Clinically trivial effects

Steve Simon


I don’t like to cite articles in the New York Times, because they are free on the web only for a couple of weeks. But an article by Denise Grady, Nominal Benefits Seen in Drugs for Alzheimers, published on April 7 is worth mentioning. Try to review this article before it disappears on April 21.

Grady writes that drugs to treat Alzheimer patients are expensive, and it is unclear how much they really help.

Clearly, the drugs can alter brain chemistry, and some studies show statistically significant improvements on tests that measure thinking and memory. But while a few extra points on a mental exam, or other changes obvious to a specialist, may be enough to get a drug approved by the Food and Drug Administration, they may not be enough to help a person with Alzheimer’s dementia function in the real world. “You can name 11 fruits in a minute instead of 10,” said Dr. Thomas Finucane, a professor at Johns Hopkins and a geriatrician. “Is that worth 120 bucks a month?"

This is a classic example of both clinical importance and surrogate end points. Typically, patients in randomized trials are assessed using tests of memory. But is the ability to name fruits really of interest to patients? Also, is the ability to name 11 fruits rather than 10 worthwhile from a patient’s perspective.

Dr. Jason Karlawish, a geriatrician at the University of Pennsylvania’s Institute on Aging, said, “There is substantial controversy over the claim that current F.D.A.-approved treatments improve function or slow a patient’s decline.” He blamed several factors for the controversy, including “the lack of widely understood and accepted measures to show improvement or slowing of decline, small effects on the few measures some experts agree are appropriate, and controversial and even outrageous approaches to analyzing the data to make the claim the drug slows a patient’s decline.” Dr. Karlawish, Dr. Finucane and other researchers said they were particularly irked by a study published last July in The Journal of the American Geriatrics Society, claiming that Aricept could delay a patient’s need for nursing-home care by nearly two years - something that would clearly matter to patients and families.

The latter claim, which has been criticized, is indeed the type of outcome that patients are interested in. There is an acronym, POEM, which stands for Patient Oriented Events that Matter. Patients are only interested in morbidity, mortality, or quality of life.

Along the same lines, I co-authored an editorial with Jay Portnoy, Is 3-mm Less Drowsiness Important?, which appeared in the October 2003 issue of the Annals of Allergy, Asthma and Immunology. We commented on a study that showed a statistically significant difference in drowsiness scores. Drowsiness was measured using a visual analog scale. This is a line, usually 10 centimeters in length. You are asked to draw a mark on the line representing how drowsy you feel, with the left end of the line representing no drowsiness and the right end of the line representing the most drowsiness possible. In this study, the difference between the two drugs was 3 millimeters. Here’s an image that shows the size of 3 millimeters on a 10 centimeter scale. On your screen, of course, or on your printer, these lines may not be equal to exactly 10 centimeters, but the distances should be proportional.

04_clinical01.gif not found.

Is a difference of 3 mm large enough to justify a claim of “less drowsy”? The authors of the study do not discuss this, and quite honestly, most researchers do not either (Chan 2001; Thomson 2000). It seems that once you compute a p-value, you stop thinking. This is a bad habit. While statistical significance is an important finding, it is also important to discuss practical relevance. Does an extra 3 mm make it safe to drive a car or to operate heavy machinery?

The visual analog scale is also used for pain assessments, and here there is some guidance for what is considered clinically important. A score of 30 mm or less is considered little or no pain (Bodian 2001), and a change of 10 mm is considered clinically relevant (Powell 2001).

In other areas, it is well known that grapefruit consumption can alter the metabolic pathways of a liver enzyme, CYP3A. Is this altering sufficiently large, though, to warrant a warning? An editorial (Abernathy 1997) discusses this and concludes that only an unusually high consumption of grapefruits would warrant any serious concern.

A review of randomized trials of head injury (Dickinson 2000) suggests that a 5% absolute improvement would be considered clinically relevant, though a letter to the editor (Murray 2000) suggests the perhaps even a 10% absolute improvement would not be clinically important. Given the severity of outcomes in head injuries, I would tend to side with Dickinson, but it is very difficult to detect differences this small.

There are many aspects to this problem which I’ll try to discuss when I have time:

Grapefruits and drugs: when is statistically significant clinically significant? Abernethy DR. J Clin Invest 1997: 99(10); 2297-8. [Medline]]( [Full text]]( [PDF]](

The visual analog scale for pain: clinical significance in postoperative patients. Bodian CA, Freedman G, Hossain S, Eisenkraft JB, Beilin Y. Anesthesiology 2001: 95(6); 1356-61. [Medline]](

How well is the clinical importance of study results reported? An assessment of randomized controlled trials. Chan KB, Man-Son-Hing M, Molnar FJ, Laupacis A. Cmaj 2001: 165(9); 1197-202. [Abstract]]( [Full text]]( [PDF]](

Size and quality of randomised controlled trials in head injury: review of published studies. Dickinson K, Bunn F, Wentz R, Edwards P, Roberts I. British Medical Journal 2000: 320; 1308-1311. [Medline]]( [Abstract]]( [Full text]]( [PDF]](

Quality of randomised controlled trials in head injury. Trials in head injury are more complex than review suggests. Murray GD, Teasdale GM. British Medical Journal 2000: 321(7270); 1223. [Medline]]( [Full text]](

Is 3-mm Less Drowsiness Important? Portnoy JM, Simon SD. Annals of Allergy, Asthma and Immunology 2003: 91(4); 324-325. [Medline]](

Determining the minimum clinically significant difference in visual analog pain score for children. Powell CV, Kelly AM, Williams A. Ann Emerg Med 2001: 37(1); 28-31. [Medline]](

Audit and feedback: effects on professional practice and health care outcomes. Thomson OB, Oxman AD, Davis DA, Haynes RB, Freemantle N, Harvey EL. Cochrane 2000: (2); CD000259. [Medline]](

You can find an earlier version of this page on my original website.