Ending Medical Reversal
Page 10
Given these sobering facts, what is a healthy diet? And how should you eat? Both of us are partial to arguments made and defended at length elsewhere. For instance, Michael Pollan has provided a reasonable framework and set of rules to guide healthy eating in his books In Defense of Food and The Omnivore’s Dilemma. He agrees that women like Joanne are right to feel frustrated. His statements like, “Eat food, mostly plants, and not too much” and “Don’t eat anything your great-grandmother wouldn’t recognize as food,” seem like good advice. Of course, we would add: be skeptical of any nutritional claims; they are unlikely to be from randomized trials and may soon be proved wrong.
CONCLUSION
So what have we learned? Laypeople, just like doctors, are prone to adopt therapies that are not well founded. Unfortunately, many of the therapies do not actually help us—and some may even do harm. Not long after you read this page, you will be presented a new complementary remedy. The treatment will be backed by a good story and maybe even a physiological explanation of how and why it will help. It will be supported by someone who has a stake in it—someone who needs to sell advertising on a show or in a magazine or someone who needs to keep the product moving from health-food store shelves. Some of these treatments might even make us feel better—though only for a short time and only because we hoped and expected that they would.
Our goal for the future of complementary health care is reasonable, although it might also be grand. We hope that people will hold the treatments they choose for themselves to the same standard to which doctors should hold the treatments they prescribe for their patients. If we ever reach this point, we will not have to decide between traditional and complementary therapies; we will just choose therapies that we know work.
7 THE FREQUENCY OF MEDICAL REVERSAL
WE HAVE SEEN THAT REVERSAL OCCURS in all aspects of medicine. Whether the medical intervention is a pill, a procedure, a surgery, a diagnostic test, a screening campaign, or even a checklist that doctors or nurses follow, all sorts of medical practices have been found not to work. When the two of us first began thinking and writing about reversal, peer reviewers at medical journals were surprised by our argument. Sure, they conceded, occasionally a medical practice fails, and when that happens it is certainly memorable. But, they added, those are exceptional cases; most of what doctors do is sound. And so a question was born. How often does medical reversal happen? Is it rare but memorable—like an earthquake in California? Or is it ubiquitous—like snowstorms during a Chicago winter? We were fascinated by the question and intrigued by how best to answer it.
In an ideal world, the way to measure reversal would be to make a list of all existing standards of care. (This would be a very long list. It would include everything that doctors do.) Every practice that was based on strong evidence, that had bettered a lesser therapy in a well-done randomized controlled trial (or more than one), would get a pass. Everything that remained would be tested in rigorous trials to see if it worked. Practices that came out ahead in those trials would also get a pass; those that didn’t would be identified as a case of medical reversal (and, more importantly, would no longer be offered to patients).
Of course, we do not live in an ideal world. Quantifying everything that would need to be tested would, in itself, be a career’s work. The cost of trials to test every unproven therapy would probably consume the entirety of the global research budget. So we tried to devise another way to measure it. One thought was to examine the treatments that were developed during a specific time interval, say 1995 to 2000, and ask what percentage was shown not to work by 2010. A nice idea, but also flawed, since most medical practices were not reexamined between 2000 and 2010. Because only a small fraction of the standards of care would be tested, we would underestimate the frequency of medical reversal.
Another way to answer the question would be to look at all clinical trials published during some time interval that tested standards of care and ask how many contradict those standards. Unfortunately, despite the growing databases of clinical trials, there is no easy way to get a list of such studies and their results. Besides, most trials that test a standard of care test trivial comparisons: is it better to give steroid injections for osteoarthritis of the knee every three months or every six months; do antacid medicines work better when taken once a day in double dose or twice a day in single dose? Instead, we were interested in trials that test fundamental questions: do steroid injections or antacids work at all? Is an established medical practice better than an older, proven one? Is it better than no treatment at all?
Finally, we devised a way to estimate reversal. Like the others, it had its flaws but it was practical and would give us some interesting data. We would review every article in the most prestigious and highest-impact medical journal, the New England Journal of Medicine (NEJM), in a single year. NEJM papers typically ask important questions. We would analyze any study that tested a current standard of care. Because hundreds of articles are published annually in NEJM, we picked 2009, the last complete year at the time of our investigation, in order to make the task manageable. As we worked on our project, we surveyed the literature to see if others had tried to estimate the incidence of medical reversal in the medical literature. Of the millions of papers published in medical journals, only one came close to what we envisioned.
In 2005 John Ioannidis, a physician-researcher and a truly innovative thinker, wanted to measure what proportion of important findings in medicine were later contradicted. His strategy was to start with highly cited works. These were studies that were referenced more than 1,000 times by other publications. These are the papers that really influence the practice of medicine. (It is the goal of pretty much every medical researcher to pen such an influential paper.) Ioannidis considered all the highly cited papers published in key medical journals during the years 1990 to 2003. Of those, 45 found that a medical intervention was effective. He then tracked all reports of these interventions’ efficacy over the years to come. He found that seven (16 percent) were later found to be ineffective, another seven (16 percent) were found less effective than initially believed, 20 (44 percent) were supported in future studies, and 11 (24 percent) were never tested again. The practices that were later found to be ineffective included the use of vitamins to prevent cancer and heart disease, treatment of overwhelming infections, and the use of nitric oxide in critically ill patients.
The study suggested that 16 percent of the most widely cited medical literature is later contradicted. Although an interesting and beautifully done study, it did not answer our particular question. We wanted to know what percentage of what doctors actually do is wrong. Not just the topics that get a lot of buzz (and citations), like cancer prevention and heart disease, but even the unsung heroes in medicine—treating back pain, rashes, or Bell’s palsy.
In 2011, with the help of Victor Gall, who at the time was a bright medical student at Northwestern University, we published our results in a paper entitled “The Frequency of Medical Reversal.” Not surprisingly, we found that most research in the NEJM, 77 percent in fact, concerned new medical practices and not standards of care. Of the 35 studies that did examine the current standards of care, 16 (46 percent) showed that the standard of care was ineffective. These were medical reversals. Standard therapy was no better than either the previous standard or no treatment at all.
Like good researchers (and ones always looking for another entry on the curriculum vitae) we set out to extend our findings. Our next project was to increase the time span that we studied from 1 year to 10. With the help of 10 colleagues—it was a pretty big job—we reviewed 2,044 original articles published in the NEJM between 2001 and 2010. Of these, 363 articles reported the results of studies that tested the efficacy of an established practice. Reversals were found in 146 (40 percent) of these studies, 138 (38 percent) of the studies reaffirmed the benefit of the new practice, and 79 (22 percent) were inconclusive. The breakdown of these results is shown graphically in figure 7.
1.
7.1 Types of articles about a medical practice published in the New England Journal of Medicine between 2001 and 2010.
Forty percent is a lot. Nearly half of what doctors do. If that much of medical practice is ineffective, it is pretty scary. The number did fit with how we felt about the medicine we see practiced every day and the medical literature we have followed during our careers. But the number really does not tell the whole story. First, for some fields of medicine, those for which the evidence base is weak, an estimate that 40 percent of standard therapy is ineffective is probably about right (or maybe even low). For other fields, those in which new therapies tend to be rigorously tested, the proportion that is ineffective is probably smaller. Second, the proportion of practice that is suspect depends on how a doctor practices. We know doctors, in every field, who do everything that is reasonable and evidence-based and no more. We also know doctors who test the limits of reason daily. So the average may not apply to your doctor (or any one doctor).
Of course, not everyone was happy with our findings, and some quibbled with the exact number. We agreed with some of the criticism. We may have overestimated the prevalence of reversal, because the NEJM likes to publish controversial papers and may therefore gravitate to papers that report the overturning of a standard. Then again, testing the standard of care is provocative because it happens so rarely. The NEJM publishes plenty of papers that validate what doctors do. In our analysis, they published about the same number of studies that validated current practice as they did studies that overturned it.
We may also have underestimated the frequency of reversal. By and large, doctors do not test their own standard of care. Doctors are doing what they believe is best for their patients. It takes an innovative and brave researcher (and a bit of a contrarian) to test a practice to which most of her colleagues are committed. There is also the issue of money. There is little incentive to attempt to overturn a practice from which you are profiting. Not many orthopedists would be willing to investigate whether joint replacement is better than a sham procedure. They believe in the procedure and are making a handsome profit from it. The manufacturers of the prosthetic hip are even less likely to fund such a study. Thus, countless trials are not done.
But in an important way, the number does not even matter. Earthquakes are rare in California, for sure, but we still build office towers and houses strong enough to withstand them. As long as reversal is not rare (and our data argue very strongly that it is not) and affects human health, we should address it. Doctors understand that what they do is beneficial for patients on average but not necessarily for the individual patient. (We discuss this further in chapter 9.) The data on medical reversal, however, suggest that much of what we believe helps a subset of a population may actually help no one.
Whether the actual number is a little higher or a little lower, it does fit nicely with the old adage “Half of what you are taught in medical school is wrong. The trouble is, we do not know which half.”
So although our results are limited, coming from a small slice of published literature, which itself is a small slice of what researchers think are plausible questions to ask and test, they do tell us that medical reversal is not rare. Many of the reversals we discovered have been outlined in the previous chapters. Reversals included medical therapies (prednisone use among preschool-aged children with viral wheezing and cholesterol-lowering drugs for patients on dialysis), a checklist (to assure tight glycemic control in intensive-care-unit patients), invasive procedures (percutaneous intervention for atherosclerotic renal artery disease), and screening tests (prostate cancer screening). In a review of research from a single journal, there was an example of medical reversal from virtually every corner of modern medicine. In the appendix, we give a short summary of every article that we interpreted as a reversal in this study. Experts may not agree with every one of our conclusions—sometimes arguing that the overturned therapy was not really an accepted standard or that the reversal was not complete. We ourselves had some heated discussions about what should and should not make the list. The important thing about the list is that even if you might quibble with an inclusion (or exclusion) or two, the weight of the overall number is great.
As is often the case when a topic is important, many researchers get interested in it at the same time and examine it in their own way. While we were completing our work, we began to see a lot of similar findings appearing in the literature. A project of the British Medical Journal Clinical Evidence completed a review of 3,000 medical practices. Those researchers found that 35 percent of medical practices are effective (or likely to be effective); 15 percent are harmful, unlikely to be beneficial, or a tradeoff between benefits and harms; and 50 percent are of unknown effectiveness.
How do these findings square with our results? Quite nicely, it turns out. The Clinical Evidence project maps the landscape of all medical practices. Some work, some do not (these are reversals still stuck in a lag time before doctors abandon them), and for some we simply do not have enough information. Our results apply to this last group; the 50 percent of practices about which we do not have adequate information. Our research suggests that, if you subjected the untested 50 percent to real scrutiny, 40 percent of these would be found to be, at best, ineffective. Our work shrinks the gray zone (actually the hatched area) of figure 7.2.
In a similar vein, a team in Australia screened 5,209 articles in an effort to find medical practices that were unlikely to be of any benefit to patients. The results, published in 2012, listed 156 potentially ineffective or unsafe practices. There was some overlap with practices that we identified in our work. The Australian group cited eight practices that we included in our catalog of reversals. Among these were the use of arthroscopic surgery for knee osteoarthritis, endovascular repair of some abdominal aortic aneurysms, and amnioinfusion for high-risk pregnant women. While it is a bit surprising that there was not more overlap with our work, the Australians looked at the entire body of medical practice, while we looked at 10 years of trials in one journal. In the end, our study added well over 100 practices to the Australians’ already impressive list.
7.2 Combining the BMJ Clinical Evidence project with our work on the frequency of medical reversal.
During the past few years, researchers have produced multiple pieces of evidence supporting the idea that there is much that doctors do that is, at least, unproved. When these practices are examined, a sizable subset is found to be ineffective, when compared to previous practices. Some of the practices are actually found to be harmful. The obvious questions that need to be answered are, Why is reversal so common? and, What can be done to make it less so?
8 THE HARMS OF MEDICAL REVERSAL :: TODAY’S PATIENTS, TOMORROW’S PATIENTS, AND THE HEALTH-CARE FIELD
WHEN ANITA KRAMER WOKE UP, she could not move her left arm. She tried to speak, but her words were garbled. She panicked and thought, “This can’t be happening.” She somehow managed to call 911. The emergency dispatcher initially had trouble understanding her but eventually recognized the symptoms. He told her to take an aspirin and sent an ambulance. Anita had just had a stroke.
By now, most of us recognize the basic signs of stroke. Nationwide campaigns have increased the awareness of the common symptoms with the hope that people will receive treatment as quickly as possible. Strokes occur when the brain is deprived of oxygenated blood. Sometimes a stroke is caused by a narrowed artery impairing blood flow to the area of the brain beyond the obstruction. More often, a cholesterol plaque or blood clot breaks off from the wall of an artery and clogs a more distant vessel.
Because Anita woke up with symptoms, her doctors in the emergency room are in a jam. They do not know if the stroke occurred 30 minutes before, when she woke up, or at midnight the night before. This would not be important were it not that our most effective treatment for strokes, tissue plasminogen activator, or TPA, only works when given within about four and half hours of the onset of symptoms. TPA works by breaki
ng up clots. The drug is such a powerful blood thinner that its risks (including lethal hemorrhage) aren’t trivial. After four and half hours, the risks of the drug begin to outweigh its benefit.
With TPA off the table, there are really no options to treat the stroke. Fortunately, with time and physical therapy, many patients will recover at least some of their lost abilities. All of the evaluation and treatment that Anita will receive in the hospital will be intended to decrease her risk of having another stroke. She receives atorvastatin to lower her cholesterol and a combination of aspirin and dipyridamole to thin her blood. Her doctors monitor her heart rhythm, looking for atrial fibrillation, a common cardiac arrhythmia that increases the risk for stroke. They also look at her heart for the presence of blood clots. If either of these is detected, she will need to be on another class of blood thinners. (We used to look for a defect between two chambers of the heart, which could allow blood clots from the legs to travel to the brain. We do not [or should not] do this anymore, because a major trial showed that closing this hole was not beneficial— another example of medical reversal, which we will leave for another day.) The doctors look at the carotid arteries, usually with ultrasound. A buildup of cholesterol plaque in these large vessels in the neck can be surgically repaired, markedly decreasing the risk of future stroke. And, for the past decade, many physicians have looked for narrowing of the smaller vessels in the brain. If one of these was narrowed, we could reopen it with a stent (just as we do for arteries in the heart). The idea was that this procedure could decrease the risk of future strokes.
During Anita’s evaluation, her doctors discovered that she had a 70 percent stenosis (narrowing) in the middle cerebral artery, a major blood vessel of the brain. Her neurologist recommended a stent. Anita got the stent. Six days later, she had another stroke that left her more disabled than the first.