by Ian Ayres
The Progresa method of randomized trials is also propagating. After Gertler got involved in evaluating the Progresa program, he was asked by the World Bank to be their chief economist for human development. He told me recently, “I spent the last three years helping the World Bank build a capacity of Progresa-like operations, evaluations in 100-plus activities that they were doing. And so we use that as a model to scale up worldwide. It is now entrenched in the World Bank and it is really spreading to other countries.”
Mark Twain once said, “Facts are stubborn things, but statistics are more pliable.” Government programs like Progresa and software programs like Offermatica show, however, the simple power—the nonpliability, if you will—of randomized trials. You flip coins, put similar people in different treatment groups, and then just look to see what happened to the different groups. There is a purity to this kind of Super Crunching that is hard for even quantiphobes to ignore. Gertler puts it this way: “Randomization just strips away all the roadblocks that you might have or at least it makes them bare. If the people are going to make a political decision, then they are going to make it despite the facts.”
In some ways, random trials seem too simple to be part of the Super Crunching revolution. But we have seen that they partake to varying degrees of the same elements. Randomized trials are taking place on larger and larger pools of subjects. CapOne thinks nothing of sending randomized solicitations to hundreds of thousands of prospects. And Offermatica shows how Internet automation has collapsed the period between testing and implementation. Never before has it been possible to test and recalibrate policy in so short a time. But most importantly we’ve seen how randomization can impact data-driven decision making. The pellucid simplicity of a randomized trial is hard for even political adversaries to resist. Something as seemingly inconsequential as flipping a coin can end up having a massive effect on how the world operates.
CHAPTER 4
How Should Physicians Treat Evidence-Based Medicine?
In 1992, two Canadian physicians from Ontario’s McMaster University, Gordon Guyatt and David Sackett, published a manifesto calling for “evidence-based medicine” (EBM). Their core idea was simple. The choice of treatments should be based on the best evidence and, when available, the best evidence should come from statistical research. Guyatt and Sackett didn’t call on doctors to be exclusively guided by statistical studies. Indeed, Guyatt is on record as saying that statistical evidence “is never enough.” They just wanted statistical evidence to play a bigger role in treatment decisions.
The idea that doctors should give special emphasis to statistical evidence remains controversial to this day. This struggle over EBM parallels the struggle over Super Crunching more generally. Super Crunching is crucially about the impact of statistical analysis on real-world decisions. The debate over EBM is in large part a debate about whether statistics should impact real-world treatment decisions.
The statistical studies of EBM use the two core Super Crunching techniques. Many of the studies estimate the kind of regression equations that we saw in the first chapter, often with supersized datasets—with tens, and even hundreds, of thousands of subjects. And of course, many of the studies continue to exploit the power of randomization—except now the stakes are much higher. Because of the success of the EBM movement, the pace at which some doctors incorporate results into treatment decisions has accelerated. The Internet’s advances in information retrieval have spurred a new technology of influence, and the speed at which new evidence drives decisions has never been greater.
100,000 Lives
Empirical tests of medical treatments have been around for more than a hundred years. As early as the 1840s, the great Austrian physician Ignaz Semmelweis completed a detailed statistical study of maternity clinics in Vienna. As an assistant professor on the maternity ward of the Vienna General Hospital, Semmelweis noticed that women examined by student doctors who had not washed their hands after leaving the autopsy room had very high death rates. When his friend and colleague Jakob Kolletschka died from a scalpel cut, Semmelweis concluded that childbirth (puerperal) fever was contagious. He found that mortality rates dropped from 12 percent to 2 percent if doctors and nurses at the clinics washed their hands in chlorinated lime before seeing each patient.
This startling result, which ultimately would give rise to the germ theory of disease, was fiercely resisted. Semmelweis was ridiculed by other physicians. Some thought his claims lacked a scientific basis because he didn’t offer a sufficient explanation for why hand-washing would reduce death. Physicans refused to believe that they were causing their patients’ deaths. And they complained that hand-washing several times a day was a waste of their valuable time. Semmelweis was eventually fired. After a nervous breakdown, he ended up in a mental hospital, where he died at the age of forty-seven.
The tragedy of Semmelweis’s death, as well as the needless deaths of thousands of women, is ancient history. Doctors today of course know the importance of cleanliness. Medical dramas show them meticulously scrubbing in for operations. But the Semmelweis story remains relevant. Doctors still don’t wash their hands enough. Even today, physicians’ resistance to hand-washing is a deadly problem. But most importantly, it’s still a conflict that is centrally about whether doctors are willing to change their modus operandi because a statistical study says so. This is a conflict that has come to obsess Don Berwick.
A pediatrician and president of the Institute for Healthcare Improvement, Don Berwick inspires some pretty heady comparisons. The management guru Tom Peters calls him “the Mother Teresa of health safety.” To me, he is the modern-day Ignaz Semmelweis. For more than a decade, Berwick has been crusading to reduce hospital error. Like Semmelweis, he focuses on the most basic back-end results of our health care system: who lives and who dies. Like Semmelweis, he has tried to use the results of EBM to suggest simple reforms.
Two very different events in 1999 turned Berwick into a crusader for system-wide change. First, the Institute of Medicine published a massive report documenting widespread errors in American medicine. The report estimated that as many as 98,000 people died each year in hospitals as a result of preventable medical errors.
The second event was much more personal. Berwick’s own wife, Ann, fell ill with a rare autoimmune disorder of the spinal cord. Within three months she went from completing a twenty-eight-kilometer cross-country ski race in Alaska to barely being able to walk.
The Institute of Medicine report had already convinced Berwick that medical errors were a real problem. But it was his wife’s slovenly hospital treatment that really opened Berwick’s eyes. New doctors asked the same questions over and over again and even repeated orders for drugs that had already been tried and proven unsuccessful. After her doctors had determined that “time was of the essence” for using chemotherapy to slow deterioration of her condition, Ann had to wait sixty hours before the first dose was finally administered. Three different times, Ann was left on a gurney at night in a hospital subbasement, frightened and alone.
“Nothing I could do…made any difference,” Don recalls. “It nearly drove me mad.” Before Ann’s hospitalization, Berwick was concerned. “Now, I have been radicalized.” No longer could he tolerate the glacial movement of hospitals to adopt simple Semmelweis policies to reduce death. He lost his patience and decided to do something about it.
In December 2004, he brazenly announced a plan to save 100,000 lives over the next year and a half. The “100,000 Lives Campaign” challenged hospitals to implement six changes in care to prevent avoidable deaths. He wasn’t looking for subtle or sophisticated changes. He wasn’t calling for increased precision in surgical operations. No, like Semmelweis before him, he wanted hospitals to change some of their basic procedures. For example, a lot of people after surgery develop lung infections while they’re on ventilators. Randomized studies showed that simply elevating the head of the hospital bed and frequently cleaning the patient’s mouth substantially reduc
es the chance of infection. Again and again, Berwick simply looked at how people were actually dying and then tried to find out whether there was large-scale statistical evidence showing interventions that might reduce these particular risks. EBM studies also suggested checks and rechecks to ensure that the proper drugs were prescribed and administered, adoption of the latest heart attack treatments, and use of rapid response teams to rush to a patient’s bedside at the first sign of trouble. So these interventions also became part of the 100,000 Lives Campaign.
Berwick’s most surprising suggestion, however, is the one with the oldest pedigree. He noticed that thousands of ICU patients die each year from infections after a central line catheter is placed in their chests. About half of all intensive care patients have central line catheters, and ICU infections are deadly (carrying mortality rates of up to 20 percent). He then looked to see if there was any statistical evidence of ways to reduce the chance of infection. He found a 2004 article in Critical Care Medicine that showed that systematic hand-washing (combined with a bundle of improved hygienic procedures such as cleaning the patient’s skin with an antiseptic called chlorhexidine) could reduce the risk of infection from central-line catheters by more than 90 percent. Berwick estimated that if all hospitals just implemented this one bundle of procedures, they might be able to save as many as 25,000 lives per year. Just as numerical analysis had informed Ignaz so many years before, it was a statistical study that showed Berwick a way to save lives.
Berwick thinks that medical care could learn a lot from aviation, where pilots and flight attendants have a lot less discretion than they used to have. He points to FAA safety warnings that have to be read word for word at the beginning of each flight. “The more I have studied it, the more I believe that less discretion for doctors would improve patient safety,” he says. “Doctors will hate me for saying that.”
Berwick has crafted a powerful marketing message. He tirelessly travels and is a charismatic speaker. His presentations at times literally sound like a revival meeting. “Every single person in this room,” he told one gathering, “is going to save five lives during the forum.” He constantly analogizes to real-world examples to get his point across. His audiences have heard him compare health care to the escape of forest-fire jumpers, his younger daughter’s soccer team, Toyota, the sinking of a Swedish warship, the Boston Red Sox, Harry Potter, NASA, and the contrasting behaviors of eagles and weasels.
And he is fairly obsessed with numbers. Instead of amorphous goals, his 100,000 Lives Campaign was the first national effort to save a specific number of lives in a set amount of time. The campaign’s slogan is “Some Is Not a Number, Soon Is Not a Time.”
The campaign signed up more than 3,000 hospitals representing about 75 percent of U.S. hospital beds. Roughly a third of the hospitals agreed to implement all six changes, and more than half used at least three. Before the campaign, the average mortality rate for hospital admits in the United States was about 2.3 percent. For the average hospital in the campaign with 200 beds and about 10,000 admits a year, this meant about 230 annual fatalities. By extrapolating from existing studies, Berwick figured that participating hospitals could save about one life for every eight hospital beds—or about twenty-five lives a year for a 200-bed hospital.
The campaign required participating hospitals to provide eighteen months of mortality data before they began participating and to report updates on a monthly basis of how many people died during the course of the experiment. It’s hard to assess at a single hospital with 10,000 admits whether any mortality decrease is just plain luck. Yet when the before-and-after results of 3,000 hospitals are crunched, it’s possible to come to a much more accurate assessment of the aggregate impact.
And the news was great. On June 14, 2006, Berwick announced that the campaign had exceeded its goal. In just eighteen months, the six reforms prevented an estimated 122,342 hospital deaths. This precise number can’t really be trusted—in part because many hospitals were independently making progress on the problem of preventable medical error. Even without the campaign, it’s probable that some of the participating hospitals would have changed the way they do business and saved the lives.
Still, any way you slice it, this is a huge victory for evidence-based medicine. You see, the 100,000 Lives Campaign is centrally about Super Crunching. Berwick’s six interventions didn’t come from his intuitions, they came from statistical studies. Berwick looked at the numbers to find out what was actually causing people to die and then looked for interventions that had been statistically shown to reduce the risk of those deaths.
But this is statistics on steroids. Berwick succeeded in scaling his campaign to impact two out of every three hospital beds in the country. And the sheer speed of the statistical influence is staggering: saving 100,000 lives in a little more than 500 days. It shows the possibility of quickly going from publication to mass implementation. Indeed, the central-line study was published just two months before the 100,000 Lives Campaign began.
“Don Berwick should win the Nobel Prize for Medicine,” says Blair Sadler, head of San Diego Children’s Hospital. “He has saved more lives than any doctor alive today.” And he’s not done either. In December 2006, his Institute for Healthcare Improvement announced the 5 Million Lives Campaign, a two-year initiative to protect patients from 5 million incidents of medical harm. The success of the 100,000 Lives Campaign underscores the potential for translating EBM results into mass action by health care providers.
Old Myths Die Hard
But the continuing problem of dirty hands underscores the difficulty of getting the medical community to follow where the statistics lead them. Even when statistical studies exist, doctors are often blissfully unaware of—or, worse yet, deliberately ignore—statistically prescribed treatments just because that’s not the way they were taught to treat. Dozens of studies dating back to 1989 found little support for many of the tests commonly included in a typical annual physical for symptomless people. Routine pelvic, rectal, and testicular exams for those with no symptoms of illness haven’t made any difference in overall survival rates. The annual physical exam is largely obsolete. Yet physicians insist on doing them, and in very large numbers.
Dr. Barron Lerner, an internist at Columbia University’s College of Physicians and Surgeons, asks patients to come in every year and always listens to their heart and lungs, does a rectal exam, checks lymph nodes, and palpates their abdomens.
“If a patient were to ask me, ‘Why are you listening to my heart today?’” he said, “I couldn’t say, ‘It’s going to help me predict whether you will have a heart attack.’
“It’s what I was taught and it’s what patients have been taught to expect,” he said.
Worse yet, there is an extensive literature on “medical myths” that persist in practice long after they’ve been refuted by strong statistical evidence. On the ground, many practicing doctors still believe:
Vitamin B12 deficiencies must be treated with shots because vitamin pills are ineffective.
Patching the eye improves comfort and healing in patients with corneal abrasions.
It is wrong to give opiate analgesics to patients with acute abdomen pain because narcotics can mask the signs and symptoms of peritonitis.
However, there is evidence from carefully controlled randomized trials that each of these beliefs is false. It’s not surprising that the untrained general public clings to “folk wisdom” and unproven alternative medicines. But how do medical myths persist among practicing physicians?
Part of the persistence comes from the idea that new studies aren’t really needed. There’s that old Aristotelian pull. The whole idea of empirical testing goes against the Aristotelian approach that has been a guiding principle of research. Under this approach, researchers should first try to understand the nature of the disease. Once you understand the problem, a solution will become self-evident. Semmelweis had trouble publishing his results because he couldn’t give an Aristotelia
n explanation for why hand-washing saved lives. Instead of focusing on the front-end knowledge about the true nature of disease, EBM shows the power of asking the back-end question of whether specific treatments work. This doesn’t mean that medical research should say goodbye to basic research or the Aristotelian approach. But for every Jonas Salk, we need a Semmelweis and, yes, a Berwick to make sure that the medical profession’s hands are clean.
The Aristotelian approach can go seriously wrong if doctors embrace a mistaken conception or model for how an ailment operates. Many doctors still cling to mistaken treatments because they have adopted a mistaken model. As Montana physician and medical myth-buster Robert Flaherty puts it, “It makes pathophysiologic sense, so it must be true.”
Moreover, once a consensus has developed about how to treat a particular disease, there is a huge urge in medicine to follow the herd. Yet blindly playing follow the leader can doom us to going the wrong way if the leaders are poorly informed. Here’s how it happens: we now know that B12 pills are just as effective as shots—they both work about 80 percent of the time. But imagine what happens if, just by chance, the first few patients who are given B12 pills don’t respond to the treatment positively (while the first few that receive B12 shots do). The pioneering clinician will respond to this anecdotal evidence by telling her students that B12 pills aren’t effective supplements. Subsequent doctors just keep giving the shots, which are in fact effective 80 percent of the time but which are also a lot more expensive and painful. Voilà, we have an informational cascade, where an initial mistaken inference based on an extremely small sample is propagated in generation after generation of education.