Book Read Free

Toms River

Page 32

by Dan Fagin


  The second misunderstanding was about the ubiquity of cancer. In adults, it was a much more common condition than most people recognized. In the mid-1990s, there was one new case per year for every 230 New Jerseyans. A more striking way to think about that was that an American man faced a 44 percent chance of getting cancer at some point during his life; for women, the lifetime risk was 38 percent. With so many cases, it was inevitable that some neighborhoods would have surprisingly high concentrations of cancer—again, for no reason other than bad luck. “People just didn’t realize how much cancer there is all over,” Berry would later explain.

  Finally, many of the people who called Berry to report a possible cluster assumed that cancer was a single disease instead of a catchall term applied to more than 150 distinct conditions. All cancers involved uncontrolled cell division triggered by genetic damage, but many had little else in common. Cervical cancer, for instance, was predominantly spread via sexual contact; including cervical cases in a residential cancer cluster study made little sense. On the other hand, focusing only on cancer types that had been plausibly linked to industrial chemicals—brain and blood cancers, for example—reduced the total number of cases in a cluster study and thus made it even harder to confidently identify nonrandom clusters. For a rare type of cancer, just one extra case in a neighborhood—raising the total from one to two cases, or from two to three—would be enough to make the neighborhood look like a hotspot, even though that one additional case could easily be coincidental.

  By the time Berry finished clearing up those misconceptions about cancer and then moved on to the deficiencies of the state registry, with its out-of-date and incomplete records, many callers were so discouraged that they dropped their request for a cluster investigation. About half of the time, however, Berry’s explanations did not satisfy a caller. In those cases—perhaps fifteen times a year—Berry would take the next step and conduct an incidence analysis, using registry data. The 1986 and 1991 analyses he conducted on childhood cancer in Toms River were typical. The analyses were simple comparisons between the number of known cases in a community and the number that “should” have occurred there based on the average incidence rate for all of New Jersey.

  What frustrated Berry about those analyses was that their only real scientific value was as a first pass, a preliminary screening tool. They were a way to identify communities worthy of more sophisticated investigations that might include air, water, and soil tests as well as interviews to determine residents’ past exposure to carcinogens. Yet his supervisors in the health department never authorized any follow-up work in neighborhoods, no matter what Berry had found initially. Identifying true pollution-induced clusters amid the sea of unlucky flukes, Berry discovered, was beyond the resources, expertise, and inclination of the State of New Jersey. If a community really did have significantly more cancer than expected—and over the years, Berry had found several communities that seemed to—he would confer with his supervisors, send a letter explaining his findings to the person who had asked for the study, and then … nothing. There was no next step, no follow-up. Just the letter explaining the anxiety-inducing results and reiterating what Berry had already told the caller in their first conversation: The cluster was probably due solely to bad luck—but no one could say for sure.

  New Jersey was hardly alone in its resistance to conducting anything more than the shallowest cluster studies. Only a few states or countries with cancer registries did more; most did far less. As Toms River would soon find out, full-blown environmental studies were very expensive and highly controversial—and in the end usually failed to deliver clear-cut results. The Hollywood movie set approach to cluster investigation, on the other hand, had some distinct advantages for the state health department. That Michael Berry was doing something—anything, no matter how futile—allowed the state health commissioner to assure the governor and legislature that his department was responding to every cluster call it received. That was more than many health departments could claim, and it was a political imperative in New Jersey, where industrial pollution was a perennial campaign issue. Incidence analyses were inexpensive, and their ambiguous results did not create conflicts with the chemical industry and other powerful interests. They were a clever political solution—and a scientifically illegitimate one.

  “That was the conundrum of doing these analyses,” recalled Berry, who retired in 2010, after twenty-four years with the department. “The incidence analyses were something we could do, and the registry was a resource we could use, and arguably should use, to respond to people who have concerns. But what happens next? Even if I thought there really was a problem in a particular place, to do a follow-up study would be very expensive, and who knows if we would find anything? Besides, it wasn’t like I was working in an environment where I could say, ‘Look at this, maybe we should do something.’ It became clear that what the organization preferred was that we just respond, because people complain if you don’t respond. But that’s it. Nothing else.”

  So when Michael Berry’s phone rang on March 13, 1995, and Steve Jones from the ATSDR was on the line requesting another investigation of childhood cancer in Toms River, Berry felt the familiar wave of frustration building.

  There was a depressing logic behind New Jersey’s faux approach to cluster investigation. It had its roots in a century-long quest to verify neighborhood cancer clusters scientifically—a quixotic effort that tantalized and ultimately frustrated everyone who attempted it, including some of the greatest statisticians of the twentieth century.

  Back when most cancers were believed to be infectious—the triumphs of Louis Pasteur and Robert Koch in the late nineteenth century convinced many Europeans and Americans of that era that all diseases were transmissible—the existence of “cancer houses” plagued by high numbers of cases was taken for granted. Just like Linda Gillick a century later, Victorians of all social strata, from aristocratic reformers to the working poor, looked around their communities, saw that cancer was not evenly distributed, and assumed that a hidden cause must be at work. With rare exceptions, cancer was treated like a shameful plague; many people thought it was related to venereal disease, and others believed that its victims should be barred from hospitals as risks to public health.1

  A few physicians tried to apply scientific scrutiny to the notion of “cancer houses.” One of the most dedicated was an otherwise obscure Englishman named Thomas Law Webb.2 He compiled the addresses of 377 people who died of cancer between 1837 and 1910 in the industrial town of Madeley. In 1911, Webb gave his data to the person in Britain best qualified to analyze it, a hot-tempered polymath named Karl Pearson. A man of breathtakingly broad interests—he was a philosopher, poet, songwriter, and novelist in his spare time—Pearson’s greatest passion was the development of mathematical statistics as a full-fledged academic discipline and as a tool for solving social problems.3 In 1911, the same year he analyzed Thomas Law Webb’s cancer records from Madeley, Pearson founded the world’s first academic department of applied statistics at University College London, which became the global incubator of the nascent discipline of biostatistics.

  There is no indication in the historical record of how Pearson found out about Webb’s remarkable collection of cancer data, but it is easy to see why he would be eager to analyze it: Webb’s records were a way for Pearson to use his new methods of statistical analysis to test the widely held belief in “cancer houses.” He was especially interested in what came to be known as significance testing, or statistical significance.4 The concept is simple: Any apparent pattern within a group of numbers, or apparent correlation between two or more groups of numbers, should be tested to determine how likely it is that the pattern or correlation is due to chance and not to some other cause.

  The 377 Madeley residents who had died of cancer between 1837 and 1910 lived in 354 houses, according to Webb’s records. To determine whether cases were clustering for reasons other than chance, Pearson first needed to estimate how many
homes would have multiple cases if those fatal cancer cases were distributed at random. Using the statistical methods he had developed, Pearson calculated that if cancer were distributed randomly among the nearly three thousand residences in Madeley, there would be about 331 houses with one cancer death, twenty-two with two deaths, and one with three. But Webb’s records showed there were actually 315 with one death, twenty homes with two, six homes with three, and one unfortunate home in which four residents died of cancer over the seventy-three-year period.

  To a non-statistician, the two sets of numbers might not have looked very different. But to Pearson they were night and day. Could it just be a fluke? Not according to Pearson. “The probability that such a distribution could arise from random sampling is only one in many, many millions,” he concluded after conducting a series of probability experiments.5 Pearson thought that his provocative findings merited a comprehensive follow-up investigation, including comparisons to nonindustrial towns and a detailed breakdown of cases by age and occupation.6 But there was no follow-up study. In that sense, Pearson was the first of a long line of cluster hunters whose tantalizing tentative findings failed to attract the interest and resources needed to confirm or refute them. He never published again on the topic in his long career, which ended with his death in 1936.

  The cluster studies that came afterward were similar cautionary tales.7 One of Pearson’s protégés, the reclusive Percy Stocks, published a more sophisticated analysis in 1935, two years after he was appointed chief statistician in the British General Registry Office, the same position William Farr had held almost a century earlier. Stocks picked up where Pearson left off by studying approximately 3,500 cancer deaths in the cities of Bristol and Worcester. He identified ninety-four houses in which there were two deaths, when only forty-four would be expected if cancer were distributed randomly. He also counted an unexpectedly high number of cases in adjacent houses, which seemed to support the notion that cancer was infectious or environmentally induced, or both.

  But then Stocks took two more steps that would create so much uncertainty in subsequent cluster investigations, including in Toms River. Knowing that most cancer victims were over fifty-five, he wondered whether what appeared to be cancer clusters were actually just clusters of older people. What he discovered was that the distribution of people between age fifty-five and seventy-five in the two cities showed just as much clustering as cancer cases did.8 Then he analyzed the data by type of tumor, reasoning that similar kinds of cancer should cluster if they were infectious or environmentally triggered. Dividing the Bristol and Worcester cancer deaths into seven groupings, based on the organs in which the primary tumors occurred, he found that none of the groupings showed a tendency to cluster, even if cancer cases overall seemed to do so. What looked like evidence of carcinogenic infection or pollution almost certainly was not, Stocks concluded. He would remain a skeptic of cluster studies for the rest of his career, and his doubts would strongly influence his close colleagues Richard Doll and Austin Bradford Hill as they launched their seminal studies on cigarette smoking and lung cancer.

  After Percy Stocks, every respectable cluster investigator, including in Toms River, had to account for the influence of age and had to look at specific types of cancer. The latter requirement was especially onerous for researchers studying clusters of rare diseases because it meant that they had even fewer cases to work with and therefore less confidence that the clustering was not caused by chance. For example, if just two cases of cancer of the larynx were expected in a community over a ten-year period, did the appearance of four cases really constitute an alarming excess of 100 percent? If an investigator saw thirty cases instead of the expected fifteen it almost certainly did, but what about four cases instead of two? A small cluster like that could easily be a random blip, but there was also a chance that it was a sign of something much more serious. A cluster analysis, however sophisticated, could not differentiate between the two.

  By the 1960s, biostatisticians like Percy Stocks had turned cancer cluster analysis into a well-structured scientific endeavor that almost no one wanted to pursue. Their insistence on statistical significance tests and age-adjusted data and their refusal to lump all cancers together in a single analysis legitimized the study of residential clusters while simultaneously making it seem like a waste of time. Cluster studies now produced results that were scientifically credible but hopelessly ambiguous. At a time when Richard Doll was showing the exciting potential of population-wide studies in identifying risk factors for common cancers, the study of small, location-specific clusters increasingly looked like a dead-end pursuit, a statistical trap from which it was impossible to emerge with useful information.

  There was still one major branch of epidemiology that remained very interested in cluster studies, and it was the same branch that had pioneered them one hundred years earlier. Infectious disease specialists conducted cluster studies all the time, looking at incidence patterns for tuberculosis, measles, and other highly transmissible diseases. In the United States, the leading institution for such studies was, and still is, the agency now known as the Centers for Disease Control and Prevention, based near Atlanta, Georgia.9 The CDC had made its reputation studying malaria, but health departments would send it reports of other disease clusters, too, in hopes of enticing the agency to investigate.

  One of the most compelling of those reports originated with Sister Mary Viva, the principal of the Saint John Brebeuf elementary school in Niles, northwest of Chicago. During the first three months of 1961, four young girls from Niles—which had a population of just twenty thousand—died of leukemia. Two were students at Saint John Brebeuf; a third was a preschool-age sibling of a student. The CDC had never conducted a cancer cluster study before, but by 1961 evidence was accumulating that leukemia might be infectious: Researchers had identified viruses associated with leukemia in cats and dogs. A young CDC investigator, Clark W. Heath Jr., was dispatched to look into the reported cluster. He discovered it was twice as large as Sister Mary had realized: There were eight childhood leukemia cases diagnosed in Niles between September of 1957 and August of 1960—a rate almost five times higher than expected.10 Seven of the eight were either students at Saint John Brebeuf or their younger siblings, which meant the incidence rate for families at the school was more than eight times higher than for other Niles families.

  Heath and his collaborators spent more than a year working on the investigation. They mapped the home addresses of the victims, measured radiation levels, looked for local pollution sources, interviewed almost five hundred families, collected statistics for neighboring towns, and even looked to see if there was an unusual number of local feline or canine leukemia cases. Nothing clicked. They found no common factors among the cases other than their association with the school, though there were some intriguing hints about viral infection in Niles, which was growing very quickly and absorbing wave after wave of newcomers, just as Toms River was.11 Heath would go on to investigate fifty clusters of childhood leukemia and lymphoma, finding seven others besides Niles in which there were indications of infection but no corroborating evidence.12 “We were not able to find the decisive evidence we were looking for,” he remembered many years later.

  The federal government’s first foray into cancer cluster investigation had accomplished very little, yet by the late 1970s the CDC was conducting more cluster studies than ever and was increasingly focusing on chemical pollutants, not viruses—all because of demands from citizens and politicians. Publicity over Love Canal and other environmental disasters had sparked a boom in requests for cluster investigations, especially in states that had cancer registries.13 State health departments were fielding about fifteen hundred such requests per year.14 The most worrisome of those requests—the ones with a plausible suspect cause and rates high enough to make random variation an unlikely explanation—were passed on to the CDC, which by the 1980s was conducting an average of five or six cluster investigations each year. Hu
ndreds more were at least crudely investigated by state health departments, as Michael Berry was doing in New Jersey.

  The complaint-driven genesis of almost all of those cluster investigations was turning out to be a profound weakness, and not just because anxious members of the public often reported cancer patterns that turned out to be unexceptional. There was a deeper issue that could not be solved by the clever use of incidence comparisons and statistical significance tests. This was the problem of hidden multiple comparisons.15 The case-control studies popularized by Richard Doll in the 1950s were scientifically elegant not only because they were large enough to reduce statistical uncertainty but also because they began with a hypothesis. Doll wanted to test the proposition that smoking was a risk factor for lung cancer, so he assembled a large group of cases and compared them to a similar but cancer-free control group. Most cluster studies, by contrast, turned deductive science on its head. Instead of starting with a testable cause-and-effect hypothesis, they began with someone cherry-picking a suspicious cluster of cases out of a much larger population.16

  For example, when Lisa Boornazian in Philadelphia confided to her sister-in-law that she had noticed an unusual number of sick children from Toms River on the oncology ward, she was making an unstated comparison to the hundreds of other communities that sent patients each year to the ward. Within such a large comparison group, sheer chance could easily explain why several towns—including Toms River—were overrepresented in the ward’s patient population. Similarly, when Sister Mary Viva became concerned about a three-month period in Niles, Illinois, when leukemia was diagnosed in four local girls, she was making an unspoken comparison to dozens of other three-month periods, and dozens of other diseases, during her years as school principal.

 

‹ Prev