Toms River
Page 34
The last big issue Berry needed to face was time. Which years would he examine? On this question, he was entirely dependent on the flawed state cancer registry. There had been a few improvements, especially in reporting by out-of-state hospitals in New York City and Philadelphia, but by 1995 the registry was running four years behind, the most out of date it had ever been. The delay meant Berry’s analysis could not include cases diagnosed after 1991 or before 1979, which was the first year of fairly complete registry data. (It was also the year of Michael Gillick’s birth, which meant that he would be one of the first cases included in Berry’s analysis.) The only way Berry’s study could be both historically comprehensive and up to date would be to include years outside the 1979-to-1991 window of the registry—but that idea was too impractical to take seriously. Who would pay the huge cost of digging up records in hospitals and doctor’s offices to find reliable data for 1975, or 1995, or any other year the registry did not cover?
The upshot was that if the deluge of industrial chemicals dumped and burned in Toms River during the 1950s and 1960s had triggered a cluster of childhood cancer in those years, Berry’s analysis would not be able to discern it because he had no information about cases diagnosed before 1979. Even worse, because of the four-year time lag at the registry, Berry would not be able to address, even indirectly, the question that so alarmed Linda Gillick and Lisa Boornazian: Was there still a cancer cluster in Toms River?
Having set the parameters of his study, Berry was ready to begin. He used the cancer registry to identify the birth address of every child under age twenty who had been diagnosed with cancer between 1979 and 1991 while living in Ocean County. Then he consulted local maps to double-check all of the addresses, making sure that they were classified correctly. Finally, he added up the cases, categorized the cancers, and laid out the results in a table. The one for the town and the core zone looked like this:6
The data table confirmed what Berry already knew: There were precious few cases to work with. There was no point in even trying to analyze fourteen categories—everything but brain/central nervous system tumors, leukemias, and overall cancers, he decided. Even in those three relatively large categories, however, there were still so few cases that even if the totals turned out to be much higher than expected, he might not be able to rule out bad luck as a likely cause, especially if boys and girls were counted separately.
To find out if the local totals really were high, Berry calculated the number of pediatric brain cancers, leukemias, and overall cancers that would be expected in the county, town, and core zone if their rates were identical to the statewide average of all New Jersey children.7 Then he updated his results table, calculating simple ratios that expressed the relationship between observed and expected cases in each category. (Any “incidence ratio” over 1.0 was higher than expected.) Finally, he added the special category he had decided to include for brain/nervous system cancers in children under five, as well as the countywide totals. The new table looked like this:
All it took was a quick glance at the results table for Berry to see that there was nothing typical about these children. In every remaining category, they had more cancer than expected. Just as importantly, all the remaining categories showed the same bull’s-eye pattern: Whatever mystery factor was affecting cancer rates seemed to be strongest in the heart of Toms River. For overall childhood cancer cases, for example, there were 7 percent more cases than expected in Ocean County, 31 percent more in the township, and 49 percent more in the core area. Most disturbingly, the biggest disparity was in the category thought to be the best indicator of a potential environmental problem: a sevenfold excess (three cases instead of the expected 0.4) for brain and nervous system cancers in children under age five in the Toms River core zone.
The message in the numbers seemed clear: Linda Gillick was right, and so was Lisa Boornazian. Something unusual really was happening in Toms River—or at least had happened between 1979 and 1991. For Berry, there was only one remaining question, and it was an exceedingly difficult one to answer: Could it all just be due to random variation, to a run of very bad luck? He had attended the cluster buster conference in 1989, and he knew how misleading apparent clusters could be. Now he would need to follow in the footsteps of Karl Pearson and the other biostatisticians who had confronted the same problem by performing tests of statistical significance. Berry needed to know how confident he could be that chance was not the cause of the cancer patterns he had identified in the county, the township, and especially in the heart of Toms River.
The significance test Berry employed was one of the most widely used in epidemiology: a 95 percent confidence interval, very similar to a margin of error in an opinion poll (though not quite the same). Pollsters employ margins of error because the fewer people they poll, the less confident they can be that the results accurately represent the sentiments of the larger population. To account for this uncertainty, statisticians apply a formula—its basics were first worked out by Siméon Poisson in the mid-nineteenth century—that assesses a poll’s accuracy based on the number of people polled and the size of the larger population those people are supposed to represent. Instead of expressing the results as a single number (“55 percent of voters approve of the president’s performance”), pollsters can apply the formula and express the results as both a number and a range (“55 percent approve, with a margin of error of plus or minus 3 percent”). The wider the margin of error, the less reliable the result. Usually, these ranges are based on a 95 percent confidence level, which means that if the poll were conducted the same way twenty times, the result would fall within the margin of error every time but once.
Cancer rates fluctuated by chance, just as opinion poll results did. Rates were especially wobbly in small communities and for rare diseases. For childhood brain and nervous system cancers in the Toms River core zone, for example, the incidence ratio was 3.05—three times higher than expected. But if there had been just three fewer cases over the thirteen-year study period—a variance that was quite possible for chance reasons alone—the ratio would have been only 1.25, barely an excess at all. On the other hand, just three more cases would have hiked the incidence ratio all the way up to a truly alarming 5.0. Those random fluctuations were the “noise” that made it so difficult to identify the “signal” of a nonrandom cancer cluster. By calculating 95 percent confidence intervals for each incidence ratio, Berry could assess how confident he could be about his results. The tighter the interval, the more confident he could be. And if the entire interval was above 1.0, then Berry could reasonably conclude that there really was more cancer than expected, for reasons other than random fluctuation. His result, in other words, would be statistically significant. The problem was, for a study of rare cancers in an area as small as the Toms River core, Berry would need to find a staggeringly high excess of cases to avoid an interval that was hopelessly wide and dipped below 1.0.
With all that in mind, Berry calculated 95 percent confidence intervals for each of his categories, taking special note every time that a confidence interval was entirely above 1.0. And then, one last time, he revised his results:
These results were precisely the kind that regularly drove cluster investigators nuts. Every number in the “incidence ratio” column carried the same message: Something was wrong in Toms River. But the numbers in the next column, the one that showed the 95 percent confidence intervals, muddled that message in every possible way. All twelve confidence intervals were wide, especially in the township and the core, where the case numbers were lower. This meant that luck could be having a large influence on those ratios, each of which could easily be much higher or lower than the ratio indicated. And in all but three categories, the lower bound of the confidence interval was below 1.0, which meant that there might not be a problem at all. Could it be, for example, that if the effects of chance were eliminated, the Toms River core might have 50 percent fewer childhood leukemia cases than the statewide average, instead of 80
percent more, as Berry had calculated? Yes, it could, since 0.50 lay within the 95 percent confidence interval. But there was also a plausible chance that the leukemia rate among Toms River children was actually almost five times higher than expected, since the upper bound of the confidence interval was 4.61. The best that Poisson’s mathematics could confidently predict was that the true, nonrandom, actual-to-expected case ratio almost certainly lay somewhere within the gaping chasm of 0.48 to 4.61. (Actually, Poisson—and Berry—could do a little better than that, because not all values in between 0.48 and 4.61 were equally likely to be the true risk. When graphed, confidence intervals form a bell curve that peaks at the calculated ratio—in this case, 1.80. So, if forced to pick just one number, Berry’s best guess would be that leukemia risk for Toms River children in the core zone was 80 percent higher than expected. But he could not be confident about that guess. He could confidently predict only that the true risk for Toms River children lay somewhere between 52 percent lower than expected and 461 percent higher than expected.)
It was a distinctly unhelpful prediction, as if a weather forecaster had studied the radar, measured the temperature, humidity, and wind, and then declared that tomorrow’s weather would be either hot or snowy or—the best guess—something in between. There was no way to know whether to wear a sunhat or a parka.
The New Jersey Department of Health, like most other public health agencies, had already established a clear precedent on how to handle this kind of uncertain finding: Ignore it. A result was not credible, and therefore not worthy of further attention, if its 95 percent confidence interval crossed under 1.0, no matter how slightly. It did not matter that an interval of, say, 0.98 to 7.11 (the range for brain cancers in Toms River children) meant that the cancer rate was far more likely than not to be much higher than expected.
There were good reasons for New Jersey’s conservative approach. For one thing, it helped to minimize the distorting effects of hidden multiple comparisons. There were almost two thousand census tracts in New Jersey, and therefore tens of thousands of groupings of four contiguous tracts that Berry could have studied. But Berry had counted cancer cases in only one such grouping, the four census tracts he designated as the Toms River core. If he had surveyed the entire state, the sample would have been so large that Berry would almost certainly have found many groupings of four contiguous census tracts that had high numbers of cases for no reason other than sheer chance. The 95 percent standard was a rigorous check against such a coincidence. It was a high hurdle, and almost all the confidence intervals Berry calculated failed to clear it.
But was the hurdle too high? The 95 percent significance test was designed for use when just one statistic was being analyzed in isolation from all others. So while there could easily have been other groupings of four census tracts in New Jersey that by chance could have had very high numbers of brain and central nervous system cancers, Berry’s analysis showed that the Toms River core zone was unusual in other important ways, too. It had an unusually high number not only of brain cancer cases but also of leukemias and all childhood cancers combined—and so did the township and the county. In fact, Berry found elevated cancer rates in every single category—all twelve of them—in which there was any hope of a statistically valid result. He had never been involved in a cluster study in which every single category was elevated, and this one was in a town with a notorious history of chemical pollution, as Berry was learning. Finally, three of those categories—brain and nervous system cancers in county children under age twenty, county children under age five, and “core zone” children under five—had managed to clear even the very high hurdle established by the state’s rules and were thus statistically significant. Could all that be just an extremely unlucky series of coincidences?
The only way to find out would be for Berry to go beyond the state’s standard protocol for incidence analyses, something he had never done before for a residential cluster. He would have to investigate, not just calculate.
Over at the chemical factory, the few workers who remained were more worried than ever about cancer, as was the much larger group of retirees. Evidence was accumulating that their worries were well founded, though few employees knew it. For his doctoral dissertation, one of Elizabeth Delzell’s students at the University of Alabama at Birmingham, Fabio Barbone, undertook a more detailed analysis of her 1987 survey of cancer among long-term employees. Barbone completed his work in 1989, concluding that “six or seven” of the eleven cases of malignant central nervous system cancers Delzell found were likely caused by exposures at the factory and that workers in the azo, vat dye, and epichlorohydrin production areas all faced elevated risks. He also concluded that about seventeen of the fifty-one lung cancer cases were probably due to exposure to chlorine, anthraquinone, and epichlorohydrin.8 This time, employees were not briefed on the results of the new analysis, which Barbone did not publish until 1994. Still, relatives of workers who died of cancer knew enough to sue the company. There were three lawsuits in 1995 alone.9 Ciba (the company dropped the “Geigy” in 1995) settled them all out of court for undisclosed sums, without the public airing that would come at a trial.
Ciba was being just as careful in how it handled what was now its most critical issue in Toms River: the massive Superfund cleanup it was about to begin. Everything about the cleanup was gigantic, including the price tag: $165 million and rising. The old dumpsites on the factory property were believed to hold more than fifty thousand intact or crushed drums of hazardous waste and at least one hundred and fifty thousand cubic yards of severely contaminated soil. That was enough chemical-soaked dirt and sand to fill the passenger compartments of one hundred and thirty 747 jumbo jets and enough drummed waste to fill four Olympic-sized swimming pools. But those were just guesses. The truth was, no one knew what had been buried back in the 1950s and 1960s. There were no reliable records, only the memories of longtime employees and the results of preliminary tests that had detected ninety-five industrial chemicals in the soil or groundwater—seventeen of them, including four known carcinogens, at concentrations higher than those permitted under state law.
It would take another decade to assess all the old dumps and figure out how to clean them up, but pumping up the contaminated ground-water was a more straightforward process. It was also much more urgent, since groundwater plumes were still spreading chemicals—and anxiety—beyond the factory grounds, across Cardinal Drive, and into Oak Ridge. The Lynnworths had moved away, but Sheila McVeigh and other residents wanted the cleanup started quickly. Life next to a Superfund site could be disconcerting. One day, McVeigh was sunbathing in her backyard when a truck appeared on the other side of her fence, on Ciba property, and a crew wearing full-body protective “moon suits” got out to check a groundwater well that was just a few dozen feet away from where McVeigh, clad in a swimsuit, was relaxing on a chaise longue. “Stuff like that happened all the time,” she recalled. “It could be a little scary.” By the spring of 1995, an interconnected system of forty-three recovery wells on the factory grounds and in Oak Ridge was sucking up about two million gallons of contaminated groundwater per day and sending it through three miles of piping to the company’s wastewater treatment plant, where the tainted water was treated and then reinjected into the ground elsewhere on the factory property.10
Now that chemical manufacturing had ended, Toms River was a much happier place for Ciba. The company gave storage space in its empty buildings to the Boy Scouts, who used it for their canned food drives for the homeless. Ocean County Citizens for Clean Water, which had started the rebellion against Ciba eleven years earlier, was now its partner, using company funds to monitor the Superfund cleanup. Even the Observer, which had been so scathing in its coverage, now ran stories like “Ex-Pariah Ciba Gets Big Honor for Eco Policy” and editorials headlined “Ciba Success Benefits All.”11 No one in Toms River—Ciba least of all—wanted to think about any latent consequences from forty years of toxic emissions into the air, the sandy
soil, and the fragile river.
There were no citizen’s groups monitoring the cleanup at the other Superfund site in town and no newspaper articles chronicling every step of the process. As usual, Reich Farm was ignored, even though it posed a much more direct threat to many more people.
The winter and spring of 1995 were uncharacteristically busy at the old dump, where there had been so little activity the previous twenty-four years. In March, contractors finished digging up and treating more than fourteen thousand cubic yards of tainted soil, at a cost of $3 million to Union Carbide. It was not much in comparison to the planned cleanup at Ciba—enough soil to fill only twelve jumbo jets instead of one hundred and thirty—but it was far more elaborate than the slapdash cleanups of the 1970s at the old egg farm. This time, the EPA-supervised effort took six months and required dumping hundreds of truckloads of excavated dirt into a steel-walled device mounted on a trailer. The device, called a thermal desorber, looked like a giant spider with a smokestack rising from its belly. Dirt was poured in one end of a tubular kiln and came out the other end ten minutes later after being heated to 700 degrees Celsius—hot enough to vaporize chemicals like trichloroethylene and perchloroethylene. The vapors were captured, blown through a series of filters, and then sent up the smokestack.
The thermal desorber did its job but could not solve a fundamental problem: The worst of the pollution had long since left the Reich Farm property. Attempting a soil cleanup at the farm now, almost twenty-four years after the dumping, was like locking your doors after thieves had already taken everything you owned. Having finally mapped the huge swath of tainted groundwater seeping southward beneath Pleasant Plains, Union Carbide discovered that it stretched more than a mile, in a band four hundred feet wide and one hundred and fifty feet deep. At its southern end, the plume lurched eastward and made a beeline for the two closest Parkway wells. Now there could be no more ambiguity about what was happening: The wells, slurping up nearly two million gallons of groundwater every day, had altered the plume’s direction and were sucking up its chemical constituents. If nothing changed, almost every drop of chemical waste in the plume would be drawn into the intake screens of those two wells and then—unless the air stripper removed them—distributed to the people of Toms River. A lot was riding on that solitary air stripper tower at the Parkway well field—too much, according to the EPA.