by Lee McIntyre
As noted, on March 23, 1989, two chemists, B. Stanley Pons and Martin Fleischmann, held a press conference at the University of Utah to announce a startling scientific discovery: they had experimental evidence for the possibility of conducting a room temperature nuclear fusion reaction, using materials that were available in most freshman chemistry labs. The announcement was shocking. If true, it raised the possibility that commercial adaptations could be found to produce a virtually unlimited supply of energy. Political leaders and journalists were mesmerized, and the announcement got front-page coverage in several media outlets, including the Wall Street Journal. But the press conference was shocking for another reason, which was that Pons and Fleischmann had as yet produced no scientific paper detailing their experiment, thus forestalling the possibility of other scientists checking their work.
To say that this is unusual in science is a vast understatement. Peer review is a bedrock principle of science, because it is the best possible way of catching any errors—and asking all of the appropriate critical questions—before a finding is released. One commentator in this debate, John Huizenga, acknowledges that “the rush to publish goes back to the seventeenth century,” when the Royal Society of London began to give priority to the first to publish, rather than the first to make a discovery.58 The competitive pressures in science are great, and priority disputes for important findings are common. Thus, “in the rush to publish, scientists eliminate important experimental checks of their work and editors compromise the peer-review process, leading to incomplete or even incorrect results.”59 And this seems to be what happened with cold fusion.
At the time of Pons and Fleischmann’s press conference, they were racing to beat another researcher from a nearby university, whom they claim had pirated their discovery when he had read about it during review of their grant application. That researcher, Steven Jones of Brigham Young University, counter-claimed that he had been working on fusion research long before this, and that Pons and Fleischmann’s grant proposal was merely an “impetus” to get back to work and brook no further delay in publishing his own results. Once the administration at their respective universities (BYU and Utah) became involved, and the subject turned to patents and glory for the institution, things devolved and few behaved admirably. As noted, scientists are human beings and feel the same competitive pressures that any of us would. But Pons and Fleischmann would live to regret their haste.
As it turned out, their experimental evidence was weak. They claimed to have turned up an enormous amount of excess heat by conducting an electrochemical experiment involving palladium, platinum, a mixture of heavy water and lithium, and some electric current. But, if this was a nuclear reaction and not a chemical one, they had a problem: where was the radiation? In the fusion of two deuterium nuclei, it is expected that not only heat but also neutrons (and gamma rays) will be given off. Yet repeatedly, Pons and Fleischmann failed to find any neutrons (or so few as to suggest that there must be some sort of error) with the detector they had borrowed from the physics department. Could there be some other explanation?
One would think that it would be a fairly straightforward process to sort all of this out, but in the earliest days after the press conference, Pons and Fleischmann refused to share their data with other experimenters, who were reduced to “[getting their] experimental details from The Wall Street Journal and other news publications.”60 Needless to say, that is not how science is supposed to work. Replications are not supposed to be based on guesswork about experimental technique or phone calls begging for more information. Nonetheless, in what Gary Taubes calls a “collective derangement of minds,” a number of people in the scientific community got caught up in the hype, and several partial “confirmations” began to pop up around the world.61 Several research groups reported that they too had found excess heat in the palladium experiment (but still no radiation). At the Georgia Tech Research Institute, they claimed to have found an explanation for the absence of deuterium atoms due to the presence of Boron as an ingredient in the Pyrex glassware that Pons and Fleischmann had used; when they tested the result with Boron-free glassware they detected some radiation, which disappeared when the detector was screened behind a Boron shield. Another group claimed to have a theoretical explanation for the lack of radiation, with helium and heat as the only product of Pons and Fleischmann’s electrolysis reaction.62
Of course, there were also critics. Some nuclear physicists in particular were deeply skeptical of how a process like fusion—which occurs at enormous temperature and pressure in the center of the Sun—could happen at room temperature. Many dismissed these criticisms, however, as representative of the “old” way of thinking about fusion. A psychologist would have a field day with all of the tribalism, bandwagon jumping, and “emperor has no clothes” responses that happened next. Indeed, some have argued that a good deal of the positive response to cold fusion can be explained by Solomon Asch’s experimental work on conformity, in which he found that a nontrivial number of subjects would deny self-evident facts if that was necessary to put them in agreement with their reference group.63 It is important to point out, however, that although all of these psychological effects may occur in science, they are supposed to be washed out by the sort of critical scrutiny that is performed by the community of scientists. So where was this being done? While it is easy to cast aspersions on how slowly things developed, it is important to note that the critical research was being done elsewhere, as other researchers worked under the handicap of inadequate data and too many unanswered questions. And, as some details about Pons and Fleishmann’s work began to leak out, the problems for cold fusion began to mount.
For one, where was the radiation? As noted, some thought they could explain its absence, yet most remained skeptical. Another problem arose as some questioned whether the excess heat could be explained as a mere artifact of some unknown chemical process. On April 10, 1989, a rushed and woefully incomplete paper by Pons and Flesichmann appeared in the Journal of Electroanalytical Chemistry. It was riddled with errors. Why weren’t these caught at peer review? Because the paper was not peer reviewed!64 It was evaluated only by the editor of the journal (who was a friend of Stanley Pons) and rushed into print, because of the “great interest” of the scientific community.65 At numerous scientific conferences to follow, Pons was reluctant to answer detailed questions from colleagues about problems with his work and refused to share raw data. In his defense, one might point out that by this point numerous partial “confirmations” had been coming in and others were defending his work too, claiming that they had similar results. Even though Pons was well aware of some of the shortcomings of his work, he might still have believed in it.
And yet, as we saw earlier in this chapter, there is a high correlation between failure to share one’s data and analytical error.66 It is unknown whether Pons deliberately tried to keep others from discovering something that he already knew to be a weakness or was merely caught up in the hype surrounding his own theory. Although some have called this fraud—and it may well be—one need not go this far to find an indictment of his behavior. Scientists are expected to participate—or at least cooperate—in the criticism of their own theory, and woe to those who stonewall. Pons is guilty at least of obstructing the work of others, which is an assault against the spirit of the scientific attitude, where we should be willing to compare our ideas against the data, whether that is provided by oneself or others.67
The process did not take long to unravel from there. A longer, more carefully written version of Pons and Fleischmann’s experiment was submitted to the journal Nature. While waiting for the paper to be peer reviewed, the journal editor John Maddox published an editorial in which he said:
It is the rare piece of research indeed that both flies in the face of accepted wisdom and is so compellingly correct that its significance is instantly recognized. Authors with unique or bizarre approaches to problems may feel that they have no true peers who can evaluate
their work, and feel more conventional reviewers’ mouths are already shaping the word “no” before they give the paper much attention. But it must also be said that most unbelievable claims turn out to be just that, and reviewers can be forgiven for perceiving this quickly.68
A few weeks later, on April 20, Maddox announced that he would not be publishing Pons and Fleischmann’s paper because all three of its peer reviewers had found serious problems and criticisms of their work. Most serious of these was that Pons and Fleishmann had apparently not done any controls! This in fact was not true. In addition to doing the fusion experiment with heavy water, Pons and Fleischmann had also done an early version of the experiment with light water, and had gotten a similar result.69 Pons kept this fact quiet, however, sharing it only with Chuck Martin of Texas A&M, who had gotten the same result. When they talked, Pons told him that he was now at the “most exciting” part of the research, but he couldn’t discuss it further because of national defense issues.70 Similarly, other researchers who had “confirmed” the original result were finding that it also worked with carbon, tungsten, and gold.71 But if the controls all “worked,” didn’t this mean that the original result was suspicious? Or could it mean that Pons and Fleischmann’s original finding was even broader—and more exciting—than they had thought?
It didn’t take long for the roof to collapse. On May 1—at a meeting of the American Physical Society in Baltimore—Nate Lewis (an electrochemist from Caltech) gave a blistering talk in which he all but accused Pons and Fleischmann of fraud for their shoddy experimental results; he received a standing ovation from the two thousand physicists in attendance. On May 18, 1989 (less than two months after the original Utah press conference), Richard Petrasso et al. (from MIT) published a paper in Nature showing that Pons and Fleischmann’s original results were experimental artifacts and not due to any nuclear fusion reaction. It seems that the full gamma ray spectrum graph that Pons had kept so quiet (even though it purported to be their main piece of evidence) had been misinterpreted. From there the story builds momentum with even more criticism of the original result, criticism of the “confirmation” of their findings, and a US governmental panel that was assigned to investigate. Eventually Pons disappeared and later resigned.
Peer review is one of the most important ways that the scientific community at large exercises oversight of the mistakes and sloppiness of individuals who may be motivated—either consciously or unconsciously—by the sorts of pressures that could cause them to cut corners in their work. Even when it is handicapped or stonewalled, this critical scrutiny is ultimately unstoppable. Science is a public venture and we must be prepared to present our evidence or face the consequences. In any case, the facts will eventually come out. Some may, of course, bemoan the fact that the whole cold fusion debacle happened at all and see it as an indictment of the process of science. Many critics of science will surely be prepared to do so, with perhaps nothing better than pseudoscience to offer in its place. But I think it is important to remind ourselves that—as we saw with p-hacking—the distinctiveness of science does not lie in the claim that it is perfect. Indeed, error and its discovery are an important part of what moves science forward. More than one commentator in this debate has pointed out that science is self-correcting. When an error occurs, it does not customarily fester. If Pons and Fleischmann had been prepared to share their data and embrace the scientific attitude, the whole episode might have been over within a few days, if it ever got started at all. Yet even when egos, money, and other pressures were at stake, the competitive and critical nature of scientific investigation all but guaranteed that any error would eventually be discovered. So again, rather than merely blaming science that an error happened, perhaps we should celebrate the fact that one of the largest scientific errors of the twentieth century was caught and run to ground less than two months after it was announced, based solely on lack of fit with the empirical evidence. For all the extraneous factors that complicated matters and slowed things down, this was the scientific attitude at its finest. As John Huizenga puts it:
The whole cold fusion fiasco serves to illustrate how the scientific process works. … Scientists are real people and errors and mistakes do occur in science. These are usually detected either in early discussions of ones research with colleagues or in the peer-review process. If mistakes escape notice prior to publication, the published work will come under close scrutiny by other scientists, especially if it disagrees with an established body of data. … The scientific process is self-corrective.72
Data Sharing and Replication
As we have just seen, failure to share data is bad. When we sign on as a scientist, we are implicitly agreeing to a level of intellectual honesty whereby we will cooperate with others who are seeking to check our results for accuracy. In some cases (such as those journals sponsored by the American Psychological Association), the publishing agreement actually requires one to sign a pledge to share data with other researchers who request it. Beyond peer review, this introduces the standard of expecting that scientific findings should be replicable.
As Trivers indicates in his review of Wicherts’s findings, this does not mean that data sharing necessarily occurs. As human beings, we sometimes shirk what is expected of us. As a community, however, this does not change expectations.73 The scientific attitude requires us to respect the refutatory power of evidence; part of this is being willing to cooperate with others who seek to scrutinize or even refute our hard-won theories. This is why it is so important for this standard to be enforced by the scientific community (and to be publicized when it is not).
It should be noted, though, that there is a distinction between refusing to share data and doing irreproducible work. It is probably expected that one of the motivations for refusing to share data is fear that someone else will not be able to confirm our results. While this does not necessarily mean that our work is fraudulent, it is certainly an embarrassment. It indicates that something must be wrong. As we’ve seen, that might fall into the camp of quantitative error, faulty analysis, faulty method, bad data collection, or a host of other missteps that could be described as sloppy or lazy. Or it might reveal fraud. In any case, none of this is good news for the individual researcher, so some of them hold back their data. Trivers rightly excoriates this practice, saying that such researchers are “a parody of academics.” And, as Wicherts demonstrates, there is a troubling correlation between refusing to share data and a higher likelihood of quantitative error in a published study.74 One imagines that, if the actual data were available, other problems might pop out as well.
Sometimes they do. The first cut is when a second researcher uses the same data and follows the same methods as the original researcher, but the results do not hold up. This is when a study is said to be irreproducible. The second step is when it becomes apparent—if it does—why the original work was irreproducible. If it is because of fabricated data, a career is probably over. If it is because of bad technique, a reputation is on the line. Yet it is important to understand here that failure to replicate a study is not necessarily a bad thing for the profession at large. This is how science learns.75 This is probably in part what is meant by saying that science is self-correcting. A scientist makes a mistake, another scientist catches it, and a lesson is learned. If the lesson is about the object of inquiry—rather than the ethical standards of the original researcher—further work is possible. Perhaps even enlightenment. When we think that we know something, but really don’t, it is a barrier to further discovery. But when we find that something we thought was true really wasn’t, a breakthrough may be ahead. We have the opportunity to learn more about how things work. Thus failure can be valuable to science.76
The most important thing about science is that we try to find failure. The real danger to science comes not from mistakes but deception. Mistakes can be corrected and learned from; deception is often used to cover up mistakes. Fabricated data may be built upon by scores of other researchers
before it is caught. When individual researchers put their own careers ahead of the production of knowledge, they are not only cheating on the ideals of the scientific attitude, they are cheating their professional colleagues as well. Lying, manipulation, and tampering are the very definition of having the wrong attitude about empirical evidence.
But again, we must make every effort to draw a distinction between irreproducible studies and fraud. It is helpful here to remember our logic. Fraudulent results are almost always irreproducible, but this does not mean that all or even most irreproducible studies are the result of fraud. The crime against the scientific attitude is of course much greater in cases of fraud. Yet we can also learn something about the scientific attitude by focusing on some of the less egregious reasons for irreproducible studies. Even if the reason for a study not being replicated has nothing whatsoever to do with fraud, scrutiny by the larger scientific community will probably find it. Merely by having a standard of data sharing and reproducibility, we are embracing the scientific attitude.77 Whether an irreproducible study results from intentional or unintentional mistakes, we can count on the same mechanism to find them.
One might think, therefore, that most scientific studies are replicated. That at least half of all scientific work must involve efforts to reproduce others’ findings so that we can be sure they are right before the field moves forward. This would be false. Most journals do not welcome publication of studies that merely replicate someone else’s results. It is perhaps more interesting if a study fails to replicate someone else’s result, but here too there is some risk. The greater glory in scientific investigation comes from producing one’s own original findings, not merely checking someone else’s, whether they can be replicated or not. Perhaps this is why only a small percentage of scientific work is even attempted to be replicated.78 One might argue that this is appropriate because the standards of peer review are so high that one would not expect most studies to need to be repeated. With all that scrutiny, it must be a rare event to find a study that is irreproducible. But this assumption too has been challenged in recent years. In fact, the media in the last few years have been full of stories about the “reproducibility crisis” in science, especially in psychology, where—when someone bothered to look—researchers found that 64 percent of studies were irreproducible!79