Leaping into the pamphlet wars again, Bayes published a piece defending and explaining Newton’s calculus. This was his only mathematical publication during his lifetime. Shortly thereafter, in 1742, five men, including a close friend of Newton’s, nominated Bayes for membership in the Royal Society. His nomination avoided any hint of controversy and described him as “a Gentleman of known merit, well skilled in Geometry and all parts of Mathematical and Philosophical Learning.” The Royal Society was not the professional organization it is today; it was a private organization of dues-paying amateurs from the landed gentry. But it played a vital role because amateurs would produce some of the era’s breakthroughs.
About this time Bayes joined a second group of up-to-date amateur mathematicians. He had moved to a small congregation in a fashionable resort, the cold-water spa Tunbridge Wells. As an independently wealthy bachelor—his family had made a fortune manufacturing Sheffield steel cutlery—he rented rooms, apparently from a Dissenting family. His religious duties—one Sunday sermon a week—were light. And spa etiquette permitted Dissenters, Jews, Roman Catholics, and even foreigners to mix with English society, even with wealthy earls, as they could not elsewhere.
A frequent visitor to Tunbridge Wells, Philip, the Second Earl of Stanhope, had been passionately interested in mathematics since childhood, but his guardian had banned its study as insufficiently genteel. When Stanhope was 20 and free to do as he liked, he seldom raised his eyes from Euclid. According to the bluestocking Elizabeth Montagu, Stanhope was “always making mathematical scratches in his pocket-book, so that one half the people took him for a conjuror, and the other half for a fool.” Because of either his aristocratic position or his late start Stanhope never published anything of his own. Instead, he became England’s foremost patron of mathematicians.
The earl and the Royal Society’s energetic secretary, John Canton, operated an informal network of peer reviewers who critiqued one another’s work. At some point Bayes joined the network. One day, for example, Stanhope sent Bayes a copy of a draft paper by a mathematician named Patrick Murdoch. Bayes disagreed with some of it and sent his comments back to Stanhope, who forwarded them to Murdoch, who in turn replied through Stanhope, and so on around and around. The relationship between the young earl and the older Reverend Bayes seems to have ripened into friendship, however, because Stanhope paid Bayes at least one personal visit at Tunbridge Wells, saved two bundles of his mathematical papers in the Stanhope estate’s library, and even subscribed to his series of sermons.
Another incendiary mix of religion and mathematics exploded over England in 1748, when the Scottish philosopher David Hume published an essay attacking some of Christianity’s fundamental narratives. Hume believed that we can’t be absolutely certain about anything that is based only on traditional beliefs, testimony, habitual relationships, or cause and effect. In short, we can rely only on what we learn from experience.
Because God was regarded as the First Cause of everything, Hume’s skepticism about cause-and-effect relationships was especially unsettling. Hume argued that certain objects are constantly associated with each other. But the fact that umbrellas and rain appear together does not mean that umbrellas cause rain. The fact that the sun has risen thousands of times does not guarantee that it will do so the next day. And, most important, the “design of the world” does not prove the existence of a creator, an ultimate cause. Because we can seldom be certain that a particular cause will have a particular effect, we must be content with finding only probable causes and probable effects. In criticizing concepts about cause and effect Hume was undermining Christianity’s core beliefs.
Hume’s essay was nonmathematical, but it had profound scientific implications. Many mathematicians and scientists believed fervently that natural laws did indeed prove the existence of God, their First Cause. As an eminent mathematician Abraham de Moivre wrote in his influential book Doctrine of Chances, calculations about natural events would eventually reveal the underlying order of the universe and its exquisite “Wisdom and Design.”
With Hume’s doubts about cause and effect swirling about, Bayes began to consider ways to treat the issue mathematically. Today, probability, the mathematics of uncertainty, would be the obvious tool, but during the early 1700s probability barely existed. Its only extensive application was to gambling, where it dealt with such basic issues as the odds of getting four aces in one poker hand. De Moivre, who had spent several years in French prisons because he was a Protestant, had already solved that problem by working from cause to effect. But no one had figured out how to turn his work around backward to ask the so-called inverse question from effect to cause: what if a poker player deals himself four aces in each of three consecutive hands? What is the underlying chance (or cause) that his deck is loaded?
We don’t know precisely what piqued Bayes’ interest in the inverse probability problem. He had read de Moivre’s book, and the Earl of Stanhope was interested in probability as it applied to gambling. Alternatively, it could have been the broad issues raised by Newton’s theory of gravitation. Newton, who had died 20 years before, had stressed the importance of relying on observations, developed his theory of gravitation to explain them, and then used his theory to predict new observations. But Newton had not explained the cause of gravity or wrestled with the problem of how true his theory might be. Finally, Bayes’ interest may have been stimulated by Hume’s philosophical essay. In any event, problems involving cause and effect and uncertainty filled the air, and Bayes set out to deal with them quantitatively.
Crystallizing the essence of the inverse probability problem in his mind, Bayes decided that his goal was to learn the approximate probability of a future event he knew nothing about except its past, that is, the number of times it had occurred or failed to occur. To quantify the problem, he needed a number, and sometime between 1746 and 1749 he hit on an ingenious solution. As a starting point he would simply invent a number—he called it a guess—and refine it later as he gathered more information.
Next, he devised a thought experiment, a 1700s version of a computer simulation. Stripping the problem to its basics, Bayes imagined a square table so level that a ball thrown on it would have the same chance of landing on one spot as on any other. Subsequent generations would call his construction a billiard table, but as a Dissenting minister Bayes would have disapproved of such games, and his experiment did not involve balls bouncing off table edges or colliding with one another. As he envisioned it, a ball rolled randomly on the table could stop with equal probability anywhere.
We can imagine him sitting with his back to the table so he cannot see anything on it. On a piece of paper he draws a square to represent the surface of the table. He begins by having an associate toss an imaginary cue ball onto the pretend tabletop. Because his back is turned, Bayes does not know where the cue ball has landed.
Next, we picture him asking his colleague to throw a second ball onto the table and report whether it landed to the right or left of the cue ball. If to the left, Bayes realizes that the cue ball is more likely to sit toward the right side of the table. Again Bayes’ friend throws the ball and reports only whether it lands to the right or left of the cue ball. If to the right, Bayes realizes that the cue can’t be on the far right-hand edge of the table.
He asks his colleague to make throw after throw after throw; gamblers and mathematicians already knew that the more times they tossed a coin, the more trustworthy their conclusions would be. What Bayes discovered is that, as more and more balls were thrown, each new piece of information made his imaginary cue ball wobble back and forth within a more limited area.
As an extreme case, if all the subsequent tosses fell to the right of the first ball, Bayes would have to conclude that it probably sat on the far left-hand margin of his table. By contrast, if all the tosses landed to the left of the first ball, it probably sat on the far right. Eventually, given enough tosses of the ball, Bayes could narrow the range of places where the
cue ball was apt to be.
Bayes’ genius was to take the idea of narrowing down the range of positions for the cue ball and—based on this meager information—infer that it had landed somewhere between two bounds. This approach could not produce a right answer. Bayes could never know precisely where the cue ball landed, but he could tell with increasing confidence that it was most probably within a particular range. Bayes’ simple, limited system thus moved from observations about the world back to their probable origin or cause. Using his knowledge of the present (the left and right positions of the tossed balls), Bayes had figured out how to say something about the past (the position of the first ball). He could even judge how confident he could be about his conclusion.
Conceptually, Bayes’ system was simple. We modify our opinions with objective information: Initial Beliefs (our guess where the cue ball landed) + Recent Objective Data (whether the most recent ball landed to the left or right of our original guess) = A New and Improved Belief. Eventually, names were assigned to each part of his method: Prior for the probability of the initial belief; Likelihood for the probability of other hypotheses with objective new data; and Posterior for the probability of the newly revised belief. Each time the system is recalculated, the posterior becomes the prior of the new iteration. It was an evolving system, which each new bit of information pushed closer and closer to certitude. In short:
Prior times likelihood is proportional to the posterior.
(In the more technical language of the statistician, the likelihood is the probability of competing hypotheses for the fixed data that have been observed. However, Andrew Dale, a South African historian of statistics, simplified the matter considerably when he observed, “Put somewhat rudely, the likelihood is what remains of Bayes’s Theorem once the prior is removed from the discussion.”)2
As a special case about balls thrown randomly onto a flat table, Bayes’ rule is uncontroversial. But Bayes wanted to cover every case involving uncertainty, even cases where nothing whatsoever was known about their history—in his words, where we “absolutely know nothing antecedently to any trials.”3 This expansion of his table experiment to cover any uncertain situation would trigger 150 years of misunderstanding and bitter attacks.
Two especially popular targets for attack were Bayes’ guesswork and his suggested shortcut.
First, Bayes guessed the likely value of his initial belief (the cue ball’s position, later known as the prior). In his own words, he decided to make “a guess whereabouts it’s [sic] probability is, and . . . [then] see the chance that the guess is right.” Future critics would be horrified at the idea of using a mere hunch—a subjective belief—in objective and rigorous mathematics.
Even worse, Bayes added that if he did not know enough to distinguish the position of the balls on his table, he would assume they were equally likely to fall anywhere on it. Assuming equal probabilities was a pragmatic approach for dealing with uncertain circumstances. The practice was rooted in traditional Christianity and the Roman Catholic Church’s ban on usury. In uncertain situations such as annuities or marine insurance policies, all parties were assigned equal shares and divided profits equally. Even prominent mathematicians assigned equal probabilities to gambling odds by assuming, with a remarkable lack of realism, that all tennis players or fighting cocks were equally skillful.
In time, the practice of assigning equal probabilities acquired a number of names, including equal priors, equal a priori’s, equiprobability, uniform distribution probability, and the law of insufficient reason (meaning that without enough data to assign specific probabilities, equal ones would suffice). Despite their venerable history, equal probabilities would become a lightning rod for complaints that Bayes was quantifying ignorance.
Today, some historians try to absolve him by saying he may have applied equal probabilities to his data (the subsequent throws) rather than to the initial, so-called prior toss. But this is also guesswork. And for many working statisticians, the question is irrelevant because in the tightly circumscribed case of balls that can roll anywhere on a carefully leveled surface both produce the same mathematical results.
Whatever Bayes meant, the damage was done. For years to come, the message seemed clear: priors be damned. At this point, Bayes ended his discussion.
He may have mentioned his discovery to others. In 1749 someone told a physician named David Hartley something that sounds suspiciously like Bayes’ rule. Hartley was a Royal Society member who believed in cause-and-effect relationships. In 1749 he wrote that “an ingenious friend has communicated to me a Solution of the inverse problem . . . which shews that we may hope to determine the Proportions, and by degrees, the whole Nature, of unknown Causes, by a sufficient Observation of their Effects.” Who was this ingenious friend? Modern-day sleuths have suggested Bayes or Stanhope, and in 1999 Stephen M. Stigler of the University of Chicago suggested that Nicholas Saunderson, a blind Cambridge mathematician, made the discovery instead of Bayes. No matter who talked about it, it seems highly unlikely that anyone other than Bayes made the breakthrough. Hartley used terminology that is almost identical to Bayes’ published essay, and no one who read the article between its publication in 1764 and 1999 doubted Bayes’ authorship. If there had been any question about the author’s identity, it is hard to imagine Bayes’ editor or his publisher not saying something publicly. Thirty years later Price was still referring to the work as that of Thomas Bayes.
Although Bayes’ idea was discussed in Royal Society circles, he himself seems not to have believed in it. Instead of sending it off to the Royal Society for publication, he buried it among his papers, where it sat for roughly a decade. Only because he filed it between memoranda dated 1746 and 1749 can we conclude that he achieved his breakthrough sometime during the late 1740s, perhaps shortly after the publication of Hume’s essay in 1748.
Bayes’ reason for suppressing his essay can hardly have been fear of controversy; he had plunged twice into Britain’s pamphlet wars. Perhaps he thought his discovery was useless; but if a pious clergyman like Bayes thought his work could prove the existence of God, surely he would have published it. Some thought Bayes was too modest. Others wondered whether he was unsure about his mathematics. Whatever the reason, Bayes made an important contribution to a significant problem—and suppressed it. It was the first of several times that “Bayes’ rule” would spring to life only to disappear again from view.
Bayes’ discovery was still gathering dust when he died in 1761. At that point relatives asked Bayes’ young friend Richard Price to examine Bayes’ mathematical papers.
Price, another Presbyterian minister and amateur mathematician, achieved fame later as an advocate of civil liberties and of the American and French revolutions. His admirers included the Continental Congress, which asked him to emigrate and manage its finances; Benjamin Franklin, who nominated him for the Royal Society; Thomas Jefferson, who asked him to write to Virginia’s youths about the evils of slavery; John Adams and the feminist Mary Wollstonecraft, who attended his church; the prison reformer John Howard, who was his best friend; and Joseph Priestley, the discoverer of oxygen, who said, “I question whether Dr. Price ever had a superior.” When Yale University conferred two honorary degrees in 1781, it gave one to George Washington and the other to Price. An English magazine thought Price would go down in American history beside Franklin, Washington, Lafayette, and Paine. Yet today Price is known primarily for the help he gave his friend Bayes.
Sorting through Bayes’ papers, Price found “an imperfect solution of one of the most difficult problems in the doctrine of chances.” It was Bayes’ essay on the probability of causes, on moving from observations about the real world back to their most probable cause.
At first Price saw no reason to devote much time to the essay. Mathematical infelicities and imperfections marred the manuscript, and it looked impractical. Its continual iterations—throwing the ball over and over again and recalculating the formula each time—produced large
numbers that would be difficult to calculate.
But once Price decided Bayes’ essay was the answer to Hume’s attack on causation, he began preparing it for publication. Devoting “a good deal of labour” to it on and off for almost two years, he added missing references and citations and deleted extraneous background details in Bayes’ derivations. Lamentably, he also threw out his friend’s introduction, so we’ll never know precisely how much the edited essay reflects Bayes’ own thinking.
In a cover letter to the Royal Society, Price supplied a religious reason for publishing the essay. In moving mathematically from observations of the natural world inversely back to its ultimate cause, the theorem aimed to show that “the world must be the effect of the wisdom and power of an intelligent cause; and thus to confirm . . . from final causes . . . the existence of the Deity.” Bayes himself was more reticent; his part of the essay does not mention God.
A year later the Royal Society’s Philosophical Transactions published “An Essay toward solving a Problem in the Doctrine of Chances.” The title avoided religious controversy by highlighting the method’s gambling applications. Critiquing Hume a few years later, Price used Bayes for the first and only time. As far as we know, no one else mentioned the essay for the next 17 years, when Price would again bring Bayes’ rule to light.
By modern standards, we should refer to the Bayes-Price rule. Price discovered Bayes’ work, recognized its importance, corrected it, contributed to the article, and found a use for it. The modern convention of employing Bayes’ name alone is unfair but so entrenched that anything else makes little sense.
The Theory That Would Not Die Page 2