The Theory That Would Not Die
Page 13
Bailey was an insurance actuary whose father had been fired and black-balled by every bank in Boston for telling his employers they should not be lending large sums of money to local politicians. So ostracized was the family that even Arthur’s schoolmates stopped inviting him and his sister to parties. Turning his back on the New England establishment, Bailey enrolled at the University of Michigan in Ann Arbor. There he studied statistics in the mathematics department’s actuarial program, earned a bachelor of science degree in 1928, and met his wife, Helen, who became an actuary for John Hancock Mutual Life before their children were born.1
Bailey’s first job was, he liked to say, “in bananas,” that is, in the statistics department of the United Fruit Company headquarters in Boston. When the department was eliminated during the Depression, Bailey wound up driving a fruit truck and chasing escaped tarantulas down Boston streets. He was lucky to have the job, and his family never lacked for bananas and oranges.
In 1937, after nine years in bananas, Bailey got a job in an unrelated field in New York City. There he was in charge of setting premium rates to cover risks involving automobiles, aircraft, manufacturing, burglary, and theft for the American Mutual Alliance, a consortium of mutual insurance companies.
Preferring church and community connections to the fair-weather friends of his youth, Bailey hid his growing professional success by living quietly in unpretentious New York suburbs. He relaxed by gardening, hiking with his four children, and annotating a copy of Gray’s Botany with the locations of his favorite wild orchids. His motto was, “Some people live in the past, some people live in the future, but the wisest ones live in the present.”
Settling into his new job, Bailey was horrified to see “hard-shelled underwriters” using the semi-empirical, “sledge hammer” Bayesian techniques developed in 1918 for workers’ compensation insurance.2 University statisticians had long since virtually outlawed those methods, but as practical business people, actuaries refused to discard their prior knowledge and continued to modify their old data with new. Thus they based next year’s premiums on this year’s rates as refined and modified with new claims information. They did not ask what the new rates should be. Instead, they asked, “How much should the present rates be changed?” A Bayesian estimating how much ice cream someone would eat in the coming year, for example, would combine data about the individual’s recent ice cream consumption with other information, such as national dessert trends.
As a modern statistical sophisticate, Bailey was scandalized. His professors, influenced by Ronald Fisher and Jerzy Neyman, had taught him that Bayesian priors were “more horrid than ‘spit,’” in the words of a particularly polite actuary.3 Statisticians should have no prior opinions about their next experiments or observations and should employ only directly relevant observations while rejecting peripheral, nonstatistical information. No standard methods even existed for evaluating the credibility of prior knowledge (about previous rates, for example) or for correlating it with additional statistical information.
Bailey spent his first year in New York trying to prove to himself that “all of the fancy actuarial [Bayesian] procedures of the casualty business were mathematically unsound.”4 After a year of intense mental struggle, however, he realized to his consternation that actuarial sledgehammering worked. He even preferred it to the elegance of frequentism. He positively liked formulae that described “actual data. . . . I realized that the hard-shelled underwriters were recognizing certain facts of life neglected by the statistical theorists.”5 He wanted to give more weight to a large volume of data than to the frequentists’ small sample; doing so felt surprisingly “logical and reasonable.” He concluded that only a “suicidal” actuary would use Fisher’s method of maximum likelihood, which assigned a zero probability to nonevents.6 Since many businesses file no insurance claims at all, Fisher’s method would produce premiums too low to cover future losses.
Abandoning his initial suspicions of Bayes’ rule, Bailey spent the Second World War studying the problem. He worked alone, isolated from academic thinkers and from his actuarial colleagues, who scratched their heads at Bailey’s brilliance.
After the war, in 1947, Bailey moved to the New York State Insurance Department as the regulatory agency’s chief actuary. An insurance executive called him “the keeper of our consciences.” As his colleagues boozed in hotel bars at conferences, Bailey sipped soft drinks and quoted occasionally from the Bible. During slack times, he read it. Some actuaries said “all manner of nasty things about Arthur Bailey,” the executive continued, “but we learned to respect his integrity and stature from knowing him in the after-hours.”7
Bailey began writing an article summarizing his tumultuous change in attitude toward Bayes’ rule. Although his old-fashioned notation was difficult to understand, he was building a mathematical foundation to justify the use of current rates as the priors in Bayes’ theorem. He started his paper with a biblical justification for using prior beliefs: “If thou canst believe,” he wrote, quoting the apostle Mark, “all things are possible to him that believeth.” Then, reviewing Albert Whitney’s mathematics for workers’ compensation, Bailey affirmed the Bayesian roots of the Credibility theory developed for workers’ compensation insurance years before. Credibility was central to actuarial thought, and while relative frequencies were relevant, so were other kinds of information. Bailey worked out mathematical methods for melding every scrap of available information into the initial body of data. He particularly tried to understand how to assign partial weights to supplementary evidence according to its credibility, that is, its subjective believability. His mathematical techniques would help actuaries systematically and consistently integrate thousands of old and new rates for different kinds of employers, activities, and locales. His working library included a 1940 reprint of Bayes’ articles with a preface by Bell Telephone’s Edward Molina. Like Molina, Bailey used Laplace’s more complex and precise system instead of Thomas Bayes’.
By 1950 Bailey was a vice president of the Kemper Insurance Group in Chicago and a frequent after-dinner speaker at black-tie banquets of the Casualty Actuarial Society. He read his most famous paper on May 22, 1950. Its title explained a lot: “Credibility Procedures: Laplace’s Generalization of Bayes’ Rule and the Combination of Collateral [that is, prior] Knowledge with Observed Data.”
For actuaries who could concentrate on a long scholarly paper after a heavy (and no doubt alcoholic) meal, Bailey’s message must have been thrilling. First, he praised his colleagues for standing almost alone against the statistical establishment and for staging the only organized revolt against the frequentists’ sampling philosophy. Insurance statisticians marched “a step ahead” of others. Actuarial practice was an obscure and profound mystery, and it went “beyond anything that has been proven mathematically.” But, he declared triumphantly, “it works. . . . They have made this demonstration many times. It does work!”8
Then he announced the startling news that their beloved Credibility formula was derived from Bayes’ theorem. Practical actuaries had thought of Bayes as an abstract, temporal solution treating time sequences of priors and posteriors. But Bailey reminded his colleagues that Bayes’ friend and editor, Richard Price, would be considered today an actuary. And he turned Bayes’ imaginary table into a frontal attack on frequentists and the contentious Fisher. He concluded with a rousing call to reinstate prior knowledge in statistical theory. His challenge would occupy academic theorists for years. It was a fighting speech. Reading it later, Professor Richard von Mises of Harvard praised it wholeheartedly. Von Mises wrote Bailey that he hoped his speech would make “the unjustified and unreasonable attacks on the Bayes theory, initiated by R. A. Fisher, fade out.”9
Unfortunately, Bailey did not live long to campaign for Bayes’ rule. Four years after giving his most important speech, he suffered a heart attack at the age of 49 and died on August 12, 1954. His son blamed the fact that Bailey had started smoking in college and been
unable to stop.
Still, a few practicing actuaries understood his message. The year of Bailey’s death, one of his admirers was sipping a martini at the Insurance Company of North America’s Christmas party when INA’s chief executive officer, dressed as Santa Claus, asked an unthinkable question: Could anyone predict the probability of two planes colliding in midair?
Santa was asking his chief actuary, L. H. Longley-Cook, to make a prediction based on no experience at all. There had never been a serious midair collision of commercial planes. Without any past experience or repetitive experimentation, any orthodox statistician had to answer Santa’s question with a resounding no. But the very British Longley-Cook stalled for time. “I really don’t like these things mixed with martinis,” he drawled. Nevertheless, the question gnawed at him. Within a year more Americans would be traveling by air than by railroad. Meanwhile, some statisticians were wondering if they could avoid using the ever-controversial subjective priors by making predictions based on no prior information at all.
Longley-Cook spent the holidays mulling over the problem, and on January 6, 1955, he sent the CEO a prescient warning. Despite the industry’s safety record, the available data on airline accidents in general made him expect “anything from 0 to 4 air carrier-to-air carrier collisions over the next ten years.” In short, the company should prepare for a costly catastrophe by raising premium rates for air carriers and purchasing reinsurance. Two years later his prediction proved correct. A DC-7 and a Constellation collided over the Grand Canyon, killing 128 people in what was then commercial aviation’s worst accident. Four years after that, a DC-8 jet and a Constellation collided over New York City, killing 133 people in the planes and in the apartments below.10
Later, Arthur Bailey’s son, Robert A. Bailey, used Bayesian techniques to justify offering merit rates to good drivers. Motor vehicle casualty rates soared so high during the 1960s that half the Americans then alive could expect to be injured in a car accident during their lifetimes. Americans were buying more cars and driving more miles, but laws had not kept pace. There was no uniform road signage; most drivers and vehicles were tested or inspected only once in their lifetimes, if at all; penalties for drunk driving were light; and cars were designed without safety in mind. Insurers suffered heavy losses. A direct, up-front system to reward good drivers was needed, but merit rating was regarded as unsound because a single car had inadequate credibility. Using Bayes’ rule, Robert Bailey and Leroy J. Simon showed that relevant data from Canada’s safe-driving discounts could be used to update existing U.S. statistics.
Robert Bailey also used Bayesian procedures to rate insurance companies themselves by incorporating nonstatistical, subjective information such as opinions about a company’s ownership, including the quality and drinking habits of its managers. In time, the insurance industry accumulated such enormous amounts of data that Bayes’ rule, like the slide rule, became obsolete.
To the few actuaries who understood Arthur Bailey’s work, he was a da Vinci or a Michelangelo: he had led their profession out of its dark ages.11 News of his achievement percolated slowly and haphazardly to university theorists. During the early 1960s an actuarial professor at the University of Michigan, Allen L. Mayerson, wrote about Bailey’s seminal role in Credibility theory. Professor of statistics Jimmie Savage, a new convert to Bayesian methods, was working in Ann Arbor at the time and later visited Bruno de Finetti, the Bayesian actuarial professor, at his vacation home on an island off Italy. The two attended a conference together in Trieste, where the Italian spread the word about Bailey and the Bayesian origin of insurance Credibility. It was the first time most statisticians had heard of him.
Hans Bühlmann, who became a mathematics professor and president of ETH Zurich, remembers the excitement of that conference. He had spent a leave of absence studying in Neyman’s statistics department in Berkeley in the 1950s, “when it was kind of dangerous to pronounce the Bayesian point of view.” Taking up Bailey’s challenge, Bühlmann produced a general Bayesian theory of credibility, which statisticians carried far beyond the world of actuaries and insurance. Carefully renaming the prior a “structural function,” Bühlmann believed he helped Continental Europe escape some of the “religious” quarrels over Bayes’ rule, quarrels that lay ahead for Anglo-Americans.12
7.
from tool to theology
While Arthur Bailey was transforming the sledgehammer of Credibility into Bayes’ rule for the insurance industry, a postwar boom in statistics was elevating the method’s lowly status. Gradually, Bayes would shed its reputation as a mere tool for solving practical problems and emerge in glorious Technicolor as an all-encompassing philosophy. Some would even call it a theology.
The Second World War had radically upgraded the stature, financial prospects, and career opportunities of applied mathematicians in the United States. The military was profoundly impressed by its wartime experience with statistics and operations research, and during the late 1940s the government poured money into science and statistics. Military funding officers roamed university hallways trying to persuade often reluctant statisticians to apply for grants. Naval leaders, convinced that postwar science needed a jump-start to prime technology’s pump, organized the Office of Naval Research, the first federal agency formed expressly to finance scientific research. Until the National Science Foundation was created in 1950, the U.S. Navy supported much of the nation’s mathematical and statistical research, whether classified or unclassified, basic or applied. Other funding came from the U.S. Army, the U.S. Air Force, and the National Institutes of Health.
A generation of pure mathematicians who had made exciting, life-or-death decisions during the war soon switched to applied mathematics and statistics. As the statistics capital of the world moved from Britain to the United States, the field exploded. Amid such spectacular growth, the number of theoretical statisticians increased a hundred-fold. Settling into mathematics departments, they coined new terms like “mathematical statistics” and “theoretical statistics.”
In these boom times, even Bayesians could get jobs in elite research institutions. At one end of the Bayesian spectrum was a small band of evangelists intent on making their theories mathematically and academically respectable. At the other end were practitioners who wanted to play key roles in science instead of in formalistic mathematical exercises.
In the face of jangling changes and new attitudes, the wartime marriage of convenience between abstract and applied mathematicians fell apart. Statisticians complained that pure mathematicians regarded useful research as “something for peasants,” akin to washing dishes and sweeping streets.1 Jack Good claimed the mathematicians at Virginia Tech, home of the nation’s third largest statistics department in the 1960s, loathed problem solvers.2
Frisky with federal funds, statisticians and data analysts divorced themselves from mathematics departments and formed their own enclaves. Yet even there tension sizzled between abstract theorizing and scientific applications, albeit in more decorous privacy. Serial schisms continue to this day, with applied mathematicians occupying—depending on the university—departments of mathematics, applied mathematics, statistics, biostatistics, and computer science.
Jerzy Neyman’s laboratory at Berkeley, then the largest and most important statistics center in the world, developed fundamental sampling theories and reigned over this fractious profession for years after the Second World War. But Neyman’s laboratory developed fissures of its own. Unable to compete with the soaring demand for statisticians, the department hired and promoted its own students and became ingrown. When a student tried to solve a blackboard problem unconventionally, Neyman grabbed his hand and forced it to write the answer Neyman’s way. For 40 years most of his hires were frequentists, and outsiders called the group “Jesus and his disciples.”3 Neyman continued to run his institute into his 80s.
Although both were fervent anti-Bayesians, Neyman and Fisher battled to the end, neither willing to admit that the oth
er might be using the technique that best fit his own needs. For Fisher, the stakes were high: “We are quite in danger of sending highly trained and intelligent young men out into the world with tables of erroneous numbers under their arms, and with a dense fog in the place where their brains ought to be. In this century, of course, they will be working on guided missiles and advising the medical profession on the control of diseases, and there is no limit to the extent to which they could impede every sort of national effort.”4 He described Neyman as “some hundred years out of date, . . . partly incapacitated by the crooked reasoning.”5 Neyman called Fisher’s researches “insidious because, in a skillfully hidden manner, they involve unjustified claims of priority.”6 And so it went. At the age of 85, Neyman declared loftily, “[Bayes] does not interest me. I am interested in frequencies.”7
To Bayesian sympathizers, frequentism began to look like a Rube Goldberg cartoon of loosely connected ad hockeries, tests, and procedures that arose independently instead of growing in a unified, logical manner out of probability. The joke was that if you didn’t like the result of your frequentist analysis, you just redid it using a different test. By comparison, Bayes’ rule seemed to have an overall rationale. As the number of statisticians, symposia, articles, and journals multiplied, a series of publications issued around 1950 began to attract attention to the heretofore invisible world of Bayes’ rule.
Bayes stood poised for another of its periodic rebirths as three mathematicians, Jack Good, Leonard Jimmie Savage, and Dennis V. Lindley, tackled the job of turning Bayes’ rule into a respectable form of mathematics and a logical, coherent methodology. The first publication heralding the Bayesian revival was a book by Good, Alan Turing’s wartime assistant. As Good explained, “After the war, he [Turing] didn’t have time to write about statistics because he was too busy designing computers and computer languages, and speculating about artificial intelligence and the chemical basis of morphogenesis, so with his permission, I developed his idea . . . in considerable detail.”8 Good finished the first draft of Probability and the Weighing of Evidence in 1946 but could not get it published until 1950, the same year Arthur Bailey issued his Bayesian manifesto for actuaries. Much of the delay, Good explained, was caused by the continuation of wartime secrecy during the Cold War.