How Not to Be Wrong : The Power of Mathematical Thinking (9780698163843)
Page 6
Some reformists go so far as to say that the classical algorithms (like “add two multidigit numbers by stacking one atop the other and carrying the one when necessary”) should be taken out of the classroom, lest they interfere with the students’ process of discovering the properties of mathematical objects on their own.*
That seems like a terrible idea to me: these algorithms are useful tools that people worked hard to make, and there’s no reason we should have to start completely from scratch.
On the other hand, there are algorithms I think we can safely discard in the modern world. We don’t need to teach students how to extract square roots by hand, or in their head (though the latter skill, I can tell you from long personal experience, makes a great party trick in sufficiently nerdy circles). Calculators are also useful tools that people worked hard to make—we should use them, too, when the situation demands! I don’t even care whether my students can divide 430 by 12 using long division—though I do care that their number sense is sufficiently developed to reckon mentally that the answer’s a little more than 35.
The danger of overemphasizing algorithms and precise computations is that algorithms and precise computations are easy to assess. If we settle on a vision of mathematics that consists of “getting the answer right” and no more, and test for that, we run the risk of creating students who test very well but know no mathematics at all. This might be satisfying to those whose incentives are driven by test scores foremost and only, but it is not satisfying to me.
Of course it’s no better (in fact, it’s substantially worse) to pass along a population of students who’ve developed some wispy sense of mathematical meaning but can’t work examples swiftly and correctly. A math teacher’s least favorite thing to hear from a student is “I get the concept, but I couldn’t do the problems.” Though the student doesn’t know it, this is shorthand for “I don’t get the concept.” The ideas of mathematics can sound abstract, but they make sense only in reference to concrete computations. William Carlos Williams put it crisply: no ideas but in things.
Nowhere is the battle more starkly defined than in plane geometry. Here is the last redoubt of the teaching of proofs, the bedrock practice of mathematics. By many professional mathematicians it is considered a sort of last stand of “real math.” But it’s not clear to what extent we’re really teaching the beauty, power, and surprise of proof when we teach geometry. It’s easy for the course to become an exercise in repetition as arid as a list of thirty definite integrals. The situation is so dire that the Fields Medalist David Mumford has suggested that we might dispense with plane geometry entirely and replace it with a first course in programming. A computer program, after all, has much in common with a geometric proof: both require the student to put together several very simple components from a small bag of options, one after the other, so that the sequence as a whole accomplishes some meaningful task.
I’m not as radical as that. In fact, I’m not radical at all. Dissatisfying as it may be to partisans, I think we have to teach a mathematics that values precise answers but also intelligent approximation, that demands the ability to deploy existing algorithms fluently but also the horse sense to work things out on the fly, that mixes rigidity with a sense of play. If we don’t, we’re not really teaching mathematics at all.
It’s a tall order—but it’s what the best math teachers are doing, anyway, while the math wars rage among the administrators overhead.
BACK TO THE OBESITY APOCALYPSE
So what percentage of Americans are going to be overweight in 2048? You can guess by now how Youfa Wang and his Obesity coauthors generated their projection. The National Health and Nutrition Examination Study, or NHANES, tracks the health data of a large, representative sample of Americans, covering everything from hearing loss to sexually transmitted infections. In particular, it gives very good data for the proportion of Americans who are overweight, which for present purposes is defined as having a body-mass index of 25 or higher.* There’s no question that the prevalence of overweight has increased in recent decades. In the early 1970s, just under half of Americans had a BMI that high. By the early 1990s that figure had risen to almost 60%, and by 2008 almost three-quarters of the U.S. population was overweight.
You can plot the prevalence of obesity against time just as we did with the missile’s vertical progress:
And you can generate a linear regression, which will look something like this:
In 2048, the line crosses 100%. And that’s why Wang writes that all Americans will be overweight in 2048, if current trends continue.
But current trends will not continue. They can’t! If they did, by 2060, a whopping 109% of Americans would be overweight.
In reality, the graph of an increasing proportion bends toward 100%, like this:
That’s not an ironclad law, like the gravity that bends the missile’s path into a parabola, but it’s as close as you’re going to get in medicine. The higher the proportion of overweight people, the fewer skinny malinkies are left to convert, and the more slowly the proportion increases toward 100%. In fact, the curve probably goes horizontal at some point below 100%. The thin we have always with us! And indeed, just four years later, the NHANES survey showed that the upward march of overweight prevalence had already begun to slow.
But the Obesity paper conceals a worse crime against mathematics and common sense. Linear regression is easy to do—and once you’ve done one, it’s cake to do more. So Wang and company broke down their data by ethnic group and sex. Black men, for instance, were less likely to be overweight than the average American; and, more important, their rate of overweight was growing only half as quickly. If we superimpose the proportion of overweight black men on the proportion of overweight Americans overall, together with the linear regressions Wang and company worked out, we get a picture that looks like this.
Nice work, black men! Not until 2095 will all of you be overweight. In 2048, only 80% of you will be.
See the problem? If all Americans are supposed to be overweight in 2048, where are those one in five future black men without a weight problem supposed to be? Offshore?
That basic contradiction goes unmentioned in the paper. It’s the epidemiological equivalent of saying there are −4 grams of water left in the bucket. Zero credit.
FOUR
HOW MUCH IS THAT IN DEAD AMERICANS?
How bad is the conflict in the Middle East? Counterterrorism specialist Daniel Byman of Georgetown University lays down some cold, hard numbers in Foreign Affairs: “The Israeli military reports that from the start of the second intifada [in 2000] through the end of October 2005, Palestinians killed 1,074 Israelis and wounded 7,520—astounding figures for such a small country, the proportional equivalent of more than 50,000 dead and 300,000 wounded for the United States.” This kind of computation has become commonplace in discussions of the region. In December 2001 the U.S. House of Representatives declared that the 26 people killed by a series of attacks in Israel were “the equivalent, on a proportional basis, of 1,200 American deaths.” Newt Gingrich in 2006: “Remember that when Israel loses eight people, because of the difference in population, it’s the equivalent of losing almost 500 Americans.” Not to be outdone, Ahmed Moor wrote in the Los Angeles Times: “When Israel killed 1,400 Palestinians in Gaza—proportionally equivalent to 300,000 Americans—in Operation Cast Lead, incoming President Obama stayed mum.”
The rhetoric of proportion isn’t reserved for the Holy Land. In 1988, Gerald Caplan wrote in the Toronto Star, “Some 45,000 Nicaraguans on both sides of the struggle have been killed, wounded or kidnapped in the past eight years; in perspective, that’s the equivalent of 300,000 Canadians or 3 million Americans.” Robert McNamara, the Vietnam-era secretary of defense, said in 1997 that the nearly 4 million Vietnamese deaths during the war were “equivalent to 27 million Americans.” Any time a lot of people in a small country come to a bad end, editorialists get o
ut their slide rules and start figuring: how much is that in dead Americans?
Here’s how you generate these numbers. The 1,074 Israelis killed by terrorists amount to about 0.015% of the Israeli population (which between 2000 and 2005 ranged from about 6 to 7 million). So the pundits are reckoning that the death of 0.015% of the much larger United States population, which indeed comes to about 50,000, would have roughly the same impact here.
This is lineocentrism in its purest form. According to the argument by proportion, you can find the equivalent of 1,074 Israelis anywhere around the globe via the graph below:
The 1,074 Israeli victims are equivalent to 7,700 Spaniards or 223,000 Chinese, but only 300 Slovenes and either one or two Tuvaluans.
Eventually (or perhaps immediately?) this reasoning starts to break down. When there are two men left in the bar at closing time, and one of them coldcocks the other, it is not equivalent in context to 150 million Americans getting simultaneously punched in the face.
Or: when 11% of the population of Rwanda was wiped out in 1994, all agree that it was among the worst crimes of the century. But we don’t describe the bloodshed there by saying, “In the context of 1940s Europe, it was nine times as bad as the Holocaust.” And to do so would set teeth rightly on edge.
An important rule of mathematical hygiene: when you’re field-testing a mathematical method, try computing the same thing several different ways. If you get several different answers, something’s wrong with your method.
For example: the 2004 bombings at the Atocha train station in Madrid killed almost 200 people. What would be an equivalently deadly bombing at Grand Central Station?
The United States has almost seven times the population of Spain. So if you think of 200 people as 0.0004% of the Spanish population, you find that an equivalent attack would kill 1,300 people in the United States. On the other hand, 200 people is 0.006% of the population of Madrid; scaling up to New York City, which is two and a half times as large, gives you 463 victims. Or should we compare the province of Madrid with the state of New York? That gives you something closer to 600. This multiplicity of conclusions should be a red flag. Something is fishy with the method of proportions.
One can’t, of course, reject proportions entirely. Proportions matter! If you want to know which parts of America have the biggest brain cancer problem, it doesn’t make much sense to look at the states with the most deaths from brain cancer: those are California, Texas, New York, and Florida, which have the most brain cancer because they have the most people. Stephen Pinker makes a similar point in his recent best seller The Better Angels of Our Nature, which argues that the world has steadily grown less violent throughout human history. The twentieth century gets a bad rap because of the vast numbers of people caught in the gears of great-power politics. But the Nazis, the Soviets, the Communist Party of China, and the colonial overlords were actually not particularly effective slaughterers on a proportional basis, Pinker argues—there are just so many more people to kill nowadays! These days we don’t spare much grief for antique bloodlettings like the Thirty Years’ War. But that war took place in a smaller world, and by Pinker’s estimate killed one out of every hundred people on Earth. To do that now would mean wiping out 70 million people, more than the number who died in both world wars together.
So it’s better to study rates: deaths as a proportion of total population. For instance, instead of counting raw numbers of brain cancer deaths by state, we can compute the proportion of each state’s population that dies of brain cancer each year. That makes for a very different leaderboard. South Dakota takes the unwelcome first prize, with 5.7 brain cancer deaths per 100,000 people per year, well above the national rate of 3.4. South Dakota is followed on the list by Nebraska, Alaska, Delaware, and Maine. These are the places to avoid if you don’t want to get brain cancer, it seems. So where should you move? Scrolling down to the bottom of the list, you find Wyoming, Vermont, North Dakota, Hawaii, and the District of Columbia.
Now this is strange. Why should South Dakota be brain cancer central and North Dakota nearly tumor free? Why would you be safe in Vermont but imperiled in Maine?
The answer: South Dakota isn’t necessarily causing brain cancer, and North Dakota isn’t necessarily preventing it. The five states at the top have something in common, and the five states at the bottom do, too. And it’s the same thing: hardly anyone lives there. Of the nine states (and one District) that finished at the top and bottom, the biggest is Nebraska, which is currently locked with West Virginia in a close struggle to be the 37th most populous state. Living in a small state, apparently, makes it either much more or much less likely you’ll get brain cancer.
Since that makes no sense, we’d better seek another explanation.
To see what’s going on, let’s play an imaginary game. The game is called who’s the best at flipping coins. It’s pretty simple. You flip a bunch of coins and whoever gets the most heads wins. To make this a little more interesting, though, not everybody has the same number of coins. Some people—Team Small—have only ten coins, while the members of Team Big have a hundred each.
If we score by absolute number of heads, one thing’s for almost sure—the winner of this game is going to come from Team Big. The typical Big player is going to get around 50 heads, a figure none of the Smalls can possibly match. Even if Team Small has a hundred members, the high scorer among them is likely to get an 8 or 9.*
That doesn’t seem fair! Team Big has got a massive built-in advantage. So here’s a better idea. Instead of scoring by raw number, let’s score by proportion. That should put the two teams on a fairer footing.
But it doesn’t. As I said, if there are a hundred Smalls, at least one is likely to get 8 heads. So that person’s score is going to be at least 80%. And the Bigs? None of the Bigs is going to get 80% heads. It’s physically possible, of course. But it’s not going to happen. In fact, you’d need about two billion players on the Big team before you’d get a reasonable chance of seeing any outcome that lopsided. This ought to fit your intuition about probability. The more coins you throw, the more likely you are to be close to 50-50.
You can try it yourself! I did, and here’s what happened. Repeatedly flipping 10 coins at a time to simulate Small players, I got a sequence of head counts that looked like this:
4, 4, 5, 6, 5, 4, 3, 3, 4, 5, 5, 9, 3, 5, 7, 4, 5, 7, 7, 9 . . .
With a hundred coins, like the Bigs, I got:
46, 54, 48, 45, 45, 52, 49, 47, 58, 40, 57, 46, 46, 51, 52, 51, 50, 60, 43, 45 . . .
And with a thousand:
486, 501, 489, 472, 537, 474, 508, 510, 478, 508, 493, 511, 489, 510, 530, 490, 503, 462, 500, 494 . . .
Okay, to be honest, I didn’t flip a thousand coins. I asked my computer to simulate coin flips. Who has time to flip a thousand coins?
One person who did was J. E. Kerrich, a mathematician from South Africa who made an ill-advised visit to Europe in 1939. His semester abroad quickly turned into an unscheduled stint in an internment camp in Denmark. Where a less statistically minded prisoner might have passed the time by scratching the days on the cell wall, Kerrich flipped a coin, 10,000 times in all, keeping track of the number of heads as he went. His results looked like this:
As you can see, the fraction of heads converges inexorably toward 50% as you flip more and more coins, as if squeezed by an invisible vise. You can see the same effect in the simulations. The proportions of heads in the first group of tries, the Smalls, range from 30% to 90%. With a hundred flips at a time, the range narrows: just 40% to 60%. And with a thousand flips, the range of proportions is only 46.2% to 53.7%. Something is pushing those numbers closer and closer to 50%. That something is the cold, strong hand of the Law of Large Numbers. I won’t state that theorem precisely (though it is stunningly handsome!), but you can think of it as saying the following: the more coins you flip, the more and more extravagantly unlikely it is that you�
��ll get 80% heads. In fact, if you flip enough coins, there’s only the barest chance of getting as many as 51%! Observing a highly unbalanced result in ten flips is unremarkable; getting the same proportional imbalance in a hundred flips would be so startling as to make you wonder whether someone has mucked with your coins.
The understanding that the results of an experiment tend to settle down to a fixed average when the experiment is repeated again and again is not new. In fact, it’s almost as old as the mathematical study of chance itself; an informal form of the principle was asserted in the sixteenth century by Girolamo Cardano, though it was not until the early 1800s that Siméon-Denis Poisson came up with the pithy name “la loi des grands nombres” to describe it.
THE GENDARME’S HAT
By the early eighteenth century, Jakob Bernoulli had worked out a precise statement and mathematical proof of the Law of Large Numbers. It was now no longer an observation, but a theorem.
And the theorem tells you that the Big−Small game isn’t fair. The Law of Large Numbers will always push the Big players’ scores toward 50%, while those of the Smalls are apt to vary much more widely. But it would be nuts to conclude that the Small team is “better” at flipping heads, even though that team wins every game. For if you average the proportion of heads flipped by all the Small players, not just the top scorer, they’ll likely be at just about 50%, same as the Bigs. And if we look for the player with the fewest heads instead of the most, Team Small suddenly looks bad at getting heads: it’s very likely one of their players will have only 20% heads, and none of the Big players will ever score that badly. Scoring by raw number of heads gives the Big team an insuperable advantage; but using percentages slants the game just as badly in favor of the Smalls. The smaller the number of coins—what we’d call in statistics the sample size—the greater the variation in the proportion of heads.