by Nate Silver
The most calamitous failures of prediction usually have a lot in common. We focus on those signals that tell a story about the world as we would like it to be, not how it really is. We ignore the risks that are hardest to measure, even when they pose the greatest threats to our well-being. We make approximations and assumptions about the world that are much cruder than we realize. We abhor uncertainty, even when it is an irreducible part of the problem we are trying to solve. If we want to get at the heart of the financial crisis, we should begin by identifying the greatest predictive failure of all, a prediction that committed all these mistakes.
The ratings agencies had given their AAA rating, normally reserved for a handful of the world’s most solvent governments and best-run businesses, to thousands of mortgage-backed securities, financial instruments that allowed investors to bet on the likelihood of someone else defaulting on their home. The ratings issued by these companies are quite explicitly meant to be predictions: estimates of the likelihood that a piece of debt will go into default.5 Standard & Poor’s told investors, for instance, that when it rated a particularly complex type of security known as a collateralized debt obligation (CDO) at AAA, there was only a 0.12 percent probability—about 1 chance in 850—that it would fail to pay out over the next five years.6 This supposedly made it as safe as a AAA-rated corporate bond7 and safer than S&P now assumes U.S. Treasury bonds to be.8 The ratings agencies do not grade on a curve.
In fact, around 28 percent of the AAA-rated CDOs defaulted, according to S&P’s internal figures.9 (Some independent estimates are even higher.10) That means that the actual default rates for CDOs were more than two hundred times higher than S&P had predicted.11
This is just about as complete a failure as it is possible to make in a prediction: trillions of dollars in investments that were rated as being almost completely safe instead turned out to be almost completely unsafe. It was as if the weather forecast had been 86 degrees and sunny, and instead there was a blizzard.
FIGURE 1-1: FORECASTED AND ACTUAL 5-YEAR DEFAULT RATES FOR AAA-RATED CDO TRANCHES
When you make a prediction that goes so badly, you have a choice of how to explain it. One path is to blame external circumstances—what we might think of as “bad luck.” Sometimes this is a reasonable choice, or even the correct one. When the National Weather Service says there is a 90 percent chance of clear skies, but it rains instead and spoils your golf outing, you can’t really blame them. Decades of historical data show that when the Weather Service says there is a 1 in 10 chance of rain, it really does rain about 10 percent of the time over the long run.*
This explanation becomes less credible, however, when the forecaster does not have a history of successful predictions and when the magnitude of his error is larger. In these cases, it is much more likely that the fault lies with the forecaster’s model of the world and not with the world itself.
In the instance of CDOs, the ratings agencies had no track record at all: these were new and highly novel securities, and the default rates claimed by S&P were not derived from historical data but instead were assumptions based on a faulty statistical model. Meanwhile, the magnitude of their error was enormous: AAA-rated CDOs were two hundred times more likely to default in practice than they were in theory.
The ratings agencies’ shot at redemption would be to admit that the models had been flawed and the mistake had been theirs. But at the congressional hearing, they shirked responsibility and claimed to have been unlucky. They blamed an external contingency: the housing bubble.
“S&P is not alone in having been taken by surprise by the extreme decline in the housing and mortgage markets,” Deven Sharma, the head of Standard & Poor’s, told Congress that October.12 “Virtually no one, be they homeowners, financial institutions, rating agencies, regulators or investors, anticipated what is coming.”
Nobody saw it coming. When you can’t state your innocence, proclaim your ignorance: this is often the first line of defense when there is a failed forecast.13 But Sharma’s statement was a lie, in the grand congressional tradition of “I did not have sexual relations with that woman” and “I have never used steroids.”
What is remarkable about the housing bubble is the number of people who did see it coming—and who said so well in advance. Robert Shiller, the Yale economist, had noted its beginnings as early as 2000 in his book Irrational Exuberance.14 Dean Baker, a caustic economist at the Center for Economic and Policy Research, had written about the bubble in August 2002.15 A correspondent at the Economist magazine, normally known for its staid prose, had spoken of the “biggest bubble in history” in June 2005.16 Paul Krugman, the Nobel Prize–winning economist, wrote of the bubble and its inevitable end in August 2005.17 “This was baked into the system,” Krugman later told me. “The housing crash was not a black swan. The housing crash was the elephant in the room.”
Ordinary Americans were also concerned. Google searches on the term “housing bubble” increased roughly tenfold from January 2004 through summer 2005.18 Interest in the term was heaviest in those states, like California, that had seen the largest run-up in housing prices19—and which were about to experience the largest decline. In fact, discussion of the bubble was remarkably widespread. Instances of the two-word phrase “housing bubble” had appeared in just eight news accounts in 200120 but jumped to 3,447 references by 2005. The housing bubble was discussed about ten times per day in reputable newspapers and periodicals.21
And yet, the ratings agencies—whose job it is to measure risk in financial markets—say that they missed it. It should tell you something that they seem to think of this as their best line of defense. The problems with their predictions ran very deep.
“I Don’t Think They Wanted the Music to Stop”
None of the economists and investors I spoke with for this chapter had a favorable view of the ratings agencies. But they were divided on whether their bad ratings reflected avarice or ignorance—did they know any better?
Jules Kroll is perhaps uniquely qualified to pass judgment on this question: he runs a ratings agency himself. Founded in 2009, Kroll Bond Ratings had just issued its first rating—on a mortgage loan made to the builders of a gigantic shopping center in Arlington, Virginia—when I met him at his office in New York in 2011.
Kroll faults the ratings agencies most of all for their lack of “surveillance.” It is an ironic term coming from Kroll, who before getting into the ratings game had become modestly famous (and somewhat immodestly rich) from his original company, Kroll Inc., which acted as a sort of detective agency to patrol corporate fraud. They knew how to sniff out a scam—such as the case of the kidnappers who took a hedge-fund billionaire hostage but foiled themselves by charging a pizza to his credit card.22 Kroll was sixty-nine years old when I met him, but his bloodhound instincts are keen—and they were triggered when he began to examine what the ratings agencies were doing.
“Surveillance is a term of art in the ratings industry,” Kroll told me. “It means keeping investors informed as to what you’re seeing. Every month you get a tape* of things like defaults on mortgages, prepayment of mortgages—you get a lot of data. That is the early warning—are things getting better or worse? The world expects you to keep them posted.”
The ratings agencies ought to have been just about the first ones to detect problems in the housing market, in other words. They had better information than anyone else: fresh data on whether thousands of borrowers were making their mortgage payments on time. But they did not begin to downgrade large batches of mortgage-backed securities until 2007—at which point the problems had become manifest and foreclosure rates had already doubled.23
“These are not stupid people,” Kroll told me. “They knew. I don’t think they wanted the music to stop.”
Kroll Bond Ratings is one of ten registered NRSROs, or nationally recognized statistical rating organizations, firms that are licensed by the Securities and Exchange Commission to rate debt-backed securities. But Moody’s, S&P, an
d Fitch are three of the others, and they have had almost all the market share; S&P and Moody’s each rated almost 97 percent of the CDOs that were issued prior to the financial collapse.24
One reason that S&P and Moody’s enjoyed such a dominant market presence is simply that they had been a part of the club for a long time. They are part of a legal oligopoly; entry into the industry is limited by the government. Meanwhile, a seal of approval from S&P and Moody’s is often mandated by the bylaws of large pension funds,25 about two-thirds of which26 mention S&P, Moody’s, or both by name, requiring that they rate a piece of debt before the pension fund can purchase it.27
S&P and Moody’s had taken advantage of their select status to build up exceptional profits despite picking résumés out of Wall Street’s reject pile.* Moody’s28 revenue from so-called structured-finance ratings increased by more than 800 percent between 1997 and 2007 and came to represent the majority of their ratings business during the bubble years.29 These products helped Moody’s to the highest profit margin of any company in the S&P 500 for five consecutive years during the housing bubble.30 (In 2010, even after the bubble burst and the problems with the ratings agencies had become obvious, Moody’s still made a 25 percent profit.31)
With large profits locked in so long as new CDOs continued to be issued, and no way for investors to verify the accuracy of their ratings until it was too late, the agencies had little incentive to compete on the basis of quality. The CEO of Moody’s, Raymond McDaniel, explicitly told his board that ratings quality was the least important factor driving the company’s profits.32
Instead their equation was simple. The ratings agencies were paid by the issuer of the CDO every time they rated one: the more CDOs, the more profit. A virtually unlimited number of CDOs could be created by combining different types of mortgages—or when that got boring, combining different types of CDOs into derivatives of one another. Rarely did the ratings agencies turn down the opportunity to rate one. A government investigation later uncovered an instant-message exchange between two senior Moody’s employees in which one claimed that a security “could be structured by cows” and Moody’s would rate it.33 In some cases, the ratings agencies went further still and abetted debt issuers in manipulating the ratings. In what it claimed was a nod to transparency,34 S&P provided the issuers with copies of their ratings software. This made it easy for the issuers to determine exactly how many bad mortgages they could add to the pool without seeing its rating decline.35
The possibility of a housing bubble, and that it might burst, thus represented a threat to the ratings agencies’ gravy train. Human beings have an extraordinary capacity to ignore risks that threaten their livelihood, as though this will make them go away. So perhaps Deven Sharma’s claim isn’t so implausible—perhaps the ratings agencies really had missed the housing bubble, even if others hadn’t.
In fact, however, the ratings agencies quite explicitly considered the possibility that there was a housing bubble. They concluded, remarkably, that it would be no big deal. A memo provided to me by an S&P spokeswoman, Catherine Mathis, detailed how S&P had conducted a simulation in 2005 that anticipated a 20 percent decline in national housing prices over a two-year period—not far from the roughly 30 percent decline in housing prices that actually occurred between 2006 and 2008. The memo concluded that S&P’s existing models “captured the risk of a downturn” adequately and that its highly rated securities would “weather a housing downturn without suffering a credit-rating downgrade.”36
In some ways this is even more troubling than if the ratings agencies had missed the housing bubble entirely. In this book, I’ll discuss the danger of “unknown unknowns”—the risks that we are not even aware of. Perhaps the only greater threat is the risks we think we have a handle on, but don’t.* In these cases we not only fool ourselves, but our false confidence may be contagious. In the case of the ratings agencies, it helped to infect the entire financial system. “The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong it usually turns out to be impossible to get at or repair,” wrote Douglas Adams in The Hitchhiker’s Guide to the Galaxy series.37
But how did the ratings agencies’ models, which had all the auspices of scientific precision, do such a poor job of describing reality?
How the Ratings Agencies Got It Wrong
We have to dig a bit deeper to find the source of the problem. The answer requires a little bit of detail about how financial instruments like CDOs are structured, and a little bit about the distinction between uncertainty and risk.
CDOs are collections of mortgage debt that are broken into different pools, or “tranches,” some of which are supposed to be quite risky and others of which are rated as almost completely safe. My friend Anil Kashyap, who teaches a course on the financial crisis to students at the University of Chicago, has come up with a simplified example of a CDO, and I’ll use a version of this example here.
Imagine you have a set of five mortgages, each of which you assume has a 5 percent chance of defaulting. You can create a number of bets based on the status of these mortgages, each of which is progressively more risky.
The safest of these bets, what I’ll call the Alpha Pool, pays out unless all five of the mortgages default. The riskiest, the Epsilon Pool, leaves you on the hook if any of the five mortgages defaults. Then there are other steps along the way.
Why might an investor prefer making a bet on the Epsilon Pool to the Alpha Pool? That’s easy—because it will be priced more cheaply to account for the greater risk. But say you’re a risk-averse investor, such as a pension fund, and that your bylaws prohibit you from investing in poorly rated securities. If you’re going to buy anything, it will be the Alpha Pool, which will assuredly be rated AAA.
The Alpha Pool consists of five mortgages, each of which has only a 5 percent chance of defaulting. You lose the bet only if all five actually do default. What is the risk of that happening?
Actually, that is not an easy question—and therein lies the problem. The assumptions and approximations you choose will yield profoundly different answers. If you make the wrong assumptions, your model may be extraordinarily wrong.
One assumption is that each mortgage is independent of the others. In this scenario, your risks are well diversified: if a carpenter in Cleveland defaults on his mortgage, this will have no bearing on whether a dentist in Denver does. Under this scenario, the risk of losing your bet would be exceptionally small—the equivalent of rolling snake eyes five times in a row. Specifically, it would be 5 percent taken to the fifth power, which is just one chance in 3,200,000. This supposed miracle of diversification is how the ratings agencies claimed that a group of subprime mortgages that had just a B+ credit rating on average38—which would ordinarily imply39 more than a 20 percent chance of default40—had almost no chance of defaulting when pooled together.
The other extreme is to assume that the mortgages, instead of being entirely independent of one another, will all behave exactly alike. That is, either all five mortgages will default or none will. Instead of getting five separate rolls of the dice, you’re now staking your bet on the outcome of just one. There’s a 5 percent chance that you will roll snake eyes and all the mortgages will default—making your bet 160,000 times riskier than you had thought originally.41
Which of these assumptions is more valid will depend on economic conditions. If the economy and the housing market are healthy, the first scenario—the five mortgages have nothing to do with one another—might be a reasonable approximation. Defaults are going to happen from time to time because of unfortunate rolls of the dice: someone gets hit with a huge medical bill, or they lose their job. However, one person’s default risk won’t have much to do with another’s.
But suppose instead that there is some common factor that ties the fate of these homeowners together. For instance: there is a massive housing bubble that has caused home prices to
rise by 80 percent without any tangible improvement in the fundamentals. Now you’ve got trouble: if one borrower defaults, the rest might succumb to the same problems. The risk of losing your bet has increased by orders of magnitude.
The latter scenario was what came into being in the United States beginning in 2007 (we’ll conduct a short autopsy on the housing bubble later in this chapter). But it was the former assumption of largely uncorrelated risks that the ratings agencies had bet on. Although the problems with this assumption were understood in the academic literature42 and by whistle-blowers at the ratings agencies43 long before the housing bubble burst, the efforts the ratings agencies made to account for it were feeble.
Moody’s, for instance, went through a period of making ad hoc adjustments to its model44 in which it increased the default probability assigned to AAA-rated securities by 50 percent. That might seem like a very prudent attitude: surely a 50 percent buffer will suffice to account for any slack in one’s assumptions?
It might have been fine had the potential for error in their forecasts been linear and arithmetic. But leverage, or investments financed by debt, can make the error in a forecast compound many times over, and introduces the potential of highly geometric and nonlinear mistakes. Moody’s 50 percent adjustment was like applying sunscreen and claiming it protected you from a nuclear meltdown—wholly inadequate to the scale of the problem. It wasn’t just a possibility that their estimates of default risk could be 50 percent too low: they might just as easily have underestimated it by 500 percent or 5,000 percent. In practice, defaults were two hundred times more likely than the ratings agencies claimed, meaning that their model was off by a mere 20,000 percent.
In a broader sense, the ratings agencies’ problem was in being unable or uninterested in appreciating the distinction between risk and uncertainty.
Risk, as first articulated by the economist Frank H. Knight in 1921,45 is something that you can put a price on. Say that you’ll win a poker hand unless your opponent draws to an inside straight: the chances of that happening are exactly 1 chance in 11.46 This is risk. It is not pleasant when you take a “bad beat” in poker, but at least you know the odds of it and can account for it ahead of time. In the long run, you’ll make a profit from your opponents making desperate draws with insufficient odds.