Antifragile: Things That Gain from Disorder
Page 34
I kept telling anyone who would listen to me, including random taxi drivers (well, almost), that the company Fannie Mae was “sitting on a barrel of dynamite.” Of course, blowups don’t happen every day (just as poorly built bridges don’t collapse immediately), and people kept saying that my opinion was wrong and unfounded (using some argument that the stock was going up or something even more circular). I also inferred that other institutions, almost all banks, were in the same situation. After checking similar institutions, and seeing that the problem was general, I realized that a total collapse of the banking system was a certainty. I was so certain I could not see straight and went back to the markets to get my revenge against the turkeys. As in the scene from The Godfather (III), “Just when I thought I was out, they pull me back in.”
Things happened as if they were planned by destiny. Fannie Mae went bust, along with other banks. It just took a bit longer than expected, no big deal.
The stupid part of the story is that I had not seen the link between financial and general fragility—nor did I use the term “fragility.” Maybe I didn’t look at too many porcelain cups. However, thanks to the episode of the attic I had a measure for fragility, hence antifragility.
It all boils down to the following: figuring out if our miscalculations or misforecasts are on balance more harmful than they are beneficial, and how accelerating the damage is. Exactly as in the story of the king, in which the damage from a ten-kilogram stone is more than twice the damage from a five-kilogram one. Such accelerating damage means that a large stone would eventually kill the person. Likewise a large market deviation would eventually kill the company.
Once I figured out that fragility was directly from nonlinearity and convexity effects, and that convexity was measurable, I got all excited. The technique—detecting acceleration of harm—applies to anything that entails decision making under uncertainty, and risk management. While it was the most interesting in medicine and technology, the immediate demand was in economics. So I suggested to the International Monetary Fund a measure of fragility to substitute for their measures of risk that they knew didn’t work. Most people in the risk business had been frustrated by the poor (rather, the random) performance of their models, but they didn’t like my earlier stance: “don’t use any model.” They wanted something. And a risk measure was there.1
So here is something to use. The technique, a simple heuristic called the fragility (and antifragility) detection heuristic, works as follows. Let’s say you want to check whether a town is overoptimized. Say you measure that when traffic increases by ten thousand cars, travel time grows by ten minutes. But if traffic increases by ten thousand more cars, travel time now extends by an extra thirty minutes. Such acceleration of traffic time shows that traffic is fragile and you have too many cars and need to reduce traffic until the acceleration becomes mild (acceleration, I repeat, is acute concavity, or negative convexity effect).
Likewise, government deficits are particularly concave to changes in economic conditions. Every additional deviation in, say, the unemployment rate—particularly when the government has debt—makes deficits incrementally worse. And financial leverage for a company has the same effect: you need to borrow more and more to get the same effect. Just as in a Ponzi scheme.
The same with operational leverage on the part of a fragile company. Should sales increase 10 percent, then profits would increase less than they would decrease should sales drop 10 percent.
That was in a way the technique I used intuitively to declare that the Highly Respected Firm Fannie Mae was on its way to the cemetery—and it was easy to produce a rule of thumb out of it. Now with the IMF we had a simple measure with a stamp. It looks simple, too simple, so the initial reaction from “experts” was that it was “trivial” (said by people who visibly never detected these risks before—academics and quantitative analysts scorn what they can understand too easily and get ticked off by what they did not think of themselves).
According to the wonderful principle that one should use people’s stupidity to have fun, I invited my friend Raphael Douady to collaborate in expressing this simple idea using the most opaque mathematical derivations, with incomprehensible theorems that would take half a day (for a professional) to understand. Raphael, Bruno Dupire, and I had been involved in an almost two-decades-long continuous conversation on how everything entailing risk—everything—can be seen with a lot more rigor and clarity from the vantage point of an option professional. Raphael and I managed to prove the link between nonlinearity, dislike of volatility, and fragility. Remarkably—as has been shown—if you can say something straightforward in a complicated manner with complex theorems, even if there is no large gain in rigor from these complicated equations, people take the idea very seriously. We got nothing but positive reactions, and we were now told that this simple detection heuristic was “intelligent” (by the same people who had found it trivial). The only problem is that mathematics is addictive.
The Idea of Positive and Negative Model Error
Now what I believe is my true specialty: error in models.
When I was in the transaction business, I used to make plenty of errors of execution. You buy one thousand units and in fact you discover the next day that you bought two thousand. If the price went up in the meantime you had a handsome profit. Otherwise you had a large loss. So these errors are in the long run neutral in effect, since they can affect you both ways. They increase the variance, but they don’t affect your business too much. There is no one-sidedness to them. And these errors can be kept under control thanks to size limits—you make a lot of small transactions, so errors remain small. And at year end, typically, the errors “wash out,” as they say.
But that is not the case with most things we build, and with errors related to things that are fragile, in the presence of negative convexity effects. This class of errors has a one-way outcome, that is, negative, and tends to make planes land later, not earlier. Wars tend to get worse, not better. As we saw with traffic, variations (now called disturbances) tend to increase travel time from South Kensington to Piccadilly Circus, never shorten it. Some things, like traffic, do rarely experience the equivalent of positive disturbances.
This one-sidedness brings both underestimation of randomness and underestimation of harm, since one is more exposed to harm than benefit from error. If in the long run we get as much variation in the source of randomness one way as the other, the harm would severely outweigh the benefits.
So—and this is the key to the Triad—we can classify things by three simple distinctions: things that, in the long run, like disturbances (or errors), things that are neutral to them, and those that dislike them. By now we have seen that evolution likes disturbances. We saw that discovery likes disturbances. Some forecasts are hurt by uncertainty—and, like travel time, one needs a buffer. Airlines figured out how to do it, but not governments, when they estimate deficits.
This method is very general. I even used it with Fukushima-style computations and realized how fragile their computation of small probabilities was—in fact all small probabilities tend to be very fragile to errors, as a small change in the assumptions can make the probability rise dramatically, from one per million to one per hundred. Indeed, a ten-thousand-fold underestimation.
Finally, this method can show us where the math in economic models is bogus—which models are fragile and which ones are not. Simply do a small change in the assumptions, and look at how large the effect, and if there is acceleration of such effect. Acceleration implies—as with Fannie Mae—that someone relying on the model blows up from Black Swan effects. Molto facile. A detailed methodology to detect which results are bogus in economics—along with a discussion of small probabilities—is provided in the Appendix. What I can say for now is that much of what is taught in economics that has an equation, as well as econometrics, should be immediately ditched—which explains why economics is largely a charlatanic profession. Fragilistas, semper fragilisti!
r /> HOW TO LOSE A GRANDMOTHER
Next I will explain the following effect of nonlinearity: conditions under which the average—the first order effect—does not matter. As a first step before getting into the workings of the philosopher’s stone.
As the saying goes:
Do not cross a river if it is on average four feet deep.
You have just been informed that your grandmother will spend the next two hours at the very desirable average temperature of seventy degrees Fahrenheit (about twenty-one degrees Celsius). Excellent, you think, since seventy degrees is the optimal temperature for grandmothers. Since you went to business school, you are a “big picture” type of person and are satisfied with the summary information.
But there is a second piece of data. Your grandmother, it turns out, will spend the first hour at zero degrees Fahrenheit (around minus eighteen Celsius), and the second hour at one hundred and forty degrees (around 60º C), for an average of the very desirable Mediterranean-style seventy degrees (21º C). So it looks as though you will most certainly end up with no grandmother, a funeral, and, possibly, an inheritance.
Clearly, temperature changes become more and more harmful as they deviate from seventy degrees. As you see, the second piece of information, the variability, turned out to be more important than the first. The notion of average is of no significance when one is fragile to variations—the dispersion in possible thermal outcomes here matters much more. Your grandmother is fragile to variations of temperature, to the volatility of the weather. Let us call that second piece of information the second-order effect, or, more precisely, the convexity effect.
Here, consider that, as much as a good simplification the notion of average can be, it can also be a Procrustean bed. The information that the average temperature is seventy degrees Fahrenheit does not simplify the situation for your grandmother. It is information squeezed into a Procrustean bed—and these are necessarily committed by scientific modelers, since a model is by its very nature a simplification. You just don’t want the simplification to distort the situation to the point of being harmful.
Figure 16 shows the fragility of the health of the grandmother to variations. If I plot health on the vertical axis, and temperature on the horizontal one, I see a shape that curves inward—a “concave” shape, or negative convexity effect.
If the grandmother’s response was “linear” (no curve, a straight line), then the harm of temperature below seventy degrees would be offset by the benefits of temperature above it. And the fact is that the health of the grandmother has to be capped at a maximum, otherwise she would keep improving.
FIGURE 16. Megafragility. Health as a function of temperature curves inward. A combination of 0 and 140 degrees (F) is worse for your grandmother’s health than just 70 degrees. In fact almost any combination averaging 70 degrees is worse than just 70 degrees.2 The graph shows concavity or negative convexity effects—curves inward.
Take this for now as we rapidly move to the more general attributes; in the case of the grandmother’s health response to temperature: (a) there is nonlinearity (the response is not a straight line, not “linear”), (b) it curves inward, too much so, and, finally, (c) the more nonlinear the response, the less relevant the average, and the more relevant the stability around such average.
NOW THE PHILOSOPHER’S STONE3
Much of medieval thinking went into finding the philosopher’s stone. It is always good to be reminded that chemistry is the child of alchemy, much of which consisted of looking into the chemical powers of substances. The main efforts went into creating value by transforming metals into gold by the method of transmutation. The necessary substance was called the philosopher’s stone—lapis philosophorum. Many people fell for it, a list that includes such scholars as Albertus Magnus, Isaac Newton, and Roger Bacon and great thinkers who were not quite scholars, such as Paracelsus.
It is a matter of no small import that the operation of transmutation was called the Magnus Opus—the great(est) work. I truly believe that the operation I will discuss—based on some properties of optionality—is about as close as we can get to the philosopher’s stone.
The following note would allow us to understand:
(a) The severity of the problem of conflation (mistaking the price of oil for geopolitics, or mistaking a profitable bet for good forecasting—not convexity of payoff and optionality).
(b) Why anything with optionality has a long-term advantage—and how to measure it.
(c) An additional subtle property called Jensen’s inequality.
Recall from our traffic example in Chapter 18 that 90,000 cars for an hour, then 110,000 cars for the next one, for an average of 100,000, and traffic will be horrendous. On the other hand, assume we have 100,000 cars for two hours, and traffic will be smooth and time in traffic short.
The number of cars is the something, a variable; traffic time is the function of something. The behavior of the function is such that it is, as we said, “not the same thing.” We can see here that the function of something becomes different from the something under nonlinearities.
(a) The more nonlinear, the more the function of something divorces itself from the something. If traffic were linear, then there would be no difference in traffic time between the two following situations: 90,000, then 110,000 cars on the one hand, or 100,000 cars on the other.
(b) The more volatile the something—the more uncertainty—the more the function divorces itself from the something. Let us consider the average number of cars again. The function (travel time) depends more on the volatility around the average. Things degrade if there is unevenness of distribution. For the same average you prefer to have 100,000 cars for both time periods; 80,000 then 120,000, would be even worse than 90,000 and 110,000.
(c) If the function is convex (antifragile), then the average of the function of something is going to be higher than the function of the average of something. And the reverse when the function is concave (fragile).
As an example for (c), which is a more complicated version of the bias, assume that the function under question is the squaring function (multiply a number by itself). This is a convex function. Take a conventional die (six sides) and consider a payoff equal to the number it lands on, that is, you get paid a number equivalent to what the die shows—1 if it lands on 1, 2 if it lands on 2, up to 6 if it lands on 6. The square of the expected (average) payoff is then (1+2+3+4+5+6 divided by 6)2, equals 3.52, here 12.25. So the function of the average equals 12.25.
But the average of the function is as follows. Take the square of every payoff, 12+22+32+42+52+62 divided by 6, that is, the average square payoff, and you can see that the average of the function equals 15.17.
So, since squaring is a convex function, the average of the square payoff is higher than the square of the average payoff. The difference here between 15.17 and 12.25 is what I call the hidden benefit of antifragility—here, a 24 percent “edge.”
There are two biases: one elementary convexity effect, leading to mistaking the properties of the average of something (here 3.5) and those of a (convex) function of something (here 15.17), and the second, more involved, in mistaking an average of a function for the function of an average, here 15.17 for 12.25. The latter represents optionality.
Someone with a linear payoff needs to be right more than 50 percent of the time. Someone with a convex payoff, much less. The hidden benefit of antifragility is that you can guess worse than random and still end up outperforming. Here lies the power of optionality—your function of something is very convex, so you can be wrong and still do fine—the more uncertainty, the better.
This explains my statement that you can be dumb and antifragile and still do very well.
This hidden “convexity bias” comes from a mathematical property called Jensen’s inequality. This is what the common discourse on innovation is missing. If you ignore the convexity bias, you are missing a chunk of what makes the nonlinear world go round. And it is a fact that suc
h an idea is missing from the discourse. Sorry.4
How to Transform Gold into Mud: The Inverse Philosopher’s Stone
Let us take the same example as before, using as the function the square root (the exact inverse of squaring, which is concave, but much less concave than the square function is convex).
The square root of the expected (average) payoff is then √(1+2+3+4+5+6 divided by 6), equals √3.5, here 1.87. The function of the average equals 1.87.
But the average of the function is as follows. Take the square root of every payoff, (√1+√2+√3+√4+√5+√6), divided by 6, that is, the average square root payoff, and you can see that the average of the function equals 1.80.
The difference is called the “negative convexity bias” (or, if you are a stickler, “concavity bias”). The hidden harm of fragility is that you need to be much, much better than random in your prediction and knowing where you are going, just to offset the negative effect.
Let me summarize the argument: if you have favorable asymmetries, or positive convexity, options being a special case, then in the long run you will do reasonably well, outperforming the average in the presence of uncertainty. The more uncertainty, the more role for optionality to kick in, and the more you will outperform. This property is very central to life.
1 The method does not require a good model for risk measurement. Take a ruler. You know it is wrong. It will not be able to measure the height of the child. But it can certainly tell you if he is growing. In fact the error you get about the rate of growth of the child is much, much smaller than the error you would get measuring his height. The same with a scale: no matter how defective, it will almost always be able to tell you if you are gaining weight, so stop blaming it.