dynamic trade sizing and the use of stops and profit targets, but
unless there is real edge somewhere the idea will eventually lose
money. However, this doesn't make risk control irrelevant. Bad
risk management can turn a potentially profitable idea into a
loser. And, because risk management is the only part of the
trading process that is completely under the control of the trader,
there is no excuse for not doing it as well as possible.
In this chapter, I restate why the Kelly criterion should form the
basis of a trade-sizing scheme and look at two extensions that are
particularly important when trading options: non-normal trade
returns and uncertainty about the return distribution.
The Kelly Criterion
It is well-known that investing according to the Kelly criterion
(Kelly, 1956) will theoretically outperform any other sizing
strategy. No other sizing scheme will produce greater long-term
growth. This doesn't mean that everyone tries or even wants to
invest like this. There are three kinds of reasons for this:
The Kelly criterion maximizes the long-term growth rate of the
bankroll. But it is completely legitimate to have other goals.
For example, many traders are more interested in maximizing
the probability of hitting a goal in a given time (see Browne,
1999, 2000a, 2000b). No one has a utility function that is as
simple as the log function that matches the Kelly scheme. This
is fair.
Some traders try to deny the math. They don't like the volatile
return stream and decide that this is a flaw of the system. It
isn't. Whether or not you like what Kelly says, it is a
mathematical fact. It's objectively true. It is common for
people to conflate their dislike of a situation with its truth.
169
Examples of this are evolution, climate change, and the Kelly
criterion.
Some people create strawman arguments about the
mathematics, claiming that Kelly only applies to simplified,
unrealistic situations. This isn't true at all. The mathematics of
maximizing growth rate are quite general.
A derivation of the Kelly criterion for both discrete and continuous
outcomes is given in Sinclair (2013), together with a discussion of the distribution of results we can expect when investing this way.
The important results can be summarized as follows.
Good
Kelly maximizes growth rate.
The expected time to reach any goal is minimized.
It is impossible to go bankrupt.
The strategy depends only on the current bankroll, not the
specific trade results that led to it.
It is essentially unbeatable.
Bad
The best bets can be uncomfortably large.
Portfolio volatility and drawdowns are large.
Because of compounding, it is reasonably common to find that
an equal number of wins and losses leaves you with a net loss.
Here I want to look at two slight extensions that are very
important to traders, particularly option traders. What happens
when outcomes are highly non-normal? And what happens when
we are uncertain of outcomes and probabilities?
These are two aspects of the same general problem. Trading
success is largely dependent on how robust our ideas are. At best
our knowledge is uncertain. Probably our knowledge is incomplete
and only partially correct. In particular, we will be ignorant of the
true probabilities of rare events. These will be the events that drive
170
non-normality, and, because they appear only rarely, they will be
those that we are most uncertain of.
To introduce the ideas, we first look at the fairly impractical case
of discrete trade results.
Non-normal Discrete Outcomes
Imagine we have a discrete set of outcomes Wi, each with
probability pi. We bet a fraction, f, of our bankroll on each
opportunity. So, the gain factor per trade is
(9.1)
Alternatively, the exponential growth per unit bet is
(9.2)
To find the value of f that maximizes the exponential growth rate,
we differentiate with respect to f and set the derivative to zero. If i is greater than 2, this is unwieldly or impossible and we need to
use numerical methods. A numerical solution of a simple example
is illustrative.
Case One
p1 = 0.55
p 2 = 0.45
W1 = 1
W2 = −1
(55% chance of winning a dollar and 45% chance of losing a
dollar)
This implies f max = 0.1. The dependence of the exponential growth
rate on f is shown in Figure 9.1.
171
FIGURE
9.1
Growth
rate
as
a
function
of
f
(p1=0.55,p2=0.45,W1=1, W2=−1).
Reconsidered Case
p1 = 0.55
p 2 = 0.43
p 3 = 0.02
W1 = 1
W2 = −1
W3 = −3
Here the probability, p3, of the extreme event is low enough that
we could easily misestimate it from historical data. But the
implications of including this small probability are far from
negligible. The growth rate is illustrated in Figure 9.2.
In this case f max is 0.5, half of that in the two-outcome case. And
growth rate becomes negative for f > 0.1. An event with only a 2%
chance of occurrence could easily be missed when we estimate
parameters, and if we erroneously think p3 = 0, we will bet at a
size that gives a negative growth rate.
This phenomenon is qualitatively similar when returns are
continuous. Because this is the more relevant situation for trading,
172
that is what we will assume when deriving methods to live with
these issues.
FIGURE 9.2 Growth rate as a function of f (P1 = 0.55, P2 = 0.44, P3 = 0.01, W1 = 1, W2 = −1, W3 = −3).
Non-normal Continuous Outcomes
We are interested in the case where the outcome of a trade is
known to have a certain continuous distribution. We bet a
fraction, f, of our wealth at the start of each period so that
(9.3)
where Bn is the random variable giving the result of the n th trade and it has the payoff g( Xn). After a sequence of n trades our bankroll will be
(9.4)
Now we take logarithms:
(9.5)
173
so
(9.6)
(9.7)
where Φ( x) is the distribution function that describes the results of
the trades. If we maximize over the bankroll fraction, f, we find
that the optimal value is the one that satisfies
(9.8)
Applying a Taylor expansion to this equation gives
(9.9)
(9.10)
(9.11)
This can be further simplified if we note that
(9.12)
is the payoff to a unit bet.
Further
174
(9.13)
(9.14)
(9.15)
where
and
are the third and fourth raw moments of
.
So,
if f is small, we can truncate the series after the first term to get
(9.16)
And further, if μ is small, we can further approximate by
(9.17)
This is the usual expression for the Kelly ratio of a trade with a
continuum of outcomes, but it is only an approximation and if we
are in a situation where skewness is important, a better
approximation can be obtained if we keep the third term, so that
equation 9.13 becomes
(9.18)
We can solve this equation to get
(9.19)
Equation 9.19 only has real solutions if
(9.20)
175
(which is a limitation of our sloppy use of asymptotics).
Further, it isn't immediately obvious which root is the correct one.
Also, the case where skewness is zero leads to a singularity. We
can address these issues by taking the limit as skewness
approaches zero.
To do this note that
(9.21)
(if b is small relative to a).
So if
(9.22)
we can write
(9.23)
And so the negative root of equation 9.19 is approximately
(9.24)
which simplifies to
(9.25)
So, in order for the limiting case to agree with the Kelly fraction
when trades are normally distributed (equations 9.16 and 9.17), we need to take the negative root.
From a practitioner's perspective, the important thing is to note
that negative skewness decreases the optimal investment fraction
176
and positive skewness increases the optimal investment fraction.
This effect is shown in Figure 9.3.
Figure 9.4 shows the approximation of equation 9.25.
FIGURE 9.3 The optimal investment fraction as a function of skewness (return is 0.015, volatility is 0.5).
FIGURE 9.4 The approximate investment fraction as a function of skewness (return is 0.015, volatility is 0.5).
Uncertain Parameters
177
The value of the optimal sizing fraction will generally need to be
estimated from empirical data. Because empirical data will always
have sampling errors and uncertainty, the estimate of the sizing
parameter will also have a degree of uncertainty attached to it.
This is well-known by professional gamblers. And to mitigate the
risk of over-betting, bettors following a Kelly scheme often modify
the Kelly criterion by investing only a fraction of the optimal
amount. These schemes are known as “fractional Kelly” sizing. By
doing this, traders accept that they will be reducing growth but
will also more drastically reduce variance.
However, simply scaling the investment fraction doesn't protect
against a bigger problem: the case in which the investment
fraction is estimated to be positive, but the true value is negative.
In this case, investing any positive fraction of the bankroll will be
over-betting.
In order to estimate the chances of this happening we need the
variance of the estimated Kelly criterion ratio (Sinclair, 2014).
fmax (approximated in equation 9.17) is a statistical estimator and
has an associated probability distribution.
First consider the case of normal trade results. Here the central
limit theorem says that the estimators of the mean,
, and
variance,
asymptotically have the following normal
distributions, where
and
are the population mean and
variances respectively.
(9.26)
(9.27)
Alternatively, the estimation errors of mean,
and variance,
can be approximated by
(9.28)
178
(9.29)
denoted by f(μ,σ2), the Kelly ratio of equation 9.17. So the estimator is just
. The estimation errors in the mean and
variance will lead to estimation errors in f.
If we define theta to be the column vector of the normal
distribution's parameters, this has an estimate of
.
For IID returns,
where
is the variance of the estimation error of
.
Denoting the estimator of the Kelly ratio to be
where f() is
now a function that estimates the Kelly ratio, we next apply the
delta method (see, for example, Oehlert, 1992).
This states that the variance of a function
is
(9.30)
(9.31)
and
(9.32)
so, evaluating equation 9.30 gives the asymptotic variance of our estimate of the Kelly ratio as
179
(9.33)
If the trade returns are not normally distributed, we need to make
use of the result (Zhang, 2007) that
(9.34)
where
is the third central moment of the population
distribution. Now equation 9.30 gives
(9.35)
(9.36)
It isn't possible to find the variance of the sizing fraction given by
equation 9.19, because the variance of the skewness would need to
be evaluated for the particular distribution the results were drawn
from. The best we can do in general is to measure the empirical
skewness, calculate the sizing ratio using equation 9.19, then
estimate the variance around that value by using equation 9.36.
We now use an example of real trade results to show the
importance of including estimation error in trade sizing. The trade
results are from a proprietary short volatility strategy. It is
somewhat typical of many such strategies in that it has a positive
expected value but a large negative kurtosis. The summary
statistics for these trade results are given in Table 9.1 and the
distribution of results is shown in Figure 9.5.
We can rearrange (and slightly modify) equation 9.36 to give an
explicit expression for the estimated standard deviation of the
Kelly ratio.
180
(9.37)
where the denominator of n − 1 is due to applying Bessel's
correction.
TABLE 9.1 Summary Statistics for the Option Trade
Sample size
1000
Mean
$0.059
Standard
deviation
$1.137
Skewness
($6.199
)
FIGURE 9.5 The distribution of the option trade results.
Because of the central limit theorem, we know that the
distribution of f is normal so we can calculate the probability that f is actually below any critical value f*.
181
(9.38)
where Z is the cumulative distribution function of the normal
distribution with mean of f and a standard deviation calculated
from equation 9.38.
Equation 9.17 gives the Kelly ratio as 0.046, but equation 9.37 tells us that the standard deviation of this point estimate is 0.031, so
our point estimate is only 1.4 standard deviations above zero.
There is an 7% chance that the true Kelly ratio of the population is
less than zero.
Having an expression for the sampling distribution also enables us
to estimate the chance that we are over-betting so much that ou
r
growth rate is negative. This case corresponds to the true value of
f being roughly less than half the estimated value. Equation 9.38
tells us this is 25%.
TABLE 9.2 Fractional Schemes Corresponding to Various Probabilities of Over-Betting
Chance of Over-
Corresponding
Kelly Scale
betting
Benchmark
Factor
0.1
0.0022
0.0480
0.15
0.0104
0.2301
0.2
0.0169
0.3748
TABLE 9.3 Fractional Schemes Corresponding to Various Probabilities of Over-Betting When Setting
Skewness of the Trading Results to Zero
Chance of Over-
Corresponding
Kelly Scale
betting
Benchmark
Factor
0.1
0.0092
0.2054
0.15
0.0161
0.3574
0.2
0.0215
0.4782
This leads us to a complimentary way to use the information. We
can use equation 9.38 to solve for a benchmark given that we want 182
a certain chance of over-betting. For example, we have just seen
that using a benchmark of half the measured Kelly fraction (i.e.,
betting at “half-Kelly”) still implies a 25% chance that we will be
over-betting. Table 9.2 shows the probabilities of over-betting for various fractional Kelly schemes.
So, in order to introduce a margin of safety we would need to scale
the measured Kelly ratio by a considerable amount. This is in line
with the practice of professional gamblers. Much of this need for
scaling is due to the presence of negative skewness. If the returns
were normally distributed, the scaling could be reduced. This is
shown in Table 9.3.
Kelly and Drawdown Control
Even after calculating and allowing for our measurement
uncertainty, it is likely that investors will find that investing the
full Kelly fraction leads to results that are unpalatably volatile.
And the more edge there is, the higher the Kelly ratio will be and
so the higher the volatility will be. Good trades are the most
volatile.
The standard way to mitigate drawdowns is to trade using a
fraction of the Kelly ratio. In this case, both growth rate and
volatility will drop. If we trade at a fraction, f, of the Kelly ratio,
Positional Option Trading (Wiley Trading) Page 19