by Scott Soames
prob p | q = prob (p and q) / prob q
which is a consequence of what is sometimes called the product rule.
prob (p and q) = prob (p | q) × prob q
Both are among the laws of the standard probability calculus.
All these laws are easily verified in simple games of chance of the kind here discussed. In that limited sense, the rationality of these laws is on par with the rationality of the calculations we have made concerning simple propositions about outcomes of individual rolls of the dice. Crucially, however, the laws also impose a more general constraint on rationality, independent of the probabilities we have taken for granted regarding simple cases like the proposition that the dice will come up 7. After all, the dice might not be fair; they might be constructed so as to systematically make some combinations come up more frequently than others. Since agents are aware of this possibility, we can imagine situations in which they accept hypotheses about the dice that result in probability assignments to simple propositions about rolls of the dice that conflict with one another and with the probabilities we have been assuming here. In some situations, their conflicting assignments may be rational. Each agent may have evidence based on his or her own experience plus testimony from normally reliable others for adopting a particular hypothesis about the dice. Thus, no agent may be irrational, in which case the probability assignments based on the evidence available to the different agents won’t be irrational either.
The same can’t be said for violations of the laws of probability. Assignments violating them can be shown to be irrational. Just as sets of propositions that are inconsistent with the laws of deductive logic can’t all be true, no matter what possible state the world is in, so probability assignments inconsistent with the laws of probability generate sets of bets which, taken together, will lose, no matter what state the world is in, despite being acceptable to one who adopts those probabilities. In fact, the rational imperative to conform one’s degrees of belief in propositions (assessments of their probabilities) to the laws of probability may well be stronger than the imperative to ensure that one’s beliefs tout court don’t form a logically inconsistent set—in the sense of belief in which for any agent A and proposition p, A either believes or does not believe p (no middle ground).
Though there may be something like a cost in believing any falsehood, that cost may sometimes be outweighed by the value of gaining other, true, beliefs. For example, it may be rational to believe, of each ticket in a heavily subscribed lottery, that it will lose, while also believing that one ticket will win, even though one’s beliefs will then be logically inconsistent.4 It may also be rational to retain one’s belief in each of a series of propositions about ordinary matters about which one has formed an opinion, while also recognizing one’s fallibility, and so coming to believe that at least one of those propositions is false, even though that renders one’s belief set logically inconsistent. By contrast, it is not clear that there is anything to be said for adopting a set of credences (degrees of belief) which, if systematically acted upon, will guarantee that one’s aims are frustrated. So, although optimal rationality doesn’t require that one never believe all members of a set of logically inconsistent propositions, optimal rationality may require the credences on which one bases one’s decisions and actions always to obey the laws of probability.
Optimal rationality of one’s credences also involves conditional probability. Our assignments of probabilities to simple propositions—e.g., that the dice will come up 7—are often based on evidence. If we are in doubt about whether the dice are fair, we may gather evidence bearing on the question by rolling them two thousand times and calculating the ratio of times they come up 7 (or any other possible numerical outcome). If the observed proportion matches or closely approximates the ideally expected proportion—e.g., 1/6 in the case of coming up 7—then, all other things being equal, we will naturally come to have that credence in the proposition that the next roll will come up 7. More generally, our post-test unconditional probability that the next roll will come up 7 should equal (or perhaps closely approximate) m/n if and only if the dice came up m/n in our test, in which case our post-test conditional probability that the dice will come up m times, conditional on its being rolled n more times, is m/n. This, roughly put, is a further constraint on our rational credences. In short, our unconditional probability for a proposition p should match our conditional probability of p given evidence e, once we have verified e and know it to be true. This holds for a wide range of propositions p and e (including those in which p is a scientific theory and e is evidence for it). If one’s credence in a theory T that makes an important, but so far untested, prediction p is x, then typically one’s conditional credence in T, given p, will be x plus some positive amount, while one’s conditional credence in T, given ~p, will typically be x minus some amount. When one then finds out whether p is true, or false, one’s unconditional credence in T will be the higher, or the lower, number. This, it is reasonable to think, is how empirical confirmation works.
PHILOSOPHICAL FOUNDATIONS OF A GENERAL THEORY OF RATIONAL DECISION AND ACTION
The norm of rationality—that one’s credences obey the laws of probability—was noticed by a founder of subjective probability theory, F. P. Ramsey, in his seminal 1926 article “Truth and Probability.”5 The subjective probability of a proposition p for an agent A is the degree of confidence that A has in the truth of p. In calling this probability “subjective,” we are in no way impugning, or endorsing, the accuracy or rationality of A’s degree of confidence in p. There is no implied contrast with the “true” or “objective” probability that p is true (if such even makes sense in the situation in which A is considering p). We are simply measuring how strongly A believes, or is inclined to believe, p. After laying out the laws of probability, Ramsey notes that an agent whose probabilities are inconsistent with them is always subject to a “Dutch book”—i.e., a set of bets which, though acceptable (at odds based on the agent’s subjective probabilities), guarantees that the agent ends up a net loser, no matter how the world turns out.6 The idea is illustrated by the following simple example.
Suppose X’s estimate of the probability for a disjunction, A or B, of incompatible disjuncts A, B, is less than the sum of X’s probabilities for A and B, in violation of the law that such probabilities are additive. Imagine that X’s probability for A is 1/5, for B is 3/10, and for A or B is 49/100, even though the rule for A or B requires it to be 5/10. To ensure X’s loss, we can proceed as follows. First, we buy a bet (paying $100 if we win) from X on A or B for $49—i.e., we can, by giving him $49 now, obtain X’s assurance to pay us $100 if the disjunction proves to be true. The price of this bet, $49, is less than the sum of the price ($20) for a bet (paying $100) on A and the price ($30) of a bet (paying $100) on B. So, after buying the bet on A or B from X, we next sell X individual bets on A and B, one for $20 and one for $30. We can do all this because, given X’s credences on the disjunction and the two disjuncts, X is willing to take either side of each of the bets. But now, X will lose no matter whether A alone, B alone, or neither A nor B is true: (i) If neither is true, X loses $50 on his two bets, while gaining $49 on the one he sold us, leaving X with a net loss of $1. (ii) If only A is true, X has a gain of $80 on bet A, a loss of $30 on B, and a loss of $51 on the bet X sold us on A or B, again leaving X with a net loss of $1. (iii) If only B is true, X has a net loss of $11. The result generalizes. If your probabilities violate any laws of probability, a Dutch book against you can always be made.7 If your assignments of probabilities are consistent with the laws, it’s not possible to make a Dutch book against you.8
Having left the realm of games of chance conforming to the stipulation that the various alternatives are equally likely, we next need to find a general way of assigning agent-relative subjective probabilities to propositions. To do this we must face a challenging question: What does it mean to say that an agent’s degree of confidence in, or subjective probability for
, p is n? Ramsey’s pathbreaking answer begins by identifying the aspect of belief he is trying to measure. For him, “the kind of measurement of belief with which probability is concerned is … a measurement of belief qua basis of action.”9 He proposes to measure this by finding the lowest odds the agent would accept for a bet on its truth. But he recognizes that the marginal utility of money renders monetary odds insufficiently general for calculating utilities. He also notes other factors—like enjoyment or aversion to gambling—that interfere with the idea that the monetary odds one is willing to take on gambles can be the basis of accurate measures of one’s confidence in the truth of a given proposition. As we shall see, he believes that an accurate measure can be found.
This leads him to claim that “we act in the way we think most likely to realize the objects of our desires, so that a person’s actions are entirely determined by his desires and opinions.”10 Assuming the value one seeks to be additive, Ramsey observes that one “will always choose the course of action which will lead in one’s opinion to the greatest sum of good.”11 He then introduces the idea that we behave so as to maximize expected utility (or value).
I suggest that we introduce as a law of psychology that [the] behavior [of an arbitrary agent] is governed by what is called mathematical expectation; that is to say that if p is a proposition about which he is doubtful, any goods or bads for whose realization p is in his view a necessary and sufficient condition enter into his calculations multiplied by the same fraction, which is called the ‘degree of his belief in p’. We thus define degree of belief in a way that presupposes the use of the mathematical expectation. We may put this in a different way. Suppose his degree of belief in p is m/n; then his action is such as he would choose it to be if he had to repeat it exactly n times, in m of which p was true, and in the others false. (Here it may be necessary to presuppose that in each of the n times he had no memory of the previous ones.)12
Although this idea is arresting, it shouldn’t be taken as identifying any definite psychological mechanism generating action. In particular, it shouldn’t be taken to suggest that agents either consciously or unconsciously perform numerical calculations designed to identify utility-maximizing actions. Nor (I think) should it be taken as suggesting that agents always do maximize expected utility (value)—in the sense in which the latter arises from applying the laws of probability to their utilities plus their degrees of belief in simple propositions. Ramsey should be understood as allowing that agents sometimes violate those laws, and that, when they do, their actions will be out of line with what is rationally required by what they value plus their degrees of belief in simple propositions.
Still, in proposing his conception as a psychological law, he is, in effect, proposing it as a model that roughly tracks what we generally do. If, as we normally assume, most people are aware of what they want and are reasonably well attuned to what will advance their interests—no matter what actual decision processes they go through—then a normative theory of rational decision may model the general behavioral tendencies of individuals responding to common reward structures in similar situations. Indeed, it may be argued that had humans not been bred by natural selection to be reasonably good expected utility maximizers, the species would probably not have been so successful.
Ramsey illustrates his model by imagining a rational agent A on a journey to destination Z, coming to a fork in the road. A thinks the right fork is a more direct route to Z than the left. This makes a difference, since people are waiting at Z for A, and it is better for A to arrive early than late. Feeling a bit more confident in the right fork, A takes it, while looking for an opportunity to ask directions. Having gone a short distance, A sees a farmer working half a mile away in a field, causing A to deliberate whether to walk over and ask for directions. This is Ramsey’s decision problem.
Suppose A is twice as confident that the right fork is the direct route—i.e., A’s subjective probability that the right fork is the direct route is ⅔. Then the question of whether to ask for directions depends on the values of (i) arriving in Z at the right time R (taking the direct route), (ii) arriving at a worse time W (taking the other route), and (iii) the time D needed to consult the farmer. Given A’s credence of ⅔ that the right fork is direct, we can evaluate A’s options by imagining A confronted with the decision three times. The total value A would expect from deciding three times not to ask is (3 × ⅔ × the value of arriving at R) + (3 × ⅓ × the value of arriving at W). This equals (2 × the value of arriving at R) plus the value of arriving at W. By contrast, the value A would expect from three decisions to ask would be 3 × the value of arriving at R minus 3 × the value of the time D spent to ask directions. (We assume the farmer knows the right direction and would tell it to A if A asks.)
So for it to be rationally worthwhile for A to ask directions, the number associated with asking must be greater than the number associated with not asking. For this to be so, D (the cost of asking) must be less than (the value of arriving at R minus the value of arriving at W) × ⅓.
2R + W < 3R − 3D
3D < 3R − 2R − W
3D < R − W
D < (R − W) × ⅓
Noting that ⅓ is 1 minus A’s subjective probability for proposition p (that the right fork is the direct route), we see that this means that if the decision is to be consistent with A’s credence in p, the cost D of asking must be less than (the value of arriving at R minus the value of arriving at W) × (1 minus the probability of p). The result generalizes. For any m and n, if A’s credence in p is m/n, the cost of asking for directions must be less than (the value of arriving at R minus the value of arriving at W) × (1 minus the probability of p) (i.e., 1 minus m/n).
In working through this example, we started with A’s subjective probability of being on the right road to A’s destination. From this, we computed the maximum loss of time it would be rational for A to devote to asking for directions, as a function of the difference in value of arriving at the destination at one time rather than another. But the law-like relationship we found would also allow us to compute other variables given other information. For example, given a particular value of the maximum time D that A is willing to expend asking for information, we can calculate the subjective probability of p required for it to be rational to ask for directions. For the value of D to equal (the value of arriving at R minus the value of arriving at W) × (1 minus the probability of p), the value of D divided by the value of arriving at R minus the value of arriving at W must equal 1 minus the probability of p; so the probability of p equals 1 minus the value of D divided by (the value of arriving at R minus the value of arriving at W).
D = (R − W) × (1 − Prob (p))
D / (R − W) = 1 − Prob (p)
Prob (p) = 1 − D / (R − W)
Here we simply assumed we could measure the agent’s utilities. To further generalize the example, we need a way of explaining what this amounts to. What we need is not just a linear ordering of the agent’s preferences over different outcomes, but a measure of how much better the agent takes certain outcomes to be than others. In short, we need a way of assigning numerical values to the agent’s utilities. First let’s see what we can do once we have such a measure. Then we can investigate what agent-relative utilities really are. (Since the discussion contains a bit more technicality than some readers may be comfortable with, those who wish may proceed to the final section of the chapter, “Social-Scientific Applications,” without loss of continuity).
THE LAW-LIKE CONNECTIONS BETWEEN SUBJECTIVE PROBABILITY AND AGENT-RELATIVE UTILITY
Let p be a contingent proposition (which is true if the world is a certain way and false otherwise). Let A, B, and C be outcomes (representing states the world could be in) with utilities U(A), U(B), and U(C) for a particular agent. (In doing the calculation we assume we know what the outcomes are and can assign them numerical values.) The particular outcomes A, B, and C are chosen so that the agent is indifferent to receiving U(A)
for certain versus accepting U(B) if p is true, and U(C) if p is false. In other words, U(A) is the value of the gamble U(B) if p is true, and U(C) if p is false. The subjective probability of p for this agent is then determined by the odds at which the agent would accept the gamble.
Suppose the agent would take either side of the bet on p at odds of 3 to 5, which means that if p turns out to be true, the agent gets an outcome at value 5 (which is U(B)), while if p is false the agent gets an outcome of value 3 (which is U(C)). This translates directly into a ⅜ subjective probability for p and a ⅝ subjective probability for ~p, which means that U(A) = (⅜ × 5) + (⅝ × 3) = 30/8. Given this as U(A), we see that [U(A) minus U(C)] / [U(B) minus U(C)] = (6/8 × ½) = 6/16 = ⅜.
That is how Ramsey defines the subjective probability of p for the agent—provided, we set up the crucial gamble with U(B) greater than U(C), as we just did. So, when the agent would take either side of a bet on p at odds of, say, 7 to 3, we do our computations on the equivalent bet with odds of 3 to 7 on the truth of ~p. Here we set U(B) at 7 and U(C) at 3. So, if ~p turns out to be true (and p is false), the agent gets value 7, while if ~p is false (and p is true), the agent gets value 3. This translates into a 3/8 subjective probability for ~p and a 7/10 subjective probability for p, which means that U(A) = (3/10 × 7) + (7/10 × 3) = 42/10. Given this as U(A), we see that [U(A) minus U(C)] / [U(B) minus U(C)] = 12/10 × ¼ = 3/10. Since this is the probability of ~p, the probability of p is 7/10.