by Gerald Gaus
Let x = Prohibition of deficits;
Let y = Prohibition of tax increases;
Let z = Prohibition on cutting vital services.
Figure 2-3. The interactions of policies and resulting justice
Suppose we start out in social world a, which has limits on deficits. Our libertarian may judge this world to be reasonably just because current generations cannot push the costs of their consumption on to the future, and so will be apt to be more cautious about governmental expenditure. But recall that our libertarian is concerned with the less well-off members of society. The libertarian may therefore judge society b to be more just than a, since b protects vital services on which the least well-off depend. However, suppose we move to world c that keeps the prohibition on cutting vital services, but drops prohibition on deficits. On this perspective, the social realization of world c is a less just world than either a or b (with a score of 4 compared to 10 or 12), as the prohibition on cutting vital services is likely (given the model used by the mapping function) to inflate the size of the state whose costs will either be pushed onto future generations or funded through increases in taxation. Introducing a limit on taxation in world d at least mitigates some potential injustices, raising the justice of d compared to c (8 compared to 4), but still leaving d less just than a or b. Now suppose that in e, as in world d, there is a prohibition on increased taxation, but also, as in world b, prohibitions both on cutting vital services and on deficits. One might think that the libertarian would judge this to be the best of all worlds. However, now the libertarian’s model of how this set of institutions will work out may lead to a “California syndrome,” in which expenses can neither be cut nor paid for, giving rise to the real possibility of default on the state’s obligations, which may pose the greatest threat to justice of all, leaving e the least just social world.46
Note that in figure 2-3, as we move from a to e, we have “justice peaks” at b and d, with “gullies” in between. Accordingly, the perspective generates an optimization problem more akin to the rugged landscape of figure 2-4 than to the Mount Fuji landscape of figure 2-2. If the various dimensions (institutions, rules, etc.) of the social world that a perspective is judging in terms of justice interact (on the perspective’s preferred model or models, included in MP) as in our bleeding-heart libertarian toy example, then we are confronted with an NK optimization problem—one in which we are optimizing over N dimensions with K interdependencies among them.47 When K = 0, that is, when there are no interdependencies between the justice scores of the individual institutions, we are apt to face a simple sort of optimization problem depicted in figure 2-2. When we face a simple optimization problem, the more of each element the better, and each act of local optimization puts us on a path toward global optimization, or the realization of an ideal. Not so when K begins to increase (as in evolutionary adaptation). When multiple dimensions (in our example, institutions) interact in complex ways to produce varying justice scores, as we saw in figure 2-3, we are faced with a rugged landscape in which optimization is much more difficult.
I believe that the NK characteristics of the justice of social institutions—that they have multiple dimensions on which they are evaluated and these display interdependencies as in figure 2-3—are critical in creating the rugged justice landscape of figure 2-4. However, this “justice as an NK problem” hypothesis is (roughly) sufficient to create rugged optimization problems, but by no means necessary. Even if justice were a simple unidimensional criterion (e.g., justice depended on only one institution and was itself a simple criterion), searching for the global optimum would still confront a perspective with a rugged landscape if the underlying structure that the perspective discerns in the similarity ordering of social worlds (the x-axis) is not well correlated with y-axis scores. To see this, take the simple unidimensional y-axis value of height, and arrange a group of one hundred people according to the similarity criterion of alphabetical ordering of first names. If we array our one hundred people on a line, their heights will be rugged indeed—the diminutive Willamina may be standing between Big Wallace and the 7′1″ Wilt the Stilt. Whenever a perspective arrays on the x-axis the elements in the domain in a way that is badly correlated with the relevant y-axis scores of the elements, that perspective will generate a rugged optimization landscape (§III.1.2 considers three examples of such ruggedness relating to the pursuit of justice). However, in our model of a perspective on justice, the way a perspective Σ arrays social worlds—the structure it discerns among them—is based solely on the world’s justice-relevant features; it is the evaluation of these and only these features that give rise to justice evaluations. Consequently—in marked contrast to our height example in which the scores and underlying structure are not remotely related—if we assume a simple theory of justice (whether there is only one factor in determining justice, or several factors that can be simply aggregated because they do not manifest interdependencies) we would expect the array of social worlds to be correlated with their justice, and so the justice optimization problem to be relatively smooth. That is, we would expect that the variation in the total justice scores of social worlds to be fairly well correlated with variation in their justice-relevant features, approaching figure 2-2. However, when our modeling of the social realizations has significant NK properties (such as in figure 2-3), the correlation of similarity with inherent justice is almost certain to be highly imperfect, and so we are almost certain to confront a rugged optimization problem (but see §III.2.3). Of course that a perspective judges that the features of a social world (WF) have interdependencies is a consequence of how it models the interaction of those features; thus we again see the central importance of the often-overlooked mapping function (MP). The NK features of institutional justice help explain why it is so difficult to avoid seeing the pursuit of ideal justice as a rugged optimization problem. Nevertheless, much of what I say in the remainder of this book applies even if one rejects the justice as an NK problem supposition, so long as the underlying x-axis structure of the social worlds in domain {X} is not highly correlated with their y-axis justice scores.
Figure 2-4. A rugged optimization landscape
Having now set aside the suggestion that the core orientation point is about feasibility (§II.1.3), Simmons can be reinterpreted as making a point about justice as a rugged optimization problem. Pace Sen, in rugged landscapes such as figure 2-4 a constant series of pairwise improvements can (i) lead to a local optimum (a low peak on figure 2-4) that is far inferior to the global optimum and (ii) lead us away (on the x-axis) from the globally optimal social world. If we are at world h in figure 2-4 we could move toward the nearby peak (a higher justice score) at world b, but this would take us further away from the ideal social world, u. Whether theories of justice are tasked with solving optimization problems in rugged or smooth landscapes is, then, the point on which a critical issue in ideal theorizing turns—whether the climbing or orientation model is most appropriate (§I.1.3). When landscapes are smooth the Orientation Condition is essentially otiose; Sen’s insistence that improvement does not require knowledge of the ideal is then sound.
2.2 How Rugged? High-Dimensional Landscapes and the Social Realizations Condition
I am supposing, then, that rugged landscapes are created by NK features of the pursuit of just institutions; a theory of justice is seeking to optimize over N dimensions with K interdependencies between the dimensions. Recall that if K = 0, the N dimensions are independent, the theory is faced with a simple aggregation problem: as we increase our success on any dimension we move higher on the landscape. However, as Stuart Kauffman stressed in his groundbreaking analysis, if there are many dimensions and interdependencies are very high, the landscape will be fully random.48 Let us use the term high-dimensional optimization landscape for one in which many dimensions display a large number of interdependencies; at the limit each dimension is affected by all others. In terms of our ideal theory model, in a maximally high-dimensional landscape t
here is no systematic relation between the justice of social world i and the justice of the worlds that are adjacent to it. Note that in such a landscape there is no point in getting close to the ideal point, u, but not achieving it: its near neighbors may not be at all just.49 Any change of any institution (or rule) in any given world i produces a new social world, the justice of which has no systematic relation to the justice of i. Such landscapes have a very large number of poor local optima.50 The crux of maximally high-dimensional landscapes is that the justice of any one rule or institution is a function of all others, producing what Kauffman called “a complexity catastrophe.”51
Because many philosophers are committed to a type of holism, they seem committed to modeling the Social Realizations and Orientation Conditions in a way that results in high-dimensional optimization problems. “A sensible contractualism,” writes T. M. Scanlon, “like most other plausible views, will involve holism about moral justification.”52 According to holist views, the justification of every element of a system of values or beliefs is dependent on many others—such systems are often depicted as “webs,” indicating a very high degree of interdependence among many elements. At the limit, the value of every element depends on the values of all other elements. It is precisely such systems that give rise to complexity catastrophes; a variation in the value of one element jumps the system to a radically different state.
Some models of evolutionary adaptation show how such high-dimensional landscapes can be successfully traversed (a species can avoid getting stuck at one of the numerous poor local optima—low peaks on a rugged landscape).53 However our concern here is a political theory that seeks to judge the justice of various social worlds, and recommends moves based on its evaluations of these worlds. In this context, the idea of a complexity catastrophe is entirely apropos, for the system will be too complex—really chaotic—for the theory to generate helpful judgments and recommendations.54
Again, it should be pointed out that while the high dimensionality of an optimization problem can be the basis of a maximally rugged landscape—and so can help us to understand why maximally rugged optimization landscapes are so difficult to avoid for some perspectives—it is by no means necessary. Recall our example of arraying people’s heights by the alphabetical order of their first names; here the root of the problem is not the high dimensionality of our concept of height, but the fact that the perspective’s underlying structure (alphabetical ordering of first names) is entirely uncorrelated with height values. Whenever the underlying structural array of a perspective that orders the domain {X} is entirely uncorrelated with the justice values (whatever the root explanation) of the members of {X}, a perspective will face a maximally rugged optimization problem. For any element i ∈ {X}, its place in the perspective’s underlying structure tells us nothing about its score (on the y-axis) relative to its x-axis neighbors.
For the purposes of political theorizing, the problem such systems pose can be expressed in terms of:
The Maximal Precision Requirement: A political theory T, employing perspective Σ, can meet the Social Realizations condition in a maximally rugged optimization landscape only if Σ is maximally precise (and accurate) in its judgments of the justice of social worlds.
Let us say that a judgment of a social world i is maximally precise (and accurate) if and only if that judgment correctly and precisely distinguishes the justice of i from proximate social worlds. A straightforward if somewhat rough way of interpreting this requirement is that a judgment that world i is just to level a is maximally precise only if Σ’s judgment of world i does not attribute to it any features that proximate social worlds possess (say i±1), but which i does not possess. But while this is the basic idea, appealing directly to the features i “truly has” begs an important question, for a critical function of a perspective is to determine the relevant classificatory scheme—what the relevant features of each social world are. To say that world i truly has justice-relevant feature f would be to adopt a certain perspective, and this perspective may clash with Σ, which denies that f is a relevant classification applying to world i. As we shall see in the next chapter, different perspectives endorse different classificatory schemes, each of which insists that its is superior. If we could directly determine which is true, we would have no need to adopt a perspective on the world, but simply to report the truth about it (§I.3.3).
Rather than defining a maximally precise judgment of perspective Σ about i in terms of one that truly identifies the features of i, we can formulate a criterion that is internal to Σ. There is nothing more common in social theory than that our predictions about how a social world will function and its resulting justice end up disappointing us—we who made the prediction. Using Σ, we evaluated social world i as having features {f, g, h} with a resulting justice of a; when we actually sought to bring about that social world, we found either that all these features did not cohere (say, h was inconsistent with f and g, so we ended up with f, h*, g), or else our efforts to bring about i went astray, and we actually ended up with a neighboring social world with f, h, g*) with justice β. In this case ∑’s evaluation of i (or the move to i it recommended) was not stable before and after the move; on ∑’s own lights, it was wrong about i. Let us, then, say that Σ is maximally precise and accurate in its judgments about a social world if its judgments would be stable after moves to that world (or, we can say, ∑’s predictions would be precisely confirmed by Σ after the move).55
Now because in a maximally rugged landscape the justice value of any social world (as measured on the y-axis) is uncorrelated with the justice of its neighbors (as measured on the x-axis), unless a perspective’s judgments of social worlds are maximally accurate and precise, they do not convey useful (reliable) information. Suppose that a perspective’s judgment of a given social world is precise and accurate (in the way we have defined it) plus or minus one social world on the x-axis. Its error is in a very tiny range; it is accurate to plus or minus one feature (g− or g+ rather than g). This would imply that, given this reliability range, the justice of the world could be of any value in the entire range of justice (i.e., y-axis scores). But this, in turn, implies that the perspective cannot generate a useful ordering of the justice of the social worlds in the domain, and so a theory employing this perspective would not meet the critical Social Realizations Condition of an ideal theory (§I.4). More generally, even if we relax the assumption of maximal ruggedness, it remains the case that in high-dimensional landscapes the (x-axis) areas in which proximate social worlds have correlated justice will be very small, and so useful judgments of justice will require great, if not maximal, precision. For reasons to be explored presently (§II.3), I take it that maximally (or approaching maximally) precise and accurate judgments of near, much less far-off, alternative social worlds is a will-o’-wisp; if so, a plausible ideal theory cannot suppose that the quest for justice is a high-dimensional optimization problem.
This is not an inconsequential result.56 Philosophers often combine commitments to justificatory holism with the aim of working to an ideal through a series of improvements. We now see that these two commitments do not cohere (at least not without a very complicated story). Consider, for example, the recent interest in so-called property-owning democracy as a core of a more just social world.57 Suppose that a perspective’s modeling of how such an economy might work is almost spot-on, but misses one significant institutional fact or relevant psychological consideration; if the optimization landscape is maximally rugged (holistic), then the perspective’s evaluation, however sophisticated it may seem, tells us nothing about the justice of property-owning democracy.
2.3 How Rugged? Low-Dimensional Landscapes and the Orientation Condition
As K (the interdependencies between the dimensions to be evaluated) decreases (i) the number of local optima (peaks) decreases, (ii) the slopes lessen, so that the basin of attraction of the optima are wider (the same optimum is reached from a wider array of starting points), and (
iii) the peaks are higher.58 Additionally, in low-dimensional landscapes (iv) the highest optima tend to be near each other59 and (v) the highest optima tend to have the largest basins of attraction.60 As K decreases the landscape becomes correlated within itself. More generally, in smoother optimization landscapes the underlying structure (x-axis) is correlated with the (y-axis) values of any element. In a smooth optimization landscape slight variants in current institutional structures (neighbors along the similarity array) produce new social worlds the justice of which is highly correlated with the current social order. As can be inferred from what has been said about smooth landscapes (§II.2.1), as the landscape approaches an entirely smooth optimization problem, Sen’s climbing model is adequate. That is, we do not really need an ideal to orient our improvements, for our underlying similarity ordering (SO) of the alternatives is an excellent indication of their justice: we are, essentially, always simply climbing gradients. Solving very smooth optimization problems does not require meeting the Orientation Condition, in which case Sen is right: we can do well without knowledge of the ideal.
2.4 Ideal Theory: Rugged, but Not Too Rugged, Landscapes
Formalizing the pursuit of the ideally just society as a complex optimization problem leads to an insight: ideal theory has appeal only if this pursuit poses a problem of a certain level of complexity. This point is, I think, barely recognized in the current literature, which supposes that whatever attractions “ideal theorizing” might have are independent of the complexity of the pursuit of justice.61 Recall Rawls’s key claim: “by showing how the social world may realize the features of a realistic Utopia, political philosophy provides a long-term goal of political endeavor, and in working toward it gives meaning to what we can do today.”62 If the problem of achieving justice is not sufficiently complex, Sen is right: all we need is to make the best pairwise choices we can, and we do not need to identify our long-term goal. If the problem is too complex, the ideal will not help, because any move “working toward” it is essentially a leap into the dark, which is not apt to provide much meaning. In these chaotic, high-dimensional landscapes a fear of movement is as reasonable as a relentless quest for the ideal.