Neither the Boltzmann formula nor the Gibbs formula for entropy is the “right” one. They both are things you can choose to define, and manipulate, and use to help understand the world; each comes with its advantages and disadvantages. The Gibbs formula is often used in applications, for one very down-to-Earth reason: It’s easy to calculate with. Because there is no coarse-graining, there is no discontinuous jump in entropy when a system goes from one macrostate to another; that’s a considerable benefit when solving equations.
But the Gibbs approach also has two very noticeable disadvantages. One is epistemic: It associates the idea of “entropy” with our knowledge of the system, rather than with the system itself. This has caused all kinds of mischief among the community of people who try to think carefully about what entropy really means. Arguments go back and forth, but the approach I have taken in this book, which treats entropy as a feature of the state rather than a feature of our knowledge, seems to avoid most of the troublesome issues.
The other disadvantage is more striking: If you know the laws of physics and use them to study how the Gibbs entropy evolves with time, you find that it never changes. A bit of reflection convinces us that this must be true. The Gibbs entropy characterizes how well we know what the state is. But under the influence of reversible laws, that’s a quantity that doesn’t change—information isn’t created or destroyed. For the entropy to go up, we would have to know less about the state in the future than we know about it now; but we can always run the evolution backward to see where it came from, so that can’t happen. To derive something like the Second Law from the Gibbs approach, you have to “forget” something about the evolution. When you get right down to it, that’s philosophically equivalent to the coarse-graining we had to do in the Boltzmann approach; we’ve just moved the “forgetting” step to the equations of motion, rather than the space of states.
Nevertheless, there’s no question that the Gibbs formula for entropy is extremely useful in certain applications, and people are going to continue to take advantage of it. And that’s not the end of it; there are several other ways of thinking about entropy, and new ones are frequently being proposed in the literature. There’s nothing wrong with that; after all, Boltzmann and Gibbs were proposing definitions to supercede Clausius’s perfectly good definition of entropy, which is still used today under the rubric of “thermodynamic” entropy. After quantum mechanics came on the scene, John von Neumann proposed a formula for entropy that is specifically adapted to the quantum context. As we’ll discuss in the next chapter, Claude Shannon suggested a definition of entropy that was very similar in spirit to Gibbs’s, but in the framework of information theory rather than physics. The point is not to find the one true definition of entropy; it’s to come up with concepts that serve useful functions in the appropriate contexts. Just don’t let anyone bamboozle you by pretending that one definition or the other is the uniquely correct meaning of entropy.
Just as there are many definitions of entropy, there are many different “arrows of time,” another source of potential bamboozlement. We’ve been dealing with the thermodynamic arrow of time, the one defined by entropy and the Second Law. There is also the cosmological arrow of time (the universe is expanding), the psychological arrow of time (we remember the past and not the future), the radiation arrow of time (electromagnetic waves flow away from moving charges, not toward them), and so on. These different arrows fall into different categories. Some, like the cosmological arrow, reflect facts about the evolution of the universe but are nevertheless completely reversible. It might end up being true that the ultimate explanation for the thermodynamic arrow also explains the cosmological arrow (in fact it seems quite plausible), but the expansion of the universe doesn’t present any puzzle with respect to the microscopic laws of physics in the same way the increase of entropy does. Meanwhile, the arrows that reflect true irreversibilities—the psychological arrow, radiation arrow, and even the arrow defined by quantum mechanics we will investigate later—all seem to be reflections of the same underlying state of affairs, characterized by the evolution of entropy. Working out the details of how they are all related is undeniably important and interesting, but I will continue to speak of “the” arrow of time as the one defined by the growth of entropy.
PROVING THE SECOND LAW
Once Boltzmann had understood entropy as a measure of how many microstates fit into a given macrostate, his next goal was to derive the Second Law of Thermodynamics from that perspective. I’ve already given the basic reasons why the Second Law works—there are more ways to be high-entropy than low-entropy, and distinct starting states evolve into distinct final states, so most of the time (with truly overwhelming probability) we would expect entropy to go up. But Boltzmann was a good scientist and wanted to do better than that; he wanted to prove that the Second Law followed from his formulation.
It’s hard to put ourselves in the shoes of a late-nineteenth-century thermody namicist. Those folks felt that the inability of entropy to decrease in a closed system was not just a good idea; it was a Law. The idea that entropy would “probably” increase wasn’t any more palatable than a suggestion that energy would “probably” be conserved would have been. In reality, the numbers are just so overwhelming that the probabilistic reasoning of statistical mechanics might as well be absolute, for all intents and purposes. But Boltzmann wanted to prove something more definite than that.
In 1872, Boltzmann (twenty-eight years old at the time) published a paper in which he purported to use kinetic theory to prove that entropy would always increase or remain constant—a result called the “H-Theorem,” which has been the subject of countless debates ever since. Even today, some people think that the H-Theorem explains why the Second Law holds in the real world, while others think of it as an amusing relic of intellectual history. The truth is that it’s an interesting result for statistical mechanics but falls short of “proving” the Second Law.
Boltzmann reasoned as follows. In a macroscopic object such as a room full of gas or a cup of coffee with milk, there are a tremendous number of molecules—more than 1024. He considered the special case where the gas is relatively dilute, so that two particles might bump into each other, but we can ignore those rare events when three or more particles bump into one another at the same time. (That really is an unobjectionable assumption.) We need some way of characterizing the macrostate of all these particles. So instead of keeping track of the position and momentum of every molecule (which would be the whole microstate), let’s keep track of the average number of particles that have any particular position and momentum. In a box of gas in equilibrium at a certain temperature, for example, the average number of particles is equal at every position in the box, and there will be a certain distribution of momenta, so that the average energy per particle gives the right temperature. Given just that information, you can calculate the entropy of the gas. And then you could prove (if you were Boltzmann) that the entropy of a gas that is not in equilibrium will go up as time goes by, until it reaches its maximum value, and then it will just stay there. The Second Law has, apparently, been derived.140
But there is clearly something fishy going on. We started with microscopic laws of physics that are perfectly time-reversal invariant—they work equally well running forward or backward in time. And then Boltzmann claimed to derive a result from them that is manifestly not time-reversal invariant—one that demonstrates a clear arrow of time, by saying that entropy increases toward the future. How can you possibly get irreversible conclusions from reversible assumptions?
This objection was put forcefully in 1876 by Josef Loschmidt, after similar concerns had been expressed by William Thomson (Lord Kelvin) and James Clerk Maxwell. Loschmidt was close friends with Boltzmann and had served as a mentor to the younger physicist in Vienna in the 1860s. And he was no skeptic of atomic theory; in fact Loschmidt was the first scientist to accurately estimate the physical sizes of molecules. But he couldn’t understand how Boltzmann could have deri
ved time asymmetry without sneaking it into his assumptions.
The argument behind what is now known as “Loschmidt’s reversibility objection” is simple. Consider some specific microstate corresponding to a low-entropy macrostate. It will, with overwhelming probability, evolve toward higher entropy. But time-reversal invariance guarantees that for every such evolution, there is another allowed evolution—the time reversal of the original—that starts in the high-entropy state and evolves toward the low-entropy state. In the space of all things that can happen over time, there are precisely as many examples of entropy starting high and decreasing as there are examples of entropy starting low and increasing. In Figure 45, showing the space of states divided up into macrostates, we illustrated a trajectory emerging from a very low-entropy macrostate; but trajectories don’t just pop into existence. That history had to come from somewhere, and that somewhere had to have higher entropy—an explicit example of a path along which entropy decreased. It is manifestly impossible to prove that entropy always increases, if you believe in time-reversal-invariant dynamics (as they all did).141
But Boltzmann had proven something—there were no mathematical or logical errors in his arguments, as far as anyone could tell. It would appear that he must have smuggled in some assumption of time asymmetry, even if it weren’t explicitly stated.
And indeed he had. A crucial step in Boltzmann’s reasoning was the assumption of molecular chaos—in German, the Stosszahlansatz, translated literally as “collision number hypothesis.” It amounts to assuming that there are no sneaky conspiracies in the motions of individual molecules in the gas. But a sneaky conspiracy is precisely what is required for the entropy to decrease! So Boltzmann had effectively proven that entropy could increase only by dismissing the alternative possibilities from the start. In particular, he had assumed that the momenta of every pair of particles were uncorrelated before they collided. But that “before” is an explicitly time-asymmetric step; if the particles really were uncorrelated before a collision, they would generally be correlated afterward. That’s how an irreversible assumption was sneaked into the proof.
If we start a system in a low-entropy state and allow it to evolve to a high-entropy state (let an ice cube melt, for example), there will certainly be a large number of correlations between the molecules in the system once all is said and done. Namely, there will be correlations that guarantee that if we reversed all the momenta, the system would evolve back to its low-entropy beginning state. Boltzmann’s analysis didn’t account for this possibility. He proved that entropy would never decrease, if we neglected those circumstances under which entropy would decrease.
WHEN THE LAWS OF PHYSICS AREN’T ENOUGH
Ultimately, it’s perfectly clear what the resolution to these debates must be, at least within our observable universe. Loschmidt is right in that the set of all possible evolutions has entropy decreasing as often as it is increasing. But Boltzmann is also right, that statistical mechanics explains why low-entropy conditions will evolve into high-entropy conditions with overwhelming probability. The conclusion should be obvious: In addition to the dynamics controlled by the laws of physics, we need to assume that the universe began in a low-entropy state. That is a boundary condition, an extra assumption, not part of the laws of physics themselves. (At least, not until we start talking about what happened before the Big Bang, which is not a discussion one could have had in the 1870s.) Unfortunately, that conclusion didn’t seem sufficient to people at the time, and subsequent years have seen confusions about the status of the H-Theorem proliferate beyond reason.
In 1876, Boltzmann wrote a response to Loschmidt’s reversibility objection, which did not really clarify the situation. Boltzmann certainly understood that Loschmidt had a point, and admitted that there must be something undeniably probabilistic about the Second Law; it couldn’t be absolute, if kinetic theory were true. At the beginning of his paper, he makes this explicit:
Since the entropy would decrease as the system goes through this sequence in reverse, we see that the fact that entropy actually increases in all physical processes in our own world cannot be deduced solely from the nature of the forces acting between the particles, but must be a consequence of the initial conditions.
We can’t ask for a more unambiguous statement than that: “the fact that entropy increases in our own world . . . must be a consequence of the initial conditions.” But then, still clinging to the idea of proving something without relying on initial conditions, he immediately says this:
Nevertheless, we do not have to assume a special type of initial condition in order to give a mechanical proof of the Second Law, if we are willing to accept a statistical viewpoint.
“Accepting a statistical viewpoint” presumably means that he admits we can argue only that increasing entropy is overwhelmingly likely, not that it always happens. But what can he mean by now saying that we don’t have to assume a special type of initial condition? The next sentences confirm our fears:
While any individual non-uniform state (corresponding to low entropy) has the same probability as any individual uniform state (corresponding to high entropy), there are many more uniform states than non-uniform states. Consequently, if the initial state is chosen at random, the system is almost certain to evolve into a uniform state, and entropy is almost certain to increase.
That first sentence is right, but the second is surely wrong. If an initial state is chosen at random, it is not “almost certain to evolve into a uniform state”; rather, it is almost certain to be in a uniform (high-entropy) state. Among the small number of low-entropy states, almost all of them evolve toward higher-entropy states. In contrast, only a very tiny fraction of high-entropy states will evolve toward lo w-entropy states; however, there are a fantastically larger number of high-entropy states to begin with. The total number of low-entropy states that evolve to high entropy is equal, as Loschmidt argued, to the total number of high-entropy states that evolve to low entropy.
Reading through Boltzmann’s papers, one gets a strong impression that he was several steps ahead of everyone else—he saw the ins and outs of all the arguments better than any of his interlocutors. But after zooming through the ins and outs, he didn’t always stop at the right place; moreover, he was notoriously inconsistent about the working assumptions he would adopt from paper to paper. We should cut him some slack, however, since here we are 140 years later and we still don’t agree on the best way of talking about entropy and the Second Law.
THE PAST HYPOTHESIS
Within our observable universe, the consistent increase of entropy and the corresponding arrow of time cannot be derived from the underlying reversible laws of physics alone. They require a boundary condition at the beginning of time. To understand why the Second Law works in our real world, it is not sufficient to simply apply statistical reasoning to the underlying laws of physics; we must also assume that the observable universe began in a state of very low entropy. David Albert has helpfully given this assumption a simple name: the Past Hypothesis.142
The Past Hypothesis is the one profound exception to the Principle of Indifference that we alluded to above. The Principle of Indifference would have us imagine that, once we know a system is in some certain macrostate, we should consider every possible microstate within that macrostate to have an equal probability. This assumption turns out to do a great job of predicting the future on the basis of statistical mechanics. But it would do a terrible job of reconstructing the past, if we really took it seriously.
Boltzmann has told us a compelling story about why entropy increases: There are more ways to be high entropy than low entropy, so most microstates in a low-entropy macrostate will evolve toward higher-entropy macrostates. But that argument makes no reference to the direction of time. Following that logic, most microstates within some macrostate will increase in entropy toward the future but will also have evolved from a higher-entropy condition in the past.
Consider all the microstates in some m
edium-entropy macrostate. The overwhelming majority of those states have come from prior states of high entropy. They must have, because there aren’t that many low-entropy states from which they could have come. So with high probability, a typical medium-entropy microstate appears as a “statistical fluctuation” from a higher-entropy past. This argument is exactly the same argument that entropy should increase into the future, just with the time direction reversed.
As an example, consider the divided box of gas with 2,000 particles. Starting from a low-entropy condition (80 percent of the particles on one side), the entropy tends to go up, as plotted in Figure 43. But in Figure 47 we show how the entropy evolves to the past as well as to the future. Since the underlying dynamical rule (“each particle has a 0.5 percent chance of changing sides per second”) doesn’t distinguish between directions of time, it’s no surprise that the entropy is higher in the past of that special moment just as it is in the future.
You may object, thinking that it’s very unlikely that a system would start out in equilibrium and then dive down to a low-entropy state. That’s certainly true; it would be much more likely to remain at or near equilibrium. But given that we insist on having a low-entropy state at all, it is overwhelmingly likely that such a state represents a minimum on the entropy curve, with higher entropy both to the past and to the future.
Figure 47: The entropy of a divided box of gas. The “boundary” condition is set at time = 500, where 80 percent of the particles are on one side and 20 percent on the other (a low-entropy macrostate). Entropy increases both to the future and to the past of that moment.
At least, it would be overwhelmingly likely, if all we had to go on were the Principle of Indifference. The problem is, no one in the world thinks that the entropy of the real universe behaves as shown in Figure 47. Everyone agrees that the entropy will be higher tomorrow than it is today, but nobody thinks it was higher yesterday than it is today. There are good reasons for that agreement, as we’ll discuss in the next chapter—if we currently live at a minimum of the entropy curve, all of our memories of the past are completely unreliable, and we have no way of making any kind of sense of the universe.
From Eternity to Here: The Quest for the Ultimate Theory of Time Page 22