Boltzmann's first attempt to derive the second law of thermodynamics assumed that gas particles followed strict dynamical laws, that is, Newton's classical mechanics. In 1872 Boltzmann derived a mathematical quantity (his H-Theorem) that had the same property of increase for a gas approaching equilibrium as Rudolf Clausius' entropy law. Clausius enunciated the two laws of thermodynamics. First, the energy in the world is a constant. Second, the entropy of the world increases to a maximum. James Clerk Maxwell was critical of Boltzmann's result. He argued (correctly as it turns out) that the kinetic theory of gases must be purely statistical, not deterministic and dynamical.
Boltzmann had in 1866 derived Maxwell's velocity distribution for the molecules of a gas in equilibrium dynamically, putting it on a firmer ground than Maxwell.
Boltzmann's mentor and colleague Josef Loschmidt criticized Boltzmann's 1872 demonstration of entropy increase, on the grounds that Newton's dynamical laws are reversible. If all the individual particles could be turned around exactly (or if time could be reversed), Boltzmann's H-Theorem should show that the entropy would decrease, violating the second law. Some, including Boltzmann, suggested that time might be simply the direction in which entropy increases. Arthur Stanley Eddington later called this the Arrow of Time. The basic problem is - how can macroscopic irreversibility result from microscopic processes that are fundamentally reversible? We shall see that the answer is found in the quantum nature of the atoms and molecules, especially when they interact with radiation.
Five years later, responding to Loschmidt's criticism, Boltzmann reformulated his H-theorem on purely statistical and probabilistic grounds. Maxwell, who died in 1879, did not remark on this obvious improvement. He found Boltzmann's papers much too long and too dense to read. (Boltzmann found Maxwell's papers too brief to contain a full explanation.) It is not clear that Boltzmann would agree with Maxwell about the implicit loss of determinism in physics. Boltzmann maintained (as his student Franz Exner, and Exner's student Erwin Schrödinger would later briefly insist) that observational evidence can never justify our assumptions of strict determinism. Boltzmann was under severe attacks from colleagues for espousing the reality of atoms. He may have been wary of emphasizing that atomic motions are chaotic and random. Real ontological chance was anathema to deterministic nineteenth-century thinkers and even considered atheistic by many, since it implies denial of the omniscience of God.
Boltzmann was a great believer in theories, but he knew that they could "go beyond experience," a phrase he used more than once and the key phrase in Franz Exner's denial of strict causal determinism decades before quantum mechanics. As Albert Einstein would later explain, theories are "free inventions of the human mind." Theories are guesses, new ideas, fictions, and pure information that goes beyond Ernst Mach's positivist belief that science includes only "economic summaries" of the results of experiments. The confirmation of theories always rests on the statistics of experiments. So theories are always probabilistic predictions about what will happen in the experiments. And experiments can only provide statistical evidence, although sometime the evidence is so good that we can regard it as "adequately" or statistically determined.
The Reversibility Objection The Recurrence Objection Boltzmann's Response to Loschmidt's Reversibility Objection (Umkehreinwand) Zermelo's Recurrence Objection (Wiederkehreinwand)
For ScholarsBoltzmann and Statistical Physics
From Lectures on Gas Theory, Part I, Introduction, 1895 (1964), pp.27-30 (tr. Stephen G. Brush)
Whence comes the ancient view, that the body does not fill space continuously in the mathematical sense, but rather it consists of discrete molecules, unobservable because of their small size. For this view there are philosophical reasons. An actual continuum must consist of an infinite number of parts; but an infinite number is undefinable. Furthermore, in assuming a continuum one must take the partial differential equations for the properties themselves as initially given. However, it is desirable to distinguish the partial differential equations, which can be subjected to empirical tests, from their mechanical foundations (as Hertz emphasized in particular for the theory of electricity). Thus the mechanical foundations of the partial differential equations, when based on the coming and going of smaller particles, with restricted average values, gain greatly in plausibility; and up to now no other mechanical explanation of natural phenomena except atomism has been successful. A real discontinuity of bodies is moreover established by numerous, and moreover quantitatively agreeing, facts. Atomism is especially indispensable for the clarification of the facts of chemistry and crystallography. The mechanical analogy between the facts of any science and the symmetry relations of discrete particles pertains to those most essential features which will outlast all our changing ideas about them, even though the latter may themselves be regarded as established facts. Thus already today the hypothesis that the stars are huge bodies millions of miles away is similarly viewed only as a mechanical analogy for the representation of the action of the sun and the faint visual perceptions arising from the other heavenly bodies, which could also be criticized on the grounds that it replaces the world of our sense perceptions by a world of imaginary objects, and that anyone could just as well replace this imaginary world by another one without changing the observable facts. I hope to prove in the following that the mechanical analogy between the facts on which the second law of thermodynamics is based, and the statistical laws of motion of gas molecules, is also more than a mere superficial resemblance. The question of the utility of atomistic representations is of course completely unaffected by the fact, emphasized by Kirchhoff, that our theories have the same relation to nature as signs to significates, for example as letters to sounds, or notes to tones. It is likewise unaffected by the question of whether it is not more useful to call theories simply descriptions, in order to remind ourselves of their relation to nature. The question is really whether bare differential equations or atomistic ideas will eventually be established as complete descriptions of phenomena. Once one concedes that the appearance of a continuum is more clearly understood by assuming the presence of a large number of adjacent discrete particles, assumed to obey the laws of mechanics, then he is led to the further assumption that heat is a permanent motion of molecules. Then these must be held in their relative positions by forces, whose origin one can imagine if he wishes. But all forces that act on the visible body but not equally on all the molecules must produce motion of the molecules relative to each other, and because of the indestructibility of kinetic energy these motions cannot stop but must continue indefinitely. In fact, experience teaches that as soon as the force acts equally on all parts of a body — as for example in so-called free fall — all the kinetic energy becomes visible. In all other cases, we have a loss of visible kinetic energy, and hence creation of heat. The view offers itself that there is a resulting motion of molecules among themselves, which we cannot see because we do not see individual molecules, but which however is transmitted to our nerves by contact, and thus creates the sensation of heat. It always moves from bodies whose molecules move rapidly to those whose molecules move more slowly, and because of the indestructibility of kinetic energy it behaves like a substance, as long as it is not transformed into visible kinetic energy or work. We do not know the nature of the force that holds the molecules of a solid body in their relative positions, whether it is action at a distance or is transmitted through a medium, and we do not know how it is affected by thermal motion. Since it resists compression as much as it resists dilatation, we can obviously get it rather rough picture by assuming that in a solid body each molecule has a rest position. If it approaches a neighboring molecule it is repelled by it, but if it moves farther away there is an attraction. Consequently, thermal motion first sets a molecule into pendulum-like oscillations in straight or elliptical paths around its rest position A (in the symbolic Fig. 1, the centers of gravity of the molecules are indicated). If it moves to A', the neighboring molecules B and C repel it, while D and E attract it and hence bring it back to its original rest position. If each molecule vibrates around a fixed rest position, the body will have a fixed form; it is in the solid state of aggregation. The only consequence of the thermal motion is that the rest positions of the molecules will be somewhat pushed apart, and the body will expand somewhat. However, when the thermal motion becomes more rapid, one gets to the point where a molecule can squeeze between its two neighbors and move from A to A" (Fig. 1). It will no longer then be pulled back to its old rest position, but it can instead remain where it is. When this happens to many molecules, they will crawl among each other like earthworms, and the body is molten. Although one may find this description rather crude and childish, it may be modified later and the apparent repulsive force may turn out to be a direct consequence of the motion. In any case, one will allow that when the motions of the molecules increase beyond a definite limit, individual molecules on the surface of the body can be torn off and must fly out freely into space; the body evaporates. If it is in an enclosed vessel, then this will be filled with freely moving molecules, and these can occasionally penetrate into the body again; as soon as the number of recondensing molecules is, on the average, equal to the number of evaporating ones, one says that the vessel is saturated with the vapor of the body in question. A sufficiently large enclosed space, in which only such freely moving molecules are found, provides a picture of a gas. If no external forces act on the molecules, these move most of the time like bullets shot from guns in straight lines with constant velocity. Only when a molecule passes very near to another one, or to the wall of the vessel, does it deviate from its rectilinear path. The pressure of the gas is interpreted as the action of these molecules against the wall of the container.From Lectures on Gas Theory, Part I, Introduction, 1895 (1964), pp.36-43 (tr. Stephen G. Brush)
CHAPTER I The molecules are elastic spheres. External forces and visible mass motion are absent. §3. Maxwell's proof of the velocity distribution law; frequency of collisions. We shall suppose for a moment that in the container there is a single gas composed of completely identical molecules. The molecules will also from now on—unless we specify otherwise—be assumed to behave like completely elastic spheres when they collide with each other. Even if all the molecules initially had the same velocity, there would soon occur collisions in which the velocity of one colliding molecule is nearly in the direction of the line of centers, but that of the other is nearly perpendicular to it. The first molecule would thereby end up with nearly zero velocity, while the velocity of the second would become √2 times as large. In the course of further collisions it would soon happen, if the number of molecules were large enough, that all possible velocities would occur, from zero up to a velocity much larger than the original common velocity of all the molecules; it is then a question of calculating the law of distribution of velocities among the molecules in the final state thus reached, or, as one says more briefly, to find the velocity distribution law. In order to find it, we shall consider a more general case. We assume that we have two kinds of molecules in the container. Each molecule of the first kind has mass m, and each of the second has mass m1. The velocity distribution which prevails at any arbitrary time t will be represented by drawing as many straight lines (starting from the origin of coordinates) as there are m-molecules in unit volume. Each line will be the same in length and direction as the velocity of the corresponding molecule. Its endpoint will be called the velocity point of the corresponding molecule. Now at time t letFrom Lectures on Gas Theory, Part II, Chapter VII, 1898 (1964), pp.441-449 (tr. Stephen G. Brush)
§87. Characterization of our assumption about the initial state. When a gas is enclosed in a rigid container, and initially one part of it has a visible motion with respect to the rest, then it soon comes to rest as a consequence of viscosity. When two kinds of gas are initially unmixed, but in contact with each other, then they mix, even if the lighter one was originally on top. In general, when a gas or a system of several kinds of gas has initially some improbable state, then it passes to the most probable state under the given external conditions, and remains there during all observable later times. In order to prove that this is a necessary consequence of the kinetic theory of gases, we used the quantity H defined and discussed in this chapter. We proved that it continually decreases as a result of the motion of the gas molecules among each other. The one-sidedness of this process is clearly not based on the equations of motion of the molecules. For these do not change when the time changes its sign. This one-sidedness rather lies uniquely and solely in the initial conditions. This is not to be understood in the sense that for each experiment one must specially assume just certain initial conditions and not the opposite ones which are likewise possible; rather it is sufficient to have a uniform basic assumption about the initial properties of the mechanical picture of the world, from which it, then follows with logical necessity that, when bodies are always interacting, they must always be found in the correct initial conditions. In particular, our theory does not require that each time when bodies are interacting, the initial state of the system they form must be distinguished by a special property (ordered or improbable) which relatively few states of the same mechanical system would have under the external mechanical conditions in question. Hereby the fact is clarified that this system takes in the course of time states which do not have these properties, and which one calls disordered. Since by far most of the states of the system are disordered, one calls the latter the probable states. The ordered initial states are not related to the disordered ones in the way that a definite state is to the opposite state (arising from the mere reversal of the directions of all velocities), but rather the state opposite to each ordered state is again an ordered state. The self-regulating most probable state — which we call the Maxwell velocity distribution since Maxwell first found its mathematical expression in a special case — is not some kind of special singular state which is contrasted to infinitely many more non-Maxwellian distributions. Rather it is, on the contrary, characterized by the fact that by far the largest number of possible states have the characteristic properties of the Maxwell distribution, and compared to this number, the number of possible velocity distributions which significantly deviate from the Maxwellian is vanishingly small. The criterion of equal possibility or equal probability is provided by Liouville's theorem. In order to explain the fact that the calculations based on this assumption correspond to actually observable processes, one must assume that an enormously complicated mechanical system represents a good picture of the world, and that all or at least most of the parts of it surrounding us are initially in a very ordered — therefore very improbable — state. When this is the case, then whenever two or more small parts of it come into interaction with each other, the system formed by these parts is also initially in an ordered state, and when left to itself it rapidly proceeds to the disordered most probable state. §88. On the return of a system to a former state. We make the following remarks: 1. It is by no means the sign of the time which constitutes the characteristic difference between an ordered and a disordered state. If, in the "initial states" of the mechanical picture of the world, one reverses the directions of all velocities, without changing their magnitudes or the positions of the parts of the system; if, as it were, one follows the states of the system backwards in time, then he would likewise first have an improbable state, and then reach ever more probable states. Only in those periods of time during which the system passes from a very improbable initial state to a more probable later state do the states change in the positive time direction differently than in the negative. 2. The transition from an ordered to a disordered state is only extremely improbable. Also, the reverse transition has a definite calculable (though inconceivably small) probability, which approaches zero only in the limiting case when the number of molecules is infinite. The fact that a closed system of a finite number of molecules, when it is initially in an ordered state and then goes over to a disordered state, finally after an inconceivably long time must again return to the ordered state,* is therefore not a refutation but rather indeed a confirmation of our theory.One should not however imagine that two gases in a 1/10 liter container, initially unmixed, will mix, then again after a few days, separate, then mix again, and so forth. On the contrary, one finds by the same principles which I used* for a similar calculation that, not until after a time enormously long compared to 101010 years will there be any noticeable unmixing of the gases. One may recognize that this is practically equivalent to never, if one recalls that in this length of time, according to the laws of probability, there will have been many years in which every inhabitant of a large country committed suicide, purely by accident, on the same day, or every building burned down at the same time — yet the insurance companies get along quite well by ignoring the possibility of such events. If a much smaller probability than this is not practically equivalent to impossibility, then no one can be sure that today will be followed by a night and then a day. We have looked mainly at processes in gases and have calculated the function H for this case. Yet the laws of probability that govern atomic motion in the solid and liquid states are clearly not qualitatively different in this respect from those for gases, so that the calculation of the function H corresponding to the entropy would not be more difficult in principle, although to be sure it would involve greater mathematical difficulties. §89. Relation to the second law of thermodynamics. If therefore we conceive of the world as an enormously large mechanical system composed of an enormously large number of atoms, which starts from a completely ordered initial state, and even at present is still in a substantially ordered state, then we obtain consequences which actually agree with the observed facts; although this conception involves, from a purely theoretical — I might say philosophical — standpoint, certain new aspects which contradict general thermodynamics based on a purely phenomenological viewpoint. General thermodynamics proceeds from the fact that, as far as we can tell from our experience up to now, all natural processes are irreversible. Hence according to the principles of phenomenology, the general thermodynamics of the second law is formulated in such a way that the unconditional irreversibility of all natural processes is asserted as a so-called axiom, just as general physics based on a purely phenomenological standpoint asserts the unconditional divisibility of matter without limit as an axiom. Just as the differential equations of elasticity theory and hydrodynamics based on this latter axiom will always remain the basis of the phenomenological description of a large group of natural phenomena, since they provide the simplest approximate expression of the facts, so likewise will the formulas of general thermodynamics. No one who has fallen in love with the molecular theory will approve of its being given up completely. But the opposite extreme, the dogma of a self-sufficient phenomenology, is also to be avoided. Just as the differential equations represent simply a mathematical method for calculation, whose clear meaning can only be understood by the use of models which employ a large finite number of elements,1 so likewise general thermodynamics (without prejudice to its unshakable importance) also requires the cultivation of mechanical models representing it, in order to deepen our knowledge of nature — not in spite of, but rather precisely because these models do not always cover the same ground as general thermodynamics, but instead offer a glimpse of a new viewpoint. Thus general thermodynamics holds fast to the invariable irreversibility of all natural processes. It assumes a function (the entropy) whose value can only change in one direction — for example, can only increase — through any occurrence in nature. Thus it distinguishes any later state of the world from any earlier state by its larger value of the entropy. The difference of the entropy from its maximum value — which is the goal [Treibende] of all natural processes — will always decrease. In spite of the invariance of the total energy, its transformability will therefore become ever smaller, natural events will become ever more dull and uninteresting, and any return to a previous value of the entropy is excluded.2§90. Application to the universe. Is the apparent irreversibility of all known natural processes consistent with the idea that all natural events are possible without restriction? Is the apparent unidirectionality of time consistent with the infinite extent or cyclic nature of time? He who tries to answer these questions in the affirmative sense must use as a model of the world a system whose temporal variation is determined by equations in which the positive and negative directions of time are equivalent, and by means of which the appearance of irreversibility over long periods of time is explicable by some special assumption. But this is precisely what happens in the atomic view of the world. One can think of the world as a mechanical system of an enormously large number of constituents, and of an immensely long period of time, so that the dimensions of that part containing our own "fixed stars" are minute compared to the extension of the universe; and times that we call eons are likewise minute compared to such a period. Then in the universe, which is in thermal equilibrium throughout and therefore dead, there will occur here and there relatively small regions of the same size as our galaxy (we call them single worlds) which, during the relative short time of eons, fluctuate noticeably from thermal equilibrium, and indeed the state probability in such cases will be equally likely to increase or decrease. For the universe, the two directions of time are indistinguishable, just as in space there is no up or down. However, just as at a particular place on the earth's surface we call "down" the direction toward the center of the earth, so will a living being in a particular time interval of such a single world distinguish the direction of time toward the less probable state from the opposite direction (the former toward the past, the latter toward the future). By virtue of this terminology, such small isolated regions of the universe will always find themselves "initially" in an improbable state. This method seems to me to be the only way in which one can understand the second law — the heat death of each single world — without a unidirectional change of the entire universe from a definite initial state to a final state. Obviously no one would consider such speculations as important discoveries or even — as did the ancient philosophers — as the highest purpose of science. However it is doubtful that one should despise them as completely idle. Who knows whether they may not broaden the horizon of our circle of ideas, and by stimulating thought, advance the understanding of the facts of experience? That in nature the transition from a probable to an improbable state does not take place as often as the converse, can be explained by assuming a very improbable initial state of the entire universe surrounding us, in consequence of which an arbitrary system of interacting bodies will in general find itself initially in an improbable state. However, one may object that here and there a transition from a probable to an improbable state must occur and occasionally be observed. To this the cosmological considerations just presented give an answer. From the numerical data on the inconceivably great rareness of transition from a probable to a less probable state in observable dimensions during an observable time, we see that such a process within what we have called an individual world — in particular, our individual world — is so unlikely that its observability is excluded. In the entire universe, the aggregate of all individual worlds, there will however in fact occur processes going in the opposite direction. But the beings who observe such processes will simply reckon time as proceeding from the less probable to the more probable states, and it will never be discovered whether they reckon time differently from us, since they are separated from us by eons of time and spatial distances 101010 times the distance of Sirius — and moreover their language has no relation to ours.* * For further discussion see H. Reichenbach, The Direction of Time (Berkeley and Los Angeles: University of California Press, 1956). Very well, you may smile at this; but you must admit that the model of the world developed here is at least a possible one, free of inner contradiction, and also a useful one, since it provides us with many new viewpoints. It also gives an incentive, not only to speculation, but also to experiments (for example on the limit of divisibility, the size of the sphere of action, and the resulting deviations from the equations of hydrodynamics, diffusion, and heat conduction) which are not stimulated by any other theory. §91. Application of the probability calculus in molecular physics. Doubts have been expressed as to the permissibility of the applications made of the probability calculus to this subject. Since however the probability calculus has been verified in so many special cases, I see no reason why it should not also be applied to natural processes of a more general kind. The applicability of the probability calculus to the molecular motion in gases cannot of course be rigorously deduced from the differential equations for the motion of the molecules. It follows rather from the great number of the gas molecules and the length of their paths, by virtue of which the properties of the position in the gas where a molecule undergoes a collision are completely independent of the place where it collided the previous time. This independence is of course attained only for a finite number of gas molecules during an arbitrarily long time. For a finite number of molecules in a rigid container with completely smooth walls it is never completely exact, so that the Maxwell velocity distribution cannot hold throughout all time. In practice, however, the walls are continually undergoing perturbations, which will destroy the periodicity resulting from the finite number of molecules. In any case, the applicability of probability theory to gas theory is not refuted but rather is confirmed by the periodicity of motion of a finite closed system in the course of eons of time, and since it leads to a model of the world which not only agrees with experience but also stimulates speculations and experiments, it should be retained in gas theory. Moreover we see that the probability calculus plays yet another role in physics. The calculation of errors by Gauss's famous method is confirmed in purely physical processes, like the calculation of insurance premiums for statistical ones. We have to thank the laws of probability for the fact that in an orchestra the sounds regularly reinforce each other in unison rather than cancelling out by interference; and the same laws also clarify the nature of unpolarized light. Since today it is popular to look forward to the time when our view of nature will have been completely changed, I will mention the possibility that the fundamental equations for the motion of individual molecules will turn out to be only approximate formulas which give average values, resulting according to the probability calculus from the interactions of many independent moving entities forming the surrounding medium — as for example in meteorology the laws are valid only for average values obtained by long series of observations using the probability calculus. These entities must of course be so numerous and must act so rapidly that the correct average values are attained in millionths of a second.