RIchard von MIses
Richard von MIses was the younger brother of Ludwig von Mises
the great theorist of libertarian economics.
Richard became an applied mathematician and specialist in the theory of probability and statistics. He was a professor at Harvard University
He is known for the frequency concept of probability, which has been controversial.
In Andrey Kolmogorov
's book, Foundations of the Theory of Probability
, Kolmogorov defined the modern axiomatic foundation of probability theory, which is more widely accepted than von Mises' frequency theory.
POSITION AND VELOCITY OF A MATERIAL PARTICLE
The physicist, W. Heisenberg, one of the founders of quantum mechanics, was the first to investigate what happens when we try to determine more and more exactly the physical variables characterizing the state of a single particle, i.e., its position in space and its velocity, or its position and its momentum.
First, let us try to fix the position of the particle in space. We place it under the microscope, illuminate it, and try to find its coordinates. The exactness with which small objects can be located under a microscope depends on the wave length of the source of illumination. The smallest distance which can be observed under the microscope is proportional to the wave length of the light used. If we want to fix the position as exactly as possible, we have to use light of a very short wave length, and consequently, of a very large frequency.
According to the modern concept of light, an illuminated particle is continuously struck by a large number of light quanta. The whole process is of a statistical nature such as Brownian motion or the motion of molecules in a gas. The energy of each light quantum is inversely proportional to its wave length. The impact of the quanta affects the state of motion of the particle, and this effect increases with the increase in the energy of the quantum, that is, with an increase in its frequency or with a decrease in its wave length (this is the so-called Compton effect). We are thus in a dilemma: the increase in accuracy of the measurement of the co-ordinates of the particle requires the use of light with a very short wave length. The shorter the wave length, however, the stronger is the disturbing influence on the measurement of the velocity of the particle. It follows that it is fundamentally impossible to measure at the same time exactly both the position and the velocity of the particle.
The main point at issue here is not, as has often been stated, that the process of measuring influences the state of the object to be measured and thus limits the possible extent of precision. Such interaction also exists in certain instances of marcophysics, e.g., the introduction of an apparatus for measuring the dynamical pressure of a fluid affects the pressure. However, in this and other such cases we know how to apply appropriate corrections. The conditions in micromechanics are fundamentally different: the essential point is the assumed random character of the disturbing light quanta, a phenomenon which cannot be accounted for by a deterministic theory of the type of Newtonian mechanics.
The essential consequence of Heisenberg’s considerations can be summarized by saying that the results of all measurements form collectives. In the realm of macrophysics the objects of measurement are themselves statistical conglomerates, such as the length of a ruler which is a mass of molecules in motion. The notion of an absolutely exact length measure has therefore obviously no meaning with respect to objects of this kind. In microphysics, where we are concerned with measurements on a single elementary particle, the inexactness is introduced by the statistical character of the light quanta striking the particle during and through the very act of measuring. In both cases we are faced with the indeterministic nature of the problem as soon as we inquire more closely into the concrete conditions of the act of measuring.
HEISENBERG’S UNCERTAINTY PRINCIPLE
Quantum mechanics is considered today to be a purely statistical theory. Its axioms are expressed in terms of differential equations connecting the probabilities for the values of co-ordinates and velocities at a given moment with the corresponding probabilities at another moment. Some physicists still try to interpret these equations in a deterministic way and to ‘derive’ them from concepts of classical mechanics to which they are doubtlessly related by many formal analogies. Possibly these attempts will meet with a similar fate as did analogous attempts in the case of Maxwell’s equations of electrodynamics. For many years, one tried to explain these equations mechanically, by the introduction of concealed masses andHEISENBERG’S UNCERTAINTY PRINCIPLE23
Quantum mechanics is considered today to be a purely statistical theory. Its axioms are expressed in terms of differential equations connecting the probabilities for the values of co-ordinates and velocities at a given moment with the corresponding probabilities at another moment. Some physicists still try to interpret these equations in a deterministic way and to ‘derive’ them from concepts of classical mechanics to which they are doubtlessly related by many formal analogies. Possibly these attempts will meet with a similar fate as did analogous attempts in the case of Maxwell’s equations of electrodynamics. For many years, one tried to explain these equations mechanically, by the introduction of concealed masses and complicated mechanisms. Eventually, however, it was agreed to accept these equations as elementary laws needing no mechanical ‘derivation’. The situation is more difficult in the case of quantum mechanics, because here the various assumptions are related to certain mechanical systems.
One consequence of the axioms of quantum mechanics has aroused particular interest. This is the above-mentioned relation existing between the distributions of the co-ordinates of a particle on the one hand and that of its impulses (or velocities) on the other, the most important being that the product of the variances of the two variables has a certain fixed value, independent of any other data of the problem. The order of magnitude of this product is that of the square of Planck’s universal constant (h = 6 x 10-27 in the usual metrical units). The relation is known as Heisenberg’s Uncertainty Principle. The previously discussed example of the observation of a particle under the microscope, which led to the finding that the more exactly we measure the co-ordinates, the less exact the measurements of the velocities become, appears now as a consequence of Heisenberg’s principle.
Heisenberg’s principle of the constancy of the product of variances is a purely theoretical proposition and is in this sense mathematically precise. In other words, it presumes that each single measurement in the collective consists in an absolutely exact jeading of the measuring instrument. If we were able to make an experimental device to measure lengths to 10-13 cm and to measure the impulses also to 10-13 gem/sec, the theory provides that the results of repeated measurements of position will be the same each time (and likewise those of velocity), so that there would be practically no variance in either case. This situation would differ only by its orders of magnitude from the one discussed above where the length of a table was measured without variance by the use of a tape divided into units of whole centimetres only.
Some physicists feel that the ground has been cut from under their feet since the Uncertainty Principle was first announced. If no exact measurements are possible, not even in principle, what is the meaning of exact physical theories? In my opinion, these apprehensions are not justified. The results of quantum mechanics or wave mechanics can be used in exactly the same way as the results of classical macrophysics. What do we care about the impossibility of predicting the beginning of an eclipse of the sun to 10-12 seconds, if we can predict it to a second ? In the end, our feeling of discomfort is nothing but another aspect of the old disparity between purely mathematical concepts with their ‘limitless precision’ and the realities of the physical world.
What, then, is the ultimate meaning of Heisenberg’s Uncertainty relation? We must see in it a great step towards the unification of our physical conception of the world. Until recently, we thought that there existed two different kinds of observations of natural phenomena, observations of a statistical character, whose exactness could not be improved beyond a certain limit, and observations on the molecular scale whose results were of a mathematically exact and deterministic character. We now recognize that no such distinction exists in nature. I do not want to convey the impression that every distinction between extreme regions of physics has now disappeared, and that the mechanics of solar systems and the theory of radioactive disintegration are only two paragraphs of the same chapter. The description of nature is not as simple as that, and cannot be forced into one single scheme. Nevertheless, a certain apparent contrast between two domains of physics has disappeared with the advent of the new concepts of wave mechanics.
CONSEQUENCES FOR OUR PHYSICAL CONCEPT OF THE WORLD
We can only roughly sketch here the consequences of these new concepts for our general scientific outlook. First of all, we have no cause to doubt the usefulness of the deterministic theories in large domains of physics. These theories, built on a solid body of experience, lead to results that are well confirmed by observation. By allowing us to predict future physical events, these physical theories have fundamentally changed the conditions of human life. The main part of modern technology, using this word in its broadest sense, is still based on the predictions of classical mechanics and physics.
It has been known for a long time, at least to those who strive for clear insight into these matters, that consequences drawn from the mathematical propositions of the classical theories cannot be verified with unlimited accuracy, in the mathematical sense. Atomistic theories of the ancient philosophers already pointed in this direction. The wave theory of light strongly suggests the existence of limitations of this kind. The first attempt at a comprehensive interpretation regarding the nature of the limits to the accuracy of measurements was Boltzmann’s formulation, in the second half of the nineteenth century, of the kinetic theory of gases as a statistics of molecules. He pointed out that the predictions of classical physics are to be understood in the sense of probability statements of the type of the Laws of Large Numbers, i.e.: ‘If n is a large number, it is almost certain that... Consideration of the values of n involved, (the number of molecules, etc.), shows that under normal conditions these probabilities are so close to unity that the probable predictions become in fact certain. As explained above, at this stage of development, the usual assumption was that the atomic processes themselves, namely the motions of single molecules, are governed by the exact laws of deterministic mechanics. This point of view which is incompatible with our concept of probability has been retained by some physicists until quite recently.
The rise of quantum mechanics has freed us from this dualism which prevented a logically satisfactory formulation of the fundamentals of physics. We know now that besides classical physics, applicable to processes on a large scale, there is a microphysics, namely the theory of quanta or wave mechanics; the differential equations of microphysics, however, merely connect probability distributions. Therefore, the statements made by this theory with respect to the elementary particles have the character of probability propositions. In the world of molecules, ‘exact measurements’ without variance are possible only under the same restrictions as hold for ordinary bodies: only if we decide to record just those digits that do not change from one measurement to another. The order of magnitude of the unit, which in atomic physics is about 10-12 mm, is of practical but not of basic importance.
II have confined myself to questions regarding inorganic matter and have avoided all attempts to carry the investigations into the field of biology. By this voluntary restriction, I do not intend to indicate that I consider an extension of our theory in this direction to be impossible or impermissible. I think, however, that the so-called biological processes are still much more complicated than those forming the subject of physics and chemistry, and that considerable additions have to be made to the physical theories before biological statements of a basic nature can be attempted.
Let us make a final brief survey of the course which we have followed in these chapters. We began by investigating the meaning of the word ‘probability’ in everyday language and by trying to restrict this meaning in an appropriate way. We found an adequate basis for the definitions and axioms of an exact scientific theory of probability in a well-known class of phenomena: games of dice and similar processes. The notions of the collective, of the limiting value of relative frequency, and of randomness became the starting-point of the new theory of probability. The four fundamental operations, selection, mixing, partition, and combination, were the tools by means of which the theory was developed.
We stated once and for all that the purpose of the theory is only to derive new distributions of probabilities from initial ones. We showed that, in this sense, the theory of probability does not differ from other natural sciences, and we thus gained a stable position from which to judge the epistemologically insufficient foundations of older theories of probability, like that based on the notion of equally likely events. We reviewed the various suggestions for improvements of my original statements. No necessity for essential alterations emerged from this discussion. The classical Laws of Large Numbers and the recent additions to these laws were incorporated into the new theory. The frequency definition of probability has allowed us to interpret these laws as definite propositions concerning sequences of observable phenomena.
The first wide field of applications of the theory of probability which we have discussed was that usually known as statistics. This is, first of all, the study of sequences of numbers derived from the observation of certain repetitive events in human life. We have seen, e.g., that Marbe’s exhaustive statistics of the sex distribution of infants is in very good agreement with the predictions of the theory of probability. In other cases, such as death statistics, suicide statistics, the statistical data could not be considered directly as collectives; we found, however, ways to reduce them to collectives. We saw that methods based on the theory of probability, such as, e.g., Lexis’s theory of dispersion, were useful tools in a rational comprehensive and systematic description of repetitive events; in this sense, the methods provide us with what is usually called an ‘explanation’ of the phenomena. The theory of errors, which is the statistics of physical measurements, has served as a link with a second fundamental field of application of the calculus of probability, with statistical physics.
The problems of statistical physics are of the greatest interest in our time, since they lead to a revolutionary change in our whole conception of the universe. We have seen how Boltzmann took the first daring step in formulating a law of nature in the form of a statistical proposition. The initial stage was uncertain and in a way self-contradictory in that it attempted to derive the statistical behaviour
of systems from the deterministic laws of classical mechanics, an attempt which was destined to fail, as E. Mach maintained vigorously. We have then followed the success of purely statistical arguments in the explanation of certain physical phenomena, such as Brownian motion or the scintillations caused by radioactivity. These investigations led us in a natural way to the problem of the meaning of the so-called law of causality and of the general relation between determinism and indeterminism in physics. We recognized how the progress of physics has brought about a gradual abandonment of preconceived ideas that had even been dogmatically formulated in some philosophical systems. The new quantum mechanics and Heisenberg’s Uncertainty Principle finally complete the edifice of a statistical conception of nature, showing that strictly exact observations are no more possible in the world of micromechanics than in that of macromechanics. No measurements can be carried out without the intervention of phenomena of a statistical character.
I think that I may have succeeded in demonstrating the thesis indicated in the title and in the introduction to this book: Starting from a logically clear concept of probability, based on experience, using arguments which are usually called statistical, we can discover truth in wide domains of human interest.
SUMMARY OF THE SIX LECTURES IN SIXTEEN PROPOSITIONS
1. The statements of the theory of probability cannot be understood correctly if the word ‘probability’ is used in the meaning of everyday speech; they hold only for a definite, artificially limited rational concept of probability.
2. This rational concept of probability acquires a precise meaning only if the collective to which it applies is defined exactly in every case. A collective is a mass phenomenon or repetitive event that satisfies certain conditions; generally speaking, it consists of a sequence of observations which can be continued indefinitely.
3. The probability of an attribute (a result of observation) within a collective is the limiting value of the relative frequency with which this attribute recurs in the indefinitely prolonged sequence of observations. This limiting value is not affected by any place selection applied to the sequence (principle of randomness or principle of the impossibility of a gambling system).
Occasionally we deal with sequences in which the condition of randomness is not fulfilled; we then call the limiting value of the relative frequency the ‘chance’ of the attribute under consideration.
4. The purpose of the calculus of probability, strictly speaking, consists exclusively in the calculation of probability distributions in new collectives derived from given distributions in certain initial collectives. The derivation of new collectives can always be reduced to the (repeated) application of one or several of four simple fundamental operations.
5. A probability value, initial or derived, can only be tested by a statistical experiment, i.e., by means of a sufficiently long sequence of observations. There is no a priori knowledge of probabilities; it is likewise impossible to derive probability values by way of some other non-statistical science, such as mechanics.
6. The classical ‘definition’ of probability is an attempt to reduce the general case to the special case of equally likely events where all the attributes within the collective have equal probabilities. This reduction is often impossible as, e.g., in the case of death statistics; in other cases it may lead to contradictions (Bertrand’s paradox). At any rate, it still remains necessary to give a definition of probability for the case of uniform distributions. Without the complement of a frequency definition, probability theory cannot yield results that are
applicable to real events.
7. The so-called Laws of Large Numbers contain meaningful
statements on the course of a sequence of observations only if we use
a frequency definition of probability. Interpreted in this way, they
make definite statements, essentially based on the condition of
randomness, concerning the arrangement of the results in the observed
sequence. On the basis of the classical definition, these laws
are purely arithmetical propositions concerning certain combinatorial
properties of integral numbers and bear no relation to the
actual evolution of phenomena.
8. The task of probability calculus in mathematical statistics
consists in investigating whether a given system of statistical data
forms a collective, or whether it can be reduced to collectives. Such
a reduction provides a condensed, systematic description of the statistical
data that we may properly consider an ‘explanation’ of these data.
9. None of the theories that seemed to contradict the theory of
probability (such as Marbe’s theory of statistical stabilization, the
theory of accumulation, the law of series) has been confirmed by
10. The concept of likelihood introduced by R. A. Fisher, and
the methods of testing derived from it do not, if they are correctly
applied and interpreted, fall outside of the domain of the theory of
probability based on the frequency concept.
11. The theory of errors, which lies on the borderline between
general and physical statistics, is based on the assumption that each
physical measurement is an element in a collective whose mean value
is the so-called ‘true’ value of the measured quantity. Additional
assumptions concerning this collective lead to the various propositions
of the theory of errors.
12. Statistical propositions in physics differ fundamentally from
deterministic laws: they predict only what is to be expected in the
overwhelming majority of cases for a sufficiently long sequence of
observations of the same phenomenon (or of the same group of
phenomena). As a rule, however, the relative frequency of this most
probable result is so close to unity that no practical difference exists
between the statistical proposition and the corresponding deterministic
13. Successive observations on the evolution in time of a physical
system do not directly form a collective. They can, nevertheless, be
dealt with satisfactorily within the framework of the rational theory
of probability (probability after-effects, Markoff chains).
14. The assumption that a statistical theory in macrophysics is compatible with a deterministic theory in microphysics is contrary to the conception of probability expressed in these lectures.
15. Modern quantum mechanics or wave mechanics appears to be a purely statistical theory; its fundamental equations state relations between probability distributions. The Uncertainty Principle derived in quantum mechanics implies that measurements in microphysics, like those in macrophysics, are elements of a collective; in either case, a vanishing variance of a measurement is merely the consequence of the choice of a sufficiently large unit of measurement.
16. The point of view that statistical theories are merely temporary explanations, in contrast to the final deterministic ones which alone satisfy our desire for causality, is nothing but a prejudice. Such an opinion can be explained historically, but it is bound to disappear with increased understanding.
Probability, Statistics, and Truth
, excerpts (PDF)