Albert Einstein - On the Development of Our Views Concerning the Nature and Constitution of Radiation

Home > Solutions > Scientists > Einstein > On Radiation

On the Development of Our Views Concerning the Nature and Constitution of Radiation

("Entwicklung unserer Anschauungen über das Wesen und die Konstitution der Strahlung," Physikalische Zeitschrift, 10, 817-825, 1909)

When light was shown to exhibit interference and diffraction, it seemed almost certain that light should be considered a wave. Since light can also propagate through empty space, one had to imagine a strange substance, an ether, that mediated the propagation of light waves. Since light also propagates in material objects, one had to assume that this ether was also present in material objects, and was chiefly responsible for the propagation of light in material objects. The existence of the ether seemed beyond doubt. In the first volume of Chwolson's excellent physics textbook, he states in the introduction to ether, "The hypothesis of this one agent's existence is extraordinarily close to certainty."

Today, however, we regard the ether hypothesis as obsolete. A large body of facts shows undeniably that light has certain fundamental properties that are better explained by Newton's emission theory of light than by the oscillation theory. For this reason, I believe that the next phase in the development of theoretical physics will bring us a theory of light that can be considered a fusion of the oscillation and emission theories. The purpose of the following remarks is to justify this belief and to show that a profound change in our views on the composition and essence of light is imperative.

The greatest advance in theoretical optics since the introduction of the oscillation theory was Maxwell's brilliant discovery that light can be understood as an electromagnetic process. This theory replaces mechanical quantities (the deformation and velocity of the ether's parts) with the electromagnetic state of the ether and the material under consideration. It reduces optical problems to electromagnetic problems. As the electromagnetic theory developed, it became ever less of a concern whether the electromagnetic processes could be explained by mechanical processes. One became used to treating electric and magnetic fields as fundamental concepts that did not require a mechanical interpretation.

The introduction of the electromagnetic theory simplified the elements of theoretical optics and reduced the number of arbitrary hypotheses. The old question about the oscillation direction of polarized light became irrelevant. The difficulties concerning the boundary conditions between two media were resolved using the theory's fundamental principles. An arbitrary hypothesis was no longer needed to eliminate longitudinal light waves. A consequence of the theory was the pressure of light, which plays such an important role in radiation theory and which has just recently been confirmed experimentally. I don't want to make an exhaustive list of well-known accomplishments, but rather concentrate on one main point, on which the electromagnetic and the mechanical theories of light agree — or rather, seem to agree.

In both theories, light is essentially an embodiment of the state of a hypothetical medium, the ether, which exists everywhere, even in the absence of light. It was therefore assumed that motions of this medium would influence optical and electromagnetic phenomena. The search for laws describing this influence caused a change in the basic ideas about the nature of radiation. Let us briefly examine the progression of this change.

The main outstanding question was the following: does the ether participate in the motions of matter, or does the ether inside moving matter move differently or, perhaps, does the ether ignore the motions of matter and remain forever at rest? To decide this question, Fizeau performed an important interference experiment, based on the following line of reasoning. Assume that light propagates with speed V in a certain object, if the object is at rest. If the object, when moved, takes its ether along with it, the light will propagate in the same way as when the object was at rest. Therefore, the speed of propagation in the object will again be V. However, taken absolutely, i.e., relative to an observer not moving with the object, the speed of propagation will be the geometric sum of V and the velocity v of the object. If the motion and propagation are along the same axis and have the same sense, the V_abs is simply the sum of the two speeds

V_abs = V + v

This is a consequence of assuming that the ether participates in the motion of its object.

To test whether this prediction was true, Fizeau let two coherent, monochromatic beams pass axially each through their own water-filled pipe and then interfere with one another. Then he set the water in the pipes moving, one in the direction of the light's propagation and the other opposite to it. In this way, a shift of the interference pattern was produced, from which Fizeau could derive the influence of the object's velocity on the absolute velocity.

As is well-known, the change is smaller than predicted by the hypothesis of complete participation, although the sense of the change is as expected. Expressed mathematically,

V_abs = V + αv

where α is always smaller than one. Ignoring dispersion,

α = 1 - 1 / n²

This experiment demonstrated that matter does not completely carry along its ether but, in general, the ether is moving relative to matter. Now the Earth is a material object, which moves in different directions over the course of a year relative to the solar system. The ether in our laboratories was assumed to not participate in the Earth's motion completely, just as the ether did not participate in the water's motion completely in Fizeau's experiment. Thus, the conclusion was that the ether was moving relative to our instruments, and that this relative motion changed over the course of a day and of a year. This relative motion was expected to produce a visible anisotropy of space, i.e., optical phenomena were expected to depend on the orientation of the apparatus. The most diverse experiments were performed without detecting the expected dependence of phenomena on orientation.

This contradiction was chiefly eliminated by the pioneering work of H. A. Lorentz in 1895. Lorentz showed that if the ether were taken to be at rest and did not participate at all in the motions of matter, no other hypotheses were necessary to arrive at a theory that did justice to almost all of the phenomena. In particular, Fizeau's experiments were explained, as well as the negative results of the above-mentioned attempts to detect the Earth's motion relative to the ether. Only one experiment seemed incompatible with Lorentz's theory, namely, the interference experiment of Michelson and Morley.

According to Lorentz's theory, a uniform translational motion of the apparatus of optical experiments does not affect light's progress, if we ignore second- and higher-order terms of the quotient (speed of apparatus)/(speed of light). The Michelson and Morley interference experiment showed that, in a special case, second-order terms also cannot be detected, although they were expected from the standpoint of the ether-at-rest theory. To include this experiment in the theory, Lorentz and FitzGerald introduced the postulate that all objects, including the parts of Michelson and Morley's experimental set-up, changed their form in a certain way, if they moved relative to the ether.

This state of affairs was very unsatisfying. The only useful and fundamentally basic theory was that of Lorentz, which depended on a completely immobile ether. The Earth had to be seen as moving relative to this ether. But every experiment designed to demonstrate this ether had a negative result, so that one was driven to a very strange hypothesis to understand why such a relative motion was not detectable.

Michelson's experiment suggests the axiom that all phenomena obey the same laws relative to the Earth's reference frame or, more generally, relative to any reference frame in unaccelerated motion. For brevity, let us call this postulate the relativity principle. Before we tackle the problem of whether it is possible to maintain the relativity principle, let us briefly consider what happens to the ether hypothesis, if we maintain the relativity principle.

The foundation of the ether hypothesis is the experimentally based assumption that the ether is at rest. The relativity principle states that all natural laws that hold in a reference frame K^' moving uniformly relative to the ether are identical with those that hold in K, a reference frame at rest relative to the ether. If that is so, we can just as well imagine the ether is at rest relative to K^', not K. It is completely unnatural to distinguish the two reference frames K^' and K by introducing an ether that is at rest in one. A satisfying theory can only be reached if we dispense with the ether hypothesis. Then the electromagnetic fields that make up light no longer appear as a state of a hypothetical medium, but rather as independent entities that the light source gives off, just as in Newton's emission theory of light. As in that theory, space that is free of matter and radiation is truly empty.

Superficial consideration suggests that the essential parts of Lorentz's theory cannot be reconciled with the relativity principle. According to Lorentz's theory, if a light beam propagates through space, it does so with a speed c in the resting frame K of the ether, independently of the state of motion of the emitting object. Let's call this the constancy of the speed of light principle. The theorem of the addition of speeds states that the same light beam will not propagate at speed c in a different frame K^' moving uniformly relative to the ether. The laws of propagation thus seem to be different in the two frames and, hence, the relativity principle seems to be incompatible with the laws governing light's propagation.

However, the theorem of the addition of speeds rests on arbitrary axioms. It presupposes that information about time and the form of moving objects has meaning independent of the motion of the moving reference frame. But one can convince oneself that the definitions of time and the form of moving objects require the introduction of clocks at rest in the reference frame under consideration. These concepts must be defined for each reference frame, and it is not self-evident that these definitions will produce the same time values in two frames K and K^' moving relative to one another. Similarly, it cannot be said a priori that statements about the form of objects in K will also be valid in K^'.

Hence, the hitherto prevailing transformation equations in passing from one frame to another moving relative to it rest on arbitrary assumptions. If these are abandoned, the essence of Lorentz's theory or, more generally, the "constancy of the speed of light" principle can be reconciled with the relativity principle. These two principles lead to certain unambiguous transformation equations characterized by the identity

x² + y² + z² - c² t² = (x^')² + (y^')² + (z^')² - c² (t^')²

for an appropriate choice of initial origins. In this equation, c is the speed of light in vacuo, x, y, z, t are the space-time coordinates in K, and x^', y^', z^', t^' are those in K^'.

This path leads to the so-called relativity theory. I only wish to bring in one of its consequences, for it brings with it certain modifications of the fundamental ideas of physics. It turns out that the inertial mass of an object decreases by L / c² when that object emits radiation of energy L. This can be derived as follows.

We consider a free object at rest that emits the same amount of radiative energy in two opposing directions. In doing so, it remains at rest. Let the object's energy prior to emission be denoted E, and its energy after emission e and let L be the energy of the emitted radiation. By the energy principle, we have

E = e + L

Now consider the object and the same emitted radiation from a reference frame moving with velocity relative to the object. Relativity theory gives us the means of calculating the energy emitted in the new reference frame. One obtains the value

L^' = L / √ (1 - v²/c²)

Since the conservation of energy principle must also hold in the new reference frame, one obtains analogously

E^' = e^' / √ (1 - v²/c²))

Subtracting and ignoring fourth- and higher-order terms in v/c, we get

E^' - E = e^' - e + (1/2) L (v²/c²)

But E^' - E is the object's kinetic energy before the light emission and e^' - e is its kinetic energy after the light emission. If we call its mass before emission M and its mass after emission m then, by ignoring terms higher than second order, we can write

(1/2) M v² = (1/2) m v² + (1/2) L (v²/c²)

M = m + L / c²

Thus, the inertial mass of an object is diminished by the emission of light. The energy given up was part of the mass of the object. One can further conclude that every absorption or release of energy brings with it an increase or decrease in the mass of the object under consideration. Energy and mass seem to be just as equivalent as heat and mechanical energy.

Relativity theory has changed our views on light. Light is conceived not as a manifestation of the state of some hypothetical medium, but rather as an independent entity like matter. Moreover, this theory shares with the corpuscular theory of light the unusual property that light carries inertial mass from the emitting to the absorbing object. Relativity theory does not alter our conception of radiation's structure; in particular, it does not affect the distribution of energy in radiation-filled space. Nevertheless, with respect to this question, I believe that we stand at the beginning of a development of the greatest importance that cannot yet be surveyed. The statements that follow are largely my personal opinion, or the results of considerations that have not yet been checked enough by others. If I present them here in spite of their uncertainty, the reason is not an excessive faith in my own views, but rather the hope to induce one or another of you to deal with the questions considered.

Even without delving deeply into theory, one notices that our theory of light cannot explain certain fundamental properties of phenomena associated with light. Why does the color of light, and not its intensity, determine whether a certain photochemical reaction occurs? Why is light of short wavelength generally more effective chemically than light of longer wavelength? Why is the speed of photoelectrically produced cathode rays independent of the light's intensity? Why are higher temperatures (and, thus, higher molecular energies) required to add a short-wavelength component to the radiation emitted by an object?

The oscillation theory, in its present formulation, gives no answers to these questions. In particular, it is completely incomprehensible why cathode rays produced photoelectrically or by X-rays acquire such a considerable velocity independent of the light's intensity. The appearance of such great amounts of energy in molecular entities under the influence of a light source in which the energy is distributed so thinly (as we must assume for light radiation and X-rays, given the oscillation theory) drove competent physicists to take refuge in a rather far-out hypothesis. They assumed that light played merely a releasing role in the process, and that the molecular energies produced were of a radioactive nature. Since this hypothesis has already been abandoned, I won't bring any arguments against it.

The fundamental property of the oscillation theory that engenders these difficulties seems to me the following. In the kinetic theory of molecules, for every process in which only a few elementary particles participate (e.g., molecular collisions), the inverse process also exists. But that is not the case for the elementary processes of radiation. According to our prevailing theory, an oscillating ion generates a spherical wave that propagates outwards. The inverse process does not exist as an elementary process. A converging spherical wave is mathematically possible, to be sure; but to approach its realization requires a vast number of emitting entities. The elementary process of emission is not invertible. In this, I believe, our oscillation theory does not hit the mark. Newton's emission theory of light seems to contain more truth with respect to this point than the oscillation theory since, first of all, the energy given to a light particle is not scattered over infinite space, but remains available for an elementary process of absorption.

Consider the laws governing the production of secondary cathode radiation by X-rays. If primary cathode rays impinge on a metal plate P1, they produce X-rays. If these X-rays impinge on a second metal plate P2, cathode rays are again produced whose speed is of the same order as that of the primary cathode rays.

In his remarks after the talk, Johannes Stark confirmed that a single X-ray had traveled as far as ten meters and ejected an equal energy electron from P2.

As far as we know today, the speed of the secondary cathode rays depends neither on the distance between P1 and P2, nor on the intensity of the primary cathode rays, but rather entirely on the speed of the primary cathode rays. Let's assume that this is strictly true. What would happen if we reduced the intensity of the primary cathode rays or the size of P1 on which they fall, so that the impact of an electron of the primary cathode rays can be considered an isolated process? If the above is really true then, because of the independence of the secondary cathode rays' speed on the primary cathode rays' intensity, we must assume that an electron impinging on P1 will either cause no electrons to be produced at P2, or else a secondary emission of an electron whose speed is of the same order as that of the initial electron impinging on P1. In other words, the elementary process of radiation seems to occur in such a way that it does not scatter the energy of the primary electron in a spherical wave propagating in every direction, as the oscillation theory demands. Rather, at least a large part of this energy seems to be available at some place on P2, or somewhere else. The elementary process of the emission of radiation appears to be directional. Moreover, one has the impression that the production of X-rays at P1 and the production of secondary cathode rays at P2 are essentially inverse processes.

Therefore, the constitution of radiation seems to be different from what our oscillation theory predicts. The theory of thermal radiation has given important clues about this, mostly by the theory on which Planck based his radiation formula. Since I cannot assume that everyone is familiar with this theory, I will cover its essential points briefly.

A radiation of definite constitution occupies the interior of a cavity of temperature T, and is independent of the cavity's material composition. The cavity contains an energy density ρ dν for frequencies between ν and ν + dν. Finding ρ as a function of ν and T poses a problem. If an electric resonator of eigenfrequency ν and negligible damping occupies the cavity, the time average of the resonator's energy E as a function of ρ(ν) can be calculated from the electromagnetic theory of radiation. The problem is thereby reduced to that of determining E as a function of T. However, the latter problem can also be reduced to the following. Let the cavity contain very many resonators of frequency ν; how does the entropy of the system depend on its energy?

To resolve this question, Planck utilized the general relationship between entropy and the probability of a state, as derived by Boltzmann from his investigations in the theory of gases. In general

entropy = k logW

where k is a universal constant, and W is the probability of the state under consideration. This probability is measured by the number of "configurations", a number that counts the number of ways the state under consideration can be realized. In the case above, the state of the resonator system is defined by its total energy, so the question to be answered reads: how many ways can a given total energy be distributed among N resonators? To find this out, Planck divided the total energy into parts of equal energy ε. A configuration is determined by the number of parts ε allotted to each resonator. The number of such configurations that result in the given total energy is calculated and set equal to W.

From Wien's displacement law, which can be derived from thermodynamic principles, Planck concluded that ε must be set equal to hν, where h is independent of ν. In this way, he found his radiation formula, which agrees with all of our experience hitherto

ρ = 8πh (ν/c)³ (1 / (e^hν/kT - 1))

It might seem that, in accordance with this derivation, Planck's radiation formula follows from the present electromagnetic theory. This is not the case, for the following reason. The number of configurations, of which we were just speaking, can be thought of as expressing the multiplicity of the distribution possibilities of the total energy among N resonators, if every imaginable distribution of energy approached within some approximation of the calculated number of configurations W. This requires that the energy ε be small compared to average resonator energy E for all ν. But simple calculation shows that, for a wavelength of 0.5 μm and an absolute temperature T = 1700 K, ε / E is not only not small compared to one, but is very big compared to one; the value is approximately 6.5 x 10⁷. This numerical example shows that the counting of the states must have gone awry, if the resonator's energy can only assume the value zero or 6.5 x 10⁷ times its average energy (or a multiple thereof). Clearly, in such a process, only a vanishingly small part of those energy distributions, which must be possible according the fundamental principles of the theory, are drawn upon to determine the entropy. Therefore, according to the fundamental principles of the theory, the number of configurations is not an expression for the probability in Boltzmann's sense. In my opinion, accepting Planck's theory means denying the precepts of our radiation theory.

I have already attempted to show that the present principles of the theory of radiation must be abandoned. In any event, it is unthinkable to reject Planck's theory because it does not fit those fundamental principles. This theory has led to a determination of the elementary quanta, which has been splendidly confirmed by the most recent measurements on alpha-particles. For the elementary quantum of electricity, Rutherford and Geiger obtained the mean value 4.65 x 10^-10, Regener 4.79 x 10^-10, while Planck, using his radiation theory, determined the intermediate value 4.69 x 10^-10 from the constants of the radiation formula.

Planck's theory leads to the following conjecture. If it is really true that a radiative resonator can only assume energy values that are multiples of hν, the obvious assumption is that the emission and absorption of light occurs only in these energy quantities. On the basis of this hypothesis, the light-quanta hypothesis, the questions raised above about the emission and absorption of light can be answered. As far as we know, the quantitative consequences of this light-quanta hypothesis are confirmed. This provokes the following question. Is it not thinkable that Planck's radiation formula is correct, but that another derivation could be found that does not rest on such a seemingly monstrous assumption as Planck's theory? Is it not possible to replace the light-quanta hypothesis with another assumption, with which one could do justice to known phenomena? If it is necessary to modify the theory's elements, couldn't one keep the propagation laws intact, and only change the conceptions of the elementary processes of emission and absorption?

To arrive at a certain answer to this question, let us proceed in the opposite direction of Planck in his radiation theory. Let us view Planck's radiation formula as correct, and ask ourselves whether something concerning the composition of radiation can be derived from it. Of two considerations that I have carried out, I wish to sketch one for you, which seems especially convincing to me because it can be imagined so clearly.

Let there be an ideal gas inside a cavity, as well as a solid plate that is free to move perpendicularly to its plane. Because of the irregularity of the collisions between the gas molecules and the plate, the latter is moved such that its average kinetic energy is one-third of the kinetic energy of a monatomic gas molecule. (This follows from statistical mechanics.) Besides this gas (which we can imagine as consisting of only a few molecules), we assume that there is also thermal radiation at the gas temperature. This will be the case if the walls of the cavity are also at the same temperature T, do not let radiation pass through and are not completely reflective. Furthermore, we assume that our plate is completely reflective on both sides. In this situation, both the gas and the radiation will affect the plate. If the plate is at rest, the pressures are equal. If the plate is moved, however, the pressure on the forward side pushing back is greater than its counterpart pushing in the opposite direction. Hence, there will be a net force that opposes the motion of the plate, and increases with the speed of the plate. Let us call this force the "radiation friction".

If we assume for the moment that we have taken into account all the radiation's mechanical influence, we can summarize as follows. Collisions with gas molecules at irregular intervals give the plate irregular momentum. The speed of the plate decreases continuously between two such collisions, due to the radiation friction, which transforms the kinetic energy of the plate into radiative energy. As a result, the energy of the gas molecules would be continuously transformed into the energy of radiation, until all the available energy had turned into energy of radiation. There would be no equilibrium between gas and radiation.

This argument is fallacious because, similar to gaseous pressure, the radiation pressure on the plate cannot be considered constant in time and free of irregular variations. To allow thermal equilibrium, the variations in the radiation presure must be such that, on average, it compensates for the plate's loss of speed due to radiation friction. Remember that the average kinetic energy of the plate is one-third that of a monatomic gas molecule. Given the radiation law, the radiation friction can be calculated and, thence, the average amount of momentum the plate must receive from variations in the light-pressure to maintain a statistical equilibrium.

The argument becomes even more interesting if we choose a plate that completely reflects only frequencies between ν and ν + dν, and is transparent to all other radiation. This gives the variations in the radiation pressure for that frequency band. I merely state the result here. Let Δ be the magnitude of the motion communicated to the plate in time t by irregular variations in the radiation pressure. The average value of Δ² is given by the expression

<Δ²> = (1/c) [ ρhν + (c³ ρ² / 8 π ν²) ] dν A dt

First of all, the simplicity of this expression is noteworthy. Planck's theory seems to be the only one that agrees with experiment to within observational error and yields such a simple expression for the statistical properties of the radiation pressure.

In trying to understand this expression, one notices at once that it is the sum of two terms. It is as if two independent causes were working to produce variations in the radiation pressure. One can conclude from the fact that Δ² is proportional to A that the pressure variations for two neighboring regions are completely independent of each other, if the regions have dimensions large compared to the wavelength of the reflection frequency ν. The second term of the expression for Δ² can be explained by the oscillation theory. According to that theory, light rays of slightly different direction, frequency and polarization interfere with one another; variations in the radiation presure correspond to uncorrelated occurrences of interference in the whole. Simple dimensional analysis shows that this variation must be of the form of the second term of our formula. Clearly, the oscillatory structure of radiation does indeed give rise to variations in the radiation pressure, as predicted.

How can the first term be explained? It is by no means negligible; it is completely dominant in the regime where Wien's radiation formula holds. For a wavelength of 0.5 μm and a temperature T = 1700 K, this term is approximately 6.5 x 10⁷ times larger than the second. It turns out that the first term of our formula results from assuming that radiation consists of localized groupings of energy hν that are reflected and move through space independently of one another — an idea presented by the most primitive picture of the light-quanta hypothesis.

Therefore, I believe one must conclude the following from the above formula derived from Planck's radiation formula: In addition to the spatial irregularities in the distribution of radiation's energy that arise from the oscillation theory, there are also other irregularities in the same spatial distribution that completely dominate the first-mentioned irregularities when the energy density of the radiation is small. I add that another argument involving the spatial distribution of the energy gives exactly the same results as those given above.

As far as I know, no mathematical theory has been advanced that does justice to both its oscillatory structure and its quantum structure, which we derived from the first term of the above formula. The difficulty lies in the fact that the variational properties of radiation, as expressed in the above formula, offer few reference points for setting up such a theory. One might imagine a situation in which diffraction and interference were still unknown, but one knew that the average magnitude of the irregular variations of the radiation pressure was determined by the second term of the above equation, where ν is a parameter of unknown meaning that determines the color. On this basis, who would have enough imagination to construct an oscillatory theory of light?

Anyway, this conception seems to me the most natural: that the manifestation of light's electromagnetic waves is constrained at singularity points, like the manifestation of electrostatic fields in the theory of the electron. It cannot be ruled out that, in such a theory, the entire energy of the electromagnetic field could be viewed as localized in these singularities, just like the old theory of action-at-a-distance. I imagine to myself, each such singular point surrounded by a field that has essentially the same character as a plane wave, and whose amplitude decreases with the distance between the singular points. If many such singularities are separated by a distance small with respect to the dimensions of the field of one singular point, their fields will be superimposed, and will form in their totality an oscillating field that is only slightly different from the oscillating field in our present electromagnetic theory of light. Of course, it need not be emphasized that such a picture is worthless unless it leads to an exact theory. I only wished to illustrate that the two structural properties of radiation according to Planck's formula (oscillation structure and quantum structure) should not be considered incompatible with one another.

For Teachers

To hide this material, click on the Normal link.

For Scholars

To hide this material, click on the Teacher or Normal link.

Chapter 1.5 - The Philosophers	Chapter 2.1 - The Problem of Knowledge
Home	Part Two - Knowledge

Normal | Teacher | Scholar