A Stubbornly Persistent Illusion

Page 6

by Stephen Hawking

The method hitherto employed for laying co-ordinates into the space-time continuum in a definite manner thus breaks down, and there seems to be no other way which would allow us to adapt systems of co-ordinates to the four-dimensional universe so that we might expect from their application a particularly simple formulation of the laws of nature. So there is nothing for it but to regard all imaginable systems of co-ordinates, on principle, as equally suitable for the description of nature. This comes to requiring that:—

The general laws of nature are to be expressed by equations which hold good for all systems of co-ordinates, that is, are co-variant with respect to any substitutions whatever (generally co-variant).

It is clear that a physical theory which satisfies this postulate will also be suitable for the general postulate of relativity. For the sum of all substitutions in any case includes those which correspond to all relative motions of three-dimensional systems of co-ordinates. That this requirement of general co-variance, which takes away from space and time the last remnant of physical objectivity, is a natural one, will be seen from the following reflexion. All our space-time verifications invariably amount to a determination of space-time coincidences. If, for example, events consisted merely in the motion of material points, then ultimately nothing would be observable but the meetings of two or more of these points. Moreover, the results of our measurings are nothing but verifications of such meetings of the material points of our measuring instruments with other material points, coincidences between the hands of a clock and points on the clock dial, and observed point-events happening at the same place at the same time.

The introduction of a system of reference serves no other purpose than to facilitate the description of the totality of such coincidences. We allot to the universe four space-time variables x1, x2, x3, x4 in such a way that for every point-event there is a corresponding system of values of the variables x1 . . . x4. To two coincident point-events there corresponds one system of values of the variables x1 . . . x4, i. e. coincidence is characterized by the identity of the co-ordinates. If, in place of the variables x1 . . . x4, we introduce functions of them, x′1, x′2, x′3, x′4, as a new system of co-ordinates, so that the systems of values are made to correspond to one another without ambiguity, the equality of all four co-ordinates in the new system will also serve as an expression for the space-time coincidence of the two point-events. As all our physical experience can be ultimately reduced to such coincidences, there is no immediate reason for preferring certain systems of co-ordinates to others, that is to say, we arrive at the requirement of general co-variance.

§ 4. THE RELATION OF THE FOUR CO-ORDINATES TO MEASUREMENT IN SPACE AND TIME

It is not my purpose in this discussion to represent the general theory of relativity as a system that is as simple and logical as possible, and with the minimum number of axioms; but my main object is to develop this theory in such a way that the reader will feel that the path we have entered upon is psychologically the natural one, and that the underlying assumptions will seem to have the highest possible degree of security. With this aim in view let it now be granted that:—

For infinitely small four-dimensional regions the theory of relativity in the restricted sense is appropriate, if the coordinates are suitably chosen.

For this purpose we must choose the acceleration of the infinitely small (“local”) system of co-ordinates so that no gravitational field occurs; this is possible for an infinitely small region. Let X1, X2, X3, be the co-ordinates of space, and X4 the appertaining co-ordinate of time measured in the appropriate unit.* If a rigid rod is imagined to be given as the unit measure, the co-ordinates, with a given orientation of the system of co-ordinates, have a direct physical meaning in the sense of the special theory of relativity. By the special theory of relativity the expression

then has a value which is independent of the orientation of the local system of co-ordinates, and is ascertainable by measurements of space and time. The magnitude of the linear element pertaining to points of the four-dimensional continuum in infinite proximity, we call ds. If the ds belonging to the element dX1 . . . dX4 is positive, we follow Minkowski in calling it time-like; if it is negative, we call it space-like.

To the “linear element” in question, or to the two infinitely proximate point-events, there will also correspond definite differentials dX1 . . . dX4 of the four-dimensional co-ordinates of any chosen system of reference. If this system, as well as the “local” system, is given for the region under consideration, the dXv will allow themselves to be represented here by definite linear homogeneous expressions of the dxσ:—

Inserting these expressions in (1), we obtain

where the gστ will be functions of the xσ. These can no longer be dependent on the orientation and the state of motion of the “local” system of co-ordinates, for ds2 is a quantity ascertainable by rod-clock measurement of point-events infinitely proximate in space-time, and defined independently of any particular choice of co-ordinates. The gστ are to be chosen here so that gστ = gτσ; the summation is to extend over all values of σ and τ, so that the sum consists of 4 × 4 terms, of which twelve are equal in pairs.

The case of the ordinary theory of relativity arises out of the case here considered, if it is possible, by reason of the particular relations of the gστ in a finite region, to choose the system of reference in the finite region in such a way that the gστ assume the constant values

We shall find hereafter that the choice of such co-ordinates is, in general, not possible for a finite region.

From the considerations of § 2 and § 3 it follows that the quantities gτσ are to be regarded from the physical standpoint as the quantities which describe the gravitational field in relation to the chosen system of reference. For, if we now assume the special theory of relativity to apply to a certain four-dimensional region with the co-ordinates properly chosen, then the gστ have the values given in (4). A free material point then moves, relatively to this system, with uniform motion in a straight line. Then if we introduce new space-time co-ordinates x1, x2, x3, x4, by means of any substitution we choose, the gστ in this new system will no longer be constants, but functions of space and time. At the same time the motion of the free material point will present itself in the new co-ordinates as a curvilinear non-uniform motion, and the law of this motion will be independent of the nature of the moving particle. We shall therefore interpret this motion as a motion under the influence of a gravitational field. We thus find the occurrence of a gravitational field connected with a space-time variability of the gσ. So, too, in the general case, when we are no longer able by a suitable choice of co-ordinates to apply the special theory of relativity to a finite region, we shall hold fast to the view that the gστ describe the gravitational field.

Thus, according to the general theory of relativity, gravitation occupies an exceptional position with regard to other forces, particularly the electromagnetic forces, since the ten functions representing the gravitational field at the same time define the metrical properties of the space measured.

B. MATHEMATICAL AIDS TO THE FORMULATION OF GENERALLY COVARIANT EQUATIONS

Having seen in the foregoing that the general postulate of relativity leads to the requirement that the equations of physics shall be covariant in the face of any substitution of the co-ordinates x1 . . . x4, we have to consider how such generally covariant equations can be found. We now turn to this purely mathematical task, and we shall find that in its solution a fundamental rôle is played by the invariant ds given in equation (3), which, borrowing from Gauss’s theory of surfaces, we have called the “linear element.”

The fundamental idea of this general theory of covariants is the following:—Let certain things (“tensors”) be defined with respect to any system of co-ordinates by a number of functions of the co-ordinates, called the “components” of the tensor. There are then certain rules by which these components can be calculated for a new system of coordinates
, if they are known for the original system of co-ordinates, and if the transformation connecting the two systems is known. The things hereafter called tensors are further characterized by the fact that the equations of transformation for their components are linear and homogeneous. Accordingly, all the components in the new system vanish, if they all vanish in the original system. If, therefore, a law of nature is expressed by equating all the components of a tensor to zero, it is generally covariant. By examining the laws of the formation of tensors, we acquire the means of formulating generally covariant laws.

§ 5. CONTRAVARIANT AND COVARIANT FOUR-VECTORS

Contravariant Four-vectors.—The linear element is defined by the four “components” dxv, for which the law of transformation is expressed by the equation

The dx′σ are expressed as linear and homogeneous functions of the dxv. Hence we may look upon these co-ordinate differentials as the components of a “tensor” of the particular kind which we call a contravariant four-vector. Any thing which is defined relatively to the system of co-ordinates by four quantities Av, and which is transformed by the same law

we also call a contravariant four-vector. From (5a) it follows at once that the sums Aσ ± Bσ are also components of a four-vector, if Aσ and Bσ are such. Corresponding relations hold for all “tensors” subsequently to be introduced. (Rule for the addition and subtraction of tensors.)

Covariant Four-vectors.—We call four quantities Av the components of a covariant four-vector, if for any arbitrary choice of the contravariant four-vector Bv

The law of transformation of a covariant four-vector follows from this definition. For if we replace Bv on the right-hand side of the equation

by the expression resulting from the inversion of (5a),

we obtain

Since this equation is true for arbitrary values of the B′σ, it follows that the law of transformation is

Note on a Simplified Way of Writing the Expressions.—A glance at the equations of this paragraph shows that there is always a summation with respect to the indices which occur twice under a sign of summation (e.g. the index v in (5)), and only with respect to indices which occur twice. It is therefore possible, without loss of clearness, to omit the sign of summation. In its place we introduce the convention:—If an index occurs twice in one term of an expression, it is always to be summed unless the contrary is expressly stated.

The difference between covariant and contravariant four-vectors lies in the law of transformation ((7) or (5) respectively). Both forms are tensors in the sense of the general remark above. Therein lies their importance. Following Ricci and Levi-Civita, we denote the contravariant character by placing the index above, the covariant by placing it below.

§ 6. TENSORS OF THE SECOND AND HIGHER RANKS

Contravariant Tensors.—If we form all the sixteen products Aμv of the components Aμ and Bv of two contravariant four-vectors

then by (8) and (5a) Aμv satisfies the law of transformation

We call a thing which is described relatively to any system of reference by sixteen quantities; satisfying the law of transformation (9), a contravariant tensor of the second rank. Not every such tensor allows itself to be formed in accordance with (8) from two four-vectors, but it is easily shown that any given sixteen Aμv can be represented as the sums of the AμBv of four appropriately selected pairs of four-vectors. Hence we can prove nearly all the laws which apply to the tensor of the second rank defined by (9) in the simplest manner by demonstrating them for the special tensors of the type (8).

Contravariant Tensors of Any Bank.—It is clear that, on the lines of (8) and (9), contravariant tensors of the third and higher ranks may also be defined with 43 components, and so on. In the same way it follows from (8) and (9) that the contravariant four-vector may be taken in this sense as a contravariant tensor of the first rank.

Covariant Tensors.—On the other hand, if we take the sixteen products Aμv of two covariant four-vectors Aμ and Bv,

the law of transformation for these is

This law of transformation defines the covariant tensor of the second rank. All our previous remarks on contravariant tensors apply equally to covariant tensors.

NOTE.—It is convenient to treat the scalar (or invariant) both as a contravariant and a covariant tensor of zero rank.

Mixed Tensors.—We may also define a tensor of the second rank of the type

which is covariant with respect to the index μ, and contravariant with respect to the index v. Its law of transformation is

Naturally there are mixed tensors with any number of indices of covariant character, and any number of indices of contravariant character. Covariant and contravariant tensors may be looked upon as special cases of mixed tensors.

Symmetrical Tensors.—A contravariant, or a covariant tensor, of the second or higher rank is said to be symmetrical if two components, which are obtained the one from the other by the interchange of two indices, are equal. The tensor Aμv, or the tensor Aμv, is thus symmetrical if for any combination of the indices μ, v,

or respectively,

It has to be proved that the symmetry thus defined is a property which is independent of the system of reference. It follows in fact from (9), when (14) is taken into consideration, that

The last equation but one depends upon the interchange of the summation indices μ and v, i.e. merely on a change of notation.

Antisymmetrical Tensors.—A contravariant or a covariant tensor of the second, third, or fourth rank is said to be antisymmetrical if two components, which are obtained the one from the other by the interchange of two indices, are equal and of opposite sign. The tensor Aμv, or the tensor Aμv, is therefore antisymmetrical, if always

or respectively,

Of the sixteen components Aμv, the four components Aμμ vanish; the rest are equal and of opposite sign in pairs, so that there are only six components numerically different (a six-vector). Similarly we see that the antisymmetrical tensor of the third rank Aμvσ has only four numerically different components, while the antisymmetrical tensor μvστ has only one. There are no antisymmetrical tensors of higher rank than the fourth in a continuum of four dimensions.

§ 7. MULTIPLICATION OF TENSORS

Outer Multiplication of Tensors.—We obtain from the components of a tensor of rank n and of a tensor of rank m the components of a tensor of rank n + m by multiplying each component of the one tensor by each component of the other. Thus, for example, the tensors T arise out of the tensors A and B of different kinds,

The proof of the tensor character of T is given directly by the representations (8), (10), (12), or by the laws of transformation (9), (11), (13). The equations (8), (10), (12) are themselves examples of outer multiplication of tensors of the first rank.

“Contraction” of a Mixed Tensor.—From any mixed tensor we may form a tensor whose rank is less by two, by equating an index of covariant with one of contravariant character, and summing with respect to this index (“contraction”). Thus, for example, from the mixed tensor of the fourth rank , we obtain the mixed tensor of the second rank,

and from this, by a second contraction, the tensor of zero rank,

The proof that the result of contraction really possesses the tensor character is given either by the representation of a tensor according to the generalization of (12) in combination with (6), or by the generalization of (13).

Inner and Mixed Multiplication of Tensors.—These consist in a combination of outer multiplication with contraction.

Examples.—From the covariant tensor of the second rank Aμv and the contravariant tensor of the first rank Bσ we form by outer multiplication the mixed tensor

On contraction with respect to the indices v and σ, we obtain the covariant four-vector

This we call the inner product of the tensors Aμv and Bσ. Analogously we form from the tensors Aμv and Bστ, by outer multiplication and double contraction, the inner product AμvBμv. By outer multiplication and one contrac
tion, we obtain from Aμv and Bστ the mixed tensor of the second rank . This operation may be aptly characterized as a mixed one, being “outer” with respect to the indices μ and τ, and “inner” with respect to the indices v and σ.

We now prove a proposition which is often useful as evidence of tensor character. From what has just been explained, Aμv Bμv is a scalar if Aμv and Bστ are tensors. But we may also make the following assertion: If AμvBμv is a scalar for any choice of the tensor Bμv, then Aμv has tensor character. For, by hypothesis, for any substitution,

But by an inversion of (9)

This, inserted in the above equation, gives

This can only be satisfied for arbitrary values of B′στ if the bracket vanishes. The result then follows by equation (11). This rule applies correspondingly to tensors of any rank and character, and the proof is analogous in all cases.

The rule may also be demonstrated in this form: If Bμ and Cv are any vectors, and if, for all values of these, the inner product Aμv Bμ Cv is a scalar, then Aμv is a covariant tensor. This latter proposition also holds good even if only the more special assertion is correct, that with any choice of the four-vector Bμ the inner product Aμv BμBv is a scalar, if in addition it is known that Aμv satisfies the condition of symmetry Aμv = vμ. For by the method given above we prove the tensor character of (Aμv + Avμ), and from this the tensor character of Aμv follows on account of symmetry. This also can be easily generalized to the case of covariant and contravariant tensors of any rank.

‹ Prev Next ›