32

What's the cleanest/quickest way to go between Einstein's postulates [1] of

  1. Relativity: Physical laws are the same in all inertial reference frames.
  2. Constant speed of light: "... light is always propagated in empty space with a definite speed $c$ which is independent of the state of motion of the emitting body."

to Minkowski's idea [2] that space and time are united into a 4D spacetime with the indefinite metric $ds^2 = \vec{dx^2} - c^2 dt^2$.

Related to the question of what is the best derivation of the correspondence are:
Is the correspondence 1:1? (Does the correspondence go both ways?)
and are there any hidden/extra assumptions?


Edit

Marek's answer is really good (I suggest you read it and its references now!), but not quite what I was thinking of.

I'm looking for an answer (or a reference) that shows the correspondence using only/mainly simple algebra and geometry. An argument that a smart high school graduate would be able to understand.

Simon
  • 3,593

5 Answers5

30

I will first describe the naive correspondence that is assumed in usual literature and then I will say why it's wrong (addressing your last question about hidden assumptions) :)

The postulate of relativity would be completely empty if the inertial frames weren't somehow specified. So here there is already hidden an implicit assumption that we are talking only about rotations and translations (which imply that the universe is isotropic and homogenous), boosts and combinations of these. From classical physics we know there are two possible groups that could accomodate these symmetries: the Gallilean group and the Poincaré group (there is a catch here I mentioned; I'll describe it at the end of the post). Constancy of speed of light then implies that the group of automorphisms must be the Poincaré group and consequently, the geometry must be Minkowskian.

[Sidenote: how to obtain geometry from a group? You look at its biggest normal subgroup and factor by it; what you're left with is a homogeneous space that is acted upon by the original group. Examples: $E(2)$ (symmetries of the Euclidean plane) has the group of (improper) rotations $O(2)$ as the normal subgroup and $E(2) / O(2)$ gives ${\mathbb R}^2$. Similarly $O(1,3) \ltimes {\mathbb R}^4 / O(1,3)$ gives us Minkowski space.]

The converse direction is trivial because it's easy to check that the Minkowski space satisfies both of Einstein postulates.

Now to address the catch: there are actually not two but eight kinematical groups that describe isotropic and uniform universes and are also consistent with quantum mechanics. They have classified in the Bacry, Lévy-Leblond. The relations among them is described in the Dyson's Missed opportunities (p. 9). E.g., there is a group that has absolute space (instead of absolute time that we have in classical physics) but this is ruled out by the postulate of constant speed of light. In fact, only two groups remain after Einstein's postulate have been taken into account: besides the Poincaré group, we have the group of symmetries of the de Sitter space (and in terms of the above geometric program it is $O(1,4) / O(1,3)$).

Actually, one could also drop the above mentioned restriction to groups that make sense in quantum mechanics and then we could also have an anti de Sitter space ($O(2,3) / O(1,3)$). In fact, this shouldn't be surprising as general relativity is a natural generalization of the special relativity so that the Einstein's postulates are actually weak enough that they describe maximally symmetric Lorentzian manifolds (which probably wasn't what Einstein intented originally).

Marek
  • 23,981
22

First, set the units so that the speed of light is equal to one, so that the path of light rays in space-time are at 45 euclidean degrees. Note that a moving observer has a space-time path which is tilted relative to the t-axis, and the t-axis describes the path of a stationary observer (what I really mean is, get comfortable with space-time pictures). Then restrict yourself to two dimensions, one space, one time.

A moving observer's t-axis is, by the principle of relativity, his world line. He has a bunch of friends moving along at the same speed, at different positions, and their worldlines are parallel to the first (they are relatively stationary by the principle of relativity, so their worldlines never meet). They also have ticks-points corresponding to their clocks.

The first nontrivial question is "what does the moving observer's x-axis look like?" This is also the first thing Einstein adresses in his relativity paper.

The x-axis is defined in the stationary frame as "all events simultaneous with the origin". Another way of saying this, is that the events at position x and y are simultaneous if the light ray starting at the event at x and the light rays starting at the event from y reach the point (x+y)/2 at the same time.

alt text

So, by the principle of relativity, if two comoving observers want to synchronize their clocks, the friend at x and the friend at y send a light signal at what they think is the same clock-tick to the friend at (x+y)/2. If (x+y)/2 gets the signal from both at the same time, he tells them "ok--- you're synchronous". This can be seen on a space-time diagram

alt text

It is important to convince yourself of the following: the x-axis of the moving observer is tilted up by the same amount that the t-axis of the moving observer is tilted right, so that the light signal from e and the light signal from f both reach g on the parallel line halfway inbetween. The reason is that the light-rays are still on 45 degrees for the moving observer, which is Einstein's principle of the constancy of the speed of light.

Two lines are relativistically-perpendicular if one is the x-axis for the other's t-axis. Convince yourself that two lines are perpendicular when they have the same tilt relative to the 45-degree lines corresponding to paths of light rays.

Now you are ready for a proof of the Minkowski version pythagorean theorem.

Chinese Minkowski Pythagorean theorem

alt text

The middle space-time shape that looks like a diamond is really a square in relativity--- you must remember what perpendicularity means: same tilt relative to the 45-degree line. By adding areas:

$(a+b)^2 = c^2 + 2ab + 2b^2$

$c^2 = a^2 - b^2$

Martin
  • 131
8

Let's build up to this. Suppose you know about x and y coordinates in the Euclidean plane, but to you they're just arbitrary labels for points, like zip codes or phone numbers. Then suppose someone tells you that observers can view the plane from different directions, but the laws of geometry stay the same. You now know that x and y aren't really separate. Under a 90-degree rotation, x could become y, and y could become -x. There is a quantity that is conserved under these rotations, which is the distance $\sqrt{\Delta x^2+\Delta y^2}$ defined by the Pythagorean theorem. All observers agree on this.

Now let's consider relativity according to Galileo and Newton. All observers agree on time intervals. However, they disagree on other things, such as distances. If I tap the "b" key on my keyboard twice, like this, bb, I say that the distance between those two events is zero. But a Martian looking at me through a telescope says the earth is spinning and revolving around the sun, so the distance between b and b was a hundred meters. Only observers at rest relative to one another agree on Pythagorean distances, but they do agree on these regardless of rotation. Just as in the example of the Euclidean plane we saw that rotation could mix x and y, in Galilean relativity we see that an observer's motion along the x axis mixes x and t according to $x'=x-vt$, $t'=t$.

Learning about Einstein's second postulate is like learning about the rotational symmetry of the Euclidean plane. It tells you that there was a higher degree of symmetry than you believed. It says that there is something that all observers agree on. Just as all observers in the Euclidean plane agree whether points A and B are 1 meter apart, all observers in a relativistic universe agree on whether two events A and B could represent the emission of a ray of light from A and its reception at B. In this case we say that the separation of A and B is lightlike. Let the difference between A and B's x coordinates be $\Delta x$, and so on, and for convenience let's use units of seconds and light-seconds, so that c=1, and we don't have to write factors of c. If a certain observer says A and B are lightlike in relation to one another, then that observer has $\sqrt{\Delta x^2+\Delta y^2+\Delta z^2}=\Delta t$.

An observer in a different state of motion will measure different values for these deltas, and as in the Euclidean and Galilean examples, the equations relating these deltas have to be linear. (A nonlinear relationship like $x'=x^2$ would violate the homogeneity of space.) The details of the actual equations and how they're derived isn't really the topic here. But we would like to find something that both observes agree on, just as observers in the Euclidean plane agree on distances, and observers in a Galilean universe agree on times.

We might hope that they would agree on the difference $\Delta t-\sqrt{\Delta x^2+\Delta y^2+\Delta z^2}$. If they did agree on this, then they would certainly agree on whether events were lightlike. But this conjecture doesn't work. One easy way to see this is with a variation on the well-known thought experiment of the train and the two lightning flashes. If the flashes are simultaneous in the dirt's frame, then observer K in a train going in one direction sees $\Delta t<0$, while observer K' in a train going the opposite way sees $\Delta t>0$. Since K and K' agree on $\Delta x$, ... but disagree on the sign of $\Delta t$, they disagree on the value of the expression conjectured above.

What does turn out to work is the difference $\Delta t^2-\Delta x^2-\Delta y^2-\Delta z^2$. We can see that it doesn't fall prey to the same counterexample of the trains, because the sign of $\Delta t$ is irrelevant.

By analogy with the Euclidean unification of the x and y axes, we think of this as a kind of unification of the t axis with the x, y, and z axes in relativity. The occurrence of the + sign in the time term and the - signs on the spatial ones is called the signature. The distinction between a signature like ++++ and one like +--- is real, and says that time is not exactly the same as a spatial dimension. The distinction between +--- and -+++ is not physically significant, and different people use different conventions.

6

The argument I'm about to describe is not necessarily the quickest, but I think it's very pretty, and it uses only elementary geometry and linear algebra. It requires some lemmas which are not entirely obvious, but their proofs should be accessible to mathematically sophisticated high-schoolers.

Yes, Einstein's argument does include a hidden assumption:

  • Spacetime can be modeled mathematically by the Cartesian product $T \times X$, where $T$ is an oriented one-dimensional Euclidean space (representing time) and $X$ is an $n$-dimensional Euclidean space (representing space), with $n \ge 2$.

Making this assumption explicit lets us state the speed-of-light postulate in a more precise way:

  • A flash of light that occurs at the spacetime point $(t, x) \in T \times X$ will be seen at, and only at, the points $(t', x')$ for which $\|t' - t\| = \|x' - x\|$ and $t' - t$ is positively oriented. (Here, $\|\cdot\|$ is the Euclidean norm.) The set of points where the flash can be seen will be called the null cone of $(t, x)$.

One advantage of saying everything so pedantically is that it lets us do away with Einstein's laws-of-physics postulate, removing the need to explain what "the laws of physics" and "inertial frame" are supposed to mean.

From the assumptions above, one can prove that the symmetries of spacetime which preserve the speed of light are precisely the Poincaré transformations and scalings. To be precise, one can prove that:

Any invertible function $f$ from $T \times X$ to itself that sends the null cone of $(t, x)$ to the null cone of $f(t, x)$ is a Poincaré transformation composed with a scaling.

(Note how little we assumed about the function $f$. We didn't even assume it's continuous!)

Proofs for $n = 3$ can be found in:

A proof for all $n \ge 2$ can be found in:

Alexandrov's proof proceeds roughly as follows. Sub-arguments are sketched in sub-lists, and non-trivial lemmas are marked in bold.

  1. By cleverly intersecting null cones, show that $f$ sends null lines to null lines.
  2. Deduce that $f$ sends null planes to null planes.
    1. Observing that every null plane is doubly ruled by null lines, deduce that $f$ sends every null plane to a set doubly ruled by null lines.
    2. Recalling the classification of doubly ruled sets, observe that planes are uniquely identified among doubly ruled sets by the way their ruling lines intersect.
  3. Notice that every line is the intersection of two null planes. Thus $f$ sends lines to lines.
  4. In dimensions three and higher, any invertible function that sends lines to lines is affine, so $f$ is an affine transformation (this is why we needed $n \ge 2$).
  5. Show that $f$ is a Poincaré transformation composed with a scaling (Alexandrov doesn't do this part explicitly, because it was already standard material by his time, so my sketch may not be totally right).
    1. Knowing that translations preserve null cones, reduce to the case where $f$ leaves some point $p_0 = (t_0, x_0)$ invariant.
    2. Knowing that rotations and boosts preserve null cones, reduce to the case where $f$ leaves the line $x = x_0$ invariant.
    3. Knowing that dilations preserve null cones, reduce to the case where $f$ fixes every point on the line $x = x_0$.
    4. Using the linearity of $f$ about $p_0$, conclude that $f$ leaves the plane $t = t_0$ invariant.
    5. Using the fact that $f$ preserves null cones, concluded that $f$ acts on the plane $t = t_0$ by a rotation composed with a scaling.

Remarks:

  • The assumption $n \ge 2$ is necessary. When $n = 1$, the postulates I've given don't restrict the symmetry group of spacetime to the Poincaré group! If you require symmetries to be smooth functions, you end up with the Lorentz-signature conformal group as your symmetry group. I'd love to know if this fact is somehow responsible for the appearance of (1+1)-dimensional conformal field theories in physics...

  • Don't be put off by the fact that the authors cited above start out by assuming that they're working in Minkowski spacetime. They're only using this as a convenient way to define null cones. In this answer I defined null cones in a more long-winded way, which doesn't mention the Minkowski metric, to make it clear that we're not assuming what we're trying to prove.

2

Historically, FitzGerald and Lorentz deduced the transformations from Michelson-Morley showing that the speed of light was constant for all observers, and then Poincare found the Minkowski metric when he was looking for invariants of the Lorentz group.

Roger
  • 187