Why there is a minus in the definition of the Minkowski Spacetime Interval?

Question

The spacetime interval is defined as follows:

$$\Delta s^2 = -(c\Delta t)^2 + \Delta x^2 + \Delta y^2 + \Delta z^2$$

or in tensor notation:

$$\Delta s^2 = \eta_{\mu\nu} \Delta x^\mu \Delta x^\nu$$

When I first studied introductory special relativity, I didn't even pay much attention to this quantity -- it was mostly time dilation, length contraction, and fancy paradoxes.

However, it has caught my attention now. The book I'm reading simply defines the quantity, and claims that it's invariant.

Now, just from tensor analysis and ignoring special relativity, $\eta_{\mu\nu} \Delta x^\mu \Delta x^\nu$ looks like a contracted product of a doubly covariant tensor with two contravariant tensors, mathematically proving it's an invariant. Great!

But, what I do not understand is why is the spacetime invariant defined the way it is? Why is it $-(c\Delta t)^2$, and not $(c\Delta t)^2 + \Delta x^2 + \Delta y^2 + \Delta z^2$ ?

I want the physical motivation behind this formula.

score 11 · Answer 1 · answered May 05 '19 at 13:00

Here are two different ways to introduce the subject of Special Relativity. Both are good ways, and each can be used to derive the other.

Approach 1: symmetry principles. We assert Relativity Postulate (physical behaviour the same relative to an inertial frame, no matter the state of relative motion of that inertial frame with others) and Speed of Light Postulate (there is a finite maximum speed for signals). From these we can derive the Lorentz transformation and hence what quantities are invariant. The spacetime interval is one such quantity.

Approach 2: geometric assertions about spacetime. We assert that spacetime is a smooth differentiable manifold with a Minkowskian metric $\eta$. The metric is itself a statement of that which is invariant; the Lorentz transformation $\Lambda$ is then defined as that class of transformations which satisfies $$ \Lambda^T \eta \Lambda = \eta $$ (Here I have used matrix notation in which $T$ is a transpose and $\eta$ has components $\eta_{ab}$.)

Your question is closest in spirit to approach 2. The question then becomes, "why the Minkowski metric? Why not some other metric?" The answer goes to the heart of what kind of universe we have. One can argue that if the metric were that of a 4-dimensional Euclidean space, for example, then there wouldn't be any sense in which time is different from space, and this would amount to such a different way of things that it is hard to even describe it as a physical universe where there can be conservation laws of the type that allow one to single out and label things by their worldlines. There would be no sense of limits to causality, of past and future. Other metrics you can consider, such as diag(-1,-1,1,1) also give such a different type of spacetime that it would be totally unlike the ways things actually are in the universe as we find it.

So in so far as one can talk of a "physical motivation behind this formula" as you ask, it would be "well this is deeply and directly connected to the notion of causality and the causal structure of spacetime. It also expresses the notion that one spatial direction is as good as another in the basic structure of spacetime."

Turbotanten · Answer 2 · 2017-01-13T12:34:15.643

Leonard Susskind professor at Stanford University has en excellent explanation of why the space-time invariant is defined the way it is. I've included a video link from where he begins to talk about the subject, if you'd like to watch it.

He compares space-time to euclidean geometry where the normal Pythagorean theorem says that the square distance between two points is the sum of the square of the distance in your coordinate system. i.e $c^2= a^2+b^2$. This is a quantity that is invariant i.e. we could rotate our coordinate system and describe our new set of points in the new coordinate system with primed coordinates, then we would have the following invariant quantity between our new and old coordinates, $a^{\prime 2}+b^{\prime 2} = a^2 + b^2$. Similar in space-time we also look for an invariant quantity that all observers in different reference frames will agree upon. If we begin with the Lorentz transformation ($c=1$), we have

$$ x^\prime = \frac{(x-vt)}{\sqrt{1-v^2}} \quad t^\prime = \frac{(t-vx)}{\sqrt{1-v^2}} $$ Let us look for an invariant quantity. We could begin and try with $t^{\prime 2}+x^{\prime 2} = t^2 + x^2$, substituting $x^\prime$ and $t^\prime$ into the equation we will notice that it does not read $t^{2}+x^{2} = t^2 + x^2$, so this is not invariant property in space-time. But if we try $t^{\prime 2}-x^{\prime 2} = t^2 - x^2$ and do the same procedure we will find that this is an invariant property!

score 7 · Answer 3 · edited Jan 11 '19 at 13:35

Let's define an event $A$, a source of light is emanated from the origin at $t=0$. Let at time $t$ the light reaches a point $B$. Let the coordinate of space be $(x, y, z)$. If the distance travelled by the light in time $t$ is given by $$c^2t^2=x^2+y^2+z^2$$ $$c^2t^2-x^2-y^2-z^2=0.$$

Consider another frame X’ which is moving at a velocity $v$. Observing the same event in Minkowski space time gives $$c^2t'^2=x'^2+y'^2+z'^2$$ $$c^2t'^2-x'^2-y'^2-z'^2=0.$$
We called the term interval. This event shows that if the interval has zero value in one frame of reference, it should be zero in all inertial frame of reference, since light has a constant velocity in all frame of reference(Postulate of special relativity). Thus the interval between any two events of $X$ and $X'$ coordinates has a linear dependence. Suppose for an observer in $X$ frame the $X’$ frame is moving at some velocity $v$ then linear dependence of interval is given by $$c^2t^2-x^2-y^2-z^2=\alpha (c^2t'^2-x'^2-y'^2-z'^2).$$ Where $\alpha$ depends only on the magnitude of velocity. If not it will contradict the isotropic nature of space. If the observer is at $X'$ frame, then the $X$ frame will be moving at at a velocity $v$. Thus the linear dependence of interval is given by $$c^2t'^2-x'^2-y'^2-z'^2=\alpha(c^2t^2-x^2-y^2-z^2).$$ Substituting this in the above equation gives, $$c^2t^2-x^2-y^2-z^2=\alpha^2(c^2t^2-x^2-y^2-z^2).$$ Which gives $\alpha=1$.Thus the interval $$c^2t'^2-x'^2-y'^2-z'^2=c^2t^2-x^2-y^2-z^2$$ is an invariant under Lorentz transformation.

The constancy of light and isotropic nature of space defines the invariant interval.

JMLCarter · Answer 4 · 2017-01-13T11:21:07.420

Because it let's you quickly recognise the potential for a causal relation between the two events. So, (noting the extra $\Delta$ you added before the s is removed)

$s>0$ (space-like) more space in between than light can cross in the time => no causal relation
$s=0$ (light-like) exactly on the "light cone"
$s<0$ (time-like) less space in between than light can cross in the time => A causal relation is possible

It's the way light works "against space" or rather travelling through space that is being modelled. It's not the same as the magnitude of a vector measurement of distance.
Obviously you could give $(c\Delta t)^2$ the same sign as the distances. The resultant quantity, whilst having some uses, would just be a vector in space-time, it would not have the same (or as much, in my view) physical significance... and crucially definitely does not get to be called "space-time interval".

Further note there is a choice about whether to use -+++ or +--- signs for the terms in the equation, this choice of -+++ is just a matter of convention.

(What's really cool is that it can be shown that s is preserved under the Lorentz transform; proving that causality cannot be affected simply by changing your frame of reference. Neat.)

score 1 · Answer 5 · answered Nov 08 '20 at 09:44

Answering the original question as to why the space time interval is “defined” with an opposite sign on the time component as compared to the space part. As Feynman said you must be careful with definitions in physics. (Feynman was talking about defining mass as force over acceleration which means Newton can never be proved wrong as it would always hold by definition! Of course f= ma needs to be a result or an observation about physical universe and it turns out it is only an approximation). The space time interval is a physical law and an observation about the universe and subject to direct and indirect measurements and exceedingly well verified by lots of experiments. Better not to think of it as a mysterious definition, it is an incredible discovery about the physical universe. We cannot expect too much in the way of “reasons” nor definitions to support fundamental physical law. Though some hints are mentioned in other answers posted. Similarly Pythagoras theorem in physics is not a definition but an observation about space and turns out it applies only approximately to the physical world as can be checked by measurements though fails near black holes and very slightly everywhere though of course infinitesimally unnoticeable except using special techniques (though some indirect consequences can be argued to be observed in everyday phenomena). Personally I think spacetime interval can be presented first simply because even a high schooler familiar with Pythagoras can glean a little from it whereas the Lorentz transformations can be grasped only after having studied the rotation matrix.

score 1 · Answer 6 · edited May 01 '24 at 06:22

It's to force the condition that the speed of light is the same in all inertial reference frames, even those traveling at appreciable fractions of the speed pf light relative to one another. Take the example of a person standing on a train-station platform when a train goes by at an appreciable fraction of the speed of light. A person sitting in the seat at the far right of a train car shines a laser at a mirror on the far left of the train car, in the same row of seats. Suppose the person has a superfast timer that ticks on when the first photon leaves the laser and ticks off when the first photon from the mirror arrives back at the timer. If the train car is 3 meters wide the timer will read 20 nanoseconds as the time interval between ticks, 10 nanoseconds each way. Now suppose the person on the platform has a superfast drone peering down from above and the drone, via google maps with the measure-distance function clicked on, sees that the photon actually travels 5 m diagonally across the tracks. Being familiar with the 3-4-5 triangle, the person deduces that the train is going 4/5 the speed of light and the time to travel the 5 m diagonally across to the mirror is 16.7 nanoseconds. Plugging in to the Lorentz transformation for time the person deduces that $s^2$ is invariant under the Lorentz transformation for time. The questioner's question is quite profound however because if the universe actually existed in spacetime with time as the fourth dimension then $s^2$ should equal $$x_1^2 + x_2^2 + x_3^2 + x_4^2$$, since a round thing in two dimensions is defined by the equation $$x_1^2 + x_2^2 = r^2$$, and in three dimensions by $$x_1^2 + x_2^2 + x_3^2 = r^2$$. In four dimensions there should be another plus sign in front of $x_4^2$. But then $x_4$ would have to be ict, where i equals the square root of minus one, which, of course, does not exist, despite its great utility in math and physics. In fact, we don't live in simple spacetime, but in Minkowski spacetime, specifically designed to force the condition that the speed of light is the same in all inertial reference frames. As such, I think, it is misleading to refer to the curvature of spacetime. The term curvature would only apply if the fourth dimension were ict, not simply $t$.

Dr Maurya Dinesh · Answer 7 · 2021-04-28T17:43:46.420

Perhaps no one has replied the correct answer, here is the required answer

Speed of light or any thing is need not to be constant & and no need of 4D space (keep these in mind), Einstein never revealed his derivation of interval, else everyone would have known... how simple the relativity is, instead made everything mysterious, in one ref frame: Interval is (now deducible, earlier it was 3rd postulate and people are still living in this third postulate world)

$- (\Delta S)^2 = c^2 (\Delta t)^2 - (\Delta x)^2 - (\Delta y)^2 - (\Delta z)^2$

Where $\Delta S = \dfrac{v\Delta \ell}{c}, \quad $ $\Delta \ell = \hat{i}\Delta x+\hat{j}\Delta y+\hat{k}\Delta z$, $ v$ is the relative speed of object, rest is clear. The above strange def of interval is derivable through the total relativistic momentum. Now for first time you will be able to understand it.

Thanks \
Dr. Maurya dinesh

Why there is a minus in the definition of the Minkowski Spacetime Interval?

7 Answers7

Linked