Fixing non-calculus proof that Lorentz transformations are linear

Question

Define the Lorentz group to be $$O(1,3)=\{\Lambda:\mathbb{R}^4\rightarrow\mathbb{R}^4|\eta(\Lambda u,\Lambda v)=\eta(u,v)\},$$ where $\eta$ is the Minkowski inner product. One could try to mimic the simple proof that rotations are linear in this case. For example, to show it distributes over the sum, we compute $$(\Lambda(u+v)-\Lambda u-\Lambda v)^2=(\Lambda(u+v))^2+(\Lambda u)^2+(\Lambda v)^2-2\eta(\Lambda(u+v),\Lambda u)-2\eta(\Lambda(u+v),\Lambda v)+2\eta(\Lambda u,\Lambda v).$$ In here we've writte $u^2=\eta(u,u)$. Then, the fact that Lorentz transformations preserve the Minkowski product allows us to eliminate the Lambdas from the right hand side. One can then recombine the terms above so that the left hand side becomes $(u+v)-u-v)^2$, which clearly vanished. Explicitely $$(\Lambda(u+v)-\Lambda u-\Lambda v)^2=(u+v)^2+u^2+v^2-2\eta(u+v,u)-2\eta(u+v,v)+2\eta(u,v)\\=u^2+2\eta(u,v)+v^2+u^2+v^2-2u^2-2\eta(v,u)-2\eta(u,v)-2v^2+2\eta(u,v)=0.$$ If $\eta$ was non-degenerate, as in the case of $O(3)$, this would guarantee that $\Lambda(u+v)=\Lambda u+\Lambda v$. However, in the Minkowski case, this procedure only guarantees that $\Lambda(u+v)-\Lambda u-\Lambda v$ is null.

Is there a way to proof further that it has to be zero, not only null? This would yield an appealing alternative to the more common calculus based proofs on can find in CFT textbooks. These can be found in the answers in this post Interval preserving transformations are linear in special relativity (I just realized looking back at this post that I provided an incorrect answer a couple of years back precisely because of this reason hahaha)

peek-a-boo · Accepted Answer · 2021-06-24T00:28:04.700

A small comment:

"If $\eta$ was non-degenerate..."

What you meant to say is "positive-definite"; note that the term "non-degenerate" has a specific technical meaning in linear algebra (according to which $\eta$ is non-degenerate) so we shouldn't mix up the terminology.

We can actually generalize the result much further, though your proof is bound to fail (at least I don't see how to fix it), because as you've rightly noted, all your argument shows is that that difference has to lie on the light-cone of $\eta$ (i.e the zero-set of the quadratic form associated to $\eta$). Instead, what we should try to exploit is the non-degeneracy(in the strict mathematical sense) of the pseudo-inner product.

Here's the general claim we're going to prove:

Let $V$ be any finite-dimensional vector space over a field $\Bbb{F}$ of characteristic not $2$, and let $g:V\times V\to \Bbb{F}$ be a symetric bilinear form which is non-degenerate (i.e the mapping $x\mapsto g(x, \cdot)$ from $V\to V^*$ is a linear isomorphism). Then any function $T:V\to V$ which preserves $g$ is automatically linear.

(Certainly, $(\Bbb{R}^4,\eta)$ satisfy all these hypotheses)

The first step is to start with a basis $\{x_1,\dots, x_n\}$ with respect to which $g$ has the matrix representation $\text{diag}(+1\dots, +1,-1,\dots -1)$ (it doesn't matter how many plus or minus). Now, observe that because $T$ preserves $g$, it follows that $\{T(x_1),\cdots, T(x_n)\}$ will also be a basis of $V$ (prove this).

Ok, now the proof is straight-forward. Let $c\in\Bbb{F}$, $u,v\in V$ be arbitrary, and take any $i\in\{1,\dots, n\}$. Then, \begin{align} g(T(cu+v)-cT(u)-T(v), T(x_i))&= g(T(cu+v),T(x_i)) -cg(T(u),T(x_i))-g(T(v),T(x_i))\\ &=g(cu+v,x_i)-cg(u,x_i)-g(v,x_i)\\ &=0 \end{align} where we used bilinearity of $g$ throughout and that $T$ preserves $g$. Now, since $\{T(x_1),\cdots, T(x_n)\}$ is a basis of $V$, it actually follows that for any $y\in V$, we have \begin{align} g(T(cu+v)-cT(u)-T(v), y)&=0. \end{align} Therefore, by non-degeneracy of $g$, it follows $T(cu+v)-cT(u)-T(v)=0$. Since $c,u,v$ were all arbitrary, this proves linearity of $T$.

As an aside: it is important for $g$ to be non-degenerate, because otherwise this can fail very badly: just take $g=0$ then any function will preserve $g$, but of course not every function is linear. For a slightly less-trivial example, consider $\Bbb{R}^2$ and suppose $g((x,y),(v,w))=xv$ (so that the matrix representation with respect to the standard basis is $\begin{pmatrix}1&0\\0&0\end{pmatrix}$). Now, consider the function $T:\Bbb{R}^2\to\Bbb{R}^2$ given as \begin{align} T(x,y)=(x,\text{literally any crazy function of $(x,y)$ you want}) \end{align} Then, $T$ will preserve $g$, but is of course not linear.

So, to truly understand the proof above, you should look closely and examine where exactly we the various hypotheses.

Fixing non-calculus proof that Lorentz transformations are linear

1 Answers1