3

I have a very general question about Lorentz transformations of electric and magnetic fields vs. 4-vectors . It arised from my previous post. I will describe the difficulty I encountered.

Information and problem:

  • The electric and magnetic field in a system $S$ which can be boosted to a frame $\bar{S}$ moving at relative velocity $\mathbf{v}$. The electric field $\mathbf{E} =\mathbf{E}_\perp + \mathbf{E}_\parallel$ ($\perp$, $\parallel$ are with respect to $\mathbf{v}$) and the magnetic field $\mathbf{B} = \mathbf{B}_\perp + \mathbf{B}_\parallel$ will be transformed as:

$$\mathbf{\bar{E}}_\parallel = \mathbf{E}_\parallel$$

$$\mathbf{\bar{B}}_\parallel = \mathbf{B}_\parallel$$

$$\mathbf{\bar{E}}_\perp = \gamma(\mathbf{E}_\perp + \mathbf{v} \times \mathbf{B})$$

$$\mathbf{\bar{B}}_\perp = \gamma(\mathbf{B}_\perp - \frac{\mathbf{v}\times \mathbf{E}}{c^2}),$$

where $\gamma \equiv \frac{1}{\sqrt{1-\mathbf{v}\cdot \mathbf{v}/c^2}}$ is used.

  • My problem with these equations is as follows. There is an equality of vectors at each line visible. This means that if we have for instance $\mathbf{E}_\parallel = E_0 \mathbf{\hat{x}}$, $\mathbf{v} = v\mathbf{\hat{x}}$ and $\mathbf{B} = \mathbf{0}$ then $\bar{\mathbf{E}}_\parallel = \mathbf{E}_\parallel = E_0 \mathbf{\hat{x}}$ according to the definition. My problem is that $\mathbf{\bar{E}}_\parallel$ is expressed in terms of basis vector $\mathbf{\hat{x}}$ from frame $S$ and not a new basis vector say $\mathbf{\bar{\hat{x}}}$ in frame $\bar{S}$.

  • In the case of a Lorentz transformation of a 4-vector, we have that for example for the 4-potential $A \equiv \{A^{\mu}\} = (V/c,\mathbf{A})$ we have that $\bar{A}^{\mu} = \Lambda^{\mu}_{\nu} A^{\nu}$ and since the 4-vector itself does not change under a basis transformation we have $A = \bar{A} = \bar{A}^\mu \bar{e}_\mu = A^{\mu}e_\mu$, where $\{e_\mu\}_{\mu \in \{0,1,2,3\}}$ and $\{\bar{e}_\mu\}_{\mu \in \{0,1,2,3\}}$ are bases for $S$ and $\bar{S}$. The same Minkowski space is therefore described by different basis vectors in the 4-vector formalism when boosting to another frame. But in case of the electric and magnetic field (which I know do not form a 4-vector but are contained in the field tensor $\mathbf{F}$) do only change in components and not in the basis of $\mathbf{R}^3$. I have the feeling that 4-vectors do not change under a Lorentz transformation, only the components and the basis in which they are expressed (so the basis representation) and for 3-vectors, they change as a whole, meaning that the basis vectors can stay the same although the components change. But I am not sure about this.

  • Coming back to the 4-potential. The components are also called $\bar{A}_x$, $\bar{A}_y$, $\bar{A}_z$, just like one can write $\bar{E}_x$, $\bar{E}_y$, $\bar{E}_z$ after a boost. Now the problem is if we write $\mathbf{\bar{E}} = \bar{E}^j \mathbf{e}_j$: is this basis also just $\{\mathbf{\hat{x}},\mathbf{\hat{y}},\mathbf{\hat{z}}\}$? And for $\mathbf{\bar{A}} = \bar{A}_x \mathbf{\hat{x}} + \bar{A}_y \mathbf{\hat{y}} + \bar{A}_z \mathbf{\hat{z}}$: is this also a valid expression?
  • To conclude I think the main problem lies in the question whether a Lorentz transformation changes the basis in which 3-vectors (e.g. magnetic or electric field) or the 3-vector inside a 4-vector (e.g. 4-potential) are defined. And is the basis of $\mathbf{R}^3$ different from the basis of Minkowski space?

I hope someone can explain this rigorously by using equations.

Qmechanic
  • 220,844

5 Answers5

2

There are different ways to think about these things, some more general than others. Since the question is calling for conceptual clarification, I'll offer a more general perspective. It is more general in two ways:

  • It works for an arbitrary number of dimensions.

  • It works for arbitrary coorindate transformations, not just Lorentz transformations.

Even though the goal is to deduce how $\mathbf{E}$ and $\mathbf{B}$ transform, it's easier to first deduce how $$ F_{ab}\equiv \partial_a A_b-\partial_b A_a \tag{1} $$ transforms, with coordinates denoted $$ x^0,\ x^1,\ x^2,\ ... \tag{2} $$ and the corresponding partial derivatives $$ \partial_a\equiv\frac{\partial}{\partial x^a}. \tag{3} $$ Then we can specialize to 4-d spacetime (3-d space) using $$ E_k\propto F_{0k} \hskip1cm \text{ for }k\geq 1\\ \tag{4a} $$ and $$ B_1 = F_{23} \hskip2cm B_2 = F_{31} \hskip2cm B_3 = F_{12}. \tag{4b} $$ (I'm using units in which the speed of light is $1$. You'll need to check the sign conventions for $\mathbf{E}$ and $\mathbf{B}$, but that's a separate issue. In fact, one of the advantages of formulating things this way is that it separates the sign-convention issue from the issue of how things transform.) Equation (4b) is specific to 3-d space, but equation (1) makes sense in any number of dimensions. Equation (1) is also independent of the spacetime metric, so there's no need distinguish between space and time when working with equation (1). That distinction only becomes important when we want to separate (1) into electric and magnetic components, as in (4).

Here's a more general perspective on what coordinates are:

  • A coordinate system is any smooth assignment of numbers (the coordinates) to the points of spacetime so that each point is labelled by a unique combination of numbers.

  • Coordinates are not the components of a vector, and a general coordinate transformation is not a rigid transformation. A coordinate transformation does not rely on concepts like "basis" or "axis." (There are special contexts in which those concepts are perfectly valid; but I'm offering a more general perspective here.)

A coordinate transformation expresses the old coordinates $x$ as functions of new coordinates $\tilde x$, or conversely. (The transformation must be invertible.) Given those functions and using the abbreviation $$ \tilde\partial_a\equiv\frac{\partial}{\partial \tilde x^a}, \tag{5} $$ we already know from calculus how the partial derivatives in the two coordinate systems are related to each other (the chain rule): $$ \partial_a = \sum_b\frac{\partial \tilde x^b}{\partial x^a}\tilde\partial_b \hskip2cm \tilde\partial_b = \sum_a\frac{\partial x^a}{\partial \tilde x^b}\partial_a \tag{6} $$ For equation (1) to make sense, $A_a$ should transform the same way that $\partial_a$ transforms, so $$ A_a(x) = \sum_b\frac{\partial \tilde x^b}{\partial x^a}\tilde A_b(\tilde x). \tag{7} $$ Equation (7) says that $A_a$ are the components of a single-subscript tensor. The transformation rule for $F_{ab}$ (and therefore for $\mathbf{E}$ and $\mathbf{B}$) can be inferred from equations (6)-(7), and the result is simpler than the derivation. Thanks to the antisymmetry of the right-hand side of (1), the messy parts cancel each other, leaving the relatively simple result $$ F_{ab}(x) = \sum_{c,d} \frac{\partial \tilde x^c}{\partial x^a} \frac{\partial \tilde x^d}{\partial x^b} \tilde F_{cd}(\tilde x). \tag{8} $$ Equation (8) says that $F_{ab}$ are the components of a two-subscript tensor.

So far, this has all been for an arbitrary coordinate transformation. To specialize to Lorentz transformations, we first need to define what that means: a Lorentz transformation is a linear coordinate transformation for which the relationship $$ (dx^0)^2-\sum_{k>0}(dx^k)^2 = (d\tilde x^0)^2-\sum_{k>0}(d\tilde x^k)^2 \tag{9} $$ holds. The fact that it's a linear transformation implies that the quantities $$ \Lambda^b_a\equiv \frac{\partial \tilde x^b}{\partial x^a} \tag{10} $$ are constants. One example of such a transformation is the boost \begin{align} \tilde x^0 &= x^0\cosh\theta + x^1\sinh\theta \\ \tilde x^1 &= x^1\cosh\theta + x^0\sinh\theta \\ \tilde x^k &= x^k \hskip1cm \text{ for }k\geq 2 \tag{11} \end{align} for any constant $\theta$. Then the components of $\Lambda$ depend on the constant $\theta$ but not on the coordinates (neither $x$ nor $\tilde x$). Given any such transformation, equation (8) tells us how $F_{ab}$ transforms, and then equations (4) tell us how $\mathbf{E}$ and $\mathbf{B}$ transform, with (4b) being specific to 3-d space.

All of this was expressed in terms of the quantities $F_{ab}$, which are the components of a tensor field. An even more satisfying formulation is one that doesn't refer to components or coordinates at all. Such a coordinate-free formulation does exist [1]. It gives a solid formal foundation for the heuristic reasoning that was used above, and it provides a more-general version of expressions like $\mathbf{A}=\sum_k A_k \mathbf{\hat x}^k$ that were written in the OP. Here's a preview that doesn't really convey the concepts, but it does provide some tips that can be used to help guide an on-line search: A scalar field is a function from the smooth manifold ("spacetime") to the set of real numbers. A vector field is a special kind of map from the set of scalar fields to itself; it is special because it is a derivation. This means that in a coordinate representation, a vector field is a combination of partial derivatives, like this: $v(x) = \sum_a v^a(x)\partial_a$. The functions $v^a(x)$ are the components of the vector field. Under a coordinate transformation, the components $v^a$ transform in a way that enforces this equation: $$ \sum_a v^a(x)\partial_a = \sum_a \tilde v^a(\tilde x)\tilde \partial_a. $$ Together with the identities (6), this tells us how the components $v^a$ transform. The quantities $F_{ab}$ are the components of a two-form, which in a coordinate representation looks like $F(x) = \sum_{a,b} F_{ab}(x)dx^a\wedge dx^b$, where the objects $dx^a$ are coordinate differentials, which are one-forms. (Each coordinate is a scalar field, and the differential of any scalar field is a one-form, and a one-form is a special kind of map from the set of vector fields to the set of scalar fields, involving odd-looking identities like $dx^a(\partial_b)=\delta^a_b$.) Under a coordinate transformation, the components $F_{ab}$ transform in a way that enforces this equation: $$ \sum_{a,b}F_{ab}(x)dx^a\wedge dx^b = \sum_{a,b}\tilde F_{ab}(\tilde x)d\tilde x^a\wedge d\tilde x^b, $$ from which equation (8) can be deduced. And so on.


Reference:

[1] Lee (2003), Introduction to Smooth Manifolds, Springer

Chiral Anomaly
  • 55,710
  • 5
  • 99
  • 166
1

My problem with these equations is as follows. There is an equality of vectors at each line visible. This means that if we have for instance $\mathbf{E}_\parallel = E_0 \mathbf{\hat{x}}$, $\mathbf{v} = v\mathbf{\hat{x}}$ and $\mathbf{B} = \mathbf{0}$ then $\bar{\mathbf{E}}_\parallel = \mathbf{E}_\parallel = E_0 \mathbf{\hat{x}}$ according to the definition. My problem is that $\mathbf{\bar{E}}_\parallel$ is expressed in terms of basis vector $\mathbf{\hat{x}}$ from frame $S$ and not a new basis vector say $\mathbf{\bar{\hat{x}}}$ in frame $\bar{S}$.

The way this is customarily notated in terms of a set of relations among 3-vectors is convenient and concise but kind of misleading, because what is really being stated is a set of relations among components of a 4x4 tensor. It is not true that $\bar{\mathbf{E}}_\parallel = \mathbf{E}_\parallel = E_0 \mathbf{\hat{x}}$, and I think this is a nice example of what's fundamentally bad about the notation, because it violates the transitivity of the = sign. When we write $\bar{\mathbf{E}}_\parallel = \mathbf{E}_\parallel$, what we mean is that in the electromagnetic field tensor, $F^{x't'}=F^{xt}$, i.e., these components of the tensor are equal in the two different bases. But for an observer in $\bar{S}$, $\bar{\mathbf{E}}$ is by definition a purely spatial vector (i.e., it lies in a plane of simultaneity), while to that observer $\mathbf{\hat{x}}$ has a nonzero time component. Therefore it cannot be true that $\bar{\mathbf{E}}_\parallel = E_0 \mathbf{\hat{x}}$

I have the feeling that 4-vectors do not change under a Lorentz transformation, only the components and the basis in which they are expressed (so the basis representation)[...]

Yes, this is the way most physicists and mathematicians think about it these days. (There is another point of view dating to ca. the 19th century and used by people like (IIRC) Sylvester. This WP article gives a description from this point of view.)

[...]and for 3-vectors, they change as a whole, meaning that the basis vectors can stay the same although the components change. But I am not sure about this.

The question of whether a vector changes as a whole under a change of coordinates is not one that automatically has a well-defined answer. People like Sylvester would probably have had a different way of thinking about this than people today. The modern trend is to try to make everything obey a standard set of tensor transformation rules, and to use notations that can't even express the question of whether a vector (as opposed to its components) changes when you change coordinates. For someone who is used to the modern language, attitudes, and notations, "3-vector" just means "ordered triple of components," i.e., unlike a true vector, a 3-vector is an object that has no meaning or existence that transcends its components.

It's not possible for the basis for the 3-vectors to stay the same, because they have to be purely spatial vectors for each observer, with no time component. By definition, $\mathbf{\hat{x}}$ means a unit vector that is parallel to an infinitesimal displacement in which only $x$ changes (or, if you want to be fancier, parallel to the directional derivative in the $x$ direction, see Wald, p. 15). By the same definition, $\mathbf{\hat{x}}'$ is parallel to an infinitesimal displacement in which only $x'$ changes, and this is a different direction in spacetime.

Coming back to the 4-potential. The components are also called $\bar{A}_x$, $\bar{A}_y$, $\bar{A}_z$, just like one can write $\bar{E}_x$, $\bar{E}_y$, $\bar{E}_z$ after a boost. Now the problem is if we write $\mathbf{\bar{E}} = \bar{E}^j \mathbf{e}_j$: is this basis also just $\{\mathbf{\hat{x}},\mathbf{\hat{y}},\mathbf{\hat{z}}\}$? And for $\mathbf{\bar{A}} = \bar{A}_x \mathbf{\hat{x}} + \bar{A}_y \mathbf{\hat{y}} + \bar{A}_z \mathbf{\hat{z}}$: is this also a valid expression?

You can do this kind of thing if you like, but in general we like 4-vectors better than 3-vectors, because 4-vectors have consistent transformation laws. Therefore when you have a nice, well-behaved 4-vector like $A$, you don't normally want to mess around making a 3-vector out of its spatial components.

To conclude I think the main problem lies in the question whether a Lorentz transformation changes the basis in which 3-vectors (e.g. magnetic or electric field) or the 3-vector inside a 4-vector (e.g. 4-potential) are defined.

Yes, as described above, it changes the basis in which the 3-vectors are defined, because the 3-vectors are just defined as triples of components of some tensorial object (in this case a 4x4 tensor), which is itself expressed in a basis that has changed.

And is the basis of $\mathbf{R}^3$ different from the basis of Minkowski space?

I think a better way of saying this is that if a particular observer is constructing a set of Minkowski coordinates, they will choose coordinates in which $(x,y,z)$ are purely spatial coordinates, so that their basis vectors span a 3-surface of simultaneity.

The best way to retain your sanity and avoid confusion in this stuff is to organize all your thoughts in terms of the transformation properties of tensors. Then when you come to something that is expressed in a different way, such as the transformation laws for the electromagnetic field's 3-vectors, just think of it as a shorthand for the actual underlying operations on tensors.

0

I think your confusion comes from what is called active and pasive transformations, one topic which was already well explained in the following post!.

When a coordinate transformation is performed there are two possibilities to interpret it:

  • If the transformation acts only on the coordinate axes, we speak of the passive role (right image) of the transformation.
  • While in the active point of view (left image) the transformation is consider to change directly the vector.

enter image description here

In this case for the parallel and perpendicular electric and magnetic fields the point of view used is the active, as you may have inferred, and the most common convention in the textbooks is to use the active point of view of the transformation.

In a slightly more formal way if you have a transformation $T$ which acts on a $\mathbb{R^3}$ vector $\mathbf{v}=v_i\mathbf{e}^i$ the passive point of view will give $\mathbf{v'}=T\mathbf{v}=v_iT\mathbf{e}^i=v_i\mathbf{e'}^i$ while the active one $\mathbf{v'}=T\mathbf{v}=Tv_i\mathbf{e}^i=v'_i\mathbf{e}^i$

0

In special relativity we can write statements such as $\bar{\bf E}_\parallel = {\bf E}_\parallel$ without ambiguity because there is a universal way to agree directions in space (it is not so simple in general relativity). Whenever we consider a 3-vector $\bf w$ at an event, we can, for the purpose of agreeing its direction between one frame and another, imagine it as the spatial part of a 4-vector lying in the plane of simultaneity of the frame of reference in which the 3-vector is specified. If we now change to another frame, and obtain another 3-vector $\bar{\bf w}$, then we can similarly construct a 4-vector lying in the plane of simultaneity of the new frame, with spatial part equal to $\bar{\bf w}$. Each of these 4-vectors can then be projected onto the plane of simultaneity of the other, and thus compared. In this way one arrives at a common sense of spatial direction between any two inertial frames (whether boosted, or rotated, or both, with respect to one another), and the concept of a 3-vector can therefore be employed, without the need to deconstruct it into components and basis vectors.

I think your question is quite deep and I had to think carefully before answering it (and I am supposed to be some sort of expert here ...) so it may be that others will give better answers; I would be pleased if they did and then I will learn something myself.

Andrew Steane
  • 65,285
-1

The unit vectors $\hat x$, $\hat y$, $\hat z$ are identical in two frames related by a boost along one of them.

my2cts
  • 27,443