27

I study mathematics but I have a deep interest in physics as well. I have taken a course in smooth manifolds where a tensor is defined as an alternating multilinear function. Recently I have learned about electrodynamics and how Maxwell's equations can be written in relativistic form. We introduce the "(anti-symmetric) 2-tensor" $F_{\mu\nu}$, which from I have understood so far, has the benefit that it allows us to easily calculate how the fields transform under arbitrary Lorentz transformations. (As an aside question, is there really any other benefit)? I've understood how $F_{\mu\nu}$ is derived, but I've been stuck on why/how physicists call this object a tensor.

How can an object such as $F_{\mu\nu}$ be seen as an bilinear function?

Qmechanic
  • 220,844
CBBAM
  • 4,632

6 Answers6

27

where a tensor is defined as an alternating multilinear function

I think you may be confusing the general concept of tensors, with the specific case of volume forms which indeed by definition are always alternating. But if you drop the "alternating", this would be completely correct.

Now, perhaps the confusion arises from the fact that in physics we may be a bit "sloppy" at times and represent something like the electromagnetic Faraday tensor as a matrix:

\begin{equation} \left\{ F^{\mu \nu} \right\} = \begin{pmatrix} 0 & -E^1 & -E^2 & -E^3 \\ E^1 & 0 & -B^3 & B^2 \\ E^2 & B^3 & 0 & -B^1 \\ E^3 & -B^2 & B^1 & 0 \end{pmatrix} \end{equation}

While in fact this isn't a great way to represent a bilinear function. We know that a matrix is a reasonable representation for a $(1,1)$ tensor, since it maps a vector (which is a $(1,0)$ tensor) to another vector. Suppose that $A$ is a $(1,1)$ tensor, then:

$$A^{i}_jV^j = U^i$$

However, a matrix isn't a very good way to represent a bilinear function like $F^{\mu\nu}$. A bilinear function either maps a co-vector to a vector, a vector to a co-vector, or a pair (two vectors or two co-vectors) to a scalar:

$$F^{\mu\nu}V_{\mu}U_{\nu} = r$$

where for example we may assume $r\in\mathbb{R}$. I apologize for the non-physicality of the example, this is only for illustration purposes :)

(You can find much more about this index notation if you're interested and how it's related to the more straightforward notation of a multilinear function. Suffice to say that this is just a more economical notation for familiar operations from (multi)linear algebra).

But apart from some differences in notation, tensors in physics are exactly the same objects as they are in math: multilinear maps. Perhaps most importantly: in physics those multilinear maps often depend on physically significant parameters, such as position in spacetime. A good example of that would be the metric tensor in relativity. So while at a point on the spacetime manifold, the metric tensor will indeed act like a multilinear map, we would still identify it as the same tensor at another point on the manifold, despite this dependence. This is related to the fact that the metric tensor is properly defined as a tensor field on the manifold, via the related notion of the fiber bundle, which you may be familiar with.

Amit
  • 6,024
13

How can an object such as $F_{μν}$ be seen as an bilinear function?

By realising that $F_{\mu\nu}$ are the components of a two-form $$F=\frac12F_{\mu\nu} \text{d}x^\mu \wedge \text{d} x^\nu\,,$$which is an element of the antisymmetric product of the cotangent space and as such is a bilinear function taking two vectors (elements of the tangent space) and returning a number.

Toffomat
  • 4,575
12

Most physicists working in general relativity call the components of a tensor, a tensor. Then they have requirements for the transformation of these components under change of basis to guarantee that these indeed satisfy the properties of the components of an (abstract) tensor.

Frederic Schuller's lectures on GR give a very nice presentation from a mathematical perspective, but he makes the connections to the physicist's notation/shorthand at a number of places, e.g.,

"and now comes something everybody has been waiting for who knew tensors before, namely components of tensors"

and

"... so this makes GR accessible to the masses"

Ben H
  • 1,435
8

Mathematicians tend to favour the intrinsic definition of a tensor, which defines the tensor as a multilinear function or a member of the tensor product of two or more vector spaces.

Physicists tend to favour the extrinsic definition of a tensor, in which a tensor is an array of components which transform in a specific way under a change of co-ordinate system or reference frame. This is a more concrete definition, whereas the mathematicians definition is more abstract.

Having said that, the mathematician's definition and the physicist's definition are equivalent, and lead to the same type of object with the same properties.

gandalf61
  • 63,999
8

Let me give an example of a rather roundabout derivation which uses this fact.

Suppose that instead of doing the usual theorist’s $c=1$ trick, I wish to fully work out a strange unit system where $$ \begin{align} \nabla \cdot E&=c\rho,&\nabla\times E &=-\mathring B\\ \nabla\cdot B&=0,&\nabla\times B&=J + \mathring E \end{align} $$ where $\mathring A = \dot A/c.$ One can check that the units align properly with $\nabla\cdot J+ c\mathring\rho=0$being a valid continuity equation and curling the curl one finds e.g. $$\overset{\,\scriptsize\circ\circ}B -\nabla^2 B=\square B=\nabla\times J$$ which has the right d'Alembert operator with the right wave velocity, so it looks very promising, is just not Gaussian/CGS or SI.

Well, there is a lot of work and cross-checking in rebuilding all of your knowledge from the ground up, and relativity would be a good guide. We start with the standard definition of vector potential, $\nabla\cdot B=0$ implies $B =\nabla\times A$ for some $A$, which means $\nabla\times(E+\mathring A)=0$ which we use to say $E = -\mathring A - \nabla\varphi$ for some $\varphi$.

Defining $\lambda = \nabla\cdot A + \mathring\varphi$, the other two Maxwell equations say$$\square\varphi = c\rho +\mathring\lambda,\\ \square A= J -\nabla\lambda,$$and our gauge freedom means that mapping $A\mapsto A +\nabla \psi$ while $\varphi\mapsto \varphi-\mathring \psi$ preserves $E, B$ while mapping $\lambda\mapsto \lambda -\square\psi$, which we can solve for zero to force $\lambda \mapsto 0$, the Lorenz gauge. Then since $(c\rho, J) = J^\bullet$ is a 4-vector and $\square$ is covariant, we find that the appropriate 4-potential is $A^\bullet=(\varphi, A)$, no division or multiplication by $c$. Use the $({+}\,{–}\,{–}\,{–})$ metric, $A_\bullet = (\varphi, -A),$ we are ready to form the field tensor.

We have $F_{\mu\nu}=\partial_\mu A_\nu -\partial_\nu A_\mu$ which means $$F_{\bullet\bullet}=\begin{bmatrix}0&-E_x&-E_y&-E_z\\ E_x&0&B_z&-B_y\\ E_y&-B_z&0&B_x\\ E_z&B_y&-B_x&0\end{bmatrix}.$$ Now, you ask in what sense this tensor is a bilinear function, well, there are two answers to that, one is just, it is a matrix, so obviously it is a bilinear function. That is if we take two 4-vectors and write them as raw linear algebra column vectors $\mathbf u,\mathbf v$, and regard this matrix as $\mathbf M$, then this implements the bilinear function $$F(u, v) = \mathbf u^T \mathbf M \mathbf v$$ which is also a Lorentz-invariant scalar. Note that this is a raw transpose of the column, no components are negated here because those transforms are already absorbed into the above matrix.

The second answer is a bit more physical, this should have the function of connecting a 4-velocity to a 4-force. The 4-force can be regarded as a covector in the sense that it takes a small 4-displacement and tells you how much work is done on that displacement. This leads to a slightly redundant situation because of course the displacement that we would want is in the direction of the four velocity of the charged particle that we are tracking, so we find that we actually want $F(v, v)$ for the same 4-velocity when all is said and done, but since F is antisymmetric that will inevitably be zero! In 4D the “length” of the 4-velocity is actually fixed, so no “4-work” is ever truly done. Nevertheless we might want to take the dual of the 4-force and think of it as a change in 4-momentum per unit proper time, obviously the change has to be “perpendicular” to the 4-momentum per the above but that doesn't make it zero.

So let's do that. We find that we want a force $${\mathrm dp^\bullet\over\mathrm d\tau} \propto \begin{bmatrix}1&&&\\ &-1&&\\ &&-1&\\ &&&-1\end{bmatrix} \begin{bmatrix}0&-E_x&-E_y&-E_z\\ E_x&0&B_z&-B_y\\ E_y&-B_z&0&B_x\\ E_z&B_y&-B_x&0\end{bmatrix} \begin{bmatrix}\gamma c\\ \gamma v_x\\ \gamma v_y\\ \gamma v_z\end{bmatrix},$$ and we know that we want for example ${\mathrm dp^x\over\mathrm dt}=\gamma^{-1} {\mathrm dp^x\over\mathrm d\tau}=qE_x$ for a purely electrostatic case. Thus we have, $$ {\mathrm dp^\mu\over\mathrm d\tau}=-\frac{q}{c}\eta^{\mu\nu}F_{\nu\sigma} v^\sigma,$$ and we thus come to the final discovery that the proper version of the Lorentz force law in these units is exactly the same as in CGS, $$ F = q \left( E +\frac vc\times B\right).$$ There's probably an easier way to see that, but it's a nice application of the bilinear properties of the electromagnetic force tensor.

Another application of this fact is that $E^2-B^2=\frac12 F_{\mu\nu}F^{\nu\mu}$ is a Lorentz invariant scalar field, and exploiting a hidden symmetry in such antisymmetric tensors (Hodge star?), so is $E\cdot B \propto F_{\mu\nu} (\star F)^{\nu\mu}.$ Both of these results come from the coordinate invariance of trace, which can roughly be stated as the slightly more geometric, “any [m, n]-tensor can be expressed as a finite sum of outer products of vectors and [m–1, n]-tensors, or a finite sum of products of covectors and [m, n–1]-tensors.” Then the procedure to “contract” an [m, n]-tensor is to choose two indices to contract, express it as a finite sum of outer products of [m–1,n–1]-tensors times a vector times a covector, feed all of those vectors to covectors to create invariant scalars, a scalar times a tensor is a tensor and a sum of tensors of the same shape is a tensor, so there you go.

CR Drost
  • 39,588
5

In the mathematician definition, any array of numbers is a vector. This is fine because the arrays obviously form a vector space and you can talk about linear functions on them (tensors) and changes of basis. All of the tools of linear algebra apply.

Physicists have reserved the word "vector" for a special case of the above : The partial derivative operators defined on the scalar functions on a manifold. The partial derivative operators defined at a point on a manifold obviously do form a vector space, according to the mathematician definition. The sum of two partial derivatives just corresponds to another partial derivative. So they're closed under addition. All of the tools of linear algebra apply. You can define tensors on these vectors.

Now, a co-ordinate transformation is like choosing a different basis for this specific vector space. When you transform the co-ordinates, the new co-ordinate system $(x',y',z')$ comes equipped with a new basis of partial derivatives in these directions $\partial _{x'}$, $\partial_{y'}$, and $\partial_{z'}$. The partial derivative basis vectors corresponding to the older co-ordinate system : $\partial _x$, $\partial _y$ and $\partial _z$ will be related to these by the Jacobian matrix of the transformation.

The above is why physicists say that vectors are only those arrays that change their basis according to the Jacobian matrix under a change of co-ordinate transformation. A co-ordinate transformation is, almost by definition, a change of basis of this specific vector space.

If you form an array out of $(temperature, pressure,mass)$, it is a vector space in the mathematical sense, but it won't change its basis under co-ordinate transformation, because co-ordinate transformations are, almost by definition, a change of basis of the very specific vector space of partial derivative operators.

Ryder Rude
  • 6,915