Physicists definition of vectors based on transformation laws

Question

First of all I want to make clear that although I've already asked a related question here, my point in this new question is a little different. On the former question I've considered vector fields on a smooth manifold and here I'm considering just vectors.

In Physics vectors are almost always defined by their transformation properties. Quoting Griffiths:

Well, how about this: We have a barrel of fruit that contains $N_x$ pears, $N_y$ apples, and $N_z$ bananas. Is $\mathbf{N} = N_x\hat{\mathbf{x}}+N_y\hat{\mathbf{y}}+N_z\hat{\mathbf{z}}$ a vector? It has three components, and when you add another barrel with $M_x$ pears, $M_y$ apples, and $M_z$ bananas the result is $(N_x+M_x)$ pears, $(N_y+M_y)$ apples, $(N_z+M_z)$ bananas. So it does add like a vector. Yet it's obviously not a vector, in the physicist's sense of the word, because it doesn't really have a direction. What exactly is wrong with it?

The answer is that $\mathbf{N}$ does not transform properly when you change coordinates. The coordinate frame we use to describe positions in space is of course entirely arbitrary, but there is a specific geometrical transformation law for converting vector components from one frame to another. Suppose, for instance, the $\bar{x},\bar{y},\bar{z}$ system is rotate by an angle $\phi$, relative to $x,y,z$, about the common $x=\bar{x}$ axes. From Fig. 1.15,

$$A_y=A\cos \theta, A_z=A\sin\theta,$$

while

$$\bar{A}_y=\cos\phi A_y + \sin \phi A_z,$$

$$\bar{A}_z=-\sin\phi A_y + \cos\phi A_z.$$

More generally, for totation about an arbitrary axis in three dimensions, the transformation law takes the form:

$$\bar{A}_i=\sum_{j=1}^3R_{ij}A_j.$$

Now: Do the components of $\mathbf{N}$ transform in this way? Of course not - it doesn't matter what coordinates you use to represent position in space, there is still the same number of apples in the barrel. You can't convert a pear into a banana by choosing a different set of axes, but you can turn $A_x$ into $\bar{A}_y$. Formally, then, a vector is any set of three components that transforms in the same manner as a displacement when you change coordinates.

It is exactly this kind of definition I'm in trouble to understand. My point here is the following: as a mathematician would say, a vector is just an element of a vector space.

Let $V$ be a vector space over $\mathbb{K}$ and let $\{e_i\}$ be a basis. Then the mapping $f : \mathbb{K}^n\to V$ given by $f(a^1,\dots,a^n)=a^ie_i$ is an isomorphism by definition of basis.

This means that we can pick any numbers $a^1,\dots,a^n$ and they will give a unique vector no matter what those numbers are. If they represent numbers of pearls, bananas or apples, it doesn't matter. They are numbers.

Now, if we consider another basis $\{\bar{e}_i\}$ we are certain that exists numbers $a^i_j$ which are unique such that $e_j = a^i_j \bar{e}_i$.

In that setting if we have a vector $v = v^je_j$ then we have $v = v^ja^i_j \bar{e}_i$. In other words $v = \bar{v}^i\bar{e}_i$ with $\bar{v}^i = a^i_jv^j$. The transformation law is thus just a result from the theory of linear algebra!

Now, my whole doubt is: what is behind this physicists definition? They are trying to use a result of the theory to define vectors, but why this definition should make sense? As I've pointed out, because $f$ is isomorphism, by the definition of basis any set of numbers will form a vector and if we change the basis the new components will forcefully change as needed for the theory make sense.

EDIT: After thinking for a while I believe I have an idea of what's going on here. I believe we have two separate things: the mathematical idea of vector and the physical idea of a vectorial quantity.

I believe that is the source of the confusion since for a mathematician when we pick $(a^1,\dots,a^n)\in \mathbb{K}^n$ those are just arbitrary numbers while for a physicist if we pick $(a^1,\dots,a^n)$ each $a^i$ has a specific physical meaning as a measurable quantity. Is that the idea somehow?

score 9 · Answer 1 · answered Mar 05 '16 at 20:28

This is a very common disconnect between mathematicians and physicists (or at least the physicists who were taught things all weird).

What goes unspoken in the "physicist" definition of vector, and indeed what I think most people using that definition fail to appreciate, is that when you are handed a tuple of numbers, you are implicitly given a rule for generating the components in any basis.

A physics example: Consider the vector $\vec{v} = (\cos\theta, \sin\theta)$ expressed in Cartesian coordinates $(x,y)$. We know this is a vector because if we rotated our axes (say to $(x',y')$ $45^\circ$ counterclockwise of $(x,y)$) but kept measuring $\theta$ as the angle to whatever is our first axis ($x$ or $x'$), we would get the same thing. That is, $\cos\theta\ \hat{x} + \sin\theta\ \hat{y} = \cos\theta'\ \hat{x}' + \sin\theta'\ \hat{y}'$, and we're not going to bother writing the prime on $\theta'$, since everyone knows $\theta$ is the angle to the first of whatever our two axes are. In matrix form, $$ \begin{pmatrix} \cos45^\circ & \sin45^\circ \\ -\sin45^\circ & \cos45^\circ \end{pmatrix} \begin{pmatrix} \cos\theta \\ \sin\theta \end{pmatrix} = \begin{pmatrix} \cos(\theta-45^\circ) \\ \sin(\theta-45^\circ) \end{pmatrix} = \begin{pmatrix} \cos\theta' \\ \sin\theta' \end{pmatrix} \stackrel{\text{looks like}}{\sim} \begin{pmatrix} \cos\theta \\ \sin\theta \end{pmatrix}. $$

On the other hand, $(\cos\theta, 2)$ is not a vector, because in the primed coordinate system it would be $$ \begin{pmatrix} \cos45^\circ & \sin45^\circ \\ -\sin45^\circ & \cos45^\circ \end{pmatrix} \begin{pmatrix} \cos\theta \\ 2 \end{pmatrix} = \frac{1}{\sqrt{2}} \begin{pmatrix} 2+\cos\theta \\ 2-\cos\theta \end{pmatrix} \neq \begin{pmatrix} \cos\theta' \\ 2 \end{pmatrix}. $$

A mathematics example: Consider the tuple $(\cos\theta, 2)$. Let $\vec{v}$ be the vector with these coefficients in Cartesian coordinates. If we rotate our coordinates by $45^\circ$, we will still have the same vector but its components will change: \begin{align} \vec{v} & \stackrel{\text{original}}{\longrightarrow} (\cos\theta, 2), \\ \vec{v} & \stackrel{\text{new}}{\longrightarrow} \frac{1}{\sqrt{2}} (2+\cos\theta, 2-\cos\theta). \end{align}

It's not that for physicists the components are somehow measurable or physically meaningful. It's that physicists often compress a whole family of formulas into one expression, conveying (in an often unclear way) more information than the notation would seem to indicate. With so much information, though, comes the chance that the formulas for coefficients are inconsistent with being any one vector in a vector space. In that case one doesn't have the formula for a vector, or rather one has many formulas for many different vectors.

Timaeus · Answer 2 · 2016-03-05T19:06:31.620

A mathematician will say a vector is an element of a vector space and a vector space is just a set with a binary addition and with some scalars and an operation of scaling a vector by a scalar. Your talk about three numbers wouldn't make it a vector to a mathematician until you say how you add the vectors and scale them by a scalar.

So operations matter to everyone.

But there are more operations. Let's look at column vectors and row vectors. With the usual matrix addition and scaling they are both vector spaces. But they also have a natural relationship to each other (given by the matrix multiplication).

And now we find that even though each might be an n tuple, the two transform differently under a rotation if the idea is that each is a linear function taking the other as an argument.

You can think of the row vector $r$ as something that takes column vectors $c_1$ and $c_2$ and sends their linear combination $\alpha c_1+\beta c_2$ to $r(\alpha c_1+\beta c_2)=\alpha r c_1+\beta r c_2.$

Or you can think of a column vector $c$ as something that takes row vectors $r_1$ and $r_2$ and sends their linear combination $\alpha r_1+\beta r_2$ to $(\alpha r_1+\beta r_2)c=\alpha r_1c+\beta r_2c.$

And now if that natural operation of being functions of each other is to be respected then the column and row vectors must transform differently even though both are n tuples.

And a basis for one determines a basis for the other if you want to use the matrix product.

If the vector has components in two basis that are given by two column vector and the transformation is given by a matrix $\Lambda$ acting on the left then the row vectors need to be multiplied by $\Lambda^{-1}$ on the right.

Operations are always essential since the point of an object is to do things with it.

Indeed from a mathematician's standpoint a vector is an element of a vector space. But it seems the physicist's definition requires more than that.

For a mathematician, a row vector and a column vector are both vectors (in different vector spaces). For a physicist we know they are linear objects in their own linear spaces, but you can pick a basis for one and get a basis for the other too!

And in addition to scaling and adding there is a third operation where given two vectors from these two different spaces (that have their basis be related to each other) there is a number. In particular for a frame $\{v_1,v_2,v_3\}$ of independent, say, column vectors in one space there is a reciprocal or dual frame $\{w_1,w_2,w_3\}$ of vectors, say, row vectors from the other space such that $w_i v_j=\delta_{ij}$ (where the Kronecker delta is zero unless $i=j$ in which case it is one).

So a basis for one naturally gives a definition of a basis for the other. It's this relationship between two different vector spaces that is important. They are different. Adding two nonzero row vectors gives a new row vector. Adding two nonzero column vectors gives a new column vector. But if you tried to add a row vector and a column vector that's not part of the definition of either vector space alone. If anything, you'd just keep them separate like adding an imaginary number to a real number. And it's the multiplication that is essential. And neither vector space by itself talks about that multiplication.

A physicist explicitly wants to tell you that both vector spaces are useful and needed and that they transform differently even though they share a basis (in a sense). And knowing how each transforms allows you to tell which is which.

So the key is to know that when you have a basis and a cobasis that are reciprocal to each other then they transform in reciprocal fashions. That is the key. So I said operations matter to everyone. But physicists are considering two spaces and a whole new operation besides addition and scaling.

When you change one basis, the other basis changes and the coordinates of both change, but in different ways.

score 1 · Answer 3 · answered Nov 03 '17 at 14:58

Let me try to clarify (lets stay in flat three-dimensional space):

In physics any triple of numbers which transforms like the radius vector under rotations is called a vector. (This definition is given for example in Feynmans lectures) Why is this definition useful: we want a vector to represent a 'real'/physical quantity. Just picture an arrow placed in your room (this is sort of a real quantity), now choose any rectangular coordinate system and write down the components. Now take a different coordinate system and do the same. (of course they are related by the correct transformation law). If they would not follow the transformation law the 'arrow' would depend on your coordinates. Physics cannot depend on your choice of coordinates (you are a bit in trouble if it does).

Maybe this example helps: choose a coordinate system (like really put some rulers down). Measure the temperature at (0,0,1), at (0,1,0) and at (1,0,0). Lets assume they are different. (0,0,1) is inside your apartement, (0,1,0) is outside and (1,0,0) is on your heating plate. Write this triple down. Is it a vector or 3 scalar numbers? (the latter of course). If you rotate the coordinate system it does not transform like a vector.

Of course formally you can say that you always take the standard basis of R^n (the math approach) and then any sequence of numbers defines a vector. But this does not represent 'real' vectors.

Quillo · Answer 4 · 2022-04-26T17:44:47.510

Most of the times in Physics you want your "vector" to be a physical quantity, not just an arbitrary juxtaposition of numbers (pears, apples.. as in your example). Therefore, it must transform as the physical quantity it should represent. In general, in Physics we are interested in fields (scalar, vectorial, tensorial, spinorial) that transform in a definite way under the action of a certain group of transformations: our vectors are the usual "vectors" of "linear algebra" plus the requirement that they are "tensors" for a certain group of interest.

In short: vectors in Physics are mathematicians' vectors (i.e. are elements of a vector space)... but not all mathematicians' vectors are "physical vectors" (i.e. mathematicians' vectors can be "lists").

If you invent your own set of rules to make sense of addition of pears, apples and elephants then you can upgrade your "linear algebra vector" to be a "physics vector" under your designed set of rules.

Note #1: a "linear algebra vector" like (temperature, density, modulus of the velocity, magnetic field intensity...) is made of many (scalar) physical quantities but does not behave as a genuine "physics" vector.

Note #2: useful answers to the same questions (in a related question): https://physics.stackexchange.com/a/627426/226902 , https://physics.stackexchange.com/a/627456/226902 , https://physics.stackexchange.com/a/406415/226902

score 0 · Answer 5 · answered May 29 '18 at 14:57

You are right in saying that Griffiths explanation does not nail the issue. Say $N_p$ are pears and $N_b$ are bananas, we can change components into $N_T=N_p + N_b$ the total number of fruits and $N_D=N_p - N_b$ the difference. This is a linear transformation as much as a rotation in space is. There are many branches in physics and engineering that do this (basically, any minimization process in some parameter space). So, what's different?

The difference is that the types of vectors he is considering (infinitesimal displacement, velocity, force, ...) are defined at a point: they have a point of application. That is, it's not just that you have a velocity, you have a velocity here at the center of mass of the ball. You don't just have a force, the force is applied here at the center of this car. Summing two forces applied at two different points has no meaning. Moreover, the unit of the component depends on the unit of space. If $x$ is measured in meters, then $v_x$ is measured in meters per second. So there is a link in how the change of coordinates in space affect the components of your vector.

Now, mathematically you'd say that you really have a manifold, and the vector is in its tangent space. However, this does not work at all physically. As we can't sum velocities with forces, they must live in different tangent spaces. So, you basically say that a vector is a set of numbers that change in a particular way (i.e. they are isomorphic to vectors in the tangent space at the point).

Hope this helps!

score 0 · Answer 6 · answered Jul 31 '23 at 09:02

The difference is that the vector space that is defined in Griffith's book has a well defined inner product and hence a well defined length and it is a model of the 3D space that we live in.

For the other one (with apple, pear and banana basis), such a definition of inner product does not make any sense when viewed from the real world (you can still define an inner product).

Physicists definition of vectors based on transformation laws

6 Answers6

Linked