58

The Euler-Lagrange equation gives the equations of motion of a system with Lagrangian $L$. Let $q^\alpha$ represent the generalized coordinates of a configuration manifold, $t$ represent time. The Lagrangian is a function of the state of a particle, i.e. the particle's position $q^\alpha$ and velocity $\dot q^\alpha$. The Euler-Lagrange equation is

$$ \frac{d}{dt} \frac{\partial L}{\partial \dot q^\alpha } = \frac{\partial L}{\partial q^\alpha}$$

Why is this a law of physics and not a simple triviality for any function $L$ on the variables $q^\alpha$ and $\dot q^\alpha$? The following "proof" of the Lagrange Equation uses no physics, and seems to suggest that the Lagrange Equation is simply a mathematical fact that works for every function.

$$\begin{align} \frac{d}{dt} \frac{\partial L}{\partial \dot q^\alpha} & = \frac{\partial}{\partial \dot q^\alpha} \frac{dL}{dt} &\text{commutativity of derivatives} \\ \ \\ &= \frac{\partial \dot L}{\partial \dot q^\alpha} \\ \ \\ &= \frac{\partial L}{\partial q^\alpha} & \text{cancellation of dots} \end{align}$$

This can't be right, or else nobody would give a hoot about this equation and it would be totally useless to solve any problem. What is wrong with the logical reasoning above?

Qmechanic
  • 220,844
Trevor Kafka
  • 1,903

7 Answers7

119

Ah, what a tricky mistake you've made there. The problem is that you've simply confused some notions in multivariable calculus. Don't feel bad though-- this is generally very poorly explained. Both steps 1 and 3 above are incorrect. Rest assured, the Euler-Lagrange equation is not trivial.

Let's first take a step back. The Lagrangian for a particle moving in one dimension in an external potential energy $V(q)$ is $$ L(q, \dot q) = \frac{1}{2}m \dot q^2 - V(q). $$ This is how most people write it. However, this is very confusing, because clearly $q$ and $\dot q$ are not independent variables. Once $q$ is specified for all times, $\dot q$ is also specified for all times.

A better way to write the above Lagrangian might be $$ L(a, b) = \frac{1}{2}m b^2 - V(a). $$ Here we've exposed the Lagrangian for what it really is: a function that takes in two numbers and outputs a real number. Likewise, we can clearly see that $$ \frac{\partial L}{\partial a} = -V'(a) \hspace{1cm} \frac{\partial L}{\partial b} = m b. $$ Usually, most people write this as $$ \frac{\partial L}{\partial q} = -V'(q) \hspace{1cm} \frac{\partial L}{\partial \dot q} = m \dot q. $$ However, $q$ and $\dot q$ must be understood as independent variables in order to do this correctly. Just as $a$ and $b$ were independent variables, $q$ and $\dot q$ are too when they're being put into the Lagrangian. In other words, we could put any two numbers into $L$; we just decided to put in $q$ and $\dot q$.

Furthermore, let's look at the total time derivative $\frac{d}{dt}$. How should we understand the following expression? $$ \frac{d}{dt} L(q(t), \dot q(t)) $$ Both $q$ and $\dot q$ are functions of time. Therefore, $L(q(t), \dot q(t))$ depends on time simply because $q(t)$ and $\dot q(t)$ do. Therefore, in order to evaluate the above expression, we need to use the chain rule in multivariable calculus. $$ \frac{d}{dt} L(q(t), \dot q(t)) = \frac{dq}{dt} \frac{\partial L}{\partial a}(q(t), \dot q(t)) + \frac{d \dot q}{dt} \frac{\partial L}{\partial b}(q(t), \dot q(t)) = \dot q(t) \frac{\partial L}{\partial a}(q(t), \dot q(t)) + \ddot q(t) \frac{\partial L}{\partial b}(q(t), \dot q(t)) $$

In the above expression, I once again used $a$ and $b$ in order to make my point clearer. We need to take partial derivatives of $L$ assuming $a$ and $b$ are independent variables. AFTER differentiating, we THEN evaluate $\partial L / \partial a$ and $\partial L / \partial b$ by plugging in $(q, \dot q)$ into the $(a,b)$ slots. This is just like how in single variable calculus, if you have $$ f(x) = x^2 $$ and you want to find $f'(3)$, you first differentiate $f(x)$ while keeping $x$ an unspecified variable, and THEN plug in $x = 3$.

In your first step, the derivatives DON'T commute because $t$ and $q$ are not independent. ($q$ depends on $t$.) Yes, partial derivatives commute, but ONLY if the variables are independent. In your third step, you can't "cancel the dots" because $L$ depends on two inputs. If $L$ only depended on $q$, then yes, you could "cancel the dots" (as this is equivalent to the chain rule in single variable calculus), but it doesn't, so you can't.

EDIT: You can see for yourself that the Euler-Lagrange equation is not identically $0$. If you take the Lagrangian $L(q, \dot q)$ I've written above and plug it into the Euler Lagrange equation, you get $$ m \ddot q(t) + V'(q(t)) = 0. $$ This is not the same as $0 = 0$. It is a condition that a path $q(t)$ would have to satisfy in order to extremize the action. If it was $0 = 0$, then all paths would extremize the action.

EDIT: As Arthur points out, this is also a good time to discuss the difference between $dL / dt$ and $\partial L / \partial t$. If we have a time dependent Lagrangian, $$ L(q, \dot q, t) $$ then $L$ can depend on $t$ explicitly, as opposed to just through $q$ and $\dot q$. So, for example, where as we might have the Lagrangian for a particle in a constant gravitational field $g$ is $$ L(a,b) = \frac{1}{2} mb^2 - m g a $$ if we let allow $L$ to depend on $t$ explicitly, we could have the gravitational field get stronger as time goes on: $$ L(a,b,t) = \frac{1}{2} mb^2 - m ( C t )a. $$ ($C$ is a constant such that $Ct$ has the same units as $g$.)

The quantity $$ \frac{\partial}{\partial t} L(a, b, t) $$ should be understood as differentiating the "$t$-slot" of $L$. In the above example, we would have $$ \frac{\partial}{\partial t} L(a,b,t) = - m C a. $$ The quantity $$ \frac{d}{d t} L(q(t), \dot q(t), t) $$ should be understood as the full time derivative of $L$ due to the fact that $q$ and $\dot q$ also depend on $t$. For the above example, \begin{align*} \frac{d}{d t} L(q(t), \dot q(t), t) &= \dot q(t) \frac{\partial L}{\partial a}(q(t), \dot q(t),t) + \ddot q(t) \frac{\partial L}{\partial b}(q(t), \dot q(t),t) + \frac{\partial L}{\partial t} (q(t), \dot q(t), t) \\ &= (\dot q) (-mC t ) + \ddot q(t) (m \dot q(t)) - mC q(t) \end{align*}

user1379857
  • 12,195
21
  1. The commutator $$\left[\frac{\partial}{\partial \dot{q}^j},\frac{\mathrm d}{\mathrm d t}\right]~\stackrel{(2)}{=}~\frac{\partial}{\partial q^j}\tag{1}$$ of a velocity derivative $\frac{\partial}{\partial \dot{q}^j}$ with the total time derivative $$\frac{\mathrm d}{\mathrm d t} ~=~\frac{\partial}{\partial t} +\dot{q}^j\frac{\partial}{\partial q^j} +\ddot{q}^j\frac{\partial}{\partial \dot{q}^j} +\dddot{q}^j\frac{\partial}{\partial \ddot{q}^j} +\ldots \tag{2}$$ is not zero. See also e.g. this related Math.SE post & this related Phys.SE post.

  2. The cancellation of dots $$\frac{\partial \dot{L}}{\partial \dot{q}^j}~=~\frac{\partial L}{\partial q^j}\tag{3}$$ works for functions $L(q,t)$ that don't depend on velocities $\dot{q}^k$. But a Lagrangian typically depends on velocities. See also this related Phys.SE post.

  3. Note the following algebraic Poincare lemma: $$L\text{ satisfies the Euler-Lagrange (EL) eqs. identically }$$ $$\quad\Updownarrow\quad\tag{4}$$ $$L\text{ is a total time derivative}$$ (modulo possible topological obstructions). For details, see e.g. this & this Phys.SE posts.

Qmechanic
  • 220,844
8

So, in principle one can choose essentially $\it{any}$ Lagrangian $\mathcal{L}$ with sufficiently chosen coordinates (and possibly constraints), and apply variational calculus to it via the Euler-Lagrange equations. The equations of motion that this produces may or may not correspond to an understandable model of reality. There are lots of Lagrangians that don't correspond to reality (seemingly). The Lagrangians that produce physical models have been found usually by guess-and-check and consultation with experiment/observation.

why is this a fundamental law of physics and not a simple triviality of ANY function L on the variables $q$ and $\dot{q}$?

The Euler-Lagrange formalism is not a "fundamental law of physics." Rather, it is a partial differential equation (or a set of them) whose solutions make a particular functional stationary, meaning the solutions obey the principle of extremized action. This mathematical concept was actually generalized in control theory by Pontryagin's maximum principle. The laws of physics are derivable through the Euler-Lagrange method, but the method is not fundamental, similar to how the particular geometry chosen is not fundamental(par. 17) for deriving physical laws. Physicists use math to model reality, so of course we're going to use the things that work! For instance, Einstein derived his field equations heuristically, but Hilbert derived them (around the same time) from the action principle by guessing the correct $\mathcal{L}$. But nowadays, almost everyone that works with general relativity or modified gravity start from $\mathcal{L}$ and use the action principle (except in cosmology they typically start from the metric itself).

It is not entirely surprising that since we are natural creatures which evolved to understand patterns of our environment, the tools we create - especially the abstract ones like math - might have some correspondence with reality. Eugene Wigner wrote a very nice essay about this topic, called "The Unreasonable Effectiveness of Mathematics in the Natural Sciences," in which he argues that it is obvious that math works so well at modeling reality, but it's not at all obvious why this works.

"Why" questions are very difficult to answer, and this one is especially difficult. Some Lagrangians work at producing physical models, and some don't, and maybe the E-L equations work as a filter for figuring that out since it can be used to make testable predictions.

@ AccidentalFourierTransform already clarified your mathematical errors, so I will not.

3

Your question: ''Why is the Lagrange equation not a triviality? What is wrong with my calculation?''.

First some notation. Using the unambiguous notation from SICM, the Lagrange equations are:

$$\mathrm{D}((\partial_2 L) ∘ Γ[q]) − (\partial_1 L) ∘ Γ[q] = 0$$ (where $\mathrm{D}$ is the total derivative (corresponds to the time derivative), and $Γ[q] = (q, \mathrm{D}q, ...)$ is the functional that provides the path and its derivate(s).)

(If you wonder what is wrong with traditional notation, then I recommend reading the preface of SICM which addresses this, but basically it is exactly such confusions as this question is about.)

Trying to rewrite your calculation using the unambiguous notation from SICM immediately reveals some problems:

Impossible to simply commute derivatives: Neither $$\mathrm{D}((\partial_2 L) ∘ Γ[q]) \neq \partial_2((\mathrm{D} L) ∘ Γ[q])$$ nor $$\mathrm{D}((\partial_2 L) ∘ Γ[q]) \neq \partial_2\mathrm{D} (L ∘ Γ[q])$$ make any sense.

Impossible to cancel dots: $$\partial_2\mathrm{D} (L ∘ Γ[q]) \neq \partial_1 (L ∘ Γ[q])$$ both left and right look pretty non-sensical.

Then you need to do $$\partial_1 (L ∘ Γ[q]) = (\partial_1 L) ∘ Γ[q]$$ to reconstruct a sane expression.

Thus no step in your proof is warranted.

hkBst
  • 138
1

That's an interesting sequence of symbolic manipulations!

It's because of the lack of rigour that it's easy to fall into these pitfalls and typically physics text don't go into where these are and why and how to avoid them. It's a skill that one picks up by doing problems, going through the theory and reading around.

Similar problems are associated with the path integral which has no rigorous definition. However, the variational calculus can be made rigourous. However, this is difficult. It's typically not touched upon in an undergraduate mathematics course where they will rigourously define calculus for one real variable, for one complex variable and many real variables - either calculus on a manifold or more typically, multi-variable calculus, which is calculus in a (finite-dimensional) vector space.

To make the mathematics of this rigorous requires apparatus of jet bundles. You can find an exposition of Saunders Jet Bundles and Michors Natural Operations. It's takes quite some development.

Mozibur Ullah
  • 14,713
1

N. Steinle already gave a great answer on the question

why is this a fundamental law of physics and not a simple triviality of ANY function L

but I would like to point out an additional tidbit regarding the part

.. seems to suggest that the Lagrange Equation is simply a mathematical fact that works for every function.

While the Lagrange equations mathematically really only describe a function/process that is an extremal value of some Lagrangian (or also some energy or action potential), the important part is that the converse is not as simple.

It seems to be a "a fundamental law of physics" that many processes, that we observe in nature even have a Lagrangian, an energy potential. This is actually not trivial, not every multidimensional function has such a potential and is a statement about the symmetry of these processes.

Aganju
  • 641
  • 5
  • 12
mirrormere
  • 11
  • 1
1

This is not about "what's wrong" but about how you could figure out what's wrong (or at least find something that's wrong in your attempted proof). Take a nice simple Lagrangian, like that for a free particle in one dimension: $L=\frac m2(\dot q)^2$ (where $q$ represents distance). And take some motion that is not correct in that physical situation, like uniform acceleration $q=at^2$, where $a$ is a non-zero constant. From $L$, you get the Euler-Lagrange equation $\frac d{dt}(m\dot q)=0$ (because $\partial L/\partial\dot q=m\dot q$ and $\partial L/\partial q=0$), i.e., you get conservation of momentum. On the other hand, from $q=at^2$, you get $\frac d{dt}(m\dot q)=m\ddot q=2ma$ (assuming the mass $m$ is constant). So the Euler-Lagrange equation is violated. That already shows that the Euler-Lagrange equation cannot be "simply a mathematical fact that works for every function." But you can get more information by plugging this particular $L$ and this particular $q(t)$ into your attempted proof, to see exactly which of your equations in that proof fail.