4

Suppose we have a configuration space manifold $M$.


The Lagrangian $\mathcal{L}(q, \dot{q})$ is a function on the tangent bundle $TM$. From $\mathcal{L}$ we can define the action functional $S$ which accepts paths on $M$ and returns the action for that path. A law of physics says that the extremal paths of the action functional are physical trajectories. With some calculation we find that the extremal paths satisfy the Euler-Lagrange equations

\begin{align*} \frac{d}{dt}\left(\frac{\partial \mathcal{L}}{\partial \dot{q}^i}\right) = \frac{\partial \mathcal{L}}{\partial q^i} \end{align*}


The Hamiltonian $H(q, p)$ is a function on the cotangent bundle $T^*M$, also called phase space. Any smooth manifold $M$ admits a canonical symplectic form $\omega$. This symplectic form induce a vector field on phase space (not on the manifold..) $X_H$. It is a law of physics that the integral curves of this vector field are physical trajectories. There is a relationship between $X_H$ and $dH$ which gives the following Hamilton differential equations of motion for the trajectories:

\begin{align*} \frac{dq^i}{dt} =& \frac{dH}{dp_i}\\ \frac{dp_i}{dt} =& - \frac{dH}{dq^i} \end{align*}


The Lagrangian induces a canonical isomorphism between $TM$ and $T^*M$ by $$ p = d_{\dot{q}}\mathcal{L} $$ which is $$ p^i(q, \dot{q}) = \frac{\partial \mathcal{L}}{\partial \dot{q}^i} $$ in coordinates.


I understand that there is a transformation from the space of Lagrangians (functions on the tangent bundle) to the space of Hamiltonians (functions on the cotangent bundle) which preserves the equations of motion. How can we derive this transformation from the constraint that the equations of motion are preserved?

One hint is that we can use the Euler-Lagrange equations and the duality condition derived from $\mathcal{L}$ to see

\begin{align*} \frac{dp_i}{dt} = \frac{\partial L}{\partial q^i} = -\frac{dH}{dq^i} \end{align*}

From $$ \frac{dq^i}{dt} = \frac{dH}{dp_i} $$ we get by the chain rule $$ \frac{dq^i}{dt} = \frac{dH}{dp_i} = \frac{dH}{dq^j}\frac{dq^j}{dp_i} + \frac{dH}{d\dot{q}^j}\frac{d\dot{q}^j}{dp_i} $$ and we can probably proceed from there but I'm not sure.


Ok, so I know the answer is $$ H(q, p) = \dot{q}(q, p)p - \mathcal{L}(q, \dot{q}(q, p)) $$ but I don't want to assume this, I want to DERIVE this. Then after that derivation I want to observe that this transformation is a Legendre transformation and then build intuition about why it is a Legendre transform.

What I want is in contrast to the usual treatment where you

  1. start with a Lagrangian
  2. Apply a Legendre transform (why???) to get a Hamiltonian
  3. Use the Legendre transform and Euler-Lagrange equations to derive Hamilton's equations.

But given by description above, Hamiltonian mechanics can stand alone without needing Lagrangian mechanics defined first. So we should be able to use the features of each to derive the bridge between the two.

Jagerber48
  • 16,234

5 Answers5

3

Ok, so I know the answer is $$ H(q, p) = \dot{q}(q, p)p - \mathcal{L}(q, \dot{q}(q, p)) $$ but I don't want to assume this, I want to DERIVE this.

I'm not sure one can "derive" this in a suitably satisfying sense... But I will try to engage with the spirit of the question as to why we might move from a Lagrangian description to a Hamiltonian description.

What I want is in contrast to the usual treatment where you

  1. start with a Lagrangian

Start with an action: $$ S = \int_{t_1}^{t_2} dt L(q(t), \dot q(t), t)\;, \tag{1} $$ where the Lagrangian could potentially have explicit time dependence.

Following Schwinger's brief description of action principles in Chapter 9 of his EM textbook, we understand that the Lagrangian viewpoint treats the position $q$ and the time $t$ as independent variables. This statement about the time as independent may be slightly different from the usual presentation which tends to focus heavily on the equation of motion that results from varying $q$ rather than the time equation of motion.

So, what happens if we actually treat time as the independent variable being changed? There may be some (surmountable) conceptual difficulty, since a change in time endpoints changes the classical path, but formally we let $$ t\to \tilde t = t + \eta(t)\;, $$ where $\eta(t)$ is "small."

The action changes to $$ S \to \tilde S = \int_{t_1+\eta_1}^{t_2+\eta_2}d\tilde t L(\tilde q(\tilde t), \frac{d\tilde q}{d\tilde t}, \tilde t)\;,\tag{2} $$ where $$ \tilde q(\tilde t) \equiv q(t(\tilde t))\;, $$ is often not explicitly shown as a different function in some physics treatments (but obviously the functional form is different).

The action of Eq. (2) can be changed back to an integral with the original endpoints by remembering the change of variables (Jacobian) is straight forward $$ d\tilde t = dt (1+\dot \eta) $$ and also $$ \frac{d\tilde q}{d\tilde t}(\tilde t(t)) = (1-\dot \eta)\dot q(t)\;, $$ which results in re-writing Eq. (2) as $$ \tilde S = \int_{t_1}^{t_2}dt(1+\dot \eta)L(q(t),(1-\dot \eta)\dot q(t), t+\eta)\tag{3} $$ $$ = S + \int_{t_1}^{t_2}dt \eta(t)\left(\frac{d}{dt}\underbrace{(\dot q\frac{\partial L}{\partial \dot q} - L)}_{E} + \frac{\partial L}{\partial t}\right) - \int_{t_1}^{t_2}dt\frac{d}{dt}\underbrace{\left(\eta \dot q\frac{\partial L}{\partial \dot t}-\eta L \right)}_{\eta E}\;.\tag{4} $$

In Eq. (4) we find it very hard not to notice the appearance of this quantity $$ \dot q\frac{\partial L}{\partial \dot q} - L\;, $$ which actually tells us how the action changes when the time endpoints are changed (for the classical path). I.e., we found that the generator of time translation is: $$ \dot q\frac{\partial L}{\partial \dot q} - L\;, $$ which we perhaps want to give the name $H$, but for now I'll call it $E$: $$ E\equiv\dot q\frac{\partial L}{\partial \dot q} - L\;.\tag{5} $$

  1. Apply a Legendre transform (why???) to get a Hamiltonian

If you apply a similar treatment to the action change with varying the spatial endpoints, you find that the generator of spatial translations is $$ \frac{\partial L}{\partial \dot q}\;, $$ where we want to give this thing a name, and we decide to name it "momentum" $p$: $$ p\equiv \frac{\partial L}{\partial \dot q}\;. $$

We now are motivated to switch from the Lagrangian viewpoint (where $q$ and $t$ are independent) to the Hamiltonian viewpoint (where $q$, $t$, and $p$ are independent). $$ H(q,p,t) = \dot q p - L(q,\dot q(q, p), t)\;,\tag{6} $$ where $\dot q(q,p)$ is determined by inverting $$ p\equiv \frac{\partial L}{\partial \dot q}\;. $$

Or, somewhat more generally, we can define the Hamiltonian as $$ H(q,p,t) \equiv \sup_v(vp-L(q,v,t))\;,\tag{7} $$ which, if we can find the supremum by simply differentiating with respect to $v$, reduces back to Eq. (6).

Admittedly, the thing defined in Eq. (6) or Eq. (7) is not the same at the thing defined in Eq. (5), since they are functions of different variables. But the motivation for introducing the Legendre transformation is at least somewhat apparent now.

The same action of Eq. (1) can now be treated as having three independent variables ($q$, $t$, and $p$): $$ S = \int_{t_1}^{t_2} dt (\dot q p - H(q,p,t))\;, $$ and the independent variation of $q$, $t$, and $p$ results in the (Hamiltonian) equations of motion.

But given by description above, Hamiltonian mechanics can stand alone without needing Lagrangian mechanics defined first. So we should be able to use the features of each to derive the bridge between the two.

Yes, it presumably can stand alone. There are other questions and answers on this website that address this issue.

hft
  • 27,235
2

From an optimizing standpoint, it is standard to go from the Lagrangian to the Hamiltonian. Recall that you want to minimize the action: $$ S=\int L(x,v,t)dt $$ but $x,v$ are not independent. They are related by the constraint: $$ \dot x=v $$ You can enforce this constraint by Lagrange multipliers. You now have a "Lagrangian" (in the optimization sense) of the action. The Lagrange multipliers are precisely canonical momenta $p$: $$ \hat S=\int[p(\dot x-v)+L]dt $$ Since now $v,x,p$ are all independent, you can minimize $v$ and the Legendre transform naturally pops out: $$ \hat S=\int(p\dot x-H)dt $$ with: $$ H=\sup_v\{pv-L\} $$ You now are left with the dual problem (in the sense of optimization) where you need to maximize $\hat S$ by varying $p$. As you can see, the equations of motion are pretty much irrelevant to the discussion.

This approach generalizes for discrete time and brings further insight. If you start from the action: $$ S=\sum_nL_n\left(x_n,\frac{x_n-x_{n-1}}h\right)h $$ the same Lagrange multiplier approach and minimizing out velocity gives you the dual problem: $$ \hat S=\sum_np_{n-1}(x_n-x_{n-1})-H_n(x_n,p_{n-1})h\\ H_n(x_n,p_{n-1})=\sup_{v}\{p_{n-1}v-L_n(x_n,v)\} $$ For the original problem, the equations of motion are simply a succession of canonical maps: $$ p_n=\frac{\partial S_n}{\partial x_n}\quad p_{n-1}=-\frac{\partial S_n}{\partial x_{n-1}}\\ S_n(x_n,x_{n-1})=L_n\left(x_n,\frac{x_n-x_{n-1}}h\right)h $$ For the dual problem is similarly a succession of canonical transformation, but with a type 2 generating function: $$ p_n=\frac{\partial\hat S_n}{\partial x_n}\quad x_{n-1}=\frac{\partial\hat S_n}{\partial p_{n-1}}\\ \hat S_n(x_n,p_{n-1})=p_{n-1}x_n-H_n(x_n,p_{n-1})h $$ Therefore, the passage of Lagrangian to Hamiltonian in the discrete setting is just going from type 1 to type 2 generating functions. The two are also related by the Legendre transform. The discretized time case also highlights that the Hamiltonian formulation is more convenient for deriving the resulting symplectic scheme.

Hope this helps.

LPZ
  • 17,715
  • 1
  • 10
  • 36
1

This might be way simpler than what you're asking for, but there's a way of phrasing the usual argument so it looks like how Legendre transformations are done in thermodynamics.

Hamilton's equations are equivalent to the fact that $H(q, p)$ has differential $$dH = \dot{q} \, dp - \dot{p} \, dq.$$ In thermodynamics, the purpose of a Legendre transformations is just to convert terms like $X \, dY$ into $Y \, dX$. In this case, we might want to trade off $dp$ for $d \dot{q}$, since $\dot{q}$ is a useful quantity, and that's done by defining $L = p \dot{q} - H$. Then we have $$dL = \dot{p} \, dq + p \, d\dot{q}$$ where $p$ and $\dot{p}$ have to be viewed as functions of $q$ and $\dot{q}$. And this is just equivalent to the usual Euler-Lagrange equation. Of course the reasoning works in reverse too. It's not any weirder than switching between $U$ and $F$ in thermodynamics.

knzhou
  • 107,105
0

This can be a little bit motivated but I wonder if there is a better picture.

The Euler-Lagrange equations are \begin{align*} \frac{d}{dt}\left(\frac{\partial \mathcal{L}}{\partial \dot{q}}\right) = \frac{\partial \mathcal{L}}{\partial q} \end{align*}

Hamilton's equations are \begin{align*} \frac{dq}{dt} = \frac{\partial H}{\partial p}\\ \frac{dp}{dt} = -\frac{\partial H}{\partial q} \end{align*}

We can make an identification $$ p(q, \dot{q}) = \frac{\partial \mathcal{L}}{\partial \dot{q}} $$ In the cases of interest, this expression can inverted to calculate $\dot{q}(q, p)$.

This transforms the Euler-Lagrange equations into \begin{align*} \frac{dp}{dt} = \frac{\partial \mathcal{L}}{\partial q} \end{align*} We look at the two equations \begin{align*} \frac{\partial H}{\partial p} =& \dot{q}\\ \frac{\partial H}{\partial q} =& \frac{dp}{dt} = -\frac{\partial \mathcal{L}}{\partial \dot{q}} \end{align*}

The first term expression motivates that $H$ might have at term like $p\dot{q}$ while the second term motivates that $H$ might have a term like $-\mathcal{L}$. We can then guess

\begin{align*} H(q, p) = p\dot{q}(q, p) - \mathcal{L}(q, \dot{q}(q, p)) \end{align*} The motivation above was made neglecting the implicit dependencies. So we must check \begin{align*} \frac{\partial H}{\partial p} = \dot{q}(q, p) + p \frac{\partial \dot{q}}{\partial p} - \frac{\partial \mathcal{L}}{\partial \dot{q}} \frac{\partial\dot{q}}{\partial p} \end{align*} But the last two terms cancel because $\partial\mathcal{L}/\partial \dot{q} = p$. For the other derivative we have \begin{align*} \frac{\partial H}{\partial q} = p\frac{\partial \dot{q}}{\partial q} - \frac{\partial \mathcal{L}}{\partial q} - \frac{\partial \mathcal{L}}{\partial \dot{q}}\frac{\partial \dot{q}}{\partial q} \end{align*} in this case the first and third terms cancel again, because $\partial \mathcal{L}/\partial \dot{q} = p$.

So $\mathcal{H}$ defined in this way satisfies the necessary differential equation to ensure, after converting $\dot{q}$ to $p$, the equations of motion will be preserved.


This is better than the totally unmotivated derivations I've seen previously but I'm still left wanting a little more motivation.

For example, I wonder if there is some property about Legendre transforms that makes it obvious that this transform exactly will preserve the equations of motion

Jagerber48
  • 16,234
0

If you are happy starting from the premise that physics is defined on a symplectic manifold and there's a function called the Hamiltonian that generates flows on the manifold using the symplectic structure, etc, then I think it's not so bad to show this will link to the Lagrangian point of view without knowing in advance about a Legendre transformation.

In particular, we start with Hamilton's equations (which fall out of all the stuff about symplectic manifolds, etc, mentioned above) \begin{eqnarray} \frac{\partial H}{\partial p} &=& \dot{q} \\ \frac{\partial H}{\partial q} &=& -\dot{p} \\ \end{eqnarray} Then we observe these equations can be obtained by minimizing the action in first order form $$ S = \int dt \left(p \dot{q} - H(p, q)\right) $$ since doing a variation of the action yields $$ \delta S = \int dt \left[ -\left(\dot{p} + \frac{\partial H}{\partial q}\right)\delta q + \left(\dot{q} - \frac{\partial H}{\partial p}\right) \delta p\right] $$ How would you come up with $S$ without knowing it in advance? Well, knowing calculus of variations it isn't that hard to come up with it by trial and error, it's basically the simplest two terms you could write down with the right units, so you could have just tried those two terms with arbitrary coefficients to derive it. And I'm assuming you are also happy with the principle of least action as its own independent formulation of physics as part of the premise of this question, so you know at some point you need to reformulate Hamilton's equations as a minimum principle.

At this point, what's left is to do a change of variables from $p$ to $\dot{q}$. Now, normally you need to be careful about plugging equations of motion into an action. But if the equation $$ \frac{\partial H}{\partial p} = \dot{q} $$ is an invertible, algebraic equation for $p$ (which it is in "ordinary" theories that show up in mechanics), then we can solve it and plug the solution back into the action. For example, particle moving in an arbitrary potential, this equation would be $\frac{p}{2m} = \dot{q}$, which is invertible and can be solved for $p$ without needing to invert any differential operators.

So, assuming we have solved this equation to get $p(\dot{q})$, then the action is now written in the form we want $$ S = \int dt \left(p(\dot{q})\dot{q} - H(q, \dot{q})\right) = \int dt L(q, \dot{q}) $$ where $L$ is define by the transformation $L = p\dot{q} - H$. The Legendre transformation has turned up without us putting it in by hand.

As pointed out in an answer linked in the comments, here I'm assuming $H$ is uniquely specified. In a general constrained Hamiltonian system like appears in gauge theory, this isn't true, since you can add terms proportional to the constraints. So seeing an equivalence between the Hamiltonian for QED and the covariant form of the Lagrangian for QED would involve some additional steps to handle those gauge/constraint ambiguities. But I'm assuming you aren't interested in those details for this question.

Andrew
  • 58,167