Bailey and Hamilton's Law of Varying Action

Question

Starting in the early 1970s and continuing for more than twenty years Bailey with his students published a series of articles on the application of what they called Hamilton's Law of Varying Action (HLVA); see, below a short list of those. To say that their work was mercilessly attacked by many for alleged lack of historical accuracy, errors in logic, errors in basic variational concepts, errors in mathematical analysis, etc., would understate the critics' message. One quite detailed attack was by Papastavridis who himself has written a 1,400+ (yes!) pages long treatise on variational mechanics.

I am not interested in the historical originality of Bailey's work or whether he understands the role of Hamilton-Jacobi transformations (I don't...), etc. My question is really about the engineering aspect (in a broad sense) of what he was proposing, and what I believe was actually ignored by his critics. Below is my short view of Bailey's method and I ask the experts here to opine on its meaning.

Let $\mathcal F = F(\{x_i\},\{v_i\},t)$ with $i=1,2,..,n,$ be a continuous differentiable function of $2n+1$ variables and let $\eta_i = \eta_i (t)$ be also a set of continuous differentiable functions of time but otherwise arbitrary .

Now fix $t_0 < t_1$ and form the integral $$ \mathcal A[\mathcal F(x)] =\int_{t_0}^{t_1}dt F(\{x_i\},\{\dot x_i\},t) \tag{1}\label{1}$$ Let $\epsilon >0$ and calculate the difference $$ \mathcal A[F(x+\epsilon \eta, \dot x +\epsilon \dot \eta, t)]-\mathcal A[ F(x, \dot x, t)] =\int_{t_0}^{t_1}dt\left(F(x+\epsilon \eta, \dot x +\epsilon \dot \eta, t)]- F(x, \dot x, t)\right)dt\\ =\epsilon \int_{t_0}^{t_1}dt\sum_i\left[\frac{\partial F(x, v, t)}{\partial x_i}\eta_i +\left. \frac{\partial F(x, v, t)}{\partial v_i}\right|_{v_i=\dot x_i}^{} \dot \eta_i\right] +\mathcal O(\epsilon^2)\\ =\epsilon\sum_i\int_{t_0}^{t_1}dt\left[\frac{\partial F(x, v, t)}{\partial x_i}\eta_i + \frac{d}{dt}\left(\left. \frac{\partial F(x, v, t)}{\partial v_i}\right|_{v_i=\dot x_i} \eta_i\right) - \frac{d}{dt}\left( \left. \frac{\partial F(x, v, t)}{\partial v_i}\right|_{v_i=\dot x_i} \right) \eta_i \right]+\mathcal O(\epsilon^2)\tag{2}\label{2}$$ and then with suppressing the indices for better readability and the summations be understood implicitly, $$\left. \frac{d\mathcal A}{d\epsilon}\right|_{\epsilon=0}=\sum_i\int_{t_0}^{t_1}dt\left[ \frac{d}{dt}\left(\left.\frac{\partial F(x, v, t)}{\partial v_i}\right|_{v_i=\dot x_i} \eta_i\right) \right] +\sum_i\int_{t_0}^{t_1}dt\left[ \frac{\partial F(x, v, t)}{\partial x_i}\eta_i - \frac{d}{dt}\left( \left.\frac{\partial F(x, v, t)}{\partial v_i}\right|_{v_i=\dot x_i} \right) \eta_i \right]\\ =\left[\left. \frac{\partial F(x, v, t)}{\partial v}\right|_{v=\dot x} \eta_i\right]_{t_0}^{t_1} +\int_{t_0}^{t_1}dt\left[\frac{\partial F(x, v, t)}{\partial x}- \frac{d}{dt}\left( \left.\frac{\partial F(x, v, t)}{\partial v}\right|_{v=\dot x} \right) \right]\eta\tag{4}\label{4}.$$

Combining $\eqref{2}$ and $\eqref{4}$ and writing $v=\dot x$ for short, we get $$\int_{t_0}^{t_1}dt \left[\frac{\partial F(x, \dot x, t)}{\partial x}\eta + \frac{\partial F(x, \dot x, t)}{\partial \dot x} \dot \eta\right] - \left[ \frac{\partial F(x, \dot x, t)}{\partial \dot x} \eta\right]_{t_0}^{t_1}=\int_{t_0}^{t_1}dt\left[\frac{\partial F(x, \dot x, t)}{\partial x}- \frac{d}{dt}\left( \frac{\partial F(x, \dot x, t)}{\partial \dot x} \right) \right]\eta\tag{5}\label{5},$$

This Equation $\eqref {5}$ is called Hamilton's Law of Varying Action, it can be found in Hamilton's 1835 memoir and nothing in this derivation is controversial so far.

Now let $t_0=0$ for short and Bailey sets the lower variation be zero, $\eta_i(0)=0$ or $\eta(0)=0,$ for all coordinates. I also write $\tau$ for the dummy integration variable and $t$ for the upper limit end-time.
$$\int_{0}^{t}d\tau \left[\frac{\partial F(x, \dot x, \tau)}{\partial x}\eta + \frac{\partial F(x, \dot x, \tau)}{\partial \dot x} \dot \eta\right] - \left[\left. \frac{ \partial F(x, \dot x, t)}{\partial \dot x} \eta\right]\right|_t=\int_{0}^{t}d\tau\left[\frac{\partial F(x, \dot x, \tau)}{\partial x}- \frac{d}{d\tau}\left( \frac{\partial F(x, \dot x, \tau)}{\partial \dot x} \right) \right]\eta\tag{6}\label{6}$$

This $\eqref{6}$ is actually an identity, there is no physics involved here so far, just summation, partial derivatives and integration, etc. Being completely general it must also hold when the Euler-Lagrange equations hold and that is when physics comes in; specifically, when we demand that the expression having the Euler-Lagrange operand be zero in the bracket of the RHS integral; that is when the true trajectory $x_i(t)$, $i=1,2,..n,$ satisfies for a given $F=F(x,\dot x, t)$ a system of $n$ simultaneous second order differential equations: $$\frac{\partial F(x, \dot x, t)}{\partial x_i} - \frac{d}{dt}\left( \frac{\partial F(x, \dot x_i, t)}{\partial \dot x_i} \right)=0.\\i=1,2,...,n \tag{EL}\label{EL}$$ Now Bailey says that because of the arbitrariness of $\eta$ we can "solve" this set of $\eqref{EL}$ equations both analytically and algebraically in a successive approximation scheme by ensuring that the left hand side of $\eqref {6}$ be zero. He proposes that we use a set of arbitrary differentiable functions $X_i$ of time $t$ with a set of "variational" parameters $a=a_1,a_2,..,a_m,.....$ and write in the $m^{th}$ approximation step $x_i(t) = X_i(t;a_1,a_2,.., a_m)$ and our job now is to find that particular set of $a=\{a_k\}$ with which

$$\int_{0}^{t}d\tau \left[\frac{\partial F(x, \dot x, \tau)}{\partial x}\eta + \frac{\partial F(x, \dot x, \tau)}{\partial \dot x} \dot \eta\right] - \frac{\partial F(x, \dot x, t)}{\partial \dot x} \eta(t) = 0\tag{7}\label{7}.$$

Since the time function is given $x_i=X_i(t;a)$ we have the variation of $x_i$ that is $\eta_i= \sum_k \frac{\partial X_i}{\partial a_k}\delta a_k$ and that of $\dot x_i$ as $\dot \eta_i= \sum_k \frac{\partial \dot X_i}{\partial a_k}\delta a_k$, resp., resulting in Bailey's equation:

$$\left[\int_{0}^{t}d\tau \left(\frac{\partial F(x, \dot x, \tau)}{\partial x}\frac{\partial X}{\partial a}\ + \frac{\partial F(x, \dot x, \tau)}{\partial \dot x} \frac{\partial \dot X}{\partial a}\right) - \frac{\partial F(x, \dot x, t)}{\partial \dot x} \frac{\partial X}{\partial a}\right]\delta a = 0 \tag{8}\label{8}.$$

As the variations $\delta a_k$ are independent of each other and for a given $t$ fixing the upper limit of integration we get a set of algebraic equations to be solved for the unknown $a_k$ parameters. Let us write this out explicitly in index notation:

$$\sum_i\int_{0}^{t}d\tau \left(\frac{\partial F(x, \dot x, \tau)}{\partial x_i}\frac{\partial X_i}{\partial a_k}\ + \frac{\partial F(x, \dot x, \tau)}{\partial \dot x_i} \frac{\partial \dot X_i}{\partial a_k}\right) = \sum_i \frac{\partial F(x, \dot x, t)}{\partial \dot x_i} \frac{\partial X_i}{\partial a_k}\\ k=1,2,...,m \tag{9}\label{9}.$$

We can force the initial coordinates at $t=0$ to be $x_i(0)=x_i^0$ by setting our trial functions in the form $x_i(t)=x_i^0+tX_i(t;a)$, or with initial velocities $\dot x_i(0)=\dot x_i^0$ as $x_i(t)=x_i^0+ \dot x_i^0t+X_i(t;a)t^2$, etc. Not just differential, but other type of constraints can be enforced, as well.

In $\eqref{9}$ we are given the functions $F(x,v,t)$ and $ X_i(t;a_1,a_2,..a_m)$ and only the parameters $\{a_k\}$ are unknown. Once the integrations are done with the unknow parameters both sides are just functions of the parameters $\{a_k\}$ and thus we have a system of $m$ equations of exactly the same number of unknowns $m$. As $m$ increases the solution $\{a_k^*\}$ of this system of algebraic equations in the limit is the one that provides the trajectory $x_i(t)=X_i(t; a_1^*,a_2^*,..a_m^*,...)$ that will satisfy the Euler-Lagrange equations and the prescribed initial conditions.

Thus, Bailey's method generates explicit solutions of the Euler-Lagrange equations in the limit of the successive approximation as initial value problems compatible with the prescribed forms of the trial functions. Note that Bailey and several other researchers used this method to find numerical solutions to the Euler-Lagrange equations with success.

But something is really strange about this whole idea. We set the end time $t>0$ arbitrarily and get a set of parameters $a=\{a_k\}$ from the system of algebraic equations $\eqref{9}$ that will give us the trajectory $x_i(t)=X_i(t; a)$ but then $x_i(\tau)=X_i(\tau; a)$ must also hold for all $0\le\tau \le t.$ So what happens if we pick an instant $t'>t$; shall we still have the same trajectory so that $x_i(t')=X_i(t'; a)$? We should, because at $t'=t$ they have the same $x_i(t)$ as initial conditions.

With the assumed Euler-Lagrange equations $\eqref{EL}$ to hold while setting $t$ to be an infinitesimal we just get the triviality $0=0.$ In other words, it is essential to assume that $t$ be finite and not infinitesimal and then we do get the full trajectory $x_i(t)=X_i(t; a),$ for all $t,$ but then the ${a_k}$ parameters in the limit must be related to the integral invariants of the EL equations.

Question: How are these $a_k$ parameters related to the various invariants of the trajectory?

References:

Hitzl:"Implementing Hamilton's Law of Varying Action with Shifted Legendre Polynomials", JOURNAL OF COMPUTATIONAL PHYSICS 38, pp185-211 (1980)
Papastavridis: "THE VARIATIONAL PRINCIPLES OF MECHANICS, AND A REPLY TO C. D. BAILEY," Journal of Sound and Vibration (1987) 118(2), pp378-393
Bailey: "A new look at Hamilton's principle," Foundations of Physics, Vol. 5, No. 3, 1975
Bailey: "Application of Hamilton's Law of Varying Action," AIAA JOURNAL VOL. 13, NO. 9 pp1154-1157
Bailey: "FURTHER REMARKS ON THE LAW OF VARYING ACTION," Journal of Sound and Vibration (1989) 131(2), pp331-344

Valter Moretti · Answer 1 · 2024-07-23T12:00:30.053

Well, the set of curves $x_i(t)=X_i(t; a_1^*,a_2^*,..a_m^*)$, generally speaking, do not satisfy the EL equations (EL) contrarily to what is argued in the post (if I have correctly understood the explanation above, otherwise I apologise!).

We have the following relations.

(EL) imply (6);
equations (6) imply (EL) only if the functional variantions $\eta$ can be arbitrarily fixed in a suitable space of (sufficiently regular) functions such that for every neighborhood of a point in the real (temporal) line there is a non-negative, non-vanishing, function of the space whch is has support in that neighborhood. This space is therefore necessarily infinite dimensional;
in general, (9) does not imply (6) and thus also (EL). That is because, in (9), the possible variations are spanned by a finite dimensional space of functions.

The set of curves $x_i(t)=X_i(t; a_1^*,a_2^*,..a_m^*)$ is just a solution of the algebraic equations (9).

In general, there is no chance to tune a finite set of arbitrary constants $a=\{a_j\}$ within a given finite set function $X_i(t,a)$ which is capable to describe all solutions of the EL equations: it is a desperate task!

At most, with a suitable choice of the functions $X_i(t,a)$ it is possible to approximate some type of solutions. And I think, this mathod should be very efficient in this context.

So the answer to final question:

"How are these $a_k$ parameters related to the various invariants of the trajectory?"

is "They are not related".

Cleonis · Answer 2 · 2024-07-23T14:14:19.030

About the content of the NASA document (where Bailey was professor of Aeronautics), that forms the basis of the 1975 original publication of the same title: Application of Hamilton's law of Varying Action

A quote from page 6:

Note that the zero on the right hand side of eq. 4 results from the fact that nature requires the equations of equilibrium to vanish as observed by Newton. It has nothing to do with the proof in variational calculus that the integral is an extremum.

Quote from page 13

$t_i$ is arbitrary. However, it is kept relatively small with the understanding that a longer period of time can be examined simply by taking the final conditions as calculated for one interval as the initial conditions of the next interval. It should be emphasized that the number of terms required in the truncated power series is not the important criteria from a practical viewpoint. The computer time required for solutions is the important criteria. Ten terms in the time variable have been found to be sufficient for all non-stationary problems of particles, beams, and plates treated to date. With this number of terms in the truncated power series, the computer time for every case of single degree of freedom particle motion was below the minimum amount ($1.68) charged for the computer and the accuracy, as will be shown, was far above expectation.

I think that in the 1970's the considerations of Professor Bailey were driven by the fact that his productivity was compute power constrained. Making his computations twice as efficient makes him and his team twice as productive.

As long as the accuracy of the results is within specification, with generous margin, Bailey will keep pushing for more efficiency.

About the newtonian formulation of mechanics:
The newtonian formulation of mechanics generalizes to using generalized coordinates, and extension of the concept of force to generalized force.

See: Richard Fitzpatrick, University of Texas at Austin
Generalized forces

My hypothesis as to what is going on:

Professor Bailey keeps insisting that in his numerical analysis algorithms he can totally accommodate non-conservative forces. My opinion (based on the contents of the article): the reason Bailey can accommodate non-conservative forces: his numerical integration implementations are actually according to newtonian formulation (ported to generalized coordinates). We have every reason to trust that the numerical integration algorithms were fine.

Hypothesis:
Professor Bailey has talked himself into believing that applying generalized coordinates automatically means you are applying Hamilton's principle. (Could be two-stage association: applying generalize coordinates => Lagrangian mechanics. Using Lagrangian mechanics => Hamilton's principle.)

That, I hypothesize, is the reason why professor Bailey keeps insisting that everything he does is extension of Hamilton's principle.

(With idiosyncratic notation to make the expressions similar in visual appearance to application of Hamilton's principle.)

Summery of my opinion:

What Professor Bailey was doing was numerical integration, according to newtonian formulation, with generalized coordinates.

But at the same time Professor Bailey had convinced himself that in formal descriptions of his methodology he had to insist that he was using Hamilton's principle.

Bailey and Hamilton's Law of Varying Action

2 Answers2