Exercise 1.15: Equivalence

This is one of the more important exercises in the chapter. The problem asks for a proof that it's possible to absorb a coordinate transformation directly into the Lagrangian. If you can do this, you can express your paths and your forces in whatever coordinates you like, so long as you can transition between them.

I also found that this exposed, and repaired, my weakness with the functional notation that Sussman and Wisdom have used in the book.

The problem states:

Show by direct calculation that the Lagrange equations for $L'$ are satisfied if the Lagrange equations for $L$ are satisfied.

Definitions and Preliminaries

$L'$ and $L$ refer to ideas from section 1.6.1. The major promise of Lagrangian mechanics is that both the potential and kinetic energies are coordinate-independent; looking at the system through different coordinates can't affect the energy in the system, or you've dropped some information in the conversation. If this is true, then the value of the Lagrangian, equal to $T - V$, must be coordinate independent too.

Imagine a coordinate transformation $F$, maybe time-dependent, that can convert "primed coordinates" like $x'$ into "unprimed coordinates", $x$:

$$ x = F(t, x') $$

You might imagine that $x'$ is in polar coordinates and $x$ is in rectangular coordinates. Time-dependence means that the coordinate system itself is moving around in time. Imagine looking at the world out of a spinning window.

Now imagine some Lagrangian $L$ that acts on a local tuple built out of the unprimed coordinates. If the Lagrangian's value is coordinate-independent, there must be some other form, $L'$, that can act on the primed coordinates. $L'$ has to satisfy:

$$ L' \circ \Gamma[q'] = L \circ \Gamma[q] $$

The function $F$ only acts on specific coordinates. Imagine a function $C$ that uses $F$ to transform the entire local tuple from the primed coordinates to the unprimed coordinates:

$$ L \circ \Gamma[q] = L \circ C \circ \Gamma[q'] = L' \circ \Gamma[q'] $$

Function composition is associative, so two facts stare at us from \eqref{eq:1-15-relations}:

$$ \begin{aligned} L' & = L \circ C \cr \Gamma[q] & = C \circ \Gamma[q'] \end{aligned} $$

This implies that if we want describe our path in some primed coordinate system (imagine polar coordinates), but we only have a Lagrangian defined in an unprimed coordinate system (like rectangular coordinates), if we can just write down $C$ we can generate a new Lagrangian $L'$ by composing our old $L$ with $C$. This brings us to the exercise.

The goal is to show that if the Lagrange equations hold in the unprimed coordinate system:

$$ D(\partial_2L \circ \Gamma[q]) - (\partial_1L \circ \Gamma[q]) = 0 $$

Then they hold in the primed coordinate system:

$$ D(\partial_2L' \circ \Gamma[q']) - (\partial_1L' \circ \Gamma[q']) = 0 $$

Approach

The approach we'll take will be to:

Write down the form of $C$, given some arbitrary $F$
start calculating the terms of the Lagrange equations in the primed coordinate system using $L' = L \circ C$
Keep an eye out for some spot where we can use our assumption that the Lagrange equations hold in the unprimed coordinates.

I'll walk through a solution using "pen and paper", then show how we can run this derivation using Scheme to help us along.

First, some Scheme tools that will help us in both cases.

Scheme Tools

Equation (1.77) in the book describes how to implement $C$ given some arbitrary $F$. Looking ahead slightly, this is implemented as F->C on page 46.

The following function is a slight redefinition that allows us to use an $F$ that takes an explicit $(t, x')$, instead of the entire local tuple:

(define ((F->C* F) local)
  (let ((t (time local))
        (x (coordinate local))
        (v (velocity local)))
    (up t
        (F t x)
        (+ (((partial 0) F) t x)
           (* (((partial 1) F) t x)
              v)))))

Next we define $F$, $C$ and $L$ as described above, as well as qprime, a function that can represent our unprimed coordinate path function.

The types here all imply that the path has one real coordinate. I did this to make the types easier to understand; the derivation applies equally well to paths with many dimensions.

(define F
  (literal-function 'F (-> (X Real Real) Real)))

(define C (F->C* F))

(define L
  (literal-function 'L (-> (UP Real Real Real) Real)))

(define qprime
  (literal-function 'qprime))

When we apply $C$ to the primed local tuple, do we get the transformed tuple that we expect from 1.77 in the book?

(->tex-equation
 ((compose C (Gamma qprime)) 't))

$$ \begin{pmatrix} \displaystyle{ t} \cr \cr \displaystyle{ F\left( t, {q}^\prime\left( t \right) \right)} \cr \cr \displaystyle{ D{q}^\prime\left( t \right) {\partial}_{1}F\left( t, {q}^\prime\left( t \right) \right) + {\partial}_{0}F\left( t, {q}^\prime\left( t \right) \right)}\end{pmatrix} $$

This looks correct. We can also transform the path before passing it to $\Gamma$:

(define ((to-q F qp) t)
  (F t (qp t)))

Subtract the two forms to see that they're equivalent:

(->tex-equations
 ((- (compose C (Gamma qprime))
     (Gamma (to-q F qprime)))
  't))

$$ \begin{pmatrix} \displaystyle{ 0} \cr \cr \displaystyle{ 0} \cr \cr \displaystyle{ 0}\end{pmatrix} $$

Now that we know $C$ is correct we can define $q$, the unprimed coordinate path function, and Lprime:

(define q (to-q F qprime))
(define Lprime (compose L C))

Derivation

Begin by calculating the components of the Lagrange equations in equation \eqref{eq:lagrange-prime}. Examine the $\partial_2L'$ term first.

As we discussed above, function composition is associative, so:

$$ (L \circ C) \circ \Gamma[q'] = L' \circ \Gamma[q'] \implies L' = L \circ C $$

Substituting $L'$ from \eqref{eq:c-l} and using the chain rule:

$$ \partial_2L' = \partial_2(L \circ C) = ((DL) \circ C) \partial_2 C $$

I found the next step troubling until I became more comfortable with the functional notation.

$C$ is a function that transforms a local tuple. It takes 3 arguments (a tuple with 3 elements, technically) and returns 3 arguments. $\partial_2 C$ is an up-tuple with 3 entries. Each entry describes the derivative each component of $C$'s output with respect to the velocity component of the local tuple.

$L$ is a function that transforms the 3-element local to a scalar output. $DL$ is a down-tuple with 3 entries. Each entry describes the derivative of the single output with respect to each entry of the local tuple.

The tuple algebra described in Chapter 9 defines multiplication between an up and down tuple as a dot product, or a "contraction" in the book's language. This means that we can expand out the product above:

$$ (DL \circ C)\partial_2 C = (\partial_0L \circ C)(I_0 \circ \partial_2 C) + (\partial_1L \circ C)(I_1 \circ \partial_2 C) + (\partial_2L \circ C)(I_2 \circ \partial_2 C) $$

$I_0$, $I_1$ and $I_2$ are "selectors" that return that single element of the local tuple.

Example the value of $\partial_2C$ using our Scheme utilities:

(->tex-equation
 (((partial 2) C) (up 't 'xprime 'vprime))
 "eq:p2c")

$$ \begin{pmatrix} \displaystyle{ 0} \cr \cr \displaystyle{ 0} \cr \cr \displaystyle{ {\partial}_{1}F\left( t, {x}^\prime \right)}\end{pmatrix} $$

The first two components are 0, leaving us with:

$$ \partial_2 L' = (DL \circ C)\partial_2 C = (\partial_2L \circ C)(I_2 \circ \partial_2 C) $$

Compose this quantity with $\Gamma[q']$ and distribute the composition into the product. Remember that $C \circ \Gamma[q'] = \Gamma[q]$:

$$ \begin{aligned} \partial_2L' \circ \Gamma[q'] & = (\partial_2L \circ C)(I_2 \circ \partial_2 C) \circ \Gamma[q'] \cr & = (\partial_2L \circ C \circ \Gamma[q'])(I_2 \circ \partial_2 C \circ \Gamma[q']) \cr & = (\partial_2L \circ \Gamma[q])(I_2 \circ \partial_2 C \circ \Gamma[q']) \end{aligned} $$

Take the derivative (with respect to time, remember, from the types):

$$ D(\partial_2L' \circ \Gamma[q']) = D\left[(\partial_2L \circ \Gamma[q])(I_2 \circ \partial_2 C \circ \Gamma[q'])\right] $$

Substitute the second term using \eqref{eq:p2c}:

$$ D(\partial_2L' \circ \Gamma[q']) = D\left[(\partial_2L \circ \Gamma[q])\partial_1F(t, q'(t))\right] $$

Expand using the product rule:

$$ D(\partial_2L' \circ \Gamma[q']) = \left[ D(\partial_2L \circ \Gamma[q]) \right]\partial_1F(t, q'(t)) + (\partial_2L \circ \Gamma[q])D\left[ \partial_1F(t, q'(t)) \right] $$

A term from the unprimed Lagrange's equation is peeking. Notice this, but don't make the substitution just yet.

Next, expand the $\partial_1 L'$ term:

$$ \partial_1L' = \partial_1(L \circ C) = ((DL) \circ C) \partial_1 C $$

Calculate $\partial_1C$ using our Scheme utilities:

(->tex-equation
 (((partial 1) C) (up 't 'xprime 'vprime)))

$$ \begin{pmatrix} \displaystyle{ 0} \cr \cr \displaystyle{ {\partial}_{1}F\left( t, {x}^\prime \right)} \cr \cr \displaystyle{ {v}^\prime {{\partial}_{1}}^{2}\left( F \right)\left( t, {x}^\prime \right) + \left( {\partial}_{0} {\partial}_{1} \right)\left( F \right)\left( t, {x}^\prime \right)}\end{pmatrix} $$

Expand the chain rule out and remove 0 terms, as before:

$$ \begin{aligned} \partial_1L' & = ((DL) \circ C) \partial_1 C \cr & = (\partial_0L \circ C)(I_0 \circ \partial_1 C) + (\partial_1L \circ C)(I_1 \circ \partial_1 C) + (\partial_2L \circ C)(I_2 \circ \partial_1 C) \cr & = (\partial_1L \circ C)(I_1 \circ \partial_1 C) + (\partial_2L \circ C)(I_2 \circ \partial_1 C) \end{aligned} $$

Compose $\Gamma[q']$ and distribute:

$$ \begin{aligned} \partial_1L' \circ \Gamma[q'] & = (\partial_1L \circ \Gamma[q])(I_1 \circ \partial_1 C \circ \Gamma[q']) + (\partial_2L \circ \Gamma[q])(I_2 \circ \partial_1 C \circ \Gamma[q']) \cr & = (\partial_1L \circ \Gamma[q])(\partial_1F(t, q'(t))) + (\partial_2L \circ \Gamma[q])D(\partial_1F(t, q'(t))) \end{aligned} $$

We now have both components of the primed Lagrange equations from \eqref{eq:lagrange-prime}.

Subtract the two terms, extract common factors and use our assumption \eqref{eq:1-15-lagrange} that the original Lagrange equations hold:

$$ \begin{aligned} D(\partial_2L' \circ \Gamma[q']) - (\partial_1L' \circ \Gamma[q']) & = \left[ D(\partial_2L \circ \Gamma[q]) - (\partial_1L \circ \Gamma[q]) \right] \partial_1F(t, q'(t)) \cr & + \left[ (\partial_2L \circ \Gamma[q]) - (\partial_2L \circ \Gamma[q]) \right] D(\partial_1F(t, q'(t))) \cr & = \left[ D(\partial_2L \circ \Gamma[q]) - (\partial_1L \circ \Gamma[q]) \right] \partial_1F(t, q'(t)) \cr & = 0 \end{aligned} $$

And boom, we're done! The primed Lagranged equations \eqref{eq:lagrange-prime} hold if the unprimed equations \eqref{eq:1-15-lagrange} hold.

I'm not sure what to make of the new constant terms. The new Lagrange equations are scaled by $\partial_1 F(t, q'(t))$, the derivative of $F$ with respect to the path; that seems interesting, and possibly there's some nice physical intuition waiting to be discovered.

Scheme Derivation

Can we use Scheme to pursue the same derivation? If we can write the relationships of the derivation in code, then we'll have a sort of computerized proof that the primed Lagrange equations are valid.

First, consider $\partial_1 L' \circ \Gamma[q']$:

(->tex-equation
 ((compose ((partial 1) Lprime) (Gamma qprime))
  't))

$$ D{q}^\prime\left( t \right) {\partial}_{2}L\left( \begin{pmatrix} \displaystyle{ t} \cr \cr \displaystyle{ F\left( t, {q}^\prime\left( t \right) \right)} \cr \cr \displaystyle{ D{q}^\prime\left( t \right) {\partial}_{1}F\left( t, {q}^\prime\left( t \right) \right) + {\partial}_{0}F\left( t, {q}^\prime\left( t \right) \right)}\end{pmatrix} \right) {{\partial}_{1}}^{2}\left( F \right)\left( t, {q}^\prime\left( t \right) \right) + {\partial}_{1}F\left( t, {q}^\prime\left( t \right) \right) {\partial}_{1}L\left( \begin{pmatrix} \displaystyle{ t} \cr \cr \displaystyle{ F\left( t, {q}^\prime\left( t \right) \right)} \cr \cr \displaystyle{ D{q}^\prime\left( t \right) {\partial}_{1}F\left( t, {q}^\prime\left( t \right) \right) + {\partial}_{0}F\left( t, {q}^\prime\left( t \right) \right)}\end{pmatrix} \right) + {\partial}_{2}L\left( \begin{pmatrix} \displaystyle{ t} \cr \cr \displaystyle{ F\left( t, {q}^\prime\left( t \right) \right)} \cr \cr \displaystyle{ D{q}^\prime\left( t \right) {\partial}_{1}F\left( t, {q}^\prime\left( t \right) \right) + {\partial}_{0}F\left( t, {q}^\prime\left( t \right) \right)}\end{pmatrix} \right) \left( {\partial}_{0} {\partial}_{1} \right)\left( F \right)\left( t, {q}^\prime\left( t \right) \right) $$

This is completely insane, and already unhelpful. The argument to $L$, we know, is actually $\Gamma[q]$. Make a function that will replace the tuple with that reference:

(define (->eq expr)
  (write-string
   (replace-all (->tex-equation* expr)
                (->tex* ((Gamma q) 't))
                "\\circ \\Gamma[q]")))

Try again:

(->eq
 ((compose ((partial 1) Lprime) (Gamma qprime))
  't))

$$ D{q}^\prime\left( t \right) {\partial}_{2}L\left( \circ \Gamma[q] \right) {{\partial}_{1}}^{2}\left( F \right)\left( t, {q}^\prime\left( t \right) \right) + {\partial}_{1}F\left( t, {q}^\prime\left( t \right) \right) {\partial}_{1}L\left( \circ \Gamma[q] \right) + {\partial}_{2}L\left( \circ \Gamma[q] \right) \left( {\partial}_{0} {\partial}_{1} \right)\left( F \right)\left( t, {q}^\prime\left( t \right) \right) $$

Ignore the parentheses around $\circ \Gamma[q]$ and this looks better.

The $\partial_1 L \circ \Gamma[q]$ term of the unprimed Lagrange equations is nestled inside the expansion above, multiplied by a factor $\partial_1F(t, q'(t))$:

(let* ((factor (((partial 1) F) 't (qprime 't))))
  (->eq
   ((* factor (compose ((partial 1) L) (Gamma q)))
    't)))

$$ {\partial}_{1}F\left( t, {q}^\prime\left( t \right) \right) {\partial}_{1}L\left( \circ \Gamma[q] \right) $$

Next, consider the $D(\partial_2 L' \circ \Gamma[q'])$ term:

(->eq
 ((D (compose ((partial 2) Lprime) (Gamma qprime)))
  't))

$$ {\left( D{q}^\prime\left( t \right) \right)}^{2} {\partial}_{1}F\left( t, {q}^\prime\left( t \right) \right) {{\partial}_{1}}^{2}\left( F \right)\left( t, {q}^\prime\left( t \right) \right) {{\partial}_{2}}^{2}\left( L \right)\left( \circ \Gamma[q] \right) + D{q}^\prime\left( t \right) {\left( {\partial}_{1}F\left( t, {q}^\prime\left( t \right) \right) \right)}^{2} \left( {\partial}_{1} {\partial}_{2} \right)\left( L \right)\left( \circ \Gamma[q] \right) + 2 D{q}^\prime\left( t \right) {\partial}_{1}F\left( t, {q}^\prime\left( t \right) \right) \left( {\partial}_{0} {\partial}_{1} \right)\left( F \right)\left( t, {q}^\prime\left( t \right) \right) {{\partial}_{2}}^{2}\left( L \right)\left( \circ \Gamma[q] \right) + {\left( {\partial}_{1}F\left( t, {q}^\prime\left( t \right) \right) \right)}^{2} {D}^{2}{q}^\prime\left( t \right) {{\partial}_{2}}^{2}\left( L \right)\left( \circ \Gamma[q] \right) + {\partial}_{0}F\left( t, {q}^\prime\left( t \right) \right) {\partial}_{1}F\left( t, {q}^\prime\left( t \right) \right) \left( {\partial}_{1} {\partial}_{2} \right)\left( L \right)\left( \circ \Gamma[q] \right) + D{q}^\prime\left( t \right) {\partial}_{2}L\left( \circ \Gamma[q] \right) {{\partial}_{1}}^{2}\left( F \right)\left( t, {q}^\prime\left( t \right) \right) + {\partial}_{1}F\left( t, {q}^\prime\left( t \right) \right) {{\partial}_{2}}^{2}\left( L \right)\left( \circ \Gamma[q] \right) {{\partial}_{0}}^{2}\left( F \right)\left( t, {q}^\prime\left( t \right) \right) + {\partial}_{1}F\left( t, {q}^\prime\left( t \right) \right) \left( {\partial}_{0} {\partial}_{2} \right)\left( L \right)\left( \circ \Gamma[q] \right) + {\partial}_{2}L\left( \circ \Gamma[q] \right) \left( {\partial}_{0} {\partial}_{1} \right)\left( F \right)\left( t, {q}^\prime\left( t \right) \right) $$

This, again, is total madness. We really want some way to control how Scheme expands terms.

But we know what we're looking for. Expand out the matching term of the unprimed Lagrange equations:

(->eq
 ((D (compose ((partial 2) L) (Gamma q)))
  't))

$$ {\left( D{q}^\prime\left( t \right) \right)}^{2} {{\partial}_{1}}^{2}\left( F \right)\left( t, {q}^\prime\left( t \right) \right) {{\partial}_{2}}^{2}\left( L \right)\left( \circ \Gamma[q] \right) + D{q}^\prime\left( t \right) {\partial}_{1}F\left( t, {q}^\prime\left( t \right) \right) \left( {\partial}_{1} {\partial}_{2} \right)\left( L \right)\left( \circ \Gamma[q] \right) + 2 D{q}^\prime\left( t \right) \left( {\partial}_{0} {\partial}_{1} \right)\left( F \right)\left( t, {q}^\prime\left( t \right) \right) {{\partial}_{2}}^{2}\left( L \right)\left( \circ \Gamma[q] \right) + {\partial}_{1}F\left( t, {q}^\prime\left( t \right) \right) {D}^{2}{q}^\prime\left( t \right) {{\partial}_{2}}^{2}\left( L \right)\left( \circ \Gamma[q] \right) + {\partial}_{0}F\left( t, {q}^\prime\left( t \right) \right) \left( {\partial}_{1} {\partial}_{2} \right)\left( L \right)\left( \circ \Gamma[q] \right) + {{\partial}_{2}}^{2}\left( L \right)\left( \circ \Gamma[q] \right) {{\partial}_{0}}^{2}\left( F \right)\left( t, {q}^\prime\left( t \right) \right) + \left( {\partial}_{0} {\partial}_{2} \right)\left( L \right)\left( \circ \Gamma[q] \right) $$

Staring at these two equations, it becomes clear that the first contains the second, multiplied by $\partial_1F(t, q'(t))$, the same factor that appeared in the expansion of the $\partial_1 L \circ \Gamma[q]$ term.

Try writing out the primed Lagrange equations, and subtracting the unprimed Lagrange equations, scaled by this factor:

(let* ((primed-lagrange
        (- (D (compose ((partial 2) Lprime) (Gamma qprime)))
           (compose ((partial 1) Lprime) (Gamma qprime))))

       (lagrange
        (- (D (compose ((partial 2) L) (Gamma q)))
           (compose ((partial 1) L) (Gamma q))))

       (factor
        (compose coordinate ((partial 1) C) (Gamma qprime))))
  (->tex-equation
   ((- primed-lagrange (* factor lagrange))
    't)))

$$ 0 $$

Done! It seems that the extra terms on each side exactly cancel. As with the pen and paper derivation, we've shown that the primed Lagranged equations \eqref{eq:lagrange-prime} hold if the unprimed equations \label{eq:1-15-lagrange} hold.

Final Comments

I'm troubled by my lack of intuition around two ideas:

what is the meaning of the $\partial_1F(t, q'(t))$ scaling factor?
Both sides acquire a constant $(\partial_2L \circ \Gamma[q]) \cdot D(\partial_1F(t, q'(t)))$.

The right factor is a total time derivative. Is this meaningful? We know from later discussion that if we add a total time derivative to the Lagrangian we don't affect the shape of the realizable path.

I learned quite a bit about functional notation from this exercise, and I think that test that the result represents is critical for leaning hard on the coordinate transformations that we'll continue to explore. But I do feel like I'm leaving intuitive cake on the table.

sritchie/ex15.md