Canonical Transformation and the word "Symplectic"
There is this very
frustrating thing in Lagrangian and Hamiltonian Mechanics, called a "canonical
transformation", which supposedly simplifies the equations of motion and sets
up all sorts of higher order analysis like Action Angle variables, the
Hamilton-Jacobi equation, and canonical perturbation theory.
The trouble is, it's hard to get a feel for these things. The basic building
block of Hamiltonian mechanics is the equation[s] of motion:
\begin{eqnarray}
\frac{\partial H}{\partial q_i} &=& -\dot{p_i} \\
\frac{\partial H}{\partial p_i} &=& \dot{q_i}
\end{eqnarray}
We can write this more succinctly using a phase space vector, which I will
call "z":
\begin{eqnarray}
\mathbf{z} &=& \left( q_1,q_2, \dots, q_n, p_1,p_2, \dots, p_n \right)
\end{eqnarray}
So z is a $2n$-dimensional vector in our $2n$-dimensional phase space. We can
now write the equations of motion using a a strange matrix, $\Omega$:
\begin{eqnarray}
\Omega &=& \left(\begin{array}{cc}
0 & I_n \\ -I_n & 0
\end{array} \right)
\end{eqnarray}
Where $I_n$ represent $n$-dimensional identity matrices, with ones along the
diagonal, and the zeros represent $n$-by-$n$ zero matrices. This $\Omega$ is
what we call a block diagonal matrix, in that it can be decomposed into the
four "blocks" I have written above. We could also write it as a Direct product
of two matrices, the $n$-by-$n$ identity and one of the pauli spin matrices
(which is also a rotation about the $xy$ plane of 90 degrees):
\begin{eqnarray}
i \sigma_2 &=& \left(\begin{array}{cc}
0 & 1 \\ -1 & 0
\end{array} \right)\\
\Omega &=& I_n \bigotimes i \sigma_2
\end{eqnarray}
Now, we can write Hamilton's equation of motion in the following form:
\begin{eqnarray}
\frac{dz_i}{dt} &=& \Omega_{ij} \frac{\partial H}{\partial z_j}
\end{eqnarray}
Another way to use this tidy notation is in the poisson bracket:
\begin{eqnarray}
\left[A_i,B_j \right] &=& \sum_k \left(\frac{A_i}{q_k}\frac{B_j}{p_k}-
\frac{A_i}{p_k}\frac{B_j}{q_k}\right) \\
&=& \frac{\partial A_i}{\partial z_k} \Omega_{km} \frac{\partial
B_j}{\partial z_m}\\
\end{eqnarray}
and one finds, if we compute the poisson bracket of some quantity $A_i$ with
the Hamiltonian, we get a partial derivative with respect to time via
chain-rule:
\begin{eqnarray}
\left[A_i,H \right] &=& \frac{\partial A_i}{\partial z_k} \Omega_{km}
\frac{\partial H}{\partial z_m}\\
\left[A_i,H \right] &=& \frac{\partial A_i}{\partial z_k}
\frac{\partial z_k}{\partial t}
\end{eqnarray}
and so we find, if our vector-valued function of interest $A_i$ is dependent
upon $q,p,t$ or $z,t$, we can write our total time derivative as:
\begin{eqnarray}
\frac{dA_i}{dt} &=& \left[A_i,H\right] + \frac{\partial A_i}{\partial
t}
\end{eqnarray}
The Hamiltonian is thus called the "generator" of time translation, because,
let's say $A_i$ does not depend on t. In the quantum mechanics regime of
things we would say it is a Schrodinger operator. We could essentially
translate the operator -- or in this case the function $A_i$ -- forward in
time by taylor expansion:
\begin{eqnarray}
A_i(t) &=& A_i(0) + \frac{dA_i}{dt}\vert_{t_0}(t-t_0)+
\frac{d^2A_i}{dt^2}\vert_{t_0}\frac{(t-t_0)^2}{2!}+\dots
\end{eqnarray}
But this can be accomplished by repeatedly taking the commutator with H!
\begin{eqnarray}
A_i(t) &=& A_i(0) + \frac{dA_i}{dt}\vert_{t_0}(t-t_0)+
\frac{d^2A_i}{dt^2}\vert_{t_0}\frac{(t-t_0)^2}{2!}+\dots \\
A_i(t) &=& A_i(0) + (t-t_0) \left[A_i(0),H \right]+
\frac{(t-t_0)^2}{2!} \left[\left[A_i(0),H \right], H\right]+
\frac{(t-t_0)^3}{3!} \left[\left[\left[A_i(0),H \right], H\right],H \right]
+\dots \\
&=& e^{\left[ \ast , H\right](t-t_0)}A_i(0)
\end{eqnarray}
This is in incredibly close parallel to the Baker-Hausdorff lemma in Quantum
mechanics, which essentially makes time-dependent operators -- in the
Heisenberg picture -- by repeatedly taking commutators on "both sides" of a
bra-ket operator. If we promote the Hamiltonian to be an operator, then we
write:
\begin{eqnarray}
A_i(t) &=&
e^{\frac{iH(t-t_0)}{\hbar}}A_i(0)e^{\frac{-iH(t-t_0)}{\hbar}}
\end{eqnarray}
where the commutators are no longer in the classical sense, but in the
``operator'' sense.
---------------------------------------------------------------------------------------------------------------------------------
So, why do we care about all these commutators and things? Well, a simple
reason is that if we are to have a "valid" canonical transformation, we must
show that the Hamiltonian Equations of motion remain untarnished. Let's look
at our nice form of the EOM again:
\begin{eqnarray}
\frac{dz_i}{dt} &=& \Omega_{ij} \frac{\partial H}{\partial z_j}
\end{eqnarray}
we can re-write this with a Poisson bracket
\begin{eqnarray}
\frac{dz_i}{dt} &=& \left[z_i,H \right] \\
&=& \frac{\partial z_i}{\partial z_k}\Omega_{km} \frac{\partial
H}{\partial z_m}\\
dz_i &=& \frac{\partial z_i}{\partial z_k}\Omega_{km} \frac{\partial
H}{\partial z_m}dt
\end{eqnarray}
Now let us transform into some new coordinate system
$\mathbf{y}=\left(Q_1,Q_2,\dots, Q_n, P_1,P_2,\dots P_n \right)$. We find that
all of the $dz$'s can be written as:
\begin{eqnarray}
dz_i &=& \frac{\partial z_i}{\partial y_j} dy_j \\
dz_i &=& J_{ij}^{-1} dy_j
\end{eqnarray}
The matrix we have used above is simply the standard jacobian,
$\mathbf{J}_{ij}=\frac{\partial y_i}{\partial z_j}$. Remember $J^TJ=I$. Now we
re-write our EOM in the y-coordinates:
\begin{eqnarray}
dz_i &=& \frac{\partial z_i}{\partial z_k}\Omega_{km} \frac{\partial
H}{\partial z_m}dt \\
dz_i &=& \delta_{ik}\Omega_{km} \frac{\partial H}{\partial z_m}dt \\
J_{ij}^{-1} dy_j &=& \delta_{ik} \Omega_{km} J_{mi}\frac{\partial
H}{\partial y_i}dt \\
\end{eqnarray}
Multiplying both sides by $J_{ij}$ we get:
\begin{eqnarray}
dy_j &=& J_{kj} \Omega_{km} J_{mi}\frac{\partial H}{\partial y_i}dt \\
\frac{dy_j}{dt} &=& J_{kj} \Omega_{km} J_{mi}\frac{\partial
H}{\partial y_i} \\
\end{eqnarray}
Now, we say this final equation is valid if it reproduces the standard
equations of motion:
\begin{eqnarray}
\frac{dy_j}{dt} &=& \left[ y_j, H \right]
\end{eqnarray}
Which will only be true if this jacobian transformation preserves the
structure of our original $Omega$ matrix:
\begin{eqnarray}
\Omega_{ij} &=& J_{ki}\Omega_{km}J_{mi}\\
\mathbf{\Omega} &=& \mathbf{J}^T\mathbf{\Omega}\mathbf{J}
\end{eqnarray}
such a transformation $q,p \to Q,P$ is called "simplectic" or ``canonical'',
which in my mental dictionary, means that it preserves the structure of this
matrix $\Omega$ and thus the Poisson brackets/fundamental commutation
relations:
\begin{eqnarray}
\left[ z_i, z_j \right] &=& \Omega_{ij} \\
\left[ y_i, y_j \right] &=& \Omega_{ij}
\end{eqnarray}
Just like the Lorentz boosts leave the minkowksi metric $\eta$ invariant. This
set of linear transformations $\mathbf{J}$ can be thought of as a
representation of the simplectic ``group'', which are continuously connected
to the identity operation.
---------------------------------------------------------------------------------------------------------------------------------
Now one way to define these canonical transformations is to add a total time
derivative to the lagrangian:
\begin{eqnarray}
L(q,Q,t) &=& L(q,\dot{q},t) - \frac{dF(q,Q,t)}{dt}
\end{eqnarray}
Such a "generator" of the canonical transformation is called type 1, because
it exchanges Q for $\dot{q}$. We allow ourselves to add this total time
derivative to the Lagrangian, because Hamilton's principle states that we are
only interested in minimizing the action through variation:
\begin{eqnarray}
S &=& \int L dt \\
S^\prime &=& \int L - \frac{dF}{dt}dt=S+constant \\
\delta S &=& \int \left( \frac{\partial L}{\partial
q}-\frac{d}{dt}\frac{\partial L}{\partial \dot{q}}\right)\delta q dt \\
\delta S = \delta S^\prime
\end{eqnarray}
so we don't care about adding total time derivatives. (Notice that I have not
allowed $F$ to be a function of the generalized coordinate velocity, $\dot{q}$
this is because when varying the action, any dependence upon $\dot{q}$ will
result in non-zero terms outside the functional integral, so we need to be
careful here! In field theory, we find that adding a total derivative
$\partial_\mu X^\mu$ to the lagrangian results in the same action as well, so
perhaps this can also be thought of as a type I canonical transformation...)
Pounding through the same equations of motion, we find that, if we want our
new Lagrangian to only depend upon q,Q and t, we require:
\begin{eqnarray}
L^\prime(q,Q,t) &=& L - \frac{\partial F}{\partial t}- \frac{\partial
F}{\partial q}\dot{q}- \frac{\partial F}{\partial Q}\dot{Q}\\
\frac{\partial L^\prime}{\partial \dot{q}} =0 &\implies
&\frac{\partial L}{\partial \dot{q}}=p=\frac{\partial F}{\partial q}\\
\end{eqnarray}
and we make the definition of a new momentum variable
\begin{eqnarray}
P=\frac{\partial L^\prime}{\partial \dot{Q}}=-\frac{\partial F}{\partial Q}
\end{eqnarray}
With these two definitions in hand, we have essentially defined our new phase
space vector $\mathbf{y}_i$. So, we can check out the NEW fundamental
commutation relations
\begin{eqnarray}
\left[ Q,P \right] &=& \frac{\partial Q}{\partial q}\frac{\partial
P}{\partial p}-\frac{\partial Q}{\partial p}\frac{\partial P}{\partial q} \\
\left[ Q, P \right] &=& -\frac{\partial Q}{\partial q}\frac{\partial^2
F}{\partial p \partial Q}+\frac{\partial Q}{\partial p}\frac{\partial^2
F}{\partial Q\partial q} \\
&=& \frac{\partial Q}{\partial p}\frac{\partial p}{\partial Q} \\
&=& 1
\end{eqnarray}
Trivially, we expect $\left[Q,Q \right]=\left[P,P \right]=0$, and so it all
works out. Further generators of the canonical transformation can be created
using the legendre transform.