rspeare.github.io

Canonical Transformation and the word "Symplectic"

22 Aug 2014

There is this very frustrating thing in Lagrangian and Hamiltonian Mechanics, called a "canonical transformation", which supposedly simplifies the equations of motion and sets up all sorts of higher order analysis like Action Angle variables, the Hamilton-Jacobi equation, and canonical perturbation theory. The trouble is, it's hard to get a feel for these things. The basic building block of Hamiltonian mechanics is the equation[s] of motion: \begin{eqnarray} \frac{\partial H}{\partial q_i} &=& -\dot{p_i} \\ \frac{\partial H}{\partial p_i} &=& \dot{q_i} \end{eqnarray} We can write this more succinctly using a phase space vector, which I will call "z": \begin{eqnarray} \mathbf{z} &=& \left( q_1,q_2, \dots, q_n, p_1,p_2, \dots, p_n \right) \end{eqnarray} So z is a $2n$-dimensional vector in our $2n$-dimensional phase space. We can now write the equations of motion using a a strange matrix, $\Omega$: \begin{eqnarray} \Omega &=& \left(\begin{array}{cc} 0 & I_n \\ -I_n & 0 \end{array} \right) \end{eqnarray} Where $I_n$ represent $n$-dimensional identity matrices, with ones along the diagonal, and the zeros represent $n$-by-$n$ zero matrices. This $\Omega$ is what we call a block diagonal matrix, in that it can be decomposed into the four "blocks" I have written above. We could also write it as a Direct product of two matrices, the $n$-by-$n$ identity and one of the pauli spin matrices (which is also a rotation about the $xy$ plane of 90 degrees): \begin{eqnarray} i \sigma_2 &=& \left(\begin{array}{cc} 0 & 1 \\ -1 & 0 \end{array} \right)\\ \Omega &=& I_n \bigotimes i \sigma_2 \end{eqnarray} Now, we can write Hamilton's equation of motion in the following form: \begin{eqnarray} \frac{dz_i}{dt} &=& \Omega_{ij} \frac{\partial H}{\partial z_j} \end{eqnarray} Another way to use this tidy notation is in the poisson bracket: \begin{eqnarray} \left[A_i,B_j \right] &=& \sum_k \left(\frac{A_i}{q_k}\frac{B_j}{p_k}- \frac{A_i}{p_k}\frac{B_j}{q_k}\right) \\ &=& \frac{\partial A_i}{\partial z_k} \Omega_{km} \frac{\partial B_j}{\partial z_m}\\ \end{eqnarray} and one finds, if we compute the poisson bracket of some quantity $A_i$ with the Hamiltonian, we get a partial derivative with respect to time via chain-rule: \begin{eqnarray} \left[A_i,H \right] &=& \frac{\partial A_i}{\partial z_k} \Omega_{km} \frac{\partial H}{\partial z_m}\\ \left[A_i,H \right] &=& \frac{\partial A_i}{\partial z_k} \frac{\partial z_k}{\partial t} \end{eqnarray} and so we find, if our vector-valued function of interest $A_i$ is dependent upon $q,p,t$ or $z,t$, we can write our total time derivative as: \begin{eqnarray} \frac{dA_i}{dt} &=& \left[A_i,H\right] + \frac{\partial A_i}{\partial t} \end{eqnarray} The Hamiltonian is thus called the "generator" of time translation, because, let's say $A_i$ does not depend on t. In the quantum mechanics regime of things we would say it is a Schrodinger operator. We could essentially translate the operator -- or in this case the function $A_i$ -- forward in time by taylor expansion: \begin{eqnarray} A_i(t) &=& A_i(0) + \frac{dA_i}{dt}\vert_{t_0}(t-t_0)+ \frac{d^2A_i}{dt^2}\vert_{t_0}\frac{(t-t_0)^2}{2!}+\dots \end{eqnarray} But this can be accomplished by repeatedly taking the commutator with H! \begin{eqnarray} A_i(t) &=& A_i(0) + \frac{dA_i}{dt}\vert_{t_0}(t-t_0)+ \frac{d^2A_i}{dt^2}\vert_{t_0}\frac{(t-t_0)^2}{2!}+\dots \\ A_i(t) &=& A_i(0) + (t-t_0) \left[A_i(0),H \right]+ \frac{(t-t_0)^2}{2!} \left[\left[A_i(0),H \right], H\right]+ \frac{(t-t_0)^3}{3!} \left[\left[\left[A_i(0),H \right], H\right],H \right] +\dots \\ &=& e^{\left[ \ast , H\right](t-t_0)}A_i(0) \end{eqnarray} This is in incredibly close parallel to the Baker-Hausdorff lemma in Quantum mechanics, which essentially makes time-dependent operators -- in the Heisenberg picture -- by repeatedly taking commutators on "both sides" of a bra-ket operator. If we promote the Hamiltonian to be an operator, then we write: \begin{eqnarray} A_i(t) &=& e^{\frac{iH(t-t_0)}{\hbar}}A_i(0)e^{\frac{-iH(t-t_0)}{\hbar}} \end{eqnarray} where the commutators are no longer in the classical sense, but in the ``operator'' sense. --------------------------------------------------------------------------------------------------------------------------------- So, why do we care about all these commutators and things? Well, a simple reason is that if we are to have a "valid" canonical transformation, we must show that the Hamiltonian Equations of motion remain untarnished. Let's look at our nice form of the EOM again: \begin{eqnarray} \frac{dz_i}{dt} &=& \Omega_{ij} \frac{\partial H}{\partial z_j} \end{eqnarray} we can re-write this with a Poisson bracket \begin{eqnarray} \frac{dz_i}{dt} &=& \left[z_i,H \right] \\ &=& \frac{\partial z_i}{\partial z_k}\Omega_{km} \frac{\partial H}{\partial z_m}\\ dz_i &=& \frac{\partial z_i}{\partial z_k}\Omega_{km} \frac{\partial H}{\partial z_m}dt \end{eqnarray} Now let us transform into some new coordinate system $\mathbf{y}=\left(Q_1,Q_2,\dots, Q_n, P_1,P_2,\dots P_n \right)$. We find that all of the $dz$'s can be written as: \begin{eqnarray} dz_i &=& \frac{\partial z_i}{\partial y_j} dy_j \\ dz_i &=& J_{ij}^{-1} dy_j \end{eqnarray} The matrix we have used above is simply the standard jacobian, $\mathbf{J}_{ij}=\frac{\partial y_i}{\partial z_j}$. Remember $J^TJ=I$. Now we re-write our EOM in the y-coordinates: \begin{eqnarray} dz_i &=& \frac{\partial z_i}{\partial z_k}\Omega_{km} \frac{\partial H}{\partial z_m}dt \\ dz_i &=& \delta_{ik}\Omega_{km} \frac{\partial H}{\partial z_m}dt \\ J_{ij}^{-1} dy_j &=& \delta_{ik} \Omega_{km} J_{mi}\frac{\partial H}{\partial y_i}dt \\ \end{eqnarray} Multiplying both sides by $J_{ij}$ we get: \begin{eqnarray} dy_j &=& J_{kj} \Omega_{km} J_{mi}\frac{\partial H}{\partial y_i}dt \\ \frac{dy_j}{dt} &=& J_{kj} \Omega_{km} J_{mi}\frac{\partial H}{\partial y_i} \\ \end{eqnarray} Now, we say this final equation is valid if it reproduces the standard equations of motion: \begin{eqnarray} \frac{dy_j}{dt} &=& \left[ y_j, H \right] \end{eqnarray} Which will only be true if this jacobian transformation preserves the structure of our original $Omega$ matrix: \begin{eqnarray} \Omega_{ij} &=& J_{ki}\Omega_{km}J_{mi}\\ \mathbf{\Omega} &=& \mathbf{J}^T\mathbf{\Omega}\mathbf{J} \end{eqnarray} such a transformation $q,p \to Q,P$ is called "simplectic" or ``canonical'', which in my mental dictionary, means that it preserves the structure of this matrix $\Omega$ and thus the Poisson brackets/fundamental commutation relations: \begin{eqnarray} \left[ z_i, z_j \right] &=& \Omega_{ij} \\ \left[ y_i, y_j \right] &=& \Omega_{ij} \end{eqnarray} Just like the Lorentz boosts leave the minkowksi metric $\eta$ invariant. This set of linear transformations $\mathbf{J}$ can be thought of as a representation of the simplectic ``group'', which are continuously connected to the identity operation. --------------------------------------------------------------------------------------------------------------------------------- Now one way to define these canonical transformations is to add a total time derivative to the lagrangian: \begin{eqnarray} L(q,Q,t) &=& L(q,\dot{q},t) - \frac{dF(q,Q,t)}{dt} \end{eqnarray} Such a "generator" of the canonical transformation is called type 1, because it exchanges Q for $\dot{q}$. We allow ourselves to add this total time derivative to the Lagrangian, because Hamilton's principle states that we are only interested in minimizing the action through variation: \begin{eqnarray} S &=& \int L dt \\ S^\prime &=& \int L - \frac{dF}{dt}dt=S+constant \\ \delta S &=& \int \left( \frac{\partial L}{\partial q}-\frac{d}{dt}\frac{\partial L}{\partial \dot{q}}\right)\delta q dt \\ \delta S = \delta S^\prime \end{eqnarray} so we don't care about adding total time derivatives. (Notice that I have not allowed $F$ to be a function of the generalized coordinate velocity, $\dot{q}$ this is because when varying the action, any dependence upon $\dot{q}$ will result in non-zero terms outside the functional integral, so we need to be careful here! In field theory, we find that adding a total derivative $\partial_\mu X^\mu$ to the lagrangian results in the same action as well, so perhaps this can also be thought of as a type I canonical transformation...) Pounding through the same equations of motion, we find that, if we want our new Lagrangian to only depend upon q,Q and t, we require: \begin{eqnarray} L^\prime(q,Q,t) &=& L - \frac{\partial F}{\partial t}- \frac{\partial F}{\partial q}\dot{q}- \frac{\partial F}{\partial Q}\dot{Q}\\ \frac{\partial L^\prime}{\partial \dot{q}} =0 &\implies &\frac{\partial L}{\partial \dot{q}}=p=\frac{\partial F}{\partial q}\\ \end{eqnarray} and we make the definition of a new momentum variable \begin{eqnarray} P=\frac{\partial L^\prime}{\partial \dot{Q}}=-\frac{\partial F}{\partial Q} \end{eqnarray} With these two definitions in hand, we have essentially defined our new phase space vector $\mathbf{y}_i$. So, we can check out the NEW fundamental commutation relations \begin{eqnarray} \left[ Q,P \right] &=& \frac{\partial Q}{\partial q}\frac{\partial P}{\partial p}-\frac{\partial Q}{\partial p}\frac{\partial P}{\partial q} \\ \left[ Q, P \right] &=& -\frac{\partial Q}{\partial q}\frac{\partial^2 F}{\partial p \partial Q}+\frac{\partial Q}{\partial p}\frac{\partial^2 F}{\partial Q\partial q} \\ &=& \frac{\partial Q}{\partial p}\frac{\partial p}{\partial Q} \\ &=& 1 \end{eqnarray} Trivially, we expect $\left[Q,Q \right]=\left[P,P \right]=0$, and so it all works out. Further generators of the canonical transformation can be created using the legendre transform.