$$ \newcommand{\problemdivider}{\begin{center}\large \bf\ldots\ldots\ldots\ldots\ldots\ldots\end{center}} \newcommand{\subproblemdivider}{\begin{center}\large \bf\ldots\ldots\end{center}} \newcommand{\pdiv}{\problemdivider} \newcommand{\spdiv}{\subproblemdivider} \newcommand{\ba}{\begin{align*}} \newcommand{\ea}{\end{align*}} \newcommand{\rt}{\right} \newcommand{\lt}{\left} \newcommand{\bp}{\begin{problem}} \newcommand{\ep}{\end{problem}} \newcommand{\bsp}{\begin{subproblem}} \newcommand{\esp}{\end{subproblem}} \newcommand{\bssp}{\begin{subsubproblem}} \newcommand{\essp}{\end{subsubproblem}} \newcommand{\atag}[1]{\addtocounter{equation}{1}\label{#1}\tag{\arabic{section}.\alph{subsection}.\alph{equation}}} \newcommand{\btag}[1]{\addtocounter{equation}{1}\label{#1}\tag{\arabic{section}.\alph{equation}}} \newcommand{\ctag}[1]{\addtocounter{equation}{1}\label{#1}\tag{\arabic{equation}}} \newcommand{\dtag}[1]{\addtocounter{equation}{1}\label{#1}\tag{\Alph{chapter}.\arabic{section}.\arabic{equation}}} \newcommand{\unts}[1]{\ \text{#1}} \newcommand{\textop}[1]{\operatorname{#1}} \newcommand{\textopl}[1]{\operatornamewithlimits{#1}} \newcommand{\prt}{\partial} \newcommand{\pderi}[3]{\frac{\prt^{#3}#1}{\prt #2^{#3}}} \newcommand{\deri}[3]{\frac{d^{#3}#1}{d #2^{#3}}} \newcommand{\del}{\vec\nabla} \newcommand{\exval}[1]{\langle #1\rangle} \newcommand{\bra}[1]{\langle #1|} \newcommand{\ket}[1]{|#1\rangle} \newcommand{\ham}{\mathcal{H}} \newcommand{\arr}{\mathfrak{r}} \newcommand{\conv}{\mathop{\scalebox{2}{\raisebox{-0.2ex}{$\ast$}}}} \newcommand{\bsm}{\lt(\begin{smallmatrix}} \newcommand{\esm}{\end{smallmatrix}\rt)} \newcommand{\bpm}{\begin{pmatrix}} \newcommand{\epm}{\end{pmatrix}} \newcommand{\bdet}{\lt|\begin{smallmatrix}} \newcommand{\edet}{\end{smallmatrix}\rt|} \newcommand{\bs}[1]{\boldsymbol{#1}} \newcommand{\uvec}[1]{\bs{\hat{#1}}} \newcommand{\qed}{\hfill$\Box$} $$
Tags:
  • math
  • Change of basis

    Coordinates

    From standard basis to new basis

    Given basis $B$ with basis vectors $\vec b_1,\vec b_2,\dots,\vec b_n$ defined in terms of standard basis $\vec e_1,\dots,\vec e_n$, i.e. \(\begin{align*} \vec b_1 &= b_{11}\vec e_1 + b_{12}\vec e_2 + \dots + b_{1n}\vec e_n\\ &\vdots\\ \vec b_n &= b_{n1}\vec e_1 + \dots + b_{nn}\vec e_n, \end{align*}\) a vector $\vec v’ = (v’_1,\dots,v’_n)^\intercal$ in $B$ coordinates has the following standard coordinates $\vec v$:

    \[\begin{align*} \vec v &= v'_1\vec b_1 + \dots + v'_n\vec b_n\\ &=\phantom{{}+{}}v'_1(b_{11}\vec e_1 + b_{12}\vec e_2 + \dots + b_{1n}\vec e_n)\\ &\phantom{{}={}} + v'_2(b_{21}\vec e_1 + b_{22}\vec e_2 + \dots + b_{2n}\vec e_n)\\ &\phantom{{}={}}\vdots\\ &\phantom{{}={}} + v'_n(b_{n1}\vec e_1 + b_{n2}\vec e_2 + \dots + b_{nn}\vec e_n)\\ &= \begin{pmatrix} b_{11} & b_{21} & \dots & b_{n1}\\ b_{12} & b_{22} & \dots & b_{n2}\\ \vdots & \vdots & \ddots & \vdots\\ b_{12} & b_{22} & \dots & b_{nn} \end{pmatrix} \begin{pmatrix} v'_1\\ v'_2\\ \vdots\\ v'_n \end{pmatrix}\\ &\equiv\mathbf{B}\vec v' \end{align*}\]

    which is just a matrix whose columns are the basis vectors of $B$.

    From new basis to standard basis

    To go the other way, i.e. take a vector in standard basis $\vec v = v_1\vec e_1 + \dots + v_n\vec e_n$ and express it in basis $B$, we simply invert the basis matrix, \(\vec v' = \mathbf{B}^{-1}\vec v.\)

    Arbitrary bases

    Suppose we have bases $\mathbf A = \begin{pmatrix}\vec a_1 & \dots & \vec a_n\end{pmatrix}$ and $\mathbf B = \begin{pmatrix}\vec b_1 & \dots & \vec b_n\end{pmatrix}$. Given vector $\vec v_a = \alpha_1\vec a_1 + \dots + \alpha_n\vec a_n$, we want to express it in terms of $\mathbf B$ coordinates $\vec v_b = \beta_1\vec b_1 + \dots + \beta_n\vec b_n$. We simply equate the two and solve for $\vec v_b$, i.e. \(\begin{align*} \alpha_1\vec a_1 + \dots + \alpha_n\vec a_n &= \beta_1\vec b_1 + \dots + \beta_n\vec b_n\\ \mathbf A\begin{pmatrix} \alpha_1\\\vdots\\\alpha_n \end{pmatrix} &=\mathbf{B}\begin{pmatrix} \beta_1\\\vdots\\\beta_n \end{pmatrix}\\ \vec v_b &= \mathbf{B}^{-1}\mathbf{A}\vec v_a \end{align*}\)

    Linear transformations

    Suppose we have change-of-basis $\vec r’ = \mathbf{B}^{-1}\vec r$, and linear transformation $\vec s = \mathbf{T}\vec r$. In the primed coordinates, we have $\vec s’ = \mathbf{B}^{-1}\mathbf{T}\vec r$; subbing in for $\vec r$, we have $\vec s’ = \mathbf{B}^{-1}\mathbf{T}\mathbf{B}\vec r’$. Thus, in the primed coordiantes, $\mathbf{T}’ = \mathbf{B}^{-1}\mathbf{T}\mathbf{B}$. $\mathbf{T}’$ is called similar to $\mathbf{T}$.

    Eigendecomposition

    If the change-of-basis in the previous section is a rotation matrix that rotates the coordinate system to point along the eigenvectors of $\mathbf{T}$, then $\mathbf{T}’$ will be diagonal. This is evident from the eigenvalue equation: \(\begin{align*} \mathbf{T}\vec v = \lambda\vec v\Rightarrow \mathbf{T}\mathbf{V} = \mathbf{V}\mathbf{\Lambda}\Rightarrow \mathbf{T} = \mathbf{V}\mathbf{\Lambda}\mathbf{V}^{-1}. \end{align*}\) where the column vectors of $\mathbf{V}$ are the eigenvectors of $\mathbf{T}$ and $\mathbf{\Lambda}$ has the corresponding eigenvalues along its diagonal. Hence $\mathbf{V}$ is exactly the change-of-basis matrix in the previous section, and $\mathbf{\Lambda}$ is $\mathbf{T}’$.

    (TODO: discuss eigendecomposition in terms of change of basis https://math.stackexchange.com/a/4745118)

    Relation to the covariance matrix

    We can think of the covariance matrix as congruent to a linear transformation that scales and rotates draws from a standard multivariate normal distribution, i.e. \(X\sim\mathcal{N}_m(\vec 0, \mathbf{I}_m)\Rightarrow \mathbf{R}\sqrt{\mathbf{\Lambda}}X \sim \mathcal{N}(\vec 0, \mathbf{R}\sqrt{\mathbf{\Lambda}}\mathbf{\Sigma})\)

    Miscellany

    Log-determinant of positive-definite matrix

    We sometimes need to compute the log-determinant of a covariance matrix, e.g. when working with multivariate Gaussians in logspace. However, directly computing the determinant may run into precision issues. We can avoid this by computing the determinant directly as sums in logspace by performing Cholesky decomposition,$\mathbf{\Sigma} = \mathbf{L}\mathbf{L}^\intercal$, where $\mathbf{L}$ is a lower triangular matrix. Thus \(\begin{align*} |\mathbf{\Sigma}| &= |\mathbf{L}\mathbf{L}^\intercal|\\ &= 2|\mathbf{L}|\\ &= 2\prod_i L_{ii}\\ &\Downarrow\\ \log|\mathbf{\Sigma}| &= 2\sum_i \log L_{ii} \end{align*}\) where the penultimate step comes from the fact that the determinant of a triangular matrix is the product of its diagonal.