\(\newcommand{\bx}{\textbf{x}} \newcommand{\bo}{\textbf{0}} \newcommand{\bv}{\textbf{v}} \newcommand{\bu}{\textbf{u}} \newcommand{\bq}{\textbf{q}} \newcommand{\by}{\textbf{y}} \newcommand{\bb}{\textbf{b}} \newcommand{\ba}{\textbf{a}} \newcommand{\grad}{\boldsymbol{\nabla}} \newcommand{\pd}[2]{\frac{\partial #1}{\partial #2}} \newcommand{\pdd}[2]{\frac{\partial^2 #1}{\partial #2^2}} \newcommand{\pddm}[3]{\frac{\partial^2 #1}{\partial #2 \partial #3}} \newcommand{\deriv}[2]{\frac{d #1}{d #2}} \newcommand{\lt}{ < } \newcommand{\gt}{ > } \newcommand{\amp}{ & } \)

Section5.5The Singular Value Decomposition

Our overarching goal of this section is to discuss an analogue to the eigenvalue-eigenvector problem for non-square matrices. That is, we would like to take a matrix \(A\) that is \(m \times n\) and find vectors are values that behave similarly to how eigenvectors and eigenvalues behave for square matrices. The key to this discussion is the matrix \(A^T A\), so let's start there.

Definition5.5.3

The singular values of an \(m \times n\) matrix \(A\) are the square roots of the eigenvalues of \(A^T A\). They are typically denoted as \(\sigma_1, \sigma_2, \dots, \sigma_n\) where \(\sigma_j = \sqrt{\lambda_j}\) and \(\lambda_j\) is an eigenvalue of \(A^T A\).

Definition5.5.4

The singular value decomposition of an \(m \times n\) matrix \(A\) with rank \(r\) is a factorization of \(A\) into the product of three matrices, \(U\), \(\Sigma\), and \(V\), such that \begin{equation*} A = U \Sigma V^T. \end{equation*}

In the singular value decomposition, \(U\) (\(m \times m\)) and \(V\) (\(n \times n\)) have orthogonal columns and \(\Sigma\) (\(m \times n\)) is a block diagonal matrix \begin{equation*} \Sigma = \begin{pmatrix}D \amp 0 \\ 0 \amp 0 \end{pmatrix} \end{equation*} where \(D\) is an \(r \times r\) diagonal matrix containing the \(r\) singular values of \(A\) in rank order (largest to smallest).

To build the singular value decomposition:

  1. Form \(A^TA\) and find the eigenvalues and eigenvectors (guaranteed to exist by Theorem 5.5.2).

  2. Form \(\Sigma\)

  3. The columns of \(V\) are the eigenvectors of \(A^T A\).

  4. The columns of \(U\) are the normalized vectors obtained by \begin{equation*} \bu_1 = \frac{1}{\sigma_1} A \bv_1\, , \, \bu_2 = \frac{1}{\sigma_2} A \bv_2 \, , \, \dots, \, \bu_m = \frac{1}{\sigma_m} A \bv_m \end{equation*}