Processing math: 0%
\newcommand{\bx}{\textbf{x}} \newcommand{\bo}{\textbf{0}} \newcommand{\bv}{\textbf{v}} \newcommand{\bu}{\textbf{u}} \newcommand{\bq}{\textbf{q}} \newcommand{\by}{\textbf{y}} \newcommand{\bb}{\textbf{b}} \newcommand{\ba}{\textbf{a}} \newcommand{\grad}{\boldsymbol{\nabla}} \newcommand{\pd}[2]{\frac{\partial #1}{\partial #2}} \newcommand{\pdd}[2]{\frac{\partial^2 #1}{\partial #2^2}} \newcommand{\pddm}[3]{\frac{\partial^2 #1}{\partial #2 \partial #3}} \newcommand{\deriv}[2]{\frac{d #1}{d #2}} \newcommand{\lt}{ < } \newcommand{\gt}{ > } \newcommand{\amp}{ & }

Section5.5The Singular Value Decomposition

Our overarching goal of this section is to discuss an analogue to the eigenvalue-eigenvector problem for non-square matrices. That is, we would like to take a matrix A that is m \times n and find vectors are values that behave similarly to how eigenvectors and eigenvalues behave for square matrices. The key to this discussion is the matrix A^T A, so let's start there.

Definition5.5.3

The singular values of an m \times n matrix A are the square roots of the eigenvalues of A^T A. They are typically denoted as \sigma_1, \sigma_2, \dots, \sigma_n where \sigma_j = \sqrt{\lambda_j} and \lambda_j is an eigenvalue of A^T A.

Definition5.5.4

The singular value decomposition of an m \times n matrix A with rank r is a factorization of A into the product of three matrices, U, \Sigma, and V, such that \begin{equation*} A = U \Sigma V^T. \end{equation*}

In the singular value decomposition, U (m \times m) and V (n \times n) have orthogonal columns and \Sigma (m \times n) is a block diagonal matrix \begin{equation*} \Sigma = \begin{pmatrix}D \amp 0 \\ 0 \amp 0 \end{pmatrix} \end{equation*} where D is an r \times r diagonal matrix containing the r singular values of A in rank order (largest to smallest).

To build the singular value decomposition:

  1. Form A^TA and find the eigenvalues and eigenvectors (guaranteed to exist by Theorem 5.5.2).

  2. Form \Sigma

  3. The columns of V are the eigenvectors of A^T A.

  4. The columns of U are the normalized vectors obtained by \begin{equation*} \bu_1 = \frac{1}{\sigma_1} A \bv_1\, , \, \bu_2 = \frac{1}{\sigma_2} A \bv_2 \, , \, \dots, \, \bu_m = \frac{1}{\sigma_m} A \bv_m \end{equation*}