Computing PCA with SVD

PCA is a great tool for performing dimensionality reduction. Two reason you might want to use SVD to compute PCA:

  • SVD is more numerically stable if the columns are close to collinear. I have seen this happen in text data, when certain terms almost always appear together.
  • Spark’s PCA implementation currently doesn’t support very wide matrices. The SVD implementation does however.

Singular Value Decomposition (SVD)

Below we briefly recap Singular Value Decomposition (SVD).

Let \(\mathbf{A}\) be a \(m \times n\) matrix, the singular value decomposition gives

$$\mathbf{A} = \mathbf{U}_{m\times m}\mathbf{\Sigma}_{m\times n}\mathbf{V}^\top_{n\times n}$$

\(\mathbf{U}\) is an orthonormal matrix and is the eigenvectors of \(\mathbf{A}\mathbf{A}^\top\).

\(\mathbf{V}\) is an orthonormal matrix and is the eigenvectors of \(\mathbf{A}^\top\mathbf{A}\).

\(\mathbf{\Sigma}\) is a diagonal matrix and contains the square-roots of the eigenvalues of \(\mathbf{A}\mathbf{A}^\top\) and   \(\mathbf{A}^\top\mathbf{A}\) e.g.

$$\mathbf{\Sigma}=\begin{pmatrix}\sqrt{\lambda_1}&0&\cdots& 0\\0 &\sqrt{\lambda_2} & \cdots &0\\\vdots & \vdots & \ddots & \vdots \\0&0& \cdots &\sqrt{\lambda_k}\end{pmatrix}$$

Remember, as \(\mathbf{U}\) is an orthonormal matrix \(\mathbf{U}^\top=\mathbf{U}^{-1}\)

Computing PCA

Start with the standard steps of PCA:

  1. Mean centre the matrix \(\mathbf{A}\)
  2. Optionally scale each column by their standard deviation. You may want to do this if the variables are measured on different scales.

We noted in the previous section that \(\mathbf{V}\) is the eigenvectors of \(\mathbf{A}^\top\mathbf{A}\) (the covariance matrix). Thus the principal component decomposition is

$$\begin{equation} \begin{split}\mathbf{T} & = \mathbf{A}\mathbf{V}\\ & =\mathbf{U}\mathbf{\Sigma}\mathbf{V}^\top\mathbf{V} \\ & =\mathbf{U}\mathbf{\Sigma}\end{split}\end{equation}$$

To reduce the dimensionality of \(\mathbf{A}\), select the \(k\) largest singular values (\(k \le n\) ), select the first \(k\) columns from \(\mathbf{U}\) and the \(k \times k\) upper-left part of \(\mathbf{\Sigma}\). The reduced dimensionality is given by

$$\mathbf{T}_k =\mathbf{U}_k\mathbf{\Sigma}_k$$

Leave a Reply

Your email address will not be published. Required fields are marked *