One of the most important matrix decompositions in data science and machine learning.

Eigenvalues are great for understanding . Does it grow? Does it shrink? How can it be easily modeled?

But it says nothing about the size of . How is transforming and rescaling the vector?

Because is a matrix, multiple scales are involved when applying . These scales can be represented systematically using the singular value decomposition.

  • : .
  • : orthogonal, . Left singular vectors
  • : orthogonal, . Right singular vectors
  • : , diagonal matrix with real positive entries = pure scaling = singular values.

Consider a ball in . transforms this ball into an ellipsoid.

  • :
    • : point on the unit ball
    • : point on an ellipsoid; the axes are aligned with the coordinate axes.
    • : rotate/reflect the ellipsoid.

The lengths of the axes of this ellipsoid are the singular values of .

As we can expect, the size of a matrix can be related to its singular values:

We can also define a new operator norm using the SVD, the Schatten -norm:

where is the rank and is the vector -norm.

The four fundamental spaces. Assume is . : number of non-zero singular values = rank of the matrix. Then:

We recover the four fundamental spaces and the rank-nullity theorem.

Connection with eigenvalues. The eigenvalues of and are equal to …, or 0. The eigenvectors of are given by , and those of by :

The computational cost of computing the singular value decomposition is .

Proof of the existence of the SVD. Multiple proofs are possible. Let’s look at one of them. Define the matrix

This matrix is symmetric and therefore is unitarily diagonalizable:

We can restrict this factorization such that has only non-zero entries on the diagonal. Assume that is an eigenvector of associated with . Then

So is also an eigenvalue. Since the eigenvectors must be orthogonal we have

Let’s normalize our eigenvector such that

This implies that .

Note that since and , we have

So is an eigenvalue of and .

Denote by all the eigenvectors of and by those of (keeping only the non-zero eigenvalues). We have shown that

Using

we find that

This is the SVD of .

The SVD is unique if and only if all the singular values are distinct (up to a sign change of columns in and ).

The four fundamental spaces, Eigenvalues, Operator and matrix norms, Orthogonal matrix and projector