One of the most important matrix decompositions in data science and machine learning.

Eigenvalues are great for understanding . Does it grow? Does it shrink? How can it be easily modeled?

But it says nothing about the size of . How is transforming and rescaling the vector?

Because is a matrix, multiple scales are involved when applying . These scales can be represented systematically using the singular value decomposition.

  • : .
  • : orthogonal, . Left singular vectors
  • : orthogonal, . Right singular vectors
  • : , diagonal matrix with real positive entries = pure scaling = singular values.

Consider a ball in . transforms this ball into an ellipsoid.

  • :
    • : point on the unit ball
    • : point on an ellipsoid; the axes are aligned with the coordinate axes.
    • : rotate/reflect the ellipsoid.

The lengths of the axes of this ellipsoid are the singular values of .

As we can expect, the size of a matrix can be related to its singular values:

We can also define a new operator norm using the SVD, the Schatten -norm:

where is the rank and is the vector -norm.

The four fundamental spaces. Assume is . : number of non-zero singular values = rank of the matrix. Then:

We recover the four fundamental spaces and the rank-nullity theorem.

Connection with eigenvalues. The eigenvalues of and are equal to …, or 0. The eigenvectors of are given by , and those of by :

The computational cost of computing the singular value decomposition is .

The four fundamental spaces, Eigenvalues, Operator and matrix norms, Orthogonal matrix and projector