One of the most important matrix decompositions in data science and machine learning.
Eigenvalues are great for understanding . Does it grow? Does it shrink? How can it be easily modeled?
But it says nothing about the size of . How is transforming and rescaling the vector?
Because is a matrix, multiple scales are involved when applying . These scales can be represented systematically using the singular value decomposition.
- : .
- : orthogonal, . Left singular vectors
- : orthogonal, . Right singular vectors
- : , diagonal matrix with real positive entries = pure scaling = singular values.
Consider a ball in . transforms this ball into an ellipsoid.
- :
- : point on the unit ball
- : point on an ellipsoid; the axes are aligned with the coordinate axes.
- : rotate/reflect the ellipsoid.
The lengths of the axes of this ellipsoid are the singular values of .
As we can expect, the size of a matrix can be related to its singular values:
We can also define a new operator norm using the SVD, the Schatten -norm:
where is the rank and is the vector -norm.
The four fundamental spaces. Assume is . : number of non-zero singular values = rank of the matrix. Then:
We recover the four fundamental spaces and the rank-nullity theorem.
Connection with eigenvalues. The eigenvalues of and are equal to β¦, or 0. The eigenvectors of are given by , and those of by :
The computational cost of computing the singular value decomposition is .
Proof of the existence of the SVD. Multiple proofs are possible. Letβs look at one of them. Define the matrix
This matrix is symmetric and therefore is unitarily diagonalizable:
We can restrict this factorization such that has only non-zero entries on the diagonal. Assume that is an eigenvector of associated with . Then
So is also an eigenvalue. Since the eigenvectors must be orthogonal we have
Letβs normalize our eigenvector such that
This implies that .
Note that since and , we have
So is an eigenvalue of and .
Denote by all the eigenvectors of and by those of (keeping only the non-zero eigenvalues). We have shown that
Using
we find that
This is the SVD of .
The SVD is unique if and only if all the singular values are distinct (up to a sign change of columns in and ).
The four fundamental spaces, Eigenvalues, Operator and matrix norms, Orthogonal matrix and projector