Iterative Methods for Eigenvalue Computation#

Why Iterative Methods?#

Previously, we developed a robust and powerful framework for computing the eigenvalues of a matrix \(A \in \mathbb{R}^{n \times n}\). The cornerstone of this “direct” approach is the QR algorithm, which is based on similarity transformations (\(A_k = Q_k R_k\), \(A_{k+1} = R_k Q_k\)) that preserve eigenvalues. These transformations are designed to drive the matrix to an upper-triangular or real-Schur form, from which the eigenvalues can be read.

These methods are incredibly accurate, stable, and generally find the entire spectrum of the matrix. However, they are built on an implicit assumption: that we can afford to work with dense matrices.

The Challenge of Large, Sparse Matrices#

In virtually all large-scale applications in science, engineering, data science, and network analysis, the matrices we encounter are sparse. A sparse matrix is one where the vast majority of entries are zero.

  • Definition: A matrix is considered sparse if the number of non-zero entries, denoted \(nnz(A)\), is much smaller than the total number of entries, \(n^2\).

  • A Common Case: In many applications (e.g., from discretizing a PDE), a matrix will have \(O(n)\) non-zero entries, meaning the number of non-zeros per row is bounded by a small constant, \(O(1)\), even as \(n \to \infty\).

Why can’t we use our standard QR algorithm?

  1. Fill-In: The similarity transformations used in the QR algorithm (and its prerequisites, like reduction to Hessenberg form) almost universally destroy sparsity. Even if \(A\) is sparse, the intermediate matrices \(Q\), \(R\), and \(A_k\) will be dense. This is known as “fill-in.”

  2. Memory Cost: For a “large” problem, \(n\) might be \(10^6\) or \(10^9\). We can easily store the \(O(n)\) non-zero entries of \(A\). We absolutely cannot store the \(O(n^2)\) entries of a dense \(A_k\).

  3. Computational Cost: The \(O(n^3)\) computational cost of the dense QR algorithm is completely infeasible for such dimensions.

We are forced to conclude that any method that modifies the entries of \(A\) (like LU, QR, or Hessenberg reduction) is unsuitable for large, sparse problems.

A New Philosophy: Matrix-as-Operator#

We need a different approach. The key is to treat the matrix \(A\) not as a table of \(n^2\) numbers, but as a linear operator that “acts” on a vector. The only operation we can afford to perform with \(A\) is the one that respects its sparsity: the matrix-vector product (or “matvec”).

Let’s analyze the cost of \(y = Ax\):

\[ y_i = \sum_{j=1}^n a_{ij} \, x_j \]

If \(A\) is stored in a sparse format (like Compressed Sparse Row, CSR), we only need to iterate over the non-zero \(a_{ij}\) entries.

  • Dense Matvec: For a dense matrix, \(a_{ij}\) is non-zero for all \(j\), so the sum requires \(n\) multiplications and \(n-1\) additions for each \(y_i\). The total cost is \(O(n^2)\).

  • Sparse Matvec: If \(A\) has \(O(1)\) non-zeros per row, the sum for each \(y_i\) costs \(O(1)\). The total cost to compute all \(n\) entries of \(y\) is therefore \(O(n)\).

This \(O(n)\) operation is the fundamental building block of all iterative methods for large-scale eigenvalue problems.

Iterative Methods: The Solution and the Trade-Offs#

This leads us to iterative methods (such as Arnoldi iteration and Lanczos iteration). These methods are designed to approximate eigenvalues and eigenvectors using only matrix-vector products and other vector-level operations (like vector-vector products and linear combinations, which cost \(O(n)\)).

They build a sequence of vectors or subspaces that (ideally) converge to the desired eigenvectors.

This new approach comes with a significant set of trade-offs:

  • Partial Spectrum: Unlike the QR algorithm, these methods typically do not find the entire spectrum. They are most effective at finding a few of the “extreme” eigenvalues (e.g., the largest or smallest in magnitude) and their corresponding eigenvectors. In practice, this is often all that is required by the application.

  • Convergence: Convergence is not guaranteed and can be slow, depending on the properties of the matrix (like the separation between eigenvalues).

  • Accuracy: We do not compute the eigenvalues to full machine precision. Instead, the iteration is stopped when the “residual” (a measure of error) falls below a user-defined tolerance.

Despite these caveats, iterative methods are the only viable option for computing eigenvalues of the very large sparse matrices that arise in modern computational problems.