Krylov iterative methods to solve sparse linear systems

Krylov methods for sparse systems
- This provides a basic introduction to all the methods covered in this chapter.
- As alluded to previously, we search for the solution in the Krylov subspace that satisfies some appropriate optimality condition.
- The optimality condition depends on the properties of the matrix: symmetric positive definite, symmetric, and general matrix.
Conjugate Gradients Version 1
- This is a short and basic derivation of the algorithm.
- This method produces the correct sequence of iterates $x_{k}$ .
- However, its space (memory) and time (number of flops) costs are not optimal.
- This will be refined in the sections below.
Some orthogonality relations in CG
- We illustrate the first of several orthogonality relations satisfied by the CG vector sequences.
- The orthogonality relation is special. It is defined using a scaling by $A^{1/2}$ .
- In this section, we prove that the solution increments $Δ x^{(k)}$ are $A$ -conjugate to each other (i.e., orthogonal with $A^{1/2}$ scaling).
Convergence with conjugate steps
- Based on the derivation from the previous section, we can prove that CV converges in at most $n$ steps, where $n$ is the size of $A$ .
Residuals and solution increments in CG
- We now start deriving a series of results to help us turn the current version of CG into an efficient algorithm.
- We prove that the subspace spanned by the residuals is the same as the subspace spanned by the solution increments.
CG search directions
- Using the previous section, we establish that the residuals can be written as linear combinations of the increments.
- We will see later on how the coefficients in the linear combination can be easily computed using orthogonality relations.
All the orthogonality relations in CG
- We derive all the orthogonality relations in CG that you need to know.
- This is the foundation to build the final steps of CG in the sections below.
Three-term recurrence
- Using the previous result and the orthogonality relations, we derive a three-term recurrence for the search directions $p^{(l)}$ .
Optimal step size
- Using our orthogonality relations, we derive a computationally efficient formula for the step sizes $μ_{k + 1}$ .
Computationally efficient search directions
- We need one more step to make computing the search directions $p^{(k)}$ efficient.
Conjugate Gradients algorithm
- Based on all the previous equations, we describe each step in the CG algorithm.
Conjugate Gradients code
- We give the Julia code for CG based on the algorithm we derived.
Convergence of the Conjugate Gradients
- We give an estimate for the convergence of CG.
- In some sense, this convergence can be considered to be optimal.
Rajat’s painless Conjugate Gradients
- Rajat’s own derivation of the CG algorithm.
2024 complete and painless Conjugate Gradient
- Another standalone derivation of the CG algorithm based on Rajat’s derivation.
- This derivation contains all the key results.
- This is the recommended reading for understanding CG.
GMRES
- CG applies when the matrix is symmetric positive definite.
- GMRES applies to general unsymmetric matrices.
- It requires a new definition of the norm used in the optimization problem.
GMRES least-squares problem
- We derive the least-squares problem GMRES is solving.
GMRES algorithm
- We explain the steps in the GMRES algorithm based on the previous derivation.
Convergence of GMRES
- The convergence of GMRES depends on the distribution of the eigenvalues. This is similar to CG.
- It also depends on the condition number of the eigenvector basis matrix $X .$
Space and time costs of CG and GMRES
- We derive the space and time computational cost for CG and GMRES.
- As expected, GMRES has a larger computational cost. GMRES has an extra factor $k$ in the cost compared to CG.
MINRES
- This algorithm specializes GMRES for symmetric matrices.
- The space cost is reduced to $O (n)$ and the time cost to $O (k (nnz + n))$ .
Preconditioning
- This technique is used to reduce the number of iterations in iterative methods.
- It works by changing the way eigenvalues are distributed.
- Preconditioners need to be computationally cheap to apply and should reduce the condition number of the linear system.
Preconditioning the Conjugate Gradients algorithm
- We explain the basic idea behind preconditioning CG.
- Although we are working with $C A C^{T}$ , because the Krylov subspace is based on using powers of the matrix, we can replace $C$ and $C^{T}$ by the simpler $M = C^{T} C$ matrix.
- Then any preconditioner $M$ that is SPD and such that $M A \approx I$ can be used to precondition CG.
- The only condition we require is that $M$ should be computationally cheap to apply.
Preconditioned Conjugate Gradients algorithm
- We describe how our previous idea can be simplified to reduce the computational cost.
- The matrices $C$ and $C^{T}$ are replaced by $M$ and disappear entirely from the algorithm.
- The final algorithm is very close to the original CG algorithm.
Preconditioned Conjugate Gradients code
- We provide a short Julia code implementing the PCG algorithm.
Flexible preconditioned conjugate gradient method
- This preconditioning technique is a variant of PCG that allows using a preconditioner like $M^{(k)}$ that varies at every step.
- This can be very useful in some circumstances.

📓 CME 302

Explorer

Krylov iterative methods to solve sparse linear systems

Graph View

Backlinks