QR Iteration#
The derivations from orthogonal iteration lead to the celebrated QR iteration method, a fundamental algorithm for computing the Schur decomposition and eigenvalues of a matrix.
In the QR iteration, we focus on computing the sequence of matrices:
where \(Q_k\) is the sequence of orthogonal matrices produced by the method of orthogonal iteration.
Let’s consider the relationship between \(T_k\) and the next matrix in the sequence, \(T_{k+1}\):
Our goal is to find a way to compute \(T_{k+1}\) directly from \(T_k\) without needing to compute \(A\) or the full \(Q_k\) matrices.
For the method of QR iteration, the matrix \(Q_k\) is assumed to be square (i.e., we are computing all eigenvalues of a square matrix \(A\)).
Derivation from Orthogonal Iteration#
The derivation proceeds in two main steps.
Step 1: Relate \(T_k\) to a QR Decomposition#
From the definition of orthogonal iteration, we have the relation \(A Q_k = Q_{k+1} R_{k+1}\). We can substitute this into the definition of \(T_k\):
Let’s define a new unitary matrix \(U_{k+1}\) as:
As \(k\) increases, \(Q_k\) converges to the Schur orthogonal transformation. Therefore, \(U_{k+1}\) can be interpreted as a small incremental rotation that updates \(Q_k\) to \(Q_{k+1}\).
Substituting this definition, we get:
This equation is simply the QR decomposition of the matrix \(T_k\). This is a key insight: \(U_{k+1}\) and \(R_{k+1}\) can be found by performing a QR decomposition on \(T_k\), without any knowledge of \(A\), \(Q_k\), or \(Q_{k+1}\).
Step 2: Relate \(T_{k+1}\) to \(T_k\)#
Now let’s find an expression for \(T_{k+1}\) in terms of \(T_k\). We start with the definition \(T_k = Q_k^H A Q_k\), which (assuming \(Q_k\) is square) implies \(A = Q_k T_k Q_k^H\).
Substitute this expression for \(A\) into the definition of \(T_{k+1}\):
By grouping the terms, we get:
Recall our definition \(U_{k+1} = Q_k^H Q_{k+1}\). This means \(U_{k+1}^H = (Q_k^H Q_{k+1})^H = Q_{k+1}^H Q_k\). Substituting these into the equation for \(T_{k+1}\) gives:
This shows that \(T_k\) and \(T_{k+1}\) are unitarily similar, which confirms they have the same eigenvalues.
Now, we combine the results from Step 1 and Step 2.
From Step 1: \(T_k = U_{k+1} R_{k+1}\).
Left-multiplying by \(U_{k+1}^H\), we get
Now substitute this into our equation for \(T_{k+1}\):
This provides the final, simple relationship:
\(T_k = U_{k+1} R_{k+1}\)
\(T_{k+1} = R_{k+1} U_{k+1}\)
This is the basis for the method of QR iteration.
Summary of the Key Steps#
The derivation shows that we can generate the sequence \(T_{k+1}\) from \(T_k\) using two simple steps:
Step 1: Compute the QR decomposition of \(T_k\): \(T_k = U_{k+1} R_{k+1}\)
Step 2: Compute \(T_{k+1}\) by multiplying the factors in reverse order: \(T_{k+1} = R_{k+1} U_{k+1}\)
We just switch the order of the terms! We can compute the entire sequence of \(T_k\) matrices starting from \(T_0 = A\) without ever needing to form the \(Q_k\) matrices from the orthogonal iteration.
This QR iteration produces the same sequence of \(R_k\) matrices as the orthogonal iteration.
From \(U_{k+1} = Q_k^H Q_{k+1}\), we can show by induction that the product of the unitary factors from the QR iteration, \(U_1 \cdots U_k\), is equal to the \(Q_k\) matrix from the orthogonal iteration (assuming \(Q_0 = I\)). This product therefore converges to the orthogonal matrix \(Q\) from the Schur decomposition.
Recall from our previous section that
Using the new sequence of \(U_k\) matrices from the QR iteration, we can write
We restate this important result as a formal theorem.
Theorem 25 (Relationship Between \(A^k\) and the QR Iteration Factors)
Let \(A\) be a square matrix. Let the sequences of matrices \(\{T_k\}\), \(\{U_k\}\), and \(\{R_k\}\) be generated by the QR iteration algorithm, defined as:
Initialize \(T_0 = A\).
For \(k=0, 1, 2, \dots\):
Compute the QR decomposition: \(T_k = U_{k+1} R_{k+1}\)
Compute the next iterate: \(T_{k+1} = R_{k+1} U_{k+1}\)
where \(U_{k+1}\) is a unitary matrix and \(R_{k+1}\) is an upper triangular matrix.
If we define the product matrices \(\tilde{Q}_k\) and \(\tilde{R}_k\) as:
\(\tilde{Q}_k = U_1 U_2 \cdots U_k\)
\(\tilde{R}_k = R_k R_{k-1} \cdots R_1\)
Then for any integer \(k \ge 1\), the \(k\)-th power of the matrix \(A\) is given by:
The proof is provided at the end of this section.
QR Iteration Algorithm#
This leads to the following simple algorithm, often called the “unshifted” QR iteration.
import numpy as np
# We use np.copy() so the original matrix A is not modified
Tk = np.copy(A)
# The 'not_converged' condition is abstract.
# In a real implementation, you would check if the
# sub-diagonal elements are all close to zero.
# For this example, we'll just run a fixed number of iterations.
num_iterations = 20
for k in range(num_iterations):
# Step 1: Compute the QR decomposition of the current matrix
# In the derivation's notation: T_k = U_{k+1} R_{k+1}
# numpy.linalg.qr() returns Q, R (which are Uk, Rk here)
Uk, Rk = np.linalg.qr(Tk)
# Step 2: Compute the next matrix by reversing the factors
# In the derivation's notation: T_{k+1} = R_{k+1} U_{k+1}
# We store the new T_{k+1} back into the Tk variable
# In Python, @ is the operator for matrix multiplication.
Tk = Rk @ Uk
First Few Iterations#
The algorithm produces a sequence of matrices \(T_0, T_1, T_2, \dots\) where \(T_k\) converges to an upper-triangular (Schur) form.
Start: \(T_0 = A\)
Iteration 1:
Compute QR decomposition: \(T_0 = U_1 R_1\)
Compute next matrix: \(T_1 = R_1 U_1\)
Iteration 2:
Compute QR decomposition: \(T_1 = U_2 R_2\)
Compute next matrix: \(T_2 = R_2 U_2\)
Iteration 3:
Compute QR decomposition: \(T_2 = U_3 R_3\)
Compute next matrix: \(T_3 = R_3 U_3\)
… and so on.
Proof of Decomposition of \(A^k\)#
We provide a direct proof of the following statement \(P(k)\):
Proof. We will use a proof by induction. To simplify notation, let’s define two product matrices:
\(\tilde{Q}_k = U_1 U_2 \cdots U_k\)
\(\tilde{R}_k = R_k R_{k-1} \cdots R_1\)
Our goal is to prove \(A^k = \tilde{Q}_k \tilde{R}_k\) for all \(k \ge 1\).
Lemma: \(A = \tilde{Q}_k T_k \tilde{Q}_k^H\)
Before the main proof, let’s first establish a key relationship between \(A\) and \(T_k\). We will prove by induction that \(T_k = \tilde{Q}_k^H A \tilde{Q}_k\). Note that this is not necessarily an obvious result since we start from the QR iteration, not the orthogonal iteration.
Base Case (k=1):
From the algorithm, \(T_1 = R_1 U_1\).
From the first step (\(k=0\)), we have \(T_0 = U_1 R_1\). Since \(T_0 = A\), this means \(A = U_1 R_1\).
Because \(U_1\) is unitary, \(U_1^H = U_1^{-1}\), so we can write \(R_1 = U_1^H A\). Substitute this into the equation for \(T_1\):
By definition, \(\tilde{Q}_1 = U_1\), so \(T_1 = \tilde{Q}_1^H A \tilde{Q}_1\). The base case holds.
Inductive Step:
Assume \(T_n = \tilde{Q}_n^H A \tilde{Q}_n\) for some \(n \ge 1\). We want to show \(T_{n+1} = \tilde{Q}_{n+1}^H A \tilde{Q}_{n+1}\).
From the algorithm, \(T_{n+1} = R_{n+1} U_{n+1}\).
From the QR decomposition step, \(T_n = U_{n+1} R_{n+1}\), which implies \(R_{n+1} = U_{n+1}^H T_n\).
Substitute this into the equation for \(T_{n+1}\):
Now, substitute the inductive hypothesis for \(T_n\):
Grouping the terms:
Using the property \((BC)^H = C^H B^H\), we get:
By definition, \(\tilde{Q}_{n+1} = \tilde{Q}_n U_{n+1}\). Therefore:
This proves the lemma. Since \(\tilde{Q}_k\) is unitary, we can rearrange this to \(A = \tilde{Q}_k T_k \tilde{Q}_k^H\).
Main Proof
Now we proceed with the main proof for \(A^k = \tilde{Q}_k \tilde{R}_k\).
Base Case (k=1)
We need to show \(A^1 = \tilde{Q}_1 \tilde{R}_1\).
\(\tilde{Q}_1 = U_1\)
\(\tilde{R}_1 = R_1\)
So we must show \(A = U_1 R_1\). From the first step of the QR iteration (\(k=0\)):
By definition, \(T_0 = A\). Therefore, \(A = U_1 R_1\). The base case holds.
Inductive Hypothesis
Assume the statement \(P(n)\) is true for some integer \(n \ge 1\):
Inductive Step
We must prove the statement \(P(n+1)\) is true. We want to show \(A^{n+1} = \tilde{Q}_{n+1} \tilde{R}_{n+1}\).
Let’s start with \(A^{n+1}\):
Substitute the inductive hypothesis for \(A^n\):
Now, use our lemma \(A = \tilde{Q}_k T_k \tilde{Q}_k^H\). For \(k=n\), we have \(A = \tilde{Q}_n T_n \tilde{Q}_n^H\). Substitute this for \(A\):
Since \(\tilde{Q}_n\) is a product of unitary matrices, it is also unitary, which means \(\tilde{Q}_n^H \tilde{Q}_n = I\) (the identity matrix).
Now, use the QR algorithm’s definition for \(T_n\): \(T_n = U_{n+1} R_{n+1}\).
Regroup the terms using the associativity of matrix multiplication:
By our definitions:
\((\tilde{Q}_n U_{n+1}) = (U_1 \cdots U_n) U_{n+1} = \tilde{Q}_{n+1}\)
\((R_{n+1} \tilde{R}_n) = R_{n+1} (R_n \cdots R_1) = \tilde{R}_{n+1}\)
Substituting these back, we get:
This is exactly the statement \(P(n+1)\).
We have shown that the statement is true for \(k=1\), and that if it is true for \(k=n\), it is also true for \(k=n+1\). Therefore, by induction, the statement \(A^k = (U_1 U_2 \cdots U_k) (R_k R_{k-1}\cdots R_1)\) is true for all integers \(k \ge 1\).