QR Iteration

QR Iteration#

The derivations from orthogonal iteration lead to the celebrated QR iteration method, a fundamental algorithm for computing the Schur decomposition and eigenvalues of a matrix.

In the QR iteration, we focus on computing the sequence of matrices:

\[ T_k = Q_k^H A Q_k \]

where \(Q_k\) is the sequence of orthogonal matrices produced by the method of orthogonal iteration.

Let’s consider the relationship between \(T_k\) and the next matrix in the sequence, \(T_{k+1}\):

\[ T_{k+1} = Q_{k+1}^H A Q_{k+1} \]

Our goal is to find a way to compute \(T_{k+1}\) directly from \(T_k\) without needing to compute \(A\) or the full \(Q_k\) matrices.

For the method of QR iteration, the matrix \(Q_k\) is assumed to be square (i.e., we are computing all eigenvalues of a square matrix \(A\)).

Derivation from Orthogonal Iteration#

The derivation proceeds in two main steps.

Step 1: Relate \(T_k\) to a QR Decomposition#

From the definition of orthogonal iteration, we have the relation \(A Q_k = Q_{k+1} R_{k+1}\). We can substitute this into the definition of \(T_k\):

\[ T_k = Q_k^H (A Q_k) = Q_k^H (Q_{k+1} R_{k+1}) \]

Let’s define a new unitary matrix \(U_{k+1}\) as:

\[ U_{k+1} \overset{def}{=} Q_k^H Q_{k+1} \]

As \(k\) increases, \(Q_k\) converges to the Schur orthogonal transformation. Therefore, \(U_{k+1}\) can be interpreted as a small incremental rotation that updates \(Q_k\) to \(Q_{k+1}\).

Substituting this definition, we get:

\[ T_k = U_{k+1} R_{k+1} \]

This equation is simply the QR decomposition of the matrix \(T_k\). This is a key insight: \(U_{k+1}\) and \(R_{k+1}\) can be found by performing a QR decomposition on \(T_k\), without any knowledge of \(A\), \(Q_k\), or \(Q_{k+1}\).

Step 2: Relate \(T_{k+1}\) to \(T_k\)#

Now let’s find an expression for \(T_{k+1}\) in terms of \(T_k\). We start with the definition \(T_k = Q_k^H A Q_k\), which (assuming \(Q_k\) is square) implies \(A = Q_k T_k Q_k^H\).

Substitute this expression for \(A\) into the definition of \(T_{k+1}\):

\[ T_{k+1} = Q_{k+1}^H A Q_{k+1} = Q_{k+1}^H (Q_k T_k Q_k^H) Q_{k+1} \]

By grouping the terms, we get:

\[ T_{k+1} = (Q_{k+1}^H Q_k) T_k (Q_k^H Q_{k+1}) \]

Recall our definition \(U_{k+1} = Q_k^H Q_{k+1}\). This means \(U_{k+1}^H = (Q_k^H Q_{k+1})^H = Q_{k+1}^H Q_k\). Substituting these into the equation for \(T_{k+1}\) gives:

\[ T_{k+1} = U_{k+1}^H T_k U_{k+1} \]

This shows that \(T_k\) and \(T_{k+1}\) are unitarily similar, which confirms they have the same eigenvalues.

Now, we combine the results from Step 1 and Step 2.

From Step 1: \(T_k = U_{k+1} R_{k+1}\).

Left-multiplying by \(U_{k+1}^H\), we get

\[U_{k+1}^H T_k = U_{k+1}^H (U_{k+1} R_{k+1}) = R_{k+1}.\]

Now substitute this into our equation for \(T_{k+1}\):

\[ T_{k+1} = (U_{k+1}^H T_k) U_{k+1} = R_{k+1} U_{k+1} \]

This provides the final, simple relationship:

\(T_k = U_{k+1} R_{k+1}\)
\(T_{k+1} = R_{k+1} U_{k+1}\)

This is the basis for the method of QR iteration.

Summary of the Key Steps#

The derivation shows that we can generate the sequence \(T_{k+1}\) from \(T_k\) using two simple steps:

Step 1: Compute the QR decomposition of \(T_k\): \(T_k = U_{k+1} R_{k+1}\)
Step 2: Compute \(T_{k+1}\) by multiplying the factors in reverse order: \(T_{k+1} = R_{k+1} U_{k+1}\)

We just switch the order of the terms! We can compute the entire sequence of \(T_k\) matrices starting from \(T_0 = A\) without ever needing to form the \(Q_k\) matrices from the orthogonal iteration.

This QR iteration produces the same sequence of \(R_k\) matrices as the orthogonal iteration.

From \(U_{k+1} = Q_k^H Q_{k+1}\), we can show by induction that the product of the unitary factors from the QR iteration, \(U_1 \cdots U_k\), is equal to the \(Q_k\) matrix from the orthogonal iteration (assuming \(Q_0 = I\)). This product therefore converges to the orthogonal matrix \(Q\) from the Schur decomposition.

Recall from our previous section that

\[ A^k = Q_k \, R_k R_{k-1}\cdots R_1. \]

Using the new sequence of \(U_k\) matrices from the QR iteration, we can write

\[ A^k = (U_1 U_2 \cdots U_k) (R_k R_{k-1}\cdots R_1). \]

We restate this important result as a formal theorem.

Theorem 25 (Relationship Between \(A^k\) and the QR Iteration Factors)

Let \(A\) be a square matrix. Let the sequences of matrices \(\{T_k\}\), \(\{U_k\}\), and \(\{R_k\}\) be generated by the QR iteration algorithm, defined as:

Initialize \(T_0 = A\).
For \(k=0, 1, 2, \dots\):
- Compute the QR decomposition: \(T_k = U_{k+1} R_{k+1}\)
- Compute the next iterate: \(T_{k+1} = R_{k+1} U_{k+1}\)

where \(U_{k+1}\) is a unitary matrix and \(R_{k+1}\) is an upper triangular matrix.

If we define the product matrices \(\tilde{Q}_k\) and \(\tilde{R}_k\) as:

\(\tilde{Q}_k = U_1 U_2 \cdots U_k\)
\(\tilde{R}_k = R_k R_{k-1} \cdots R_1\)

Then for any integer \(k \ge 1\), the \(k\)-th power of the matrix \(A\) is given by:

\[ A^k = \tilde{Q}_k \tilde{R}_k \]

The proof is provided at the end of this section.

QR Iteration Algorithm#

This leads to the following simple algorithm, often called the “unshifted” QR iteration.

import numpy as np

# We use np.copy() so the original matrix A is not modified
Tk = np.copy(A)

# The 'not_converged' condition is abstract.
# In a real implementation, you would check if the
# sub-diagonal elements are all close to zero.
# For this example, we'll just run a fixed number of iterations.
num_iterations = 20

for k in range(num_iterations):
    # Step 1: Compute the QR decomposition of the current matrix
    # In the derivation's notation: T_k = U_{k+1} R_{k+1}
    # numpy.linalg.qr() returns Q, R (which are Uk, Rk here)
    Uk, Rk = np.linalg.qr(Tk)
    
    # Step 2: Compute the next matrix by reversing the factors
    # In the derivation's notation: T_{k+1} = R_{k+1} U_{k+1}
    # We store the new T_{k+1} back into the Tk variable
    # In Python, @ is the operator for matrix multiplication.
    Tk = Rk @ Uk

First Few Iterations#

The algorithm produces a sequence of matrices \(T_0, T_1, T_2, \dots\) where \(T_k\) converges to an upper-triangular (Schur) form.

Start: \(T_0 = A\)
Iteration 1:
1. Compute QR decomposition: \(T_0 = U_1 R_1\)
2. Compute next matrix: \(T_1 = R_1 U_1\)
Iteration 2:
1. Compute QR decomposition: \(T_1 = U_2 R_2\)
2. Compute next matrix: \(T_2 = R_2 U_2\)
Iteration 3:
1. Compute QR decomposition: \(T_2 = U_3 R_3\)
2. Compute next matrix: \(T_3 = R_3 U_3\)
… and so on.

Proof of Decomposition of \(A^k\)#

We provide a direct proof of the following statement \(P(k)\):

\[ P(k): \quad A^k = (U_1 U_2 \cdots U_k) (R_k R_{k-1}\cdots R_1) \]

Proof. We will use a proof by induction. To simplify notation, let’s define two product matrices:

\(\tilde{Q}_k = U_1 U_2 \cdots U_k\)
\(\tilde{R}_k = R_k R_{k-1} \cdots R_1\)

Our goal is to prove \(A^k = \tilde{Q}_k \tilde{R}_k\) for all \(k \ge 1\).

Lemma: \(A = \tilde{Q}_k T_k \tilde{Q}_k^H\)

Before the main proof, let’s first establish a key relationship between \(A\) and \(T_k\). We will prove by induction that \(T_k = \tilde{Q}_k^H A \tilde{Q}_k\). Note that this is not necessarily an obvious result since we start from the QR iteration, not the orthogonal iteration.

Base Case (k=1):

From the algorithm, \(T_1 = R_1 U_1\).

From the first step (\(k=0\)), we have \(T_0 = U_1 R_1\). Since \(T_0 = A\), this means \(A = U_1 R_1\).

Because \(U_1\) is unitary, \(U_1^H = U_1^{-1}\), so we can write \(R_1 = U_1^H A\). Substitute this into the equation for \(T_1\):

\[ T_1 = (U_1^H A) U_1 = U_1^H A U_1 \]

By definition, \(\tilde{Q}_1 = U_1\), so \(T_1 = \tilde{Q}_1^H A \tilde{Q}_1\). The base case holds.

Inductive Step:

Assume \(T_n = \tilde{Q}_n^H A \tilde{Q}_n\) for some \(n \ge 1\). We want to show \(T_{n+1} = \tilde{Q}_{n+1}^H A \tilde{Q}_{n+1}\).

From the algorithm, \(T_{n+1} = R_{n+1} U_{n+1}\).

From the QR decomposition step, \(T_n = U_{n+1} R_{n+1}\), which implies \(R_{n+1} = U_{n+1}^H T_n\).

Substitute this into the equation for \(T_{n+1}\):

\[ T_{n+1} = (U_{n+1}^H T_n) U_{n+1} = U_{n+1}^H T_n U_{n+1} \]

Now, substitute the inductive hypothesis for \(T_n\):

\[ T_{n+1} = U_{n+1}^H (\tilde{Q}_n^H A \tilde{Q}_n) U_{n+1} \]

Grouping the terms:

\[ T_{n+1} = (U_{n+1}^H \tilde{Q}_n^H) A (\tilde{Q}_n U_{n+1}) \]

Using the property \((BC)^H = C^H B^H\), we get:

\[ T_{n+1} = (\tilde{Q}_n U_{n+1})^H A (\tilde{Q}_n U_{n+1}) \]

By definition, \(\tilde{Q}_{n+1} = \tilde{Q}_n U_{n+1}\). Therefore:

\[ T_{n+1} = \tilde{Q}_{n+1}^H A \tilde{Q}_{n+1} \]

This proves the lemma. Since \(\tilde{Q}_k\) is unitary, we can rearrange this to \(A = \tilde{Q}_k T_k \tilde{Q}_k^H\).

Main Proof

Now we proceed with the main proof for \(A^k = \tilde{Q}_k \tilde{R}_k\).

Base Case (k=1)

We need to show \(A^1 = \tilde{Q}_1 \tilde{R}_1\).

\(\tilde{Q}_1 = U_1\)
\(\tilde{R}_1 = R_1\)

So we must show \(A = U_1 R_1\). From the first step of the QR iteration (\(k=0\)):

\[ T_0 = U_1 R_1 \]

By definition, \(T_0 = A\). Therefore, \(A = U_1 R_1\). The base case holds.

Inductive Hypothesis

Assume the statement \(P(n)\) is true for some integer \(n \ge 1\):

\[ A^n = \tilde{Q}_n \tilde{R}_n = (U_1 \cdots U_n) (R_n \cdots R_1) \]

Inductive Step

We must prove the statement \(P(n+1)\) is true. We want to show \(A^{n+1} = \tilde{Q}_{n+1} \tilde{R}_{n+1}\).

Let’s start with \(A^{n+1}\):

\[ A^{n+1} = A \cdot A^n \]

Substitute the inductive hypothesis for \(A^n\):

\[ A^{n+1} = A \cdot (\tilde{Q}_n \tilde{R}_n) \]

Now, use our lemma \(A = \tilde{Q}_k T_k \tilde{Q}_k^H\). For \(k=n\), we have \(A = \tilde{Q}_n T_n \tilde{Q}_n^H\). Substitute this for \(A\):

\[ A^{n+1} = (\tilde{Q}_n T_n \tilde{Q}_n^H) \cdot (\tilde{Q}_n \tilde{R}_n) \]

Since \(\tilde{Q}_n\) is a product of unitary matrices, it is also unitary, which means \(\tilde{Q}_n^H \tilde{Q}_n = I\) (the identity matrix).

\[ A^{n+1} = \tilde{Q}_n T_n (I) \tilde{R}_n = \tilde{Q}_n T_n \tilde{R}_n \]

Now, use the QR algorithm’s definition for \(T_n\): \(T_n = U_{n+1} R_{n+1}\).

\[ A^{n+1} = \tilde{Q}_n (U_{n+1} R_{n+1}) \tilde{R}_n \]

Regroup the terms using the associativity of matrix multiplication:

\[ A^{n+1} = (\tilde{Q}_n U_{n+1}) (R_{n+1} \tilde{R}_n) \]

By our definitions:

\((\tilde{Q}_n U_{n+1}) = (U_1 \cdots U_n) U_{n+1} = \tilde{Q}_{n+1}\)
\((R_{n+1} \tilde{R}_n) = R_{n+1} (R_n \cdots R_1) = \tilde{R}_{n+1}\)

Substituting these back, we get:

\[ A^{n+1} = \tilde{Q}_{n+1} \tilde{R}_{n+1} \]

This is exactly the statement \(P(n+1)\).

We have shown that the statement is true for \(k=1\), and that if it is true for \(k=n\), it is also true for \(k=n+1\). Therefore, by induction, the statement \(A^k = (U_1 U_2 \cdots U_k) (R_k R_{k-1}\cdots R_1)\) is true for all integers \(k \ge 1\).