We know how to compute . We can now calculate the optimal step sizes using the orthogonality relations.

Recall the basic definitions:

Multiply by

We use the definition of the residual: . So we get:

Recall that . Let’s multiply by to the left:

We can simplify it a bit more using our three-term recurrence:

Take a dot product with :

We have proved that:

Theorem. The optimal step-size in CG is given by: