The name conjugate gradients comes from the fact that the search directions are orthogonal to each other with the right metrics. A special orthogonality relation needs to be used to prove this.
Definition. We say that two non-zero vectors and are -conjugate with respect to if
Denote by
Theorem. The solution increments and are -conjugate.
Proof.
This is a key result: the residual is orthogonal to the Krylov subspace .
Denote by Then we have that
But:
The steps in CG can be visualized as shown below. If we multiply the vectors by then each step is orthogonal to all the previous ones. This looks like a street map of Manhattan!