Artificial Intelligence 🤖
Matrices
Diagonalisation of symmetric and Hermitian matrices

Diagonalisation of symmetric and Hermitian matrices

In this last section we discuss properties of symmetric and Hermitian matrices which have applications in classical physics and quantum mechanics. The theory outlined for Hermitian matrices is closely related to Sturm-Liouville theory which defines properties of solution sets in boundary value problems. The latter arise in the separation of variables procedure for partial differential equations which are covered in the Year II Mathematics course in the context of heat and mass transfer.

Recall that a matrix is symmetric iff it is equal to its transpose, i.e.

A=A.A=A^{\top} .

Definition of Hermitian conjugate

If AA is a matrix with complex entries, the Hermitian conjugate of AA, denoted by AA^{\dagger} is the complex conjugate of all entries in the matrix transposed, i.e.

(A)ij=aˉji\left(A^{\dagger}\right)_{i j}=\bar{a}_{j i}

A complex matrix AA is called Hermitian if

A=AA^{\dagger}=A

In particular, all symmetric matrices are Hermitian. We also note that the Hermitian conjugate of the product of two matrices is the product of their conjugates taken in reverse order

(AB)=BA(A B)^{\dagger}=B^{\dagger} A^{\dagger}

which follows from the same identity for transpose matrices and the fact that the complex conjugate of a product is the product of complex conjugates, i.e.

ωˉzˉ=ωz.\bar{\omega} \bar{z}=\overline{\omega z} .

The diagonalisation of symmetric matrices is particularly easy as all eigenvalues are real and corresponding eigenvectors are orthonormal (these are mutually orthogonal vectors of unit length). We recall here that the dot product of two real vectors can be written using column vector notation as,

e1e2=e1e2\boldsymbol{e}_{1} \cdot \boldsymbol{e}_{2}=\boldsymbol{e}_{1}^{\top} \boldsymbol{e}_{2}

For complex-valued vectors the generalisation of the dot product involves the Hermitian conjugate. For two complex-valued vectors u\boldsymbol{u} and v\boldsymbol{v}, the dot product is defined as,

uv=uv\boldsymbol{u} \cdot \boldsymbol{v}=\boldsymbol{u}^{\dagger} \boldsymbol{v}

Hermitian matrices are orthogonally diagonalisable

While the proofs given here are beyond the scope of the course, the results are important. The key message is that Hermitian and therefore symmetric matrices are orthogonally diagonalisable.

Theorem 10.1 For a symmetric matrix AA, we may diagonalise AA with respect to an orthonormal set of eigenvectors.

The first step is to understand that the eigenvalues and therefore the eigenvectors of a Hermitian matrix are real.

Theorem 10.2 For an n×nn \times n Hermitian matrix AA all eigenvalues are real.

Proof Consider an eigenvector x\boldsymbol{x} with a possibly complex eigenvalue, λ\lambda. We want to show that λ\lambda is real which follows from λ=λˉ\lambda=\bar{\lambda}. Starting from,

Ax=λx,A \boldsymbol{x}=\lambda \boldsymbol{x},

and multiplying by the Hermitian conjugate of x\boldsymbol{x},

xAx=λxx.\boldsymbol{x}^{\dagger} A \boldsymbol{x}=\lambda \boldsymbol{x}^{\dagger} \boldsymbol{x} .

Next, we take the complex conjugate on both sides,

(xAx)=λˉ(xx)\left(\boldsymbol{x}^{\dagger} A \boldsymbol{x}\right)^{\dagger}=\bar{\lambda}\left(\boldsymbol{x}^{\dagger} \boldsymbol{x}\right)^{\dagger}

Using the property (10.126), (10.130) becomes,

xAx=λˉxx\boldsymbol{x}^{\dagger} A^{\dagger} \boldsymbol{x}=\bar{\lambda} \boldsymbol{x}^{\dagger} \boldsymbol{x}

Since AA is Hermitian, then A=AA^{\dagger}=A which renders (10.131) as,

xAx=λˉxx\boldsymbol{x}^{\dagger} A \boldsymbol{x}=\bar{\lambda} \boldsymbol{x}^{\dagger} \boldsymbol{x}

Finally, subtracting Eq. (10.132) from (10.129) gives,

(λλˉ)xx=xAxxAx=0.\begin{aligned} (\lambda-\bar{\lambda}) \boldsymbol{x}^{\dagger} \boldsymbol{x} & =\boldsymbol{x}^{\dagger} A \boldsymbol{x}-\boldsymbol{x}^{\dagger} A \boldsymbol{x} \\ & =0 . \end{aligned}

Since x\boldsymbol{x} is an eigenvector (and therefore nonzero), xx0\boldsymbol{x}^{\dagger} \boldsymbol{x} \neq 0 which implies that λ=λˉ\lambda=\bar{\lambda} and the eigenvalues are real.

The second step is to show that eigenvectors corresponding to distinct eigenvalues are orthogonal. Theorem 10.3 Eigenvectors of Hermitian matrices corresponding to different eigenvalues are orthogonal.

Proof Consider two eigenvectors x1\boldsymbol{x}_{1} and x2\boldsymbol{x}_{2} of a Hermitian matrix AA, corresponding to eigenvalues λ1\lambda_{1} and λ2\lambda_{2} where λ1λ2\lambda_{1} \neq \lambda_{2}. We want to show that x1\boldsymbol{x}_{1} and x2\boldsymbol{x}_{2} are orthogonal. We start from,

Ax1=λ1x1,Ax2=λ2x2.A \boldsymbol{x}_{1}=\lambda_{1} \boldsymbol{x}_{1}, \quad A \boldsymbol{x}_{2}=\lambda_{2} \boldsymbol{x}_{2} .

We multiply the first equation by x2\boldsymbol{x}_{2}^{\dagger}

x2Ax1=λ1x2x1\boldsymbol{x}_{2}^{\dagger} A \boldsymbol{x}_{1}=\lambda_{1} \boldsymbol{x}_{2}^{\dagger} \boldsymbol{x}_{1}

and the second by x1\boldsymbol{x}_{1}^{\dagger}

x1Ax2=λ2x1x2\boldsymbol{x}_{1}^{\dagger} A \boldsymbol{x}_{2}=\lambda_{2} \boldsymbol{x}_{1}^{\dagger} \boldsymbol{x}_{2}

We take the Hermitian conjugate on Eq. (10.135),

(x1Ax2)=λ2(x1x2)\left(\boldsymbol{x}_{1}^{\dagger} A \boldsymbol{x}_{2}\right)^{\dagger}=\overline{\lambda_{2}}\left(\boldsymbol{x}_{1}^{\dagger} \boldsymbol{x}_{2}\right)^{\dagger}

With similar arguments as in Theorem 10.2, we manipulate Eq. (10.136) as,

x2Ax1=λ2x2x1\boldsymbol{x}_{2}^{\dagger} A \boldsymbol{x}_{1}=\overline{\lambda_{2}} \boldsymbol{x}_{2}^{\dagger} \boldsymbol{x}_{1}

Finally, subtracting Eq. (10.137) from (10.134) gives,

(λ1λ2)x2x1=x2Ax1x2Ax1=0\begin{aligned} \left(\lambda_{1}-\overline{\lambda_{2}}\right) \boldsymbol{x}_{2}^{\dagger} \boldsymbol{x}_{1} & =\boldsymbol{x}_{2}^{\dagger} A \boldsymbol{x}_{1}-\boldsymbol{x}_{2}^{\dagger} A \boldsymbol{x}_{1} \\ & =0 \end{aligned}

Note that from Theorem 10.2, we know that λ2=λ2\overline{\lambda_{2}}=\lambda_{2} since the eigenvalues of a Hermitian matrix are real. Further, since λ1λ2\lambda_{1} \neq \lambda_{2} then x2x1=0\boldsymbol{x}_{2}^{\dagger} \boldsymbol{x}_{1}=0 which implies orthogonality.

The only question left to ask is - how can we prove that we have orthogonal eigenvectors for eigenvectors with the same eigenvalue? The first step towards answering this question requires deeper knowledge than is presented in this course and we assume that for a given eigenvalue we have a full set of eigenvectors. Once this is assumed, we can easily show that any set of eigenvectors can be turned into an orthogonal set of eigenvectors. This is done through the Gram-Schmidt process. Example 10.13 Let AA be

A=(3/21/201/23/20003)A=\left(\begin{array}{ccc} 3 / 2 & -1 / 2 & 0 \\ -1 / 2 & 3 / 2 & 0 \\ 0 & 0 & 3 \end{array}\right)

Diagonalise the symmetric matrix AA with respect to its orthonormal eigenvectors.

Solution We first compute the eigenvalues from,

3/2λ1/201/23/2λ0003λ=(3/2λ)3/2λ003λ+121/2003λ=(3λ)(λ2)(λ1);\begin{aligned} \left|\begin{array}{ccc} 3 / 2-\lambda & -1 / 2 & 0 \\ -1 / 2 & 3 / 2-\lambda & 0 \\ 0 & 0 & 3-\lambda \end{array}\right| & =(3 / 2-\lambda)\left|\begin{array}{cc} 3 / 2-\lambda & 0 \\ 0 & 3-\lambda \end{array}\right|+\frac{1}{2}\left|\begin{array}{cc} -1 / 2 & 0 \\ 0 & 3-\lambda \end{array}\right| \\ & =(3-\lambda)(\lambda-2)(\lambda-1) ; \end{aligned}

hence the eigenvalues are 1, 2, and 3. Next, we compute the eigenvectors. Starting with λ=1\lambda=1, we need to solve

A=(1/21/201/21/20002)(xyz)=(000)A=\left(\begin{array}{ccc} 1 / 2 & -1 / 2 & 0 \\ -1 / 2 & 1 / 2 & 0 \\ 0 & 0 & 2 \end{array}\right)\left(\begin{array}{l} x \\ y \\ z \end{array}\right)=\left(\begin{array}{l} 0 \\ 0 \\ 0 \end{array}\right)

which has solution x=yx=y and z=0z=0 and so the eigenvector is x1=(1,1,0)\boldsymbol{x}_{1}=(1,1,0)^{\top}; the unit vector is

e1=(1/21/20)\boldsymbol{e}_{1}=\left(\begin{array}{c} 1 / \sqrt{2} \\ 1 / \sqrt{2} \\ 0 \end{array}\right)

Similarly, we find the other two unit vectors corresponding to λ=2\lambda=2 and λ=3\lambda=3, respectively, as,

e2=(1/21/20),e3=(001)\boldsymbol{e}_{2}=\left(\begin{array}{c} 1 / \sqrt{2} \\ -1 / \sqrt{2} \\ 0 \end{array}\right), \quad \boldsymbol{e}_{3}=\left(\begin{array}{l} 0 \\ 0 \\ 1 \end{array}\right)

Recall that the diagonalisation matrix is given by A=PDP1A=P D P^{-1}, where PP is made up of the three eigenvectors as follows,

P=(1/21/201/21/20001)P=\left(\begin{array}{ccc} 1 / \sqrt{2} & 1 / \sqrt{2} & 0 \\ 1 / \sqrt{2} & -1 / \sqrt{2} & 0 \\ 0 & 0 & 1 \end{array}\right)

Since PP is a matrix with orthonormal columns then it is orthogonal which we know from Subsec. 10.1.4 to satisfy P1=PP^{-1}=P^{\top}. Further, DD is the diagonal matrix with its diagonal entries made up of the eigenvalues of AA. Hence, we have

A=(1/21/201/21/20001)P(100020003)D(1/21/201/21/2000)P1A=\underbrace{\left(\begin{array}{ccc} 1 / \sqrt{2} & 1 / \sqrt{2} & 0 \\ 1 / \sqrt{2} & -1 / \sqrt{2} & 0 \\ 0 & 0 & 1 \end{array}\right)}_{P} \underbrace{\left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 3 \end{array}\right)}_{D} \underbrace{\left(\begin{array}{ccc} 1 / \sqrt{2} & 1 / \sqrt{2} & 0 \\ 1 / \sqrt{2} & -1 / \sqrt{2} & 0 \\ 0 & 0 \end{array}\right)}_{P^{-1}}

The biggest advantage of symmetric matrices is that we do not need to find the inverse of the matrix of eigenvectors using Gauss-Jordan, but we can just take the transpose.

Exercises

  1. Using the Cayley-Hamilton theorem find the inverse of

    (a) A=(2113/25/23/21/21/27/2)A=\left(\begin{array}{ccc}2 & -1 & 1 \\ -3 / 2 & 5 / 2 & 3 / 2 \\ -1 / 2 & 1 / 2 & 7 / 2\end{array}\right);

    (b) A=(1/38/34/31224/310/3/211/3)A=\left(\begin{array}{ccc}1 / 3 & -8 / 3 & 4 / 3 \\ -1 & -2 & 2 \\ -4 / 3 & -10 / 3 / 2 & 11 / 3\end{array}\right).

  2. Orthogonally diagonalise the following matrices:

    (a) A=(010100003)A=\left(\begin{array}{lll}0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 3\end{array}\right);

    (b) A=(211101112)A=\left(\begin{array}{ccc}2 & 1 & -1 \\ 1 & 0 & 1 \\ -1 & 1 & 2\end{array}\right).