Diagonalisation of matrices

Diagonal matrices are particularly convenient for eigenvalue problems since the eigenvalues of a diagonal matrix coincide with the diagonal entries $\left\{a_{i i}\right\}$ and the corresponding eigenvector is simply the $i^{\text {th }}$ coordinate vector. Diagonalised matrices are useful in determining matrix exponents which are in turn useful in describing solutions to linear systems of differential equations.

Consider a $3 \times 3$ matrix $A$ with three distinct eigenvalues, $\lambda_{1}, \lambda_{2}$ , and $\lambda_{3}$ with eigenvectors $\boldsymbol{e}_{1}, \boldsymbol{e}_{2}$ , and $\boldsymbol{e}_{3}$ , respectively. We write a matrix $P$ whose columns are the eigenvectors,

P=\left(\begin{array}{lll} \boldsymbol{e}_{1} & \boldsymbol{e}_{2} \boldsymbol{e}_{3} \end{array}\right)

What happens when we pre-multiply by $A$ ? We have

A\left(\boldsymbol{e}_{1} \boldsymbol{e}_{2} \boldsymbol{e}_{3}\right)=\left(\begin{array}{lll} A \boldsymbol{e}_{1} & A \boldsymbol{e}_{2} & A \boldsymbol{e}_{3} \end{array}\right)

which implies, using $A \boldsymbol{e}=\lambda \boldsymbol{e}$ ,

\begin{aligned} A\left(\boldsymbol{e}_1 \boldsymbol{e}_2 \boldsymbol{e}_3\right) &= \left(\lambda_1 \boldsymbol{e}_1 \lambda_2 \boldsymbol{e}_2 \lambda_3 \boldsymbol{e}_3\right) \\ & = \left(\begin{array}{ccc} \boldsymbol{e}_{1} & \boldsymbol{e}_{2} & \boldsymbol{e}_{3} \end{array}\right)\left(\begin{array}{ccc} \lambda_{1} & 0 & 0 \\ 0 & \lambda_{2} & 0 \\ 0 & 0 & \lambda_{3} \end{array}\right) . \end{aligned}

We define

D=\left(\begin{array}{ccc} \lambda_{1} & 0 & 0 \\ 0 & \lambda_{2} & 0 \\ 0 & 0 & \lambda_{3} \end{array}\right)

This gives us the equation

A P=P D .

We can take the inverse of $P$ on the right on both sides of the equation to get,

A=P D P^{-1} .

This is called the diagonalisation of matrix $A$ .

We consider the matrix

A=\left(\begin{array}{ccc} 5 / 2 & 1 / 2 & -1 / 2 \\ -1 / 2 & 7 / 2 & 1 / 2 \\ -1 & 1 & 3 \end{array}\right)

We calculate the characteristic equation,

\operatorname{det}(A-\lambda I)=\left|\begin{array}{ccc} 5 / 2 & 1 / 2 & -1 / 2 \\ -1 / 2 & 7 / 2 & 1 / 2 \\ -1 & 1 & 3 \end{array}\right|=0

which, after some algebraic manipulation, gives us

(4-\lambda)(\lambda-2)(\lambda-3)=0

First we compute the eigenvector with eigenvalue $\lambda=2$ . We need to solve for the roots of

\left(\begin{array}{ccc} 1 & 1 & -1 \\ -1 & 3 & 1 \\ -1 & 1 & 1 \end{array}\right)\left(\begin{array}{l} x \\ y \\ z \end{array}\right)=\left(\begin{array}{l} 0 \\ 0 \\ 0 \end{array}\right)

Note that we have multiplied the first two rows by 2 to ease computations. Since $A-\lambda I$ is singular, we know that one of the equations must be redundant. We choose the following two equations to proceed,

\left(\begin{array}{ccc} 1 & 1 & -1 \\ -1 & 1 & 1 \end{array}\right)\left(\begin{array}{l} x \\ y \\ z \end{array}\right)=\left(\begin{array}{l} 0 \\ 0 \end{array}\right) .

We undertake $\left(R_{2}+R_{1} \rightarrow R_{2}\right)$ to simplify the equations,

\left(\begin{array}{ccc} 1 & 1 & -1 \\ 0 & 2 & 0 \end{array}\right)\left(\begin{array}{l} x \\ y \\ z \end{array}\right)=\left(\begin{array}{l} 0 \\ 0 \end{array}\right)

The second row immediately gives that $y=0$ and the first row that $x=z$ so the eigenvector is

\boldsymbol{e}_{1}=\left(\begin{array}{l} 1 \\ 0 \\ 1 \end{array}\right)

Following the same steps with the other two eigenvalues we obtain

\boldsymbol{e}_{2}=\left(\begin{array}{l} 1 \\ 1 \\ 0 \end{array}\right) \text { and } \boldsymbol{e}_{3}=\left(\begin{array}{l} 0 \\ 1 \\ 1 \end{array}\right)

for $\lambda=3$ and $\lambda=4$ , respectively. We can now form the matrix $P$ in Eq. (10.100),

P=\left(\begin{array}{lll} 1 & 1 & 0 \\ 0 & 1 & 1 \\ 1 & 0 & 1 \end{array}\right)

Then, $A P$ is given by

\left(\begin{array}{ccc} 5 / 2 & 1 / 2 & -1 / 2 \\ -1 / 2 & 7 / 2 & 1 / 2 \\ -1 & 1 & 3 \end{array}\right)\left(\begin{array}{lll} 1 & 1 & 0 \\ 0 & 1 & 1 \\ 1 & 0 & 1 \end{array}\right)=\left(\begin{array}{lll} 2 & 3 & 0 \\ 0 & 3 & 4 \\ 2 & 0 & 4 \end{array}\right) .

Equation (10.106) is equal to $P D$ ,

\left(\begin{array}{lll} 2 & 3 & 0 \\ 0 & 3 & 4 \\ 2 & 0 & 4 \end{array}\right)=\left(\begin{array}{lll} 1 & 1 & 0 \\ 0 & 1 & 1 \\ 1 & 0 & 1 \end{array}\right)\left(\begin{array}{lll} 2 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 4 \end{array}\right)

where $D$ is given by,

D=\left(\begin{array}{lll} 2 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 4 \end{array}\right)

Applications of Diagonalisation

Let us consider a diagonal $3 \times 3$ matrix,

D=\left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 2 \end{array}\right)

What happens when we square the matrix? We obtain,

D^{2}=\left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 2 \end{array}\right)\left(\begin{array}{ccc} 1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 2 \end{array}\right)=\left(\begin{array}{lll} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 4 \end{array}\right)

We observe that matrix powers of diagonal matrices correspond simply to powers of the diagonal entries. Let us consider what happens when we square both sides of Eq. (10.102),

A^{2}=P D P^{-1} P D P^{-1}=P D^{2} P^{-1} .

We see that we can repeatedly multiply by the same expression and $P$ and $P^{-1}$ cancel in the middle of the product such that,

A^{n}=P D^{n} P^{-1}

If we let $D=\operatorname{Diag}\left(\lambda_{1}, \lambda_{2}, \lambda_{3}\right)$ , then $D^{n}=\operatorname{Diag}\left(\lambda_{1}^{n}, \lambda_{2}^{n}, \lambda_{3}^{n}\right)$ which we can put between $P$ and $P^{-1}$ to ease the calculation of the power of $A$ .

In addition to powers of $A$ we can also use matrix diagonalisation to easily calculate matrix exponentials. Recall the power series expansion of the exponential given by Eq. (6.20),

e^{x}=\sum_{n=0}^{\infty} \frac{x^{n}}{n !}

We define the expression $e^{t A}$ through the series expansion,

e^{t A}=\sum_{n=0}^{\infty} \frac{t^{n}}{n !} A^{n}

We show how diagonalisation helps us compute the expression $e^{t A}$ . We can replace $A^{n}$ with $P D^{n} P^{-1}$ in Eq. (10.111) which gives,

e^{t A}=\sum_{n=0}^{\infty} \frac{t^{n}}{n !} P D^{n} P^{-1}

In this case, the distributive law of matrix multiplication allows us to write,

e^{t A}=P\left(\sum_{n=0}^{\infty} \frac{t^{n}}{n !} D^{n}\right) P^{-1} .

We observe that we have $e^{t D}$ in between $P$ and $P^{-1}$ and so we obtain,

e^{t A}=P e^{t D} P^{-1}

Finally, for $D=\operatorname{Diag}\left(\lambda_{1}, \lambda_{2}, \lambda_{3}\right)$ , we have $e^{t D}=\operatorname{Diag}\left(e^{t \lambda_{1}}, e^{t \lambda_{2}}, e^{t \lambda_{3}}\right)$ .

Note that the derivation is not rigorous, since we need to consider convergence properties of these series and define what it means for a matrix sum to converge. The interested reader may wish to think how to extend the concept of partial sums and the convergence of series to the convergence of a series of matrices.

Example

Let

A=\left(\begin{array}{ccc} -2 / 3 & 2 / 3 & 4 / 3 \\ 1 & 0 & 1 \\ 2 / 3 & 1 / 3 & -4 / 3 \end{array}\right)

diagonalise $A$ and hence calculate $A^{5}$ .

Solution We need to express $A$ as the diagonalisation $A=P D P^{-1}$ in order to calculate matrix powers. To compute $D$ we need the eigenvalues and for $P$ we need the corresponding eigenvectors. Once we have $P$ we need to compute its inverse. The eigenvalues are obtained from the characteristic equation,

\left|\begin{array}{ccc} -2 / 3 & 2 / 3 & 4 / 3 \\ 1 & 0 & 1 \\ 2 / 3 & 1 / 3 & -4 / 3 \end{array}\right|=(2+\lambda)(1-\lambda)(1+\lambda)

which gives the eigenvalues $-2,-1,1$ and $D$ as

D=\left(\begin{array}{ccc} -2 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 1 \end{array}\right)

To find the eigenvector for $\lambda=-2$ , we solve $(A-\lambda I) \boldsymbol{x}=\mathbf{0}$ ,

\left(\begin{array}{ccc} 4 / 3 & 2 / 3 & 4 / 3 \\ 1 & 2 & 1 \\ 2 / 3 & 1 / 3 & 2 / 3 \end{array}\right)\left(\begin{array}{l} x \\ y \\ z \end{array}\right)=\left(\begin{array}{l} 0 \\ 0 \\ 0 \end{array}\right)

as always, one equation is redundant leaving us with one degree of freedom. The eigenvector is computed as $\boldsymbol{e}_{1}=(1,0,-1)^{\top}$ . Similarly, we obtain the other two eigenvectors corresponding to the eigenvalues $\lambda=-1,1$ . For $\lambda=-1$ , we obtain $\boldsymbol{e}_{2}=(2,-3,1)^{\top}$ and for $\lambda=1, \boldsymbol{e}_{2}=(2,3,1)^{\top}$ . Using the eigenvectors as the columns of $P$ we have,

P=\left(\begin{array}{ccc} 1 & 2 & 2 \\ 0 & -3 & 3 \\ -1 & 1 & 1 \end{array}\right)

Next, we compute the inverse of $P$ using Gauss-Jordan

\left(\begin{array}{ccc|ccc} 1 & 2 & 2 & 1 & 0 & 0 \\ 0 & -3 & 3 & 0 & 1 & 0 \\ -1 & 1 & 1 & 0 & 0 & 1 \end{array}\right)

By performing the following row transformations, $\left(R_{3}+R_{1}\right) \rightarrow\left(R_{3}\right),\left(-R_{3} / 3\right) \rightarrow\left(R_{3}\right)$ , $\left(R_{1}+2 R_{2}\right) \rightarrow\left(R_{1}\right),\left(R_{3}-3 R_{2}\right) \rightarrow\left(R_{3}\right),\left(6 R_{2}+R_{3}\right) \rightarrow\left(R_{2}\right),\left(6 R_{1}-4 R_{2}\right) \rightarrow\left(R_{1}\right)$ , $\left(R_{2} / 6\right) \rightarrow\left(R_{2}\right),\left(R_{3} / 6\right) \rightarrow\left(R_{3}\right)$ we obtain the identity on the RHS and the inverse on the LHS,

\left(\begin{array}{ccc|ccc} 1 & 0 & 0 & 1 / 3 & 0 & -2 / 3 \\ 0 & 1 & 0 & 1 / 6 & -1 / 6 & 1 / 6 \\ 0 & 0 & 1 & 1 / 6 & 1 / 6 & 1 / 6 \end{array}\right) .

Now, to calculate $A^{5}$ we raise $D^{5}$ and multiply by $P$ on the left and $P^{-1}$ on the right,

A^{5}=\left(\begin{array}{ccc} 1 & 2 & 2 \\ 0 & -3 & 3 \\ -1 & 1 & 1 \end{array}\right)\left(\begin{array}{ccc} -32 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 1 \end{array}\right)\left(\begin{array}{ccc} 1 / 3 & 0 & -2 / 3 \\ 1 / 6 & -1 / 6 & 1 / 6 \\ 1 / 6 & 1 / 6 & 1 / 6 \end{array}\right)=\left(\begin{array}{ccc} -32 / 3 & 2 / 3 & 64 / 3 \\ 1 & 0 & 1 \\ 32 / 3 & 1 / 3 & -64 / 3 \end{array}\right)

From the procedure followed in Example 10.11, we can calculate matrix exponents. For instance, to compute $e^{t A}$ for matrix $A$ given in Example 10.11, we compute $e^{t D}$ as,

e^{t D}=\left(\begin{array}{ccc} e^{-2 t} & 0 & 0 \\ 0 & e^{-t} & 0 \\ 0 & 0 & e^{t} \end{array}\right)

The matrix exponent $e^{t A}$ is then calculated from $e^{t A}=P e^{t D} P^{-1}$ .

Geometric and algebraic multiplicities

Consider the matrix,

A=\left(\begin{array}{ll} 1 & 1 \\ 0 & 1 \end{array}\right)

Its characteristic equation gives

\operatorname{det}(A-\lambda I)=(1-\lambda)^{2}=0

which gives the eigenvalue 1 repeated twice. We cannot diagonalise this matrix since the only diagonalisable form is the identity matrix for which,

A=P I P^{-1}=P P^{-1}=I

and this results in a contradiction. To find eigenvectors, we have $(A-\lambda I) \boldsymbol{x}=\mathbf{0}$ which gives

\left(\begin{array}{ll} 0 & 1 \\ 0 & 0 \end{array}\right)\left(\begin{array}{l} x \\ y \end{array}\right)=\left(\begin{array}{l} 0 \\ 0 \end{array}\right)

The solution is $y=0$ and so the eigenvector is $(1,0)^{\top}$ . This motivates the following definition.

Algebraic and geometric multiplicities equation

The algebraic multiplicity of an eigenvalue is the number of times it is repeated as a root of the characteristic equation. In the example above, $\lambda=1$ has algebraic multiplicity two. The geometric multiplicity of an eigenvalue is the number of linearly independent eigenvectors that are associated with the eigenvalue. In the example above the geometric multiplicity of 1 is one.

Exercises

Diagonalise $A=\left(\begin{array}{ccc}-1 & 3 / 2 & 1 / 2 \\ -1 & 2 & 1 \\ -2 & 1 & 2\end{array}\right)$ and hence find $A^{6}$ and $e^{t A}$ .
Diagonalise $A=\left(\begin{array}{ccc}2 & 0 & 0 \\ 8 & -2 & 0 \\ -3 & 0 & 3\end{array}\right)$ and hence find $A^{5}$ and $e^{t A}$ .

Eigenvalues and Eigenvectors The Cayley-Hamilton theorem