Artificial Intelligence 🤖
Matrices
Definitions & rules

Definitions & rules

A matrix is an ordered rectangular array of quantities. An array AA with mm rows and nn columns is called an m×nm \times n matrix and is said to have mnm n elements. For example,

A=(a11a12a1na21a22a2nam1am2amn)A=\left(\begin{array}{cccc} a_{11} & a_{12} & \ldots & a_{1 n} \\ a_{21} & a_{22} & \ldots & a_{2 n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m 1} & a_{m 2} & \ldots & a_{m n} \end{array}\right)

The position of an element in a matrix is specified uniquely by means of a double subscript. We denote the element in the ii-th row and the jj-th column of the matrix AA by aija_{i j}. A shorthand notation is to write A=(aij)m×nA=\left(a_{i j}\right)_{m \times n} where 1im1 \leq i \leq m and 1jn1 \leq j \leq n.

A matrix with m=1m=1 is called a row vector, for example A=(134)A=\left(\begin{array}{lll}1 & 3 & 4\end{array}\right) and a matrix with n=1n=1 is called a column vector, for example B=(351)B=\left(\begin{array}{l}3 \\ 5 \\ 1\end{array}\right).

Rules of matrix algebra

I. Addition and subtraction

Consider a matrix A=(aij)m×nA=\left(a_{i j}\right)_{m \times n} of size m×nm \times n and a matrix B=(bij)p×qB=\left(b_{i j}\right)_{p \times q} of size p×qp \times q. Then, A±BA \pm B can only exist if p=mp=m and q=nq=n, i.e. only if AA and BB have the same size. For example, determine the matrix A+BA+B if

A=(012987) and B=(754345).A=\left(\begin{array}{lll} 0 & 1 & 2 \\ 9 & 8 & 7 \end{array}\right) \quad \text { and } \quad B=\left(\begin{array}{lll} 7 & 5 & 4 \\ 3 & 4 & 5 \end{array}\right) .

The resulting matrix has the same size as AA and BB; if the entries for the matrix AA are aija_{i j} and for BB are bijb_{i j}, then the entries in A+BA+B are given by aij+bija_{i j}+b_{i j}, as follows,

A+B=(0+71+52+49+38+47+5)=(766121212)A+B=\left(\begin{array}{lll} 0+7 & 1+5 & 2+4 \\ 9+3 & 8+4 & 7+5 \end{array}\right)=\left(\begin{array}{ccc} 7 & 6 & 6 \\ 12 & 12 & 12 \end{array}\right)

II. Equality

The matrices AA and BB are equal to each other, i.e. A=BA=B, iff aij=bija_{i j}=b_{i j} for all ii and jj.

III. Multiplication by a scalar

If λ\lambda is a scalar, then λA=λaij\lambda A=\lambda a_{i j} i.e. every element of AA is multiplied by λ\lambda.

IV. Matrix multiplication

If AA is a row vector with nn elements given by

A=(a11a12a1n)A=\left(\begin{array}{llll} a_{11} & a_{12} & \ldots & a_{1 n} \end{array}\right)

and BB is a column vector with nn elements given by

B=(b11b21bn1)B=\left(\begin{array}{c} b_{11} \\ b_{21} \\ \vdots \\ b_{n 1} \end{array}\right)

then the product, ABA B is defined to be a scalar given by

AB=a11b11+a12b21++a1nbn1.A B=a_{11} b_{11}+a_{12} b_{21}+\ldots+a_{1 n} b_{n 1} .

In general, if A=(aij)m×nA=\left(a_{i j}\right)_{m \times n} and B=(bij)p×qB=\left(b_{i j}\right)_{p \times q}, the product ABA B exists only if n=pn=p, i.e. only if

the number of columns of A=A= the number of rows of BB

If n=pn=p then the product ABA B (let us call it CC ) is an m×qm \times q matrix:,

Am×nBp×q=Cm×q\underbrace{A}_{m \times n} \underbrace{B}_{p \times q}=\underbrace{C}_{m \times q}

where n=pn=p in Eq. (10.4). For n=pn=p, we define the product C=ABC=A B as a matrix of size m×qm \times q where,

cij=k=1naikbkjc_{i j}=\sum_{k=1}^{n} a_{i k} b_{k j}

Consider the matrices given by AA and BB as follows,

A=(041632) and B=(120115)A=\left(\begin{array}{cc} 0 & 4 \\ -1 & 6 \\ 3 & -2 \end{array}\right) \quad \text { and } \quad B=\left(\begin{array}{ccc} 1 & 2 & 0 \\ 1 & -1 & 5 \end{array}\right)

Suppose we want to determine the product ABA B. The number of columns of AA are equal to the number of rows of BB therefore we can proceed to determine the product ABA B. Since AA is a 3×23 \times 2 matrix and BB is 2×32 \times 3 matrix, the product, ABA B will be a 3×33 \times 3 matrix.

To obtain the product ABA B : - We multiply the first element in the first row of AA (i.e a11=0a_{11}=0 ) with the first element in the first column of BB (i.e. b11=1b_{11}=1 ) [these elements are shown in blue boxes]. We then multiply the second element in the first row of AA (i.e. a12=4a_{12}=4 ) with the second element in the first column of BB (i.e. b21=1b_{21}=1 ) [these elements are shown in red boxes]. The sum of these products is the first element in the product ABA B :

(041632)(120115)=(0+4)\left(\begin{array}{cc} 0 & 4 \\ -1 & 6 \\ 3 & -2 \end{array}\right)\left(\begin{array}{ccc} 1 & 2 & 0 \\ \hline 1 & -1 & 5 \end{array}\right)=\left(\begin{array}{ccc} 0 & +4 & - \\ - & - & - \\ - & - & - \end{array}\right)
  • So far we multiplied the first row of AA by the first column of BB. The sum is 0+4=40+4=4 shown in green. Next, we multiply the first row of AA by the second column of BB :

  • To complete the first row of the product ABA B, we multiply the first row of AA by the third column of BB :
(041632)(120115)=(440)\left(\begin{array}{cc} 0 & 4 \\ -1 & 6 \\ 3 & -2 \end{array}\right)\left(\begin{array}{ccc} 1 & 2 & 0 \\ 1 & -1 & 5 \end{array}\right)=\left(\begin{array}{ccc} 4 & -4 & 0 \\ - & - & - \\ - & - & - \end{array}\right)
  • Repeating the above steps with the second and third rows of AA, yields,
(041632)(120115)=(442058301810)\left(\begin{array}{cc} 0 & 4 \\ -1 & 6 \\ 3 & -2 \end{array}\right)\left(\begin{array}{ccc} 1 & 2 & 0 \\ 1 & -1 & 5 \end{array}\right)=\left(\begin{array}{ccc} 4 & -4 & 20 \\ 5 & -8 & 30 \\ 1 & 8 & -10 \end{array}\right)

Note that if AB=0A B=0, it does not necessarily follow that A=0A=0 or B=0B=0 or BA=0B A=0.

Properties of matrix multiplication

(i) Non-commutativity: in general, ABBAA B \neq B A; even if both ABA B and BAB A exist. If AB=BAA B=B A then AA and BB are said to commute.

(ii) Associativity: for matrices A(m×n),B(n×q)A(m \times n), B(n \times q), and C(q×s)C(q \times s), we have

(AB)C=A(BC)(A B) C=A(B C)

(iii) Distributive over matrix addition: For A(m×n),B(n×q)A(m \times n), B(n \times q), and C(n×q)C(n \times q), we have

A(B+C)=AB+AC.A(B+C)=A B+A C .

Special types of matrices

In this section, we take AA to be an m×nm \times n matrix, i.e. A=(aij)m×nA=\left(a_{i j}\right)_{m \times n}.

Transpose of a matrix

Consider an m×nm \times n matrix AA. The transpose of AA, denoted by AA^{\top} is the n×mn \times m (note the index letters have been switched) matrix whose rows are the columns of AA and whose columns are the rows of AA, i.e. A=(aji)n×mA=\left(a_{j i}\right)_{n \times m}. For example, if

A=(a11a12a21a22a31a32)A=\left(\begin{array}{ll} a_{11} & a_{12} \\ a_{21} & a_{22} \\ a_{31} & a_{32} \end{array}\right)

then,

A=(a11a21a31a12a22a32).A^{\top}=\left(\begin{array}{lll} a_{11} & a_{21} & a_{31} \\ a_{12} & a_{22} & a_{32} \end{array}\right) .

Note that the transpose of a matrix AA is also denoted as AtA^{t} or AA^{\prime}.

I. Square matrices

If m=nm=n, then AA is called the square matrix of order nn or an nn-th order matrix. The elements aii(i=1,2,3,,n)a_{i i}(i=1,2,3, \ldots, n) are called the main diagonal elements of AA. Their sum is called the trace of AA, i.e.

traceA=i=1naii\operatorname{trace} A=\sum_{i=1}^{n} a_{i i}

II. Symmetric matrices

If A=AA^{\top}=A, i.e. aji=aija_{j i}=a_{i j} then AA is said to be a symmetric matrix. For example,

A=(101032125)=ATA=\left(\begin{array}{ccc} 1 & 0 & -1 \\ 0 & -3 & 2 \\ -1 & 2 & 5 \end{array}\right)=A^{T}

III. Skew-symmetric or anti-symmetric matrices

If A=AA^{\top}=-A, i.e. aji=aija_{j i}=-a_{i j} then AA is said to be a skew or anti-symmetric matrix. For example,

A=(013104340)=AA=\left(\begin{array}{ccc} 0 & 1 & -3 \\ -1 & 0 & 4 \\ 3 & -4 & 0 \end{array}\right)=-A^{\top}

It follows that aii=0a_{i i}=0 for such matrices.

Note that any square matrix can be written as the sum of a symmetric and an anti-symmetric matrix:

A=12(A+A)+12(AA)A=\frac{1}{2}\left(A+A^{\top}\right)+\frac{1}{2}\left(A-A^{\top}\right)

IV. Diagonal matrices

If aij=0a_{i j}=0 where ij,Ai \neq j, A is called a diagonal matrix. For example,

A=(x000y000z).A=\left(\begin{array}{lll} x & 0 & 0 \\ 0 & y & 0 \\ 0 & 0 & z \end{array}\right) .

Special cases of diagonal matrices include the identity and null matrices. If aij=0a_{i j}=0 where iji \neq j and aii=1a_{i i}=1, then A=InA=I_{n} and is referred to as the identity matrix of order nn (note that throughout these notes we also denote the identity matrix simply by I)I). For example,

I3=(100010001).I_{3}=\left(\begin{array}{lll} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{array}\right) .

If aij=0a_{i j}=0 for all ii and jj then A=0A=0 and is called the null or zero matrix.

V. Lower triangular matrix:

This is a matrix for which aij=0a_{i j}=0 if i<ji<j, e.g.

A=(200150672)A=\left(\begin{array}{ccc} 2 & 0 & 0 \\ -1 & 5 & 0 \\ 6 & 7 & 2 \end{array}\right)

VI. Upper triangular matrix:

This is a matrix for which aij=0a_{i j}=0 if i>ji>j, e.g.

A=(226071005).A=\left(\begin{array}{lll} 2 & 2 & 6 \\ 0 & 7 & 1 \\ 0 & 0 & 5 \end{array}\right) .

Inverse matrices

A square matrix AA of order nn has an inverse, say BB, if,

AB=In and BA=InA B=I_{n} \quad \text { and } \quad B A=I_{n}
  • If the inverse of AA exists, then it is unique;
  • If BB exists, AA is called a non-singular matrix;
  • If BB does not exist, AA is said to be singular.

The inverse of AA is denoted by A1A^{-1} so,

AA1=A1A=In.A A^{-1}=A^{-1} A=I_{n} .

Shortcut for 2×22 \times 2 matrices

Given a 2×22 \times 2 matrix,

A=(a11a12a21a22)A=\left(\begin{array}{ll} a_{11} & a_{12} \\ a_{21} & a_{22} \end{array}\right)

let

A1=(b11b12b21b22)A^{-1}=\left(\begin{array}{ll} b_{11} & b_{12} \\ b_{21} & b_{22} \end{array}\right)

where the elements bijb_{i j} are unknown. From Eqs. (10.14), we have

AA1=(a11a12a21a22)(b11b12b21b22)=(1001),A A^{-1}=\left(\begin{array}{ll} a_{11} & a_{12} \\ a_{21} & a_{22} \end{array}\right)\left(\begin{array}{ll} b_{11} & b_{12} \\ b_{21} & b_{22} \end{array}\right)=\left(\begin{array}{ll} 1 & 0 \\ 0 & 1 \end{array}\right),

and,

A1A=(b11b12b21b22)(a11a12a21a22)=(1001).A^{-1} A=\left(\begin{array}{ll} b_{11} & b_{12} \\ b_{21} & b_{22} \end{array}\right)\left(\begin{array}{ll} a_{11} & a_{12} \\ a_{21} & a_{22} \end{array}\right)=\left(\begin{array}{ll} 1 & 0 \\ 0 & 1 \end{array}\right) .

Equations (10.15) and (10.16) give us 8 scalar equations for the 4 unknowns (b11,b12,b21\left(b_{11}, b_{12}, b_{21}\right., b22b_{22} ). It can be found that there is a unique solution for bijb_{i j} such that:

A1=1(a11a22a12a21)(a22a12a21a11).A^{-1}=\frac{1}{\left(a_{11} a_{22}-a_{12} a_{21}\right)}\left(\begin{array}{cc} a_{22} & -a_{12} \\ -a_{21} & a_{11} \end{array}\right) .

Note that:

  • the denominator in the first term on the RHS of (10.17) is the determinant of AA (see Section 10.2);
  • the second term is the matrix AA with the position of the diagonals a11a_{11} and a22a_{22} switched while the positions of a12a_{12} and a21a_{21} remain the same but the elements are multiplied by -1 .

From (10.17), we can say that:

  • AA has an inverse (i.e. it is non-singular) if a11a22a12a210a_{11} a_{22}-a_{12} a_{21} \neq 0;
  • AA is singular if a11a22a12a21=0a_{11} a_{22}-a_{12} a_{21}=0. Example 10.1 If it exists, compute the inverse of the following 2×22 \times 2 matrix
A=(4332).A=\left(\begin{array}{ll} 4 & 3 \\ 3 & 2 \end{array}\right) .

Solution We first calculate the denominator of the fraction in Eq. (10.17),

4×23×3=14 \times 2-3 \times 3=-1

We then switch the entries in the diagonals and multiply the entries in the off-diagonal by -1 so that we obtain the inverse as

A1=(2334)=(2334).A^{-1}=-\left(\begin{array}{cc} 2 & -3 \\ -3 & 4 \end{array}\right)=\left(\begin{array}{cc} -2 & 3 \\ 3 & -4 \end{array}\right) .

It is always a good idea to check your answer using AA1=I2A A^{-1}=I_{2},

(4332)(2334)=(8+912126+698)=(1001)\left(\begin{array}{ll} 4 & 3 \\ 3 & 2 \end{array}\right)\left(\begin{array}{cc} -2 & 3 \\ 3 & -4 \end{array}\right)=\left(\begin{array}{cc} -8+9 & 12-12 \\ -6+6 & 9-8 \end{array}\right)=\left(\begin{array}{ll} 1 & 0 \\ 0 & 1 \end{array}\right) \checkmark

Note that to find the inverse of a matrix higher than n=2n=2, we need knowledge of determinants, the matrix of minors, and the matrix of cofactors so we discuss inverses of larger matrices later in this chapter.

Orthogonal matrices

A square matrix AA with real entries and satisfying the condition A1=AA^{-1}=A^{\top} is said to be orthogonal. Since computing the matrix inverse can be difficult while the transpose is straightforward, orthogonal matrices make a difficult operation easier. If AA=AA=IA A^{\top}=A^{\top} A=I, then AA is orthogonal. Consider a matrix AA and its transpose AA^{\top},

AA=(a11a12a1na21a22a2nai1ai2ainan1an2ann)(a11a21aj1an1a12a22aj2an2a1na2najnann)=IAA^{\top} = \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{i1} & a_{i2} & \cdots & a_{in} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \\ \end{pmatrix} \begin{pmatrix} a_{11} & a_{21} & \cdots & a_{j1} & \cdots & a_{n1} \\ a_{12} & a_{22} & \cdots & a_{j2} & \cdots & a_{n2} \\ \vdots & \vdots & \ddots & \vdots & \ddots & \vdots \\ a_{1n} & a_{2n} & \cdots & a_{jn} & \cdots & a_{nn} \\ \end{pmatrix} = I

Consider now the dot product of the ith i^{\text {th }} and jth j^{\text {th }} row vectors of AA [refer to Eq. (10.18)]; this is given by k=1naikajk\sum_{k=1}^{n} a_{i k} a_{j k}. For the product AAA A^{\top} to be equal to the identity matrix, we deduce the following:

(AA)ij=k=1naikajk={1 if i=j0 if ij\left(A A^{\top}\right)_{i j}=\sum_{k=1}^{n} a_{i k} a_{j k}= \begin{cases}1 & \text { if } i=j \\ 0 & \text { if } i \neq j\end{cases}

Denoting the ith i^{\text {th }} row vector of AA by ri\boldsymbol{r}_{i}, we have

rirj={1 if i=j0 if ij\boldsymbol{r}_{i} \cdot \boldsymbol{r}_{j}= \begin{cases}1 & \text { if } i=j \\ 0 & \text { if } i \neq j\end{cases}

i.e. for ij,rii \neq j, \boldsymbol{r}_{i} and rj\boldsymbol{r}_{j} must be perpendicular or orthogonal and ri=1\left|\boldsymbol{r}_{i}\right|=1. If any two vectors in a set {ri}\left\{\boldsymbol{r}_{i}\right\} are orthogonal for all iji \neq j then they are said to be mutually orthogonal; further since ri=1\left|\boldsymbol{r}_{i}\right|=1, the row vectors of AA are mutually orthogonal unit vectors. Note that we also need AA=IA^{\top} A=I which implies that the column vectors of AA must also be mutually orthogonal unit vectors.

Example 10.2 Show that the following matrix is orthogonal,

A=(cosθsinθ0sinθcosθ0001)A=\left(\begin{array}{ccc} \cos \theta & -\sin \theta & 0 \\ \sin \theta & \cos \theta & 0 \\ 0 & 0 & 1 \end{array}\right)

Solution The matrix AA is orthogonal iff Eq. (10.19) and equivalently Eq. (10.20) holds true. We define r1=(cosθ,sinθ,0),r2=(sinθ,cosθ,0)\boldsymbol{r}_{1}=(\cos \theta,-\sin \theta, 0), \boldsymbol{r}_{2}=(\sin \theta, \cos \theta, 0), and r3=(0,0,1)\boldsymbol{r}_{3}=(0,0,1). Then,

r1r1=cos2θ+sin2θ=1,r2r2=sin2θ+cos2θ=1,r3r3=1\boldsymbol{r}_{1} \cdot \boldsymbol{r}_{1}=\cos ^{2} \theta+\sin ^{2} \theta=1, \quad \boldsymbol{r}_{2} \cdot \boldsymbol{r}_{2}=\sin ^{2} \theta+\cos ^{2} \theta=1, \quad \boldsymbol{r}_{3} \cdot \boldsymbol{r}_{3}=1

and

r1r2=cosθsinθcosθsinθ=0,r1r3=0,r2r3=sinθcosθsinθcosθ=0\boldsymbol{r}_{1} \cdot \boldsymbol{r}_{2}=\cos \theta \sin \theta-\cos \theta \sin \theta=0, \quad \boldsymbol{r}_{1} \cdot \boldsymbol{r}_{3}=0, \quad \boldsymbol{r}_{2} \cdot \boldsymbol{r}_{3}=\sin \theta \cos \theta-\sin \theta \cos \theta=0

Recall the dot product is commutative so the above proves that the rows of AA are mutually orthogonal vectors and therefore AA is an orthogonal matrix. It follows that,

A1=A=(cosθsinθ0sinθcosθ0001)A^{-1}=A^{\top}=\left(\begin{array}{ccc} \cos \theta & \sin \theta & 0 \\ -\sin \theta & \cos \theta & 0 \\ 0 & 0 & 1 \end{array}\right)

Exercises

  1. Find both ABA B and A+BA+B of the following where possible: (a) A=(1221)A=\left(\begin{array}{ll}1 & 2 \\ 2 & 1\end{array}\right) and B=(1111)B=\left(\begin{array}{cc}1 & 1 \\ -1 & 1\end{array}\right) (b) A=(123121)A=\left(\begin{array}{cc}1 & 2 \\ 3 & -1 \\ 2 & -1\end{array}\right) and B=(2213)B=\left(\begin{array}{cc}2 & -2 \\ 1 & 3\end{array}\right); (c) A=(110121310)A=\left(\begin{array}{ccc}1 & -1 & 0 \\ -1 & 2 & 1 \\ 3 & -1 & 0\end{array}\right) and B=(010022101)B=\left(\begin{array}{ccc}0 & 1 & 0 \\ 0 & 2 & 2 \\ -1 & 0 & -1\end{array}\right).
  2. Find decomposition of AA into symmetric and skew symmetric parts where,
A=(121212123)A=\left(\begin{array}{ccc} 1 & 2 & -1 \\ 2 & -1 & 2 \\ 1 & -2 & 3 \end{array}\right)
  1. Find the inverse of the following 2×22 \times 2 matrices: (a) A=(1221)A=\left(\begin{array}{cc}1 & 2 \\ -2 & 1\end{array}\right) (b) A=(3112)A=\left(\begin{array}{cc}3 & -1 \\ 1 & 2\end{array}\right).