21. Matrix Diagonalisation#

21.1. Motivation#

Recall that a diagonal matrix is a square matrix with zeros everywhere except the main diagonal. Multiplying a vector by a diagonal matrix D is easy. Just multiply each entry of the vector by the the element in the corresponding position of the diagonal matrix:

(d1000d2000dn)(x1x2xn)=(d1x1d2x2dnxn).

In fact, for diagonal matrices we immediately have the eigenvalues and eigenvectors! The eigenvalues are the diagonal entries di and the eigenvectors are the standard coordinate vectors ei:

Dei=diei.

Multiplying by a diagonal matrix is easy because any vector is a linear sum of coordinate vectors. The diagonal entries of D act separately on each of the components of the vector x.

Dx=D(x1e1++Dxnen)=x1De1++xnDen=(x1d1)e1++(xndn)en

What about general (non-diagonal) square matrices? If x is an eigenvector of a square matrix A and λ its eigenvalue, then we have

Ax=λx.

If x is an eigenvector, A behaves like a diagonal matrix. Instead of writing x as a sum of coordinate vectors ei, we write it as a sum of eigenvectors vi:

x=a1v1++anvn,

Then

Ax=a1λ1v1++anλnvn.

We just need to find the values ai. It turns out that these are just the inverse of the matrix of eigenvectors, as we will see in the next section.

21.2. Diagonalisation A=XΛX1#

Matrix Diagonalisation

Let A be an (n×n) matrix with n eigenvectors vi and corresponding eigenvalues λi. Let

X=(||v1vn||)

be the matrix of eigenvectors and

Λ=(λ1λn)

the diagonal matrix of eigenvalues. If X is invertible, then

X1AX=Λ

and

A=XΛX1.

Note that we use capital lambda Λ to represent the matrix of eigenvalues.

To see why the above result is true, note X1AX=Λ and A=XΛX1 are both exactly equivalent to:

AX=XΛ.

Then the left hand side AX is A times the eigenvectors:

AX=A(||v1vn||)=(||λ1v1λnvn||)

because Avi=λivi.

Whereas right hand side XΛ is the eigenvectors times the eigenvalues:

XΛ=(||v1vn||)(λ1λn)=(||λ1v1λnxn||).

Thus we see that AX=XΛ.

Example

Diagonalise the (2×2) matrix

A=(0110).

Solution

This matrix represents a reflection in the line y=x so we can immediately write down two eigenvalues and corresponding eigenvectors:

λ1=1,v1=(11),λ2=1,v2=(11).

Therefore we can write the matrix of eigenvectors X and matrix of eigenvalues Λ:

X=(1111),Λ=(1001).

To complete the diagonalisation we need to calculate X1:

X1=1det(X)adj(X)=12(1111)=(12121212).

Then

A=XΛX1=(1111)(1001)(12121212).

The matrix of eigenvectors X1 is also called the change of basis matrix. Multiplying a vector x by X1 transforms it from standard coordinates e1,,en to eigenvector coordinates v1,,vn.

Suppose x is written as a sum of eigenvectors with coefficients ai:

x=a1v1++anvn=X(a1an)

then left-multiplying by X1 gives the vector of coefficients ai:

(21.1)#(a1an)=X1x.

Note

Diagonalisation is not unique

1. If we write the eigenvalues and eigenvectors in a different order, we get a different matrix X. Swapping v1 and v2 we get another way to write (21.1):

A=(1111)(1001)(12121212).

However it is important that the order of the eigenvalues is the same as the order of the eigenvectors. If you swap the eigenvectors, you must remember to also swap the eigenvalues!

2. Any eigenvector can be multiplied by a constant. For example replacing v1 by 2v1 in (21.1):

A=(2121)(1001)(14141212).

3. The eigenvalues are unique, although different diagonalisations may result in a different order.

Exercise 21.1

Diagonalise the matrix

A=(12323212).

21.3. About Diagonalisation#

Not all matrices can be diagonalised. To diagonalise a matrix, the matrix of eigenvectors X must be invertible. In this section we will investigate conditions under which this is the case.

Recall that if v is an eigenvector then so is any multiple av. Suppose a (2×2) matrix has eigenvector v. Then the vector 2v is also an eigenvector, but the eigenvector matrix

X=(||v2v||)

is not invertible since the second column is a multiple of the first (and therefore its determinant is zero).

We can extend this idea to (3×3) matrices. Suppose we have two independent eigenvectors v2 and v2 and a third eigenvector v3 such that v3=av1+bv2 for some scalars a and b. Then the matrix of eigenvectors

X=(|||v1v2v3|||)

is not invertible. To extend this idea to the general case, we need to introduce the important concept of linear independence.

21.4. Linear Independence#

A set of vectors is linearly dependent if one vector can be written as a linear sum of the other vectors.

Definition

Let v1,,vnRm be a set of vectors.

The vectors are linearly independent if the equation

a1v1++anvn=0

has only the trivial solution

a1==an=0.

Otherwise, we say the vectors are linearly dependent.

Exercise 21.2

Show that a set of vectors is linearly dependent if (and only if) one of the vectors is a linear sum of the others.

If we have n linearly dependent vectors viRn then it is easy to show that the matrix

X=(||v1vn||)

is not invertible. Therefore we can extend the invertible matrix theorem with two more equivalent conditions for invertibility:

Invertible Matrix Theorem (II)

Let A be an (n×n) matrix. Then the following statements are equivalent:

  1. A is invertible.

  2. det(A)0.

  3. A has n pivots.

  4. The null space of A is 0.

  5. Ax=b has a unique solution for every bRn.

  6. The columns of A are linearly independent.

  7. The rows of A are linearly independent.

21.5. More Eigenvalues#

Theorem

Eigenvectors v1,,vn corresponding to distinct eigenvalues are linearly independent.

An (n×n) matrix with n distinct eigenvalues is diagonalisable.

To prove this, suppose that v1 and v2 are linearly dependent eigenvectors of the matrix A with distinct eigenvalues λ1 and λ2.

(21.2)#v2=av1

for some a0.

Multiply by A:

λ2v2=aλ1v1

then divide by λ2 (since they are distinct we can assume that at least one of the eigenvalues is nonzero).

(21.3)#v2=aλ1λ2v2.

Comparing (21.2) and (21.3) we see that λ1/λ2=1 and so

λ1=λ2.

We have shown that two linearly dependent eigenvectors must have identical eigenvalues. We will not show it here, but it is not difficult to extend this to the general case: eigenvectors from distinct eigenvalues are linearly independent.

We can conclude that if an (n×n) matrix has n distinct eigenvalues then it has n linearly independent eigenvectors. By the invertible matrix theorem we can therefore can conclude that its matrix of eigenvectors X is invertible, and the matrix is invertible.

Example

Determine the characteristic equation of each of the following matrices and identify which are diagonalisable:

A=(0110),B=(1111),C=(1101).

Solution

A has characteristic polynomial λ21=(λ1)(λ+1). It has two distinct eigenvalues λ1=1,λ2=1 and therefore two independent eigenvectors. It is diagonalisable and we already calculated A=XΛX1 (21.1).

B has characteristic polynomial λ22λ=λ(λ2). It has two distinct eigenvalues λ1=0,λ2=2 and therefore has two independent eigenvectors and is diagonalisable.

C has characteristic polynomial λ22λ+1=(λ1)2. It has a single eigenvalue λ1=1. To determine if C is diagonalisable, we need to check if there are two independent eigenvectors in the eigenspace of λ1.

Null(Cλ1I)=Null(0100)=(10).

C has only one linearly independent eigenvector and therefore it is not diagonalisable.

Exercise 21.3

  1. Diagonalise the matrix B in the example above.

  2. Find a diagonalisable matrix with only one distinct eigenvalue.

Attention

Diagonalisability is not related to invertibility. Non-invertible matrices can be diagonalisable, as in B above. Likewise, non-diagonalisable matrices can be invertable as in C above.

21.6. Algebraic and Geometric Multiplicity#

The eigenvalues of a (n×n) square matrix are the roots of its characteristic polynomial f(λ). We have already seen that if there are n distinct roots then there are n linearly independent eigenvectors and the matrix is diagonalisable. For example a (3×3) matrix with the following characteristic polynomial is diagonalisable.

f(λ)=(λ1)(λ2)(λ3)

If there are fewer then n distinct roots then at least one of the roots must be repeated. For example, the characteristic polynomial

(21.4)#f(λ)=(λ2)(λ1)2

results in two distinct eigenvalues λ1=2 and λ2=1. λ2 is a repeated root with multiplicity 2 since the factor (λ1) divides the polynomial twice. To determine whether the matrix is diagonalisable, we need to determine how many independent eigenvectors there are in the eigenspace of each of the eigenvectors.

Theorem

Let A be a square matrix and λ an eigenvalue of A. Then,

geometric multiplicity of λ algebraic multiplicity of λ

where the algebraic multiplicity of λ is its multiplicity as a root of the characteristic polynomial of A and the geometric multiplicity of λ is the number of independent eigenvectors in the eigenspace of λ.

This means that for (21.4) there could be one or two independent eigenvectors in the eigenspace of λ2. If there are two, then the matrix is diagonalisable; if only one then it is not diagonalisable. To check, we have to calculate the null space of Aλ2I.

21.7. Matrix Powers Ak#

The eigenvector matrix X produces A=XΛX1. This factorisation is useful for computing powers because X1 multiplies with X to get I:

A2=XΛX1XΛX1=XΛIΛX1=XΛ2X1A3=XΛ2X1XΛX1=XΛ3X1

and so on. Because Λ is a diagonal matrix, its powers Λk are easy to calculate.

Theorem

Let A be a diagonalisable square matrix with A=XΛX1 and kN. Then

Ak=XΛkX1=X(λ1kλnk)X1.

21.8. Complex Eigenvalues#

We have seen that from a square matrix A we can calculate the characteristic polynomial f(λ). For an (n×n) matrix the polynomial is degree n:

f(λ)=λn+an1λn1++a1λ+a0

where ai are real numbers.

By the fundamental theorem of algebra, f(λ) can be factorised into n factors λλi (some of which may be repeated):

f(λ)=(λλ1)(λλ2)(λλn).

The roots λi are the eigenvalues of the matrix.

In this section we consider the case where some of the roots are not real numbers.

21.8.1. Rotations in 2D#

Let A be the matrix of an anticlockwise rotation π/2 around the origin:

A=(0110).

The characteristic polynomials is det(AλI) which equals

|λ11λ|=λ2+1.

The polynomial λ2+1 does not have real roots. Its roots are ±i where i is the imaginary number 1:

λ2+1=(λ+i)(λi)

resulting in two complex eigenvalues λ1=i and λ2=i.

We also find that the eigenvectors contain the imaginary number i:

Aλ1I=(i11i) RREF (1i00)

and hence the eigenvector corresponding to the eigenvalue λ1=i is

v1=(i1).

Likewise the eigenspace corresponding to the eigenvalue λ2=i is

v2=(i1).

Example

Find the eigenvalues of an anticlockwise rotation by an angle θ.

Solution

The matrix

A=(cosθsinθsinθcosθ)

represents an anticlockwise rotation by an angle θ. The characteristic polynomial f(λ) is given by

det(AλI)=|cosθλsinθsinθcosθλ|=(cosθλ)2+sin2θ.

Setting this to zero and solving for λ gives eigenvalues

λ=cosθ±isinθ=e±iθ.

Exercise 21.4

Find the complex eigenvectors corresponding to the two complex eigenvalues of

A=(cosθsinθsinθcosθ).

21.9. Trace and Determinant#

Calculating eigenvalues is (in general) a difficult problem. However, in some cases we can use some ‘tricks’ to help find them.

Definition

The trace of a matrix is the sum of the diagonal entries. Given a matrix A with entries aij:

tr(A)=a11++ann.

Theorem

Let A be an (n×n) matrix with eigenvalues λ1,,λn. Then

λ1++λn=tr(A)

and

λ1λ2λn=det(A).

The sum of the eigenvalues is the sum of the diagonal entries of A. The product of the eigenvalues is the determinant of A.

Example

Calculate the determinant of the matrix

A=(111111111).

Solution

This matrix is clearly not invertible, and so has zero determinant and at least one zero eigenvalue.

In fact there are two independent eigenvectors in the zero eigenspace (check this!). This means that we have λ1=λ2=0. To find the third eigenvalue, use the identity

λ1+λ2+λ3=tr(A)

to determine that λ3=3.

Exercise 21.5

Show that the eigenvalues of a triangular matrix are its diagonal entries.

21.10. Solutions to Exercises#

Solution to Exercise 21.1

First find the eigenvalues and eigenvectors of A. The characteristic polynomial is given by:

f(λ)=det(AλI)=λ2λ2=(λ+1)(λ2)

therefore the two eigenvalues are λ1=1 and λ2=2.

To find the corresponding eigenvectors, calculate the nullspace of AλI for each eigenvalue. For λ1:

A+1I=(32323232) RREF (1100)

Therefore v1=(11) is an eigenvector corresponding to eigenvalue λ1=1.

For λ2:

A2I=(32323232) RREF (1100)

Therefore v2=(11) is an eigenvector corresponding to eigenvalue λ2=2.

The matrix of eigenvectors is

X=(||v1v2||)=(1111)

and its inverse

X1=(12121212).

The matrix of eigenvalues is

Λ=(1002)

and so

A=XΛX1=(1111)(1002)(12121212).

Solution to Exercise 21.3

1. Eigenvalues are λ1=0, λ2=1.

Bλ1I=(1111) RREF (1100)

Therefore v1=(11).

Bλ2I=(1111) RREF (1100)

Therefore v2=(11).

X=(1111)

and

X1=(12121212).
B=XΛX1=(1111)(0001)(12121212).

The eigenvalue matrix Λ has a zero on the diagonal because one of the eigenvalues is zero.

2. A matrix of the form

(a00a)

has characteristic polynomial

(aλ)2

and therefore a single repeated eigenvalue λ=a.

But it is diagonalisable (it is diagonalised by the identity matrix). In fact, every vector vR2 is an eigenvector of the matrix.