Chapter 10 Eigenvalues and Eigenvectors

Some linear transformations have “natural directions” associated with them.

Example 10.1 Let \[ A = \begin{pmatrix} 1 & 1 \\ 2 & 0 \end{pmatrix}, \] We can see that \[ A \begin{pmatrix} 1 \\ 1 \end{pmatrix} = \begin{pmatrix} 1 & 1 \\ 2 & 0 \end{pmatrix} \begin{pmatrix} 1 \\ 1 \end{pmatrix} = \begin{pmatrix} 2 \\ 2 \end{pmatrix}. \] Thus, it follows that \[ A\mathbf{u}_1 = 2 \mathbf{u}_1, \qquad \text{where }\quad\mathbf{u}_1 = \begin{pmatrix} 1 \\ 1 \end{pmatrix}. \] Moreover, we have \[ A \begin{pmatrix} 1 \\ -2 \end{pmatrix} = \begin{pmatrix} 1 & 1 \\ 2 & 0 \end{pmatrix} \begin{pmatrix} 1 \\ -2 \end{pmatrix} = \begin{pmatrix} -1 \\2\end{pmatrix}, \] and so \[ A\mathbf{u}_2 = -\mathbf{u}_2, \qquad \text{where }\quad \mathbf{u}_2 = \begin{pmatrix} 1 \\ -2 \end{pmatrix}. \]

The above example shows that there are vectors associated to a matrix, which, when multiplied with the matrix from the left, are only scaled by a factor. This motivates the following definition.

Definition 10.1 (Eigenvalues and Eigenvectors) Let \(A\) be an \(n\times n\) matrix. Then a non-zero vector \(\mathbf{u}\) is said to be an eigenvector of \(A\) if there exists a scalar \(\lambda\) such that \[ A\mathbf{u} = \lambda \mathbf{u} \] The scalar \(\lambda\) is called the eigenvalue associated to \(\mathbf{u}\).

Example 10.2 In Example 10.1, we have that \(\mathbf{u}_1 = \left(\begin{smallmatrix}1\\1\end{smallmatrix}\right)\) is an eigenvector associated with an eigenvalue \(\lambda_1 = 2\), and that \(\mathbf{u}_2 = \left(\begin{smallmatrix}1\\-2\end{smallmatrix}\right)\) is an eigenvector associated with an eigenvalue \(\lambda_2 = -1\).

We claim that any vector may be written as a linear combination of \(\mathbf{u}_1\) and \(\mathbf{u}_2\). That is, for all \(\mathbf{v}\), there exist scalars \(x_1, x_2\) such that \[ x_1 \mathbf{u}_1 + x_2 \mathbf{u}_2 = \mathbf{v}. \] Write \(\mathbf{x} = \left(\begin{smallmatrix}x_1\\ x_2\end{smallmatrix}\right)\), then we may write the above equation as \(B \mathbf{x} = \mathbf{v}\), where \(B = (\mathbf{u}_1 \ \mathbf{u}_2)\), i.e. the columns of the matrix \(B\) are \(\mathbf{u}_1\) and \(\mathbf{u}_2\). Note that \[ \det(B) = \begin{vmatrix} 1 & 1 \\ 1 & -2 \end{vmatrix} = -2 -1 = -3, \] which is non-zero, hence the matrix is invertible, and so the equation \(B \mathbf{x} = \mathbf{v}\) has the (unique) solution \(\mathbf{x} = B^{-1} \mathbf{v}\). The matrix \(B\) being invertible is equivalent to saying that \(\mathbf{u}_1\) and \(\mathbf{u}_2\) are linearly independent.

Having shown that any vector \(\mathbf{v}\) may be written as a (unique) linear combination of \(\mathbf{u}_1\) and \(\mathbf{u}_2\), we apply \(A\) to both sides of and deduce \[\begin{align*} A\mathbf{v} &= A(x_1 \mathbf{u}_1 + x_2 \mathbf{u}_2)\\ &= x_1 A\mathbf{u}_1 + x_2 A\mathbf{u}_2\\ &= \lambda _1 x_1 \mathbf{u}_1 + \lambda_2 x_2 \mathbf{u}_2, \end{align*}\] where we use the facts \(A\mathbf{u}_1 = \lambda_1 \mathbf{u}_1\) and \(A\mathbf{u}_2 = \lambda_2 \mathbf{u}_2\) for the last equality. Hence if we write a vector \(\mathbf{v}\) in terms of the eigenvectors \(\mathbf{u}_1\) and \(\mathbf{u}_2\), it is simple to calculate the result of multiplying by the matrix \(A\): we simply multiply these components by their eigenvalues. That is, we just strech/compress by a factor \(\lambda_i\) in the \(\mathbf{u}_i\) direction (and reverse direction if \(\lambda_i\) is negative).

10.1 Matrix Diagonalisation

Definition 10.2 (Diagonal Matrix) A square matrix is said to be a diagonal matrix if its non-diagonal entries are zero. That is, a matrix of the form

\[ D = \begin{pmatrix} \lambda_1 & 0 & 0 & \dotsb & 0\\ 0 & \lambda_2 & 0 &\dotsb & 0\\ \vdots & &\ddots & & \vdots\\ 0& & \dotsb & & \lambda_n\\ \end{pmatrix} \]

for any scalars \(\lambda_1,\dotsc,\lambda_n\).

Consider the diagonal matrix \[ D = \begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{pmatrix} = \begin{pmatrix} 2 & 0 \\ 0 & -1 \end{pmatrix}, \]

Note that this acts very simply on a vector \(\mathbf{x}=\left(\begin{smallmatrix}x_1\\x_2\end{smallmatrix}\right)\):

\[ D\mathbf{x}= \begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{pmatrix} \begin{pmatrix}x_1\\x_2\end{pmatrix} =\begin{pmatrix} \lambda_1 x_1\\ \lambda_2 x_2 \end{pmatrix} = \begin{pmatrix} 2 x_1\\ -x_2 \end{pmatrix} \]

it simply multiplies each element of the vector by a value on the diagonal. This is reminiscent of the action of \(A\) on

We’ll now look at an example to understand the meaning of the term “diagonalising a matrix”.

Example 10.3 Suppose that we write a vector \(\mathbf{v} = x_1 \mathbf{u}_1 + x_2 \mathbf{u}_2\) in terms of the eigenvectors \(\mathbf{u}_1\) and \(\mathbf{u}_2\) from example (exm:eigen2). We say that \(\mathbf{v}\) has the coordinate vector \[\mathbf{x}_{\mathcal{P}}=\begin{pmatrix}x_1\\x_2\end{pmatrix}\] with respect to the vectors13 \(\mathcal{P}=[\mathbf{u}_1,\mathbf{u}_2]\) – these vectors take the place of the usual vectors \(\mathcal{E}=[\mathbf{e}_1,\mathbf{e}_2]\) (although note that the vectors in \(\mathcal{P}\) are not unit vectors here). We may write \(\mathbf{v}\) as \[ \mathbf{v} = P \mathbf{x}_\mathcal{P}, \] where \(P = (\mathbf{u}_1\; \mathbf{u}_2)\) is the matrix whose columns are the vectors \(\mathbf{u}_1\) and \(\mathbf{u}_2\). Consider an exemplar vector \[ \mathbf{v} = 5 \mathbf{u}_1 - 3 \mathbf{u}_2= 2 \mathbf{e}_1 + 11 \mathbf{e}_2 \] Then \(\mathbf{v}\) has co-ordinate vector \(\mathbf{x}_\mathcal{P}=(5,-3)\) and \[ \mathbf{v} = P \begin{pmatrix} 5 \\ -3 \end{pmatrix} = \begin{pmatrix} 1 & 1 \\ 1 & -2 \end{pmatrix} \begin{pmatrix} 5 \\ -3 \end{pmatrix} = \begin{pmatrix} 2\\ 11\end{pmatrix} \]

We have seen that \(A\) acts very simply in these coordinates: \[ A\mathbf{v}=\lambda _1 x_1 \mathbf{u}_1 + \lambda_2 x_2 \mathbf{u}_2 \] and writing this result as a coordinate vector with respect to \(\mathcal{P}\) we would have \[ (A\mathbf{v})_\mathcal{P}= \begin{pmatrix} \lambda_1 x_1\\ \lambda_2 x_2 \end{pmatrix} \] We can see that \(A\) acts like a diagonal matrix in the coordinates \(\mathcal{P}\). If we defined \[D=\begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{pmatrix}\] then note that \(D\) acts on the coordinate vector as \[D\mathbf{x}_\mathcal{P}= \begin{pmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{pmatrix} \begin{pmatrix} x_1\\ x_2 \end{pmatrix} =\begin{pmatrix} \lambda_1 x_1\\ \lambda_2 x_2 \end{pmatrix} \]

Since the matrix \(P\) converts between coordinates \(\mathcal{P}\) and standard coordinates \(\mathcal{E}\), that is \(\mathbf{v}=P\mathbf{x}_\mathcal{P}\), we can see that \(P^{-1}\) converts in the other direction \(\mathbf{x}_\mathcal{P}=P^{-1}\mathbf{v}\). We can use this to write \(A\) as

\[A=PDP^{-1}.\]

This decomposes \(A\mathbf{v}\) into three operations \(PDP^{-1}\mathbf{v}\):

  1. Apply \(P^{-1}\) to convert \(\mathbf{v}\) to \(P^{-1}\mathbf{v}=\mathbf{x}_\mathcal{P}\);
  2. Now apply the diagonal matrix \(D\) to \(\mathbf{x}_\mathcal{P}\), which simply multiplies each component by the eigenvalue \(\lambda_i\);
  3. Finally, convert back to standard coordinates by applying \(P\) to give the result \(A\mathbf{v}\).

We call the matrix \(D\) the diagonalisation of the matrix \(A\). Note that \(D\) can be obtained by applying \(P^{-1}\) to the left and \(P\) to the right of the above equation to obtain:

\[P^{-1}AP=P^{-1}PDP^{-1}P=D.\]

(using \(P^{-1}P=I\)).

Theorem 10.1 (Diagonalisation) Let \(A\) be an \(n\times n\) matrix with \(n\) eigenvectors \(\mathbf{u}_1, \dots, \mathbf{u}_n\) with corresponding eigenvalues \(\lambda_1, \dotsc, \lambda_n\). If the matrix \(P=(\mathbf{u}_1 \dotsb \mathbf{u}_n)\) is invertible (in other words, the eigenvectors are linearly independent) then \[ D = P^{-1}A P, \] where \(D\) is the diagonal matrix whose \(i\)-th diagonal entry is \(\lambda_i\). We say that \(A\) is diagonalisable.

It is worth noting here that it is not always possible to diagonalise a matrix. We shall see an example later.

10.2 Finding Eigenvalues

Suppose \(\mathbf{u}\) is an eigenvector of a matrix \(A\) with associated eigenvalue \(\lambda\). This means that \(\mathbf{u}\) is a non-zero vector such that \(A\mathbf{u} = \lambda \mathbf{u}\). Equally, \[\begin{array}{rrcl} & \lambda \mathbf{u} - A \mathbf{u} &=& \mathbf{0} \\ \Longleftrightarrow\qquad & \lambda I_n \mathbf{u} - A\mathbf{u} &=& \mathbf{0} \\ \Longleftrightarrow\qquad & (\lambda I_n-A) \mathbf{u} &=& \mathbf{0}. \end{array}\] Setting \(B_{\lambda} = \lambda I_n - A\), we have that \(\lambda\) is an eigenvalue of \(A\) if, and only if, \(B_{\lambda}\mathbf{u} = \mathbf{0}\) has a non-zero solution \(\mathbf{u}\). Recall that for a general square matrix \(M\), the matrix–vector system \(M \mathbf{v} = \mathbf{b}\) has a unique solution for \(\mathbf{v}\) if, and only if, \(\det(M) \neq 0\). Note the system \(B_{\lambda} \mathbf{v} = \mathbf{0}\) always has at least one solution, namely \(\mathbf{v} = \mathbf{0}\), so the system has a non-zero solution if, and only if, it has more than one solution, which is equivalent to \(\det(B_{\lambda}) = 0\). We use this fact to find eigenvalues, as shown in the following example.

Example 10.4 (Finding Eigenvalues) Let \[ A = \begin{pmatrix} 1 & 1 \\ 2 & 0 \end{pmatrix}. \] Then the matrix \(B_{\lambda} = \lambda I_{2} - A\) is given as \[ B_{\lambda} = \begin{pmatrix} \lambda - 1 & -1 \\ -2 & \lambda \end{pmatrix}. \] Then we have \[\begin{align*} \det(B_{\lambda}) &= \lambda(\lambda-1) - 2 \\ &= \lambda^2 - \lambda - 2 \\ &= (\lambda - 2)(\lambda + 1). \end{align*}\] Thus, \(\lambda\) is an eigenvalue of \(A\) if and only if \[ 0 = (\lambda - 2)(\lambda + 1), \] and equivalently if and only if \(\lambda = 2\) or \(\lambda = -1\). Hence, the eigenvalues of \(A\) are \[ \lambda_1 = 2 \qquad\text{and}\qquad \lambda_2 = -1. \]

Note, that in the above example \(\det(B_\lambda)\) is a polynomial of degree 2 in the variable \(\lambda\). We can generalise this observation.

Definition 10.3 (Characteristic Polynomial) Let \(A\) be an \(n\times n\) matrix. Then the function \[ p_A(\lambda)=\det(\lambda I_n - A) \] is a polynomial of degree \(n\) in the variable \(\lambda\). We call this polynomial \(p_A\) the characteristic polynomial of \(A\).

So in general, we can find the eigenvalues of a square matrix \(A\) by finding the roots of the characteristic polynomial \(p_A\), which is obtained from computing the determinant \(\det(\lambda I_n - A)\).

10.3 Finding Eigenvectors

Example 10.5 (Finding Eigenvectors) Let again \[ A = \begin{pmatrix} 1 & 1 \\ 2 & 0 \end{pmatrix}. \] We know from above that \(\lambda_1 = 2\) and \(\lambda_2 = -1\) are the eigenvalues of \(A\). We first find an eigenvector corresponding to \(\lambda_1\). Any non-zero solution \(\mathbf{v} = \left(\begin{smallmatrix}v_1\\v_2\end{smallmatrix}\right)\) of the equation \(B_2 \mathbf{v} = \mathbf{0}\) is such an eigenvector. For \(\lambda = \lambda_1 = 2\), we have \[ B_2 = \begin{pmatrix} 1 & -1 \\ -2 & 2 \end{pmatrix}. \] We solve \(B_2 \mathbf{v} = \mathbf{0}\) via Gaussian elimination. The augmented matrix for \(B_2 \mathbf{v} = \mathbf{0}\) is given by \[ \left(\begin{array}{rr|r} 1 & -1 & 0 \\ -2 & 2 & 0 \end{array}\right). \] We perform the ERO \(R_2 \to R_2 + 2R_1\) to obtain \[ \left(\begin{array}{rr|r} 1 & -1 & 0 \\ 0 & 0 & 0 \end{array}\right). \] From this, we see that \(v_2\) is arbitrary, say \(v_2 = \alpha\), and \(v_1 = v_2 = \alpha\). So, the eigenvectors corresponding to the eigenvalue \(\lambda_1 = 2\) are given by the set \[ \alpha \begin{pmatrix} 1 \\ 1 \end{pmatrix} \] for any non-zero value of \(\alpha\). By a similar process, we find an eigenvector corresponding to \(\lambda_2 = -1\). Any non-zero solution \(\mathbf{v} = \left(\begin{smallmatrix}v_1\\v_2\end{smallmatrix}\right)\) of the equation \(B_{-1} \mathbf{v} = \mathbf{0}\) is such an eigenvector. We have \[ B_{-1} = \begin{pmatrix} -2 & -1 \\ -2 & -1\end{pmatrix}, \] and again solve \(B_{-1} \mathbf{v} = \mathbf{0}\) via Gaussian elimination to give the eigenvectors corresponding to \(\lambda_2 = -1\) as \[ \beta \begin{pmatrix} 1 \\ -2\end{pmatrix} \] for any non-zero value of \(\beta\).

Note that there are infinitely many eigenvectors corresponding to an eigenvalue, because if a non-zero vector \(\mathbf{v}\) satisfies \(B_{\lambda} \mathbf{v} = \mathbf{0}\), then any multiple of \(\mathbf{v}\), i.e. \(\mu\mathbf{v}\) for any \(\mu \neq 0\) satisfies the equation, too, as we have \(B_{\lambda} (\mu\mathbf{v}) = \mu (B_{\lambda}{\mathbf{v}}) = \mu\mathbf{0} = \mathbf{0}\).

In order to obtain specific eigenvectors, we simply need to set \(\alpha\) and \(\beta\) to some non-zero value, for which we usually simply take \(\alpha=1\) and \(\beta=1\). Hence two suitable eigenvectors of \(A\) are \[ \mathbf{u}_1=\begin{pmatrix} 1 \\ 1 \end{pmatrix} \quad\text{ and }\quad \mathbf{u}_2=\begin{pmatrix} 1 \\ -2\end{pmatrix} \]

With the following example, we demonstrate how to find the eigenvalues and eigenvectors for a \(3\times 3\) matrix.

Example 10.6 Let \(A\) be given by \[ A = \begin{pmatrix} 1 & 0 & -1 \\ 1 & 2 & 1 \\ 2 & 2 & 3 \end{pmatrix}. \] Then \(A\) has characteristic polynomial \[\begin{align*} p_A(\lambda) &= \det(\lambda I_3 - A) \\ &= \begin{vmatrix} \lambda - 1 & 0 & 1 \\ -1 & \lambda - 2 & -1 \\ -2 & -2 & \lambda - 3 \end{vmatrix}. \end{align*}\] We have: \[\begin{align*} p_A(\lambda) &= (\lambda - 1) \begin{vmatrix} \lambda -2 & -1 \\ -2 & \lambda -3 \end{vmatrix} + \begin{vmatrix} -1 & \lambda -2 \\ -2 & -2 \end{vmatrix}\\ &= (\lambda - 1)\big( (\lambda - 2)(\lambda - 3) - 2 \big) + (2 + 2(\lambda - 2)) \\ &= (\lambda - 1)(\lambda^2 - 5 \lambda + 4) + 2(\lambda - 1) \\ &= (\lambda - 1)(\lambda^2 - 5 \lambda + 6) \\ &= (\lambda - 1)(\lambda - 2)(\lambda - 3). \end{align*}\] Therefore, we find that \(A\) has eigenvalues \(\lambda_1 = 1\), \(\lambda_2 = 2\) and \(\lambda_3 = 3\).

It remains to find the eigenvectors of \(A\). Recall that, for each eigenvalue \(\lambda\), this is equivalent to finding all non-zero solutions \(\mathbf{v} = (v_1,v_2,v_3)\) to the equation \(B_{\lambda} \mathbf{v} = \mathbf{0}\), where \(B_{\lambda} = \lambda I_3 - A\). Here, we have \[ B_{\lambda} = \begin{pmatrix} \lambda - 1 & 0 & 1 \\ -1 & \lambda - 2 & -1 \\ -2 & -2 & \lambda - 3 \end{pmatrix}. \]

  1. We find the eigenvectors for the eigenvalue \(\lambda_1 = 1\). For \(\lambda = \lambda_1 = 1\), we have \[ B_{1} = \begin{pmatrix} 0 & 0 & 1 \\ -1 & -1 & -1 \\ -2 & -2 & -2 \end{pmatrix}, \] which is (automatically) the augmented matrix representing \(B_{1} \mathbf{v} = \mathbf{0}\) where we omit the solution vector. We perform a Gaussian elimination. First, we apply EROs \(R_3 \to R_3 - 2 R_2\) and \(R_2 \to -R_2\) to obtain \[ \begin{pmatrix} 0 & 0 & 1 \\ 1 & 1 & 1 \\ 0 & 0 & 0 \end{pmatrix}, \] and then we apply \(R_1 \leftrightarrow R_2\) to have \[ \begin{pmatrix} 1 & 1 & 1 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{pmatrix}, \] and finally \(R_1 \to R_1 - R_2\) to arrive at \[ \begin{pmatrix} 1 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{pmatrix}. \] From this we see that \(v_2\) can be chosen arbitrary, say \(v_2 = \alpha\). Then, in view of \(v_1 = -v_2\) and \(v_3 = 0\), we have the set of eigenvectors corresponding to the eigenvalue \(\lambda_1 = 1\) given as \[ \alpha \begin{pmatrix} -1 \\ 1 \\ 0 \end{pmatrix} \] For instance, taking \(\alpha = 1\), gives a specific eigenvector \[ \mathbf{u}_1 = \begin{pmatrix} -1 \\ 1 \\ 0 \end{pmatrix}. \]

  2. We find the eigenvectors for the eigenvalue \(\lambda_2 = 2\). For \(\lambda = \lambda_2 = 2\), we have \[ B_{2} = \begin{pmatrix} 1 & 0 & 1 \\ -1 & 0 & -1 \\ -2 & -2 & -1 \end{pmatrix}. \] As above, we use Gaussian elimination to find non-zero solutions \(\mathbf{v}\) to the equation \(B_{2} \mathbf{v} = \mathbf{0}\). This gives the set of eigenvectors \[ \beta \begin{pmatrix} -2 \\ 1 \\ 2 \end{pmatrix} \] For instance, taking \(\beta = 1\), we find the specific eigenvector \[ \mathbf{u}_2 = \begin{pmatrix} -2 \\ 1 \\ 2 \end{pmatrix}. \]

  3. We find the eigenvectors for the eigenvalue \(\lambda_3 = 3\). For \(\lambda = \lambda_3 = 4\), we have \[ B_{3} = \begin{pmatrix} 2 & 0 & 1 \\ -1 & 1 & -1 \\ -2 & -2 & 0 \end{pmatrix}. \] As above, we use Gaussian elimination to find non-zero solutions \(\mathbf{v}\) to the equation \(B_{3} \mathbf{v} = \mathbf{0}\). This gives the set of eigenvectors \[ \gamma \begin{pmatrix} -1 \\ 1 \\ 2 \end{pmatrix}. \] For instance, taking \(\gamma = -1\), we find the specific eigenvector \[ \mathbf{u}_3 = (1,-1,-2). \]

It also makes sense to consider vectors and matrices with complex number entries. These arise in applications such as analysis of electronic circuits and in quantum mechanics. We won’t consider such situations here, but we will now see that even if we start with vectors and matrices with real entries, complex number eigenvalues and eigenvectors with complex entries can still arise. We can immediately see that this could be possible, since the eigenvalues are calculated as roots of the characteristic polynomial and we know that real polynomials can have complex roots.

Example 10.7 (Complex Eigenvalues and Eigenvectors) Consider the matrix

\[ A= \begin{pmatrix} 0&-1\\ 1&0 \end{pmatrix} \]

The matrix \(A\) has the effect of rotating vectors in the plane anticlockwise by \(\pi/2\) (it is matrix \(R_{\pi/2}\) from example (exm:rotmat)), so we can see geometrically that it can’t have any real eigenvectors, since these would correspond to fixed directions in the plane that are scaled by an eigenvalue.

The characteristic polynomial is

\[p_A(\lambda)=\lambda^2+1\]

which has solutions \(\lambda_1=i\) and \(\lambda_2=-i\). We can also find the eigenvectors in the usual way by solving \((A-\lambda I)\mathbf{v}=0\). For \(\lambda_1\) \[ (A-iI)= \begin{pmatrix} -i&-1\\ 1&-i \end{pmatrix} \] and performing \(R_1\to iR_1\) \[ \begin{pmatrix} 1&-i\\ 1&-i \end{pmatrix} \] we find that the corresponding eigenvectors are \(\alpha\left(\begin{smallmatrix}1\\-i\end{smallmatrix}\right)\) and taking \(\alpha=1\),

\[ \mathbf{u}_1= \begin{pmatrix} 1\\ -i \end{pmatrix} \]

Similarly, we find an eigenvector for \(\lambda_2\): \[ \mathbf{u}_2= \begin{pmatrix} 1\\ i \end{pmatrix} \]

Let \[ P=\begin{pmatrix} 1&1\\ -i&i \end{pmatrix} \] then \[ P^{-1}= \frac{1}{2i} \begin{pmatrix} i&-1\\ i&1 \end{pmatrix} = \frac{1}{2} \begin{pmatrix} 1&i\\ 1&-i \end{pmatrix} \] and letting \[ D= \begin{pmatrix} i&0\\ 0&-i \end{pmatrix} \] we may write \[ A=PDP^{-1}. \] Although \(P^{-1}\), \(D\) and \(P\) all contain complex numbers, their product leaves only real numbers and we get back to the real matrix \(A\) (check that \(PDP^{-1}\) does indeed equal \(A\)).

Complex roots of real polynomials always come in complex conjugate pairs and so we will have complex conjugate eigenvalue pairs and corresponding complex conjugate eigenvector pairs. To see this, if \(p_B(\lambda)=\lambda^n+a_{n-1}\lambda^{n-1}+\dotsb+a_1\lambda+a_0\) is a (real) characteristic polynomial of a real \(n\times n\) matrix \(B\) and \(\lambda\) is a complex root, then taking the complex conjugate \[\begin{align*} 0=p_B(\lambda)=\overline{p_B(\lambda)}&=\overline{\lambda^n}+\overline{a_{n-1}\lambda^{n-1}}+\dotsb+\overline{a_1\lambda}+\overline{a_0}\\ &=\overline{\lambda}^n+a_{n-1}\overline{\lambda}^{n-1}+\dotsb+a_1\overline{\lambda}+{a_0} \end{align*}\] (using that the coefficients \(a_j\) are real) shows that \(\overline{\lambda}\) is also a root. Let \(\mathbf{u}\) be an eigenvector corresponding to \(\lambda\). Then, \[ B\overline{\mathbf{u}}=\overline{B}\overline{\mathbf{u}}=\overline{B\mathbf{u}}=\overline{\lambda \mathbf{u}}=\overline{\lambda}\overline{\mathbf{u}} \] (using that \(B\) is real in the first step) i.e. the vector \(\overline{\mathbf{u}}\) is an eigenvector with eigenvalue \(\overline{\lambda}\).

As we mentioned earlier, not all matrices are diagonalisable. Here is an example.

Example 10.8 (A non-diagonalisable matrix) Consider the matrix \[ A=\begin{pmatrix} 3&1\\ 0&3 \end{pmatrix}. \] The characteristic polynomial of \(A\) is \[ p_A(x)=\det(\lambda I-A)= \begin{vmatrix} \lambda-3&-1\\ 0&\lambda-3 \end{vmatrix} =(\lambda-3)^2 \] so the only eigenvalue of \(A\) is \(3\). Now \[ A-3I= \begin{pmatrix} 0&1\\ 0&0 \end{pmatrix} \] so the eigenvectors are \[ \alpha \begin{pmatrix} 1\\ 0 \end{pmatrix} \] So we do not have 2 linearly independent eigenvectors since all eigenvectors are multiples of \(\mathbf{u}=\left(\begin{smallmatrix}1\\0\end{smallmatrix}\right)\). Therefore, we cannot diagonalise this matrix.

We briefly mention that there are other useful matrix decompositions to diagonalisation that always work. The generalisation of diagonalisation is the Jordan Normal Form, which is an important theoretical tool. Another key technique is the Singular Value Decomposition which is important in numerical applications of matrices and in data science (see Principal Component Analysis).

10.4 Powers of diagonalisable matrices

One of the most important applications of diagaonalisation is in computing powers of matrices. If a matrix \(A\) is diagonalisable, we can write it as

\[A=PDP^{-1}\]

Now if we want to compute \(A^2\), we have \[A^2=PDP^{-1}PDP^{-1}=PD^2P^{-1}\] where we used \(P^{-1}P=I\). Now, more generally \[ A^n=PDP^{-1}PDP^{-1}\dotsb PDP^{-1}=PD^nP^{-1} \] for any natural number \(n\). By the rules of matrix multiplication, the \(n\)-th power of the diagonal matrix \[ D= \begin{pmatrix} \lambda_1&&0\\ &\ddots&\\ 0&&\lambda_m \end{pmatrix} \] is simply \[ D^n = \begin{pmatrix} \lambda_1^n&&0\\ &\ddots&\\ 0&&\lambda_m^n \end{pmatrix}. \] This gives us an efficient way to compute \(A^n\).

Example 10.9 (Powers of a diagonalisable matrix) Consider the matrix \[ A= \begin{pmatrix} 8&-6\\ 3&-1 \end{pmatrix}. \]

We find

\[A=PDP^{-1}= \begin{pmatrix} 1&2\\ 1&1 \end{pmatrix} \begin{pmatrix} 2&0\\ 0&5 \end{pmatrix} \begin{pmatrix} -1& 2\\ 1& -1 \end{pmatrix}.\]

Now using

\[ A^n=PD^nP^{-1} \] we have \[ A^n= \begin{pmatrix} 1&2\\ 1&1 \end{pmatrix} \begin{pmatrix} 2^n&0\\ 0&5^n \end{pmatrix} \begin{pmatrix} -1& 2\\ 1& -1 \end{pmatrix} = \begin{pmatrix} -2^n+2\times 5^n& 2^{n+1}-2\times 5^n\\ -2^{n}+5^n& 2^{n+1}-5^n \end{pmatrix}. \]

We now have a nice expression that works for any \(n\) and this is much easier than trying to calculate \(A^n\) directly.


  1. Technically, \(\mathcal{P}\) is what is known as a basis – a list of vectors that are linearly independent and span the space and hence can act as a coordinate system.↩︎