Skip to main content

Section 6.3 Diagonalization

An \(n \times n\) matrix \(A\) often has a huge number of matrices to which it is similar. In this section, we will focus on matrices that are similar to diagonal matrices, and we do so because of how straightforward the action of a diagonal matrix is.

Subsection 6.3.1 Diagonalizable Matrices

If \(A\) is similar to a diagonal matrix \(D\text{,}\) then \(A = PDP^{-1}\) for some invertible matrix \(P\text{.}\) Such a factorization of \(A\) encodes much of the information about the eigenvalues and eigenvectors of \(A\text{,}\) and it also allows us to raise \(A\) to integer powers rather easily.
If \(D\) is diagonal, then powers of \(D\) are easy to compute. Consider the following matrices \(D\) and \(D^2\text{:}\)
\begin{equation*} D = \begin{bmatrix} -3 \amp 0 \\ 0 \amp 4 \end{bmatrix}, \hspace{6pt} D^2 = \begin{bmatrix} 9 \amp 0 \\ 0 \amp 16 \end{bmatrix}\text{.} \end{equation*}
In general, the many zeros in a diagonal matrix make the powers of that matrix easy to calculate. For this specific \(D\text{,}\) we have, for any integer \(k \ge 1\text{,}\)
\begin{equation*} D^k = \begin{bmatrix} (-3)^k \amp 0 \\ 0 \amp 4^k \end{bmatrix}\text{.} \end{equation*}
Given the number of calculations that are usually involved in matrix multiplication, this is a huge savings in computing time.
Now, if \(A\) is similar to a diagonal matrix \(D\text{,}\) we find related behavior. Suppose that \(A = PDP^{-1}\text{.}\) Then
\begin{align*} A^2 \amp = (PDP^{-1})(PDP^{-1}) = PD(P^{-1}P)DP^{-1}\\ \amp = PD(I)DP^{-1} = PD^2P^{-1}\text{.} \end{align*}
Since \(A^3\) can be written as \(AA^2\text{,}\) we have \(A^3 = PD^3P^{-1}\text{.}\) This pattern holds for any \(k \ge 2\text{:}\)
\begin{equation*} A^k = PD^kP^{-1}\text{.} \end{equation*}
Perhaps we have convinced a skeptical reader that there are some advantages when \(A\) is similar to a diagonal matrix. This is worthy of a formal definition.

Definition 6.3.1.

A matrix \(A \in M_n(\ff)\) is diagonalizable if \(A = PDP^{-1}\) for some invertible matrix \(P\) and some diagonal matrix \(D\text{.}\)
Given this definition, it is natural to ask exactly when a matrix is diagonalizable. That answer comes in the following theorem.

Proof.

If \(P\) is an \(n\times n\) matrix with columns \(\bfv_1, \ldots, \bfv_n\text{,}\) and if \(D\) is a diagonal matrix with diagonal entries \(\lambda_1, \ldots, \lambda_n\text{,}\) then we have
\begin{equation} AP = A \begin{bmatrix} \bfv_1 \amp \cdots \amp \bfv_n \end{bmatrix} = \begin{bmatrix} A\bfv_1 \amp \cdots \amp A\bfv_n \end{bmatrix}\text{,}\tag{6.4} \end{equation}
and also
\begin{equation} PD = \begin{bmatrix} \lambda_1\bfv_1 \amp \cdots \amp \lambda_n\bfv_n \end{bmatrix}\text{.}\tag{6.5} \end{equation}
(If the reader has trouble believing (6.5), thinking of matrix multiplication, in each column of the product, as a linear combination of the columns of \(P\) with weights coming from the corresponding column of \(D\text{,}\) may help!) If \(A\) is diagonalizable, then \(A = PDP^{-1}\) and \(AP = PD\text{.}\) From (6.4) and (6.5), by equating columns in \(AP\) and \(PD\) we see that \(A\bfv_i = \lambda_i\bfv_i\) for \(1 \le i \le n\text{.}\) Since \(P\) is invertible, the columns of \(P\) must be linearly independent. Further, since the columns of \(P\) cannot be zero vectors, this argument shows that \(\lambda_i\) is an eigenvalue of \(A\) with eigenvector \(\bfv_i\text{,}\) for each \(i\text{.}\) This proves one direction of the theorem.
If we are given \(\bfv_1, \ldots, \bfv_n\) as eigenvectors of \(A\) with corresponding eigenvalues \(\lambda_1, \ldots, \lambda_n\text{,}\) then we can form the matrices \(P\) and \(D\text{.}\) The argument in the previous paragraph shows that \(AP = PD\text{.}\) (Note that we have not yet used the linear independence of the eigenvectors!) If the eigenvectors are linearly independent, then \(P\) is invertible, and \(AP = PD\) implies \(A = PDP^{-1}\text{,}\) making \(A\) diagonalizable.

Note 6.3.3.

TheoremΒ 6.3.2 says that \(A\) is diagonalizable if and only if there is a basis of \(\ff^n\) consisting of eigenvectors of \(A\text{.}\) We call such a basis an eigenvector basis of \(\ff^n\text{.}\)

Subsection 6.3.2 How to Diagonalize a Matrix

Using TheoremΒ 6.3.2, we see there are four steps to diagonalizing a matrix. We will summarize them in the following algorithm.
After forming \(P\) and \(D\text{,}\) it is a good idea to check that the process was successful. We may verify the equation \(A = PDP^{-1}\text{,}\) or alternatively we may check that \(AP = PD\text{.}\) (This avoids the need to find \(P^{-1}\text{.}\))

Example 6.3.5.

We consider the matrix \(A \in M_3(\rr)\text{:}\)
\begin{equation*} A = \begin{bmatrix} -2 \amp 2 \amp -2 \\ -4 \amp 4 \amp -2 \\ -6 \amp 3 \amp -1 \end{bmatrix}\text{.} \end{equation*}
Though the reader can determine this independently, we will provide the characteristic polynomial to save time and space:
\begin{equation*} p_A(\lambda) = -(\lambda+3)(\lambda-2)^2\text{.} \end{equation*}
From this we see that the eigenvalues of \(A\) are \(\lambda = -3\) and \(\lambda = 2\text{.}\)
We now find bases for the associated eigenspaces. Again, we will suppress all of the calculations since the previous sections have gone through these in some detail. We find that
\begin{equation*} \mathrm{eig}_{-3}(A) = \spn \left\{ \begin{bmatrix} 2 \\ 2 \\ 3 \end{bmatrix} \right\}\text{,} \end{equation*}
and
\begin{equation*} \mathrm{eig}_{2}(A) = \spn \left\{ \begin{bmatrix} 1 \\ 2 \\ 0 \end{bmatrix}, \begin{bmatrix} -1 \\ 0 \\ 2 \end{bmatrix} \right\}\text{.} \end{equation*}
Since we have three linearly independent eigenvectors, we know from TheoremΒ 6.3.2 that \(A\) is diagonalizable.
We now form the matrices \(P\) and \(D\) according to the algorithm:
\begin{equation*} P = \begin{bmatrix} 2 \amp 1 \amp -1 \\ 2 \amp 2 \amp 0 \\ 3 \amp 0 \amp 2 \end{bmatrix}, \hspace{6pt} D = \begin{bmatrix} -3 \amp 0 \amp 0 \\ 0 \amp 2 \amp 0 \\ 0 \amp 0 \amp 2 \end{bmatrix}\text{.} \end{equation*}
We can check that our diagonalization was successful by calculating \(AP\) and \(PD\text{:}\)
\begin{align*} AP \amp = \begin{bmatrix} -2 \amp 2 \amp -2 \\ -4 \amp 4 \amp -2 \\ -6 \amp 3 \amp -1 \end{bmatrix} \begin{bmatrix} 2 \amp 1 \amp -1 \\ 2 \amp 2 \amp 0 \\ 3 \amp 0 \amp 2 \end{bmatrix} = \begin{bmatrix} -6 \amp 2 \amp -2 \\ -6 \amp 4 \amp 0 \\ -9 \amp 0 \amp 4 \end{bmatrix}\\ PD \amp = \begin{bmatrix} 2 \amp 1 \amp -1 \\ 2 \amp 2 \amp 0 \\ 3 \amp 0 \amp 2 \end{bmatrix} \begin{bmatrix} -3 \amp 0 \amp 0 \\ 0 \amp 2 \amp 0 \\ 0 \amp 0 \amp 2 \end{bmatrix} = \begin{bmatrix} -6 \amp 2 \amp -2 \\ -6 \amp 4 \amp 0 \\ -9 \amp 0 \amp 4 \end{bmatrix}\text{.} \end{align*}
We now consider another example of a \(3\times 3\) matrix.

Example 6.3.6.

We consider the matrix \(A \in M_3(\rr)\text{:}\)
\begin{equation*} A = \begin{bmatrix} 0 \amp -6 \amp -4 \\ 5 \amp -11 \amp -6 \\ -6 \amp 9 \amp 4 \end{bmatrix}\text{.} \end{equation*}
The characteristic polynomial for \(A\) is \(p_A(\lambda) = -(\lambda + 2)^2(\lambda +3)\text{.}\) So the eigenvalues of \(A\) are \(\lambda = -2\) and \(\lambda = -3\text{.}\)
When we look for eigenvectors, we find the following for \(A + 2I\text{:}\)
\begin{equation*} A + 2I = \begin{bmatrix} 2 \amp -6 \amp -4 \\ 5 \amp -9 \amp -6 \\ -6 \amp 9 \amp 6 \end{bmatrix} \sim \begin{bmatrix} 1 \amp 0 \amp 0 \\ 0 \amp 1 \amp \frac{2}{3} \\ 0 \amp 0 \amp 0 \end{bmatrix}\text{.} \end{equation*}
This shows that \(\mathrm{eig}_{-2}(A)\) is only one-dimensional. Since we need three total linearly independent eigenvectors to diagonalize \(A\text{,}\) and we will only get one from \(\mathrm{eig}_{-3}(A)\text{,}\) we needed \(\mathrm{eig}_{-2}(A)\) to be two-dimensional. This shows that \(A\) is not diagonalizable.
The difference between the last two examples shows that diagonalizability is subtle. There are times when we can tell if a matrix is diagonalizable without a lot of work, but sometimes we need to get all the way to the eigenspace calculation before having an answer. The following theorem states a situation in which diagonalizability is easier to confirm.

Proof.

If \(A\) has \(n\) distinct eigenvalues, let \(\bfv_1, \ldots, \bfv_n\) be eigenvectors which correspond to those eigenvalues. Then, by TheoremΒ 6.1.13, the set \(V' = \{\bfv_1, \ldots, \bfv_n \}\) is linearly independent. Since our vectors are in \(\ff^n\text{,}\) by TheoremΒ 6.3.2 this basis of eigenvectors of \(A\) means that \(A\) is diagonalizable.

Example 6.3.8.

We consider the following matrix \(A \in M_3(\rr)\text{:}\)
\begin{equation*} A = \begin{bmatrix} 1 \amp 0 \amp 0 \\ -4 \amp -2 \amp 0 \\ 3 \amp -1 \amp 5 \end{bmatrix}\text{.} \end{equation*}
Since \(A\) is lower triangular, we can read the eigenvalues off of the main diagonal: \(\lambda = 1, -2, 5\text{.}\) Since there are three distinct eigenvalues for this \(3\times 3\) matrix, then TheoremΒ 6.3.7 says that \(A\) is diagonalizable.

Note 6.3.9.

Having \(n\) distinct eigenvalues is a sufficient condition for a matrix to be diagonalizable, but it is not necessary. In other words, a matrix can still be diagonalizable with fewer than \(n\) distinct eigenvalues. We have already encountered this in ExampleΒ 6.3.5.
The following theorem collects some facts about the diagonalizability of a matrix. (We omit the proof.)
In the previous theorem, we used the dimension of an eigenspace several times. This is worthy of a definition.

Definition 6.3.11.

The geometric multiplicity of an eigenvalue \(\lambda\) of a matrix \(A\) is the dimension of the eigenspace \(\mathrm{eig}_{\lambda}(A)\text{.}\)
The reader will now be able to phrase some of the results contained in TheoremΒ 6.3.10 in terms of how the (algebraic) multiplicity and geometric multiplicity of the eigenvalues of a matrix compare to each other.

Subsection 6.3.3 Linear Transformations and Diagonalizability

In SubsectionΒ 5.5.2 we saw that, for linear transformations between finite-dimensional vector spaces, we could view these transformations as multiplication by a matrix if we were content to handle coordinate vectors. And while we didn’t have the current terminology at that point, in SectionΒ 5.6 we were calculating coordinate matrices for linear transformations using similarity. (See ExampleΒ 5.6.10.)
This means that our discussion of similar matrices has implications for linear transformations broadly. And these implication are, unsurprisingly, related to eigenvalues and eigenvectors.

Definition 6.3.12.

Let \(V\) be a finite-dimensional vector space and let \(T \in L(V)\text{.}\) Then \(T\) is diagonalizable if there exists a basis \(\mcb\) of \(V\) such that \([T]_{\mcb}\) is diagonal.
Based on our discussion thus far in this section, the reader may guess that the vectors in the basis \(\mcb\) referenced in DefinitionΒ 6.3.12 are eigenvectors for \(T\text{.}\) What is true for matrices is (generally) true in the proper context for linear transformations.

Example 6.3.14.

In ExampleΒ 5.6.10, we considered the linear transformation on \(\rr^2\) which is reflection across the line \(y = \frac{1}{2}x\text{.}\) In that example, we looked at the basis \(\mcb = \{\bfv_1, \bfv_2 \}\) for \(\rr^2\text{,}\) where
\begin{equation*} \bfv_1 = \begin{bmatrix} 2 \\ 1 \end{bmatrix}, \hspace{12pt} \bfv_2 = \begin{bmatrix} -1 \\ 2 \end{bmatrix}\text{.} \end{equation*}
We saw that the matrix \([T]_{\mcb}\) was diagonal, and now we know that was because the basis vectors are eigenvectors for \(T\text{.}\) Since \(\bfv_1\) lies on the line of reflection, it is an eigenvector for \(T\) with eigenvalue 1, and since \(\bfv_2\) lies on the line perpendicular to the line of reflection, it is an eigenvector for \(T\) with eigenvalue \(-1\text{.}\) The matrix \([T]_{\mcb}\) is
\begin{equation*} [T]_{\mcb} = \begin{bmatrix} 1 \amp 0 \\ 0 \amp -1 \end{bmatrix}\text{.} \end{equation*}
The following result is basically a restatement of CorollaryΒ 5.6.9, using the language of similar matrices.

Proof.

The final result in this section brings several prior results together, tying the diagonalizability of linear transformations and matrices to each other in a predictable way.
This theorem says that a linear transformation \(T\) is diagonalizable if there is a basis of \(V\) with respect to which the coordinate matrix of \(T\) is diagonalizable.
We finish this section with an example which is perhaps a bit contrived but which is also, hopefully, illustrative.

Example 6.3.17.

Let \(T:P_1 \to P_1\) be the following linear transformation:
\begin{equation*} T(a + bt) = (a-4b) + (3a-6b)t\text{.} \end{equation*}
If \(\mcb\) is the standard basis for \(P_1\text{,}\) then \([T]_{\mcb}\) is
\begin{equation*} [T]_{\mcb} = \begin{bmatrix} 1 \amp -4 \\ 3 \amp -6 \end{bmatrix}\text{.} \end{equation*}
It is fairly easy to determine that \([T]_{\mcb}\) is diagonalizable, since the characteristic polynomial is
\begin{equation*} \lambda^2+5\lambda + 6 = (\lambda+2)(\lambda + 3)\text{.} \end{equation*}
Since \([T]_{\mcb}\) is diagonalizable, that means that \(T\) is a diagonalizable linear transformation.
Using coordinate vectors, we can also determine the basis \(\mcc\) of \(P_1\) with respect to which \(T\) has a diagonal coordinate matrix. (It is a basis of eigenvectors of \(T\text{!}\))
Since the eigenvalues of \([T]_{\mcb}\) are \(\lambda = -2, -3\text{,}\) we can find bases for the related eigenspaces. For ease of notation, let \([T]_{\mcb} = A\text{.}\) Now
\begin{equation*} \mathrm{eig}_{-2}(A) = \spn \left\{ \begin{bmatrix} 4 \\ 3 \end{bmatrix} \right\}, \hspace{6pt} \mathrm{eig}_{-3}(A) = \spn \left\{ \begin{bmatrix} 1 \\ 1 \end{bmatrix} \right\}\text{.} \end{equation*}
These are the coordinate vectors for the eigenvectors of \(T\) with respect to the standard basis. Therefore, an eigenvector basis of \(P_1\) is
\begin{equation*} \mcc = \{ 4 + 3t, 1 + t \}\text{,} \end{equation*}
and \([T]_{\mcc}\) is a diagonal matrix with diagonal entries \(-2\) and \(-3\text{.}\)

Reading Questions 6.3.4 Reading Questions

1.

Consider the following matrix \(A \in M_2(\rr)\text{:}\)
\begin{equation*} A = \begin{bmatrix} -1 \amp 2 \\ 3 \amp 4 \end{bmatrix}\text{.} \end{equation*}
  1. Find the characteristic polynomial and the eigenvalues of \(A\text{.}\) Show your work.
  2. Using only part a (this means you should make no additional calculations), explain why \(A\) is diagonalizable.

Exercises 6.3.5 Exercises

1.

Let \(P, D \in M_2(\rr)\) be the following matrices:
\begin{equation*} P = \begin{bmatrix} 2 \amp 5 \\ 1 \amp 3 \end{bmatrix}, \hspace{6pt} D = \begin{bmatrix} -2 \amp 0 \\ 0 \amp -1 \end{bmatrix}\text{.} \end{equation*}
If \(A = PDP^{-1}\text{,}\) calculate \(A^4\text{.}\)

2.

Consider the following matrix \(A \in M_2(\rr)\text{:}\)
\begin{equation*} A = \begin{bmatrix} -5 \amp -3 \\ 6 \amp 4 \end{bmatrix}\text{.} \end{equation*}
Determine whether or not \(A\) is diagonalizable. If it is, diagonalize it. If it isn’t, explain why it isn’t.

3.

Consider the following matrix \(A \in M_2(\rr)\text{:}\)
\begin{equation*} A = \begin{bmatrix} 3 \amp 1 \\ -1 \amp 5 \end{bmatrix}\text{.} \end{equation*}
Determine whether or not \(A\) is diagonalizable. If it is, diagonalize it. If it isn’t, explain why it isn’t.
Answer.
This matrix is not diagonalizable. There is only one eigenvalue, and the dimension of the eigenspace is one.

4.

Consider the following matrix \(A \in M_3(\rr)\text{:}\)
\begin{equation*} A = \begin{bmatrix} 14 \amp 3 \amp 12 \\ -2 \amp 3 \amp -2 \\ -7 \amp -1 \amp -5 \end{bmatrix}\text{.} \end{equation*}
Determine whether or not \(A\) is diagonalizable. If it is, diagonalize it. If it isn’t, explain why it isn’t. (Hint: One of the eigenvalues of \(A\) is \(\lambda = 5\text{.}\))

5.

Consider the following matrix \(A \in M_3(\rr)\text{:}\)
\begin{equation*} A = \begin{bmatrix} -9 \amp -8 \amp -16 \\ -4 \amp -5 \amp -8 \\ 4 \amp 4 \amp 7 \end{bmatrix}\text{.} \end{equation*}
Determine whether or not \(A\) is diagonalizable. If it is, diagonalize it. If it isn’t, explain why it isn’t. (Hint: One of the eigenvalues of \(A\) is \(\lambda = -1\text{.}\))
Answer.
This matrix is diagonalizable using the following matrices \(P\) and \(D\text{:}\)
\begin{equation*} P = \begin{bmatrix} -1 \amp -2 \amp -2 \\ 1 \amp 0 \amp -1 \\ 0 \amp 1 \amp 1 \end{bmatrix}, \hspace{6pt} D = \begin{bmatrix} -1 \amp 0 \amp 0 \\ 0 \amp -1 \amp 0 \\ 0 \amp 0 \amp -5 \end{bmatrix}\text{.} \end{equation*}

6.

Suppose that \(A \in M_4(\rr)\) has three distinct eigenvalues. One eigenspace is one-dimensional and one of the other eigenspaces is two-dimensional. Is it possible for \(A\) not to be diagonalizable? Explain.
Answer.
No, that is not possible. The matrix \(A\) must be diagonalizable. The dimension of the eigenspace that is not yet specified must be at least one. The sum of all dimensions of the eigenspaces must be at most four (since \(A\) is \(4\times 4\)), and the given information tells us that this sum will be exactly four. This means that \(A\) is diagonalizable.

7.

Consider the following matrix \(A \in M_2(\rr)\text{:}\)
\begin{equation*} A = \begin{bmatrix} 4 \amp -1 \\ -4 \amp 4 \end{bmatrix}\text{.} \end{equation*}
Show that a diagonalization \(A = PDP^{-1}\) is not unique by finding two pairs of matrices \((P,D)\) which diagonalize \(A\text{.}\)
Answer.
Here is one pair of matrices that diagonalizes \(A\text{:}\)
\begin{equation*} P = \begin{bmatrix} 1 \amp -1 \\ 2 \amp 2 \end{bmatrix}, \hspace{6pt} D = \begin{bmatrix} 2 \amp 0 \\ 0 \amp 6 \end{bmatrix}\text{.} \end{equation*}
We can make subtle manipulations to these matrices to find another pair which diagonalizes \(A\text{:}\)
\begin{equation*} P = \begin{bmatrix} -1 \amp 1 \\ 2 \amp 2 \end{bmatrix}, \hspace{6pt} D = \begin{bmatrix} 6 \amp 0 \\ 0 \amp 2 \end{bmatrix}\text{.} \end{equation*}
There are many, many other pairs of matrices that diagonalize \(A\text{,}\) mainly because the matrix \(P\) can have infinitely many different columns.

Writing Exercises

8.
Prove that if \(A\) is both invertible and diagonalizable, then so is \(A^{-1}\text{.}\)
9.
Let \(A \in M_n(\ff)\text{.}\) Prove that if \(A\) has \(n\) linearly independent eigenvectors, then so does \(A^T\text{.}\)
Solution.
If \(A\) has \(n\) linearly independent eigenvectors, then it is diagonalizable by TheoremΒ 6.3.2. This means there exist matrices \(P\) and \(D\) such that \(A = PDP^{-1}\text{.}\) If we take the transpose of both sides of this equation, we get
\begin{equation*} A^T = (PDP^{-1})^T = (P^{-1})^TD^TP^T\text{.} \end{equation*}
Since \(D\) is diagonal, \(D^T=D\text{.}\) Also, ExerciseΒ 3.3.5.8 tells us that \((P^{-1})^T=(P^T)^{-1}\text{.}\) So, we have
\begin{equation*} A^T = (P^T)^{-1}DP^T\text{.} \end{equation*}
This proves that \(A^T\) is diagonalizable, and TheoremΒ 6.3.2 allows us to conclude that \(A^T\) has \(n\) linearly independent eigenvectors.
10.
This problem explores the relationship between invertibility and diagonalizability.
  1. Construct a nonzero \(2\times 2\) matrix which is invertible but not diagonalizable.
  2. Construct a nondiagonal \(2\times 2\) matrix that is diagonalizable but not invertible.
11.
Let \(T:\rr^3 \to \rr^3\) be projection onto the \(xy\)-plane. Prove that \(T\) is diagonalizable.
12.
Let \(T:\rr^2 \to \rr^2\) be orthogonal projection onto the line \(y = -6x\text{.}\) Prove that \(T\) is diagonalizable.