Eigenvalues, Eigenvectors, and Diagonalization
The eigenvectors of a matrix are those vectors that the matrix simply rescales, and the factor by which an eigenvector is rescaled is called its eigenvalue. These concepts can be used to quickly calculate large powers of matrices.
This post is part of the book Justin Math: Linear Algebra. Suggested citation: Skycak, J. (2019). Eigenvalues, Eigenvectors, and Diagonalization. In Justin Math: Linear Algebra. https://justinmath.com/eigenvalues-eigenvectors-and-diagonalization/
Want to get notified about new posts? Join the mailing list and follow on X/Twitter.
Suppose we want to compute a matrix raised to a large power, i.e. multiplied by itself many times.
Of course, we could perform this computation using sheer brute force, multiplying out each of the 999 matrices – but this would take a while.
On the other hand, we could go about the multiplications in a more clever way – for example, if the matrix is $A$, then we could compute $AA=A^2$, $A^2A^2=A^4$, and so on until we get to $A^{256}A^{256}=A^{512}$, and then compute
However, this would still require us to compute 14 multiplications, which – although it is much better than the original 999 – is still an annoyingly large amount of work, especially once the numbers inside the matrices become large.
Inverse Shearings and Rescalings
Fortunately, there is an even better way. First, notice that there is a way to express this matrix as a particular product of shearings and rescalings shown below.
The two shearings surrounding the rescaling are special in that they are inverses of each other:
As a result, if we multiply 999 copies of the decomposed matrix, we see that all of the shears cancel except the very first and the very last, leaving us with a product of 999 rescaling matrices in between.
But rescaling matrices are easy to multiply – we can just multiply the diagonal entries separately! This leaves us with only 3 remaining matrix multiplications, which isn’t too much work to do by hand.
Diagonalized Form
In order to reproduce this trick on other matrices, we need to come up with a general method for expressing a matrix $A$ in the diagonalized form
where $D$ is a diagonal rescaling matrix and the surrounding matrices $P$ and $P^{-1}$ are inverses of each other.
In order to solve for $P$ and $D$, it helps to right-multiply both sides of the equation by $P$ so that
Then, we can express $P$ in terms of its column vectors $v_i$ and $D$ in terms of its diagonal entries $\lambda_i$, and multiply.
We see that the problem amounts to finding pairs of vectors $v$ and scalars $\lambda$ such that
Eigenvectors and Eigenvalues
Such vectors $v$ are called eigenvectors of the matrix $A$, and the scalars $\lambda$ that the eigenvectors are paired with are called eigenvalues.
Essentially, the eigenvectors of a matrix are those vectors that the matrix simply rescales, and the factor by which an eigenvector is rescaled is called its eigenvalue.
There is one important constraint: the eigenvectors must be nonzero and independent, since we need to be able to compute the inverse of the matrix that has them as columns.
In order to solve for the eigenvector and eigenvalue pairs, we rearrange the equation once more, introducing the identity matrix so that we may factor out the eigenvector $v$.
Since we’re assuming is not the zero vector, the last equation tells us that some combination of not-all-zero multiples of columns of $A-\lambda I$ makes the zero vector. Consequently, the columns of $A-\lambda I$ must be dependent, and thus
Finally, we have an equation that we can use to solve for $\lambda$. Then, for each solution that we find for the eigenvalue $\lambda$, we can simply substitute back into $(A-\lambda I)v=0$ to solve for the corresponding eigenvector $v$.
Demonstration of Diagonalization
Let’s work an example. Say we want to diagonalize the matrix below.
We start by solving the equation $\det(A-\lambda I)$ for the eigenvalues $\lambda$.
Now that we have the eigenvalues $\lambda_1=-1$ and $\lambda_2=2$, we solve the equation $(A-\lambda I)v=0$ for corresponding eigenvectors $v_1$ and $v_2$.
At this point, one option is to write $v_1$ in terms of its components, say $v_1 = \left< s,t \right>$, and simplify the matrix equation into a linear system in $s$ and $t$.
We can simplify the system by dividing the top equation by $-3$ and the bottom equation by $6$. This reveals that the two equations are really just the same equation.
As a result, they can be reduced down to a single equation, and we can easily solve for $s$ in terms of $t$.
Substituting back into $v_1$, we have
In other words, the eigenvector $v_1$ can be chosen as any multiple of the vector $\left< -\frac{2}{3}, 1 \right>$. Intuitively, this makes sense: if $Av=\lambda v$, then any multiple $cv$ of $v$ will have the same property:
We only need to choose a single vector for $v_1$. For the sake of simplicity, we will choose $v_!$ to be the least multiple of $\left< -\frac{2}{3}, 1 \right>$ that has whole number coefficients, and a positive first component. We multiply the vector by $-3$ to reach
Thus, we have our first eigenvalue-eigenvector pair!
Solving for an eigenvector might seem like a bit of work, but once you go through the process several times, you’ll notice a faster method: we can simply multiply by a diagonal matrix.
The diagonal matrix represents the operations we did the long way on the system of equations: dividing the top equation by $-3$ and the bottom equation by $6$.
Then, we just have to choose $v_1$ as a vector whose dot product with $\left< 3,2 \right>$ is equal to $0$. The simplest choice is $\left< 2,-3 \right>$, and to keep the solution general, we introduce a parameter $t$ to mean that $v_1$ is any nonzero multiple of $\left< 2,-3 \right>$.
For the purposes of diagonalization, we just need one particular such vector, so we will choose the simplest case, $t=1$ (and we will implicitly assume such choice when solving for other eigenvectors).
Using this method, we reach the same eigenvalue-eigenvector pair.
Next we repeat the same process to find the second eigenvalue-eigenvector pair, this time starting with our second eigenvalue $\lambda_2 = 2$.
Now that we have our eigenvalues and eigenvectors, we can substitute them into our diagonalization.
More Complicated Case
In this example, the eigenvalues came out to nice integer values. As we’ll see in the next example, eigenvalues and eigenvectors might be messy, involving roots or even complex numbers.
The next example will also be on a $3 \times 3$ matrix, to illustrate that the method of diagonalization is the same even for higher-dimensional matrices.
To diagonalize the matrix
we begin by computing the eigenvalues:
Then, we solve for the eigenvectors corresponding to the eigenvectors $\lambda_1 = 1$, $\lambda_2 = 1+i$, and $\lambda_3 = 1-i$.
Collecting our eigenvalues and eigenvectors, we have
We substitute the eigenvalues and eigenvectors into our diagonalization.
Then we compute $P^{-1}$.
Finally, we’re done!
Eigenvalues with Multiple Eigenvectors
When diagonalizing some matrices such as the one below, we may end up with a single repeated eigenvalue, which corresponds to multiple independent eigenvectors.
This matrix consists of two distinct eigenvalues, one of which is repeated.
When we solve for the eigenvector corresponding to the eigenvalue $\lambda = 2$, we find that the solution consists of combinations of two independent vectors.
We shall use the simplest cases, $s=1, \, t=0$ and $s=0, \, t=1$, to choose two eigenvectors corresponding to the eigenvalue $\lambda = 2$. Thus, we have two eigenvalue-eigenvector pairs!
We solve for the third eigenvector, corresponding to the eigenvalue $\lambda_3=1$, as usual.
Then, we can invert the eigenvector matrix and diagonalize.
Non-Diagonalizable
Other times, though, we may not find enough independent eigenvectors to create the matrix $P$.
In such cases, $P$ simply cannot be diagonalized (though we will later learn a different method to exponentiate such matrices without too much more work).
For an example of a non-diagonalizable matrix, consider the matrix below:
We are able to solve for the eigenvalues of this matrix:
However, when we attempt to solve for the eigenvectors, we reach a problem: there is only one independent vector that satisfies $(A-\lambda I)v=0$.
We need two pairs of eigenvalues and eigenvectors to diagonalize the matrix, but we have a repeated eigenvalue and only one independent eigenvector corresponding to that eigenvalue.
Thus, we simply do not have enough independent eigenvectors to diagonalize the matrix.
Exercises
Diagonalize the given matrices $A$, if possible. If diagonalization is possible, then use the diagonalization to compute a formula for $A^n$. Check your formula on the case $n=2$. (You can view the solution by clicking on the problem.)
$\begin{align*} 1) \hspace{.5cm} \begin{pmatrix} 2 & 0 \\ -1 & 4 \end{pmatrix} \end{align*}$
Solution:
$\begin{align*} &\text{answers may vary; one correct answer is} \\ &A= \begin{pmatrix} 2 & 0 \\ 1 & 1 \end{pmatrix} \begin{pmatrix} 2 & 0 \\ 0 & 4 \end{pmatrix} \begin{pmatrix} \frac{1}{2} & 0 \\ -\frac{1}{2} & 1 \end{pmatrix} \\ &A^n = 2^{n-1} \begin{pmatrix} 2 & 0 \\ 1-2^n & 2^{n+1} \end{pmatrix} \end{align*}$
$\begin{align*} 2) \hspace{.5cm} \begin{pmatrix} 6 & 6 \\ 0 & 2 \end{pmatrix} \end{align*}$
Solution:
$\begin{align*} &\text{answers may vary; one correct answer is} \\ &A= \begin{pmatrix} 3 & 2 \\ -2 & 0 \end{pmatrix} \begin{pmatrix} 2 & 0 \\ 0 & 6 \end{pmatrix} \begin{pmatrix} 0 & -\frac{1}{2} \\ \frac{1}{2} & \frac{3}{4} \end{pmatrix} \\ &A^n = 2^{n-1} \begin{pmatrix} 2(3)^n & 3(3^n-1) \\ 0 & 2 \end{pmatrix} \end{align*}$
$\begin{align*} 3) \hspace{.5cm} \begin{pmatrix} 3 & 2 \\ 0 & 3 \end{pmatrix} \end{align*}$
Solution:
$\begin{align*} &\text{not diagonalizable} \\ \end{align*}$
$\begin{align*} 4) \hspace{.5cm} \begin{pmatrix} 7 & -3 \\ -2 & 8 \end{pmatrix} \end{align*}$
Solution:
$\begin{align*} &\text{answers may vary; one correct answer is} \\ &A= \begin{pmatrix} 3 & 1 \\ 2 & -1 \end{pmatrix} \begin{pmatrix} 5 & 0 \\ 0 & 10 \end{pmatrix} \begin{pmatrix} \frac{1}{5} & \frac{1}{5} \\ \frac{2}{5} & -\frac{3}{5} \end{pmatrix} \\ &A^n = 5^{n-1} \begin{pmatrix} 3+2^{n-1} & 3(1-2^n) \\ 2(1-2^n ) & 2+3(2)^n \end{pmatrix} \end{align*}$
$\begin{align*} 5) \hspace{.5cm} \begin{pmatrix} -3 & -4 \\ -4 & 3 \end{pmatrix} \end{align*}$
Solution:
$\begin{align*} &\text{answers may vary; one correct answer is} \\ &A= \begin{pmatrix} 1 & 2 \\ -2 & 1 \end{pmatrix} \begin{pmatrix} 5 & 0 \\ 0 & -5 \end{pmatrix} \begin{pmatrix} \frac{1}{5} & -\frac{2}{5} \\ \frac{2}{5} & \frac{1}{5} \end{pmatrix} \\ &A^n = \frac{1}{5} \begin{pmatrix} 4(-5)^n+5^n & 2 \left[ (-5)^n-5^n \right] \\ 2 \left[ (-5)^n-5^n \right] & (-5)^n+4(5)^n \end{pmatrix} \end{align*}$
$\begin{align*} 6) \hspace{.5cm} \begin{pmatrix} 2 & 3 \\ 1 & 2 \end{pmatrix} \end{align*}$
Solution:
$\begin{align*} &\text{answers may vary; one correct answer is} \\ &A= \begin{pmatrix} -\sqrt{3} & \sqrt{3} \\ 1 & 1 \end{pmatrix} \begin{pmatrix} 2-\sqrt{3} & 0 \\ 0 & 2+\sqrt{3} \end{pmatrix} \begin{pmatrix} -\frac{\sqrt{3}}{6} & \frac{1}{2} \\ \frac{\sqrt{3}}{6} & \frac{1}{2} \end{pmatrix} \\ &A^n = \frac{1}{6} \begin{pmatrix} 3 \left[ (2+\sqrt{3})^n + (2-\sqrt{3})^n \right] & 3\sqrt{3} \left[ (2+\sqrt{3})^n - (2-\sqrt{3})^n \right] \\ \sqrt{3} \left[ (2+\sqrt{3})^n - (2-\sqrt{3})^n \right] & 3 \left[ (2+\sqrt{3})^n + (2-\sqrt{3})^n \right] \end{pmatrix} \end{align*}$
$\begin{align*} 7) \hspace{.5cm} \begin{pmatrix} -1 & 2 \\ 1 & 1 \end{pmatrix} \end{align*}$
Solution:
$\begin{align*} &\text{answers may vary; one correct answer is} \\ &A= \begin{pmatrix} -1-\sqrt{3} & -1+\sqrt{3} \\ 1 & 1 \end{pmatrix} \begin{pmatrix} -\sqrt{3} & 0 \\ 0 & \sqrt{3} \end{pmatrix} \begin{pmatrix} -\frac{\sqrt{3}}{6} & \frac{3-\sqrt{3}}{6} \\ \frac{\sqrt{3}}{6} & \frac{3+\sqrt{3}}{6} \end{pmatrix} \\ &A^n = \frac{\sqrt{3^{n-1}}}{2}\begin{pmatrix} -1+\sqrt{3} + (1+\sqrt{3})(-1)^n & 2-2(-1)^n \\ 1-(-1)^n & 1+\sqrt{3} + (\sqrt{3}-1)(-1)^n \end{pmatrix} \end{align*}$
$\begin{align*} 8) \hspace{.5cm} \begin{pmatrix} -4 & 0 \\ 5 & -4 \end{pmatrix} \end{align*}$
Solution:
$\begin{align*} &\text{not diagonalizable} \\ \end{align*}$
$\begin{align*} 9) \hspace{.5cm} \begin{pmatrix} 1 & 1 \\ -1 & 1 \end{pmatrix} \end{align*}$
Solution:
$\begin{align*} &\text{answers may vary; one correct answer is} \\ &A= \begin{pmatrix} 1 & 1 \\ -i & i \end{pmatrix} \begin{pmatrix} 1-i & 0 \\ 0 & 1+i \end{pmatrix} \begin{pmatrix} \frac{1}{2} & \frac{i}{2} \\ \frac{1}{2} & -\frac{i}{2} \end{pmatrix} \\ &A^n = \frac{1}{2} \begin{pmatrix} (1+i)^n+(1-i)^n & i \left[ (1-i)^n-(1+i)^n \right] \\ i \left[ (1+i)^n-(1-i)^n \right] & (1+i)^n-(1-i)^n \end{pmatrix} \end{align*}$
$\begin{align*} 10) \hspace{.5cm} \begin{pmatrix} 1 & 2 \\ -2 & 1 \end{pmatrix} \end{align*}$
Solution:
$\begin{align*} &\text{answers may vary; one correct answer is} \\ &A= \begin{pmatrix} i & -i \\ 1 & 1 \end{pmatrix} \begin{pmatrix}1-2i & 0 \\ 0 & 1+2i \end{pmatrix} \begin{pmatrix} -\frac{i}{2} & \frac{1}{2} \\ \frac{i}{2} & \frac{1}{2} \end{pmatrix} \\ &A^n = \frac{1}{2} \begin{pmatrix} (1+2i)^n+(1-2i)^n & i \left[ (1-2i)^n-(1+2i)^n \right] \\ i \left[ (1+2i)^n - (1-2i)^n \right] & (1+2i)^n+(1-2i)^n \end{pmatrix} \end{align*}$
$\begin{align*} 11) \hspace{.5cm} \begin{pmatrix} 1 & 0 & 1 \\ 3 & -1 & -3 \\ 0 & 0 & -2 \end{pmatrix} \end{align*}$
Solution:
$\begin{align*} &\text{answers may vary; one correct answer is} \\ &A= \begin{pmatrix} 0 & 2 & 1 \\ 1 & 3 & 0 \\ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} -1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 2 \end{pmatrix} \begin{pmatrix} -\frac{3}{2} & 1 & \frac{3}{2} \\ \frac{1}{2} & 0 & -\frac{1}{2} \\ 0 & 0 & 1 \end{pmatrix} \\ &A^n = \frac{1}{2} \begin{pmatrix} 2 & 0 & 2(2^n-1) \\ 3 \left[ 1-(-1)^n \right] & 2(-1)^n & 3 \left[ (-1)^n-1 \right] \\ 0 & 0 & -2^{n+1} \end{pmatrix} \end{align*}$
$\begin{align*} 12) \hspace{.5cm} \begin{pmatrix} 0 & 6 & 4 \\ -5 & 11 & 6 \\ 6 & -9 & -4 \end{pmatrix} \end{align*}$
Solution:
$\begin{align*} &\text{not diagonalizable} \\ \end{align*}$
$\begin{align*} 13) \hspace{.5cm} \begin{pmatrix} 4 & 2 & -2 \\ -12 & -10 & 8 \\ -9 & -9 & 7 \end{pmatrix} \end{align*}$
Solution:
$\begin{align*} &\text{answers may vary; one correct answer is} \\ &A= \begin{pmatrix} 1 & 0 & 2 \\ -1 & 1 & 0 \\ 0 & 1 & 3 \end{pmatrix} \begin{pmatrix} 2 & 0 & 0 \\ 0 & -2 & 0 \\ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} 3 & 2 & -2 \\ 3 & 3 & -2 \\ -1 & -1 & 1 \end{pmatrix} \\ A^n = \begin{pmatrix} 3(2)^n - 2 & 2(2^n-1) & 2(1-2^n) \\ 3 \left[ (-2)^n-2^n \right] & 3(-2)^n - 2^{n+1} & (-2)^{n+1}+2^{n+1} \\ 3 \left[ (-2)^n-1 \right] & 3 \left[ (-2)^n-1 \right] & 3+(-2)^{n+1} \end{pmatrix} \end{align*}$
$\begin{align*} 14) \hspace{.5cm} \begin{pmatrix} 1 & 1 & 0 \\ 0 & 1 & 1 \\ 0 & 0 & 4 \end{pmatrix} \end{align*}$
Solution:
$\begin{align*} &\text{not diagonalizable} \\ \end{align*}$
This post is part of the book Justin Math: Linear Algebra. Suggested citation: Skycak, J. (2019). Eigenvalues, Eigenvectors, and Diagonalization. In Justin Math: Linear Algebra. https://justinmath.com/eigenvalues-eigenvectors-and-diagonalization/
Want to get notified about new posts? Join the mailing list and follow on X/Twitter.