Linear algebra (Osnabrück 2024-2025)/Part I/Lecture 24



The Theorem of Cayley-Hamilton
Arthur Cayley (1821-1895)
William Hamilton (1805-1865)


One highlight of the linear algebra is the Theorem of Cayley-Hamilton. In order to formulate this theorem, recall that we can plug in a square matrix into a polynomial, see the 20th lecture. Here, the variable is everywhere replaced by the matrix , the powers are the -th matrix product of with itself, and the addition is the (componentwise) addition of matrices. A scalar has to be interpreted as the -fold of the identity matrix. For the polynomial

and the matrix

we get

For a fixed matrix , we have the substitution mapping

This is (like the substitution mapping for an element ), a ring homomorphism, that is, the relations (see also Lemma 20.3 )

hold. The Theorem of Cayley-Hamilton answers the question of what happens when we insert a matrix in its characteristic polynomial.


Let be a field, and let be an -matrix. Let

denote the characteristic polynomial of . Then

This means that the matrix annihilates the characteristic polynomial.

We consider the matrix as a matrix whose entries are in the field . The adjugate matrix

belongs also to . The entries of the adjugate matrix are by definition the determinants of -submatrices of . In the entries of this matrix, the variable occurs at most in its first power, so that, in the entries of the adjugate matrix, the variable occurs at most in its -th power. We write

with matrices

that is, we write the entries as polynomials, and we collect all coefficients referring to into a matrix. Because of Theorem 17.9 , we have

We can write the matrix on the left according to the powers of and we get

Since these polynomials coincide, their coefficients coincide. That is, we have a system of equations

We multiply these equations from the left from top down with , yielding the system of equations

If we add the left-hand side of this system, then we just get . If we add the right-hand side, then we get , because every partial summand occurs once positively and once negatively. Hence, we have .



Let be a finite-dimensional vector space over a field , and let

denote a linear mapping. Then the characteristic polynomial of fulfills the relation

This follows immediately from Theorem 24.1 .



Minimal polynomial and characteristic polynomial

Let be a finite-dimensional vector space over a field , and let

be a linear mapping. Then the characteristic polynomial is a multiple of the minimal polynomial

of .

This follows directly from Theorem 24.2 and Corollary 20.12 .


In particular, the degree of the minimal polynomial of is bounded by the dimension of the vector space . The minimal polynomial and the characteristic polynomial are related in several respects, for example, they have the same zeroes.


Let be a finite-dimensional vector space over a field , and let

be a linear mapping. Let be an eigenvector of with eigenvalue , and let denote a polynomial. Then

In particular, is an eigenvector of with eigenvalue . The vector

belongs to the kernel of if and only if is a zero of .

We have

This implies the statement, since the assignment is compatible with addition and scalar multiplication.



Let be a finite-dimensional vector space over a field , and let

be a linear mapping. Then the characteristic polynomial and the minimal polynomial

have the same zeroes.

It follows directly from Cayley-Hamilton that the zeroes of the minimal polynomial are also zeroes of the characteristic polynomial.

To prove the other implication, let be a zero of the characteristic polynomial, and let denote an eigenvector of with eigenvalue , its existence is guaranteed by Theorem 23.2 . We write the minimal polynomial as

where has no zero. Then

We apply this mapping to . Because of Fact *****, the factors send the vector to or to , respectively. Altogether, is sent to

As the composed mapping is the zero mapping and , we must have for some .



Further examples

For the moment, we will apply the following concept only for invertible matrices.


Let be a group and an element. Then we call the smallest positive number with the order of . For this, we write . If all positive powers of are different from the neutral element, then we set

.

We consider linear mappings

with the property that some power of it is the identity, say

that is, has finite order. Typical examples are rotations around an angle of the form degree. The polynomial annihilates this endomorphism, and is, therefore, a multiple of the minimal polynomial.


Let be a field, and A zero of the polynomial

in is called an -th root of unity

in .

Let . The zeroes of the polynomials over are

In , we have the factorization

The proof uses some basic facts about the complex exponential function. We have

Hence, the given complex numbers are indeed zeroes of the polynomial . These zeroes are all different, because

with implies, by considering the fraction, that

holds. Therefore, there exist explicit zeroes and these are all the zeroes of the polynomial. The explicit description in coordinates follows from the Euler's formula.



For a permutation on , the -matrix

where

and all other entries are , is called a

permutation matrix.

We want to determine the characteristic polynomial of a permutation matrix. Here, we use that a permutation is a product of cycles. For a cycle of the form , the corresponding permutation matrix is

Every cycle can be brought (by renumbering) into this form.


The characteristic polynomial of a permutation matrix for a cycle of order is

We may assume that the cycle has the form . The corresponding permutation matrix looks with respect to like the identity matrix and has, with respect to the first standard vectors, the form

The determinant of is multiplied with the determinant of

The expansion with respect to the first row yields



For a permutation matrix over for a cycle with and a -th root of unity , the vectors

are eigenvectors of for the eigenvalue . In particular, a permutation matrix of a cycle over is

diagonalizable.

We have

Since there are different -th roots of unity in , these vectors are linearly independent due to Lemma 22.3 , and they generate a -dimensional linear subspace of . In fact, we have

Since the vectors , , are fixed vectors, the together with the , , form a basis consisting of eigenvectors of . Hence, is diagonalizable.



A permutation matrix over is

diagonalizable.

Proof


<< | Linear algebra (Osnabrück 2024-2025)/Part I | >>
PDF-version of this lecture
Exercise sheet for this lecture (PDF)