Sometime during the first course in linear algebra we all learn the famous Cayley-Hamilton theorem which states the following:
Theorem: Let be an
matrix over a field
, and denote by
its characteristic polynomial. Then
.
The “easy” proof for this theorem is just noting that . Of course, this is not a real proof since in the definition as a determinant of
, the symbol
is a place holder for a scalar and not a matrix. Nevertheless, the theorem is still true and has several proofs. For example, if you believe in the Jordan presentations or that each matrix can be approximated by a diagonalizable matrix, then this theorem is not that far fetched, since it is true for matrices in the Jordan form and for diagonlizable matrices.
Our goal here is to show that the most “generic” matrix satisfies this theorem, so all the other matrices have no real choice, and must follow the footsteps of their generic master ruler and do the same.
We start, as always, with a definition.
Definition (The generic matrix): Let be algebraically independent indeterminates over
. The generic matrix is defined to be
What makes this matrix into a “generic” matrix is the following argument. Suppose that is any field, and
is any
matrix over
. Any polynomial
defines an element in
, simply by assigning
to be
. Let us denote this map by
. Clearly, this is a homomorphism of rings, which we can extend to a homomorphism of the matrix rings
. We now have the almost too trivial observation that
– or in other words, any matrix
is an image of the matrix
(under some carefully chosen homomorphism). We note that the same process will work if we take any commutative ring
instead of a field
.
The previous argument might seem silly at first, since everything was defined so that will be an image of
so we should not be surprised when this actually happens. True as it is, let us show that this will tell us a lot about the matrix
. But first, to shorten the notation, lets give a name to the phenomenon above.
Definition (specialization): Let be two unital integral domains (namely commutative without zero divisors), and
be two
matrices over
and
respectively. We say that
specializes to
if there is some homomorphism
such that its extension to
(which we also denote by
) satisfies
.
Under this notations, we see that any matrix over any field (or unital commutative rings) is a specialization of
. The main idea now is to show that if
specializes to
, then for certain properties if
satisfy them then so is
and for certain properties the other direction is true.
For a matrix , denote by
its characteristic polynomial. Assume that
specializes to
through the homomorphism
. We also denote by
its natural extensions
and
.
Lemma 1: We have .
Proof: Since is a homomorphism of rings, we get that for any matrix
over
we have that
. We thus have that
.
Lemma 2: If is irreducible, then
is irreducible.
Proof: Suppose that is a nontrivial decomposition (namely
, which implies that
. We claim that this is a nontrivial decomposition as well. This follows from the fact that
is monic which implies that
is monic, and hence
. It then follows that
and
so we obtain a nontrivial decomposition for
. Thus, if
is irreducible, then so is
.
Corollary: The matrix has an irreducible characteristic polynomial.
Proof: We know that specializes to any
matrix, so by the previous lemma we only need to find one such matrix with an irreducible characteristic polynomial, and of course there are plenty of those.
Recall, that if is a polynomial of degree
over a characteristic zero field, then the irreducibility of
implies that all the roots of
are distinct. It is always a great joy for a matrix to have such a characteristic polynomial, since then it must have
distinct eigenvalues, and therefore diagonalizable (over the algebraic closure)! Any diagonal matrix clearly satisfies the Cayley-Hamilton theorem, and with some further thought, we see that this is also true for diagonalizable matrices (This is because if
for some invertible matrix
, then
). Thus we have the following:
Corollary: The generic matrix satisfies
.
Proof: By the previous corollary the polynomial is irreducible, and it is defined over
which is of characteristic zero. Thus
is diagonalizable, and hence satisfy the Cayley-Hamilton theorem.
And now we can finally prove the Cayley-Hamilton Theorem:
Proof (Cayley-Hamilton): Let be any
matrix over a field
and let
be the specialization map. Again, using the fact that
is a homomorphism we get that:
and we are done.
The idea of generic objects is of course much more common than only the generic matrix. Indeed, this notion is closely related to free objects and other universal properties. The next step from here is not to look on a single generic matrix, but on the algebra generated by several generic matrices (on distinct indeterminates). This algebra will be generic in the sense that it specializes to any matrix algebra (and not just a single matrix), but this is a whole new subject on its own and will be left for another time.