Minimal polynomial (linear algebra)
In linear algebra, the minimal polynomial μ_{A} of an n × n matrix A over a field F is the monic polynomial P over F of least degree such that P(A) = 0. Any other polynomial Q with Q(A) = 0 is a (polynomial) multiple of μ_{A}.
The following three statements are equivalent:
- λ is a root of μ_{A},
- λ is a root of the characteristic polynomial χ_{A} of A,
- λ is an eigenvalue of matrix A.
The multiplicity of a root λ of μ_{A} is the largest power m such that Ker((A − λI_{n})^{m}) strictly contains Ker((A − λI_{n})^{m−1}). In other words, increasing the exponent up to m will give ever larger kernels, but further increasing the exponent beyond m will just give the same kernel.
If the field F is not algebraically closed, then the minimal and characteristic polynomials need not factor according to their roots (in F) alone, in other words they may have irreducible polynomial factors of degree greater than 1. For irreducible polynomials P one has similar equivalences:
- P divides μ_{A},
- P divides χ_{A},
- the kernel of P(A) has dimension at least 1.
- the kernel of P(A) has dimension at least deg(P).
Like the characteristic polynomial, the minimal polynomial does not depend on the base field, in other words considering the matrix as one with coefficients in a larger field does not change the minimal polynomial. The reason is somewhat different from for the characteristic polynomial (where it is immediate from the definition of determinants), namely the fact that the minimal polynomial is determined by the relations of linear dependence between the powers of A: extending the base field will not introduce any new such relations (nor of course will it remove existing ones).
The minimal polynomial is often the same as the characteristic polynomial, but not always. For example, if A is a multiple aI_{n} of the identity matrix, then its minimal polynomial is X − a since the kernel of aI_{n} − A = 0 is already the entire space; on the other hand its characteristic polynomial is (X − a)^{n} (the only eigenvalue is a, and the degree of the characteristic polynomial is always equal to the dimension of the space). The minimal polynomial always divides the characteristic polynomial, which is one way of formulating the Cayley–Hamilton theorem (for the case of matrices over a field).
Formal definition
Given an endomorphism T on a finite-dimensional vector space V over a field F, let I_{T} be the set defined as
where F[t] is the space of all polynomials over the field F. I_{T} is a proper ideal of F[t]. Since F is a field, F[t] is a principal ideal domain, thus any ideal is generated by a single polynomial, which is unique up to units in F. A particular choice among the generators can be made, since precisely one of the generators is monic. The minimal polynomial is thus defined to be the monic polynomial which generates I_{T}. It is the monic polynomial of least degree in I_{T}.
Applications
An endomorphism φ of a finite dimensional vector space over a field F is diagonalizable if and only if its minimal polynomial factors completely over F into distinct linear factors. The fact that there is only one factor X − λ for every eigenvalue λ means that the generalized eigenspace for λ is the same as the eigenspace for λ: every Jordan block has size 1. More generally, if φ satisfies a polynomial equation P(φ) = 0 where P factors into distinct linear factors over F, then it will be diagonalizable: its minimal polynomial is a divisor of P and therefore also factors into distinct linear factors. In particular one has:
- P = X^{ k} − 1: finite order endomorphisms of complex vector spaces are diagonalizable. For the special case k = 2 of involutions, this is even true for endomorphisms of vector spaces over any field of characteristic other than 2, since X^{ 2} − 1 = (X − 1)(X + 1) is a factorization into distinct factors over such a field. This is a part of representation theory of cyclic groups.
- P = X^{ 2} − X = X(X − 1): endomorphisms satisfying φ^{2} = φ are called projections, and are always diagonalizable (moreover their only eigenvalues are 0 and 1).
- By contrast if μ_{φ} = X^{ k} with k ≥ 2 then φ (a nilpotent endomorphism) is not necessarily diagonalizable, since X^{ k} has a repeated root 0.
These cases can also be proved directly, but the minimal polynomial gives a unified perspective and proof.
Computation
For a vector v in V define:
This definition satisfies the properties of a proper ideal. Let μ_{T,v} be the monic polynomial which generates it.
Properties
- Since I_{T,v} contains the minimal polynomial μ_{T}, the latter is divisible by μ_{T,v}.
- If d is the least natural number such that v, T(v), ..., T^{d}(v) are linearly dependent, then there exist unique a_{0}, a_{1}, ..., a_{d−1} in F such that
and for these coefficients one has
- Let the subspace W be the image of μ_{T,v}(T), which is T-stable. Since μ_{T,v}(T) annihilates at least the vectors v, T(v), ..., T^{d-1}(v), the codimension of W is at least d.
- The minimal polynomial μ_{T} is the product of μ_{T,v} and the minimal polynomial Q of the restriction of T to W. In the (likely) case that W has dimension 0 one has Q = 1 and therefore μ_{T} = μ_{T,v}; otherwise a recursive computation of Q suffices to find μ_{T}.
Example
Define T to be the endomorphism of R^{3} with matrix, on the canonical basis,
Taking the first canonical basis vector e_{1} and its repeated images by T one obtains
of which the first three are easily seen to be linearly independent, and therefore span all of R^{3}. The last one then necessarily is a linear combination of the first three, in fact
- T^{ 3}⋅e_{1} = −4T^{ 2}⋅e_{1} − T⋅e_{1} + e_{1},
so that:
- μ_{T,e1} = X^{ 3} + 4X^{ 2} + X − I.
This is in fact also the minimal polynomial μ_{T} and the characteristic polynomial χ_{T}: indeed μ_{T,e1} divides μ_{T} which divides χ_{T}, and since the first and last are of degree 3 and all are monic, they must all be the same. Another reason is that in general if any polynomial in T annihilates a vector v, then it also annihilates T⋅v (just apply T to the equation that says that it annihilates v), and therefore by iteration it annihilates the entire space generated by the iterated images by T of v; in the current case we have seen that for v = e_{1} that space is all of R^{3}, so μ_{T,e1}(T) = 0. Indeed one verifies for the full matrix that T^{ 3} + 4T^{ 2} + T − I_{3} is the null matrix:
References
- Lang, Serge (2002), Algebra, Graduate Texts in Mathematics, 211 (Revised third ed.), New York: Springer-Verlag, ISBN 978-0-387-95385-4, MR 1878556