In the post about Diophantine approximation, we saw that in order to find “good” rational approximations to a real number, it is enough to prove that given a lattice and a “big enough” box around the origin, the box must contain a nonzero lattice point. This very intuitive result is due to Minkowski which started what is today known as the geometry of numbers. After several examples in order to get some intuition, we shall prove this result and see several of its applications.
The most trivial lattices are the 1-dimensional lattices, and in these cases a box is simply an interval , and as we want to consider symmetric boxes around the origin, we will work with intervals of the form
for
. This box contains a nonzero point from the lattice
if and only if
, or in different notation
.
The next lattice we consider is , and a symmetric box there has the form
for some
. Such a box will contain a nonzero lattice point if and only if
or
, which is certainly true if
. Using our lattice notations we see that
contains a nonzero lattice point if
. A similar argument will show that in dimension 3 the right inequality will be
, and more generally in dimension
we will require that
.
Up until now we worked with boxes and box-lattices, which work well together because we can find fundamental domains for such lattices which are boxes (which is not true for the general lattice). We can also ask a similar question where is a general parallelepiped, a ball, an ellipsoid etc and
is a general lattice. How big can we take
before it must contain a nonzero point from
?
At this point we should ask ourselves what are the right conditions for a set so that we have a result of the form “if
is big enough, then it contains a nonzero lattice point”. After playing around a little bit with examples, you should see that the following two properties are very important.
Definition: Let .
is called convex if whenever
and
, then
.
is called symmetric if whenever
we also have that
.
The convexity property just means that whenever contains two points, it must also contain the line joining these two points. If we leave it out, then we can take a very big ball
, and remove from it the small balls
for all
. If we choose
to be small enough, then the size of
with these holes will be roughly the size of
without the holes, which grows to infinity as
. In other words, we can find very big nonconvex sets (though still symmetric) which don’t contain any nonzero lattice points. The symmetric condition is also important. In the example of the square-lattice
we can take the set
where
is as big as we want to get that
is convex and doesn’t contain any lattice point. Even if we add the condition that
must contain the origin, we can take the triangle
with vertices
to get a counter example.
After this introduction, we are ready to state the theorem.
Theorem (Minkowski): Let be a lattice and
be a measurable set which is convex and symmetric.
- If
, then
contains a nonzero lattice point.
- If in addition
is a closed set, then it is enough to require that
.
Remark: You should check that these bounds are tight, i.e. we cannot change the strict inequality in (1) to , and we cannot take a smaller bound than
in (2).
Proof: The main idea of the proof is that given a region and
, the region
contains a lattice point if and only if
contains a lattice point. This let us decompose
to small parts and move them around to create new sets which are easier to work with.
The first step of the proof is due to Blichfeldt and states that any measurable set with
contains two points
such that
. Otherwise, take some fundamental domain
for
, and decompose
to small parts and move each one of them into
using elements from
. Since
, there will be overlap somewhere, which will produce two points as required. This is in a sense the pigeonhole principle which says that if we try to jam a “big set” into a “smaller set”, then two points must go to the same place.
Below we have the example with the square-lattice and the standard fundamental domain . After decomposing the set on the left and moving it into the fundamental domain (using translations from the lattice) we get an overlap in two triangles.
More formally, write as a disjoint union (up to measure zero)
In the second line we used the property of fundamental domains which says that its translations by distinct lattice points are disjoint up to measure zero, so that the measure of the union is the sum of the measures. The sets are all inside
and the sum of their measures is more than
, so we can find
such that
which implies that there are
such that
.
Now that we have proven Blichfeldt’s theorem, suppose that we are given such that
. It then follows that
, so by Blichfeldt’s theorem we can find two distinct points
such that
. If we also know that
is symmetric, then
, and if it is also convex, then
which is exactly what we needed.
The proof for closed sets is left as an exercise.
The first application for Minkowski’s theorem that we have already seen is to Diophantine approximation. Recall, that finding “good” rational approximations for some translates to finding nontrivial points from the lattice
inside the boxes
as
. Since
has
and
are closed boxes of size
, we can apply Minkowski’s theorem and get the required rational approximations.
I will leave it as an exercise (for now) to show that a similar result is true for higher dimension – given a set of numbers and
such that
, we can find infinitely many distinct integer tuples
such that
for every
.
There are many applications to Minkowski’s theorem in mathematics which seem to come out of nowhere. For the algebraic number theorists among you, Dirichlet’s unit theorem (which I will cover in a later post) and the fact that the ideal class group is finite are usually proven using Minkowski’s theorem. Another application is Lagrange’s four squares theorem which states that any positive integer can be written as a sum of four squares of integers.
Here is a baby case application of Minkowski’s theorem for number theory. Let be a prime, and consider the problem of finding a solution to
where
are integers. This problem is in a sense the first step in algebraic number theory, since such a solution is equivalent to a decomposition
in the Gaussian integers
. If
, then we have the solution
, so let us assume that
is odd. Trying to solve this for small primes, we see that while
have a solution, the primes
don’t have any solution, which lead us to consider the residue of
mod 4. If
, then there is no solution in integers because there is no solution mod 4 – this follows from the fact that the squares mod 4 are only 0,1 and you cannot solve
. If
, then a solution
implies that
. Actually, the second expression is always true for such primes (HINT: write
in two different ways), and we can ask whether the inverse is still true.
Suppose that there is some integer such that
and we want to find
such that
. If this is the case, then
, so up to a choice of sign we have that
(up to a sign), or in other words
. Running over all possible
produces the 2-dimensional lattice
which has covolume
. Thus, the nonzero vectors
in this lattice always satisfy
and we are looking for such a vector which satisfy
. Note that such a vector must have the minimal (nonzero) length in the lattice. In search of small vectors in this lattice, let us look on the open ball
which has area
. It follows that we can apply Minkowski’s theorem and find a nonzero lattice point
such that
and
so that
, and we are done.
Probably the most important application of Minkowski’s theorem is showing that lattices have “small” nonzero vectors. More precisely, let be the volume of an
-dimensional ball of radius
and let
be a lattice. By Minkowski’s theorem if
is big enough, then the closed ball
of radius
around the origin contains a nonzero lattice point, or in other words, there exists
such that
. It is quite hard in general to actually find the shortest nonzero vector from a lattice, and though I’m not sure, probably even finding the vector that Minkowski’s theorem promises is not that trivial. Finding a good algorithm for this problem (or show that there isn’t such) is an important open problem in computer science, and in cryptography in particular.
Going back to the problem of Diophantine approximation, the family of sets that we had there are boxes of the form . These boxes not only have the same size (which is big enough in order to apply Minkowski’s theorem), but are also very similar to one another. Actually, if we take only one box
and multiply it by the matrices
, then we will get the entire family of boxes. If we let
, then this family of matrices is actually a group isomorphic to
. As always, mathematical objects become much more interesting once we find a nice group which acts on them and “behaves” nicely with respect to the properties that we want to check. Thus our next step (in some future post) will be to understand this group action and reformulate all of our results so far using this new language.