In the post about Diophantine approximation, we saw that in order to find “good” rational approximations to a real number, it is enough to prove that given a lattice and a “big enough” box around the origin, the box must contain a nonzero lattice point. This very intuitive result is due to Minkowski which started what is today known as the geometry of numbers. After several examples in order to get some intuition, we shall prove this result and see several of its applications.
The most trivial lattices are the 1-dimensional lattices, and in these cases a box is simply an interval , and as we want to consider symmetric boxes around the origin, we will work with intervals of the form for . This box contains a nonzero point from the lattice if and only if , or in different notation .
The next lattice we consider is , and a symmetric box there has the form for some . Such a box will contain a nonzero lattice point if and only if or , which is certainly true if . Using our lattice notations we see that contains a nonzero lattice point if . A similar argument will show that in dimension 3 the right inequality will be , and more generally in dimension we will require that .
Up until now we worked with boxes and box-lattices, which work well together because we can find fundamental domains for such lattices which are boxes (which is not true for the general lattice). We can also ask a similar question where is a general parallelepiped, a ball, an ellipsoid etc and is a general lattice. How big can we take before it must contain a nonzero point from ?
At this point we should ask ourselves what are the right conditions for a set so that we have a result of the form “if is big enough, then it contains a nonzero lattice point”. After playing around a little bit with examples, you should see that the following two properties are very important.
Definition: Let .
- is called convex if whenever and , then .
- is called symmetric if whenever we also have that .
The convexity property just means that whenever contains two points, it must also contain the line joining these two points. If we leave it out, then we can take a very big ball , and remove from it the small balls for all . If we choose to be small enough, then the size of with these holes will be roughly the size of without the holes, which grows to infinity as . In other words, we can find very big nonconvex sets (though still symmetric) which don’t contain any nonzero lattice points. The symmetric condition is also important. In the example of the square-lattice we can take the set where is as big as we want to get that is convex and doesn’t contain any lattice point. Even if we add the condition that must contain the origin, we can take the triangle with vertices to get a counter example.
After this introduction, we are ready to state the theorem.
Theorem (Minkowski): Let be a lattice and be a measurable set which is convex and symmetric.
- If , then contains a nonzero lattice point.
- If in addition is a closed set, then it is enough to require that .
Remark: You should check that these bounds are tight, i.e. we cannot change the strict inequality in (1) to , and we cannot take a smaller bound than in (2).
Proof: The main idea of the proof is that given a region and , the region contains a lattice point if and only if contains a lattice point. This let us decompose to small parts and move them around to create new sets which are easier to work with.
The first step of the proof is due to Blichfeldt and states that any measurable set with contains two points such that . Otherwise, take some fundamental domain for , and decompose to small parts and move each one of them into using elements from . Since , there will be overlap somewhere, which will produce two points as required. This is in a sense the pigeonhole principle which says that if we try to jam a “big set” into a “smaller set”, then two points must go to the same place.
Below we have the example with the square-lattice and the standard fundamental domain . After decomposing the set on the left and moving it into the fundamental domain (using translations from the lattice) we get an overlap in two triangles.
More formally, write as a disjoint union (up to measure zero)
In the second line we used the property of fundamental domains which says that its translations by distinct lattice points are disjoint up to measure zero, so that the measure of the union is the sum of the measures. The sets are all inside and the sum of their measures is more than , so we can find such that which implies that there are such that .
Now that we have proven Blichfeldt’s theorem, suppose that we are given such that . It then follows that , so by Blichfeldt’s theorem we can find two distinct points such that . If we also know that is symmetric, then , and if it is also convex, then which is exactly what we needed.
The proof for closed sets is left as an exercise.
The first application for Minkowski’s theorem that we have already seen is to Diophantine approximation. Recall, that finding “good” rational approximations for some translates to finding nontrivial points from the lattice inside the boxes as . Since has and are closed boxes of size , we can apply Minkowski’s theorem and get the required rational approximations.
I will leave it as an exercise (for now) to show that a similar result is true for higher dimension – given a set of numbers and such that , we can find infinitely many distinct integer tuples such that for every .
There are many applications to Minkowski’s theorem in mathematics which seem to come out of nowhere. For the algebraic number theorists among you, Dirichlet’s unit theorem (which I will cover in a later post) and the fact that the ideal class group is finite are usually proven using Minkowski’s theorem. Another application is Lagrange’s four squares theorem which states that any positive integer can be written as a sum of four squares of integers.
Here is a baby case application of Minkowski’s theorem for number theory. Let be a prime, and consider the problem of finding a solution to where are integers. This problem is in a sense the first step in algebraic number theory, since such a solution is equivalent to a decomposition in the Gaussian integers . If , then we have the solution , so let us assume that is odd. Trying to solve this for small primes, we see that while have a solution, the primes don’t have any solution, which lead us to consider the residue of mod 4. If , then there is no solution in integers because there is no solution mod 4 – this follows from the fact that the squares mod 4 are only 0,1 and you cannot solve . If , then a solution implies that . Actually, the second expression is always true for such primes (HINT: write in two different ways), and we can ask whether the inverse is still true.
Suppose that there is some integer such that and we want to find such that . If this is the case, then , so up to a choice of sign we have that (up to a sign), or in other words . Running over all possible produces the 2-dimensional lattice which has covolume . Thus, the nonzero vectors in this lattice always satisfy and we are looking for such a vector which satisfy . Note that such a vector must have the minimal (nonzero) length in the lattice. In search of small vectors in this lattice, let us look on the open ball which has area . It follows that we can apply Minkowski’s theorem and find a nonzero lattice point such that and so that , and we are done.
Probably the most important application of Minkowski’s theorem is showing that lattices have “small” nonzero vectors. More precisely, let be the volume of an -dimensional ball of radius and let be a lattice. By Minkowski’s theorem if is big enough, then the closed ball of radius around the origin contains a nonzero lattice point, or in other words, there exists such that . It is quite hard in general to actually find the shortest nonzero vector from a lattice, and though I’m not sure, probably even finding the vector that Minkowski’s theorem promises is not that trivial. Finding a good algorithm for this problem (or show that there isn’t such) is an important open problem in computer science, and in cryptography in particular.
Going back to the problem of Diophantine approximation, the family of sets that we had there are boxes of the form . These boxes not only have the same size (which is big enough in order to apply Minkowski’s theorem), but are also very similar to one another. Actually, if we take only one box and multiply it by the matrices , then we will get the entire family of boxes. If we let , then this family of matrices is actually a group isomorphic to . As always, mathematical objects become much more interesting once we find a nice group which acts on them and “behaves” nicely with respect to the properties that we want to check. Thus our next step (in some future post) will be to understand this group action and reformulate all of our results so far using this new language.