One of the thing I really love in mathematics is how different subjects, which at least when starting to learn mathematics seem far apart, can come together in interesting ways. While the first reaction of many people might be that it just makes it more complicated, I always thought that it is the other way around. This means that I can use everything that I know from all of the different subjects, instead of using just my knowledge from a single subject. More over, it is almost always interesting to see the same problem from several points of view, each one with its own tools and ideas, and see how they all connect.
In this post, I want to share one such problem with you. Originally, I knew it from the algebra point of view, and that alone can fill several posts. However, along the way I have heard about this interesting new geometric proof, which connects this already interesting problem to even more field in mathematics. And the best thing, as many problems, it has a very humble beginning with a theorem we all learn about in high school – the Pythagorean theorem.
For the Hebrew speaking people – a video version of this post can be found here.
Pythagorean Triples
We begin by recalling first the famous Pythagorean theorem. If we are given a right angled triangle, then the square of the length of the hypotenuse (the edge not touching the right angle) equals the sum of the squares of the lengths of the other edges:
There are many interesting proofs for this theorem, but here we would take it as a given, and instead concentrate on a special case of such triangles – those that have edges of positive integer length. The smallest such triangle has lengths and
, and you can check that
.
These triples of integers corresponding to right angled triangles are called Pythagorean triples. They play all sorts of roles in mathematics, in particular in algebra and are quite interesting in general. However, these triples are not easy to come by. Even if we start with a triangle with edges of length near the right angle, then we only know that the hypotenuse has length
which in general is not an integer. So already, just finding such triangles is not an easy task, not to mention try to find all of them. But mathematicians do not give up easily, and eventually we have found some trick to find these triples.
The first trick to find more Pythagorean triple is to start with a given one, e.g. , and to stretch it with some integer factor. For example, if we stretch it by 2, we get the triangle with sides
which is another Pythagorean triple. You can also check it algebraically, by noting that
Of course, we can stretch our triples by any integer factor to get infinitely more Pythagorean triples. We can also try shrinking our triangle by some integer factor, but this doesn’t always give a new triangle with integer sides. If we are given a triple where all of its integers are divisible by the same number, then we can divide by it to get another Pythagorean triple (namely, move from to
by dividing by 2). When we get to an integer triangle that we cannot shrink any more, or algebraically speaking, the greatest common divisor of the lengths of its sides is 1, we call the Pythagorean triple primitive. For example, our
triple is primitive.
With this stretching and shrinking process, instead of looking for all the Pythagorean triples, we can restrict our attention to only primitive triples. If we can find all of them, we can then find all the Pythagorean triples by simply stretching them. This process already gives us a lot of integer triangles, but not all of them, so we need more tricks. If up until now we multiplied by some integer, we now want to multiply by another Pythagorean triple.
Multiplying Pythagorean triples
If we start with any two Pythagorean triples (not necessarily distinct), for example
we can multiply them to get
.
This shows that if both and
are sums of two squares, then their product is a sum of four squares. While this is already not a trivial result, we actually want to know if we can write it as sum of two squares.
To somehow go down to 2 squares we use the following trick. While the equations are defined over the integers, in a sense the right place to view them in the complex world. Indeed, in the complex world a “sum of two squares” have a very special meaning, since is a complex number times its conjugate.
Rewriting the product from above we get that:
In other words, we can now multiply two Pythagorean triples to get a new one. Note that while in the integer world we had both multiplication (the squares) and addition, which results in sum of four squares, in this complex world everything became multiplication, and it is much easier to work with only multiplication than with both multiplication and addition.
In the previous section we saw that we can either stretch our triangles or reverse this process by shrinking them. Here we saw that we can multiply them, but can we break them apart also? For example, if we only knew that can be written as a sum of squares, then can we conclude that both
and
are sums of two squares? Or going even further, are
and
are sums of two squares?
The same complex trick from above shows that if is a sum of squares, then so it
, but is the opposite true as well? For these specific cases, we can actually check that this is true (
), but can we show it in general?
More formally, if , is it true that both
and
are sums of two squares? With this formulation it is not true. For example, if
is any number which is not a sum of two squares (e.g.
) then
or even
are counter examples. To avoid these type of examples, let us assume that
are coprime, and we are going to show that in this case the claim holds. Such points
where
and
are coprime are called primitive points. Note that in our Pythagorean triple world, this condition is simply the primitiveness of the Pythagorean triple.
The lattice of good points
Our goal now is to somehow start with the equality , and try to find a solution for
. We already saw before that sum of two squares
can be thought of as a complex number times its conjugate, but another interpretation is just the distance squared of the point
from the origin. With this point of view the question becomes the following:
Question: Given an integer point
in the plane, with distance squared
, is there another point with distance squared equals to
?
As an example, assume that we start with the point , which satisfies
, and we want to find a point
such that
.
Since our only input is a given integer point with distance squared divisible by , we would like to somehow use it to create more points with this property. Eventually, after we find all the points that we can with distance squared divisible by
, we want to look for the one closest to the origin, namely with smallest
, and hope that
.
We already seen the first trick to create more such points. If satisfies
, then the same is true for
for all
. All of these points are integer points on the line spanned by
, and as an exercise you should show that since
is primitive these are exactly all the integer points on this line.
Of course, these new points are further away from the origin, so we will need to use another way to get closer point. For that we will use a second trick – if then it is also true that
and
. For example
so that if and only if
.
We can now create lots of new such points. We start with all the integer points on our line from above, and add all the new points that we can get to by moving steps left, right, down or up.
For example, our original point was which had distance squared
. The next point on the line was
with distance squared
. By jumping 5 units to the left and then 5 down we get the point
with distance squared
which in particular is closer to the origin than the point
. Similarly, we can do it for
and after the 5-jumps get to
which has distance squared
, and finally for the point
which produce the point
with distance squared
which is exactly what we wanted to find.
Remark: These -jumps basically tell us that in a sense we work modulo
. More specifically, we have the natural map
, where the points generated above are exactly the preimage of the line generated by
. In the image below we see the line generated by
“modulo” 5. Each time the line gets to the right (resp. top) edge, it jumps 5 units to the left (resp. down).
The set of points that we get in the process defined above can also be written as
This is of course a set closed under addition and taking negatives, or in other words a subgroup of (and actually, it is a lattice). Our goal is to say something about the shortest nontrivial point in this set, and hopefully show that its distance squared from the origin is
.
In our example, the lattice is going to be the blue point in the image below:
As you can see, it is a very nice looking (=symmetric) set of points. The question is how can we use its properties to say something about its shortest nonzero vector.
From lines to triangles
The natural setting of our new set is the 2-dimensional real plane, while we look for a result about the 1-dimensional length of the shortest vector, which make our problem a bit unnatural. To move from one to the other, we use a simple trick, and instead of looking on the size of the 1-dimensional vector, we will look on the size of a 2-dimensional triangle. More specifically, we want to look on the smallest possible triangle, and hopefully use it to say something about it edges (and in particular, that one of them is very small).
Let be a shortest nontrivial vector, and choose some
such that the triangle defined by
is nontrivial (
and
are not on the same line) and the triangle doesn’t contain any more points from
, whether in the interior or on the edges (prove to yourself that such a triangle exists). This triangle now has 3 edges of lengths
and
. By definition we know that
, but we claim that
also. Indeed, this follows from the fact that our set
is close under subtraction.
So what can we say about this triangle, which we know that all of its edges are at least of size ? You can check that its area is at least the area of an equilateral triangle with all side equal to
, and we can compute directly the area of such triangle – it is exactly
. We now have a way to give an upper bound on
– to do this we simply need to give an upper bound on the area of our original triangle.
Luckily for us, the set has all sorts of nice properties. The first one, is that all of the triangles with vertices in
which don’t have any other point from
in them have the same area. This is true in general for every lattice (after finding the analogue for triangles in higher dimension). We will give an intuition for this result in dimension 2 later, but for now let us assume this claim. The question is, can we find such a triangle which we can easily compute its area?
In our example, there is a very simple triangle that we can use. Its vertices are
,
and
. It has base of length
and height
, so its area is exactly
. Can we do something similar in general?
To create such a triangle for the general case, we will start with the vertices and
which we know are in
. To complete it to a triangle, we claim that there is (1) a point of the form
and (2) there are no points in
strictly between
and
. Given these two claims, the triangle with vertices
and
doesn’t contain any points from
, other than its vertices, and its area is simply
.
Recall that our whole investigation started with the point where
and
. This implies in particular that
also (since if
is some prime then
and therefore
– contradiction). Using this coprime condition, we can find integers
such that
, so that
This proves our first claim. For the second, if we can write
then just looking on the second coordinate, we get that . Since we already saw that
is coprime to
and now
we conclude that
. It then follows that
is divisible by
, which proves the second claim.
To summarize, given the shortest nonzero vector , we constructed a triangle with the length of each edge at least
. Its area is at least the area of an equilateral triangle with all edges of length
, which is exactly
. On the other hand our original triangle has the same area as other triangles defined by vertices in
(and no other points inside), which is
, so all in all we get that
.
Finally, recall that we constructed so that all of its points have distance squared
for some integer
, so we conclude that
must satisfy
, thus completing our proof.
On the area of triangles
As mentioned before, the fact that all the triangles with vertices in and no other points from it have the same area is a general property of lattices, and is not unique for this set.
For those who want to try and prove it formally, here are the steps that you should try to do:
- Instead of working with general triangles with vertices
, you can always translate them by
without changing the area, nor the number of points inside it, so that the vertices are
. Namely, we can assume that one of the vertices is the origin.
- Next, show that
is actually generated over
by two vectors (specifically in this case, one such example is
and a vector of the form
that we know exists).
- More over, two vector
generate it if and only if the triangle defined by
doesn’t contain any other point from
.
- To compute the area of the triangle, we can simply put the two vector as columns of a
matrix
, and then the area will be
.
- Finally, to move from one such matrix to another
there is always some
such that
. In particular this implies that the areas in the two triangles are the same, since
.
The arguments above can be generalized to higher dimension, however, in dimension 2 there is a nice geometric visualization of what happens there. There are three basic transformations that change the triangle without changing it area. The first one is simply a translation. The second is rotation. For the last one, since the area is the base times the height divided by two, we want to change the triangle without changing the sizes of the height and base. For that choose one edge as a base, and move it along the line the it defines and keep the opposing vertex at place. This keeps both the base and the height fix, and therefore the area of the triangle.
You should now spend some minutes convincing yourself that this is enough, at least for our example with . More over, try to see how these three transformation correspond to the algebraic proof above.
Some final remarks
We started our investigation with right angled triangles and the Pythagorean theorem, moved to the set of sum of two squares and saw some of its interesting algebraic properties, and while trying to figure them out, somehow managed to construct a lattice and looked for its shortest vector.
Each one of these steps has a full world of interesting mathematics behind it, and at least for me, I always found this type of problem, which involved several ideas and subjects together quite interesting. Not only it let us get to know new and interesting subjects, it also let us combine our knowledge from all of the different subject to create something new.
Our original problem here was to find all the Pythagorean triple, and here we “just” saw that the set of sum of two squares is not only multiplicative, but we can break them up also. This is not enough to find all the Pythagorean triples, but it is a big step. In particular, if we keep breaking them up, eventually we will get stuck at an “unbreakable” number, or in other words a prime number. Thus, in the end, we are left with the problem of which primes can be written as
(another favorite problem for me). Dealing with primes also involves lots of number theory which make this whole process even more interesting, but this is for another post.
For now, if you want to play around a little bit with the more geometric ideas mentioned in this post, and interact with all of the triangles in pictures above, you can go to the program used to create them here: https://openprocessing.org/sketch/1242516