One of the things that I love about Mathematics, is how one can come across a mathematical problem or a phenomenon which you can look at from several points of views, each one different, yet they all combine together to form an interesting story. It can be a story that grows with you as you learn more and more advanced mathematics, or a story that lets you connect different parts of mathematics.
The story, which is the focus of this post and the next one, is a story of such a problem around the operation which takes two positive rational numbers and “returns” the wrong way to add them: . This story starts from the most elementary mathematics there is, and goes all the way to the most advanced, and on the way it introduces many interesting concepts in mathematics.
For all of you Hebrew speakers out there, a video version of this posts can be found here: https://www.youtube.com/watch?v=SPFxgNcm5rs
A new interesting average.
I first learned about this operation when I was in elementary school. At the attempt of trying to teach us the pupils mathematics, the school had a very simple math computer game for us to play. In this game there was a man walking in a hallway and each time he wanted to open a door, he needed to find a rational number between two other rationals.
For example, we can get the problem of finding a rational number such that
There are many ways to find such a rational number, but probably the most well known is to go for the number exactly in the middle, or in other words the arithmetic mean:
Once one door is opened, the next two rationals will be and one of the previous rationals, for example, find:
Repeatedly computing these arithmetic means will not only teach the pupils about their importance, but also teaches them how to add rationals (e.g. computing ), and along the way compute 4 multiplications and 1 addition of integers. As learning tools go, this can be a very good one to learn addition and multiplication.
As a future mathematician, and a lazy person, I somehow found out that if I just add the numerators and denominators, I get a rational somewhere in the middle, for example:
This meant that not only I had to compute only 2 additions, but also, as can be seen in the examples above, both the numerator and the denominator 3 and 8 are much smaller than their counterparts 13 and 30 in the arithmetic mean. This means that the next step will be even easier since I need to deal with smaller numbers.
Formally speaking, years after finishing elementary school and a couple of degrees in mathematics, the claim above is that given two rationals we have that
- , and
- The numerator and denominator of are “small”.
While the second claim is more intuition than an actual formal claim, the first one can be easily checked to be true. This is a very simple exercise – since all the denominators are positive, we have that
proving the left inequality in (1) above, and a similar argument proves the right inequality. In other words, the weird sum is a some kind of an average of and . But easy as the proof may be, it doesn’t give us any intuition about why it is true, why should we care about this average and why it is interesting.
Being in elementary school, I didn’t even know that I needed to ask these questions, and I mostly forgot about this operation up until my post doc.
A rational postdoc conversation
In my postdoc, one of the problems that came up in my research was about distribution of rationals in all sorts of places. For example, the most “standard” way in which we think about rationals distribute inside the real line is to fix the denominator and then look on all the rationals with that denominator . For example, on the unit segment we have the following:
After managing to prove some interesting claims for the distribution of rationals above, in my postdoc I was asked if I could do it for other distributions as well. Of course, this is very much dependent on which distribution. Then, it was suggested that I try to use a special way of ordering the rationals, which for my surprise meant that I will need to use this weird operation that I used in elementary school.
The method was as follows. Start with the rationals zero and one, but write them as and . Then add the numerators and denominators to get the three rationals: . Next step create new rationals by adding numerators and denominators of each consecutive pairs to get and we can continue this step again and again infinitely many times.
Once I heard about it in the postdoc, I learned that this weird summation is called the mediant. In the picture above, we wrote each mediant between its “summands” and . Since we already showed that , then the numbers above are also ordered by their size, and in particular all the numbers that we can create like this are different. This in itself is already interesting, even though easy to prove, but then I was told that this process is much more interesting:
Theorem: Every rational number in appears in this repeated mediant process.
This is already a much more powerful theorem than just proving that the mediant computes some average of two rational numbers.
Forgetting for a moment my elementary school days, I have basically never heard about these mediants, let alone this theorem. However, the moment I saw it, I knew that I can prove it. More than that – it felt like I knew everything about this operation and the theorem, except that it existed.
But this is not enough – the 10 year old in me could not understand the university level mathematics, so I needed to find a way to prove it in the most elementary way that I can find, and this is exactly what I am going to do in this post, and in the later posts I will show how this same problem leads to more and more mathematics.
Start from the definition
One of the first thing that you learn to ask in mathematics is if a new definition is “well defined”. For example, in the standard rational addition we have . The rational has many presentations, e.g. , and at first look you might wonder if also equals to . In other words, is the addition operation depends on the rational numbers themselves, or their presentation. If it is not defined by the value of the numbers, but by their presentations, then by definition it is not well defined.
While the standard rational addition is well defined – changing the presentation doesn’t change the result – the mediant is not well defined. If we write
then for example we have that and ,but
Already we reached our first roadblock – our operation is not well defined. However, we already know that it does have a special property – it computes some sort of average – so we need to somehow make it well defined, and then study it. Making an operation into a well defined one is usually quite simple. Instead of considering the objects themselves, consider their presentations as objects. In our case, instead of the rational number we think of the integer vector , and the rational number is the “projection” of this vector: .
Now, our operation in the world of integer vectors is simply the vector addition:
and taking the projection of both side gives us the original (not well defined) “operation”. So while in the 1-dimensional rational context the mediant is not well defined, we see that it is actually a shadow of a well defined 2-dimensional integer operation. In general, 2-dimensional objects might be harder to deal with than 1-dimensional objects, but in this case we also moved from rationals to integers, but even better – we have a nice visualization for this whole process.
The geometric point of view
First, for the projection itself, the integer vectors mapped to the rational , are exactly the integer vectors on the line . Moreover, if we intersect this line with the horizontal line , then the intersection point is exactly . So in a sense, the rational is the “shadow” that the line drops on the line .
Now that we have the projection, we can also look at the addition:
With the images above in mind, showing that is straight forward. We start with the segments from the points to the origin (the green segments), and complete them to a parallelogram. The diagonal of this parallelogram (in red) is always inside it, and in particular between the two green lines, so when we project it onto , its projection is between the projections of these lines. In other words, we get the inequality from above.
We already managed to convert one elementary computation to a visual, and intuitive, proof. Next, we want an even better result, and we shall use this visualization to show that we can get any rational number in when starting at and use the repeated mediant process.
To get some intuition, lets start with a simple process of getting from to . We first start by computing the mediant to get the triple . We are looking for , and we need to decide if we look inside or . For our simple example we can just check which segment contains this point, but we are looking for intuition for the general case, so let view our geometric interpreation:
As can be seen in the image above, the triple defines a parallelogram for us, and since we can see the line corresponding to , we know that we need to go down to the pair .
Repeating this process we get the triple , go down to and finish with the mediant of these two rationals which is . Visually we moved through these parallelograms:
Here we saw that we can get to , and in general we can try to use this process for any rational . The question is if this process always ends, or if there are cases where it does not. Intuitively, we expect the edges of the parallelograms to become larger and larger, and since the rational\integer point is always between the lines they define, if the process never stops, then eventually we expect to see this point inside the parallelogram. However, as can be seen in the images above, the parallelograms never contain integer points, and this will lead to a contradiction.
Before we give a formal proof for this result, let us define the repeated mediant process properly.
The mediant process
We start with two vectors and (corresponding to the rationals and ), and we want to reach some vector in between them (namely ). We also assume that are coprime (we can always consider our rational in its reduced form).
At each step, if , then we are done. Otherwise we look at the three vectors . If the line defined by is between the lines corresponding to , then in the next step we have and . Otherwise, namely is between and (linewise) and we continue to and .
The process ends if we at some point reach and otherwise it is infinite.
In the language of the process above, the parallelograms that we see are those with vertices at . There is something very special when we move from the i’th parallelogram to the next. We can actually think of it as cutting the current parallelogram to two triangles, and then moving only one triangle by translating it in an integer direction.
The first immediate result from this point of view, is that all of our parallelograms have the same area, no matter how many steps we take. Indeed, when we cut a parallelogram to two triangles and then glue them together, we do not change the total area. The second result is that all of our parallelograms don’t contain integer points in their interior (only their vertices are integer points). More generally, if is any set, and we look at its translate for some integers , then contains the integer point if and only if contains the integer point . It follows that . In the case of our parallelograms, our initial parallelogram doesn’t contain any integer points, so all of the rest of the parallelograms don’t contain integer points as well.
We now have all the ingredients to show that we can reach any rational number by taking mediant steps.
Theorem: For any such that , the mediant mediant process defined above eventually stops.
Remark: I tried to keep the following proof as elementary as possible. There are simpler, more elegant proofs using a bit of linear algebra which I will show in the next post. For now you are welcome to try and find such a proof.
Proof: Consider the parallelograms defined by . We expect these vectors to grow as increase. In particular, you should prove the if , then the section of the circle of radius between the lines defined by the vectors is completely inside the parallelogram. So if the process never stops, and these vectors increase enough in size, then eventually we could take , so that the integer point will be contained inside the parallelogram, which is a contradiction (hence, the process must stop at some point).
A simple way to track the size (more or less) of these vectors is by using the simple map . For this map satisfies:
Our initial vectors and have nonnegative entries. Since in our process we only add vectors, we always remain with nonnegative vectors so that the inequality above holds.
Additionally, we have the nice property that , so if , then . Applying this result to our process, it is easy to check that for each . Finally, since at each step either or , then at least one of them has sum of coordinates equals to .
Suppose that we updated both and at some . At the point of update we know that , and since is an increasing sequence, this inequality is true for all large enough. It follows that
for all large enough, and the same holds for . But as we wrote in the beginning of the proof, this will lead to a contradiction where we have an integer point inside our parallelogram. We conclude that at least one of them is not updated at all after a certain point.
Without loss of generality, assume that , at some index and from this point on we don’t update . This means that and for all . Letting , by the definition of our process, we must have
However, we can write which is almost (which is larger than ) when goes to inifnity, and we get our desired contradiction. Visually, what happens is :
We see that even if the are not updated after some point, we can’t continue the process for ever. In other words, the process must stop at some point, which is exactly what we wanted to prove.
Conclusion for my 10 years old past self
While the last proof became a little technical, the geometry behind it was quite simple. Hopefully it is simple enough so that my past self could understand most of it. However, this is only the beginning of the story. If you know a little bit of linear algebra, there is a good chance that you caught some glimpses of it along the way. In the next post, I will meet my past self from my Bachelor degree in mathematics and see how this same problem and results are connected to all sorts of mathematical subjects and connects them nicely together.