Summing the prime reciprocals

In the previous post we saw why primes are so important, and used Euclid’s proof to show that there are infinitely many primes. We further conjectured that not only there are infinitely many primes, they are also “nicely” distributed among the integers.

More precisely, if we fix some integer $N\geq 2$ , then for every $0\leq d <N$ we can ask how many primes $p$ are there such that $p\equiv _N d$ . If $N$ and $d$ are not coprime, then we showed that there is either zero or one such primes, so the infinitely many primes that we already know exist are equivalent to $d$ mod $N$ where the $d$ are coprime to $N$ . The conjecture was that not only the union over these $d$ is infinite, but for each $d$ the number of primes is infinite (which is Dirichlet’s theorem for primes in arithmetic progression), and more over, in a sense we have “the same” infinite in each of those $d$ . While we could generalize a bit Euclid’s proof to some other cases, it becomes really complicated quite fast.

Trying to look for other methods, we mix our prime numbers with a bit of calculus, and turn to Euler for some help.

Sum of reciprocals

Euler’s idea was instead of actually counting the primes, we should try to compute the sum

${\displaystyle \sum _{p\in P} \frac{1}{p}} ,$

and see if it is infinite or not.

The multiplicative inverse function, or its other name – the reciprocal, has all sorts of nice properties which come in handy in many cases, and we will see some of them here.

Of course, if there are only finitely many primes, then the sum above will be finite. However, unlike the simple counting of primes, there are infinite sets which we can sum over but still get a finite sum. For example, we know that $\sum_n \frac{1}{n^2}$ is finite (and we can actually compute it to get $\frac{\pi^2}{6}$ ). So while an infinite set $P$ doesn’t mean that the sum is infinite, if the sum is infinite, then the set $P$ must be infinite.

The question now is how do we even start to approach this sort of summation?

Well, the importance of the primes come from their role inside of the integers, so a good place to start looking at is actually the infinite sum

${\displaystyle \sum _{n\in \mathbb{N}} \frac{1}{n} }.$

This sum is larger than the previous one over the primes, so if we ever hope to show that the first one is infinite, we must also be able to show that the second one is infinite as well. This second sum should be easier to work with, since the set of integers has much more structure than just the primes. Indeed, it is a ring – we have both addition and multiplication and all sorts of nice properties combining those two.

Hopefully, after understanding the sum over the integers, we could then find some trick to reverse the direction and show that

$\sum _{n\in \mathbb{N}} \frac{1}{n} = \infty \Rightarrow \sum _{p\in P} \frac{1}{p} = \infty$ ,

and to conclude that there are infinitely many prime.

We already have the first step in our road map, connecting the question over the primes, to a question over the integers.

At this point you might think that even the summation over the integers is difficult (in particular with a quick look into our future where we only want to look on primes satisfying some modulo condition). So can we have an even simpler summation than the one over the integers?

Yes! We can “sum” over all the real numbers:

$\int_1^\infty \frac{1}{x} \mathrm{dx}$ .

As before, if we can show that this integral is infinite, and that this implies that the sum over the integers is infinite, then we are in a good place. More over, while the integers had addition and multiplication which were missing in the primes, the real numbers also have a metric (distance between numbers) which leads to everything we know from calculus, and in particular we have many tools that we can use to solve such integrals.

Thus, our 4 step plan is:

Show that $\int_1^\infty \frac{1}{x} \mathrm{dx} = \infty$ .
Show that $\int_1^\infty \frac{1}{x} \mathrm{dx} = \infty \Rightarrow \sum_1^\infty \frac{1}{n} = \infty$ .
Show that $\sum_1^\infty \frac{1}{n} = \infty = \sum_{p\in P} \frac{1}{p} = \infty$ .
Conclude that there are finitely many primes.

From the real numbers to integers

When trying to compute integrals, one of the main tools is using the antiderivative (indefinite integral). In our case we know that $\ln'(x) = \frac{1}{x}$ , so the fundamental theorem of calculus gives us that

$\int _1^T \frac{1}{x}\mathrm{dx} = ln(T) - ln(1) \overset{T\to\infty}{\longrightarrow} \infty$ ,

thus proving the first step. This is also a good place to recall that $\frac{1}{x}$ is in a sense on the border between convergence and divergence to infinity. Indeed, using the antiderivitive as above, we have that

$\int_{1}^{\infty}\frac{1}{x^{s}}\mathrm{dx}=\begin{cases} finite & s>1\\ \infty & s\leq1 \end{cases}$ .

This is one of the reason why this is a good function to use. Saying that our original sum over the primes diverge to infinity means that it is quite a large set inside the integers. If it was “a bit” smaller, then it would converge.

The second step, where we try to move from the real numbers to integers, involves a result that is usually seen when learning about integrals. It uses the fact that the integers are “distibuted nicely” inside the real line – the distance between each integer to the next is exactly 1.

Visually, the integral, which is the area below the graph of the function, and the summation, which can be seen as the sum of areas of rectangles, can be viewed as follows:

Immediately we see the the areas of the rectangles without the first rectangle is smaller than the area below the graph of the function. However, shifting our function by 1 to the left, we also get the following:

Formally, what we get is that:

$\sum_2^\infty \frac{1}{n} \leq \int_1^\infty \frac{1}{x}\mathrm{dx} \leq \sum_1^\infty \frac{1}{n}$

so that

${\displaystyle 0 \leq \sum_1^\infty \frac{1}{n} - \int_1^\infty \frac{1}{x}\mathrm{dx} \leq 1}.$

Both the sum and integral are over positive functions so both either converge to a finite number or infinity, but because their difference is finite, then one diverges to infinity if and only if the second diverges.

Just to connect it to what you have probably seen in your calculus class, the generalization of this result is:

CLAIM: Let $f: \mathbb{R} \to \mathbb{R}$ be a continuous function, which is positive and decreasing. Then $\sum_1^\infty f(n)$ diverge to infinity if and only if $\int_1^\infty f(x)\mathrm{dx}$ diverge to infinity.

From this claim for $f(x)=\frac{1}{x}$ and result from step 1, we conclude that $\sum_1^\infty \frac{1}{n} = \infty$ .

From the integers to the prime numbers

The next step is a bit more complicated, though still doesn’t need anything more than your second calculus class about integrals and infinite sums. When trying to move from the real numbers to the integers, we used the way the integers sit inside the real line. Similarly, when moving to the primes from the integers, we need to use the properties of the prime numbers as a subset of the integers. The main property here, as we have seen in the previous post, is that every integer can be written as a product of primes (and in a sense in a unique way).

If we let $P=\{p_1, p_2, p_3 , ... \}$ be an enumeration of the positive primes, which we can assume is increasing, then recall that each positive integer $n$ can be written as $\prod p_i^{k_i}$ where the $k_i\geq 0$ are integers, almost all of which are zero. In other words, the number $n$ corresponds to the sequence $(k_1, k_2,...)$ . We can now write

$\sum_{n=1}^\infty \frac{1}{n} = \sum _{(k_1,k_2,...)} \frac{1}{\prod p_i^{k_i}}$ .

In this presentation, each integer is a combination of many primes, while in our end result, namely the sum $\sum \frac{1}{p}$ each prime stands by itself. To move one step closer to this expression we note that

$\sum _{(k_1,k_2,...)} \frac{1}{\prod p_i^{k_i}} = \prod_{p} \sum_{k=0}^\infty \frac{1}{p^k}$ .

At this point, there should be all sort of red lights and sirens going off all around you. The expression on the right is an infinite product of infinite sums, which we claim is equal to another infinite sum! All of this requires an explanation and is not trivial. However, before you ever start to look for a formal proof, you should always try to find some basic intuition why the result should be true.

If we ignore, for just a quick minute, all the convergence problems in the world, and just open the brackets on the expression on the right, we see that it is going to be the sum of terms of the form

$\frac{1}{\prod p_i^{k_i}}$

where the $k_i$ are nonnegative integers. The expressions with only finitely many nonzero $k_i$ are exactly those appearing on the left. Those that have infinitely many nonzero $k_i$ , if the world is not broken, should be zero. With this intuition, the right hand side must be equal to the left hand side. Now we are only left with finding the right definitions to make our intuition true.

When talking about infinite products, the definition is the same as for infinite sum, namely

$\prod_1^\infty a_i := \lim_{N \to \infty} \prod_1^N a_i$ .

In our case, the product is over the primes and $a_p = \sum_0^\infty \frac{1}{p^k}$ is a convergent geometric series, where the limit is $\frac{1}{1-1/p} > 1$ , so at least each factor in our infinite product is well defined.

If we only multiply finitely many of our factors, then we can use the steps from our intuition above and open the brackets to get that

$\sum _{(k_1,k_2,...)} \frac{1}{\prod p_i^{k_i}} \geq \prod_{i=1}^N \sum_{k=0}^\infty \frac{1}{p_i^k}$ ,

and by taking the limit of $N\to\infty$ we get that

$\sum _{(k_1,k_2,...)} \frac{1}{\prod p_i^{k_i}} \geq \prod_{i=1}^\infty \sum_{k=0}^\infty \frac{1}{p_i^k}$ .

This is not too surprising, since in the back of our mind we already know that the expression on the left is inifity, but it is a good first exercise for infinite products.

What we really care about is the opposite direction. Using the definition of infinite sums as the limit of finite sums, let us fix $N$ and consider the sum $\sum_1^N\frac{1}{n}$ . We can now choose $M = M_N$ large enough, so that all the primes appearing in $1,2,...,N$ are in $p_1, p_2, ..., p_M$ . With this choice we get that

$\sum_1^N \frac{1}{n} \leq \prod_1^M (\sum_k \frac{1}{p_i^k})$ .

Our factors in the infinite product are larger than 1, so we will only increase the product when changing $M$ to infinity, namely

$\sum_1^N \frac{1}{n} \leq \prod_1^\infty (\sum_k \frac{1}{p_i^k})$ .

Finally, now that we have this fixed upper bound, we can take $N\to \infty$ to get that

$\infty = \sum_1^\infty \frac{1}{n} \leq \prod_1^\infty (\sum_k \frac{1}{p_i^k})=\prod_{p\in P} \frac{1}{1-1/p}$ ,

which leaves us with an expression containing only primes.

Finishing the proof

While we now have only primes in our final expression, it is inside an infinite product, instead of an infinite sum. However, we have the usual trick of going from product to sum which is using the logarithm. Applying it to both side of the equation gives us:

$\infty = \ln(\infty) = \ln(\prod_{p\in P} \frac{1}{1-1/p})=\sum_p \ln(\frac{p}{p-1})=\sum_p \ln(1+\frac{1}{p-1})$ .

The term $\ln(1+\frac{1}{p-1})$ and $\frac{1}{p}$ are almost the same, in the sense that $[\ln(1+\frac{1}{p-1})] / [\frac{1}{p}] \to 1$ as $p\to \infty$ . In particular, this means that the last infinite sum diverges if and only if the series $\sum_{p\in P}\frac{1}{p}$ diverges, which is exactly what we wanted to prove.

So what makes this proof better?

We now saw Euler’s proof for the fact that there are infinitely many primes, and in the previous post we saw Euclid’s proof. Without a doubt , Euler’s proof is more complicated, so we are left with the question what is so important about it?

First, I would argue that the ideas to connect results about the prime numbers to result about the integers and another connection to results about the real line, is already in itself quite interesting and important, and if you spend enough time around mathematicians you will see it in other places as well (in particular the second step).

But even more so, when we look for all the primes which are $d$ mod $N$ , and ask if it is an infinite set, we can try to generalize the Euler’s proof, which automatically leads us to the sum of $\frac{1}{n}$ for all integers which are $d$ mod $N$ . This already is a very easy sum to work with, and can be easily seen to be infinity, using what we have already seen in this post. So in a sense, we already have half of our proof right here. For the second half, we have one more tool that we need (and one more advance result that we will ignore…), and we will see it in the next post.