Equidistribution of divergent orbits

Motivation from continued fraction expansion:

Let us recall first the definition of continued fraction expansion (c.f.e).

Given a (possibly finite) sequence a_0 \in \mathbb{Z}, a_i\in \mathbb{N}, i\geq 1, we define


to be the limit of the convergents \frac{p_n}{q_n}=[a_0;a_1,a_2,....,a_n]\in \mathbb{Q} (this limit definition is analogous to presentation as decimal expansion).

It is not hard to show that for any sequence (a_i) the induced sequence of convergents actually converge, and moreover every irrational number has a unique such presentation, and rational numbers have 2 presentations (e.g. \frac{1}{2}=[0;2]=[0;1,1]). While given any x\in [0,1], q\in \mathbb{N} it is easy to find p\in \mathbb{N} such that |x-\frac{p}{q}|\leq \frac{1}{q}, the importance of the convergents in the c.f.e is that they satisfy the inequality |x-\frac{p_n}{q_n}|<\frac{1}{q_n^2}. In other words, the c.f.e provides us with very good approximations to x.

The standard way to study these sort of presentations is using the shift left operator on the sequence (a_i), namely after we found the n-th convergent, we shift left n times in order to get the next coefficient a_{n+1} needed for the next convergent.

Definition (Gauss map): Given x=[0;a_1 ,a_2 ,...]\in (0,1] we define the Gauss map T:(0,1]\to[0,1] to be

T(x)=T([0;a_1 ,a_2 ,a_3,...]):=[0;a_2 ,a_3,...]=\frac{1}{x}-\left\lfloor \frac{1}{x}\right\rfloor.

Note that for x=[0;a_1 ,a_2,...] \in (0,1] we have that a_1= \left\lfloor \frac{1}{x}\right\rfloor, hence the information about the n-th convergent is in the partial forward orbit \{x,T(x),T^2(x),...,T^{n-1}(x)\}. It is well known that this orbit equidistribute for almost every x\in (0,1] and it is a special instance of the pointwise ergodic theorem.

Pointwise Ergodic Theorem (for the Gauss map): For any f\in C_c([0,1]) and for almost every x\in [0,1] we have that

\lim_{N\to \infty}\frac{1}{N}\sum_{i=0}^{N-1} f(T^i(x)) = \frac{1}{\ln(2)}\int_0^1 \frac{f(t)}{1+t} dt,

Or equivalently we have the weak star limit

\lim_{N\to \infty}\frac{1}{N}\sum_{i=0}^{N-1} \delta_{T^i(x)} =\nu_{Gauss}:=\frac{dt}{\ln(2)(1+t)}.

A point x\in [0,1] which satisfies the convergence above is called a generic point (for the Gauss map). It can be easily shown by using indicator functions, that a point is generic if and only if its coefficients in the continued fraction expansion satisfy a certain statistics (called the Gauss-Kuzmin statistics) and in particular every natural number should appear in its c.f.e. Thus, it is easy to construct points which are not generic and in particular we have the following three sets:

  1. The sequence a_i is bounded – corresponds to badly approximable numbers , namely x for which there exists some constant c>0 for which |x-\frac{p}{q}|>\frac{c}{q^2} for any \frac{p}{q}\in \mathbb{Q}.
  2. The sequence a_i is eventually periodic – corresponds to real algebraic numbers of degree 2.
  3. The sequence a_i is finite (in which case we cannot even take N to infinity in the PET) – corresponds to rational numbers.

By the PET, each of these sets is of zero Lebesgue (and Gauss) measure. The set of badly approximable numbers is very big in the sense that it is uncountable, has maximal Hausdorff dimension and is even Schmidt winning (so for example countably many intersection of its translations is still nonempty and has all the properties as above). On the other hand, the sets in (2) and (3) are countable. For these two sets, instead of taking a single element and an increasing sequence of its forward orbit, we can instead ask whether a suitable ordering of the elements will produce an equidistribution result. In this work we consider the case of the finite c.f.e.

Continued fraction expansion of rational numbers:

Let  \frac{p}{q} \in (0,1) be a rational number where 0<p<q are coprime. If \frac{p}{q}=[0;a_1,a_2,...,a_n], then it easily seen that q=a_1 p +r_1 where \frac{r_1}{p}=[0;a_2,a_3,...,a_n]=T(\frac{p}{q}) and in particular 0\leq r_1<p. Similarly we have that p = a_2 r_1 +r_2 where 0\leq r_2 <r_1 and \frac{r_2}{r_1}=[0;a_3,a_4,...,a_n]. Thus, the coefficient in the c.f.e of a rational number are exactly the quotients appearing in the Euclidean division algorithm. For example


Thus, trying to adapt the PET to these finite c.f.e we let len(\frac{p}{q}) be the length of the c.f.e of \frac{p}{q} (which is the number of steps in the Euclidean division algorithms) and then set

\nu_{p/q} = \frac{1}{len(\frac{p}{q})} \sum_{i=0}^{len(\frac{p}{q})-1}T^i(\frac{p}{q}).

The first natural question is if for every choice of p_q prime to q we have that \nu_{p_q /q} \overset{w^*}{\longrightarrow} \nu_{Gauss}. This is false, since for example \nu_{1/q} is always a Dirac measure, and these measures cannot converge to \nu_{Gauss}. On the other hand, this is a very specific example, and we might expect that there are very few such bad examples. This leads us to consider averages of the measures \nu_{p/q} over (p,q)=1 (and 1\leq p\leq q) and ask whether

\nu_q := \frac{1}{\varphi(q)} \sum_{(p,q)=1} \nu_{p/q}\overset{w^*}{\longrightarrow} \nu_{Gauss}.

We note that the normalization for each p above is different according to len(\frac{p}{q}). For example, in the q=5 case we normalize by the lengths 1,2,3,2. If we want to use a uniform normalization we can ask instead whether

\tilde{\nu}_q := \frac{1}{\sum_{(p,q)=1} len(\frac{p}{q})} \sum_{(p,q)=1} \sum_{i=0}^{len(\frac{p}{q})-1} \delta_{T^i(\frac{p}{q})} \overset{w^*}{\longrightarrow} \nu_{Gauss}.

In this work we prove the convergence of the last two averages, and show that this implies that \nu_{p_q /q} \overset{w^*}{\longrightarrow} \nu_{Gauss} for almost every choice of p_q. These equidistribution results correspond to similar questions about equidistribution of geodesics in the space of 2-dimensional lattices (and have generalization to high dimension), which is where we prove them.

Divergent orbits in the space of lattices

All the results appearing here are from [1] (dimension 2) and [2] (general dimension).

Let X_n = \Gamma \backslash G,\; G=SL_n(\mathbb{R}), \Gamma=SL_n(\mathbb{R}) be the space of unimodular n-dimensional lattices. We will use the following notation

A = \{ a(\bar{t}):=diag(e^{t_1},...,e^{t_n})\in SL_n(\mathbb{R}) \mid \sum t_i = 0 \},

U = \{u_{\bar{x}}:= I + \sum_{i=2}^{n} x_i e_{1,i} \mid \bar{x}\in \mathbb{R}^{n-1} \}

where e_{i,j} is the zero matrix with 1 in the (i,j) location.

There is a well known connection between the c.f.e of x\in(0,1) and the A-orbit \Gamma u_x A = \Gamma \left(\begin{array}{cc}1 & x\\0 & 1\end{array}\right)A. Under this connection we get that \Gamma u_x a(-t,t) diverges as t \to -\infty, and the coefficients of the c.f.e can be seen in the forward orbit (i.e. t>0). In particular, the rational numbers can be characterized as the x for which the forward orbit diverges as well. In this case we obtain that the map a\mapsto \Gamma u_x a is a proper map (the preimage of a compact set is compact). This leads us to the following definition:

Definition: Let x\in X_n. The orbit xA is called divergent if the map x\to xa is proper.

Remark: This divergence property in dimension 2 corresponds to the fact that the c.f.e of rational numbers are always finite.

Any rational lattice x=\Gamma g, g \in SL_n(\mathbb{Q}) always contains a vector on each of the main axes, hence whenever \bar{t}\in \mathbb{R}^n_0 is large enough, there exists some i for which t_i is very small, hence xa(\bar{t}) contains a very small nonzero vector, or in other words xa(\bar{t}) is near the cusp. Thus the orbit of a rational lattice is always divergent, and as in the 2-dimensional case the converse is also true.

As in the c.f.e world, we define a measure for each orbit of a rational lattice. Recall that A\cong \mathbb{R}^n_0 has the n-1 dimensional Lebesgue measure, and we denote by \delta_{xA} its push forward to the orbit xA, and we note that for a divergent orbit this is a well defined A-invariant locally finite measure. In the case of dimension 2, the corresponding measure to \tilde{\nu}_q will be \mu_q:=\frac{1}{\varphi(q)}\sum_{(p,q)=1} \delta_{\Gamma u_{p/q}A}.

As the measures above are only locally finite and not probability measures, we need to consider the “right” normalization, so we define the following convergence:

Definition: Let \mu_n, \mu be nonzero locally finite measures on X.

  1. We write \mu_n \overset{w^*}{\longrightarrow} \mu if for any f\in C_c(X) we have that \mu_n(f)\to \mu(f).
  2. We write [\mu_n]\to [\mu] if there exist some constants c_n>0 such that c_n \cdot \mu_n \overset{w^*}{\longrightarrow} \mu. Equivalently, for any f_1,f_2 \in C_c(X) with \mu(f_2)\neq 0 we have that \frac{\mu_n(f_1)}{\mu_n(f_2)}\to \frac{\mu(f_1)}{\mu(f_2)} as n\to \infty.

Under this notation, we have the following result:

Theorem 1: Let \Lambda_q = \{ \frac{1}{q}(p_1,...,p_{n-1}) \mid (p_i,q)=1 \forall i\} and set \mu_q = \frac{1}{|\Lambda_q|}\sum _{\lambda \in \Lambda_q}\delta_{\Gamma u_\lambda A}. Then [\mu_q] \to [\mu_{X_n}] where \mu_{X_n} is the G-invariant probability measure on X_n.

The main steps of the proof are as follows: Suppose that [\mu_q]\to [\mu] for some subsequence of q and a locally finite measure \mu. Then

  1. A-invariance: Since the \mu_q are A-invariant, then so is \mu.
  2. No escape of mass: We show that \mu is not the zero measure. This condition can be translated to a Diophantine condition on the elements in \Lambda_q, and we show that this condition holds by using the fact that \Lambda_q equidistributes in [0,1]^{n-1} as q\to \infty. This equidistribution result is obvious if q runs over the prime numbers (or more generally a product of at most k primes for a fixed k), but this actually still holds for any sequence q.
  3. Maximal entropy: Normalizing \mu we obtain an A-invariant probability measure. For such measures we can compute the entropy h_\mu(a) with respect to some nontrivial element a\in A. We then use the theorem which states that h_\mu(a)\leq h_{\mu_{X_n}}(a) with equality if and only if \mu=\mu_{X_n}. Thus to complete the proof we prove that \mu must have the maximal entropy.

By definition, if xA is a divergent orbit, then xa \to \infty as a\to\infty. In particular, there is a “nice” region \Delta \subseteq A such that most of the “interesting” life span of xA is actually in x\Delta and outside this region the orbit diverges quickly to infinity (more specifically, lattices in this part of the orbit are exactly those which do not have short vectors on the axes). Given q\in \mathbb{N} this region can be chosen uniformly over the orbits corresponding to the elements in \Lambda_q, and we denote this region by \Delta_q.  With this notation we have the following result which says that there are very few orbits for which their interesting life span is uniformly bounded.

Theorem 2: Fix some compact set K\subseteq X_n and let \Lambda_{q,K} = \{ \lambda \in \Lambda_q \mid \; \Gamma u_\lambda \Delta_q \subseteq K\}. Then:

  1. For n=2 we have that |\Lambda_{q,K}|=O(|\Lambda_q^{1-\varepsilon_K}|)=O(|q^{1-\varepsilon_K}|) for some 1>\varepsilon_K>0.
  2. For n\geq 3 we have that |\Lambda_{q,K}|=O(|\Lambda_q^\varepsilon|)=O(|q^{(n-1)\varepsilon}|) for any \varepsilon>0.

Finally, for dimension 2, the theorem above can be translated back to c.f.e, and asks for a fixed K>0 how many p coprime to q are there such that the coefficients in the c.f.e of \frac{p}{q} are bounded by K. Let us denote this set by \tilde{\Lambda}_{q,K}. A well known conjecture regarding this set is Zaremba’s conjecture which asks to show that there exists some constant K for which \tilde{\Lambda}_{q,K}\neq \emptyset for all q. What is known (see Bourgain and Kontorovich [3] and Huang [4]) is that for K=5, these sets are not empty for almost every q. The theorem above shows that even if this set is not empty, it cannot be too big. More specifically, we have the following:

Corollary 3: For any K>0 there exists some \varepsilon_K>0 such that |\tilde{\Lambda}_{q,K}|<q^{1-\varepsilon_K}.

Upgrading the result:

Using the ergodicity of the uniform measure \mu_{X_n} with respect to the group A and the fact that the measures \mu_q are averages of A-orbits, we can upgrade Theorem 2 as follows:

Upgrade 1: Let \Lambda_q'\subseteq \Lambda_q such that \liminf \frac{|\Lambda_q'|}{|\Lambda_q|}>0. Then [\sum_{\lambda\in \Lambda_q'} \delta_{\Gamma u_\lambda A}]\to [\mu_{X_n}].

This theorem follows from the presentation

\mu_q = \frac{1}{|\Lambda_q|}\sum_{\lambda\in \Lambda_q} \delta_{\Gamma u_\lambda A} =  \frac{|\Lambda_q'|}{|\Lambda_q|} \left[ \frac{1}{|\Lambda_q'|} \sum_{\lambda\in \Lambda_q'} \delta_{\Gamma u_\lambda A} \right]+(1-\frac{|\Lambda_q'|}{|\Lambda_q|}) \left[ \frac{1}{|\Lambda_q|-|\Lambda_q'|} \sum_{\lambda\notin \Lambda_q'} \delta_{\Gamma u_\lambda A} \right].

By going to a subseqeunce on which all the averages above and their coefficients converge, we may assume that \frac{|\Lambda_q'|}{|\Lambda_q|}\to \alpha >0, hence taking the limit (with the proper normalization) we obtain that

\mu_{X_n}=\alpha \mu^{(1)}+(1-\alpha)\mu^{(2)}.

It is easy to show that both \mu^{(1)} and \mu^{(2)} must be A-invariant probability measure. The ergodicity of \mu_{X_n} implies that it is an extreme point in the set of such measure, i.e. it cannot be written as a proper convex combination of two A-invariant probability measure. From that we conclude that \mu^{(1)}=\mu_{X_n} which implies Upgrade 1.

Next, we upgrade this result even further and show that almost all the \delta_{\Gamma u_\lambda A} for \lambda \in \Lambda_q are close to the uniform measure (after the suitable normalization).

Upgrade 2: There exist sets \Lambda_q'\subseteq \Lambda_q such that

  1. We have that \lim_{q\to \infty} \frac{|\Lambda_q'|}{|\Lambda_q|}=1, and
  2. For any choice of sequence \lambda_q\in \Lambda_q' we have that [\delta_{\Gamma u_{\lambda_q} A}]\to [\mu_{X_b}].

The main idea of the proof of this upgrade is that if it is not true, then after a suitable normalization we could find a positive proportion of the elements in \Lambda_q and some witness function f\in C_c(X_n) for which  \delta_{\Gamma u_{\lambda_q} A}(f) are “far” from \mu_{X_n}(f). But such a result will contradict the previous upgrade.

In particular, for dimension 2 and its connection to c.f.e, we obtain the following:

Corollary:  There exist sets \Lambda_q'\subseteq \{1\leq p\leq q \mid (p,q)=1\} such that

  1. We have that \lim_{q\to \infty} \frac{\varphi(q)}{|\Lambda_q|}=1, and
  2. For any choice of sequence p_q\in \Lambda_q' we have that \nu_{p_q/q} \to \nu_{Gauss}.

Divergent orbits in the space of adelic lattices

In the results in the previous section in the space X_n we took an average over some set \Lambda_q and then an average over the diagonal flow. The set \Lambda_q always consisted of rational lattices with vectors defined over \frac{1}{q}\mathbb{Z}. Of course, there are many more such lattices than just those inside \Lambda_q and a natural question is what makes this set so important, and are there any other nice sets which have similar equidistribution results. In order to see this result in a much more natural way we lift the discussion to the space of adelic lattices where the average over the rationals can be see as a translation in the p-adic places.

Recall first that we have a natural projection

\pi: PGL_n(\mathbb{Q}) \backslash PGL_n(\mathbb{A}) \to PGL_n(\mathbb{Z}) \backslash PGL_n(\mathbb{R}) \cong X_n

defined as follows. Given (x,y)\in PGL_n(\mathbb{A}) where x\in PGL_n(\mathbb{R}) and y\in \prod _p PGL_n(\mathbb{Q}_p) such that in almost every place y_p \in PGL_n(\mathbb{Z}_p), we can always find \gamma \in PGL_n(\mathbb{Q}) such that \gamma y_p \in PGL_n(\mathbb{Z}_p) for all p. We then define the projection \pi(PGL_n(\mathbb{Q})\cdot (x,y)):= PGL_n(\mathbb{Z}_p)\gamma x.

Letting x_0 \in  PGL_n(\mathbb{Q}) \backslash PGL_n(\mathbb{A}) be the point corresponding to PGL_n(\mathbb{Q}) and A_\mathbb{A} be the set of diagonal matrices in PGL_n(\mathbb{A}) we denote by \eta the locally finite A_\mathbb{A}-invariant measure on the orbit x_0 A_\mathbb{A}. It is not hard to check that the projection of this measure to X_n is exactly \pi_*([\eta])=[\delta_{\Gamma A}], namely the A-invariant measure on the orbit \Gamma A. When the measure \eta is first translated by an element from PGL_n(\mathbb{A}) which is trivial in the real place and then projected down to X_n, then this translation turns into an average over several orbit measures. In particular we are interested in translation by elements of the form

\bar{g} = (Id, g, g,...), g\in PGL_n(\mathbb{Q}).

One can show that for \lambda\in \Lambda_q we have that \pi_*(\bar{u}_\lambda [\eta]) = [\mu_q], and we use this fact to prove that:

Theorem 4: The sequence \bar{u}_{\frac{1}{q}(1,...,1)} [\eta] converges to the PGL_n(\mathbb{A})-invariant measure on PGL_n(\mathbb{Q}) \backslash PGL_n(\mathbb{A}).

This theorem leads to the more general question of finding a condition of a sequence g_i \in PGL_n(\mathbb{A}) for which the translations g_i[\eta] equidistribute. The theorem above shows that we can choose the sequence g_i = \bar{u}_{\frac{1}{q}(1,...,1)}.

Note that the sequence cannot equidistribute if g_i are in some compact set. Moreover, since \eta is A_\mathbb{A}-invariant, then a necessary condition is that g_i diverges modulo A_\mathbb{A}. On the other hand, if the sequence g_i[\eta] equidistributes, then clearly k_i g_i a_i [\eta] equidistributes for any choice of a_i \in A_\mathbb{A} and k_i in a fixed compact set. Thus, using the Iwasawa decomposition in the n=2 we obtain the following:

Theorem 5: Let g_i\in PGL_2(\mathbb{A}) which satisfy:

  1. The sequence diverges modulo A_\mathbb{A}, and
  2. The real part of the g_i is trivial for each i (or more generally, it is in a compact set modulo A).

Then the sequence g_i [\eta] equidistributes.


  1. O. David, U. Shapira, “Equidistribution of divergent orbits and continued fraction expansion of rationals”, arXiv: arXiv:1707.00427 [math.DS]
  2. O. David, U. Shapira, “Equidistribution of divergent orbits of the diagonal group in the space of lattices”, arXiv: arXiv:1710.05242 [math.DS]
  3. J. Bourgain and A. Kontorovich, “On Zaremba’s conjecture”. Annals of Mathematics, 180(1), pp.137-196, 2014.
  4. S. Huang, “An improvement to Zaremba’s conjecture”. Geometric and Functional Analysis, 25(3), pp.860-914, 2015.