# Equidistribution of divergent orbits

## Motivation from continued fraction expansion:

Let us recall first the definition of continued fraction expansion (c.f.e).

Given a (possibly finite) sequence $a_0 \in \mathbb{Z}, a_i\in \mathbb{N}, i\geq 1$, we define

$[a_0;a_1,a_2,....]:=a_0+\cfrac{1}{a_1+\cfrac{1}{a_2+\cfrac{1}{a_3+\cdots}}}$

to be the limit of the convergents $\frac{p_n}{q_n}=[a_0;a_1,a_2,....,a_n]\in \mathbb{Q}$ (this limit definition is analogous to presentation as decimal expansion).

It is not hard to show that for any sequence $(a_i)$ the induced sequence of convergents actually converge, and moreover every irrational number has a unique such presentation, and rational numbers have 2 presentations (e.g. $\frac{1}{2}=[0;2]=[0;1,1]$). While given any $x\in [0,1], q\in \mathbb{N}$ it is easy to find $p\in \mathbb{N}$ such that $|x-\frac{p}{q}|\leq \frac{1}{q}$, the importance of the convergents in the c.f.e is that they satisfy the inequality $|x-\frac{p_n}{q_n}|<\frac{1}{q_n^2}$. In other words, the c.f.e provides us with very good approximations to $x$.

The standard way to study these sort of presentations is using the shift left operator on the sequence $(a_i)$, namely after we found the $n$-th convergent, we shift left $n$ times in order to get the next coefficient $a_{n+1}$ needed for the next convergent.

Definition (Gauss map): Given $x=[0;a_1 ,a_2 ,...]\in (0,1]$ we define the Gauss map $T:(0,1]\to[0,1]$ to be

$T(x)=T([0;a_1 ,a_2 ,a_3,...]):=[0;a_2 ,a_3,...]=\frac{1}{x}-\left\lfloor \frac{1}{x}\right\rfloor.$

Note that for $x=[0;a_1 ,a_2,...] \in (0,1]$ we have that $a_1= \left\lfloor \frac{1}{x}\right\rfloor$, hence the information about the $n$-th convergent is in the partial forward orbit $\{x,T(x),T^2(x),...,T^{n-1}(x)\}$. It is well known that this orbit equidistribute for almost every $x\in (0,1]$ and it is a special instance of the pointwise ergodic theorem.

Pointwise Ergodic Theorem (for the Gauss map): For any $f\in C_c([0,1])$ and for almost every $x\in [0,1]$ we have that

$\lim_{N\to \infty}\frac{1}{N}\sum_{i=0}^{N-1} f(T^i(x)) = \frac{1}{\ln(2)}\int_0^1 \frac{f(t)}{1+t} dt,$

Or equivalently we have the weak star limit

$\lim_{N\to \infty}\frac{1}{N}\sum_{i=0}^{N-1} \delta_{T^i(x)} =\nu_{Gauss}:=\frac{dt}{\ln(2)(1+t)}.$

A point $x\in [0,1]$ which satisfies the convergence above is called a generic point (for the Gauss map). It can be easily shown by using indicator functions, that a point is generic if and only if its coefficients in the continued fraction expansion satisfy a certain statistics (called the Gauss-Kuzmin statistics) and in particular every natural number should appear in its c.f.e. Thus, it is easy to construct points which are not generic and in particular we have the following three sets:

1. The sequence $a_i$ is bounded – corresponds to badly approximable numbers , namely $x$ for which there exists some constant $c>0$ for which $|x-\frac{p}{q}|>\frac{c}{q^2}$ for any $\frac{p}{q}\in \mathbb{Q}$.
2. The sequence $a_i$ is eventually periodic – corresponds to real algebraic numbers of degree 2.
3. The sequence $a_i$ is finite (in which case we cannot even take $N$ to infinity in the PET) – corresponds to rational numbers.

By the PET, each of these sets is of zero Lebesgue (and Gauss) measure. The set of badly approximable numbers is very big in the sense that it is uncountable, has maximal Hausdorff dimension and is even Schmidt winning (so for example countably many intersection of its translations is still nonempty and has all the properties as above). On the other hand, the sets in (2) and (3) are countable. For these two sets, instead of taking a single element and an increasing sequence of its forward orbit, we can instead ask whether a suitable ordering of the elements will produce an equidistribution result. In this work we consider the case of the finite c.f.e.

### Continued fraction expansion of rational numbers:

Let  $\frac{p}{q} \in (0,1)$ be a rational number where $0 are coprime. If $\frac{p}{q}=[0;a_1,a_2,...,a_n]$, then it easily seen that $q=a_1 p +r_1$ where $\frac{r_1}{p}=[0;a_2,a_3,...,a_n]=T(\frac{p}{q})$ and in particular $0\leq r_1. Similarly we have that $p = a_2 r_1 +r_2$ where $0\leq r_2 and $\frac{r_2}{r_1}=[0;a_3,a_4,...,a_n].$ Thus, the coefficient in the c.f.e of a rational number are exactly the quotients appearing in the Euclidean division algorithm. For example

$\frac{1}{5}=[0;5];\,\,\frac{2}{5}=[0;2,2];\,\,\frac{3}{5}=[0;1,1,2];\,\,\frac{4}{5}=[0;1,4]$

Thus, trying to adapt the PET to these finite c.f.e we let $len(\frac{p}{q})$ be the length of the c.f.e of $\frac{p}{q}$ (which is the number of steps in the Euclidean division algorithms) and then set

$\nu_{p/q} = \frac{1}{len(\frac{p}{q})} \sum_{i=0}^{len(\frac{p}{q})-1}T^i(\frac{p}{q}).$

The first natural question is if for every choice of $p_q$ prime to $q$ we have that $\nu_{p_q /q} \overset{w^*}{\longrightarrow} \nu_{Gauss}$. This is false, since for example $\nu_{1/q}$ is always a Dirac measure, and these measures cannot converge to $\nu_{Gauss}$. On the other hand, this is a very specific example, and we might expect that there are very few such bad examples. This leads us to consider averages of the measures $\nu_{p/q}$ over $(p,q)=1$ (and $1\leq p\leq q$) and ask whether

$\nu_q := \frac{1}{\varphi(q)} \sum_{(p,q)=1} \nu_{p/q}\overset{w^*}{\longrightarrow} \nu_{Gauss}.$

We note that the normalization for each $p$ above is different according to $len(\frac{p}{q})$. For example, in the $q=5$ case we normalize by the lengths $1,2,3,2$. If we want to use a uniform normalization we can ask instead whether

$\tilde{\nu}_q := \frac{1}{\sum_{(p,q)=1} len(\frac{p}{q})} \sum_{(p,q)=1} \sum_{i=0}^{len(\frac{p}{q})-1} \delta_{T^i(\frac{p}{q})} \overset{w^*}{\longrightarrow} \nu_{Gauss}.$

In this work we prove the convergence of the last two averages, and show that this implies that $\nu_{p_q /q} \overset{w^*}{\longrightarrow} \nu_{Gauss}$ for almost every choice of $p_q$. These equidistribution results correspond to similar questions about equidistribution of geodesics in the space of 2-dimensional lattices (and have generalization to high dimension), which is where we prove them.

## Divergent orbits in the space of lattices

All the results appearing here are from [1] (dimension 2) and [2] (general dimension).

Let $X_n = \Gamma \backslash G,\; G=SL_n(\mathbb{R}), \Gamma=SL_n(\mathbb{R})$ be the space of unimodular $n$-dimensional lattices. We will use the following notation

$A = \{ a(\bar{t}):=diag(e^{t_1},...,e^{t_n})\in SL_n(\mathbb{R}) \mid \sum t_i = 0 \},$

$U = \{u_{\bar{x}}:= I + \sum_{i=2}^{n} x_i e_{1,i} \mid \bar{x}\in \mathbb{R}^{n-1} \}$

where $e_{i,j}$ is the zero matrix with 1 in the $(i,j)$ location.

There is a well known connection between the c.f.e of $x\in(0,1)$ and the $A$-orbit $\Gamma u_x A = \Gamma \left(\begin{array}{cc}1 & x\\0 & 1\end{array}\right)A$. Under this connection we get that $\Gamma u_x a(-t,t)$ diverges as $t \to -\infty$, and the coefficients of the c.f.e can be seen in the forward orbit (i.e. $t>0$). In particular, the rational numbers can be characterized as the $x$ for which the forward orbit diverges as well. In this case we obtain that the map $a\mapsto \Gamma u_x a$ is a proper map (the preimage of a compact set is compact). This leads us to the following definition:

Definition: Let $x\in X_n$. The orbit $xA$ is called divergent if the map $x\to xa$ is proper.

Remark: This divergence property in dimension 2 corresponds to the fact that the c.f.e of rational numbers are always finite.

Any rational lattice $x=\Gamma g, g \in SL_n(\mathbb{Q})$ always contains a vector on each of the main axes, hence whenever $\bar{t}\in \mathbb{R}^n_0$ is large enough, there exists some $i$ for which $t_i$ is very small, hence $xa(\bar{t})$ contains a very small nonzero vector, or in other words $xa(\bar{t})$ is near the cusp. Thus the orbit of a rational lattice is always divergent, and as in the 2-dimensional case the converse is also true.

As in the c.f.e world, we define a measure for each orbit of a rational lattice. Recall that $A\cong \mathbb{R}^n_0$ has the $n-1$ dimensional Lebesgue measure, and we denote by $\delta_{xA}$ its push forward to the orbit $xA$, and we note that for a divergent orbit this is a well defined $A$-invariant locally finite measure. In the case of dimension 2, the corresponding measure to $\tilde{\nu}_q$ will be $\mu_q:=\frac{1}{\varphi(q)}\sum_{(p,q)=1} \delta_{\Gamma u_{p/q}A}$.

As the measures above are only locally finite and not probability measures, we need to consider the “right” normalization, so we define the following convergence:

Definition: Let $\mu_n, \mu$ be nonzero locally finite measures on $X$.

1. We write $\mu_n \overset{w^*}{\longrightarrow} \mu$ if for any $f\in C_c(X)$ we have that $\mu_n(f)\to \mu(f)$.
2. We write $[\mu_n]\to [\mu]$ if there exist some constants $c_n>0$ such that $c_n \cdot \mu_n \overset{w^*}{\longrightarrow} \mu$. Equivalently, for any $f_1,f_2 \in C_c(X)$ with $\mu(f_2)\neq 0$ we have that $\frac{\mu_n(f_1)}{\mu_n(f_2)}\to \frac{\mu(f_1)}{\mu(f_2)}$ as $n\to \infty$.

Under this notation, we have the following result:

Theorem 1: Let $\Lambda_q = \{ \frac{1}{q}(p_1,...,p_{n-1}) \mid (p_i,q)=1 \forall i\}$ and set $\mu_q = \frac{1}{|\Lambda_q|}\sum _{\lambda \in \Lambda_q}\delta_{\Gamma u_\lambda A}$. Then $[\mu_q] \to [\mu_{X_n}]$ where $\mu_{X_n}$ is the $G$-invariant probability measure on $X_n$.

The main steps of the proof are as follows: Suppose that $[\mu_q]\to [\mu]$ for some subsequence of $q$ and a locally finite measure $\mu$. Then

1. $A$-invariance: Since the $\mu_q$ are $A$-invariant, then so is $\mu$.
2. No escape of mass: We show that $\mu$ is not the zero measure. This condition can be translated to a Diophantine condition on the elements in $\Lambda_q$, and we show that this condition holds by using the fact that $\Lambda_q$ equidistributes in $[0,1]^{n-1}$ as $q\to \infty$. This equidistribution result is obvious if $q$ runs over the prime numbers (or more generally a product of at most $k$ primes for a fixed $k$), but this actually still holds for any sequence $q$.
3. Maximal entropy: Normalizing $\mu$ we obtain an $A$-invariant probability measure. For such measures we can compute the entropy $h_\mu(a)$ with respect to some nontrivial element $a\in A$. We then use the theorem which states that $h_\mu(a)\leq h_{\mu_{X_n}}(a)$ with equality if and only if $\mu=\mu_{X_n}$. Thus to complete the proof we prove that $\mu$ must have the maximal entropy.

By definition, if $xA$ is a divergent orbit, then $xa \to \infty$ as $a\to\infty$. In particular, there is a “nice” region $\Delta \subseteq A$ such that most of the “interesting” life span of $xA$ is actually in $x\Delta$ and outside this region the orbit diverges quickly to infinity (more specifically, lattices in this part of the orbit are exactly those which do not have short vectors on the axes). Given $q\in \mathbb{N}$ this region can be chosen uniformly over the orbits corresponding to the elements in $\Lambda_q$, and we denote this region by $\Delta_q$.  With this notation we have the following result which says that there are very few orbits for which their interesting life span is uniformly bounded.

Theorem 2: Fix some compact set $K\subseteq X_n$ and let $\Lambda_{q,K} = \{ \lambda \in \Lambda_q \mid \; \Gamma u_\lambda \Delta_q \subseteq K\}$. Then:

1. For $n=2$ we have that $|\Lambda_{q,K}|=O(|\Lambda_q^{1-\varepsilon_K}|)=O(|q^{1-\varepsilon_K}|)$ for some $1>\varepsilon_K>0$.
2. For $n\geq 3$ we have that $|\Lambda_{q,K}|=O(|\Lambda_q^\varepsilon|)=O(|q^{(n-1)\varepsilon}|)$ for any $\varepsilon>0$.

Finally, for dimension 2, the theorem above can be translated back to c.f.e, and asks for a fixed $K>0$ how many $p$ coprime to $q$ are there such that the coefficients in the c.f.e of $\frac{p}{q}$ are bounded by $K$. Let us denote this set by $\tilde{\Lambda}_{q,K}$. A well known conjecture regarding this set is Zaremba’s conjecture which asks to show that there exists some constant $K$ for which $\tilde{\Lambda}_{q,K}\neq \emptyset$ for all $q$. What is known (see Bourgain and Kontorovich [3] and Huang [4]) is that for $K=5$, these sets are not empty for almost every $q$. The theorem above shows that even if this set is not empty, it cannot be too big. More specifically, we have the following:

Corollary 3: For any $K>0$ there exists some $\varepsilon_K>0$ such that $|\tilde{\Lambda}_{q,K}|.

Using the ergodicity of the uniform measure $\mu_{X_n}$ with respect to the group $A$ and the fact that the measures $\mu_q$ are averages of $A$-orbits, we can upgrade Theorem 2 as follows:

Upgrade 1: Let $\Lambda_q'\subseteq \Lambda_q$ such that $\liminf \frac{|\Lambda_q'|}{|\Lambda_q|}>0$. Then $[\sum_{\lambda\in \Lambda_q'} \delta_{\Gamma u_\lambda A}]\to [\mu_{X_n}]$.

This theorem follows from the presentation

$\mu_q = \frac{1}{|\Lambda_q|}\sum_{\lambda\in \Lambda_q} \delta_{\Gamma u_\lambda A} = \frac{|\Lambda_q'|}{|\Lambda_q|} \left[ \frac{1}{|\Lambda_q'|} \sum_{\lambda\in \Lambda_q'} \delta_{\Gamma u_\lambda A} \right]+(1-\frac{|\Lambda_q'|}{|\Lambda_q|}) \left[ \frac{1}{|\Lambda_q|-|\Lambda_q'|} \sum_{\lambda\notin \Lambda_q'} \delta_{\Gamma u_\lambda A} \right].$

By going to a subseqeunce on which all the averages above and their coefficients converge, we may assume that $\frac{|\Lambda_q'|}{|\Lambda_q|}\to \alpha >0$, hence taking the limit (with the proper normalization) we obtain that

$\mu_{X_n}=\alpha \mu^{(1)}+(1-\alpha)\mu^{(2)}$.

It is easy to show that both $\mu^{(1)}$ and $\mu^{(2)}$ must be $A$-invariant probability measure. The ergodicity of $\mu_{X_n}$ implies that it is an extreme point in the set of such measure, i.e. it cannot be written as a proper convex combination of two $A$-invariant probability measure. From that we conclude that $\mu^{(1)}=\mu_{X_n}$ which implies Upgrade 1.

Next, we upgrade this result even further and show that almost all the $\delta_{\Gamma u_\lambda A}$ for $\lambda \in \Lambda_q$ are close to the uniform measure (after the suitable normalization).

Upgrade 2: There exist sets $\Lambda_q'\subseteq \Lambda_q$ such that

1. We have that $\lim_{q\to \infty} \frac{|\Lambda_q'|}{|\Lambda_q|}=1$, and
2. For any choice of sequence $\lambda_q\in \Lambda_q'$ we have that $[\delta_{\Gamma u_{\lambda_q} A}]\to [\mu_{X_b}]$.

The main idea of the proof of this upgrade is that if it is not true, then after a suitable normalization we could find a positive proportion of the elements in $\Lambda_q$ and some witness function $f\in C_c(X_n)$ for which  $\delta_{\Gamma u_{\lambda_q} A}(f)$ are “far” from $\mu_{X_n}(f)$. But such a result will contradict the previous upgrade.

In particular, for dimension 2 and its connection to c.f.e, we obtain the following:

Corollary:  There exist sets $\Lambda_q'\subseteq \{1\leq p\leq q \mid (p,q)=1\}$ such that

1. We have that $\lim_{q\to \infty} \frac{\varphi(q)}{|\Lambda_q|}=1$, and
2. For any choice of sequence $p_q\in \Lambda_q'$ we have that $\nu_{p_q/q} \to \nu_{Gauss}$.

## Divergent orbits in the space of adelic lattices

In the results in the previous section in the space $X_n$ we took an average over some set $\Lambda_q$ and then an average over the diagonal flow. The set $\Lambda_q$ always consisted of rational lattices with vectors defined over $\frac{1}{q}\mathbb{Z}$. Of course, there are many more such lattices than just those inside $\Lambda_q$ and a natural question is what makes this set so important, and are there any other nice sets which have similar equidistribution results. In order to see this result in a much more natural way we lift the discussion to the space of adelic lattices where the average over the rationals can be see as a translation in the p-adic places.

Recall first that we have a natural projection

$\pi: PGL_n(\mathbb{Q}) \backslash PGL_n(\mathbb{A}) \to PGL_n(\mathbb{Z}) \backslash PGL_n(\mathbb{R}) \cong X_n$

defined as follows. Given $(x,y)\in PGL_n(\mathbb{A})$ where $x\in PGL_n(\mathbb{R})$ and $y\in \prod _p PGL_n(\mathbb{Q}_p)$ such that in almost every place $y_p \in PGL_n(\mathbb{Z}_p)$, we can always find $\gamma \in PGL_n(\mathbb{Q})$ such that $\gamma y_p \in PGL_n(\mathbb{Z}_p)$ for all $p$. We then define the projection $\pi(PGL_n(\mathbb{Q})\cdot (x,y)):= PGL_n(\mathbb{Z}_p)\gamma x$.

Letting $x_0 \in PGL_n(\mathbb{Q}) \backslash PGL_n(\mathbb{A})$ be the point corresponding to $PGL_n(\mathbb{Q})$ and $A_\mathbb{A}$ be the set of diagonal matrices in $PGL_n(\mathbb{A})$ we denote by $\eta$ the locally finite $A_\mathbb{A}$-invariant measure on the orbit $x_0 A_\mathbb{A}$. It is not hard to check that the projection of this measure to $X_n$ is exactly $\pi_*([\eta])=[\delta_{\Gamma A}]$, namely the $A$-invariant measure on the orbit $\Gamma A$. When the measure $\eta$ is first translated by an element from $PGL_n(\mathbb{A})$ which is trivial in the real place and then projected down to $X_n$, then this translation turns into an average over several orbit measures. In particular we are interested in translation by elements of the form

$\bar{g} = (Id, g, g,...), g\in PGL_n(\mathbb{Q}).$

One can show that for $\lambda\in \Lambda_q$ we have that $\pi_*(\bar{u}_\lambda [\eta]) = [\mu_q]$, and we use this fact to prove that:

Theorem 4: The sequence $\bar{u}_{\frac{1}{q}(1,...,1)} [\eta]$ converges to the $PGL_n(\mathbb{A})$-invariant measure on $PGL_n(\mathbb{Q}) \backslash PGL_n(\mathbb{A})$.

This theorem leads to the more general question of finding a condition of a sequence $g_i \in PGL_n(\mathbb{A})$ for which the translations $g_i[\eta]$ equidistribute. The theorem above shows that we can choose the sequence $g_i = \bar{u}_{\frac{1}{q}(1,...,1)}$.

Note that the sequence cannot equidistribute if $g_i$ are in some compact set. Moreover, since $\eta$ is $A_\mathbb{A}$-invariant, then a necessary condition is that $g_i$ diverges modulo $A_\mathbb{A}$. On the other hand, if the sequence $g_i[\eta]$ equidistributes, then clearly $k_i g_i a_i [\eta]$ equidistributes for any choice of $a_i \in A_\mathbb{A}$ and $k_i$ in a fixed compact set. Thus, using the Iwasawa decomposition in the $n=2$ we obtain the following:

Theorem 5: Let $g_i\in PGL_2(\mathbb{A})$ which satisfy:

1. The sequence diverges modulo $A_\mathbb{A}$, and
2. The real part of the $g_i$ is trivial for each $i$ (or more generally, it is in a compact set modulo $A$).

Then the sequence $g_i [\eta]$ equidistributes.

### Bibliography:

1. O. David, U. Shapira, “Equidistribution of divergent orbits and continued fraction expansion of rationals”, arXiv: arXiv:1707.00427 [math.DS]
2. O. David, U. Shapira, “Equidistribution of divergent orbits of the diagonal group in the space of lattices”, arXiv: arXiv:1710.05242 [math.DS]
3. J. Bourgain and A. Kontorovich, “On Zaremba’s conjecture”. Annals of Mathematics, 180(1), pp.137-196, 2014.
4. S. Huang, “An improvement to Zaremba’s conjecture”. Geometric and Functional Analysis, 25(3), pp.860-914, 2015.