$\ell^p$-improving for discrete spherical averages

We initiate the theory of $\ell^p$-improving inequalities for arithmetic averages over hypersurfaces and their maximal functions. In particular, we prove $\ell^p$-improving estimates for the discrete spherical averages and some of their generalizations. As an application of our $\ell^p$-improving inequalities for the dyadic discrete spherical maximal function, we give a new estimate for the full discrete spherical maximal function in four dimensions. Our proofs are analogous to Littman's result on Euclidean spherical averages. One key aspect of our proof is a Littlewood--Paley decomposition in both the arithmetic and analytic aspects. In the arithmetic aspect this is a major arc-minor arc decomposition of the circle method.


Introduction
The motivation for this paper is Littman's L p (R d )-improving result for spherical averages from [Lit73]. For dimensions d ≥ 2 and functions f : R d → C define the spherical average (over the unit sphere) by where dσ is the Euclidean surface measure on the unit sphere S d−1 in R d .
Littman. If A is the averaging operator over the unit sphere, then .
In this note we will be interested in estimates for the discrete spherical averages which are analogous to (1). Suppose that d ≥ 2. For λ ∈ N and functions f : Z d → C, define the discrete spherical averages whenever N d (λ) := #{y ∈ Z d : |y| 2 = λ} is not zero. In other words, A λ is the linear operator given by convolution with the discrete (or more appropriately named "arithmetic") probability measure σ λ := N d (λ) −1 1 {y∈Z d :|y| 2 =λ} .
Our main result is the following.
Theorem 1. If d ≥ 4 and d+1 d−1 ≤ p ≤ 2, then for each ǫ > 0, there exists constants C p,ǫ depending on p and ǫ such that for all λ ∈ N (further restrict λ to be odd when d = 4), we have the ℓ p -improving inequality The implicit constants are independent of λ. 1 We are also motivated by Lee's work [Lee03] which proved that the dyadic spherical maximal function variant of Littman's theorem holds. Moreover, our methods are flexible and we can use them to strengthen Theorem 1 when p is sufficiently large.
Theorem 2. If d ≥ 5 and p := d d−2 , then there exists constants C p depending on p such that for all Λ ∈ N, we have the ℓ p -improving inequality for the Λ-dyadic discrete spherical maximal function The constant C p is independent of Λ.
This clearly implies the same is true for a single average and hence we may remove the ǫ-loss in Theorem 1 when p > d d−2 .
1.1. Motivation. When d ≥ 5 and λ ∈ N, we have that 100 −1 λ 2 ; when d = 4 and λ is restricted to be odd, we have that 100 −1 λ d−2 2 / log log λ ≤ N d (λ) ≤ 100λ d−2 2 · log log λ. Several years ago Jim Wright asked the author: What is ℓ p -improving for the discrete spherical averages? We interpret his question as the following.
Question 1 (Jim Wright). When are there exponents 1 ≤ p, q ≤ ∞ and a constant C = C d,p,q , possibly depending on d, p, q but independent of λ such that On the one hand, unlike the continuous case in R d , we do not have the dilational symmetry to exploit; this is why we want (3) to hold uniformly for λ ∈ N instead of formulating the question for the unit sphere (λ = 1) as stated in Littman's result. On the other hand, we may quickly obtain some trivial off-diagonal results by using the contraction inequality and the nesting property of ℓ p -spaces to see that (3) is true for all λ for 1 ≤ p ≤ q ≤ ∞ with C = 1. This makes Question 1 trivial. We consider the following two examples; assume for simplicity that d ≥ 5 and λ ∈ N is large.
Example 1 (δ-function). Take f to be the delta function at the origin. Then A λ f is supported on the sphere of radius √ λ and has height N d (λ) −1 λ 2−d 2 . Thus, for p, q ≥ 1, We have a similar bound for the dyadic maximal function Consequently, Example 2 (big ball-function). Take f to be the indicator function of a ball of radius R √ λ. Then A λ f is supported on the ball of radius R + √ λ √ λ and has height 1 for a large chunk of it. Therefore, The same estimates hold with the dyadic maximal function sup Λ≤λ<2Λ |A λ f | in place of A λ f .
Here we immediately see that we must have q ≥ p to satisfy (3). Combining these two we see that for 1 ≤ p ≤ 2: Here and throughout, p ′ is the dual exponent to p which means for 1 ≤ p ≤ ∞ that 1 p + 1 p ′ = 1. Example 1 actually reveals more since Young's inequality implies This bound extends to the dyadic maximal functions: Interpolating this with the contraction inequality we obtain the following estimate which we call the trivial bound. For large λ the trivial bound is larger than the bounds from our examples. Curiously, the trivial bound shows that there exist ℓ p (Z d )-improving estimates which decay with λ. Therefore a better interpretation of Jim Wright's question is the following.
Question 2. For each 1 < p < 2, what is the best exponent η p so that where the constant C p is independent of λ ∈ N and f ∈ ℓ p (Z d )?
We also ask this question for the dyadic version.
Question 3. For each 1 < p < 2, what is the best exponent ν p so that where the constant C ′ p is independent of Λ ∈ N and f ∈ ℓ p (Z d )?
3 Theorem 1 addresses Question 2 when we are close to p = 2. In particular it says that in Question 2 we may take any 0 ≤ η p < d 2 ( 2 p − 1) for d+1 d−1 ≤ p < 2. Comparing (2) to (6), we see that Theorem 1 improves upon the trivial bound. Recall that the trivial bound was given by a simple convexity estimate (in this case applying Young's inequality); consequently, Theorem 1 might be referred to as a subconvexity estimate in the argot of analytic number theorists.
One can also ask for ℓ p -improving estimates for the full discrete spherical maximal function.
denote the discrete spherical maximal function when d = 4, and denote the discrete spherical maximal function when d ≥ 5.
Question 4. When are there exponents 1 ≤ p, q ≤ ∞ so that When d ≥ 5, there are some obvious ranges for which one may obtain bounds. In particular, by applying Magyar-Stein-Wainger's theorem on the discrete spherical maximal function, see "Theorem" in [MSW02], we have that (7) holds for d ≥ 5 and d d−2 < q ≤ ∞ because A * f ℓ q f ℓ q for all q > d d−2 and the nesting property of ℓ p -spaces implies that Moreover this range is sharp; this can be seen by considering the delta-function example. In four dimensions, the analogous Magyar-Stein-Wainger estimate is expected to hold.
Conjecture 1. If p > 2, then Interestingly we can prove the following ℓ p -improving result when d = 4 which would be a corollary of Conjecture 1 by the nesting properties of ℓ p -spaces.

Comparison to recent works.
Our examples show that (2) fails for p < d+2 d . So, one might expect that (2) would hold for all d+2 d ≤ p ≤ 2 which would go beyond Theorem 1. Intriguingly, Kesler-Lacey [KL18] showed that (2) fails for p < d+1 d−1 . Moreover [KL18] removed the ǫ-loss in (2) for d+1 d−1 < p < d d−2 . We encourage the reader to read Kesler-Lacey's interesting work [KL18] which appeared independently of this paper. Kesler-Lacey also considered ℓ p (Z d )-improving inequalities for discrete spherical averages. In [KL18] their focus lied upon using ℓ p -improving inequalities to deduce 'sparse bounds' for the discrete spherical maximal function. The question of sparse bounds is not considered here. However, we prove ℓ p -improving inequalities for a broader class of averages which are not considered in [KL18].
As the reader may compare, the ideas in [KL18] -for proving ℓ p (Z d )-improving inequalities for discrete spherical averages -are very similar to those here. Their methods may appear more complicated due to the use of the Ramanujan bound from [Bou85,Hug14]. We point out that the Ramanujan bound first appeared in Bourgain's work on restriction of the parabola to the integer lattice to remove an ǫ-loss there, and the Ramanujan bound was first adapted to the context of spherical averages in my work on discrete spherical maximal functions over sparse subsequences of radii; see [Hug14].
1.3. Overview of the proofs. For many discrete analogues in harmonic analysis not only are the statements of theorems analogous, but there is also an analogy between their proofs. We take a moment to describe this since this analogy does not appear to be explained in the literature.
For many problems in Fourier analysis such as Littman's theorem, one decomposes an operator into 'low' and 'high' frequencies and obtains bounds for these different pieces. For instance the Littlewood-Paley square function is one way to do this. There is a similar decomposition for analogous problems with an arithmetic flavor.
For problems over Z d instead of R d , all frequencies in the torus (R/Z) d are ostensibly low frequencies since the torus is compact. Unfortunately, this perspective is insufficiently nuanced to treat many problems. Instead one should recalibrate to the following: replace the sobriquets 'low frequencies' and 'high frequencies' of Fourier analysis with 'major arcs' and 'minor arcs' of the circle method respectively.
Recalibrating one's perspective to the above analogy allows us to import intuitions and paradigms from continuous Euclidean harmonic analysis to discrete Euclidean harmonic analysis via the circle method. By way of this analogy one sees that the circle method is akin to Littlewood-Paley theory. I encourage the reader to review the proofs of Littman's theorem before reading the proofs here to see this analogy in action.
1.4. Organization of the paper. The paper is organized as follows. Section 2 sets some notation used throughout the paper. Section 3 generalizes Theorem 1 to hypersurfaces defined by nice, positive definite homogeneous forms with integral coefficients. We prove Theorem 1 in Section 4 by making use of improved estimates for Kloosterman sums. Section 5 proves bounds for dyadic discrete k-spherical maximal functions. In Section 6 we prove Theorem 3. Section 7 concludes with a few questions. Finally, in the Appendix we record the range of ℓ p -boundedness of Magyar's maximal functions (which arise in Section 3) from [Mag02].

Notation
We introduce here some notation that will streamline our exposition.
• We write f (λ) g(λ) if there exists a constant C > 0 independent of all λ under consideration (e.g. λ in N or in Γ Q ) such that • Subscripts in the above notations will denote parameters, such as the dimension d or degree k of a form Q, on which the implicit constants may depend.
• * denotes convolution on a group such as Z d , T d or R d . It will be clear from context as to which group the convolution takes place. • e (t) will denote the character e −2πit for t ∈ R or T • For a function f : For a function f : • For a ring R, we will use the inner product notation b · m for vectors b, m ∈ R d to mean the sum d i=1 b i m i . This is used for the rings R , Z, T and Z/q where q ∈ Z. • We also let 1 X denote the indicator function of the set X.

ℓ p -improving for Magyar's theorem
Our method is quite general, so we start by generalizing Theorem 2. Throughout this . , x d ), will denote an integral, positive definite, homogeneous form, and k will denote the degree of the form Q. We assume that k ≥ 2 and a natural number. Let V Q (C) := {x ∈ C d : ∇Q(x) = 0} denote the (Birch) singular locus of the form Q. We will say that a homogeneous, integral form is non-singular if it satisfies Birch's criterion: The notion of dimension 'dim C (V Q (C))' can be taken to be the algebraic dimension of the complex variety V Q (C). When (10) is satisfied, Birch [Bir61] tells us that there exists a positive constant C Q and an infinite arithmetic progression Γ Q in N depending on the form Q so that Following Magyar [Mag02], we will call any such arithmetic progression Γ Q a set of regular values for Q. For each λ ∈ N, N Q (λ) is finite because the hypersurface {n ∈ R d : Q(n) = λ} is defined by a positive definite form which implies that this hypersurface is compact. Consequently, the averages make sense for all λ ∈ Γ Q and functions f : Z d → C. In this setting, our trivial bound (6) becomes This follows from Young's inequality and Birch's estimate for N Q (λ). In all of our dyadic maximal functions we will restrict to λ ∈ Γ Q . For a non-singular, homogeneous, integral form define the parameters Throughout we assume that d > k ≥ 2 with d sufficiently large with respect to k to satisfy the Birch-Magyar non-singularity criterion (10) so that γ Q > 0 and κ Q > 2. The following result gives an improvement over the trivial bound (11) when p is close to 2.
Theorem 4. Let Q be a positive definite, non-singular, homogeneous, integral form in d variables of degree k and Γ Q a set of regular values for Q. If p := κ κ−1 , then for each ǫ > 0 there exists a constant C ǫ,Q,p independent of Λ ≥ 1 so that for all Λ ≥ 1. Recall that the supremum is restricted so that each λ is in Γ Q .
Note that our assumption that κ > 2 implies that κ/(κ − 1) < 2. Our theorem implies the following corollary which says that our dyadic maximal functions satisify essentially sharp ℓ p (Z d ) → ℓ p ′ (Z d )-improving estimates for p ≤ 2 with p sufficiently close to 2.
Corollary 1. Let Q be a positive definite, non-singular, homogeneous, integral form in d variables of degree k and Γ Q a set of regular values for Q. If 2(1+γ Q ) 1+2γ Q ≤ p ≤ 2, then for each ǫ > 0 there exists a constant C ǫ,Q,p independent of Λ ≥ 1 so that for all Λ ≥ 1.
Our corollary follows by determining when 2 p − 1 − γ Q (2 − 2 p ) ≤ 0 and noting that γ Q < κ Q . This is a computation that we leave to the reader.
The heavy lifting in our thoerem lies in a decomposition of Magyar for the averages A Q λ . Magyar 1 ( [Mag07]). Let Q(x) ∈ Z[x] be a positive definite, non-singular, integral, homogeneous form, and Γ Q be a set of regular values for the form Q. For each λ ∈ Γ Q the averaging operator A Q λ decomposes into the sum of two convolution operators, A λ = M Q λ + E Q λ such that for all Λ ≥ 1, we have The implicit constant is independent of Λ ≥ 1. The main term, M Q λ , is the sum of finitely many convolution operators, We remark that M (17) | dσ Q (ξ)| ǫ (1 + |ξ|) 1−κ+ǫ for each ξ ∈ R d and for all ǫ > 0.
The estimate (14) does not explicitly appear in [Mag02]. Instead it appears for a slightly different definition of our error term, so we briefly indicate how one obtains it. Estimate (14) is encoded in the proofs of Propositions 3 and 4 in [Mag02]. One key difference in this paper is that our main term M Q λ is a finite sum depending on λ, and so we do not need (2.17) of Proposition 4 in [Mag02]. Meanwhile the estimates (2.15) and (2.16) of Proposition 4 in [Mag02] are superior to the minor arc estimate of Proposition 3 in [Mag02]. Therefore the minimal exponent which defines γ Q comes from the minor arc estimate.
Magyar's theorem gives us ℓ 2 -estimates, but we are interested in ℓ p → ℓ p ′ -estimates. We will interpolate Magyar's ℓ 2 -estimates with appropriate ℓ 1 → ℓ ∞ -estimates to deduce the following lemmas which when added together immediately yield Theorem 4. Our lemma for the main term is the following.
be a positive definite, non-singular, integral, homogeneous form satisfying (10), and Γ Q be a set of regular values for the form Q. If p := κ κ−1 (which is less than 2 by our assumption on κ), then Up to a factor of Λ ǫ the bound for our main term is the size we expect for our averages in the range κ κ−1 ≤ p ≤ 2. Unfortunately the bound for the error term is much worse. Lemma 2. Let Q(x) ∈ Z[x] be a positive definite, non-singular, integral, homogeneous form satisfying (10), and Γ Q be a set of regular values for the form Q. If 1 ≤ p ≤ 2, then for all ǫ > 0, for all Λ ≥ 1.
The proofs of these lemmas are motivated by proofs of Littman's theorem and its variants. Proofs of Littman's theorem often proceed by frequency decomposing the spherical average into pieces and finding L 1 → L ∞ and L 2 → L 2 estimates with which to interpolate.
In particular, we use a restricted weak-type argument that was used by Bourgain for the Euclidean spherical maximal function, by Ionescu [Ion04] for the discrete spherical maixmal function and also by Hu-Li for discrete restriction to the sphere in [HL14]. This is the same strategy that we follow in our proofs of these lemmas. In the next subsections we will deduce our lemmas.
3.1. Proof of Lemma 1. Let K a/q λ denote the kernel (with domain Z d ) associated to the convolution operator M Q,a/q λ . We start our proof by establishing an identity for these kernels.
be a positive definite, non-singular, integral, homogeneous form satisfying (10), and Γ Q be a set of regular values for the form Q. If 1 ≤ a < q < ∞ with (a, q) = 1, then Fix a form Q satisfying the hypotheses of the proposition. We drop the dependence on Q in our notation in order to simplify it.
By Fourier inversion, our kernel is The second equality follows since there is only a single term in the sum for each ξ while the third follows from writing every m ∈ Z d as qn + b for some n ∈ Z d and a representative b ∈ {0, 1, . . . , q − 1} d which we identify with (Z/q) d . Using the well-known translation, modulation and dilation symmetries of the (inverse) Fourier transform, we have Identity (20) immediately follows since We leave this calculation to the reader since its just the inverse (Z/q) d -Fourier transform of the Gauss sum, and the Gauss sum is the (Z/q) d -Fourier transform of the function e aQ(x) q .

9
Now that we know the structure of our kernel we will use a Littlewood-Paley decomposition and a circle method decomposition to arbitrage ℓ 1 → ℓ ∞ and ℓ 2 → ℓ 2 estimates to deduce Lemma 1. In particular, the following lemma is motivated by the decompositions in [Ion04] when k = 2 and [Hug17] when k ≥ 3. With this in mind, we introduce a low-high frequency decomposition in the analytic aspect.
For ∆ ∈ (0, 1), define the low frequency piece by the Fourier multipler Lemma 3. Suppose that ∆ ∈ (0, 1). Let Q(x) ∈ Z[x] be a positive definite, non-singular, integral, homogeneous form satisfying (10), and Γ Q be a set of regular values for the form Q. If 1 ≤ a < q Λ 1/k with (a, q) = 1, then each major arc piece M a/q,Q = M a/q,Q,low + M a/q,Q,high decomposes into a low frequency and high frequency piece such that Proof. The ℓ 2 → ℓ 2 -estimate is proved using Bourgain's L 2 -estimates from [Bou85] as in [Ion04,Hug17]. We only sketch the proof of the ℓ 1 → ℓ ∞ -estimate. When Q(x) := d i=1 |x i | 2 , then we have the following known bound for the continuous spherical measure See for instance page 1415 of [Ion04] where (23) is used to bound the discrete spherical maximal function, or (5.5.12) of [Gra08] for its derivation. The estimate (23) also holds for our varieties {x ∈ R d : Q(x) = 1} since the proof only relies on the dimensionality of the measure dσ Q . To be precise all we require is that This implies that for each x ∈ Z d we have Remark 3.1. Estimates (17) and (24) and a Euclidean version of the low-high decomposition suffice to prove Euclidean versions of Theorem 4 generalizing Littman's theorem.
We now return to the proof of Lemma 1. Fix Λ ≥ 1. Let X be a fixed, finite subset of Z d . Our first observation is that estimate (21) implies that for Λ ≤ λ < 2Λ.
by taking ∆ Λ −1/k and summing over the moduli 1 ≤ q Λ 1/k . The implicit bound is independent of the set X. Consquently we are reduced to proving Since |{x ∈ Z d : sup the inequalities of Lemma 3 combine to imply that The inequality (27) shows that we want to choose Q 1/2 ∆ κ− 3 2 Q 2−κ which is ∆ Q −1 ; we now make this assumption. Our restriction then takes the form Q 2 Λ −d/k |X| T which is consistent with our reduction (26). Choosing Q 2 Λ d/k T |X| −1 , we deduce that for all 0 < T ≤ 1 as desired.
3.2. Proof of Lemma 2. The proof of Lemma 2 follows by interpolating the error term estimate (14) in Magyar's theorem with estimate (28) below.
Proposition 2. We have the following ℓ 1 → ℓ ∞ -improving estimate for the error term: Proof. Since E λ = A λ − M λ , the trivial bound (11) for the dyadic maximal function and our bound (25) for the main term imply (28).

The discrete spherical averages
In this section we refine our main term analysis from Section 3 in order to prove Theorem 1. We now recall a decomposition of Magyar which, for the discrete spherical averages, improves upon the error term bound (14) in Magyar 1. Therefore (40) is in general sharp for all 1 ≤ p ≤ 2, and another way to interpret Theorem 1 is that the discrete sphere smooths almost as well as the discrete ball which contains the same number of points as the sphere in the range d+1 d−1 ≤ p < 2. This begs with a few questions.
Question 5 (Michael Fryers). For a random subset X what is the best constant in the inequality (40)?
Question 6 (Michael Fryers). Is (40) still true if we take X to be a perturbation of the discrete spherical measures σ λ ?
Question 7. What are the extremizers satisfying (40) for all 1 ≤ p ≤ 2? Do they resemble the ball?
Appendix A. ℓ p (Z d )-bounedness for Magyar's theorem Theorem 4 of [Mag02] is only stated for ℓ 2 (Z d ). Since it has come up in conversation on multiple occasions, we record the range of ℓ p (Z d )-boundedness to which Magyar's theorem extends and briefly indicate how. Standard density arguments extend the range of Magyar's pointwise ergodic theorem -Theorem 3 in [Mag02] -to the same range of L p -spaces as ℓ p (Z d )-spaces below.
As in Section 3, we have A λ = M λ + E λ . This implies that for each function f : and Our strategy is then to give sufficiently decaying ℓ p -bounds bounds on M a/q * to sum over a and q and to prove ℓ p -boundedness of E * through its dyadic counterparts.
Summing over a and q immediately yields the following corollary. We see that we only need 2 − κ(2 − 2 p ) < 0 which is when p > κ κ−1 .
Corollary 2. Let Q(x) ∈ Z[x] be a non-singular, integral, homogeneous form satisfying (10), and Γ Q be a set of regular values for the form Q. If κ κ−1 < p ≤ 2, then (42) sup Our lemma for the error term is the following.
Lemma 9. Let Q(x) ∈ Z[x] be a non-singular, integral, homogeneous form satisfying (10), and Γ Q be a set of regular values for the form Q. If 1 ≤ p ≤ 2, then for all ǫ > 0, Summing (43) over Λ = 2 j for integers j ≥ 0 immediately yields the following corollary.
We resume the proof of Lemma 8. By (20) of Proposition 1 we have that M a/q * ℓ 1 (Z d )→ℓ 1 (Z d )
Since k ≥ 2, we have that