An exponential inequality for orthomartingale differences random fields and some applications

In this paper, we establish an exponential inequality for random fields, which is applied in the context of convergence rates in the law of large numbers and H\"olderian weak invariance principle.

1. Statement of the exponential inequality 1.1.Goal and motivations.Understanding the asymptotic behavior of sums of random variables is an important topic in probability.A useful tool for establishing limit theorems is to control the probability that the partial sums or the maximum of the absolute values of the partial sums exceeds a fixed number.A direct application can give convergence rates in the law of large numbers, and such an inequality can be used to check tightness criteria in some functional spaces.
In this paper, we will be concerned by establishing and applying an exponential inequality for the so-called orthomartingale difference random fields introduced in [3].We make an assumption of invariance in law of the partial sums, which is weaker than stationarity.We control the tails of maximum of partial sums on rectangles of N d by two quantities: an exponential term and another one involving the common distribution function of the increments.We will then formulate two applications of the result.One will deal with the convergence rates in the strong law of large numbers.The second one will be about the invariance principle in Hölder spaces, which will be restricted to the case of stationary processes for technical reasons.
Orthomartingales are well-suited for the summation over rectangles of Z d because we can use a one dimensional martingale property when we sum according to each fixed coordinate.Moreover, the approximation of stationary random fields by this class of martingales can be done, like in [4,13,24,35].However, for rates in the law of large numbers and the functional central limit theorem in Hölder spaces, only a few results are available in the literature.Indeed, rates on the law of large number for orthomartingales with polynomial moment have been given in [15,19,20], but it seems that the question of exponential moments was only addressed for martingale difference sequences.For the functional central limit theorem in Hölder spaces for random fields, the i.i.d.case was addressed for moduli of regularity of the form t α , 0 < α < 1/2 in the case of random fields, and for more general moduli in the case of sequences.It turns out that the inequality we will present in this paper is a well-suited tool for dealing with these problems.
A key ingredient for the aforementioned limit theorems is an exponential inequality for orthomartingale random fields, that is, an inequality putting into play an exponential term and the tail of the common distribution of the increments.At first glance, it seems that multiindexed martingales could be treated like sequences, up to some technical and notational obstacles.However, it turns out that the standard tools for proving deviation inequalities for martingales, like martingale transforms or uses of exponential supermartingale, do not extend easily.Nevertheless, it is possible to apply induction arguments when the random field satisfies good properties.
The paper is organized as follows: in Subsection 1.2, we state the definition of orthomartingale difference random fields and an exponential inequality for such random fields, with an assumption of invariance of the law of the sum on rectangles by translation.Subsections 2.1 (respectively 2.2) provide an application to the rates in the strong law of large numbers (respectively, to the invariance principle in Hölder spaces).Section 3 is devoted to the proof of the previously stated results.
1.2.Exponential inequality for orthomartingales.We start by defining the concept of orthomartingale.Before that, we need to introduce the notion of filtration with respect to the coordinatewise order : we say that for i, j ∈ Z d , i j if i q j q for each q ∈ {1, . . ., d} =: [d].
Definition 1.1.A filtration indexed by Z d on a probability space (Ω, F , P) is a collection of σ-algebras (F i ) i∈Z such that for each i j, the inclusion F i ⊂ F j takes place.Definition 1.2.We say that the filtration (F i ) i∈Z is commuting if for all integrable random variable Y and all i, j ∈ Z, the equality takes place, where the minimum is taken coordinatewise.
Example 1.3.Let (ε j ) j∈Z d be an i.i.d.random field and let , q ∈ [d] := {1, . . ., d} be independent copies of an i.i.d.sequence (ε k ) k∈Z and let kq , k q j q , 1 q d .Then the filtration (F i ) i∈Z is commuting, and connected to decoupled U -statistics.Definition 1.5.We say that the random field (X i ) i∈Z d is an orthomartingale difference random field with respect to the commuting filtration where for q ∈ [d], e q is the q-th vector of the canonical basis of R d .
Theorem 1.6.Let (X i ) i∈Z d be an orthomartingale differences random field such that for all n ∈ N d and all k ∈ Z d , Then the following inequality holds for all x, y > 0: Remark 1.8.The exponent 2/d in the exponential term of (1.4) is not improvable, even in the bounded case.To see this, consider d i.i.d.sequences ε which are independent of each other, in the sense that the collection ε takes the values 1 and −1 with probability 1/2 for all q ∈ iq .Then P max The vector converges in distribution to (N q ) d q=1 , where (N q ) d q=1 is independent and each N q has a standard normal distribution.Therefore, if f (x) is a function such that for each bounded by 1 orthomartingale difference random field satisfying (1.3) and each x > 0 P max then letting Y := d q=1 |N q |, the following inequality should hold for all x: P {Y > x} f (x).Since the L p -norm of Y behaves like p d/2 , the function f cannot decay quicker than exp (−Kx γ ) for some K > 0 and γ > 2/d.Remark 1.9.The log factor in the right hand side of (1.4) appears naturally as iterations of weak-type estimates of the form xP {X > x} E [X1 {Y > x}] for some random variable Y , giving a control of the tail of X in terms of that of Y .Consequently, the log factor does not seem to be avoidable with this method of proof.We do not know whether this factor can be removed.

Application to limit theorems
2.1.Convergence rates in the law of large numbers.A centered sequence (X i ) i 1 satisfies the law of large numbers if the sequence n −1 n i=1 X i i 1 converges almost surely to zero.This is for example the case of a strictly stationary ergodic sequence where X 1 is integrable and centered.Then arises the question of evaluating the speed of convergence and finding bounds for the large deviation probabilities, namely, This question has been treated in the independent case in [7,17,31] under conditions on the L pnorm of X i .The case of martingale differences has also been addressed: under boundedness of moments of order p and a Cramer-type condition sup i 1 E [exp (|X i |)] < +∞ in [21], a conditional Cramer condition in [22] and under finite exponential moments in [8].
For random fields, we can consider large deviation probabilities defined by Result for orthomartingales with polynomial moment have been given in [15,19,20].
Theorem 2.1.Let (X i ) i∈Z d be an orthomartingale difference random field satisfying (1.3).Suppose that for some γ > 0, Then for each positive x, the following inequality takes place: where C 1,d,γ and C 2,d,γ depend only on d and γ.

Application to Hölderian invariance principle.
Given a sequence of random variables (X i ) i 1 , a way to understand the asymptotic behavior of the partial sums given by , centered and has unit variance, the sequence (W n ) n 1 converges in law in C[0, 1] to a standard Brownian motion.The result has been extended to strictly stationary martingale difference sequences in [1,18].Then numerous papers treated the case of weakly dependent strictly stationary sequence, see [23] and the references therein for an overview.
There are also two other possibilities of extension of such result.The first one is to consider other functional spaces, in order to establish the convergence of (F (W n )) n 1 for a larger class of functionals than the continuous functionals F : C[0, 1] → R. In other words, we view W n as an element of an element of a Hölder space H ρ with modulus of regularity ρ and investigate the convergence of (W n ) n 1 in this function space.Applications to epidemic changes can be given, for example in [28].The question of the invariance principle in Hölder spaces has been treated for i.i.d.sequences for modulus of regularity of the form t α in [26] and also for more general ones, of the form t 1/2 log (ct) β in [27].Some results are also available for stationary weakly dependent sequences, for example mixing sequences [11,16] or by the use of a martingale approximation [10,12].
A second development of these invariance principles is the consideration of partial sum processes built on random fields.Given a random field (X i ) i∈Z d , one can define the following random function on [0, 1] d : where λ denotes the Lebesgue measure on R d and Convergence of W n in the space of continuous functions has been investigated in [34] for an i.i.d.random field, [6] for martingales for the lexicographic order and in [32,33] for orthomartingales.
In this section, we will study the convergence of W n in some Hölder spaces for orthomartingales random fields.First, an observation is that for hence for almost every ω, the map t → W n (t) (ω) is Lipschitz-continuous.As pointed out in the i.i.d.case in [34], the finite dimensional distributions of W n converge to those of a standard Brownian sheet, that is, a centered Gaussian process (W (t)) t∈[0,1] d whose covariance function is given by provided that X 1 is centered and has unit variance.Given an increasing continuous function ρ : [0, 1] → R such that ρ (0) = 0, let H ρ be the space of all the functions x from [0, 1] . Therefore, it is not possible to expect to show the convergence of (W n ) n 1 in all the possible Hölder spaces and some restriction have to be made.In order to treat a class of modulus of regularity larger than the power function, we need to introduce the slowly varying functions.We say that a function L : (0, +∞) → (0, +∞) is slowly varying if for all c > 0, the quantity L (ct) /L (t) goes to 1 as t goes to infinity.For example, functions which behave asymptotically as a power of the logarithm are slowly varying.We now define the moduli of regularity and the associated Hölder spaces of interest from the point of view of the convergence of the partial sum process W n .

Definition 2.2. Let d
1 be an integer.We say that ρ belongs to the class R 1/2,d if there exists a slowly varying function L such that L (t) → ∞ as t goes to infinity and a constant c such that and ρ is increasing on [0, 1].
It seems that the exponent d/2 is the best we can get in view of proving tightness via the deviation inequality we established.It may not be optimal in some cases, for example, if (X i ) i∈Z d is centered, we can get a similar inequality as (1.4), but with the exponent 2 instead of 2/d in the right hand side.As a consequence, tightness in We now give a sufficient condition for tightness of the partial sum process associated to a strictly stationary random field, that is, a random field (X i ) i∈Z d such that for each integer N and each i 1 , . . ., i N , j ∈ Z d , the vectors (X i k +j ) ) It turns out that it is more convenient to consider maximum indexed by dyadic elements of Z d .In order to avoid confusion, we will use notations of the form 2 m k instead of 2 n k .
A similar tightness criterion was used in [11] for sequences, giving optimal results for mixing sequences and allowing to recover the optimal result for i.i.d.sequences.For random fields, the inequality (1.4) we obtained and that in Theorem 1.13 of [15] are appropriate tools to check (2.10).
No other assumption than stationarity is done but of course, some dependence will be required for this condition to be satisfied since one need a good control of the tails of the partial sums on rectangles normalized by the square root of the number of elements in the rectangle.Now, we state a result for the weak convergence of (W n ) n 1 in the space H ρ , where ρ ∈ R 1/2,d .
Theorem 2.4.Let d 1 and let (X i ) i∈Z d be a strictly stationary orthomartingale difference random field.Let ρ be an element of R 1/2,d be given by (2.9), where L is slowly varying.

Assume moreover that any
where W is a standard Brownian sheet.

Proofs
3.1.Proof of Theorem 1.6.The proof will be done by induction on the dimension d.For d = 1, we need the following deviation inequality for one dimensional martingales.This is a combination of Theorem 2.1 in [9] and Theorem 6 in [30].
Proposition 3.1 (Proposition 2.1 in [14]).Let (D i ) i 1 be a martingale difference sequence with respect to the filtration (F i ) i 0 .Suppose that E D 2 i is finite for all i 1. Suppose that there exists a nonnegative random variable Y such that for all Then for all x, y > 0 and each n 1, the following inequality holds: Assume that Theorem 1.6 holds in dimension d − 1 with d 2. We have to prove (1.4) for all d-dimensional orthomartingale difference fields satisfying (1.3).We first get rid of the maximum over the coordinate d, apply the inequality of the d − 1 dimension to the obtained orthomartingale and we are then reduced to control the tails of a one dimensional martingale.More concretely, the induction step is done as follows.
(1) Step 1: let M := max 1 i n |S i | and Then we will show that Step 2: the tails of M ′ are controlled by applying the result for (d − 1)-dimensional random fields.(3) Step 3: it remains to control the tails of n d i d =1 X 1,...,1,i d , which can be done by using the one dimensional result.The proof will be quite similar to that of Theorem 1.1 in [14].The latter gave an exponential inequality in the spirit of those of the paper for U -statistics, that is, sum of terms of the form h (X 1 , . . ., X r ) where (X i ) i 1 is i.i.d.The connection with orthomartingales is the following.First, using decoupling (see [5]), the following inequality takes place: jq , j q i q .
Let us now go into the details of the proof.Let x, y > 0. We can assume without loss of generality that x/y > 3 d .Indeed, suppose that we showed the existence of constants A ′ d , B d and C d such that the inequality P max holds for each orthomartingale difference random field (X i ) i∈Z d satisfying (1.3) and for each x, y > 0 such that x/y > 3 d , then replacing 1 when x/y 3 d , so that the desired inequality becomes trivial in this range of parameters.
(1) Step 1.Let M and M N be defined as M := max 1 i n |S i |, n = (n 1 , . . ., n d ) and Define the filtration G N as F n1,...,n d−1 ,N .We check that (M N ) N 1 is a submartingale with respect the filtration (G N ) N 1 .Indeed, and by the orthomartingale property, By Doob's inequality, we derive that Expressing the latter expectation as an integral of the tail and cutting this integral at x/2 gives the bound P max is an orthomartingale difference random field with respect to the commuting filtration F i1,...,i d−1 ,n d i1,...,i d−1 ∈Z satisfying (1.3).We thus apply the induction hypothesis with We get in view of (3.10) and x y where (3.21) Observing that for u, v > 1, (1 + 2 ln (uv)) and defining the functions we derive from estimate (3.20) that P max Doing for fixed u, v 1 the substitution t = wf d/2 (u) f 1/2 (v), we are reduced to show that there exists a constant K such that for each t 1, where Using the fact that there exists a constant c such that for each u 1, f d/2 (u) c, we derive that and since the integral over u is convergent, we are reduced to show that there exists a constant K such that for each t 1, Since there exists a constant κ such that for each v 1, and the fact that p d = p d−1 + 2 ends the proof of Theorem 1.6.

3.2.
Proof of Theorem 2.1.We apply Theorem 1.6 to y = |N | 1/(2+dγ) x 2/(2+dγ) .Bounding the resulting integral term by a constant times exp (−y γ ), one sees that in this case, the two terms of the right hand side of (1.4) have a similar contribution.

Proof of Proposition 2.3. The proof of the tightness criterion rests on the Schauder decomposition of the spaces H
In order to state it, we need to introduce the following notations.
Set for j 0, and We define for v ∈ V j the pyramidal function Λ j,v by where where v + and v − are define in the following way.Each v ∈ V j is represented in a unique way by The sequential norm is defined by By [25], the norm A general tightness criterion is available for moduli of the form ρ : h → h α .The criterion rests on a Schauder decomposition of H o ρ as j 1 E j , where E j is the vector space generated by the functions Λ j,v , v ∈ V j .Theorem 3.2 (Theorem 6, [29]).Let ζ n , n ∈ N d and ζ be random elements with values in the space H α [0, 1] d .Assume that the following conditions are satisfied.
(1) For each dyadic t ∈ (3.37) This extends readily to ρ ∈ R 1/2,d .First observe that by bounding from below the sum over j by the term at index j = J and taking i = d, (2.10) implies that lim J→∞ lim sup min m→∞ and doing the replacement of index m ′ d = m d − J for a fixed J gives asymptotic tightness of (W n (t)) n 1 for each t.
It remains to check the second condition of Theorem 3.2.Since the Schauder decomposition H o ρ = j 1 E j is also valid for ρ ∈ R 1/2,d , this theorem also holds with the map h → h α replaced by ρ.Therefore, it suffices to prove that lim J→∞ lim sup  By definition of λ j,v and V j , the inequality takes place, where t k = k2 −j and s u = u i 2 −j i∈[d]\{q} .We show that for each positive ε, the treatment of the corresponding terms with ∆ (q) instead of ∆ (d) can be done by switching the roles of the coordinates.We will use the following notations: we will denote by i, k, n, m elements of Z d and i ′ , k ′ , n ′ , m ′ elements of Z d−1 and ((i ′ , i d )) will denote the element of Z d whose first d−1 coordinates are those of i ′ and dth one is i d , and similarly for other letters.We will denote by the coordinatewise order on Z d and Z d−1 , since there will be no ambiguity.Finally, we will let i ′ denote the element of Z d−1 whose coordinates are all equal to 1.
We have in view of Lemma 3.3 that Since the indicator in the first term of the right hand side of (3.44) vanishes if j > log n d , we have and by stationarity, it follows that For the second term of the right hand side of (3.44), notice that sup is increasing, and if j > log n d , then 2 j > n d , hence min 1, n d 2 −j = n d 2 −j and for such j's, we have ρ 2 −j −1 n d 2 −j and we use decreasingness of the sequence ρ 2 −j −1 2 −j j .As a consequence, after having bounded the probability of the max over i d by the sum of probabilities and used stationarity, we obtain 3.4.Proof of Theorem 2.4.We have to check that (2.10) is satisfied.For simplicity, we will do this for q = d; the general case can be done similarly.To this aim, we apply inequality (1.4) for fixed m 1, J 1 and j ∈ {J, . . ., m d } in the following setting: m = 2 m1 , . . ., 2 m d−1 , 2 m d −j , x = ερ 2 −j 2 j/2 and y = ε/2ρ 2 −j 2 j/2 j −d/2 .The sum of the exponential terms in (1.4) can be bounded by the remainder of a convergent series.With the assumption on ρ, the sum of the obtained integral terms does not exceed Proof.From Potter's bound (see Lemma 1.5.6. in [2]), there exists a constant K such that for each 1 j k, L 2 k KL 2 j 2 k−j 2 .Consequently, and the change of index ℓ = k − j shows that we can take C L = K ℓ 0 2 −ℓ/2 .This ends the proof of Lemma 3.4.
Lemma 3.5.Let X be a non-negative random variable and let L : R + → R + be a slowly varying increasing function such that L (x) → ∞ as x → ∞.Suppose that ∀A > 0, Switching the sums over j and k and using Lemma 3.5 with L 3 instead of L reduces use to show that k 1

is 1 -
Lipschitz continuous and that W n kq nq d q=1 = S k1,...,k d / |n| hence W n takes into account the values of all the partial sums S k for 1 k n.

N
k=1 and (X i k ) N k=1 have the same distribution.The criterion puts into play tails of the maximum of partial sums of the rectangles.Proposition 2.3.Let d 1 and ρ be an element of R 1/2,d .If (X i ) i∈Z d is a strictly stationary random field such that for each q ∈ {1, . . ., d} and each positive ε, lim J→∞ lim sup min m→∞ mq j=J

. 16 )( 3 )
Step 3. It remains to find a bound for the double integral.For fixed u and v > 1, we apply Proposition 3.1 in the following setting: x is replaced by x defined by 44) with (3.47) and (3.51), we obtain(3.39)  .This ends the proof of Proposition 2.3.

j J 2 j +∞ 1 P 1 PLemma 3 . 4 .
|X 1 | > L 2 j uC u (log (1 + u)) p d du where C depends only on ρ and ε.Now, we will show that(2.11)guarantees the convergence to zero of the previous term as J goes to infinity.As there exists a constant κ such that u (log (1 + u))p d κu 2 for each u 1, it suffices to prove that for each C, |X 1 | > L 2 j uC u 2 du < ∞ (3.52)This will be a consequence of the following lemmas.Let L : R + → R + be a slowly varying function.There exists a constant C L such that for each k 1,

j 1 2 1 P 1 P
j P X > L 2 j A < ∞. (3.55)Then for each C > 0, X > L 2 j uC u 2 du < ∞.(3.56)Proof.Without loss of generality, we can assume that C = 1.First, observe that+∞ X > L 2 j u u 2 du E X L (2 j ) Remark 1.7.Assume that (X i ) i∈Z d is bounded, that is, there exists a constant K such that|X i | K almost surely for all i ∈ Z d .If x > 3 d/2 K/C d thenwe can choose y = K/C d and inequality (1.4) simplifies as p d du, (1.4) where A d , B d and C d depend only on d, p d = 2d and |n| = d q=1 n q .