TASEP fluctuations with soft-shock initial data

We consider the totally asymmetric simple exclusion process (TASEP) with soft-shock initial density, which is a step function increasing in the direction of flow and the step size chosen small to admit KPZ scaling. We prove that the fluctuations of a particle at the macroscopic position of the shock converge to the maximum of two independent GOE Tracy-Widom random variables, establishing a conjecture of Ferrari and Nejjar. Furthermore, we show that the joint fluctuations of particles near the shock are described by the maximum of two lines with height shifts given by these two independent random variables. The microscopic position of the shock is then easily seen to be the difference of these two random variables. Our proofs rely on determinantal formulae and a novel factorization of the associated kernels.


Introduction
The continuous time totally asymmetric simple exclusion process (TASEP) is an interacting particle system on Z. At time 0, there is a given initial configuration of particles such that every site of Z occupies at most one particle. The dynamics are as follows. A particle jumps randomly to its neighbouring site to the right provided that it is empty. The jumps of a particle are independent of the others' and performed with exponential waiting times having mean 1.
The book [16] provides a detailed construction and basic features of the model. TASEP can also be seen as a randomly growing 1-dimensional interface whose gradient is the particle density. It belongs to the Kardar-Parisi-Zhang (KPZ) universality class. The surveys [5,21] discuss the relationship of TASEP to KPZ.
Despite its simplicity, TASEP displays many of the interesting behaviour of nonequilibrium statistical mechanics. Consider a deterministic initial configuration such that the macroscopic particle density is ρ − to the left of the origin and ρ + to the right: For instance, particles may be arranged periodically in large enough blocks to attain such a profile. On the macroscopic scale the evolution of the particle density is a solution to Burgers' equation [24,26]. More precisely, the limit Pr [there is a particle at site xt at time T t] exists and is an entropic solution of Burgers' equation: When ρ − < ρ + , there is a traffic jam in the system because particles to the left of the origin, moving at macroscopic speed 1 − ρ − , run into particles to the right of the origin moving at a slower speed of 1 − ρ + . In this case the relevant solution of (1.2) is given by the travelling front The number ν is the speed of the traffic jam. This is the shock in Burgers' equation. It is of much interest to study the microscopic features of the shock, ergo, the fluctuations of TASEP with an initial particle configuration as above. A proxy for the location of the shock is the particle at macroscopic position νt. For large times, the number 1 of said particle is Its position fluctuates randomly to the order t 1/3 and so one would like to calculate, for every a ∈ R, lim t→ ∞ Pr X t n shk t ≥ νt − at 1/3 .
In the soft shock scenario, the TASEP is run until time t with the choice of ρ ± as in (1.3) and t being the parameter within ρ ± . One then considers the fluctuation of X(n shk t ) in the double limit of t → ∞ followed by β → ∞. It is governed by the GOE Tracy-Widom-squared law, as follows. Theorem 1. Consider TASEP with a deterministic initial configuration of particles having macroscopic density as in (1.1) and ρ ± scaled as in (1.3). Then, where F 1 is the distribution function of the GOE Tracy-Widom law. 1 Rounding particle numbers to nearest integers is omitted throughout the paper.
The soft shock is introduced in [12] and Theorem 1 proves the conjecture therein as we explain in Section 1.6. The scaling (1.3) is an analogue of the one for the Baik-Ben Arous-Péché transition from random matrix theory within the context of last passage percolation; see [12]. The large time limit of the soft shock gives processes with interesting features as we remark below.
An advantage of the soft shock is that it allows one to describe the limiting law of X t (n shk t ) in the first limit transition, t → ∞, in terms of explicit determinantal formulae. This is then used to study the second limit transition of β → ∞ and to understand the mechanisms of shock fluctuations. This is how Theorem 1 will be proved: first, by deriving the large t limit of the joint fluctuations of particles that are near the shock and, second, by showing that the large β limit of the resulting stochastic process becomes an appropriately combined maximum of two independent GOE Tracy-Widom random variables. This general result is stated in Theorem 3 below. Fluctuations of the hard shock are expected to be given by the same law. Also, the methods of this paper should generalize to prove GOE Tracy-Widom cubed, quadrupled and so on limiting laws at the merger of soft shocks when the initial density has two jumps, three jumps, etc.
1.2. Large time limit of the soft shock. In the case of soft shock the particle numbered n shk t has non-trivial correlations with other particles that are within a distance of order t 2/3 of its position. The positions of these particles also fluctuate on a scale of order t 1/3 . As such, consider particles having numbers The first limit transition derives the law in the large t limit of the process Theorem 2. Given real numbers x 1 < x 2 < · · · < x m and a 1 , . . . , a m , as t → ∞, the probability converges to the probability where h(1, x; 2β|y|) is a random function of the variable x. The multi-point distribution functions of h(1, x; 2β|y|) are given in terms of Fredholm determinants: , whereχ a (u) = 1 {u≤a} is projection onto L 2 (−∞, a) and K β is an explicit operator.
K β is defined separately in Section 2 since its introduction requires crucial terminology and concepts.
Complicated though the determinant in Theorem 2 may appear, observe the onepoint distribution functions of h(1, x; 2β|y|) are given by the Fredholm determinant of operators e −x∂ 2 K β e x∂ 2 over the spaces L 2 (a, ∞). These will turn out simpler and play a crutial role in the proofs. The reason we call the limit process h(1, x; 2β|y|) is that it is the height function at time 1 of the KPZ fixed point with initial data h 0 (y) = 2β|y|, as introduced in [17]. The KPZ fixed point refers to the asymptotic scaling invariant Markov process for the KPZ universality class, starting from general initial data. Although the KPZ fixed point motivates our paper to an extent, the kernels in this case were actually known previously in [12], and so the results used from [17] are somewhat auxillary.

1.3.
Transitioning into the shock. The main result of the paper is the large β limit law of the process h(1, x; 2β|y|). Theorem 3. As β → ∞, the process converges in the sense of finite dimensional laws to the process where X TW 1 and X TW 1 are two independent GOE Tracy-Widom random variables. In other words, as β → ∞, Stated in terms of the TASEP soft shock, Theorem 3 asserts that in the double limit of t → ∞ followed by β → ∞ the process converges in law to the process (1.5). Process (1.5) may be thought of as the asymptotic "shock process" of TASEP with initial density (1.1).
x u Flat region: Airy 1 joint fluctuations on scales of t 1/3 for height and t 2/3 for space.
Shock region: Joint fluctuations given by the process (1.5) on scales of t 1/3 for both height and space. Remark on position of the shock. The process (1.5) can be expressed as |x − X| + Y , where X = (X TW 1 − X TW 1 )/2 5/3 and Y = (X TW 1 + X TW 1 )/2 5/3 . The microscopic position of the shock is then at the minimizer of this function, which is X = (X TW 1 − X TW 1 )/2 5/3 .

1.4.
Remarks on the soft shock process. "Soft shock" is bit of a misnomer since the shock manifests for large values of β whereas the process in Theorem 2 has interesting features for negative values of β as well. What we have is a family of processes interpolating from the Airy 2 process at β = −∞ to the process (1.5) at β = +∞. This becomes easy to see from the framework of the aforementioned KPZ fixed point as the mapping from initial data h 0 → h(1, x; h 0 ) is continuous, so long as h 0 is upper semicontinuous with values in [−∞, ∞) and bounded from above by a linear function.
1.5. An overview of the proof. It is well known that the correlation functions of TASEP, which provide the probability of particles being at specific sites, are determinantal; see for instance [3,5] and references therein. Our proofs rely on such determinantal formulae.
Let us summarize how the GOE TW-squared law arises in Theorem 1. The operator K β associated to the law of h(1, 0; 2β|y|) can be approximately factorized in the form Err β is an error term that is vanishingly small in the appropriate trace norm as β → ∞. This effectively allows one to consider the Fredholm determinant of the product. This approximately becomes the product of determinants over the stipulated L 2 space. The conjugations by M β can then be removed. This results in the GOE TW-squared law in the large β limit.
Observe that if one conjugates away M β from one of the factors in the above representation then the other factor is conjugated by M 2 β , and the resulting operator, does not in fact converge as β → ∞. This was a challenge faced by previous attempts.
1.6. Review of literature. The study of the TASEP shock has a rich history and the reader may find nice discussions in [8,9] and the references therein. We provide an overview of prior works most directly related to this paper.
In [4] the authors find determinantal formulae for TASEP with particles having varying speeds, which allows them to study shock fluctuations with Bernoullirandom initial data. The fluctuations there are Gaussian to the order of t 1/2 . The paper [1] has related results for Bernoulli initial data. Deterministic shock-like initial data is studied in [11,13] by connecting TASEP to last passage percolation. The authors prove that shock fluctuations for various setups are governed by the maximum of various Tracy-Widom random variables, although they are unable to treat the basic case of the step initial density (1.1).
The soft shock is introduced in [12] in a setup where the particles move at two different speeds instead of being spread with the two densities ρ ± . The authors prove the analogue of our Theorem 2. They present determinantal kernels for the multi-point distributions in terms of contour integrals, and one may verify that their kernel matches ours. They also conjecture our Theorem 1. A beautiful illustration of the convergence of X t (n shk t ) to the GOE TW-squared law is shown in [12, Figure  1]. The paper [18] also considers a scenario like the soft shock but with narrowwedge-like initial data.
Finally, [10] proves that the asymptotic position of the second class particle with the step initial density (1.1) is the difference of two independent GOE Tracy-Widom random variables. One should think of the second class particle as a random walk in the potential well given by the TASEP height process, and, as expected, it sits at the minimum of the process (1.5), which is (X TW 1 − X TW 1 )/2 5/3 .

Acknowledgements. JQ was supported by the Natural Sciences and Engineering
Research Council of Canada. MR is thankful to Alexei Borodin for helpful discussions and guidance, especially during early stages of the project. The authors also thank Patrik Ferrari for a valuable discussion, and Ivan Corwin, Promit Ghosal and Daniel Remenik for their comments.

The soft shock operator
The operator K β associated to the soft shock is defined in terms of operators acting on functions f ∈ L 2 (a, ∞), for any fixed a > −∞. Note that the operator exp{x∂ 2 }, which corresponds to the heat kernel, is ill-defined for x < 0 but S x is well-defined due to the presence of the third derivative operator. In terms of integral kernels, where Ai(z) is the Airy function defined as and is a contour consisting of two rays going from e −iπ/3 ∞ to e iπ/3 ∞ through 0. The operator S 0 will often be denoted S. We will frequently use the fact that Define the operator S hypo(h) x in terms of its integral kernel as Consider also the projection operators onto L 2 (a, ∞) and L 2 (−∞, a), respectively: The hitting operator is defined as follows. Consider an upper semicontinuous h : R → [−∞, ∞) that has at most linear growth. The hitting operator associated to h requires choosing a split point x ∈ R. Then consider the functions It is a crucial property of the hitting operator that it does not depend on the choice of split point x (see [23] for a proof).
The operator K β is the hitting operator associated to h β (y) = 2β|y|. It is natural (and crucial for the large β asymptotics) to take the split point at x = 0, which utilizes the fact that h β has different slopes on the two sides of the split point. So denoting h + β (y) = 2βy for y ≥ 0, Since S * S = I, K hypo(h β ) can be expressed as Each of these terms have a presence of the operator exp{± ∂ 3 /3} on both sides. This ensures that operator exp{x∂ 2 } can be applied legally around K hypo(h β ) for every x ∈ R, and so the operator inside the determinant from the statement of Theorem 2 is well-defined.
For general initial data h 0 , the multi-point distribution functions of h(1, x; h 0 ) are given as follows. Given reals x 1 < . . . < x m and a 1 , . . . , a m , .
The determinantal expression for the multi-point distribution function is the 'path integral' version from [17]. There is an alternative 'extended kernel' version. The hitting operator is also introduced in [23] in a modified form and precursors appear in [2,6,20,22].

First limit transition: proof of Theorem 2
Let us introduce a parameter ε > 0 and write In order to prove Theorem 2 one must derive the limiting joint probabilities of such events as ε → 0. Upon replacing x with x − (β 2 /2)ε 1/2 the event becomes We may express the event (3.1) in terms of the height function of TASEP. For TASEP with initial data X 0 , let In terms of the KPZ-rescaled height function one has In the ε → 0 limit the probability of the event in (3.1) remains unaffected if the term O(β 2 ) is ignored. Thus, one must show that the limiting multi-point probabilities are given by the formula from Theorem 2. (It is easy to see that h ε (0, y) converges uniformly to h 0 (y) = 2β|y|.) Here there are several approaches. In [12], a determinantal formula is derived for these multi-point probabilities for the soft-shock data in a related setup, where particles to the left of the origin have a different speed than those to the right. Using their formula, it is not difficult to guess a determinantal formula for our setup and then check it using the the bi-orthogonalization procedure from [3,25]. On the other hand, [17, Theorem 2.6] provides a formula for any initial data with a rightmost particle. One can cutoff the soft-shock initial data far to the right and take a limit as the cutoff is removed to get a determinantal formula for the multi-point probabilities which coincides with the guess. Then by direct asymptotic analysis of the associated determinantal kernels one arrives at Theorem 2. Since the limiting kernel is the same as from [12], we omit the details. .

Second limit transition: proofs of Theorem 1 and 3
The proof of Theorem 1 is presented first in Section 4.1 followed by the proof of Theorem 3 in Section 4.2 as the latter builds on the former. We first define the GOE Tracy-Widom law, introduced in [28], in a suitable form. For the remainder of the paper it is assumed that β ≥ 0.
The GOE Tracy-Widom law. The distribution function of the GOE Tracy-Widom law may be written as a Fredholm determinant [14]. Consider the operator A with integral kernel A(u, v) = 2 −1/3 Ai 2 −1/3 (u + v) . If R is the reflection operator: then A may be expressed as The above representation uses that S 2 = e 2∂ 3 /3 and, as an integral kernel, The relation ∂R = −R∂ implies the second equality. It will turn out that A is the operator K 0 . The GOE Tracy-Widom distribution function is
The following lemma is key to calculating the hitting operator associated to h β (y) = 2β|y|.
This contributes the termχ 0 S. Now assume that u > 0 and let τ be the hitting time of a Brownian motion of diffusion coefficient 2, starting from u, to the hypograph of h + β .
Observe that . Recall that the Airy function has the following decay: there is a constant C such that The above implies that S −t (2βt, v) decays sufficiently fast that one has For t ≤ T , S −t = e (T −t)∂ 2 S −T and one recognizes the integral kernel of e (T −t)∂ 2 at the transition density of Brownian motion (with diffusion constant 2) to go from B(t) = u to B(T ) = v. As such, the strong Markov property implies where the expression on the right is the transition density of B to go from B(0) = u to B(T ) = v while hitting the curve h + β . Denote this expression S hit T (u, v). Thus, , so that S hit T is the transition density of X to go from X(0) = u to X(T ) = v − 2βT while hitting 0. By the Cameron-Martin Theorem, X becomes Brownian motion with diffusion coefficient 2 on [0, T ], starting from u, after a change of measure by the density exp{−β(B(T ) − u) − β 2 T }. Consequently, Since u > 0, if v − 2βT ≤ 0 then the latter transition density is simply the transition density of B to go from B(0) = u to B(T ) = v − 2βT . If v − 2βT > 0, however, one reflects along the time axis the initial segment of B from time 0 till the hitting time to the point zero. The reflection principle then implies that the latter transition density is the transition density of B to go from B(0) = −u to B(T ) = v − 2βT . Therefore, for u > 0, Consequently, writing χ 2βT as 1 −χ 2βT and expressing everything in operator notation shows that 2βT . Multiplying by S −T now gives The operators χ 0 and M ±β are diagonal, R is an anti-diagonal, and none depend on T . The lemma thus follows if for every choice of u and v the quantity e T ∂ 2 ·χ 2βT · S −T (u, v) → 0 as T → ∞.
Let (I) denote this quantity. Using the integral kernels of e T ∂ 2 and S −T one infers that (I) equals In order to evaluate (I), write the Airy function in terms of its contour integral representation (2.3) and switch the contour integration with the integration over variable z by Fubini. The integral over z is a Gaussian integral, which equals Here Φ(w) = (2π) −1/2 w −∞ ds e −s 2 /2 , where w is a complex argument and the integral is over the horizontal contour s → w − s, for s ≥ 0, oriented from −∞ to w. Substituting into the expression for (I), simplifying, and changing variables w → w − T gives the following.
We may now observe that the hitting operator K 0 is simply the operator A associated to the GOE Tracy-Widom law. Employing the definition from (2.7), Lemma 4.2, and using the fact S * S = I, it follows that The relations Rχ 0 =χ 0 R and R 2 = I imply which establishes the claim.
Proof. Lemma 4.2 and the relation χ 0 = χ 0 SS * χ 0 imply that (4.6) Since S * S = I and M β commutes with the projection χ 0 , we may write We now conjugate the above equation by the translation e β 2 ∂ and use relations (1) and (3) The last equation used that A = RS 2 . A key point above is that conjugation by the translation cancels the term involving ∂ in the expansion of ± 1 3 (β + ∂) 3 .
Analogously, one computes to see that In conclusion, The lemma follows from this relation and the expression (4.6) for I − K β .
Decompose the product above in the form (I − X) * χ a (I − X) + (I − X) * χ a (I − X). The determinant of the first of these terms factorizes and upon conjugating out M β from each factor one gets (4.7) .
The proof of Theorem 1 will be completed by showing that the second term in the decomposition, as well as the term E β , provide negligible error as β → ∞. This is the content of the following two lemmas. The argument makes use of some standard inequalities between the Fredholm determinant, trace norm, Hilbert-Schmidt norm and operator norm that may be found in the book [27].
If suffices to show that both of the operators above have vanishingly small trace norm as β → ∞.
We change variable v → v + a in both integrals above. Then, changing variables y := u + v and x := u − v in both integrals gives Rescaling the variable of the first integral as y → y/2β, and of the second as y → β 2 y/2, shows that dy y e −y Ai 2 (β 2 + a + (2β) −1 y) × (4.8) ∞ 0 dy y e β 3 y Ai 2 β 2 (1 + y 2 ) + a .
Recall there is a constant C such that |Ai(z)| ≤ C exp{− 2 3 z 3/2 } if z ≥ 0 and |Ai(z)| ≤ C if z < 0. Since a is fixed, suppose β satisfies β 2 + a ≥ 1, say. Then due to the aforementioned bound on the Airy function the contribution to the first of the two integrals above results from y being of bounded order, y ≈ 1. In particular, there is a constant C a such that for sufficiently large β (in terms of a), ∞ 0 dy y e −y Ai 2 (β 2 + a + (2β) −1 y) ≤ C a Ai 2 (β 2 ).
The magnitude of the second integral from (4.8) may also be determined from a critical point analysis by using the bound on the Airy function above. By abusing notation a bit, there is a constant C a such that for large enough β, The function y − (4/3)(1 + (y/2)) 3/2 is uniquely maximized at y = 0 and its value there is − 4 3 . Therefore the second integral from (4.8) is of order e − 4 3 β 3 as β → ∞. Consequently, there is a (new) constant C a such that such for sufficiently large β, This shows that ||E 1 β || tr → 0 as β → ∞ since Ai(β 2 ) is of order e − 2 3 β 3 .
Now consider the operator E 2 β . Using the definitions one has that The trace norm of E 2 β is the same as that of (u, v) → E 2 β (u + a, v + a) since the latter is a conjugation of the former by the unitary operation of translation. So we consider the latter kernel.
When β 2 + a ≥ 1, the major contribution to the integral above comes from z being in a region around zero, z ≈ 0, due to the rapid decay of the integrand in the variable z. Consequently, for large β there is a constant C a such that (4.9) . The right hand side above decays rapidly in the variable u, namely, it is at most of order e − 2 3 (β 3 +u 3/2 ) 1 {u≥0} . Consider its rate of decay in the variable v.
Lemma 4.5. As β → ∞, the difference of determinants tends to zero.
Proof. In the following argument all Fredholm determinants are over L 2 (a, ∞). Denote X = A + E β . On the space L 2 (a, ∞), becauseχ a annihilates the identity on L 2 (a, ∞). Consequently, on L 2 (a, ∞), The determinant of Y is det(I − X) 2 .
Since χ a andχ a are projections and commute with M ±β , The operators I − X and I − X * are invertible on L 2 (a, ∞) for sufficiently large β because I − A is invertible there (since det(I − A) L 2 (a,∞) = F 1 (2 2/3 a) > 0) and E β has vanishingly small trace norm as β → ∞. In fact, this means that both the operator norm and the Fredholm determinant of I − X are uniformly bounded away from 0 for sufficiently large β. This implies invertibility of Y on L 2 (a, ∞), and one observes from the above expressions for Y and E that on this space (4.10) In order to compare the determinant of Y + E with that of Y one first conjugates both operators as M −β (Y + E)M β and M −β EM β , and then employs the inequality to deduce that The determinant of Y remains bounded in β by Lemma 4.4.
The trace norm of M −β Y −1 EM β may be bounded using the inequalities ||T 1 T 2 || tr ≤ ||T 1 || op ||T 2 || tr and ||T 1 T 2 || tr ≤ ||T 1 || tr ||T 2 || op . The second follows from the first by taking adjoints and noting that both the operator norm and trace norm are invariant under taking adjoints. Thus, The first operator norm in the last expression above remains bounded for large β as remarked. Since χ a M −2β is a diagonal operator, it has trace norm The operator norm of M βχa Xχ a is at most ||M βχa Aχ a || op + ||M βχa E β χ a || op . Observe that The operator norm of the kernel inside the big parentheses is bounded in terms of β because the kernel decays to the order e − √ 2 3 v 3/2 for large values of v and to the order e −β|u| for negative values of u. Since the operator displayed above is a conjugation of M βχa Aχ a by a translation, it follows that ||M βχa Aχ a || op ≤ C a e βa for some constant C a . Similarly, M βχa E β χ a = e βa e ∂a M βχ0 e −∂a E β e ∂a χ 0 e −∂a . The opertor norm of what sits within the big parentheses is vanishingly small in terms of β by a calculation entirely analogous to that of Lemma 4.4. So in all, ||M βχa Xχ a || op ≤ C a e βa for some constant C a . Therefore, (4.11) implies that for some (new) constant C a , Thus ||M −β Y −1 EM β || tr tends to 0 as required. This section concludes by extending Theorem 1 to arbitrary one-point distributions of h(1, x; 2β|y|), which will be utilized in the proof of Theorem 3 below.
Proposition 4.1. For every a, x ∈ R, as β → ∞ the probability Proof. By Theorem 2, the probability .

Factorization Lemma 4.3 then gives
where X = A + E β . Commutation relation (2)  The relation ∂R = −R∂ now implies the following identities.

4.2.
Proof of Theorem 3. We will use an argument by way of the variational principle for the law of the process h(1, x; 2β|y|). An Airy sheet A 2 (x, y) is a random function of real variables x and y defined by the identity Here, −∞ 1 {z =y} is the narrow wedge at y. The height functions above are all coupled by a "common noise". This noise is naturally present in TASEP and the coupled height functions may be obtained as a joint KPZ scaling limit of TASEPs with different wedge initial data that all move under a common dynamic. See Section 4.5 of [17].
Actually, [17] proves existence of an Airy sheet (due to tightness) but not its uniqueness. Nevertheless, the following properties we use are common to every Airy sheet. Every Airy sheet is continuous, invariant under switching variables, and has the law of the Airy 2 process in each variable when the other is held fixed. Also, the following variational principle applies to any Airy sheet [17,Theorem 4.11] (see also [7]).
Variational formulae like these originate in [15] and are similar to the Lax-Oleinik formula for solutions of Hamilton-Jacobi equations; see [7,26]. Lemma 4.6. An Airy sheet has the following modulus of continuity uniformly over y and x 1 , x 2 with |x 1 − x 2 | ≤ 1.
The notation O p () means a random quantity that is finite with probability one.
Proof. For every fixed y, A 2 (x, y) is an Airy 2 process in x, which satisfies the modulus of continuity estimate stated above by [17,Theorem 4.4]. (The Airy 2 process is Hölder-(1/2 − ε) almost surely.) Thus, the modulus of continuity estimate above holds for every fixed y. By a union bound it then holds uniformly over all rational values of y. By continuity of an Airy sheet, it also holds uniformly over all y.
Using the variational principle and separating the supremum over y ≤ 0 from the supremum over y ≥ 0, one has that h(1, x; 2β|y|) Rewrite (I) by changing variable y → y − β + x and (II) by changing variable y → y +β +x. Then h(1, (2β) −1 x; 2β|y|)−β 2 has the law of max {X 1 (x) − x, X 2 (x) + x}, where − y 2 and (4.12) Now consider X 1 (x) for a fixed value of x. As y → A 2 (x/2β, y) has the law of the Airy 2 process, by the modulus of continuity estimate of Lemma 4.6 (the roles of x and y are now switched) one infers that sup y ∈ β− |x| 2β , β+ |x| 2β As a result, the supremum of A 2 x 2β , y − β + x 2β − y 2 over y ≤ β − x 2β may be replaced by its supremum over y ≤ β with an additive error of order o p (1) as β → ∞ since the supremum on the leftover interval is of order O p (1) − β 2 . (The notation o p (1) denotes a term that converges to zero in probability as β → ∞.) Furthermore, due to the modulus of continuity estimate in Lemma 4.6, the latter supremum may be replaced by the supremum of the process y → A 2 0, y − β over y ≤ β with an additional penalty of o p (1). This is because replacing the x/2β by 0 in the above introduces an additive error of order O p (β −1/4 ). As a result, This same argument implies that X 2 (x) = sup y≥−β A 2 (0, y + β) − y 2 + o p (1) as β → ∞.