The Schauder estimate in kinetic theory with application to a toy nonlinear model

. This article is concerned with the Schauder estimate for linear kinetic Fokker-Planck equations with H¨older continuous coeﬃcients. This equation has an hypoelliptic structure. As an application of this Schauder estimate, we prove the global well-posedness of a toy nonlinear model in kinetic theory. This nonlinear model consists in a non-linear kinetic Fokker-Planck equation whose steady states are Maxwellian and whose diﬀusion in the velocity variable is proportional to the mass of the solution.


C Y R IL I M B E R T C L É M E N T M O U H O T T H E S C H A U D E R E STI M AT E IN K IN E TIC T H E O R Y W IT H A P P L IC ATIO N T O A T O Y N O N L IN E A R M O D E L L ' E STI M ATIO N D E S C H A U D E R E N T H É O R IE C IN É TIQ U E E T S O N A P P L IC ATIO N À U N M O D È L E J O U E T N O N L IN É A IR E
Abstract. -This article is concerned with the Schauder estimate for linear kinetic Fokker-Planck equations with Höder continuous coefficients. This equation has an hypoelliptic structure. As an application of this Schauder estimate, we prove the global well-posedness of a toy nonlinear model in kinetic theory. This nonlinear model consists in a non-linear kinetic Fokker-Planck equation whose steady states are Maxwellian and whose diffusion in the velocity variable is proportional to the mass of the solution.
Keywords: Fokker-Planck equation, hypoelliptic, Schauder estimate, nonlinear kinetic equation. 2020 Mathematics Subject Classification: 35B65, 35Q84, 82C40. DOI: https://doi.org/10.5802/ahl.75 (*) The authors would like to thank L. Silvestre for fruitful discussions and detailed comments on the previous version of this work. This lead in particular to simplifying our use of two definitions of Hölder norms to only one. The second author acknowledges partial funding from the ERC grant MAFRAN.

The Schauder estimate for linear kinetic Fokker-Planck equations
The first part of this paper deals with Schauder estimate for linear kinetic Fokker-Planck equations of the form for some given function S under the assumption that the diffusion matrix A = (a i, j (t, x, v)) i, j=1, ..., d satisfy a uniform ellipticity condition for some λ > 0: The main result of this article is a Schauder estimate, that is to say an a priori estimate for classical solutions to (1.1) controlling their second-order Hölder regularity (in the sense of a "kinetic order" made precise below) by their supremum norm, under the assumptions that the coefficients a i,j , b i , c are also Hölder continuous.
Theorem 1.1 (The Schauder estimate). -Given α ∈ (0, 1) and a i, j , b i , c ∈ C α (R×R d ×R d ), i, j = 1, . . . , d, satisfying (1.2) and a function S ∈ C α (R×R d ×R d ), any classical solution g to (1.1) satisfies where the constant C depends on dimension d, the constant λ from (1.2), the exponent α and the [·] C α semi-norm and L ∞ norm of a i, j for i, j = 1, . . . , d, b i for i = 1, . . . , d, and c. The semi-norm [·] C α is the standard Hölder semi-norm for the distance (t, x, v) = |t| 1 2 + |x| 1 3 + |v|. Remark 1.2. -The left hand side can be understood as a Hölder regularity of (kinetic) order 2 + α, according the specific definition of Hölder spaces C β , β 0, given in the next section (Definition 2.2). We compare this result to the classical Schauder estimate for parabolic equations in the next subsection.
Such a Schauder estimate is typically used to reach well-posedness of nonlinear equations after the derivation of Hölder estimates on coefficients. In order to illustrate this fact, we consider in the second half of this paper the equation for an unknown 0 f = f (t, x, v), supplemented with the initial condition f (0, x, v) dv and T d denotes the d-dimensional torus. We emphasize the fact that studying (1.4) with x ∈ T d is equivalent to study it with x ∈ R d with periodic initial data. The known a priori estimates that are preserved in time for this equation are L 1 (T d × R d ) and C 1 µ f C 2 µ, where µ denotes the Gaussian (2π) −d/2 e −|v| 2 /2 . They are not sufficient to derive uniqueness or bootstrap higher regularity. The Schauder estimate from Theorem 1.1, together with the Hölder regularity from [GIMV19] (see Theorem 4.3), allows us to prove global well-posedness of Eq. (1.4) in Sobolev spaces. In the following statement H k (T d × R d ) denotes the standard L 2 -based Sobolev space.
Theorem 1.3 (Global well-posedness for a toy nonlinear model).
-Given two constants 0 < C 1 C 2 , let f in be such that f in / √ µ ∈ H k (T d × R d ) with k > 2 + d/2 and satisfying C 1 µ f in C 2 µ. There then exists a unique global-in-time solution

Schauder estimates for kinetic equations
The Schauder estimate for solutions g(t, v) to parabolic equations of the form where C α (R × R d ) denotes the classical Hölder space with respect to the parabolic distance (t, v) = |t| 1 2 + |v|. This distance accounts for the parabolic scaling (t, v) → (r 2 t, rv).
Because we work with kinetic Fokker-Planck equations, the usual parabolic scaling is replaced with the kinetic scaling (t, x, v) → (r 2 t, r 3 x, rv). Moreover, parabolic equations of the form (1.5) are translation invariant while kinetic Fokker-Planck equations of the form (1.1) are translation invariant in the space variable x but not in the velocity one v; the latter is replaced by the Galilean invariance. As already noticed for instance in [GIMV19,Pol94], these two facts -kinetic scaling and Galilean invariance-naturally require new definitions for cylinders, order of polynomials and Hölder continuity. We therefore define (see Definition 2.2) the space C β to be the set of functions whose difference with any polynomial of (kinetic) degree smaller than β decays at rate r β in a cylinder of radius r > 0. Following [IS21], one can define the kinetic degree that follows the kinetic scaling, as 2(degree in t) + 3(degree in x) + (degree in v). The subscript stands for "left" since the transformations leaving the equation invariant are applied to the left, see the next section.
There exists a well-developed literature of Schauder estimates for ultraparabolic equations, e.g. [Man97,Pol94], and so-called Kolmogorov or Hörmander type equations, see e.g. [BB07,DFP06,Lun97,Rad08] and references there in. Some of these large classes of equations include the linear kinetic Fokker-Planck equations of the form (1.1). Moreover, (1.1) is already considered in [HS20]. However in all these works, either the choice of Hölder spaces is not appropriate to the study of kinetic equations or the assumptions on the coefficients are too strong or the estimate is too weak.
In [HS20], the authors use the same natural Hölder spaces C α for α ∈ (0, 1) but make other choices for higher exponents α. For instance, following the aforementioned classical Schauder estimate for parabolic equations, the semi-norm [·] 2+α, Q in [HS20] equals the sum of [D 2 u] α,Q , [∂ t u] α, Q plus an additional semi-norm controlling xvariations. Such a choice can be compared the equivalence of norms discussed in Remark 2.9. The authors of [HS20] explain their choice by the fact that the natural Schauder estimate (in the spirit of Theorem 1.1) only provides a regularity in the x variable of order (2 + α)/3 < 1 while they aim at reaching complete smoothing by bootstrap. Note that such bootstrap can nevertheless be achieved in our spaces C β , but this requires to work with difference derivatives in x of fractional order (2 + α)/3 each time the Schauder estimate is applied. Due to the technical length of this argument when providing full details and in order to keep this paper concise, we defer this higher regularity bootstrap on the nonlinear model (1.4) to a future study, and also note that such bootstrap techniques are being also implemented for the Boltzmann equation in [IS19].
We also remark that our choice of norms -that measures regularity by estimating the oscillations of a Taylor expansion remainder-is related to the proof of the Schauder estimate. Indeed, we adapt the argument by Safonov [Saf84] in the parabolic case, explained in Krylov's book [Kry96]. In the latter argument, the oscillation of the remainder of the second-order Taylor expansion of the solution is shown to decay at rate r 2+α in a cylinder of radius r, and a corrector is introduced to the secondorder Taylor polynomial to account for the contribution of the source term at large distance. Compared with the parabolic argument in [Kry96], the main conceptual difference is in the proof of the gradient bound, see Proposition 3.1. We combine Bernstein's method, as in [Kry96], with ideas and techniques borrowed from the hypocoercivity theory [Vil09].

Motivation and background for the toy model
Equation (1.4) describes the evolution of the probability density function f of particles. The free transport translates the fact that the variable v is the velocity of the particle at position x at time t. The operator ρ[f ]∇ v · (∇ v f + vf ) takes into account the interaction between particles. The diffusion coefficient ρ[f ] is proportional to the total mass of particles lying at x at time t: diffusion is strong in regions where local density is large and weak in regions where local density is small. The diffusion The collision operator in equation (1.4) corresponds to the (much simpler) case where coefficients are given by In the case of the Landau equation, both coefficients are defined by integral quantities involving the solution. Our simplified toy model (1.4) replaces these convolutions crudely by their averages and neglects the issues of the various positive or negative moments at large velocities. This explains the factor ρ[f ]. Our simplified toy model also shares the same Gaussian steady state as the Landau collision operator.
This simplification respects the principle at the source of nonlinearity in bilinearity collision operators: that the amount of collisions at a point is related to the local density of particles. Note that replacing ρ[f ] by another v-moment of the solution, or even having different v-moments in front respectively of the diffusion and drift terms, could most likely be treated by variants of the method developed in this paper. It is also likely that replacing ρ[f ] by F (ρ[f ]) where F : R * + → R * + is a smooth nonlinear map could be treated by variants of the methods in this paper.
The model (1.4) was also studied in [KL06] (see [KL06,equation (9)], when keeping only mass conservation) and the authors show how its spatially homogeneous version arise as a mean-field limit of an N -particle Markov process in the spirit of Kac's process [Kac56]. It is also related to the gallery of nonlinear Fokker-Planck models discussed for instance in [Cha08].
A recent line of research consists in extending methods from the elliptic and parabolic theories to kinetic equations. Silvestre in particular made key progresses on the Boltzmann equation without cut-off in [Sil16], and together with the first author later obtained local Hölder estimate in [IS20] and a Schauder estimate for a class of kinetic equations with integral fractional diffusion in [IS21]. In parallel, a similar program was initiated for the Landau equation in [GIMV19], following up from an earlier result in [WZ09]. The local Hölder estimate is obtained in [GIMV19,WZ09] for essentially bounded solutions, the Harnack inequality is proved in [GIMV19] and some Schauder estimates are derived in [HS20]. The results contained in the present article are part of this emerging trend in kinetic theory.

Strategy of proof for global well-posedness
We explain here the various ingredients used in the proof of the global wellposedness of equation (1.4). The proof proceeds in 5 steps.
(1) First, the maximum principle implies that if the initial datum f in lies between C 1 µ and C 2 µ, then the corresponding solution f to (1.4) satisfies the same property: as long as the solution x, v) (see Lemma 4.1). In particular, this ensures that the solution f has fast decay at large velocities and that the diffusion coefficient ρ[f ] satisfies C 1 ρ[f ](t, x) C 2 for all (t, x). Therefore, the equation satisfies the uniform ellipticity condition in v as stated in (1.2).
(2) We deduce from the bound on ρ[f ] that the solution f satisfies an equation of the form for a symmetric real matrix A whose eigenvalues all lie in [C 1 , C 2 ]. In particular, we can use the local Hölder estimate from [GIMV19,WZ09], see Theorem 4.3 from Subsection 4.2. The decay estimate from Step 1 and the Hölder regularity C α 0 for some small α 0 are then combined with the Schauder estimate from Theorem 1.1 to derive a higher-order Hölder estimate in C 2+α 0 (see Proposition 4.4).
(3) With such a higher order Hölder estimate at hand, we next study how Sobolev norms in x and v grow as time increases and we derive a continuation criterion (in the same spirit as the Beale-Kato-Majda blow-up criterion [BKM84]). We prove then that the blow-up is prevented by the C 2+α 0 Hölder estimate from Step 2. This finally yields global well-posedness of equation (1.4) in Sobolev spaces. It is worth mentioning that a conditional global smoothing effect for the Landau equation with moderately soft potentials has been recently obtained in [HS20] by combining the ingredients listed in Steps 1 to Step 3 above. Moreover, establishing such a global smoothing effect is in progress for the Boltzmann equation without cut-off with moderately soft potentials [IS19]. These works however assume a priori that some quantities such as mass, energy and entropy densities remain under control along the flow. The a priori assumption is necessary to prove, among other things, that the equations enjoy some uniform ellipticity and to establish good decay in the velocity variable. The interest of the toy nonlinear model (1.4) lies in the fact that it is a nontrivial and physically relevant model for which unconditional global well-posedness can be proven following such a programme. The main simplification of our model compared with the Landau equation is the lack of local conservation of momentum and energy; therefore the fluid dynamics on the local density, momentum and energy fields reduces to the heat flow in the fluid limit and avoids the difficulties of the Euler and Navier-Stokes dynamics. The results of this paper hence provide one more hint that the formation of singularities, if any, in the Cauchy problem for non-linear kinetic equations is likely to come from (1) fluid mechanics, or (2) issues with the decay at large velocities.

Perspectives
We conclude the introduction by mentioning that the well-posedness result for the toy nonlinear model can be improved in two directions. First, more general initial data could be considered by constructing solutions directly in our Hölder spaces rather than mixing Hölder and Sobolev spaces. Second, we previously mentioned that C ∞ regularization is expected for positive times by applying iteratively the Schauder estimates.

Organisation of the article
Section 2 is devoted to the definition of Hölder spaces. The Schauder estimate from Theorem 1.1 is proved in Section 3. We prove Theorem 1.3 in the final Section 4 by constructing local solutions to the non-linear equation (1.4) in Sobolev spaces and by using the Schauder estimate to extend these solutions globally in time.

Notation
We collect here the main notations for the convenience of the reader.

Euclidian space and torus
The d-dimensional Euclidian space is denoted by R d and the d-dimensional torus by T d . Throughout this article, the space variable x belongs to R d , except in Section 4 where x ∈ T d .

Multi-indices
The order of m ∈ N d is |m| := m 1 + · · · + m d . Given a vector x ∈ R d and m ∈ N d , we denote

Balls and cylinders
B r denotes the open ball of R d of radius r centered at the origin. Q r (z 0 ) denotes a cylinder in R×R d ×R d centered at z 0 of radius r following the kinetic scaling, see (2.2). Q r simply denotes Q r ((0, 0, 0)). The scaled variable is S r (z) := (r 2 t, r 3 x, rv) for z = (t, x, v), see (2.1). Radii of cylinders are denoted by r. Unless further constraints are stated, this radius is an arbitrary positive real number. It is sometimes restricted by the fact that a cylinder should not leak out of the domain of study of the equation (in particular in time), sometimes chosen to be 1 or 2, or multiplied by a given constant, e.g. 3/2 or K + 1.

Constants
We use the notation g 1 g 2 when there exists a constant C > 0 independent of the parameters of interest such that g 1 Cg 2 . We analogously write g 1 g 2 . We sometimes use the notation g 1 δ g 2 if we want to emphasize that the implicit constant depends on some parameter δ.

Hölder spaces and exponents
Given an open set Q, C α (Q) denotes the set of α-Hölder continuous functions in Q, see Definition 2.2. The subscript refers to "left", it is not a parameter. The letter α denotes an arbitrary positive exponent. This exponent will be fixed to some value α 0 in Section 4 after applying the local Hölder estimate from [GIMV19,WZ09] recalled in Subsubsection 4.2.

Kolmogorov operator
The Green function of the operator L K :

The Green function
Consider the Kolmogorov equation with a given source term S. The Green function G of the operator L K : Proposition 2.1 (Properties of the Green function).

Hölder spaces
We now introduce Hölder spaces similar to that in [IS21].
In particular, constants have zero kinetic degree. The kinetic degree deg kin p of a polynomial p ∈ R[t, x, v] is defined as the largest kinetic degree of the monomials m j appearing in p.
Given an open set Q ⊂ R × R d × R d and β > 0, we say that a function g : If this property holds true for all z 0 ∈ Q, the function g is β-Hölder continuous in Q and we write g ∈ C β (Q). The smallest constant C such that the property (2.5) holds true for all z 0 ∈ Q is denoted by (1) For β ∈ (0, 1), the semi-norm C β is equivalent to the standard Hölder seminorm C β for the distance (t, x, v) = |t| 1 2 + |x| 1 3 + |v|.
(2) For a non-zero integer k ∈ N, the spaces C k differ from the usual C k spaces, in that the highest-order derivatives are not continuous but merely L ∞ . For instance C 1 functions are Lipschitz continuous in v but not continuously differentiable in v.
This inequality justifies the fact that the Definition 2.2 above coincides with [IS21, Definition 2.3] and that semi-norms only differ by a factor 4 β .

Second order Taylor expansion
When β = 2+α ∈ (2, 3), we now prove that the polynomial p realizing the infimum in the C β -semi-norm is the Taylor expansion of kinetic degree 2: . Remark that the linear part in x does not appear since it is of kinetic degree 3.
We recall and denote where P denotes the set of polynomials of kinetic degree smaller than or equal to 2.

ANNALES HENRI LEBESGUE
Proof. -First reduce to z 0 = 0 by the change of variables g (z) := g(z 0 • z). We continue however to simply call the function g. We start by proving the first inequality. One needs to identify the minimizer Using the latter and testing for v = 0 and |t| = r 2 k+1 gives Testing for t = 0 and summing v and −v with |v| = r k+1 in all directions gives Finally by difference and testing with t = 0 and all directions of |v| = r k , one gets This shows that the coefficients are converging with These convergences and estimates imply that and M ∞ = D 2 v g(0, 0, 0) (and proves the existence of such derivatives). We thus proved that where the constant does not depend on k. This in turn implies that same inequality for any r > 0, with a constant at most multiplied by 2, which concludes the proof since ε is arbitrarily small.
The proof of the second inequality [g] C 2+α (Q) [g] C 2+α , 0 (Q) then follows from the existence of the derivatives appearing in T 0 [g] showed in the previous step, and the fact that T 0 [g] is of kinetic degree strictly smaller than 2 + α.

Interpolation inequalities
Interpolation inequalities are needed in the proof of the Schauder estimate.
Lemma 2.5 (Interpolation inequalities). -Let Q int , Q ext be two cylinders of the form Q ρ (z c ) and Q R (z c ) with either ρ < R or ρ = R = +∞. There exist C, β > 0 depending only on d, α ∈ (0, 1) (and R − ρ if R is finite) such that for any ε ∈ (0, 1) and any g ∈ C 2+α (Q ext ), and define k ∈ {0, 1, 2} the kinetic order of the differential operator D, i.e. k = 0 for D = Id, k = 1 for D = ∇ v and k = 2 for D = D 2 v and D = ∂ t + v · ∇ x (this is the kinetic degree of the polynomial naturally associated with the differential operator). Then given for β ∈ {0, α}, and provided that k +β < 2+α, we have where constant C only depends on dimension d, α and · L ∞ (Q) . Observe that the restrictions imposed on k, β yield the same inequalities as in Lemma 2.5. The two key steps in [IS21, Lemma 2.7] are the proof of the general interpolation inequality and inequalities relating the Hölder semi-norms of derivatives of a given function to its (higher order) semi-norm, as in (2.9)-(2.10) but more general.
Remark 2.7. -It is possible to get rid of the condition ρ < R when R < +∞ but this requires substantial modifications. In the following proof, a global estimate (ρ = R = +∞) is derived and a local "interior" one is easily obtained by a localization procedure. Since reaching ρ = R < +∞ is irrelevant for this work and the proof below is different from the one contained in [IS21], we believe it can be useful to restrict ourselves to this special case.
Proof of Lemma 2.5. -It is sufficient to prove the interpolation inequalities (2.7)-(2.8) in the case where Q int = Q ext = R 2d+1 . Indeed, it is then sufficient to apply global estimates to g = gφ for some cut-off function φ such that 0 φ 1, φ = 1 in Q int and φ = 0 outside Q ext to get the local ones. Indeed, in the case of (2.7), where in the case 0 < ρ < R < +∞, the constant C ∼ (R − ρ) −2−α depends on the difference of radii. This localization is however not used in the proof of (2.9)-(2.10) because it would add a term g L ∞ (Qext) on the right hand side, which we want to avoid for the application of Remark 2.9.
Since we now work in the whole space R 2d+1 , we do not specify the domain of functions spaces in the remainder of the proof.
Step 1. -L ∞ bounds on derivatives. One can perform the same argument as in the proof of the preceding Lemma 2.4 with ε = g L ∞ , a 0 = 0, b 0 = 0, M 0 = 0 as admissible first terms in the sequence to get for any z 0 ∈ Q ext , which yields (2.14) Step 2. -Control of lower-order Hölder norms. Using the previous L ∞ bounds and Lemma 2.4, which yields (2.7) with ε = 1.
Step 3. -Hölder regularity of first-order derivatives in v.
We are left with proving (2.8), that is to say that for all z 0 ∈ R 2d+1 and z 1 ∈ Q r (z 0 ), we have In view of (2.13), it is enough to consider r ∈ (0, 1]. Define for z ∈ R 2d+1 and u ∈ S d−1 and r ∈ (0, 1], We make two observations about this function. First there exists θ = θ(z, r, u) ∈ (0, 1) such that , z 1 ∈ Q r (z 0 ) and u ∈ S d−1 be fixed, and δ > 0 to be chosen later. There then exists z 0 ∈ Q δr (z 0 ) and z 1 ∈ Q δr (z 1 ) such that Then we can write, using Lemma 2.4 to get the last inequality, . Using the fact that r 1 and taking the supremum over z 0 , z 1 , u, we get, . We can now pick δ > 0 such that 2δ α = 1 2 and conclude that (2.8) holds true.
Let z 0 ∈ R 2d+1 , z 1 ∈ Q r (z 0 ) and u ∈ S d−1 be fixed, and δ > 0 to be chosen later. There then exists z 0 ∈ Q δr (z 0 ) and z 1 ∈ Q δr (z 1 ) such that Then we can write, using Lemma 2.4 to get the last inequality, Using the fact that r 1 and taking the supremum over z 0 , z 1 , u, we get, which proves the inequality by choosing 2δ α = 1 2 .
Step 5. -Hölder regularity of first-order transport derivative. The proof of (2.10) follows the same strategy: we prove, denoting Y := ∂ t + v · ∇ x , for all z 0 ∈ R 2d+1 and z 1 ∈ Q r (z 0 ) with r ∈ (0, 1] Define for z ∈ R 2d+1 and r ∈ (0, 1], This function satisfies for some θ = θ(z, r) ∈ (0, 1) Let z 0 ∈ R 2d+1 , z 1 ∈ Q r (z 0 ), and δ > 0 to be chosen later. Then for some z 0 ∈ Q δr (z 0 ), z 1 ∈ Q δr (z 1 ): Then we can write, using Lemma 2.4 to get the last inequality, Using the fact that r 1 and taking the supremum over z 0 , z 1 , u, we get, which proves the inequality by choosing 2δ α = 1 2 . This achieves the proof of the Lemma 2.5. We will see below that (2.9) and (2.10) can be combined with the hypoelliptic estimate contained in Lemma 2.8 below to derive an equivalent semi-norm for the kinetic Hölder space C 2+α , see Remark 2.9.

A hypoelliptic estimate
We investigate here how to recover the fact that a given function g lies in C 2+α only knowing that its free transport and its velocity second order derivatives lie in C α . We remark that this fact is (almost) a consequence of the Schauder estimate contained in Theorem 1.1 (as a matter of fact, it is closer to the local version of this result, see Theorem 3.9). However we do not need here to control the L ∞ norm of g. The proof of the following lemma illustrates the hypoelliptic structure of the Hölder spaces C β and is consequently of independent interest. Lemma 2.8 (Hypoelliptic Hölder estimate). -Let Q be an arbitrary cylinder and g ∈ C 2+α (R 2d+1 ) then for some constant C only depending on α.
Remark 2.9. -By combining (2.15) with (2.9) and (2.10), we deduce that ) is a semi-norm equivalent to [·] C 2+α (R 2d+1 ) . In order words, although the semi-norm [·] C 2+α (R 2d+1 ) is defined by measuring oscillations around higher-order polynomials, in the whole space it can be recovered with the more classical notion of Hölder regularity along the highest-order derivatives. We do not know if this equivalence is true in a domain, as in Proof. -It is convenient to write Y for (∂ t + v · ∇ x ). Moreover, since the domain is the whole space R 2d+1 , we do not specify the domain in function spaces. Let z 0 ∈ R 2d+1 and r > 0. We consider in particular h = (1, u, v) ∈ Q 1 and we have z 0 • S r (h) ∈ Q r (z 0 ).
Proving (2.15) for variations along free streaming is easy: for all z 0 ∈ R 2d+1 and r > 0, we have The proof of (2.15) for variations along the v variable is also straightforward: for all z 0 ∈ R 2d+1 and r > 0, We then prove (2.15) but only for variations along x: we prove that for all z 0 ∈ R 2d+1 and u ∈ B 1 and r > 0, for some constant C only depending on α. This is where a hypoelliptic argument is used: ∇ x is realized as the Lie bracket of ∇ v and Y , along trajectories. Let z 1 denote z 0 • (0, r 3 u, 0) and z 2 = z 1 • (0, 0, ru) and z 3 = z 2 • (−r 2 , 0, 0) and z 4 = z 3 • (0, 0, −ru). Remark that z 0 = z 4 • (r 2 , 0, 0). Notice that all points remain in Q r (z 0 ). We now write

TOME 4 (2021)
We now use (2.16) with z 0 replaced successively by z 2 and z 4 and (2.17) with z 0 replaced successively by z 3 and z 2 , and we get, after summing the four resulting inequalities, We are going to prove that for all R > 0, In order to derive this estimate, we approximate ∇ v g(z) · u by the finite difference σ 3 R, u [g](z) we already used above. Recall that Remark that (2.17) implies Moreover, recalling that z 3 = z 2 • (−r 2 , 0, 0) and using (2.16) twice,

ANNALES HENRI LEBESGUE
We now combine (2.19) and (2.20) and get We now pick R = δ −1 r with δ 1−α = 1/2 and get (2.18) for some constant only depending on α. This concludes the proof.

The Schauder estimate
This section is devoted to the proof of the Schauder estimate (Theorem 1.1). The proof proceeds in mainly two steps: in the first step, the matrix A is constant. The estimate is first obtained when A is the identity matrix (Theorem 3.5), then for an arbitrary constant diffusion matrix (Corollary 3.7). Then the estimate is established for variable coefficient by the procedure of freezing coefficients thanks to interpolation inequalities.

A gradient bound for the Kolmogorov equation
We first establish a gradient bound for solutions of (1.1) when A is the identity matrix and when there is no lower order terms (b = 0, c = 0). The equation is then reduced to (2.3). Recall that Q 1 = (−1, 0] × B 1 × B 1 is the cylinder of radius 1. (∂ t + v · ∇ x )g = ∆ v g + S. Then [Bau17,GW12] for related gradient estimates.
Proof. -We use Bernstein's method as Krylov does in [Kry96] in the ellipticparabolic case, combined with methods from hypocoercivity theory (see for instance [Vil09]) in order to control the full (x, v)-gradient of the solution: see the construction of the quadratic form w in ∂ x i g and ∂ v i g below.
Denote the Kolmogorov operator L K g := ∂ t g + v · ∇ x g − ∆ v g and compute the following defaults of distributivity of the operator (reminiscent of the so-called Γcalculus [BÉ85]) Consider a cut-off function 0 ζ ∈ C ∞ such that ζ 1/2 ∈ C ∞ , with support in (−1, 0] × B 1 × B 1 and such that ζ(0, 0, 0) = 1. In order to estimate the gradient of g TOME 4 (2021) in x and v at the origin, it is enough to find ν 0 , ν 1 > 0 and 0 < a b and 0 < c < ab such that, for any i ∈ {1, . . . , d}, −L K w 0. Indeed, the maximum principle for parabolic equations then implies that sup Since ζ ≡ 0 in ∂ p Q 1 and ζ(0, 0, 0) = 1, we get (iv) Compute fourth, for some 1 > 0, the term −L K (ζ 2 (∂ v i g) 2 ) using and (3.1): (v) Compute fifth, for some 2 > 0, the term −L K [ζ 3 (∂ x i g)(∂ v i g)], with the intermediate step To clean a little the calculations, observe that (1) error terms involving S are controlled by choosing ν 1 larger enough, i.e. larger than a multiple of terms involving |∇ v g| 2 or ∇ v g ·∇ x g are controlled by choosing ν 0 large enough thanks to (3.3), with g controlled by its sup norm, (3) Equation (3.3) is "free" (i.e. not involved in any constant dependency) as well as equation (3.5) by choosing 1 small enough so that the term − 1 ζ 3 (∂ x i g) 3 is compensated by the positive term in (3.6), (4) Equation (3.4) has an error term of the form −O(1)ζ 3 (∂ x i g) 2 that is compensated by the positive term in (3.6) again: we use where we have used again ζ 1 2 ∈ C ∞ , in the form |∇ζ| ζ 1/2 . These considerations result in the following calculations: We finally choose (1) a = 1, (2) c large enough so that the first term in the third line controls the fourth term in the first line, (3) 2 and 3 small enough so that the second term of the third line is controlled by the second term in the right hand side of the first line, (4) b large enough so that the third term in the fourth line is controlled by the first term in the third line and ab > c so that the quadratic form is strictly positive, (5) 1 small enough so that the third term in the third line is controlled by the first term in the third line, (6) finally ν 0 and ν 1 large enough to control all the v-derivatives of g and all the derivatives of S. Notice that . The conditions on the coefficients are: c which is compatible with all requirements and has solutions. This proves that −L K ω 0 and the desired inequality is thus obtained from (3.2) and the choice of ν 1 we made above, which concludes the proof of Proposition 3.1.
A direct consequence of the gradient estimate from Proposition 3.1 is a bound of derivatives of arbitrary order in any cylinder of radius r. We recall that Q r = (−r 2 , 0] × B r 3 × B r . Corollary 3.3 (Bounds on arbitrary derivatives around the origin). -Given k ∈ N and r > 0, there exists a constant C depending on dimension d and k such that any solution of (2.3) in Q r with zero source term S ≡ 0, i.e. (∂ t + v · ∇ x )g = ∆ v g in Q r , satisfies for all n ∈ N and α, β ∈ N d with |β| = k, C g L ∞ (Qr) r 2n+3|α|+|β| . Remark 3.4. -Remark that 2n+3|α|+|β| is the kinetic degree of the polynomial associated with ∂ n t D α x D β v . Proof. -We reduce to the case r = 1 by rescaling: the function g r (t, x, v) = g(r 2 t, r 3 x, rv) is a solution of (2.3) in Q 1 . If the result is true for r = 1, then we get the desired estimate for arbitrary r's. We then first treat the case n = 0 and argue by induction on |β|. Proposition 3.1 yields the result for |β| 1 since D α x g solves (2.3) with S ≡ 0 for an arbitrary multi-index α. Assuming the result true for n = 0, any α and |β| k, remark that D α Consider β ∈ N d with |β| = k + 1; the previous step yields controls of ∂ x j S and ∂ v j S for the source term S in the equation for D α x D β v g. Proposition 3.1 then gives the control ∂ x i ,v i D α x D β v g(0, 0, 0) which completes the induction. We finally get the result for an arbitrary n 1 by remarking that the equation allows us to control any time derivatives by space and velocity derivatives.

Proof of Schauder estimates
With such bounds on derivatives at hand, we can turn to the proof of the Schauder estimate for the Kolmogorov equation, that is to say for equation (1.1) with A replaced with the identity matrix and with no lower order terms (b = 0, c = 0), see (2.3). A change of variables will then yield the result for any constant matrix A satisfying the ellipticity condition (1.2). Finally we shall classically approximate the coefficients locally by constants to treat the general case.
3.2.1. The core estimate The proof of the Schauder estimate for the Kolmogorov equation follows the argument proposed by Safonov [Saf84], as explained in [Kry96].
Theorem 3.5 (The Schauder estimate for the Kolmogorov equation). -Given α ∈ (0, 1) and S ∈ C α , let g ∈ C 2+α be a solution to Proof. -We first reduce to the case where g ∈ C ∞ c (R × R d × R d ) by mollification and truncation (as for instance in the proof of [Kry96, Lemma 8.7.1, p. 122]). We then reduce to the base point z 0 = 0 by considering the change of unknown g (z) := g(z 0 •z) (we however keep on calling the unknown g). Given r > 0 and K 1 to be chosen later, consider Q (K+1)r and a cut-off function ζ ∈ C ∞ c such that ζ ≡ 1 in Q (K+1)r . Recall again the Kolmogorov operator L K := ∂ t + v · ∇ x − ∆ v , and definē S := L K (ζT 0 [g]) with the Taylor polynomial T z 0 [g] of g at (0, 0, 0), defined in (2.6).
Decompose in Q (K+1)r (where ζ = 1): and G is the Green function studied in Proposition 2.1.
The general constant coefficients case is reached through a change of variables.
Corollary 3.7 (Schauder estimates for constant diffusion coefficients). -Let α ∈ (0, 1) and S ∈ C α and let A := (a i, j ) be a constant real d × d matrix that satisfies (1.2). Then for all solution g ∈ C 2+α to where the constant C depends on d, α, the norm of the (constant) matrix A and λ in (1.2).
and the result follows from Theorem 3.5.

Proof of Theorem 1.1
Proof of Theorem 1.1. -It is enough to treat the case where b ≡ 0 and c ≡ 0, the general case is then treated by interpolation (using Lemma 2.5). We consider a constant γ > 0 which will be fixed later, and pick z 0 , z 1 ∈ R × R d × R d and r > 0 such that z 1 ∈ Q r (z 2 ) and [g] C 2+α 2 |g(z 1 ) − T z 0 [g](z 1 )| r 2+α .
Case 1.r γ We compute then [g] C 2+α + C 1 (γ) g L ∞ (using again Lemma 2.5) with C 1 (γ) > 0 depending on γ and d. Case 2.r γ Then z 1 ∈ Q γ (z 0 ) and we consider a C ∞ cut-off function 0 ξ 1 that is equal to 1 on Q γ (z 0 ) and equal to zero outside Q 2γ (z 0 ). In particular, ξ(z 1 ) = ξ(z 0 ) = 1. We now use Corollary 3.7 to get where we have added the restriction to the support of ξ in the last term. We estimate successively the two terms of the right hand side. On the one hand, recalling that (using again Lemma 2.5) for a constant C 2 (γ) > 0 depending on d, A C α , α. On the other hand,

ANNALES HENRI LEBESGUE
using again Lemma 2.5, for some constants C 3 > 0 and C 4 (γ) depending γ. Combining the last three estimates yields finally in the case r γ: We now pick γ such that Cγ α + 1 4 1 2 and we thus get in both cases (r γ and r γ) that for some constant C 5 (γ) > 0, which concludes the proof of (1.3) thanks to (2.9) and (2.10).

Localization of the Schauder estimate
Theorem 3.9 (Local Schauder estimate). -Given α ∈ (0, 1) and a i,j , b i , c, S ∈ C α satisfying (1.2) for some constant λ > 0, any solution g ∈ C 2+α to (1.1) satisfies for all where the constant C depends on d, λ and α and a i,j Proof. -We use the strategy of [Kry96, Theorem 8.11.1]. Consider z 0 = 0 without loss of generality and define R n := n j=0 2 −j for n 0. Define a cutoff function ζ n that is smooth, one on Q Rn and zero outside Q R n+1 . It satisfies the controls Then apply the non-localized estimate of Theorem 1.1 to ζ n g. In order to do so, it is convenient to consider the differential operator and use the interpolation inequalities from Lemma 2.5 to get for all ε n > 0, for some β > 0 (note the additional factor ρ −n due to the dependency of the constant C in the difference of radii in Lemma 2.5). Choosing next ε n := ε 0 ρ n for ε 0 ∈ (0, 1) small enough yields Consider then the geometric sum n 0 ε n 0 A n , and calculate n 0 Assuming ε 0 < ρ β+2 < 1 and cancelling terms gives finally: which concludes the proof.

Global existence for the toy model
After this rescaling the operator U is symmetric in L 2 ( dx dv), without weight. In contrast with (1.4), this operator has no first order term in the velocity variable. But a (simpler) difficulty is created with the appearance of the unbounded zero order term ( d 2 − |v| 2 4 )g. We overcome it using that g stays in between two Maxwellians (see Lemma 4.1 below) and that, after rescaling, the Hölder estimate from [GIMV19,WZ09] in Theorem 4.3 encodes decay in the v variable (Proposition 4.4).

Gaussian bounds
We first explain how to propagate in time the Gaussian bounds satisfied by the initial data g in (x, v) = g(0, x, v).
Lemma 4.1 (Gaussian bounds). -Consider a strong solution g to (4.1) in for almost all t ∈ [0, T ], we have Remark 4.2. -The notion of strong solutions used in the latter lemma and the rest of this section is standard: all the terms in the equation are defined almost everywhere and the equation is satisfied almost everywhere and the solution is continuous in time with value in L 2 . We could have considered weaker notions of solutions but it was unnecessary since we are interested in constructing global strong solutions.
for any Q 2r (z 0 ) ⊂ [τ, T ] × T d × R d . In particular, Proof. -We apply Theorem 4.3 with λ = C 1 , Λ = C 2 , A = R[g] Id and S 0 = R[g]( d 2 − |v| 2 4 )g and get for some α 0 ∈ (0, 1). Apply now Lemma 4.1 to get for an arbitrary δ ∈ (0, 1) for some constant C δ depending only on C 1 , C 2 , d, τ, δ. Note that the role of the time τ > 0 is to ensure that the constant C δ only depends on τ and not on r: it gives some "room" around a point z 0 ∈ [τ, T ] × T d × R d . This implies in turn that for all (t, x) ∈ [τ, T ] × T d , . This also implies for any δ ∈ (0, δ) This ensures the Hölder regularity of the coefficients and source term, and we are thus in a position to apply Theorem 3.9 in the cylinders Q r (z 0 ) and deduce (4.3). To get (4.4) from (4.3) and the decay of Lemma 4.1, apply Lemma 2.5 on a cylinder Q to obtain ∇ v g L ∞ (Q) C g L ∞ (Q) + C[g] C 2+α 0 (Q) and argue as before. This achieves the proof of the Lemma 4.1.

Standard interpolation product inequality
We recall and prove an interpolation inequality tailored to our needs; it is folklore knowledge.
We now focus on establishing the key a priori estimate. Consider a solution g n+1 ∈ C 1 ([0, T ], H k,k x, v (T d × R d )) to (4.6) and compute successively: (4.7) L 2 estimate: d dt where we denote h n+1 := µ 1/2 ∇ v (µ −1/2 g n+1 ). Regarding the v-derivatives, for any integerk 1, d dt In the right hand side, the first term corresponds to the transport v · ∇ x , the second one to the operator U since R[g n ] does not depend on v, the third term appears when one v-derivative applies to |v| 2 and the others apply to g n+1 in the product |v| 2 g n+1 appearing in U [g n+1 ], the fourth term appears after deriving |v| 2 twice. Notice that integrations by parts are used either to further differentiate |v| 2 or to make appear |∂k −1 v i g n+1 | 2 . Discarding the negative term and using the fact that R[g n ] C 2 thanks to Lemma 4.1, we get after summing over i = 1, . . . , d and combining with equation (4.7) (4.8) Estimate of v-derivatives : d dt Regarding the x-derivatives, we write the equation on g n+1 as