A sharp log-Sobolev inequality for the multislice

We determine the log-Sobolev constant of the multi-urn Bernoulli-Laplace diffusion model with arbitrary parameters, up to a small universal multiplicative constant. Our result extends a classical estimate of Lee and Yau (1998) and confirms a conjecture of Filmus, O'Donnell and Wu (2018). Among other applications, we completely quantify the"small-set expansion"phenomenon on the multislice, and obtain sharp mixing-time estimates for the colored exclusion process on various graphs.

This Markov chain is known as the transposition walk on the multislice, or multi-urn Bernoulli-Laplace diffusion model with parameter κ. It can also be viewed as a random walk on the Schreier graph G κ = (Ω κ , E κ ), whose edge-set is given by Thanks to the degree of freedom in the choice of the parameter κ, the model is rich enough to encompass several classical special cases, including: (i) the random walk on the complete graph of order n, corresponding to κ = (1, n − 1); (ii) the k−particle Bernoulli-Laplace diffusion on n sites, corresponding to κ = (k, n − k); (iii) the transposition walk on S n , corresponding to κ = (1, . . . , 1).
These fundamental examples have been studied in full detail, see in particular [12,13,2,18,33,38,30,39]. In the general case, however, understanding the precise impact of the parameter κ on the mixing properties of the graph G κ was suggested as an open problem several times [13,9,15]. Beyond the traditional "mixing times of Markov chains" perspective, this question was recently shown in [16,15,14] to have remarkable applications to the theory of Boolean functions on the multislice, see Section 2.1 below for more details. In particular, the present paper was motivated by a conjecture from [16] regarding the so-called log-Sobolev constant of the multislice, whose definition will be recalled in the next section.

Remark 1 (Coarsening).
There is an obvious partial ordering on our parameter space: say that κ ′ is coarser than κ if it can be obtained from κ by repeatedly merging two entries into one. Note that this operation simply amounts to identifying certain colors, so that the transposition walk on Ω κ ′ is a projection of the one on Ω κ . In particular, the mixing behavior of the chain can only improve as κ becomes coarser, with the case κ = (1, . . . , 1) of example (iii) being the worst. Our main result will precisely quantify this qualitative statement.

Functional inequalities
One of the most powerful ways to quantify the mixing properties of a Markov chain consists in establishing appropriate functional inequalities for the underlying Dirichlet form. We shall here only recall the relevant definitions, and refer to the seminal papers [8,3] or the excellent survey [34] for a detailed account. We start by turning the multislice Ω κ into a probability space by equipping it with the uniform distribution. In particular, we regard functions f : Ω κ → R as random variables, and write E κ [f ] for the corresponding expectation: The Dirichlet form of our chain is defined for every f, g : Ω κ → R by where Remark 2 (Scaling). We have here chosen to work under the natural continuous-time scaling where each of the n 2 possible transpositions occurs at rate 1/n, so that a coordinate gets refreshed at rate 1. We emphasize that this is a matter of convention only: switching to discrete time amounts to nothing more that multiplying the above Dirichlet form by 2/(n−1).
Since E κ (f, f ) measures the local variation of the observable f along a typical transition of the chain, it is natural to compare it with the variance Var κ (f ) or the entropy Ent κ (f ), which quantify the global variation of f across the whole state space: All logs appearing in this paper are natural logarithms, and the last definition is of course restricted to non-negative functions, with the standard convention 0 log 0 = 0. With this notation in hands, the three classical functional inequalities read as follows: • The Poincaré inequality holds with constant τ if (3) • The modified log-Sobolev inequality holds with constant τ if • The log-Sobolev inequality holds with constant τ if The optimal values of τ in these functional inequalities are respectively known as the (inverse) Poincaré, modified log-Sobolev, and log-Sobolev constants of the chain. They will here be denoted by τ rel (κ), τ mls (κ) and τ ls (κ). These fundamental parameters provide powerful controls on the underlying Markov semi-group, and have tight connections to mixing times, concentration of measure, small-set expansion, and hypercontractivity. We again refer to [8,3,34] for a detailed account, and to [19] for new characterizations. Let us simply note that the statements (3), (4), (5) are essentially increasing in strength, in the sense that 2τ rel (κ) ≤ 4τ mls (κ) ≤ τ ls (κ).
Perhaps surprisingly, the first two quantities turn out to be too rough to capture the precise impact of κ on the mixing properties of the multislice Ω κ . Specifically, we note the following dramatic insensitivity result, see Section 3.4 for details.
In contrast, the much finer log-Sobolev constant τ ls (κ) happens to depend on κ in a nontrivial way, and understanding the exact nature of this dependency is precisely the aim of the present paper. Before we state our results, let us give a brief account on this general problem and its broad range of applications.

Related works
As already mentioned, the multi-urn Bernoulli-Laplace model encompasses various wellstudied special cases. The simplest one is the random walk on the complete n−vertex graph, obtained with κ = (1, n − 1). This example belongs to the short list of chains whose log-Sobolev constant is known exactly, see the seminal paper [8] by Diaconis and Saloff-Coste.
Theorem 1 (Random walk on the complete graph, see Theorem A.1 in [8]).
A much richer example is the famous "Random Transposition" walk on the symmetric group S n , which corresponds to the choice κ = (1, . . . , 1). A sharp estimate on the log-Sobolev constant of this fundamental chain can be deduced from the detailed representationtheoretic analysis conducted by Diaconis and Shahshahani in their pioneering work [12].
Several years later, Lee and Yau found a more direct proof, based on what is now known as the "martingale method" [30]. This approach also allowed them to determine the order of magnitude of the log-Sobolev constant of the k−particle Bernoulli-Laplace diffusion on n sites, thereby resolving an open problem raised by Diaconis and Saloff-Coste in [8].
Theorem 3 (Two-urn Bernoulli-Laplace diffusion model, see Theorem 5 in [30]). There exists a universal constant ε > 0 such that for all 0 < k < n, The implications of Theorems 2-3 are too numerous to be all cited. One particularly active direction consists in "transferring" these log-Sobolev estimates to models with less symmetry in order to obtain sharp mixing-time bounds, via the celebrated "comparison method" introduced by Diaconis and Saloff-Coste [10,11]. Recent successful examples include the interchange process on arbitrary graphs [1], or the exclusion process on high-dimensional product graphs [24]. Beyond Markov chains, the well-known connection between log-Sobolev inequalities and hypercontractivity provides another extremely fertile ground for applications in discrete analysis and computer science. We refer to the book [36, Chapters 9 & 10] for details, and to the recent work [16] for an impressive list of references from combinatorics, computational learning, property testing or Boolean functions, where Theorems 2-3 played a crucial role. Motivated by these applications, Filmus, O'Donnell and Wu [16] initiated the study of the log-Sobolev constant τ ls (κ) for general κ. Their main result is as follows.
Theorem 4 (General bound, see Theorem 1 in [16]). For any choice of the parameter κ, Several remarkable consequences of this estimate can be found in the recent works [16,14].
A quick comparison with Theorems 1, 2 and 3 shows that the bound is of the right order of magnitude in the extreme case L = 2, but is off by a factor of order n at the other extreme, Regarding what the correct order of magnitude of τ ls (κ) should be for all ranges of κ, Filmus, O'Donnell and Wu proposed the following beautifully simple dependency.
Note that the right-hand side decreases smoothly from log n downto 0 as κ becomes coarser and coarser, in agreement with Remark 1. To better appreciate this conjecture, consider the single-site dynamics obtained by projecting the multislice onto a fixed coordinate i ∈ [n]: under our transposition walk, the variable ω i simply gets refreshed at unit rate according to the marginal distribution The log-Sobolev constant of this trivial chain is well-known to be Although our probability space Ω κ is far from being a product space, the above conjecture asserts that the transposition walk mixes essentially as well as if the coordinates ω 1 , . . . , ω n were being refreshed independently. A brief look at Theorems 1, 2 and 3 will convince the reader that this intuition is correct in all known special cases.

Main estimate
Our main result is the determination of the log-Sobolev constant τ ls (κ) for all values of the parameter κ, up to a (small) universal multiplicative constant. In contrast, our result shows that as long as the vector κ = (κ 1 , . . . , κ L ) is reasonably balanced, in the (weak) sense that its lowest entry is of the same order as the mean entry. In particular, our estimate can be readily used to sharpen the dependency in L in the various quantitative results that were derived from Theorem 4 in [16]. To avoid a lengthy detour through hypercontractivity, we choose to leave the details to the reader, and to instead describe two different applications: a sharp quantification of the "small-set expansion" phenomenon for the multislice, and a general log-Sobolev inequality for the colored exclusion processes.
Remark 3 (Sharpness of universal constants). In our lower bound, the pre-factor in front of the logarithm can not be replaced by any larger universal constant, since we have in the special case κ = (1, n−1), as per Theorem 1. Regarding the upper bound, our pre-factor can not be improved by more than a log 2 factor. Indeed, we will show that in the important special case κ = (⌊n/2⌋, ⌈n/2⌉), see (26). In fact, the possibly loose log 2 term comes directly from the one appearing in Theorem 3, and any improvement of the latter will immediately imply the same improvement in our upper bound.

Small-set expansion
Recall that the multislice is naturally equipped with a graph structure by declaring two vertices ω, ω ′ ∈ Ω κ to be adjacent if they differ at exactly two coordinates. Following standard graph-theoretical notation, we write |∂A| for the edge boundary of a subset A ⊆ Ω κ , i.e., the set of edges having one end-point in A and the other outside A. Let us consider the problem of finding a constant ι(κ), as large as possible, such that the isoperimetric inequality holds for all non-empty subsets A ⊆ Ω κ . The left-hand side measures the conductance of A, i.e. the facility for the walk to escape from A, given that it currently lies in A. The presence of the logarithmic term on the other side constitutes a notable improvement upon the more standard Cheeger inequality: instead of being constant, the right-hand side of (8) gets larger as the set A gets smaller, thereby capturing the celebrated small-set expansion phenomenon [29,31,16]. Our log-Sobolev estimate allows us to determine the fundamental quantity ι(κ) for all values of κ, up to a small universal constant.
The proof will be given in Section 3.3. As in Remark 3, the universal constants appearing in our estimate can not be improved, apart from perhaps removing the log 2 term.

Colored exclusion process
A far-reaching generalization of the transposition walk on the multislice Ω κ consists in allowing each of the n 2 possible transpositions to occur at a different (possibly zero) rate. More precisely, we fix a non-negative symmetric array G = (G ij ) 1≤i,j≤n (which we interpret as a weighted graph) and consider the following weighted version of the Dirichlet form (2): The canonical setting -to which we shall here stick for simplicity -consists in taking G to be the transition matrix of the simple random walk on a regular graph, which we henceforth identify with G. The resulting process is known as the κ−colored exclusion process on G, see [4]. By varying the parameter κ, we obtain a rich family of diffusion models on G including: (i) the simple random walk on G, when κ = (1, n − 1); (ii) the k−particle exclusion process on G, when κ = (k, n − k); (iii) the interchange process on G, when κ = (1, . . . , 1).
Comparing the mixing properties of these three processes constitutes a rich and active research problem, see [40,35,4,28,37,6,20,1]. Perhaps the most celebrated result in this direction is the remarkable fact that their Poincaré constants coincide, as conjectured by Aldous and established by Caputo, Liggett and Richthammer [4].
Theorem 6 (Insensitivity of the Poincaré constant, see [4]). The Poincaré constant τ rel (κ, G) of the κ−colored exclusion process on G does not depend on κ. In particular, it equals the Poincaré constant τ rel (G) of the simple random walk on G.
In a sense, this result asserts that the Poincaré constant is too "rough" to capture the influence of the color profile κ on the mixing properties of the colored exclusion process. It is thus natural to turn one's attention to the finer log-Sobolev constant.
Our main result answers this question in the simple mean-field setting, where G is the complete graph. However, it implies an estimate of τ ls (κ, G) for arbitrary G, by means of the celebrated "comparison method" introduced by Diaconis and Saloff-Coste [10,11]. A particularly pleasant observation here is that we do not even need to build a comparison theory for the colored exclusion process: we can simply recycle the one that has already been developed for the interchange process. Specifically, let c(G) be the smallest number such that the functional inequality holds for all f : Ω (1,...,1) → R. This fundamental quantity is known as the comparison constant of the interchange process on G. It was shown in [1] that where means inequality up to a universal multiplicative constant, and where τ mix (G) denotes the mixing time of the simple random walk on G. It is in fact believed that see Conjecture 2 in [22]. This refinement, inspired by an analogous relation for the Zero-Range process [25], is already known to hold for several natural families of graphs ranging from low-dimensional tori [1] to high-dimensional products [22]. Those estimates can be combined with our main result to yield a general log-Sobolev inequality for the colored exclusion process (see Section 3.4 for details): Corollary 2 (Log-Sobolev inequality for the colored exclusion process). We have max 2τ rel (G), log n κ min ≤ τ ls (κ, G) ≤ 4 log 2 c(G) log n κ min .
To appreciate the sharpness of this general inequality, note that the lower and upper bounds are of the same order in the following two generic situations: • For families of graphs with c(G) ≍ 1 (i.e. "well-connected" graphs), we obtain τ ls (κ, G) ≍ log n κ min , exactly as in the mean-field case. Note that this potentially constitutes a considerable extension of our main result, since the class of graphs satisfying c(G) ≍ 1 is believed to contain all expanders, as per (12).
• For graphs satisfying the conjecture (12), in the regime κ min ≥ εn (ε > 0 fixed), we get Remark 4 (Mixing times). One of the many interests of those log-Sobolev estimates is that they provide powerful controls on the strong L ∞ −mixing time of the process, see e.g., [34]. Let us here just give one concrete example: on the d−dimensional hypercube, our work implies that the balanced colored exclusion process with an arbitrarily fixed number L ≥ 2 of colors mixes in time Θ(d 2 ). The special case L = 2 of this statement had been conjectured several years ago by Wilson [40], and was settled only recently [20].
We end this section with an intriguing possibility, which arises naturally in view of Theorem 6 and of what happens in the mean-field case (Lemma 1).
Question 2 (Sensitivity of the modified log-Sobolev constant). Can the choice of the parameter κ affect τ mls (κ, G) by more than a universal multiplicative constant ?
A negative answer would, in particular, substantially improve our current knowledge on the mixing times of the interchange and exclusion processes on general graphs. We note that, unlike our main result, the estimate on τ mls (κ) provided by Lemma 1 can not be directly transferred to more general graphs, since the modified log-Sobolev constant is notoriously not amenable to comparison techniques. This severe drawback constitutes a strong point in favor of log-Sobolev inequalities (as opposed to their modified versions) for mean-field interacting particle models, and was one of the motivations for the present work.

General strategy
Let us start with an elementary but crucial observation about the multislice.
Remark 5 (Recursive structure). If (ω 1 , . . . , ω n ) is uniformly distributed on Ω κ , then the conditional law of (ω 1 , . . . , ω i−1 , ω i+1 , . . . , ω n ) given {ω i = ℓ} is uniform on Ω κ ′ , where Such a simple recursive structure suggests the possibility of proving Theorem 5 by induction over the dimension n, using the "chain rule" for entropy (see formula (13) below). This is in fact a classical strategy for establishing functional inequalities, known as the "martingale method". Introduced by Lu & Yau [32] in the context of Kawasaki and Glauber dynamics, it has been successfully applied to various interacting particle systems [41,30,17,18,5,21], as well as other Markov chains enjoying an appropriate recursive structure [7,26,27,16,23]. In particular, this is how Theorem 3 was proved. However, as explained in detail in [16], moving from the special case L = 2 covered by Theorem 3 to the general case studied in Theorem 4 significantly complicates the inductive argument, resulting in the loose L log L dependency mentioned at (7). Here we introduce two simple ideas to bypass those complications and prove Conjecture 1: (i) instead of just a single site, we condition on a whole region being colored with ℓ ∈ [L]; (ii) when averaging the contributions from the various colors, we assign more weight to rare colors, which are the one which really govern τ ls (κ). More precisely, our decomposition (16) below gives weight 1 − κ ℓ n to the ℓ−colored region, whereas the traditional uniform average over all sites would give it the weight κ ℓ n .
Let us now implement those ideas. We fix an observable f : Ω κ → R + once and for all. To lighten notation, we drop the index κ from our expectations, and write simply for the entropy of f . If Z is a random variable on Ω κ , we define the conditional entropy of f given Z by simply replacing all expectations with conditional expectations, i.e.
We then have the following elementary "chain rule": The choice Z = ω i is of course natural in light of Remark 5, and this was the one adopted in the proofs of Theorems 3 and 4. However, as mentioned in (i) above, we choose here to condition instead on the whole ℓ−colored region, i.e., on the random set With Z = ξ ℓ , the formula (13) becomes Following our second idea (ii), we multiply both sides of this identity by the "unusual" weight 1 − κ ℓ n and then sum over all colors ℓ ∈ [L]. Recalling (1), we obtain the following formula, which will constitute the basis of our induction: Our main task will consist in estimating the two terms Σ 1 and Σ 2 on the right-hand side, in terms of the log-Sobolev constants of certain lower-dimensional multislices. More precisely, we let κ \ℓ denote the parameter obtained from κ by removing the ℓ−th entry, i.e.
Proposition 1 (Recursive log-Sobolev estimate). We have From this, the upper bound in Theorem 5 follows by an easy induction over the number L of colors, using the known log-Sobolev estimate for L = 2 (Theorem 2). The details, as well as the proof of the lower bound, are provided in Section 3.3.

Main recursion
This section is devoted to proving the two technical estimates (17) and (18) which, in view of the decomposition (16), establish Proposition 1.
Proof of the first estimate (17). Conditionally on the ℓ−colored region ξ ℓ , f may be regarded as a function of the remaining coordinates (ω i : i ∈ [n]\ξ ℓ ), which form a uniformly distributed element of Ω κ \ℓ . Consequently, the log-Sobolev inequality for the multislice Ω κ \ℓ gives Note that the event in the indicator can be rewritten as {ℓ / ∈ {ω i , ω j }}, and that we may impose the restriction {ω i = ω j } at no cost, since ∇ ij √ f = 0 on the event {ω i = ω j }. Taking expectations and rearranging, we arrive at Summing over all ℓ ∈ [L] yields which is exactly the claim made at (17).
Proof of the second estimate (18). Fix ℓ ∈ [L], and let us write for some non-negative function F = F ℓ . The distribution of ξ ℓ is uniform over all κ ℓ −element subsets of [n], and this is precisely the stationary distribution of the occupied set in the κ ℓ −particle Bernoulli-Laplace diffusion model on n sites. When applied to the function F , the log-Sobolev inequality for this process reads as follows: where A ij denotes the set obtained from A by swapping the membership status of i and j: On the other hand, since the involution τ ij : ω → ω ij preserves the uniform law on Ω κ and maps the event {ξ ℓ = A} onto the event {ξ ℓ = A ij }, we have But the function Φ :

Averaging this inequality over all possible
We may now plug this estimate back into (20) to arrive at Finally, multiplying by 1 − κ ℓ n and summing over all ℓ ∈ [L] gives which is precisely the claim (18).

Putting things together
To complete the proof of Theorem 5, we only need an estimate on the second term appearing on the right-hand side of our recursive log-Sobolev inequality. We of course use Theorem 3.
Proof. By Theorem 3, we have .
Since the right-hand side is maximized when κ ℓ = κ min , our task boils down to establishing But this is exactly the special case t = κmin n of the inequality which is valid for all t ∈ [0, 1 2 ]. To see this, note that the left-hand side is a convex function of t ∈ [0, 1 2 ] (as can be easily checked by differentiating) and that it equals zero at the two boundary points t = 0 and t = 1 2 .
We are now in position to prove our main result.
We proceed by induction over the dimension L of the parameter κ = (κ 1 , . . . , κ L ). By combining Proposition 1 and Lemma 2, we have which already establishes the claim in the base case L = 2. Now, assume that L ≥ 3 and that the claim already holds for lower values of L. In particular, we know that for all ℓ ∈ [L]. But Φ(κ \ℓ ) ≤ Φ(κ), since removing an entry from the parameter κ can only decrease the value of the sum n = κ 1 + · · · + κ L and increase the value of the minimum κ min = min{κ 1 , . . . , κ L }. Consequently, (22) gives and (21) is established.
Our upper bound on τ ls (κ) implies the lower bound on ι(κ) given in Corollary 1, thanks to the well-known relation between log-Sobolev inequalities and small-set expansion: Lemma 3 (Log-Sobolev inequality and small-set expansion). We have ι(κ)τ ls (κ) ≥ n.
Proof. This follows from the definitions of ι(κ) and τ ls (κ), once we have observed that for any event A ⊆ Ω κ .
The inequality in Lemma 3 is obtained by restricting the definition of the log-Sobolev inequality to indicator functions, and could therefore be rather loose. However, it turns out to be sharp in the present case, as we will now see.
Proof of the remaining halves of Theorem 5 and Corollary 1. By definition, we have for any non-empty event A ⊆ Ω κ . We fix ℓ ∈ [L] such that κ ℓ = κ min , and consider the choice where we recall that ξ ℓ is the ℓ−colored region. Since ξ ℓ is uniformly distributed over all Thus, our pre-factor can not be improved by more than log 2, as claimed in Remark 3.
Proof. By construction, we have |Ψ −1 ({ℓ})| = κ ℓ for each color ℓ ∈ [L], and hence Ψ maps Ω (1,...,1) to Ω κ . The first claim asserts that the Ψ−image of the uniform measure on Ω (1,..., 1) is the uniform measure on Ω κ , which is nothing more than the observation that each element of Ω κ admits the same number of pre-images under Ψ (namely κ 1 ! · · · κ L !). The second claim follows from the first and the definition (10), once we note that the commutativity relation trivially holds for all 1 ≤ i < j ≤ n and all f : Ω κ → R.
We can now easily establish our log-Sobolev estimate for the colored exclusion process.
The claimed upper bound now follows from our main estimate on τ ls (κ). The lower bound τ ls (κ, G) ≥ 2τ rel (G) is obtained by combining the general inequality τ ls (·) ≥ 2τ rel (·) with Theorem 6. To prove the other lower bound, we choose the test function f = 1 A in the definition of the log-Sobolev inequality, with A as in (24). We have already seen that |A| = |Ω κ |/ n κmin . Moreover, we now have |∂A| ≤ |A|dκ min where d denotes the degree in G, since moving from A to A c requires transposing some site in {1, . . . , κ min } with one of its d neighbors. We thus obtain τ ls (κ, G) ≥ |A|d log |Ωκ| |A| |∂A| ≥ 1 κ min log n κ min ≥ log n κ min , and the proof is complete.