Spectral aspects of the Berezin transform

We discuss the Berezin transform, a Markov operator associated to positive operator valued measures (POVMs), in a number of contexts including the Berezin-Toeplitz quantization, Donaldson’s dynamical system on the space of Hermitian products on a complex vector space, representations of finite groups, and quantum noise. In particular, we calculate the spectral gap for quantization in terms of the fundamental tone of the phase space. Our results incidentally confirm a prediction of Donaldson for the spectrum of the Q-operator on Kähler manifolds with constant scalar curvature. Furthermore, viewing POVMs as data clouds, we study their spectral features via geometry of measure metric spaces and the diffusion distance.


Introduction
Given a function f on a classical phase space X, let us first quantize it and then dequantize. This operation on functions, f → Bf , is called the Berezin transform. As a result of this operation, the function f blurs on the phase space. The intuition behind this is as follows a : assume that f is the Dirac delta-function at a point x ∈ X. Its quantization is a coherent state at x, whose dequantization is approximately a Gaussian centered at x. In the framework of the Berezin-Toeplitz quantization of closed Kähler manifolds, B is known to be a Markov operator with finite-dimensional image, and is closely related to the Laplace-Beltrami operator ∆ of the Kähler manifold. In fact, the Berezin transform has the following asymptotic expansion as → 0, due to Karabegov and Schlichenmaier [30] b : for every smooth function f on X, with remainder depending on f and where stands for the Planck constant (see Section 3 for notations and conventions). We focus on the spectral properties of B. For fixed , this operator factors through a finite-dimensional space and hence its spectrum consists of a finite collection of points lying in the interval [0, 1]. Moreover, multiplicities of positive eigenvalues are finite, and 1 is the maximal eigenvalue corresponding to the constant function. Write its spectrum (with multiplicities) in the form 1 = γ 0 γ 1 γ 2 . . . γ k . . . 0 The quantity γ := 1 − γ 1 is called the spectral gap, a fundamental characteristic of a Markov chain responsible for the rate of convergence to the stationary distribution. Our first result, Theorem 3.1, implies that in the context of the Berezin-Toeplitz quantization, the spectral gap γ of the Berezin transform equals where λ 1 stands for the first eigenvalue of ∆. Note that the upper bound on the gap readily follows from (1). The proof follows a work of Lebeau and Michel [34] on semiclassical random walks on manifolds with extra ingredients such as an asymptotic expansion for the Bergman kernel due to Dai, Liu and Ma [13], a comparison between the Berezin transform and the heat operator motivated by the work of Liu and Ma [36], a We thank S. Nonnenmacher for this explanation. b Note that after renormalization, there is a missing factor of 1/2 in front of the second term of the analogous formula in [30, (1.2)]. and a refined version of the above-mentioned Karabegov-Schlichenmaier asymptotic expansion [30]. In fact, Theorem 3.1 shows much more than (2), namely that one can approximate the full spectrum of ∆, as well as the associated eigenfunctions, with those of B.
Let us point out that the proof of Theorem 3.1 can be extended to Berezin-Toeplitz quantization of closed symplectic manifolds, using the quantum spaces given by the eigenstates corresponding to the small eigenvalues of the renormalized Bochner Laplacian. This uses the associated generalized Bergman kernel of Ma and Marinescu [39] and asymptotic estimates refining those of Ma, Marinescu and Lu [37] (see the discussion at the end of Section 3).
The Berezin transform is defined in the more general context of positive operator valued measures (POVMs). In fact, the Berezin-Toeplitz quantization is nothing else but the integration over a certain POVM on the phase space M with values in the space of quantum observables, and the dequantization is the dual operation [33,8]. In addition to quantization, POVMs appear in quantum mechanics on another occasion: they model quantum measurements [7]. Interestingly enough, within this model the spectral gap of the Berezin transform corresponding to a POVM admits two different interpretations: it measures the minimal magnitude of quantum noise production, and it equals the spectral gap of the Markov chain corresponding to repeated quantum measurements (see Section 7 for details).
Another theme of this paper is related to Donaldson's program [17] of developing approximate methods for detecting canonical metrics on Kähler manifolds. Interestingly enough, our study of the Berezin transform yields the asymptotic behaviour of the spectrum and of the eigenfunctions of the Q-operator, a geometric operator arising in this program, for Kähler metrics of constant scalar curvature. This behaviour, which was predicted by Donaldson in [17], is stated in Theorem 3.2 below.
Additionally, Donaldson discovered in [17] a remarkable class of dynamical systems on the space of all Hermitian products on a given complex vector space. Section 4 deals with the spectrum of the linearization of such a system at a fixed point. We show that it can be, roughly speaking, identified with the one of the Berezin transform associated to a certain POVM. We prove that under certain natural assumptions, this linearization in contracting. By the Grobman-Hartman theorem and earlier results of Donaldson, this implies in particular that the iterations of this system converge exponentially fast to a fixed point (see Corollary 4.9). The use of Hartman's theorem in a related context has been suggested by Fine in [22].
This naturally brings us, in Section 5, to a geometric viewpoint at POVMs. Following Oreshkov and Calsamiglia [42, VII.C], we encode them as probability measures in the space of quantum states S equipped with the Hilbert-Schmidt metric. It turns out that the spectral gap admits a transparent description in terms of the geometry of such metric measure spaces and exhibits a robust behaviour under perturbations of POVMs in the Wasserstein metric. In a similar spirit, one can consider a POVM as a data cloud in S, which leads us to a link between the spectral gap and the diffusion distance, a notion coming from geometric data analysis.
Section 6 contains a case study of POVMs associated to irreducible unitary representations of finite groups. In this case the spectrum of the Berezin transform and the diffusion distance associated to the corresponding Markov chain can be calculated explicitly via the character table of the group, and their properties reflect algebraic features. In particular, we prove that any non-trivial irreducible representation of a simple group has a strictly positive spectral gap (see Corollary 6.6).

Preliminaries
The mathematical model of quantum mechanics starts with a complex Hilbert space H. In what follows we consider finite-dimensional Hilbert spaces only. Observables are represented by Hermitian operators whose space is denoted by L (H). Quantum states are provided by density operators, i.e., positive trace-one operators ρ ∈ L (H). They form a subset S(H) ⊂ L (H). Notation: We write ((A, B)) for the scalar product tr(AB) on L (H).
Let Ω be a set equipped with a σ-algebra C of its subsets. By default, we assume that Ω is a Polish topological space (i.e., it is homeomorphic to a complete metric space possessing a countable dense subset) and C is the Borel σ-algebra.
An L (H)-valued positive operator valued measure W on (Ω, C ), which we abbreviated to POVM, is a countably additive map W : C → L (H) which takes each subset X ∈ C to a positive operator W (X) ∈ L (H) and which is normalized by W (Ω) = 1l. According to [10], every L (H)-valued POVM possesses a density with respect to some probability measure α on (Ω, C ), that is having the form where n = dim C H and F : Ω → S(H) is a measurable function. A POVM W given by formula (3) is called pure if (i) for every s ∈ Ω the state F (s) is pure, i.e. a rank one projector; (ii) the map F : Ω → S(H) is one to one.
Pure POVMs, under various names, arise in several areas of mathematics including the Berezin-Toeplitz quantization , convex geometry (see [24] for the notion of an isotropic measure and [1] for the resolution of identity associated to John and Löwner ellipsoids), signal processing (see [18] for a link between tight frames and quantum measurements) and Hamiltonian group actions [20]. When Ω is a finite set, a pure POVM with a given measure α exists if and only if the measure α({s}) of each point s ∈ Ω is 1/n, see [20] for a detailed account on the structure of the moduli spaces of pure POVMs on finite sets up to unitary conjugations. Let us introduce the main character of our story, the spectral gap of a POVM of the form (3). Define a map T : L 1 (Ω, α) → L (H) by (here and below we work with spaces of real-valued functions). The dual map T * : L (H) → L ∞ (Ω, α) is given by T * (A)(s) = n((F (s), A)). Since L ∞ ⊂ L 1 , we have an operator Observe that E is a unital trace-preserving completely positive map. In the terminology of [26,Example 5.4], this is an example of an entanglement-breaking quantum channel. Furthermore, set Observe that the image of B is finite-dimensional as B factors through L (H). Write (φ, ψ) := Ω φψ dα for the scalar product on L 2 (Ω, α), and · for the associated norm. Note that B is defined as an operator on L 2 (Ω, α) and its spectrum belongs to [0, 1], with 1 being the maximal eigenvalue associated with the constant function.
Note now that positive eigenvalues of E and B coincide. Indeed, T * maps isomorphically an eigenspace corresponding to a positive eigenvalue of E to the eigenspace of B corresponding to the same eigenvalue. Denote 1 = γ 0 γ 1 . . . the positive part of the spectrum of E and B. The number γ(W ) := 1 − γ 1 is called the spectral gap of the POVM W .
In what follows, it would be instructive to use the language of Markov chains with the state space Ω. Recall [2,45] that a Markov kernel on Ω is a map x → σ x sending a point x ∈ Ω to a probability measure σ x on (Ω, C ) such that x → σ x (A) is a measurable function for every A ∈ C . With every Markov kernel σ one associates a Markov chain, i.e., a sequence of Ω-valued random variables ζ k , k = 0, 1, . . . defined on the same probability space, such that for every n and every sequence x i ∈ Ω the conditional probabilities satisfy If ζ 0 is distributed according to a probability measure ν 0 on Ω, then ζ 1 is distributed according to ν 1 given by formula If ν 0 = ν 1 , we say that ν 0 is a stationary measure for the Markov chain.
The Markov kernel is called reversible with respect to a measure ν on Ω if as measures on Ω×Ω. In this case ν is a stationary measure of the Markov chain. Given a ν-reversible Markov kernel σ with the state space Ω, define the Markov operator A on L 1 (Ω, ν) by Note that A preserves positivity: A(φ) 0 for φ 0, A(1) = 1, and its operator norm is 1. The reversibility readily yields that the Markov operator A is self-adjoint on L 2 (Ω, ν). Denote by 1 ⊥ the orthogonal complement to the constant function 1 on Ω, i.e., the space of functions with zero mean. Then A preserves 1 ⊥ . By definition, the spectral gap γ(A) is defined as With this language, the operator B given by (5) is a Markov operator with the Markov kernel t → n((F (s), F (t)))dα(s) .
It is reversible with respect to the stationary measure α.

Berezin transform vs. Laplace-Beltrami operator
Pure POVMs naturally appear in the context of the Berezin-Toeplitz quantization of closed Kähler manifolds which are quantizable in the following sense: the cohomology class [ω] is integral, where ω is the Kähler symplectic form on X. Recall that this is equivalent to the existence of a Hermitian holomorphic line bundle L over X whose Chern connection has curvature −2πiω.
Let us briefly recall the construction of this quantization (see [5,46,35] for prelimiaries). Let X be a d-dimensional closed Kähler manifold with Hermitian holomorphic line bundle L as above. Write L p for the p-th tensor power of L, where p ∈ N * is large enough c , and consider the space H p of global holomorphic sections of L p . The quantity = 1/p plays the role of the Planck constant, so that the classical limit is given by p → +∞. In this setting, one defines a family of pure L (H p )-valued POVMs dW p = n p F p dα p on X, taking the map F p : X → S(H) which send a point x ∈ X to the coherent state projector at x, and taking n p = dim C H p . From the viewpoint of algebraic geometry, the map F p comes from the Kodaira embedding theorem. The measure α p is given at any x ∈ X by c Our convention is that the set of natural numbers N contains 0. We write N * for strictly positive natural numbers.
where the density R p : X → R is called the Rawnsley function, and dv X is the measure associated to the canonical volume form ω d /d!. From the viewpoint of complex geometry, the Rawnsley function is given by the value of the Bergman kernel on the diagonal, and it is a classical fact that R p (x) = 0 for all x ∈ X and p ∈ N * big enough.
In the context of the Berezin-Toeplitz quantization, the operator B p := 1 np T * p T p given by formula (5) above is known as the Berezin transform. Note that for any p ∈ N * , the operator B p has a finite-dimensional image, and all its eigenvalues lie in the interval [0, 1]. There is a finite number of positive eigenvalues with multiplicities, while 0 has infinite multiplicity. Write for the eigenvalues of B p with multiplicities.
Let ∆f = −div∇f be the (positive) Laplace-Beltrami operator associated with the Kähler metric, acting on functions on X with eigenvalues Theorem 3.1. For every integer k ∈ N, we have the following asymptotic estimate as p → +∞, Furthermore, every sequence in p ∈ N * of L 2 (X, α p )-normalized eigenfunctions of B p corresponding to the eigenvalue γ k,p contains a subsequence converging to an eigenfunction of the Laplace-Beltrami operator corresponding to λ k in the C ∞ -sense.
Note that in the context of Section 2, Theorem 3.1 is equivalent to the same statement via T * p for the operator E p : L (H p ) → L (H p ) defined from B p by the formula (4).
The Berezin transform B p and its associated operator E p have prominent cousins, the Q K -operator and the Q-operator, respectively introduced by Donaldson [17, §4] in the framework of his program of finding numerical approximation to distinguished Kähler metrics on complex projective manifolds d . They are defined in the same way, replacing the measure α p in (9) by the canonical probability measure dv X / Vol(X). In particular, for any p ∈ N * , Donaldson's Q-operator Q p : Write the eigenvalues of Q p as d We keep Donaldson's notation for these operators. and set For some Kähler metrics of constant scalar curvature, Donaldson considered the Qoperator as a finite-dimensional approximation of the heat operator and predicted (see p. 611 in [17]) that as p → +∞, the spectrum of Q p approximate the spectrum of Furthermore, for every sequence in {A p } p∈N * of normalized eigenvectors of Q p in L (H p ) corresponding to the eigenvalue β k,p for all p ∈ N * , there is a subsequence of {T * p A p } p∈N * converging to an eigenfunction of the Laplace-Beltrami operator corresponding to λ k in the C ∞ -sense.
We refer to [22,32] for a related study of the asymptotic behaviour of the spectrum of certain geometric operators arising in Donaldson's program.
Let us introduce the following useful notion [17,23]. Definition 3.3. Let (X, ω) be a closed Kähler manifold. Let L be a holomorphic bundle over X equipped with a Hermitian metric h such that the curvature of the corresponding Chern connection equals −2πiω. Fix a positive integer p so that the Kodaira map X → H 0 (X, L p ) is an embedding. We say that the collection (X, ω, L, h, p) is balanced if the corresponding Rawnsley function R p is constant on X.
Note that for the balanced data (X, ω, L, h, p) the Berezin transform B p and the Q K,p -operator coincide, as well as E p and the Q p -operator. In that case, the result of Theorem 3.1 is relevant in [17, § 4.3]. We refer the reader to [17, § 4.1] and to [21, § 1.4.1] for an interpretation of these operators in terms of complex geometry of (X, L p ). Let us finally mention that the approximation of the heat operator by the Q K -operator has been explored by Lu and Ma in [34], and that the analogue of the refined Karabegov-Schlichenmaier expansion of Proposition 3.8 for the Q K -operator has been shown by Ma and Marinescu in [40,Th.6.1]. Some ingredients of their approach are instrumental for us.
It follows from Theorem 3.1 that the spectral gap of B p equals γ(B p ) = 4π In particular, this yields that the eigenvalue 1 of B p is simple (i.e., has multiplicity 1) for all sufficiently large p.
Example 3.4. Take the projective line X = CP 1 = S 2 of area 1. Let L = O(1) be the holomorphic line bundle over X dual to the tautological one. The quantum Hilbert space H p of global holomorphic sections of L p can be identified with the (p + 1)-dimensional space of homogeneous polynomials of degree p of 2 variables. A representation-theoretical argument (see [48,17] and Remark 6.7 below) shows that the eigenvalue γ 1 of the Berezin transform equals p/(p + 2). The Kähler metric on X has constant curvature. For such metrics the first eigenvalue λ 1 of the Laplace-Beltrami operator equals 8π/Area = 8π. We get that as predicted by (15).
The upper bound in (15) immediately follows from the Karabegov-Schlichenmaier asymptotic expansion (1) of the Berezin transform [30] B for every smooth function f on X, where the remainder O(p −2 ) depends on f . Indeed, choosing f to be the L 2 (X, α p )-normalized first eigenfunction of ∆, we see that where (·, ·) p is the scalar product on L 2 (X, α p ). The prototype example illustrating a link between the Berezin transform and the Laplace-Beltrami operator is the flat space R 2n , where the Berezin transform B p simply coincides with the heat operator e − ∆/4π (see [3]). It would be interesting to explore the following problem motivated by a conversation with J.-M. Bismut. Denote by χ(t) the indicator function of the interval [0, 1].

Problem 3.5. Call a non-decreasing sequence
where the norm stands for the operator norm in L 2 . According to Theorem 3.1, the constant sequence r(p) = C is admissible for all C. Is the sequence r(p) = p τ admissible for τ > 0? What is the maximal possible growth rate of an admissible sequence?
Let us finally make a couple of comments on the physical intuition behind the Berezin transform. It has been noted in the introduction that the Berezin transform can be defined as the composition of the quantization and the dequantization. It is instructive to interpret it in terms of the quantization only. Let σ be a classical state, i.e. a Borel probability measure on X, and following [9], define its quantization as where as earlier F (x) stands for the coherent state projector at x ∈ X. Let further f ∈ L 2 (X) be a classical observable. It was noticed in [9, (11)] that the expectation ((T p (f ), Θ p (σ))) of the value of the quantized observable T p (f ) in the quantized state Θ p (σ) equals the classical expectation X B(f ) dσ of the Berezin transform B(f ) in the classical state σ. Thus in the context of Berezin-Toeplitz quantization, we get another interpretation of the blurring of quantization measured by B. Furthermore, in view of Theorem 3.1, we know that B is a Markov operator with strictly positive spectral gap. Thus it has unique stationary measure α p whose density against the phase volume is given by R p /n p , as in formula (9). Interestingly enough, this provides an interpretation of the Rawnsley function without appealing to a specific choice of coherent states.

Comments on the proof
The proof of Theorem 3.1 occupies the rest of this section, and we will deduce Theorem 3.2 as a consequence of it in Section 3.5. Our argument is structured similar to the one in a paper by Lebeau and Michel [34] on the Markov operator associated to the semiclassical random walk on manifolds. The key intermediate results are as follows: (i) An apriori estimate stating that for any eigenfunction f of B p whose eigenvalue is sufficiently bounded away from 0, any Sobolev norm f H q is bounded by C q f L 2 . See Lemma 3.10 below which is a counterpart of Lemma 5 in [34].
(ii) The operators A p := p(1l−B p ) and ∆ 4π turn out to be ∼ p −1 -close as operators from L 2 to H q for the Sobolev space H q with some sufficiently large q, see formula (45) below which is a counterpart of formula (3.28) in [34], and which can be considered as a refinement of the expansion (1) obtained in [30].
Combining (i) and (ii) we conclude that, roughly speaking, eigenfunctions of A p as in (i) are "approximate" eigenfunctions of the Laplacian, which eventually implies that the spectra of A p and ∆ are close to one another, which yields the desired result (see the ending of our proof which is parallel to the one in [34]).
Proving (i) and (ii) forms the main bulk of the work. In contrast to [34], our proof does not involve micro-local analysis. The main ingredients we use is the expansion of the Bergman kernel due to Dai, Liu and Ma [13] (see Theorem 3.7) and a comparison between the Berezin transform and the heat operator motivated by the work of Liu and Ma on Donaldson's Q K -operator [36] (see Proposition 3.9 below).
Finally, an acknowledgment is in order. After a weaker version of Theorem 3.1 was posted and formula (15) was stated as a question, Alix Deleporte kindly shared with us his ideas concerning the proof of (15). He sent us notes [14] containing a number of preliminary steps in the direction of (i) and (ii) above. While the original arguments of Deleporte dealt with the case of real-analytic Kähler manifolds and line bundles and were based on the asymptotic expansion from [44,14], he informed us that they also could be adjusted to the C ∞ -case.

Preparations
Recall that the measure dv X associated to the canonical volume form ω d /d! is also the Riemannian volume form of X. Let ·, · L 2 be the usual L 2 -scalar product on C ∞ (X, C), and let · L 2 be the associated norm. For all j ∈ N, let e j ∈ C ∞ (X, C) be the normalized eigenfunction associated with the j-th eigenvalue of the Laplace-Beltrami operator, so that e j L 2 = 1 and ∆e j = λ j e j as in (10) for all j ∈ N. Then for any f ∈ C ∞ (X, C), we have the following equality in For any F : R → R bounded, we define the bounded operator F (∆) acting on L 2 (X, C) by the formula The bounded operator e −t∆ thus defined for all t > 0 is called the heat operator. For any m ∈ N, let | · | C m be a C m norm on C ∞ (X, C). The following result is classical and can be found for example in [31], [4, Th.2.29, (2.8)].
Proposition 3.6. For any m ∈ N, there exists C m > 0 such that for any f ∈ C ∞ (X, C) and all t > 0, we have For any m ∈ N * , let · H m be a Sobolev norm of order m on C ∞ (X, C). Using the elliptic estimates for the Laplace-Beltrami operator, for m even we define · H m by Note that the Laplacian ∆ is symmetric with respect to the corresponding scalar product on H m . By convention, we set f H 0 := f L 2 .
Next, turn to the Berezin transform. Recall that the Hermitian product on L and the Riemannian measure dv X induce an L 2 -scalar product on sections of L p for any p ∈ N * , and write L 2 (X, L p ) for the associated Hilbert space. The central tool for the study of the Berezin transform is the Schwartz kernel Π p (x, y) of the orthogonal projector Π p : L 2 (X, L p ) → H p , called the Bergman kernel. Recall that for fixed x and y, this is an element of L p x ⊗L p y , where L p x denotes the fiber of L p at x ∈ X and the bar stands for the conjugate line bundle. Since the bundle L comes with a Hermitian metric, we can measure the point-wise norm |Π p (x, y)|. By Corollary 9.1.4 (2) in [35], we have that |Π p (x, y)| = | ξ x,p , ξ y,p |, where ξ x,p is the non-normalized coherent state at x ∈ X defined up to a phase factor (see e.g. [8,35] for the definition). The Rawnsley function R p is given by R p (x) = |ξ x,p | 2 , and thus satisfies It follows from (8) and (9) that (20) so that the Schwarz kernel of B p with respect to dv X is given by Let · p be the norm on L 2 (X, α p ). From the classical asymptotic expansion of R p as p → +∞, we get a constant C > 0 such that

Asymptotic expansion of the Berezin transform
For a comprehensive account on the off-diagonal expansion of the Bergman kernel as well as tools of Berezin-Toeplitz quantization in this context, we refer to [38].
We always assume that p ∈ N * is as large as needed. For any s > 0, we use the notation O(p −s ) as p → +∞ in the usual sense, uniformly in C m -norm for all m ∈ N * .
Let ε 0 > 0 be smaller than the injectivity radius of X. Fix a point x 0 ∈ X, and let Z = (Z 1 , ..., Z 2d ) ∈ R 2d with |Z| < ε 0 be geodesic normal coordinates around x 0 , where | · | is the Euclidean norm of R 2d . In these coordinates, the canonical volume form is given by with κ x 0 (0) = 1. For any kernel K(·, ·) ∈ C ∞ (X × X, C), we write K x 0 (·, ·) for its image in these coordinates, and we write |K x | C m (X) for the C m -norm of the family of functions K x with respect to x ∈ X.
Let d X be the Riemannian distance on X. We will derive Theorem 3.1 as a consequence of the following asymptotic expansion as p → +∞ of the Schwartz kernel of the Berezin transform. Theorem 3.7. For any m , k ∈ N and ε > 0, there is C > 0 such that for all p ∈ N * and x, y ∈ X satisfying d X (x, y) > ε, For any m, k ∈ N, there is N ∈ N and C > 0 such that for any x 0 ∈ X, |Z|, |Z ′ | < ε 0 and for all p ∈ N * , we have of the same parity as r, depending smoothly on x 0 ∈ X. Furthermore, for any Z, Z ′ ∈ R 2n we have This readily follows from formula (21)  For any x ∈ X, let B X (x, ε 0 ) be the geodesic ball of radius ε 0 > 0 around x, and write B(0, ε 0 ) ⊂ R 2d for the Euclidean ball of radius ε 0 around 0. The following proposition is a refinement of the Karabegov-Schlichenmaier expansion [30, (1.2)] of the Berezin transform, where we make explicit the remainder term. Proposition 3.8. For any m ∈ N, there exists C m > 0 such that for any f ∈ C ∞ (X, C) and all p ∈ N * , we have Proof. For any x ∈ X, write f x for the image of f restricted to B X (x, ε 0 ) in normal coordinates around x. From (24), we know that for any ε > 0 and x ∈ X, For any k ∈ N * and m ∈ N, we will use the following Taylor expansion of f x up to order k − 1, for all p ∈ N * and |Z| < ε 0 , where O m means that the expansion is uniform in x ∈ X as well as all its derivatives up to order m ∈ N, and does not depend on f . We will compute the asymptotic expansion as p → +∞ of (28) using the Taylor expansion (29) of f and the asymptotic expansion (25) of the Berezin transform up to order 3. First, using the fact that B p 1 = 1 for all p ∈ N * , we know that the polynomials J r,x (Z, Z ′ ) of the asymptotic expansion (25) of the Berezin transform satisfy for all x ∈ X and r ∈ N * . On another hand, recall from (26) that J 0,x ≡ 1 and J 1,x ≡ 0 for all x ∈ X. Using the parity of Gaussian functions, a change of variable Z → Z/ √ p and the Taylor expansion (29) for k = 4, we get that is a polynomial in Z ∈ R 2n of the same parity than r ∈ N, so that using (29), (30) and the parity of Gaussian functions, we get in the same way Finally, again using a change of variable Z → Z/ √ p, we get for any N ∈ N * and p ∈ N * , This completes the proof of (27).
In view of Proposition 3.6 and Proposition 3.8, it is natural to compare the Berezin transform with the heat operator by setting t = (4πp) −1 . This leads to the following result, which is essentially a refinement of [36, Th.0.1]. Proposition 3.9. For any m ∈ N, there exists C m > 0 such that for any f ∈ C ∞ (X, C) and all p ∈ N * , we have Proof. Set S p := e ∆ 4πp − B p , which acts on L 2 (X, C) for all p ∈ N * and admits a smooth Schwartz kernel S p (·, ·) with respect to dv X . Comparing the classical asymptotic expansion of the heat kernel, as given for example in [4,Th.2.29], [31], with Theorem 3.7, we see that for all x, y ∈ X satisfying d X (x, y) > ε 0 , and using the formula (26) for the first two coefficients, we get for any m ∈ N a constant C > 0 and N ∈ N such that Let us first show (34) for m = 0. For any f ∈ C ∞ (X, C) and any ε > 0, we get by Cauchy-Schwarz inequality and (35) for S p that for all p ∈ N * , Then (34) for m = 0 follows from (36) with Z = 0 or Z ′ = 0 respectively, as in (33).
To deal with the case of arbitrary m ∈ N * , let us assume by induction that (34) is satisfied for m−1. Considering the estimates (35) and (36) with corresponding m ∈ N * , note that for any differential operator D x of order m in x ∈ X, there exists a differential operator D ′ x,y in x, y ∈ X of total order m but of order at most m − 1 in x ∈ X, such that the operator S (m) p defined through its kernel for all x, y ∈ X by also satisfies (35) and (36). Then for all x ∈ X and p ∈ N * , we get where D ′ x and D ′′ y are differential operators, respectively in x and in y, obtained from D ′ x,y using a partition of unity and integration by parts in local charts, so that in particular D ′ x is of order m − 1 in x ∈ X. Then using the induction hypothesis, the inequality (34) for m follows from the same inequality for m − 1 replacing f by any number of derivatives of f , and from the estimates (36) and (37) for S (m) p in the same way than before.

Spectrum
Recall that · p denotes the norm on L 2 (X, α p ). In this section, we consider a sequence {f p } p∈N * , with f p ∈ C ∞ (X, C) such that for some µ p ∈ Spec(B p ) for all p ∈ N * . The following estimate is crucial for the proof of Theorem 3.1.
Proof. Note that (41) is automatically verified for m = 0 by (22) and (40). By induction on m ∈ N, let us assume that (41) is satisfied for m − 1. Let us write where the bounded operator F (∆/p) acting on L 2 (X, C) is defined as in (17) for the continuous function F : R → R given for any s ∈ R * by F (s) = 4π(1 − e s/4π )/s. As |p(1 − µ p )| < L for all p ∈ N * , by Proposition 3.9 and formula (19) for · H 2m , this gives a constant C m > 0 such that On the other hand, note that by hypothesis, we have µ p → 1 as p → +∞. Using Proposition 3.9 again, we then get ε m > 0 and p m ∈ N * such that for all p > p m , This together with (43) gives (41).
Proof of Theorem 3.1. For any f ∈ C ∞ (X, C), by Propositions 3.8 and 3.9 we get that with q even and large enough. The inequality on the right follows from Sobolev embedding theorem, and the same is true in L 2 (X, α p )-norm by (22). Now if e j ∈ C ∞ (X, C) is such that ∆e j = λ j e j , then by (45) we get C j > 0 not depending on p ∈ N * such that Thus if m j ∈ N is the multiplicity of λ j as an eigenvalue of ∆, the estimate (46) for all eigenfunctions of ∆ associated with λ j gives a constant C > 0 such that This immediately follows from the variational principle for the operator p(1l−B p )− λ j 4π 1l. Consider now for every p ∈ N * an eigenfunction f p ∈ C ∞ (X, C) of B p as in (40) such that the associated sequence {p(1 − µ p )} p∈N * of eigenvalues of p(1l − B p ) is bounded. Combining Lemma 3.10 with the right inequality in (45), we get C > 0 such that In particular, we get that Finally, let us show that there exists p 0 ∈ N * such that (47) is in fact an equality for p > p 0 . To this end, let l ∈ N * with l m j be such that for all p ∈ N * , there exists an orthonormal family f k,p , 1 k l, of eigenfunctions of B p in L 2 (X, dα p ) with associated eigenvalues µ k,p ∈ R, 1 k l, satisfying As the inclusion of the Sobolev space H q in H q−1 is compact, by Lemma 3.10 there exists a subsequence of {f k,p } p∈N * converging to a function f k in H q−1 -norm, for all 1 k l. In particular, taking q > 2, the family f k , 1 k l, is orthonormal in L 2 (X, C) and satisfies ∆f k = λ j f k for all k ∈ N * by (48). By definition of the multiplicity m j ∈ N of λ j , this forces l = m j .
Let us sum up our findings. First, equality where m j is the multiplicity of λ j as the eigenvalue of ∆, together with (49) readily yields the first statement of the theorem: Second, observe that we got a subsequence of f k,p , p ∈ N * converging to f k in the Sobolev H q−1 sense, where q even can be chosen arbitrarily large. By the Sobolev embedding theorem, this yields a subsequence which C l -converges to f k with arbitary l. Iterating this argument for this subsequence we get that there exists a sequence p l → +∞ such that |f k,p l − f k | C l 1/l , which means that f k,p l converges to f k in the C ∞ -sense. This completes the proof.

Proof of Theorem 3.2.
Let us consider Donaldson's Q K -operator, acting on f ∈ C ∞ (X, C) by We will show that when the scalar curvature is constant, the analogue of Theorem 3.1 holds for this operator. As p ′ /p = 1 + O(p −1 ) by the Riemann-Roch theorem, this will imply Theorem 3.2 via the morphism T * p of Section 2, which relates Q p with Q K,p in the same way that E p is related to B p .
Recall that R p : X → R denotes the Rawnsley function, and that n p = dim C H p . By the classical asymptotic expansion of the Bergman kernel, which can be found for example in [38, § 4.1.1], we know that when the scalar curvature is constant, we have As this expansion holds in C m -norm for all m ∈ N * and by the definition B p and Q K,p in formulas (20) and (51) respectively, we get a constant C m > 0 for any m ∈ N * such that It is then easy to see that Lemma 3.10 holds for any sequence with {p(1 − µ p )} p∈N * bounded, simply using the estimate (53) to replace B p by Q K,p in (42) and (44). We can then follow the proof of Theorem 3.1 above to get the same result for Q K,p , using the estimate (53) to replace B p by Q K,p in (45) and (46), and using (22) to replace · p by · L 2 in (46). This completes the proof.
Let us now make a final comment on the case when the complex structure is not assumed to be integrable, so that (X, ω) is a closed symplectic manifold of real dimension 2d, and L is a Hermitian line bundle with Hermitian connection of curvature −2πiω. One can then consider the following renormalized Bochner Laplacian acting on C ∞ (X, L p ) for any p ∈ N * , first introduced by Guillemin and Uribe [25], where ∆ L p stands for the usual Bochner Laplacian on L p . By [25, Th.2.a], the spectrum of ∆ p is contained in I ∪ (C 1 p − C 2 , +∞) for all p ∈ N * , for some C 1 , C 2 > 0 and some interval I ⊂ R containing 0. We can then consider Π p as the associated spectral projection corresponding to I and set H p = Im(Π p ). Using the work of Ma and Marinescu on the kernel of Π p [39], all the preliminaries of Section 3.3 hold in this context, and we claim that Theorem 3.1 holds as well. In fact, the Berezin transform admits an asymptotic expansion similar to Theorem 3.7 as a consequence of the analogous result in [37, (2.31), (3.2)], except for the formula (26), where we only have J 1,x 0 (0, Z ′ ) = 0 for all Z ′ ∈ R 2d as a consequence of [28, Lem.6.1, Lem.6.2] (see also [39, (2.32)]). Then Proposition 3.8 holds as stated, and Proposition 3.9 holds with p −σ instead of p −1 on the right hand side, for some σ > 0. It is then straightforward to adapt the rest of the proof. Note that the corresponding estimates in Proposition 3.8 and Proposition 3.9 can be seen as refinements of [37]. On the other hand, the proof extends with no modifications to the case where L p is replaced by L p ⊗ E for any Hermitian vector bundle E equipped with an Hermitian connection, again by the results of [39].

Berezin transform and Donaldson's iterations
In [17] Donaldson, as a part of his program of developing approximate methods for detecting canonical metrics on Kähler manifolds, discovered a remarkable class of dynamical systems on the space of all Hermitian products on a given complex vector spaces. We shall show in this section that the linearization of such a system at a fixed point can be identified with the quantum channel E introduced in (4) above and prove that under certain natural assumptions, E is injective and has strictly positive spectral gap. Using earlier results by Donaldson, we will then deduce that the iterations of this system converge exponentially fast to the fixed point.
For a complex n-dimensional vector space V, denote by Prod(V) the space of Hermitian products on V. Given such a q ∈ Prod(V), let H := (V, q) be the corresponding Hilbert space, and define a map sending a line Λ ∈ P(H) to the orthogonal projector to Λ with respect to q. Every product q ∈ Prod(V) gives rise to an antilinear Riesz map I q : V → V * , defined by the formula I q (v) = q(·, v), for all v ∈ V. This induces in turn a product q * ∈ Prod(V * ) by the formula q * (ξ, η) := q(I −1 q η, I −1 q ξ), for all ξ, η ∈ V * . We write H * := (V * , q * ) for the corresponding Hilbert space.
In this section, we will be mainly concerned with the problem of finding a ν-balanced Hermitian product q ∈ Prod(V), under some natural assumptions on the measure ν. The existence of ν-balanced products in this context is due to Bourguignon, Li and Yau [6], where they use such products to give an upper bound for the first eigenvalue of the Laplacian of complex manifolds embedded in the projective space. This generalizes the seminal work of Hersch [27], where he shows that the first eigenvalue of any metric over S 2 is smaller than the one of the round metric, using the notion of balanced product in its simplest form.  (3). Let σ W be the pushforward measure F * α on P(V) by the associated map where P(V) is identified with the set of rank one projectors in S(H) ⊂ L (H). It is then an immediate consequence of the definitions that q * ∈ Prod(V * ) is σ W -balanced.
is the space of holomorphic sections of L p , and q L 2 ∈ Prod(H 0 (X, L p )) is the L 2 -Hermitian product induced by the Kähler metric. In this case, the measure σ Wp is the push-forward of α p under the Kodaira embedding Then by Lemma 6.13 in [23], we get as a special case of the previous example that q L 2 ∈ Prod(H 0 (X, L p )) is σ Wp -balanced.

Example 4.3.
Taking the previous example the other way around, let X be a complex manifold and L → X be an ample line bundle over X. Then the Kodaira map (60) is an embedding for p sufficiently large. Let now q ∈ Prod(H 0 (X, L p )) be any Hermitian product on H 0 (X, L p ). This gives rise to the Fubini-Study form ω (q) F S on P(H 0 (X, L p ) * ), inducing a volume form ν q on F p (X), which we consider as a measure on P(H 0 (X, L p ) * ). Considering the Berezin-Toeplitz POVM of Section 3.1 associated with the Kähler form F S , we get by Proposition 8.3 in [23] that the collection (X, ω q , L, h, p) is balanced in the sense of (3.3) if and only if the product q is ν q -balanced. Combining with the previous example, we see that this happens if and only if q is the L 2 -Hermitian product induced by h. Now, starting with a measure ν on P(V * ), we wish to find a ν-balanced product q ∈ Prod(V). First choose any base point q 0 ∈ Prod(V), and identify (V, q 0 ) with (C n , ·, · ), where z, w = j z jwj . This identifies Prod(V) with the set L (C n ) + of positive Hermitian n × n matrices: every q ∈ Prod(V) can be uniquely written as q(v, w) = G T v, w , with G ∈ L (C n ) + . On the other hand, under the identification of (C n ) * with C n induced by ·, · , this identifies P(V * ) with CP n−1 , and the dual product q * is given by q * (v, w) For a non-zero vector z ∈ C n , denote by Π z the orthogonal projector with respect to ·, · to the line generated by z. Then one readily checks that q ∈ Prod(V) is ν-balanced if and only if its image G ∈ L (C n ) + via the identification above satisfies This can be rewritten as This dynamical system T ν on L (C n ) was defined by Donaldson in [17]. Under mild conditions on the measure ν, Donaldson proved that for every initial condition G 0 ∈ L (C n ) + , the iterations T r ν (G 0 ) converge to a fixed point G ∈ L (C n ) + as r → +∞, and that this fixed point is unique up to the action of R + on L (C n ) by scalar multiplication. Donaldson's argument, which we reproduce below, shows that the linearization of T ν at G coincides with the quantum channel E of W defined in (4).
To see this, let q ∈ Prod(V) be a ν-balanced Hermitian product, which we identify with a fixed point G ∈ L (C n ) + of T ν as above, and let W be the associated POVM as in (57). Note that the two point of views are linked via composition by G, which identifies the space L (C n ) = L (V, q 0 ) with L (H) = L (V, q). Note that if we choose q 0 = q as a base point, then we have G = 1l and this identification is an equality.

Lemma 4.4. [17, p.609] Let q ∈ Prod(V) be a ν-balanced
Hermitian product, and let G ∈ L (C n ) + be the associated fixed point of T ν . Under the natural identification of the tangent space T G L (C n ) + ≃ L (C n ) with L (H) via composition by G, the differential of T ν satisfies D G T ν = E, where E is the quantum channel of the POVM W associated to q.
Recall that the quantum channel E : L (H) → L (H) satisfies E(1l) = 1l, and that its spectral gap is the quantity γ = 1 − λ 1 , where 1 = λ 0 λ 1 λ 2 · · · 0 (63) is the decreasing sequence of eigenvalues of E. Keeping in mind the identification of V and V * with C n via a base point q 0 ∈ Prod(V), our first goal is to prove the following result. (i) Assume that for any projective subspace Σ of CP n−1 , we have Then for any G 0 ∈ L (C n ) + , the iterations T r ν (G 0 ) converge to a fixed point G ∞ ∈ L (C n ) + as r → +∞, unique up to the action of R + by scalar multiplication. Furthermore, the quantum channel E associated to G ∞ as in Lemma 4.4 has positive spectral gap. (

ii) Assume in addition that at least one irreducible component of Y is not contained in any proper projective subspace of CP n−1 . Then the associated quantum channel E is invertible.
The proof of Theorem 4.5 will be divided into Propositions 4.6, 4.7 and 4.8 below. But first, some remarks are in order.
Note that if Y is irreducible, assumptions (i) and (ii) are satisfied as soon as Y is not contained in a proper projective subspace of CP n−1 . Thus these assumptions are automatically satisfied in the important case when ν is induced by a smooth volume form over a complex manifold X embedded in a projective space via Kodaira embedding.
Conversely, observe that if there exists a ν-balanced Hermitian product, the whole variety Y (in contrast with its irreducible components) cannot lie in a proper projective subspace of CP n−1 . Indeed, assume without loss of generality that ·, · is ν-balanced, and assume on the contrary that the lift of Y to C n is orthogonal to a non-zero vector, say u. Then equation (58) yields However, Π y u = 0 by the assumption, and we arrived at a contradiction.
Note also that assumption (i) coincides with Donaldson's assumption 2 in [17, p. 581] when Y is a finite collection of points. Donaldson proved that if either Y is a complex variety which is not contained in any proper projective subspace, or Y is a finite collection of points satisfying (i), every orbit of T ν converges to a balanced product G as time goes to infinity. The proof of Proposition 4.6 closely follows the lines of [17, p. 581]. Proposition 4.6. Assume that assumption (i) of Theorem 4.5 holds. Then for any G 0 ∈ L (H) + , the iterations T r ν (G 0 ) converge to a fixed point G ∞ ∈ L (C n ) + as r → +∞, unique up to the action of R + by scalar multiplication.
Proof. Recall that ·, · denotes the canonical Hermitian product of C n . Following [17, p. 582], for any [z] ∈ CP n−1 , let z ∈ C n be a lift of norm 1, and for any G ∈ L (C n ) + , set This quantity does not depend on the choice of a lift of [z] ∈ CP n−1 of norm 1, and the second term makes it invariant under multiplication of G by a positive scalar. Given a Borel measure ν on CP n−1 , we then define a functional on L (C n ) + by the formula for any G ∈ L (C n ). Using (62), we see that G ∈ L (C n ) + is a critical point of Ψ ν if and only if it is a fixed point of T ν . Thus to show the existence and unicity of such a fixed point up to the action of R + , we can restrict Ψ ν to the space L (C n ) 1 + of positive Hermitian matrices of determinant 1, and it suffices to show that Ψ ν is strictly convex and proper along any geodesic of L (C n ) 1 + for its natural Riemannian metric as a symmetric space. In fact, any strictly convex and proper function over R has a unique absolute minimum, which is also its unique critical point. Now as two points can always be joined by a geodesic, we conclude in that case that a fixed point of T ν on L (C n ) 1 + coincide with a minimum of Ψ ν , which exists and is unique. Recall that the structure of symmetric space on L (C n ) 1 + is given by the map which realizes L (C n ) 1 + as the quotient of the special linear group SL n (C) by the special unitary group SU(n). The usual scalar product ((·, ·)) on the space of n × n matrices induces a Riemannian metric on L (C n ) 1 + through the identification of its tangent space at any point with the space of traceless matrices. By general theory of symmetric spaces, geodesics are simply the images of 1-parameter groups of SL n (C) through the above map, so that up to the action of SU(n) by conjugation, they are of the form for all t ∈ R, where λ 1 λ 2 · · · λ n satisfy n j=1 λ j = 0. Now if ν satisfies assumption (i) of Theorem 4.5, its pullback by the action of a unitary matrix also satisfies this assumption, and thus we are reduced to show strict convexity and properness of Now convexity follows from a direct computation, with strict convexity as long as the total mass of ν is not contained in any projective subspace of CP n−1 , which is a straightforward consequence of assumption (i).
Let us now show properness, i.e. that Ψ ν (G t ) → +∞ when t → ±∞. By considering the geodesic going to the opposite direction, it suffices to show it when t → +∞. Consider an irreducible component Z ⊂ Y , and let k n be the largest integer such that Z is contained in the projective subspace As ν is absolutely continuous over the smooth part of Z, this means in particular that the function log |z k | 2 restricted to Z is integrable with respect to ν. We thus get a constant C Z > 0 such that For any k n, write ν k > 0 for the total mass of the irreducible components of Y for which k is the largest integer such that they are not contained in Σ k as above. We then get a constant C Y > 0 such that We are thus reduced to show that n j=1 λ j ν j > 0. Notice now that assumption (i) implies n j=k ν j < n − k n n j=1 ν j , for all 1 k n .
Let us now show the convergence of iterations of T ν to a fixed point. We will first show that T ν decreases Ψ ν , so that iterations have an accumulation point by properness, and we will then show that this accumulation point is in fact a fixed point. First note that for any G ∈ L (C n ) + , using the fact that projectors are of trace 1, formula (58), together with (61) and (62), gives tr [T ν (G)G −1 ] = n. Using the strict concavity of the logarithm, we thus get with equality if and only if T ν (G)G −1 = 1l. Thus to show that Ψ ν (T ν (G)) Ψ ν (G), by definition (66) of Ψ ν , we only need to show that T ν decreases the integral against ν of the first term of formula (65). Again by concavity of the logarithm, we get This, together with (75), proves Ψ ν (T ν (G)) Ψ ν (G), for all G ∈ L (C n ) + . To conclude, note first that properness over L (C n ) 1 + and invariance under the action of R + implies that Ψ ν is bounded from below over the whole L (C n ) + . Thus for any G 0 ∈ L (C n ) + , we get that the decreasing sequence {Ψ ν (T r ν (G 0 ))} r∈N converges to its lower bound. As both terms in the definition of Ψ ν are decreasing under iterations of T ν by (75) and (76), we then deduce that {log det(T r ν (G 0 ))} r∈N , thus also {det(T r ν (G 0 ))} r∈N , are bounded in R, and that Now from properness of Ψ ν over L (C n ) 1 + and boundedness in R of the sequences {Ψ ν (T r ν (G 0 ))} r∈N and {det(T r ν (G 0 ))} r∈N , we get that the sequence {T r ν (G 0 )} r∈N admits an accumulation point G ∞ ∈ L (C n ) + . On the other hand, by strict concavity of the logarithm, formula (77) and the equality case in formula (75) imply We thus get that the accumulation point is unique, and satisfies T ν (G ∞ ) = G ∞ . This concludes the proof.
In the following Proposition, we use the result that a fixed point of T ν exists as soon as ν satisfies assumption (i), which was proved in the previous Proposition. Proof. Consider a Borel measure ν over CP n−1 satisfying assumption (i) of Theorem 4.5, and normalize if by setting α := ν/|ν|. Through the identification of Lemma 4.4, we assume without loss of generality that G = 1l and that D G T ν = E. Until the end of the proof we write z for a non-vanishing vector in C n and [z] for its class in CP n−1 . For any z, w ∈ C n , write for the Schwartz kernel of the Berezin transform B on L 2 (CP n−1 , ν) with respect to dα.
Recall that ·, · stands for the canonical Hermitian product of C n , and let Y 1 , . . . , Y q be the irreducible components of Y . Since (z, w) → z, w is holomorphic in z and antiholomorphic in w, for every i, j q, we get that Consider a graph Γ with vertices 1, . . . , q, where i, j are connected by an edge whenever (b) occurs of Y i × Y j . In particular, each i is connected by an edge to itself.
Using the same trick than in formula (37) above, we apply Cauchy-Schwarz inequality on the formula to get for any φ ∈ L 2 (CP n−1 , ν), In particular, the equality Bφ = φ can hold only if the inequality above is an equality, and by the equality case of Cauchy-Schwarz inequality, this implies that for α-almost all x, there exists c = 0 such that cB(x, y) 1/2 = B(x, y) 1/2 φ(y) for α-almost all y. In terms of the graph defined in the previous step, this yields that φ is constant on every subset of the form j∈star(i) Y j , where i = 1, . . . , q. Thus if φ is a non constant function satisfying Bφ = φ, it follows that Γ is disconnected. Denote by Γ i , i = 1, . . . , k the connected components, and put Assuming that there exists a non-constant φ satisfying Bφ = φ as above, we will show that assumption (i) can not hold. Recall that we work with the POVM dW (x) = nΠ x dα(x), where Π x is the orthogonal projector to the line x ∈ CP n−1 with respect to ·, · . With this notation, B(x, y) = 0 yields Π x Π y = 0. Write P = W (Z 1 ) and P ′ = W (Z 2 ∪· · ·∪Z k ). It follows that P +P ′ = 1l and P P ′ = 0. Thus P is an orthogonal projector whose image is a proper projective subspace Σ of CP n−1 of dimension m − 1, with Observe also that if P z = 0, we get and hence Π x z, z = 0 for ν-almost all x. Since ν is absolutely continuous on each irreducible component of Y , it follows that x is orthogonal to z for all x ∈ Z 1 , and hence Z 1 ⊂ Σ. We conclude that so that assumption (i) does not hold.
Pick any irreducible component Z. If it fully lies in KerA, we have that Z is contained in a proper projective subspace. Otherwise, pick [u] ∈ Z so that Au = 0. We thus proved that any other [z] ∈ Z satisfies a linear equation z, Au = 0, meaning that Z lies in a proper projective subspace. This is in contradiction with assumption (ii).
The main consequence of Theorem 4.5 and the main result of this section is the exponential convergence of Donaldson's iteration process to the ν-balanced product. Corollary 4.9. Suppose that the measure ν on CP n−1 satisfies the assumptions (i) and (ii) of Theorem 4.5. Then for any G 0 ∈ L (C n ) + , there exists a fixed point G ∞ ∈ L (C n ) + of T ν and constants C > 0 and β ∈ (0, 1) such that for all r ∈ N, we have Proof. Suppose that ν satisfies assumptions (i) and (ii). Let's simplify the notation by setting L := L (C n ) + and T = T ν . Take any G 0 ∈ L . By Theorem 4.5, its orbit T r (G 0 ) has a limit G ∞ ∈ L as r → +∞, which without loss of generality equals 1l. Write L 1 for the space of positive Hermitian matrices of determinant 1. Identify diffeomorphically L with L 1 × R + via the map Then for every r ∈ N, Recall that by Lemma 4.4, D 1 l T coincides with the quantum channel E. Since L 1 is a slice of the R + -action and T is R + -equivariant, the differential of D • T equals to the restriction of E to the tangent space T 1 l L 1 . The latter subspace consists of all trace 0 Hermitian matrices. It follows from Theorem 4.5 that the spectrum of this differential is contained in (0, 1), i.e., D • T is a local diffeomorphism of L 1 near its hyperbolic fixed point 1l. By the classical Hartman-Grobman theorem, in a neighbourhood of 1l the map D • T ν is conjugate by a local homeomorphism to its linearization at 1l. In particular, taking β ∈ (0, 1) as the largest eigenvalue of E in (0, 1), we get a constant C > 0 such that dist (D(T r ν (G 0 )), 1l) Cβ r , for all r ∈ N .
By (83), in order to complete the proof of the exponential convergence of the orbit of G 0 to 1l, we need to show that for r large enough To this end recall that the functional Ψ ν of the proof of Proposition 4.6 is decreasing under iterations of T and invariant with respect to the action of R + by multiplication. By (84) and the differentiability of Ψ ν at 1l, there exists a constant C > 0 such that Now as both (75) and (76) are non-positive and as T r (G 0 ) → 1l as r → +∞, recalling the definition (65)-(66) of Ψ ν we deduce that Example 4.10. Consider the setting of Example 4.2 above. The spectral gap of E coincides with the one of the Berezin transform which, by (15) above, equals ∼ λ 1 (X)/(4πp). Thus by Corollary 4.9, the first eigenvalue of the Laplace-Beltrami operator controls the convergence rate of Donaldson's iterations in this case.

POVMs and geometry of measures
Assume that we are given an L (H)-valued POVM on Ω satisfying equation (3), i.e., of the form dW = n F dα for some F : Ω → S(H). In this section we discuss spectral properties of the Berezin transform associated to W in terms of the geometry of the measure on S(H), focusing on its multi-scale features, and on stability of the spectral gap under perturbations of the measure. Recall that for pure POVMs we have encountered measure (88) in Example 4.1.
Write V ⊂ L (H) for the affine subspace consisting of all trace 1 operators, dist for the distance on V associated to the scalar product ((A, B)) = tr(AB) on L (H). Given a compactly supported probability measure σ on V, introduce the following objects: • the center of mass C(σ) = V vdσ(v); • the mean squared distance from the origin, • the mean squared distance to the best fitting line where the infimum is taken over all affine lines ℓ ⊂ V.
The infimum in the definition of J is attained at the (not necessarily unique) best fitting line which is known to pass through the center of mass C (Pearson, 1901; see [19, p.188] for a historical account). e Observe that the center of mass C(σ W ) for the measure σ W given by (88) coincides with the maximally mixed state 1 n 1l. e The problem of finding J and the corresponding minimizer ℓ appears in the literature under several different names including "total least squares" and "orthogonal regression".
Proof. Let ℓ ⊂ V be any line passing through the center of mass 1 n 1l generated by a trace zero unit vector A ∈ L (H). For a point B ∈ V we have Integrating over σ W and taking infimum over ℓ we get that with The latter integral can be rewritten as so by definition K = n −1 γ 1 = n −1 (1 − γ(W )). Substituting this into (89), we deduce the theorem.

Remark 5.2.
Observe that the supremum in 90 is attained at a unit vector A generating the best fitting line. By (91), A is an eigenvector of E with the eigenvalue γ 1 .
For instance, consider the (pure!) Berezin-Toeplitz POVM W p from Example 4.2. Let us use formula (92) in order to calculate J. Recall that by the Riemann-Roch theorem (see [23], Propositions 2.25 and 4.21) It follows from formula (15) for γ p that . For instance, for the dual to the tautological bundle over CP 1 in Example 3.4 n = p + 1 and γ = 2/(p + 2) so by (92) J = 1 − 2 p+2 .
Furthermore, we explore robustness of the gap γ(W ), as a function of the measure σ W , with respect to perturbations in the Wasserstein distances on the space of Borel probability measures on S(H). They are defined as follows. For compactly supported Borel probability measures σ 1 , σ 2 on a metric space (X, d) the L 2 -Wasserstein distance is given by , and the L ∞ -Wasserstein distance by where in both cases the infimum is taken over all Borel probability measures ν on X ×X with marginals σ 1 and σ 2 .
Theorem 5.4. Let σ V and σ W be measures on S(H) associated to POVMs V and W respectively.
where c(n) depends on the dimension n = dim H; (ii) If in addition V and W are pure POVMs, there exists a universal constant c such that Note that this result enables us to compare spectral gaps of POVMs defined on different sets (but having values in the same Hilbert space). This idea goes back to [42] f . Let us emphasize that the estimate in (ii) is dimension-free. This is important, for instance, for comparison of spectral gaps corresponding to different Berezin-Toeplitz quantization schemes. Theorem 5.4(i) immediately follows from the fact that C(σ), I(σ) and J(σ) are Lipschitz in σ with respect to L 2 -Wasserstein distance. The details will appear in MSc thesis by V. Kaminker.
For the proof of part (ii), we need the following auxiliary statement. In what follows we write A 2 for the Hilbert-Schmidt norm (tr(AA * )) 1/2 . Lemma 5.5. Let P, Q be rank 1 orthogonal projectors. Then for every A ∈ L (H), Proof. Suppose that P and Q are orthogonal projectors to unit vectors ξ and η, respectively. By tuning the phase of ξ, we can assume that ξ, η 0. We have |tr(A(P − Q))| = | Aξ, ξ − Aη, η | f In [42] the authors consider the L 1 -version of this distance, and call it the Kantorovich distance.
Proof of Theorem 5.4 (ii): Denote by P the space of all rank 1 orthogonal projectors on H. We can assume without loss of generality that pure POVMs V and W are defined on subsets Ω 1 and Ω 2 of P, respectively, and that the maps F i : Ω i → P are the inclusions. Thus representation (3) in this case can be simplified as where σ V = α 1 and σ W = α 2 are Borel probability measures supported in Ω 1 and Ω 2 , respectively. Let us emphasize that here and below s, t stand for rank 1 orthogonal projectors. Pick any measure ν on P × P with marginals α 1 and α 2 and write We use the fact that the operators E 1 , E 2 : L (H) → L (H) given by formula (4) have the same spectrum as the Berezin transform. For A ∈ L (H) with tr(A 2 ) = 1 put One readily rewrites By Cauchy-Schwarz, writing we get |tr((s + t)A)| (tr(s + t)) 1/2 (tr(A 2 (s + t))) 1/2 = √ 2(tr(A 2 (s + t))) 1/2 .

It follows that
The integral on the right can be rewritten as and tr(A 2 ) = 1. It follow that D 4∆. Choosing ν so that ∆ becomes arbitrary close to δ := δ ∞ (α 1 , α 2 ), and taking A with to be an eigenvector of E 1 with the first eigenvalue γ 1 (E 1 ), we get that But due to the variational characterization of the first eigenvalue, where the maximum is taken over all A satisfying (94). It follows that γ 1 (E 1 ) −γ 1 (E 2 ) 4δ. By symmetry, γ 1 (E 2 ) − γ 1 (E 1 ) 4δ, which yields the theorem with c = 4.
Our next result provides a geometric characterization of the eigenfunction of the operator B with the eigenvalue γ 1 . Let A ∈ L (H) be the trace zero unit vector generating the best fitting line corresponding to W . In view of Theorem 5.1, with I = I(σ W ) and J = J(σ W ).
Theorem 5.6. The function is an eigenfunction of the operator B with the eigenvalue γ 1 . Furthermore, ψ 1 = 1.
In other words, up to a multiplicative constant, the first eigenfunction is the projection to the best fitting line.
Proof. By Remark 5.2 above, the operator A generating the best fitting line is an eigenvector of the quantum channel E: EA = γ 1 A. Since E = n −1 T T * and B = n −1 T * T , we have B(T * A) = γ 1 T * A and (T * A, T * A) = nγ 1 . Furthermore, T * A(s) = n((F (s), A)) and nγ 1 = n 2 (I − J). Choosing ψ 1 = T * A/ T * A , we get (95).
Next, we discuss the diffusion distance on Ω associated to the Markov operator B (see [11]). This distance, which originated in geometric analysis of data sets, depends on a positive parameter τ playing the role of the time in the corresponding random process. Take any orthonormal eigenbasis {ψ k } corresponding to eigenvalues 1 = γ 0 γ 1 γ 2 . . . of B such that ψ 0 is constant. The diffusion distance D τ is defined by If γ 1 < 1, i.e., the spectral gap is positive, this expression decays exponentially. Suppose now that γ 2 < γ 1 . In this case the asymptotic behavior of D τ (s, t) as τ → ∞ is given by and D τ (s, t) = O(γ τ 2 ) otherwise. The difference in these asymptotic formulas highlights the multi-scale behaviour of the metric space (Ω, D τ ). In the first approximation, this space consists of the level sets of the function s → ((F (s), A)) situated at the distance ∼ γ τ 1 from one another, while each fiber has the diameter γ τ 2 . Viewing POVMs as data clouds in S opens up a prospect of using various tools of geometric data analysis for studying POVMs. The above result on the diffusion distance associated to a POVM can be considered as a step in this direction.

Case study: representations of finite groups
In this section we will be interested in finite POVMs associated to irreducible representations of finite groups. We start with some preliminaries from Woldron's book [47]. Let G be a finite set. Definition 6.1. A finite collection {f s } s∈G of non-zero vectors in a finite-dimensional Hilbert space H is said to be a tight frame if there exists a number A > 0, called the frame bound, such that Denote by P s the orthogonal projector to f s . One readily checks that for such a frame, the operators form a L (H)-valued POVM on G.
Suppose from now on that G is a finite group, and we are given its non-trivial irreducible unitary representation ρ on a d ρ -dimensional Hilbert space V . g One can g All the representations considered below are assumed to be unitary. Denote by χ ρ : G → C, χ ρ (s) := tr(ρ(s)) the character of the representation ρ. Consider a basis in L 2 (G) consisting of the indicator functions of the elements of G. It readily follows from the definition that the Berezin transform B corresponding to the POVM W is given by a matrix where u(s) := |χ ρ (s)| 2 . The eigenvalues of this matrix and their multiplicities are given by the following proposition, see chapter 3E of [16].
representation of G. We claim that λ ϕ = 1. Indeed, for s ∈ V(ρ) we have ϕ(s) = 1l and hence χ ϕ (s) = d ϕ , and for s / ∈ V(ρ) holds χ ρ (s) = 0. It follows that This proves the claim and hence completes the proof of the theorem. Corollary 6.6. If G is a simple group, then the gap of W is positive.
Proof. Indeed, otherwise by Theorem 6.4 and the simplicity of G, V(ρ) = {1l}, which means that χ ρ (s) = 0 for every s = 1l. Then the first statement of Lemma 6.5 yields |G| = d 2 ρ , while the second statement guarantees that |G| 1+d 2 ρ , since ρ is a non-trivial representation. We get a contradiction.
Let us point out that there exist non-simple groups G admitting an irreducible representation ρ with V(ρ) = G. Indeed, consider the irreducible representation ρ : Z m → U(C), ρ(s) = e 2πis/m of the abelian cyclic group Z m . Observe that V(ρ) = Z m , while Z m is simple if and only if m is prime.
Let us describe the diffusion distance D τ (see (96)) corresponding to the POVM W associated to a finite group G and a non-trivial irreducible representation ρ. Recall [16] that for an irreducible representation ϕ : G → U(n), the orthonormal basis of eigenfunctions corresponding to the eigenvalue λ ϕ presented in Proposition 6.2 is given by the matrix coefficients of ϕ multiplied by d ϕ . Assume that the gap of G is strictly positve, and denote by β 1 > · · · > β k all pair-wise distinct eigenvalues of B lying in the open interval (0, 1). Denote Then (96) yields the following expression for the diffusion distance: where 2 stands for the Hilbert-Schmidt norm C 2 = (tr(CC * )) 1/2 . Note that this expression can be rewritten in terms of the character χ ϕ since ) . Define a normal subgroup Γ j := ϕ∈R j Ker(ϕ) , j = 1, . . . , k and a normal series It follows from (100) that for τ → +∞ In fact we have a sequence of nested partitions ∆ p of G formed by the cosets of K p . For every pair of distinct points s, t ∈ G choose maximal p so that s and t lie in the same element of ∆ p . Then asymptotical formula (101) holds, which manifests the multi-scale nature of the diffusion distance. Let us illustrate this in the case when G = S 4 is the symmetric group, and ρ a 3-dimensional irreducible representation. The direct calculation with the character table of S 4 shows that the first non-trivial eigenvalue 1/2 corresponds to the unique 2dimensional irreducible representation whose kernel coincides with the normal subgroup K of order 4 of S 4 called the Klein four-group. Thus D τ (s, t) ∼ (1/2) τ if s, t belong to different cosets of K in S 4 , and one can calculate that D τ (s, t) ∼ (1/3) τ if s, t are distinct and belong to the same coset. Remark 6.7. A modification of the construction presented in this section is related to Berezin-Toeplitz quantization. The modification goes in two directions. First, we deal with unitary representations ρ of compact Lie groups G instead of finite groups, and second, our POVMs are related to the G-orbits in a representation space H as opposed to the image of ρ in the endomorphisms of H. Let us very briefly illustrate this in the following simplest case. Consider the irreducible unitary representation ρ j of the group G = SU (2) in an n = 2j + 1-dimensional Hilbert space H, j ∈ 1 2 N. Fix a maximal torus K = S 1 ⊂ G, and let w ∈ H be the maximal weight vector of K, that is ρ j (t)w = e 4πijt w for all t ∈ K. Consider an L (H)-valued POVM W on Ω = G/K = CP 1 of the form dW ([g]) = nP [g] dα([g]), where [g] stands for the class of g ∈ G in Ω, α is the Ginvariant measure on Ω and P [g] is the rank one projector to gw. Note that W is nothing else but the Berezin-Toeplitz POVM W p from Example 3.4 with p = 2j. We refer to [12,Chapter 7] for the representation theoretic approach to coherent states and quantization. By using theory of Gelfand pairs (cf. [16,Chapter 3.F]) one can check that the eigenvalues of the Berezin transform are of the form λ ϕ = (u, χ ϕ ) L 2 , where ϕ runs over all irreducible unitary representations of G, χ ϕ stands for the character of ϕ and u(g) = n| ρ(g)w, w | 2 . The multiplicity of λ ϕ equals d 2 ϕ where d ϕ is the dimension of ϕ. In order to calculate λ ϕ , recall that Writing v for the vector of weight −j of ρ j , we have u(g) = n (ρ j ⊗ ρ j )(g)ξ, ξ , where ξ = w ⊗ v .
In order to complete this calculation, one has to decompose ξ in the sense of (102). This can be done with the help of explicit expressions for the Clebsch-Gordan coefficients, and it eventually yields eigenvalues of the Berezin transform, including γ 1 = j/(j + 1) (cf. Example 3.4), in agreement with calculations by Zhang [48] and Donaldson [17, p.613]. The details will appear in MSc thesis by D. Shmoish.

Two concepts of quantum noise
In the present section we provide two different (and essentially tautological) interpretations of the spectral gap in the context of quantum noise. In quantum measurement theory, there are two concepts of quantum noise: the increment of variance for unbiased approximate measurements as formalized by the noise operator, see below, and a non-unitary evolution of a quantum system described by a quantum channel (a.k.a. a quantum operation, see, e.g. [41,Chapter 8]). Such a non-unitary evolution can be caused, for instance, by the quantum state reduction in the process of repeated quantum measurements. Interestingly enough, for pure POVMs, the spectral gap γ(W ) brings together these two seemingly remote concepts: it measures the minimal magnitude of noise production in the context of the noise operator, and it equals the spectral gap of the Markov chain modeling repeated quantum measurements.
Given an observable A ∈ L (H), write A = λ i P i for its spectral decomposition, where P i 's are pair-wise distinct orthogonal projectors. According to the statistical postulate of quantum mechanics, in a state ρ the observable A attains value λ i with probability ((P i , ρ)). It follows that the expectation of A in ρ equals E(A, ρ) = ((A, ρ)) and the variance is given by Var(A, ρ) = ((A 2 , ρ))−E(A, ρ) 2 . In quantum measurement theory [7], a POVM W represents a measuring device coupled with the system, while Ω is interpreted as the space of device readings. When the system is in a state ρ ∈ S(H), the probability of finding the device in a subset X ∈ C equals µ ρ (X) := ((W (X), ρ)). An experimentalist performs a measurement whose outcome, at every state ρ, is distributed in Ω according to the measure µ ρ . Given a function φ ∈ L 2 (Ω, α) (experimentalist's choice), this procedure yields an unbiased approximate measurement of the quantum observable A := T (φ). The expectation of A in every state ρ equals ((A, ρ)) and thus coincides with the one of the measurement procedure given by Ω φdµ ρ (hence unbiased), in spite of the fact that actual probability distributions determined by the observable A (see above) and the random variable (φ, µ ρ ) could be quite different (hence approximate). In particular, in general, the variance increases under an unbiased approximate measurement: where ∆ W (φ) := T (φ 2 ) − T (φ) 2 is the noise operator. This operator, which is known to be positive, measures the increment of the variance. We wish to explore the relative magnitude of this increment for the "maximally mixed" state θ 0 = 1 n 1l. To this end introduce the minimal noise of the POVM W as where the infimum is taken over all non-constant functions φ ∈ L 2 (Ω, α). It turns out that the minimal noise coincides with the spectral gap: Indeed, since tr(T (φ 2 )) = n(φ, φ), we readily get that ((∆ W (φ), θ 0 )) = ((1l − B)φ, φ) , where B = n −1 T * T is the Markov operator given by (5), while Var(φ, µ θ 0 ) = (φ, φ) − (φ, 1) 2 .
Formula (104) follows from the variational principle.
Suppose now that Ω ⊂ S(H) is a finite set consisting of rank one projectors {P 1 , . . . , P N } and that W is a pure POVM of the form W (P i ) := nα i P i , where α is a probability measure on Ω. Given a system in the original state ρ, the result of the measurement equals P j with probability p = nα j ((P j , ρ)). Recall the quantum state reduction (a.k.a. the wave function collapse) axiom for so called Lüders repeated quantum measurements: if the result of the measurement equals P j , the system moves from the original state ρ to the new (reduced) state It follows that if the original state ρ is chosen from Ω, the repeated quantum measurements are described by the Markov chain with transition probabilities nα j ((P i , P j )). The corresponding Markov operator equals B, and the spectral gap of the Markov chain coincides with the spectral gap γ(W ) of the POVM W . Furthermore, given an original state ρ ∈ Ω, the expected value of the reduced state equals E(ρ). It follows that if γ(W ) > 0, E k (ρ), k → ∞ converge to the maximally mixed quantum state 1 n 1l at the exponential rate ∼ (1 − γ(W )) k . In other words, for pure POVMs the spectral gap controls the convergence rate to the maximally mixed state under repeated quantum measurements.