CBMS-NSF REGIONAL CONFERENCE SERIES IN APPLIED MATHEMATICS A series of lectures on topics of current research interest ...

Author:
S. R. S. Varadhan

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

CBMS-NSF REGIONAL CONFERENCE SERIES IN APPLIED MATHEMATICS A series of lectures on topics of current research interest in applied mathematics under the direction of the Conference Board of the Mathematical Sciences, supported by the National Science Foundation and published by SIAM. GARRETT BIRKHOFF, The Numerical Solution of Elliptic Equations D. V, LINDLEY, Bayesian Statistics, A Review R. S. VARGA, Functional Analysis and Approximation Theory in Numerical Analysis R. R. BAHADUR, Some Limit Theorems in Statistics PATRICK BILLINGSLEY, Weak Convergence of Measures: Applications in Probability J, L, LIONS, Some Aspects of the Optimal Control of Distributed Parameter Systems ROGER PENROSE, Techniques of Differential Topology in Relativity HERMAN CHERNOFF, Sequential Analysis and Optimal Design J. DURBIN, Distribution Theory for Tests Based on the Sample Distribution Function SOL I, RUBINOW, Mathematical Problems in the Biological Sciences P. D. LAX, Hyperbolic Systems of Conservation Laws and the Mathematical Theory of Shock Waves I. J. SCHOENBERG, Cardinal Spline Interpolation IVAN SINGER, The Theory of Best Approximation and Functional Analysis WERNER C. RHEINBOLDT, Methods of Solving Systems of Nonlinear Equations HANS F. WEINBERGER, Variational Methods for Eigenvalue Approximation R. TYRRELL ROCKAFELLAR, Conjugate Duality and Optimization SIR JAMES LIGHTHILL, Mathematical Biofluiddynamics GERARD SALTON, Theory of Indexing CATHLEEN S. MORAWETZ, Notes on Time Decay and Scattering for Some Hyperbolic Problems F. HOPPENSTEADT, Mathematical Theories of Populations: Demographics, Genetics and Epidemics RICHARD ASKEY, Orthogonal Polynomials and Special Functions L. E. PAYNE, Improperly Posed Problems in Partial Differential Equations S. ROSEN, Lectures on the Measurement and Evaluation of the Performance of Computing Systems HERBERT B. KELLER, Numerical Solution of Two Point Boundary Value Problems J. P. LASALLE, The Stability of Dynamical Systems - Z. ARTSTEIN, Appendix A: Limiting Equations and Stability of Nonautonomous Ordinary Differential Equations D. GOTTLIEB AND S. A. ORSZAG, Numerical Analysis of Spectral Methods: Theory and Applications PETER J. HUSER, Robust Statistical Procedures HERBERT SOLOMON, Geometric Probability FRED S. ROBERTS, Graph Theory and Its Applications to Problems of Society

(continued on inside back cover)

S. R. S. VARADHAN

Courant Institute of Mathematical Sciences New York University

Large Deviations and Applications

SOCIETY FOR INDUSTRIAL AND APPLIED MATHEMATICS PHILADELPHIA, PENNSYLVANIA

1984

Copyright 1984 by the Society for Industrial and Applied Mathematics. All rights reserved. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the Publisher. For information, write the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, Pennsylvania 19104-2688. Library of Congress Catalog Card Number: 83-51046. Printed by Capital City Press, Montpelier, Vermont, U.S.A. Second printing 1994.

is a registered trademark.

Contents Preface

v

Section 1 INTRODUCTIONa

1

Section 2 LARGE DEVIATIONS

3

Sectk>n3 CRAMER'S THEOREM

7

Section 4 MULTIDIMENSIONAL VERSION OF CRAMER'S THEOREM.

11

Section 5 AN INFINITE MOTION

15

DIMENSIONAL

EXAMPLE:

BROWNIAN

Section 6 THE VENTCEL-FREIDLIN THEORY

19

Section 7 THE EXIT PROBLEM

25

Sections EMPIRICAL DISTRIBUTIONS

31

Section 9 THE LARGE DEVIATION PROBLEM FOR DISTRIBUTIONS OF MARKOV PROCESSES

EMPIRICAL 33

Section 10 SOME PROPERTIES OF ENTROPY

35

Section 11 UPPER BOUNDS

41

Section 12 LOWER BOUNDS

49 iii

iv

CONTENTS

Section 13 CONTRACTION PRINCIPLE Section 14 APPLICATION SAUSAGE

TO

THE

55 PROBLEM

OF

THE

WIENER

61

Section 15 THE POLARON PROBLEM

69

Section 16 BIBLIOGRAPHICAL REMARKS

73

References

75

Preface These notes are based on lectures given at the University of Southern Illinois at Carbondale during June 1982. The author wishes to thank the National Science Foundation and the Conference Board of Mathematical Sciences for their generous support which made the conference possible. The author is grateful to the Mathematics Department of the University of Southern Illinois for hosting the conference, and for the hard work of its faculty and staff in running the conference successfully.

V

This page intentionally left blank

SECTION 1

Introduction There are many instances where solutions to problems are expressed as integrals over function spaces. The simplest nontrivial example is the Feynman-Kac formula which expresses the solution of the equation

as the function space integral

where E* refers to the expectation with respect to Brownian motion on Rd starting from the point x in Rd at time t = 0. Another example is the solution to Dirichlet's problem

where G A2 = A3=;5 ' ' ' are the eigenvalues tending to — «> and i//, are the corresponding eigenfunctions. (/>!(x)>0 and therefore (fa, 1)>0. The dominant term is e*1' and therefore A = A!. The question before us, then, is to see how the analysis of an integral of the form (1.4) leads directly to a formula of the form (1.6). In the second example we could replace L by LE where for e > 0

We let ue be the solution of and study the behavior of ue as e —> 0. This is a singular perturbation problem, and its analysis can sometimes be particularly difficult. We will return to this problem later.

SECTION 2

Large Deviations In this section we will give an abstract formulation for a class of large deviation problems. In some of the later sections we will look at specific examples that fit the format developed in this section. Let X be a complete separable metric space, and Pe a family of probability measures on the Borel subsets of X. Typically as e -*• 0, Pe will converge weakly to the probability measure which is degenerate, i.e., has unit mass, at some point x0 in X. For most sets A, then, Pe(A)-»0 as e—»0. In the examples we look at, Pe(A) will tend exponentially rapidly to zero as e -»0 with an exponential constant depending on the set A and the relevant situation. We abstract the situation in the following context. DEFINITION 2.1. We say that {Pe} obeys the large deviation principle with a rate function /(•) if there exists a function !(•) from X into [0, 0 is arbitrary we are done. Lower bound. Given 5>0 there exists a point yex such that F(y)-I(y)^ supx [F(x)-I(x)]-S/2. We can find a neighborhood U of y such that F(x)^ F(y) - 5/2 for x e U. We would then havea

Since 8 > 0 is arbitrary we are done. Sometimes we need a slight variation of the above theorem, which we state and prove as another theorem. THEOREM 2.3. Let Pe satisfy the large deviation principle with a rate function /(•). Let F6(x) be a family of nonnegative functions such that for some lower semicontinuous nonnegative function F(x) one has

Then

Proof. Let I = infx [F(x) + I(x)]. For any 5 > 0 and x e X there is a neighborhood L/s of x such that

LARGE DEVIATIONS

5

Therefore as e —» 0

By the usual compactness argument, any compact set K can be included in some open set U such that U is a finite union of such Ug's. Therefore as e — » 0 one has

On the other hand, since Fe =^0, we hav

By choosing K = {x: I(x)^ k} for a k much larger than I, the term can be made negligible compared to exp[-(J-28)/e]. Since 8>0 is arbitrary, the proof is complete. Remark 1. Let Pe be a family of probability measures on a Polish space X satisfying the large deviation principle with a rate function !(•). Let TT be a continuous mapping from X into Y. Then 0e = PETT~I also satisfies the large deviation principle with a rate function J(y) defined by

(If y is not in the range of 77, then J(y) = o°.) We will refer to this as the "contraction principle". Remark 2. Instead of e -» 0 we can have a parameter A —> oo and think of e as I/A. The exponential rate can be proportional to e~2 instead of e"1. We would then have to reparametrize the family to make it come out in the standard form. We can also have a discrete sequence Pn with an exponential rate proportional to n. There is a slight technical modification of Remark 1, which we will need later. We state and prove it as a theorem. THEOREM 2.4. Let Pe satisfy the large deviation principle with a rate function !(•). Let Fe be continuous maps from X-+Y where Y is another complete separable metric space. Assume that lime_»o FE = F exists uniformly over compact subsets ofX. Then if we define Qe on Yby Qe= PEF7\ then Qe satisfies the large deviation principle with a rate function /(•) defined by

Proof. Upper bound. Let A a. Moreover, for x>a and 00 for x>a, and by a similar argument only over 00

Therefore

7

8

SECTION 3

Since 0>0 is arbitrary, replacing 0 by n0 gives

A similar inequality is valid for intervals of the form Jy = [—0 be fixed. Allow n -»». Then

For any closed set C, one verifies easily that

We now let 6 —» 0 and we obtain the upper bound. Applications. One can use this theorem to obtain tail estimates for Brownian motion. For example, one gets

by choosing C = {x: Jo |x(f)|pdf =^1}. For p = 2 this reduces to an eigenvalue problem. Another application is to derive Strassen's form of the law of the iterated logarithm for Brownian motion. Define

for O ^ t ^ l , where /3(f) is Brownian motion on [0,x from C[[0, T]; Rd]^> C[[0, T]; Rd] which is in fact continuous for 8>0. We denote by P8jE)X the distribution induced by FSx from the scaled Wiener 19

20

SECTION 6

measure Qe. In other worads, We can combine the large deviation results for Qe proved in Section 5 with Remark 1 of Section 2 to obtain a large deviation principle for P6>s,x as e —» 0 for each fixed 5 and x. The rate function takes the form

if f(t) is square integrable and /(0) = x. Js<x(f) is infinite otherwise. We shall now state these facts in the form of a theorem. THEOREM 6.1. Let C be a closed set in C[0, T] and G an open set. Then

Proof. Since the theorem follows from Theorem 2.4 by observing that the map F8>x(f) is jointly continuous in the initial point x and the function / as a map of Cp),T]x|? d ->C[0,T]. We want to let 8 —» 0 in the statement of Theorem 6.1. To do that we need a basic estimate on the difference between xe>8(f) and xe(f). This is stated as Lemma 6.2. The starting point x should enter into the definition of xe>s(f) and x e (f). But our estimates will be uniform in x, and so we will not explicitly write out the dependence on the starting point. LEMMA 6.2. For any 7i>0 and T0 is an arbitrary number. We want to estimate QI{T! 0. It is therefore sufficient to show that

During the time interval 0 ^ s ^ T3, we have Consider the function where 6 and I will be chosen later on. We apply Ito's formula to (z

We note

Taking 0 = p2, we obtain for some constant C

22

SECTION 6

we have used here the inequalities (6.5). In other words, if A = Cl(l + el), then is a supermartingale. Replacing I by J/e we get A = CI(f + l)/e. From the supermartingale property we have

On the set where r3 0 to obtain (6.4). We note as mentioned earlier that all of our estimates are uniform in the starting point x. This proves the lemma. This lemma allows us to pass to the limit as 5 —» 0 in Theorem 6.1. We carry this out in the next theorem. THEOREM 6.3. For any closed set C and any open set G in C[0, T],

Here Ix(f) is given by

if f is absolutely continuous with a square integrable derivative f and satisfies /(O) = x. Otherwise Ix(f) = ». Proof. Lower bound. Let /e G and Ix(f) = f 0. Taking logarithms and applying Theorem 6.1, we get

Letting 5 —» 0, and then t\ —> 0 since the left-hand side is independent of 5 and TJ and C8)71 —> oo as 8 —»• 0 for every T) >0, we have

From the definition of J S x(/)» it is easy to see that for every closed set D

and

The proof of the theorem is now complete.

This page intentionally left blank

SECTION 7

The Exit Problem For e >0, let Le be the operator

Consider the solution u e (x) of the Dirichlet problem where G is a bounded region with smooth boundary dG. Consider the ordinary differential equation We assume that there is a globally stable equilibrium point x in G such that for every x e G the solution x(f) of (7.3) lies in G for t >0 and x(f) -*• x as t —» . The boundary data / in (7.2) is assumed to be continuous on dG. As e —> 0 the trajectories of the diffusion process LE are close to the deterministic trajectory (7.3) with a very high probability. In the limit the deterministic trajectory does not exit at all from the set G, so that the exit time and exit place are not defined. We need a new formulation to calculate the limit of the hitting distribution on dG as e —>0. For each 0 < T < o° we define

and

The following lemmas are elementary and are proved directly from the definitions. LEMMA 7.1. There exists a constant C such that

LEMMA 7.2. P0 in the sense of weak convergence almost surely with respect to P0. If we denote by Qn the distribution of Rn0, Rt(a is a mapping of £l-*Ms(£l), and as such is ^ measurable. For each f > 0 and each xeX, we use this mapping to induce a probability measure Ttx on Ms(£l) by defining Ftx = P0>XR^, i-e-> if Bx. We will then show that the large deviation principle holds for F tx with a rate function given by H(Q). Let us consider the occupation distribution: for co eH, t>0 and A (-) occupies the set A 0 and w Gft, L,>a)(-) is then a probability measure on X. We denote the space of probability measures on X by M(X). Now, the relation between L^ in M(X) and Rt>ta in ^s(^) isa made clear byoccupies the set A 0 and w Gft, L,>a)(-) is then a probability 0)X) which was defined on M(X}

Furthermore, in [4] the large deviation result for L^ was governed by a certain /-function (for the Markov process P0)X) which was defined on M(X} (see e.g. [4]). To see the relation between that /-function and the entropy function H(Q) that we will use, we show in §13 that

where, in (9.4), the notation q(Q) = JM means that the marginal of the stationary measure Q is /j,. We refer to (9.4) as the contraction principle. Now, we wish to make precise the hypotheses imposed in order to prove the main results. For the upper estimate (Theorem 11.6), we first comment that if the space X happens to be compact, then no further hypotheses on the Markov process P0,x need be imposed in order to obtain the large deviation principle, but if X is not compact, we need to impose the following hypothesis on P0,x: There exists a sequence (Un(x)} of functions in 2(L) (the domain of the infinitesimal generator L of the Markov process P0>x) with the five properties: (1) u n (x)^c>0 for all x and n. (2) There exists, for every compact set K0 for almost all y (a measure) such that for all x e X: I. p(l,x, dy) = p(l,x, y)a(dy). II. p(l, x, •) as a mapping from X—^L^a) is continuous. To prove the large deviation principle we will assume (1) through (5) as well as I and II.

SECTION 10

Some Properties of Entropy Let (X, S) be a measurable space and let A and /m be the probability measures on (X, 2). Let 38(2) be the space of bounded measurable functions on (X, 2). We define the entropy of JUL with respect to A by

If X is a Polish space and 2 is the Borel o--field, then replacing 38(2) by C(X) in (10.1) gives the same infimum (see [4]). From the definition (10.1) we see that for fixed A, Ji(A; j^) is a nonnegative, convex function of JUL and 0^h(A; JLL)^a,(o) with respect to the a-field ^° and where a> is the variable, i.e., EQ° be a trajectory in ft, t E (-«>, °°), and let P be a measure on &l> with s ^ t. Suppose P{a>: cu(t) = : (0)}: 1, and (P0,a,(o))tl,). Then, from (10.8) we have, for every

which implies from (10.1) that

38

SECTION 10

Using Lemma 10.3, we then get But, by the definition of P^,, Q = Q on ^0°°> and therefore ^--(Q; Q) = 0. Moreover, CL = P^ = S^oPo.^o), and Q0>{0 = 8^,00 Qo.o,, so that Hence, from (10.11) we obtain which completes the proof. THEOREM 10.6. Under the hypothesis that the mapping x —> Ptx is weaklys continuous, H(Q) is lower semicontinuous and convex in Q. Proof. From Theorems 10.4 and 10.5,

The supremum on the right of (10.12) is not changed if we restrict e 98(^7°°) Pi C(ft). Thus, to show the lower semicontinuity of H(A), it suffices to show that for each P^. Assume then that o> n —»co and that o> has no jump at the origin, i.e., (0) and hence P^ ^ PO>- It remains to observe that, for all stationary processes Q, Q{o>: to has a jump at 0} = 0. The convexity of H(Q) follows from (10,12) when we note that H(Q) is the supremum of a collection of linear functionals of Q. Next (Theorem 10.8 below) we prove the surprising fact that H(Q) is linear in Q, but first we need a lemma. LEMMA 10.7. Let M& be the space of probability measures on ft. There exists a conditional probability K^, :£l-+Mn such that Roo>(o) and for all t>Q,

Now define

Clearly, THEOREM 10.9.

Proof. Define P measure on ^° by P = J P0,a>(o)Q(dtu). By Jensen's inequality and (10.1),

Hence, to show (10.15), it suffices to show that

From Lemma 10.3, where P£W and Q^ are r.c.p.d. of P and Q respectively, given ^?. But,

40

SECTION 10

using the homogeneity of the Markov family and the stationarity of the Q process. By the martingale convergence theorem, QQ,^ => Q0, oo almost everywhere with respect to Q measure. As noted earlier, H(A;ia) is lower semicontinuous in LL, and hence

Thus, by Fatou's lemma, (10.17), (10.18) and (10.19),

Using the usual Cesaro argument, we obtain (10.16) and (10.20). This completes the proof of Theorem 10.9. D Let Cs(fl) be the space of functions <E> on H which are bounded and measurable on H and such that for all QeMs(fl), For Oe Cs(O), the linear functional of Q, J <J>Q(do>), is then continuous in Q. LEMMA 10.10. Let YX = {: (0)) is continuous as a function of co on the set where a) is continuous at 0. This is a set of w of Q measure 1 for every Q&MS(£L), i.e., j//(co(0))eCs(n). Moreover, Thus, if we let 3> = tKe»(3F?)nQ(n) and, as we just noted, EMc*} = 1. Thus 3>, as defined, is in Y2, and so EQ{$}^L Henc

But

Thus, (10.22) implies H(t, Q) ^ I, and the proof is complete. D

SECTION 11

Upper Bounds In this section we obtain upper bounds on rt>x(A) as t-»oo for appropriate sets A. The main results are Lemma 11.4 and Theorem 11.6. We need some preparatory lemmas. LEMMA 11.1. Let ) = £k: kso, s+krst ^(^s+kr^) s° that we can rewrite the left side of (11.1) as

From Jensen's inequality,

Since (ds+kTo)) is ^toc+iyr measurable, we have from the hypothesis on and successive conditioning that

Inequalities (11.3) and (11.2) give us (11.1). COROLLARY. Let eSS(3^) and such that EM**0"0}^! for all x. Then, for all t,

Proof. By definition of Ft x measure,

and since

41

a2

eSECTION 11

we have from Lemma 11.1 for all t,

In what follows we shall use the notation

forA^M (O). LEMMA 11.2. Let E? be the set of 3>e33(^T)nCs(n) such that for all x, EMe*(0, there exist an I and open sets G±, G2,..., Gt in Jis(fl) such that A c Uj=1 G, and

In particular, for any compact set A in Ms(fl),

Proof. Let infQeA H(Q) = TJ. From Theorem 10.6 and Lemma 10.7 it follows that, given QeA and e>0, there exist TQ and 4>QeEr such that

UPPER BOUNDS

43

Since Q e Cs(fl), the integral on the left of (11.10) is a continuous functional of O which implies that there exists a neighborhood GQ of O in Ms(£i) such that

for all Q in GQ. The neighborhoods {GQ} form an open covering of the compact set A, and therefore there exist GQi, GQ2, . . . , GQ such that A c: Uj = 1 GQ, This and (11.11) yield (11.8) and hence (11.9). D LEMMA 11.4. Let A be closed in MS(Q) and such that the family of onedimensional marginals of Q as Q varies over A forms a tight family of measures on X. Then

Proof. Let M(X) be the space of probability measures on X. Let AM cr M(X] be the family of one-dimensional marginals of Q as Q varies over A. Since by hypothesis AM is tight, given en —» 0 there exist compact sets Kn 0, v = \n2, en = 1/n and Tjn = exp [ Then, Thus, from (11.6) for each n

This implies

If we let A, ={Q: Q(C n )^l/f + 2/n for all n}, then we have just shown that

We should note that the set At depends on A since the sets Q, depend on the choice of t)n, and Tjn was selected in terms of A. From (11.17) we get

Let Aoo = n t>0 A = {O j#s(ft): Q(Cn) ^ 2/n, for all n}. The restriction of Q to &\ constitutes a tight family of measures on Dx[0,1]. But, for stationary processes, tightness when restricted to some interval implies overall tightness, and hence Ax is a compact set in Ms(£i). Since A is closed, A O A«, is compact in ^s(ft). Thus, from Lemmas 11.2 and 11.3, J(A r\Ax)^mtQ£AnA^H(Q)^ —inf QeA H(Q), which implies that for any e > 0 there exists an open set Ge => A n Aoo such that

In Lemma 11.5 to follow, we show that, given any open set G 0 and t^t0,

UPPER BOUNDS

45

which means

Since the left side of (11.20) is independent of e and A, we now let e —» 0 and A —»cc, obtaining the desired (11.12). LEMMA 11.5. LetA^Ms(fl)beclosed,At = {Q: Q(Cn}o A- Let G be a neighborhood in Ms(Cl) such that G0 for all x and n. Thus, from (11.22) we conclude that, for any compact set K,

Using hypothesis (4) we get from Fatou's lemma that, for all t,

By the definition of I\x measure we have so, in particular, if

then F(Q) = JX V(y)/x(dy) where ^ is the one-dimensional marginal of Q. Thus, (11.23) becomes

Let KI = {x e X: V(x) ^ I}. Using hypothesis (3) and the constant C which appears there, we get from (11.24)

thus, for all x e K, In particular, take A >0 and choose k = Cn + An2, 8^ = 1/n. From (11.25) we get, for all x K,

which implies that, for all x e K,

If we let

then (11.26) says

UPPER BOUNDS

47

Now, let A be a closed set in Ms(fl). From the definition of Ax we see that the marginals of Q as Q varies over Ax form a tight family. Since A fl Ax i closed, we have from Lemma 11.4 that for each A > 0

Since A c (A n Ax) U Ax, we conclude from (11.27) and (11.28) that, for every X>0,

In this last, if we let A. -* and let /ut be the margina distribution of Q. Then /x « a. Proof. Let A c X be such that a (A) = 0. We want to show that if Q Ms(fl such that H(Q) (Q measure). Thus, from (12.1) we conclude that for almost all o> (Q measure) Qo«{ft>':«'(l)eA} = 0. Hence, EQ{Q00. THEOREM 12.5. Let Q(=MS(CL) be such that H(Q)e93(^?) and be such that We earlier defined (in (10.12))

and noted H(f, Q)^fH(Q). Thus, for the selected, In particular, let °(L be the space of functions u e 98(X) for each of which there exist constants c and C such that for all x, 0 < c ^ u S= C < o°, and for h> 0 and ue% take (a>) = log(w(a>(h))/(Thu)(co(0))) where Th is the semigroup as-ssociated with the Markov process P0,x. Now, this particular (0))}

Since (13.4) holds for every u e % we get

This states that (l/h)I h (|a)^J. Since lim h _^(l/h)I h (ju,) = 7(/x), we conclude I(fjt)^ I Since this was true for any Q such that q(Q) = jut, we have

We now want to show the inequality in the other direction. Let JUL be a probability measure on X and suppose I(JUL) = I0, Ih(/ui)^ hl(/a) = hi In [6] it is shown that there exists a bivariate distribution p(dx, dy) on X x X such that both marginals of p are equal to /a, and if Px(dy) denotes the r.c.p.d. of the second component given the first, then for all h > 0

In (13.7), h(a;/3) is, of course, the entropy of |3 with respect to a, as introduced in Section 10, and p(t, x, •) is the transition function for the Markov process0,P x. Let n be a positive integer, let h = l/n and consider the grid {jh}, j = 0, ±1, ±2, Define a Markov chain on this grid with stationary marginal /ut and transition probability px(dy). Given a sample of this Markov chain, let us interpolate between the grid points by random trajectories whose distribution is P£y where PJUch) is the r.c.p.d. of P0)X given ^{J and where, in the interpolation, x and y are the endpoints, i.e., values of the Markov chain at adjacent grid points. Let Q(h) be the measure induced by this interpolation procedure from the Markov chain on the grid. We note that Px>y is defined for each x for almost all y with respect to p(h, x, •) measure, but px(-) is absolutelya continuous with respect to p(h, x, •) for almost all x (/3 measure) and hence P£y is defined almost everywhere with respect to p(dx, dy) measure. Thus the interpolation procedure is well defined.

CONTRACTION PRINCIPLE

57

Let PI,, be the Markov process on &%> with initial distribution jot, i.e., P» =JP0,x*A(dx). Consider hy«(P^ Q(fl)). Let ^(K) be the tr-field generated by a>(jh), O^j ^ l/h = n. Then,by Lemma 10.3, where P^ and Q^° are respectively the r.c.p.d. of P^ and Q(h), given ^(h). Notice that by construction P^ = Q^i) on &° for almost all o> with respect to Q(h) measure, and therefore EQ(h){fc*?(pn.« 5 Qlh))} = 0- Hence, from (13.8) we get In Lemma 13.2 to follow we show using (13.7), that h^P^; Q(h)) ^ nhl = I Hence, from (13.9), we conclude h^P^; Q(h)) ^ I This last inequalityand Lemma 10.2 imply that the family of measures {Q(H)} is tight. So, let Q by any weak limit of Q(h) as h—>0 (n —>°°) and we see that O is stationary with marginal JUL and, moreover, H(Q)^1. Hence, mfQ:q(Q)=(JLH(Q)^J = I(/ui) an the proof is complete except for the lemma. LEMMA 13.2. Let ir(x, dy) be any Markov chain transition probability with state space X and let p be a probability measure on X. Let p(dx, dy) be a probability measure onXxXwith both marginals ja, conditional measure p and let

Let P^ be the ir-chain with initial distribution /u, and let R^ be the stationary p-chain with initial distribution /u,. Let &n be the o--field generated by x0, Xi,..., Xn, the first n steps of the chain. Then Proof. If T = O° the result is obvious, so assume Te3B(^T)> and from (13.11), since H(Q)^J,

On the other hand, for this 3>,

From (13.12) and (13.13) we get

In particular, taking T — 2 and using the same estimates as in (11.13), we get

Since the one-dimensional marginals .s#M form a tight family, one can clearly choose K, e, A to make the right side of (13.15) as small as one pleases for all Qe^ simultaneously, showing M . THEOREM 13.4. Assume the Markov process P0,x satisfies hypotheses (l)-(5) (Section 9). Let si be a family in Ms(fl) such that H(Q)^ J«» for all Qe,s& Then ^ is tight in M Proof. If fx, e s&M, the family of one-dimensional marginals in ^, then, by the

CONTRACTION PRINCIPLE

59

contraction principle,

but since /^,e^M there is a Q with H(Q)^l and marginal & so that I(^}^1 This is true for all ju, es&M- From the definition of I(JLA) we see that

where Un is as in properties (l)-(5) of Section 9. Letting n —» °° and using Fatou's lemma, we have

Property (5) now ensures that the measures /u, in ^M are uniformly tight. Now we apply Theorem 13.3.D

This page intentionally left blank

SECTION 14

Application to the Problem of the Wiener Sausage A problem that comes up in the study of density of states for Schrodinger operators with certain random potentials near the edge of the energy spectrum is the following: Let /3(0 be d-dimensional Brownian motion starting from the origin. Let e > 0 be fixed. Consider

Ct is just the image of the Wiener path up to time t, and C, is the sausage around it of radius e. The problem is to show that

exists and is nonzero. Here \C*\ is the d -dimensional volume of the sausage Cf up to time t. The actual physical problem involves the Brownian motion that is conditioned to return to the origin at time t. But for large t, one can see easily that the difference between the free Brownian motion and the conditional Brownian motion is small enough that the formula (14.1) is unaffected by it. We will study only the free Brownian motion. We will first carry out a Brownian change of scale so that td/(d+2) appears naturally. Let us replace |3(s), O^isSif, by

which is again a Brownian motion. Therefore the distribution of \C*\ is the same as that of td/(d+2) |C^t»+2)|. If we let r = fd/(d+2), then

The problem therefore reduces to showing that

exists and is nonzero. 61

62

SECTION 14

A basic fact in what follows is the behavior of where G is a smooth bounded open set containing the origin. If L(t, w) is the random measure representing the occupation time of the Brownian motion, i.e., if

then We can therefore estimate Since the set {/x: supp JLL t(x)= t<j>(xtl/d), then where L, is the occupation distribution and * denotes convolution. We will denote by t,f the mollified local time. The problem we have reduces to two lemmas: LEMMA 14.3.

64

SECTION 14

and LEMMA 14.4.

Of the two lemmas, the second is a standard approximation lemma involving truncation methods. We will not carry out the proof, but will only refer to [8]. We will sketch a proof of Lemma 14.3. If we consider the Ll topology for densities on Tf, then |x:/>0| is a lower semicontinuous functional of the density /. In view of Theorem 2.3, it is sufficient to prove the large deviation principle for ft in the Ll topology with a rate function /(/). LEMMA 14.5. Let $ be any mollifier, i.e., a smooth probability density. Then

for any C closed in LI. Proof. The map / —» / * i/f is continuous from M with weak topology to Ll with norm topology. So the large deviation principle in the weak topology for L,, which implies the large deviation principle in the weak topology for t, isf converted into a large deviation principle for /, * i/f in the norm topology of L^. Theorem 2.4 provides the precise proof. We now state without proof Lemma 14.6. We will then state and prove Lemma 14.7, which will imply Lemma 14.3 and our main result. Finally we will prove Lemma 14.6. LEMMA 14.6.

where kp(«/0 —» °° as i/f— >80 for each p>0. Proof. The proof will be given after the proof of Lemma 14.7. LEMMA 14.7. The large deviation principle holds for ft with the rate function I(f) in the space Ll with norm topology. Proof. Upper bound.

Therefore

Letting $ — > 80 and p —*• 0, we get

provided C is closed in Lj.

APPLICATION TO THE PROBLEM OF THE WIENER SAUSAGE

65

Lower bound. For an open set G around /, where Gt is a smaller open set around / such that the sphere around GI of radius p is contained in G. The result is again obvious from Lemma 14.6.. From Lemma 14.7 we obtain Lemma 14.3 by an application of Theorem 2.3. If we now combine it with the lower bound, i.e. Theorem 14.2, and take Lemma 14.4 for granted, then we have THEOREM 14.8.

where k(v, a) is given by (14.3). We now turn to the Proof of Lemma 14.6.

where (We have assumed that are symmetric.) The map g -» 6 defined by 0 = g * 0 we can find a finite number N = N(t,p) of l5..., 0N such that the image of the unit ball is covered by spheres around 0j,..., 0N of radius p/2. We can assume that 01}..., BN are al bounded by 1 as well. Then

where

66

SECTION 14

One can show that for any \ with |x|

where A.J(ZX) is the largest eigenvalue of If, for each p>0, N(t, p)^exp[Dpf] for some Dp, then

One verifies that supA(z^)—»0 as i/f—»80 for each z>0. Therefore

and by letting z —>«» we will obtain our lemma. We now need only the estimation of N(p, t) to complete the proof of Lemma 14.6.

where o> is the LI modulus of continuity of

(continued on inside back cover)

S. R. S. VARADHAN

Courant Institute of Mathematical Sciences New York University

Large Deviations and Applications

SOCIETY FOR INDUSTRIAL AND APPLIED MATHEMATICS PHILADELPHIA, PENNSYLVANIA

1984

Copyright 1984 by the Society for Industrial and Applied Mathematics. All rights reserved. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the Publisher. For information, write the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, Pennsylvania 19104-2688. Library of Congress Catalog Card Number: 83-51046. Printed by Capital City Press, Montpelier, Vermont, U.S.A. Second printing 1994.

is a registered trademark.

Contents Preface

v

Section 1 INTRODUCTIONa

1

Section 2 LARGE DEVIATIONS

3

Sectk>n3 CRAMER'S THEOREM

7

Section 4 MULTIDIMENSIONAL VERSION OF CRAMER'S THEOREM.

11

Section 5 AN INFINITE MOTION

15

DIMENSIONAL

EXAMPLE:

BROWNIAN

Section 6 THE VENTCEL-FREIDLIN THEORY

19

Section 7 THE EXIT PROBLEM

25

Sections EMPIRICAL DISTRIBUTIONS

31

Section 9 THE LARGE DEVIATION PROBLEM FOR DISTRIBUTIONS OF MARKOV PROCESSES

EMPIRICAL 33

Section 10 SOME PROPERTIES OF ENTROPY

35

Section 11 UPPER BOUNDS

41

Section 12 LOWER BOUNDS

49 iii

iv

CONTENTS

Section 13 CONTRACTION PRINCIPLE Section 14 APPLICATION SAUSAGE

TO

THE

55 PROBLEM

OF

THE

WIENER

61

Section 15 THE POLARON PROBLEM

69

Section 16 BIBLIOGRAPHICAL REMARKS

73

References

75

Preface These notes are based on lectures given at the University of Southern Illinois at Carbondale during June 1982. The author wishes to thank the National Science Foundation and the Conference Board of Mathematical Sciences for their generous support which made the conference possible. The author is grateful to the Mathematics Department of the University of Southern Illinois for hosting the conference, and for the hard work of its faculty and staff in running the conference successfully.

V

This page intentionally left blank

SECTION 1

Introduction There are many instances where solutions to problems are expressed as integrals over function spaces. The simplest nontrivial example is the Feynman-Kac formula which expresses the solution of the equation

as the function space integral

where E* refers to the expectation with respect to Brownian motion on Rd starting from the point x in Rd at time t = 0. Another example is the solution to Dirichlet's problem

where G A2 = A3=;5 ' ' ' are the eigenvalues tending to — «> and i//, are the corresponding eigenfunctions. (/>!(x)>0 and therefore (fa, 1)>0. The dominant term is e*1' and therefore A = A!. The question before us, then, is to see how the analysis of an integral of the form (1.4) leads directly to a formula of the form (1.6). In the second example we could replace L by LE where for e > 0

We let ue be the solution of and study the behavior of ue as e —> 0. This is a singular perturbation problem, and its analysis can sometimes be particularly difficult. We will return to this problem later.

SECTION 2

Large Deviations In this section we will give an abstract formulation for a class of large deviation problems. In some of the later sections we will look at specific examples that fit the format developed in this section. Let X be a complete separable metric space, and Pe a family of probability measures on the Borel subsets of X. Typically as e -*• 0, Pe will converge weakly to the probability measure which is degenerate, i.e., has unit mass, at some point x0 in X. For most sets A, then, Pe(A)-»0 as e—»0. In the examples we look at, Pe(A) will tend exponentially rapidly to zero as e -»0 with an exponential constant depending on the set A and the relevant situation. We abstract the situation in the following context. DEFINITION 2.1. We say that {Pe} obeys the large deviation principle with a rate function /(•) if there exists a function !(•) from X into [0, 0 is arbitrary we are done. Lower bound. Given 5>0 there exists a point yex such that F(y)-I(y)^ supx [F(x)-I(x)]-S/2. We can find a neighborhood U of y such that F(x)^ F(y) - 5/2 for x e U. We would then havea

Since 8 > 0 is arbitrary we are done. Sometimes we need a slight variation of the above theorem, which we state and prove as another theorem. THEOREM 2.3. Let Pe satisfy the large deviation principle with a rate function /(•). Let F6(x) be a family of nonnegative functions such that for some lower semicontinuous nonnegative function F(x) one has

Then

Proof. Let I = infx [F(x) + I(x)]. For any 5 > 0 and x e X there is a neighborhood L/s of x such that

LARGE DEVIATIONS

5

Therefore as e —» 0

By the usual compactness argument, any compact set K can be included in some open set U such that U is a finite union of such Ug's. Therefore as e — » 0 one has

On the other hand, since Fe =^0, we hav

By choosing K = {x: I(x)^ k} for a k much larger than I, the term can be made negligible compared to exp[-(J-28)/e]. Since 8>0 is arbitrary, the proof is complete. Remark 1. Let Pe be a family of probability measures on a Polish space X satisfying the large deviation principle with a rate function !(•). Let TT be a continuous mapping from X into Y. Then 0e = PETT~I also satisfies the large deviation principle with a rate function J(y) defined by

(If y is not in the range of 77, then J(y) = o°.) We will refer to this as the "contraction principle". Remark 2. Instead of e -» 0 we can have a parameter A —> oo and think of e as I/A. The exponential rate can be proportional to e~2 instead of e"1. We would then have to reparametrize the family to make it come out in the standard form. We can also have a discrete sequence Pn with an exponential rate proportional to n. There is a slight technical modification of Remark 1, which we will need later. We state and prove it as a theorem. THEOREM 2.4. Let Pe satisfy the large deviation principle with a rate function !(•). Let Fe be continuous maps from X-+Y where Y is another complete separable metric space. Assume that lime_»o FE = F exists uniformly over compact subsets ofX. Then if we define Qe on Yby Qe= PEF7\ then Qe satisfies the large deviation principle with a rate function /(•) defined by

Proof. Upper bound. Let A a. Moreover, for x>a and 00 for x>a, and by a similar argument only over 00

Therefore

7

8

SECTION 3

Since 0>0 is arbitrary, replacing 0 by n0 gives

A similar inequality is valid for intervals of the form Jy = [—0 be fixed. Allow n -»». Then

For any closed set C, one verifies easily that

We now let 6 —» 0 and we obtain the upper bound. Applications. One can use this theorem to obtain tail estimates for Brownian motion. For example, one gets

by choosing C = {x: Jo |x(f)|pdf =^1}. For p = 2 this reduces to an eigenvalue problem. Another application is to derive Strassen's form of the law of the iterated logarithm for Brownian motion. Define

for O ^ t ^ l , where /3(f) is Brownian motion on [0,x from C[[0, T]; Rd]^> C[[0, T]; Rd] which is in fact continuous for 8>0. We denote by P8jE)X the distribution induced by FSx from the scaled Wiener 19

20

SECTION 6

measure Qe. In other worads, We can combine the large deviation results for Qe proved in Section 5 with Remark 1 of Section 2 to obtain a large deviation principle for P6>s,x as e —» 0 for each fixed 5 and x. The rate function takes the form

if f(t) is square integrable and /(0) = x. Js<x(f) is infinite otherwise. We shall now state these facts in the form of a theorem. THEOREM 6.1. Let C be a closed set in C[0, T] and G an open set. Then

Proof. Since the theorem follows from Theorem 2.4 by observing that the map F8>x(f) is jointly continuous in the initial point x and the function / as a map of Cp),T]x|? d ->C[0,T]. We want to let 8 —» 0 in the statement of Theorem 6.1. To do that we need a basic estimate on the difference between xe>8(f) and xe(f). This is stated as Lemma 6.2. The starting point x should enter into the definition of xe>s(f) and x e (f). But our estimates will be uniform in x, and so we will not explicitly write out the dependence on the starting point. LEMMA 6.2. For any 7i>0 and T0 is an arbitrary number. We want to estimate QI{T! 0. It is therefore sufficient to show that

During the time interval 0 ^ s ^ T3, we have Consider the function where 6 and I will be chosen later on. We apply Ito's formula to (z

We note

Taking 0 = p2, we obtain for some constant C

22

SECTION 6

we have used here the inequalities (6.5). In other words, if A = Cl(l + el), then is a supermartingale. Replacing I by J/e we get A = CI(f + l)/e. From the supermartingale property we have

On the set where r3 0 to obtain (6.4). We note as mentioned earlier that all of our estimates are uniform in the starting point x. This proves the lemma. This lemma allows us to pass to the limit as 5 —» 0 in Theorem 6.1. We carry this out in the next theorem. THEOREM 6.3. For any closed set C and any open set G in C[0, T],

Here Ix(f) is given by

if f is absolutely continuous with a square integrable derivative f and satisfies /(O) = x. Otherwise Ix(f) = ». Proof. Lower bound. Let /e G and Ix(f) = f 0. Taking logarithms and applying Theorem 6.1, we get

Letting 5 —» 0, and then t\ —> 0 since the left-hand side is independent of 5 and TJ and C8)71 —> oo as 8 —»• 0 for every T) >0, we have

From the definition of J S x(/)» it is easy to see that for every closed set D

and

The proof of the theorem is now complete.

This page intentionally left blank

SECTION 7

The Exit Problem For e >0, let Le be the operator

Consider the solution u e (x) of the Dirichlet problem where G is a bounded region with smooth boundary dG. Consider the ordinary differential equation We assume that there is a globally stable equilibrium point x in G such that for every x e G the solution x(f) of (7.3) lies in G for t >0 and x(f) -*• x as t —» . The boundary data / in (7.2) is assumed to be continuous on dG. As e —> 0 the trajectories of the diffusion process LE are close to the deterministic trajectory (7.3) with a very high probability. In the limit the deterministic trajectory does not exit at all from the set G, so that the exit time and exit place are not defined. We need a new formulation to calculate the limit of the hitting distribution on dG as e —>0. For each 0 < T < o° we define

and

The following lemmas are elementary and are proved directly from the definitions. LEMMA 7.1. There exists a constant C such that

LEMMA 7.2. P0 in the sense of weak convergence almost surely with respect to P0. If we denote by Qn the distribution of Rn0, Rt(a is a mapping of £l-*Ms(£l), and as such is ^ measurable. For each f > 0 and each xeX, we use this mapping to induce a probability measure Ttx on Ms(£l) by defining Ftx = P0>XR^, i-e-> if Bx. We will then show that the large deviation principle holds for F tx with a rate function given by H(Q). Let us consider the occupation distribution: for co eH, t>0 and A (-) occupies the set A 0 and w Gft, L,>a)(-) is then a probability measure on X. We denote the space of probability measures on X by M(X). Now, the relation between L^ in M(X) and Rt>ta in ^s(^) isa made clear byoccupies the set A 0 and w Gft, L,>a)(-) is then a probability 0)X) which was defined on M(X}

Furthermore, in [4] the large deviation result for L^ was governed by a certain /-function (for the Markov process P0)X) which was defined on M(X} (see e.g. [4]). To see the relation between that /-function and the entropy function H(Q) that we will use, we show in §13 that

where, in (9.4), the notation q(Q) = JM means that the marginal of the stationary measure Q is /j,. We refer to (9.4) as the contraction principle. Now, we wish to make precise the hypotheses imposed in order to prove the main results. For the upper estimate (Theorem 11.6), we first comment that if the space X happens to be compact, then no further hypotheses on the Markov process P0,x need be imposed in order to obtain the large deviation principle, but if X is not compact, we need to impose the following hypothesis on P0,x: There exists a sequence (Un(x)} of functions in 2(L) (the domain of the infinitesimal generator L of the Markov process P0>x) with the five properties: (1) u n (x)^c>0 for all x and n. (2) There exists, for every compact set K0 for almost all y (a measure) such that for all x e X: I. p(l,x, dy) = p(l,x, y)a(dy). II. p(l, x, •) as a mapping from X—^L^a) is continuous. To prove the large deviation principle we will assume (1) through (5) as well as I and II.

SECTION 10

Some Properties of Entropy Let (X, S) be a measurable space and let A and /m be the probability measures on (X, 2). Let 38(2) be the space of bounded measurable functions on (X, 2). We define the entropy of JUL with respect to A by

If X is a Polish space and 2 is the Borel o--field, then replacing 38(2) by C(X) in (10.1) gives the same infimum (see [4]). From the definition (10.1) we see that for fixed A, Ji(A; j^) is a nonnegative, convex function of JUL and 0^h(A; JLL)^a,(o) with respect to the a-field ^° and where a> is the variable, i.e., EQ° be a trajectory in ft, t E (-«>, °°), and let P be a measure on &l> with s ^ t. Suppose P{a>: cu(t) = : (0)}: 1, and (P0,a,(o))tl,). Then, from (10.8) we have, for every

which implies from (10.1) that

38

SECTION 10

Using Lemma 10.3, we then get But, by the definition of P^,, Q = Q on ^0°°> and therefore ^--(Q; Q) = 0. Moreover, CL = P^ = S^oPo.^o), and Q0>{0 = 8^,00 Qo.o,, so that Hence, from (10.11) we obtain which completes the proof. THEOREM 10.6. Under the hypothesis that the mapping x —> Ptx is weaklys continuous, H(Q) is lower semicontinuous and convex in Q. Proof. From Theorems 10.4 and 10.5,

The supremum on the right of (10.12) is not changed if we restrict e 98(^7°°) Pi C(ft). Thus, to show the lower semicontinuity of H(A), it suffices to show that for each P^. Assume then that o> n —»co and that o> has no jump at the origin, i.e., (0) and hence P^ ^ PO>- It remains to observe that, for all stationary processes Q, Q{o>: to has a jump at 0} = 0. The convexity of H(Q) follows from (10,12) when we note that H(Q) is the supremum of a collection of linear functionals of Q. Next (Theorem 10.8 below) we prove the surprising fact that H(Q) is linear in Q, but first we need a lemma. LEMMA 10.7. Let M& be the space of probability measures on ft. There exists a conditional probability K^, :£l-+Mn such that Roo>(o) and for all t>Q,

Now define

Clearly, THEOREM 10.9.

Proof. Define P measure on ^° by P = J P0,a>(o)Q(dtu). By Jensen's inequality and (10.1),

Hence, to show (10.15), it suffices to show that

From Lemma 10.3, where P£W and Q^ are r.c.p.d. of P and Q respectively, given ^?. But,

40

SECTION 10

using the homogeneity of the Markov family and the stationarity of the Q process. By the martingale convergence theorem, QQ,^ => Q0, oo almost everywhere with respect to Q measure. As noted earlier, H(A;ia) is lower semicontinuous in LL, and hence

Thus, by Fatou's lemma, (10.17), (10.18) and (10.19),

Using the usual Cesaro argument, we obtain (10.16) and (10.20). This completes the proof of Theorem 10.9. D Let Cs(fl) be the space of functions <E> on H which are bounded and measurable on H and such that for all QeMs(fl), For Oe Cs(O), the linear functional of Q, J <J>Q(do>), is then continuous in Q. LEMMA 10.10. Let YX = {: (0)) is continuous as a function of co on the set where a) is continuous at 0. This is a set of w of Q measure 1 for every Q&MS(£L), i.e., j//(co(0))eCs(n). Moreover, Thus, if we let 3> = tKe»(3F?)nQ(n) and, as we just noted, EMc*} = 1. Thus 3>, as defined, is in Y2, and so EQ{$}^L Henc

But

Thus, (10.22) implies H(t, Q) ^ I, and the proof is complete. D

SECTION 11

Upper Bounds In this section we obtain upper bounds on rt>x(A) as t-»oo for appropriate sets A. The main results are Lemma 11.4 and Theorem 11.6. We need some preparatory lemmas. LEMMA 11.1. Let ) = £k: kso, s+krst ^(^s+kr^) s° that we can rewrite the left side of (11.1) as

From Jensen's inequality,

Since (ds+kTo)) is ^toc+iyr measurable, we have from the hypothesis on and successive conditioning that

Inequalities (11.3) and (11.2) give us (11.1). COROLLARY. Let eSS(3^) and such that EM**0"0}^! for all x. Then, for all t,

Proof. By definition of Ft x measure,

and since

41

a2

eSECTION 11

we have from Lemma 11.1 for all t,

In what follows we shall use the notation

forA^M (O). LEMMA 11.2. Let E? be the set of 3>e33(^T)nCs(n) such that for all x, EMe*(0, there exist an I and open sets G±, G2,..., Gt in Jis(fl) such that A c Uj=1 G, and

In particular, for any compact set A in Ms(fl),

Proof. Let infQeA H(Q) = TJ. From Theorem 10.6 and Lemma 10.7 it follows that, given QeA and e>0, there exist TQ and 4>QeEr such that

UPPER BOUNDS

43

Since Q e Cs(fl), the integral on the left of (11.10) is a continuous functional of O which implies that there exists a neighborhood GQ of O in Ms(£i) such that

for all Q in GQ. The neighborhoods {GQ} form an open covering of the compact set A, and therefore there exist GQi, GQ2, . . . , GQ such that A c: Uj = 1 GQ, This and (11.11) yield (11.8) and hence (11.9). D LEMMA 11.4. Let A be closed in MS(Q) and such that the family of onedimensional marginals of Q as Q varies over A forms a tight family of measures on X. Then

Proof. Let M(X) be the space of probability measures on X. Let AM cr M(X] be the family of one-dimensional marginals of Q as Q varies over A. Since by hypothesis AM is tight, given en —» 0 there exist compact sets Kn 0, v = \n2, en = 1/n and Tjn = exp [ Then, Thus, from (11.6) for each n

This implies

If we let A, ={Q: Q(C n )^l/f + 2/n for all n}, then we have just shown that

We should note that the set At depends on A since the sets Q, depend on the choice of t)n, and Tjn was selected in terms of A. From (11.17) we get

Let Aoo = n t>0 A = {O j#s(ft): Q(Cn) ^ 2/n, for all n}. The restriction of Q to &\ constitutes a tight family of measures on Dx[0,1]. But, for stationary processes, tightness when restricted to some interval implies overall tightness, and hence Ax is a compact set in Ms(£i). Since A is closed, A O A«, is compact in ^s(ft). Thus, from Lemmas 11.2 and 11.3, J(A r\Ax)^mtQ£AnA^H(Q)^ —inf QeA H(Q), which implies that for any e > 0 there exists an open set Ge => A n Aoo such that

In Lemma 11.5 to follow, we show that, given any open set G 0 and t^t0,

UPPER BOUNDS

45

which means

Since the left side of (11.20) is independent of e and A, we now let e —» 0 and A —»cc, obtaining the desired (11.12). LEMMA 11.5. LetA^Ms(fl)beclosed,At = {Q: Q(Cn}o A- Let G be a neighborhood in Ms(Cl) such that G0 for all x and n. Thus, from (11.22) we conclude that, for any compact set K,

Using hypothesis (4) we get from Fatou's lemma that, for all t,

By the definition of I\x measure we have so, in particular, if

then F(Q) = JX V(y)/x(dy) where ^ is the one-dimensional marginal of Q. Thus, (11.23) becomes

Let KI = {x e X: V(x) ^ I}. Using hypothesis (3) and the constant C which appears there, we get from (11.24)

thus, for all x e K, In particular, take A >0 and choose k = Cn + An2, 8^ = 1/n. From (11.25) we get, for all x K,

which implies that, for all x e K,

If we let

then (11.26) says

UPPER BOUNDS

47

Now, let A be a closed set in Ms(fl). From the definition of Ax we see that the marginals of Q as Q varies over Ax form a tight family. Since A fl Ax i closed, we have from Lemma 11.4 that for each A > 0

Since A c (A n Ax) U Ax, we conclude from (11.27) and (11.28) that, for every X>0,

In this last, if we let A. -* and let /ut be the margina distribution of Q. Then /x « a. Proof. Let A c X be such that a (A) = 0. We want to show that if Q Ms(fl such that H(Q) (Q measure). Thus, from (12.1) we conclude that for almost all o> (Q measure) Qo«{ft>':«'(l)eA} = 0. Hence, EQ{Q00. THEOREM 12.5. Let Q(=MS(CL) be such that H(Q)e93(^?) and be such that We earlier defined (in (10.12))

and noted H(f, Q)^fH(Q). Thus, for the selected, In particular, let °(L be the space of functions u e 98(X) for each of which there exist constants c and C such that for all x, 0 < c ^ u S= C < o°, and for h> 0 and ue% take (a>) = log(w(a>(h))/(Thu)(co(0))) where Th is the semigroup as-ssociated with the Markov process P0,x. Now, this particular (0))}

Since (13.4) holds for every u e % we get

This states that (l/h)I h (|a)^J. Since lim h _^(l/h)I h (ju,) = 7(/x), we conclude I(fjt)^ I Since this was true for any Q such that q(Q) = jut, we have

We now want to show the inequality in the other direction. Let JUL be a probability measure on X and suppose I(JUL) = I0, Ih(/ui)^ hl(/a) = hi In [6] it is shown that there exists a bivariate distribution p(dx, dy) on X x X such that both marginals of p are equal to /a, and if Px(dy) denotes the r.c.p.d. of the second component given the first, then for all h > 0

In (13.7), h(a;/3) is, of course, the entropy of |3 with respect to a, as introduced in Section 10, and p(t, x, •) is the transition function for the Markov process0,P x. Let n be a positive integer, let h = l/n and consider the grid {jh}, j = 0, ±1, ±2, Define a Markov chain on this grid with stationary marginal /ut and transition probability px(dy). Given a sample of this Markov chain, let us interpolate between the grid points by random trajectories whose distribution is P£y where PJUch) is the r.c.p.d. of P0)X given ^{J and where, in the interpolation, x and y are the endpoints, i.e., values of the Markov chain at adjacent grid points. Let Q(h) be the measure induced by this interpolation procedure from the Markov chain on the grid. We note that Px>y is defined for each x for almost all y with respect to p(h, x, •) measure, but px(-) is absolutelya continuous with respect to p(h, x, •) for almost all x (/3 measure) and hence P£y is defined almost everywhere with respect to p(dx, dy) measure. Thus the interpolation procedure is well defined.

CONTRACTION PRINCIPLE

57

Let PI,, be the Markov process on &%> with initial distribution jot, i.e., P» =JP0,x*A(dx). Consider hy«(P^ Q(fl)). Let ^(K) be the tr-field generated by a>(jh), O^j ^ l/h = n. Then,by Lemma 10.3, where P^ and Q^° are respectively the r.c.p.d. of P^ and Q(h), given ^(h). Notice that by construction P^ = Q^i) on &° for almost all o> with respect to Q(h) measure, and therefore EQ(h){fc*?(pn.« 5 Qlh))} = 0- Hence, from (13.8) we get In Lemma 13.2 to follow we show using (13.7), that h^P^; Q(h)) ^ nhl = I Hence, from (13.9), we conclude h^P^; Q(h)) ^ I This last inequalityand Lemma 10.2 imply that the family of measures {Q(H)} is tight. So, let Q by any weak limit of Q(h) as h—>0 (n —>°°) and we see that O is stationary with marginal JUL and, moreover, H(Q)^1. Hence, mfQ:q(Q)=(JLH(Q)^J = I(/ui) an the proof is complete except for the lemma. LEMMA 13.2. Let ir(x, dy) be any Markov chain transition probability with state space X and let p be a probability measure on X. Let p(dx, dy) be a probability measure onXxXwith both marginals ja, conditional measure p and let

Let P^ be the ir-chain with initial distribution /u, and let R^ be the stationary p-chain with initial distribution /u,. Let &n be the o--field generated by x0, Xi,..., Xn, the first n steps of the chain. Then Proof. If T = O° the result is obvious, so assume Te3B(^T)> and from (13.11), since H(Q)^J,

On the other hand, for this 3>,

From (13.12) and (13.13) we get

In particular, taking T — 2 and using the same estimates as in (11.13), we get

Since the one-dimensional marginals .s#M form a tight family, one can clearly choose K, e, A to make the right side of (13.15) as small as one pleases for all Qe^ simultaneously, showing M . THEOREM 13.4. Assume the Markov process P0,x satisfies hypotheses (l)-(5) (Section 9). Let si be a family in Ms(fl) such that H(Q)^ J«» for all Qe,s& Then ^ is tight in M Proof. If fx, e s&M, the family of one-dimensional marginals in ^, then, by the

CONTRACTION PRINCIPLE

59

contraction principle,

but since /^,e^M there is a Q with H(Q)^l and marginal & so that I(^}^1 This is true for all ju, es&M- From the definition of I(JLA) we see that

where Un is as in properties (l)-(5) of Section 9. Letting n —» °° and using Fatou's lemma, we have

Property (5) now ensures that the measures /u, in ^M are uniformly tight. Now we apply Theorem 13.3.D

This page intentionally left blank

SECTION 14

Application to the Problem of the Wiener Sausage A problem that comes up in the study of density of states for Schrodinger operators with certain random potentials near the edge of the energy spectrum is the following: Let /3(0 be d-dimensional Brownian motion starting from the origin. Let e > 0 be fixed. Consider

Ct is just the image of the Wiener path up to time t, and C, is the sausage around it of radius e. The problem is to show that

exists and is nonzero. Here \C*\ is the d -dimensional volume of the sausage Cf up to time t. The actual physical problem involves the Brownian motion that is conditioned to return to the origin at time t. But for large t, one can see easily that the difference between the free Brownian motion and the conditional Brownian motion is small enough that the formula (14.1) is unaffected by it. We will study only the free Brownian motion. We will first carry out a Brownian change of scale so that td/(d+2) appears naturally. Let us replace |3(s), O^isSif, by

which is again a Brownian motion. Therefore the distribution of \C*\ is the same as that of td/(d+2) |C^t»+2)|. If we let r = fd/(d+2), then

The problem therefore reduces to showing that

exists and is nonzero. 61

62

SECTION 14

A basic fact in what follows is the behavior of where G is a smooth bounded open set containing the origin. If L(t, w) is the random measure representing the occupation time of the Brownian motion, i.e., if

then We can therefore estimate Since the set {/x: supp JLL t(x)= t<j>(xtl/d), then where L, is the occupation distribution and * denotes convolution. We will denote by t,f the mollified local time. The problem we have reduces to two lemmas: LEMMA 14.3.

64

SECTION 14

and LEMMA 14.4.

Of the two lemmas, the second is a standard approximation lemma involving truncation methods. We will not carry out the proof, but will only refer to [8]. We will sketch a proof of Lemma 14.3. If we consider the Ll topology for densities on Tf, then |x:/>0| is a lower semicontinuous functional of the density /. In view of Theorem 2.3, it is sufficient to prove the large deviation principle for ft in the Ll topology with a rate function /(/). LEMMA 14.5. Let $ be any mollifier, i.e., a smooth probability density. Then

for any C closed in LI. Proof. The map / —» / * i/f is continuous from M with weak topology to Ll with norm topology. So the large deviation principle in the weak topology for L,, which implies the large deviation principle in the weak topology for t, isf converted into a large deviation principle for /, * i/f in the norm topology of L^. Theorem 2.4 provides the precise proof. We now state without proof Lemma 14.6. We will then state and prove Lemma 14.7, which will imply Lemma 14.3 and our main result. Finally we will prove Lemma 14.6. LEMMA 14.6.

where kp(«/0 —» °° as i/f— >80 for each p>0. Proof. The proof will be given after the proof of Lemma 14.7. LEMMA 14.7. The large deviation principle holds for ft with the rate function I(f) in the space Ll with norm topology. Proof. Upper bound.

Therefore

Letting $ — > 80 and p —*• 0, we get

provided C is closed in Lj.

APPLICATION TO THE PROBLEM OF THE WIENER SAUSAGE

65

Lower bound. For an open set G around /, where Gt is a smaller open set around / such that the sphere around GI of radius p is contained in G. The result is again obvious from Lemma 14.6.. From Lemma 14.7 we obtain Lemma 14.3 by an application of Theorem 2.3. If we now combine it with the lower bound, i.e. Theorem 14.2, and take Lemma 14.4 for granted, then we have THEOREM 14.8.

where k(v, a) is given by (14.3). We now turn to the Proof of Lemma 14.6.

where (We have assumed that are symmetric.) The map g -» 6 defined by 0 = g * 0 we can find a finite number N = N(t,p) of l5..., 0N such that the image of the unit ball is covered by spheres around 0j,..., 0N of radius p/2. We can assume that 01}..., BN are al bounded by 1 as well. Then

where

66

SECTION 14

One can show that for any \ with |x|

where A.J(ZX) is the largest eigenvalue of If, for each p>0, N(t, p)^exp[Dpf] for some Dp, then

One verifies that supA(z^)—»0 as i/f—»80 for each z>0. Therefore

and by letting z —>«» we will obtain our lemma. We now need only the estimation of N(p, t) to complete the proof of Lemma 14.6.

where o> is the LI modulus of continuity of

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & close