Instructions to authors Aims and scope Physics Reports keeps the active physicist up-to-date on developments in a wide range of topics by publishing timely reviews which are more extensive than just literature surveys but normally less than a full monograph. Each Report deals with one speci"c subject. These reviews are specialist in nature but contain enough introductory material to make the main points intelligible to a non-specialist. The reader will not only be able to distinguish important developments and trends but will also "nd a su$cient number of references to the original literature. Submission In principle, papers are written and submitted on the invitation of one of the Editors, although the Editors would be glad to receive suggestions. Proposals for review articles (approximately 500}1000 words) should be sent by the authors to one of the Editors listed below. The Editor will evaluate proposals on the basis of timeliness and relevance and inform the authors as soon as possible. All submitted papers are subject to a refereeing process. Editors J.V. ALLABY (Experimental high-energy physics), PPE Division, CERN, CH-1211 Geneva 23, Switzerland. E-mail:
[email protected] D.D. AWSCHALOM (Experimental condensed matter physics), Department of Physics, University of California, Santa Barbara, CA 93106, USA. E-mail:
[email protected] J.A. BAGGER (High-energy physics), Department of Physics, The Johns Hopkins University, 3400 North Charles Street, Baltimore MD 21218, USA. E-mail:
[email protected] C.W.J. BEENAKKER (Mesoscopic physics), Instituut}Lorentz, Universiteit Leiden, P.O. Box 9506, 2300 RA Leiden, The Netherlands. E-mail:
[email protected] E. BREZIN (Statistical physics and ,eld theory), Laboratoire de Physique Theorique, Ecole Normale Superieure, 24 rue Lhomond, 75231 Paris Cedex, France. E-mail:
[email protected] G.E. BROWN (Nuclear physics), Institute for Theoretical Physics, State University of New York at Stony Brook, Stony Brook, NY 11974, USA. E-mail:
[email protected] D.K. CAMPBELL (Non-linear dynamics), Dean, College of Engineering, Boston University, 44 Cummington Street, Boston, MA 02215, USA. E-mail:
[email protected] G. COMSA (Surfaces and thin ,lms), Institut fuK r Physikalische und Theoretische Chemie, UniversitaK t Bonn, Wegelerstrasse 12, D-53115 Bonn, Germany. E-mail:
[email protected] J. EICHLER (Atomic and molecular physics), Hahn-Meitner-Institut Berlin, Glienicker Strasse 100, 14109 Berlin, Germany. E-mail:
[email protected] T.F. GALLAGHER (Atomic and molecular physics), Department of Physics, University of Virginia, Charlottesville, VA 22901, USA. E-mail:
[email protected] vi
Instructions to authors
M.P. KAMIONKOWSKI (Astrophysics), Theoretical Astrophysics 130-33, California Institute of Technology, 1200 East California Blvd., Pasadena, CA 91125, USA. E-mail:
[email protected] M.L. KLEIN (Soft condensed matter physics), Department of Chemistry, University of Pennsylvania, Philadelphia, PA 19104-6323, USA. E-mail:
[email protected] A.A. MARADUDIN (Condensed matter physics), Department of Physics, University of California, Irvine, CA 92717, USA. E-mail:
[email protected] D.L. MILLS (Condensed matter physics), Department of Physics, University of California, Irvine, CA 92717, USA. E-mail:
[email protected] R. PETRONZIO (High-energy physics), Dipartimento di Fisica, II Universita` di Roma } Tor Vergata, Via Orazio Riamondo, 00173 Rome, Italy. E-mail:
[email protected] S. PEYERIMHOFF (Molecular physics), Institute of Physical and Theoretical Chemistry, Wegelerstrasse 12, D-53115 Bonn, Germany. E-mail:
[email protected] I. PROCACCIA (Statistical mechanics), Department of Chemical Physics, Weizmann Institute of Science, Rehovot 76100, Israel. E-mail:
[email protected] E. SACKMANN (Biological physics), Physik-Department E22 (Biophysics Lab.), Technische UniversitaK t MuK nchen, D-85747 Garching, Germany. E-mail:
[email protected] A. SCHWIMMER (High-energy physics), Physics Department, Weizmann Institute of Science, Rehovot 76100, Israel. E-mail:
[email protected] R.N. SUDAN (Plasma physics), Laboratory of Plasma Studies, Cornell University, 369 Upson Hall, Ithaca, NY 14853-7501, USA. E-mail:
[email protected] W. WEISE (Physics of hadrons and nuclei), Institut fuK r Theoretische Physik, Physik Department, Technische UniversitaK t MuK nchen, James Franck Stra{e, D-85748 Garching, Germany. E-mail:
[email protected] Manuscript style guidelines Papers should be written in correct English. Authors with insu$cient command of the English language should seek linguistic advice. Manuscripts should be typed on one side of the paper, with double line spacing and a wide margin. The character size should be su$ciently large that all subscripts and superscripts in mathematical expressions are clearly legible. Please note that manuscripts should be accompanied by separate sheets containing: the title, authors' names and addresses, abstract, PACS codes and keywords, a table of contents, and a list of "gure captions and tables. } Address: The name, complete postal address, e-mail address, telephone and fax number of the corresponding author should be indicated on the manuscript. } Abstract: A short informative abstract not exceeding approximately 150 words is required. } PACS codes/keywords: Please supply one or more PACS-1999 classi"cation codes and up to 4 keywords of your own choice for indexing purposes. PACS is available online from our homepage (http://www.elsevier.com/locate/physrep). References. The list of references may be organized according to the number system or the name-year (Harvard) system. Number system: [1] M.J. Ablowitz, D.J. Kaup, A.C. Newell and H. Segur, The inverse scattering transform } Fourier analysis for nonlinear problems, Studies in Applied Mathematics 53 (1974) 249}315.
Instructions to authors
vii
[2] M. Abramowitz and I. Stegun, Handbook of Mathematical Functions (Dover, New York, 1965). [3] B. Ziegler, in: New Vistas in Electro-nuclear Physics, eds E.L. Tomusiak, H.S. Kaplan and E.T. Dressler (Plenum, New York, 1986) p. 293. A reference should not contain more than one article. Harvard system: Ablowitz, M.J., D.J. Kaup, A.C. Newell and H. Segur, 1974. The inverse scattering transform } Fourier analysis for nonlinear problems, Studies in Applied Mathematics 53, 249}315. Abramowitz, M. and I. Stegun, 1965, Handbook of Mathematical Functions (Dover, New York). Ziegler, B., 1986, in: New Vistas in Electro-nuclear Physics, eds E.L. Tomusiak, H.S. Kaplan and E.T. Dressler (Plenum, New York) p. 293. Ranking of references. The references in Physics Reports are ranked: crucial references are indicated by three asterisks, very important ones with two, and important references with one. Please indicate in your "nal version the ranking of the references with the asterisk system. Please use the asterisks sparingly: certainly not more than 15% of all references should be placed in either of the three categories. Formulas. Formulas should be typed or unambiguously written. Special care should be taken of those symbols which might cause confusion. Unusual symbols should be identi"ed in the margin the "rst time they occur. Equations should be numbered consecutively throughout the paper or per section, e.g., Eq. (15) or Eq. (2.5). Equations which are referred to should have a number; it is not necessary to number all equations. Figures and tables may be numbered the same way. Footnotes. Footnotes may be typed at the foot of the page where they are alluded to, or collected at the end of the paper on a separate sheet. Please do not mix footnotes with references. Figures. Each "gure should be submitted on a separate sheet labeled with the "gure number. Line diagrams should be original drawings or laser prints. Photographs should be contrasted originals, or high-resolution laserprints on glossy paper. Photocopies usually do not give good results. The size of the lettering should be proportionate to the details of the "gure so as to be legible after reduction. Original "gures will be returned to the author only if this is explicitly requested. Colour illustrations. Colour illustrations will be accepted if the use of colour is judged by the Editor to be essential for the presentation. Upon acceptance, the author will be asked to bear part of the extra cost involved in colour reproduction and printing. For details, contact the Publisher at the address below. After acceptance } Proofs: Proofs will be sent to the author by e-mail, 6}8 weeks after receipt of the manuscript. Please note that the proofs have been proofread by the Publisher and only a cursory check by the author is needed; we are unable to accept changes in, or additions to, the edited manuscript at this stage. Your proof corrections should be returned as soon as possible, preferably within two days of receipt by fax, courier or airmail. The Publisher may proceed with publication of no response is received. } Copyright transfer: The author(s) will receive a form with which they can transfer copyright of the article to the Publisher. This transfer will ensure the widest possible dissemination of information. LaTeX manuscripts The Publisher welcomes the receipt of an electronic version of your accepted manuscript (encoded in LATEX). If you have not already supplied the "nal, revised version of your article (on diskette) to the Journal Editor, you are requested herewith to send a "le with the text of the manuscript
viii
Instructions to authors
(after acceptance) directly to the Publisher by e-mail or on diskette to the address given below. Please note that no deviations from the version accepted by the Editor of the journal are permissible without the prior and explicit approval by the Editor. Such changes should be clearly indicated on an accompanying printout of the "le. Files sent via electronic mail should be accompanied by a clear identi"cation of the article (name of journal, editor's reference number) in the &&subject "eld'' of the e-mail message. LATEX articles should use the Elsevier document class &&elsart'', or alternatively the standard document class &&article''. The Elsevier LATEX package (including detailed instructions for LATEX preparation) can be obtained from http://www.elsevier.com/locate/latex. The elsart package consists of the "les: ascii.tab (ASCII table), elsart.cls (use this "le if you are using LATEX2e, the current version of LATEX), elsart.sty and elsart12.sty (use these two "les if you are using LATEX2.09, the previous version of LATEX), instraut.dvi and/or instraut.ps (instruction booklet), readme. Author bene5ts } Free o+prints. For regular articles, the joint authors will receive 25 o!prints free of charge of the journal issue containing their contribution; additional copies may be ordered at a reduced rate. } Discount. Contributors to Elsevier Science journals are entitled to a 30% discount on all Elsevier Science books. } Contents Alert. Physics Reports is included in Elsevier's pre-publication service Contents Alert. Author enquiries For enquiries relating to the submission of articles (including electronic submission), the status of accepted articles through our Online Article Status Information System (OASIS), author Frequently Asked Questions and any other enquiries relating to Elsevier Science, please consult http://www.elsevier.com/locate/authors/ For speci"c enquiries on the preparation of electronic artwork, consult http://www.elsevier.com/ locate/authorartwork/ Contact details for questions arising after acceptance of an article, especially those relating to proofs, are provided when an article is accepted for publication.
THE PDF APPROACH TO TURBULENT POLYDISPERSED TWO-PHASE FLOWS
Jean-Pierre MINIER, Eric PEIRANO ElectriciteH de France, Div. R&D, MFTT, 6 Quai Watier, 78400 Chatou, France Energy Conversion Department, Chalmes University of Technology, S-41296 GoK teborg, Sweden
AMSTERDAM } LONDON } NEW YORK } OXFORD } PARIS } SHANNON } TOKYO
Physics Reports 352 (2001) 1–214
The pdf approach to turbulent polydispersed two-phase ows Jean-Pierre Miniera ; ∗ , Eric Peiranob b
a Electricite de France, Div. R&D, MFTT, 6 Quai Watier, 78400 Chatou, France Energy Conversion Department, Chalmers University of Technology, S-41296 G-oteborg, Sweden
Received December 2000; editor : I: Procaccia
Contents 1. Introduction 1.1. Two-phase ow regimes 1.2. An industrial example of dispersed two-phase ows 1.3. Mathematical and physical approach 1.4. Description of the contents 2. Mathematical background on stochastic processes 2.1. Random variables 2.2. Stochastic processes 2.3. Markov processes 2.4. Key Markov processes 2.5. General Chapman–Kolmogorov equations 2.6. Stochastic di8erential equations and di8usion processes 2.7. Stochastic calculus 2.8. Langevin and Fokker–Planck equations 2.9. The probabilistic interpretation of PDEs 2.10. A word on numerical schemes 3. Hierarchy of pdf descriptions
3 4 5 7 11 12 13 15 16 17 19 22 24 25 26 27 28
3.1. Complete and reduced pdf equations 3.2. BBGKY hierarchy 3.3. Hierarchy between state vectors 4. Stochastic di8usion processes for modelling purposes 4.1. The shift from an ODE to a SDE 4.2. Modelling principles 4.3. Example for typical stochastic models 5. The physics of turbulence 5.1. The turbulence problem 5.2. Characteristic scales 5.3. Kolmogorov theory 5.4. Di@culties and reAnements 5.5. Experimental and numerical results 5.6. SimpliAed images of turbulence and Lagrangian models 5.7. Closing remarks 6. One-point pdf models in single-phase turbulence 6.1. Motivation and basic ideas 6.2. Coarse-grained description and stochastic modelling
∗
Corresponding author. Tel.: +33-1-30-87-71-40; fax: +33-1-30-87-79-16. E-mail addresses:
[email protected] (J.-P. Minier),
[email protected] (E. Peirano). c 2001 Elsevier Science B.V. All rights reserved. 0370-1573/01/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 1 1 - 4
29 30 32 34 35 36 38 40 41 42 46 51 55 59 64 65 66 66
2
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
6.3. Relations to classical approaches 6.4. Probabilistic description of continuous Aelds 6.5. Choice of the pdf description 6.6. Present one-point models 6.7. Mean Aeld equations 6.8. Physical and information contents 6.9. Numerical examples 7. One-point particle pdf models in two-phase ows 7.1. Fundamental equations and modelling approaches 7.2. Interest of the pdf approach 7.3. Choice of the pdf description 7.4. Present models 7.5. Properties of present class of models 7.6. Numerical examples and typical simulations
67 69 75 77 80 84 88 99 100 105 106 108 120 135
8. Two-point uid–particle pdf models in dispersed two-phase ows 8.1. Motivations and basic ideas 8.2. Probabilistic description of dispersed two-phase ows 8.3. Choice of the pdf description 8.4. Present ‘two-point’ models 8.5. Mean Aeld equations 8.6. Concluding remarks 9. Summary and propositions for new developments 9.1. Di@culties with conventional approaches and interest of a pdf description 9.2. Assessment of current modelling state 9.3. Open issues and suggestions References
148 149 149 161 164 169 181 181 181 184 186 209
Abstract The purpose of this paper is to develop a probabilistic approach to turbulent polydispersed two-phase ows. The two-phase ows considered are composed of a continuous phase, which is a turbulent uid, and a dispersed phase, which represents an ensemble of discrete particles (solid particles, droplets or bubbles). Gathering the di@culties of turbulent ows and of particle motion, the challenge is to work out a general modelling approach that meets three requirements: to treat accurately the physically relevant phenomena, to provide enough information to address issues of complex physics (combustion, polydispersed particle ows, : : :) and to remain tractable for general non-homogeneous ows. The present probabilistic approach models the statistical dynamics of the system and consists in simulating the joint probability density function (pdf) of a number of uid and discrete particle properties. A new point is that both the uid and the particles are included in the pdf description. The derivation of the joint pdf model for the uid and for the discrete particles is worked out in several steps. The mathematical properties of stochastic processes are Arst recalled. The various hierarchies of pdf descriptions are detailed and the physical principles that are used in the construction of the models are explained. The Lagrangian one-particle probabilistic description is developed Arst for the uid alone, then for the discrete particles and Anally for the joint uid and particle turbulent systems. In the case of the probabilistic description for the uid alone or for the discrete particles alone, numerical computations are presented and discussed to illustrate how the method works in practice and the kind of information that can be extracted from it. Comments on the current modelling state and propositions for future investigations which try to link the present c 2001 Elsevier Science B.V. All work with other ideas in physics are made at the end of the paper. rights reserved. PACS: 47.27.Eq; 47.55.Kf; 02.40.+j; 02.50.Ey Keywords: Turbulence; Two-phase ows; Probability density function; Stochastic process
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
3
1. Introduction In March 1999, Reviews of Modern Physics issued a special volume, for the commemoration of its 100th birthday, which discussed historical developments and gave a general outlook on a wide range of physical questions. Numerous articles, written by world experts and often major contributors to their Aelds, provided an overview of past achievements and of the current state of each domain. Apart from the Aelds which have traditionally formed the main core of theoretical physics (quantum theory, particle physics, relativity, astrophysics, etc.) the selection of other subjects (such as uid turbulence, granular matter, soft matter, biological physics) which also found their place in this prestigious assembly is an indication of the interest for the issues raised by these subjects. This is also an indication that improved physical understanding is needed to bring these subjects to a more mature state. In the section devoted to statistical physics, two reviews, written by Sreenivasan and De Gennes, respectively, discussed separately the present understanding of uid turbulence [1] and of granular matter [2] (say, the behaviour of non-Brownian small solid particles). Broadly speaking, both are subjects where the basic equations (for example, the Navier–Stokes equations) or the elementary behaviour (for instance, how two grains interact) may be believed to be known, but where the issue is to understand the complicated and collective behaviour of a large number of interacting degrees of freedom. Both represent problems at a human-size level. They are actually everyday-life concerns and could, at Arst, have been thought to be mere engineering problems. They are indeed engineering problems, but even if only approximate results or rough estimates are sought, this often requires a clear and precise physical understanding of the important phenomena at play. There is another interesting domain which is simply obtained when the two di@culties are mixed: the case of turbulent dispersed two-phase ows. An easy way to picture this is to imagine dealing with granular matter but embedded in a turbulent ow. These ows are of crucial importance in a large variety of industrial problems. Yet, they have not received the same attention as turbulence or as granular matter. As a consequence, physical understanding remains limited and appears to be scarce compared to each of the separate sub-cases, uid turbulence in the absence of particles, and granular matter in the absence of any underlying or interstitial uid. The purpose of the present work is to discuss some of the physical issues involved in two-phase ows and to put forward a probabilistic formalism that can bridge the gap between physical understanding of basic phenomena and practical simulations. That middle-road approach is that of a modeller, where one invents a model, which has simpliAed rules compared to the real phenomena, and which is used to simulate the overall and collective behaviour of a complex system. The question is therefore whether the model contains the right ‘physics’ (thus the need to understand clearly the important phenomena) and then how to reach an acceptable compromise between the simplicity of the model versus its physical realism (thus the need of an appropriate formalism). Before going into the details of the approach followed in this work, a clear deAnition of two-phase ows and particularly of dispersed two-phase ows must be given. Secondly, a better idea of their importance in natural and industrial situations as well as an outline of the modelling issues involved must be provided. Introducing these notions is perhaps best achieved through typical examples. Dispersed two-phase ows occur in many natural phenomena. They are met for example in fogs, in water sprays, in smokes, when desert sand is carried away or
4
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 1. Di8erent two-phase ow regimes in a heat exchanger.
sediments, or (to provide a more vivid image) when an erupting volcano billows around smoke and various particles. They are increasingly important in environmental problems when one or several species (not necessarily pollutants) are dispersed in a turbulent atmosphere. Nevertheless, in the following two sections and to complement these Arst examples, we introduce dispersed two-phase ows and discuss the modelling questions through industrial examples. 1.1. Two-phase ;ow regimes As it transpires from their name, two-phase ows are encountered when two non-miscible phases coexist. Depending on the form of the interface between the two media, di8erent regimes can be found. This is illustrated in Fig. 1 which shows a range of regimes for the case of a boiling liquid (for example water) in a classical heat exchanger. At the bottom of the tube, the liquid has not yet started to boil and we have a single-phase turbulent ow. When nucleation starts at the walls, bubbles can be found as separate inclusions within the liquid (bubbly ows). Then, as more vapour is created we go through the so-called slug and plug regimes where vapour occupies a more important volumetric fraction. Then, as the liquid continues to boil, we And the annular regime with a thin liquid layer at the walls and a central vapour ow with small droplets carried by the vapour. Other regimes can also be found when horizontal channels are considered, but their detailed description is outside the scope of the present article. The wide variety of regimes, merely outlined above, is typical of immiscible liquid–gas or liquid–liquid ows since the interface can be deformed. Two of these regimes (the bubbly and annular regimes) are characterized by the presence of one phase, either liquid or vapour, as separate inclusions embedded in the other phase. These are two examples of what is deAned as dispersed turbulent two-phase ows, where one phase (called the continuous phase) is a continuum and the other phase (called the dispersed phase) appears as separate inclusions dispersed within the continuous one, assumed here to be a turbulent uid. When the dispersed phase is characterized by a distribution in size, one speaks of a polydispersed turbulent two-phase ;ows. The dispersed regime (either mono or polydispersed) is of Arst importance in most cases. It is always found when the dispersed phase is made up by solid particles (solid particles in a gas or a liquid turbulent ow). It is often found for a liquid dispersed as separate droplets in a gas ow (sprays for example) or for two immiscible liquids where one liquid is dispersed in the other liquid. Indeed, the dispersion of one phase within another one increases considerably the surface of the separating interface
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
5
Fig. 2. A circulating uidized bed combustor.
and thus allows better mass and energy transfer between the two phases. These higher transfer rates explain that the dispersed regime is preferable. In the present work, we limit ourselves exclusively to the dispersed regime and we will talk of a uid (the continuous phase) and of discrete particles (which can represent either solid particles, bubbles or droplets). In most of the problems encountered, the dispersed particles have a distribution in size. Since the polydispersed case obviously contains the monodispersed situation as a simple sub-case, we will consider the realistic problem of polydispersed two-phase ows. 1.2. An industrial example of dispersed two-phase ;ows The limitation to the regime of dispersed two-phase ows is of course a simpliAcation with respect to general two-phase ows. Yet, the range of problems remains large, and each of these problems is di@cult. What are the main problems and what are the key issues? To provide some answers to that question, it is perhaps better to describe a relevant industrial example, circulating uidized bed (CFB) boilers. This is an industrial process for thermal energy generation. A sketch of a typical unit is displayed in Fig. 2. In a conAned domain (the combustion chamber), solids (inert sand and solid fuel or coal particles with a size distribution ranging from 100 m to 1 mm and an average density of order of magnitude 1000 kg m−3 ) are transported vertically by a gas (injected at the bottom) through the combustion chamber. The solids are captured at the exit by a separator (usually a cyclone), and reintroduced near the bottom of the combustion chamber, whereas the gas leaves the cyclone through an outlet duct. The solid particles are therefore
6
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
recycled to the combustion chamber and some particles may thus perform several loops in the process. This solid circulation is the key factor of the whole process (thus its name). It ensures an approximate homogeneous temperature within the combustion chamber which can thus be chosen as an optimum between the e@ciency of the process and the formation of noxious pollutants (the resulting low level of emission is one of the strong points of CFB units). It represents also a central aspect that must be mastered if one is to expect a satisfactory performance of the overall process. The ow of the gas–solid mixture is non-stationary and non-homogeneous with a vertical distribution of the particle concentration inside the combustion chamber. This vertical distribution is characterized by three interacting zones (as a function of increasing height): (1) a bottom bed which has the characteristics of a bubbling bed (gas ows through the bed in the form of large structures) and the concentration of discrete particles is so high that particle–particle interaction is a dominant mechanism (particles collide and possibly slide against each other), (2) a splash zone with high clustering and back-mixing activity and (3) a transport zone which exhibits a core/wall-layer structure (particles are entrained upwards in the core and fall down along the walls in the form of a thin boundary layer). In these regions large scale spatial inhomogeneities in the discrete particle concentration Aeld can be observed and for the gas large scale instabilities (pseudo-like turbulence) are present. In regions (2) and (3), particle loading (the local instantaneous ratio between the weight of particles and the weight of gas) is high enough so that turbulence is modulated and possibly modiAed by the presence of the particles. In addition particle segregation can be observed, that is to say the mean particle diameter decreases with height and large particles tend to migrate to the boundary layers. At the exit of the chamber, the particle-laden ow enters the cyclone(s). Cyclones are used here as separators (separating the solid coal particles from the gas in order to recycle them) and are key elements of the whole process. Indeed, should they fail to ensure a proper separation and consequently a proper particle recirculation, the whole process would not be able to run correctly. Cyclone performances are quantiAed by their e@ciency curve which is the fraction of solid particles being collected (and thus recycled) as a function of the particle diameters. It is important to be aware that cyclone e@ciencies are due to the complicated swirling motions and gas ow patterns within the cyclone and not to external forces such as gravity. In other words, both within the combustion chamber and within the cyclone separators, satisfactory performances of a CFB process are ensured by the local hydrodynamics of the two-phase ows rather than by external monitoring. In particular, a key parameter for a good functioning of a CFB boiler is the particle size distribution, to ensure, for example, suitable particle spatial distribution and residence time in the combustion chamber. It is mainly controlled, for small diameters, by the collection e@ciency of the cyclone, and for large diameters, by the characteristics of the fuel particles. This is, after all, an engineering problem. Numerous industrial or engineering problems may also involve di@cult questions. In some situations, complex questions or issues may become less relevant or secondary if engineers apply a high-enough ‘margin coe@cient’. This easy way around theoretical issues may lead researchers to believe that only clever or astute Axing or tinkering is needed. However, from the above outline of the CFB process, it is clear that this is not the case here, since the overall performance is a result of the local hydrodynamics throughout the process.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
7
In summary, the physics of turbulent gas-solid ows (in the case of circulating uidised bed boilers) contains a large spectrum of problems. Three main themes emerge, namely: (A) polydispersed two-phase ows (there is a range of particle diameters rather than a single value), (B) combustion, either within reactive gas ows or of the solid particles (as in the case of fuel particles within the CFB boiler), (C) turbulence, which is the central and common issue. Of course, these three main categories overlap and a wide range of sub-categories or classes of problems can be enumerated. For example, this concerns: • reactive ows (heterogeneous and homogeneous combustion), • particle dispersion (and di8usion of combustion gases), • turbulence modulation, possibly modiAcation of its nature, by the presence and motion of
solid particles embedded in the turbulent ow,
• particle–particle interactions (short- and long duration collisions), • swirling two-phase ows (selectivity curve of the cyclone) and • particle segregation, etc.
Other industrial and practical needs involving two-phase ows, such as pollutant dispersion in the atmosphere, combustion of fuel droplets within car engines, etc. would reveal the same picture and the same categories with an emphasis on one of these categories depending on the application. From the previous analysis, it appears that one has to built the link between uid-mechanics, classical mechanics, tribology, combustion and chemistry. The question to be answered is: how can we achieve this goal with a tractable formalism which has to be, in addition, suitable for numerical applications? 1.3. Mathematical and physical approach 1.3.1. The present objectives For the two-phase ows we consider in the present work, the central subject is turbulence. Turbulence of continuous-phase ows is further compounded by the e8ects of the discrete particles. Direct numerical simulations are possible in theory but are quite impossible in practice, at least for the typical examples described above. Most turbulent dispersed two-phase ows involve far too many degrees of freedom to be directly simulated. The issue is therefore to reduce the number of degrees of freedom to a tractable number and to come up with a contracted description. We are thus faced with a problem of non-equilibrium statistical physics where one tries to obtain a statistical model for a reduced number of degrees of freedom. Given the inherent complexity of the problems we have to deal with, the Arst choice is to limit ourselves to mean or average quantities. This is classical in most problems of statistical mechanics. In other words, we treat the solutions of the fundamental equations as random variables and we are interested in some statistics. Compared to the high complexity and to the beauty of the initial problem, this may look as limited and perhaps unchallenging objectives. However, it must be remembered that we are not dealing with only one problem, either single-phase turbulence, combustion or
8
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
two-phase ows, but typically with the three of them. Consequently, the aim is to develop a mathematical and physical approach that meets the following requirements: 1. the important physical phenomena, such as convection or mean pressure-gradient, are treated without approximation, 2. enough information is available to handle correctly issues of complex physics (combustion, polydispersed particles), 3. the resulting numerical model is tractable for non-homogeneous ows, 4. the model can be coupled to other approaches, either more fundamental or applied descriptions. It is clear from the third constraint that a compromise between detailed descriptions and practical applicability must be reached. The approach will necessarily be less fundamental compared to a number of theoretical approaches in turbulence for example [3–5]. It does not mean that the present choice contradicts more theoretically oriented works, but rather that the underlying objectives are somewhat di8erent. Actually, a satisfactory model respecting the three Arst requirements should easily beneAt from fundamental progress made in one of the three main themes (A), (B) or (C) listed above, which are of concern here. This is one of the reasons for the fourth item which also suggests that the approach can be used in relation to coarser descriptions in a multi-scale or multi-level simulation. The main challenge comes from the second constraint. It implies that we are not looking for a model or an approach which performs very well for only one theme but for an approach that can handle complex physics. For example, we are not looking for an approach which is perhaps the best candidate at the moment to simulate, say, isothermal incompressible single-phase turbulent ows but which requires new formalisms or new models for combustion. We are looking for an approach that can do a Ane job for single-phase turbulence and still be easily extended to handle combustion and dispersed two-phase ow issues within the same framework. This ‘engineering’ constraint has far-reaching consequences in terms of modelling choices and justiAes advanced methods. 1.3.2. Choice of the modelling approach Since we are mainly interested in some local mean statistics on a number of uid and discrete particle properties and since we have emphasized the practical side of the problem, it would seem that the path of least dissipation (for the modeller) consists in trying to derive directly a set of closed partial di8erential equations (PDE) for those mean variables. We refer to this approach as the moment approach or the conventional approach. It is indeed in line with the classical or conventional approaches in continuum mechanics where one handles Aelds which are solutions of some PDEs. For example, if we are interested in the mean uid velocity, we start from the Navier–Stokes equations (we consider here an incompressible ow for the sake of simplicity) 9Ui 9Ui 1 9P 92 Ui =− + 2 : (1) + Uj 9t 9xj 9xi 9 xj This equation contains all the information for the uid velocity. Then, following the classical approach we apply to this equation an averaging operator (the nature of this averaging operator, be it the Reynolds average or a spatial Alter, does not change the present point so we use
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
9
here the Reynolds decomposition while comments on spatial Altering will be developed in Section 9), written :. We obtain the unclosed PDE directly for the variable of interest U 9ui uj 1 9P 9Ui 9Ui =− + QUi − : + Uj 9t 9xj 9xi 9xj
(2)
This open equation has to be closed by resorting to a constitutive relation giving the unknown quantity, here ui uj (the Reynolds stress tensor), as a function of known variables, here the mean velocity. If we use vocabulary from statistical physics, we can say that the conventional approach is a macroscopic approach where one tries to obtain the macroscopic laws through closure relations which are written directly at the macroscopic level. If an acceptable macroscopic constitutive relation can be found, then this route is certainly the most cost-e8ective one since we explicitly calculate only what we want and nothing more. However, the success of this macroscopic approach hinges on the possibility to express unclosed terms through macroscopic laws. If such macroscopic relations cannot be explicitly written or involve far too drastic assumptions to yield acceptable results, then the conventional approach fails. Two typical examples of such problems are provided by the reactive source terms which enter the equations of single-phase turbulent combustion with Anite-rate chemistry and by the existence of a range of particle diameters (a polydispersed two-phase ow). Both issues will be explained in detail in Section 6 for the former and in Section 7 for the latter. In each case, one has to express the average value of a complicated function of some instantaneous variables (the uid instantaneous species mass fractions or the particle instantaneous diameters), whereas the conventional approach can only provide information on the Arst moments (usually the Arst two moments). We are faced with the problem of having to express a quantity such as S( ), where S is a complicated function of some scalar , in terms of the available information, usually limited to or 2 . This results in an intractable problem and more information is needed to address these typical issues of complex physics. In other words, even if we are interested mainly in estimates of macroscopic quantities, an advanced method providing more detailed information is absolutely required. That problem will be emphasized and explained in more details (and for general averaging operators) in the course of the paper. From the above outline, it can be concluded that the macroscopic path is not well suited for our present objectives. On the other hand, we have also seen that the direct simulation is not tractable. In the language of statistical physics, this direct simulation is a microscopic description since all degrees of freedom are explicitly tracked. A reasonable solution is therefore to choose what can be referred to as a mesoscopic approach, or as a middle-road approach between the microscopic and macroscopic descriptions. The mesoscopic approach retained in this work is a probabilistic approach. Its aim is to model and to simulate the probability density functions of the variables which are of Arst interest. For this reason, the present approach can be deAned as a pdf approach to turbulent dispersed two-phase ows. The di8erent pdfs that will be manipulated are modelled pdfs, that is to say the basis of the approach will be to propose probabilistic models to describe the joint pdf of some variables. It will be seen that probabilistic models can be developed either in terms of the pdf or in terms of the trajectories of the stochastic processes involved. In the present work, we will mainly adopt this second point of view and we will be talking about stochastic particles. The stochastic models will be developed directly for the variables attached to these stochastic particles, providing at the same time
10
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
a Monte Carlo evaluation of the pdf. Thus, the approach can also be referred to as a particle stochastic approach. At the moment, multi-point approaches have not been extended to non-homogeneous wallbounded ows whereas one-point pdf models have been put forward. From the third requirement mentioned above, it follows that one-point pdf models will be often considered. Yet, this is not a strict limitation of the present work. Two-point or multi-point pdfs will be discussed and considered at di8erent stages. This will re ect the fourth constraint since multi-point pdfs represent Aner probabilistic descriptions. Our purpose is a general pdf approach and the relation between these di8erent descriptions is contained in the presentation. 1.3.3. Choice of a rigorous presentation pdf models are already well established in the combustion community. A number of reviews are available which discuss the necessary formalism and present modelling state [6 –9]. In particular, Pope’s work [6] has clariAed the one-point pdf formalism and has given the relations between the Lagrangian and the Eulerian pdfs. In these works, the presentation is strictly tailored for one-point pdf descriptions and the derivation of the stochastic models is often based on the previous knowledge of a macroscopic closure relation [10]. This is probably a reasonable choice (and perhaps the best compromise between model complexity versus tractability) since closures of the reactive source terms, which are local source terms, require only one-point pdfs. In most presentations, the stochastic models are not derived from statistical arguments but from their correspondence with given mean moment equations, although recent proposals have tried to use only arguments from statistical physics [11]. This approach can indeed be regarded as a satisfactory answer for two of the main themes, turbulence and combustion, but application to dispersed turbulent two-phase ows requires further work. Indeed, di8erent physical e8ects are present when discrete particles are considered. Furthermore, the mean equations are not known in advance and should precisely be derived from a probabilistic approach. On the other hand, a particle approach and stochastic models have been used for some time to simulate dispersed two-phase ows, see among others the review of Stock [12] and the references inside. A wide variety of stochastic models have been devised, most of the time from a heuristic point of view. In two-phase ows, the notion of a stochastic particle is, at Arst sight, less surprising than in single-phase turbulence, and it is tempting to skip the careful construction of rigorous foundations since the stochastic concepts may be believed to be ‘evident’. However, this direct approach to stochastic modelling can create severe problems that will be detailed in Section 7. Given that no macroscopic relations are known in advance (and can thus be used as safeguards), the development of a rigorous approach to the pdf description of dispersed turbulent two-phase ows is needed. There is also a new element compared to single-phase reactive ows where the choice of the variables which are explicitly modelled is rather obvious. In two-phase ows, the selection of the basic variables is less obvious and is subject to debate. Then, various choices can be made for a pdf description, and the technical aspects of the hierarchies between these di8erent pdfs must be well understood. As a consequence of the above analysis, the aim of this paper is to build a rigorous probabilistic framework that extends current models developed for single-phase reactive ows to include dispersed turbulent two-phase ows. Such a construction requires a mathematical-oriented
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
11
presentation and the deAnition of a clear methodology. As such, it will be somewhat di8erent from the above-cited references. The presentation will be based on ideas from statistical physics and on the hierarchies of pdf equations with a particle stochastic point of view. The stochastic models will be developed as much as possible from the arguments of statistical mechanics and statistical physics. The purpose of the present work is basically to propose a probabilistic description of a mixed system, composed of a continuous Aeld and of discrete particles. The central notion that is adopted is the Lagrangian point of view which will appear as the ‘propagator of information’ for our complex system. 1.4. Description of the contents The paper has been organized to answer to the following general questions: • • • •
What are the mathematical tools which are required? What are their main characteristics? How are they used for physical modelling in general? How are they precisely used in our case? What do we obtain from them? Are present models the end of the story or can they be improved or coupled to other methods?
These questions correspond to three categories: the mathematics of stochastic processes, the general physical meaning of stochastic modelling, and the development of a speciAc framework. As a consequence, the paper has a three-fold objective. The Arst objective is to provide the reader with a comprehensive and understandable picture of the theoretical tools used in the pdf approach (Section 2). Several notions must be understood: the mathematical properties of stochastic processes, the notion of the trajectories of a stochastic process as well as the correspondence between the trajectory point of view and pdf equations. Once the notion of a Markovian stochastic process (and more precisely the subclass ‘di8usion process’) and its associated pdf is clear, the second objective is the description of the use of di8usion stochastic processes for physical modelling. This is carried out by Arst recalling the concepts of statistical physics, i.e. the N -body problem. A general framework is given to work out the relations between the di8erent levels of contraction (Section 3). Then, the modelling principles that allow stochastic processes to be used are presented (Section 4). From this results the deAnition of a pdf description, which is made up by the choice of the variables which constitute the state vector and by the choice of a stochastic model for this state vector. The third objective concerns the development of a consistent and self-contained framework for the probabilistic description of two-phase ows. This derivation is the core subject of the present work and is performed in four steps. A gradual construction of the complete description has several advantages. It avoids dealing immediately with a complicated formalism which may hide or blur some physical points. By gradually building the complete description, we can discuss at length the physical meaning of the di8erent stochastic terms, for the continuous phase and for the dispersed phase. Since the discrete particles are embedded in a turbulent uid, their motion (and the associated statistical properties) are governed by the underlying turbulent ow. It is then important to detail the physical characteristics of turbulent ows. This is the Arst step of our modelling approach where the reader is given a comprehensive, but still general, overview of the physics of turbulence,
12
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Section 5. The second modelling step is the probabilistic description of single-phase ows, that is the probabilistic description of a continuous Aeld (Section 6). Emphasis is put on the level of information which is needed for successful closures (Kolmogorov theory), on the Lagrangian point of view (which is the natural choice in uid mechanics) and on the existence of a propagator. Correspondences with the Eulerian description and classical mean Aeld equations are given. The third modelling step addresses the question of the probabilistic modelling of discrete particles. The usual issues of particle-tracking models are discussed at length, Section 7. In both the second and the third modelling step, numerical computations are presented and discussed to illustrate how the method performs in di8erent ows and the kind of information that can be extracted from it. The fourth modelling step, i.e. the complete uid–particle pdf approach, is achieved in Section 8. It is shown, once again, that the Lagrangian point of view is the natural choice and that there exists a propagator. Correspondence with Eulerian tools is put forward and mean Aeld equations are derived using rigorous probabilistic arguments. By the end of Section 8, the reader has a clear picture of the pdf approach to turbulent dispersed two-phase ows. Then, the concepts of the probabilistic approach can be summarized and prospects for new developments can be put forward, Section 9. 2. Mathematical background on stochastic processes The purpose of this chapter is to provide clear deAnitions of a stochastic process and of stochastic di8erential equations. These equations appear rather naturally in physical or engineering sciences where one would like to introduce ‘randomness’ or ‘noise’ into the di8erential equations that describe the evolution of a physical system. For example, one would like to give a precise meaning to the equation d Xt = A(t; Xt ) d t + B(t; Xt )t ; dt
(3)
where t is the so-called ‘white-noise’ process that represents some ‘rapid uctuations’. It turns out that the deAnition and proper treatment of such an equation cannot be made directly with classical methods from ordinary di8erential equations (ODEs). Special mathematical notions have to be introduced to explain stochastic calculus which has its subtleties that can be surprising at Arst sight. The following results and notions will be presented, as much as possible, in a logical way while trying to avoid being too mathematically involved. Most of these results will be stated without proofs and not all deAnitions are given. However, complements and detailed presentations of this material can be found in mathematical textbooks [13,14] or in physically-oriented books [15]. An excellent presentation gathering mathematical correctness and an applicationR oriented discussion can be found in Ottinger [16]. Most of the material needed to handle in a simple way probability concepts has been developed in Pope’s seminal work [6] for single-phase pdf methods. It did not appear useful to repeat this presentation here and the objective of this section is to go into more mathematically advanced details. Each of the following subsections cannot pretend to give a comprehensive description of the subjects but the themes and the order of the presentation re ect the important issues.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
13
2.1. Random variables In applied physics, random variables are often introduced directly through their probability density functions (pdf) which can be either discrete or continuous. The random variable, say X , can take a range of possible values x ∈ A, for example x ∈ R or Rd . The probability that X takes a value between x and x + d x is P[x 6 X 6 x + d x] = p(x) d x :
(4)
The actual and more rigorous construction of a random variable is based on an underlying probability space (; F; P) and on measure theory. One deAnes a reference space equipped with a -algebra F (an ensemble of subsets such that complements and reunions of them still give a subset that belongs to the ensemble) on which a measure P (with P() = 1) is deAned. A random variable X is mathematically deAned as a measurable function from this reference space to the one where X takes its values, here A which is also equipped with a -algebra G (; F; P) → (A; G) ; ! → X (!)
(5)
and the law of probability of X is simply the image of the reference measure P, that is PX (A) = P(X −1 (A));
∀A ∈ G :
2.1.1. Conditional expectations The Arst central notion is the expectation of a random variable which is the integral of the possible values against their measures X = X (!) d P(!) : (6) The expected or mean value is written here as X following the usual notation in applied physics but is written as E(X ) in the mathematical literature. The level of abstraction used in the deAnition of random variables is not just for the sake of doing mathematics but is helpful to precise some notions concerning Arst random variables and then for stochastic processes. One such notion is the conditional expectation which is very important for the physical idea of coarse-grained descriptions but can only be fully understood with reference to -algebras. It is worth giving the formal deAnition: Denition 1. If X is a random variable on the probability space (; F; P) and if F is a sub--algebra of F, that is F ⊂ F, the conditional expectation (or conditional average) of X given F , written X |F , is a random variable deAned on the sub--algebra F and such that its expectation or its mean value on any subset A of the sub--algebra F is equal to the mean value of X on the same subset, or X XA = X |F XA
∀A ∈ F ;
where XA is the indicator of the subset A .
(7)
14
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
This formal statement can be translated into words. The Arst important point is that the conditional average is also a random variable but deAned on a coarser -algebra. A -algebra can be regarded as describing mathematically the notion of the ‘level of information resolved’ or the ‘information content’ of the random variable X . If more information is provided on a physical object which is represented by the random variable X , this corresponds in the mathematical deAnition to a function deAned on a ?ner ensemble F. On the reverse, if less information is known or resolved by X this translates into the fact that the function X is deAned on a coarser ensemble F. Therefore, for a physical object represented by a random variable X , the idea of a coarse-grained description, when not enough details are resolved or when one voluntarily disregards some pieces of information, can be well represented by a conditional average. The second important point in the deAnition given above, is that the conditional average is the mean of the actual random variable X ‘averaged’ over the unresolved information or, in more mathematical terms, over the Aner decomposition of any subset A of F into reunions of subsets of F. This appears as the only way to properly deAne the notion of the conditional average of one random variable. However, when one handles in fact two joint random variables X and Y and simply considers the sub--algebras obtained by Axing the value of one of the two random variables, say for example the sub--algebra obtained with Y = y, we retrieve the usual and more intuitive notion of conditional random variable given the value of another one whose pdf is then p(x; y) p(X = x|Y = y) = : (8) pY (y) 2.1.2. Weak and strong convergence of random variables Random variables are not often known directly and are generally obtained as limits of approximate and simpler random variables. This is the case when a process is simulated by numerical integration with Anite time steps Qt, see Section 2.10. This also happens from a physical point of view since models are used to get practical but then approximate answers. One must be able to know how properties or characteristics of the various approximations can be carried over to the actual solution. Several modes of convergence can be deAned for random variables, which must be well understood in particular the distinction between strong and weak convergences. For these reasons, we explicitly give the following deAnitions. Denition 2. The sequence (Xn ) converges towards the random variable X , deAned on the same probability space, almost surely if and only if P({! for which |Xn (!) − X (!)| → 0 as n → ∞} = 1 :
(9)
Denition 3. The sequence (Xn ) converges towards the random variable X , deAned on the same probability space, in the mean square sense if and only if |Xn − X |2 → 0
as n → ∞ :
(10)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
15
Denition 4. The sequence (Xn ) converges towards the random variable X , not necessarily on the same probability space, in distribution or in law if and only if f(Xn ) → f(X )
as n → ∞ :
(11)
There is actually a fourth possible mode of convergence, the notion of stochastic limit, which is not given here. This mode of convergence is important in the theory but since it will not be explicitly used here and since its absence does not prevent the key concepts from being presented, it is left out so as to limit the mathematical burden. The Arst mode of convergence, the almost sure limit, is the strongest possible one. It is actually the idea of classical pointwise convergence of real-valued functions used for the realizations of the random variable. It means that the sequence (Xn ) converges to X ‘everywhere’, or that the subset on which (Xn ) does not converge to X is negligible (in the sense that the measure of its importance is zero). The second mode of convergence has a more familiar connotation since it manipulates something that is basically an energy. Yet, these Arst two modes are similar. The third one is somewhat di8erent since what is required is that only mean quantities derived from the sequence (Xn ) converge to a mean value derived from the limit process X . That limit process does not need to be known explicitly, and we only deal here with some information extracted from the di8erent processes. That mode of convergence is therefore weaker than the Arst two. Indeed, the Arst two modes depend ‘directly’ on the values of the variables Xn and X whereas in the third mode the knowledge of X is ‘indirect’. Loosely speaking, we can give the overall picture and say, that the almost sure and the mean square convergence are strong modes of convergence while convergence in distribution refers to a weak convergence. The distinction between these two ways of approximating random variables is important within the mathematical theory (deAnition of the Itˆo integral, solutions of equations, : : :) but also for numerical reasons (see Sections 2.10) and for physical purposes since it helps clarifying the ideas of the pdf approach in single- and two-phase ows developed in Sections 6 –8. 2.2. Stochastic processes Another interest of the exact deAnition of random variables given previously is to pave the way for the notion of stochastic processes and of trajectories of a process. A stochastic process is simply a family of random variables X = (Xt ) indexed by a parameter which is usually the time t. This notion is obvious to introduce when one wishes to use random variables to model a time-dependent physical system. The mathematical deAnition of a stochastic function is in fact a measurable function of two variables T × (; F; P) → (A; G) ; (t; !) → X (t; !) :
(12)
The equivalence mentioned in the introduction between di8erent points of view can now be made clear by Axing one of the two variables. (a) for each Axed t ∈ T , Xt is a random variable and we can deAne its pdf p(t; x),
16
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
(b) for each Axed !, we have simply a function t → Xt (!)=X (t; !) which is called a trajectory of the stochastic process X or a sample path, (c) there is a third point of view which generalizes the trajectory point of view. In this point of view, the stochastic process (Xt ) is regarded as a random variable for which the range of values is the set of real functions X: (!). This deAnes the path-integral point of view. In (a), we address the problem by considering the time-dependent pdf and the question is, for a given problem, to write the equation satisAed by this pdf. This is the pdf point of view. In (b), we Arst discretize the reference space introducing ‘particles’ and we follow the time evolution of these particles which deAne the trajectories of the process. This is the trajectory point of view. It is now clear that these particles represent actually di8erent realizations of a stochastic process whose evolution is tracked in time. The path-integral point of view will not be used in the discussion of present models for turbulent dispersed two-phase ows, but will be referred to as an attractive tool in Section 9.3.4. 2.3. Markov processes Manipulation of general stochastic processes is di@cult since it requires to handle N -point distributions, that is the joint distribution functions of the values of the process p(t1 ; x1 ; t2 ; x2 ; : : : ; tN ; xN ) ;
(13)
at N di8erent times, and for any value of N . An important simpliAcation can be obtained for a class of special processes to which we nearly always limit ourselves, Markov processes. A Markov process is deAned as a process for which knowledge of the present is su@cient to predict the future. This is actually a simple notion which is carried over from ordinary di8erential equations (ODE). In classical mechanics, when an ODE is written to describe the time evolution of a system, knowledge of the initial condition is su@cient. For stochastic processes, the Markov property means that if the state of the system is known at time t0 , additional information on the system at previous times s (s 6 t0 ) has no e8ect on the future at t ¿ t0 . The Markov property simpliAes the situation since it can be shown that Markov processes are completely determined by their initial distribution p(t0 ; x0 ) and their transitional pdf p(t; x|t0 ; x0 ). This transitional pdf represents the probability that X takes a value x at time t conditioned on the fact that at time t0 its value was x0 . The Markov property manifests itself in the following consistency relation which is the Chapman–Kolmogorov formula (14) p(t; x|t0 ; x0 ) = p(t; x|t1 ; x1 ) p(t1 ; x1 |t0 ; x0 ) d x1 : This equation states that the probability to go from (t0 ; x0 ) to (t; x) is the sum over all intermediate locations x1 at an intermediate time t1 . The factorization inside the integral re ects the independence of the past and the future at t1 when the present is known. A Markov process can be characterized directly in terms of its transitional pdf or its trajectories or more indirectly (in a weak or distribution sense) by its action on members of a function space. It is useful to deAne the inAnitesimal operator for functions g acting on the sample space
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
17
of Xt by (g(Xt+dt )|Xt = x) − g(x) ; dt→0 dt
Lt g(x) = lim
(15)
where g(X ) denotes the mean or expectation g(X ) = g(x)p(x) d x :
(16)
The value of Lt g(x) can be thought of as the mean inAnitesimal rate of change of the process g(Xs ) conditioned on Xt = x. Using this operator, the Chapman–Kolmogorov relation can be turned into di8erential equations. Since the conditional pdf p(t; x|t0 ; x0 ) is a function of two variables, on can consider variations with respect to the initial variables (t0 ; x0 ) or the Anal variables (t; x). We obtain then two di8erent equations (see [13]), whose meaning will become clearer for di8usion processes: • Kolmogorov backward equation: 9p + Lt p = 0 ; 0 9 t0 end condition p(t; x|t ; x ) = (x − x ) 0 0 0 • Kolmogorov forward equation: 9p = L∗ p ; t 9t initial condition p(t; x|t ; x ) = (x − x ) 0 0 0
(17) when t0 → t :
(18) when t → t0 ;
where Lt∗ denotes the adjoint of the operator Lt . The forward Kolmogorov equation gives the well-known Fokker–Planck equation for di8usion processes as we will see below. 2.4. Key Markov processes 2.4.1. The Poisson process Many situations, such as electron emission, telephone calls, shot noise or collisions, among other problems, require the notion of random points and eventually of Poisson processes. The important properties of the statistics of random points are Arst outlined. If a large number of points n are placed at random within a wide interval, say [ − T=2; T=2], it can be shown that the probability to have k points in an interval I of length tI , small with respect to T , is given by (ntI =T )k : (19) k! We then consider the case when n; T → ∞ such that n=T = # remains Anite. This deAnes the concept of random Poisson points for which the probability to have k points in any interval I of length tI , say n(I ) = k, is thus P(k in I ) = e−ntI =T
P(k in I ) = e−#tI
(#tI )k : k!
(20)
18
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
A very important property is that random points in non-overlapping intervals are independent. The parameter # which speciAes Poisson points has a clear and simple meaning. This is shown by considering a small time interval of length Qt and the probability to have one point within that interval. If Qt is such that #Qt is much less than one, we have P(one point in [t; t + Qt]) #Qt ;
(21)
while we have P(more than one point in [t; t + Qt])Qt :
(22)
Consequently, the parameter # appears as the density of Poisson points. From the concept of random points, the deAnition of the Poisson process is straightforward. The Poisson process, Nt is deAned as the number of random points, or more generally of random events that take place in the interval [0; t] Nt = n(0; t) :
(23)
The Poisson process is therefore a stochastic process taking discrete values. The trajectories of the Poisson process are staircase functions, being constant between random points at which they jump to the next integer. The mean value of the Poisson process and its variance are given by Nt = #t ;
(24)
Nt = (Nt − Nt )2 1=2 = (#t)1=2 :
(25)
The parameter # which deAnes the process is still the density of the random points, or rather of the random times at which certain events take place (emission of an electron, arrival of a phone call, collision with another particle, : : :). It is called the intensity of the Poisson process and has the dimension of a frequency or the inverse of a time scale, say $c . The mean time interval between each random event is simply equal to $c . 2.4.2. Wiener process and Brownian motion The Wiener process is the key process for our present concerns. It represents directly a model for a Brownian particle and as such has direct physical applications for modelling issues. It is also the fundamental building block on which di8usion processes and stochastic di8erential equations are built. The Wiener process can be introduced di8erently, directly through its construction as a random walk in some applied textbooks or as a rather abstract mathematical object in more formal mathematical works. A middle path is sought here and further explanations can be found in [13–15]. We Arst limit ourselves to the one-dimensional case but all results are easily extended to the multi-dimensional case. The Wiener process Wt can be deAned as a Gaussian process. Just as every Gaussian random variable is completely deAned by its mean and variance, a Gaussian process is fully characterized by two functions, its mean and covariance, which are functions of one and two variables respectively: M (t) = Xt ;
(26)
C(t; t ) = (Xt − Xt )(Xt − Xt ) :
(27)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
19
For the Wiener process, these deAning functions are M (t) = 0;
C(t; t ) = min(t; t )
(28)
and the transitional pdf has a Gaussian form (x − x0 )2 1 p(t; x|t0 ; x0 ) = exp − : 2(t − t0 ) 2((t − t0 )
(29)
The inAnitesimal operator associated to the Wiener process is 1 92 g(x) Lt g(x) = ; (30) 2 9x 2 and the forward Kolmogorov equation shows that the transitional pdf p(t; x|t0 ; x0 ) is the solution of the heat equation 2 9p = 1 9 p(x) ; 9t 2 9x 2 (31) initial conditions p(t; x|t0 ; x0 ) = (x − x0 ) when t → t0 : This equation already reveals the physics of the problem. A quantity will di8use in space (its value follows a di8usion equation such as the heat equation) because it is ‘carried’ by underlying and fast Brownian particles. In other words, the result of the mixing of fast Brownian particles which carry a piece of information is that the mean value of that information di8uses in space. The Wiener process has a number of particular properties. The main ones are: • the trajectories of Wt are continuous yet nowhere di8erentiable. Even on a small interval, Wt
uctuates enormously.
• the increments of Wt , d Wt =Wt+dt − Wt , over small time steps d t are stationary and independent.
Each increment is a Gaussian variable with mean, variance and higher moments given by d Wt = 0;
(d Wt )2 = d t;
(d Wt )n = o(d t) :
(32)
• the Wiener process is the only stochastic process with independent Gaussian increments and
with continuous trajectories. • the trajectories are of unbounded variation in every Anite interval. This property explains why stochastic integrals will di8er from classical Riemann–Stieltjes ones. 2.5. General Chapman–Kolmogorov equations
Some of the typical properties observed with the key stochastic processes described above can be generalized to a whole class of Markov processes, provided that certain assumptions are made on their behaviour over small time increments. From the correspondence between the trajectory and the pdf points of view, there are two ways to express this incremental behaviour. In this section, we follow the trajectory point of view and characterize these processes by the following conditions on the transitional pdf over small increments in time Qt: 1 (33a) p(t + Qt; y|t; x) = W (y|t; x) + O(Qt); for |x − y| ¿ + ; Qt
20
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
1 Qt 1 Qt
|y−x|¡+
|y−x|¡+
(y − x) p(t + Qt; y|t; x) d y = A(t; x) + O(Qt) ;
(33b)
(y − x)2 p(t + Qt; y|t; x) d y = B2 (t; x) + O(Qt) :
(33c)
The Arst condition is the probability of a jump and trajectories are discontinuous when W = 0. The second one deAnes the drift coe@cient A(t; x) which is the mean increment of the conditional process Xt . The third one deAnes the di8usion coe@cient which represents the variance of the increment or the spread around the mean incremental value. The transitional probability density function p(t; x|t0 ; x0 ) is a function of both the initial state (t0 ; x0 ) and of the Anal state (t; x). Consequently, two points of view can be adopted by holding either the initial or the Anal condition Axed and by varying the other state. Using the Chapman–Kolmogorov relation, Eq. (14), and the above hypotheses, it can be shown that, when the initial condition is held Axed and when p is regarded as a function of the Anal state (t; x), then p(t; x|t0 ; x0 ) satisAes the forward Kolmogorov equation [15] 9p 9[A(t; x) p] 1 92 [B2 (t; x) p] =− + 9t 9x 2 9x 2 + {W (x|t; y) p(t; y|t0 ; x0 ) − W (y|t; x) p(t; x|t0 ; x0 )} d y :
(34)
Using similar considerations and more or less the same derivation, it can be shown that, as a function of the initial state (t0 ; x0 ) when the Anal condition (t; x) is held Axed, p(t; x|t0 ; x0 ) satisAes the backward Kolmogorov equation [15] 9p(t; x|t0 ; x0 ) 9p(t; x|t0 ; x0 ) 1 2 92 p(t; x|t0 ; x0 ) = −A(t0 ; x0 ) − B (t0 ; x0 ) 9t0 9x0 2 9x02 + W (y|t0 ; x0 ){ p(t; x|t0 ; x0 ) − p(t; x|t0 ; y)} d y :
(35)
It is important not to confuse the two points of view (forward or backward) which further justiAes the central role of the transitional pdf. In the forward equation, the initial state does not appear explicitly in the jump, drift and di8usion coe@cients, and we can integrate over all possible initial conditions. Since the pdf of the stochastic process Xt at time t is of course given by p(t; x) = p(t; x|t0 ; x0 )p(t0 ; x0 ) d x0 ; (36) it follows that p(t; x) satisAes the same forward equation. From the general Chapman–Kolmogorov equations, various cases can be isolated by considering di8erent possibilities for the jump, drift and di8usion coe@cients. These particular equations have sometimes been obtained separately and carry di8erent names often for historical reasons. Yet, in the present formulation, they appear as subclasses of a general class of Markov processes.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
21
2.5.1. The Master equation When A(t; x) = B(t; x) = 0, the stochastic process involves only jumps and between each jump the trajectories of the process are straight lines. The pdf equation is called the Master equation 9p (37) = [W (x|t; y)p(t; y|t0 ; x0 ) − W (y|t; x)p(t; x|t0 ; x0 )] d y : 9t The Master equation is indeed the central equation for processes which are typically discrete and is met when one deals with physical issues which are also by nature discrete. The classical example is molecular and chemical processes which involve either a complete change or no change at all. Another typical application is for particle collisions where particle velocities can be constant and change discontinuously at discrete and random times. Over a Anite time step Qt, we have p(t + Qt; y|t; x) = (y − x) 1 − W (y |t; z) d y + W (y|t; x)Qt (38) which shows that W (y|t; x) is the probability to jump from state x to y at time t per unit of time. The generic process in this subclass is the Poisson process described in the previous section for which the sample space is discrete and W (x + 1|t; x) = #. 2.5.2. The Liouville equation When W (y|t; x) = B(t; x) = 0, the process is a continuous deterministic process and the pdf equation is the Liouville equation 9p 9[A(t; x)p] =− : (39) 9t 9x The Liouville equation is central in classical mechanics. Its characteristic form, and the presence of only Arst-order partial derivatives, are closely related to the choice of a closed description of a mechanical system (each degree of freedom is explicitly tracked) as it will be explained in detail later on in Sections 3 and 4. 2.5.3. The Fokker–Planck equation When W (y|t; x) = 0, the forward Kolmogorov equation is called the Fokker–Planck equation. 9p 9[A(t; x)p] 1 92 [B2 (t; x)p] : =− + 9t 9x 2 9x 2
(40)
Compared to the Liouville equation, the Fokker–Planck equation involves a supplementary term with a second-order partial derivative. The existence of this term has deep consequences both mathematically and physically. From the mathematical point of view, the issue is to deAne clearly the corresponding behaviour of the trajectories of the process and to put the manipulation of these trajectories on a sound footing. From the physical point of view, the issue is to explain how this behaviour comes into play and the physical meaning behind its use. That question is addressed in Section 4. The solutions of Fokker–Planck equations are known as di8usion processes and the rest of the present section is devoted to clarifying their characteristics and how they are manipulated. The central example within the subclass of di8usion processes is the Wiener process, described in the previous section.
22
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
2.6. Stochastic diCerential equations and diCusion processes Di8usion processes form a subclass of Markov processes. They have been carefully studied and their properties are rather well-known mathematically which makes them easier and safer to use in applied physics. They will be used extensively for modelling purposes since they represent models for di8usion phenomena (thus their name) and have continuous trajectories. The Arst example is the Wiener process described above and general di8usion processes are in fact extensions of it. The inAnitesimal operator for such a di8usion process is given by 9 92 1 L = A(t; x) + B2 (t; x) 2 (41) 9x 2 9x and the transitional pdf p(t; x|t0 ; x0 ) satisAes the Fokker–Planck equation, Eq. (40), with the initial condition p(t; x|t0 ; x0 ) → (x − x0 ) when t → t0 . The Fokker–Planck equation re ects the pdf point of view. On the other hand, the second point of view will give direct answers to the questions explained in the introduction of this chapter related to the meaning of Eq. (3). One would like to give a meaning to the ‘noise’ term, t , introduced in an ODE. The proper way to do so is to say that we are now dealing with a stochastic process Xt and that we are writing di8erential equations for the trajectories of this process as deAned above. We consider now t as a rapidly uctuating, highly irregular stochastic process. The ideal model is a Gaussian ‘white noise’ model where the process is stationary with zero mean and no correlation, that is t = 0
and
t t = (t − t) :
(42)
This process has a constant spectral density (thus the name white noise). However, the white-noise process cannot be deAned directly since it has an inAnite variance. One can give an abstract sense to this process (Arnold, Chapter 3). However, there is a simpler way out of this di@culty. The solution consists in considering the e8ect of the white-noise term over (small) time intervals. We deAne t Yt = t d t : (43) 0
Yt is a Gaussian Markov process whose mean and covariance functions are worked out from the properties of the white noise Yt = 0
and
(Yt )2 = t :
(44)
Therefore, Yt can be identiAed with the Wiener process, Yt = Wt . This indicates that in fact, the integral over a time interval of the white-noise process gives the Wiener process and this justiAes writing t Wt = t d t or d Wt = t d t : (45) 0
The idea is thus to try to deAne not the derivatives of the trajectories, Eq. (3), but their increments over small time steps as d Xt = A(t; x) d t + B(t; x) d Wt ;
(46)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
which is a short-hand notation for t t Xt = Xt0 + A(s; Xs ) d s + B(s; Xs ) d Ws : t0
23
(47)
t0
The Arst integral can be thought of as a classical one. The second one is a stochastic integral which must be properly deAned. In the usual sense, one would split the time interval into a number of small time steps and write the integral as the limit t N B(s; Xs ) d Ws = lim B($i ; X$i )(Wti+1 − Wti ) ; (48) t0
N →∞
i=1
where $i is chosen in the interval [ti ; ti+1 ]. However, it turns out that the limit result is not independent of the choice of the intermediate time $i as one would expect from classical integration. Di8erent choices of this intermediate time yield Anite but di8erent results for the integral. This surprising result can be traced back to the ragged behaviour of the Wiener process and to the fact that its trajectories are of inAnite total variation in any time interval. To obtain a meaningful and coherent theory (for later manipulations) of the stochastic integral, one must therefore choose the intermediate times right from the outset. The important message here is that speaking of a stochastic integral without specifying in what sense it is considered is not meaningful. Two main choices have been made in the literature. The Arst one is called the Itˆo deAnition and consists in taking the value at the beginning of the time interval $i = ti . There is a clear probabilistic interpretation of this. The integral writes t N B(s; Xs ) d Ws = lim B(ti ; Xti )(Wti+1 − Wti ) (49) t0
N →∞
i=1
which shows that we consider the function B(t; Xt ) as a non-anticipating function with respect to the Wiener process. The choice of $i signiAes that we express B(t; x) as a function of the present state while the increment d Wt which is independent of the present is said to ‘point towards the future’. This choice is in fact rather natural when the ‘noise’ does not depend on the system. From it, result the properties of the Itˆo stochastic integral
t1 Xt d Wt = 0 ; (50) t0
t1 t3 t1 for t0 6 t2 6 t1 6 t3 Xt d Wt Yt d Wt = Xt Yt d t : (51) t0
t2
t2
The second choice is to take the intermediate point $i as the middle point of the interval $i = (ti + ti+1 )=2. This results in the Stratonovich deAnition. Actually, various deAnitions of the Stratonovich integral can be found depending upon the exact expression of the term involving B($i ; X$i ) in the limit sum, Eq. (48). For example, one can choose to take B((ti + ti+1 )=2; X(ti +ti+1 )=2 ) or B(ti ; X(ti +ti+1 )=2 ) as in Arnold [13]. The most common deAnition met in mathematical books is (written with a characteristic symbol ◦) t N 1 B(s; Xs ) ◦ d Ws = lim (52) [B(ti ; Xti ) + B(ti+1 ; Xti+1 )](Wti+1 − Wti ) : N →∞ 2 t0 i=1
24
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
These various deAnitions di8er only by the assumptions required on B for the sums to converge, and if B is smooth enough they lead to the same limit object. Therefore, the above sum can be taken as the present deAnition of the Stratonovich integral. The question of what deAnition of the stochastic integral, Itˆo or Stratonovich, should be chosen has led to continuous debate in applied physics textbooks. A detailed discussion on this dilemma is outside the scope of the present notes. However, the key point is to be aware of this apparent peculiarity so as to avoid confusion. Indeed, if these di8erent deAnitions and properties are ignored, it is hard to understand why calculations performed with seemingly identical procedures can lead to con icting results. Actually, the two forms can be transformed one into the other. The Stratonovich deAnition of a SDE d Xt = A(t; x) d t + B(t; x) ◦ d Wt ;
(53)
can be shown to be equivalent to the Itˆo SDE [13,15] 9B(t; x) 1 d Xt = A(t; x) d t + B(t; x) d t + B(t; x) d Wt : (54) 2 9x The di8erence between the two deAnitions is therefore a mean drift term and is not ‘negligible’. This illustrates and further stresses that, even if one is not interested in mathematical subtleties, a careful deAnition and at least some understanding of what these deAnitions embody is unavoidable. The best illustration of such pitfalls is perhaps numerical schemes for the integration of the trajectories of the process in practical computations, see Section 2.10. Finally, for a stochastic process Xt whose trajectories satisfy stochastic di8erential equations in the Stratonovich sense, it can be seen from the correspondence with an Itˆo form, Eq. (54), and the Fokker–Planck equation veriAed for di8usion processes in the Itˆo sense, Eq. (40), that the pdf of Xt is the solution of
9p 9[B(t; x)p] 9[A(t; x)p] 1 9 B(t; x) : (55) =− + 9t 9x 2 9x 9x 2.7. Stochastic calculus Most of the strangeness of stochastic processes and of SDEs is embodied in stochastic calculus. Although surprising at Arst sight, the di8erences with ordinary di8erential rules are not too di@cult to grasp. They stem from the irregular behaviour of the trajectories of the Wiener process Wt . Indeed, we have seen that on a small time increment d t the variance of the increments of the Wiener process, (d Wt )2 , is linear in d t (in fact, it is equal to d t). This is already contradictory with the ‘normal’ calculus result which says that the square of an increment should be of order (d t)2 . The explanation is that the ‘correct’ behaviour is expected for a di8erentiable process (a process whose trajectories are di8erentiable) while the Wiener process is precisely not di8erentiable. As a consequence, normal calculus rules must be modiAed by going over to the second-order derivatives, which in normal cases give only terms of order (d t)2 but will bring a Arst-order contribution in our case. To illustrate this, we consider a SDE deAned in the Itˆo sense d Xt = A(t; x) d t + B(t; x) d Wt
(56)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
25
and we want to derive the SDE veriAed by a function g(t; Xt ) of the stochastic process Xt . The rule of thumb is thus to write the Taylor series up to the second order and not to forget the contribution that arises from the term involving (d Wt )2 whose mean is d t. The result is the Itˆo’s formula 9g 9g 9g d g(t; Xt ) = (t; Xt ) d t + A(t; Xt ) (t; Xt ) d t + B(t; Xt ) (t; Xt ) d Wt 9t 9x 9x 92 g 1 + B2 (t; Xt ) 2 (t; Xt ) d t ; (57) 2 9x where the last term on the second line is the ‘new term’ with respect to classical calculus which would have produced only the Arst line. On the other hand, the choice of the Stratonovich deAnition leads to calculus rules which are identical to classical ones [13,15]. However, this nice point is o8set by the di@culty in manipulating the stochastic integral and Itˆo’s simple properties Eqs. (50) – (51) are no longer valid.
2.8. Langevin and Fokker–Planck equations Once stochastic calculus has been deAned and the signiAcation of SDEs has been given, the picture is complete. We can state what is in fact the main point of this whole section: when dealing with stochastic processes there are two ways to characterize the properties, the time-evolution equation of the trajectories of the process or the equation satisAed in sample space by its pdf. This correspondence is particularly clear for di8usion processes and is central in the present paper. We use this summary as an opportunity to write results in the multi-dimensional case. If Z(t) = (Z1 ; : : : ; Zn ) is a di8usion process with a vector drift A = (Ai ) and a di8usion matrix B = Bij , the trajectories of the process are solutions of the following SDE d Zi = Ai (t; Z(t)) d t + Bij (t; Z(t)) d Wj ;
(58)
where Wt =(W1 ; : : : ; Wn ) is a set of independent Wiener processes. The SDEs are called Langevin equations in the physical literature. This corresponds in sample space to the Fokker–Planck equation for the transitional pdf written p(t; z|t0 ; z0 ) 9p 9[A(t; z)p] 1 92 [(BBT )ij (t; z)p] + ; (59) =− 9t 9 zi 2 9zi 9zj where BT is the transpose matrix of B. Actually, the correspondence between the two points of view is not a strict equivalence. Indeed, the matrix D that enters the Fokker–Planck equation is related to the di8usion matrix of the SDEs B by D = BBT . Since there is not always a unique decomposition of deAnite positive matrices for a given matrix D, there may exist several choices for the di8usion matrix B. Therefore, we can have di8erent models for the trajectories that still correspond to the same transitional pdf. In other words, there is more information in the trajectories of a di8usion process than in the solution of the Fokker–Planck equation. However, since we are in the present work interested mainly in statistics extracted from the stochastic process, or in a weak
26
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
approach (in the sense already used in Section 2.1.2), we can consider that the di8erent models for the trajectories belong to the same class and then speak of the equivalence between SDEs and Fokker–Planck equations. 2.9. The probabilistic interpretation of PDEs The equivalence between the trajectory and the pdf points of view is also the basis of the probabilistic interpretation of some PDEs. The starting idea is to interpret the solution of a PDE as a function or a functional of some stochastic process. Instead of solving the PDE by classical numerical methods, the idea is then to simulate directly the trajectories of the process and to obtain the solution by some sort of averaging operation. This methodology can be applied to a large variety of PDEs (see [14,17]). We limit ourselves to the case of parabolic equations and the relation between stochastic processes and PDEs of convection–di8usion type is simply the relation between the two deAnitions of a di8usion process. The probabilistic interpretation reverses that point of view and regards a convection–di8usion PDE as a kind of Fokker–Planck equation. For example, the solution of the problem 2 2 9u = − 9[A(t; x) u] + 1 9 [B (t; x) u] ; 9t 9x 2 9x 2 (60) u(0; x) = h(x); when t = 0 ; can be built from the transitional pdf of the di8usion process Xt , p(t; x|t0 ; x0 ), as u(t; x) = p(t; x|t0 ; x0 )h(x0 ) d x0 ;
(61)
where A(t; x) and B(t; x) are, respectively, the drift and di8usion coe@cients of the process Xt . Therefore, in physical terms, Xt appears as the propagator of the initial function h(x0 ). Or in other words, Xt is the carrier of the information. At the initial time, particles start at x0 with an ‘information’ that is h(x0 ). Then they follow the SDE d Xt = A(t; x) d t + B(t; x) d Wt :
(62)
As a consequence of this motion, information is carried from the initial state (t0 ; x0 ) to another one (t; x). The average result is then the solution of the PDE which is of convection–di8usion type. It is seen that the di8usion term in the PDE re ects in fact the fast and random motion expressed by the Wiener process, d Wt , in the ‘particle’ evolution equation. Conversely, when particles undergo a random walk, the result of their mixing is to produce a di8usion in space. Then, in practical simulations, any statistics that are continuously obtained from the pdf of the process, can be approximated, at a given time t, from an ensemble of realizations of the process by the Monte Carlo evaluation N 1 f(Xt ) f(Xti ) : (63) N i=1
For a stochastic process when statistics are required at various times, the di8erent realizations at time t are simply provided by the values at the corresponding time of the trajectories of the
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
27
process. Indeed, it is now clear that the trajectory point of view consists in practice in following in time a number of trajectories, that we can write as X i (t) = X (t; !i ) for a certain number of possible events represented here by !i . In other words, simulating stochastic processes from the trajectory point of view corresponds to performing Monte Carlo integration at each time. One speaks then of the Monte Carlo integration of partial di8erential equations. 2.10. A word on numerical schemes The problem of how to devise accurate numerical schemes for the integration of SDEs is a di@cult issue, and also a recent concern. This is the subject of current research [18,19]. The detailed presentation of current state-of-the-art proposals is not within the scope of the present paper and we limit ourselves to the main points that also illustrate the notions put forward in the previous sections. Compared to similar numerical schemes that are now well established for ordinary di8erential equations, the question of the consistency of stochastic numerical schemes must be carefully analysed. Actually, most of the di@culties arise from a lack of understanding of the exact deAnition of the stochastic integral, see Section 2.7. Numerical schemes, as well as manipulation of a function of the stochastic process Xt can only be done after an interpretation of the stochastic integral has been chosen. If one has chosen the Itˆo interpretation, then it is implicitly assumed that the discretization of B(t; x) should not anticipate the future. As a result, Runge–Kutta schemes cannot be applied directly. More precisely, careless applications of high-order Runge– Kutta schemes can introduce spurious drifts which may not be easy to detect. For the Langevin equation d Xt = A(t; Xt ) d t + B(t; Xt ) d Wt ;
(64)
the Euler scheme is the simplest choice and is written as X i (t + Qt) − X i (t) = A(t; X i (t))Qt + B(t; X i (t))QWt ; (65) √ where the random term QWt is expressed as Qt × e, e being a value sampled in a normalized Gaussian random variable, independently at each time step and for each trajectory. A rather illuminating example of typical pitfalls is seen if one tries to apply directly the well-known predictor–corrector scheme. This is a two-step scheme with the Euler scheme acting as a predictor i X˜ (t + Qt) − X i (t) = A(t; X i (t))Qt + B(t; X i (t))QWt ;
(66a)
1 i X i (t + Qt) − X i (t) = (A(t; X i (t)) + A(t + Qt; X˜ (t + Qt)))Qt 2 1 i × (B(t; X i (t)) + B(t + Qt; X˜ (t + Qt)))QWt : (66b) 2 Yet, a time series expansion of this scheme reveals that due to the Arst-order behaviour of (QWt )2 in time, the corresponding di8erential equation turns out to be Eq. (54) rather than the Itˆo SDE which is here Eq. (64). In other words, the predictor–corrector scheme is consistent, however with the Stratonovich interpretation of SDEs, Eq. (53), but not (in general) with the
28
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Itˆo interpretation. Therefore, if the Itˆo interpretation has been chosen and the stochastic integrals are manipulated using the simple Itˆo’s rules (see Section 2.6), the scheme is not consistent. The key point here is that the numerical discretization must be in line with the mathematical deAnition of the stochastic terms. To stay on somewhat safer grounds, one can stick to the Euler scheme or pay enough attention to the validity of the numerical schemes. After consistency is checked, the quality of schemes must be measured to analyse how they actually approximate solutions, and for this the notion of order of convergence must be properly deAned. For stochastic processes various deAnitions can be adopted which mirror the di8erent ways random variables may converge to a limit random variable, see Section 2.1.2. One can deAne a strong order of convergence and a weak order of convergence. Let us consider a numerical approximation of the process Xt obtained with a Anite time step Qt, called XtQt . On the one hand, the numerical scheme will have a strong order of convergence m if at a time tmax we have that |Xtmax − XtQt |2 1=2 6 C(Qt)m : max
(67)
On the other hand, the numerical scheme will have the weak order of convergence m if at time tmax we have that |f(Xtmax ) − f(XtQt )| 6 C(Qt)m max
(68)
for all su@ciently smooth functions f. For example, the Euler scheme has a strong order of convergence m = 1=2 but a weak order of convergence m = 1. As already explained before, since we are mainly interested in approximating various statistics of single- and two-phase ows the natural notion is the notion of weak convergence. 3. Hierarchy of pdf descriptions Most of the necessary mathematical elements concerning stochastic di8erential equations have been given in the preceding section. For our purposes, attention has been focused on Markovian processes, and more speciAcally on a particular subset of Markovian processes, di8usion processes. These processes will be used as building blocks, Arst in turbulent single-phase ow modelling in Section 6 and then in turbulent two-phase ow modelling in Sections 7 and 8. Up to now, emphasis has been mainly put on the mathematical characteristics of di8usion processes rather than on their application for physical purposes. Such an application requires further analysis and discussion. Indeed, even in the multi-dimensional case where the stochastic process Z(t) is a vector of d real stochastic processes Z(t)=(Z1 (t); : : : ; Zd (t)), the selection of variables that make up the stochastic process Z(t) in a practical case, its dimension d, and the choice of the evolution equation (through the drift and di8usion coe@cients), were not discussed and were considered as given. However, a pdf description appears in a closed form only when: (i) the stochastic process Z(t) is chosen, (ii) the model for its time evolution equation is speciAed. The form and the nature of the di8erent models used for two-phase ow modelling will be presented in detail in later sections. In the present one, we discuss issues related to the choice of
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
29
the stochastic process Z(t) which is used to describe a physical system. By considering di8erent stochastic descriptions, either Aner or coarser, di8erent pdf equations result. It is important to be aware of the interplay between the di8erent and increasingly coarser descriptions and the structure of the corresponding reduced pdf equations. In practice, the cornerstone of the contracted description is, of course, to be able to choose the ‘correct’ reduced number of variables, which must be small enough to make up a tractable system while still capturing the essence of the physics of the problem. The discussion on how to perform such a choice in some cases is postponed to the next section. In the present one, we limit ourselves to the technical presentation of this interplay which manifests itself by various pdf hierarchies. These hierarchies will be referred to continuously in the rest of the paper. The general issue of a pdf hierarchy is Arst presented, and is then illustrated by two examples. The Arst hierarchy is very well known in Statistical Physics. However, the second hierarchy is not often described, though it is of the same nature. Both hierarchies appear constantly in the modelling considerations later on. 3.1. Complete and reduced pdf equations Numerous physicals situations fall into the category of what is called N -body problems. That is, we have N objects, identical or not, which interact mutually. This situation can be loosely referred to as a N -particle problem by deAning each ‘object’ as being a particle. This terminology will be retained here. In this general approach, each particle represents the particular value of a set of variables and is fully determined by the knowledge of these ‘internal’ variables. A classical example is molecular dynamics problems, where each particle represents a molecule and can be thought of as a point particle deAned by the value of its location and velocity. In another case, the knowledge of the state of each particle may require more variables. The way these N particles interact and in uence one another is considered to be known when the state of the N particles is known, that is the mutual forces are internal with respect to the whole system made up by the ensemble of the N particles. The dimension of the system (or the number of degrees of freedom), d = dim(Z), is given by d = N × p, where N is the number of particles included in the system and p represents the number of variables attached to each particle. For this system, the complete vector which gathers all available information is then Z = (Z11 ; Z21 ; : : : ; Zp1 ; Z12 ; Z22 ; : : : ; Zp2 ; : : : ; Z1N ; Z2N ; : : : ; ZpN ) : This vector is the state vector of the N -particle system. The vector deAned by the p variables attached to each particle, Zi = (Z1i ; Z2i ; : : : ; Zpi ), is called the one-particle state vector, in this case for the particle labelled i. In practice the dimension of the system is huge (it might be inAnite) and one has to come up with a reduced (or contracted) description, or in other words to consider a subset of dimension d = s × p d. Such a reduced description is needed to achieve a practical formulation of the behaviour of the system, that is to formulate a set of equations in closed form which can be solved numerically with help of modern computer technology. The key point is that, in the general case, such a contraction is followed by a loss of information and that knowledge of higher-order pdfs has to be provided through closure relations.
30
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
To illustrate this problem, let us consider a N -particle system where the time evolution equation involves simply a deterministic force d Z(t) = A(t; Z(t)) : (69) dt The dimension of the complete state vector Z is equal to d, and the corresponding pdf p(t; z) veriAes the Liouville equation 9p(t; z) 9 + (A(t; z)p(t; z)) = 0 : (70) 9t 9z This equation is closed since in fact all the degrees of freedom of the system are explicitly tracked. We consider now a reduced pdf pr (t; zr ) where dim(Zr ) = d and p(t; z) = p(t; zr ; y) with, of course, dim(Y) = d − d . By integration of the previous equation on y, the transport equation for the marginal (reduced) pdf becomes 9pr (t; zr ) 9 (71) + r [A|zr pr (t; zr )] = 0 ; 9t 9z where the conditional expectation is deAned by 1 r r r A|z = A(t; z ; y)p(y|t; z ) d y = A(t; zr ; y) p(t; zr ; y) d y : (72) p(t; zr ) Eq. (71) is now unclosed. This illustrates the fact that when a reduced description (in terms of a subset of degrees of freedom) is performed, information is lost, and one has to come up with a closure equation for higher-order pdfs. We have moved from a complete description and therefore a closed pdf equation Eq. (70), to a contracted description and thus an unclosed pdf equation Eq. (71). At this point, two sets of reduced descriptions can be chosen in the N -particle example, by varying either the number of particles retained in the state vector of the reduced system or by varying the number of variables attached to each particle. The Arst one corresponds to the classical BBGKY hierarchy (the initials are those of the authors who derived it independently: Bogoliubov, Born, Green, Kirkwood and Yvon) encountered in kinetic theory (p = 2), and is fully described in textbooks, for example [20,21]. In the second one, the dimension of the state vector is addressed from a single particle point of view, s = 1. 3.2. BBGKY hierarchy Classical mechanical questions are well represented by N -particle deterministic problems, involving N particles of identical mass m in mutual interaction and with no external forces. The dimension of the one-particle state vector is, almost always, taken as two, including particle location and velocity. This is a consequence of the search of a kinetic description and of the hypothesis that forces derive from a location-dependent potential. Consequently, the dimension of the complete state vector is d = 2 × N . The drift vector is A = (U; F) where the mutual acceleration, taken in the direction xi − xj , is denoted Fij and is given in terms of a potential
ij = (|xi − xj |), which is the mutual potential energy of the pair of particles (i; j). Therefore mFij = 9 ij = 9xi represents the force on particle i due to particle j. In the classical mechanical framework, a reduced description is meant as a description of the system using identical variables for each particle but using only a subset of the total number. The reduced pdf for a subset of
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
31
s particles, ps (t; y1 ; V1 ; : : : ; ys ; Vs ) is written for the sake of simplicity as ps (t; 1; : : : ; s) = ps and consequently for integration d ys d Vs reads d s. Integration of the Liouville equation yields (summation over the i index should not be confused with tensor notation as it represents the number of particles in the subset, that is a summation from 1 to s; 1 6 s 6 N ) 9 Ls (ps ) + Fij pN d (s + 1) : : : d N = 0 ; (73) 9Vi j¿s where the Ls operator is given by Ls (·) =
9· 9 9 (Vi ·) + + 9t 9yi 9Vi
s
Fij · :
(74)
j=1
Eq. (73) has been obtained by applying the correspondence di8usion process—Fokker–Planck equation and more especially deterministic process—Liouville equation. It can also be derived using Classical Mechanics, i.e. the properties of the Liouville operator, Libo8 [21], or the Hamiltonian, Balescu [22]. Noticing that (by permutation and variable changes) N Fij p d (s + 1) : : : d N = (N − s) Fi(s+1) ps+1 d (s + 1) ; (75) j¿s
the following set of equations is obtained: 9 Ls (ps ) + (N − s) Fi(s+1) ps+1 d (s + 1) = 0 ; 9Vi
(76)
which is a set of N coupled equations and is often called the BBGKY hierarchy. This simply states that for a deterministic ensemble of N particles, a contracted description of the system gives an unclosed equation on the reduced pdf as illustrated by Eq. (76). For s = 1, one-point pdf, one recognizes the kinetic equation which involves the two-point pdf and so on. At this point, it should be mentioned that, in the case of mutual interactions given by a simple potential, it was quite trivial to illustrate the hierarchy of pdfs but, for example in the case of discrete particles (or even uid particles) carried by a turbulent uid, the expression of the force exerted on a particle does not exhibit a simple analytical form as it depends simultaneously on all other particles and consequently, in this case, the hierarchy problem is given by Eq. (71). At last, this type of hierarchy is not a property of the pdf approach but is typical, in general, for problems where a reduction is made, as for example, in the case of the Reynolds decomposition of the local instantaneous Navier–Stokes equations. 3.2.1. Normalization of the distribution function In the previous approach, a pdf, p(t; x), has been used, p(t; x) d x is in fact the probability to And the system (the N particles) in a given state in the range [x; x + d x], cf. Section 2.2 (this can be understood more easily using the notion of an ensemble density function, introduced by Gibbs, cf. e.g. [21]). The marginal ps represents then the probability to And the reduced system (s particles) in a given state.
32
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
In many applications, as will be seen later, it is convenient to work with the s-tuple distribution function, fs (t; 1; : : : ; s), where fs (t; 1; : : : ; s) d 1 : : : d s represents the probable number of s-tuple in a given state in the range [1; 1 + d 1]; : : : ; [s; s + d s] at time t. The relation between ps and fs is directly given by combinatorics, that is by the number of ways of taking s elements from a population of N elements, without replacement and of course with regard to order. The answer is (N )k , that is N! fs (t; 1; : : : ; s) = (77) ps (t; 1; : : : ; s) : (N − s)! Normalization is given by fs (t; 1; : : : ; s) d 1 : : : d s =
N! (N − s)!
ps (t; 1; : : : ; s) d 1 : : : d s =
N! ; (N − s)!
(78)
and with r ¡ s 6 N , the r- and s-tuple distribution functions verify the following relation: (N − s)! r f (t; 1; : : : ; r) = fs (t; 1; : : : ; s) d (r + 1) : : : d s : (79) (N − r)! The BBGKY hierarchy, Eq. (76), can be written in a slightly di8erent form 9 s s s+1 Fi(s+1) f d (s + 1) = 0 : L (f ) + 9Vi
(80)
3.3. Hierarchy between state vectors The BBGKY hierarchy gives a comprehensive picture of the resulting modelling problem in the frame of Classical Mechanics. The issue is now to express the statistical e8ect of all the disregarded particles on the statistical properties of the small number (usually one or two) of particles that are kept in the state vector. In this hierarchy, the choice of the one-particle state vector and its dimension, here p = 2, remains unchanged. However, in di8erent situations, various choices can be made for the one-particle state vector and it is useful to consider a second set of pdf equations which corresponds to di8erent and increasing one-particle state vectors. This happens already when we consider a N -particle problem where the force acting on one particle due to the other ones can be any function of particle properties, for example a function of particle acceleration or other ‘internal’ particle properties. It is therefore important to express also the interplay between the choice of the one-particle state vector and the structure of the corresponding pdf equation, even when a given subset of s particles is considered. There is another strong justiAcation for considering this second pdf hierarchy with respect to modelling purposes. Indeed, to obtain a closed pdf equation at some chosen level, a model must be introduced to simulate the behaviour of the degrees of freedom that are summed over. As will be explained more in detail in the following section in the case of white-noise terms, it is important to select the ‘correct’ variable that can be well modelled by a certain stochastic process. A very precise example of this choice will be given by the choice of the variable to model in one-point particle pdf for two-phase ows, see Section 7. The BBGKY hierarchy was presented using a top-bottom approach, that is starting from the complete Liouville equation and deriving from it the di8erent reduced descriptions. The
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
33
hierarchy between vector states will be presented here from a bottom-top approach, starting from the most reduced level to higher level and introducing modelling concerns. We consider only one particle (s = 1), and follow a presentation based on the historical case of a Brownian particle that will be taken up again in the next section. First of all, we can restrict ourselves to following the position of the particle (that was Einstein’s point of view with time steps that are large enough, see the next section). With that choice of the state vector Z(t) = (X(t)), the particle velocity is an external variable and the pdf equation for p(t; y) is unclosed 9p(t; y) 9 (81) + (U|yp(t; y)) = 0 : 9t 9y To obtain a closed model, the e8ect of the particle velocity has to be replaced by a model d X + (t) d X (t) = U + (t) ⇒ = F[t; U (t)] dt dt
(82)
where the superscript + denotes the exact equation and F[t; X (t)] represents a functional of the position X (t). If the functional F is deterministic we end up with a reduced Liouville equation. However, if F is stochastic, the techniques of Section 2 may be applied. If this Arst picture is believed to be too crude, one can include the velocity of the particle in the state vector that becomes then Z(t) = (X(t); U(t)) (Langevin’s point of view). Now, the particle acceleration A(t) becomes an external variable and the corresponding pdf equation for p(t; y; V) is unclosed 9p(t; y; V) 9(Vi p(t; y; V)) 9 + (A|y; Vp(t; y; V)) = 0 : (83) + 9t 9yi 9Vi To obtain a closed form, the acceleration has to be eliminated or replaced by a model d X + (t) d X (t) + = U (t) = U (t) dt dt ⇒ + d U (t) = A+ (t) d U (t) = F[t; X (t); U (t)] : dt dt It is thus clear that the second description encompasses the Arst one. It contains more information and in physical terms corresponds to a description performed with a Aner resolution. From a modelling point of view the task is also di8erent depending upon the choice of the one-particle state vector. In the Arst case (Einstein’s point of view), one has to model particle velocities. In the second case (Langevin’s point of view) one has to model particle accelerations. From the above example, a general picture emerges. We consider a one-particle reduced description (s = 1) but with many internal degrees of freedom, i.e. Z1 = (Z11 ; Z21 ; : : : ; Zp1 ; : : :). The complete one-particle state vector is written here for a particle labelled i = 1, but in a one-particle pdf description the label is irrelevant (the same would be valid for any particle i) and the superscript is therefore skipped in the following. If the time rate of change of the particle degrees of freedom has the following form: d Z1 (85a) = g(t; Z1 ; Z2 ) ; dt d Z2 (85b) = g(t; Z1 ; Z2 ; Z3 ) dt
34
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
.. .
(85c)
d Zp = g(t; Z1 ; : : : ; Zp ; Zp+1 ) dt
(85d)
.. .
(85e)
and if the chosen one-particle reduced state vector contains only a limited number of degrees of freedom, say p; Z r = (Z1 ; : : : ; Zp ) then the corresponding pdf equation for pr (t; z1 ; z2 ; : : : ; zp ) is unclosed since it involves an external variable, namely Zp+1 9(g(t; z1 ; : : : ; zp )pr ) 9(g(t; Z1 ; : : : ; Zp ; Zp+1 )|Z r = z r pr ) 9p r 9(g(t; z1 ; z2 )pr ) + ··· + + + 9t 9z1 9zp−1 9zp =0 : (86) To obtain a closed model, the external variable Zp+1 must be expressed as a function of the variables contained in the chosen state vector, and the equations for the modelled system have the form with a model written gm for the time rate of change of Zp d Z1 (87a) = g(t; Z1 ; Z2 ) ; dt d Z2 (87b) = g(t; Z1 ; Z2 ; Z3 ) dt .. . (87c) d Zp = gm (t; Z1 ; : : : ; Zp ) : dt
(87d)
4. Stochastic di-usion processes for modelling purposes The purpose of the present section is to show how stochastic processes can be used in applied situations for modelling issues. Indeed, we have seen in the previous section that the practical need to limit ourselves to reduced descriptions results in unclosed pdf equations. To obtain closed equations, the disregarded degrees of freedom may be replaced by stochastic models. The objective in this section is to try to clarify what is meant when a stochastic process is written to replace a real physical process. This is not always an easy question, though there are some situations when such a move is clear. For example, if we are dealing with a mechanical system subject to an external force F(t) which uctuates rapidly with a variance 2 (t) around a mean term Fd (t), then the obvious model is to write the equivalent of Newton’s law as d Xt (88) = F(t) ⇒ d Xt = Fd (t) d t + (t) d Wt : dt However, the situation is perhaps less clear when we are dealing with internal degrees of freedom. The methodology is thus detailed in the rest of this section starting with a ‘simple’ example.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
35
4.1. The shift from an ODE to a SDE Let us consider the case of a system Xt whose time rate of change is Yt d Xt (89) = Yt : dt We consider that we are dealing with stochastic processes (due, for example to random initial conditions) which are di8erentiable and can thus be handled with normal calculus rules. This gives t dX 2 (t) Y (t)Y (t ) d t : (90) =2 dt 0 If we consider, for the sake of simplicity, Y (t) as a stationary process and introduce its autocorrelation Ry (s) deAned by Ry (s) = Y (t)Y (t + s)= Y 2 , we can write t dX 2 Ry (s) d s : (91) = 2Y 2 dt 0 The important scale in that reasoning is the integral time scale of Y (t), say T , which is deAned as the integral of the autocorrelation ∞ T= Ry (s) d s : (92) 0
This time scale is a measure of the ‘memory’ of the process. If we consider time intervals s small with respect to T , successive values of Y (t) are well correlated. On the other hand, successive values of Y (t) over time intervals that are large with respect to T are nearly uncorrelated. Therefore, in this second limit, we have t for t T; Ry (s) d s ∼ T ⇒ X 2 2Y 2 T × t (93) 0
that is the mean square of X (t) varies linearly with the time interval, here t. This is the ‘di8usive regime’. It should be noted that this regime is always reached (for long enough time spans) and that, once it is reached, the behaviour of X 2 does not depend on the particular form of Ry (s) but simply on two mean quantities, namely the variance and integral time scale of Y (t). This reasoning is certainly not new. Applied to the position and velocity of a uid particle, this point was described by Taylor in 1921 and has been detailed in most textbooks. However, we are not simply interested in reformulating known results concerning the statistics of X (t) but in modelling the instantaneous trajectories. Indeed, if we assume that the trajectories of X (t) are continuous, the previous result suggests that, in the range t T; X (t) can be seen as a Wiener process, that is undergoing a random walk. The previous behaviour is obtained with Anite time di8erence and by Arst introducing T and then making t or Qt large enough. The reasoning can be reversed to reveal what the introduction of a white noise means. We still consider Xt whose time rate of change is Yt . Let us consider that there is a separation of scales: we introduce a time step Qt ∼ d t representing the time interval over which we observe the process Xt . This time increment d t is therefore assumed to be small with respect to a characteristic time of Xt . Nevertheless, we assume that the integral time scale of Yt ; T , is very small with respect to d t. Thus, Yt is a fast and rapidly changing
36
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
variable. Actually, we would like to take directly the limit T → 0, since d t is assumed to be arbitrarily small. Yet, if we take that limit, assuming that Yt is a normal process having a Anite variance, Eq. (93) shows that the e8ect of the uctuations of Yt vanishes completely. Consequently, to retain a Anite limit when T → 0, we are forced to consider that Y 2 becomes arbitrary large in the sense that 2 Y → +∞ (94) such that Y 2 T → D ; T →0 where D is a Anite constant. In that case, the modelling step consists in replacing the di8erentiable process Yt by a white noise and writing that Xt becomes a di8usion process deAned by the SDE √ d Xt = 2D d Wt ; d Xt (95) = Y (t) → T →0 D = lim Y 2 T : dt T →0
By making this step, Xt becomes a Markov process since the memory of Yt becomes inAnitesimally small. It also implies that some ‘information’ has been lost (the information associated to Yt ) in an irreversible way. The signiAcance of this modelling step can be further clariAed by writing the consequences in the pdf equation. If p(t; y) is the pdf associated to the process Xt , we have 9p 92 p 9Yt |Xt = y 9p → =− =D 2 T →0 9t 9t 9y 9y
(96)
which shows that we have in fact introduced a ‘transport coe@cient’, namely D. The discussion above is presented in the framework of continuous-time stochastic processes, and to be put on Arm mathematical grounds the limit expressed in Eq. (94) is required. On a discrete time basis, the time scale T of Yt does not have to go exactly to zero. What is required is that this very time scale be small with respect to the time step which is the reference time scale Qt we have introduced right at the outset. It is important to realize that in practice the introduction of a white-noise term is a relative notion. With regard to one time scale, another process is assumed to vary ‘su@ciently quickly’. Therefore, the details of this fast process are not crucial: the wild variations can be expressed by a Wiener increment. Yet, the eliminated fast process leaves its trace through its variance and integral time scale which deAne the transport coe@cient D. Using a discrete representation of Xt , this step can be expressed by t+Qt t+Qt √ QX (t) = X (t + Qt) − X (t) = Ys d s → QX (t) = 2D d Ws : (97) t
T Qt
t
4.2. Modelling principles All the necessary elements have been given in the above example and can be developed in a more complex context to propose a general methodology. The idea is that introducing a local closure in an (open) set of equations means a Markovian approximation. Such an approximation
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
37
can be justiAed by a coarse-graining procedure, that is by observing the system on ‘large enough’ time intervals. This is precisely what we did in the previous example by taking not-too-small d t in order to disregard the ‘information’ related to Yt and to retain only its e8ects on Xt through the coe@cient D. The success of such a procedure will therefore rest upon a satisfactory choice of the ‘size of the grain’ (in practice a time or length scale) and upon a separation of scales as explained in Section 3. Let us build on these ideas in a complex situation to help us select the proper degrees of freedom to retain in the state vector. As it was explained in the previous sections, for the case of N interacting particles, and even if we limit ourselves to characterizing the statistical behaviour of one particle (one-point pdf), we still have a huge (maybe inAnite) number of degrees of freedom. We could limit ourselves to the position of the particle, say Xt , or include its velocity to have (Xt ; Ut ), or also its acceleration (Xt ; Ut ; At ) and so on. Using the language of Statistical Mechanics or of Synergetics [23,24] the principle is to introduce Arst a reference scale which in our example with one particle would be a reference time scale d t. Then, the degrees of freedom written as Zt = (Z1 ; : : : ; Zn ) are classiAed with respect to that scale as slow and fast variables, (Z1 ; Z2 ; : : : ; Zn ; : : :) ; ↑ reference scale A slow variable is a variable whose integral time scale T is greater than the reference scale d t while fast variables are those with an integral time scale $ smaller than the reference scale, $d t T :
(98)
The guiding principle is then to retain only the slow modes or variables in the state vector used to build the model and to ‘eliminate’ the fast ones. The latter modes are eliminated by expressing them as functions of the slow ones. This is called the slaving-principle [23] and is in fact an equilibrium hypothesis. The fast modes are assumed to relax ‘very rapidly’ to equilibrium values or distributions which are determined or parameterized by the values taken by the slow modes. This corresponds to sorting out the degrees of freedom in terms of solutions of transport equations and local source terms. The slow modes (Z1 ; Z2 ; : : : ; Zd ) that are kept in the state vector will satisfy di8erential equations while the fast ones (Zd+1 ; Zd+2 ; : : :) will be given by algebraic relations. In uid mechanics applications, statistics on the slow modes will be solutions of transport equations while statistics on the fast modes will appear as local source terms. Of course, this procedure will be successful if there exists a clear separation of scales between the integral time scales of the slow modes and of the fast ones. This was indeed the case in the previous example and this allows to replace the fast modes by white-noise or increments of Wiener processes. In the general case, there is no such clear-cut separation and replacing the fast variables by white-noise terms appears as a less justiAed approximation. However, the interest of this principle is at least to provide a convenient and coherent framework and to suggest in practice which variables have the ‘best chances’ to be replaced by a model.
38
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
4.3. Example for typical stochastic models A typical example of this reasoning is the historical case of a Brownian particle. This example was already used in Section 3 to illustrate the pdf hierarchy with respect to increasing one-particle state vectors. We can return to that case and go one step further by introducing speciAc models following the general methodology above. The Arst and simplest description retains only the position of the particle (that was Einstein’s point of view) and with that choice of the state vector Zt =(Xt ), the particle velocity is an external variable and has to be eliminated to obtain a closed model, as already explained in Section 3. When a large enough time step Qt or d t is used, the particle velocity can be regarded as a fast variable and the resulting stochastic model for Brownian particle location is expressed by √ d Xt (99) = Ut → d Xt = 2D d Wt : dt That procedure implicitly assumes that the time scale of the particle velocity Ut , say TU , is small with respect to d t. The corresponding pdf equation is a simple di8usion equation in sample-space (identical to a heat equation) 9p 92 p =D 2 : 9t 9y
(100)
The correlation between successive particle locations is given by Xt Xs = min(t; s) :
(101)
In the Einstein’s picture, particle velocities do not exist. If this Arst picture is believed to be too crude, one can include the velocity of the particle in the state vector that becomes then Zt = (Xt ; Ut ) (Langevin’s point of view). Now, the particle acceleration At becomes an external variable that has to be eliminated. The model proposed by Langevin is written as d Xt d Xt = Ut d t = Ut dt → (102) √ Ut d Ut d U = − d t + K d W t t = At T dt and the corresponding pdf equation for p(t; y; V ) is
1 92 [Kp] 9p 9p 9 1 : (103) +V = Vp + 9t 9y 9V T 2 9V 2 The correlation between successive particle velocities is now given by KT −(t+t )=T KT −|t−t |=T Ut Ut = U02 e−(t+t )=T − e + e : (104) 2 2 When we consider times both long enough with respect to the initial time of the process, the form of the correlation takes the simpliAed expression KT −|t−t |=T t; t T Ut Ut = e : (105) 2
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
39
This reveals that the time scale T used in the stochastic velocity equation is the time scale of particle velocity correlations since ∞ −|s|=T RU (s) = e and thus RU (s) d s = T : (106) 0
The Langevin model has better support if the acceleration can easily be replaced by a model. In the case of a Brownian particle, the acceleration is due to the large number of collisions with uid molecules. Due to the large inertia of the Brownian particle compared to the inertia of uid molecules, we can select a time step which is small with respect to the time scale of particle velocities and yet large with respect to the time scale of uid molecule velocities. The motion of these molecules can thus be seen as a fast and purely random process. The total action of the collisions is written as the sum of two contributions: a purely deterministic one opposed to the Brownian particle motion and a purely random one expressed as a white-noise process. For that precise example the complete form of the Langevin model is written with kB the Boltzmann constant, : the friction coe@cient and ; the uid temperature as d Xt = Vt d t ; d Vt = −:Vt d t +
(107)
2kB ;: d Wt : (108) The Langevin model is really the archetype of stochastic processes for uid dynamical modelling problems and will be extensively referred to in the next chapters. It is therefore important to be aware of its physical justiAcation and, consequently, of its inherent limitations. In the Langevin’s picture, one part of the particle acceleration is taken as a fast process and replaced by a white-noise term. Consequently, information related to the acceleration is lost. If such information is needed, or if acceleration cannot be seen as inAnitely fast, the same procedure can be pursued by shifting the introduction of the necessary model to the time rate of change of At . A useful model can be written as d Xt = Ut d t Ut Ut (109) At = − + or the uid viscosity. To focus the discussion on this issue, we limit ourselves in this section to passive scalars but we still consider one-particle PDF descriptions of the scalar Aeld. That problem appears simply when we have to simulate thermal e8ects, for example heat exchanges between the uid and the particles. Following our Lagrangian point of view, this is done by assigning to each discrete particle a new variable which represents the particle temperature Tp . The simplest model which accounts for the heat exchange between the uid and the particles (with no mass exchange) relies on a macroscopic coe@cient, the heat transfer coe@cient hfp and has the form of 6hfp d Tp (Ts − Tp ) ; (533) = dt dp Cpp where Cpp is the particle heat capacity. The heat transfer coe@cient hfp is usually given by empirical expressions in terms of the non-dimensional Nusselt and Prandlt numbers hfp dp
Nu = ; Pr = f ; (534) #f > which have for example the form Nu = 2 + 0:6Rep1=2 Pr 1=3 :
(535)
These equations mirror the ones that express the discrete particle momentum equation, Eq. (281) and Eq. (285) with hfp playing in the discrete particle temperature equation the role of the drag coe@cient CD in the momentum equation. In Eq. (533), Ts stands for the instantaneous temperature of the uid seen Ts (t)=Tf (t; xp ) which implies modelling issues similar to the ones detailed in Section 7.4.1 for the velocity of the uid seen. Even if we neglect the crossing-trajectory e8ect and regard Ts as having the same statistics as for a uid particle, we still have to model this instantaneous uid temperature. For the sake of simplicity, and since this does not change the modelling problem that we would like to bring up in this section, we consider only the uid case. We also generalize the discussion to include, not speciAcally the uid temperature, but any scalar whose local exact equation involves a molecular transport coe@cient. The background is therefore provided by Section 6. We follow mainly a Lagrangian point of view in the rest of this section, the Eulerian pdf being retrieved from the Lagrangian one through the general relations, see Section 6.4.3. The one-particle pdf equation for the pdf pL (t; y; ) can be obtained either from the exact instantaneous Aeld equation, Eq. (117) by applying directly the techniques of Section 3, or by starting
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
195
from the pdf equation satisAed by the joint velocity–scalar pdf, Eq. (231) in Section 6.5.2. Indeed, pL (t; y; ) is simply the marginal of the joint velocity–scalar pdf pL (t; y; V; ) L (536) p (t; y; ) = pL (t; y; V; ) d V : From Eq. (231), by integration over velocity variables we obtain the pdf equation for pL (t; y; ) 9[>Q |( = )pL ] 9[S( )pL ] 9p L 9[Ui |( = )pL ] =− − + : (537) 9t 9yi 9 9 This equation illustrates once again the interplay between closure terms and the hierarchy of di8erent descriptions. At the level where we handle the joint values of the velocity and of the scalar, that is when the one-particle state vector is Z = (y; V; ), turbulent uxes are closed and are treated without approximation. By ‘going down’ to the reduced vector state Z = (y; ), the e8ect of velocity (which has now become an external variable whereas it was an internal variable in the former case) represented by the mean conditional Ui |( = ) has to be closed. This shows that one may have an interest in staying at the upper oor even though one is mainly interested in the scalar statistics. Yet, we leave out that question and concentrate on the terms on the rhs of the pdf equation. It can be shown that the term which involves the molecular transport coe@cient, here the scalar di8usivity >, can be exactly re-expressed as the sum of two contributions [6,7,11] 9p L 9[Ui |( = )pL ] 92 [>pL ] 92 [+ |( = )pL ] 9[S( )pL ] = − − + ; (538) 9t 9yi 92 9 9yi2 where the Arst term on the rhs is negligible at high Peclet numbers and where + is the scalar dissipation 9 (t; x) 2 + = > : (539) 9xi In the above scalar pdf equation, we end up with the same form as with the viscous terms of the momentum equation, Eq. (235) in Section 6.6. This is a general result for all molecular transport terms in the exact Aeld equations. Such terms which are di8usive in physical space but yield a negative coe@cient (anti-di8usion) in sample space. For the uid particle velocity model developed in Section 6.6, this negative coe@cient was later compensated by a larger positive term arising from the model of the uctuating pressure gradient and the equivalent negative square root that one would like to write in the corresponding equations of the trajectories of the process were mere formal intermediates. For micro-mixing models there is (unfortunately) no such supplementary terms and one is faced with the di@culty of modelling an anti-di8usion coe@cient. At high Peclet numbers, without reactive source terms S = 0 and neglecting the unconditional form of the scalar dissipation + |( = ) + , we have 92 [+ pL ] 9p L 9[Ui |( = )pL ] =− + : (540) 9t 92 9yi The scalar modelling issue, referred to as the micro-mixing problem, is to construct the corresponding term in the equations of the trajectories of the process that corresponds to this second-derivative form in the pdf equation, following the equivalence between the trajectory and
196
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
the pdf point of view for stochastic processes (see Section 2). In the absence of velocity e8ects, this is illustrated by the following sketch where the question mark indicates the unknown and required term d =? ↔
92 [+ pL ] 9p L =− : 9t 92
(541)
The scalar pdf equation looks very much like a Fokker–Planck equation (see Section 2.8) and the required model term for the trajectories of the process should be something like the Langevin stochastic di8erential equation. However, this cannot be really a Fokker–Planck equation since the coe@cient appearing in front of the second-order derivative is negative whereas the similar coe@cient in an actual Fokker–Planck equation is always positive, as shown in Eq. (40) or in Eq. (59) for the multi-dimensional case. The micro-mixing issue, or anti-di8usion behaviour, is bound to bring about a great deal of eyebrows-raising and perhaps even downright suspicion, particularly for the mathematically oriented reader, since the equation is not well posed. It is thus useful to try to provide further physical explanations and a general picture for the origin of this behaviour. Let us go back to the basic physical ideas leading to Langevin and Fokker–Planck di8usive equations, which are developed in Sections 2 and 4, and which seem well in place. From Section 2.8, we know that the existence of a positive second-order term in a partial di8erential equation of the convection–di8usion type, is equivalent to the existence of a white-noise term in the particle evolution equation. If (t; x) is the solution of a 1D heat transfer equation, (t; x) can be interpreted as the law of a stochastic process, say X whose trajectories undergo random walks. This can be represented as √ 92 (t; x) 9 (t; x) d X = 2> d W ↔ : (542) => 9t 9x 2 The stochastic equation for the trajectories of the process (that we will call particle equations) helps to bring out the underlying physics. A particle dynamics leads to a di8usive behaviour because it is subject to random forces or kicks from its environment. The ‘force’ acting on this particle (the rate of changes of the state variables considered) is taken as a fast variable (rapidly changing or with no memory) and as being independent of the actual state of the system considered, resulting in the white-noise term d W . The solution (t; x) of the PDE appears as a ‘macroscopic’ quantity and represents the mean or averaged behaviour of the underlying ‘microscopic’ constituents, see Section 6.2. These ‘microscopic’ constituents can be seen as the carriers of the related information (here it would be thermal energy or their kinetic energy) which they carry and propagate through the domain. In our case, these microscopic constituents are the molecules of the uid. The emerging picture is thus: the temperature Aeld (t; x) di8uses in space because each small (but macroscopic with respect to the molecules) volume exchanges molecules very rapidly with the surrounding small volumes of uid. If we select an observation time scale (the incremental time interval in the di8erential equations) small with respect to the evolution of the small element of uid but much larger than molecular time scales, and a length scale (the dimension of the small uid elements) much larger than molecular sizes, these molecules can be regarded as being independent and varying inAnitively fast. This corresponds to the discussion of the chosen ‘observation’ time scale in Section 4 and is in line with the
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
197
general picture given above. Such a modelling can only represent a system whose variance X 2 constantly increases as a direct consequence of the Langevin stochastic equation: dX 2 = > d t ¿ 0 :
(543)
This variance can be interpreted as the entropy of the system, it increases as the molecules become increasingly well mixed throughout the domain. The probabilistic description performed in the present pdf methods is di8erent. As emphasized in Section 6.2, we actually try to follow the reverse direction compared with the previous explanation of hydrodynamical di8usion and molecular random walks. We start at the hydrodynamical level and interpret the Aeld as an ensemble of N -interacting uid particles. These uid particles are therefore small elements of uid (we are within the framework of continuous mechanics) or, in other words, each uid particle is a large scale particle (compared to molecules) or a cluster gathering a large number of molecules. We then try to describe not the dynamics of this N -particle problem but rather N one-particle dynamical problems. The di8erence between the points of view may be illustrated by the following sketch: √ 92 [+ pL ] 92 T (t; x) 9T (t; x) 9p L d X = !"2> d W# → → − (544) => = : 9t 9x 2 # 9t 92 !" !" # molecular level hydrodynamical level one-particle PDF level →
increasingly coarser description
(545)
If we consider a volume of uid that contains a number of these uid particles and that we describe as statistically homogeneous, then the uid particles contained in that volume interact between themselves through the exchanges of molecules. We could explicitly calculate these interactions, in a particle formulation for example through the use of the particle strength exchange (PSE) method developed in vortex methods [116]. However, in turbulence these interactions are small-scale forces. Most of the energy exchanged in the process is due to interactions that take place within a distance which scales with the Kolmogorov inner scale @. An explicit calculation of this phenomenon would require a su@cient number of particles to be present within a distance of order @ and to follow the time evolution with a time step of the order of the Kolmogorov time scale $@ . These are the observation scales mentioned in the previous paragraph, which are macroscopic time scales with respect to molecular scales. This would be like an direct simulation and this is precisely what we would like to avoid in the present pdf models! We want now to express the dynamics and the time evolution of the temperature attached to each particle at a much larger time scale, see Sections 4.2 and 6.8. We are looking for a coarser description without explicitly computing the local interactions of the particle considered with all its neighbours. We simply want to account for the resulting e8ect of these molecular interactions on the one-particle pdf. What is this resulting e8ect? Let us consider a number of uid elements or particles that have di8erent temperatures (our considered scalar). In the one-particle pdf sample space, this means that the pdf is spread since there is a range (either discrete or continuous) of possible values. Then, as time goes by, these temperatures will be smoothed out by the action of the uid di8usivity and will tend towards a single value. This smoothing action re ects the exchange of molecules between the uid particles. At the time
198
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
chosen in the pdf description, larger than the Kolmogorov scale which in turn is larger than the molecular time scale, this exchange is random and inAnitively fast. The resulting mixing e8ect is irreversible at the pdf level (thus perhaps the second-order derivative) and in the corresponding sample-space the pdf tends towards a Dirac-value distribution. In the pdf description, the mean value of the uid particle scalar does not change but the variance decreases d 2 = −+ d t ¡ 0 :
(546)
There is no new physical phenomenon involved. The variance (or the entropy) in the pdf description decreases since energy (or order) is transferred in an irreversible way to the molecules whose disorder increases. Both evolutions, Eqs. (543) and (546), are two manifestations of molecular random motions. The evolution at the pdf level is the re ection of the relaxation of macroscopic systems towards thermodynamical equilibrium through increasing molecular disorder. The micro-mixing issue, indicated by the sketch (541), is one of the key di@culties of the subject and it remains a much discussed and open question. No satisfactory model representing in a trajectory formulation this pdf behaviour has been proposed, at least in the turbulence community. Interestingly enough, this question has also appeared recently in other physical situations. For example, it is mentioned in an appendix (appendix S.11) of the latest edition of Risken’s book [117] where the idea of ‘doubling the phase space variables’ to retrieve a positive di8usion matrix is brie y put forward. Similar notions are also addressed in the reference book of Gardiner, particularly in the part 7.7.4 of [15] where complex stochastic di8erential equations are introduced. This is achieved through what is referred to as Poisson representation and the direct analogy with our present micro-mixing modelling issue, although striking, remains to be properly established. It seems therefore that there is room for improvement and that theoretical work, perhaps in connection with the above-mentioned works, would greatly help to devise better and physically sound model proposals. At the moment, the modelling problem remains partially and even poorly treated, with the limited objectives to represent the correct evolution of the mean scalar and the decrease of its variance 2 instead of representing the correct evolution of the pdf pL (t; y; ). This is really a limiting problem, and any improvement would have important consequences for practical calculations. As an example of current closures, the simplest model replaces the real process by a linear return-to-equilibrium to the mean (IEM) [8]
− d = − dt ; (547) $ where $ is the scalar time scale. It is seen that, in sample space, the second-order derivative term has been replaced by a Arst order one (the model involves only a drift term) which highlights its limitations. In sample space, the pdf equation is now of convection type 9p L 9 − = pL : (548) 9t 9 $ Therefore, in homogeneous turbulence starting from an initial two-value discrete pdf, the IEM model predicts that though the variance of 2 decreases, the shape of the pdf is conserved
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
199
contrary to the real one which tends towards a Gaussian distribution with a vanishing standard deviation. A number of other models have been proposed that come more or less close to the exact process. They have been discussed at length in most reviews on the subject [7–9] among others, and these works are referred to for further details and tests of the various models. 9.3.4. Path-integral ideas for dispersion models In this section, we go back to the core subject of this paper, two-phase ow modelling. The propositions developed above aim at improving current stochastic models by using more information provided by fundamental approaches (see Section 9.3.1) or by introducing spatial information (see Section 9.3.2). This will obviously increase the complexity of the models and increase the computational costs, and in turn could limit their range of applications. Yet, there is clearly one aspect, even within the present one-point formalism, that is in need of improvement: the modelling of the velocity of the uid seen by discrete particles. The modelling issues have been detailed in Section 7.4. It was explained that, even if we accept the present models for the velocity of a uid particle, the issue in turbulent two-phase ows is to model the successive velocities of the uid seen or sampled by a discrete particle as it moves across a turbulent ow. A careful analysis of the di8erences between these two variables, the velocity of a uid particle Uf and the velocity of the uid seen Us , has been proposed in Section 7.4. However, it is clear that the derivation of a Langevin model for Us implies additional assumptions compared to a Langevin model for Uf (see the discussion of the crossing-trajectory e8ect in Section 7.4.1 and the application of Kolmogorov hypothesis in Section 7.4.2). Improving present closure relations of Section 7.4.2 would certainly enhance the precision of the numerical predictions in most practical cases. This improvement would be obtained within the same one-particle pdf approaches and without going to higher pdf descriptions. Moreover, such a description would not impair the applicability of the approach to practical problems. It seems di@cult to keep on striving to devise better models by Addling with the di8erent terms of the stochastic equations (the drift and the di8usion coe@cients) and by trying to obtain better comparisons against various experimental data sets. A strong reason for this limitation is that we cannot be helped in the construction of these models by the knowledge of the macroscopic laws (see Sections 7.1 and 7.2). Consequently, it is believed that too much uncertainty limits the conAdence we can have in present closures and that a breakthrough is needed. Such a breakthrough requires that the model approach be Arst put on Arm ground with a clear theoretical setting. In other words, we need the help of a fundamental approach. The approach we propose to follow is a path-integral and a variational approach. We Arst outline the main characteristics of the general path-integral approach and then suggest how this approach can be used for our modelling issue. Originated in quantum mechanics, this method, which extends the Lagrangian=Hamiltonian variational ideas of classical mechanics, has a direct interpretation for stochastic di8usion processes [118]. In Section 2, we have emphasized that stochastic di8usion processes can be addressed from two points of view: the trajectory and the pdf point of view. In a 1D formulation for a stochastic process Z, the trajectory point of view consists in a Langevin equation d Z = A(t; Z) d t + B(t; Z) d W ;
(549)
200
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 43. Representation of one path between the two states (t0 ; z0 ) and (t; z).
while the pdf point of view consists in the Fokker–Planck equation, satisAed by the pdf p(t; z) of the process in sample space 9[A(t; z)p] 1 92 [B2 (t; z)p] 9p : =− + 9t 9z 2 9z 2
(550)
The correspondence is explained in the general case in Section 2.8. There is actually a third representation which is the path-integral. This representation, built from the trajectory and pdf points of view, expresses the probability to follow one particular path between two possible states of the stochastic process at two di8erent times, (t1 ; z1 ) and (t2 ; z2 ). We brie y recall the main characteristics of this way to handle stochastic processes which can be found in a few textbooks [119 –121]. It is illustrated in Fig. 43 which represents one particular path connecting the initial state (t0 ; z0 ) and the Anal state (t; z) through a number of intermediate values zi at the intermediate times ti when the Anite time interval is split in small time intervals. The transitional probability density p(t; z | t0 ; z0 ) to have the value z for the process Z at time t given that we had the value z0 at time t0 , can be worked out by the successive use of the Chapman–Kolmogorov relation, Eq. (14), p(t; z | t0 ; z0 ) = · · · p(t; z | tn ; zn ) × p(tn ; zn | tn−1 ; zn−1 ) × · · · ×p(t2 ; z2 | t1 ; z1 ) × p(t1 ; z1 | t0 ; z0 ) d zn d zn−1 : : : d z1 :
(551)
If we split the time interval (t0 ; t) in N + 1 identical subintervals of equal duration Qt, with Qt = (t − t0 )=(N + 1), we can approximate the relation between the intermediate states zi and zi+1 by an Euler scheme (see Section 2.10) 1 (552) (zi+1 − zi − A(ti ; zi )Qt) = d Wi : B(ti ; zi ) From the property of the Wiener process given in Section 2.4.2, the incremental conditional pdf is (zi+1 − zi − A(ti ; zi )Qt)2 1 p(ti+1 ; zi+1 | ti ; zi ) = exp − (553) 2B2 (ti ; zi )Qt 2(B2 (ti ; zi )Qt
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
201
and by adding all the terms within the exponential expression in Eq. (551) we get ) * N +1 N +1 1 ((zi+1 − zi )=Qt − A(ti ; zi ))2 d zi exp − : p(t; z | t0 ; z0 ) = lim Qt 2 N →∞ 2 2B2 (ti ; zi ) 2(B (ti ; zi )Qt i=1
The short-hand notation for the above integral is p(t; z | t0 ; z0 ) = C exp{−S[z($)]}D[z($)] ; where C is a normalization factor and where t S[z($)] = L[z($)] d $ ; t0
L[z($)] = L(z($); ˙ z($)) =
1 [z($) ˙ − A($; z($))]2 ; 2B2 ($; z($))
i=1
(554)
(555)
(556a) (556b)
where z($) ˙ is the time derivative of z($). In these relations, the notation z($) represents a particular trajectory between (t0 ; z0 ) and (t; z) and denotes the complete time function (for $ varying from t0 to t). The quantities S and L are functions of the whole trajectory. They are thus functionals and we use the classical notation L[:] to indicate that L depends on all values of z($); $ ∈ [t0 ; t]. The resulting expression, Eq. (555), yields what is referred to as the path-integral representation of a di8usion process. It has the form of a sum over all histories, since it expresses that the probability to start with the value z0 at time t0 and to end up with the value z at time t is the sum over all the possible paths that connect (t0 ; z0 ) to (t; z), each path z($) being weighted by the factor exp{−S[z($)]}. The above expressions are similar to the variational approach to classical mechanics [122]. The functional S[z($)] can be regarded as the action along a given path z($) and the functional L[z($)] as the equivalent of the classical Lagrangian extended to a stochastic context. Loosely speaking, Eq. (555) means that the probability p(t; z | t0 ; z0 ) to go from (t0 ; z0 ) to (t; z) is the probability to follow one particular path, summed over all the possible paths. The probability to follow one path is proportional to exp{−S[z($)]}. These forms of the Lagrangian and of the action are referred to as Onsager–Machlup actions [123,124], who Arst derived this representation. The above continuous forms of the path-integral representation, Eqs. (555) and (556), are actually symbolic expressions. From a mathematical point of view, the ‘measure’ written in Eq. (555), D[z($)], has not a well-deAned sense on the ensemble of the trajectories z($). Furthermore, what is loosely called the ‘probability’ to follow one particular path z($), exp{−S[z($)]} is also not well deAned. Indeed, the expression of the Lagrangian functional, L, uses the derivative z($) ˙ along the path z($). However, we have seen in Section 2 that one of the characteristics of a di8usion process is that the trajectories are continuous but nowhere di8erentiable! The expression entering the Lagrangian functional L is thus meaningless. Yet, the formulas can be used in a discrete sense, as in Eq. (554). From the physical point of view, they have nevertheless a clear and appealing meaning. We can consider complete paths z($) and sort them out with respect to their relative contribution to the sum. In that sense, the best meaning for
202
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
the functional S[z($)] is perhaps as an importance function for the set of trajectories between (t0 ; z0 ) and (t; z). The path-integral representation, Eqs. (555) and (556), is just another way to express the properties of a stochastic di8usion process. It does not really add something new to the knowledge of Z. If we already know either the trajectory Langevin stochastic di8erential equations or the Fokker–Planck equation, then this is simply a reformulation of a closed problem. The strong interest of the path-integral approach, for our present concerns, is to reverse that point of view and to follow an action principle. In that approach, a Langevin model is derived from the functional S and L as follows [125]. If we have an idea of a suitable or a reasonable functional, L[z($)], that represents the relative importance of the di8erent paths z($), then we can work out a Langevin equation by the following procedure. We Arst calculate the mean path z ($) which is the path that minimizes the action functional S[z($)] (assuming there is only one such path), z ($)
such that
minS[z($)] = S[z ($)] : z($)
(557)
The mean path corresponds to the trajectory on which the functional derivative of S is the zero function, S[z ($)] =0 : (558) z ($) By writing the ‘derivative’ along the mean path as z˙($) = A($; z ($)) ;
(559)
we get the desired expression for the drift coe@cient A of the Langevin SDE. If we now make a quadratic approximation of the Lagrangian around the mean path, L[z($)] D($; z($))(z($) − z ($))2 ;
(560)
where D($; z($)) is positive since z ($) is the minimum of L and of S, the di8usion coe@cient B for the Langevin equation is then 1 B($; z($)) = (561) 2D($; z($)) and is given by the second-order derivative of the functional L[z($)] for the mean path z . How can this approach be put to practical use for our concern to model the successive velocities of the uid seen by discrete particles in a turbulent ow? To see this, it is necessary to go back to the discussion of Section 7.4.1. The modelling issue is to derive a model for Us taking into account particle inertia and crossing-trajectory e8ects. In a one-particle approach, the dispersion model is built on given models for the Lagrangian increments of the velocity of a uid particle and on Eulerian correlations between two particles at the same time. The classical scheme is given in Fig. 27 where only one uid particle location is considered. That scheme was analysed in Section 7.4.1, and to get around the di@culties it creates a simpliAed picture based on the mean relative velocity (see Fig. 29) was used as the starting point for the derivation of Langevin models in Section 7.4.2. Compared to what was just a qualitative analysis there, the path-integral representation provides a powerful tool to delve into the question in a systematic
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
203
Fig. 44. Sketch of the possible Lagrangian correlation step between F(t1 ) and F(t2 ) and Eulerian correlation step between F(t2 ) and F (t2 ) for the velocity of the uid seen.
and consistent way. If we consider a discrete particle at two time steps, say t1 and t2 , and the relative motion of the uid particle F located around the discrete particle at time t1 , there is actually an inAnite range of possibility for the Lagrangian and Eulerian correlation steps (see Section 7.4.1 for the details of these correlation steps). This is represented in Fig. 44 where four possibilities are displayed. For the same discrete particle motion, we cannot say that the uid particle will go there, but we can say that it has a probability to go there. Therefore, all of the four sketches in Fig. 44 are possible but do not have the same ‘importance’. In Section 7.4.1, we already used a similar reasoning to point out the shortcomings due to a ‘poor choice’ of the relative disposition of the uid and the discrete particle locations (see Fig. 28). The path-integral formalism can now help us to select consistently the ‘relevant path’ on which to build a stochastic Langevin approximation for Us . We propose the following notion. From the previous description, we assign to each path zs ($) connecting the value of Us at time t1 (represented by the velocity of the particle F(t1 ) in Fig. 44) to the value of Us at time t2 (represented by the velocity of the particle F (t2 ) in Fig. 44) an importance function, or loosely speaking a probability, say Ls [zs ($)]. This Lagrangian functional Ls for the paths of the velocity of the uid seen can be proposed directly. Another possibility is to try to built it by a combination of the Lagrangian step and of the Eulerian step. Indeed, we can assign to the Lagrangian step, indicated by [L] in Fig. 44, a Lagrangian functional, say LL . Conversely, we can assign to the Eulerian step, indicated by [E] in Fig. 44, another Lagrangian functional, say LE , for the paths that link the two possible values of the velocities of the uid particles F (t2 ) and F(t2 ). The sum of these two steps, that is the link between the possible values of F(t1 ) and F (t2 ), can be regarded as a complete path and we write zs = zL + zE . The complete Lagrangian functional, Ls , that will describe the path of the velocity of the uid seen can be taken as the sum of the two
204
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
elementary Lagrangian functionals Ls [z($)] = LL [zL ($)] + LE [zE (r($))] :
(562)
The Eulerian step is dependent on the Lagrangian one, and we loosely indicate this by the notation r($). The total action is then t2 Ss [z($)] = Ls [z($)] d $ : (563) t1
Once we know Ss , we can now apply an action principle and the above procedure to derive a Langevin equation model. We say that the relevant or most important path is the complete path, z s ($), that minimizes the action z s ($)
such that
minSs [zs ($)] = Ss [z s ($)] : zs ($)
(564)
This should yield the drift terms of the Langevin approximation while the positive function that is the coe@cient in the quadratic expansion around z s ($) should give the di8usion coe@cient. At this point, it must be repeated that the previous description does not pretend to lay out a complete and Anal solution. That description should simply be regarded as a proposal. It is not claimed that following the path-integral representation is a necessity, but it is merely suggested that, through this formalism, one can approach the problem in a rigorous framework. Indeed, the knowledge of the action leads to a clear deAnition of the relevant path and may help to avoid the di@culties encountered in the heuristic models which are mentioned in Section 7.4. At the moment, this path-integral approach has never been followed. Nevertheless, it appears as an interesting possibility to marry theoretical tools from other Aelds of physics and rather practical concerns of two-phase ow modelling. Much work still remains to be done which requires insights from researchers conversant with the path-integral concepts and their manipulation. If we accept a Langevin model for the velocities of a uid particle, the Lagrangian functional (for the Arst Lagrangian correlation step [L]) LL is given by an Onsager–Machlup expression, Eq. (556). On the other hand, a proper expression for the Lagrangian functional describing the Eulerian correlation [E] has yet to be proposed. As mentioned in Section 7.4.1, one must also account for the fact that the Eulerian step is actually conditioned on the Lagrangian one. It is here simply hoped that these challenges will be deemed worthy of consideration. 9.3.5. Particle–particle interactions and granular behaviour Up till now, we have mainly treated the di8usion and dispersion problems (di8usion for uid particles and dispersion for discrete particles) in Sections 6 and 7, respectively, and the problem of turbulence modulation has been brie y touched in Section 8. Broadly speaking, one can state that as the concentration of discrete particles increases, one encounters Arst the one-way coupling case (where the discrete particles are dispersed by the turbulent uid), then the two-way coupling situation (where particles modify the intensity and possibly the nature of turbulence) and Anally four-way coupling, when the relative distance between particles is small enough so that there are particle–particle interactions (particles start to collide in the case of hard spheres or there is coalescence and break-up in the case of particles which can be deformed like bubbles and droplets).
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
205
Here, no attempt at classifying turbulent dispersed two-phase ows is proposed, i.e. if it is possible to predict when turbulence modulation and particle–particle interactions become relevant mechanisms based on, for example, some characteristic time scales. Instead, the physics of four-way coupling are brie y presented and discussed from the physical and modelling points of view and this only for hard spheres, a case which is line with the Aeld equations presented in Section 8. As far as the physics of particle–particle interactions between non-rigid particles is concerned, the reader can And suitable information in [126,127]. In the case of hard spheres immersed in an interstitial uid, it is quite intricate to address the problem in a general way. Instead, it is often necessary to consider distinct cases that correspond to di8erent areas of physics. If the uid ow is not turbulent (which is not the case of the present work), a problem often referred to as sedimentation (discrete particles immersed in a uid whose density has the same order of magnitude as p ), hydrodynamic interactions become important as mentioned in Section 7 and the nature of the contact between particles, if any, is a subject of controversy [128]. Here, only turbulent uid ows are under consideration and hydrodynamic interactions are neglected. There is no formal proof for this, but some guidance can be found in the work of Sa8man [129], who showed that as long as the relative distance between particles is large enough, the velocity perturbations on a given particle induced by the surrounding particles remain small. Let us deAne a time scale $c which characterizes particle–particle interactions, for example the time experienced by a given particle between two consecutive binary collisions as in the spirit of the kinetic theory [20]. One can state that when $c $p , there is almost no in uence of the uid on the collisional mechanisms (although energy can still be mainly supplied by the uid provided its agitation is high enough) a regime which is called dry granular ows in the engineering community and more recently granular matter in physics (a large collection of small grains, under conditions in which the Brownian motion of the grains is negligible [2]). When $p $c , collisions are controlled by uid-dynamic properties, a situation which is of course much more complex than the case of granular matter where the in uence of the uid is negligible. In the case where $p $c , the motion of particles is mainly controlled by aerodynamic forces. The only use of $p and $c is, of course, not enough for an exhaustive deAnition of the di8erent regimes. One might need to know how the particles respond to gas phase turbulence by comparing $p and TL , see Section 6 (the ratio $p =TL is called the Stokes number). A typical example of the complexity of the physics when the interstitial uid in uences the collisional motion between the particles ($p $c ) is the evaluation of the time scale $c . In dry granular ows that are rapidly sheared, by analogy with the kinetic theory (this will soon be explained) and assuming molecular chaos (two colliding particles have uncorrelated velocities), $c is given as the ratio between the mean free path (which is a function of dp and the particle volumetric fraction) and a characteristic uctuating velocity. The assumption of molecular chaos is valid when TL $p , that is, when the particle motion is hardly a8ected by the turbulent motion of the uid. However when particle motion is in uenced by the turbulent motion of the uid, $p TL , the assumption of molecular chaos is not valid anymore. If large scale instabilities (or turbulence) are present in the uid, one might end up with a situation where particles are dragged by the uid (particles relax fast to changes in the local instantaneous uid velocity Aeld) and do not collide since their motion is well correlated with the large scale motions of the uid (or if they happen to collide, their velocities will be correlated), i.e. $c → ∞. In that
206
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 45. Energy path in a rapid granular ow (when the interstitial uid is involved in the granular ow, additional mechanisms are indicated in bold style).
case, a spatial information might be needed to solve the problem. Before we go on to some proposals and perspectives for this type of ows, we start with a description of granular matter which is a simpliAed case of our general problem. The interest for granular ows is not new and outstanding pioneers like Coulomb, Reynolds and Bagnold [130], to name of few, had already gathered knowledge in the science of granular materials. Over the years the subject has received a great deal of attention in chemical and mechanical engineering (where granular ows are ubiquitous) and more recently in physics where granular matter is a new type of condensed matter, and it has become a fruitful metaphor for describing microscopic, dissipative dynamical systems and the concept of self-organized criticality [131,132]. Even though there is a striking analogy between granular matter and the other forms of matter (solid, liquid, gas), granular matter exhibits unique properties in both its solid-like and uid-like behaviour [2,133]. This is mainly due to two reasons: (i) ordinary temperature plays no role (the relevant energy scale is potential energy—and also kinetic energy if particles are dragged by a uid—but not thermal energy kB T ), (ii) the interactions between the particles (grains) are dissipative because of static friction (solid-like state) or the inelasticity of collisions ( uid-like state)—and also friction if particles are dragged by a uid. Granular matter is an unusual solid, liquid or gas. Fill a container with sand and the pressure at the bottom will reach a maximum value independent of the height. Vibrate the container and one will And that the degree of compaction is history dependent. Pour the grains on at table and motion will stop almost instantaneously. If a heap is formed, pour some more grains and phase transitions (solid-like and uid-like) can be observed. These simple experiments (and many others) clearly indicate that granular matter is a non-conventional uid or liquid. These non-conventional behaviours raise a fundamental question: is it possible to describe granular matter using a Aeld approach as it is done in continuum mechanics? Is it possible to describe phase transition with classical arguments from statistical physics? There is no general agreement on the subject. For example, as far as the issue of the derivation of hydrodynamic equations in the uid-like case is concerned, some authors are inclined to say ‘no’ [134] whereas for others it is a subject of controversy [133]. In the Aeld of classical mechanics, the problem of the derivation of Aeld equations has been addressed for many years for the so-called rapid granular ows, Campbell [135], i.e. the uid-like behaviour of granular matter that is rapidly sheared so that there is a constant free motion of the particles where only short duration contacts are involved. In this particular case, the concept of granular temperature can be introduced, Tp = up; i up; i =3. This internal energy which is supplied by external forces is dissipated into heat by inelastic collisions. The energy path is described as follows, Fig. 45. Driving forces (gravity, motion of external boundaries,
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
207
forces exerted by the uid) supply the system of grains with kinetic energy. Part of the kinetic energy is converted, via shear work (due to velocity gradients), into random agitation of the particles (granular temperature). This internal energy is dissipated into (thermodynamic) heat due to collisions (small deformations at the surface of the particles). When the interstitial uid drags the particles, heat is also dissipated due to friction (drag force) if one assumes that the particles are smaller than the Kolmogorov length scale so that perturbations in the uid velocity Aeld are directly dissipated into heat. If not, spatial information is once again needed. Once the concepts of rapid granular ows and granular temperature are accepted, and especially their relevance in physical and engineering applications, the physical similarity with the kinetic theory of gases is a fact. The question which remains to be answered is: to what extend can dissipative gases At in the frame of the kinetic theory? Under what assumptions can we derive macroscopic equations? In the particular case of a population of smooth, rigid, non-rotating, identical spheres, this question was answered mainly by Savage and Jenkins, see for example [136,137], and their results were later on extended to gas–solid ows [78,79]. In the case where the interstitial gas drags the particles, the procedure can be sketched as follows: unclosed Boltzmann equation ↓ simpliAed closure on Us; i | yp ; Vp molecular chaos assumption ↓
closed Boltzmann equation ↓
unclosed mean Aeld equations ↓ small departure from equilibrium Grad ’s 13-moment approximation simpliAed collision model and 1 − e1 ↓
closed mean Aeld equations The starting point of the ‘kinetic theory of granular ows’ is an unclosed Boltzmann-like equation on the one-point particle pdf p(t; yp ; Vp ), Eq. (373), where an additional term accounting for the time rate of change of p(t; yp ; Vp ) due to collisions is introduced,
Vp; i 9p 9 1 9p 9 9 [Vp; i p] − p =− Us; i | yp ; Vp p + : (565) + 9t 9yp; i 9Vp; i $p 9Vp; i $p 9t c This Boltzmann-like equation is closed by making a simpliAed assumption on Us; i | yp ; Vp and by assuming molecular chaos, that is the two-point (for two discrete particles) pdf is the product of the one-point particle pdfs. Then unclosed mean Aeld equations can be derived using the procedure outlined in Section 8.5.3. Closure at the macroscopic level can be performed
208
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
by supposing that the system is close to equilibrium (perturbation analysis), that the form of the pdf is known in advance (Grad’s 13-moment approximation) and that simpliAed collision models can be used (binary collisions are characterized only by the restitution coe@cient e and collisions are nearly elastic). Expressions for the mean ux of momentum and energy, and for the mean dissipation rate can then be derived analytically for the simpliAed collision integrals, i.e. closed mean Aeld equations are written for the mean density, the mean velocity Aeld and the granular temperature. The form of the equations and the technical aspect of the problem will not be discussed but, roughly speaking, it can be stated that mean Aeld equations can be derived when we are in the case of an (almost) dense gas close to equilibrium. Some e8orts are still made in the Aeld to include more physics (binary mixtures, accurate collision models, rotation on the particles, in uence of the gas by working on p(t; yp ; Vp ; Vs ), etc.). The spirit of the method has however not changed and one is left with the classical drawbacks when closure is performed at the macroscopic level as explained in Section 6. At the microscopic level, it is possible today, with modern computer technology, to perform calculations of granular ows and turbulent gas-solid ows. The microscopic simulations of granular ows, for example [138–140], are a powerful tool to study granular matter (for example inelastic collapse in the liquid-like form) since precise information can be extracted and more realistic physics can be put into the model (collision models, rotation of the particles). However, it is not yet reasonable to claim that these simulations are real microscopic ones since the collisional models are simpliAed [141]. In the case of gas–solid suspensions the di@culty is increased by the presence of the uid. Real DNS is not possible (where particles become moving boundaries) and most of the time particular turbulent Aelds are generated and large eddy simulation is used together with a particle-point approximation [142] (the size of the LES Alter is chosen in a way so that the unresolved velocity uctuations do not a8ect the motion of the particles). Even though these methods provide fruitful ‘numerical experiments’, they su8er from two major drawbacks: (i) the number of degrees of freedom is huge, (ii) the collision-tracking algorithm imposes stringent numerical constraints [138]. The leading idea of the present paper is that, when the number of degrees of freedom of a system is too large, one has to come with a contracted description, i.e. to describe the system at a mesoscopic level. The treatment of collisions in granular matter and dispersed two-phase ows can At in this approach. If one is interested in one-point information for the discrete particles, the exact trajectories of the discrete particles can be approximated by stochastic particles (in order to reproduce the statistical signature of particle–particle collisions) as in the spirit of DSMC [143] (direct simulation Monte Carlo). The trajectories of the stochastic process are no longer continuous (there are velocity discontinuities) and di8usion processes cannot be directly applied as a modelling tool. Instead, more general Markov processes must be used, for example the combination of di8usion process and a jump process, see Section 2. A Ast proposal could be to model the statistical signature of the collision with a generalized Poisson process. For example, for a given particle under time interval d t, the velocity increment becomes, d Xt = A(t; X; F(t; X )) + B(t; X; G(t; X )) d W + d Nt ;
(566)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
209
where d Nt = 0
with probability 1 − #(t; x; h(t; x)) d t ;
d Nt = Yt − Xt
with probability #(t; x; h(t; x)) d t :
Here, we obtain a generalized Mc-Kean stochastic equation. Yt is a random variable which is speciAed by a conditional pdf g(y | t; x). Since #(t; x; h(t; x)) and g(y | t; x) are independent, the pdf associated to Nt is W (y | t; x) = #(t; x; h(t; x)) g(y | t; x) ;
(567)
that is, roughly speaking, the probability to jump from Xt = x to Yt = y is the product of the probability that a jump occurs and the probability to have the speciAed amplitude. It is obvious that g(y | t; x) d y = 1 (568) (since g(y | t; x) is a pdf) and inserting Eq. (567) into Eq. (34), one obtains for the transitional pdf p(t; x | t0 ; x0 ) 9p 9 1 92 2 (B (t; x; H (t; x))p) = − (A(t; x; F(t; x)) p) + 9t 9x 2 9x 2
+
#(t; y; h(t; y))g(x | t; y) p(t; y | t0 ; x0 ) d y − #(t; x; h(t; x)) p(t; x | t0 ; x0 ) :
(569)
The modelling problem is now to And expressions for #(t; x; h(t; x)) and g(y | t; x) based on physical arguments. By doing so, the numerical treatment of collisions could be signiAcantly simpliAed, i.e. by applying a mesoscopic description. References [1] K.R. Sreenivasan, Fluid turbulence, Rev. Mod. Phys. 71 (2) (1999) 383–395. [2] P.G. de Gennes, Granular matter: a tentative view, Rev. Mod. Phys. (Centenary) 71 (2) (1999) S374–S382. ∗ [3] M. Lesieur, Turbulence in Fluids, 3rd Edition, Kluwer, Dordrecht, 1997. [4] W.D. McComb, The Physics of Fluid Turbulence, Clarendon Press, Oxford, 1990. [5] U. Frisch, Turbulence, The Legacy of A.N Kolmogorov, Cambridge University Press, Cambridge, 1995. ∗ [6] S.B. Pope, Pdf methods for turbulent reactive ows, Prog. Energy Combust. Sci. 11 (1985) 119–192. ∗ ∗ ∗ [7] S.B. Pope, Lagrangian pdf methods for turbulent reactive ows, Ann. Rev. Fluid Mech. 26 (1994) 23–63. ∗∗ [8] C. Dopazo, Recent developments in PDF methods, in: P.A. Libby, F.A. Williams (Eds.), Turbulent Reactive Flows, Academic, New York, 1994. [9] R.O. Fox, Computational methods for turbulent reacting ows in the chemical process industry, Rev. Inst. Fr. P\et. 51 (2) (1996). [10] S.B. Pope, On the relationship between stochastic lagrangian models of turbulence and second-order closures, Phys. Fluids 6 (2) (1994) 973–985. [11] J.-P. Minier, J. Pozorski, Derivation of a pdf model for turbulent ows based on principles from statistical physics, Phys. Fluids 9 (6) (1997) 1748–1753. ∗ [12] D.E. Stock, Particle dispersion in owing gases, J. Fluids Eng. 118 (1996) 4–17.
210
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
[13] L. Arnold, Stochastic Di8erential Equations: Theory and Applications, Wiley, New York, 1974. ∗ ∗ ∗ [14] B. Iksendal, Stochastic Di8erential Equations, An Introduction with Applications, Springer, Berlin, 1995. [15] C.W. Gardiner, Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences, 2nd Edition, Springer, Berlin, 1990. ∗ ∗ ∗ R [16] H.C. Ottinger, Stochastic Processes in Polymeric Fluids, Tools and Examples for Developing Simulation Algorithms, Springer, Berlin, 1996. [17] B. Lapeyre, E. Pardoux, R. Sentis, in: M\ethodes de Monte-Carlo pour les e\ quations de transport et de di8usion, Coll. Math\ematiques et Applications, Springer, Berlin, 1998. [18] P.E. Kloeden, E. Platen, Numerical Solution of Stochastic Di8erential Equations, Springer, Berlin, 1992. [19] D. Talay, Simulation of stochastic di8erential equation, in: P. Kree, W. Wedig (Eds.), Probabilistic Methods in Applied Physics, Springer, Berlin, 1995. ∗ ∗ ∗ [20] S. Chapman, T.G. Cowling, The Mathematical Theory of Non-Uniform Gases, Cambridge Mathematical Library, Cambridge, 1970. [21] R.L. Libo8, Kinetic Theory: Classical, Quantum, and Relativistic Descriptions, 2nd Edition, Prentice-Hall Advanced Reference Series, London, 1998. [22] R. Balescu, Statistical Dynamics: Matter Out of Equilibrium, Imperial College Press, London, 1997. ∗ ∗ ∗ [23] H. Haken, Synergetics: an overview, Rep. Prog. Phys. 52 (1989) 515–533. ∗ ∗ ∗ [24] M. Bushev, Synergetics, Chaos, Order, Self-Organization, World ScientiAc, Singapore, 1994. [25] A.S. Monin, A.M. Yaglom, Statistical Fluid Mechanics, MIT Press, Cambridge, MA, 1975. ∗ ∗ ∗ [26] H. Tennekes, J.L. Lumley, A First Course in Turbulence, The MIT Press, Cambridge, MA, 1990. [27] S.B. Pope, Turbulent Flows, Cambridge University Press, Cambridge, 2000. [28] K.R. Sreenivasan, R.A. Antonia, The phenomenoly of small-scale turbulence, Annu. Rev. Fluid Mech. 29 (1997) 435–472. [29] P.K. Yeung, S.B. Pope, Lagrangian statistics from direct numerical simulations of isotropic turbulence, J. Fluid Mech. 207 (1989) 531–586. ∗∗ [30] K.D. Squires, J.K. Eaton, Lagrangian and eulerian statistics obtained from direct numerical simulations of homogeneous turbulence, Phys. Fluids A 3 (1991) 130–143. [31] E. Deustch, Dispersion de particules dans une turbulence stationnaire homogene isotrope calcul\ee par simulation directe des grandes e\ chelles, Ph.D. Thesis, Universit\e Paris VI, 1992. [32] G.K. Batchelor, The Theory of Homogeneous Turbulence, Cambridge University Press, Cambridge, 1953. [33] C.W. Van Atta, R.A. Antonia, Reynolds number dependence of skewness and atness factors of turbulent velocity derivatives, Phys. Fluids 23 (1980) 252–257. [34] J. Jimenez, A.A. Wray, P.G. Sa8man, R.S. Rogallo, The structure of intense vorticity in homeogeneous isotropic turbulent ows, J. Fluid Mech. 255 (1993) 65–90. [35] F. Anselmet, Y. Gagne, E.J. HopAnger, R.A. Antonia, High-order velocity structure functions in turbulent shear ows, J. Fluid Mech. 140 (1984) 63–89. [36] K.R. Sreenivasan, Fractals and multifractals in uid turbulence, Annu. Rev. Fluid Mech. 23 (1991) 539–600. [37] L.P. Wand, S. Chen, J.G. Brasseur, J.C. Wyngaard, Examination of hypotheses in the kolmogorov reAned turbulence theory through high-resolution simulations. Part 1. Velocity Aeld, J. Fluid Mech. 309 (1996) 113–156. [38] W. George, P.D. Beuther, R.E. Arndt, Pressure spectra in turbulent free shear ows, J. Fluid Mech. 148 (1984) 151–191. [39] T. Gotoh, R.S. Rogallo, Statistics of pressure and pressure gradient in homogeneous isotropic turbulence. Proceedings of the Summer Program, Standford, USA, Center for Turbulence Research, 1994. [40] M. Nelkin, Universality and scaling in fully developed turbulence, Adv. Phys. 43 (1994) 143–181. [41] R. Benzi, S. Ciliberto, R. Tripiccione, C. Baudet, F. Massaioli, S. Succi, Extended self similarity in turbulent ows, Phys. Rev. E 48 (1993) R29–R32. [42] Z.S. She, E. Leveque, Universal scaling laws in fully developed turbulence, Phys. Rev. Lett. 72 (1994) 336–339. [43] C.W. Van Atta, W.Y. Chen, Structure functions of turbulence in the atmospheric boundary layer over the ocean, J. Fluid Mech. 44 (1970) 145–159.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
211
[44] B. Castaing, Y. Gagne, E.J. HopAnger, Velocity probability density functions of high-Reynolds number turbulence, Physica D 46 (1990) 177–200. [45] Y. Gagne, M. Marchand, B. Castaing, Conditional velocity pdf in 3-d turbulence, J. Phys. II France 4 (1994) 1–8. ∗∗ [46] A. Vincent, M. Meneguzzi, The spatial structure and statistical properties of homogeneous turbulence, J. Fluid Mech. 225 (1991) 1–20. [47] Z.S. She, E. Jackson, S.A. Orszag, Scale-dependent intermittency and coherence in turbulence, J. Sci. Comput. 1 (1988) 407–434. [48] A. Pumir, A numerical study of pressure uctuations in three-dimensional incompressible, homogeneous, isotropic turbulence, Phys. Fluids 6 (1994) 2071–2083. [49] G.L. Brown, A. Roshko, On density e8ects and large structures in turbulent mixing layers, J. Fluids Mech. 64 (1974) 775–816. [50] E.D. Siggia, Numerical study of small scale intermittency in three dimensional turbulence, J. Fluid Mech. 107 (1981) 375–406. [51] J. Jimenez, A.A. Wray, On the characteristics of vortex Alaments in isotropic turbulence, J. Fluid Mech. 373 (1998) 225–285. [52] O. Cadot, S. Douady, Y. Couder, Characterisation of the low-pressure Alaments in a three-dimensional turbulent shear ows, Phys. Fluids 7 (1995) 630–646. [53] O. Cadot, D. Bonn, Y. Couder, Turbulent drag reduction in a closed system: boundary layer versus bulk e8ects, Phys. Fluids 10 (1995) 426–436. [54] J.-P. Minier, Lagrangian stochastic modelling of turbulent ows, Lecture Notes of the Von-Karman Institute, Session on Advances in Turbulence Modelling, 23–27 March, 1998. [55] M.H. Kalos, P.A. Whitlock, Monte Carlo Methods, Vol. I, Wiley, New York, 1986. [56] J. Xu, S.B. Pope, Assessment of numerical accuracy of pdf=monte carlo methods for turbulent reacting ows, J. Comput. Phys. 152 (1999) 192. [57] S.B. Pope, Y.L. Chen, The velocity-dissipation probability density function model for turbulent ows, Phys. Fluids A 2 (1990) 1437. [58] S.B. Pope, Application of the velocity-dissipation probability density function model to inhomogeneous turbulent ows, Phys. Fluids A 3 (1991) 1947. [59] J.-P. Minier, J. Pozorski, Analysis of a pdf model in a mixing layer case, Proceedings of the 10th Symposium on Turbulent Shear Flows, University Park, PA, 1995. [60] P.R. Van Slooten, Jayesh, S.B. Pope, Advances in pdf modeling for inhomogeneous turbulent ows, Phys. Fluids 10 (1998) 246. [61] H.A. Wouters, T.W.J. Peeters, D. Roekaerts, On the existence of a stochastic lagrangian model representation for second-moment closures, Phys. Fluids A 8 (1996) 1702. [62] P.J. Colucci, F.A. Jaberi, P. Givi, S.B. Pope, The Altered density function for large-eddy simulation of turbulent reactive ows, Phys. Fluids 10 (1998) 499. [63] F.A. Jaberi, P.J. Colucci, S. Givi, S.B. Pope, Filtered mass density function for large-eddy simulation of turbulent reactive ows, J. Fluid Mech. 401 (1999) 85–121. ∗ [64] D.C. Haworth, S.H. El Tahry, Probability density function approach for multimensional turbulent ows calculations with application to an in-cylinder ows in reciprocating engines, AIAA 29 (2) (1991) 208–218. [65] M. Muradoglu, P. Jenny, S.B. Pope, D.A. Caughey, A consistent hybrid Anite volume=particle method for the pdf equations of turbulent reactive ows, J. Comput. Phys. 154 (1999) 342–371. [66] J.-P. Minier, J. Pozorski, Wall boundary conditions in the pdf method and application to a turbulent channel ow, Phys. Fluids 11 (1999) 2632–2644. [67] R.P. Patel, J. AIAA 11 (67) (1973). [68] F.H. Champagne, Y.H. Pao, I.J. Wygnanski, On the two-dimensional mixing region, J. Fluid Mech. 74 (1976) 209–250. [69] I.J. Wygnanski, H.E. Fiedler, The two-dimensional mixing layer, J. Fluid Mech. 41 (1970) 327–361. [70] J. Pozorski, J.-P. Minier, Full velocity–scalar pdf approach for wall-bounded ows and computation of thermal boundary layers, Proceedings of the 8th European Turbulence Conference, Barcelone, June 27–30, 2000.
212
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
[71] C.M. Tchen, Mean value and correlation problems connected with the motion of small particles suspended in a turbulent uid, Ph.D. Thesis, Delft University of Technology, 1947. [72] M.R. Maxey, J.J. Riley, Equation of motion for a small rigid sphere in a nonuniform ow, Phys. Fluids 26 (4) (1983) 883–889. [73] R. Gatignol, The Fax\en formulae for a rigid particle in an unsteady non-uniform Stokes ow, J. M\ec. Th\eor. Appl. 1 (2) (1983) 143–160. ∗ [74] J. Magnaudet, M. Rivero, J. Favre, Accelerated ows past a rigid sphere or a spherical bubble, Part 1, steady straining ow, J. Fluid Mech. 284 (1995) 97–135. [75] R. Clift, J.R. Grace, M.E. Weber, Bubbles, Drops and Particles, Academic Press, New York, 1978. [76] D.L. Koch, Kinetic theory for a monodispersed gas-solid suspension, Phys. Fluids A 2 (10) (1990) 1711–1723. [77] M. Boivin, O. Simonin, K.D. Squires, Direct numerical simulation of turbulence modulation by particles in isotropic turbulence, J. Fluid Mech. 375 (1998) 235–263. [78] E. Peirano, B. Leckner, Fundamentals of turbulent gas-solid ows applied to circulating uidized bed combustion, Prog. Energy Combust. Sci. 24 (1998) 259–296. [79] O. Simonin, Continuum modelling of dispersed two-phase ows, Combustion and Turbulence in Two-Phase Flows, 1995 –1996, Lecture Series Programme, von K\arm\an Institute, Belgium, 1996. [80] J. Pozorski, J.-P. Minier, Probability density function modelling of dispersed two-phase turbulent ows, Phys. Rev. E 59 (1) (1998) 855–863. [81] O. Simonin, E. Deutsch, J.-P. Minier, Eulerian prediction of the uid=particle correlated motion in turbulent two-phase ows, Appl. Sci. Res. 51 (1993) 275–283. [82] M.W. Reeks, On the continuum equations for dispersed particles in nonuniform ows, Phys. Fluids A 4 (6) (1992) 1290–1303. [83] M.W. Reeks, On the constitutive relations for dispersed particles in nonuniform ows, I dispersion in a simple shear ow, Phys. Fluids A 5 (3) (1993) 750–761. [84] J. Pozorski, J.-P. Minier, On the lagrangian turbulent dispersion models based on the langevin equation, Int. J. Multiphase Flow 24 (1998) 913–945. [85] J.-P. Minier, Closure proposals for the langevin equation model in Lagrangian two-phase ow modelling, Proceedings of the third ASME=JSME Conference, San Francisco, ASME FED, July 28–23 1999, pp. FEDSM99-7885. ∗ [86] J.M. Mc Innes, F.V. Bracco, Stochastic particle dispersion modeling and the tracer-particle limit, Phys. Fluids A 4 (1992) 2809. ∗ [87] S.B. Pope, Consistency conditions for random-walk models of turbulent dispersion, Phys. Fluids 30 (8) (1987) 2374–2378. ∗ [88] J.O. Hinze, Turbulence, 2nd Edition, McGraw Hill, New-York, 1975. [89] T.R. Auton, J.C.R. Hunt, M. Prud’homme, The force exerted on a body in inviscid unsteady non-uniform rotational ow, J. Fluid Mech. 197 (1988) 241–257. [90] Hockney, Eastwood, Computer Simulations Using Particles, Institute of Physics Publishing, Bristol, Philadelphia, 1988. [91] J. Pozorski, J.-P. Minier, Computation and projection of statistical averages in monte carlo particle-mesh methods, J. Comput. Phys. 2000, submitted. [92] Y. Sato, K. Hishida, M. Maeda, E8ect of dispersed phase on modiAcation of turbulent ow in a wall jet, J. Fluids Eng. 118 (1996) 307–314. [93] T. Ishima, J. Boree, P. Fanouillere, I. Flour, Presentation of a data base: conAned blu8 body ow laden with solid particle, Proceedings of the Nineth Workshop on Two-Phase Flow Predictions, Halle-Wittenburg, Germany, Martin-Luther-Universitat, April 13–16, 1999. [94] V. Mathiesen, T. Solberg, B.H. Hjertager, An experimental and computational study of multiphase ow behavior in a circulating uidized bed, Int. J. Multiphase Flow 26 (3) (2000) 387–419. [95] C. Gourdel, O. Simonin, E. Brunier, Modelling and simulation of gas-solid turbulent ows with a binary mixture of particles, Proceedings of the Third International Conference on Multiphase Flow, ICMF 98. Lyon, France, June 8–12, 1998. [96] S.L. Soo, Multiphase Fluid Dynamics, Science Press, Gower Technical, New York, 1990.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
213
[97] T.D. Dreeben, S.B. Pope, Probability density function=monte carlo simulation of near-wall turbulent ows, J. Fluid Mech. 357 (1998) 141. [98] B.J. Delarue, S.B. Pope, Application of pdf methods to compressible turbulent ows, Phys. Fluids 9 (9) (1997) 2704. [99] P. Moin, K. Mahesh, Direct numerical simulation: a tool in turbulence research, Ann. Rev. Fluid Mech. 30 (1998) 539–578. [100] V. L’vov, I. Procaccia, Phys. World 9 (1995) 35. [101] V. L’vov, I. Procaccia, Fusion rules in turbulent systems with ow equilibrium, Phys. Rev. Lett. 76 (1996) 2898–2901. [102] P.K. Yeung, One- and two-particle lagrangian acceleration correlations in numerically simulated homogeneous turbulence, Phys. Fluids 9 (10) (1997) 2981–2990. [103] P. Vedula, P.K. Yeung, Similarity scaling of acceleration and pressure statistics in numerical simulations of isotropic turbulence, Phys. Fluids 11 (5) (1999) 1208–1220. [104] G.A. Voth, K. Satyanarayan, E. Bodenschatz, Lagrangian acceleration measurements at large Reynolds numbers, Phys. Fluids 10 (9) (1998) 2268–2280. [105] M.S. Sawford, B.L. Borgas, Stochastic equations with multifractal random increments for modeling turbulent dispersion, Phys. Fluids 6 (2) (1994) 618–632. [106] Z.S. She, E. Jackson, S.A. Orszag, Structure and dynamics of homogeneous turbulence: models and simulations, Proc. Roy. Soc. London A. 434 (1991) 101–124. ∗ [107] R.J. Adrian, P. Moin, Stochastic estimation of organized turbulent structure: homogeneous shear ow, J. Fluid Mech. 190 (1988) 531–559. [108] Y. Nagano, Modelling heat transfer in near-wall ows, Closure strategy for modelling turbulent and transitional ows, Isaac Newton Institute for Mathematical Sciences, Cambridge, April 6 –17, 1999. [109] D.J. Thomson, A stochastic model for the motion of particle pairs in isotropic high-Reynolds number turbulence, and its application to the problem of concentration variance, J. Fluid Mech. 210 (1990) 113–153. [110] B.L. Borgas, M.S. Sawford, A family of stochastic models for two-particle dispersion in isotropic homogeneous stationary turbulence, J. Fluids Mech. 279 (1994) 69–99. [111] O.A. Kurbanmuradov, Stochastic lagrangian models for two-particle relative dispersion in high-Reynolds number turbulence, Monte Carlo Methods Appl. 3 (1) (1997) 37–52. [112] K.K. Saberfeld, O.A. Kurbanmuradov, Stochastic lagrangian models for two-particle motion in turbulent ows, Monte Carlo Methods Appl. 3 (1) (1997) 53–72. [113] K.K. Saberfeld, O.A. Kurbanmuradov, Stochastic lagrangian models for two-particle motion in turbulent ows. Numerical results, Monte Carlo Methods Appl. 3 (3) (1997) 199–223. [114] B.M.O. Heppe, Generalized langevin equation for relative turbulent dispersion, J. Fluid Mech. 357 (1998) 167–198. [115] J.J. Monaghan, Smoothed particle hydrodynamics, Ann. Rev. Astron. Astrophys. 30 (1992) 543–574. ∗∗ [116] G.H. Cottet, P. Koumoutsakos, Vortex Methods, Theory and Practice, Cambridge University Press, Cambridge, 2000. [117] H. Risken, The Fokker–Planck Equation, Methods of Solution and Applications, 2nd Edition, Springer, Berlin, 1989. [118] G. Roepstorf, Path Integral Approach to Quantum Physics, Springer, Berlin, 1994. [119] L.S. Schulman, Techniques and Applications of Path Integration, Wiley, New York, 1981. [120] F.W. Wiegel, Introduction to Path-Integral Methods in Physics and Polymer Science, World ScientiAc, Singapore, 1986. [121] M. Namiki, Stochastic Quantization, Springer, Berlin, 1992. [122] H. Goldstein, Classical Mechanics, 2nd Edition, Addison-Wesley Publishing Co., Reading, MA, 1980. [123] L. Onsager, S. Machlup, Phys. Rev. 91 (1953) 1505. [124] L. Onsager, S. Machlup, Phys. Rev. 91 (1953) 1512. [125] G.L. Eyink, Linear stochastic models for nonlinear dynamical systems, Phys. Rev. E 58 (6) (1998) 6975–6991. [126] M. Orme, Experiments on droplet collisions, bounce, coalescence and disruption, Prog. Energy Combust. Sci. 23 (1997) 65–79.
214
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
[127] S.P. Lin, R.D. Reitz, Drop and spray formation from a liquid jet, Ann. Rev. Fluid Mech. 30 (1998) 85–105. [128] S. Zeng, E.T. Kerns, R.H. Davis, The nature of particle contacts in sedimentation, Phys. Fluids 8 (1996) 1389. [129] P.G. Safman, On the settling speed of free and Axed suspensions, Stud. Appl. Maths. 52 (1973) 115–127. [130] E.R. Bagnold, Physics of Blown Sand and Sand Dunes, Chapman & Hall, London, 1941. [131] P. Bak, How Nature Works: the Science of Self-Organized Criticality, Oxford University Press, Oxford, 1997. [132] Jensen, Self-organized Criticality, Cambridge Lecture Notes in Physics, Vol. 10, 1998. [133] H.M. Jaeger, S.R. Nagel, R.P. Behringer, Granular solids, liquids, and gases, Rev. Mod. Phys. 68 (4) (1996) 1259–1273. [134] L.P. Kadano8, Built upon sand: theoretical ideas inspired by granular ows, Rev. Mod. Phys. 71 (1) (1999) 435–444. [135] C.S. Campbell, Rapid granular ows, Ann. Rev. Fluid Mech. 22 (1990) 57–92. [136] J.T. Jenkins, S.B. Savage, A theory for rapid ow of identical, smooth, nearly elastic, spherical particles, J. Fluid Mech. 130 (1983) 187–202. [137] J.T. Jenkins, M.W. Richman, Grad’s 13 moment system for a dense gas of inelastic spheres, Arch. Rational Mech. Anal. 87 (1985) 355–377. [138] M.A. Hopkins, M.Y. Louge, Inelastic microstructure in rapid granular ows of smooth disks, Phys. Fluids A 3 (1) (1991) 47–57. [139] S. MacNamara, W.R. Young, Inelastic collapse in two dimensions, Phys. Rev. E 50 (1) (1994) R28–R31. [140] N. SchRorghofer, T. Zhou, Inelastic collapse of rotating spheres, Phys. Rev. E 54 (5) (1996) 5511–5515. [141] S.F. Foerster, M.Y. Louge, H. Chang, K. Allia, Measurements of the collision properties of small spheres, Phys. Fluids 6 (3) (1994) 1108–1115. [142] J. Lavi\eville, E. Deutsch, O. Simonin, Large eddy simulation of interactions between colliding particles and a homogeneous isotropic turbulent Aeld. in: Gas–Solid Flows, ASME FED, Vol. 228, ASME, New York, 1995, pp. 347–357. [143] E.S. Oran, C.K. Oh, B.Z. Cybyk, Direct Simulation Monte Carlo: recent advances and applications, Ann. Rev. Fluid Mech. 30 (1998) 403–441.
Renormalization group theory in the new millennium. III edited by Denjoe O'Connor, C.R. Stephens editor: I. Procaccia Contents D. O'Connor, C.R. Stephens, Renormalization group theory in the new millennium. III D.V. Shirkov, V.F. Kovalev, The Bogoliubov renormalization group and solution symmetry in mathematical physics G. Gallavotti, Renormalization group in statistical mechanics and mechanics: gauge symmetries and vanishing beta functions
PII S0370-1573(01)00069-2
215
219
251
G. Gentile, V. Mastropietro, Renormalization group for one-dimensional fermions. A review on mathematical results G. Jona-Lasinio, Renormalization group and probability theory E.A. Calzetta, B.L. Hu, F.D. Mazzitelli, Coarsegrained e!ective action and renormalization group theory in semiclassical gravity and cosmology
273 439
459
RENORMALIZATION GROUP THEORY IN THE NEW MILLENNIUM. III
edited by Denjoe O:CONNOR, C.R. STEPHENS
ElectriciteH de France, Div. R&D, MFTT, 6 Quai Watier, 78400 Chatou, France Energy Conversion Department, Chalmes University of Technology, S-41296 GoK teborg, Sweden
AMSTERDAM } LONDON } NEW YORK } OXFORD } PARIS } SHANNON } TOKYO
Physics Reports 352 (2001) 215–218
Editorial
Renormalization group theory in the new millennium. III 1. Introduction This volume constitutes the third in a series of reviews based loosely on plenary talks given at the conference “RG2000: Renormalization Group Theory at the Turn of the Millennium” held in Taxco, Mexico in January 1999. The chief purpose of the conference was to bring together a group of people who had made signi0cant contributions to RG Theory and its applications, especially those who had contributed to the development of the subject in quantum 0eld theory=particle physics and statistical mechanics=critical phenomena, i.e. the high- and low-energy regimes of RG theory. In the last half-century, renormalization group (RG) theory has become a central structure in theoretical physics and beyond, though it is not always clear that di4erent authors mean the same thing when they speak about it. The aim of these reviews is to try and convey some of the power and scope of RG theory and its applications and in the process hopefully convey the underlying unity of the set of ideas involved. Although RG theory has had a major impact it has tended to be viewed as a tool rather than as a subject in and of itself. Being presented principally in terms of its applications has therefore meant a lack of contact between practitioners from di4erent 0elds. An important exception to this tendency is the series of RG conferences organized by Dimitri Shirkov and others of the Joint Institute for Nuclear Research, Dubna theoretical physics community. The Taxco conference was in the same spirit. The advent in recent years of conferences on the “exact” RG has also provided an opportunity for practitioners to come together. The only criticism one might have of this latter series is the large emphasis on 0eld theory. This small criticism notwithstanding we hope that there will be continued opportunity to bring together RG practitioners from di4erent 0elds. In obtaining contributions for these reviews we did not restrict ourselves to speakers from the conference. A major concern was to avoid producing yet another typical conference proceedings. Hence, the remit given to the contributors was to write as extensively and comprehensively as they saw 0t. Naturally, with such a liberal regime the length of article varies signi0cantly. Our goal was to try and review the state of the art of RG theory given that it could now be considered mature enough to warrant a large scale overview. We believe that we were to some extent defeated in our purpose by the very size and range of applicability of the RG. Although we have managed to cover a large gamut we know there c 2001 Elsevier Science B.V. All rights reserved. 0370-1573/01/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 3 8 - 2
216
Editorial
are glaring omissions. Nevertheless, we feel it is of great bene0t to have reviews by leading practitioners all brought together in the same place even if the range of coverage is suboptimal. A possible remedy to this would be for specialists in areas not adequately covered here to submit articles which would naturally fall into the present series. We particularly wanted to emphasize the idea that although mature enough to warrant a major review, RG theory is young enough, and more signi0cantly, deep enough, that such a review would still only barely scratch its surface. We hope that young researchers will get the feeling that it is still very much an emerging 0eld with a large number of open problems associated with the understanding of RG theory itself and an even larger number associated with its applications. 2. Introduction to the third volume This third volume in the series is principally devoted to theory and applications of the “0eld theoretic RG” treating both theoretical developments and applications. The two principle strands of RG theory are based on two, apparently, quite di4erent concepts: “coarse graining” and “reparametrization”. In the context of 0eld theory, of either statistical or quantum systems, they coexist within the same formalism, coarse grainings naturally introducing coordinate changes on the space of Hamiltonians and reparametrizations being naturally associated with scale changes. However, the last few years have witnessed developments wherein the basic di4erences between the two approaches have become more apparent. An example of this is how the reparametrization approach has been abstracted away from its 0eld theoretic origins to treat problems such as the solution of non-linear partial or ordinary di4erential equations. The review of Shirkov and Kovalev gives an overview of the evolution of these developments. Stated abstractly, the RG in the reparametrization approach is a continuous symmetry of a solution of some problem with respect to transformations of the parameters on which the solution depends. A good example would, for instance, be boundary conditions. Importantly, it is also an exact symmetry of the problem. If the in0nitesimal generators of this group can be found perturbatively then the solution of the corresponding RG equation corresponds to an “exponentiation” of this perturbation theory. This fact is at the heart of the applications that involve a RG improvement of the perturbative solutions of di4erential equations. The authors consider several illustrative examples such as solution of the modi0ed Burgers equation and the self-focusing of laser beams. In the latter case one very interesting feature of the reparametrization method is its generalization to a more than one parameter family of reparametrizations, i.e. more than one RG, which opens the door to solving problems where the singularities one is trying to access are more than one-dimensional. Without doubt applications of this more abstract view of the RG method, outside its origin in the context of Quantum Field Theory, will be a continuing important area of application of RG methods. The review of Gallavotti discusses two very di4erent yet similar problems: the theory of the ground state of a system of fermions in one dimension and the theory of KAM tori in classical mechanics. He thus demonstrates ‘the conceptual unity that renormalization group methods’
Editorial
217
have brought to theoretical physics. In many ways, the most surprising use of RG techniques is the application to the KAM problem. This is a diGcult problem of classical mechanics where surprisingly a 0eld theoretic analysis and RG study of an associated Feynman diagrammatic series (which involves only trees) proves to be a powerful and insightful method of attacking the problem. In the third review, Gentile and Mastropietro give a very comprehensive treatment of onedimensional Fermionic systems. They review what is known at a rigorous level about the correlation functions of models of interacting one-dimensional Fermi systems emphasizing primarily results obtained in the 1990s. It turns out that the correlation functions of many models can be written as convergent series, in the weak coupling regime, and such expressions provide all the information of interest. In “Renormalization Group and Probability Theory” Jona-Lasinio discusses the probabilistic interpretation of the RG and argues that the critical point can be characterized by deviations from the central limit theorem. He further argues that to any type of RG transformation one can associate a multiplicative structure and that ‘the characterizing feature of the Green’s function RG is that it is de0ned directly in terms of this multiplicative structure’ and that this multiplicative structure emerges from the properties of conditional probabilities. It is undoubtedly true that further insights into the connection between the di4erent RG approaches is to be gained by this type of probabilistic analysis. This seems to us to be an aspect of RG theory that has not yet yielded all of its secrets. In the 0fth review Calzetta, Hu and Mazzitelli discuss various applications of the RG in situations involving a non-trivial gravitational 0eld, such as in the early universe. In particular, they consider both at a qualitative and quantitative level the application of RG theory to non-equilibrium quantum processes and phase transitions in the early universe. For example, in the simplest case—De Sitter space, where the dynamics can be rewritten as a pure scaling transformation—one can easily calculate the running of the couplings in, for example, a 4 theory. The resultant renormalized couplings “run” with time, leading to an exponentially decreasing e4ective coupling constant. The full consequences of this type of behaviour for phase transitions in an inLationary universe do not seem to have been worked out, but certainly deserve to be. Similar calculations are in principle possible for more general spacetimes. An associated unsolved problem is how to implement a cuto4 in these dynamical situations and how to avoid having to confront post-Planckian frequencies. The authors further go on to consider the Closed Time Path Coarse Grained E4ective Action as the key functional for dynamic processes (remember, the e4ective potential has little meaning in a dynamic situation). This involves a division of the wavemodes into fast and slow type and a subsequent coarse graining (over a spatial volume rather than the space-time of the Euclidean average e4ective action) of the fast modes which then act as noise for the slow modes. A key point here is the ability to study the origin and properties of the noise from 0rst principles rather than put it in by hand as is done in, for example, RG approaches to the Langevin equation. The authors end their review by considering the interesting possibility that RG equations themselves, when noise and dissipation are inherent in a system, should be stochastic equations. It remains to be seen under what circumstances this is indeed the case and whether it has important quantitative consequences.
218
Editorial
Acknowledgements We take this opportunity to thank our coorganizers of the Taxco conference, Alberto Robledo, Riccardo Capovilla and Juan Carlos D’Olivo, and the conference secretaries, Trinidad Ramirez and Alejandra Garcia. We thank the conference sponsors for their signi0cant 0nancial support: CONACyT, MMexico; NSF, USA; ICTP, Italy; the Depto de FMPsica, Cinvestav, MMexico; Instituto de Ciencias Nucleares, UNAM, MMexico; Instituto de FMPsica, UNAM, MMexico; Fenomec, UNAM, MMexico; Cinvestav, MMexico; DGAPA, UNAM, MMexico and the CoordinaciMon de InvestigaciMon CientMP0ca, UNAM, Mexico. It is fair to say that without this generous support a conference of such caliber could not have taken place. We also take this opportunity to express our gratitude, for their advice and assistance, to the international advisory committee comprised of: A.P. Balachandran, Syracuse University, USA; K. Binder, Mainz, Germany; M.E. Fisher, University of Maryland, USA; N. Goldenfeld, University of Illinois, USA; B.L. Hu, University of Maryland, USA; D. Kazakov, Dubna, Russia; V.B. Priezzhev, Dubna, Russia; I. Procaccia, Weizmann Institute, Israel; M. Shifman, University of Minnesota, USA; D.V. Shirkov, Dubna, Russia; F. Wegner, Heidelberg, Germany; J. Zinn-Justin, Saclay, France. We express our special thanks to Michael Fisher for his cogent advice and organizational help, to Bei Lok Hu for helping organize the US component of the conference and to Itamar Procaccia for organizing an appropriate forum in which to present this overview. Denjoe O’Connor Dept. de Fisica, CINVESTAV, A. Postal 14-740, 07360 Mexico D.F., Mexico E-mail address:
[email protected] (D. O’Connor). C.R. Stephens Instituto de Ciencias Nucleares, A. Postal 70-543, 04510 Mexico, D.F., Mexico E-mail address:
[email protected] (C.R. Stephens).
Physics Reports 352 (2001) 219–249
The Bogoliubov renormalization group and solution symmetry in mathematical physics Dmitrij V. Shirkova; ∗ , Vladimir F. Kovalevb a
Bogoliubov Laboratory of Theoretical Physics, J.I.N.R., Dubna 141980, Russia b Institute for Mathematical Modelling, Moscow 125047, Russia Received March 2001; editor: I: Procaccia
Contents 1. The Bogoliubov renormalization group 1.1. Historical introduction 1.2. The Bogoliubov RG: symmetry of a solution 1.3. The renorm-group method 2. Evolution of the renormalization group concept 2.1. Renormalization group evolution 2.2. Di6erence between the Bogoliubov RG and KW-RG 2.3. Functional self-similarity 3. Solution symmetry in mathematical physics
221 221 223 224 227 227 231 231 232
3.1. Constructing RG-symmetries and their application 3.2. Examples of solution improvement 4. The RG in non-linear optics 4.1. Formulation of a problem 4.2. Plane geometry 4.3. Cylindrical geometry 5. Overview Acknowledgements References
232 237 241 241 241 244 246 247 248
Abstract Evolution of the concept known in theoretical physics as the renormalization group (RG) is presented. The corresponding symmetry, that was ?rst introduced in quantum ?eld theory (QFT) in the mid-1950s, is a continuous symmetry of a solution with respect to transformations involving the parameters (e.g., that determine boundary condition) which specify some particular solution. After a short detour into Wilson’s discrete semi-group, we follow the expansion of the QFT RG and argue that the underlying transformation, being considered as a reparametrization, is closely related to the property of self-similarity. It can be treated as its generalization—Functional Self-similarity (FS). Next, we review the essential progress made in the last decade in the application of the FS concept to boundary value problems formulated in terms of di-erential equations. A summary of a regular approach, recently devised for ∗
Corresponding author. E-mail addresses:
[email protected] (D.V. Shirkov),
[email protected] (V.F. Kovalev).
c 2001 Elsevier Science B.V. All rights reserved. 0370-1573/01/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 3 9 - 4
220
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
discovering the RG = FS symmetries with the help of modern Lie group analysis, and some of its applications are given. As the principal physical illustration, we consider the solution of the problem of c 2001 Elsevier Science B.V. All rights reserved. a self-focusing laser beam in a non-linear medium. PACS: 02.20.−a; 03.70.+k; 11.10.Hi Keywords: Quantum ?eld theory; Renormalization group; Renorm-group symmetry; Lie groups
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
221
1. The Bogoliubov renormalization group 1.1. Historical introduction 1.1.1. Discovery of the renormalization group In 1952–1953 StKuckelberg and Petermann [1] discovered 1 a group of in?nitesimal transformations related to the ?nite arbitrariness that arises in S-matrix elements upon elimination of ultraviolet (UV) divergences. These authors introduced the notion of a normalization group as a Lie transformation group generated by di6erential operators connected with renormalization of a coupling constant, e. The following year, on the basis of (in?nite) Dyson’s renormalization transformations formulated in the regularized form, Gell-Mann and Low [3] derived functional equations (FEs) for the QED propagators in the UV limit. The appendix to this article contained the general solution (obtained by T.D. Lee) of the FE for the renormalized transverse photon propagator amplitude d(Q2 =2 ; e2 ) (where is a cuto6 de?ned as a normalization momentum). This solution was used for a qualitative analysis of the small distance behavior of the quantum electromagnetic interaction. Two possibilities, namely, in?nite and ?nite charge renormalizations were pointed out. However, paper [3] paid no attention to the group character of the analysis and of the qualitative results obtained there. The authors missed a chance to establish a connection between their results and the QED perturbation theory and did not discuss the possibility that a ghost pole solution might exist. The decisive step was made by Bogoliubov and the present author [4 – 6] in 1955. 2 Using the group properties of ?nite Dyson transformations for the coupling constant, Fields and Green’s functions, they derived functional group equations for the renormalized propagators and vertices in QED in the general (i.e., with the electron mass taken into account) case. In “modern notation”, the ?rst equation x y Q2 m2 (x; Q y; ) = Q ; ; (t; (1) Q y; ) ; x = 2 ; y = 2 t t
is that for the invariant charge (now also widely known as the e6ective or running coupling) Q = d(x; y; = e2 ) and the second— x y s(x; y; ) = s(t; y; ) s ; ; (t; (2) Q y; ) t t —for the electron propagator amplitude. These equations obey a remarkable property: the product, e2 d ≡ , Q of the electron charge squared and the photon transverse propagator amplitude enters into both FEs. This product is invariant with respect to ?nite Dyson’s transformation (as stated by Eq. (1)) which now can be written in the form Rt : { 2 → t 2 ; → (t; Q y; )} :
(3)
We called this product the invariant charge and introduced the term renormalization group. 1 2
For a more detailed exposition of the RG’s early history see our recent reviews [2]. See also the two survey papers [7] published in English in 1956.
222
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
We emphasize that, unlike in Refs. [1,3], in the Bogoliubov formulation there is no reference to UV divergences and their subtraction or regularization. At the same time, technically, there is no simpli?cation due to the massless nature of the UV asymptotics. Here, the homogeneity of the transfer momentum scale Q is explicitly violated by the mass m. Nevertheless, the symmetry with respect to the transformation Rt (even though a bit more involved) underlying RG is formulated as an exact property of the solution. This is what we mean by the term Bogoliubov renormalization group or renormgroup for short. The di6erential Lie equations for Q and for the electron propagator y y 9 (x; Q y; ) 9s(x; y; ) (4) = ; (x; Q y; ) ; = ; (x; Q y; ) s(x; y; ) 9 ln x x 9 ln x x with 9 (; Q y; ) 9s(; y; ) (y; ) = ; (y; ) = at = 1 (5) 9 9 were ?rst derived in [4] by di6erentiating the FEs (1) and (2) with respect to x at the point t = x. On the other hand, by di6erentiating the same equations with respect to t one obtains [8] X (x; Q y; ) = 0;
Xs(x; y; ) = (y; )s(x; y; )
(6)
with X = x9x + y9y − (y; )9
(9x ≡ 9= 9x)
(7)
being the Lie in?nitesimal operator. 1.1.2. Creation of the RG method Another important achievement of [4] was the formulation of a simple algorithm for improving an approximate perturbative solution by combining it with Lie group equations—for details, see Section 1.3 below. In the adjacent publication [5] this algorithm was e6ectively used to analyze the UV and infrared (IR) behavior in QED. In particular, the one-loop UV asymptotics of the photon propagator as well as the IR behavior of the electron propagator in the transverse gauge Q(1) (8) ; s(x; y; ) ≈ (p2 =m2 − 1)−3 =2 rg (x; ) = 1 − ( =3) ln x were derived. At that time, these expressions, summing the leading logs were already known from papers by Landau and collaborators [9]. However, Landau’s approach did not provide a means for constructing subsequent approximations. A simple technique for calculating higher approximations was found only within the new renormgroup method. In the same paper, starting with the next order perturbation expression 3 Q(2) pt (x; ) containing the ln x term, we arrived at the second renormgroup approximation (see below Section 1.3.2): Q(2) (9) rg (x; ) = 1 − ( =3) ln x + (3 =4) ln(1 − ( =3) ln x) which performs an in?nite summation of the 2 ( ln)n terms. This two-loop solution for the invariant coupling, ?rst obtained in [5], contains the non-trivial log-of-log dependence which
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
223
is now widely known as the “next-to-leading logs” approximation for the running coupling in quantum chromodynamics (QCD)—see, below, Eq. (21). Comparing (9) with (8), one concludes that two-loop correction is essential in the vicinity of the ghost pole at x1 = exp(3= ). This also shows that the RG method is a regular procedure, within which it is easy to estimate the range of applicability of its results. Quite soon, this approach was formulated [6] for the case of QFT with two coupling constants. To the system of FEs for two invariant couplings there corresponds a coupled system of non-linear di6erential equations (DEs). The latter was used in [10] to study the UV behavior of the –N interaction at the one-loop level. Thus, in Refs. [4 – 6,10] the RG was directly connected with practical computations of the UV and IR asymptotics. Since then, this technique, the renormalization group method (RGM), 3 has become the principle means of asymptotic analysis in local QFT. 1.2. The Bogoliubov RG: symmetry of a solution The RG transformation: Generally, the RG can be de?ned as a continuous one-parameter group of speci?c transformations of a partial solution (or the solution characteristic) of a problem, a solution that is ?xed by boundary conditions. The RG transformation involves boundary condition parameters and corresponds to some change in the way of imposing this condition. For illustration, imagine a one-argument solution characteristic f(x) that has to be speci?ed by the boundary condition f(x0 ) = f0 . Formally, one can represent a given characteristic of a partial solution as a function of boundary parameters as well: f(x) = f(x; x0 ; f0 ). This step can be treated as an embedding operation. Without loss of generality, f can be written in the form of a two-argument function F(x=x0 ; f0 ) with the property F(1; ) = . The RG transformation then corresponds to a change in the parameterization, say from {x0 ; f0 } to {x1 ; f1 }, for the same solution. In other words, the x argument value, at which the boundary condition is given, can be changed for x1 with f(x1 ) = f1 . The equality F(x=x0 ; f0 ) = F(x=x1 ; f1 ) now reRects the fact that under such a change the form of the function F itself is not modi?ed. Noting that f1 = F(x1 =x0 ; f0 ), we get F(; f0 ) = F(=t; F(t; f0 ));
= x=x0 ;
t = x1 =x0 :
The group transformation here is { → =t; f0 → F(t; f0 )}. The renormgroup transformation for a given solution of some physical problem in the simplest case can now be de?ned as a simultaneous one-parameter transformation of two variables, say x and g, by Rt : {x → x = x=t; g → g = g(t; Q g)} ;
(10)
the ?rst being a scaling of a coordinate x (or reference point) and the second—a more complicated functional transformation of the solution characteristic. The equation g(x; Q g) = g(x=t; Q g(t; Q g)) 3
Summarized in the special chapter of the ?rst edition of the monograph [11].
(11)
224
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
for the transformation function gQ provides the group property Tt = T Tt of the transformation (10). These are just the RG FEs and transformation for a massless QFT model with one coupling constant g. In that case x = Q2 = 2 is the ratio of a four-momentum Q squared to a “normalization” momentum squared and g, the coupling constant. The RG transformation (10) of a QFT amplitude s is of the form (compare with Eq. (2)) Rt · s(x; g) ≡ e−ln tX s(x; g) = s(x=t; g(t; Q g)) = zs−1 s(x; g);
zs = s(t; g) :
(12)
Several generalizations are in order. (a) “Massive” case: For example, in QFT, if we do not neglect the mass, m, of a particle, we have to insert an additional dimensionless argument into the invariant coupling gQ which now has to be considered as a function of three variables: x = Q2 = 2 ; y = m2 = 2 , and g. The presence of a new “mass” argument y modi?es the group transformation (10) and the FE (11) x y x y Rt : x = ; y = ; g = g(t; Q y; g) ; g(x; Q y; g) = gQ ; ; g(t; (13) Q y; g) : t t t t Here, it is important that the new parameter y (which, physically, should be close to the x variable, as it scales similarly) enters also into the transformation law of g. If the considered QFT model, like QCD, contains several masses, there will be several mass arguments y → {y} ≡ y1 ; y2 ; : : : ; yn . (b) Multi-coupling case: A more involved generalization corresponds to the case of several coupling constants: g → {g} = g1 ; : : : ; gk . Here, there arise a “family” of e6ective couplings gQ → {gQ};
gQi = gQi (x; y; {g});
i = 1; 2; : : : ; k ;
(14)
satisfying the system of coupled functional equations gQi (x; y; {g}) = gQi (x=t ; y=t; {g(t; Q y; {g})}) :
(15)
The RG transformation now is Rt : {x → x=t; y → y=t; {g} → {g(t)}};
gi (t) = gQi (t; y; {g}) :
(16)
1.3. The renorm-group method 1.3.1. The algorithm The uni?cation of an approximate solution [4,5] and the abstract group symmetry can be realized with the help of the group DEs. If we de?ne and (the so-called group “generators”) via some approximate solutions and then solve the evolutional DEs we obtain the RG improved solutions that obey the group symmetry and correspond to the approximate solutions used as an input. Now, we can formulate an algorithm for improving an approximate solution. The procedure is given by the following prescription which we illustrate in the massless one-coupling cases (4) and (5):
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
225
Assume some approximate solution gQappr (x; g); sappr (x; g) is known: 1. On the basis of Eq. (5) de?ne the beta- and gamma-functions def 9 def 9 (g) = : gQappr (; g) ; sappr (; g) (g) = 9 9 =1 =1 2. Integrate the ?rst of Eqs. (4), i.e., construct the function g d def f(g) = : ()
(17)
(18)
3. Solve the resulting equation to obtain gQrg (x; g) = f−1 {f(g) + ln x} :
(19)
4. Integrate the second of Eqs. (4) using this expression gQrg on its right hand side to explicitly obtain srg (x; g). 5. The expressions gQrg and srg precisely satisfy the RG symmetry, i.e., they are exact solutions of Eqs. (11) and (12) corresponding to gQappr and sappr used as input. 1.3.2. A simple illustration 2 As a concrete illustration, take the simplest perturbative expressions: gQ(1) pt = g − g 1 ln x for (1) gQappr and spt = 1 − g1 ln x. Here, (g) = − 1 g2 ; (g) = − 1 g and integration of (4) gives the explicit expressions g (1) gQ(1) (x; g) = (g(x; Q g)=g)!1 ; !1 = 1 =1 ; (20) ; srg rg (x; g) = 1 + g1 ln x which, on the one hand, exactly satisfy the RG symmetry and, on the other, being expanded in powers of g, correlate with gQpt and spt . Now, on the basis of the geometric progression (20), let us present the two-loop perturbative 2 3 2 2 approximation for gQ in the form gQ(2) pt = g−g 1 ln x+g (1 ln x −2 ln x). By using this expression as an input in Eq. (17), we have (2) (g) = − 1 g2 − 2 g3 and then (performing step 2), z 1 d z 2 (2) 1 f (z) = − = + b ln : ; b= 2 + b3 z 1 + bz 1 To make the last step, we have to start with the equation f(2) [gQrg(2) (x; g)] = f(2) (g) + 1 ln x which is transcendental and has no simple explicit solution. 4 Due to this, one usually solves the equation approximately by noting that the second, logarithmic, contribution to f(2) (z) is a small correction to the ?rst one at bz 1. With this caveat we can substitute the one-loop RG expression (20), instead of gQ(2) rg , into this correction and obtain the explicit “iterative” solution g gQ(2) ; l = ln x : (21) rg = 1 + g1 l + g(2 =1 ) ln [1 + g1 l] 4
It can be expressed in terms of special, Lambert, W -function: W (z) expW (z) = z; see, e.g., Ref. [12].
226
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
(2) An analogous procedure for spt = 1 − g1 ln x + g2 (1 (1 + 1)12 ln2 x − 2 ln x) yields (2) = srg
S(gQ(2) 1 2 − 1 2 rg (x; g)) : with S(g) = g!1 e!2 g and !2 = S(g) 12
(22)
These results are interesting from several aspects. • Firstly, being expanded in powers of g and gl, they produce an in?nite series containing
“leading” and “next-to-leading” UV logarithmic contributions.
• Secondly, they contain a new analytic dependence ln(1 + g1 l) ∼ ln(ln Q2 ) which is absent
in the perturbative input. • Thirdly, when compared with the one-loop solution, Eq. (20), they illustrate an algorithm for the improvement of a solution’s accuracy, i.e., of the RGM regularity.
1.3.3. RGM usage in QFT As we have seen, ?nite order perturbative expressions in QFT do not obey the RG symmetry. On the other hand, it was shown that the one- and two-loop approximation, used as an input for the construction of the “generators” (g) and (g), yield expressions (20) – (22) that obey the group symmetry and exactly satisfy the FEs (11) and (12). More generally, one can state the following logical structure of the RGM procedure. • Solve the group equation(s) for the invariant coupling(s) gQrg (x; g) using some approximate
solution gQpt as an input.
• Obtain the RG solutions for some other QFT objects (like vertices and propagator amplitudes)
on the basis of the expression(s) for gQrg just derived.
Typically, they satisfy the equation XM (x; y; g) = (y; g) M (x; y; g) :
(23)
The general structure of the corresponding solutions has the form −1 (y; g) M(x=y; g(x; Q y; g)) : M (x; y; g) = zM
(24)
Note that the function M on the right hand side depends only on the RG invariants, that is on the ?rst integrals of the RG operator X introduced in Eqs. (6) and (7). It satis?es homogeneous partial di6erential equations (PDEs) X M = 0. For RG invariant objects, like observables, zM = 1, = 0. Now we can summarize the properties of the RGM. The RGM is a regular procedure for combining dynamical information (taken from an approximate solution) with the RG symmetry. The essence of the RGM is the following: (1) The mathematical tool used in the RGM is Lie di6erential equations. (2) The key element of the RGM is the possibility of an (approximate) determination of the “generators”, such as (g); (g), from the dynamics. (3) The RGM works most e6ectively in the case where the solution has a singular behavior. It restores the structure of the singularity compatible with the RG symmetry.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
227
2. Evolution of the renormalization group concept In the 1970s and 1980s RG ideas were applied to critical phenomena (spontaneous magnetization, polymerization, percolation), non-coherent radiation transfer, dynamic chaos, and so on. Perhaps, a less sophisticated motivation by Wilson in the context of spin lattice phenomena (rather than in QFT) made this “explosion” of RG applications possible. 2.1. Renormalization group evolution 2.1.1. The Kadano-–Wilson RG in critical phenomena (a) Spin lattice: The so-called renormalization group in critical phenomena is based on the Kadano6–Wilson procedure [13,14] of “decimation” or “blocking”. Initially, it emerged from the problem of critical phenomena on spin lattices. Imagine a regular (two- or three-dimensional) lattice consisting of N d ; d = 2; 3 sites with an ‘elementary step’ a between them. Suppose that at every site a spin vector is located. The Hamiltonian, describing the spin interaction between nearest neighbors H =k i · i±1 i
contains k, the coupling constant. A statistical sum is obtained from the partition function, S = exp(−H=() aver . To realize blocking, one has to perform a “spin averaging” over blocks consisting of nd elementary sites. This step diminishes the number of degrees of freedom from N d to (N=n)d . It also destroys the short-range properties of the system, in the averaging procedure some information being lost. However, the long-range physics (such as the correlation length, essential for understanding the phase transition) is not a6ected by it, and thus we gain a simpli?cation of the problem. As a result of this blocking procedure, new e6ective spins, , arise at new sites, forming a new e6ective lattice with lattice spacing na. We arrive also at the new e6ective Hamiltonian He6 = Kn I · I ±1 + TH I
with the e6ective coupling Kn between new spins I of new neighboring sites; Kn has to be de?ned by the averaging process as a function of k and n. Here, TH contains quartic and higher spin forms which are irrelevant for the IR (long-distance) properties. Due to this, one can drop TH and conclude that the spin averaging leads to an approximate transformation, k · → Kn · ; i
I
or, taking into account the “elementary step” change, to {a → n a; k → Kn }. The latter is the Kadano6–Wilson transformation. It is convenient to write down the new coupling Kn in the form Kn = K(1=n; K). Then, the KW transformation reads as KWn : {a → na; k → Kn = K(1=n; k)} :
(25)
228
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
These transformations obey a composition law KWn · KWm = KWnm if the relation K(x; k) = K(x=t; K(t; k));
x = 1=nm;
t = 1=n
(26)
holds. This is very close to the RG symmetry. We observe the following points: • • • •
The RG symmetry in this case is approximate (due to neglecting TH ). The transformations KWn are discrete. There exists no inverse transformation to KWn . The transformations KWn relate di6erent auxiliary models.
Hence, the ‘Kadano6–Wilson renormalization group’ (KW-RG) is an approximate and discrete semi-group. For the long-distance (IR limit) physics, however, +(1=n) is small and it is possible to use di6erential Lie equations. 5 (b) Polymer theory: In polymer physics, one considers the statistical properties of polymer macromolecules which can be imagined as very long chains of identical elements (with the number of elements N as big as 105 ). Molecules are swimming in a solvent and form globulars. This big molecular chain forms a speci?c pattern resembling that formed by a random walk. The central problem of polymer theory is very close to that of a random walk and can be formulated as follows. For a long chain of N “steps” (with stepsize = a), one has to ?nd the “chain size” RN , i.e. the distance between the “start” and the “?nish” points (the size of a “globule”), with the distribution function f(,) of angles between the neighboring elements being given. For large values of N , the molecular size, RN , obeys the power, Fleury, law RN ∼ N ! , with !, the Fleury index. When N is given, RN is a functional of f(,) which depends on external conditions (e.g., temperature T , properties of the solvent, etc). If T grows, RN increases and at some moment the globules touch one another. This is the polymerization process which is very similar to a phase transition phenomenon. The Kadano6–Wilson blocking ideology has been introduced into polymer physics by De Gennes [15]. The key idea is a grouping of n neigboring elements of a chain into a new “elementary block”. It leads to the transformation {1 → n; a → An } which is analogous to the one for spin lattice decimation. This transformation must be speci?ed by a direct calculation which gives an explicit form of An = a(n; Q a). Here, we have a discrete semi-group. Then, by using the KW-RG technique, one ?nds the ?xed point, obtains the Fleury power law and calculates its index !. An essential feature of a polymer chain is the impossibility of a self-intersection. This is known as the excluded volume e6ect in the random walk problem. Generally, the excluded volume e6ect yields some complications. However, using the QFT RG approach to polymers [16], it can be treated rather simply by introducing another argument which is analogous to the ?nite length L in the transfer problem or the particle mass m in QFT. 5 In applications of these transformations to critical phenomena the notion of a Axed point is important. Generally, a ?xed point is associated with power-type asymptotic behavior. Note here that, contrary to the QFT case considered in Section 1.3.2, in phase transitions we deal with an IR stable point.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
229
Besides polymers, the KW-RG technique has been used in other ?elds of physics, such as percolation, non-coherent radiation transfer [17], dynamical chaos [18] and others. 2.1.2. The Bogoliubov symmetry outside QFT The original QFT-RG approach has also proliferated into other parts of theoretical physics. In the late 1950s, it was used [19] for the summation of Coulomb singularities in Bogoliubov’s theory of superconductivity based on the FrKohlich electron–phonon interaction. Twenty years later it was used in the theory of turbulence. (a) Turbulence: To formulate the turbulence problem in terms of the RG, one has to perform the following steps [20,21]: 1. Introduce the generating functional for correlation functions. 2. Write down the path integral representation for this functional. 3. By changing the functional integration variable, ?nd the equivalence of the statistical system to some QFT model. 4. Construct the system of Schwinger–Dyson equations for this equivalent QFT model. 5. Perform the ?nite renormalization procedure and derive the RG equations. Here, the reparametrization degree of freedom physically corresponds to a change of long wavelength cuto6 which is built into the de?nition of a few e6ective parameters. (b) Weak shock wave: Another example can be taken from hydrodynamics. Consider a weak shock wave in the one-dimensional case of a large distance l from the starting (implosion) point. The dependence of the velocity, v, of the matter as a function of l at a given moment of time, t, has a simple triangular shape and can be described by the expression l v(l) = V L
at l 6 L;
=0
for l ¿ L ;
where L = L(t) is the front position and V = v(L)—the front velocity. They are functions of time. In the absence of viscosity, the “conservation law” LV = Const: holds. Due to this, they can be treated as functions of the front wave position L ≡ x; V = V (x) as well. If the physical situation is homogeneous the front velocity V (x) can be considered to be a function of only two additional relevant arguments—its own value V0 = V (x0 ) at some other point (x0 ¡ x) and of the x0 coordinate. It can be written in the form: V (x) = G(x=x0 ; V0 ). If we pick three points x0 ; x1 and x2 (for details, see Refs. [22,23]), then the initial condition may be given either at x0 or x1 . Thus, we obtain the FE equivalent to (11) V2 = G(x2 =x0 ; V0 ) = G(x2 =x1 ; V1 ) = G(x2 =x1 ; G(x1 =x0 ; V0 )) : (c) One-dimensional transfer: A similar argument has been given by Mnatzakanian [26] in the one-dimensional transfer problem. Imagine a half-space ?lled with a homogeneous medium on the surface of which some Row (of radiation or particles) with intensity g0 falls from the vacuum half-space. We follow the Row as it moves into the medium to a distance l from the boundary. Due to homogeneity along the l coordinate, the intensity of the penetrated Row g(l) depends on
230
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
two essential arguments, g(l) = G(l; g0 ). The values of the Row at three di6erent points g0 (on the boundary), g1 and g2 ¿ g1 can be connected to each other by the transitivity relations, g1 = G(; g0 ); g2 = G( + l; g0 ) = G(l; g1 ), which lead to the FE G(l; g) = G(l − ; G(; g)) :
(27)
Performing a logarithmic change of variables l = ln x; = ln t; G(l; g) = g(x; Q g), we see that (27) is equivalent to (11). Consider now the intensity of a reverse Row, i.e. the total amount of particles at the point l moving in the backward direction. It is completely de?ned by g0 and can be written down as R(l; g0 ). This function can be represented in the form R(l; g) = R0 (g)N (l; g) with R0 ≡ R(0; g) and the function N “normalized” on the boundary N (0; g) = 1. Playing the same game with the transitivity we arrive at the FE N (l; g) = Z(l; g)N (l − ; G(l; G(; g));
Z = R0 (g1 )=R0 (g)
(28)
related to Eq. (12) by a logarithmic change of variables. One can refer to (27) and (28) as the additive version of the RG FEs while the previous equations of Section 1, like (11), (12) and (13) are the multiplicative one. The transfer problem admits a modi?cation connected with discrete inhomogeneity. Imagine the case of two di6erent kinds of homogeneous materials separated by an inner boundary surface at l = L. The separation point l = L may correspond, for instance, to the boundary with empty space wherein the resulting equation is equivalent to Eq. (13). One more generalization is related to “multiplication” of the argument g as expressed by Eq. (14). Physically, this relates to the case of radiation on di6erent frequencies !i ; i = 1; 2; : : : k (or particles of di6erent energies or of di6erent types). Take the case for k = 2 and suppose that the material of the medium has such properties that the transfer processes of the two Rows are not independent. In this case, the characteristic functions of these Rows G and H are dependent on both the boundary values g0 and h0 and can be taken as functions g(l) = G(l; g0 ; h0 ); h(l) = H (l; g0 ; h0 ). After a group operation l → l − , we arrive at a coupled set of functional equations G(l + ; g; h) = G(l; g ; h ); g ≡ G(; g; h);
H (l + ; g; h) = H (l; g ; h );
h ≡ H (; g; h)
which is just an additive version of system (15) for k = 2. Now, we can make the important conclusion that a common property yielding functional group equations is just the transitivity property of some physical quantity with respect to the way of giving its boundary or initial value. Hence, the RG symmetry is not a symmetry of equations but a symmetry of equation solutions, that is of equations and boundary conditions considered as a whole.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
231
2.2. Di-erence between the Bogoliubov RG and KW-RG As mentioned above, RG ideas have expanded into diverse ?elds of physics in two di6erent ways: • Via direct analogy with the Kadano6–Wilson construction (averaging over some set of de-
grees of freedom) in polymers, non-coherent transfer and percolation, i.e., constructing a set of models for a given physical problem. • Via ?nding an exact RG symmetry by proof of the equivalence with a QFT model (e.g., in turbulence [20,21]), plasma turbulence [27] or by some other reasoning (like in the transfer problem). To the question Are there di-erent renormalization groups? the answer is positive: 1. In QFT and some simple macroscopic examples, RG symmetry is an exact symmetry of a solution formulated in terms of its natural variables. 2. In turbulence, continuous spin-?eld models and in some others, it is a symmetry of an equivalent QFT model. 3. In polymers, percolation, etc. (with KW blocking), the RG transformation is a transformation between di-erent auxiliary models (specially constructed for this purpose) of a given system. As we have shown, there is no essential di6erence in the mathematical formulation. There exists, however, a profound di6erence in physics: • In cases 1 and 2 (as well as in some macroscopic examples), the RG is an exact symmetry
of a solution. • In the Kadano6–Wilson type problem (spin lattice, polymers, etc.), one has to construct a set M of models Mi . The KW-RG transformation KWn Mi = Mni
with integer n
(29)
is acting inside a set of models. 2.3. Functional self-similarity The RG transformations have a close connection to the concept of self-similarity. Self-similarity transformations for problems formulated by using non-linear PDEs are well known since the last century, mainly in the dynamics of liquids and gases. They are one parameter transformations de?ned as a simultaneous power scaling of independent variables z = {x; t; : : :}, solutions fk (z) and other functions Vi (z) (like external forces) S : {x = x; t = ta ; fk = ’k fk ; Vi = !i Vi } entering into the equations. To emphasize their power structure, we use the term power self-similarity = PS. According to Zel’dovich and Barenblatt [28], PS can be classi?ed into two types: (a) PS of the 1st kind with all indices a; : : : ; ’; !; : : : being integers or rational numbers (rational PS) that are usually found from the theory of dimensions;
232
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
(b) PS of the 2nd kind with irrational indices (fractal PS) which should be de?ned from the system’s dynamics. To relate the RG to PS, consider the renormgroup FE g(xt; Q g) = g(x; Q g(t; Q g)). Its general solution is known; it depends on an arbitrary function of one argument—see Eq. (19). However, at the moment, we are interested in a special solution linear in the second argument: g(x; Q g) = gX (x): The function X (x) should satisfy the equation X (xt) = X (x)X (t) with the solution X (x) = x! . Hence, g(x; Q t) = gx! . This means that in our special case, linear in g, the RG transformation (10) is reduced to the PS transformation, Rt ⇒ St : {x = xt −1 ; g = gt ! } : (30) More generally, with the RG, instead of a power law we have an arbitrary functional dependence. Thus, one can consider transformations (10), (13) and (16) as functional generalizations of the usual (i.e., power) self-similarity transformations. Hence, it is natural to refer to them as transformations of functional scaling or functional (self-)similarity (FS) rather than as RG-transformations. In short, RG ≡ FS with FS standing for functional similarity. 6 Now, we can answer the question on the physical meaning of the symmetry underlying FS and the Bogoliubov renormgroup. As we have mentioned, it is not a symmetry of a physical system or of the equation(s) for the problem at hand, but rather a symmetry of a solution considered as a function of the relevant physical variables and suitable boundary parameters. A symmetry like that can be related; in particular, to the invariance of a physical quantity described by a solution. It means, that this quantity remains unaltered under group transformations changing the way in which boundary conditions are imposed. For instance, this happens in illustration of Section 1.2 where the changing of the reference point constitutes the group operation. Homogeneity is an important feature of a physical system under consideration. However, homogeneity can be violated in a discrete manner. Imagine that such a discrete violation is connected with a certain value of x, say, x = y. In this case the RG transformation with the canonical parameter t has the form (13). The symmetry connected to FS is a very simple and frequently encountered property of physical solutions. It can be easily “discovered” in numerous problems of theoretical physics like classical mechanics, transfer theory, classical hydrodynamics, and so on [30,26,22,23]—see, above, Section 2.1.2. 3. Solution symmetry in mathematical physics 3.1. Constructing RG-symmetries and their application From the discussion in Sections 1.1 and 1.2 it follows that the FS transformation in QFT is the scaling transformation of an independent variable x (and, possibly, a parameter y) accompanied 6
This notion was ?rst mentioned in [29] and formally introduced [30] in the beginning of 1980s.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
233
Fig. 1. RGS constructing and application to BVP in mathematical physics.
by a functional transformation of the solution characteristic g. It is introduced by means of either ?nite transformations (10), (13) and (16) or the in?nitesimal operator (7). Hence, the symmetry of a solution, i.e., FS symmetry, is commonly understood in QFT as the Lie point symmetry of a one-parameter transformation group de?ned by an operator of (7)-type. Now, we are interested in answers to the following questions: • is it possible to extend the notion of RG symmetry (RGS) and generalize the form of RGS
implementation that may di6er from that given by (7)?—and if “yes”,
• is it possible to create a regular algorithm for ?nding these symmetries?
The answer is yes to both these questions, and below we demonstrate a regular algorithm for constructing an RGS in mathematical physics that up to now has been devised only for boundary value problems (BVPs) for the (system of) di6erential equation(s) which we shall refer to as basic equations (BEs). The point is that these models can be analyzed by methods of Lie group analysis which employ in?nitesimal group transformations instead of ?nite ones. The general idea of the algorithm is to ?nd a speci?c renormgroup manifold RM that contains the desired solution of a BVP. Then, construction of a RGS that leaves this solution unaltered is performed by using standard methods of group analysis of DEs. The regular algorithm for constructing RGS (and their application) can be formulated in the form of a scheme 7 which comprises of a few steps. It is illustrated in Fig. 1. (I) First of all, a speci?c renormgroup manifold RM for the given BVP is constructed which is identi?ed below with a system of the kth-order DEs F8 (z; u; u(1) ; : : : ; u(k) ) = 0; 7
8 = 1; : : : ; s :
(31)
In the present form this scheme was described in [31]. One can ?nd there historical comments and references on the pioneering publications.
234
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
In (31) and what follows we use the terminology of group analysis and the notation of differential algebra. In contrast to mathematical analysis, where we usually deal with functions u ; = 1; : : : ; m of independent variables xi ; i = 1; : : : ; n and derivatives ui (x) ≡ 9u = 9xi ; uij (x) ≡ 92 u = 9xi 9xj ; : : : that are also considered as functions of x, in di6erential algebra we also treat u ; ui ; uij ; : : : as variables. Therefore, in di6erential algebra we deal with an in?nite number of variables x = {x i };
u = {u };
u(1) = {ui };
u(2) = {ui 1 i2 }; : : : ;
(i; i1 ; : : : = 1; : : : ; n) ;
(32)
where the xi are called independent variables, u dependent variables and u(1) ; u(2) ; : : : derivatives. A locally analytic function f(x; u; u(1) ; : : : ; u(k) ) of the variables (32), with the highest order derivative being of the kth-order is called a di-erential function of order k. The set of all di6erential functions of a given order forms a space of di6erential functions A, the universal space of modern group analysis [32–35]. The realization of the Arst step is not unique, as it depends on both the form of the basic equations and the boundary conditions; generally, the RM does not coincide with the BEs. We indicate here a few possibilities for achieving this step. • One can use an extension of the space of variables involved in the group transformations. These variables, for example, may be parameters, p = {pj }; j = 1; : : : ; l entering into a so-
lution via the equations and=or boundary conditions. Adding parameters p to the list of independent variables z = {x; p} we treat the BEs in this extended space as the RM (31). Similarly, one can extend the space of di6erential variables by treating derivatives with respect to p as additional di6erential variables. • Another possibility employs reformulating the boundary conditions in terms of embedding equations or di-erential constraints which are then combined with the BEs. The key idea here is to treat simultaneously the solution of the BVP as an analytic function of the independent variables and the boundary parameters b = {x0i ; u0 }. Di6erentiation with respect to these parameters leads to additional DEs (embedding equations) that, together with the BEs, form an RM. In some cases, while calculating the Lie point RGS, the role of the embedding equations can be played by di6erential constraints (for details see [31]) that come from an invariance condition for the BEs with respect to the Lie-BCacklund 8 symmetry group. • In the case when the BEs contain a small parameter , the desired RM can be obtained by simpli?cation of these equations and use of “perturbation methods of group analysis” (see Vol. 3, Chapter 2, p. 31 in [34]). The main idea here is to consider a simpli?ed ( = 0) model, which admits a wider symmetry group (see examples in Section 4.2 below) in comparison with the case = 0. When we take the contributions from small into account, this symmetry is inherited by the BEs, which results in some additional terms, corrections in powers of , in the RGS generator. (II) The next step consists of calculating the most general symmetry group G that leaves the RM unaltered. The term “symmetry group”, as used in classical group analysis, means 8
We use here the terminology adopted in Russian literature [32,35]. This symmetry is also known as generalized or higher-order symmetry [33,34].
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
235
the property of system (31) that it admits a local Lie group of point transformations in the space A. The Lie algorithm for ?nding such symmetries consists of constructing tangent vector ?elds de?ned by the operator X = i 9xi + ; 9u ;
i ; ; ∈ A ;
(33)
where the coordinates, i ; ; are functions of the group variables and have to be determined by a system of equations XF8 |(31) = 0;
8 = 1; : : : ; s ;
(34)
that follow from the invariance of the RM. Here X is extended 9 to all derivatives involved in F8 and the symbol |(31) means calculated on frame (31). The linear homogeneous PDEs (34) for the coordinates i ; ; , are known as determining equations, and form an overdetermined system as a rule. The solution of Eqs. (34) de?nes a set of in?nitesimal operators (33) (also known as group generators), which correspond to the admitted vector ?eld and form a Lie algebra. In the case that the general element of this algebra X= Aj Xj ; (35) j
Aj
are arbitrary constants, contains a ?nite number of operators, 1 6 j 6 l, the group is where called Anite dimensional (or simply ?nite) with the dimension l; otherwise, for unlimited j or in the case that the coordinates i , ; depend upon arbitrary functions of the group variables, the group is called inAnite. The use of the in?nitesimal criterion (34) for calculating the symmetry groups makes the whole procedure algorithmic and can be carried out not only “by hand” but using symbolic packages of computer algebra (see, e.g., Vol. 3 in [34]) as well. In modern group analysis, different modi?cations of the classical Lie scheme are in use (see [32–34] and references therein). Generator (33) of the group G is equivalent to the canonical Lie–BKacklund operator Y = = 9u ;
= ≡ ; − i ui ;
(36)
that is known as a canonical representation of X and plays an essential role in RGS construction. The group de?ned by generators (33) and (36), in general, is wider than the desired RG, that usually appears as its subgroup. As the RGS is related to a partial BVP solution, it can be revealed by it restricting the admitted group G on a manifold de?ned by this given solution. (III) Hence, to obtain the RGS, the restriction of the group G on a particular BVP solution should be made, and this forms the third step. Mathematically, this procedure appears as checking the vanishing condition for the linear combination of coordinates =j of the canonical operator equivalent to (35) on a particular approximate (or exact) BVP solution U (z) Aj =j ≡ Aj (; j − ij ui ) =0 : (37) j
9
j
|u =U (z)
The extension of the generators to the derivatives employs the prolongation formulas and is a regular procedure in group analysis (see, e.g. [34]).
236
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
Evaluating (37) on a particular BVP solution, U (z), transforms the system of DEs for the group invariants into algebraic relations. 10 Firstly, it gives relations between the Aj thus “combining” di6erent coordinates of the group generators Xj admitted by the RM (31). Secondly, it eliminates (partially or entirely) the arbitrariness that may appear in the coordinates i , ; in the case of an in?nite group G. In terms of the “classic” QFT RG terminology, where there exists only one operator, X , of (7)-type (i.e., all Aj except one are equal to zero), the procedure of group restriction on a particular BVP solution gQappr eliminates the arbitrariness in the form of the (g)-function. While the general form of the condition given by Eq. (37) is the same for any BVP solution, the way of realization of the restriction procedure in every particular case employs a particular perturbation approximation (PA) for the concrete BVP. Generally, the restriction procedure reduces the dimension of G. It also “?ts” boundary conditions into operator (35) by a special choice of coeWcients Aj and=or by choosing the particular form of arbitrary functions of the coordinates i , ; . Hence, the general element (35) of the group G after the ful?llment of a restriction procedure is expressed as a linear combination of i the new generators Ri with the coordinates ˜ , ;˜ , i Bj Rj ; Rj = ˜j 9xi + ;˜ j 9u ; (38) X ⇒ R= j
where the Bj are arbitrary constants. The set of RGS generators Ri , each containing the desired BVP solution in its invariant manifold, de?ne a group of transformations that we also refer to as a renormgroup. This symmetry group is wider than the one considered in QFT, as the set of generators Rj generally form a ?nite or in?nite dimensional algebra. Moreover, Rj may correspond to Lie–BKacklund symmetry. Therefore, here we extend the notion of renormgroup and RG symmetry, the direct analogy with the “Bogoliubov RG” being preserved only for a one-parameter group of point transformations. (IV) The above prescribed three steps entirely de?ne a regular algorithm for RGS construction but do not touch on how a BVP solution is found. Hence, one more important, fourth, step should be added. It consists of using the RGS generators to ?nd analytical expressions for the new, “improved”, solution of the BVP. Mathematically, this step makes use of the RG = FS invariance conditions that are given by a combined system of (31) and the vanishing condition for the linear combination of coordinates =˜ j of the canonical operator equivalent to (38), i Rj =˜ j ≡ Bj (;˜ j − ˜j ui ) = 0 : (39) j
j
One can see that conditions (39) are akin to (37). However, in contrast to the previous step, the di6erential variables u in (39) should not be replaced by an approximate expression for the BVP solution U (z), but should be treated as normal dependent variables. 10
Similar relations were discussed in [32, Chapter 8], when constructing invariant solutions for the Cauchy problem for a quasi-linear system of ?rst order PDEs.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
237
For the one-parameter Lie point renormgroup, RG invariance conditions lead to the Arst order PDE that gives rise to the so-called group invariants (such as invariant couplings in QFT) which arise as solutions of associated characteristic equations. A general solution of the BVP is now expressed in terms of these invariants. On the one hand, this is in direct analogy with the structure of RG invariant solutions in QFT—compare with Eqs. (22) and (24). On the other hand, it reminds one of the so-called @-theorem from the theory of dimensional analysis and similitude (see, Section 19 in [32], Section 6 of Chapter 1 in [36] and historical comment to Section 43 in [37]) directly related to power self-similarity, discussed above in Section 2.3. However, as we shall see later, in the general case of arbitrary RGS the group invariance conditions obtained for a BVP are not necessarily characteristic equations for the Lie point group operator. They may appear in a more complicated form, e.g., as a combination of PDEs and higher order ODEs (see Section 4.2). Nevertheless, the general idea of ?nding solutions to the BVP in terms of RG invariant solutions remains valid. 3.2. Examples of solution improvement We now present a few examples of RGS construction with further use of the symmetry for “improving” an approximate solution. 3.2.1. ModiAed Burgers equation As the ?rst example, we take the initial value problem for the modi?ed Burgers equation ut − aux2 − !uxx = 0;
u(0; x) = f(x) :
(40)
It is connected to the heat equation u˜ t = !u˜ xx
(41)
by the transformation u˜ = exp(au=!) and has an exact solution which therefore allows us to check the validity of our approach. The RGS construction for (40) is an apt illustration of the general scheme, shown in Fig. 1 which may be helpful in understanding other examples of the general algorithm implementation. We review here brieRy the procedure and results of paper Ref. [38]. The RG-manifold RM (step (I)) is given by Eq. (40) with the parameters of non-linearity a and dissipation ! included in the list of independent variables. The Lie calculational algorithm applied to the RM gives, for the admitted group G (step (II)), nine independent terms in the general expression for the group generator X=
8
Ai (a; !)Xi + (t; x; a; !)e−au=! 9u ;
(42)
i=1
X1 = 4!t 2 9t + 4!tx9x − (!=a)(x2 + 2!t)9u ;
X2 = 2t 9t + x9x ;
238
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
X3 = (1=!)9t ;
X4 = 2!t 9x − (!=a)x9u ;
X7 = a9a + [(!=a) − u]9u ;
X5 = 9x ;
X6 = − (!=a)9u ;
X8 = 2!9! + x9x + 2[u − (!=a)]9u :
Here, Ai (a; !) are arbitrary functions of their arguments and (t; x; a; !) is an arbitrary function of four variables, satisfying the heat equation (41). The set of operators Xi forms an eight-dimensional Lie algebra, L8 . The ?rst six generators are related to the well-known symmetries of the modi?ed (potential) Burgers equation (see, e.g., Vol. 1, p. 183 in [34]). They describe projective transformations in the (t; x)-plane (X1 ), dilatations in the same plane (X2 ), translations along the t-, x- and u-axis (X3 , X5 and X6 ) and Galilean transformations (X4 ). The last two generators X7 and X8 relate to dilatations of the parameters a and !, now involved in group transformations. The procedure of restriction (step (III)) of the group (42) admitted by the RM (40) gives us a check of the invariance condition (37) on a particular BVP solution u = U (t; x; a; !) 8 i ;∞ + A (a; !)=i = 0; =i ≡ ;i − 1i ut − 2i ux − 3i ua − 4i u! : (43) i=1
|u=U (t; x; a; !)
This formula expresses the coordinate of the last term in (42) in terms of the remaining coordinates of the eight generators Xi for arbitrary t, and hence for t = 0, when U (0; x; a; !) = f(x). As a result, we obtain the “initial” value (0; x; a; !) and then, using the standard representation for the solution to the linear parabolic equation (41), the value of at arbitrary t = 0 (t; x; a; !) = −
8
Ai (a; !)=Q i (x; a; !) :
(44)
i=1
Here, =Q i (x; a; !) denote “partial” canonical coordinates =i taken at t = 0 and u = f(x). Symbol
F designates the convolution of a function F with the fundamental solution of (41), multiplied
by the exponential function of f entering into the boundary condition ∞ 1 (x − y)2 af(y) F(x; t; a; !) ≡ √ dy F(y; t; a; !) exp − : + 4!t ! 4!t −∞
Substitution (44) in the general expression (42) gives the desired RG generators Ri = Xi + %i e−au=! 9u ; 1 ! %3 = afx2 + !fxx ; %4 = x ; ! a ! ! ! ; %8 = xfx − 2f + 2 : %5 = fx ; %6 = 1 ; %7 = f − a a a The operators Ri form an eight-dimensional RG algebra RL8 that has the same tensor of structural constants as L8 , i.e. RL8 and L8 are isomorphic. Hence, the group restriction procedure eliminates the arbitrariness presented by the function and “?ts” the boundary conditions into the RG generators by means of %i . ! %1 = x2 ; a
%2 = xfx ;
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
239
It can be veri?ed that the exact solution of the initial-value problem (40) ∞ ! ! 1 (x − y)2 af(y) u(t; x; a; !) = ln1 ≡ ln √ dy exp − (45) + a a 4!t ! 4!t −∞ is the invariant manifold for any of the above RGS operators. Also, vice versa, (45) can be reconstructed from an approximate solution with the help of any of the RGS operators or their linear combination. For example, two such operators, !R3 ≡ Rt and (1=a)(R6 + R7 ) ≡ Ra were used in [38] to reconstruct the exact solution from perturbative (in time and in the non-linearity parameter a) solutions. Below, we describe this procedure (step (IV)) using the operator Ra , Ra = 9a + (1=a)(−u + e−au=! f(x) ) 9u :
(46)
It is evident that t; x and ! are invariants of the group transformations with (46), whilst ?nite RG transformations of the two remaining variables, a and u, are obtained by solving the Lie equations for (46), with ‘ the group parameter du da u (47) = 1; a |‘=0 = a; = (t; x; a ; !)e−a u =! − ; u |‘=0 = u : d‘ d‘ a Combining these equations yields one more invariant J = eau=! − 1 for the RGS generator (46). Solution of (47) along with (44) gives the formulae for ?nite RG transformations of the group variables {t; x; a; !; u} ! t = t; x = x; ! = !; a = a + ‘; u = (48) ln(eau=! + e‘f(x)=! − 1 ) : a+‘ Choosing the value a equal to zero, which is a starting point of PA in a, we get a = ‘. Then after excluding t; x; ! and ‘ from the expression for u (48) and omitting accents over t ; x ; ! ; u and a the desired BVP solution (45) is obtained. It also follows directly from J in view of the initial condition J|a=0 = 0. A similar procedure can be followed for the other RG operator, Rt = 9t + e−au=! afx2 + !fxx 9u ; which is consistent with the PA in time t. Although invariants for Rt and ?nite RG transformations di6er from that for (46), the ?nal result, i.e., the exact solution of BVP (40) given by (45), is the same. This possibility is the distinct demonstration of the multi-dimensional RGS to reconstruct the unique BVP solution from di-erent PA: either in parameter a or in t (though we used only two one-dimensional subalgebras here 11 ). 3.2.2. BVPs for ODEs: a simple example Quite recently, the QFT RG ideology has been applied, in a rather straightforward fashion, in mathematical physics to the asymptotic analysis of solutions to DEs [41,42] and in constructing an envelope for a family of solutions [43]. Our second methodological example with a linear ODE is presented here in order to illustrate the di6erence between our approach and the “perturbative RG theory” devised in [41] for a global analysis of BVP solutions in mathematical physics. 11
This can be considered as a construction parallel to the one used in Ref. [39].
240
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
Consider a linear second order ODE for y(t) with the initial conditions at t = , ˜ ytt + yt + Cy = 0; y() = u; which has the exact solution:
yt () = w˜ ;
(49)
√ w˜ + ∓ u˜ 1±K ; K = 1 − 4C; C± = ∓ : (50) 2 K Provided that the parameter C is small, the solutions to Eq. (49) have been treated in [41] with the goal of demonstrating the e6ectiveness of “perturbative RG theory” in the asymptotic analysis of a solution’s behavior. The main goal of this treatment was to improve a perturbative expansion in powers of C with secular terms ˙ C(t −) and obtain 12 a uniformly valid asymptotic of a solution
y = C+ e−+ (t−) + C− e−− (t−) ;
± =
y = c+ e(−1+C(1+C))(t−) + c− e−C(1+C)(t−) + O(C2 ) ;
(51)
˜ c− ≈ ((1 + 2C)w˜ + (1 + C)u) ˜ ; c+ ≈ −((1 + 2C)w˜ + Cu); which is accurate for small values C1 but for arbitrary values of the product C(t − ). We are going to show that the use of our regular RG algorithm enables one to improve a PA solution (either in powers of C or in t − ) up to the exact BVP solution (50). Rewriting (49) in the form of a system of two ?rst order ODEs for the functions u ≡ y and w ≡ yt , (52) ut = w; wt = − Cw − u ; we construct the RM (step (I)) using the invariant embedding method (this approach was ?rst realized in [44]). Then, the RM is presented as a joint system of BEs (52) and embedding equations u − (Cw˜ + u)u ˜ w˜ − wu ˜ u˜ = 0; w − (Cw˜ + u)w ˜ w˜ − ww ˜ u˜ = 0 ; treated in the extended space of group variables which include the parameters ; w; ˜ u˜ associated with the boundary conditions in addition to t and dependent variables u; w. Omitting tedious calculations related to the following two steps (steps (II) and (III)), we present here two examples of the resulting RGS generators R = 9 − (w˜ + Cu) ˜ 9w˜ + w˜ 9u˜ ; t 1 t t (2w + u) 1 − + u 9w + 2 (2w + u)9u RC = 9C − 2
2 2
1 − (2 w ˜ + u) ˜ 1 − + u˜ 9w˜ + 2 (2w˜ + u) ˜ 9u˜ ; (53)
2 2 2
that involve the initial values w, ˜ u˜ and the initial point in RG transformations. In addition, RC transforms the parameter C. 12
The algorithm used in [41] for improving PA solutions with secular terms involves: (a) an introduction of some additional parameters in the solutions, (b) a special choice of these parameters that eliminates secular divergences, and (c) imposing an independence condition on the solution with respect to the way these parameters are introduced. In some cases, this algorithm, directly borrowed from the QFT RG-method, gives an exact solution. However, the question of correspondence of this construction to a transformation group of a solution of the BEs remains open.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
241
Now, the procedure for constructing the BVP solution (50) (step (IV)) is similar to that used in the previous Section 3.2.1 and employs ?nite transformations that are de?ned by the Lie equations for operators (53). For R , the functions u; w and the parameter C are group invariants, while the translations of and the corresponding transformations of u; ˜ w˜ restore the exact solution (50) from the PA in powers of t − (note that the parameter C is not necessarily small in this PA!). For RC the di6erence t − is group invariant, whilst the transformation of C and related transformations of u; w; u; ˜ w˜ restore the exact solution (50) from the PA (discussed in [41]) in powers of C. Hence, as in the previous Section 3.2.1, both the RGS generators (53) reconstruct the unique BVP solution but from di6erent PAs. 4. The RG in non-linear optics 4.1. Formulation of a problem As a problem of real physical interest, take the BVP that describes self-focusing of a high-power light beam. While the problem has played an important role in non-linear electrodynamics since the 1960s, a detailed quantitative understanding of self-focusing is still missing [45], and there is no method which allows one to ?nd an analytic solution to the corresponding equations with arbitrary boundary conditions. Here, we demonstrate the great potential of the RGS approach in constructing analytic solutions to BVP equations with arbitrary boundary conditions. The RGS method allows one to consider di6erent types of BEs for self-focusing processes which include plane and cylindrical beam geometry, non-linear refraction and di6raction. The merit of the RGS method is that it describes BVP solutions with one- or two-dimensional singularities in the entire range of variables from the boundary up to the singularity point. Let us start with the BVP for a system of two DEs vz + vvx − nx = 0; v(0; x) = 0;
nz + nvx + vnx + (! − 1)(nv=x) = 0 ;
n(0; x) = N (x) ;
(54) (55)
which are used in the non-linear optics of self-focusing wave beams when di6raction is negligible. We study the spatial evolution of the derivative of the beam eikonal v and the beam intensity n in the direction into the medium z and in the transverse direction x. The term proportional to is related to non-linear refraction e6ects; ! = 1 and 2 refer to the plane and cylindrical beam geometry, respectively. Boundary conditions (55) correspond to the plane front of the beam and the arbitrary transverse intensity distribution. 4.2. Plane geometry In the plane beam geometry (at ! = 1) Eqs. (54) can be reduced to the system of BEs w − nHn = 0;
Hw + n = 0
(56)
242
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
for functions = nz and H = x − vz of w = v= and n arguments, with boundary conditions (0; n) = 0;
H(0; n) = H (n) ;
(57)
where H (n) is the inverse to N (x). Here, the procedure of RGS construction makes use of the Lie–BCacklund symmetry and is described as follows [31]. The manifold RM (step (I)) is de?ned by Eqs. (56) treated in the extended space that include dependent and independent variables ; H; w; n and derivatives of and H with respect to n of an arbitrary high order. The admitted symmetry group G (step (II)) is represented by the canonical Lie–BKacklund operator X = f 9 + g 9H
(58)
with the coordinates f and g that are linear combinations of and H and their derivatives
9i = 9ni and 9i H= 9ni ; i ¿ 1 with the coeWcients depending on w and n. The restriction of the group admitted by RM (56) (step (III)) implies the check of the
invariance condition (37) that yields two relations f = 0;
g=0 :
(59)
These relations should be valid on a particular solution of BVP with the boundary data (57).√For example, choosing the so-called “soliton” pro?le, N (x) = cosh−2 (x), i.e., H (n) = Arccosh(1= n), we have f = 2n(1 − n)nn − nn − 2nw(Hn + nHnn ) + ( w2 =2)nnn ; g = 2n(1 − n)Hnn + (2 − 3n)Hn + w(2nnn + n ) + ( w2 =2)(nHnn + Hn ) :
(60)
Dependence on nn and Hnn indicates that here RGS is the second-order Lie–BCacklund symmetry. In order to ?nd a particular solution of a BVP (step (IV)), one should solve the joint system of BEs (56) and second-order ODEs that follow from the RG = FS invariance conditions (59) and (60). The resulting expressions [46]—the well-known Khokhlov solutions 13 v = − 2 nz tanh(x − vz);
n2 z 2 = n cosh2 (x − vz) − 1 ;
(61)
describe the process of self-focusing of a soliton beam: the sharpening of the beam intensity pro?le with the increase of z is accompanied by the intensity growth on the beam axis. Solution (61) is valid up to the singularity point where the derivatives vx and nx tend to in?nity whilst the beam intensity n remains ?nite √ sol zsing = 1=2 ; nsol (62) sing = 2 : Here, the Lie–BKacklund RGS enables one to reconstruct the BVP solution and describe the solution singularity for the light beam with the soliton initial intensity pro?le. One more example of an exact BVP solution obtained with the help of Lie–BKacklund RGS (with the initial beam pro?le in the form of a “smoothed” step) can be found in [46]. For arbitrary boundary data, it turns to be impossible to ful?ll condition (59) with the help of the Lie–BKacklund symmetries of any ?nite order, and one is forced to use a di6erent algorithm [31,46] of RGS construction, based on the approximate group methods. Here, (step (I)) RM 13
In Ref. [47], where this solution was ?rst obtained, it did not result from a regular procedure.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
243
is given by BEs (56) with a small parameter , and coordinates of the group generator (58) (and, hence, coordinates of the RGS operators) appear as in?nite series in powers of ∞ ∞ i i f= f ; g= i gi : (63) i=0
i=0
The procedure of ?nding the coeWcients fi ; gi (step (II)) leads to the system of recurrent relations that express higher-order coeWcients fi+1 , gi+1 in terms of previous ones fi , gi . It means that once the zero-order terms are speci?ed, the other terms are reconstructed by the recurrent relations. The coeWcients fi and gi contain an arbitrary function of n; H[s] and [s] − w(sH[s] + nH[s+1] ) where subscript [s] denotes the partial derivative of the order s with respect to n. This arbitrariness is eliminated by the procedure of group restriction (step (III)), i.e., by imposing the invariance condition (59). For particular forms of f0 and g0 , that is for partial boundary conditions (57), inAnite series are truncated automatically, and we arrive at the exact RGS. One example of this kind is given by Eqs. (60) that have a binomial structure f = f0 + f1 , g = g0 + g1 . If we neglect the higher-order terms in the case of arbitrary boundary conditions (when series (63) are not truncated automatically), then we get an approximate RGS which produces an approximate solution to the BVP. As an example, we give here two sets of expressions for the coordinates fi and gi for the Gaussian initial pro?le with N (x) = exp(−x2 ), i.e., H (n) = (ln(1=n))1=2 , which de?ne approximate RGS (a) f0 = 1 + 2nHHn ; g0 = 0; f1 = − 2n + 2 =n; g1 = − 2(Hn + Hn ), (b) f0 = 2n(Hn + n H); g0 = 1 + 2nHHn ; f1 = 2H ; g1 = 2(HH − n ). Here, linear dependencies of f and g upon ?rst-order derivatives indicate that RGS is equivalent to Lie point symmetry. The peculiarity of case (b) is a dependence of f and g not only on derivatives with respect to n but also with respect to : it means that the parameter is also involved in group transformations. In the non-canonical representation (33), the RGS generator in this case has the form RGuass1 = 29w + 2nH9n + 2 H9 − 9H :
(64)
The last step (IV) is performed in a usual way by solving the joint system of BEs (56) and equations that follow from the RG = FS invariance condition (59), or else, using invariants of associated characteristic equations for RG operator provided that RGS is a Lie point symmetry. We give here the solution that follows from RGS (64), 1 2x nz x2 = (1 − 2 nz 2 )2 ln : (65) ; v=− n(1 − nz 2 ) 1 − 2 nz 2 These expressions describe a self-focusing Gaussian beam (the plot n(x) for this solution is presented at the end of the section in Fig. 2), that is qualitatively very similar to the spatial evolution of the soliton beam (61). Moreover, the singularity position and the value of maximum beam intensity at this point coincide with analogous values (62) for the soliton beam. Although formulae (65) correspond to an approximate BVP solution, they exactly describe the behavior of n on the beam axis at x = 0. To estimate the reliability of result (65) in the o6-axis region, we compared it with another approximate BVP solution which arises from the approximate RGS
244
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
Fig. 2. Intensity n versus transverse coordinate x for a plane (left panel) and cylindrical (right panel) beam geometry for a few values of distance z from the boundary z2 ¿ z1 ¿ z0 = 0.
in case (a). These approximations agree very well (details are presented in [46]), thus proving the accuracy 14 of the RG approach. 4.3. Cylindrical geometry In the above discussion we dealt with the plane beam geometry and took into account only e6ects of non-linear beam refraction, neglecting di6raction. The Rexibility of RGS algorithm allows one to apply it in a similar way to a more complicated model as compared to (56), e.g., for the cylindrical beam geometry, ! = 2. Omitting technical details, we present the RGS generator for the cylindrical parabolic beam with N = 1 − x2 Rpar = (1 − 2 z 2 )9z − 2 zx9x − 2 (x − vz)9v + 4 nz 9n : The BVP solution is expressed in terms of group invariants for this generator: J1 =
x2 ; %
J2 = n%;
J3 = 2 x2 − v2 % +
xv %z ; 2
% = (1 − 2 z 2 ) :
(66)
The explicit form of dependencies of J2 = 1 − J12 ; J3 = 2 J1 upon J1 follows from the boundary conditions (55). They lead to the well-known solution [47] v = (x=(2%))%z ;
n = (1=%)(1 − (x2 =%)) par zsing
√
(67)
= 1= 2 where % = 0 that describes the convergence of the beam to the singularity point and n → ∞. The solution singularity is two-dimensional here: the in?nite growth of beam par intensity in the vicinity of the singularity z → zsing is accompanied by the in?nite growth of the derivative vx and collapsing of the beam size in the transverse direction. The RGS algorithm based on approximate group methods can also be applied in the case when besides non-linear refraction also di6raction e6ects are taken into account. Then, the ?rst equation in (54) should be modi?ed by adding the di6raction term √ √ −9x {(x1−! = n)9x (x!−1 9x n)} : 14
One more evidence is provided by the comparison of approximate and exact BVP solution for the soliton beam performed in [46].
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
245
Standard calculations done in compliance with a general scheme for thus modi?ed RM (for details see [48]) give the RGS generator for the cylindrical beam geometry (! = 2) vz nz SHH + SH 9n : (68) RGauss2 = (1 + z 2 SHH )9z + (zSH + vz 2 SHH )9x + SH 9v − nz 1 + x x Here the function S, de?ned by the form of the intensity boundary distribution, 9H (H9H N (H)) ; S(H) = N (H) + H N (H)
contains two small parameters, and , and, as in the case = 0, there exist speci?c forms of boundary distribution, N , for which the RGS operator (68) de?nes exact (not approximate) symmetry valid for arbitrary values of and . Constructing a particular BVP solution (step (IV)) implies the use of group invariants related to (68), and the procedure is similar to that one for the parabolic beam. For the Gaussian wave beam, N = exp(−x2 ), the result is as follows: v(z; x) =
x−H ; z
n(z; x) = e−
2
2
H ( − e−H ) : x ( − e− 2 )
(69)
Here H and are expressed in terms of t and x by the implicit relations 2
2
2
( 2 − H2 ) + (e− − e−H ) = 2z 2 H2 ( − e−H )2 ; 2
x = H(1 + 2z 2 ( − e−H )) : Solution (69) describes the self-focusing of the cylindrical Gaussian beam that gives rise to the two-dimensional singularity: both the beam intensity n and derivatives vx ; nx go to in?nity Gauss = 1= 2( − ) provided that ¿ . A detailed analysis of (69) and more at the point zsing general solutions with a parabolic form of an eikonal at z = 0; v(0; x) = − x=T , is given in [48,45]. To illustrate the di6erence between the one- and two-dimensional solution singularities, in Fig. 2 we present a typical behavior of the wave beam intensity, de?ned by Eqs. (65) and (69). The left panel corresponds to the plane beam geometry, ! = 1, and without di6raction, = 0, while the right one is concerned with a cylindrical wave beam, ! = 2, with both non-linearity and di6raction e6ects included. Diverse curves describe beam intensity distribution upon coordinate x at di6erent distances from the medium boundary, where we have the collimated Gaussian beam, N = exp(−x2 ). It is clear that in the plane geometry the derivative of the beam intensity with respect to x turns to in?nity at some singular point, while the value of intensity on the axis remains ?nite. In cylindrical case the solution singularity is two dimensional: both the beam intensity and its derivative with respect to transverse coordinate turn to in?nity simultaneously at point zsing . This last example demonstrates the possibility of the RGS approach to analyze two-dimensional singularity. In the practice of RG application to critical phenomena, this correlates with the case of “two renormalization groups” [39].
246
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
Fig. 3. Early development of concept: from Bogoliubov RG to Wilson RG and FS.
5. Overview To complete our review, we indicate some milestones in the evolution of the RG concept. Since its appearance in QFT, the RG has served as a powerful tool for analyzing diverse physical problems and improving solution singularities disturbed by a PA. The development of the RG concept can be divided into two stages. The ?rst one (since the mid-1950s up to the mid-1980s) is summarized in Fig. 3. Besides the early history (discovery of the RG, formulation of the RG method and application to UV and IR asymptotics), it comprises the development of the Kadano6–Wilson RG in the 1970s and the subsequent explosive expansion into other ?elds of theoretical physics. During this stage, the formulation of the RG method was based on the uni?ed scaling transformation of an independent variable (and=or some parameters) accompanied by a more complicated transformation of a solution characteristic g = g( ; Q g)—see Eqs. (10), (13) and (16) in Section 1.2. Here, the main role of RG = FS was the aZ priori establishment of the fact that the solution under consideration admits functional transformations that form a group. Particular implementations of the RG symmetry di6er in the form of the function(s) g( ; Q g) (or (g)) which, in an every partial case, is obtained from some approximate solution. The next stage, after the mid-1980s, is depicted in Fig. 4. The scheme describes the entire evolution of the Bogoliubov RG. There were several important reasons for further developing the RG concept in theoretical physics in this period. On the one hand, it was due to the extension of the notion of FS and RG symmetry that until then were based on one-parameter Lie groups of point transformations. Appending multi-dimensional Lie point groups and Lie–BKacklund groups to possible realizations of the group symmetry enhanced the usefulness of the RG method.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
247
Fig. 4. Evolution of concept: from the Bogoliubov RG via FS to RG-symmetries.
On the other hand, this additional possibility arose due to the mathematical apparatus that was used in mathematical physics to reveal RGS. The advantage came from in?nitesimal transformations that enabled one to describe the RGS by an algebra of RG generators. However, in contrast to the situation typical of QFT models with only one operator, in mathematical physics we have ?nite or in?nite-dimensional algebras. Both their dimension and the method of construction depend upon the model employed and upon the form of the boundary conditions. The use of the in?nitesimal approach results in constructing an RG-type symmetry with the help of the regular methods of group analysis of DEs. Precisely, this regular algorithm naturally includes the RG = FS invariance condition in the general scheme of construction and application of the RGS generators (see also our recent review [49]). Within the in?nitesimal approach this condition is formulated in terms of the vanishing of the canonical RG operator coordinates, which is especially important for the Lie–BKacklund RGS because ?nite transformations in this case are expressed as formal series. In particular, this property attribute a new feature to the RG analysis of a BVP solution with singular behavior, making the singularity analysis more powerful. At the same time, as the group analysis technique is still developing—here we mean both extension to new types of symmetries and application to more complicated mathematical models, e.g., including integro-di6erential equations—we have a clear perspective that the possibilities of a regular scheme based upon the Bogoliubov RG method are far from being exhausted. Acknowledgements The authors are grateful to Professors Chris Stephens and Denjoe O’Connor for the invitation to participate in the Conference “Renormalization Group 2000”. They are indebted to these
248
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
gentlemen for useful discussions and comments. This work was partially supported by grants of the Russian Foundation for Basic Research (RFBR projects Nos 96-15-96030, 99-01-00232 and 99-01-00091) and by INTAS grant No 96-0842, as well as by the Organizing Committee of the above-mentioned meeting. References [1] E.E.C. Stueckelberg, A. Petermann, Helv. Phys. Acta 22 (1953) 499–520. [2] D.V. Shirkov, Historical remarks on the renormalization group, in: L.M. Brown (Ed.), Appendix in the collective monograph Renormalization: From Lorentz to Landau (and Beyond), Springer, New York, 1993, pp. 167–186; D.V. Shirkov, On the early days of renormalization group, in: L. Hoddeson et al. (Eds.), The Rise of the Standard Model, Proceedings of the Third International Symposium on the History of Particle Physics, SLAC, 1992, Cambridge Univ. Press, Cambridge, 1997, pp. 250 –258. [3] M. Gell-Mann, F. Low, Phys. Rev. 95 (1954) 1300–1312. [4] N.N. Bogoliubov, D.V. Shirkov, Dokl. Akad. Nauk SSSR 103 (1955) 203–206 (in Russian); see also in Nuovo Cimento 3 (1956) 845 –863. [5] N.N. Bogoliubov, D.V. Shirkov, Dokl. AN SSSR 103 (1955) 391–394 (in Russian); see also in Nuovo Cimento 3 (1956) 845 –863. [6] D.V. Shirkov, Dokl. AN SSSR 105 (1955) 972–975 (in Russian); see also in Nuovo Cimento 3 (1956) 845 –863. [7] N.N. Bogoliubov, D.V. Shirkov, Nuovo Cimento 3 (1956) 845–863. [8] L.V. Ovsyannikov, Dokl. AN SSSR 109 (1956) 1112–1115 (in Russian) (for English translation see in: Yu. Trutnev (Ed.), Intermissions : : : WS, 1998, pp. 76 –79); C. Callan, Phys. Rev. D 2 (1970) 1541–1547; K. Symanzik, Comm. Math. Phys. 18 (1970) 227–246. [9] L.D. Landau et al., Nuovo Cimento 3 (Supp.) (1955) 80–104. [10] I.F. Ginzburg, Dokl. AN SSSR 110 (1956) 535 –538 (in Russian), see Chapter “Renormalization Group” in N. Bogoliubov, D. Shirkov, Introduction to the Theory of Quantized Fields, 1959; Wiley-Interscience, New York, 1980. [11] N. Bogoliubov, D. Shirkov, Introduction to the Theory of Quantized Fields, 1959; Wiley-Interscience, New York, 1980. [12] D.V. Shirkov, Theor. Math. Phys. 119 (1999) 55 – 66; hep-th=9810246. [13] L. Kadano6, Physics 2 (1966) 263–272. [14] K. Wilson, Phys. Rev. B 4 (1971) 3174–3183. [15] P.G. De Gennes, Phys. Lett. 38A (1972) 339 –340; J. des Cloiseaux, J. Phys. (Paris) 36 (1975) 281–292. [16] V.I. Alkhimov, Theor. Math. Phys. 39 (1979) 422– 424; 59 (1984) 591–597. [17] T.L. Bell et al., Phys. Rev. A 17 (1978) 1049 –1057; G.F. Chapline, Phys. Rev. A 21 (1980) 1263–1271. [18] B.V. Chirikov, Lect. Notes Phys. 179 (1983) 29 – 46; B.V. Chirikov, D.L. Shepelansky, Chaos Border and Statistical Anomalies, in: D.V. Shirkov, D.I. Kazakov, A.A. Vladimirov (Eds.), Renormalization Group, Proceedings of 1986 Dubna Conference, WS, Singapore, 1988, pp. 221–250; Yu.G. Sinai, K.M. Khanin, Renormalization group method in the theory of dynamical systems, in: D.V. Shirkov, D.I. Kazakov, A.A. Vladimirov (Eds.), Renormalization Group, Proceedings of 1986 Dubna Conference, WS, Singapore, 1988, pp. 251–277; A. Peterman, A. Zichichi, Nuovo Cimento 109A (1996) 341–355. [19] D.V. Shirkov, Sov. Phys. JETP 9 (1959) 421–424. [20] C. DeDominicis, P. Martin, Phys. Rev. A 19 (1979) 419 – 422; L.Ts. Adzhemyan et al., Theor. Math. Phys. 58 (1984) 47–51; 64 (1985) 777–784; A.N. Vasiliev, Quantum Field Renormalization Group in the Theory of Turbulence and in Magnetic Hydrodynamics, in: D.V. Shirkov, D.I. Kazakov, A.A. Vladimirov (Eds.), Renormalization Group, Proceedings of 1986 Dubna Conference, WS, Singapore, 1988, pp. 146 –159. [21] A.N. Vasiliev, Quantum Field Renormalization Group in the Theory of Critical Behavior and Stochastic Dynamics, PINF Publ., St-Petersburg, 1998, 773 pp (in Russian), also Taylor and Francis group PLC, London, in preparation.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
249
[22] D.V. Shirkov, Teor. Mat. Fiz. 60 (1984) 778–782; The RG method and functional self-similarity in physics, in: R.Z. Sagdeev (Ed.), Nonlinear and Turbulent Processes in Physics, Vol. 3, Harwood Acad. Publ., New York, 1984, pp. 1637–1647. [23] D.V. Shirkov, Renormalization group in modern physics, in: D.V. Shirkov, D.I. Kazakov, A.A. Vladimirov (Eds.), Renormalization Group, Proceedings of 1986 Dubna Conference, WS, Singapore, 1988, pp. 1–32; Int. J. Mod. Phys. A 3 (1988) 1321–1341; Several topics on renorm-group theory, in: D.V. Shirkov, V.B. Priezzhev (Eds.), Renormalization group ‘91, Proceedings of Second International Conference, September 1991, Dubna, USSR, WS, Singapore, 1992, pp. 1–10; Renormalization group in di6erent ?elds of theoretical physics, KEK Report 91-13 (February 1992), 85pp. [26] M.A. Mnatsakanyan, Sov. Phys. Dokl. 27 (1982) 856–859. [27] G. Pelletier, Plasma Phys. 24 (1980) 421–443. [28] Ja.B. Zel’dovich, G.I. Barenblatt, Sov. Phys. Dokl. 3 (1) (1958) 44 – 47; see also G.I. Barenblatt, Scaling, Self-Similarity, and Intermediate Asymptotics, Cambridge Univ. Press, Cambridge, 1996 (Chapter 5). [29] V.Z. Blank, V.L. Bonch-Bruevich, D.V. Shirkov, Sov. Phys. JETP 3 (1956) 845–863. [30] D.V. Shirkov, Sov. Phys. Dokl. 27 (1982) 197–200. [31] V.F. Kovalev, V.V. Pustovalov, D.V. Shirkov, J. Math. Phys. 39 (1998) 1170 –1188; hep-th=9706056. [32] L.V. Ovsyannikov, Group Analysis of Di6erential Equations, Academic Press, New York, 1982. [33] Peter J. Olver, Applications of Lie Groups to Di6erential Equations, Springer, New York, 1986. [34] N.H. Ibragimov (Ed.), CRC Handbook of Lie Group Analysis of Di6erential Equations, 3 Vols. CRC Press, Boca Raton, FL, USA, 1994 –1996. [35] N.H. Ibragimov, Transformation Groups Applied to Mathematical Physics, Reidel Publ, Dodrecht-Lancaster, 1985. [36] L.I. Sedov, Similarity and Dimensional Analysis, Academic Press, New York, 1959. [37] G. Birkho6, Hydrodynamics, A study in Logic, Fact and Similitude, Princeton Univ. Press, Princeton, 1960. [38] V.F. Kovalev, V.V. Pustovalov, Lie Groups Appl. 1 (1994) 104–120. [39] C.R. Stephens, Why two renormalization groups are better than one, in: D.V. Shirkov, D.I. Kazakov, V.B. Priezzhev (Eds.), Renormalization group ‘96, Proceedings of Third International Conference, August 1996, Dubna, Russia, JINR publ., Dubna, 1997, pp. 392– 407; Int. J. Mod. Phys. 12 (1998) 1379 –1396. [41] L.-Y. Chen, N. Goldenfeld, Y. Oono, Phys. Rev. E 54 (1996) 376–394. [42] J. Bricmont, A. Kupiainen, G. Lin, Comm. Pure Appl. Math. 47 (1994) 893–922. [43] T. Kunihiro, Progr. Theor. Phys. 94 (1995) 503–514. [44] V.F. Kovalev, S.V. Krivenko, V.V. Pustovalov, The Renormalization group method based on group analysis, in: D.V. Shirkov, V.B. Priezzhev (Eds.), Renormalization group ‘91, Proceedings of Second International Conference, September 1991, Dubna, USSR, WS, Singapore, 1992, pp. 300 –314. [45] V.F. Kovalev, V.Yu. Bychenkov, V.T. Tikhonchuk, Phys. Rev. A 61(3) (2000) 0338098(1-10). [46] V.F. Kovalev, Theor. Math. Phys. 111 (1997) 686 –702; V.F. Kovalev, D.V. Shirkov, J. Nonlinear Opt. Phys. Mater. 6 (1997) 443– 454. [47] S.A. Akhmanov, R.V. Khokhlov, A.P. Sukhorukov, Sov. Phys. JETP 23 (1966) 1025–1033. [48] V.F. Kovalev, Theor. Math. Phys. 119 (1999) 719–730. [49] V.F. Kovalev, D.V. Shirkov, Theor. Math. Phys. 121 (1999) 1315–1322.
Physics Reports 352 (2001) 251–272
Renormalization group in statistical mechanics and mechanics: gauge symmetries and vanishing beta functions Giovanni Gallavotti Departimento di Fisica, Universita di Roma La Sapienza 1, Piazzale Aldo Moro 2, 00185 Roma, Italy Received March 2001; editor : I: Procaccia
Contents 1. Introduction 2. Fermi systems in one dimension 3. The conceptual scheme of the renormalization group approach followed above
252 252 263
4. The KAM problem Acknowledgements References
265 271 271
Abstract Two very di0erent problems that can be studied by renormalization group methods are discussed with the aim of showing the conceptual unity that renormalization group has introduced in some areas of theoretical Physics. The two problems are: the ground state theory of an one-dimensional quantum Fermi liquid and the existence of quasi periodic motions in classical mechanical systems close to integrable ones. I summarize here the main ideas and show that the two treatments, although completely independent c 2001 Elsevier Science B.V. All rights reserved. of each other, are strikingly similar. PACS: 05.10.Cc
E-mail address:
[email protected] (G. Gallavotti). c 2001 Elsevier Science B.V. All rights reserved. 0370-1573/01/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 4 0 - 0
252
G. Gallavotti / Physics Reports 352 (2001) 251–272
1. Introduction There are few cases in which a renormalization group analysis can be performed in full detail and without approximations. The best known case is the hierarchical model theory of Wilson (1970) and Wilson and Kogut (1974). Other examples are the (Euclidean) ’4 quantum ?eld theories in two and three space-time dimensions (Wilson, 1973) (for an analysis in the spirit of what follows see Gallavotti (1985) or Benfatto and Gallavotti (1995)), and the universality of critical points (Wilson and Fisher, 1972). In all such examples there is a basic diAculty to overcome: namely the samples of the ?elds can be unboundedly large: this does not destroy the method because such large values have extremely small probability (Gallavotti, 1985). The necessity of a di0erent treatment of the large and the small ?eld values hides, to some extent, the intrinsic simplicity and elegance of the approach: unnecessarily so as the end result is that one can essentially ignore (to the extent that it is not even mentioned in most application oriented discussions) the large ?eld values and treat the renormalization problem perturbatively, as if the large ?elds were not possible. Here I shall discuss two (nontrivial) problems in which the large ?eld diAculties are not at all present, and the theory leads to a convergent perturbative solution of the problem (unlike the above-mentioned classical cases, in which the perturbation expansion cannot be analytic in the perturbation parameter). The problems are: (1) the theory of the ground state of a system of (spinless, for simplicity) fermions in 1-dimension, (Berretti et al., 1994; Benfatto and Gallavotti, 1990; Bonetto and Mastropietro, 1995); (2) the theory of KAM tori in classical mechanics (Eliasson, 1996; Gallavotti, 1995; Bonetto et al., 1998; Gallavotti et al., 1995). The two problems will be treated independently, for completeness, although it will appear that they are closely related. Since the discussion of problem (1) is quite technical we summarize it at the end (in Section 3) in a form that shows the generality of the method that will then be applied to the problem (2) in Section 4. The analysis of the above examples suggests methods to study and solve several problems in the theory of rapidly perturbed quasi periodic unstable motion (Gallavotti, 1995; Gallavotti et al., 1999): but for brevity we shall only refer to the literature for such applications. 2. Fermi systems in one dimension The Hamiltonian for a system of N spinless fermions at x1 ; : : : ; xN enclosed in a box (actually an interval) of size V is N N 1 H= v(xi − xj ) − ; (2.1)
x − + 2 2m i i¡j i=1
i=1
where is the chemical potential, v is a smooth interaction pair potential, is the strength of the coupling; is a correction to the chemical potential that vanishes for = 0 and that has
G. Gallavotti / Physics Reports 352 (2001) 251–272
253
Fig. 1. The two basic building blocks (“graph elements”) of the Feynman graphs for the description of the ground state: the ?rst represents the potential term (2 v) in (2.1)) and the second the chemical potential term ().
Fig. 2. Graphical representation of the “external” lines and vertices in Feynman graphs.
to be adjusted as a function of ; it is introduced in order that the Fermi momentum stays -independent and equal, therefore, to pF = (2m)1=2 . It is in fact convenient to develop the theory at ?xed Fermi momentum because the latter has a more direct physical meaning than the chemical potential as it marks the location of important singularities of the functions that describe the theory. The parameter m is the mass of the particles in absence of interaction. It is well known (Luttinger and Ward, 1960) that the ground state of the above Hamiltonian is described by the Schwinger functions of a fermionic theory whose ?elds will be denoted ± x . For instance, the occupation number function nk which, in absence of interaction, is the simple characteristic function nk = 1 if |k | ¡ pF and nk = 0 if |k | ¿ pF is, in general, the Fourier transform of S(x; t) with x = (x; t) = (x1 − x2 ; t1 − t2 ) and t = 0+ : Tr e−(−t1 )H x+1 ;t1 e−(t1 −t2 )H x−2 ;t2 e−t2 H (2.2) S(x) = S(x1 ; t1 ; x2 ; t2 )|t1 =t + = lim ’ 2 + →∞ Tr e−H V →∞
t1 =t2
Formal perturbation analysis of the 2-points Schwinger function S(x) and of the n-points natural extensions S(x1 ; x2 ; : : : ; x n ) can be done and the (heuristic) theory is very simple in terms of Feynman graphs. p q The n-points Schwinger function is expressed as a power series in the couplings ; ∞ p=0 ×S (p; q) (x1 ; x2 ; : : : ; x n ) with the coeAcients S (p; q) computed by considering the (connected) Feynman graphs composed by linking together in all possible ways the following basic “graph elements” (1) p “internal 4-lines graph elements” (also called “coupling graphs”) and q “internal 2-lines graph elements” (or “chemical potential vertices”) of the form in Fig. 1 where the incoming or outgoing arrows represent x− or x+ , respectively, and (2) n single lines attached to “external” vertices xj : the ?rst half of which oriented towards the vertex x and the other half of them oriented away from it (Fig. 2). The graphs are formed by contracting (i.e. joining) together lines with equal orientation. The lines emerging from di0erent nodes are regarded as distinct: we can imagine that each line carries a label distinguishing it from any other, e.g. the lines are thought to be numbered from 1 to 4 or from 1 to 2, depending
254
G. Gallavotti / Physics Reports 352 (2001) 251–272
Fig. 3. An example of a Feynman graph: in spite of its involved structure it is far simpler than its numerical expression, see (2.5). A systematic consideration of graphs as “short cuts” for formulae permits us to visualize more easily various quantities and makes it possible to recognize cancellations due to symmetries.
on the structure of the graph element to which they belong. So that there are many graphs giving the same contributions. Each graph is assigned a value which is ±(p!q!)−1 p q times a product of propagators, one per line. The propagator for a line joining x1 to x2 is, if x1 = (x1 ; t1 ); x2 = (x2 ; t2 ): 1 g(x1 − x2 ) =x1 •→—• x2 = (2)2
e−i(k0 (t1 −t2 )+k(x1 −x2 )) −ik0 + (k 2 − pF2 )=2m
d k0 d k :
(2.3)
A wavy line, see Fig. 1, joining x1 with x2 is also given a propagator g(x ˜ 2 − x1 ) = v(x2 − x1 ) (t2 − t1 ) ;
(2.4)
associated with the “potential”. However the wavy lines are necessarily internal as they can only arise from the ?rst graph element in Fig. 1. The p + q internal node labels (x; t) must be integrated over the volume occupied by the system (i.e. the whole space-time when V; → ∞): the result will be called the “integrated value” of the graph or simply, if not ambiguous, the graph value. Since the value of a graph has to be integrated over the labels x = (x; t) of the internal nodes we shall often consider also the value of a graph # without the propagators corresponding to the external lines but integrated with respect to the positions of all nodes that are not attached to external lines and we call it the “kernel” of the graph #: the value of a graph # will often be denoted as Val # and the kernel by K# . Note that the kernel of a graph depends on less variables: in particular it depends only on the positions of the internal nodes; it also depends on the labels ! of the branches external to them through which they are connected to the external vertices. Introducing the notion of kernel is useful because it makes natural to collect together values of graphs which contain subgraphs with the same number of lines exiting them, i.e. whose kernels have the same number of variables. The function p q S (p; q) is given by the sum of the values of all Feynman graphs with p vertices of the ?rst type in Fig. 1 and q of the second type in Fig. 1 and, of course, n external vertices, integrated over the internal vertices positions. As an example consider the contribution to S (4; 2) in Fig. 3.
G. Gallavotti / Physics Reports 352 (2001) 251–272
255
The value of the graph in Fig. 3 can be easily written in formulae: apart from a global sign that has to be computed by a careful examination of the order in which the xj -labels are written it is 1 (4; 2) S# (x1 ; x2 ; x8 ; x9 ) = ± g(x10 − x2 ) g(x3 − x1 ) g(x3 − x10 ) 4!2! ×g(x4 − x3 ) g(x5 − x4 ) g(x4 − x5 ) g(x5 − x4 ) g(x4 − x5 ) g(x6 − x5 ) ×g(x4 − x5 ) g(x5 − x4 ) g(x7 − x6 ) g(x9 − x7 ) g(x8 − x7 ) (t4 − t4 ) ×(t5 − t5 ) v(x3 − x3 ) (t3 − t3 ) v(x7 − x7 ) (t7 − t7 ) ×v(x4 − x4 ) v(x5 − x5 ) d x3 d x3 d x4 d x4 d x5 d x5 d x6 d x7 d x7 d x10 ; (2.5)
which is easily derived from the ?gure. And one hardly sees how this formula could be useful, particularly if one thinks that this is but one of a large number of possibilities that arise in evaluating S: not to mention what we shall get when looking at higher orders, i.e. at S (p; q) when p is a bit larger than 2. Many (in fact most) of the integrals over the node variables xv will, however, diverge. This is a “trivial” divergence due to the fact that interaction tends to change the value of the chemical potential. The chemical potential is related (or can be related) to the Fermi ?eld propagator singularities, and the chemical potential is changed (or may be changed) by the interaction: the divergences are due to the naivetMe of the attempt at expanding the functions S in a power series involving functions with singularities located “at the wrong places”. The divergences disappear if the (so far free) parameter is chosen to depend on as: =
∞
k k
(2.6)
k=1
with the coeAcients k suitably de?ned so that the resulting power series in the single parameter has coeAcients free of divergences (Luttinger and Ward, 1960). This leads to a power series in just one parameter and the “only” problem left is therefore that of the convergence of the expansion of the Schwinger functions in powers of . This is non trivial because naive estimates of the sum of all graphs contributing to a given order p yield bounds that grow like p!, thus giving a vanishing estimate for the radius of convergence. The idea is that there are cancellations between the values of the various graphs contributing to a given order in the power series for the Schwinger functions: and that such cancellations can be best exhibited by further breaking up the values of the graphs and by again combining them conveniently. The “renormalization group method” can be seen in di0erent ways: here I am proposing to see it as a resummation method for (possibly divergent) power series.
256
G. Gallavotti / Physics Reports 352 (2001) 251–272
Fig. 4. A (non-smooth) scaling (by a factor of 2) decomposition of unity.
Keeping the original power series in ; , i.e. postponing the choice of as a function of , one checks the elementary fact that the propagator g(x) can be written, setting k = (k0 ; k) ∈ R2 , also as g(x) =
1
(h) h ei! pF x 2h g! (2 pF x) ;
h=−∞ !=±1
gˆ(h) ! (k) =
(2.7)
$(h) (k) + “negligible corrections” ; −ik0 + !k
where $(1) (k) is a function increasing from 0 to 1 between 12 pF and pF , while the functions $(h) (k) are the same function scaled to have support in 2h−2 pF ¡ |k | ¡ 2h pF . This means that for h 6 0 it is $(h) (k) = $(2−h kpF−1 ). The simplest choice is to take $(1) (k) to be the characteristic function of z ≡ pF−1 |k | ¿ 1 and $(z) to be the characteristic function of the interval [ 12 ; 1] (Fig. 4), so that 1
$(h) (k) ≡ 1 :
(2.8)
h=−∞
To avoid technical problems it would be convenient to smoothen the discontinuities in Fig. 4 of $ and $(1) turning them into C ∞ -functions which in a small vicinity of the jump increase from 0 to 1 or decrease from 1 to 0, this is possible while still keeping the scaling decomposition (2.8) (i.e. with $(h) (k) ≡ $(k)). However, the formalism that this smoothing would require is rather havy and hides the stucture of the approach; therefore we shall continue with the decomposition of unity in (2.8) with the sharply discontinuous functions in Fig. 4, warning (c.f.r. footnote 2) the reader when this should cause a problem. The “negligible terms” in (2.7) are terms of a similar form but which are smaller by a factor 2h at least: their presence does not alter the analysis other than notationally. We shall henceforth set them equal to 0 because taking them into account only introduces notational complications. The above is an infrared scale decomposition of the propagator g(x): in fact the propagator g(h) contains only momenta k of O(2h pF ) for h 6 0 while the propagator g(1) contains all (and only) large momenta (i.e. the ultraviolet part of the propagator g(x)). The representation (2.7) is called a quasi particles representation of the propagator and the quantities ! pF
G. Gallavotti / Physics Reports 352 (2001) 251–272
257
are called a quasi particles momenta. The function g!(h) is the “quasi particle propagator on scale h”. After extracting the exponentials ei!pF x from the propagators the Fourier transforms gˆ(h) ! (k) (h) of g! (x) will no longer be oscillating on the scale of pF and the variable k will have the interpretation of “momentum measured from the Fermi surface”. The mentioned divergences are still present because we do not yet relate and : they will be eliminated temporarily by introducing an infrared cut-o8: i.e. by truncating the sum in (2.7) to h ¿ − R. We then proceed keeping in mind that we must get results which are uniform as R → ∞: this will be eventually possible only if is suitably ?xed as a function of . def Writing g(x) = Z1−1 g(1) (x) + Z1−1 g(60) (x) with Z1 = 1 and g(6m) being de?ned in general, see (2.7), as g(6m) (x) =
m
2h ei! pF x g!(h) (2h pF x);
m60 :
(2.9)
h=−R !=±1
Each graph can now be decomposed as a sum of graphs each of which with internal lines carrying extra labels “1” and “!” or “6 0” and “!” (signifying that the value of the graph has to be computed by using the propagator Z1−1 g!(1) (x − x ) or Z1−1 g!(60) (x − x ) for the line in question, if it goes from x to x). We now de?ne clusters of scale 1: a “cluster” on scale 1 will be any set C of vertices connected by lines bearing the scale label 1 and which are maximal in size (i.e. they are not part of larger clusters of the same type). Wavy lines are regarded as bearing a scale label 1. The graph is thus decomposed into smaller graphs formed by the clusters and connected by lines of scale 6 0: it is convenient to visualize the clusters as enclosed into contours that include the vertices of each cluster as well as all the lines that connect two vertices of the same cluster. The latter can be naturally called lines internal to the cluster C. The integrated value of a graph will be represented, up to a sign which can be determined as described above, as a sum over the quasi particles labels ! of the cluster lines and as an integral over the locations of the inner vertices of the various clusters lines. The integrand is a product between (a) the kernels KCi associated with the clusters Ci and depending only on the locations of the vertices inside the cluster Ci which are extremes of lines external to the cluster and on the quasi particles labels ! of the lines that emerge from it, 1 and (b) the propagators Z1−1 g!(60) corresponding to the lines that are external to the clusters (in the sense that they have at least one vertex not inside the cluster). We now look at the clusters C that have just |C | = 2 or |C | = 4 external lines and that are therefore associated with kernels KC ({xj ; !j }j=1; 2 ; C) or KC ({xi ; !i }i=1; :::; 4 ; C). Such kernels, by
1
By de?nition the kernel KC also involves integration over the locations of its inner vertices and the sum over the quasi particle momenta of the inner propagators.
258
G. Gallavotti / Physics Reports 352 (2001) 251–272
the structure of the propagators, see (2.7) and (2.9), will have the form KC = eipF (
j
! j xj )
KQ C ({xj ; !j }j=1; 2; :::; |C| ) ;
(2.10)
where xj are vertices of the cluster C to which the entering and exiting lines are attached; the cluster may contain more vertices than just the ones to which the external lines are attached: the positions of such “extra” vertices must be considered as integration variables (and as integrated), and a sum is understood to act over all the quasi particles labels of the internal lines (consistent with the values of the external lines !j ’s). If |C | = 2; 4 we write the Fourier transform at k = (k0 ; k) of the kernels KQ C (: : :): (1) (1) Z1 2−1 (1) C !1 ;!2 + Z1 (−ik0 )C + ! k*C ) !1 ;!2 + “remainder ” ;
Z12 C(1) !1 +!2 +!3 +!4 =0
+ “remainder ” ;
(2.11)
where the ?rst equation (|C | = 2) is a function of one k only while the second equation (|C | = 4) depends on four momenta k: one says that the remainders are obtained by “subtracting from the kernels their values at the Fermi surface” or by collecting terms that do not conserve the quasi particles momenta (like terms with !2 ;−!2 in the ?rst equation or with !1 + · · · !4 = 0 in the second). The remainder contains various terms which do not have the form of the terms explicitly written in (2.11): a form which could be as simple as ! · k!1 ;−!2 but that will in general be far more involved. In evaluating graphs we imagine, as described, them as made with clusters and that the graph value is obtained by integrating the product of the values of the kernels associated with the graph times the product of the propagators of the lines that connect di0erent clusters. Furthermore we imagine to attach to each cluster with 2 external lines a label indicating that it contributes to the graph value with the ?rst term in the decomposition in (2.11) only (which is the term proportional to (1) ), or with the second term (which is proportional to (−ik0 )(1) + ! k*(1) ) or with the remainder. This is easily taken into account by attaching to the cluster an extra label 1; 2 or r. Likewise, we imagine to attach to each cluster with 4 external lines a label indicating that it contributes to the value with the ?rst term in the decomposition in (2.11) only (which is the term proportional to (1) ), or with the remainder. This is again easily taken into account by attaching to the cluster an extra label 1; r. The label r stands for “remainder term” or “irrelevant term”, however irrelevant does not mean negligible, as usual in the renormalization group nomenclature (on the contrary they are in a way the most important terms). The next idea is to collect together all graphs with the same clusters structure, i.e. which become identical once the clusters with 2 or 4 external lines are “shrunk” to points. Since the internal structure of such graphs is di0erent this means that we are collecting together graphs of di0erent perturbative order. In this way, we obtain a representation of the Schwinger functions that is no longer a power series representation and the evaluation rules for graphs in which single vertex subgraphs (or single node subgraphs) with 2 or 4 external lines have a new meaning. Namely a four external
G. Gallavotti / Physics Reports 352 (2001) 251–272
259
lines vertex will mean a quantity Z12 equal to the sum of Z12 C(1) of all the values of the clusters C with 4 external lines and with label 1. The 2 external lines nodes will mean ei(!1 x1 −!2 x2 )pF !1 ;!2 Z1 ( + ) 9t − i* !9x )(x1 − x2 ) ;
(2.12)
where again or ) ; * are the sum of the contributions from all the graphs with 2 external lines and with label 0 or 1, respectively. It is convenient to de?ne = ) − * and to rewrite the 2 external lines nodes contributions to the product generating the value of a graph simply as ei(!1 x1 −!2 x2 )pF !1 ;!2 Z1 (2 + 9t + * (9t − ! · 9x )) (x1 − x2 ) :
(2.13)
One then notes that this can be represented graphically by saying that 2 external lines nodes in graphs which do not carry the label r can contribute in 3 di0erent ways to the product determining the graph value. The 3 ways can be distinguished by a label 0, 1 and z corresponding to the three addends in (2.13). Any graph without z-type of nodes can be turned into a graph which contains an arbitrary number of them, on each line connecting the clusters. And this amounts to saying that we can compute the series by imposing that there is not even a single vertex with two external lines and with label z simply by modifying the propagators of the lines connecting the graphs: changing them from Z1−1 g(60) to Z0−1 g(60) with Z0 = Z1 (1 + * ) :
(2.14)
This can be seen either elementarily by remarking that adding values of graphs which contains chains of nodes with label z amounts to summing a geometric series (i.e. precisely the series ∞ k k −1 or, much more easily, by recalling that the graphs are generated by k=0 (−1) (* ) = (1+* ) a formal functional integral over Grassmanian variables and checking (2.14) from this remark without any real calculation, see Benfatto and Gallavotti (1995). In the ?rst approach care is needed to get the correct relation (2.14) and it is wise to check it ?rst in a few simple cases (starting with the “linear” graphs which only contain nodes with one entering line and one exiting line, see Fig. 1: the risk is to get Z0 = Z1 (1 − * ) instead of (2.14)). 2 2
It is at this point that using the sharply discontinuous $-functions would cause a problem. In fact if one uses the smooth decomposition (2.14) is no longer correct: namely it would become Z0 = Z1 (1 + $(0) (k) * ) with the consequence that Z1 would no longer be a constant. At this point there are two possible ways out: the ?rst is to live with a Z0 which dpends on k and with the fact that the quantities introduced below Zj , j 6 − 1, will also be k-dependent; this is possible but it is perhaps too di0erent from what one is used to in the phenomenological renormalization group approaches in which quantities like Zj are usually constants. The other possibility is to modify the propagator on scale 0 from Z0−1 g(60) (k) to g(60) (k) (1 + * )=(1 + $(0) (k) * ). The second choice implies that g(6h) will no longer be exactly $(h) (k)=(−ik0 + ! · k) but it will be gradually modi?ed as h decreases and the modi?cation has to be computed step by step. This is also unusual in the phenomenological renormalization group approaches: the reason being simply that in such approaches the decomposition with sharp discontinuities is always used. The latter is not really convenient if one wants to make estimates of large order graphs. Here this will not be a problem for us because we shall not do the technical work of deriving estimates. In Benfatto and Gallavotti (1990) as well as in Berretti et al. (1994) the second choice has been adopted.
260
G. Gallavotti / Physics Reports 352 (2001) 251–272
Correspondingly we set: Z0 Z0 (0) = 2 ; (0) = ; Z1 Z1
(0) =
Z02 : Z12
(2.15)
We can now iterate the analysis: we imagine writing the propagators of the lines connecting the clusters so far considered and that we shall call clusters of scale 1 as 1 (60) 1 (0) 1 g = g + g(6−1) (2.16) Z0 Z0 Z0 and proceed to decompose all the propagators of lines outside the clusters of scale 1 into propagators of scale 0 or of scale 6 − 1. In this way, imagining all clusters of scale 0 as points, we build a new level of clusters (whose vertices are either vertices or clusters of scale 0): they consist of maximal sets of clusters of scale 1 connected via paths of lines of scale 0. Proceeding in the same way as in the above “step 1” we represent the Schwinger functions as sums of graph values of graphs built with clusters of scale 0 connected by lines with propagators on scale 6 − 1 given by Z0−1 g(6−1) and with the clusters carrying labels 1; 2 or r. Again we rearrange the 2-external lines clusters with labels 1; 2 as in (2.13) introducing the parameters ; ; * ; and graphs with nodes of type z by de?ning * in an analogous way as the previous quantity with the same name (relative to the scale 1 analysis). The one vertex nodes of such graphs with 2 or 4 external lines of scale 6 − 1 will contribute 2 (−1) (while the to the product de?ning the graph value, a factor Z−1 2(−1) or Z−1 (−1) or Z−1 propagators in the clusters of scale 1 and the 2 or 4 nodes with two lines of scale 0 emerging from them retain the previous meaning). Again one sets Z−1 = Z0 (1 + * ) and correspondingly we set Z−1 Z−1 (−1) = 2 ; (−1) = ; Z0 Z0
(2.17) (−1) =
2 Z−1 Z02
(2.18)
and now we shall only have graphs with 2 or 4 external lines clusters which carry a label 0; 1 or r as in the previous analysis of the scale 1 and the propagators connecting clusters of scale −1 (6−1) 0 changed from Z0−1 g(6−1) to Z−1 g . Having completed the step 0 we then “proceed in the same way” and perform “step −1” and so on. One can wonder why the choice of the scaling factor 2 in (2.13) and (2.18) multiplying the ratio of the renormalization factors in the de?nition of the new j or, for that matter, why the choice of 1 for the de?nition of the new j ; j : these are dimensional factors that come out naturally and any attempt at modifying the above choices leads to a beta function, de?ned below, which is not uniformly bounded as we remove the infrared cut-o0. In other words: di0erent scalings can be considered but there is only one which is useful. It could also be found by using arbitrary scaling and then look for which one the estimates needed to get a convergent expansion can be made. The conclusion is a complete rearrangement of the perturbation expansion which is now expressed in terms of graphs which bear various labels and, most important, contain propagators
G. Gallavotti / Physics Reports 352 (2001) 251–272
261
that bear a scale index which gives us information on the scale on which they are sizably di0erent from 0. The procedure, apart from convergence problems, leads us to de?ne recursively a sequence (j) ; (j) ; (j) ; Zj of constants each of which is a sum of a formal power series involving values of graphs with 2 or 4 external lines. The quantities gj = ( (j) ; (j) ; (j) ) can be called the running coupling constants while Zj can be called the running wave function renormalization constants: here j = 1; 0; −1; −2; : : : : Of course all the above is nothing but algebra, made simple by the graphical representation of the objects that we wish to compute. The reason why it is of any interest is that, since the construction is recursive, one derives expressions of the gj ; Zj in terms of the gn ; Zn with n ¿ j: Zj+1 = 1 + Bj (gj+1 ; gj ; : : : ; g0 ) ; Zj gj = -j gj+1 + C j (gj+1 ; gj ; : : : ; g0 ) ; where -j is a matrix (Zh =Zh−1 )2 -j = 0 0
(2.19)
0
0
(Zh =Zh−1 )
0
0
2 (Zh =Zh−1 )
(2.20)
and the functions Bj ; C j are given by power series, so far formal, in the running couplings. The expression of Zj+1 =Zj can be used to eliminate such ratios in the second relation of (2.19) which therefore becomes Zj+1 = 1 + Bj (gj+1 ; gj ; : : : ; g0 ) ; Zj gj = - gj+1 + Bj (gj+1 ; gj ; : : : ; g0 ) ;
(2.21)
where - is the diagonal matrix with diagonal (1; 1; 2). The scalar functions Bj and the three components vector functions Bj = (Bj; 1 ; Bj; 2 ; Bj; 3 ) are called the beta functional of the problem. There are two key points, which are nontrivial at least if compared to the above simple algebra and which we state as propositions Proposition 1 (regularity and boundedness of the beta function). Suppose that there is . ¿ 0 such that |gj | ¡ .; |Zj =Zj−1 − 1| ¡ . for all j 6 1 then if . is small enough the power series de9ning the beta functionals converge. Furthermore the functions Bj ; Bj are uniformly bounded and have a dependence on the arguments with label j + n exponentially decaying as n grows; namely there exist constants D; 0 such that if G = (gj+1 ; : : : ; g0 ) and G = (gj+1 ; : : : ; g0 ) with G and F di8ering only by the (j + n)th “component” d = gj+n − gj+n = 0; then for all j 6 0 and all n ¿ 0 |Bj (G)|; |Bj (G)| 6 D.2 ; |Bj (G ) − Bj (G)|;
|Bj (G ) − Bj (G)| 6 De−0n .|d| ;
(2.22)
262
G. Gallavotti / Physics Reports 352 (2001) 251–272
if . is small enough: i.e. the “memory” of the “beta functionals” Bj ; Bj is short ranged. The Schwinger functions are expressed as convergent power series in gj in the same domain |gj | ¡ .. The diAcult part of the proof of the above proposition is to get the convergence of the series under the hypotheses |gj | ¡ .; |Zj =Zj−1 − 1| ¡ . for all j: this is possible because the system is a fermionic system and one can collect the contributions of all graphs of a given order k into a few, i.e. not more than an exponential in k, groups each of which gives a contribution that is expressed as a determinant which can be estimated without really expanding it into products of matrix elements (which would lead to bounding the order k by a quantity growing with k!) by making use of the Gram–Hadamard inequality. Thus the k!−1 that is in the de?nition of the values compensates the number of labels that one can put on the trees and the number of Feynman graphs that is also of order k! is controlled by their representability as determinants that can be bounded without generating a k! via the Hadamard inequality. The basic technique for achieving these bounds is well established after the work (Lesniewski, 1987). A second nontrivial result is Proposition 2 (short range and asymptotics of the beta function). Let G 0 = (g; g; : : : ; g) with g = ( ; ; ) then the function Bj (G 0 ) de9nes an analytic function of g; that we shall call “beta functional ”; by setting (g) = lim Bj (G 0 )
(2.23)
j→−∞
for |g| ¡ .. The limit is reached exponentially |(g)−Bj (G 0 )| ¡ .2 De−0|j| ; for some 0 ¿ 0; D ¿ 0 provided |g| ¡ .. Finally the key result (Benfatto and Gallavotti, 1990; Berretti et al., 1994) is Proposition 3 (vanishing of the beta function). If g = ( ; ; 0) then the functions (g) = 0 provided |g| ¡ .. Furthermore for some D; 0 ¿ 0 it is; for all j 6 0 2 Bj3 (gj+1 ; : : : ; g0 ) + e0j Bj3 (gj+1 ; : : : ; g0 ) Bj3 (gj+1 ; : : : ; g0 ) = j+1 j+1 |Bj3 (gj+1 ; : : : ; g0 )| ¡ D;
|Bj3 (gj+1 ; : : : ; g0 )| ¡ D.2
(2.24)
provided; for h = 0; : : : ; j + 1; |gh | ¡ .. The above propositions are proved in Benfatto and Gallavotti (1990), Berretti et al. (1994) and Benfatto and Mastropietro (2000). The vanishing of ( ; ; 0) is proved in a rather indirect way. We proved that the function is the same for model (2.1) and for a similar model, the Luttinger model, which is exactly soluble; but which can be also studied with the technique described above: and the only way the exactly soluble model results could hold is to have = 0. The vanishing of the beta function seems to be a kind of Ward identity: it is easy to prove it directly if one is willing to accept a formal proof. This was pointed out, after the work
G. Gallavotti / Physics Reports 352 (2001) 251–272
263
(Benfatto and Gallavotti, 1990), in other papers and it was believed to be true probably much earlier in some equivalent form, see Solyom (1979); note that the notion of the beta function is intrinsic to the formalism of the renormalization group and therefore a precise conjecture on it could not even be stated before the 1970s; but of course the existence and importance of in?nitely many identities had already been noted. Given the above propositions one shows that “things go as if ” the recursion relation for the running couplings was, up to exponentially small corrections, a simple memoryless evolution gj−1 = (gj )+O(e−0|j| ): the propositions say in a precise way that this is asymptotically, as j → −∞, true. This tells us that the running couplings j ; j stay constant (because 1 ; 2 vanish): however they in fact tend to a limit as j → −∞ exponentially fast because of the corrections in the above propositions, provided we can guarantee that also j → 0 exponentially fast and j→−∞
that the limits of ; ; j do not exceed . (so that the beta functionals and the beta function still make sense). It is now important to recall that we can adjust the initial value of the chemical potential. 3 This freedom corresponds to the possibility of changing the chemical potential “correction” in (2.1) and tuning its value so that h → 0 as h → −∞. Informally if 0 is chosen “too positive” then j will grow (exponentially) in the positive direction (becoming larger than ., a value beyond which the series that we are using become meaningless); if 0 is chosen “too negative” the j also will grow (exponentially) in the negative direction: so there is a unique choice such that j can stay small (and, actually, it can be shown to converge to 0. 4 The vanishing of the beta function gives us the existence of a sequence of running couplings gj = ( j ; j ; j ) which converge exponentially fast to ( −∞ ; −∞ ; 0) as j → −∞ if 0 are conveniently chosen: and one can prove that −∞ ; −∞ ; 0 are analytic in for small enough (Berretti et al., 1994). In this way one gets a convergent expansion of the Schwinger functions: which leads to an essentially complete theory of the one dimensional Fermi gas with spin zero and short range interaction. 3. The conceptual scheme of the renormalization group approach followed above The above schematic exposition of the method is a typical example of how one tries to apply the multiscale analysis that is commonly called a “renormalization group approach”: (1) one has series that are easily shown to be ?nite order by order possibly provided that some free parameters are suitably chosen (“formal renormalizability theory”: this is the proof 3
Which is a “relevant operator”, in the sense that if regarded as a running coupling it is roughly multiplied by 2 at each change of scale, i.e. j−1 ∼ 2j . 4 A simpli?ed analysis is obtained by “neglecting memory corrections” i.e. using as a recursion relation gj = -gj+1 + (gj+1 ) with (g) verifying (2.24): this gives that j ; j → ( −∞ ; −∞ ) exponentially fast and j → 0 expoj →−∞
nentially fast provided 0 is suitably chosen in terms of 0 ; 0 : otherwise everything diverges.
j →−∞
264
G. Gallavotti / Physics Reports 352 (2001) 251–272
in Luttinger and Ward (1960)) that if h in (2.2) are suitably chosen we obtain a well-de?ned perturbation series in powers of . (2) However the series even when ?nite term by term come with poor bounds which grow at order k as k! which, nevertheless are often non trivial to obtain (although this not so in case (2.1) discussed here unlike the case discussed in the next sections). (3) One then tries to reorganize the series by leaving the original parameters ( ; ) in the present case as is ?xed) as independent parameters and collecting terms together. The aim being to show that they become very convergent power series in a sequence of new parameters, the “running couplings” (h) ; (h) and (h) in the present case, under the assumption that such parameters are small (they are functions, possibly singular, of the initial parameters of the theory, ; in case (2.1), as has to be imagined to be 0). (4) The running couplings, essentially by construction, also verify a recursion relation that makes sense again under the assumption that the parameters are small. This relation allows us to express (if it makes sense) successively the running couplings in terms of the preceding ones: the running couplings are ordered into a sequence by “scale labels” h = 1; 0; −2; : : : : The recursion relation is interpreted as an evolution equation for a dynamical system (a map de?ned by the beta function(al)): it generates a “renormalization group trajectory” (the sequence ( h ; h ; h ) out of the original parameters ; present in (2.1), as has to be taken as 0). (5) One then shows that if the free parameters in the problem, (i.e. ; in (2.1)) are conveniently chosen, then the recursion relation implies that the trajectory stays bounded and small, thus giving a precise meaning to (2.2)), and actually the limit relation holds ( (h) ; (h) ; (h) ) → h→−∞
( ∞ ; ∞ ; 0) (this is achieved in the above Fermionic problem by ?xing as a suitable function of , see Berretti et al. (1994)). (6) Hence the whole scheme is self-consistent and it remains to check that the expressions that one thus attributes to the sum of the series are indeed solutions of the problem that has generated them: not unexpectedly this is the easy part of the work, because we have always worked with formal solutions which “only missed, perhaps, to be convergent”. (7) The ?rst step, i.e. going to scale 0 is di0erent from the others as the propagators have no ultraviolet cut o0 (see the graph of $(1) in Fig. 2). Although there are no ultraviolet divergences the control of this ?rst step o0ers surprising diAculties (due to the fact that in the direction of k0 the decay of the propagators is slow making various integrals improperly convergent): the analysis is done in Berretti et al. (1994) and Gentile and Scoppola (1993). Note that the above scheme leaves room for the possibility that the running couplings rather than being analytic functions of a few of the initial free parameters are singular: this does not happen in the above fermionic problem because some components of the beta function vanish identically: this is however a peculiarity of the fermionic models. In other applications to ?eld theory, and particularly in the very ?rst example of the method which is the hierarchical model of Wilson, this is by far not the case and the perturbation series are not analytic in the running couplings but just asymptotic in the actual free parameters of the theory. The method however “reduces” the perturbation analysis to a recursion relation in small dimension (namely 3 in case (2.1)) which is also usually easy to treat heuristically. The d = 2 ground state fermionic problem (i.e. (2.1) in 2 space dimensions) provides, however, an example in which even the heuristic analysis is not easy.
G. Gallavotti / Physics Reports 352 (2001) 251–272
265
In the following section we discuss another problem where the beta function does not vanish, but one can guarantee the existence of a bounded and small solution for the running couplings thanks to a “gauge symmetry” of the problem. This is an interesting case as the theory has no free parameters so that it would not be possible to play on them to ?nd a bounded trajectory for the renormalization group running constants. This also illustrates another very important mechanism that can save the method in case there seemed to be no hope for its use, namely a symmetry that magically eliminates all terms that one would fear to produce “divergences” in formal expansions. Again the case studied is far from the complexity of gauge ?eld theory because it again leads to the result that the perturbation series itself is summable (unlike gauge ?eld theories which can only yield asymptotic convergence): but it has the advantage of being a recognized diAcult problem and therefore is a nice illustration of the role of symmetries in the resummation of (possibly) divergent series and the power of the renormalization group approach in dealing with complex problems. 4. The KAM problem Consider d rotators with angular momentum A = (A1 ; : : : ; Ad ) ∈ Rd and positions * = (*1 ; : : : ; *d ) ∈ T d = [0; 2]d ; let J ¿ 0 be their inertia moment and suppose that .f(*) is the potential energy in the con?guration *, which we suppose to be an even trigonometric polynomial (for simplicity) of degree N . Then the system is Hamiltonian with Hamiltonian function 1 2 H= (4.1) A + .f(*) 2J giving rise to a model called “Thirring model”. 5 For . = 0 motions are quasi periodic (being t → (A0 ; *0 + !0 t) with !0 = J −1 A0 ) and their “spectrum” !0 ?lls the set S0 ≡ Rd of all vectors !0 : there is a 1-to-1 correspondence between the spectra !0 and the angular momenta A0 . Question: If . = 0 can we 9nd, given !0 ∈ S0 a perturbed motion, i.e. a solution of the Hamilton equations of (4:1), which has spectrum !0 and, as . → 0, reduces with continuity to the unperturbed motion with the same spectrum? or less formally: which among the possible spectra ! ∈ S0 survives perturbation? (1) The global canonical transformations C of Rd × T d with generating functions S(A; *) = N A · *+8(*) · A +’(*) parameterized by an integer components non singular matrix N , and analytic functions g(*); f(*) leave invariant the class of Hamiltonians of the form H = (A; M (*)A)=2 + A · g(*) + f(*). The subgroup CLd (R) of the global canonical coordinate transformations C was (remarked and) used by Thirring so that (4.1) is called the “Thirring model”, see Thirring (1983). (2) The function H . ( ) in (4.2) must have zero average over or, if → + !0 t, over time: hence the surviving quasi periodic motions can be parameterized by their spectrum !0 or, equivalently, by their average action A0 = J !0 . The “spectral dispersion relation” between the average action A0 and the frequency spectrum is not twisted by the perturbation. Furthermore, the function .(!0 ; J ) can be taken monotonically increasing J : J −1 is called the “twist rate”. The latter two properties motivated the name of “twistless motions” given to the quasi periodic motions of form (4.2) for Hamiltonians like (4.1). (3) The invariance under the group CLd (R) has been used widely in the numerical studies of the best treshold value .(!0 ; J ) and a deeper analysis of this group would be desirable, particularly a theory of its unitary representations. 5
266
G. Gallavotti / Physics Reports 352 (2001) 251–272
Analytically this means asking whether two functions H . ( ); h. ( ) on T d exist, are divisible by . and are such that if A0 = J !0 and if we set A = A0 + H . ( ); ∈Td ; (4.2) a = + h. ( ); then → + !0 t yields a solution of the equations of motion for . small enough. It is well known that in general only “nonresonant” spectra can survive: for instance (KAM theorem) those which verify, for some ective potential on scale h. We can introduce also a scale label h = 1 to denote the ultraviolet scale, (1) = (u:v:) , so that (61) ≡ and V(1) ( (61) ) = V( ). By using iteratively the invariance of exponential property we see that V(h) can be expressed in terms of V(h+1) as (h)
V (
(6h)
)=
∞ 1 n=0
n!
T Eh+1 (V(h+1) (· +
(6h)
); n) ;
(5.18)
where V(h+1) in turn can be expressed in terms of V(h+2) as (5.18) with h replaced with h + 1, and so on until V(h) is expressed in terms of V(1) ≡ V. At each step of the iterative procedure a circle representing V(h ) , for some h ¡ h ¡ 1, is transformed into a point v on a vertical line x = h + 1 (we use the coordinate system introduced above) with sv ¿ 1 exiting lines leading to sv circles representing V(h +1) and so on. At the end only points are left (i.e. no circles remain): the ones on the line x = 2 are called endpoints. By resuming the above discussion, we see that we can introduce a graph representation of V(h) in terms of labelled trees. We refer to Section A.1 for a systematic discussion on trees: here we con6ne ourselves to the basic notions, in order to make self-consistent the following analysis. On the plane (x; y) one draws the vertical lines x = h; h + 1; : : : ; 0; 1; 2 and one considers all the possible planar graphs obtained as follows [60]. One draws a horizontal line (a branch or a line) starting from a point r on the line x = h, the root, and leading to a point v0 with coordinate x = hv0 ¿ h, the =rst nontrivial vertex. Such a point is the branching point of sv0 ¿ 2 lines (also branches or lines) forming angles #j ∈ (−'=2; '=2), j = 1; : : : ; sv0 , with the x-axis and ending into points each of which is located on some vertical line x = hv0 + 1; hv0 + 2; : : : (and it becomes another branching point). One proceeds in such a way until n points on the line x = 2 are reached, the endpoints. All the branching points between the root and the endpoints will be called the nontrivial vertices. The trivial vertices will be the points located at the intersections of the lines connecting two
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
299
Fig. 8. A tree appearing in the graphic representation of V(h) . Such a tree is obtained by iterating the graph representations of the previous 6gures. All the endpoints are on the vertical line corresponding to the line h = 2.
nontrivial vertices with the vertical lines. The integer n denoting the number of endpoints will be called the order of the tree. We associate to the endpoints a number 1 to n, ordered from up to down. See Fig. 8. If the tree has only one line connecting the root to a vertex on the line x = 2, we say that the tree is trivial and we shall write E = E0 . Note that in such a case the root has scale h = 1. The graph so obtained is a tree graph: it consists of a set of lines connecting a partially ordered set of points (the vertices). The partial ordering of the vertices will be denoted by the symbol 4: if v ≺ w are two vertices, then hv ¡ hw . Of course the lines are ordered as well; note that there is a one-to-one correspondence between vertices and lines, as a line uniquely identi6es the vertex which it enters. Note that to each vertex v an integer hv is associated by construction: it is called the scale label. In particular we can associate the scale label h to r. We can associate with the unlabelled trees also some other labels: the values of such labels will depend on the particular problem we are studying. Therefore, we shall consider also the labelled trees (to be called simply trees in the following): we shall denote by the same symbol E the labelled trees (in the following we shall deal only with labelled trees) and by Th; n the set of all labelled trees with n endpoints (i.e. of order n) and with a scale label h associated to the root. It is then easy to see that the number of unlabelled trees with n endpoints is bounded by 4n ; see Section A.1. If we include also the endpoints into the set of vertices, we have that the vertices can be either trivial vertices or nontrivial vertices (which include also the endpoints). We shall denote by V (E) the set of vertices of a tree E and by Vf (E) the set of vertices in V (E) which are endpoints. By construction hv = 2 for any v ∈ Vf (E), while h ¡ hv ¡ 2 for any v ∈ V (E)\Vf (E). To each endpoint there corresponds one of the contributions to the interaction part of the Hamiltonian. With respect to Hamiltonian (2.17), it is more convenient to consider a Hamiltonian containing some extra term having the same form of the terms de6ning the free Hamiltonian H0 times some parameter: physically this is interpreted by saying that the interaction changes the “free” values of the parameters, i.e. the values of the parameters of the Hamiltonian
300
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
describing the free system (the mass and the chemical potential). By using the decomposition in (2.1) and (2.2) for H0 , we shall consider Hamiltonians of the form H = H0 + V = H0 + 2V1 + 7V2 + uV3 + V4 + (V5 ; V1 = T0 ; V2 = N0 ; V3 = P ; V4 = V ; V5 = B :
(5.19)
Then with each endpoint v of scale hv = 2 we associate one of the 6ve contributions to V; so we can associate with v a label i ≡ iv ∈ {1; : : : ; 5} uniquely identifying the contribution Vi to V in (5.19). We shall say that the endpoint is (1) of type 2 if i = 1, (2) of type 7 if i = 2, (3) of type u if i = 3, (4) of type if i = 4, (5) of type ( if i = 5. We can also introduce a label rv for v ∈ Vf (E) such that rv = 2 if iv = 1 and so on. If n is the number of endpoints, n = |Vf (E)|, we shall write n = n1 + · · · + n5 , where ni is the number of endpoints v ∈ Vf (E) with iv = i. Moreover, with such an endpoint v we associate also a set {xv } of space–time points, which are the integration variables corresponding to the particular interaction contribution Vi : in particular {xv } contains one point for any i = 4 and two points for i = 4. Given a vertex v, which is not an endpoint, {xv } will denote the family of all space–time points associated with the endpoints following v, i.e. with the endpoints w ∈ Vf (E) such that v ≺ w. We introduce a =eld label f to distinguish the 6elds appearing in the terms associated with the endpoints: the set of 6eld labels associated with the endpoint v will be called Iv . Then x(f), (f) and !(f) will denote the space–time point, the index and the ! index, respectively, of the 6eld with label f. For instance, for v ∈ Vf (E) with iv = 4, then {xv } = {x; y} and Iv = {f1 ; f2 }, if x(f1 ) = x and x(f2 ) = y. We shall write also x(Iv ) = {x(f): f ∈ Iv }. Analogously, if v is not an endpoint, we shall call Iv the set of 6eld labels associated with the endpoints following the vertex v. 5.2. Clusters It is clear that, if h 6 0, the e=ective potential (if E˜ h are normalization factors for any h 6 2) can be written in the following way: ∞ V(h) ( (6h) ) + L+E˜ h+1 = V(h) (E; (6h) ) ; (5.20) n=1 E∈Th; n
where
V(h) (E;
(6h) )
is de6ned iteratively as follows.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
301
If E is the trivial tree E0 , then h = 1 and V(1) (E0 ; (61) ) is given by one of the contributions to V( ), listed in (5.19). If E is nontrivial and v0 is the 6rst vertex of E and E1 ; : : : ; Es (with s = sv0 ) are the subtrees of E with root v0 , then V(h) (E;
(6h)
)=
1 T E (V(h+1) (E1 ; s! h+1
(6h+1)
); : : : ; V(h+1) (Es ;
(6h+1)
)) :
(5.21)
In general, for each v ∈ V (E) we denote by sv the number of lines exiting from v (sv = 0 if v ∈ Vf (E)), so that, by iterating (5.21), one obtains 1 V(h) (E; (6h) ) = sv ! v∈V (E)
T T T T T × Eh+1 (Eh+2 (Eh+3 : : : E−2 (E−1 (E0T (V(E0 ;
(61)
); : : :); : : :); : : :); : : :); : : :) ; (5.22)
where E0 is the trivial tree. The truncated expectations in (5.21) are meant to be computed starting from the endpoints towards the root. The above expression can look a little intricate at 6rst sight: the better way to understand it is to especially work out some examples (for instance for low values of h like h = 0; −1; −2; : : :) and try to generalize them to any value of h 6 0. Once a vertex v is reached, one has to consider an expression of the kind (6hv ) 1 T ˜ (6hv ) Ehv ( (Pv1 ); : : : ; ˜ (Pvsv )) ; sv !
(5.23)
where sv is the number of lines exiting from v and Pvj , with j = 1; : : : ; sv , is a set of indices such that (6h ) (f) v ˜ (6hv ) (Pvj ) = j = 1; : : : ; sv ; (5.24) x(f); !(f) ; f∈Pvj
is a product of |Pvj | 6elds on scale 6 hv . This can be proven by induction on the scale hv ; see Section A.6. Therefore, the e=ect of the truncated expectation EhTv is to contract the 6elds on scale hv appearing in the products (5.24) in all the possible ways. If one uses expansion (4.42) one obtains a sum over all the possible Feynman diagrams which can be obtained by contracting the half-lines emerging from the sets Pv1 ; : : : ; Pvsv . This means that, when the vertex v is reached moving along the tree E, we construct a “diagram” formed by lines ‘ on scales h‘ ¿ hv . To any vertex w v there corresponds a subdiagram <w such that all the lines on scale hw form a connected set if all the subdiagrams <wj , j = 1; : : : ; wsw , corresponding to the vertices immediately following w, are thought of as contracted into points (this simply follows from the very de6nition of truncated expectation). We call Pv the set of labels corresponding to the 6elds associated to the external lines of erentiable as functions of x0 ; if i = a; b. Moreover; there exist two constants Q1 and Q2 of the form Q1 = − a1 J3 + O(J32 );
Q2 = a2 J3 + O(J32 ) ;
(16.7)
a1 and a2 being positive constants; uniformly bounded in L; +; pF and (u; J3 ) ∈ A; such that the following is true. Then; given any positive integers n and N; there exist positive constants A˜ ¡ 1 and Cn; N ; independent of L; +; pF and (u; J3 ) ∈ A; so that; for any integers n0 ; n1 ¿ 0 and putting n = n0 + n1 ; n1
|9nx00 9Sx U3; a (x)| 6
1
|x|2+2Q2 +n
Cn; N ; 1 + [3|x|]N
(16.8)
380
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
1 Cn; N ; 2+n |x| 1 + [3|x|]N (3|x|)1=2 1 1 C0; N 3; c |U (x)| 6 2 + min(0; 2Q2 ) ; A ˜ |x| |x| |x| 1 + [3|x|]N n1
|9nx00 9Sx U3; b (x)| 6
where 9Sx denotes the discrete derivative and $ 3 = max{|u|1+Q1 ; (v0 +)−2 + L−2 } :
(16.9) (16.10)
(16.11)
(c) U3; a (x) and U3; b (x) are even functions of x and there exists a constant ∗ ; of order J3 ; such that; if 1 6 |x| 6 3−1 and v0∗ = v0 (1 + ∗ ) 1 + A1 (x) ; + (v0∗ x0 )2 ]1+Q2
2 x0 − (x=v0 )2 1 3; b + A2 (x) ; U (x) = 2 2 2' [x + (v0 x0 )2 ]2 [x2 + (v0 x0 )2 ]2
U3; a (x) =
2'2 [x2
|Ai (x)| 6 c1 {|J3 | + (3|x|)1=2 }
(16.12) (16.13)
for some constant c1 . The function U3; a (x) is the restriction to Z×R of a function on R2 ; satisfying the symmetry relation 3; a 3; a ∗ x U (x; x0 ) = U x0 v0 ; ∗ : (16.14) v0 3 (d ) Let Uˆ (k); k = (k; k0 ) ∈ [ − '; '] × R1 ; the Fourier transform of U3 (x). For any =xed k 3 with k = (0; 0); (±2pF ; 0); Uˆ (k) is uniformly bounded as u → 0; moreover; for some constant c2 ; c2 ; 3
|Uˆ (0; 0)| 6 c2 + c2 |J3 | log 3
|Uˆ (±2pF ; 0)| 6 c2
1 ; 3
1 − 3Q2 : Q2
(16.15)
3 Finally; if u = 0; Uˆ (k) is at most logarithmically divergent at k = (0; 0) for any J3 ; and; at k = (±2pF ; 0); it is singular only if J3 ¡ 0; in this case it diverges as |k − (±2pF ; 0)|Q2 = |Q2 |. ˆ ˆ (e) Let G(x) = U3 (x; 0) and G(k) its Fourier transform. For any =xed k = 0; ±2pF ; G(k) is uniformly bounded as u → 0; together with its =rst derivative; moreover
ˆ | 6 c2 ; |9k G(0) ˆ ±2pF )| 6 c2 (1 + 3Q2 ) : |9k G(
(16.16)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
381
ˆ Finally; if u = 0; 9k G(k) has a =rst-order discontinuity at k = 0; with a jump equal to 1+O(J3 ); and; at k = ± 2pF ; it is singular only if J3 ¡ 0; in this case it diverges as |k − (±2pF )|Q2 . We comment on the above very elaborate theorem. (a) The above theorem holds for any magnetic 6eld h such that sin pF ¿ 0, if pF = h − J3 . Remember that the exact solution [8] is valid only for h = 0. Moreover, u has not to be small, see (16:5), and the only small parameter is J3 ; however the interesting (and more diKcult) case is when u is small. (b) A naIWve estimate of , is , = c(sin pF )2 , with c; 2 positive numbers; in other words we must take smaller and smaller J3 for pF closer and closer to 0 or ', i.e. for magnetic 6elds of size close to 1. It is unclear at the moment if this is only a technical problem or a property of the model. (c) If J1 = J2 and J3 = 0 one can distinguish, like in the J3 = 0 case (16.1), two regions in the behaviour of the correlation function U3 (x), discriminated by an intrinsic length which is given approximately by the inverse of spectral gap. In the 6rst region the bounds for the correlation function are the same as in the gapless J1 = J2 case, while in the second region there is a faster than any power decay with rate given essentially by the gap size, which is O(|u|1+Q1 ), see (16.11), in agreement with (16.3), found by the exact solution. The interaction J3 has the e=ect that the gap becomes anomalous and it acquires a critical index Q1 ; the ratio between the renormalized and bare gap is very small or very large, if u is small, depending on the sign of J3 . In the 6rst region one can obtain the large distance asymptotic behaviour of U3 (x), see (16.12) and (16.13); in the second region only an upper bound is obtained, but even in the J3 = 0 case we are not able to obtain more from the exact solution if h = 0. If u = 0 only the 6rst region is present as the spectral gap is vanishing. (d) It is useful to compare the expression for the large distance behaviour of U3 (x) in the case u = 0 with its analogues for the Luttinger model (2.7). A 6rst di=erence is that, while in the Luttinger model the Fermi momentum is independent of the interaction, in the XYZ model in general it is changed nontrivially by the interaction, unless the magnetic external 6eld is zero, i.e. pF = '=2. The reason is that the Luttinger model has special parity properties which are not satis6ed by the XYZ chain (except if the magnetic 6eld is vanishing). (e) Another peculiar property of the Luttinger model correlation function is that the dependence on pF of the correlation function is only by the factor cos(2pF x); this is true not only asymptotically (i.e. it is true not only in (14.25) but in the complete expression in [41,42]) and is due to a special symmetry of the Luttinger model (the Fermi momentum disappears from the Hamiltonian if a suitable rede6nition of the fermionic 6elds is done, see [41,42]). This is of course not true in the XYZ model and in fact the dependence from pF of U3 (x) is very complicated. However, we will see that U3 (x) can be written as the sum of three terms, see (16:6), and from (16.17) and (16.9) we have that the derivatives of the 6rst two terms verify the same bounds as their analogue of the Luttinger model (which were pF independent). This is not true for the third term U3; c (x), in which there are possibly oscillating terms making false a bound on the derivatives like (16.17) and (16.9). However, we can prove that such a term is smaller for large distances, see (16.10) (note that A˜ is J3 and u independent, contrary to Q2 ). Of course this is true only for small J3 and it could be that such a third term plays an important role for larger J3 . If we compare (16.12) with u = 0 with (14.25) we see that the expressions
382
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
di=er essentially for the factors Ai (x), containing terms of higher order in our expansion. We can prove that Ai (x) verify (16.13) and that the derivatives verify a bound like (16.8) and (16.9) which means that the higher-order terms verify the same bound as the 6rst-order terms, or the same bound as their analogue of the Luttinger model. However, the 6rst-order terms, or (14.25), have subtle symmetry properties which are very important in analysing the Fourier transform. We are able to prove that A1 (x) veri6es (16.14), which says essentially that v0∗ is the renormalized Fermi velocity; in fact the decomposition of U3a in the form of (16.12) with A1 (x) verifying (16.13) is not unique, as one can replace v0∗ with any velocity vS∗0 of the form vS∗0 = v0∗ (1 + O()) and an expression similar to (16.12) with A1 (x) verifying (16.13) is still found; however, with vS∗0 property (16.14) it is not true, unless vS∗0 = v0∗ , and this allows us to say that v0∗ is the renormalized Fermi velocity. We are not able, however, to prove a similar properties for A2 (x), see below. (f) Another important property of the Luttinger model correlation function is the fact that the not oscillating term does not acquire a critical index, contrary to what happens for the term oscillating with frequency pF ='. In the Luttinger model the not oscillating term of the correlation function is exactly (i.e. not asymptotically) equal to the noninteracting one. Again in the XYZ model this is not true, but one is naturally led to the conjecture that still the critical 3; b index of UL; + (x) is vanishing, see for instance [Sp]. In our expansion, we have a series also 3; b for the critical index of UL; + (x), and while an explicit computation of the 6rst order gives a vanishing result, it is not obvious that this is true at any order. However, due to some hidden symmetries of the model (i.e. symmetries enjoyed approximately by the relevant part of the e=ective action) we can prove that all the coeKcients are vanishing proving a Ward identity. We want to stress that this is, to our knowledge, the 6rst example in which an approximate Ward identity is proved in a rigorous way. The Ward identity we 6nd is not the same obtained neglecting the regularizations and proceeding formally. (g) The above properties can be used to study the equal time density correlation Fourier transform; if J3 = 0 its 6rst derivative at k = ± 2pF is logarithmically divergent at u = 0 and it is 6nite at k = 0; if J3 = 0 the behaviour of the 6rst derivative at k = ± 2pF is completely di=erent, as it is 6nite if J3 ¿ 0 while it has a power like singularity, if u = 0, if J3 ¡ 0 see item (e) in the theorem. This is due to the fact that the critical index Q2 appearing in the oscillating 3 (x) has the same sign as J (note that Q has nothing to do with the critical index term in UL; 3 2 + Q appearing in the two-point fermionic Schwinger function, which is O((J3 )2 )). On the other hand, the equal time density correlation Fourier transform near k = 0 of the Luttinger, XYZ or of the free fermionic gas (J1 = J2 ; J3 = 0) behaves in the same way (see also [Sp] for a heuristic explanation). This is due to a parity cancellation in the expansion eliminating the apparent dimensional logarithmic divergence. (h) From (14.25) in the u = 0 case we can see that the (bidimensional) Fourier transform can be singular only at k = (0; 0) and k = (±2pF ; 0). If J3 = 0 the singularity is logarithmic at k = (±2pF ; 0), but there is no singularity if J3 ¿ 0 and there is a power like singularity if J3 ¡ 0, see item (d) in the theorem. Then the singularity at k = (±2pF ; 0) is of the same type as in the Luttinger model, see (14.25). However, we cannot conclude that the same is true for the Fourier transform at k = 0, which is bounded in the Luttinger model, while we cannot exclude a logarithmic divergence. In order to get such a stronger result, it would be suKcient
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
383
to prove that the function U3; b (x) is odd in the exchange of (x; x0 ) with (x0 v; x=v), for some v; this property is true for the leading term corresponding to U3; b (x) in (14.25), with v = v0 , but seems impossible to prove on the base of our expansion. We can only see this symmetry for the leading term, with v = v0∗ . (i) Note that our theorem cannot be proved by building a multiscale renormalized expansion, neither by taking the XY one as the “free model” and J3 as the perturbative parameter, nor by taking the XXY one as the free model and u as the perturbative parameter. In fact, in order to solve the model, one cannot perform a single Bogoliubov transformation as in the J3 = 0 case; the gap has a nontrivial Cow and one has to perform a di=erent Bogoliubov transformation for each renormalization group integration. (l) If u = 0 the critical indices and 7 can be computed with any pre6xed precision; we write explicitly in the theorem only the 6rst order for simplicity. However, if u = 0, they are not 6xed uniquely; for what concerns 7, this means that, in the gapped case, the system is insensitive to variations of the magnetic 6eld much smaller than the gap size. (m) Finally, there is no reason for considering a nearest-neighbour Hamiltonian like (2.10); it will be clear by the following analysis that our results still hold for nonnearest-neighbour spin-Hamiltonian.
17. Spinning fermions 17.1. The repulsive case If the fermions are spinning, the general scheme is the same as the one discussed for spinless fermions, but new complications arise from the fact that the number of running coupling constants is much higher. Let us consider a system of spinning fermions on a lattice in the not 6lled band case with Hamiltonian H = H0 + V + 7N0
(17.1)
with H0 ; N0 given by (2.1), and V given by (2.5). This case was studied in [18] to which we refer for details. One can de6ne an anomalous integration similar to the one in Section 8 for spinless fermions; the localization operator is de6ned by (8.19) – (8.21). The spin has the e=ect that there are more running coupling constants; in fact the relevant part of the e=ective potential, which in the spinless case is given by (8.25), is, if pF = 0; ' for any integer n: (h)
(h)
(h)
(h)
LV (h) = Ah 7h F7(h) + h Fz(h) + gh1 F1 + gh2 F2 + gh4 F4 + pF ;'=2 gh3 F3
;
where F1(h) =
1 (L+)4
k1 ;:::;k4 ∈DL;+ ; !
(17.2)
(6h)+ k1 +!pF ;!;
(6h)+ k2 −!pF ;−!;
(6h)− k3 +!pF ;!;
(6h)− k4 −!pF ;−!;
4 i=1
i ki
;
384
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
F2(h) =
1 (L+)4
k ;:::;k ∈D 1
4
L;+
!
;
(6h)+ k1 +!pF ;!;
(6h)+ k2 −!pF ;−!;
(6h)− k3 −!pF ;−!;
(6h)− k4 +!pF ;!;
4
i ki
;
i=1
(17.3) F4(h) =
1 (L+)4
k ;:::;k ∈D 1
4
L;+
;
!
(6h)+ k1 +!pF ;!;
(6h)+ k2 +!pF ;!;
(6h)− k3 +!pF ;!;
(6h)− k4 +!pF ;!;
4
i ki
;
i=1
(17.4) F3(h) =
1 (L+)4
k ;:::;k ∈D 1
4
L;+
;
!
(6h)+ k1 +!pF ;!;
(6h)+ k2 +!pF ;!;
(6h)− k3 −!pF ;−!;
(6h)− k4 −!pF ;−!;
4
i ki
i=1
and ˆ + O(2 ); g02 = v(0) 2 g01 = v(2p ˆ F ) + O( );
g04 = v(0) ˆ + O(2 ) ; 2 g03 = v(2p ˆ F ) + O( ) :
Note that gh2 ; gh4 correspond to an interaction with a small exchange of momentum and are called forward scattering processes; gh1 correspond to an interaction with a big exchange of momenta and it is called backward scattering. Finally gh3 is possible only at pF = '=2 and it is an Umklapp scattering. Of course one can obtain the analyticity of the beta function if the running coupling constants are small enough, proving a result similar to Theorem 1 in Section 8. However the Cow of the running coupling constants is now much more complex. We consider the case pF = 0; '=2; '; the renormalization group Cow equations for the running coupling constants gh1 ; gh2 ; gh4 are given by, if h = gh2 ; gh4 ; h 1 gh−1 = gh1 + gh1 [ − +gh1 + +h1 (˜vh ; : : : ; v0 )] ; 2 gh−1 = gh2
+
4 = gh4 + gh−1
gh1
+ 1 ˆh (h) − gh + +2 (˜vh ; : : : ; v0 ) + +2 (h ; 7h ; : : : ; 0 ; 70 ) ;
2
h gh1 +ˆ 4 (˜vh ; : : : ; v0 )
+ +4(h) (h ; 7h ; : : : ; 0 ; 70 )
with + ¿ 0 and we have written explicitly the second-order terms. Note that, by trivial symme1 try considerations, any contributions to gh−1 has at least a g1 endpoint. Truncating the above equations at the second order we see that gh1 → 0 if g01 ¿ 0 grows while exiting out of the radius of convergence of the beta function if g01 ¡ 0. We consider for the moment the repulsive case v(2p ˆ F ) ¿ 0. One can proceed as in Section 10 dividing the Beta function in a part depending only on the Luttinger model part of the propagator g!(h) (see Lemma 2 in Section 8) plus a “correction” which is smaller by a factor AQh . Moreover, one can 6x the counterterm 7 so that 7h = O(AQh ) so dividing, like in Section 10, the Beta function in a part independent from 7h
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
385
plus a correction smaller by a factor AQh . Let +hi (h ; 7h ; : : : ; 0 ; 70 ) be the function obtained by +hi (˜vh ; : : : ; v0 ), putting gh1 = 7h = 0; one can show, see [18], that if +h2 (h ; 0; : : : ; h ; 0) = 0 ;
(17.5)
+h4 (h ; 0; : : : ; h ; 0) = 0 ;
(17.6)
+h1 (h ; 0; : : : ; h ; 0) = 0 ;
(17.7)
2 +ˆ h (h ; 0; : : : ; h ; 0) = 0 ;
(17.8)
4 +ˆ h (h ; 0; : : : ; h ; 0) = 0 ;
(17.9)
then it is possible to choose a counterterm 7 such that, if v(2pF ) ¿ 0 then gh1 →h→−∞ 0;
7h →h→−∞ 0;
Zh−1 →h→−∞ AQ Zh
2 ; g4 ; 2 3 2 2 2 and gh2 ; gh4 ; h →h→−∞ g∞ ∞ ∞ with Q = a + O( ) with a ¿ 0, and g∞ = g0 + O( ), 4 2 2 2 g∞ = g0 + O( ), ∞ = O( ). In order to prove (17.5) – (17.8) we follow essentially the same strategy for the spinless case, see Section 11, but in the spinning case the role of the Luttinger model is played by the Mattis model with Hamiltonian L + + H= d x(1 + ) !; ; x (i!9x − pF ) !; ; x 0
!=±1 =±1=2
+
g
2; p
g
2; o
g
4; p
0
!;
+
+
!;
g4; o
L
0
!;
+
L
0
!;
L
0
L
d x d yv(x − y)
− + − + !; x; !; x; −!; y; −!; y;
d x d yv(x − y)
− + − + !; x; !; x; −!; y; − −!; y; −
d x d yv(x − y)
+ − + − !; x; !; x; !; y; !; y;
d x d yv(x − y)
− + − + !; x; !; x; !; y; − !; y; − :
Also such a model is solvable, see [113], and the Schwinger functions can be explicitly computed [81].
386
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Reasoning as in Section 11 one can study the above model by Renormalization group. Let us start from the spin symmetric Mattis model g2; p = g2; o and g4; p = g4; o in which one obtains an expression for the relevant part of the e=ective potential similar to (17.2) but with gh1 = gh3 = 7h = h ≡ 0. As the 6nite volume Schwinger functions of the Mattis model are known we can reason exactly as in Section 11 and we obtain (17.5) and (17.6). In order to prove (17.7) and (17.8) we study by renormalizaton group the nonspin symmetric Mattis model in which g2; p = g2; o and g4; p = g4; o . One obtains an expression for the relevant part of the e=ective potential similar to (17.2) but with gh1 = gh3 = 7h = 0 and the relevant part of the e=ective potential is given by 2; p
(h)
(h)
(h)
LV (h) = gh F2; p + gh2; o F2; o + gh4; o F4; o ;
where F2;(h)p and F2;(h)o are given by (17.3) with = and − , respectively, and in the same way (h) (h) are de6ned Fp; 4 = 0 and Fo; 4 , see (17.4). The beta function with all the running coupling constants having the same scale driving the Cow of gi2; h with i = 2; 4 and 2 = o; p of the nonspin symmetric Mattis model can be written as 4; o [gh ]n1 [gh2; p ]n2 [gh2; o ]n3 [h ]n4 +i;(h)2;n1 ;:::;n4 : (17.10) n1 ;:::; n4
Again reasoning as in Section 11 by the comparison of the nonspin symmetric Schwinger functions of the Mattis model it follows the vanishing of (17.10) and from the independence of g4; o , gh2; p , g2; o , it follows that +i;(h)2;n1 ;:::;n4 = 0 :
(17.11)
Let us return to the spin symmetric model with e=ective potential given by (17.2) and gh2; p = gh2; o , gh1; p = gh1; o . For the conservation of the quasi-particle and spin indices, it is not possible to have 2 involving only one gh1; o and any number of h ; then the only possibility a contribution to gh−1 2 involving only one gh1; p and any number of h . But such is to have a contribution to gh−1 contribution is equal to [gho; 4 ]n1 [ghp; 2 ]n2 −1 [gho; 4 ]n3 [ghp; 1 ][h ]n4 +2;(h)2;n1 ;:::;n4 ;
(17.12)
so it is vanishing. In fact the function +2;(h)2;n1 ;:::;n4 in (17.12) and (17.10) are the same as (h) (h) Fp; 2 = Fp; 1 . This proves (17.8). The same argument can be repeated for i = 4 so proving (17.9). Finally, let us consider the contribution to g1h−1 involving only one gh1 and any number of p; 1 ; by symmetry considerations it follows that there is no h . We consider a contribution to gh−1 p; 1 o; 1 contribution to gh−1 involving one gh and any number of h , and the only possibility is a contribution involving one ghp; 1 and any number of h . But replacing ghp; 1 with ghp; 2 and remem(h) (h) p; 2 bering that Fp; 2 = Fp; 1 this contribution coincides with a contribution to gh−1 , so it is vanishing p; 1 o; 1 = gh−1 and by (17.11). On the other hand, we are considering the spin symmetric case so gh−1 (17.7) is proved.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
387
At the end, the following theorem can be proved (the proof in [18] refers to the continuum case): Theorem 10. Given Hamiltonian (17:1) for spinning fermions with pF =0; '=2; '; if v(2p ˆ F )¿0 there exists an , ¿ 0 such that; for || 6 ,; there are functions 7(); Q() such that the two-point Schwinger function is given by g(x; y) A(x; y) S(x; y) = + |x − y|Q |x − y|1+Q with A(x; y) bounded by a constant; 7() = O() and Q = a2 + O(3 ); with a ¿ 0. In the half-6lled band case pF = '=2 there is a running coupling constant more gh3 whose second-order Cow is not trivial and given by 3 gh−1 = gh3 + +gh3 (gh1 − 2gh2 ) ; so that the Cow of the running coupling constants becomes much more complex to study. It is quite clear that one can to Hamiltonian (17.1) a term uP representing the interaction with a commensurate or an incommensurate potential; in the v(2p ˆ F ) ¿ 0 and under proper conditions on pF forbidding the comparison of extra running coupling constants (for instance, if p=' a rational number we require pF = np=2 for any integer n) one can prove results similar to their analogue in the spinless case, see Section 13.
17.2. The attractive case The analysis above shows that the presence of the spin, if pF = 0; '=2; ' and the interaction is repulsive, is in some sense irrelevant, as the two point Schwinger function asymptotic behaviour is similar to the one in the spinless case. The situation is completely di=erent in the attractive case v(2p ˆ F ) ¡ 0, in which the running coupling constants do not remain in the convergence radius of the series for the Beta function unless, in the in6nite volume limit, the temperature is ˆ F )| for some suitable constant . It is easy in fact to check that for h ¿ h larger than e−=|v(2p +S −1 =| v(2p ˆ )| S S F O(log(+ )), with + 6 e , the running coupling constants remain O(). It is generally
believed that the growing of the coupling g1(h) in the attractive case, or of g3(h) if pF = '=2 and always in the attractive case, are related to the opening of a gap and to exponential decay of correlations. Our result gives an upper bound on a possible gap in the ground state energy, ˆ F )| . saying that |3| 6 e−=|v(2p A proof that there really is a gap in the spectrum is up to now lacking except in the remarkable case of the Hubbard model; it is a particular case of the model we are considering in which v(x − y) = x; y and pF = '=2. In this case it was proved in [5] that the ground state has a gap for any ¡ 0; moreover, the ground state is such that each site is occupied by an electron and the spins are alternating (hence a spin density wave with period 1=P). In the general situation, only mean-6eld approximations are at our disposal; a very simpleheuristic mean-6eld argument from which one can deduce from the growing of g1(h) the appearance of a gap is the following one. As gh1 is the instable process, this suggests that the relevant interactions involve the exchange of a momentum of order 2pF so that the important terms in the
388
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
interaction are of the form, for |k |; |k | 6 pF =4 (say) 1 1 + − − + : k+!pF ; k−!pF ; L L k −!pF ;− k +!pF ;− !; k
(17.13)
k
Making a BCS-type mean-6eld theory we write |S |ei2 =
1 L k
− + k+pF ; k−pF ;
and neglecting quantum Cuctuations one obtains an e=ective interaction x; |S | cos(2pF x + + − 2) x; x; , from which the existence of a gap at the Fermi surface can be deduced. In this argument there is, however, a Caw; it does not take into account the fact that, if pF =' is irrational, then it can be that 2npF 2pF mod . 2' for very large n, so that it is not a priori true that one of the interactions exchanging momenta O(2npF ) are negligible. A more correct way to perform a mean-6eld analysis is the following one. One can replace in the interaction (assumed local for simplicity) x;+ x;− x;+− x;−− two fermionic 6elds with a classical 6eld + − x; x;
→ ’(x) + [
+ − x; x;
− ’(x)] ;
neglecting (this is the approximation) terms quadratic in the “Cuctuations” [ thus obtaining a model H0 + ’(x) x;+− x;−− − ’2x : x∈
(17.14) + − x; x;
− ’(x)],
(17.15)
x∈
This model is called variational Holstein model and the nontrivial problem is to minimize the ground-state energy with respect to ’. One arrives at the same model also considering the interaction of fermions with a phonon 6eld, neglecting quantum Cuctuations, which will be discussed in Section 19. We anticipate that even in this approximation the existence of periodic ground states (which can be commensurate or incommensurate depending on whether pF =' is a rational or an irrational number) is not trivial (for instance, it is not proved for small and pF =' irrational, see below). In other words, even in a mean-6eld model the existence of a gap is not proven, in general, in the attractive case.
18. Fermions interacting with phonon 8elds 18.1. Interaction with a quantized phonon =eld The Hamiltonian of a system of one-dimensional fermions on a lattice interacting locally with the optical modes of a quantized phonon 6eld is given by (2.8) and (2.9). We refer to [78] for
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
more details. The two-point Schwinger function can be written as P(d W) P(d )e−gV˜ x;+ y;− S(x; y) = ; P(d W) P(d )e−gV˜ where P(d W) is a bosonic integration with propagator 1 e−ik(x−y) ; v(x; y) = 2 2 2 2(1 − cos k) L+ ik + ikL k + 1 + b 0 0 0 e
389
(18.1)
(18.2)
=1 e =1; |k|6'
with |v(x; 0)| 6
and
2 (b) =
C(b) =
C(b) − 0−1 |x0 | −2 (b)|x| e e ; 0
O(b−1 ) O(log b−1 )
for b → ∞ ;
O(1) O(b−1 log b)
for b → 0 ;
(18.3)
for b → 0;
(18.4)
for b → ∞:
Integrating out the boson 6elds in (18.1) we obtain 2 P(d )eg V x;+ y;− S(x; y) = ; P(d )eg2 V with
+=2 +=2 1 d x0 d y0 v(x − y) V= 8 −+=2 −+=2 x;y∈
− + − + x; x; y; y;
(18.5)
:
(18.6)
The only di=erence with the previously considered interacting spinless Hamiltonian is that it is not local in time; it is easy to check that this changes nothing in the previous discussion. Then in the spinless case one can prove that the Schwinger function has an anomalous behaviour; of course the convergence radius is vanishing as b → ∞ (corresponding to long range interactions, i.e. p0 → 0); it is also vanishing if 0 → ∞. In the spinning case one is in the situation of the preceding section, so results are found only 2 for temperatures greater than e−=g . 18.2. Classical limit: the static Holstein model We can study the above model also in the “static” limit in which the quantum Cuctuations are neglected, to put formally, corresponding 0 = ∞; b = 0; one again gets, in this way, the variational Holstein model [107] found at the end of the previous section.
390
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
The ground-state problem is now equivalent to 6nding the 6eld minimizing the ground-state fermionic energy. Before discussing this model, we stress again that the relationship between the variational Holstein model and the models considered in this and in the preceding sections are not well understood. Surely if there is no spin the quantum Cuctuation completely changes the behaviour (the static Holstein model makes no di=erence among spinning or spinless fermions), at least for small interactions.
19. The variational Holstein model 19.1. Old results In the two preceding sections, we arrived at the variational Holstein model either by considering a mean 6eld model for spinning fermions with an attractive potential or by considering a semi-classical model for phonon–fermion interaction. The problem is to 6nd the function ’(x) minimizing the ground-state energy of a system of fermions with Hamiltonian H ≡ HLel + =
1 2 ’ (x) 2 x∈
txy
+ − x y
−
x;y∈
+ − x x
x∈
−
x∈
’(x)
+ − x x
+
1 2 ’ (x): 2
(19.1)
x∈
At 6nite L, the fermionic Fock space is 6nite dimensional, hence there is a minimum eigenvalue ELel (’; ) of the operator HLel , for each given phonon 6eld ’ and each value of ; let PL (’; ) be the corresponding fermionic density. The aim is to minimize the functional FL (’; ) = ELel (’; ) +
1 2 ’x ; 2
(19.2)
x∈
subject to the condition PL (’; ) = PL ;
(19.3)
where PL is a 6xed value of the density, converging for L → ∞, say to P. It is generally believed that, as a consequence of Peierls’ instability argument [82], in the limit L → ∞, there is a 6eld ’(0) , uniquely de6ned up to a spatial translation, which minimizes (19.2) with constraint (19.3), and it is a function of the form ’(2'Px), S where ’(u) S is a 2'-periodic function in u. This is physically interpreted by saying that one-dimensional metals are unstable at low temperature, in the sense that they can lower their energy through a periodic distortion of the “physical lattice” with period 1=P (in the continuous version of the model, since 1=P is not an integer in general). There are a few results about this model in the literature. (1) An exact result [84,110], makes rigorous the theory of Peierls instability for model (19.1) in the case P = PL = 1=2 (half-=lled band case), for any value of . In fact, in this case it has
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
391
been proved that there is a global minimum of FL (’) of the form ,()(−1)x , where ,() is a suitable function of . This means that the periodicity of the ground state phonon 6eld is 2 (recall that in our units 1 is just the lattice spacing): this phenomenon is called dimerization. The proof heavily relies on symmetry properties which hold only in the half-6lled case. As in the case of the Hubbard model, the special symmetries at pF = '=2 play a crucial role. (2) In [85,95] Peierls instability for the Holstein model is proven assuming large enough: in that case the fermions are almost classical particles and the quantum e=ects are treated as perturbations. The results hold for the commensurate or incommensurate case; in particular in the incommensurate case the function ’(u), S related to the minimizing 6eld through the relation ’(x) = ’(2'Px), S has in6nitely many discontinuities. On the contrary, in the small case, according to numerical results, ’(u) S has been conjectured to be an analytic function of its argument, both for the commensurate and incommensurate cases [85]. The results are closely related to the existence of the so-called “Aubry–Mather” sets in Classical Mechanics. 19.2. New results We discuss here a result in [42] found using the RG methods reviewed above, in the case of small and any pF . A local minimum of (19.2) satisfying (19.3) must ful6l the conditions ’(x) = Px (’; ); PL =
1 Px (’; ) L x
(19.4)
and Mxy ≡ xy −
9 9’x
Py (’; ) is positive de6nite :
(19.5)
If ’ is a solution of (19.4), it must satisfy the condition ’ˆ 0 = L−1 x ’(x) = PL . On the other hand, if we de6ne @x = ’(x) − ’ˆ 0 , we can see immediately that PL (’; ) = PL (@; + ’ˆ 0 ). It follows that we can restrict our search of local minima of (19.2) to 6elds ’ with zero mean, satisfying the conditions
’(x) = (Px (’; ) − PL ) ; PL =
1 Px (’; ); L x
(19.6)
and condition (19.5). Of course, if the 6eld ’(x) satis6es (19.6), the same is true for the translated 6eld ’(x + n), for any integer n. On the other hand, one expects that the solutions of (19.6) are even with respect to some point of ; hence we can eliminate the trivial source of nonuniqueness described above by imposing the further condition ’(x) = ’(−x). We shall then consider only 6elds of
392
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
the form [(L−1)=2]
’(x) =
’ˆ n ei2n'x=L ;
’ˆ −n = ’ˆ n ∈ R;
’ˆ 0 = 0 :
(19.7)
n=−[L=2]
We want to consider the case of rational density, P = P=Q, P and Q relatively prime, and we want to look for solutions such that ’(x) = ’(x + Q). Hence, we shall look for solutions of (19.6) with L = Li = iQ, PL = P, and [(Q−1)=2]
’(x) =
’ˆ n ei2'Pnx ;
’ˆ n = ’ˆ −n ∈ R; ’ˆ 0 = 0 :
(19.8)
n=−[Q=2]
Note that the condition on L allows us to rewrite in a trivial way the 6eld ’(x) of (19.8) in the general form (19.7), by putting ’ˆ n = 0 for all n such that (2n')=L = 2'Pm; ∀m, and by relabelling the other Fourier coeKcients. Conditions (19.6) can be easily expressed in terms of the variables ’ˆ n ; if we de6ne Pˆn so that [(Q−1)=2]
Px (’; ) =
Pˆn (’; )ei2n'Px ;
(19.9)
n=−[Q=2]
we get ’ˆ n = Pˆn (’; );
n = 0; n = − [Q=2]; : : : ; [(Q − 1)=2] ;
Pˆ0 (’; ) = PL :
(19.10) (19.11)
Also the minimum condition (19.5) can be expressed in terms of the Fourier coeKcients; we obtain the L × L matrix 9 MS nm ≡ nm − Pˆm (’; ) (19.12) 9’ˆ n which has to be positive de6nite, if the 6eld ’ satis6es (19:10) and (19:11) and Pˆm (’; ) is de6ned analogously to ’ˆ m in (19.8). Hence, if we restrict the space of phonon 6elds to those of form (19.8), we have to show that the Q × Q matrix 9 M˜ nm ≡ nm − Pˆ (’; ) (19.13) 9’ˆ n m has to be positive de6nite, if the 6eld ’ satis6es (2.10) and (19.11). Then the following result holds. Theorem 11. Let P = P=Q; with P; Q relative prime integers; L = Li ≡ iQ. Then; for any positive integer N; there exist positive constants ,; ,; ˜ c and K; independent of i; P and N; such that; if 06
v2 (1 + log v0−1 )−1 4'v0 ; 6 2 6 , 0 N log(,v ˜ 0 L) K N ! log(cQ=v04 )
(19.14)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
393
where v0 = sin('P) ;
(19.15)
there exist two solutions ’(±) of (19:6); with L = Li ; 1 − = cos('P) and PL = P; of form (19:8). The matrices M˜ corresponding to these solutions; de=ned as in (19:13); are positive de=nite. Moreover; the Fourier coeLcients ’ˆ n(±) verify; for |n| ¿ 1; the bound 2 N (±) (±) |’ˆ n | 6 |’ˆ 1 | : (19.16) v0 |n| Finally, ’ˆ 1(±) is of the form
2'v0 + +(±) (; L) (±) 2 ; ’ˆ 1 = ± v0 exp − 2 with |+
(±)
(; L)| 6 C
2
1 1 + log v0
(19.17)
;
(19.18)
where C is a suitable constant. The one-particle Hamiltonian corresponding to this solution has a gap of order |’ˆ 1 | around ; uniformly on i. The above theorem proves that there are two stationary points of the ground-state energy corresponding to a periodic function with period equal to the inverse of the density, if the coupling is small enough and the density is rational, and that these stationary points are local minima at least in the space of periodic functions with that period. The energies associated with such minima are di=erent so that the ground-state energy is not degenerate. The theorem is proved by writing Px (’; ) as an expansion convergent for small and solving the set of equations (19.10) by a contraction method. As a byproduct it is found that the ’ˆ n are fast decaying, (see (19.16)), so that ’(x) is really well approximated by its 6rst harmonics (this remark is important as the number of harmonics could be very large). The results are uniform in the volume, so they are interesting from a physical point of view (a solution de6ned only for || 6 O(1=L) should be outside any reasonable physical value for ). The case in [84] for the half-6lled case is contained in Theorem 11, but in [84] it is also proved that the solution is a global minimum. Finally, the lower bound in (19.14) is a large volume condition: this is not a technical condition as, if the number of Fermions is odd, there is Peierls instability only for L large enough. The upper bound for in (19.14) requires to decrease as Q increases; in particular, irrational density is forbidden. This requirement is due to the discreteness of the lattice and to Umklapp phenomena. Note that the dependence of the maximum allowed on Q is not very strong as it is a logarithmic one.
394
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
We know that Px (’; ) is well de6ned for small not only in the rational density case, (in which the proof is almost trivial), but also in the irrational case: in fact the small divisor problem due to the irrationality of the density can be controlled thanks to a Diophantine condition (see Theorem 2). However, to solve the set of equations (19.10), a contraction method is used which is not trivially adaptable in the latter case. The same kind of problem arises in proving the positive de6niteness of MS nm in the rational case (and this is the reason why we are able to prove that the stationary points are local minima only in the space of periodic functions with pre6xed period). It is not known if such problems are only technical or whether there is some physical reason for this to occur.
20. Coupled Luttinger liquids A natural question is what happens if we consider two or more fermionic chains coupled with a hopping term from one chain to another. This problem is surprisingly very diKcult, as the number of running coupling constants is very high (15 or more, see [83]) and many of them are growing so that a rigorous analysis in the limit + → ∞ based on RG seems impossible. We can consider a simple model of two Mattis models exchanging Cooper pairs between them. Even for this model a renormalization group analysis of the + → ∞ limit is not possible (the Cow equations are similar to the one for spinning fermions in the attractive case) but it is possible to perform a sort of mean-6eld theory, see [45,87], obtaining the equivalent of a BCS theory but the corresponding critical temperature Tc is not exponentially small (see also [88] for a perturbative third-order analysis). We consider the following functional integral: ZL; +; r = Pa (d )Pb (d )e−Va −Vb −Vab −hr ; (20.1) where, calling 2g2 ≡ gt Vi = −
1 (L+)4
Vab = −2
k1 ;k2 ;k3 ;k4
!; ;
g (+L)3=2 k ;! 1
− − + + k1 ;!; ;i k2 ;!; ;i k3 ;−!; ;i k4 ;−!; ;i (k1
+ + k1 ;!1 ;1=2;a −k1 ;−!1 ;−1=2;a
1
1
hr =
1 [r L+ !; i k
1
g (+L)3=2 k ;! 2
g −2 (+L)3=2 k ;!
− k2 + k3 − k4 )
− − −k2 ;−!2 ;−1=2;b k2 ;!2 ;1=2;b
2
+ k1 ;!1 ;1=2;b
+ g −k1 ;−!1 ;−1=2;b 3=2 (+L) k ;!
− − k; !; 1=2; i −k; −!; −1=2; i
2
+r
− − −k2 ;−!2 ;−1=2;a k2 ;!2 ;1=2;a
2
+ + k; !; 1=2; i −k; −!; −1=2; i ]
;
(20.2)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
395
where k;±!; ; i is the Grassman variable describing a fermion with momentum k and spin = ± 1=2 associated with the chain = a; b, Vi describes the interaction between fermions belonging to the same chain and Vab describes the tunneling of Cooper pairs from one chain to another, in the Barden approximation. The term hr represents the interaction with an external 6eld and the parameter r is real and positive (for 6xing ideas). If g = 0 the system reduces to two independent Mattis models, and the Schwinger functions have an anomalous behaviour like (13.30). It is convenient to write the interaction in terms of Gaussian variables. We write Vab = − 2[3a 3S b + 3b 3S a ] ;
where 3i =
g (+L)3=2
+ + k ; !; 1=2; i −k ; −!; −1=2; i ;
k ;!
3S i =
g (+L)3=2 k ;!
− − −k ; −!; −1=2; i k ; !; 1=2; i :
By using the identity (Hubbard–Stratanovich transformation) ( = u + iv; S = u − iv; u; v ∈ R) 1 2 S 2ab e = d u d v e−(1=2)|| ea+b (20.3) 2' R2 we can rewrite the partition function as 1 2 −(1=2)|1 |2 1 ZL; +; r = d u1 d v1 e d u2 d v2 e−(1=2)|2 | 2' R2 2' R2 S S S S × Pa (d )e−Va Pb (d )e−Vb e−hr e1 3a +1 3b e2 3b +2 3a :
(20.4)
Performing the change of variables $ (ui ; vi ) → +L(ui ; vi ) ; we obtain ZL; +; r =
where Di =
+L 2 +L 2 d u1 d v1 e−(+L=2)|1 | d u2 d v2 e−(+L=2)|2 | 2' R2 2' R2 S S S S × Pa (d )e−Va Pb (d )e−Vb eg(1 −r=g)Da +g(1 −r=g)Db eg(2 −r=g)Db +g(2 −r=g)Da ;
1 (+L) k ;!
+ + k ; !; 1=2; i −k ; −!; −1=2; i ;
Si = D
1 (+L) k ;!
− − −k ; −!; −1=2; i k ; !; 1=2; i
:
After the integration of the Fermi 6elds, if ˜v = (u1 ; u2 ; v1 ; v2 ) r r −(+L=2)[(u1 + g )2 +(u2 + g )2 +v12 +v22 ] −+LFL; +; r (˜v) 2 ; g d u1 d v1 d u2 d v2 e e ZL; +; r = [+L=2']
=
+L 2'
2 R4
R4
L; +; r
d u1 d v1 d u2 d v2 e−+LH; g
(˜v)
;
(20.5)
396
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
where L; +
e−+LF; g (˜v) =
Pa (d )
S S
S S
Pb (d )e−Va −Vb eg1 Da +g1 Db eg2 Db +g2 Da :
(20.6)
The partition function is then written as the (four dimensional) integral of the exponential L; +; r e−+LH; g (˜v) . If the function r +; r H+; v) = lim HL; v) ; g (˜ ; g (˜ L→∞
is two times di=erentiable and it admits a nondegenerate global minimum ˜v∗ for + large enough (the parameter r is introduced just to remove the possible degeneration) then L; +; r
lim lim r→0 L→∞
e−+LH; g
(˜v)
d u1 d u2 d v1 d v2 e
+; r −+LHL; v) ; g (˜
= (˜v − ˜v∗ ) :
(20.7)
r v) has a global minimum the model is solved; all the Schwinger If we can prove that H+; ; g (˜ functions can be computed using (20.7) and, if ˜v∗ = 0, there is a spontaneous gap generation. r So the problem is reduced to the computation of H+; v) and to the determination of its ; g (˜
r global minimum. However, H+; v) is given by the Grassmanian integral (20.6) which is not ; g (˜ quadratic in the Grassman variables and it is nontrivial to compute, especially in the g0t case. One has to take into account the interaction Va + Vb which is responsible for the g0t = 0 case of the Luttinger liquid behaviour of the model. +; r Let us assume that, given ˜v∗ , the function HL; v) is di=erentiable in a small neighbourhood ; g (˜ ∗ of ˜v (uniformly in L; +) and +; r +; r 9HL; v) 9HL; v) ; g (˜ ; g (˜ = 0; =0 : (20.8) ∗ ∗ 9ui 9vi ˜v=˜v
˜v=˜v
+; r This means that ˜v∗ is an extremal point for HL; v). An extremal point satis6es the following ; g (˜ extremality equations:
u1 +
r 1 -. −g g L+ k ;!
u2 +
1 -.
r −g g L+
v1 + ig
k ;!
1 -. L+ k ;!
v2 + ig
1 [ L+ k ;!
+ + k ; !; 1=2; a −k ; −!; −1=2; a + + k ; !; 1=2; b −k ; −!; −1=2; b
+ + k ; !; 1=2; a −k ; −!; −1=2; a
+ + k ; !; 1=2; b −k ; −!; −1=2; b
/
−
−
/ / .
+ +
. .
− − −k ; −!; −1=2; b k ; !; 1=2; b − − −k ; −!; −1=2; a k ; !; 1=2; a
− − −k ; −!; −1=2; b k ; !; 1=2; b
/0
/0 /0
=0 ; =0 ;
=0 ;
− − −k ; −!; −1=2; a k ; !; 1=2; a ] = 0
;
(20.9)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
397
where L+
+ + k ; !; 1=2; i −k ; −!; −1=2; i
=
S S S S + Pa (d )e−Va Pb (d )e−Vb eg1 Da +g1 Db eg2 Db +g2 Da k+ ; !; 1=2; i −k ; −!; −1=2; i S S S S Pa (d )e−Va Pb (d )e−Vb eg1 Da +g1 Db eg2 Db +g2 Da
(20.10)
− − and a similar one for −k ; −!; −1=2; i k ; !; 1=2; i . One has then to compute the r.h.s. of (20.9); if = 0 such computation is trivial and one obtains, as in BCS theory, that the gap and the critical temperature are exponentially small in 1=g2 . However the presence of the interaction along the chain, which is responsible of the anomalous behaviour, has a dramatic e=ect. One could think that the r.h.s. of the self-consistence equation (20.9) is obtained by the one obtained in the = 0 case simply replacing propagator (3.4) with the Mattis model Schwinger function (see [1], p. 209). This is in fact what is found by a naive 6rst order perturbation theory. However the true result is more complex, as also the gap acquires a critical index. In fact one can compute (20.10) by the techniques describes above and the following result holds, see [45,87]. r Theorem 12. There exist an , such that; if ¿ 0; ; |g| 6 , the function H+; v) de=ned in ; g (˜ (2:10) is di>erentiable at u1 = u2 ; v1 = v2 and the extremality equations (3:2) are pairwise equal. In particular the l.h.s. of third and the fourth are vanishing while the =rst and the second are equal to; if 1=+ 6 K |gu|; K ¡ 1 −Q 1 |gu| −Q ˜ | gu | 2 −1 2 u + r=g − g u f(g; ; u) = 0] ; − 1 [a + f(g; ; u)] + g u Q A A
(20.11) where Q = +1 + Q; ˜ |Q˜| 6 C2 ; |f|; |f˜ | 6 C; and C; a; +1 ; A are positive constants. Note that (20.11) is a non BCS or anomalous self-consistence equation describing a superconductor whose normal state is a Luttinger liquid; the Luttinger interaction modi6es the self-consistence equation for the gap from the BCS-like one to (20.11). Note that ; g2 have to be small but there is no restriction on their ratio, in particular it can be =g2 1. v) admits two Corollary. There exist , and K ¡ 1 such that; if ¿ 0; ; |g| 6 , then H+:r ; g (˜ 2 −1 2 extremal points; both if =g ¡ K or =g ¿ K . In the limit + → ∞, r → 0 they become of the form (±3; ±3; 0; 0). In particular if =g2 ¿ K −1 2 1=Q 2 1=Q g g |g3| = A 1 + O() + O (20.12) aQ while if =g2 ¡ K 2
|g3| = Ae(−a+O(g))=g :
398
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
The above analysis says two one-dimensional spinning Fermi systems with an intrachain interaction given only by forward scattering and an interchain interaction expressed by a Cooper pair tunnelling Hamiltonian, in the Barden approximation, are such that the two-point Schwinger function has a behaviour similar to the Mattis model Schwinger function if T ¿ Tc while for T 6 Tc there is long distance exponential decay related to the opening of a gap 3; Tc 3 and 3 has the nonBCS form given by (20.12) if the intrachain interaction is smaller than the interchain one.
21. Bidimensional Fermi liquids The techniques we have applied to one-dimensional fermions are general and can be applied also in d ¿ 2. In this case much less is known, and there is till now no rigorous construction of the theory in the + → ∞ limit. The study of d ¿ 2 fermions was started in [25,26] and has been pursued in [101–103]: a renormalization group analogous to the d = 1 case was de6ned; many new problems appear due to the fact that the singularity (i.e. the Fermi surface) are not two points but a circle or a sphere. The main result obtained in such papers was the de6nition of a well-de6ned mathematical setting, n! bounds for the perturbative series and the de6nition of the beta function. However, it appears that even truncating arbitrarily (as there is no proof of the convergence of the beta function, but only n! bounds) at the second order there are problems; one has in6nitely many running coupling constants and: (1) if the interaction is attractive, the Cow is not bounded due to the BCS instability, while (2) if it is repulsive due to the Kohn–Luttinger phenomenon it is likely that, except for very particular interactions with special symmetries, the Cow is still not bounded. As there is the generation of a gap, the fermionic techniques discussed till here probably have to be supplemented by cluster expansion techniques (the theory becomes partly bosonic due to the appearence of a Goldstone boson). At the moment, the only rigorous construction for a problem of interacting fermions in d = 2 is for temperature T ¿ e−k=|| [89,47,48]; note that we cannot expect to reach a colder region due to the appearance of BCS instability at Tc = e−a=|| (but =c1, see below; so perhaps fermionic techniques will allow us to reach at least =c 1). Let us consider a model in d = 2 of interacting fermions with Hamiltonian H = H0 +V +7N0 , where H0 and V are de6ned by the analogue of (2.2), (2.7) in two dimensions with an ultraviolet cut-o>. In d = 2 the Fermi surface is the circle k12 + k22 − pF2 = E(k) and the propagator is given by 0h=−∞ g(h) (x − y) with eik0 (t−s)+ik(x−y) (h) g (x − y) = d k0 d k fh (k02 + [E(k) − pF2 ]2 ) : (21.1) −ik0 + E(k) − pF2 Passing to polar coordinates we 6nd eik0 (t−s)+ik(x−y) (h) g (x − y) = d k0 d # |k|d|k|fh (k02 + [E(k) − pF2 ]2 ) −ik0 + E(k) − pF2
(21.2)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
399
and we can introduce another decomposition over the integration in # in the following way. The annulus of radius Ah around the Fermi surface is divided in sectors centred at # = #r and of angular width Ah=2 (the choice Ah=2 is not arbitrary, see below). Then 1 = ! @h; ! (#), where @h; ! (#) are compact support functions with support in Ah=2−1=2 6 |#−#! | 6 Ah=2+1=2 , ! 1 = A−h=2 and g!h (˜x − ˜y) = ei!pF (x−y) gS!h (˜x − ˜y) with gS!h (˜x
h d k0 d #@! (#)
− ˜y) =
k d kfh (k02 + [E(k) − pF2 ]2 )
eik0 (t−s)+i[(k−!pF )(x−y) ; −ik0 + E(k) − pF2
(21.3)
which is bounded by h |g! (˜x − ˜y)| 6 A3h=2
1+
[A(h) |t
− s| +
CN − y)r | + Ah=2 |(x − y)t |]N
A(h) |(x
(21.4)
where (x − y)r = |x − y|cos #! and (x − y)t = |x − y|sin #! . As in d = 1 one can write x
=
h
ei ! p F x
!
(h) !;˜x
;
(21.5)
(h) h where !;˜ x −˜y). The di=erence with respect to the d = 1 case is that x has propagator given by gS! (˜ −h=2 . We write a tree expansion as in the preceding section and we write the truncated !=A expectation as sum over anchored trees times determinants; the Gram–Hadamard inequality can be applied as there is always a 6nite number of kinds of fermions (on the contrary, if like in [25,28] one considers continuous ! variables, one 6nds technical diKculty in doing the Gram– Hadamard bound). Then we get the following bound for the e=ective potential; for a 6xed tree E and an anchored tree T we get: (1) a factor A−(5=2)hv (sv −1) for the integration over the coordinates, if sv are the subtrees coming out of the vertex v; (2) a factor A(3=2)hv n˜v where n˜v are the propagators (in the anchored tree T or in the determinants) in the cluster v and not in any smaller one; calling m4v the number of vertices with 4 external lines we get, using (5.32), (5.33), a factor
C n AhD
4
e
4
A(hv −hv )((3=2)(2mv −nv =2)−(5=2)(mv −1)) ;
(21.6)
v
if D is a proper dimension; (3) we have now to sum over !, which is the crucial point. In order to perform this sum, suppose that we have a number of vertices v with all the external lines 6xed to some scale
400
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
hv , with nev external lines; then the sum over ! gives e e [A(−hv =2)(nv −3)@(nv ¿3) ] :
(21.7)
v
In order to understand this formula one has to note that for each vertex v there are nev sums over v but (a) the conservation of momentum on each vertex eliminates one sum (b) the vertices are connected by an anchored tree in the truncated expectations; so if v1 , v2 are two vertices connected by a line l of the spanning tree, 6xing the sector of v1 of the half-line forming l 6xes automatically the half-line line of the vertex v2 which forms l; (c) by geometrical considerations [34] the fact that the momenta have to stay in an annulus around the Fermi surface of radius A(h) and that the sectors are O(Ah=2 ) cancels another sum. However, in general, the external lines are not all on the same scale and we need a slightly more complicated argument. One can perform an iterative argument for summing over !; let us consider the endpoints (assume only four 6eld interactions, for simplicity). In general, the scales of the external lines are di=erent; let us 6x all of the them equal to the largest one. By the above argument we get a factor (all the lines are 6xed to have the same scale): 4 A−(1=2)(hv −hv )mv : (21.8) v
Now we have to sum on the lines of the vertices whose scale was not the largest one. We contract all the minimal clusters in points, and we iterate the above argument; the lines external to the minimal clusters v were 6xed to a sector of width Ahv =2 ; so summing on the sectors of these lines (6xing all of them to the smallest scale) gives a factor A−(1=2)(hv −hv ) and at the end we get e e A(1=2)(hv −hv )(nv −3)@(nv ¿3) : (21.9) v
Putting together all terms we get 4 e 4 4 e e A(hv −hv )((3=2)(2mv −nv =2)−(5=2)(mv −1)−(1=2)mv +(1=2)(nv −3)@(nv ¿3) ;
(21.10)
v
which gives e e e A(hv −hv )[(−3=4)nv +5=2+(1=2)(nv −3)@(nv ¿3)] :
(21.11)
v
From the above formula, we see that the power counting is exactly the same as the d = 1 case i.e. the dimension of the cluster with two external lines is −1 and the one with 4 is 0. Then if one can restrict the summation to |Pv | ¿ 4 the series for the e=ective potential would be convergent (the above argument works really for trees which, for any v 20 ¿ |Pv | ¿ 4, see [47]; in fact the sector sums done like above produce a constant K |Pv | which should develop a factorial. For v with |Pv | ¿ 20 one uses the fact that the dimension is very negative. For this technical point, see [47]).
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
401
To renormalize the above theory, one uses a de6nition very similar to the one for d = 1 fermions. If we allow logarithmic divergences, we have only to renormalize at the 6rst order the clusters with two external lines (logarithmic divergences give a factor in the bounds C n n hn+ C 2 n (log +)n which allows us to get convergence for T ¿ e−k=|| , with kC 6 1). The de6nition of localization is the same as in the d = 1 case (note that, by the conservation of momenta the ! index of external lines of the clusters with two external lines are the same) (6h)+ (6h)− (6h)+ (6h)− (h) L d k d k0 k; ! W (k0 ; k) = d k d k0 k; ! W(h) (0; !pF ) : (21.12) k; ! k; ! Note that the theory is rotation invariant so that W(h) (0; !pF ) is in fact independent from !. There is, however, a di=erence with respect to the d = 1 case (see [48]). The e=ect of R gives (6h)+ (6h)− R d k d k0 k; ! W(h) (k0 ; k) k; ! (6h)− = d k d k0 k;(6h)+ [(k − !pF )9k W(h) + k0 9k0 W(h) ] : (21.13) ! k; ! Let us 6x a reference frame in which axis 1 is directed as ! and 2 is orthogonal; then k = k1 ; k2 and (1; 0) is a radial vector while 0; 1 is a tangential vector. Then, we can write the above equation as, if k − !pF = k (k is the momentum measured from the Fermi surface) 1 k1 d t 9k1 W (h) + d tk2 9k2 W (h) ; (21.14) 0
where k1 = O(A(h) ), k2 = O(Ah=2 ). The 6rst addend gives a factor Ahv −hv which is the right factor to leave only a logarithmic divergence; however the second addend gives a factor A(hv −hv )=2 A−hv =2
(21.15)
which is not the correct one to have only logarithmic divergences. This (apparent) problem is solved using the rotational invariance of the theory. In fact (h) (h) S W (k0 ; k = W (k0 ; k12 + k22 ) (21.16) and if P = (h)
(pF + tk1 )2 + t 2 k22 then (h)
W (k0 − k) − W (k0 ; !pF ) =
=
1 0 1
dt
d S (h) W (k0 ; P(t)) dt
k (pF + tk1 ) + tk22 d t WS (k0 ; P(t)) 1 P(t) 0
(21.17)
and from the absence of terms linear in k2 we see that the renormalization produces the right dimensional gain.
402
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Uncited references [68,75,80,86,97,100,104,105,108,109,115,117] Appendix A A.1. Graphs, diagrams and trees A.1.1. Graphs Given a set V with n elements, we shall call graph E on V a couple (V; E), where E is a subset of unordered pairs of elements in V ; we shall write V = V (E) and E = E(E) and shall call points the elements of V (E) and lines the elements of E(E). We shall denote by |V (E)| and by |E(E)| the number of elements in V (E) and in E(E), respectively; of course |V (E)| = n. We shall write also ‘ ∈ E for ‘ ∈ E(E). See Fig. 18. If a line ‘ connects two points v; w ∈ V (E) we shall write also ‘ = (vw): we say that the line ‘ is incident with the points v and w. Two points v; w ∈ V (E) are adjacent if (vw) ∈ E(E), while two lines are adjacent if they are incident on the same point. Given a point v ∈ V (E) we de6ne as degree of the point v the number d(v) of lines incident on v a point such that d(v) = 1 is called an endpoint. Of course d(v) = 2|E(E)| : (A.1.1) v∈V (E)
A subgraph E of E is a couple (V ; E ) with V = V (E ) ⊂ V (E) and E = E(E) a subset of lines (vw) in E(E) with v; w ∈ V (E ); we shall write E ⊂ E. A graph E is connected if for any v; w ∈ E there exist p ∈ N and p points v1 ; : : : ; vp , with v1 = v and vp = w, such that vj and vj+1 are adjacent for each j = 1; : : : ; p − 1: in such a case we say that the lines (v1 v2 ); : : : ; (vp−1 vp ) form a path P on E connecting the point v with the point w. We shall say also that P crosses or intersects the points v1 ; : : : ; vp . See Fig. 19. A graph is disconnected if it is not connected. A graph is acyclic if it has no cycle (or loop), i.e. if for any two points v; w ∈ V (E) there is only one path connecting them.
Fig. 18. A graph E with 14 points and 18 lines.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
403
Fig. 19. A path P connecting v1 with v6 .
Fig. 20. A rooted tree of order 9 with 27 vertices.
A.1.2. Trees A tree graph (or tree tout court) E is a connected acyclic graph. If |V (E)| = n we say that E is a tree with n points [106]. Given a tree one has |E(E)| = |V (E)| − 1 :
(A.1.2)
Note that given a tree E any subgraph (subtree) of E is still connected and acyclic: so any subtree is a tree. A rooted tree is a tree with a distinguished point v0 . A rooted tree can be seen as a partially ordered set of points connected by lines. The partial ordering relation can be denoted by 4: we shall say that v ≺ w if there is a path P connecting w with v0 and v is crossed by P. We can also superpose an arrow on each line pointing towards v0 : we say that the lines of the tree are oriented; by extension also the tree is said to be oriented. We shall call also vertices the points in V (E). The point v0 is called the 6rst vertex of E. To identify the 6rst vertex v0 , we can draw an extra point r and an extra oriented line ‘ connecting v0 with r (see Fig. 20). We shall call r the root of E and ‘ the root line. Such a line is added to the lines in E(E), while the root is not considered a vertex. With such a convention, (A.1.2) has to be replaced with |E(E)| = |V (E)| = n :
(A.1.3)
404
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Fig. 21. Two unequivalent unlabelled trees of order 3.
Note also that in this way (A.1.1) becomes d(v) = 2|E(E)| − 1 :
(A.1.4)
v∈V (E)
Given a vertex v ∈ V (E) we denote by v the node immediately preceding v, i.e. the vertex ≺ v such that (v v) ∈ E(E). We say that the line ‘ = (v v) exits from v and enters v . Note that the vertex v is uniquely de6ned, as the ordering relation implies a bijective correspondence between lines and vertices: given a vertex there is one and only one line exiting from it. For any vertex there are sv ¿ 0 exiting lines: one has sv = 0 if v is an endpoint. We de6ne the order of a tree as the number of its endpoints. We call trivial a vertex v with sv = 1 and nontrivial a vertex v either with sv ¿ 2 or with sv = 0 (this means that the endpoints are counted as nontrivial vertices). Denote by Vf (E) the set of endpoints in E, by Vt (E) the set of trivial vertices in E and by Vnt (E) the set of nontrivial vertices in E: of course V (E) = Vt (E) ∪ Vnt (E) and Vf (E) = {v ∈ Vnt (E): sv = 0} :
(A.1.5)
By the notation v ∈ Vf (E) we mean v ∈ V (E)\Vf (E). Given a vertex v ∈ V (E) the subgraph (V ; E ) with V = {w ∈ V (E): w ¡ v} ; E = {‘ ∈ E(E): ‘ = (w w): w v} ;
(A.1.6)
is a rooted subtree with root v . The just de6ned trees are sometimes called unlabelled trees, in order to distinguish them from the “labelled trees” (to be de6ned). The unlabelled trees are identi6ed if superposable up to a continuous deformation of the lines on the plane such that the endpoints coincide: in such a case we say that they are equivalent. In Fig. 21 two unequivalent unlabelled trees of order n = 3 are drawn. Note that the indices used to identify the vertices v ∈ Vf (E) play no role. The notions which will be used will be that of unlabelled tree and, mostly, that of labelled tree. A (rooted) labelled tree can be obtained from an unlabelled tree by assigning labels hv to its vertices v ∈ V (E) in the following way. A label h 6 0 is associated to the root. If Th; n
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
405
Fig. 22. A rooted tree and the corresponding walk W .
denotes the corresponding set of labelled trees of order n (i.e. with n endpoints), we introduce a set of vertical lines, labelled by an integer assuming values in [h; 2], such that each vertex v ∈ V (E) is contained in some vertical line h ∈ [h; 2] (this will always be possible, as the lines can be continuously deformed): then we set hv = h . The label hv will be called the frequency or the scale of the vertex v. By construction hv ¿ h for all v ∈ V (E) and hv ¿ h + 1 for all v ∈ Vf (E). Moreover, if v ≺ w then hv ¡ hw . The number of trees is controlled through the following result. Lemma A.1. The number of (rooted) unlabelled trees with n points is bounded by C n for some constant C. Proof. The number of (rooted) unlabelled trees is bounded by the number of one-dimensional random walks W with 2n steps. This can be proved as follows. We can imagine moving along the tree by remaining to the left of the lines and starting from the root line. We move forward until an endpoint is reached: in this case we turn backwards until we meet a nontrivial vertex; then we turn once more forward and so on, until we come back to the root line. See Fig. 22: + means that we move from left to right along the line, while − means that we move from right to left. Each time we move forward along a line we associate to it a sign +, while we associate to it a sign − when we move backwards. So the tree can be characterized by a collection of 2n signs ± which de6ne a walk W = {± ± : : : ±}. Note that not all one-dimensional random walks with 2n steps correspond to unlabelled trees: we call compatible the random walks for which this happens. For instance, the 6rst sign is always a + and the last one is always a −: moreover, the overall number of + signs has to be equal to the overall number of signs −: Note that the correspondence between unlabelled trees and one-dimensional compatible random walks is one-to-one. By neglecting all the constraints we can bound the number of collections of 2n
406
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
signs, hence the number of unlabelled trees with n nodes, by 22n , that is the overall number of random walks with 2n steps. So we can choose C = 4 and the assertion follows. Given a tree with n vertices one has, as it is straightforward to check,
n − 1 if n ¿ 2 ; 1 6 |Vf (E)| 6 1 if n = 1 ; |Vnt (E)| 6 2|Vf (E)| − 1 :
(A.1.7)
The number of labelled tree in Th; n cannot be bounded uniformly in h: there are at most 2n − 1 nontrivial vertices, by (A.1.7), but once they have been 6xed, one can add many trivial vertices between them, and the number of possible insertions goes to in6nity for h → ∞. Nevertheless, we have the following result on labelled trees. Lemma A.2. Let Th; n be the number of labelled trees of order n and with scale h assigned to the root. If A ¿ 1 and 2 ¿ 0; then A−2(hv −hv ) 6 C2n (A.1.8) E∈Th; n v∈Vf (E)
for some constant C2 . Proof. Let us denote by Th;∗n the set of labelled trees of order n having only nontrivial vertices, and by E∗ any element in Th;∗n . A labelled tree E of order n can be imagined as formed from a tree E∗ of order n, by inserting trivial vertices between the (nontrivial) vertices of E∗ : the number of inserted vertices automatically determines the values of the scale labels. Fixing a tree E, so that the corresponding tree E∗ is determined, we can write A−2(hv −hv ) = A−2(hv −hv ) ; (A.1.9) v∈Vf (E)
v∈Vnt (E∗ )\Vf (E∗ )
where, for v seen as a vertex of E∗ , v denotes the vertex in E∗ immediately preceding v. The tree E can be obtained by inserting hv − hv trivial vertices between v ∈ E∗ and v ∈ E∗ . Then we have A−2(hv −hv ) = A−2(hv −hv ) : (A.1.10) E∈Th; n v∈Vf (E)
E∗ ∈Th;∗n v∈V (E∗ )\Vf (E∗ )
Denote by Tn∗ the set of unlabelled trees of order n having only nontrivial vertices. Then = ; (A.1.11) E∗ ∈Th;∗n
E∗ ∈Tn ∗ {hv }v∈E∗
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
so that
A−2(hv −hv ) =
E∗ ∈Th;∗n v∈V (E∗ )\Vf (E∗ )
407
A−2(hv −hv )
E∗ ∈Tn ∗ {hv }v∈E∗ v∈V (E∗ )\Vf (E∗ )
6
E∗ ∈Tn ∗
1 A2 − 1
n
6 Cn ;
(A.1.12)
where we used |V (E∗ )| = |Vnt (E)| 6 2n (see (A.1.7)), so that the number of elements in Tn∗ is bounded by C 2n , for a constant C (see Lemma A.1); moreover, in performing the sum over the scales we neglected all constraints except that of hv − hv ¿ 1. A.1.3. Feynman diagrams A graph can be imagined as formed by giving n points v1 ; : : : ; vn with dv1 ; : : : ; dvn outcoming lines, respectively, and contracting (some of) such lines between themselves. We can also associate to each line a sign = ± 1 and allow only contractions such that a line with a sign + is contracted with a line with a sign −. In particular, we can consider points with 2 or 4 outcoming lines: in the 6rst case there is one line with a sign + and one line with a sign −, while in the second one there are two lines with a sign + and two lines with a sign −. We denote by n2 the number of points v with dv = 2 and by n4 the number of points v with dv = 4: of course n = n2 + n4 . The points can have also a structure: when dv = 4 the point v is formed by two disjoint points connected through an ondulated line, while when dv = 2 the point can be characterized by an extra label. We shall call graph elements the points with structure. We shall consider only graphs of the above type which are connected: such graphs will be called Feynman diagrams and will be denoted by