CAMBRIDGE MONOGRAPHS ON MATHEMATICAL PHYSICS General Editors: P.V. LandshofT. W.H. McCrea. D.W. Sciama. S. Weinberg
GRAVITATIONAL PHYSICS OF STELLAR AND GALACTIC SYSTEMS
FRONTISPIECE: A cluster of galaxies in the southern constellation of Pavo about 300 million light years away. This illustrates several types of galaxies: spiral, elliptical, barred and boxshaped. The giant galaxy in the center may have formed by the merging of smaller galaxies, or it may have formed first and attracted other galaxies to cluster around it. (Original negative by U.K. Schmidt Telescope Unit. Photography by Photolabs, Royal Observatory, Edinburgh, with permission.)
GRAVITATIONAL PHYSICS OF
STELLAR AND
GALACTIC SYSTEMS WILLIAM C.SASLAW University of Cambridge University of Virginia National Radio Astronomy Observatory, USA
The right of the Unitersiiy of Cambridge to print and sett tilmanner ofbacks was granted by Henry Vln In 1534. The University has printed and published continuously since ISM.
CAMBRIDGE UNIVERSITY PRESS CAMBRIDGE NEW YORK
NEW ROCHELLE
MELBOURNE
SYDNEY
CAMBRIDGE u n i v e r s i t y p r e s s
Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo, Delhi, Dubai, Tokyo, Mexico City Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521349758 © Cambridge University Press 1985 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 1985 First paperback edition (with corrections) 1987 A catalogue recordfor this publication is availablefrom the British Library Library of Congress catalogue card number: 8412127 ISBN 9780521234313 Hardback
ISBN 9780521349758 Paperback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or thirdparty internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate. Information regarding prices, travel timetables, and other factual information given in this work is correct at the time of first printing but Cambridge University Press does not guarantee the accuracy of such information thereafter.
to my parents
Contents
1 2 3 4 5 6 7 8 9 10 11 12 13 14
15
16
17
Preface Introduction Part I: Idealized homogeneous systems  basic ideas and gentle relaxation The average and fluctuating gravitational fields Gentle relaxation: timescales The dynamics of random impulsive forces General properties of Fokker™ Planck evolution FokkerPlanck description of gravitating systems Dynamics with a memory: nonMarkovian evolution The Boltzmann equation Some properties of the Boltzmann equation The virial theorem The grand description  Liouville's equation and entropy Extracting knowledge: the BBGKY hierarchy Extracting knowledge: the Fourier development Collective effects  grexons Collective scattering 14.1. The scattering probability 14.2. Fluctuations, correlations, form factors and the / sum rule for stellar systems 14.3. The deflection angle, dynamical friction again, and a numerical test Linear response and dispersion relations 15.1. Basic result 15.2. Gaseous systems 15.3. Stellar systems Damping and decay 16.1. Physical description 16.2. Calculation of Landau damping rate 16.3. Other damping mechanisms Stargas interactions
xi xv 1 3 7 14 21 29 34 39 44 49 54 64 69 73 77 77 86 90 95 95 96 100 105 105 107 114 117
viii
18
19 20
21 22 23 24 25
26 27 28 29
30
31
32 33 34
Contents 17.1. Gas dynamical processes 17.2. Accretion and momentum decrease 17.3. Jeans and twostream instabilities Problems and extensions 18.1. The point mass approximation 18.2. Plummer's model 18.3. Solutions of the master equation 18.4. Selfsimilar solutions of the collisionless Boltzmann equation 18.5. Wave propagation in an inhomogeneous system 18.6. Accretion of gas by stars Bibliography Part II: Infinite inhomogeneous systems  galaxy clustering How does matter fill the Universe? 20.1. General description 20.2. Quantifying the distribution 20.3. Measurement of the galaxy pair correlation function Gravitational instability of the infinite expanding gas Gravitational graininess initiates clustering Growth of the twogalaxy correlation function The energy and early scope of clustering Later evolution of cosmic correlation energies 25.1. The cosmic energy equation 25.2. Stability of c(r)xr~2 /Vbody simulations Evolving spatial distributions Evolving velocity distributions Short review of basic thermodynamics 29.1. Concepts 29.2. Interrelations 29.3. Connections with kinetic theory and statistical mechanics 29.4. Fluctuations and phase transitions Gravity and thermodynamics 30.1. Statistical mechanical approach 30.2. Kinetic theory approach 30.3. Model approach Gravithermodynamic instability 31.1. The vanishing of sound speed 31.2. Spatial fluctuations 31.3. Temporal fluctuations 31.4. Gravitational phase transitions Thermodynamics and galaxy clustering; £{r)<xr~2 Efficiency of gravitational clustering Nonlinear theory of high order correlations 34.1. Equation of state 34.2. The distribution functions f(N)
117 119 124 127 127 127 128 128 130 130 131 135 137 137 143 147 151 158 163 171 175 175 178 181 185 195 202 202 206 209 213 215 215 218 220 222 222 226 228 230 233 238 245 245 249
Contents 35
36
37 38
39 40
41 42
43 44 45 46 47 48
49 50 51 52 53
54
Problems and extensions 35.1. Gravitational instability in multiple component systems 35.2. Pancakes 35.3. Fluctuations and semiinvariants Bibliography Part ill: Finite spherical systems  clusters of galaxies, galactic nuclei, globular clusters Breakaway Violent relaxation 38.1. Introduction and basic physical ideas 38.2. The collisionless distribution function 38.3. Criteria for violent relaxation 38.4. Damping and computer simulations 38.5. Distributions for collisional systems Symmetry and Jeans' theorem Quasiequilibrium models 40.1. Polytropes and isothermal spheres 40.2. Loaded polytropes 40.3. FokkerPlanck models Applying the virial theorem Observed dynamical properties of clusters 42.1. Clusters of galaxies 42.2. Galactic nuclei 42.3. Globular clusters Gravithermal instabilities Selfsimilar transport Evaporation and escape Mass segregation and equipartition Orbit segregation Binary formation and cluster evolution 48.1. Formation by fewbody interactions 48.2. Formation by stellar dissipation  tidal energy transfer 48.3. Effects on cluster evolution Slingshot Role of a central singularity Role of a distributed background Physical stellar collisions More stargas interactions 53.1.Galactic winds 53.2. Central disks and star formation 53.3. Embedded stars Problems and extensions 54.1. Potential energy of a polytrope 54.2. Virial mass discrepancy for a system which loses mass 54.3. Effect of angular momentum on the contraction of a stellar system 54.4. Calculation of slingshot results
ix 255 255 255 256 259 263 265 270 270 272 277 280 281 287 290 290 294 299 303 312 313 316 318 321 328 335 340 346 352 352 356 362 365 369 374 378 382 382 385 387 388 388 388 389 389
x 55 56 57
58 59 60 61 62
63
64 65 66
67 68
Contents Bibliography Part IV: Finite flattened systems  galaxies Observed dynamics of galaxies Kinematics of motion 57.1. General kinematics 57.2. Motions in galactic coordinates Transfer of angular momentum Rotation curves and galaxy mass Orbits and third integrals Axisymmetric and bar instabilities Spiral instabilities 62.1. Introduction 62.2. Basic properties of patterns 62.3. Selfconsistent patterns 62.4. Sustaining the spiral Triaxial and irregular systems 63.1. Introduction  spinning polytropes 63.2. Computer experiments Gravitational shocks Passingmerging Problems and extensions 66.1. Uniform ellipsoids and Maclaurin spheroids 66.2. Simple evolution of bars 66.3. Distribution function for a uniformly rotating disk of stars 66.4. Models of thick disks 66.5. Origin of the density profiles of elliptical galaxies 66.6. Orbits and adiabatic invariants in a triaxial galaxy Unanswered questions Bibliography
391 397 399 406 406 410 414 420 427 438 446 446 447 450 453 456 456 461 463 466 471 471 473
Index
481
473 474 474 475 476 478
Preface
The plan of this monograph is divided into four main parts. These parts develop in order of decreasing symmetry, from idealized infinite homogeneous systems to finite flattened irregular systems. Along this sequence, the ratio of model applications to fundamental physical ideas and techniques increases. Even so, I have tended to emphasize the basic physics over detailed applications. Specific astronomical models wax and wane as data and fashions change, but the principles on which they are built have much longer lifetimes. Thus the degree to which various topics are discussed does not always reflect their popularity in today's, or yesterday's, literature. Nearly all the theory described in this book is based on classical Newtonian gravity. Relativistic generalizations of almost every aspect are possible, and there was a flurry of these generalizations in the 1960s and early 1970s. It was greatly stimulated by possible applications to quasars. Although quasars still are unsolved, no evidence has developed that relativistic star clusters are needed to explain them. That, plus the fact that there are enough fascinating things to say about observed nonrelativistic systems, persuaded me to restrict this book to classical gravity. The book is reasonably selfcontained in that most of its results can be obtained directly from preceding ones, sometimes with two or three intermediate algebraic steps to be added by the reader. Usually these steps are straightforward and they are outlined in the text. So the book is meant to be read with a notepad and pencil. If, having bought the book, you can no longer afford a notepad, feel free to write on the pages. It is a copy to be used, not a sacred relic. Although this is a theoretical treatise, I have also tried to describe just enough related observations to set the scene and provide some motivation for the physical problems. In some cases, observations are described after a conceptual framework for them is developed. Computer simulations of iVbody systems have reached the stage where they can almost be regarded as quasiobservations, or at least as experiments, and I have included some representative examples. I have not attempted to put results in an historical context, or trace them back to their origins. Some acquaintance with the early literature has convinced me that this
xii
Preface
is a major task, interesting in its own right but inappropriate here. There are many cases, incidentally, where the original ideas go back much earlier than generally recognized. Although the book contains almost 300 references, the bibliographies are not meant to be complete, but just to be small gates into a vast literature. Some of these gates are in sections called 'Problems and extensions' where the reader can try out his skill and pursue nontrivial problems into the literature. Results given in these problems are occasionally used later in the text. Other, more straightforward problems, are lightly sprinkled throughout the text. There are also a number of suggestions for research topics, and I have not hesitated to point out loose ends of the theory. Gravitational physics has developed through many applications, and, inevitably, conflicts of notation arise. There are disadvantages to imposing a superficial unity on the subject by requiring each symbol to stand for a unique physical quantity. It would require too many symbols, many of which would be unfamiliar. Rather, I have generally kept to the usual notation in each branch of the subject, trusting the reader to keep his wits about him and not confuse the mass of a galaxy with the index of a tensor. Where serious confusion might occur, or I have had to use a variant notation, I have normally remarked on it in the text. Most symbols are defined within sight of the equations where they are first used in each section. For some applications it is convenient (or traditional) to think of the gravitational potential as positive; for other applications as negative. In either case, the sign in Poisson's equation is compensated by the sign in the equation of motion, where the direction of the force is usually obvious. Many colleagues have been helpful and influential with regard to this book, and I would not have been able to complete a project this large without their assistance. For their careful reading of the manuscript, numerous helpful comments and friendly criticism. I especially thank Douglas Heggie, Ray White and Andrew Hamilton. I am also glad to thank Dennis Sciama for encouraging both the writing of this book and my early work in astrophysics. Richard Epstein and Chris Pethick invited me to give a series of lectures on gravitational physics at NORDITA, which was the catalyst that started my writing. Subsequently while writing, I enjoyed the hospitality of NORDITA, the Niels Bohr Institute, the Tata Institute and the Peking Observatory. I owe a particular debt to Larry Frederick, Dave Heeschen and Morton Roberts in Charlottesville, and Fred Hoyle, Donald LyndenBell, Martin Rees, the late Sir Denys Page, Sir Alan Cottrell, and the Fellows of Jesus College in Cambridge, England. They have encouraged the somewhat unusual administrative arrangements which enabled me to spend half of each year at the University of Virginia and the National Radio Astronomy Observatory in Charlottesville, and the other half at the Institute of Astronomy and Jesus College, Cambridge University. At this stage in the acknowledgements, authors often thank colleagues 'too numerous to mention' for reading parts of the manuscript and for helpful discussions. But I shall try to mention all who, having been helpful in these respects
Preface
xiii
over the years, come to mind at the moment, with double apologies to those inadvertently left out: S. Aarseth, G.B. van Albada, G.D. van Albada, T.S. van Albada. P. Anderson, J. Aodouze, H. Ardavan, D. Arnett, J.M. Bardeen, J. Barnes, J. Barrow, J. Bartlett, M. Begelman, J. Binney, R. Bond, S. Bonometto, G. Burbidge, M. Burbidge, S. van den Bergh, W, Burke, A.G.W. Cameron. R. Cannon, B. Carr, S. Chandrasekhar, S. Colgate, G. Contopoulos, A. Cottrell, P. Crane, R. Davies, M. Davis, D. De Young, M. Disney, G. Efstathiou, P. Eggleton, R. Epstein, M. Fall, J. Faulkner, J. Felten, G. Field, W. Fowler, R. Fox, J. Franck, K. Freeman, C. Frenk, K. Fricke, M. Geller, T. Gold, P. Goldreich, J.R. Gott, J. Gunn, D. Ter Haar, E.R. Harrison, R. Havlen, M. Henon, J. Hindmarsh, S. von Hoerner, H. van Horn, F. Hoyle, V. Icke, S. Inagaki, M. lye, B. Jones, A. Kalnajs, J. Katz, T. Kiang, I. King, G. Lake, M. Lecar, A. Lightman, D.N. Limber, C.C. Lin, D. Lin, D. LyndenBell, W. Mathews, D. Merritt, L. Mestel, R. Miller, J. Monahagn, P. Morrison, R. Nakatsuka, C. Norman, J. Oort, J. Ostriker, P.J.E. Peebles, C. Pethick, K. Prendergast, W. Press, J. Primack, M. Rees, M. Roberts, W. Roberts, M. Ruderman, E.E. Salpeter, R. Sanders, W. Sargent, B. Schutz, M. Schwarzschild, J. Sellwood, F. Shu, J. Silk, E.A. Spiegel, L. Spitzer, G. Steigman, P. Strittmatter, B. Stromgren. P. Sturrock, K. Thorne. T. Thuan, A. Toomre, J. Toomre, S. Tremaine, A. Tubbs, J.A. Tyson, M. Valtonen, P. Vandervoort, G. de Vaucouleurs, J. Wadiak, J. Wheeler, S. White, M. Whittle, A. Whitworth, A.M. Wolfe, P. Woodward, A. Zee. Saundra Mason typed a difficult manuscript with truly outstanding ability and patience. She said she was only doing her job, but she did it extraordinarily well. George Kessler, also at NR AO, drew the diagrams with a steady and skilful hand. I am very grateful to them both. It has also been a pleasure, as I expected it would be, to work with Simon Mitton, Rufus Neal, and the staff of the Cambridge University Press. W. Saslaw
Introduction Glendower: I can call spirits from the vasty deep. Hotspur: Why, so can I, or so can any man; but will they come when you do call for them ? Shakespeare
The spirit beneath the surface of nearly any astronomical phenomenon is gravitation. The reason why gravity is the motive force for much of the Universe is not hard to see. What primarily interests us about the Universe is its structure, including ourselves. And the physical reason for the existence of this structure is gravity. Even in the case of ourselves, it is the force of gravity in massive stars which drives their nuclear reactions to produce heavy elements, then eventually causes the star to explode and spew these elements throughout the galaxy. Some of them collect into new stars and planets, partly through the more gentle ministrations of gravity and here we are! Of course, the mass of humanity, though important to ourselves, is only about 10~ 41 of the mass of the visible Universe. To put it another way, we contribute about 10~ 1 9 kms~' Mpc" 1 to the Hubble constant. As our own origin, through star formation and evolution, was driven by gravity, so even more directly does gravity govern the dynamics of other astronomical structure: stellar clusters, the shapes and evolution of galaxies, and the motions of the entire system of galaxies. Richard Hooker, writing in the seventeenth century, thought the relations between human and gravitational phenomena were even closer: If the frame of that heavenly arch erected over our heads should loosen and dissolve itself; if celestial spheres should forget their wonted motions, and by irregular volubility turn themselves any way as it might happen; if the prime of the lights of heaven, which now as a giant doth run his unwearied course, should as it were through a languishing faintness begin to stand and to rest himself; if the moon wander from her beaten way, the times and seasons of the year blend themselves by disordered and confused mixture, the winds breathe out their last gasp, the clouds yield no rain, the earth be defeated of heavenly influence...: what would become of man himself, whom these things now do all serve? See we not plainly that obedience of creatures unto the law of nature is the stay of the whole world? So you see, there are dire consequences of not understanding this subject!
xvi
Introduction
The scope of gravitational manybody physics and its applications to astronomical systems is very broad. To gain a bit of perspective, consider two major ways in which the gravitational problem differs from other branches of manybody physics such as superfluids, condensed nuclear matter, or amorphous solids. First, we have had our Newton, so we start with a known microscopic description of the physics. The gravitational Hamiltonian was written down more than 150 years ago  by Hamilton. Thus our problem is often to find a suitable macroscopic description of the physics. This development is historically inverted compared to other branches of physics where macroscopic descriptions were followed by microscopic ones. For example: thermodynamics led to statistical mechanics, fluid mechanics was followed by kinetic theory, and a description of the bulk properties of solids preceded our understanding of their molecular and atomic structure. Because we understand the microscopic interaction of point Newtonian particles completely (even when the 'particles' represent galaxies with diameters of 100000 parsecs!) it is possible to do numerical experiments with them on large computers. So we can both simulate natural astronomical systems (the great analog computer in the sky) and test our macroscopic ideas of manybody properties. This has been one of the recent growth industries of astronomy and some of its results will be incorporated into our discussion. We notice the second difference between gravitational and other manybody systems immediately on writing down the Hamiltonian. For a system of gravitationally interacting particles with momenta pt and constant masses mh the Hamiltonian is the total energy of the system:
This is a simple but highly nonlinear function of the separations rit of the particles. The equations of motion are also nonlinear and, since all pairs are attractive, they do not saturate, unlike plasmas with overall neutrality. Already there is a hint that gravitating systems like to evolve toward more and more condensed phases toward more negative gravitational energy. However, it does not follow that the entire system develops into a singular state with r,j>0 for all i a n d / Since H is constant, small values of ri} must be balanced by large values of momenta. So some particles acquire high velocities and, if the system is finite, they escape it. Thus part of the system can condense, and the rest expand to conserve total energy; so that the particles' harmonic mean separation proportional to
pairs if J
becomes very small. This tendency toward subclustering will lurk behind much of our discussion of gravitating systems. Although the gravitational Hamiltonian was written down in 1834, for more than
Introduction
xvii
70 years its applications were mainly to the orbits of two or three bodies, with the perturbations of celestial mechanics. Observations of the motions of large numbers of stars in our galaxy had led to interest in the manybody problem by the first decade of this century. By the end of its second decade, several classic papers of Charlier, Eddington, Jeans and others treated the problem as an analog of the MaxwellBoltzmann kinetic theory of gases. A major difference between the two approaches was the lack of shortrange collisions among stars, so they used a collisionless Boltzmann approximation. In the 1930s and 40s, much effort went into using the collisionless Boltzmann equation to try to determine the gravitational potential and structure of our galaxy from observations of its stellar distribution and assumptions that its velocity distribution could be parametrized by a Gaussian ellipsoid. This was the socalled inverse Jeans problem. It did not turn out to be very fruitful for understanding the structure of our galaxy, and was superseded in this respect by the 1420 MHz measurements of the distribution of neutral hydrogen. In the 1940s also there were calculations of the evaporation of stars from clusters and in the 1950s and 1960s many astronomers began applying variants of techniques developed for plasma physics to the gravitational problem. In the last two decades, there has been a resurgence of many aspects of this problem. The revival has been motivated mainly by observations of quasars and violent activity in galactic nuclei, consisting of highly concentrated stellar systems, new analyses of galaxy clustering, increased understanding of spiral structure of galaxies as gravitational density waves, observations of triaxial elliptical galaxies, and the discovery of Xray emission from globular clusters of stars in our galaxy, previously thought to be very stable and boring astronomical objects but now quite lively.
PARTI
Idealized homogeneous systems  basic ideas and gentle relaxation In reality we apprehend nothing exactly, but only as it changes according to ... the things that impinge on or offer resistance to it. Democritus
To understand some of the most important properties of gravitating systems we temporarily put aside the effects of their density and velocity gradients, their components of different masses and sizes, and any external forces which may act upon them. Imagine an idealized, isolated, homogeneous gravitating system of particles. Usually in discussing the physics of these systems we will call their particles stars for brevity, although when discussing many astronomical contexts they will often be galaxies, or even clusters of galaxies. These idealized systems will sometimes be finite, sometimes infinite. In later sections we will find that both sizes of homogeneous clusters turn out to be unstable. But never mind that for now; there are more basic properties to consider.
1 The average and fluctuating gravitational fields Little by little does the trick. Aesop
The selfconsistent motions of all the stars or galaxies in an isolated cluster are completely determined by their mutual gravitational forces, so long as each object can be treated as a point mass. For a system of objects, each with radius d, this criterion requires that the volume occupied by the objects be small compared to the volume of the system, so that bodily collisions are infrequent. Two spherical objects of radius d have an effective radius for grazing collision of 2d, and thus an effective cross section a = And2. Randomly moving objects with number density n will have a mean free path kG to geometric collisions given by
in a spherical system of radius R containing N uniformly distributed members. The number of times an object can move through its own diameter before colliding is essentially the ratio of the cluster's volume to that occupied by objects. Moreover, the number of cluster radii the object could traverse before colliding is essentially the ratio of the projected area of the cluster to that of the objects. For galaxies in clusters: N « 103, R x 3 Mpc, and d « 0.01 Mpc, so /.G/d * 10 4 and AG/R % 30. For stars in globular clusters: N as 105, R % lOpc, and d * 3 x l O ~ 8 p c , so kGjd % 1020 and kG/R « 3 x 10". For stars in moderately dense galactic nuclei: AT%106, R %0.1 pc, and d « 3 x l 0  8 p c , so A G /d%10 13 and A G / K * 3 x l 0 6 . Therefore in all these cases direct collisions rarely affect the motions of most galaxies or stars. This is not to say that collisions are always unimportant for the overall evolution of the clusters, and we shall later examine the effects when they dominate. The criterion that a system be isolated means that forces (potential gradients) from the stars within the system are much greater than forces from outside sources. An external mass Mc at distance / typically produces a force per unit mass ~ GMJl2, if it is sufficiently spherical that its multipole contributions are unimportant. Internal forces on a star arise from two contributions: an average force produced by many distant stars and a fluctuating force produced by a few nearby stars. To make a rough estimate (which will be sharpened later), consider a star at distance r from
4
Part I : Idealized homogeneous systems
the centre of a uniform spherical system. Within this distance the total mass is M(r) = (^)nmnr3 and it produces a mean field force ~ GM(r)/r2 »4Gmnr. Thus the average internal force dominates an external force if r >R2 M e
(l2
where M = M{R). The contribution from the few nearest neighbors, which cannot be averaged because its time fluctuations are of the same order as its instantaneous value, is approximately the mass m of the nearest star times G divided by the square of the interparticle distance, i.e., Gmn213 x GmN2liR~2. This fluctuating force exceeds the mean force for r/R £N~113. The internal fluctuating force dominates an external force if
As a first example, consider the forces on stars within our previous globular cluster (supposing each star to be one solar mass) at the edge of a spherical galaxy 10 kpc in radius containing 10 1 ' M Q. From inequality (1.2) we find that in the central region of the cluster the external force dominates the average internal force by about an order of magnitude, and the forces are about the same near the cluster's edge. Equation (1.3) shows that the internal fluctuating forces are much less than the external force. Next, consider stars within a galaxy influenced by a companion galaxy of the same mass at lx 10 R. The situation is different. Now the internal average force dominates, except for very small r/R. A caveat arises because the mass density in the center is idealized to be constant, giving a simple harmonic restoring force. In real galaxies this is most unlikely to be true. It might seem, in the previous globular cluster example, that we cannot isolate the cluster from the galaxy when analyzing its internal dynamics. This is only true, however, in a restricted sense. Insofar as the external force is the same for the entire cluster, it only determines the motion of the cluster's center of mass, i.e., its orbit around the galaxy. It is the differential force on the cluster  the tidal force  which affects its internal dynamics. The effect of the external mass on the outer stars in a nearby cluster differs from the effect at the center by an amount F ( R )  F(0)3rl—  A r x —R .
(1.4)
For the globular cluster example this reduces the tidal force relative to the external gravitational force by a factor R/l x 10" 3 . So the galaxy does not determine the cluster's internal dynamics, except near the tidal radius. There is an interesting case when a galaxy can influence a globular cluster's internal dynamics strongly, even if the tidal forces are small. If the external field varies rather sharply in space, it can give the cluster a shock or impulse every time its orbit comes around to the irregularity. Although each shock may be small, their cumulative result over many orbits can add enough energy to the cluster to shake it
The average and fluctuating gravitational fields
5
apart. We return to this in more detail in Part iv; for the present we consider isolated systems. One of the most important ideas which arose in the previous discussion is that we can separate the internal field into the irregularly fluctuating field of near neighbors plus the calmer averaged field of more distant stars. Many manybody approximations use this idea, so we now examine it further. We start with a static case, imagining a star in a uniform random distribution. The force on this star from JV stars in a small cell at distance r is proportional to N/r2, so cells at different distances which produce the same force must contain numbers of stars JV oc r2. In a uniform random distribution JV will be the average number of stars in the cell, but there will also be a typical relative fluctuation AN/N x N~112 ccr~l from cell to cell. Thus, the relative importance of fluctuations in cells producing the same magnitude of gravitational force decreases inversely as their distance. Let us look at it another way. Suppose the star of interest is at the center of a spherical cluster. If the matter were distributed in a continuous uniform way, the central star would feel no force. But because the matter is condensed into stars, its distribution is grainy. Graininess imposes a net force. A spherical shell surrounding the star will contain JVsheI1 = Anr2n Ar stars, on average. Thus, the fluctuating component of the force due to the shell would be AF 5 h e l l «
2nGmn(Ary
(1.5)
for, say, a positive number fluctuation in the shell. This force would arise, for example, if onehalf the shell contained about %(N + yjN) stars. No longer would the shell be completely spherical, so the central star would be deflected. The net effect on the star would result from all the shells, there being R/Ar of them. However, the deflection of each shell is also in a random direction. The net residual effect is approximately proportional to the square root of the number of shells, Rll2/(Ar)112. Thus, the total fluctuating force is approximately
(1.6) l
where (,r~ y is a convenient average reciprocal radial distance,
which would arise from integrating the fluctuating force over the system. Notice that the shell thickness actually cancels in the calculation, so its particular value is irrelevant. In fact, the system behaves essentially as though JV1/2 of its stars were placed in a lump roughly halfway out from the center, and this lump provides a typical fluctuation in the gravitational force. As the number of stars increases, the medium becomes more continuous and the ratio of the fluctuating to the average force decreases as JV~1/2.
6
Part I : Idealized homogeneous systems
These fluctuations will give the star a random component of velocity, in addition to the velocity it gains from the average field. The simple static picture we have used so far is not adequate to calculate this random velocity. As the star moves, the fluctuations in the surrounding stars also change their position and strength. Moreover, the motion of the star through the system can lead to collective effects in which the positions and velocities of nearby stars are not random, but correlated. So we must turn from static to dynamic calculations.
2 Gentle relaxation: timescales I have moved this way and that: gradually, this way and that, but mostly this way. Mervyn Peake
Since gravity is a binary interaction, occurring between pairs of stars, the deflection which results from all the interactions in a system is essentially the sum of all pair interactions. The dynamical effect of each interaction is to deflect the star's previous orbit. It is easy to estimate this deflection for weak scattering. Consider, as in Figure 1, a massive star m^ deflecting a much less massive star m2, moving initially with velocity v perpendicular to the impact parameter (distance of closest approach of the undeflected orbit) b. The gravitational acceleration Gmjb2 acts for an effective time 2b/v to produce a component of velocity ^
(2.1)
approximately perpendicular to the initial velocity. Since Av 0, say, objects of either positive or negative velocities would be accelerated to large positive velocities. This would clearly be the case for any additional dependence of a(v) on even powers of v. However, if a were to depend on only odd powers of v with negative coefficients, the negative velocities would become less negative and the positive velocities less positive. Thus if we expand a(v) in odd powers of v, the phenome
General properties of FokkerPlanck evolution
23
nological form a = — av (for a unit mass) can be regarded as the first term of such an expansion, providing the simplest form of dynamical friction. This was the intuition behind Langevin's analysis. With these forms for the coefficients of f(v), the normalized solution of the steady state FokkerPlanck equation is f(v) =
(4.14a)
0, then gives the net instantaneous rate of change of the distribution function df(v,t) dt
U*(V,t)W(v\v) f(v,t)W(v,vl)]dv'.
(4.24)
This is known as the 'master equation' and, so far, is just an empty formalism without any physics apart from the Markov assumption.
General properties of FokkerPlanck
Undeflected
evolution
7\
27
V \% \
tht
t
t + St
Time
Fig. 4. The scattering processes contributing to the master equation. To build on this skeleton we first give W some general features. Near equilibrium the velocity of a star rarely changes by a large amount in a short time. Thus W(v', v) may be regarded as a sharply peaked function of Ay = v' — v and either v or v'. At equilibrium, f^(v, t) =f(v, t) and W(v\ v) = W(v, v') giving df/dt = 0, as expected. So we start by supposing that very close to equilibrium /„, « / and that the flow of stars between different velocities is so nearly reversible that W(v\ v) « W{v, v'). Since /„, does not differ very much from / , we expand it in a Taylor series around / dfiv'.t)
_ . . „
A2
.
dv
where A = v'v = Av.
(4.26)
We consider the velocity change in (4.25) to be negative in order to obtain the usual 'forward' FokkerPlanck equation. More rigorous derivations (found in texts on probability theory or stochastic processes) show this is because a negative velocity change implies differentiation with respect to the final time of the interval St. A positive velocity change would give the corresponding 'backward' (Kolmogorov) equation, useful for finding the probability distribution of initial conditions which would lead to a given final state. With W, we can go a step further. Its expansion may be written ~A+~~A2 uv 2 dv
+ ....
(4.27)
The truncation of these expansions with the second derivative is also a statement of proximity to equilibrium. Far from equilibrium, largeangle scattering becomes
28
Part I : Idealized homogeneous systems
important and Taylor series expansions become inadequate. Inserting (4.25)(4.27) into (4.24) and integrating with respect to A gives the FokkerPlanck equation (3.24) yet again with a(v) =
A W(v, A)d A = < Ai; >
(4.28)
J 00
interpreted as the average change of velocity per unit time and o2{v) =
f00
A2 W(v, A)d A = < (Av)2 >
(4.29)
as the mean square change of velocity in a unit time interval. This derivation shows explicitly how the first and second moments of the transition probability determine the distribution function. It also shows clearly how the validity of the FokkerPlanck equation depends on both the Markov assumption and the system being close to equilibrium. Small departures from these conditions can easily be fitted into modifications of the formalism. But violent nonlinear departures cannot.
5 FokkerPlanck description of gravitating systems
So far we have examined just an illustrative onedimensional example of the FokkerPlanck equation. To apply it to gravitating systems we need to generalize it to higher dimensions and make the average velocity changes represent gravitational scattering. Generalization to higher dimensions is necessary because the distribution function depends just on the modulus v of the velocity only for a completely isotropic system. Once anisotropies are present /'=f(v 1 ,v 2 ,v 3 ,t). Similarly, the still assuming it to be statransition probability is \V=W(vl,v2,v3,v'1,v'2,v'i), tionary. It is straightforward to extend the derivation of the last section using Taylor series expansions to generalize (3.24) for anisotropic or /Vdimensional distributions
The coefficients a, and ajk are defined analogously to (4.28) and (4.29). Next we see how a, and ajk can describe the effects of many smallangle gravitational scatterings. Each twobody interaction scatters the relative velocity magnitude u =  v • — vt  through an angle 9 in the centerofmass frame. With an inverse square force, the effective area, per unit solid angle, for this change is the famous Rutherford differential cross section for two masses ma and mh with reduced mass mab = mamb/(ma + mb)
e S(e,u)Jp^)\rn Jp^)\ 2 \2mahu )
(5.2) 2
(see, e.g., Landau & Lifshitz, 1976). An analysis of the kinematics of the deflection relates the scattering angle 6 to the change of velocity Avt. To find the average velocity change at for all deflections, Avt is first integrated over all scattering angles and then over all velocities vt of the scattering star. Each of these velocities occurs with a probability f(vt), so a( = ,> will itself depend on an integral over the distribution function occurring in the FokkerPlanck equation (5.1). The result for stars of mass ma encountering stars of mass mb, which may represent an average mass
30
Part I ; Idealized homogeneous systems
for the field stars, is (Rosenbluth, MacDonald & Judd, 1957)
= rp, j
where
vk} = r ^  ,^ dvjdv dvdvk
(5.3) (5.4)
with D the upper limit to the impact parameter of the deflections and v ')
g=
v  v' " l dv',
/(v')vv'dv'.
(5.6) (5.7)
Next, we specialize h and g to gravitating systems. It looks as though to solve the FokkerPlanck equation we must already know its solution in order to use the proper coefficients. We break this deadlock in the usual way for differentiointegral equations by assuming a distribution function/0(i/) for the coefficients, calculating j \ = / 0 + A t / from the equation, substituting J\ for / 0 in the coefficients, calculating f2 = / i + A 2 /, etc. The hope is that such an iterative procedure will converge. In practice no one has had the stamina, or found it necessary, to do more than one iteration analytically. This is because in a uniform, or even a fairly uniform, system, it is reasonable to suppose that near equilibrium the background distribution of field star velocities is essentially Gaussian with a constant normalization A up to some maximum velocity F max f=Ae~P2"2,
=0
i>max,
, v>vmax.
(5.8)
Thus the velocities are isotropic, dh/dvl =3~in dh/dv, etc., and only the derivatives of h and g with respect to the speed v appear. Moreover, these are no cross correlations of orthogonal velocity changes so ajk = 0 when j ± k. In this case the derivatives are (Michie, 1963)
FokkerPlanck description of gravitating systems
31
where Cx
x In ( — 1Vx2"+ ' F(x)=\ e' dz = ^—erf(x)= Y ^ — (5.12) Jo 2 ,^0 «!(2n+l) and n0 is the average number density. Expanding F(flv) and the exponentials (for P2vluix g> 1), using (5.3)—(5.5) and noting O 2 > = 3/2/?2, shows that for fiv < 1 2
(5.13)
and
^%
2
[
^Z2[^^~j[llP^
+ ...l (5.14)
Thus we now find more rigorously the result ace — v, or the coefficient of dynamical friction a = —a/v is independent of velocity for small velocities  a relation which we earlier derived heuristically. Moreover (5.14) is essentially our earlier result (2.8), now reduced by a factor of about 0.2 from the velocity integrations (remember also that j y/n as x * oo, we now have 1 3/2 tR
and
*2V
(5 16)

Again, this could have been guessed. Since relatively rapidly moving stars do not have time to perturb their neighbors significantly (see (5.15)), the dynamical friction on them is less and they take longer to slow down. Nor do their velocities diffuse as fast as those of stars with v « urms. Note that molecular or atomic systems in which shortrange, hardcore collisions dominate will behave oppositely. In microscopic systems, the faster moving particles make more collisions each second, so are slowed down more quickly. This is another example where the analogy between molecular viscosity and gravitational friction breaks down. The form of a, and ajk, both normalized to 1 at fiv = 0, from Equations (5.3)—(5.12) is shown in Figure 5. To gain some preliminary insight into the importance of anisotropy in the velocity distribution, consider an imaginary, artificial system. Let it be spatially homogeneous with three types of stars  red, white and blue, say. Suppose the red stars move in the x direction, white stars in y and blue stars in z throughout the system.
o
0.4 —
3.5 Fig. 5. The coefficient of dynamical friction a; and the fluctuating force a jk for a gravitating system described by the FokkerPlanck equation.
4.0
FokkerPlanck description of gravitating systems
33
Each class of stars has a Gaussian velocity distribution but with different dispersions represented by flx, fly and f}z. Of course, in reality this would never work since streaming instabilities would also produce relaxation on the same timescale ~ TR as dynamical friction. So we must either imagine that stars of different colors do not interact with each other, or, alternatively, examine such a system just for times shorter than xR after these initial conditions are set up. The motivation for this system is that each coefficient a, of dynamical friction will have the same form as in our previous discussion, with /?, replacing /?. Thus from (5.13) and (5.15) we see that the friction is a strong function of pt, typically varying as ft?. Insert a green star of arbitrary low velocity, less than any velocity dispersion, into such a system. It tends to increase its velocity more rapidly in the directions of smaller velocity dispersion. Encounters with stars of smaller velocity dispersion generally last longer and give greater deflections. On the other hand, a green star of arbitrary high velocity tends to decrease its velocity less rapidly in directions of smaller velocity dispersion since those stars usually have less time to deflect it. A green star whose initial velocity is less than the dispersion in some directions and greater than it in others will tend toward the average velocity in each direction. But the greater its initial difference, the longer it will take to come to equilibrium. Superimposed on these effects is the everpresent tendency to increase the velocity by diffusion. From (5.14) and (5.16), a low velocity green star increases its velocity by diffusion faster in directions of low dispersion. A high velocity star increases its velocity faster in directions of high dispersion. Again the coupling is stronger when the velocities of test stars and field stars are closer. In real systems, anisotropic velocity dispersions are most often produced by rotation. But for these cases, the density is also anisotropic and leads to further interesting behavior. Behind our dynamical analysis so far lie three basic assumptions. First, the development of the system at any given time depends only on the state of the system at that time, and not on its previous history. Second, the field stars are uncorrelated in position or velocity. In particular, the test star does not change the distribution of field stars as it passes through them. Third, the distribution evolves through a series of stellar encounters, each of which makes only a slight fractional change in the stars' velocities. Naturally we want to know what happens when these assumptions do not hold, since they are rather special. So we will describe nonMarkovian effects in the next section, collective effects starting in Section 13, and violent relaxation in Part in.
Dynamics with a memory: nonMarkovian evolution Those who do not remember the past are condemned to relive it. Georye Santayana
Why should the evolution of a gravitating system depend on its previous history, and when is this important? In discussing the properties of systems of colliding objects, two timescales are especially critical. One is the time of a collision ?•„ during which the objects affect each other significantly. The other is the time between collisions x0. Ordinary systems of atoms or molecules interact with a clear separation of these two timescales. Consider a roomtemperature gas of hydrogen atoms with cross section u * l(T 1 6 cm 2 so that x% x{ma/\2nkT)}'2 « 10~ 14 s but i/2 l 10 x 24 3 xox(m/3kT) (ncr)~ « 3 x 10 n" s. Then for n < 1 0 c m " , we have t 0 > V When there is a clear separation of these timescales, the particles undergo a series of buffetings, each of which can be idealized as a random impulse to its orbit. Each buffet is over well before the next one begins. One can start to describe such a system's evolution at any time with initial conditions just at that time. It is not necessary to know what happened earlier, for the system has no memory. This is why the FokkerPlanck equation is a differential equation and not a differentiointegral equation; it operates locally in time. Now consider gravitating systems. The analog of shortrange collisions for point objects is encounters which deflect their orbits by ~ 90°. (Shortrange physical collisions of finite objects, like stars, will be analyzed quite differently in Part in.) From (2.1) these have a cross section o x 4nb2 « l6nG2m2/v4. Therefore, using the subscript s to denote short range. "* S
W3/2 ~ ^JV
3
V ~
3 I! V
TG,
(.0.1,1
" "
and
In the last term of both these equalities, we have used the virial relation v2 = GmN/R and set
Gm
VI/2 )
tG.
(6.3)
Dynamics with a memory : nonMarkovian evolution
35
The length R is the radius for a finite spherical system, and for an infinite system it represents a scale over which orbits can be correlated. For an infinite system, this correlation length may result from a density enhancement Anxn giving rise to enhancements of order v in the peculiar (i.e., nonstreaming) velocities of the objects. The time T G is basically the contribution that one object of mass m would make to the gravitational response time if it were smoothed over the whole system. From Equations (6.1) and (6.2) we see that the time between collisions is much longer than the time of a shortrange collision, by a factor of about 0.04 N2. The opposite result holds for longrange collisions (subscript L), as might be expected from their name. For these, the time of a collision is R
GM
V
V
,,,
T*L =  =  I  « W  1 / 2 T G ,
(6.4)
while the time between collisions is 1
:AT 3/2 T G .
(6.5)
Indeed the time of a longrange collision is greater than the time between collisions by a factor ~ N, and as a result a longrange collision is never complete! So it would seem that longrange collisions, by their very nature, must be nonMarkovian. But this is not the whole story. Since longrange collisions are so gentle, it takes about 10 ~2 N times as many of them to produce the same effect, Ai' v v, as one shortrange collision (see (2.10), (2.11) and (6.4)). Comparing IO'2N T^L withrOs shows that the time T*L for longrange collisions to have a major effect on the orbits is about onetenth the time between shortrange collisions. In a realistic system with density variations and nonMaxwellian distributions, at least roughly onetenth the longrange collisions are quenched by a shortrange nearly random scattering. The time between shortrange collisions as a function of velocity change varies as TOsocA"2oc(Ai;/i»)2i'3. The longrange relaxation time TR has the same velocity dependence (see (2.8)). So the quenching is about the same at any level of velocity perturbation. Other phenomena also tend to quench longrange deflections in realistic systems. Any density or velocity inhomogeneities will scatter individual objects strongly. If these inhomogeneities, representing collective motions, change rapidly with time as in a system far from equilibrium  they can be much more important than single longrange collisions. This state, called violent relaxation, is discussed in Section 38. Even without violent relaxation, we should not forget that an object responds to the mean gravitation field, as well as to the fluctuations. Global asymmetries of scale length D in the average field will affect the orbit on times of order D/v (the crossing time for finite systems). If the amplitude of the asymmetry is large, as in a contracting ellipsoidal system of stars, it can dominate the fluctuations. Finally, the tidal effects of external systems may also influence internal orbits. Nevertheless, there may be cases where nonMarkovian effects are important and we should at least construct a simple theory to see how they can alter a system's
36
Part I : Idealized homogeneous systems
evolution. To do this we return to the Langevin analysis of (3.1) and ask how it is modified by perturbing forces which still have zero average value (3.3) but are no longer unrelated at different times. What form will
as
(8.8)
for all velocity components and any power n. Basically this means that finite systems contain very few objects with velocities greater than the escape velocity, and infinite systems contain a low density of stars or galaxies moving with arbitrarily high relativistic velocities. Next, define the average velocity moments of order I + k v\vkJf(x,v,t)dvidv2dv3
= n(x,t)v\vkj,
(8.9)
where n(\,t) is the number density and the bar denotes the velocity average. Multiply (8.7) by v) and integrate all over velocity space. The first term is just
x,
Vj
dt
Vl
Vl
"3
dt
j
'
The second set of terms are ' ' dx;
l
2
3
dxt
'
J
'
The third set of terms reduces to [0 when i =/=; .  kvjn(\, t)vkj~l when i =j (no summation on;) (8.12) since gravitational acceleration does not depend on velocity and we integrate by parts using (8.8). Combining these results 8 Q —(nv)) H (nviv)) = kvin(x,t)vk~i (no summation over/). (8.13) J J dt ox,' There are three equations, one for each value of j . It takes the merest moment to interpret what this technique produces. Letting k = 0, the three equations (8.13) are all the same ^ + V(nv) 0. (8.14) ot This is just the equation of continuity (4.3) for number (mass) conservation of the
Some properties of the Boltzmann equation
47
average flow in the system. Letting k = 1 gives three equations, each of the form  ^ + j(nv\) + T—(nii^) + — (mw) = e,fi(x, t), (8.15) at cx^ ox2 ox3 from which the remaining two also follow by cyclic permutation, i.e., changing the indices 1 • 2, 2 »3, and 3 > 1, then repeating this transformation. Equations (8.15) describe the transport of momentum. In this case it is momentum per unit mass for equal mass objects, but it can easily be generalized for a distribution function which depends on mass as well as on position, velocity and time (an exercise for the reader). The results (8.14)(8.15) were first derived by Jeans (1915,1919,1922) and are often called the equations of stellar hydrodynamics. The transport equations can be made to look more like ordinary fluid hydrodynamics by rewriting the nine terms of the form d(nv~[v2)/dx2. These terms represent, for example, the gradient in the x2 direction of the average flux of nvt momentum along x2. This generally differs from m J ^ , which is the flux of the average momentum nvt along x 2 . The reason is essentially that the fluctuations of opposite sign in the mean flow make it unequal to the root mean squareflow,as also occurs in the Langevin and FokkerPlanck descriptions. Supposing that each velocity component is the sum of an average and a fluctuating term, v, = v, + v'h
(8.16)
and that the fluctuations are not autocorrelated, »/ = 0,
(8.17)
gives vjTj = v{vj + vjvj = ViVj + ~ ,
(8.18)
where Ptj = Pfi is a pressure tensor and m is the mass of a star in the system. The lack of autocorrelation is a very strong assumption, much less realistic for gravitating systems than for atomic gases. But it is consistent with the collisionless Boltzmann approximation and, as we will see, simplifies the situation so much that it is often used to give approximate results which provide some insight into particular astronomical problems. Most of these problems are so difficult that even small insights become significant. Substituting (8.18) into (8.15), recalling from (8.9) that vt is generally a function of x and t even though vt is not, and using the continuity equation (8.14) transforms the momentum transport equation into its more familiar form ct
cxi
ox2
ox3
mn\ ox^
ox2
ox3 j
m
with Fi = mvx the 'external' force due to the mean field. Similar equations obtained by cyclic permutation hold for the v2 and v3 components. When the pressure is isotropic so PtJ = Pdi}, all these equations can be combined in the vector form dv
1
— + vVv + — V P = Fm~ 1 . ot mn
(8.20)
48
Part I : Idealized homogeneous systems
Poisson's equation (7.13) determines F from n, and mass conservation (8.14) relates changes in n and v. But something is lacking. Even in the simplest case of isotropic pressure, we have eight variables, i?l5 v2, v3, n, P, Flf F2, F3, and only seven equations to determine them. There is not enough information in these moments to solve them selfconsistently. If only there were another equation for, say the pressure v'2. Aha! If we go back to the moment Equation (8.13) and take the next higher moment with k = 2 we get a transport equation for energy. And after applying the separation of (8.16) we would have our equation for the pressure. The catch, however, is that the pressure equation contains terms involving third order moments arising from d{nviv'j)/dxi. The trend is now clear: equations for second order moments contain third order moments, those for third order moments contain fourth order moments and so on to infinity. It seems we cannot break this chain. What started as a promising technique turns out disappointing. All is not lost, however. The problem just calls for a new approximation; there is more than one way to proceed. We could calculate many moments until we tired and then set the highest moment equal to zero to close the set of equations. This would have the advantage that differences in the solutions found by closing the equations at different levels would give an idea of their accuracy. The more moments used, the greater the constraints placed on the solution to behave like the solution of the original Boltzmann equation. Infinite knowledge of all the solution's moments would be equivalent to knowing the solution itself. (This follows from an extension of the arguments after (3.14) along the lines that all the moments of a function give its transform which can then be inverted to obtain the function. The functions need to have standard properties of continuity and good behavior.) Without infinite knowledge, we can take a more physical approach. If the motion of the stars or galaxies is close to Maxwellian, then Ptj = P(i = v\2 and this average square random velocity is approximately the velocity dispersion of the distribution function. In a finite spherical system, the kinetic and potential energies approximately balance, so r,'2 x GM/R. At a stroke we can then truncate the coupled moment equations. This only works, however, for the equilibrium case where we half know the answer anyway. Nevertheless, an important series of questions involves perturbations around equilibrium states, for which these equations are ideally suited. We will discuss them several times, in different contexts and coordinate systems, in later sections.
9 The virial theorem One by one the moments fall; some are coming, some are going; do not strive to grasp them all. Adelaide Proctor
This result is used so often throughout so many branches of dynamical astronomy that it is worth discussing separately, even though it is just a special moment of the equations of motion. For a first look at the virial theorem, we start with the equation of motion (8.20) written in component form as dvt
dP
Here p = mn is the mass density and the total derivative dt
dt
J
dxj
K
'
follows the motion. In a discrete distribution the ith component of gravitational force on the ath star is F^) = w w ^ , (9.3) where the potential
+ 3=0.
(9.19)
In this form of the virial theorem, we must remember that < T> does not refer to the total kinetic energy, but just to the portion of the kinetic energy which involves the average motion. If we observe a stellar system, we have no way of knowing how to separate these components. Nor, as a practical matter, can we average a system's internal motions over times which are much longer than its dynamical relaxation time. The first problem can be solved easily by recasting the virial theorem, and we
52
Part i ; Idealized homogeneous systems
will do that straightaway. The second difficulty is often assumed away by supposing that the configuration we observe actually is the time average. The validity of this assumption will be discussed in Section 41. Whereas the previous derivation of the virial theorem treated the system as a continuum, this one starts with the basic equations of motion for discrete objects. This will avoid the need to separate the kinetic energy into mean flow and 'thermal' terms. Moreover, we shall generalize the situation by allowing the mass of each object to change with time. This represents mass loss from stars or galaxies, accretion and merging of galaxies, or, more speculatively, cosmologies in which mass varies. The equations of motion for each object are
using (9.4). The (a) and (/?) indices serve the purpose of the primed and unprimed variables in the continuous case. The masses now are functions of time. For simplicity the objects lose mass isotropically, although the anisotropic case can easily be described by denoting the mass loss in each direction by mx?> Vmt;j«»oy>, (9.24) 2atZLZ 2dt^ « which is motivated by differentiating the inertia tensor, given the form of (9.14) as a clue.
The virial theorem
53
For the kinetic energy tensor we now have
a
for the inertia tensor /u = 5> < " ) x" ) x$ I \
(926)
and for the mass variation tensor J 0 . = £m ( oo, then 2 + <W,j>=0.
(9.29)
The time averages will not, except fortuitously, be equal to the instantaneous average here, since TiS and Wtj are explicitly time dependent. However, this time dependence does not change the form of the virial theorem. Contracting and summing over the indices gives 2 + <W^>=0,
(9.30)
which is the most quoted version. Any number can play in the virial theorem. It applies to a satellite going around a planet as well as to a cluster of galaxies. In the simplest case, a twobody circular orbit, for example, we know from first principles that centrifugal and gravitational forces balance: m^v2jr = Gm(x)mifi)/r2. Multiplying through by r gives the virial theorem. In this case the time average is equal to the instantaneous value since the orbit is periodic and symmetric. When applied to clusters of many objects, the virial theorem gives the orderofmagnitude relation used in Section 2 between the size of the cluster and its velocity dispersion: v2 « GmN/R. About a halfcentury ago this relation was used to estimate the mass of several clusters of galaxies from their observed radii and velocity dispersions. An alternative estimate from the calibrated luminositymass relation of the galaxies gave a cluster mass about an order of magnitude less than the dynamical mass. The disagreement of these two estimates was called 'the mystery of the missing mass' and is discussed further in Section 42.1.
10 The grand description  Liouville's equation and entropy All for one, one for all, that is our device. Alexandre Dumas, Elder
Langevin's equation, the FokkerPlanck equation, the master equation, and Boltzmann's equation are all just partial descriptions of gravitating systems. Each is based on different assumptions, suited to different conditions. They all arise from physical, rather intuitive, approaches to the problem. But there is also a more general description from which our previous ones emerge as special cases. We know this must be true because Newton's equations of motion provide a complete description of all the orbits. The trouble with Newton's equations is that they are not very compact: N objects generate 6N equations. True, the total angular and linear momenta, and energy, are conserved, at least for isolated systems, but this is not usually a great simplification. By extending our imagination, we can cope with the problem. We previously imagined a sixdimensional phase space for the collisionless Boltzmann equation. Each point in this phase space represented the three position and three velocity (or momentum) coordinates of a single particle. It was a slight generalization of the twodimensional phase plane whose coordinates are values of a quantity and its first derivative resulting from a second order differential equation for that quantity. The terminology probably arose from the case of the harmonic oscillator where this plane gave the particular stage or phase in the recurring sequence of movement of the oscillator. Now consider a bigger 'phase space' having 6N dimensions. Each point represents not a single object, but the entire system of N objects. As the system evolves, the trajectory of its phase point traces out this evolution. So far, nothing new has been added except a pictorial representation of the dynamics. The next step is a great piece of intuitive insight due to Gibbs. Although only one system with a particular set of properties may really exist in nature, suppose there were many such systems, an ensemble. If all the objects in each of these systems had exactly the same positions and velocities at some time, the systems would all share the same phase point. More interestingly, consider such a Gibbs ensemble in which each system has a different internal distribution of positions and velocities (for the same number of objects). This ensemble is represented by a cloud of points in 6Ndimensional phase space. At
The grand description  Liouville's equation and entropy
55
any time, the probability density for finding a system in the ensemble within a particular range of 6N coordinates will be denoted by / , • • .,x m ,v ( 1 ) ,.. .,ym, f)dx(1>... d v w .
(10.1)
This shorthand notation avoids the boredom of writing millions of coordinates for, say, a globular cluster. The value o f / w is the fraction of systems in the ensemble with the desired range of velocities and positions. Thus the integral of/(fV) over all phase space is unity " /...dv(N> = l.
(10.2)
The next conceptual step is to assume, following Gibbs, that the probability distribution of all members of the ensemble is the same as the probability of finding a given set of coordinates in any one member of the ensemble. For ordinary statistical mechanics this is justified by supposing that all members of the ensemble are fairly similar and represent different microscopic realizations of systems with the same macroscopic (average) properties such as temperature and density. Then one appeals to the ergodicity of the ensemble. Some physicists find this intuitively obvious and note that it leads to many experimentally verified predictions. Others find it intuitively implausible, and thus all the more remarkable for seeming to be true. They have therefore sought rigorous proofs in statistical mechanics and generated a considerable industry. In gravitational (and other explicitly Hamiltonian) systems, the situation is perhaps more straightforward: the Gibbs concept implies the exact equations of motion of the system. To see this, we first determine how/(Af), considered as the probability of finding a given system in the ensemble, changes with time. Initially the probability that a given system has coordinates (xj,1',...,v(0N),t0) lying within a small 6Ndimensional volume, with boundary So, of phase space is f / < o N) (x ( o 1) ,...,vrfo)dx ( o 1) ...dvW. (10.3) J s,, At some later time, the coordinates x 0 , v0 in the system will evolve dynamically into x and v, the distribution function will become/ (Ar) (x (1) ,..., vm, t) and the boundary S o will change to S,. The probability that the evolved system now lies within S, is A(to)=
A(t)= I /•(x