Physics Reports vol.353

NEW DEVELOPMENTS IN THE CASIMIR EFFECT M. BORDAG, U. MOHIDEEN, V.M. MOSTEPANENKO ElectriciteH de France, Div. R&D, MFTT...

12 downloads 887 Views 3MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

NEW DEVELOPMENTS IN THE CASIMIR EFFECT

M. BORDAG, U. MOHIDEEN, V.M. MOSTEPANENKO ElectriciteH de France, Div. R&D, MFTT, 6 Quai Watier, 78400 Chatou, France Energy Conversion Department, Chalmes University of Technology, S-41296 GoK teborg, Sweden

AMSTERDAM } LONDON } NEW YORK } OXFORD } PARIS } SHANNON } TOKYO

Physics Reports 353 (2001) 1–205

New developments in the Casimir eect M. Bordaga , U. Mohideenb; ∗ , V.M. Mostepanenkoc; 1 a

Institute for Theoretical Physics, Leipzig University, Augustusplatz 10=11, 04109, Leipzig, Germany b Department of Physics, University of California, Riverside, CA 92521, USA c Department of Physics, Federal University of Paraiba, Caixa Postal 5008, CEP 58059-970, João Pessoa, Para01ba, Brazil Received December 2000; editor : A:A: Maradudin Contents 1. Introduction 1.1. Zero-point oscillations and their manifestation 1.2. The Casimir eect as a macroscopic quantum eect 1.3. The role of the Casimir eect in dierent 3elds of physics 1.4. What has been accomplished during the last years? 1.5. The structure of the review 2. The Casimir eect in simple models 2.1. Quantized scalar 3eld on an interval 2.2. Parallel conducting planes 2.3. One- and two-dimensional spaces with nontrivial topologies 2.4. Moving boundaries in a two-dimensional space–time 3. Regularization and renormalization of the vacuum energy 3.1. Representation of the regularized vacuum energy 3.2. The heat kernel expansion

3 3 5 7 8 9 10 10 14 18 22 25 26 36

3.3. The divergent part of the ground state energy 3.4. Renormalization and normalization conditions 3.5. The photon propagator with boundary conditions 4. Casimir eect in various con3gurations 4.1. Flat boundaries 4.2. Spherical and cylindrical boundaries 4.3. Sphere (lens) above a disk: additive methods and proximity forces 4.4. Dynamical Casimir eect 4.5. Radiative corrections to the Casimir eect 4.6. Spaces with non-Euclidean topology 5. Casimir eect for real media 5.1. The Casimir eect at nonzero temperature 5.2. Finite conductivity corrections 5.3. Roughness corrections 5.4. Combined eect of dierent corrections 6. Measurements of the Casimir force 6.1. General requirements for the Casimir force measurements

∗

Corresponding author. Fax: +1-909-787-4529. E-mail address: [email protected] (U. Mohideen). 1 On leave from A. Friedmann Laboratory for Theoretical Physics, St. Petersburg, Russia.

c 2001 Elsevier Science B.V. All rights reserved. 0370-1573/01/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 1 5 - 1

40 42 46 52 52 64 79 83 88 94 105 105 117 127 142 151 152

2

M. Bordag et al. / Physics Reports 353 (2001) 1–205

6.2. Primary achievements of the older measurements 6.3. Experiment by Lamoreaux 6.4. Experiments with the Atomic Force Microscope by Mohideen et al. 6.5. Demonstration of the nontrivial boundary properties of the Casimir force 6.6. The outlook for measurements of the Casimir force 7. Constraints for non-Newtonian gravity from the Casimir eect 7.1. Constraints from the experiments with dielectric test bodies

153 158 160 177 184 185 186

7.2. Constraints from Lamoreaux’s experiment 7.3. Constraints following from the atomic force microscope measurements of the Casimir force 8. Conclusions and discussion Acknowledgements Appendix A. Applications of the Casimir force in nanotechnology A.1. Casimir force and nanomechanical devices A.2. Casimir force in nanoscale device fabrication References

188 191 195 196

197 198 199

Abstract We provide a review of both new experimental and theoretical developments in the Casimir eect. The Casimir eect results from the alteration by the boundaries of the zero-point electromagnetic energy. Unique to the Casimir force is its strong dependence on shape, switching from attractive to repulsive as function of the size, geometry and topology of the boundary. Thus, the Casimir force is a direct manifestation of the boundary dependence of quantum vacuum. We discuss in depth the general structure of the in3nities in the 3eld theory which are removed by a combination of zeta-functional regularization and heat kernel expansion. Dierent representations for the regularized vacuum energy are given. The Casimir energies and forces in a number of con3gurations of interest to applications are calculated. We stress the development of the Casimir force for real media including eects of nonzero temperature, 3nite conductivity of the boundary metal and surface roughness. Also, the combined eect of these important factors is investigated in detail on the basis of condensed matter physics and quantum 3eld theory at nonzero temperature. The experiments on measuring the Casimir force are also reviewed, starting 3rst with the older measurements and 3nishing with a detailed presentation of modern precision experiments. The latter are accurately compared with the theoretical results for real media. At the end of the review we provide the most recent constraints on the corrections to Newtonian gravitational law and other hypothetical long-range interactions at submillimeter range obtained from the Casimir force c 2001 Elsevier Science B.V. All rights reserved. measurements. PACS: 12.20.−m; 11.10.Wx; 72.15.−v; 68.35.Ct; 12.20.Fv; 14.80.−j Keywords: Vacuum; Zero-point oscillations; Renormalization; Finite conductivity; Nonzero temperature; Roughness; Precision measurements; Atomic force microscope; Long-range interactions


3

1. Introduction More than 50 years have passed since H.B.G. Casimir published his famous paper [1] where he found a simple yet profound explanation for the retarded van der Waals interaction (which was described by him along with D. Polder [2] only a short time before) as a manifestation of the zero-point energy of a quantized 3eld. For a long time, the paper remained relatively unknown. But starting from the 1970s this eect quickly received increasing attention and in the last few years has become highly popular. New high precision experiments on the demonstration of the Casimir force have been performed and more are under way. In theoretical developments considerable progress had been made in the investigation of the structure of divergencies in general, nonMat background and in the calculation of the eect for more complicated geometries and boundary conditions including those due to the real structures of the boundaries. In the Introduction we discuss the fundamental problems connected with the concept of physical vacuum, the role of the Casimir eect in dierent domains of physics and the scope of the review. 1.1. Zero-point oscillations and their manifestation The Casimir eect in its simplest form is the interaction of a pair of neutral, parallel conducting planes due to the disturbance of the vacuum of the electromagnetic 3eld. It is a pure quantum eect—there is no force between the plates (assumed to be neutral) in classical electrodynamics. In the ideal situation, at zero temperature for instance, there are no real photons in between the plates. So it is only the vacuum, i.e., the ground state of quantum electrodynamics (QED) which causes the plates to attract each other. It is remarkable that a macroscopic quantum eect appears in this way. In fact, the roots of this eect date back to the introduction by Planck in 1911 of the half-quanta [3]. In the language of quantum mechanics one has to consider a harmonic oscillator with energy levels En = ˝!(n + 12 ), where n = 0; 1; : : : ; and ˝ is the Planck constant. It is the energy E0 =

˝!

(1.1) 2 of the ground state (n = 0) which matters here. From the point of view of the canonical quantization procedure this is connected with the arbitrariness of the operator ordering in de3ning the Hamiltonian operator Hˆ by substituting in the classical Hamiltonian H (p; q) the dynamical variables by the corresponding operators, Hˆ = H (p; ˆ q). ˆ It must be underlined that the ground state energy E0 cannot be observed by measurements within the quantum system, i.e. in transitions between dierent quantum states, or for instance in scattering experiments. However, the frequency ! of the oscillator may depend on parameters external to the quantum system. It was as early as 1919 that this had been noticed in the explanation for the vapor pressure of certain isotopes, the dierent masses provide the necessary change in the external parameter (for a historical account see [4]). In quantum 3eld theory one is faced with the problem of ultraviolet divergencies which come into play when one tries to assign a ground state energy to each mode of the 3eld. One has to

4


consider then ˝ E0 = !J ; 2 J

(1.2)

where the index J labels the quantum numbers of the 3eld modes. For instance, for the electromagnetic 3eld in Minkowski space the modes are labeled by a three vector k in addition to the two polarizations. Sum (1.2) is clearly in3nite. It was Casimir who was the 3rst to extract the 3nite force acting between the two parallel neutral plates F(a) = −

2 ˝c S 240 a4

(1.3)

from the in3nite zero-point energy of the quantized electromagnetic 3eld con3ned in between the plates. Here a is the separation between the plates, S a2 is their area and c is the speed of light. To do this Casimir had subtracted away from the in3nite vacuum energy of Eq. (1.2) in the presence of plates, the in3nite vacuum energy of quantized electromagnetic 3eld in free Minkowski space. Both in3nite quantities were regularized and after subtraction, the regularization was removed leaving the 3nite result. Note that in standard textbooks on quantum 3eld theory the dropping of the in3nite vacuum energy of free Minkowski space is motivated by the fact that energy is generally de3ned up to an additive constant. Thus, it is suggested that all physical energies be measured starting from the top of this in3nite vacuum energy in free space. In this manner eectively the in3nite energy of free space is set to zero. Mathematically, it is achieved by the so called normal ordering procedure. This operation when applied to the operators for physical observables puts all creation operators to the left of annihilation operators as if they commute [5 –7]. It would be incorrect, however, to apply the normal ordering procedure in this simplest form when there are external 3elds or boundary conditions, e.g., on the parallel metallic plates placed in vacuum. In that case there is an in3nite set of dierent vacuum states and corresponding annihilation and creation operators for dierent separations between plates. These states turn into one another under adiabatic changes of separation. Thus, it is incorrect to pre-assign zero energy values to several states between which transitions are possible. Because of this, the 3nite dierence between the in3nite vacuum energy densities in the presence of plates and in free space is observable and gives rise to the Casimir force. It is important to discuss brieMy the relation of the Casimir eect to other eects of quantum 3eld theory connected with the existence of zero-point oscillations. It is well known that there is an eect of vacuum polarization by external 3elds. The characteristic property of this eect is some nonzero vacuum energy depending on the 3eld strength. Boundaries can be considered as a concentrated external 3eld. In this case the vacuum energy in restricted quantization volumes is analogous to the vacuum polarization by an external 3eld. We can then say that material boundaries polarize the vacuum of a quantized 3eld, and the force acting on the boundary is a result of this polarization. The other vacuum quantum eect is the creation of particles from vacuum by external 3elds. In this eect energy is transferred from the external 3eld to the virtual particles


5

(vacuum oscillations) transforming them into the real ones. There is no such eect in the case of static boundaries. However, if the boundary conditions depend on time there is particle creation, in addition to a force (this is the so called nonstationary or dynamic Casimir eect). A related topic to be mentioned is quantum 3eld theory with boundary conditions. The most common part of that is quantum 3eld theory at 3nite temperature in the Matsubara formulation (we discuss this subject here only in application to the Casimir force). The eects to be considered in this context can be divided into pure vacuum eects like the Casimir eect and those where excitations of the quantum 3elds are present, i.e., real particles in addition to virtual ones. An example is an atom whose spontaneous emission is changed in a cavity. Another example is the so called apparatus correction to the electron g-factor. Here, the virtual photons responsible for the anomaly ae = (g − 2)=2 of the magnetic moment are aected by the boundaries. In this case by means of the electron a real particle is involved and the quantity to be considered is the expectation value of the energy in a one electron state. The same holds for cavity shifts of the hydrogen levels. This topic, together with a number of related ones, is called “cavity QED”. In the methods used, a photon propagator obeying boundary conditions, this is very closely related to the quantum 3eld theoretic treatment of the Casimir eect. However, the dierence is “merely” that expectation values are considered in the vacuum state in one case and in one (or more) particle states in the other. 1.2. The Casimir e>ect as a macroscopic quantum e>ect The historical path taken by Casimir in his dealings with vacuum Muctuations is quite dierent from the approaches discussed in the previous subsection (see for example [8]). In investigating long-range van der Waals forces in colloids together with his collaborator D. Polder he took the retardation in the electromagnetic interaction of dipoles into account and arrived at the so called Casimir–Polder forces between polarizable molecules [2]. This was later extended by Lifshitz [9] to forces between dielectric macroscopic bodies usually characterized by a dielectric constant 0 F(a) = −

2 ˝c (0 − 1)2 ’(0 )S ; 240 a4 (0 + 1)2

(1.4)

where ’(0 ) is a tabulated function. In this microscopic description, the ideal conductor is obtained in the limit 0 → ∞, the same Casimir force (1.3) emerges just as in the zero-point energy approach. The point is that in the limit of ideal conductors only the surface layer of atoms thought of as a continuum interacts with the electromagnetic 3eld. Clearly, in this idealized case, boundary conditions provide an equivalent description with the known consequences on the vacuum of the electromagnetic 3eld. These alternative descriptions also work for deviations from the ideal conductor limit. For example, the vacuum interaction of two bodies with 3nite conductivity can be described approximately by impedance boundary conditions with 3nite penetration depth in one case and by the microscopic model on the other. For two dielectric bodies of arbitrary shape the summation of the Casimir–Polder interatomic potentials was shown to be approximately equal to the exact results if special normalizations accounting for the nonadditivity eects are performed [10]. Only recently has an

6


important theoretical advance occurred in our understanding of this equivalence in the example of a spherical body (instead of two separate bodies). Here the equivalence of the Casimir–Polder summation and vacuum energy has been shown, at least in the dilute gas approximation [11]. The microscopic approach to the theory of both van der Waals and Casimir forces can be formulated in a uni3ed way. It is well known that the van der Waals interaction appears between neutral atoms of condensed bodies separated by distances which are much larger than the atomic dimensions. It can be obtained nonrelativistically in second order perturbation theory from the dipole–dipole interaction energy [12]. Because the expectation values of the dipole moment operators are zero, the van der Waals interaction is due to their dispersions, i.e. to quantum Muctuations. Thus, it is conventional to speak about Muctuating electromagnetic 3eld both inside the condensed bodies and also in the gap of a small width between them. Using the terminology of Quantum Field Theory, for closely spaced macroscopic bodies the virtual photon emitted by an atom of one body reaches an atom of the second body during its lifetime. The correlated oscillations of the instantaneously induced dipole moments of those atoms give rise to the nonretarded van der Waals force [13,14]. Let us now increase the distance between the two macroscopic bodies to be so large that the virtual photon emitted by an atom of one body cannot reach the second body during its lifetime. In this case the usual van der Waals force is absent. Nevertheless, the correlation of the quantized electromagnetic 3eld in the vacuum state is not equal to zero at the two points where the atoms belonging to the dierent bodies are situated. Hence nonzero correlated oscillations of the induced atomic dipole moments arise once more, resulting in the Casimir force. In this theoretical approach the latter is also referred to as the retarded van der Waals force [7]. In the case of perfect metal the presence of the bounding condensed bodies can be reduced to boundary conditions at the sides of the gap. In the general case it is necessary to calculate the interaction energy in terms of the frequency dependent dielectric permittivity (and, generally, also the magnetic permeability) of the media. For the case of two semispaces with a gap between them this was 3rst realised in [9] where the general expressions for both the van der Waals and Casimir force were obtained. Needless to say that this theoretical approach is applicable only for the electromagnetic Casimir eect caused by some material boundaries having atomic structure. The case of quantization volumes with nontrivial topology which also lead to the boundary conditions [15 –17] is not covered by it. An important feature of the Casimir eect is that even though it is quantum in nature, it predicts a force between macroscopic bodies. For two plane-parallel metallic plates of area S = 1 cm2 separated by a large distance (on the atomic scale) of a = 1 m the value of the attractive force given by Eq. (1.3) is F(a) ≈ 1:3 × 10−7 N. This force while small, is now within the range of modern laboratory force measurement techniques. Unique to the Casimir force is its strong dependence on shape, switching from attractive to repulsive as a function of the geometry and topology of a quantization manifold [18,19]. This makes the Casimir eect a likely candidate for applications in nanotechnologies and nanoelectromechanical devices. The attraction between neutral metallic plates in a vacuum was 3rst observed experimentally in [20]. This and other recent experimental developments in the measurement of the Casimir force is discussed in Section 6.


7

There exist only a few other macroscopic manifestations of quantum phenomena. Among them there are the famous ones such as Superconductivity, SuperMuidity, and the Quantum Hall eect. In the above macroscopic quantum eects the coherent behavior of large number of quantum particles plays an important role. In line with the foregoing the Casimir force arises due to coherent oscillations of the dipole moments of a great number of atoms belonging to the dierent boundary bodies. For this reason the Casimir eect can be considered also as a macroscopic quantum eect. The clearest implication of the above is that the greater attention traditionally given to the macroscopic quantum eects will also be eventually received by the Casimir eect.

1.3. The role of the Casimir e>ect in di>erent @elds of physics The Casimir eect is an interdisciplinary subject. It plays an important role in a variety of 3elds of physics such as Quantum Field Theory, Condensed Matter Physics, Atomic and Molecular Physics, Gravitation and Cosmology, and in Mathematical Physics [21]. In Quantum Field Theory, the Casimir eect 3nds three main applications. In the bag model of hadrons in Quantum Chromodynamics the Casimir energy of quark and gluon 3elds makes essential contributions to the total nucleon energy. In Kaluza–Klein 3eld theories Casimir effect oers one of the most eective mechanisms for spontaneous compacti3cation of extra spatial dimensions. Moreover, measurements of the Casimir force provide opportunities to obtain more strong constraints for the parameters of long-range interactions and light elementary particles predicted by the uni3ed gauge theories, supersymmetry, supergravity, and string theory. In Condensed Matter Physics, the Casimir eect leads to attractive and repulsive forces between the closely spaced material boundaries which depend on the con3guration geometry, on temperature, and on the electrical and mechanical properties of the boundary surface. It is responsible for some properties of thin 3lms and should be taken into account in investigations of surface tension and latent heat. The Casimir eect plays an important role in both bulk and surface critical phenomena. In Gravitation, Astrophysics and Cosmology, the Casimir eect arises in space–times with nontrivial topology. The vacuum polarization resulting from the Casimir eect can drive the inMation process. In the theory of structure formation of the Universe due to topological defects, the Casimir vacuum polarization near the cosmic strings may play an important role. In Atomic Physics, the long-range Casimir interaction leads to corrections to the energy levels of Rydberg states. A number of the Casimir-type eects arise in cavity Quantum Electrodynamics when the radiative processes and associated energy shifts are modi3ed by the presence of the cavity walls. In Mathematical Physics, the investigation of the Casimir eect has stimulated the development of powerful regularization and renormalization techniques based on the use of zeta functions and heat kernel expansion. The majority of these applications will be discussed below and the references to the most important papers will be also given.

8


1.4. What has been accomplished during the last years? This review is devoted to new developments in the Casimir eect. In spite of the extensive studies on the subject performed in the more than 50 years there is only a small number of review publications. The 3rst two large reviews [22,23] were published more than 10 years ago. There is a single monograph [24] specially devoted to the dierent aspects of the Casimir eect (the 3rst, Russian, edition of it was published in 1990). The other monograph [25] is mostly concerned with the condensed matter aspects of the subject. Several chapters of the monograph [7] are also devoted to the Casimir eect. There are at least three very important new developments in the Casimir eect which have made their appearance after the publication of the above mentioned reviews. The 3rst falls within the domain of Quantum Field Theory. It has been known that in the case of Mat boundaries the vacuum energy turns into in3nity at large momenta in the same way as in free Minkowski space. Thus, it is apparently enough to subtract the contribution of Minkowski space in order to obtain the 3nal physical result for the Casimir energy. For arbitrary compact domains bounded by closed surfaces (for example, the interior of a sphere) this is, however, not the case. Except for the highest in3nity (which is proportional to the fourth power of a cut-o momentum) there exist lower order in3nities. The investigation of the general structure of these in3nities for an arbitrary domain was a theoretical problem which had been solved by the combination of zeta-functional regularization [26,27] and heat kernel expansion [28]. However, these results had been obtained mostly in the context of curved space–time. The explicit application to the Casimir eect was done later, 3rst in [29,30], where as an example the known divergencies for the Casimir eect for a massive 3eld with boundary conditions on a sphere had been related to the corresponding heat kernel coeTcients. In the 1990s essential progress had been made in the understanding and application of zeta functions [31,32] as well as in the calculation of the Casimir eect for massive 3elds for nonplane boundaries, e.g., in [33]. The second important development in the Casimir eect during the last years is concerned with Condensed Matter Physics. It has long been known that there are large corrections to the ideal 3eld-theoretical expressions for the Casimir force due to several factors which are necessarily present in any experimental situation. The most important factors of this sort are those due to the 3niteness of the conductivity of the boundary metal, surface roughness, and nonzero temperature. In the papers [34 –39] the Casimir force including these factors was investigated in detail. In doing so not only the inMuence of each individual factor was examined, but also their combined action was determined. This gave the possibility to increase the degree of agreement between theory and experiment. Probably, the third development is the most striking. It consists in new, more precise measurements of the Casimir force between metallic surfaces. In [40] a torsion pendulum was used to measure the force between Cu plus Au coated quartz optical Mat, and a spherical lens. In [41– 44] an atomic force microscope was 3rst applied to measure the Casimir force between Al plus Au=Pd and Au coated sapphire disk and polystyrene sphere. Considerable progress has been made towards the improving the accuracy of the Casimir force measurements. The results of these measurements have allowed the stringent calculation of constraints on hypothetical forces such as ones predicted by supersymmetry, supergravity, and string theory [45 – 48]. Other


9

important results in the Casimir eect obtained during the last few years are also discussed below (see the review paper [49], and resource letter [50]). 1.5. The structure of the review In the present review both the theoretical and experimental developments mentioned above are considered in detail. In Section 2 the simpli3ed overview of the subject is provided. The main theoretical concepts used in the theory of the Casimir eect are illustrated here by simple examples, where no technical diTculties arise and all calculations can be performed in a closed form. Thus, the concepts of regularization and renormalization are demonstrated for the case of a scalar 3eld on an interval and for the simplest spaces with nontrivial topology. The famous Casimir formula (1.3) for the force between perfectly conducting parallel plates is derived by two methods. The additional eects arising for moving boundaries are considered in two-dimensional space–time. The presentation is designed to be equally accessible to 3eld theorists, specialists in condensed matter and experimentalists. Section 3 contains the general 3eld-theoretical analyses of regularization and renormalization procedures for the quantized 3eld in an arbitrary quantization domain with boundaries. Here the divergent parts of the vacuum state energy and eective action are found by a combination of heat kernel expansion and zeta-functional regularization. Dierent representations for the regularized vacuum energy are obtained. The correspondence between the massive and massless cases is discussed in detail. In Section 4 the Casimir energies and forces in a number of dierent con3gurations are calculated, among which are strati3ed media, rectangular cavities, wedge, sphere, cylinder, sphere (lens) above a disk and others. Dierent kinds of boundary conditions are considered and possible applications to the bag model of hadrons, Kaluza–Klein 3eld theories, and cosmology are discussed. Radiative corrections to the Casimir eect are also presented. Both exact and approximate calculation methods are used in Section 4. Some of the obtained results (especially the ones for the strati3ed media and a sphere above a disk) are of principal importance for the following sections devoted to aspects of condensed matter physics and of the experiments. Section 5 is devoted to the consideration of the Casimir force for the real media. Here the Casimir force with account of nonzero temperature, 3nite conductivity of the boundary metal and surface roughness is investigated. The 3nite conductivity corrections are found both analytically in the framework of the plasma model of metals and numerically using the optical tabulated data for the complex refractive index. Surface roughness is taken into account by means of perturbation theory in powers of small parameter which relates the eective roughness amplitude to the distance between the boundary surfaces. Special attention is paid to the combined eect of roughness and conductivity corrections, conductivity and temperature corrections, and also of all three factors acting together. It is shown that there are serious diTculties when applying the well known general expression for the temperature Casimir force [9] in the case of real metals. A line of attack on this problem is advanced. In Section 6 the experiments on measuring the Casimir force are 3rst reviewed. The presentation begins with the discussion of experimental problems connected with the measuring of small forces and small separations. Dierent background eects are also considered in detail. The historical experiments on measuring the Casimir force between metals and dielectrics are

10


presented starting from [20]. The major part of Section 6 is devoted to the presentation of the results of modern experiments [40 – 44] and their comparison with the theoretical results for the Casimir force between real media represented in Section 5. The prospects for further improving the accuracy of Casimir force measurements are outlined. In Section 7 the reader 3nds new interesting applications of the Casimir eect for obtaining constraints on the parameters of hypothetical long-range interactions including corrections to Newtonian gravitational law and light elementary particles predicted by the modern theories of fundamental interactions. Both the constraints following from the historical and modern experiments on measuring the Casimir force are presented. They are the best ones in comparison to all the other laboratory experiments in a wide interaction range. With further improvements in the Casimir force measurements the obtained constraints can further be strengthened. The presentation is organized in such a way that the specialists in dierent 3elds of physics and also students could restrict their reading to some selected sections. For example, those who are interested in condensed matter aspects of the Casimir eect could read only Sections 1, 2, 4.1.1, 4.3 and 5. Those who are also interested in experimental aspects of the problem may add to this Sections 6 and 7. Except for the purely theoretical Section 3 we are preserving in all formulas the fundamental constants ˝ and c, as experimentalists usually do, which helps physical understanding. 2. The Casimir eect in simple models In this section we present the elementary calculation of the Casimir energies and forces for several simple models. These models are mainly low-dimensional ones. Also the classical example of two perfectly conducting planes is considered. Such important concepts as regularization and renormalization are illustrated here in an intuitive manner readily accessible to all physicists, including nonspecialists in Quantum Field Theory. Introduction into the dynamical Casimir eect is given at the end of the section. 2.1. Quantized scalar @eld on an interval We start with a real scalar 3eld ’(t; x) de3ned on an interval 0 ¡ x ¡ a and obeying boundary conditions ’(t; 0) = ’(t; a) = 0 :

(2.1)

This is the typical case where the Casimir eect arises. The simplicity of the situation (one dimensional space and one component 3eld) gives the possibility to discuss the problems connected with the calculation of the Casimir force in the most transparent manner. In Section 2.2 a more realistic case of the quantized electromagnetic 3eld between perfectly conducting planes will be considered. The scalar 3eld equation is as usual [5,6] 1 92 ’(t; x) 92 ’(t; x) m2 c2 − + 2 ’(t; x) = 0 ; c 2 9t 2 9x 2 ˝

(2.2)


11

where m is the mass of the 3eld. The inde3nite scalar product associated with Eqs. (2.1) and (2.2) is a (f; g) = i d x(f∗ 9x0 g − 9x0 f∗ g) ; (2.3) 0

where f; g are two solutions of (2.2), x0 ≡ ct. We remind the reader that the scalar 3eld in two-dimensional space–time is dimensionless. It is easy to check that the positive- and negative-frequency solutions of (2.2) obeying boundary conditions (2.1) are as follows: c 1=2 ±i!n t (±) ’n (t; x) = e sin kn x ; a!n 2 4 1=2

n mc 2 2 !n = + c kn ; kn = ; n = 1; 2; : : : : (2.4) 2 ˝ a They are orthonormalized in accordance with the scalar product (2.3) (’n(±) ; ’n(±) ) = ∓nn ;

(∓) (’(±) n ; ’n ) = 0 :

(2.5)

We consider here a free 3eld. Soliton-type solutions for the self-interacting 3eld between the boundary points in two-dimensional space–time with dierent boundary conditions are considered in [51]. Now the standard quantization of the 3eld is performed by means of the expansion (+) + ’(t; x) = [’(−) (2.6) n (t; x)an + ’n (t; x)an ] ; n

where the quantities an ; a+ n are the annihilation and creation operators obeying the commutation relations [an ; a+ n ] = n; n ;

+ [an ; an ] = [a+ n ; an ] = 0 :

(2.7)

The vacuum state in the presence of boundary conditions is de3ned by an |0 = 0 :

(2.8)

We are interested in investigating the energy of this state in comparison with the vacuum energy of the scalar 3eld de3ned on an unbounded axis −∞ ¡ x ¡ ∞. The operator of the energy density is given by the 00-component of the energy-momentum tensor of the scalar 3eld in the two-dimensional space–time ˝c 1 2 2 T00 (x) = [9t ’(x)] + [9x ’(x)] : (2.9) 2 c2 Substituting Eq. (2.6) into Eq. (2.9) with account of (2.4), (2.7), and (2.8) one easily obtains 0|T00 (x)|0 =

∞ ˝

2a

n=1

!n −

∞

m2 c4 cos 2kn x : 2a˝ !n n=1

(2.10)

12


The total vacuum energy of the interval (0; a) is obtained by the integration of (2.10)

E0 (a) =

a

0

0|T00 (x)|0 d x =

∞

˝

2

!n :

(2.11)

n=1

The second, oscillating term on the right hand side of (2.10) does not contribute to the result. Expression (2.11) for the vacuum state energy of the quantized 3eld between boundaries is the standard starting point in the theory of the Casimir eect. Evidently the quantity E0 (a) is in3nite. It can be assigned a meaning by the use of some regularization procedure. There are many such regularization procedures discussed below. Here we use one of the simplest ones, i.e., we introduce an exponentially damping function exp(−!n ) after the summation sign. In the limit → 0 the regularization is removed. For simplicity let us consider the regularized vacuum energy of the interval for a massless 3eld (m = 0). In this case E0 (a; ) ≡

∞

˝ c n

2

n=1

c n exp − a a

=

˝c c sinh−2 : 8a 2a

(2.12)

In the limit of small one obtains E0 (a; ) =

˝a + E(a) + O(2 ); 2 c2

E(a) = −

˝c ; 24a

(2.13)

i.e., the vacuum energy is represented as a sum of a singular term and a 3nite contribution. Let us compare result (2.13) with the corresponding result for the unbounded axis. Here instead of (2.4) we have the positive and negative frequency solutions in the form of traveling waves ’k(±) (t; x) =

c 1=2

4 !

e±i(!t−kx) ;

!=

m2 c4 + c2 k 2 ˝2

1=2

;

−∞ ¡ k ¡ ∞ :

(2.14)

The sum in the 3eld operator (2.6) is interpreted now as an integral with the measure dk=2 , and the commutation relations contain delta functions (k − k ) instead of the Kronecker symbols. Let us call the vacuum state de3ned by ak |0M = 0 ;

(2.15)

the Minkowski vacuum to underline the fact that it is de3ned in free space without any boundary conditions. Repeating exactly the same simple calculation which was performed for the interval we obtain the divergent expression for the vacuum energy density in Minkowski vacuum ∞ ˝ 0M |T00 (x)|0M = ! dk (2.16) 2 0


and for the total vacuum energy on the axis ∞ ˝ ! dkL ; E0M (−∞; ∞) = 2 0

13

(2.17)

where L → ∞ is the normalization length. Let us separate the interval (0; a) of the whole axis whose energy should be compared with (2.11) E0M (−∞; ∞) ˝a ∞ E0M (a) = ! dk : (2.18) a= L 2 0 To calculate (2.18) we use the same regularization as above, i.e., we introduce the exponentially damping function under the integral. For simplicity, consider once more the massless case c˝a ∞ −ck ˝a E0M (a) = ke dk = : (2.19) 2 0 2 c2 The obtained result coincides with the 3rst term on the right hand side of (2.13). Consequently, the renormalized vacuum energy of the interval (0; a) in the presence of boundary conditions can be de3ned as E0ren (a) ≡ lim [E0 (a; ) − E0M (a; )] = E(a) = − →0

˝c : 24a

(2.20)

It is quite clear that in this simplest case the renormalization corresponds to removing a quantity equal to the vacuum energy of the unbounded space in the given interval. The general structure of the divergencies of the vacuum energy will be considered in Section 3.3. The renormalized energy E(a) monotonically decreases as the boundary points approach each other. This points to the presence of an attractive force between the conducting planes, F(a) = −

9E(a)

˝c : =− 9a 24a2

In the massive case m = 0 the above calculations lead to the result ∞ 2 mc2 y − 42 ˝c − E(a; m) = − dy 4 4 a 2 exp(y) − 1

(2.21)

(2.22)

with ≡ mca= ˝. Here the 3rst, constant contribution is associated with the total energy of the wall (boundary point). It does not inMuence the force. For m = 0, Eq. (2.22) gives the same result as (2.20). It is possible to 3nd the asymptotic behaviors of (2.22) in the case of small and large . Thus, for 1 we have E(a; m) ≈ −

mc2 ˝c ˝c 2 − + ln 4 24a 23 a

(2.23)

and for 1

√ ˝c mc2 − √ e−2 : E(a; m) ≈ − 2 4 a

(2.24)

14


Note, that the exponentially small value of the distance dependent term in the Casimir energy for 1 is typical. The same small value is also obtained for parallel planes in three-dimensional space and for 3elds of higher spin. It is, however, an artifact of plane boundaries. If some curvature is present, either in the boundary or in space–time, the behavior is, generally speaking, in powers of the corresponding geometrical quantity, for example the radius of a sphere. There are only accidental exceptions to this rule, e.g., the case of a three sphere (see Section 4.6.2). Therefore, it is primarily for plane boundaries and Mat space that the Casimir eect for massless 3elds is larger and more important than that for massive 3elds. Evidently, the physical results like those given by Eq. (2.20) or (2.22) should not depend on the chosen regularization procedure. We reserve the detailed discussion of this point for Section 3. It is not diTcult, however, to make sure that results (2.20), (2.22) actually do not depend on the speci3c form of the damping function. Let us, instead of the exponential function used above, use some function f(!; ) with the following properties: The function f(!; ) monotonically decreases with increasing ! or and satis3es the conditions f(!; ) 6 1, f(!; 0) = 1, f(!; ) → 0 for all = 0 when ! → ∞. The nondependence of the obtained results in the form of f(!; ) can be most easily demonstrated by the use of the Abel–Plana formula [52] ∞ ∞ ∞ 1 dt F(n) − dt F(t) = F(0) + i (2.25) [F(it) − F(− it)] ; 2 t 2 e −1 0 0 n=0

where F(z) is an analytic function in the right half-plane. One can substitute F(n) by !n multiplied by the damping function f(!n ; ). Then the left hand side of Eq. (2.25) can be interpreted as the dierence in the regularized energies in the presence of boundaries and in free space from Eq. (2.25) de3ning the renormalization procedure. The independence of the integral on the right hand side of (2.25) in the form of f(!n ; ) follows from the exponentially fast convergence which permits taking the limit → 0 under the integral (note that the Abel–Plana formula was 3rst applied for the calculation of the Casimir force in [17]). In the case of the Casimir eect for both scalar and spinor 3elds a modi3cation of (2.25) is useful for the summation over the half-integer numbers ∞ ∞ ∞ 1 dt − F n+ dt F(t) = − i (2.26) [F(it) − F(− it)] : 2 t + 1 2 e 0 0 n=0

Other generalizations of the Abel–Plana formula can be found in [24]. The Abel–Plana formulas are used in Sections 2:2, 2:3, 4:1:2, and 4:6:1 to calculate the Casimir energies in dierent con3gurations. 2.2. Parallel conducting planes As it was already mentioned in the Introduction, in its simplest case the Casimir eect is the reaction of the vacuum of the quantized electromagnetic 3eld to changes in external conditions like conducting surfaces. The simplest case is that of two parallel perfectly conducting planes


15

with a distance a between them at zero temperature. They provide conducting boundary conditions to the electromagnetic 3eld. These boundary conditions can be viewed as an idealization of the interaction of the metal surfaces with the electromagnetic 3eld. In general, this interaction is much more complicated and is modi3ed by the 3nite conductivity of the metal (or alternatively the skin depth of the electromagnetic 3eld into the metal) and the surface roughness. But the idealized conducting boundary conditions are a good starting point for the understanding as they provide a complete problem and one that can be easily modi3ed for the case of realistic metals. So it is possible to treat real metals with their 3nite conductivity and surface roughness as small perturbations (see Section 5). Here we focus on understanding the simple case of ideal metal boundaries. It is well known in classical electrodynamics that both polarizations of the photon 3eld have to satisfy boundary conditions Et|S = Hn|S = 0

(2.27)

on the surface S of perfect conductors. Here n is the outward normal to the surface. The index t denotes the tangential component which is parallel to the surface S. Conditions (2.27) imply that the electromagnetic 3eld can exist outside the ideal conductor only. To proceed we imagine the electromagnetic 3eld as a in3nite set of harmonic oscillators with √ 2 frequencies !J = c k . Here the index of the photon momentum in free space (i.e., without boundaries) is J = k = (k1 ; k2 ; k3 ) where all ki are continuous. In the presence of boundaries J = (k1 ; k2 ; n=a) = (k⊥ ; n=a), where k⊥ is a two-dimensional vector, n is integer. In the latter case the frequency results in

n 2 !J = !k⊥ ; n = c k12 + k22 + : (2.28) a This has to be inserted into the half-sum over frequencies to get the vacuum energy of the electromagnetic 3eld between the plates ∞ ˝ dk1 dk2 E0 (a) = !k ; n S ; (2.29) 2 (2 )2 n=−∞ ⊥ where S → ∞ is the area of plates. In contrast to Eq. (2.11) the sum is over negative integers also, so as to account for the two photon polarizations. The expression obtained is ultraviolet divergent for large momenta. Therefore, we have to introduce some regularization like discussed in the preceding section. This procedure is well known in quantum 3eld theory and consists in changing the initial expression in a way that it becomes 3nite. This change depends on the so called regularization parameter and it is assumed that it can be removed formally by taking the limit value of this parameter to some appropriate value. Here, we perform the regularization by introducing a damping function of the frequency which was used in the original paper by Casimir (see also Section 2.1) and the modern zeta-functional regularization. We obtain correspondingly ∞ ˝ dk1 dk2 E0 (a; ) = !k ; n e−!k⊥ ; n S (2.30) 2 (2 )2 n=−∞ ⊥

16


and E0 (a; s) =

∞ ˝ dk1 dk2

2 n=−∞

(2 )2

!k1−2s S: ⊥; n

(2.31)

These expressions are 3nite for ¿ 0, respectively, for Re s ¿ 32 and the limits of removing the regularization are → 0 and s → 0 correspondingly. 2 The 3rst regularization has an intuitive physical meaning. As any real body becomes transparent for high frequencies their contribution should be suppressed in some way which is provided by the exponential function. The regularization parameter can be viewed as somewhat proportional to the inverse plasma frequency. In contrast, the zeta-functional regularization does not provide such an explanation. Its advantage is more mathematical. The ground state energy in zeta-functional regularization E0 (a; s) is the zeta function of an elliptic dierential operator which is well known in spectral geometry. It is a meromorph function of s with simple poles on the real axis for Re s 6 32 . To remove this regularization one has to construct the analytic continuation to s=0. In s=0 it may or may not have a pole (see examples below in Section 3). These properties give the zeta-functional regularization quite important technical advantages and allow to simplify the calculations considerably. Together with this it must however be stressed that all regularizations must be equivalent as in the end they must deliver one and the same physical result. Let us 3rst consider the regularization done by introducing a damping function. The regularized vacuum energy of the electromagnetic 3eld in free Minkowski space–time is given by ˝ E0M (−∞; ∞) = d 3 k !k e−!k LS ; (2.32) (2 )3 where L → ∞ is the length along the z-axis which is perpendicular to the plates, !k = c|k| = 2 c k1 + k22 + k32 ; k = (k1 ; k2 ; k3 ). The renormalized vacuum energy is obtained by the subtraction from (2.30) of the Minkowski space contribution in the volume between the plates. After that the regularization can be removed. It is given by (cf. (2.20)) ∞

˝ dk1 dk2 dk3 ren −!k⊥ ; n −!k E0 (a) = lim !k⊥ ; n e − 2a S !k e (2 )2 2 →0 2 n=−∞  ∞ 2 a2 c ˝ dk1 dk2  k⊥ = + n2 e!k⊥ ; n lim a →0 (2 )2

2 n=0  ∞ 2 2 k⊥ a k a − dt + t 2 e−!k − ⊥  S ; (2.33) 2

2 0 2 ≡ k 2 + k 2 ; t ≡ ak = . where k⊥ 3 1 2 2

In fact, one has to exclude the mode with n = 0 but this does not change the physical result, see below Section 3.


To calculate (2.33) we apply the Abel–Plana formula (2.25) and obtain ∞ 2 2 ∞ c ˝

t − y2 E0ren (a) = − 3 y dy dt S ; a e2 t − 1 0 y

17

(2.34)

where y = k⊥ a= is the dimensionless radial coordinate in the (k1 ; k2 )-plane. Note that we could put = 0 under the sign of the integrals in (2.34) due to their convergence. Also the signs √ when rounding the branch points t1; 2 = ± iA of the function F(t) = A2 + t 2 by means of (2.35) F(it) − F(− it) = 2i t 2 − A2 (t ¿ A) were taken into account. To calculate (2.34) 3nally we change the order of integration and obtain t c˝ 2 ∞ dt E0ren (a) = − 3 y t 2 − y2 dy S a e2 t − 1 0 y ∞ c ˝ 2 1 d x x3 c ˝ 2 =− 3 − S: S = 3a (2 )4 0 ex − 1 720a3

(2.36)

The force (1.3) acting between the plates is obtained as derivative with respect to their distance F(a) = −

9E0ren (a)

2 ˝c S: =− 9a 240 a4

(2.37)

Now, we demonstrate the calculation of the ground state energy in zeta-functional regularization starting from Eq. (2.31). Using polar coordinates (k⊥ ; ’k ) in the plane (k1 ; k2 ) and performing the substitution k⊥ = y( n=a) we obtain ∞ ˝c ∞

n 3−2s 2 1=2−s dy y(y + 1) S: (2.38) E0 (a; s) = 2 0 a n=1

Note that we put s = 0 in the powers of some constants, e.g., c. The integration can be performed easily. The sum reduces to the well known Riemann zeta function (t = 2s − 3) ∞ 1 ; (2.39) %R (t) = nt n=1

which is de3ned for Re t ¿ 1, i.e., Re s ¿ 32 , by this sum. We need, however, the value of %R (−3) in the limit of removing the regularization s → 0. If we use the de3nition of %R (t) according to the right hand side of (2.39) this value evidently diverges. It is known that there exists a meromorph function with a simple pole in t = −1 (s=1) which can be obtained by analytic continuation of the right hand side of Eq. (2.39) to the whole complex plane. Such analytic continuation is unique and well de3ned, for instance, at a point t = −3, although needless to say that its values for Re t 6 1 are not represented by the right hand side of Eq. (2.39). It can be shown that the use of the value %R (−3) = 1=120 instead of the in3nite when s → 0 value (2.38) is equivalent to the renormalization of the vacuum energy under consideration (the reasons for the validity of this statement are presented in Section 3). In this simplest case the value of the analytically continued zeta function can be obtained from

18


Fig. 1. Illustration of two Mat manifolds with Euclidean topology (a) and topology of a circle (b).

the reMection relation z 1−z −z=2 &

%R (z) = &

(z−1)=2 %R (1 − z) ; 2 2

(2.40)

where &(z) is gamma function, taken at z = 4. Substituting %R (−3) = 1=120 into (2.38) and putting s = 0 one obtains once more the renormalized physical energy of vacuum (2.36) and attractive force acting between plates (2.37). Here a remark must be added concerning the renormalization. The result which we obtained from the regularization by means of analytical continuation of zeta function into the complex s-plane is 3nite. This is particular to the case under consideration. In more complicated con3gurations the result will, in general, be in3nite in the limit of removing the regularization so that some additional renormalization is needed. The situation for two plane plates considered above is referred to sometimes as renormalization by zeta-functional regularization. It should be noted that it works in some special cases only. 2.3. One- and two-dimensional spaces with nontrivial topologies As noted in the Introduction, when the space is topologically nontrivial the boundary conditions are imposed on the quantized 3eld similar to the case of material boundaries. As a consequence, a nonzero vacuum energy appears, though there are no boundaries and therefore no force can act upon them. Let us return to the interval 0 6 x 6 a and impose boundary conditions ’(t; 0) = ’(t; a);

9x ’(t; 0) = 9x ’(t; a) ;

(2.41)

which describe the identi3cation of the boundary points x = 0 and a. As a result we get the scalar 3eld on a Mat manifold with topology of a circle S 1 (see Fig. 1). Comparing with (2.1) now solutions are possible with ’ = 0 at the points x = 0; a. The orthonormal set of solutions to (2.2), (2.41) can be represented in the following form: c 1=2 ’(±) (t; x) = exp[ ± i(!n t − kn x)] ; n 2a!n 2 4 1=2 2 n mc 2 2 !n = + c kn ; kn = (2.42) ; n = 0; ±1; ±2; : : : : ˝2 a


19

Substituting (2.42) into Eqs. (2.6) and (2.9) we obtain the vacuum energy density of a scalar 3eld on S 1 0|T00 (x)|0 =

∞ ˝

!n : 2a n=−∞

(2.43)

Here, as distinct from Eq. (2.10), no oscillating contribution is contained. The total vacuum energy is a ∞ ˝ E0 (a; m) = 0|T00 (x)|0 d x = !n 2 n=−∞ 0 =˝

∞ n=0

!n −

mc2 : 2

(2.44)

The renormalization of this in3nite quantity is performed by subtracting the contribution of the Minkowski space in accordance with (2.20). The simplest way to perform the calculation of the renormalized vacuum energy without introducing an explicit renomalization function is the use of the Abel–Plana formula (2.25). Substituting (2.42), (2.44) and (2.18) into (2.20) one obtains ∞ ∞ a mc2 E0ren (a; m) = ˝ !n − !(k) dk − 2 0 2 n=0 ∞ ∞ 2 ˝c 2 mc2 A + n2 − A2 + t 2 dt − = (2.45) a 2 0 n=0

with A ≡ amc=(2 ˝) and√ the substitution t = ak=(2 ) was made. Now we put F(t) = A2 + t 2 into Eq. (2.25) and take account of Eq. (2.35). Substituting (2.25) into (2.45), we 3nally obtain ∞ 4 ˝ c t 2 − A2 ˝c ∞ '2 − 2 ren E0 (a; m) = − d' ; (2.46) dt = − a e2 t − 1

a e' − 1 A where ' = 2 t, ≡ mca= ˝ = 2 A. Here the constant term describing the energy of a wall like in Eq. (2.22) is absent as there are no walls in space with non-Euclidean topology. In the massless case we have = 0, and the result corresponding to (2.20) for the interval reads [19] '

˝c ˝c ∞ ren E0 (a) = − d' = − : (2.47)

a 0 exp(') − 1 6a For 1 it follows from (2.46) √ ˝c − ren e ; E0 (a; m) ≈ − √ 2 a

(2.48)

20


i.e., the vacuum energy of the massive 3eld is exponentialy small which also happens in the case of Mat spaces. In the one-dimensional case S 1 is a single topologically nontrivial manifold. In twodimensional spaces, i.e., (2 + 1)-dimensional space–times, there exist both Mat and curved manifolds with non-Euclidean topology. Below we discuss one example of each. A plane with the topology of a cylinder S 1 × R1 is a Mat manifold. This topology implies that points with Cartesian coordinates (x + na; y), where n = 0; ±1; ±2; : : : are identi3ed. For the scalar 3eld ’ de3ned on that manifold the following boundary conditions hold: ’(t; 0; y) = ’(t; a; y);

9x ’(t; 0; y) = 9x ’(t; a; y) :

(2.49)

Bearing in mind future applications to curved space–times for dierent dimensionality we remind the reader of the scalar wave equation in (N + 1)-dimensional Riemannian space–time m2 c2 * ∇* ∇ + 'R + 2 ’(x) = 0 ; (2.50) ˝ where ∇* is the covariant derivative, and R is the scalar curvature of the space–time, ' = (N − 1)=4N , x = (x0 ; x1 ; : : : ; xN ). This is the so called “equation with conformal coupling”. For zero mass it is invariant under conformal transformations (see [53,54]). The equation with minimal coupling is obtained from (2.50) with ' = 0 for all N . The metric energy-momentum tensor is obtained by varying the Lagrangian corresponding to (2.50) with respect to the metric tensor gik . Its diagonal components are [53,54] Tii = ˝c (1 − 2')9i ’9i ’ + (2' − 12 )gii 9k ’9k ’ − '(’∇i ∇i ’ + ∇i ∇i ’’) m2 c 2 + ( 12 − 2') 2 gii − 'Gii − 2'2 Rgii ’2 ; (2.51) ˝ where Gik = Rik − 12 Rgik is the Einstein tensor, and Rik is the Ricci tensor. Now let N = 2, and the curvature be zero as in the case of S 1 × R1 . It is not diTcult to 3nd the orthonormalized solutions to Eq. (2.50) with the boundary conditions (2.49). All the preceding procedure described above for the case of S 1 can be repeated thereafter with the result [55] ∞ ˝c '2 − 2 ren E0 (a; m) = − d' L ; (2.52) 4 a2 exp(') − 1 where L → ∞ is the normalization length along the y-axis as in Eq. (2.17). Note that this result is valid for an arbitrary value of ' and not for ' = 18 only. In the massless case the integral in (2.52) is easily calculated with the result ˝c%R (3) E0ren (a) = − L; (2.53) 2 a2 where %R (z) is the Riemann zeta function with %R (3) ≈ 1:202. The calculation of the vacuum energy of the scalar 3eld in other topologically nontrivial two-dimensional Mat manifolds (a 2-torus, a Klein bottle, a MUobius strip of in3nite width) can be found in [24]. We now consider the Casimir eect for a scalar 3eld on a two-dimensional sphere S 2 with radius a. This is a curved manifold with scalar curvature R = 2a−2 . In spherical coordinates the


21

space–time metric reads ds2 = c2 dt 2 − a2 (d,2 + sin2 , d’2 ) :

(2.54)

The scalar 3eld equation (2.50) with ' = 18 , R = 2a−2 after the transformations takes the form 2 2 2 a2 2 mc a 1 (2) 9 ’(x) − - ’(x) + + ’(x) = 0 ; (2.55) c2 t ˝2 4 where -(2) is the angular part of the Laplace operator. The orthonormal set of solutions to Eq. (2.50) obeying periodic boundary conditions in both , and ’ can be represented as √ c (+) (+) ∗ ’lM (t; ,; ’) = √ exp(i!l t)YlM (,; ’); ’(−) lM (t; ,; ’) = (’lM (t; ,; ’)) ; a 2!l 1=2 1 2 m2 c4 c2 !l = + 2 l+ ; l = 0; 1; 2; : : : ; M = 0; ±1; : : : ; ±l ; (2.56) ˝2 a 2 where YlM (,; ’) are the spherical harmonics. Substituting the 3eld operator in the form (2.6) with eigenfunctions (2.56) into Eq. (2.51) we 3nd the still nonrenormalized vacuum energy density ∞ 1 ˝ 0|T00 (x)|0 = l+ !l (2.57) 4 a2 2 l=0

and the total vacuum energy of the sphere S 2 to be ∞ 1 E0 (a; m) = ˝ l+ !l : 2

(2.58)

l=0

The renormalization is performed according to (2.20), i.e., by introducing a regularization by means of a damping function and subtracting the contribution of the vacuum energy of the three-dimensional Minkowski space–time. The 3nal result which does not depend on the speci3c form of the damping function is most easily obtained by use of the Abel–Plana formula (2.26) for the summation over half-integers resulting in [56] 2 1 ' 1 − '2 ren 2 mca E0 (a; m) = 2mc d' : (2.59) ˝ 0 exp(2 mca'= ˝) + 1 It is signi3cant that here the Casimir energy is positive in contrast to the case of a Mat manifold considered above. When = mca= ˝1 it follows from (2.59) that E0ren (a; m) ≈ 13 mc2 2 : By this means, for a massless scalar 3eld the Casimir energy on S 2 is equal to zero. In the opposite limiting case 1 we have mc2 7 ren 1− : E0 (a; m) ≈ 24 402

(2.60)

(2.61)

22


As seen from (2.61) the Casimir energy on the surface of S 2 diminishes as a power of −1 , whereas in the examples considered above it was exponentially small for 1. This is because the manifold S 2 has a nonzero curvature (see Section 3). For a sphere of in3nite radius the total Casimir energy of a scalar 3eld on S 2 takes the value E0ren (m) = mc2 =24 (see the discussion of additional normalization condition for m → ∞ in Section 3.4). 2.4. Moving boundaries in a two-dimensional space–time In the preceding sections we considered the case when the boundaries and boundary conditions are static. The corresponding vacuum energies and forces were also static. If the geometrical con3guration, and, respectively, boundary conditions depend on time the so called dynamic Casimir eect arises. The most evident manifestation of dynamic behavior is the dependence of the force on time. Let us return to a massless scalar 3eld on an interval (0; a) considered in Section 2.1. Now let the right boundary depend on time: a = a(t). It is obvious that in the 3rst approximation the force also depends on time according to the same law as in (2.21) F(a(t)) = −

˝c : 24a2 (t)

(2.62)

This result is valid under the condition that the boundary velocity a (t) is small compared with the velocity of light. The proof of this statement and the calculation of the velocity dependent corrections to the force can be found in Section 4.4 where more realistic dimensionalities are considered. The other, and more interesting manifestation of the dynamic behavior is the creation of particles from vacuum by a moving boundary (this eect was 3rst discussed in [57,58]). The eect of creation of particles from vacuum by nonstationary electric and gravitational 3elds is well known (see, e.g., [53,54]). As noted above, boundary conditions are idealizations of concentrated external 3elds. It is not surprising, then, that moving boundaries act in the same way as a nonstationary external 3eld. We outline the main ideas of the eect of particle creation by moving boundaries with the same example of a massless scalar 3eld de3ned on an interval (0; a) with a = a(t) depending on time when t ¿ 0. Instead of (2.1) the boundary conditions now read ’(t; 0) = ’(t; a(t)) = 0 :

(2.63)

For t ¡ 0 the orthonormalized set of solutions to Eq. (2.2) with m = 0 is given by Eq. (2.4) in which one should substitute a by a0 ≡ a(0). The 3eld operator which is understood now as in-3eld (i.e., 3eld de3ned for t ¡ 0 when the boundary point is at rest) is given by the Eq. (2.6). For any moment the set of solutions to Eq. (2.2), 1n(±) (t; x) should satisfy the boundary conditions (2.63) and the initial condition 1 ±i!n t

nx 1n(±) (t 6 0; x) = ’(±) e sin : n (t; x) = √ a0

n

(2.64)


23

The 3eld operator at any moment is given by [1n(−) (t; x)an + 1n(+) (t; x)a+ ’(t; x) = n] :

(2.65)

n

The functions 1n(±) (t; x) which are unknown for t ¿ 0 can be found in the form of a series (see, e.g., [59,60])

a0 1

kx (−) 1n (t; x) = √ Qnk (t) sin (2.66) a(t) a(t)

n k with the initial conditions Qnk (0) = nk ;

Qnk (0) = − i!n nk :

(2.67)

Here Qnk (t) are the coeTcients to be determined, n; k = 1; 2; 3; : : : . The positive-frequency functions 1n(+) are obtained as the complex conjugate of (2.66). It is obvious that both boundary conditions (2.63) and the initial conditions (2.64) are satis3ed automatically in (2.66). Substituting Eq. (2.66) into the 3eld equation (2.2) (with m = 0) we arrive after conversion to an in3nite coupled system of dierential equations with respect to the functions Qnk (t) [60] 2 2 24(t)hkj Qnj (t) + 4 (t)hkj Qnj (t) + 4 (t) hjk hjl Qnl (t) : Qnk (t) + !k (t)Qnk (t) = j

l

(2.68)

Here the following notations are introduced: !k (t) =

c k ; a(t)

4(t) =

hkj = − hjk = (−1)k−j

a (t) ; a(t) j2

2kj ; − k2

j = k :

(2.69)

Let after some time T the right boundary of the interval return to its initial position a0 . For t ¿ T the right hand side of Eq. (2.68) is equal to zero and the solution with initial conditions (2.67) can be represented as a linear combination of exponents with dierent frequency signs Qnk (t) = 6nk e−i!k t + 7nk ei!k t :

(2.70)

This is a familiar situation which is well known in the theory of particle creation from vacuum by a nonstationary external 3eld. Substituting Eqs. (2.66) and (2.70) into the 3eld operator (2.65) we represent it once more as the expansion in terms of the functions ’k(±) from (2.4) where m = 0, a = a0 but with the new creation and annihilation operators

k k ∗ + bk = (2.71) 6nk an + 7nk an n n n and the Hermitian conjugate for b+ k.

24


Eq. (2.71) is the Bogoliubov transformation connecting in-creation and annihilation operators + a+ n ; an with the out-ones bn ; bn . Its coeTcients satisfy the equality k(|6nk |2 − |7nk |2 ) = n ; (2.72) k

which is the unitarity condition (see [53,54] for details). Dierent vacuum states are de3ned for T ¡ 0 and for t ¿ T due to nonstationarity of the boundary conditions ak |0in = 0 for t ¡ 0;

bk |0out = 0 for t ¿ T :

(2.73)

The number of particles created in the kth mode is equal to the vacuum–vacuum matrix element in the in-vacuum of the out-operator for the number of particles. It is calculated with the help of Eqs. (2.71) and (2.73) ∞ 1 nk = 0in |b+ b | 0

= k |7 |2 : (2.74) k k in n nk n=1

The total number of particles created by the moving boundary during time T is ∞ ∞ ∞ 1 nk = k |7 |2 : N= n nk k=1

k=1

(2.75)

n=1

It should be mentioned that in the original papers [57,58] another method was used to calculate the number of created particles. There the property that in two-dimensional space–time the classical problem with nonstationary boundary conditions can be reduced to a static one by means of conformal transformations was exploited. This method, however, does not work in four-dimensional space–time. To calculate quantities (2.74) and (2.75) it is necessary to solve system (2.68) which is a rather complicated problem. It is possible, however, to obtain a much more simple system in the case when the boundary undergoes small harmonic oscillations under the condition of a parametric resonance. Let us consider, following [61], the motion of the boundary according to the law a(t) = a0 [1 + sin(2!1 t)] ;

(2.76)

where !1 = c =a0 , and the nondimensional amplitude of the oscillations is 1 (in realistic situations ∼ 10−7 ). In the framework of the theory of the parametrically excited systems [62] the coeTcients 6nm ; 7nm can be considered as slowly varying functions of time. Substituting (2.70) into (2.68) we neglect by all the terms of the order 2 and perform the averaging over fast oscillations with the frequencies of the order !k . As a result the simpli3ed system (2.68) takes the form [61] d6n1 d6nk = −7n1 + 36n3 ; = (k + 2)6n; k+2 − (k − 2)6n; k−2 ; k ¿ 2 ; d9 d9 d7n1 = −6n1 + 37n3 ; d9

d7nk = (k + 2)7n; k+2 − (k − 2)7n; k−2 ; d9

k ¿2 ;

(2.77)


25

where the initial conditions are 6nk (0) = nk ;

7nk (0) = 0 :

(2.78)

Here we introduce the so-called “slow time” 9 = 12 !1 t :

(2.79)

Note that even modes are not coupled to the odd modes in (2.77). Due to the initial conditions (2.78) 7n; 2k (t) = 72l; k (t) = 0 ;

(2.80)

which is to say that the particles are created in odd modes only. The solution of the dierential system (2.77) and (2.78) and of the integral equation which is equivalent to it can be found in [61]. Here we present only the 3nal result for the particle creation rate. With the proviso that 91 the number of created particles and the creation rate in the lowest mode with k = 1 are [61] dn1 (t) 1 2 2 1 n1 (t) ≈ (!1 t)2 ; ≈ !1 t : (2.81) 4 dt 2 In the opposite limiting case 91 the results are 4 1 4 2 dn1 (t) n1 (t) ≈ 2 (!1 t) + 2 ln 4 − ; ≈ 2 !1 : (2.82)

2 dt

The total number of created particles in all modes is N (t) ≈ n1 (t) if 91, i.e., the lowest mode alone determines the result. However, if 91 we have N (t) ∼ 92 n1 (t). For the energy of the particles created in the lowest mode one obtains evidently the result !1 n1 (t). The total energy of particles created in all modes is 1 E(t) = !1 knk (t) = !1 sinh2 (29) : (2.83) 4 k

As is seen from this result the total energy increases faster than the total number of photons, i.e., the pumping of energy takes place into the high-frequency modes at the expense of the low-frequency ones. In Section 4.4 where the three-dimensional con3gurations will be considered we discuss the possibility of experimental observation of the photons created by the moving mirrors. Additional factors such as imperfectness of the boundary mirrors, back reaction of the radiated photons upon the mirror, etc., will be discussed. The inMuence of the detector placed into the cavity will be also touched upon. 3. Regularization and renormalization of the vacuum energy This section is devoted to the theoretical foundation of the Casimir eect. It contains the general regularization and renormalization procedures formulated in the frames of Quantum Field Theory under the inMuence of boundary conditions. The divergent part of the vacuum state energy is found in an arbitrary quantization domain. Dierent representations for the regularized

26


vacuum energy are obtained. The photon propagator in the presence of boundary conditions is presented. The mathematical methods set forth in this section give the possibility to calculate the Casimir energies and forces for variety of con3gurations and quantized 3elds of dierent spin. Remind that in this section units are used in which ˝ = c = 1. 3.1. Representation of the regularized vacuum energy The basic quantities appearing in quantum 3eld theory in connection with the Casimir eect are the vacuum expectation value of the energy operator in the ground state (vacuum) of the quantum 3eld under consideration and the corresponding eective action. In zeta-functional regularization the ground state energy can be written as half-sum over the one-particle energies J labeled by some general index J 1 1−2s E0 (s) = ± 2s J : (3.1) 2 J The one-particle energies J are by means of J2 = :J connected with the eigenvalues of the corresponding one-particle Hamiltonian H;J (x) = :J ;J (x) :

(3.2)

In case of a 3rst order (in the derivatives) theory, e.g., for a spinor 3eld, they have to be taken with positive sign, j → |j |. We assume the spectrum to be discrete for the moment. Within the given regularization we are free to introduce an arbitrary constant with the dimension of a mass which assures the energy E0 (s) to have the correct dimension. The dierent signs in Eq. (3.1) correspond to bosonic and fermionic 3elds, respectively. The mathematical background of the regularization used in (3.1) is the zeta function %H (s) = :−s (3.3) J J

associated with the operator H . Then the ground state energy (3.1) reads 2s 1 E0 (s) = ± %H s − : 2 2

(3.4)

This zeta function is one of the functions de3ned on an operators spectrum and is used in 3eld theory as well as in geometry. It is a well investigated object with a clearly de3ned meaning, especially for hyperbolic pseudo-dierential operators. It is known to be a meromorphic function of s and it has simple (sometimes double [63]) poles. This function coincides for Re s ¿ d=p (where d is the dimension of the manifold and p the order of the dierential operator, we consider operators with p = 2 only) with the sum on the right hand side of Eq. (3.3). In a broader context, in quantum 3eld theory instead of the ground state energy one considers the eective action & which is de3ned by means of & = −i ln Z ;

(3.5)


27

where Z is the vacuum-to-vacuum transition amplitude. It can be represented by the functional integral Z = D;eiS(;) (3.6) taken over 3elds satisfying corresponding boundary conditions. In the context of ground state energy one usually restricts oneself to an action quadratic in the 3elds, i.e., to a free theory. 3 A typical action reads 1 d x ;(x)( + V (x) + m2 );(x) ; (3.7) S(;) = − 2 where V (x) is a background potential. So integral (3.6) is Gaussian and can be carried out delivering Z = (det( + V (x) + m2 ))−1=2 = e−(1=2)Tr ln(

+V (x)+m2 )

:

(3.8)

The box operator may be the usual wave operator or may be more complicated, for example → D D with the covariant derivative D = 9 + ieA (x) in case of the complex Klein–Gordon 3eld in the background of an electromagnetic potential. The functional integral can also be over Grassmann 3elds too describing quantized fermion 3elds. In that case we have the usual changes in the sign so that the eective action may be written as i & = ± Tr ln( + V (x) + m2 ) : (3.9) 2 We need to give this expression a more precise meaning. Again, we assume the spectrum of the one-particle Hamiltonian (3.2) to be discrete. Moreover, we assume the background to be static. In that case we can immediately switch to the Euclidean formulation and separate the time dependence by means of a Fourier transform to a momentum p0 . Then the eective action (3.9) becomes diagonal. Finally, we introduce the zeta-functional regularization and the eective action can be written as 9 2s ∞ dp0 2 &=∓ (p0 + :J )−s (s → 0) : (3.10) 9s −∞ 2 J The integral over p0 may be carried out and by means of Eq. (3.3) the eective action can be expressed in terms of the zeta function of the operator H 1 9 2s &(s − 12 ) √ &=∓ : (3.11) %H s − 9s 4 &(s) 2 In this way, a connection between the ground state energy and the eective potential is established. In general, the physical consequences resulting from both quantities are expected to be the same. This will be seen after the discussion of the renormalization and the corresponding normalization conditions. The quantities introduced so far have a precise mathematical meaning. However, they have a restricted region of applicability which is due to the assumption of a discrete spectrum which 3

Note that free is meant in this context as a free 3eld theory whereby the background potential may be taken into account exactly.

28


is here equivalent to a 3nite quantization volume. In order to go beyond that we are faced with the in3nite Minkowski space contribution in problems with a Mat background manifold. A typical example is the Casimir eect for the exterior of a ball. More precisely, let L be the size of the quantization volume. Then the ground state energy (and the zeta function as well) for L → ∞ contains a contribution which does not depend on the background and, by means of translational invariance, it will be proportional to the in3nite volume of the Minkowski space. This contribution is independent of the background and thus of the boundary conditions. Because of this, it does not carry any information of interest. In the following, we drop it without changing the notation of the corresponding quantities like E0 . Next, there is a contribution which does not depend on L but depends on the background and which we are interested in. Finally, there are contributions vanishing for L → ∞. To put this procedure into a mathematical framework it is useful to transform the sum, say in Eq. (3.1), into an integral. A second reason for this is the need to construct the analytic continuation of the zeta function to the left of Re s ¿ s0 which is necessary for removing the regularization. There is no general procedure to do this and we are left with some special assumptions which are, however, still quite general and allow consideration of a wide class of problems. So we assume that the variables separate in the problems considered. The simplest example in this respect is a background depending on one Cartesian coordinate only, e.g. parallel mirrors. The next example is a spherically symmetric background, say boundary conditions on a sphere or a potential V (r). As known, separation of variables is connected with some symmetry. So the same can be done for a problem with cylindrical symmetry, for a generalized cone, for monopoles and quite a large number of other problems not considered here. We restrict ourselves to the two most simple, but typical cases. 3.1.1. Background depending on one Cartesian coordinate Here we assume the potential to depend on the coordinate x3 only. In that case, by means of translational invariance, the vacuum energy is proportional to the volume of the directions x⊥ perpendicular to x3 and we have, in fact, to consider the corresponding energy density. With the obvious ansatz ;J (x) = exp(ik⊥ x⊥ );n (x3 ) the equation for the one-particle energy becomes one-dimensional H;n (x3 ) = :n ;n (x3 )

(3.12)

with the operator 2 H = k⊥ −

d2 + V (x3 ) + m2 : d x32

(3.13)

It is meaningful to de3ne the pure SchrUodinger operator P=−

d2 + V (x) d x2

(3.14)

with the eigenvalue problem which looks like a one-dimensional SchrUodinger equation P’n (x) = Bn ’n (x)

(3.15)


29

Fig. 2. A typical one-dimensional potential, the bound state levels are shown by broken lines.

so that Eq. (3.1) for the vacuum energy takes the form 2s dk1 dk2 2 (Bn + k⊥ + m2 )1=2−s : E0 (s) = ± 2 n (2 )2

(3.16)

Here we assume that the potential V (x) for x → ± ∞ tends to zero (see Fig. 2). Otherwise, if it tends to a nonzero constant this constant can be absorbed into a rede3nition of the mass m. The case when it tends to dierent constants at both in3nities can be treated in a similar way and is not considered here. Formula (3.16) can be rewritten by integrating out k1 and k2 . By means of the substitution k1; 2 → Bn + m2 k1; 2 the expression factorizes. The k-integration reads simply 1 dk1 dk2 2 1 (k⊥ + 1)1=2−s = : 2 (2 ) 4 s − 3=2 The integral is de3ned for Re s ¿ 3=2. The analytic continuation to the whole s-plane is given by the right hand side. In more general cases, for instance when the number of dimensions of the directions perpendicular to x3 is odd, a combination of gamma functions results. After this transformation the ground state energy takes the form 2s 1 E0 (s) = ± (Bn + m2 )3=2−s : (3.17) 8 s − 3=2 n Now, with the discrete eigenvalues Bn , Eq. (3.17) is a well de3ned expression and we could start to construct its analytic continuation to s=0. An example for such a problem is the Casimir eect between two planes with Dirichlet boundary conditions at x = ± L and no potential, i.e., with V (x) = 0 in Eq. (3.14), which was considered in Section 2.2. There, the remaining sum delivered the Riemann zeta function with its well known analytic continuation. But this is not the general case and in order to separate the translational invariant part (in the x3 -direction) we proceed as follows. We consider the one-dimensional scattering problem on the whole axis (x ∈ (− ∞; ∞)) associated with the operator P (3.15) P’(x) = k 2 ’(x) :

(3.18)

30


Fig. 3. The complex k-plane.

It is well investigated, please refer to the textbook [64] for example. We note the following properties. Eq. (3.18) has two linear independent solutions which can be chosen as to have the asymptotics: ’1 (x) ∼ eikx + s12 e−ikx ; x → −∞

’2 (x) ∼ s22 e−ikx ; x → −∞

’1 (x) ∼ s11 eikx ; x→∞

’2 (x) ∼ s21 eikx + e−ikx : x→∞

(3.19)

The matrix s = (sij ) composed from the coeTcients sij in (3.19) is unitary. From this for real k the relation s11 (k) 2 2 s11 (k) − s21 (k) = (3.20) = e2i(k) s11 (− k) follows, where (k) is the scattering phase. The 3rst solution, ’1 (x), describes a wave incident from the left which is scattered by the potential, t(k) = s11 (k) is the transmission coeTcient, r(k) = s12 (k) is the reMection coeTcient and the second power of their modules R = |r(k)|2 and T = |t(k)|2 are connected by R + T = 1. The second solution, ’2 (x), has the same meaning for a wave incident from the right. The function s11 (k) is a meromorphic function. Its poles on the upper half plane (if any) are located on the imaginary axis and correspond to bound states in the potential V (x) with binding energy * = ik. Now, we consider the following two linear combinations of the solutions: ’± (x) = ’1 (x) ± ’2 (x) :

(3.21)

Imposing the boundary conditions ’± (L)=0 delivers the discrete eigenvalues k 2 → Bn as solution to these equations. With other words, the functions ’± (L), considered as functions of k have zeros in k 2 = Bn . By means of this we rewrite the sum in (3.17) by an integral 1 1 E0 (s) = (−*i2 + m2 )3=2−s 8 s − 3=2 i dk 2 9 + (3.22) (k + m2 )3=2−s ln(’+ (L)’− (L)) ; 9k C 2i where the 3rst sum is over the bound states *i of the potential V (x) and the contour C encloses the continuous spectrum of P, i.e., all eigenvalues Bn on the real axis (see Fig. 3). We are


31

interested in the limit of large L. Let be the imaginary part of the (3.22). We note ¿ 0, respectively, ¡ 0 on the upper, respectively, Then we have using the asymptotic expansion (3.19). −2ikL + 2L − i + O(e−L ) ln(’+ (L)’− (L)) = 2 2 (k) − s21 (k)) + O(eL ) 2ikL − 2L + ln(s11

integration variable k in lower-half of the path C. ( ¿ 0) ; ( ¡ 0)

correspondingly. The contributions proportional to L constitute the so called “Minkowski space contribution” and are thrown away. Then we obtain, from the part depending on the background potential, in the limit L → ∞ 1 1 E0 (s) = (−*i2 + m2 )3=2−s 8 s − 3=2 i ∞ dk 2 9 2 2 + (k) − s21 (k)) : (3.23) (k + m2 )3=2−s ln(s11 2i 9 k 0 Now there are two ways to proceed. By means of relation (3.20) we obtain ∞ dk 9 1 1 E0 (s) = (−*i2 + m2 )3=2−s + (k 2 + m2 )3=2−s (k) ; 8 s − 3=2

9k 0 i

(3.24)

where the integration is over the real k-axis. Another representation can be obtained using 2 (k) − s2 (k)) = ln s (k) − ln s (−k) and turning the integration contour (3.20) in the form ln (s11 11 11 21 towards the positive imaginary axis in the contribution from ln s11 (k) and to the negative axis in ln s11 (−k). Taking into account the cut resulting from the factor (k 2 + m2 )3=2−s we arrive at the representation 1 cos s ∞ 1 9 E0 (s) = − dk(k 2 − m2 )3=2−s ln s11 (ik) : (3.25) 8 s − 3=2 9 k m Here the integration is over the imaginary axis. The explicit contribution from the bound states is canceled from the extra terms arising when moving the integration contour to the imaginary axis. The two representations, (3.24) and (3.25), are connected by the known dispersion relation [65] ln(1 − |s12 (q)|2 ) k + i*i −1 ∞ ln s11 (k) = dq ln ; (3.26) + 2 i −∞ k − q + i k − i*i i representing the analytic properties of the scattering matrix. Now, we return to a problem whose spectrum is discrete from the very beginning. Here we do not need to separate a translationally invariant part, but in order to perform the analytic continuation in s it is useful to rewrite the sum as an integral. Again, as example we consider the interval x ∈ [0; a] with Dirichlet boundary conditions, i.e., the Casimir eect between planes. We de3ne a function s(k) such that the solutions of the equation s(k) = 0 are the eigenvalues k 2 → Bn . Further, we assume that s(ik) and s(−ik) dier by a factor which is independent of k. In the example we can choose s(k) = sin ka. Then by deforming the integration contour as

32


above we arrive just at formula (3.25) with s instead of s11 . In this way representation (3.25) for the ground state energy is valid in both cases, for discrete and continuous spectra. We illustrate this by another simple example. Consider a potential given by a delta function, V (x) = 6(x). The transmission coeTcient is known from quantum mechanics textbooks. It reads t(k) ≡ s11 (k) = (1 − 6=2ik)−1 . For 6 ¡ 0 it has a pole on the positive imaginary axis corresponding to the single bound state in the attractive delta potential. The problem of an Eq. (3.18) with the delta potential present in the operator P (3.14) can be reformulated as problem with no potential but a matching condition ’(x − 0) = ’(x + 0);

’ (x + 0) − ’ (x − 0) = 6’(0) ;

(3.27)

stating that the function is continuous and its derivative has a jump in x = 0. In the formal limit 6 → + ∞ we obtain Dirichlet boundary conditions at x = 0. In this sense the delta potential can be viewed as a “semitransparent” boundary condition. We note the corresponding formula if two delta potentials, located at x = ± a=2, are present: 6 6 ika −1 t(k) = 1 − − e : (3.28) 2ik 2ik This problem had been considered in [66,67]. 3.1.2. Spherically symmetric background Here we assume the background potential V (r) to depend on the radial variable alone with the boundary conditions being given on a sphere. For example Dirichlet boundary conditions read ;(x) = 0 for r = R. With the ansatz 1 ;(x) = Ylm (,; ’)’nl (r) ; (3.29) r where Ylm (,; ’) are the spherical harmonics, the equation for the one particle energies takes the form similar to Eq. (3.12) H’nl (r) = !nl ’nl (r)

(3.30)

with H = P + m2 and the operator P reads P=−

l(l + 1) 92 + + V (r) 9r 2 r2

(3.31)

de3ning the eigenvalue problem P’nl (r) = Bnl ’nl (r) :

(3.32)

The vacuum energy takes the form now E0 (s) =

∞ 1 (2l + 1) (Bnl + m2 )1=2−s ; 2 n

(3.33)

l=0

where the factor (2l + 1) accounts for the multiplicity of the eigenvalues Bnl . Again, we assume the potential V (r) to vanish for r → ∞ otherwise we had to rede3ne the mass m.


33

In the case of a continuous spectrum we are faced with the same problem of separating the translational invariant contribution as in the preceding subsection. Here we consider as the “large box” a sphere of radius R for R → ∞. For this task we consider the scattering problem on the half-axis r ∈ [0; ∞) associated with the operator P, Eq. (3.31), P’nl (r) = k 2 ’nl (r) :

(3.34)

We need to know several facts from the three-dimensional scattering theory related to the standard partial wave analysis. They can be found in familiar textbooks, see [68] for example, and can be formulated as follows. Let ’reg nl (r) be the so called “regular scattering solution”. It is de3ned as that solution to Eq. (3.34) which for r → 0 becomes proportional to the solution of the free equation, i.e., of the equation with V = 0: ˆ ’reg nl (r) ∼ j l (kr) ; r→0

(3.35)

where jˆl (z) = ( z=2)Jl+1=2 (z) is the Riccati–Bessel function. This solution is known to have for r → ∞ the asymptotic behavior

’reg nl (r) ∼

i

r →∞2

− + (fl (k)hˆl (kr) − fl∗ (k)hˆl (kr)) ;

(3.36)

∓ 1; 2 (z) are the Riccati–Hankel functions and the coeTcients fl (k) where hˆl (z) = ± i ( z=2)Hl+1=2 ∗ and fl (k) are the “Jost function” and its complex conjugate, respectively. We note the property

fl (−k) = fl∗ (k)

(3.37)

for real k and the relation to the scattering phases l (k) fl (k) = e−2il (k) : fl (−k)

(3.38)

Next, we impose Dirichlet boundary conditions on the regular solution ’reg nl (R) = 0. Considered as function of k this equation has the discrete eigenvalues Bnl as solution, k 2 → Bnl and we rewrite the sum in (3.33) as contour integral ∞ 1 dk 2 reg 2 2 1=2−s 2 1=2−s 9 E0 (s) = (2l + 1) (−*il + m ) + (k + m ) ln ’nl (R) : 2 2 i 9 k C i l=0

(3.39)

Here the sum over i accounts for the bound states which may be present in the potential V (r). Their binding energy is *il which, in general, depends on the orbital quantum number l. The integration path C is the same as in the preceding section and encloses the positive real axis, see Fig. 3.

34


Now, in order to perform the limit R → ∞, we use the asymptotic expression (3.36). We rewrite it in the form − + ln(fl (k)hˆl (kR) − fl∗ (k)hˆl (kR))

 ∗ (k)hˆ+ (kR)  f −  l  ln fl (k) + ln hˆl (kR) + ln 1 − l ;  −   ˆ fl (k)hl (kR) =

 ˆ−  + f (k) h (kR)  l l ∗  ˆ  :  ln fl (k) + ln hl (kR) + ln −1 + ∗ + fl (k)hˆl (kR)

(3.40)

± Keeping in mind the behavior of the Hankel functions for large arguments, hˆl (z) ∼ exp(±i(z − l =2)), we use the 3rst representation in the right hand side on the upper half of the integration path C, i.e., for Im k ¿ 0 and the second one on the lower one. So in both cases we keep the 3rst contribution in the limit of R → ∞. The second one does not depend on the background and represents the Minkowski space contribution and must be dropped. The third contribution vanishes at R → ∞. So we arrive at the representation ∞ ∞ dk 2 1 fl (k) 2 2 1=2−s 2 1=2−s 9 E0 (s) = (2l + 1) (−*il + m ) − : (k + m ) ln ∗ 2 2 i 9 k f (k) 0 l i l=0

(3.41)

To proceed we have two choices. By means of Eq. (3.38) we obtain the representation which is analogous to (3.24) with the integration over the real axis ∞ ∞ 1 dk 2 2 2 1=2−s 2 1=2−s 9 E0 (s) = (2l + 1) (−*il + m ) + (k + m ) l (k) : (3.42) 2

9 k 0 i l=0

By turning the integration contour C in (3.41) towards the imaginary axis we obtain with (3.37) ∞ ∞ cos s 9 E0 (s) = − (2l + 1) dk(k 2 − m2 )1=2−s ln fl (ik) : (3.43) 2 9k m l=0

Again, the contributions arising from moving the contour across the poles from the zeros of the Jost function on the imaginary axis cancel the explicit contributions from the bound states. In fact, representation (3.43) has to be handled with care. So for instance, while representations (3.41) and (3.42) are valid for Re s ¿ 3=2, in (3.43) s cannot be made too large otherwise the integration diverges for k → m. This can be avoided by encircling the k = m at some small, but 3nite distance, for example. The advantage of representation (3.43), as well as (3.25), is that the integrand is not oscillating for large k and that its behavior for large argument can be found quite easily. To complete these representations of the regularized ground state energy it remains to note that both representations, Eqs. (3.42) and (3.43), are connected by the standard dispersion


relation which reads in this case *il2 2 ∞ dq q 1− 2 − (k) : ln fl (ik) = k

0 q2 + k 2 l i

35

(3.44)

In case the spectrum is discrete from the beginning, we do not need to separate a translational invariant part and can use the sum in Eq. (3.33) directly. However, in order to construct the analytic continuation in s it is useful anyway to transform it into an integral. Consider as example the Casimir eect inside a conducting sphere. One of the modes of the electromagnetic 3eld obeys Dirichlet boundary conditions at r = R. The radial wave functions are Bessel functions like (3.35) and the solutions of the boundary conditions jl (kR) = 0 are the roots of the Bessel functions: Bn;l = jn;l+1=2 =R. Although this sum is explicit so far, its analytic continuation is not easy to construct. It is advantageous to obtain an integral representation which can be continued much more easily. The procedure is essentially the same as in the preceding subsection. For de3niteness we demonstrate it here on the given example. So consider the ground state energy (3.33). The sum can be rewritten as integral ∞ 1 dk 2 9 E0 (s) = (2l + 1) (3.45) (k + m2 )1=2−s ln(k −(l+1=2) Jl+1=2 (kR)) : 2 2 i 9 k C l=0

The zeros of the function in the logarithm are just the roots of the Bessel function and the contour C must include all of them. Now we note the following technical point. We are free to modify the function within the logarithm by any function which is analytic inside the contour C without changing the integral. Also we note that a constant there does not contribute due to the derivative 9= 9k. We used this freedom to introduce the factor k −(l+1=2) . The reason is that we want to turn the contour towards the imaginary axis. Then it crosses the point k = 0. Without the factor we introduced the integrand by means of 9= 9k ln(Jl+1=2 (kR)) ∼ (l + 1=2)=k had a pole there delivering an extra contribution. With the factor introduced this is a regular point and we can move the contour towards the imaginary axis. With j4 (iz) = exp(i4 =2)I4 (z) where I4 (z) is the modi3ed Bessel function we obtain just the same expression (3.43) as in the case of a continuous spectrum with the substitution fl (ik) → k −(l+1=2) Il+1=2 (k)

(3.46)

for the Jost function. We want to conclude this section by another example. First of all let us consider the exterior of a sphere with Dirichlet boundary conditions. We can use the formulas given above for the related scattering problem. We need the regular solution of the scattering problem (3.34). Because of r ¿ R in this case we do not need condition (3.35). The operator P is given by Eq. (3.31) with V (r) = 0 and the solution coincides with a free one. There are two independent solutions and we have to choose the linear combination which 3ts the asymptotic expression (3.36). The solution in this case is just given by the right hand side of Eq. (3.36). Now we − + impose the boundary conditions and arrive at the equation (fl (k)hˆl (kR) − fl∗ (k)hˆl (kR)) = 0 from which we determine the Jost function as +

fl (k) = i(kR)−(l+1=2) hˆl (kR) ;

(3.47)

36


where the factor in front of the Riccati–Bessel function is chosen in a way that fl (k) is regular for k → 0. Frequently, for the modi3ed spherical Riccati–Bessel functions the notations jˆl (iz) = il sl (z)

and

+ hˆl (iz) = (−i)l el (z)

are used. So we can represent the Jost functions (3.46) and (3.47) as (kR)−(l+1=2) sl (kR) ; fl (ik) = (kR)l+1=2 el (kR)

(3.48)

(3.49)

for the problem inside and outside the sphere with radius R, respectively. Note that the factors (kR)±(l+1=2) compensate each other when considering the two problems together. In that case we have to add the corresponding vacuum energies and, consequently the logarithms, so, that we obtain the same expression given by Eq. (3.43) with fl (ik) = sl (kR)el (kR). 3.2. The heat kernel expansion Here we consider the heat kernel expansion as the most suited tool to investigate the divergence structure of the vacuum energy. We start from a general operator P. Speci3c examples are those given by Eqs. (3.14) or (3.31) on a manifold M . However, the formulas given below are valid in general for elliptic operators which may be pseudo-dierential operators also. Likewise, an extension to 3rst order operators, i.e. the Dirac operator are possible, where some special considerations must be taken into account. Roughly speaking, one has to take some quadratic combination of the Dirac operator D such as D† D. Readers interested in more details should refer to the paper [69]. In this and the following two sections we write the formulas for a massive scalar 3eld in order to avoid unnecessary technical details like internal indices. We would like to promote the understanding of the underlying essential ideas which are the same for all theories. In case this manifold has a boundary, 9M , one has to impose some boundary conditions in order to have a symmetric operator P. We consider here Dirichlet and Neumann boundary conditions. Let BJ and ’J (x) be the corresponding eigenvalues and eigenfunctions respectively. The local heat kernel is de3ned then by K(x; y|t) = ’J (x)’∗J (y)e−tBJ : (3.50) J

It is de3ned for operators with a discrete and=or continuous spectrum. The global heat kernel can be obtained formally as trace over the local one K(t) ≡ Tr K(x; y|t) = e−tBJ ; (3.51) J

where the trace means an integral over the manifold, Tr K(x; y|t) = d x tr K(x; x|t) where tr is over the internal indices, e.g., corresponding to some symmetry group. The heat kernel obeys the Helmholtz equation 9 (3.52) + P K(x; y|t) = 0 9t


37

with the initial condition K(x; y|t = 0) = (x − y)

(3.53)

and describes the diusion of heat from a pointlike source. It can be de3ned for a very wide class of operators and on quite general manifolds. The appropriate language is that of dierential geometry and geometric objects on a Riemannian manifold equipped with a connection and some endomorphism (scalar background 3eld). However, we do not use this in full generality and restrict the discussion to the necessary minimum for the understanding of the divergencies of the Casimir energy. The zeta function %H (s), Eq. (3.3), can be expressed as an integral over the heat kernel. By representing the power of the eigenvalues in (3.3) by ∞ 1 −s dt t s−1 e−tA (3.54) A = &(s) 0 (Re s ¿ 0; Re A ¿ 0), which is, in fact, an integral representation of the Euler gamma function and changing the order of summation and integration (this is possible due to the convergence of both for Re s ¿ s0 ) we obtain with (3.51) ∞ 1 dt t s−1 K(t) : (3.55) %H (s) = &(s) 0 Now, from an inspection of Eq. (3.55) it is clear that the integral is well behaved for t → ∞ and possible singularities of the zeta function result only from the lower t → 0 behavior of the integrand. Therefore, we need information on K(t) for t → 0. This is given by the heat kernel expansion [70] which we note in the form 1 an t n ; (3.56) K(t) = (4 t)d=2 n=0;1=2;1;:::

where d is the dimension of the manifold M . Eq. (3.56) is an asymptotic expansion for t → 0 and it is in general not a converging series. The coeTcients an are called heat kernel coeTcients. There is, of course, an analogous expansion for the local heat kernel. Let us 3rst consider the heat kernel of a free problem for a scalar 3eld, i.e., without potential or boundary conditions. The operator is P = - + m2 . In that case Eq. (3.52) has the solution 1 (x − y)2 (0) 2 K (x; y|t) = exp − − tm ; (3.57) 4t (4 t)d=2 where m is the mass of the 3eld. This solution can be easily veri3ed by inserting into Eq. (3.52). The initial condition (3.53) is also satis3ed because for t → 0, the right hand side of Eq. (3.57) is a representation of the delta function. The idea for the behavior of the heat kernel for t → 0 comes from the observation that the leading ultraviolet divergence in the presence of a background is the same as in free space. For that reason the ansatz for the expansion K(x; y|t) = K (0) (x; y|t) an (x; y)t n (3.58) n¿0

is meaningful. The coeTcients an (x; y) are called “local heat kernel coeTcients”.

38


The connection between the local and the global coeTcients is very simple on a manifold without boundary an = Tr an (x; y) = d x tr an (x; x) : (3.59) M

In that case, from inserting expansion (3.56) into Eq. (3.52), the recurrence relations 9 (x − y)i i a0 (x; y) = 0 ; 9x 9 (x − y)i i + n + 1 an+1 (x; y) = (- − V (x))an (x; y) (n = 1; 2; : : :) 9x

(3.60)

follow, see for example [69]. They allow the determination of these coeTcients and their derivatives very easily. We note here as illustration the 3rst coeTcients for a Mat manifold with a background potential V (x) in the so called coincidence limit, i.e., for y = x: a0 (x; x) = 1 ; a1 (x; x) = −V (x) ; a2 (x; x) = − 16 WV (x) + 12 V 2 (x) :

(3.61)

However, these recurrence formulas do not work on manifolds with boundary or with singular background potentials. The simplest example is a background potential given by a delta function. While the coeTcient a1 is well de3ned, whereas in a2 the delta function appears squared and the expression in (3.61) becomes meaningless. Also, it is well known that on a manifold with boundary there are in addition coeTcients with half-integer numbers and the corresponding powers of t in expansion (3.56). There is a general framework to determine these coeTcients based on the formalism of pseudo-dierential operators which is however not very useful in speci3c calculations. During the past few years progress had been made in a combination of conformal techniques and special case calculations. So for Dirichlet and Robin 4 boundary conditions the coeTcient a5=2 had been calculated for an arbitrarily shaped smooth boundary. For boundaries with symmetries, a sphere or a generalized cone, the coeTcients up to quite high numbers are available. Due to their properties as distributions, the structure of the coeTcients can be best expressed in the smeared form with a test function f(x) as integral over the manifold and over its boundary an (f) = d x f(x)an (x; x) + dSx f(x)bn (x) ; (3.62) M

9M

where bn (x) are the boundary dependent contributions. It is important that these coeTcients are all local in the sense that they may be represented as integrals taking information from the background only at one point. In Eq. (3.62) the volume integrals contribute to coeTcients with integer number only whereas the surface integrals deliver coeTcients with integer and with half-integer numbers. Another remarkable fact is that the coeTcients can be expressed solely in terms of geometric characteristics 4

Robin boundary conditions on a 3eld ’(x) are given by (9= 9n − h(x))’(x)|x∈S = 0 where h(x) is some function de3ned on the boundary. For h = 0 they turn into Neumann boundary conditions.


39

of the M like curvature and its derivatives and in this way they do not depend on the dimension of the manifold. The 3rst two coeTcients are √

a0 (f) = d x f; a1=2 (f) = dSx f : (3.63) 2 9M M √ So, a0 (1) is the volume and (2= )a1=2 (1) is the surface area of the manifold. It follows that these two coeTcients do not depend on the details of the background and do not carry any information of interest to us. Especially in case of a Mat manifold, a0 is the so called Minkowski space contribution. Mostly, it is of relevance only in curved space–time where it enters the renormalization of the cosmological constant. In order to give an impression on the general structure of the coeTcients we note the next one, a1 , in the case of Dirichlet boundary conditions 1 1 1 a1 (f) = d x f(x) V (x) + R(x) + dSx (3.64) f(x)L66 + f; N (x) ; 6 3 2 M 9M where R(x) is the scalar curvature of M; L66 is the trace of the second fundamental form on 9M and f; N is the normal derivative of f. For more details on these quantities the reader should consult a textbook on dierential geometry. It is important to notice that all higher coeTcients after a0 and a1=2 are proportional to the background potential V (x) or to the mentioned geometrical quantities and their derivatives. It follows, for instance, that for a Mat M without boundary or with Mat boundary (especially for plane parallel planes) all coeTcients except for a0 vanish. This is also the case in a 3eld theory with temperature where the corresponding manifold is simply S 1 . To conclude the discussion of the general properties of the heat kernel coeTcients we note that the boundary dependent contributions bn to the coeTcients inside and outside the boundary are connected by bn|inside = (−1)2n+1 bn|outside ;

(3.65)

i.e., the boundary dependent coeTcients with half-integer number are the same on two sides of the boundary whereas that with integer numbers have dierent signs. In order to further illustrate the topic we note below the heat kernel coeTcients for some simple con3gurations, such as for the Laplace operator with Dirichlet (D) and Neumann (N) boundary conditions on a sphere and for a penetrable sphere (ps) given by a delta function potential in three dimensions: D a0

3 4 3 R 3=2 2

a1=2 −2 a1

N R

± 83 R

a3=2 − 16 3=2 a2 ∓

16 315 R

ps

3 4 3 R 3=2 2

2

R

± 16 9 R 7 3=2 6

±

16 9 R

0 −4 6R

3=2 62 −

2 63 : 3 R

(3.66)

40


Here the upper (lower) sign corresponds to the interior (exterior) region (except for a0 which is in3nite in the exterior region) and 6 is the coeTcient in front of the delta function. Note that for the penetrable sphere there is no subdivision into interior and exterior region. The calculation of the heat kernel coeTcients is quite complicated but now a large number of them are known. For manifolds without boundary the most complete calculation is done in [71,72], see also [73]. There is even a computer program provided for that in [74]. The coeTcients for Dirichlet and Neumann boundary conditions can be found in [75]. For a general surface the coeTcients are given in [76 –80]. In [81] the coeTcients for a d-dimensional sphere with Dirichlet and Robin boundary conditions are given up to n = 10. The coeTcients for the penetrable sphere where 3rst calculated in [82] and generalized to an arbitrary penetrable surface in [83]. The coeTcients for the dielectric ball are given in [82]. They do not have an explicit expression. For a particular example see Eq. (3.86) below. 3.3. The divergent part of the ground state energy We start with the representation E0 (s) = ±

2s (BJ + m2 )1=2−s 2 J

(3.67)

of the ground state energy in zeta-functional regularization. Here we have shown the dependence on the mass explicitly, and BJ are the eigenvalues of the corresponding operator such as P, Eq. (3.31). In parallel we consider the frequency-cuto regularization 1 2 1=2 E0 () = ± (BJ + m2 )1=2 e−(BJ +m ) ; (3.68) 2 J where the convergence is achieved by the exponential damping function and we have to put = 0 in the end. Both regularizations as well as many other have their own advantages and limitations. The zeta-functional regularization is the most elegant known regularization with pleasant mathematical properties. However, one has to take care not to have modes with zero eigenvalue BJ = 0 for a massless theory because 0−s is ill-de3ned. Here the mass can serve as an intermediate infrared regulator. The cuto regularization (3.68) has the physically very intuitive meaning as to cut at frequencies ∼ 1= where any real mirror becomes transparent. With both regularizations an arbitrariness comes in. In the zeta-functional regularization there is the arbitrary parameter with dimension of a mass which can be introduced (in the limit of s → 0, i.e., of formally removing the regularization it disappears). It gives the regularized ground state energy the correct dimension. In the cuto regularization the parameter (it has the dimension of an inverse mass) can be multiplied by any 3nite, positive number. For physical reasons, all regularizations must deliver the same end result. How to achieve this is subject to the normalization condition to be discussed below. To conclude the general considerations, we would like to note that in introducing a regularization one has to take care that it in fact removes the divergencies, i.e., that it delivers a mathematically correct de3ned expression.


41

We want to study the divergent part of the ground state energy by making use of the heat kernel expansion. For the ground state energy in zeta-functional regularization (3.67) we use formula (3.54) and obtain 2s ∞ dt t s−1=2 −t(BJ +m2 ) E0 (s) = ± : (3.69) e 2 J 0 t &(s − 12 ) Due to the convergence for Re s ¿ 32 , the sum and the integral in the right hand side may be interchanged and by means of the heat kernel K(t) = e−tBJ ; (3.70) J

where the mass is not included into the heat kernel (in dierence to Eq. (3.51)), we obtain 2s ∞ dt t s−1=2 2 E0 (s) = ± (3.71) K(t)e−tm : 2 0 t &(s − 12 ) Now, as we are interested in the divergent part of the ground state energy we can insert the heat kernel expansion into the right hand side. Then the t-integration can be carried out simply by applying formula (3.54) and we arrive at E0 (s) = ±

2s an &(s + n − 2) 2(2−s−n) : m 2 (4 )3=2 &(s − 12 )

(3.72)

n¿0

From this expression it is seen by inspection that only the coeTcients with numbers n 6 2 contribute to divergencies at s = 0. We take these contributions and de@ne the divergent part as sum of the nonvanishing terms for s → 0 resulting from the heat kernel coeTcients with numbers n 6 2: m4 1 m3 42 1 div E0 = − − a − a + ln 0 64 2 s m2 2 24 3=2 1=2 1 m2 m 42 + a + ln 2 − 1 a1 + 2 32 s m 16 3=2 3=2 1 1 42 − (3.73) + ln 2 − 2 a2 : 32 2 s m In fact, it contains some 3nite contributions at s = 0 also. Furthermore, we observe that for dimensional reasons E0div contains only nonnegative powers of the mass and that all terms of E0div are of this type. A similar procedure can be applied to the cuto regularization (3.68). It is technically a bit more involved. First of all let us consider the integral ∞ I (s) ≡ d s E0 () : (3.74) 0

42


Using Eq. (3.68) and formula (3.54) we obtain s 1 1 (BJ + m2 )−s=2 = &(s + 1)%H : I (s) = &(s + 1) 2 2 2 J

(3.75)

So, I (s) is proportional to the corresponding zeta function of the half-argument. Using a Mellin transform we obtain from Eqs. (3.74) and (3.75) s 1 i∞ ds −s−1 E0 () = &(s + 1)%H ; (3.76) 2 −i∞ 2 i 2 which is an expression of the ground state energy in cuto regularization and given as an integral of the Mellin–Barnes type of the zeta function. Here the integration path goes parallel to the imaginary axis to the right of the poles of the integrand. Being interested in the divergent part, i.e., in the behavior for → 0, we move the integration path to the left and collect the contributions resulting from crossing the poles of the integrand. These poles are known to result from the heat kernel expansion inserted into representation (3.55) of the zeta function. We obtain ∞ 1 i∞ ds −s−1 dt t s=2 an t n −tm2 E0 () = &(s + 1) e 2 −i∞ 2 i t &(s=2) (4 t)3=2 0 n¿0 1 = 16 3=2

i∞

−i∞

ds −s−1 &(s + 1) s−3 an & + n m3−s−2n : 2 i &(s=2) 2

(3.77)

n¿0

From this representation it is seen that there are poles at s = 3; 2; 1. There is no pole in s = 0 due to the gamma function in the denominator. Furthermore, there is a double pole in s = −1. The poles to the left of s = −1 contribute positive powers of and we do not need to consider them. Calculating the contributions from the poles at s = 3; 2; 1; −1 we obtain √ 1 m 2 m4 4 24 div −2 2 + E0 () = ln a0 + 3 a1=2 16 2 4 2 2 + − m2 ln a1 + (ln )a2 : (3.78) 2 From this formula, it is seen that all contributions are divergent including that from the coeTcient a1=2 . Moreover, these divergent contributions are also present in the massless case so that their absence in the zeta-functional regularization (except for a2 ) is due to the speci3c form of the regularization. 3.4. Renormalization and normalization condition In order to perform the renormalization we need a quantity to be renormalized. In the framework of perturbative quantum 3eld theory there are the bare constants (masses, couplings, etc.) which become renormalized by the corresponding counter-terms—a well known procedure. In case of the vacuum energy (as well as other vacuum expectation values) the corresponding quantities are the classical background 3elds that the vacuum expectation values depend on.


43

The simplest example to be considered is a scalar 3eld theory with a classical background 3eld ;(x) and a quantum 3eld ’(x) with the action 1 S= dx{;(x)( + M 2 + B;2 (x));(x) + ’(x)( + m2 + B ;2 (x))’(x)} ; (3.79) 2 where V (x) = B ;2 (x) is the background potential that is coupled to the quantum 3eld. Now there is a classical energy associated with the background 1 class E = d 3 x((∇;(x))2 + M 2 ;2 (x) + B;4 (x)) ; (3.80) 2 where we assumed the background to be static. Furthermore, we have the ground state energy of the quantum 3eld which is given e.g. in zeta-functional regularization by Eq. (3.67) and there is the complete energy of the system as a whole which is the sum of these two. The renormalization procedure consists simply in subtracting the divergent part of the ground state energy, e.g., given by Eq. (3.73), from the ground state energy and adding it to the classical energy, E = E class + E0div + E0 − E0div ! ! class ren ≡ E˜ + E0 :

(3.81)

class can be interpreted as a renormalization of the parameters of The change from E class to E˜ the classical system. In the 3rst model it reads B m2 1 4 2 2 M →M − −1 ; + ln 16 2 s m 4 B m2 1 B→B − −2 : (3.82) + ln 64 2 s m

The divergence associated with a0 from Eq. (3.73) would lead to a renormalization of a constant addendum to the classical energy. As noted above, we drop such a contribution. This procedure works, in principle, for any background 3eld. Some of the structures in the classical action may be missing for the case of the corresponding heat kernel coeTcient being zero. Other coeTcients may be present, for instance those with half-integer numbers in case of a singular background 3eld, say containing a delta function as considered e.g. in [82]. The best known example is half-classical gravity. In this connection it is interesting to remark that in (3.82) we observe a renormalization of the self-coupling B of the background 3eld which in this sense is an inevitable part of the classical action. The terms to be included for renormalization are determined primarily by dimensional reasons. In the same manner one needs to include the contributions quadratic in the curvature into the left hand side of the Einstein equation. We would like to stress that it is very important which interpretation we can give to the vacuum expectation values. In general, they must be viewed as a quantum correction to the classical system, given e.g. by ;(x) in (3.80). This system must have its own dynamics from which its characteristics like M or B may be determined or there must be additional, say experimental knowledge on them. In any case they cannot be determined from the ground state energy i.e., from calculating quantum corrections to them. So we are left with the question how

44


to give a unique meaning to the vacuum expectation values. As we have seen with introducing a regularization, some arbitrariness creeps in. In fact, this is known from perturbative quantum 3eld theory. There are reasons, e.g., in QED, like observable masses or couplings which have unique normalization conditions. For vacuum expectation values viewed as quantum corrections to some classical system there is also a natural normalization condition. One has to require that in the limit of a large mass of the quantum 3eld its vacuum expectation must vanish, E ren → 0 :

(3.83)

m→∞

This condition is natural since a 3eld in the limit of in3nite mass should not have quantum Muctuations. From the technical point of view this condition removes completely the arbitrariness of the renormalization procedure. The reason for this is that all divergent contributions come along with nonnegative powers of the mass. This follows from dimensional reasons together with the well known fact that the heat kernel expansion is also an adiabatic expansion in the limit of a large mass. There is no general normalization condition for a massless 3eld, however. As it is shown in [82] a ground state energy normalized according to (3.83) does not have a 3nite limit for m → 0 except for the case of a vanishing heat kernel coeTcient a2 . Therefore, in the case of a2 = 0, the ground state energy of a massless 3eld cannot be uniquely de3ned and, hence, it is physically meaningless. Below we will discuss some examples of this case. It is necessary to stress the point that it is just the coeTcient a2 which becomes problematic. This is because the nonuniqueness comes from the logarithmic contributions. Moreover, in the zeta-functional regularization for a massless 3eld it is only the contribution from a2 which is present (see Eq. (3.73) for m = 0). In another regularization, e.g., the frequency cuto, Eq. (3.78), also at m = 0, the only logarithmic contribution is the one proportional to a2 . Some special considerations are required for the computation of ground state energy in the case of boundaries, such as for the Casimir eect. Here we have two dierent situations. First, let us consider a setup with two distinct bodies with boundary conditions on their surfaces and assume we are interested only in the dependence of the ground state energy on the distance between these bodies. This is the typical situation for measurements. In that case the heat kernel coeTcients (except for a0 which is of no interest here) have only surface contributions. Now, the surface 9M constitutes of two parts, 9M = 9M1 ∪ 9M2 corresponding to the two bodies and the heat kernel coeTcients become a sum of two contributions according to (1) an = dS bn + dS b(2) (3.84) n : 9M1

9M2

Due to the locality of the coeTcients there is no dependence on the distance between the bodies and, consequently, no distance dependent singularities or ambiguities. Now, we consider the vacuum energy for boundary conditions given on one body so that the energy itself has a certain meaning. One can think for example of a conducting sphere where the dependence of the ground state energy on the radius de3nes a pressure. In that case, as discussed, e.g., in [29], one has to introduce a classical energy associated with the geometrical


45

con3guration under consideration. So for a spherical surface one has to consider E class = pV + FG + FR + k +

h ; R

(3.85)

in accordance with the dependence of the heat kernel coeTcients (3.66) on the radius R. Here V = 43 R3 is the volume and G = 4 R2 is the surface of the sphere. Correspondingly, p is the pressure and F is the surface tension. The parameters F, k and h do not have a special meaning. Now the addition of E0div to E class in (3.81) can be reformulated as a rede3nition of the corresponding constants p; F; : : : . With regard to the normalization conditions the same consideration apply as above. It is of particular interest to consider the Casimir eect for a conducting sphere from this point of view. First of all we observe that the problem initially can be divided into two, one for the interior and the other for the exterior of the sphere. The quantum 3elds inside and outside are completely independent from each other because the sphere with conductor (or Dirichlet, respectively, Neumann) boundary conditions is impenetrable. In case of a massive quantum 3eld its vacuum energy can be uniquely de3ned (and calculated [33]) in each region independently owing to the normalization condition (3.83). However, for a massless 3eld this is not the case. Here we have a2 = 0 (e.g. a2 = ±16 =315R for Dirichlet boundary conditions, see Eq. (3.66)) and uniqueness cannot be achieved. This can be seen in the given example by noticing that the vacuum energy for dimensional reasons is proportional to 1=R. Therefore, it is impossible to formulate a condition of the type that the vacuum energy must vanish for R → ∞ because it does not remove the arbitrariness which results from the logarithmic term which is, of course, also proportional to 1=R. The situation, however, changes when considering the interior and the exterior of the sphere together. In that case the contributions to a2 from inside and from outside cancel and an unique calculation of the ground state energy is possible. For the same reason one can drop the contributions in the classical energy which are generated by a0 , a1 and a2 . Equipped with this knowledge it is interesting to consider the 3rst calculation [18] of the Casimir eect for a conducting sphere. The electromagnetic 3eld obeys conductor boundary conditions and a frequency cuto regularization was taken. In that work a delicate cancellation of divergencies was observed. In fact, the most divergent contribution is that delivered by a0 in Eq. (3.78). It vanishes when taking together the modes from inside and from outside. The same applies to a1 and a2 . The remaining divergence is that resulting from a1=2 . It vanishes when taking both the TE and the TM modes, i.e., the contributions from Dirichlet and Neumann boundary conditions. It was this coincidence of cancellations which made that important calculation possible. Let us note that in zeta-functional regularization the only divergent contribution is that of a2 (see Eq. (3.73) for m = 0) which vanishes when taking the inside and outside contributions together. Hence, in this regularization the contributions from the TE and from the TM modes have individually a 3nite vacuum energy. There is another example much discussed during the last few years where the situation is not so pleasant, namely the dielectric ball. The heat kernel coeTcients have been calculated in [82] with the result that a2 is in general nonzero. It vanishes only in the dilute approximation, i.e., for small − 1, respectively, for small dierence in the speeds of light c1 and c2 inside

46


and outside the ball. It holds a2 = −

2656 (c1 − c2 )3 + O((c1 − c2 )4 ) : 5005R c22

(3.86)

From this fact two consequences follow. First, all calculations in the dilute approximation (there are at least three dierent ones, by mode summation [11], by summing the pairwise Casimir– Polder forces [11] and a perturbative approach [84]) must deliver the same result. In fact, they do. They are performed in dierent regularizations and some divergent contributions had to be dropped which is possible in a unique way because there are no logarithmic contributions. The same applies to calculation with equal speed of light inside and outside. The second conclusion is that beyond the dilute approximation the ground state energy of the electromagnetic 3eld is not unique. In fact, this has the consequence that the approximation of real matter in which a ball is characterized solely by a dielectric constant constitutes an ill-de3ned problem. At the moment there is no satisfactory explanation for this. Let us conclude with the remark that the same problem exists for a conducting spherical shell of 3nite thickness as discussed in [82]. The calculation of the Casimir energy and force for a ball and other con3gurations will be considered in more detail in Section 4. 3.5. The photon propagator with boundary conditions In order to investigate the properties of the vacuum it is in most cases possible to investigate global quantities like the ground state energy. Here, a knowledge of the spectrum is suTcient, along with formulas discussed in the preceding subsections like (3.1) or (3.43), to calculate (at least in principle) the quantities of interest. The eigenfunctions or propagators are not needed to this end. We are presented with a somewhat dierent situation if one is concerned with subjects such as semiclassical gravity where there are diTculties with the de3nition of the global energy and the investigation of the (local) Einstein equations is preferred by most people. The local vacuum energy density may be of interest also in cases where the global energy does not exist because the background is too singular. An example is the vacuum energy density in the background of an in3nitely thin magnetic string which is well de3ned at any point outside the Mux line but cannot be integrated in its vicinity. Yet another example is the calculation of radiative corrections to the vacuum energy. Here, a higher loop graph must be calculated where one of the propagators (or all) obey boundary conditions. A knowledge of only the spectrum is clearly insuTcient and one needs knowledge of the eigenfunctions also. The photon propagator in the presence of boundary conditions had been widely used in the calculation of the Casimir eect in various con3gurations. In the simplest case of plane parallel plates it can be constructed using the reMection principle well known from electrostatic problems. It works also in some other geometries, such as a wedge for example. There is a generalization to arbitrarily shaped cavities in form of the multi-reMection expansion. This is a formal, in3nite series and may be useful in some speci3c examples. In general, it is a matter of intuition combined with symmetry considerations to 3nd a suTciently simple representation of the propagator for a given problem. A number of examples is considered in Section 4. Here we are going to discuss a more systematic, general (and necessarily more formal) approach which reMects the general properties of the propagator. Thereby we connect it with the speci3c


47

problem of which degrees of freedom of a gauge 3eld contribute to the ground state energy. We restrict ourselves to QED, but the considerations may be easily generalized to other gauge 3eld theories. 3.5.1. Quantization in the presence of boundary conditions The conductor boundary conditions that the electromagnetic 3eld has to ful3ll on some boundary G are ∗ n F4 (x)|x∈G = 0 ;

(3.87)

∗ = 67 is the dual 3eld strength tensor and n is the (outer) normal of G. Being where F4 467 F the idealization of a physical interaction (with the conductor), they are formulated in terms of the 3eld strengths and thus are gauge-invariant. Now, for well known reasons, it is desirable to perform the quantization of QED in terms of the gauge potentials A (x). Obviously, the boundary conditions (3.87) do not unambiguously imply boundary conditions for all components of A (x) as required in order to obtain a self-adjoint wave operator. There are two ways to proceed. The 3rst one is to impose boundary conditions on the potentials in such a way that conditions (3.87) are satis3ed and a self-adjoint wave operator is provided. Local boundary conditions of this kind 3x either the magnetic or the electric 3eld on the boundary. These conditions are stronger than (3.87) and they are not gauge-invariant. When now requiring Becchi–Rouet–Stora–Tyutin (BRST) invariance the ghosts (and some auxiliary 3elds) become boundary dependent too and contribute to physical quantities like the ground state energy. This was probably 3rst observed in the papers [85,86] and has later also been noticed in connection with some models in quantum cosmology [87,88]. The common understanding is that the ghost contributions cancel those from the unphysical photon polarizations, for recent discussions see [89 –91,82]. In a second approach one considers the boundary conditions (3.87) as constraints when quantizing the potentials A (x) as it was 3rst done in [92]. In that case, explicit gauge invariance is kept. There is no need to impose any additional conditions. Conditions (3.87) appear to be incorporated in a “minimal” manner. In that respect, this second approach resembles the so called dyadic formalism which has successfully been used in the calculation of the Casimir energy in a spherical geometry [93]. There are two ways to put this approach into practice. The 3rst way is to solve the constraints explicitly. For this purpose one has to introduce a basis of polarization vectors Es (instead of the commonly used es ) such that only two amplitudes as (x) (s = 1; 2) of the corresponding decomposition

A (x) =

3

Es as (x)

(3.88)

s=0

of the electromagnetic potential have to satisfy boundary conditions. The other two amplitudes (s = 0; 3) remain free. In the case where the surface G consists of two parallel planes such a polarization basis was constructed explicitly in [92]. In this way, roughly speaking half of the photon polarizations feel the boundary (in [94] they have been shown to be the physical ones in the sense of the Gupta–Bleuler quantization procedure) and half of them do not. However, such

48


an explicit decomposition, which simultaneously diagonalizes both, the action and the boundary conditions can only be found in the simplest case. The problem is that for a nonplanar surface G the polarizations Es become position dependent and no general expression can be given. In the second way one realizes the boundary conditions as constraints which is equivalent to restrictions on the integration space in the functional integral approach. Then the generating functional of the Green functions in QED reads "" ∗ Z(J; H; Z H) = C DA D Z D (n F4 (x)) exp{iS[A ; ; Z ]

+i

4 x∈G

d x(A (x)J (x) + Z (x)H(x) + H(x) Z (x))} ;

(3.89)

Z and where the integration runs over all 3elds with the usual asymptotic behavior and J (x), H(x) H(x) are the corresponding sources. The functional delta function restricts the integration space to such potentials A that the corresponding 3eld strengths satisfy the boundary conditions (3.87). This approach had been used in [92,95]. It was shown to result in a new photon propagator and an otherwise unaltered covariant perturbation theory of QED. The boundary conditions (3.87) appear to be incorporated with a “minimal” disturbance of the standard formalism, completely preserving gauge invariance (as well as the gauge 3xing procedure) and Lorentz covariance as far as possible. The spinor 3eld deserves a special discussion with respect to the boundary conditions. We do not impose boundary conditions on the electron and consider the electromagnetic 3eld and the spinor 3eld on the entire Minkowski space with the conducting surface placed in it. In the case of G being a sphere we thus consider the interior and the exterior region together. In general, the surface G need not be closed. Only the electromagnetic 3eld obeys boundary conditions on the surface G. The electron penetrates it freely, it does not feel the surface. As for a physical model one can think of a very thin metallic surface which does not scatter the electrons but reMects the electromagnetic waves. If the thickness of the metallic surface (e.g. 1 m) is small compared to the radiation length of the metal (e.g. 1:43 cm for copper), this approximation is well justi3ed. However, the radiative corrections are of order (6Bc =L) with respect to the Casimir force itself and thus too small to be directly observable. A discussion of the validity of this approximation seems therefore to be somewhat academic at this time. We remark, that the situation is dierent in the case of the bag model of the hadrons in QCD (see Section 4.2.3). The boundary conditions of the gluon 3eld and of the spinor 3eld are connected by means of the equation of motion (this is because the 3eld strength tensor enters the boundary conditions rather than the dual 3eld strength tensor in (3.87)). 3.5.2. The photon propagator After the quantization is given by the functional integral (3.89) it “remains” the task to calculate it. We start from the standard representation of perturbation theory in QED Z(J; H; Z H) = exp iSint Z (0) (J; H; Z H) ; (3.90) ;− ; iJ iH iHZ


49

where Z (0) is the generating functional of the free Greens functions and Z ˆ Sint (A; ; ) = e d x Z (x)A(x) (x) is the interaction. In this way the problem is reduced to that of a standard perturbative technique with the corresponding Feynman rules and a free theory which now depends on the boundary conditions. Before proceeding further we rewrite the boundary conditions (3.87) in the following way. Let Es (x) (s = 1; 2) be the two polarization vectors in (3.88) with the properties 9 s E (x) = 0; 9x

n Es (x) = 0

(3.91)

for x ∈ G (assuming 9 n4 (x) = 94 n (x)). They span a space of transversal vectors tangential to the surface G. Note that due to (3.91), there is no derivative acting outside the tangential space, i.e. no normal derivative in (3.91). Without loss of generality, we assume the normalization Es† g4 E4t = −st . We remark that the transformations A → A + 9 ’(x) and A → A + n ’(x) respect the boundary conditions (3.87), i.e. if A (x) satis3es the boundary conditions, then so does the transformed potential. The invariance under the 3rst transformation simply says that the boundary conditions are gauge independent. The second means that the projection of A onto the normal n of G is unaected by the boundary conditions. We therefore conclude that the boundary conditions (3.87) can equivalently be expressed as Es A (x)|x∈G = 0

(s = 1; 2) :

(3.92)

Explicit examples for such polarization vectors Es are given in [92] for plane parallel plates (see below Eq. (3.97)) and in [95] for a sphere and for a cylinder. In [96] these polarization vectors had been used in connection with the radial multiple scattering expansion for a sphere. Now, with the boundary conditions in the form (3.92) it is possible to represent the delta functions in (3.89) as functional Fourier integrals. Then the functional integral is Gaussian and can be completed. With the notation K 4 = g4 92 − (1 − 1=6)9 94 for the kernel of the free action of the electromagnetic 3eld and its inverse, which is the free (i.e., without boundary conditions) photon propagator D4 (x − y), we de3ne a new integral kernel on the boundary manifold G by st KZ (z; z ) ≡ Es† (z)D4 (z − z )E4t (z );

z; z ∈ G :

(3.93)

This object is, in fact, the projection of the propagator D4 (z − z ) on the surface G and, with respect to the Lorentz indices, into the tangential subspace spanned by the polarization vectors −1st Es (x); s = 1; 2. Further, we need to de3ne the inversion KZ (z; z ) of this operation −1st tt d z KZ (z; z )KZ (z ; z ) = G (z − z )st ; (3.94) G

where G (z − z ) is the delta function with respect to the integration over the surface G, st is the usual Kronecker symbol. Then the new photon propagator with boundary conditions takes

50


the form G

D4 (x; y) ≡ D4 (x − y) − DZ 4 (x; y) −1 = D4 (x − y) − d z d z D (x − z)Es (z)KZ st (z; z )E4t† (z )D44 (z − y) G

G

(3.95)

and the generating functional of the free (in the sense of perturbation theory) Greens functions obeying boundary conditions takes the form 1 (0) −1=2 −1=2 Z Z (J; H; Z H) = C(det K) (det K) exp d x dy J (x) G D4 (x; y)J 4 (y) 2 1 Z + d x dy (x)S(x − y) (y) : (3.96) i Note the appearance in addition to the det K, which is known from the theory without boundary conditions, of the determinant det KZ of the operation KZ which is boundary dependent. Representation (3.95) of the photon propagator is valid for an arbitrary surface G. Its explicit form in the plane parallel geometry and for boundary conditions on a sphere (explicit in terms of Bessel functions) can be found in [95]. In general, for geometries allowing for a separation of variables it is possible to write down the corresponding explicit expressions. In this respect representation (3.95) is equivalent to any other. Its main advantage consists in separating the propagator into a free part and a boundary dependent one allowing at least in one loop calculations for an easy subtraction of the Minkowski space contribution. Another remarkable property is that the gauge dependence is only in the free space part whereas the boundary dependent part does not contain the gauge parameter 6. As an illustration of these general formulas we consider in the next subsection the photon propagator in plane parallel geometry. 3.5.3. The photon propagator in plane parallel geometry Here the surface G consists of two pieces. A coordinization of the planes is given by z = {x6 ; x3 = ai }, where the subscript i = 1; 2 distinguishes the two planes and 6 = 0; 1; 2 labels the directions parallel to them (they are taken perpendicular to the x3 -axis intersecting them at x3 = ai , |a1 − a2 | ≡ a is the distance between them). The polarizations Es (s = 1; 2) can be chosen as [92]     0 −92x    i 9x 2  1 −9x0 9x1   1 ; E2 =  E1 =  ;    −i9x1   −9x0 9x2  −92 −92 + 92 −92x x x0 x 0 0     −i9x0 0     1 0 0   −i9x1  E3 =  (3.97)  0  ; E =  −i9x2  2 −9x0 + 92x 0 1


51

(Es† g4 E4s = gst ). These polarization vectors do not depend on x6 or i. Therefore they commute with the free photon propagator 9x 9x4 d 4 k eik(x−y) D4 (x − y) = g4 − (1 − 6) 2 (3.98) 9x (2 )4 −k 2 − i st ( ¿ 0). Inserting Es into (3.93) yields the operator KZ in the form st KZ (z; z ) = −st D(x − x )|x; x ∈G :

(3.99)

We proceed with deriving a special representation of the scalar propagator. It is obtained by performing the integration over k3 in Eq. (3.98) 6 6 3 3 d 3 k6 eik6 (x −x )+i&|x −x | D(x − x ) = ; (3.100) (2 )3 −2i& with & = k02 − k12 − k22 + i. Substituting x3 = ai and x3 = aj we get d 3 k6 i 6 6 st Z K (z; z ) = −st (3.101) hij eik6 (x −x ) ; (2 )3 2& where the abbreviation hij = ei&|ai −aj |

(i; j = 1; 2)

(3.102)

st has been introduced. With (3.101) we have achieved a mode decomposition of the operator KZ st on the surface G. As an advantage of this representation the inversion of KZ , de3ned by (3.94), is now reduced to the algebraic problem of inverting the (2 × 2)-matrix hij . With −i&a

e − 1 i h−1 ; (3.103) ij = 2 sin &a −1 e−i&a ij

we get st

−1 KZ (z; z ) = −st

d 3 k6 2& −1 ik6 (x6 −x6 ) : h e (2 )3 i ij

(3.104)

After inserting this expression into (3.95) we 3nd the photon propagator for the electromagnetic 3eld in covariant gauge with conductor boundary conditions on two parallel planes which was 3rst derived in [92]. The connection to dierent representations can be seen from the remark that the zeros of the denominator for sin &a = 0 correspond just to the discrete momentum perpendicular to the plates. However, the photon propagator (3.104) is valid in the whole space, i.e., in the outside region too. The connection to the reMection principle can be made by % expanding 1=sin &a = −2i n ¿ 0 exp(2i(n + 12 )&a). The photon propagator for plane parallel plates as given in this subsection had been used in [97] for the calculation of boundary dependent contributions to the anomalous magnetic moment of the electron, and in [98,99] for the calculation of boundary dependent level shifts of a hydrogen atom. It will be applied to calculate the radiative corrections to the Casimir force in Section 4.5.

52


4. Casimir eect in various con%gurations In this section the Casimir energies and forces in various con3gurations are calculated for Mat and curved boundaries. To perform this calculation dierent theoretical methods are applied. For some con3gurations, such as the strati3ed media, wedge, sphere, or a cylinder the exact calculational methods described above are applicable. For other cases, e.g., for a sphere (lens) above a disk, the approximate methods are developed. The application of the obtained results are considered in Quantum Field Theory, Condensed Matter Physics, and Cosmology. Certain of the results, presented here, are basic for comparison of theory and experiment. They will be used in the subsequent sections. 4.1. Flat boundaries Here two examples of Mat boundaries are considered: semispaces including strati3ed media and rectangular cavities. Both con3gurations are of much current interest in connection with the experiments on the measurement of the Casimir force and applications of the Casimir eect in nanotechnologies. Only the simplest con3gurations are considered below, i.e., empty cavities and gaps between semispaces (see, e.g., [100,101] where the inMuence of the additional external 3elds onto the Casimir eect is calculated). The role and size of dierent corrections to the Casimir force important from the experimental point of view are discussed in Section 5. Flat boundaries play a distinguished role for the Casimir eect because they allow for quite explicit formulas. In addition, due to the missing curvature of the boundaries, most heat kernel coeTcients are zero which makes it much easier to extract the 3nite part of the vacuum energy even if additional external factors are included. In a series of papers [102,103] generalizations of Mat boundaries to rectangular regions (e.g., a half-plane sticked to a plane), and in [104] to softened boundaries (e.g. a background potential growing to in3nity at the position of the boundary) are considered. In this connection also the penetrable plane mirrors (see, e.g., Section 3.1.1, Eq. (3.28)) should be mentioned. 4.1.1. Two semispaces and strati@ed media In the case of Mat boundaries the method of separation of variables in the 3eld equation can usually be applied, which permits the application of the exact calculational methods. The best known example of Mat boundaries is the con3guration of two semispaces 3lled in by two dielectric materials and separated by a gap 3lled in by some other material. This is the con3guration investigated by Lifshitz [9] for which he obtained the general representation of the van der Waals and Casimir force in terms of the frequency dependent dielectric permittivities of all three media (magnetic permeabilities were suggested to be equal to unity). Actually, Lifshitz results can be generalized for any strati3ed medium containing an arbitrary number of plane-parallel layers of dierent materials. The original Lifshitz derivation was based on the assumption that the dielectric materials can be considered as continuous media characterized by randomly Muctuating sources. The correlation function of these sources, situated at dierent points, is proportional to the -function of the radius-vector joining these points. The force per unit area acting upon one of the semispaces was calculated as the Mux of incoming momentum into this semispace through the boundary


53

Fig. 4. The con3guration of two semispaces with a dielectric permittivity 2 (!) covered by layers of thickness d with a permittivity 1 (!). The space separation between the layers is a.

plane. This Mux is given by the appropriate component of the stress tensor (zz-component if xy is the boundary plane). Usual boundary conditions on the boundary surfaces between dierent media were imposed on the Green’s functions. To exclude the divergences, the values of all the Green’s functions in vacuum were subtracted of their values in the dielectric media [105]. Here we present another derivation of the Lifshitz results and their generalization starting directly from the zero-point energy of electromagnetic 3eld. In doing so, the continuous media, characterized by the frequency dependent dielectric permittivities, and appropriate boundary conditions on the photon states, can be considered as some eective external 3eld (which cannot be described, however, by a potential added to the left hand side of wave equation). The main ideas of such a derivation were 3rst formulated in Refs. [106,107] (see also [7,36,108] where they were generalized and elaborated). We are interested here not only in the force values acting upon boundaries but also in the 3nite, renormalized values of the Casimir energies for the purpose of future application to con3gurations used in experiments. In the experiments on the Casimir force measurements, symmetrical con3gurations are usually used, i.e., both interacting bodies are made of one and the same material which at times is covered by a thin layers of another material [40 – 44]. In line with this let us consider the con3guration presented in Fig. 4. Here the main material of the plates situated in (x; y) planes has the permittivity 2 (!), and the covering layers (if any)—1 (!). The empty space between the external surfaces of the layers is of thickness a, and the layer thickness is d.

54


In line with Eq. (2.29) from Section 2.2 the nonrenormalized vacuum energy density of electromagnetic 3eld reads E0 (a; d) ˝ dk1 dk2 (1) (2) = ES (a; d) = ! + ! ; (4.1) k⊥ ; n k⊥ ; n S 2 (2 )2 n where we have separated the proper frequencies of the modes with two dierent polarizations of the electric 3eld (parallel and perpendicular to the plane formed by k⊥ and z-axis, respectively). As in Section 2.2, here k⊥ = (k1 ; k2 ) is the two-dimensional propagation vector in the xy-plane. For simplicity x-axis is chosen to be parallel to k⊥ . However, it is more diTcult than for the perfectly conducting metallic planes to 3nd the frequencies !k(1;⊥2) ; n. In order to solve this problem we use the formalism of surface modes [106,107], which are exponentially damping for z ¿ a=2 + d and z ¡ − a=2 − d. These modes describe waves propagating parallel to the surface of the walls [109]. They form a complete set of solutions and this approach is widely used. For another approach using conventional scattering states see Section 5.1 where it is, in addition, generalized to nonzero temperature. To 3nd these modes let us represent the orthonormalized set of negative-frequency solutions to Maxwell equations in the form (i)

Ek⊥ ; 6 (t; r ) = f6(i) (k⊥ ; z)ei(kx x+ky y)−i!t ; (i)

i(kx x+ky y)−i!t Bk⊥ ; 6 (t; r ) = +(i) ; 6 (k⊥ ; z)e

(4.2)

where index i numerates the same states of polarization as in Eq. (4.1), index 6 numerates the regions shown in Fig. 4. From Maxwell equations the wave equation for the z-dependent vector functions follows d 2 f6(i) d 2 +(i) 6 2 (i) − R f = 0; − R26 +(i) 6 6 6 =0 ; 2 dz d z2 where the notation is introduced

(4.3)

!2 ; k 2 = k12 + k22 ; 0 = 1; 6 = 0; 1; 2 : (4.4) c2 In obtaining Eqs. (4.3) we have assumed that the media are isotropic so that the electric displacement is D6 = 6 E6 . According to the boundary conditions at the interface between two dielectrics the normal component of D and tangential component of E should be continuous. Also Bn and Ht = Bt (in our case of nonmagnetic media) are continuous. It is easy to verify that all these conditions are (2) (2) satis3ed automatically if the quantities 6 fz;(1)6 and dfz;(1)6 =d z or fy; 6 and dfy; 6 =d z are continuous. Let us consider in detail the 3rst of these conditions. According to Eq. (4.3), the surface modes fz(1) in dierent regions of Fig. 4 can be represented as the following combinations of exponents: a fz(1) = AeR2 z ; z¡ − −d ; 2 a a fz(1) = BeR1 z + Ce−R1 z ; − − d¡z¡ − ; 2 2 2 − 6 (!) R26 = k⊥


a 2

a ; 2

fz(1) = DeR0 z + Ee−R0 z ;

− ¡z¡

fz(1) = FeR1 z + Ge−R1 z ;

a a ¡z¡ + d ; 2 2

fz(1) = H e−R2 z ;

z¿

55

a +d : 2

(4.5)

Imposing the continuity conditions on 6 fz;(1)6 and dfz;(1)6 =d z at the points z = −a=2 − d; −a=2; a=2, and a=2 + d, and taking into account (4.5) we arrive at the following system of equations: A2 eR2 (−a=2−d) = B1 eR1 (−a=2−d) + C1 e−R1 (−a=2−d) ; AR2 eR2 (−a=2−d) = BR1 eR1 (−a=2−d) − CR1 e−R1 (−a=2−d) ; B1 e−R1 a=2 + C1 eR1 a=2 = De−R0 a=2 + EeR0 a=2 ; BR1 e−R1 a=2 − CR1 eR1 a=2 = DR0 e−R0 a=2 − ER0 eR0 a=2 ; DeR0 a=2 + Ee−R0 a=2 = F1 eR1 a=2 + G1 e−R1 a=2 ; DR0 eR0 a=2 − ER0 e−R0 a=2 = FR1 eR1 a=2 − GR1 e−R1 a=2 ; F1 eR1 (a=2+d) + G1 e−R1 (a=2+d) = H2 e−R2 (a=2+d) ; FR1 eR1 (a=2+d) − GR1 e−R1 (a=2+d) = − HR2 e−R2 (a=2+d) :

(4.6)

This is a linear homogeneous system of algebraic equations relating the unknown coeTcients A; B; : : : ; H . It has the nontrivial solutions under the condition that the determinant of its coeTcients is equal to zero. This condition is, accordingly, the equation for the determination of the proper frequencies !k(1) of the modes with a parallel polarization [36] ⊥; n − − −R1 d 2 R0 a + + R1 d ) ≡ e−R2 (a+2d) {(r10 r12 e − r10 r12 e )e -(1) (!k(1) ⊥; n − + R1 d + − −R1 d 2 −R0 a − (r10 r12 e − r10 r12 e )e }=0 :

(4.7)

Here the following notations are introduced: ± r67 = R6 7 ± R7 6 ;

± q67 = R6 ± R7 :

(4.8)

(2) (2) Similarly, the requirement that the quantities fy; 6 and dfy; 6 =d z are continuous at boundary points results in the equations for determination of the frequencies !k(2) of the perpendicular ⊥; n polarized modes [36] − − −R1 d 2 R0 a + + R1 d -(2) (!k(2) ) ≡ e−R2 (a+2d) {(q10 q12 e − q10 q12 e )e ⊥; n − + R1 d + − −R1 d 2 −R0 a − (q10 q12 e − q10 q12 e )e }=0 :

(4.9)

Note that to obtain Eqs. (4.7) and (4.9) we set the determinants of the linear system of equations equal to zero and do not perform any additional transformations. This is the reason why (4.7) and (4.9) do not coincide with the corresponding equations of [7,108] where some transformations were used which are not equivalent in the limit |!| → ∞ (see below).

56


Summation in Eq. (4.1) over the solutions of Eqs. (4.7) and (4.9) can be performed by applying the argument theorem which was applied for this purpose in [106,107]. According to this theorem n

!k(1;⊥2) ;n

1 = 2 i

−i∞

i∞

! d ln -

(1; 2)

(!) +

C+

! d ln -

(1; 2)

(!) ;

(4.10)

where C+ is a semicircle of in3nite radius in the right one-half of the complex !-plane with a center at the origin. Notice that the functions -(1; 2) (!), de3ned in Eqs. (4.7) and (4.9), have no poles. For this reason the sum over their poles is absent from (4.10). The second integral on the right hand side of (4.10) is simply calculated with the natural supposition that d6 (!) =0 ! → ∞ d!

lim 6 (!) = 1;

lim

!→∞

(4.11)

along any radial direction in complex !-plane. The result is in3nite, and does not depend on a: C+

! d ln -

(1; 2)

(!) = 4

C+

d! :

(4.12)

Now we introduce a new variable ' = −i! in Eqs. (4.10) and (4.12). The result is n

!k(1;⊥2) ;n =

1 2

−∞

∞

' d ln -(1; 2) (i') +

2

C+

d' ;

(4.13)

where both contributions in the right hand side diverge. To remove the divergences we use a renormalization procedure which goes back to the original Casimir paper [1] (see also [23,107,108]). The idea of this procedure is that the renormalized physical vacuum energy density vanishes for the in3nitely separated interacting bodies. From Eqs. (4.7), (4.9) and (4.13) it follows lim

a→∞

n

!k(1;⊥2) ;n =

1 2

−∞

∞

2) ' d ln -(1; ∞ (i') +

2

C+

d' ;

(4.14)

where the asymptotic behavior of -(1; 2) at a → ∞ is given by − − −R1 d 2 (R0 −R2 )a−2R2 d + + R1 d -(1) (r10 r12 e − r10 r12 e ); ∞ =e − − −R1 d 2 (R0 −R2 )a−2R2 d + + R1 d (q10 q12 e − q10 q12 e ): -(2) ∞ =e

(4.15)


57

Now the renormalized physical quantities are found with the help of Eqs. (4.13) – (4.15)

(1; 2) (1; 2) (1; 2) !k⊥ ; n ≡ !k⊥ ; n − lim !k⊥ ; n n

ren

n

1 = 2

a→∞

−∞

∞

' d ln

n

-(1; 2) (i') 2) -(1; ∞ (i')

:

(4.16)

They can be transformed to a more convenient form with the help of integration by parts

(1; 2) 1 ∞ -(1; 2) (i') ; (4.17) !k⊥ ; n = d' ln (1; 2) 2 (i') −∞ ∞ n ren

where the term outside the integral vanishes. To obtain the physical, renormalized Casimir energy density one should substitute the renormalized quantities (4.17) into Eq. (4.1) instead of Eq. (4.13) with the result ∞ ∞ ˝ ren ES (a; d) = 2 k dk d'[ln Q1 (i') + ln Q2 (i')] ; (4.18) 4 0 ⊥ ⊥ 0 where we introduced polar coordinates in k1 ; k2 plane, and − + R1 d + − −R1 d 2 r10 r12 e − r10 r12 e -(1) (i') =1− + + R d e−2R0 a ; Q1 (i') ≡ (1) − − −R1 d 1 r10 r12 e − r10 r12 e -∞ (i') − + R1 d + − −R1 d 2 q10 q12 e − q10 q12 e -(2) (i') =1− Q2 (i') ≡ (2) e−2R0 a : − − −R1 d + + R1 d q10 q12 e − q10 q12 e -∞ (i')

(4.19)

In Eq. (4.18) the fact that Q1; 2 are even functions of ' has been taken into account. For the convenience of numerical calculations below we introduce the new variable p instead of k⊥ de3ned by 2 k⊥ =

'2 2 (p − 1) : c2

In terms of p; ' the Casimir energy density (4.18) takes the form ∞ ∞ ˝ ren ES (a; d) = 2 2 p dp '2 d'[ln Q1 (i') + ln Q2 (i')] ; 4 c 1 0

(4.20)

(4.21)

where a more detailed representation for the functions Q1; 2 from (4.19) is 2 (K1 − 1p)(2 K1 + 1 K2 ) − (K1 + 1p)(2 K1 − 1 K2 )e−2('=c)K1 d Q1 (i') = 1 − e−2('=c)pa ; (K1 + 1p)(2 K1 + 1 K2 ) − (K1 − 1p)(2 K1 − 1 K2 )e−2('=c)K1 d 2 (K1 − p)(K1 + K2 ) − (K1 + p)(K1 − K2 )e−2('=c)K1 d e−2('=c)pa : (4.22) Q2 (i') = 1 − (K1 + p)(K1 + K2 ) − (K1 − p)(K1 − K2 )e−2('=c)K1 d

58


Here all permittivities depend on i' and c K6 = K6 (i') ≡ p2 − 1 + 6 (i') = R6 (i'); '

6 = 1; 2 :

(4.23)

For 6 = 0 one has p = cR0 =' which is equivalent to Eq. (4.20). Notice that expressions (4.18) and (4.21) give us 3nite values of the Casimir energy density (which is in less common use than the force). Thus in [7] no 3nite expression for the energy density is presented for two semispaces. In [108] the omission of in3nities is performed implicitly, namely instead of Eqs. (4.7) and (4.9) the result of their division by the terms containing exp(R0 a) was presented. The coeTcient near exp(R0 a), however, turns into in3nity on C+ . In other words Eqs. (4.7) and (4.9) are divided by in3nity. As a result the integral along C+ is equal to zero in [108] and quantity (4.1) would seem to be 3nite. Fortunately, this implicit division is equivalent to the renormalization procedure explicitly presented above. That is why the 3nal results obtained in [108] are indeed correct. In [105] the energy density is not considered at all. From Eq. (4.21) it is easy to obtain the Casimir force per unit area acting between semispaces covered with layers: ∞ ∞ 9ESren (a; d) ˝ 1 − Q1 (i') 1 − Q2 (i') 2 3 Fss (a; d) = − p dp ' d' : =− 2 3 + 9a 2 c 1 Q1 (i') Q2 (i') 0 (4.24) This expression coincides with Lifshitz result [9,105,110] for the force per unit area between semispaces with a dielectric permittivity 2 if the covering layers are absent. To obtain this limiting case from Eq. (4.24) one should put d = 0 and 1 = 2  −1 ∞ ∞  K + p 2 ˝ 2 2 Fss (a) = − 2 3 p2 dp '3 d' e2('=c)pa − 1  2 c 1 K − p 2 2 0

+

K2 + p K2 − p

2

−1  

e2('=c)pa − 1



:

The corresponding quantity for the energy density follows from Eq. (4.21) 2 ∞ ∞ ˝ K − p 2 2 ESren (a) = 2 2 p dp '2 d' ln 1 − e−2('=c)pa 4 c 1 K2 + 2p 0 K2 − p 2 −2('=c)pa + ln 1 − e : K2 + p

(4.25)

(4.26)

It is well known [105] that Eqs. (4.25) and (4.26) contain the limiting cases of both van der Waals and Casimir forces and energy densities. At small distances aB0 (where B0 is the characteristic absorption wavelength of the semispace dielectric matter) these equations take the


59

simpli3ed form

H H ; ESren (a) = − ; 6 a3 12 a2 where the Hamaker constant is introduced −1 ∞ 2 + 1 2 x 3˝ ∞ 2 x dx d' e −1 H= 8 0 2 − 1 0 Fss (a) = −

(4.27)

(4.28)

and the new integration variable is x = 2p'a=c. This is the nonretarded van der Waals force per unit area and the corresponding energy density between the semispaces. In the opposite case of large distances aB0 the dielectric permittivities can be represented by their static values at ' = 0. Introducing in Eq. (4.25) the variable x (now instead of ') one obtains ˝c ˝c Fss (a) = − L(20 ); ESren (a) = − L(20 ) ; (4.29) 4 10a 30a3 where the function L is de3ned by  −1 ∞ ∞  K + p 2 dp 5 20 x3 d x ex − 1 L(20 ) =  K20 − p 16 3 1 p2 0

+

K20 + p20 K20 − p20

2

−1  

ex − 1



(4.30)

and K20 = (p2 − 1 + 20 )1=2 ; 20 = 2 (0). If both bodies are ideal metals the dielectric permittivity 2 (i') → ∞ for all ' including ' → 0. Putting 20 → ∞; L(20 ) → =24 we obtain the Casimir result for the force per unit area and energy density

2 ˝c

2 ˝c (0)ren Fss(0) (a) = − ; E (a) = − : (4.31) S 240 a4 720 a3 The other method to obtain the force between semispaces (but with a permittivity 1 ) is to consider limit d → ∞ in (4.24). In this limit we obtain once more results (4.25) and (4.26) where K2 , 2 are replaced by K1 ; 1 . Note also that here we have not taken into account the eect of nonzero point temperature which is negligible for a˝c=(kB T ). The calculation of the Casimir force including the eect of nonzero temperature is contained in Section 5.1. The above formulas can be applied also to describe the Casimir force between two dielectric plates of 3nite thickness. For this purpose it is enough to put 2 (!) = 1 in Eqs. (4.22) and (4.24). This case was especially considered in [111]. For anisotropic plates along with the vacuum force a torque can appear which tends to change the mutual orientation of the bodies (see [24,112,113]). 4.1.2. Rectangular cavities: attractive or repulsive force? As mentioned above, the Casimir energy may change its sign depending on geometry and topology of the con3guration. Probably, the most evident example of the dependence on the

60


geometry is given by the Casimir eect inside a rectangular box. As was noticed in [114], the vacuum Casimir energy of electromagnetic 3eld inside a perfectly conducting box may change sign depending on the length of the sides. The detailed calculation of the Casimir energy inside a rectangular box, when it is positive or negative, as a function of the box dimensions is contained in Refs. [19,115]. In these references the analytical results for two- and three-dimensional boxes were obtained by the repeated application of the Abel–Plana formula (2.25). The results of [19,115] were later veri3ed by other authors (see the references below). In [116] the Epstein zeta function was applied to calculate the Casimir energy for a scalar and electromagnetic 3eld in a general hypercuboidal region with p sides of 3nite length a1 ; : : : ; ap and d − p sides with length Lai . Both the periodic and perfect conductivity boundary conditions were considered, and the contours were computed at which energy is zero. The detailed examination of the attractive or repulsive nature of the Casimir force for a massless scalar 3eld as a function of the dimensionality of space and the number of compact sides p was performed in [117,118] using the method of [116]. In [119] the Casimir eect for electromagnetic 3eld in three-dimensional cavities was investigated by the use of Hertz potentials and their second quantization. Particular attention has been given to the isolation of divergent quantities and their interpretation which is in accordance with [19,115]. The case of massless scalar 3eld in multidimensional rectangular cavity was reexamined in [120]. The sign of the Casimir energy and its dependence on the type of boundary conditions (periodic, Dirichlet, Neumann) was studied. Let us apply the Epstein zeta function method to calculate the Casimir energy and force for the electromagnetic vacuum inside a rectangular box with the side lengths a1 ; a2 , and a3 . The box faces are assumed to be perfect conductors. Imposing the boundary conditions of Eq. (2.27) on the faces, the proper frequencies are found to be 2 n22 n23 2 2 2 n1 !n1 n2 n3 = c + + : (4.32) a21 a22 a23 Here the oscillations for which all ni = 0 and positive are doubly degenerate. If one of the ni vanishes they are not degenerate. There are no oscillations with two or three indices equal to zero because in such cases electromagnetic 3eld vanishes. As a consequence, the nonrenormalized vacuum energy of electromagnetic 3eld inside a box takes the form   ∞ ∞ ∞ ∞ ˝ E0 (a1 ; a2 ; a3 ) = 2 !n1 n2 n3 + !0 n2 n3 + !n1 0 n3 + !n1 n2 0  : 2 n1 ; n2 ; n3 =1

n2 ; n3 =1

n1 ; n3 =1

n1 ; n2 =1

(4.33) We regularize this quantity with the help of Epstein zeta function which for a simple case under consideration is de3ned by 2 2 −t=2 ∞ 1 1 1 n1 2 n2 n3 ; ; ;t = + + : (4.34) Z3 a1 a2 a3 a a a 1 2 3 n ; n ; n =−∞ 1

2

3

This series is convergent if t ¿ 3. The prime near sum indicates that the term for which all ni = 0 is to be omitted.


At 3rst, Eq. (4.33) should be transformed identically to ∞ ˝ !n n n (1 − n1 ;0 n2 ;0 − n1 ;0 n3 ;0 − n2 ;0 n3 ;0 ) : E0 (a1 ; a2 ; a3 ) = 8 n ; n ; n =−∞ 1 2 3 1

2

61

(4.35)

3

Introducing the regularization parameter s like in Section 2.2 and using de3nitions (4.34), (2.39) one obtains ∞ ˝ E0 (a1 ; a2 ; a3 ; s) = !1−2s (1 − n1 ;0 n2 ;0 − n1 ;0 n3 ;0 − n2 ;0 n3 ;0 ) 8 n ; n ; n =−∞ n1 n2 n3 1 2 3 ˝ c 1 1 1 1 1 1 Z3 ; ; ; 2s − 1 − 2%R (2s − 1) + + : (4.36) = 8 a1 a2 a3 a1 a2 a3 To remove regularization (s → 0) we need the values of Epstein and Riemann zeta functions at t = −1. Both of them are given by the analytic continuation of these functions. As to %R (t) the Eq. (2.40) should be used. For Epstein zeta function the reMection formula analogical to (2.40) is [116] t 3−t 1 1 1 −t=2 −1 (t−3)=2 &

Z3 (a1 ; a2 ; a3 ; t) = (a1 a2 a3 ) &

Z3 ; ; ; 3 − t ; (4.37) 2 2 a1 a2 a3 where &(z) is gamma function. The results of their application are 1 1 1 a1 a2 a3 Z3 ; ; ; −1 = − Z3 (a1 ; a2 ; a3 ; 4) ; a1 a2 a3 2 3 1 %R (−1) = − 12 :

(4.38)

Substituting the obtained 3nite values into Eq. (4.36) in the limit s → 0 we obtain the renormalized (by means of zeta function regularization) vacuum energy [116] 1 1 ˝ca1 a2 a3 ˝c 1 ren E0 (a1 ; a2 ; a3 ) = − Z3 (a1 ; a2 ; a3 ; 4) + + + : (4.39) 16 2 48 a1 a2 a3 It is clearly seen that the obtained result consists of the dierence of two positively de3ned terms and, therefore, can be both positive and negative. Two renormalizations performed above (associated with the analytical continuation of the Epstein and Riemann zeta functions) may be interpreted as the omitting of terms proportional to the volume a1 a2 a3 and perimeter (a1 +a2 +a3 ) of the box [119]. For possible explanation of the Casimir repulsion in terms of vacuum radiation pressure see [121]. As usual, the forces acting upon the opposite pairs of faces and directed normally to them are 9E ren (a1 ; a2 ; a3 ) Fi (a1 ; a2 ; a3 ) = − 0 ; (4.40) 9 ai so that the total vacuum force is F (a1 ; a2 ; a3 ) = −∇E0ren (a1 ; a2 ; a3 ) :

(4.41)

62


According to energy sign forces (4.40) can be both repulsive or attractive depending on the relationship between the lengths of the sides a1 ; a2 , and a3 . In Ref. [122] the detailed computations of the vacuum forces and energies were performed numerically for boxes of dierent dimensions. In particular, the zero-energy surfaces are presented, which separate the positive-energy surfaces from the negative-energy ones, and the surfaces of zero force. The analytical results of Refs. [19,114,115,119] were con3rmed numerically in [122]. Although the general result (4.39) settles the question of the vacuum energy inside a perfectly conducting rectangular box, we give some attention to the alternative calculation of the same quantity based on the application of the Abel–Plana formula [19,115]. The latter provides a simple way of obtaining analytical results which is an advantage over the zeta function method in application to a box. Let us start with a massless scalar 3eld in a two-dimensional box 0 6 x 6 a1 , 0 6 y 6 a2 for which the nonrenormalized vacuum energy is expressed by ∞ ˝

E0 (a1 ; a2 ) =

2

!n21 n2

!n1 n2 ;

2 2

= c

n1 ; n2 =1

n21 n22 + a21 a22

:

(4.42)

To perform the summation in (4.42) we apply twice the Abel–Plana formula (2.25). As was noted in Section 2.3, the explicit introduction of the dumping function is not necessary. After the 3rst application one obtains Sn1 ≡

∞ 2 n 1 a21

n2 =1

−2

+

n22 a22

∞

n1 a2 =a1

1=2

=−

n21 t2 − a22 a21

n1 + 2a1

1=2

∞

0

dt

n21 t2 + 2 2 a1 a2

dt ; −1

(4.43)

e2 t

where the last integral uses the fact that the dierence of the radicals is nonzero only above the branch point (see Eq. (2.35)). The result of the second application is ∞

1 Sn1 = − 2

n1 =1

1 1 + a1 a2

∞

0

t dt +

∞

0

dt

0

∞

t2 v2 dv 2 + 2 a1 a2

1 2a2 a2 a2 + − %R (3) + 2 G ; 24a1 8 2 a21 a1 a1

1=2

(4.44)

where

G(x) = −

1

∞

∞ ds s2 − 1 n1 =1

n21

e2 xn1 s − 1

:

(4.45)

Renormalization of quantity (4.44) is equivalent to the omission of 3rst two integrals in the right hand side (the 3rst is proportional to the perimeter, and the second—to the volume, i.e.,


63

area of the box). As a result, the renormalized vacuum energy is ∞

˝ c ren E0 (a1 ; a2 ) = Sn1 2

n1 =1

ren

%R (3)a2 a2 a2 = ˝c − + 2 G : 2 48a1 a1 16 a1 a1

(4.46)

In the foregoing we performed the summation in n2 3rst, followed by summation of n1 . This order is advantageous for a2 ¿ a1 because G(a2 =a1 ) is small in this case. It can be easily shown from Eq. (4.44) that for a1 = a2 the contribution of G(1) to the vacuum energy is of order 1% and for a2 ¿ a1 is exponentially small. In such a manner one can neglect the term containing G in (4.46) and get analytical expression for the vacuum energy which is valid with a high degree of accuracy. It is seen from this expression that the energy is positive if a2

2 16 ¡ ≈ 2:74 ; (4.47) a1 3%R (3) and negative if a2 ¿ 2:74a1 (we remind that %R (3) ≈ 1:202). The above derivation can be applied to the more realistic case of electromagnetic vacuum con3ned in a three-dimensional box a1 × a2 × a3 with a perfectly conducting faces. After three applications of the Abel–Plana formula and renormalization, the result is the following [19]: 2

1 1

a2 a3 %R (3) a3 a2 a3 a3 E0ren (a1 ; a2 ; a3 ) = ˝c − − + + + H ; ; ; (4.48) 16 a22 48 a1 a2 a1 a1 a2 720a31 where function H is exponentially small in all its arguments if a1 6 a2 6 a3 (the explicit expression for H can be found in [19,23]). If there is a square section a1 = a2 6 a3 and the small integral sums contained in H are neglected the result is 2

%R (3) a3 ˝c ren E0 (a1 ; a3 ) ≈ − : (4.49) + a1 24 720 16 a1 In the opposite case a1 = a2 ¿ a3 the vacuum energy is 3 2 a ˝ c

% (3)

a1

R 1 E0ren (a1 ; a3 ) ≈ − − : (4.50) + a1 48 16 48 a3 720 a3

For the case a1 = a2 , one can easily obtain from (4.48) that the energy is positive if a3 (4.51) 0:408 ¡ ¡ 3:48 a1 and passes through zero at the ends of this interval. Outside interval (4.51) vacuum energy of electromagnetic 3eld inside a box is negative. For the cube a1 = a2 = a3 the vacuum energy takes the value ˝c E0ren (a1 ) ≈ 0:0916 : (4.52) a1 Exactly the same results as in (4.51) and (4.52) are obtained from Eq. (4.39) by numerical computation [122]. If, instead of (4.48) one uses the approximate formulas (4.49), (4.50) the

64


error is less than 2%. For example, for the case of the cube, where the error is largest, it follows from (4.49) that E0ren (a1 ) ≈ 0:0933˝c=a1 instead of (4.52). This demonstrates the advantage of the analytical representations (4.48) – (4.50) for the vacuum energies inside the rectangular boxes. The analogous results to (4.48) can be obtained also for the periodic boundary conditions imposed on the box faces (see, e.g., [23]). It is notable that the vacuum energy (4.39) preserves its value if two sides of the box are interchanged or the cyclic permutations of all three sides is performed. The same property is implicitly present in Eqs. (4.46) and (4.48) even though it appears to be violated. The apparent violation of it is connected with the adopted condition a1 6 a2 6 a3 under which the additional contributions G and H are small. 4.2. Spherical and cylindrical boundaries In this subsection we consider the ground state energy in spherical and cylindrical geometry. The interest for these results comes from a number of sources. Historically, the 3rst example emerged from Casimir’s attempt to explain the stability of the electron [123] stimulating Boyer to his work [18] with the surprising result of a repulsive force for the conducting sphere. In the 1970s there was the bag model in QCD and during the past decade there were attempts to explain sonoluminescence stimulating the investigation of spherical geometries. Technically, quite closely related problems appear from vacuum polarization in the background of black holes and in some models of quantum cosmology. Here we explain the methods suited to handle problems given by local boundary conditions and matching conditions on a sphere. In a separate subsection we collect the corresponding results for a cylindrical surface. These methods possess generalizations to smooth spherically symmetric background potentials and to boundary conditions on generalized cones as well as to higher dimensions which will, however, not be discussed here. We consider two types of boundary conditions The 3rst one are “hard” boundary conditions dividing the space into two parts, the interior of the sphere (0 6 r 6 R) and the exterior (R 6 r ¡ ∞) so that the 3eld is quantized independently in each region. Examples are Dirichlet and Neumann boundary conditions for a scalar 3eld, conductor boundary conditions for the electromagnetic 3eld and bag boundary conditions for the spinor 3eld. The second type of boundary conditions can be rather called “matching conditions” connecting the 3eld of both sides of the boundary so that it must be quantized in the whole space as an entity. Examples are the matching conditions which are equivalent to a delta function potential and the conditions on the surface of a dielectric ball. The 3elds which we consider here are, in general, massive ones except for the electromagnetic 3eld. 4.2.1. Boundary conditions on a sphere In spherical geometry consider a function which is after separation of variables by means of Eq. (3.29) (see Section 3.1.2) subject to Eq. (3.32) which we rewrite here in the form l(l + 1) 92 − 2+ + V (r) Nl (r) = BNl (r) ; (4.53) 9r r2


65

where we kept the background potential V (r) for a moment. Dirichlet and Neumann boundary conditions are de3ned by Dirichlet: Neumann:

Nl (r)|r=R = 0 ; 9 Nl (r)|r=R = 0 : 9r

(4.54)

Conductor boundary conditions for the electromagnetic 3eld (3.87) turn into boundary conditions for the two modes the 3eld can be decomposed in: Dirichlet boundary conditions for the transverse electric (TE) modes and Neumann 5 boundary conditions for the transverse magnetic (TM) ones. In the bag model of QCD, the boundary conditions on a surface G for the gluon 3eld reads n F4 (x)|x∈G = 0 and can be treated by means of duality in the same way as the conductor boundary conditions (3.87) (at least at the one loop level). For the spinor 3eld the bag boundary conditions read n C (x)|x∈G = 0 :

(4.55)

The bag boundary conditions prevent the color Mux through the surface G simulating con3nement. Note that it is impossible to impose boundary conditions on each component of the spinor individually since this would result in an overdetermined problem because the Dirac equation is of 3rst order. These conditions are the most important local boundary conditions. There are some other, spectral boundary conditions [124] or boundary conditions containing tangential derivatives [125,126] which are discussed in some problems of quantum cosmology or spectral geometry, see e.g. [127]. Penetrable boundary conditions interpolate to some extent between hard boundary conditions and smooth background potentials. There is a simple, although somewhat unphysical example, given by a potential containing a delta function, V (r)=(6=R)(r −R). Here 6 is the dimensionless strength of the potential. The problem with this potential in Eq. (4.53) can be reformulated as a problem with no potential but matching condition Nl (r)|r=R−0 = Nl (r)|r=R+0 ;

6 9 9 Nl (r)|r=R−0 − Nl (r)|r=R+0 = Nl (r)|r=R 9r 9r R

(4.56)

in complete analogy to the one-dimensional case, Eq. (3.27). Another example having an obvious physical meaning is that of the matching condition for the electromagnetic 3eld on the surface between two bodies with dierent dielectric constant requiring the continuity of the normal components of the D and B and of the tangential compo√ nents of E and H . In this case we have dierent speeds of light according to c1; 2 = c= 1; 2 1; 2 in the corresponding regions. We are going to apply the methods explained in Section 3 for the calculation of the ground state energy. Consider 3rst the 3eld in the interior (0 6 r 6 R) with hard boundary conditions. 5

Note that it is a Neumann boundary condition for the function Nl (r) de3ned by (3.29), for the function is a Robin boundary condition.

(x) it

66


In this case the spectrum is discrete and we need a function whose zeros are just the discrete eigenvalues. For Dirichlet and Neumann boundary conditions these functions are the solutions jˆl (kr) to Eq. (4.53) with V = 0 and its derivative jˆl (kr) (cf. the discussion in Section 3.1.2). In order to use representation (3.43) we need these functions on the imaginary axis. We remind the reader of the notations

z ˆj l (iz) = il sl (z); sl (z) = (z) ; I 2 l+1=2 (4.57)

+ 2z el (z) = (z) : hˆl (iz) = i−l el (z); K

l+1=2 One should note that these functions depend on Bessel functions of half-integer order so that they have explicit analytic expressions l l −z sinh z e l+1 1 9 l l+1 1 9 sl (z) = z ; el (z) = (−1) z ; z 9z z z 9z z which are, however, not of much use with respect to the ground state energy. As explained in Section 3.1.2 we multiply these functions by a certain power of their argument in order to make the Jost functions regular at k = 0. In this way we obtain for the Dirichlet problem in the interior flD; i (ik) = (kR)−(l+1) sl (kR)

(4.58)

and for the Neumann problem flN; i (ik) = (kR)−l sl (kR) :

(4.59)

Now we turn to the corresponding exterior problems. According to Section 3.1.2, the Jost function for the Dirichlet problem is given by Eq. (3.47) which we note now in the form flD; e (ik) = (kR)l el (kR) :

(4.60)

In the same manner we obtain for the Neumann problem flN; e (ik) = (kR)l+1 el (kR) :

(4.61)

The additivity of the contributions from dierent modes to the regularized ground state energy implies that the corresponding Jost functions must be multiplied. In this way we obtain for the Dirichlet problem on a thin spherical shell by taking both the interior and the exterior problems together flD (ik) = (kR)−1 sl (kR)el (kR)

(4.62)

and for the corresponding Neumann problem flN (ik) = kR sl (kR)el (kR) :

(4.63)

For the conductor boundary conditions on the electromagnetic 3eld we have to take the product of Dirichlet and Neumann boundary conditions. For the interior problem we obtain in this way flem; i (ik) = (kR)−(2l+1) sl (kR)sl (kR)

(4.64)


67

and for the exterior problem flem; e (ik) = (kR)2l+1 el (kR)el (kR)

(4.65)

and, 3nally, for the thin conducting sphere flem (ik) = sl (kR)sl (kR)el (kR)el (kR) :

(4.66)

This expression can be rewritten using the Wronskian sl (z)el (z) − sl (z)el (z) = 1 so that with sl (z)sl (z)el (z)el (z) = − 14 {1 − [(sl (z)el (z)) ]2 } we can use flem (ik) = 1 − {[sl (kR)el (kR)] }2 instead of (4.66), taking into account that a k-independent factor does not inMuence the ground state energy. Note that for the electromagnetic 3eld the sum over the orbital momentum starts from l = 1, i.e., that the s-wave is missing which is in contrast to the scalar 3eld. Now we turn to the problems with matching conditions. Let us note the general form of the solution to Eq. (4.53) − + i Nl (r) = jˆl (qr),(R − r) + (fl (k)hˆl (kr) − fl∗ (k)hˆl (kr)) : (4.67) 2 With q = k this is the solution for the delta function potential. Note that in this case the asymptotic expressions (3.35) and (3.36) coincide with the exact solutions in 0 6 r 6 R and R 6 r ¡ ∞, respectively, as the potential has a pointlike support in the radial coordinate. The matching conditions applied to solution (4.67) can be solved and the Jost function follows to be 6 fldelta (ik) = 1 + sl (kR)el (kR) : (4.68) kR

Note that in the formal limit 6 → ∞ this function turns into flD (ik) for the Dirichlet problem taken interior and exterior together. In the electromagnetic case we consider a dielectric ball with 1 and 1 inside and 2 and 2 √ outside, c1; 2 = c= 1; 2 1; 2 are the corresponding speeds of light. We insert solution (4.67) into Eq. (4.53) and obtain q2 =!2 =c12 ; respectively, k 2 =!2 =c22 so that we have q=(c2 =c1 )k. As in the case of the delta potential the spectrum is completely continuous which can be understood in physical terms as there are no bound states for the photons. The Jost functions following from the matching conditions for the electromagnetic 3eld are well known in classical electrodynamics TM and are denoted by -TE l (k) and -l (k) for the two polarizations. They read flTE (ik) ≡ -TE l (ik) √ √ = 1 2 sl (qR)el (kR) − 2 1 sl (qR)el (kR) ;

(4.69)

flTM (ik) ≡ -TM l (ik) √ √ = 1 2 sl (qR)el (kR) − 2 1 sl (qR)el (kR) :

(4.70)

68


The Jost function for the dielectric ball is the product of them: TM fldiel (ik) = -TE l (ik)-l (ik) :

(4.71)

A frequently discussed special case is that of equal speeds of light inside and outside the sphere. In that case formulas simplify greatly and no ultraviolet divergences appear. After some trivial transformations making use of the Wronskian one obtains fldiel; c1 =c2 (ik) = 1 − '2 {[sl (kR)el (kR)] }2

(4.72)

with ' = (1 − 2 )=(1 + 2 ). Note that in this case we have 1 =2 = 2 =1 = 1 while for unequal speeds it is possible to have a pure dielectric ball, i.e., to have 1 = 2 = 1. For ' = 1 this Jost function coincides up to an irrelevant factor with that of the conducting sphere, Eq. (4.66). For completeness we note the Jost functions here for bag boundary conditions. For the interior they read 2mc bag; i −2l−2 2 2 fl (ik) = (kR) sl (kR) + sl+1 (kR) + (4.73) sl (kR)sl+1 (kR) ˝k and for the exterior flbag; e (ik)

2l+2

= (kR)

el2 (kR)

+

2 el+1 (kR)

2mc + el (kR)el+1 (kR) : ˝k

(4.74)

Here m is the mass of the spinor 3eld. Its appearance in the Jost function is a special feature of the boundary conditions (4.55) and the corresponding equation of motion has to be used. As discussed in [128] this dependence on the mass causes some speci3c problems in the de3nition of the ground state energy, which we do not however discuss here. 4.2.2. Analytic continuation of the regularized ground state energy By means of Eq. (3.43) we have a representation of the regularized ground state energy well suited for analytic continuation in s. By means of Eq. (3.73) we have a representation of the singular at s = 0 contributions, which are to be subtracted in accordance with Eq. (3.81) in order to obtain the renormalized ground state energy. Because it is impossible to put s = 0 under the signs of summation and integration in E reg we subtract and add in the integrand in Eq. (3.43) the 3rst few terms of the uniform asymptotic expansion of the logarithm of the Jost function, ln flas (ik), and split the renormalized ground state energy accordingly. Then it takes the form E ren ≡ E reg − E0div = E f + E as

(4.75)

with f

E =−

∞

˝c

2

l=0

(2l + 1)

∞

mc=˝

mc 2 1=2 9 dk k − [ln fl (ik) − ln flas (ik)] ˝ 9k 2

(4.76)


and ∞

cos s E = − ˝c (2l + 1) 2 as

l=0

∞

mc=˝

mc 2 1=2−s 9 dk k − ln flas (ik) − E0div (s) : ˝ 9k 2

69

(4.77)

We assume ln flas (ik) to be de3ned such that ln fl (ik) − ln flas (ik) = O(l−4 )

(4.78)

in the limit l → ∞, k → ∞, uniform with respect to k=l. This allows us to put s = 0 in the integrand of E f , (4.76), because of the convergence of both, the integral and the sum. Now the computation of E f for a given problem is left as a purely numerical task. The so called “asymptotic contribution”, E as , can be analytically continued to s = 0 where it is 3nite because the pole term is subtracted. The construction of this analytic continuation can be done analytically because its structure is quite simple, see below for examples. In this way sometimes explicit results for E as can be reached, mainly for massless 3eld. Otherwise expressions in terms of well converging integrals or sums can be obtained. We note the asymptotic expansion in the form ln flas (ik) =

3 Xi (t) i=−1

4i

:

This is an expansion in 4 ≡ l + use the notation 1 t= ; 1 + (kR=4)2

(4.79) 1 2

which is better than the corresponding expansion in l. We (4.80)

which is well known from the Debye polynomials appearing in the uniform asymptotic expansion of the modi3ed Bessel functions, see e.g. [129]. For most problems the functions Xi (t) (i = 1; 2; : : :) are polynomials in t, for the dielectric ball they are more complicated. Let us remark that in terms of the variable t it becomes clear why it is useful to work on the imaginary axis with respect to the variable k in representation (3.39) or (3.45). On the real axis (by means of k → − ik in (4.80)) the expansion is not uniform for k ∼ 4 because t becomes large and more complicated expansions should be used there. The depth of the asymptotic expansion required for the procedure described here is determined by the spatial dimension of the problem considered, up to 3 in our case. It is admissible to include higher terms into the de3nition of ln flas (ik) which is some kind of over-subtraction not changing, of course, E ren . In general, one can expect for this to increase E as on the expense of E f diminishing the part left for pure numerical calculation. This is, however, not useful for large background potentials (e.g. large 6 in (4.68)) as large compensations between E f and E as will appear in that case. To obtain ln flas (ik) there are two possibilities. For smooth background potentials one can use the Lippmann–Schwinger equation known in potential scattering to obtain a recursion. Examples can be found in [130,131]. The other way is to have an explicit expression for ln fl (ik) in terms of special functions like the examples shown above and to expand them directly using

70


the known expansions for these functions. As all our examples are in terms of Bessel functions it is suTcient to note their uniform asymptotic expansion which is well known, e.g., ∞ 1 uk (t) e4H I4 (4z) ∼ √ 1+ (4.81) 4k 2 4 (1 + z 2 )1=4 √

k=1

√

√ with t = 1= 1 + z 2 and H = 1 + z 2 + ln [z=(1 + 1 + z 2 )]. The 3rst few polynomials uk (t) and the recursion relation for the higher ones can be found in [129]. In this way one obtains for Dirichlet boundary conditions on the sphere (z = kR)

X0 = − 14 ln(1 + z 2 ) = 12 ln t ;

X−1 = H(z) − ln z; 5 3 24 t ;

X1 = 18 t − X3 =

25 3 384 t

−

X2 =

531 5 640 t

+

1 2 16 t

221 7 128 t

− 38 t 4 +

−

1105 9 1152 t

5 6 16 t

;

:

(4.82)

For the contribution from the exterior one has to change only signs, Xi → (−1)i Xi . For a semitransparent shell it holds [82,132] 62 2 6 6 63 3 36 5 56 7 X1 = t; X3 = t − t + t : (4.83) X2 = − t ; + 2 8 16 24 8 16 For the dielectric sphere the 3rst two coeTcients are [82] z z X−1 = H −H ; c1 c2 (1 2 =2 1 )c1 t2 + c2 t1 (2 1 =1 2 )c1 t2 + c2 t1 √ √ X0 = ln + ln (4.84) 2 c1 c2 t1 t2 2 c1 c2 t1 t2 with t1; 2 = 1= 1 + (z=c1; 2 )2 . The higher Xi are listed in the appendix of [82]. For bag boundary conditions these functions can be found in [128]. Now, equipped with the explicit expression for the functions Xi (t) in (4.79) we can construct the analytic continuation in E as (4.77). This had been done in all mentioned examples, sometimes repeatedly by dierent authors. We demonstrate such calculations here on the simplest example of the Casimir eect for a scalar 3eld with Dirichlet boundary conditions on a sphere. We introduce notations for the pieces of E as as follows: as

E =

3

Ai − E0div (s)

i=−1

≡ −˝c

3 ∞ cos s i=−1

2

l=0

(2l + 1)

∞

mc=˝

mc 2 1=2−s 9 X i dk k − − E0div (s) : ˝ 9k 4i 2

(4.85)

For X−1 and X0 given by Eq. (4.82) the k-integration delivers hypergeometric functions which by means of their Mellin–Barnes integral representation can be transformed into a sum in powers


71

of the mass m. After that the l-sum can be taken delivering a Hurwitz zeta function. We obtain (for details see [33]) ∞ ˝cR2s−1 (−1)j mcR 2j &(j + s − 1) A−1 = √ %H (2j + 2s − 3; 1=2) ; ˝ 4 &(s − 12 ) j=0 j! s + j − 12 ∞ ˝cR2s−1 (−1)j mcR 2j 1 A0 = − & s+j− %H (2j + 2s − 2; 1=2) : ˝ 2 4&(s − 12 ) j=0 j!

In the contributions from i = 1; 2; : : : the Xi (t) are polynomials in t, ta Xi (t) = xia i 4 a

(4.86)

and the k-integration can be carried out explicitly ∞ ˝cR2s−1 (−1)j mcR 2j Ai = − %H (2s − 2 + i + 2j; 1=2) ˝ &(s − 12 ) j=0 j! i &(s + a + j + (i − 1)=2) × xi; a :

&(a + i=2)

a=0

In these expressions the pole contributions are given explicitly in terms of gamma and zeta functions. In fact, they cancel exactly the pole contribution in E div , (3.73). In this way the analytic continuation to s = 0 is constructed. The sums over j in these expressions converge for mcR= ˝ ¡ 1. A representation valid for all mcR= ˝ had been derived in [33] but it is too complicated to be displayed here. In contrast to this, in the massless case the expressions are simple: 7 ˝c 7 1 cR 2 1 7 A−1 = + + ln + ln 2 + %R (−3) + O(s) ; 2 R 1920 s ˝ 1920 160 8 A0 = 0 ; A1 =

˝c

2 R

A2 = 0 ; A3 =

˝c

2 R

1 1 cR 2 1 1 − − % (−1) + O(s) ; + ln 192 s ˝ 36 8 R

269 229 1 cR 2 229 229 − + − + ln C− ln 2 + O(s) ; 40320 s ˝ 7560 20160 6720

(4.87)

where is the mass parameter introduced in (3.1). The pole contribution is 3 1 ˝c Ai = + O(1) : 630 Rs i=−1

It cancels just E div = −a2 ˝c=(32 2 s) (Eq. (3.73) for m = 0) with a2 = −16 =(315R) given in Eq. (3.66) as expected.

72


Table 1 CoeTcients for the zeta functions and the Casimir energy for a massless scalar 3eld with Dirichlet boundary conditions D

Zeta function inside

Zeta function outside

Casimir energy

2 3 4 5 6 7 8 9

+0:010038 − 0:003906=s +0:008873 + 0:001010=s −0:001782 + 0:000267=s −0:000940 − 0:000134=s +0:000268 − 0:000033=s +0:000137 + 0:000021=s −0:000045 + 5:228 × 10−6 =s −0:000022 − 3:769 × 10−6 =s

−0:008693 − 0:003906=s −0:003234 − 0:001010=s

+0:000672 − 0:003906=s +0:002819 −0:000655 + 0:000267=s −0:000288 +0:000102 − 0:000033=s +0:000040 −0:000017 + 5:228 × 10−6 =s −6:798 × 10−6

+0:000470 + 0:000267=s +0:000364 + 0:000134=s −0:000062 − 0:000033=s −0:000055 − 0:000021=s +0:000010 + 5:228 × 10−6 =s +9:399 × 10−6 + 3:769 × 10−6 =s

When adding the contributions from the exterior by means of the signs, Xi → (−1)i Xi , the contributions from the Ai cancel each other and one is left with E f . 4.2.3. Results on the Casimir e>ect on a sphere As we have seen in the foregoing subsection, the Casimir energy is typically represented as a sum of two parts. The 3rst, E f , is a convergent integral and sum which can be calculated only numerically. The second part is sometimes explicit in terms of gamma and Hurwitz zeta functions, when this is not the case, then it has to be evaluated numerically also. It is not meaningful to describe the numerical procedures here. They are simple and can be done for example using Mathematica. Therefore, we restrict ourselves to a collection of most of the known results. The historically 3rst result on a conducting sphere is that by Boyer [18]. It had been con3rmed with higher precision in [96,93] using Green’s function methods. In the bag model the vacuum energy of spinors was calculated in [133–136] and more recently in [137,138]. For massive 3elds it was recognized quite early that there are additional divergencies [139]. The 3rst complete calculation for the massive case was done in [33] for a scalar 3eld and in [128] for a spinor 3eld with bag boundary conditions. In [140] the results for a massless 3eld in several dimensions are collected. A calculation of the Casimir energy in any (even fractional) dimension is done in [141,142]. The Casimir energy for the massless scalar 3eld inside the sphere of radius R in D = 3 dimensions with Dirichlet boundary conditions is given by (see Table 1) 1 ˝c 1 cR 2 ECas = 0:0044 + : (4.88) + ln R 630 s ˝ It is proportional to 1=R for dimensional reasons. The pole part (in zeta-functional regularization with s → 0) follows from (4.87), the logarithmic term is a consequence of the divergence. It carries the mentioned arbitrariness in terms of the parameter introduced in (3.1). The 3nite part results from the corresponding numerical evaluation of E f . In the same manner results for the exterior space and for dierent dimensions can be obtained. Taking interior and exterior


73

Table 2 CoeTcients for the zeta functions and the Casimir energy for a massless scalar 3eld with Neumann boundary conditions D



Casimir energy

2 3 4 5 6

−0:344916 − 0:019531=s −0:459240 − 0:035368=s −0:512984 − 0:044716=s −0:556588 − 0:048921=s −0:677067 − 0:051373=s

−0:021330 − 0:019531=s

−0:183123 − 0:019531=s −0:223458 −0:260872 − 0:044716=s −0:270281 −0:376709 − 0:051373=s

+0:012324 + 0:035368=s −0:008760 − 0:044716=s +0:016024 + 0:048921=s −0:076351 − 0:051373=s

Table 3 CoeTcients for the zeta functions and the Casimir energy for a massless spinor 3eld with bag boundary conditions D



Casimir energy

2 3 4 5 6 7

−0:00537 + 0:007812=s −0:060617 − 0:005052=s +0:005931 − 0:002838=s

+0:02167 + 0:007812=s +0:019796 + 0:005052=s −0:010171 − 0:002838=s −0:008981 − 0:002510=s +0:004602 + 0:001171=s +0:004024 + 0:001174=s

−0:008148 − 0:007812=s +0:020410 +0:002120 + 0:002838=s −0:008039 −0:000781 − 0:001171=s +0:003419

+0:025059 + 0:002510=s −0:003039 + 0:001171=s −0:010862 − 0:001174=s

Table 4 CoeTcients for the zeta functions and the Casimir energy for the electromagnetic 3eld in a perfectly conducting spherical shell. It has to be noted that in even dimensions, in contrast with the scalar 3eld, the divergences between the inside and outside energies are dierent for D ¿ 2. This is due to the fact that (only in even dimensions) the l = 0 mode explicitly contributes to the poles of the %-function D



Casimir energy

2 3 4 5

−0:344916 − 0:019531=s

−0:021330 − 0:019531=s −0:075471 − 0:008084=s −0:351663 + 0:006021=s −0:593096 − 0:049692=s

−0:183123 − 0:019531=s

+0:167872 + 0:008084=s −0:044006 − 0:073556=s +0:580372 + 0:049692=s

+0:046200

−0:197834 − 0:033768=s −0:006362

space together, in odd dimensions the divergencies cancel and so do the logarithmic terms. A 3nite and unique result emerges. In Tables 2– 4 taken from [143], where the known results are collected and extended, the corresponding numbers are shown for several spatial dimensions D. In the 3rst two columns the values of the corresponding zeta function are given, the sums of which are twice the coeTcient of the Casimir energy given in the third column. The pole parts are given as real numbers for better comparison with the 3nite parts although they all have explicit representations in terms of zeta functions. The Casimir energy for the whole space is 3nite in odd spatial dimension and in3nite in even. It is interesting to note the changes in the signs for the Casimir energy for whole space. There is no general rule known for that.

74


Fig. 5. The Casimir energy of a massive scalar 3eld in the interior of a sphere of radius R.

In the case of a massive 3eld we have a renormalization prescription delivering a unique result as described in Section 3.4. For dimensional reasons the renormalized energy can be represented in the form f(mcR= ˝) ; (4.89) R where f(mcR= ˝) is some dimensionless function. This function is shown in the 3gures, for the interior space in Fig. 5, for the exterior space in Fig. 6, and for the whole space in Fig. 7. This is the Casimir energy measured in units of the inverse radius. In the interior space the energy takes positive as well as negative values demonstrating an essential dependence on the mass. It should be noted that for large masses, the Casimir energy decreases as a power of the mass and it is not exponentially damped as known from plane parallel geometry. This was probably 3rst noted in [33]. This is due to the nonzero heat kernel coeTcient a5=2 (in general the next nonvanishing coeTcient after a2 ). For small masses there is a logarithmic behavior in the interior and exterior separately taken which is due to the normalization condition and the nonzero heat kernel coeTcient a2 . For the whole space these contributions cancel and the massless result (4.88) is again obtained for mcR= ˝ → 0. It should be noted that the energy is positive for all values of the mass in contrast to the contribution in the interior where the energy takes dierent signs as a function of the radius. In the paper [128] the same had been done for the spinor 3eld obeying bag boundary conditions on a sphere. The result is shown in Fig. 8. Again, a nontrivial dependence on the radius is seen. In [132] the corresponding calculation had been done for the semitransparent spherical shell given by a delta function potential corresponding to the matching condition (3.27) and the E ren = ˝c


75

Fig. 6. The Casimir energy of a massive scalar 3eld in the exterior of a sphere of radius R.

Fig. 7. The Casimir energy of a massive scalar 3eld in the whole space with Dirichlet boundary conditions on a sphere of radius R.

Jost function is given by Eq. (4.68). The result is a ground state energy changing its sign in dependence on the radius and the strength of the potential. The vacuum energy of the electromagnetic 3eld in the presence of a dielectric body is of general interest as an example for the interaction with real macroscopic matter. Recently, this was intensively discussed as a possible explanation of sonoluminescence (there is a number

76


Fig. 8. The Casimir energy for a spinor 3eld obeying bag boundary conditions.

of papers on this topic, see, e.g., [11,144] and papers cited therein). The structure of the ultraviolet divergencies had been discussed in Section 3.3. For a dielectric ball (permittivity 1 (2 ) and permeability 1 (2 ) inside (outside)) these divergencies, represented by the heat kernel coeTcient a2 , turned out to be present in general not allowing for a unique de3nition of the vacuum energy of the electromagnetic 3eld due to the lack of a normalization condition in a massless case. Only in the dilute approximation, i.e., to order (c1 − c2 )2 for c1 ∼ c2 or for equal speeds of light, c1 = c2 (still allowing for 1 = 2 by means of 1 1 = 2 2 ) a unique result can be obtained. In general, the corresponding calculations can be done in a similar manner to that for the conducting sphere. One has to take the corresponding Jost function, Eq. (4.72) for c1 = c2 or Eq. (4.71) which must be expanded in powers of c1 − c2 and to insert that into the regularized energy, Eq. (3.43). The resulting expression which contains an in3nite sum and integral can be analytically continued in s and has a 3nite continuation to s = 0. The 3rst calculation for c1 = c2 had been done in [145] using a dierent representation. The vacuum energy turned out to be positive. As a function of ' = (1 − 2 )=(1 + 2 ) it interpolates smoothly between zero for ' = 0 and the known result for a conducting sphere at ' = 1. For small ' it is possible to obtain an analytical result 5 ˝c'2 E= (4.90) + O('4 ) : 16 2R It had been obtained in [96] by the multiple reMection expansion and in [146] by performing the orbital momentum summation prior to the radial momentum integration. In the dilute approximation, i.e., for nonequal speeds of light, the result in the 3rst order in c1 − c2 is zero. In second order it had been computed by dierent methods. In [147] the Casimir–Polder potential between pairs of molecules within the ball had been integrated over the volume, 1 23 1 2 2 E=− ˝c6 N dr1 dr2 ; (4.91) 2 4 |r1 − r2 |7−F V where 6=( − 1)=(4 N ) is the polarizability and N is the density of molecules. The divergencies at r1 = r2 forced the introduction of a regularization (F suTciently large and F → 0 in the end).


77

After dropping some divergent contributions which can be understood as particular examples (in a dierent regularization) of the divergent terms in Eq. (3.78), as 3nite result E=

23 ( − 1)2 ˝c 1536 R

(4.92)

emerges. Another approach was chosen in [84]. Here the dielectric ball was taken as perturbation to the Minkowski space. Again, there was a number of divergent contributions to be dropped in the same sense as above and the remaining 3nite part coincides with (4.92). In this calculation a wave number cut o was taken as regularization which is quite close to the frequency cut o used in Eq. (3.68). These two papers establish the equivalence of “Casimir– Polder summation” and ground state energy which is known for distinct bodies since the 1950s now for one body which is quite important for the understanding of these phenomena. For instance, it was demonstrated that the attractive force between individual molecules in (4.91) turns into a repulsive one after performing the integration and dropping divergent contributions. In [148] the same result (4.92) was obtained by mode summation, i.e., starting from a formula equivalent to (3.43). However, there no regularization was used. It would be of interest to get this result using a mathematically correct regularization procedure. Despite the mentioned results the situation with a dielectric body remains unsatisfactory. Consider 3rst the dilute approximation. There are divergent contributions in some regularizations (in zeta-functional regularization we would expect to have no divergent contributions according to Eq. (3.73) with m = 0 and a2 = 0). So we are forced to introduce some classical model like (3.85). There is no understanding how to do that as no classical energy is associated with a dielectric body. Beyond the dilute approximation the situation is even worse. It is impossible to identify a unique quantum energy. But on the other hand, we are confronted with real macroscopic bodies and the clear existence of vacuum Muctuations of the electromagnetic 3eld constituting a real physical situation so that no in3nities or arbitrariness should occur. At the moment the only conclusion that can be made representing the properties of a real body by dielectric constant is not a good idealization. Dispersion and other properties must be taken into account. A consequence would be that the vacuum energy will depend on them. It should be remarked that spatial dispersion, i.e., a position dependent (x), makes the situation even worse as then a2 = 0 even in the dilute approximation [149]. 4.2.4. The Casimir e>ect for a cylinder Using the methods explained above in this section it is possible to investigate the Casimir eect for a large number of con3gurations which can be characterized as generalized cones. Among them the conducting and dielectric cylinder in three dimensions are of particular interest as intermediate situation between the parallel plates and the sphere where the Casimir force is attractive and repulsive, respectively. The calculations go essentially the same way as in the spherical case and we restrict ourselves to pointing out the dierences. Due to the translational invariance along the axis of the cylinder, the quantum numbers labeling the eigenvalues are (kz ; l; n) (corresponding to a cylindrical coordinate system), where kz ∈ (−∞; ∞) is the momentum corresponding to the z-axis, l = −∞; : : : ; ∞ is the orbital momentum and n is the radial quantum number (assuming for the

78


moment the presence of a “large cylinder”). In parallel to Eq. (3.43) we obtain for the energy density per unit length ∞ cos s ∞ dkz ∞ 9 cyl E0 (s) = −˝c dk(k 2 − kz2 )1=2−s ln fl (ik; kz ) : (4.93) 2 9k −∞ 2 mc=˝ l=−∞

The Jost function fl (ik; kz ) is de3ned for the scattering problem in the (x; y)-plane. Usually, it does not depend on kz and the corresponding integration can be carried out. But for the dielectric cylinder it depends on kz (see below). The main dierence with the spherical case is that the corresponding Bessel functions are of integer order now. Also, it is useful to take the notations of the modi3ed Bessel functions instead of the spherical ones, Eq. (4.57). For example, for Dirichlet boundary conditions on the cylinder we have for the interior problem flD; i cyl (ik) = (kR)−l Il (kR) :

(4.94)

A technical complication appears from the l = 0 contribution. Instead of Eq. (4.79) we must de3ne the asymptotic expansion in the form ln flas (ik) =

3 Xi (t) i=−1

(4.95)

li

as an expansion in inverse powers of l. The l = 0 contribution must be treated separately using the corresponding asymptotic expansion for large argument. Another dierence is that in the electromagnetic case the l = 0 mode must be included so that the Casimir energy for a conducting cylinder is the sum of the corresponding energies for Dirichlet and Neumann boundary conditions. The results have been compiled in [150]. For a massless scalar 3eld with Dirichlet boundary conditions (taking interior and exterior regions together) the Casimir energy is 0:0006148 ; R2 for Neumann boundary conditions it reads cyl EDir = ˝c

0:014176 R2 and the Casimir energy for a conducting cylinder is the sum of these two, cyl = −˝c ENeum

(4.96)

(4.97)

0:01356 (4.98) R2 and the force is attractive. The 3rst calculation of this quantity had been done in [151]. The dielectric cylinder had been considered only recently. This problem is more involved than the corresponding spherical case because the TE- and TM -modes do not decouple in general. The corresponding Jost function reads cyl = −˝c Eelm

fl (ik; kz ) = -TE -TM + l2

(k 2 − kz2 )kz2 (c12 − c22 )2 c2 [Il (qR)Kl (kR)]2 k 2 q2 c14 c22

(4.99)


with q =

79

(c2 =c1 )2 k 2 + (1 − (c2 =c1 )2 )kz2 and

-TE = 1 kRIl (qR)Kl (kR) − 2 qRIl (qR)Kl (kR)

(4.100)

-TM = 1 kRIl (qR)Kl (kR) − 2 qRIl (qR)Kl (kR) :

(4.101)

and Up to now an analysis of the ultraviolet divergencies in terms of the heat kernel coeTcients is missing. Calculations had been done only in special cases. For equal speeds of light inside and outside the TE- and TM-modes decouple and the Casimir energy is 3nite. As a function of the parameter ' = (1 − 2 )=(1 + 2 ) it had been calculated for small ' in [152–155] and found to be zero in order '2 and nonzero in order '4 and higher (for ' = 1 the conducting cylinder, Eq. (4.98) is reobtained). For c1 = c2 , but in dilute approximation the Casimir energy had been calculated in the appendix of the paper [152] summing the Casimir–Polder forces and found to be zero too. At the moment there is no explanation for these somewhat unexpected results. As for the heat kernel coeTcients one may speculate that a2 is zero in the dilute approximation like in the case of a dielectric sphere. The reason is that in the Casimir–Polder summation (where, of course, a regularization had to be used) no polarization term and, hence, no logarithmic singularity appeared. For a massive 3eld the ground state energy had been calculated in [156] for a semitransparent cylinder using the same method as in the spherical case. In contrast to the spherical case, the ground state energy turned out to be negative for all values of the radius and the strength of the background potential. 4.3. Sphere (lens) above a disk: additive methods and proximity forces In many calculations of the Casimir energies and forces presented previously, the exact calculation methods were used, based on the separation of variables in a 3eld equation and 3nding a frequency spectrum (in an explicit or implicit form). Unfortunately, this can be done only for con3gurations possessing high symmetry properties. Of particular interest is the con3guration of a sphere (lens) above a disk which is used in the modern experiments on the measurement of the Casimir force [40 – 44]. For this con3guration all the variables in the wave equation cannot be separated which makes impossible to obtain the exact expression for the Casimir force starting from the 3rst principles. Hence some approximate methods should be applied in this case. Such methods are needed also to account for the surface roughness which is a very important factor in the Casimir force measurements (see below Section 5.3). There exist iterative procedures to obtain the Green’s function for con3gurations where the variables cannot be separated, starting from related con3gurations for which the exact solution is known. Such procedure called the multiple scattering expansion method has been proposed for the Casimir problems with boundaries in [157]. It turns out to be rather complicated in applications to con3gurations of experimental interest. Because of this, more phenomenological approximate procedures which could be easily applied are also of great interest. One such procedure is called the Proximity Force Theorem [158]. It has gained wide-spread acceptance in recent years. The application range of this theorem extends from coagulations of

80


aerosols (the subject where it was used 3rst [159]) to atomic nuclei [158]. It is based on the expression for the proximity energy associated with a curved gap of smoothly variable width z Vp = ESren (z) dF ; (4.102) G

ESren (z)

where is the known interaction energy per unit area of the two parallel planes at the separation z; G is one of the two surfaces restricting a gap. It is apparent that in Eq. (4.102) we neglect the nonparallelity of the area elements situated on the curved boundary surfaces restricting the gap. The corrections to (4.102) diminish as the curvatures of these surfaces become small. After some steps of approximately the same accuracy as Eq. (4.102) the expression for the force acting between the gap boundaries follows [158] 9Vp F(a) = − (4.103) = 2 RZ ESren (a) : 9a Here a is the minimal value of the separation z of boundary surfaces, RZ = (R1 R2 )1=2 is the geometric mean of the two principal radii of curvature R1 and R2 , characterizing the gap. Let us apply the Proximity Force Theorem expressed by Eq. (4.103) to calculate the Casimir force in con3guration of a sphere (or a spherical lens) situated above a large disk. Let the closest separation between the sphere and disk points be aR, where R is a sphere (lens) radius. Under this condition the force acting in both con3gurations is one and the same because the upper part of a sphere makes a negligible contribution to the force value. We put the coordinate origin on a surface of a disk under a sphere center, which is situated at a point z = a + R. Then the width of a gap is z = R + a − R2 − x2 − y2 (4.104) and the principal radii of a gap curvature are simply calculated 1 1 R1 = = R; R2 = =R ; zxx (0; 0) zyy (0; 0)

(4.105)

which leads also to RZ = R. Now in accordance with Eq. (4.103) the Casimir (van der Waals) force acting between a disk and a sphere (lens) is Fd‘ (a) = 2 RESren (a) ;

(4.106)

where the energy density ESren (a) between semispaces (planes) is given by Eq. (4.26). This result as well as (4.26) contains the limiting cases of both the van der Waals and Casimir force. At small separations (aB0 , see Section 4.1.1) it follows from (4.106) and (4.27) for perfect conductors HR Fd‘ (a) = − 2 : (4.107) 6a This is the van der Waals force between a disk and a lens. At large separations aB0 using Eqs. (4.106) and (4.31) one obtains the Casimir force in con3guration of the perfectly


81

conducting disk and lens [160]

3 ˝cR (0) (a) = − : (4.108) Fd‘ 360a3 If Eq. (4.18) is substituted into the right hand side of (4.106) the expression for the Casimir (van der Waals) force acting between a disk and a lens covered by the additional layers follows. The question arises as to the accuracy of results (4.106) – (4.108) given by the Proximity Force Theorem. Before discussing this question we consider one more approximate method which can be applied for the calculation of the Casimir force in complicated con3gurations. To calculate the Casimir and van der Waals force in con3gurations where the exact methods do not work the additive summation of interatomic pairwise interaction potentials is used widely. Let us consider two atoms with electrical polarizability 6(!) and let !0 be the characteristic frequency for transition between the ground and excited states. If the distance r12 between these atoms is such that !0 r12 =c1 the Casimir–Polder interaction occurs [2] C 23 C≡ ˝c62 (0) (4.109) U (r12 ) = − 7 ; 4 r12 (it is assumed that the magnetic polarizabilities of the atoms are equal to zero). In the opposite case !0 r12 =c1 the nonretarded van der Waals interaction with the inverse sixth power of distance holds. Its coeTcient does not depend on c. The additive result for the Casimir interaction energy of two bodies V1 and V2 separated by a distance a is obtained by the summation of the retarded interatomic potentials (4.109) over all atoms of the interacting bodies add 2 3 U (a) = −CN d r1 d 3 r2 |r1 − r2 |−7 ; (4.110) V1

V2

where N is the number of atoms per unit volume. It has been known that the additive result (4.110) reproduces correctly the dependence of U on distance (see, e.g., Section 2:5 of [161] where a number of examples and references on this point is contained). The coeTcient near this dependence is, however, overestimated in U add (a) compared to the true value. This is because Eq. (4.110) does not take into account the nonadditivity eects connected with the screening of more distant layers of material by closer ones (the value of a coeTcient in (4.110) tends to a true value if → 1, i.e. for a very rare3ed medium). In Ref. [162] a normalization procedure was suggested to approximately take into account the nonadditivity. According to this procedure the additive expression (4.110) should be divided by a special factor which is obtained by the division of the additive result by the exact quantity found from the plane-parallel con3guration (see also [24]). For two semispaces separated by the distance a the integration in (4.110) yields additive energy per unit area CN 2 U add (a) = − : (4.111) 30a3 The normalization factor is obtained by the division of (4.111) with the exact energy density from (4.29) U add (a) CN 2 K = ren = : (4.112) ES (a) ˝cL(20 )

82


Finally, the expression for the Casimir energy with approximate account of nonadditivity is U add (a) 3 d r1 d 3 r2 |r2 − r1 |−7 : (4.113) U (a) ≡ = −˝cL(20 ) K V1 V2 Applying Eq. (4.113) to a sphere (spherical lens) situated above a large disk at a minimum distance aR one easily obtains

2 ˝cR U (a) = − L(20 ) : (4.114) 30a2 Let us calculate the force as

2 ˝cR 9U (a) Fd‘ (a) = − L(20 ) : (4.115) =− 9a 15a3 Comparing the obtained result with Eq. (4.106) for a sphere above a disk derived with the Proximity Force Theorem, taking into account the second equation of (4.29), we can observe that both methods are in agreement. Now we start discussion of the accuracy of both methods. For two plane parallel plates result (4.113) is exact by construction. If to consider the arbitrary shaped body above a conducting plane the maximal error occurs when this body is a little sphere with a radius Ra (the con3guration which is maximally distinct from two plane parallel plates). For this case the independent result obtained from the 3rst principles is [163] 9˝cR3 E ex (a) = − : (4.116) 16 a4 For this con3guration the additive method supplemented by normalization (Eq. (4.114) in the limit 20 → ∞) gives the result ˝c 3 R3 U (a) = − : (4.117) 180 a4 The comparison of Eqs. (4.116) and (4.117) shows that the maximum error of the above method is only 3.8%. It is necessary to stress that for such con3gurations like a sphere (lens) above a disk under the condition aR the actual accuracy of both Proximity Force Theorem and an additive summation with normalization is much higher. For such con3gurations the dominant contribution to the Casimir force comes from the surface elements which are almost parallel. In the papers [164,165] the semiclassical approach was proposed for calculation of the Casimir energies and forces between curved conducting surfaces. In the framework of this approach the Casimir energy is explained in terms of the classical periodic trajectories along which the virtual photons are traveling between the walls. The contribution from periodic trajectories decreases with their length (the contribution from each trajectory is inversely proportional to the third power of its length). As shown in [164] the semiclassical approximation reproduces the value of the Casimir energy for a large class of con3gurations. It was applied to con3gurations of two spheres of radii R1 ; R2 , a distance aR1 ; R2 apart, and also of a sphere (lens) above a disk under the condition aR. In both cases the results are the same as were obtained earlier by the application of the Proximity Force Theorem or the additive summation with normalization (for a sphere above a disk this result is given by Eq. (4.108)). Also, several more complicated


83

con3gurations were considered semiclassically [165]. The semiclassical approach provides additional theoretical justi3cation for the results obtained by the Proximity Force Theorem and the additive method. According to the semiclassical approach for a con3guration of a sphere (lens) above a disk the correction term to Eq. (4.108) is of order a=R [165]. By this is meant for the typical values used in experiment [40] a = 1 m; R = 10 cm the accuracy of Eq. (4.108) is of order 10−3 %. Another possibility of checking and con3rming the high accuracy of phenomenological methods if there are the small deviations from plane parallelity is discussed in Section 5.3.1. In this section the additive method is applied for a plane plates inclined at a small angle to one another. For this con3guration the exact result is also obtainable. The comparison of both results shows that the additive result coincides with the exact one at least up to 10−2 % which is in agreement with a semiclassical estimation of accuracy (see Section 5.3.1). At the end of this section we discuss the application range of the approximate methods mentioned above. The Proximity Force Theorem is applicable for any material and distances between the interacting bodies, i.e. in the range of the van der Waals forces, Casimir forces, and also in the intermediate transition region. It is, however, not applicable for stochastic and drastically Muctuating surfaces, which is the case if the surface distortions are taken into account (see Section 5.3). The additive method supplemented by the normalization procedure starts from the summation −6 of pairwise interatomic potentials. These potentials decrease as r12 for the van der Waals forces −7 and as r12 for the Casimir ones. That is why the transition region between the two types of forces is not covered by the additive method. The advantage is that it can be applied to calculate roughness corrections to the van der Waals or Casimir force both for dielectrics and metals. The more fundamental, semiclassical approach is applicable, however, only for the perfectly conducting bodies placed at a distance small compared to the characteristic curvature radius of a boundary surface. In the case of the large distances (e.g., sphere of a radius Ra or of the order a where a is a distance to the disk) diraction eects should be taken into account. What this means is that trajectories which are going around a sphere make nonnegligible contributions to the result. The inclusion of diraction into the semiclassical theory of the Casimir eect is an outstanding question to be solved in the future [164,165]. Given that the modern experiments on the Casimir force measurement [40 – 44] use con3gurations such as a sphere (lens) above a disk, the approximate methods discussed above are gaining importance. 4.4. Dynamical Casimir e>ect For the case of two dimensions (one-dimensional space and one-dimensional time) the dynamical Casimir eect was already discussed in Section 2.4. It was emphasized there that the nonstationarity of the boundary conditions leads to two dierent eects. The 3rst one is the dependency of the Casimir energy and force on the velocity of a moving plate. The second eect is quite dierent and consists in the creation of photons from vacuum such as happens in nonstationary external 3elds. Let us begin with the 3rst eect which was not discussed in Section 2.4.

84


Consider the massless scalar 3eld in the con3guration of two perfectly conducting planes one of which, K, lies in the plane x3 = 0, and the other, K , moves with a constant velocity v in a positive direction of the x3 -axis [166]. The Green function of the 3eld is the solution of the Dirichlet boundary problem x G(x; x

) = −(x − x ) ;

G(x; x )|x; x ∈K or K = 0 ; K

(4.118) x3

x3

are the planes at = 0 and at = vt; respectively. The Green’s function has to where K; 3 be found separately for the three domains: x ¡ 0; 0 6 x3 ¡ vt; vt 6 x3 ¡ ∞. Given knowledge of the Green’s functions it is not diTcult to 3nd the nonrenormalized vacuum energy density in all three domains using the equalities 0|T00 (x)|0 =

=

3

˝c

2

˝c

2

0|9k ’(x)9k ’(x)|0

k=0 3

lim

x →x

9k 9k 0|’(x)’(x )|0 =

k=0

3

i ˝c 9k 9k G(x; x ) : lim 2 x → x

(4.119)

k=0

Within the 3rst domain the Green’s function is found by the use of reMection principle with one reMection only i 1 1 G ¡ (x; x ) = 2 − ; x3 ; x3 ¡ 0 ; (4.120) 4 (x − x )2 (x − x1 )2 where



x1 = SK x ;

 SK =  

1



1

1

  

(4.121)

−1

and the diagonal operator SK describes the reMection in the plane of a mirror K. To 3nd the Green’s function in the third domain we transform the point under consideration into the reference system of a moving mirror K , 3nd its reMection in K , and determine the coordinates of this reMection relative to the mirror K at rest by an inverse Lorenz transformation. The result is i 1 1 G ¿ (x; x ) = 2 − ; x3 ; x3 ¿ vt ; (4.122) 4 (x − x )2 (x − x1 )2 where



x1 = SK x ; s ≡ ln

c+v : c−v

cosh s  0 SK =   0 sinh s

0 1 0 0

 0 −sinh s 0 0   ; 1 0  0 −cosh s

(4.123)


85

Now consider the case of the second domain where the point lies in between the mirrors and experiences the multiple reMections in both of them. Here the Green’s function is given by the in3nite sum over all reMections ∞ 1 Z x ) = i G(x; (−1)m ; 0 6 x3 ; x3 6 vt ; (4.124) 2 4 m=−∞ (x − xm )2 where x0 ≡ x and x2m = (SK SK )m x ; x−2m = (SK SK )m x ;

x2m−1 = SK (SK SK )m x ; x−2m−1 = SK (SK SK )m x :

(4.125)

The Casimir energy density is calculated in all three regions separately using Eqs. (4.119), and (4.120), (4.122), (4.124). To obtain the renormalized energy density, the contributions of free Minkowski space is subtracted in each region. This is equivalent to the descarding of the 3rst terms in Eqs. (4.120), (4.122), (4.124) which coincide with the Green’s function G0 = i=[4 2 (x − x )2 ] in free space without any boundaries. The value of a force per unit area is calculated by dierentiation of the obtained energy per unit area with respect to the time dependent distance a(t) = vt between the plates ∞ d F(a(t)) = − d x3 E0ren (x3 ; s) : (4.126) d(vt) −∞ The result is [166]

4 8 v 2

2 ˝c v F(a(t)) = − 1+ +O 4 : 4 480a (t) 3 c c

(4.127)

It is seen that in the 3rst approximation the Casimir force between moving boundaries coincides with the well known result for the massless scalar 3eld between plates separated by a distance a(t) (which is one-half of the electromagnetic force from Eq. (2.37)). The same method was applied in [167] to calculate the velocity dependent correction to the Casimir force for electromagnetic 3eld. Both the cases of small and large velocities of plate were investigated. For vc the following result is obtained [167]: 4

2 ˝c 10 2 v 2 v F(a(t)) = − 1− − +O 4 : (4.128) 4 2 240a (t)

3 c c In the limiting case c − vc the force is given by 2 (c2 − v2 )2 3˝ c (c − v2 )4 F(a(t)) = − 2 4 1+ +O : 8 a (t) 16c4 c8

(4.129)

It is seen from Eqs. (4.127) and (4.128) that the velocity dependent correction to the Casimir force has dierent sign in the scalar and electromagnetic cases. Also, the velocity dependence of the Casimir force for the electromagnetic 3eld appears to be very slight (less than 8% of the static value for the same separation). Now let us discuss the eect of photon creation in case of a three-dimensional nonstationary cavity and the possibility of the experimental observation of this eect. As was told in

86


Section 2.4 the method suggested in [57] cannot be used in four-dimensional space–time (the generalization of this method to the resonant oscillations of one-dimensional cavity was given in [168]). A perturbation method applicable to a single boundary moving along some prescribed trajectory was developed in [169]. It was used to calculate the radiated energy for a plane mirror in four-dimensional space–time. The analytical method of Ref. [61] used in Section 2.4 leaves room for the application to three-dimensional resonant cavities. Let a rectangular cavity have dimensions a1 ; a2 ; a3 , respectively. If these dimensions do not depend on time the eigenfrequencies !n1 n2 n3 are given by Eq. (4.32). For simplicity consider the case a3 a1 ∼ a2 [61]. With this condition the frequencies with n3 = 0 are much greater than the frequencies with n3 = 0. Studying the excitation of the lowest modes one may put n3 = 0 when taking into account the fact that the interaction between low- and high-frequency modes is weak. As a consequence the vector potential of the electromagnetic 3eld is directed along the z-axis and depends only on two space coordinates x and y. For t 6 0 the 3eld operator is given by −i!n1 n2 0 t Az (t 6 0; x; y) = an1 n2 + ei!n1 n2 0 t a+ (4.130) n1 n2 (x; y)[e n1 n2 ] ; n1 ; n 2

where

√

n1 x

n2 y 2c sin sin : (4.131) a1 a2 a3 !n1 n2 0 a1 a2 For t ¿ 0 dimension a1 depends on time given by a1 = a(t). The boundary conditions for the operator of the vector potential are n1 n2 (x; y)

=√

Az |x=0 = Az |x=a(t) = Az |y=0 = Az |y=a2 = 0 : For any time t it can be found in the form ˜ (x; y)Qn1 n2 (t) ; Az (t; x; y) = n1 ; n 2

n1 n2

(4.132) (4.133)

where the functions ˜ n1 n2 are obtained from (4.131) by replacing a1 = const for a1 = a(t). The operators Qn1 n2 (t) for t ¿ 0 are unknown. Substituting (4.133) into the wave equation 92 Az (t; x; y) − WAz (t; x; y) = 0 ; (4.134) 9t 2 one obtains the coupled system similar to (2.68) for their determination. The coeTcients of this system are given by a a2 a3 9 ˜ (x; y) hn1 n2 ; k1 k2 = a dx dy d z ˜ k1 k2 (x; y) n1 n2 (4.135) 9a 0 0 0 (cf. with (2.69)). Now let us assume that the wall oscillates according to (2.76) a(t) = a1 [1 − cos (2!n1 n2 0 t)] (4.136) with 1. Using considerations analogous to those of Section 2.4 in one-dimensional case it is possible to 3nd the total number of photons created by the time t [61] n(t) = sinh2 (!n1 n2 0 Ct) ;

(4.137)


where the frequency modulation depth C is de3ned by −1=2 n1 a1 2 1 C= 1+ : 2 n2 a2

87

(4.138)

It is notable that in a three-dimensional case both the total energy of the cavity and also the number of photons grows exponentially with time. Let us discuss the possibility of detecting the photons created by virtue of the dynamical Casimir eect. Instead of oscillations of a wall as a whole it is more realistic to consider the oscillations of its surface due to the strong acoustic waves excited inside a wall. The maximal possible value of the dimensionless displacement from the Eqs. (2.76), (4.136), which a wall material can endure, is estimated as max ∼ 3 × 10−8 for the lowest mode !1 ∼ c =a0 [61]. Considering the separation between plates of order of several centimeters, the frequency of the lowest mode is !1 ∼ 60 GHz. Substituting these numbers into Eq. (2.82) one obtains the photon creation rate for the one-dimensional cavity dn1 (t) 4 ≈ 2 max !1 ∼ 700 s−1 : (4.139) dt

In the case of three-dimensional cavity the number of created photons can be even larger due to the exponential dependence on time in Eq. (4.137). Using the same frequency value as in one-dimensional case !n1 n2 0 ∼ !1 ∼ 60 GHz, and the displacement parameter = max =100 ∼ 3 × 10−10 , the frequency modulation depth of Eq. (4.138) takes the value C ∼ 10−10 . As a result it follows from Eq. (4.137) n(t) ≈ sinh2 (6t) ;

(4.140)

t being measured in seconds, which results in approximately 4 × 104 photons created in the cavity during 1 s due to the wall oscillations. So large number of photons can be observed experimentally. The main problem is, however, how to excite the high frequency surface vibrations in GHz range of suTcient amplitude. There are many important factors which should be taken into account in future experiments on the dynamical Casimir eect. It should be noted that the above derivations were performed for the perfectly reMecting walls. The case of the wall with any 3nite (but nondispersive) refractive index was considered in [170] for the scalar 3eld in one-dimensional space. The methods, elaborated in [170] can be generalized, however, for the electromagnetic 3eld in four-dimensional space–time. The important point is the type of detector and its inMuence on the photon creation process. In [61] two types of detectors were discussed. The 3rst one proposes that the beam of Rydberg atoms be injected into the cavity after the creation of suTcient number of photons. The typical frequencies of order 10 GHz mentioned above correspond to the transitions between the atomic levels with n ∼ 100 (n being the principal quantum number). These transitions can be observed by the well known methods. The second type of detector suggests that a harmonic oscillator tuned to the frequency of resonant mode be placed into the cavity from the start, so that a quantum system consisting of two subsystems is built up. The interaction between the radiated resonant modes and detector can be described by the quadratic Hamiltonian with time dependent coeTcients. As a result, the squeezed state of electromagnetic 3eld is generated, and the number

88


of photons in each subsystem is equal to one-half of the result given by Eq. (4.137) in the absence of detector. Another point which aects the possibility of experimental observation of the dynamical Casimir eect is the back reaction of the radiated photons upon a mirror. In Ref. [171] not only the total energy of radiated photons but also the dissipative force exerted on a single mirror moving nonrelativistically is considered. The eect, however, turns out to be very small (creation rate there is of order 10−5 photons=s only, to compare with much larger above rates in the case of a cavity). In Ref. [172] a master equation is derived describing one-dimensional nonrelativistic mirror interacting with vacuum via radiation pressure (Muctuations of a position of the dispersive mirror driven by the vacuum radiation pressure were considered earlier in [173]). The other part of the radiative reaction force exerted on a free mirror, which is not dissipative but of reactive nature, was considered in [170,174]. The existence of this force leads to corrections of the inertial mass of the mirrors. All the above considerations of the dynamical Casimir eect deals with the case of zero temperature. The Casimir eect at nonzero temperature will be considered in Section 5.1. Here we touch only on the inMuence of nonzero temperature upon the photon creation rate in the dynamical Casimir eect. In Ref. [175] the temperature correction to the number of photons created by a moving mirror is derived in the framework of response theory. It was shown that for a resonantly vibrating cavity of a typical size of about 1 cm at room temperature a thermal factor of order 103 should be considered along with the zero temperature result (4.137). Due to this, at room temperature, 3 orders of magnitude larger number of photons will be created in comparison with the 4 × 104 photons created during 1 s, for a displacement parameter ∼ 3 × 10−10 (see Eq. (4.140)). This provides the possibility to decrease the displacement parameter and makes more probable the experimental observation of the dynamical Casimir eect in near future. 4.5. Radiative corrections to the Casimir e>ect From the point of view of quantum 3eld theory the Casimir eect is a one loop radiative correction to an external classical background given by some boundary conditions (or background 3elds if in a broader understanding). The question of higher loop corrections naturally appears. So the 3rst radiative correction to the Casimir eect is, in fact, a two loop contribution. In general, the expected eects are very small. First, they are suppressed by the corresponding coupling constant. Second, for boundary conditions they are suppressed by the ratio of the Compton wavelength Bc of the corresponding quantum 3eld and the characteristic macroscopic geometrical size, the plate separation a, for instance. Nevertheless, there is general interest in the consideration of radiative corrections, mainly as a test of the applicability of the general methods of perturbative quantum 3eld theory. For example, the covariant formulation in the presence of explicitly noncovariant boundaries and the generalization of the perturbation expansion of a gauge theory are among the important issues to be addressed in this context. Boundary conditions necessitate a modi3cation of the renormalization procedure as well. The interaction of the quantum 3elds with the boundary leads to ultraviolet divergencies in vacuum graphs that cannot simply be discarded by normal ordering. Last but not least the question for the order of the geometrical eects is raised, i.e. the question for the leading power of Bc =a contributing


89

to the eective action. Additional attention to radiative corrections arises in the bag model of QCD where they are not a priori negligible. In the framework of general perturbative QED the radiative corrections to the ground state energy can be obtained from the eective action. We start from representation (3.90) of the generating functional of the Green’s functions with Z (0) being the corresponding functional of the free Green’s functions, Eq. (3.96). Then the eective action is given by 6 &=

i˝ i˝ {1PI graphs} : Tr ln K + Tr ln KZ − i˝ 2 2

(4.141)

Here, in addition to Eq. (3.5) we have the trace of the operation KZ according to the representation, Eq. (3.96), of the generating functional and the sum is over all one-particle irreducible (1PI) graphs with no external legs (vacuum graphs) resulting from Eq. (3.90). Since the boundary conditions are static, the eective action is proportional to the total time T and the ground state energy is given by 1 E0 = − & : T

(4.142)

If there are translation invariant directions (e.g., parallel to the plates), the relevant physical quantity is the energy density and we have to divide by the corresponding volume also. In this representation, the 3rst contribution, (i=2)˝ Tr ln K does not depend on the boundary Z and delivers just the Minkowski space contribution. The second contribution, (i=2)˝ Tr ln K, delivers the boundary dependent part and the third contribution contains the higher loops. We consider the 3rst of them. Graphically it is given by . Here, the solid line is the spinor propagator and the wavy line represents the photon propagator. So up to the two loop order we 3nd the following expression for the ground state energy: i˝ i˝ E0 ≡ E0(0) + E0(1) = − Tr ln K − Tr ln KZ 2T 2T i˝ d x dy[D4 (x − y) − DZ 4 (x; y)]S4 (y − x) ; + 2T

(4.143)

where S4 (x − y) = −i6 Tr C S c (x − y)C4 S c (y − x)

(4.144)

is the known polarization tensor, S c (x − y) is the spinor propagator and 6 = e2 = ˝c is the 3ne structure constant. Here we brieMy pause from the calculation of the radiative correction and instead recalculate the one loop contribution in order to illustrate the given representation. Using the explicit

6 As we restored in this section the dimensional constants ˝ and c we note that K; KZ do have dimensions. However, as the functional determinants here and below are de3ned up to a constant it would be an unnecessary complication to introduce factors making the arguments of the logarithms dimensionless.

90


expression (3.101), the boundary dependent part takes the form 3k i˝ i ˝ d i 6 (0) 3 Tr ln KZ = − d x6 tr ln − st hij ; E0 ≡ − 2TV 2TV (2 )3 2& where tr is related to two-dimensional indices, V is the volume of the translational invariant directions parallel to the plates and E0(0) is the energy density per unit area of the plates. We

perform the Wick rotation k0 → ik0 (thereby & → iC ≡ i k02 + k12 + k22 ) and calculate the “tr ln” 1 1 tr ln − st hij = ln det − st hij = 2 ln (1 − e−2Ca ) − 4 ln(2C) : 2C 2C The term − 4 ln(2C) yields a distance independent contribution and will be dropped. In this way the known result for the Casimir energy is reproduced d 3 k6

2 ˝c (0) −2Ca ln (1 − e ) = − : E0 = ˝ c (2 )3 720a3 Now we turn back to the radiative correction. Again, to do the separation of the photon propagator into two parts, we see that the radiative correction in (4.143) separates into corresponding two parts also. That, proportional to the free space photon propagator D4 (x − y), is just the same as without boundary conditions. It delivers a constant contribution not depending on the boundary and we drop it as part of the Minkowski space contribution. Hence, we can restrict ourselves to consider the contribution from DZ 4 (x − y). We rewrite it using (3.95) and (4.144) in the form i˝ ts −1st (1) 3 E0 = − d z d 3 z KZ (z; z )SZ (z ; z) ; (4.145) 2TV G G where the abbreviation ts Z S (z ; z) = d x dy E4t† (z )D44 (z − y)S4 (y − x)D (x − z)Es (z)|z; z ∈G

(4.146)

has been introduced. Using the transversality of S4 (x), S4 (x) = (g4 92x − 9x 9x4 )S(x2 ) ; this expression simpli3es to ts SZ (z ; z) = d x dy Et† (z )D(z − y)g4 92y S((y − x)2 )D(x − z)E4s (z)|z; z ∈G : Introducing the Fourier transform of S(x) d 4 q iqx ˜ 2 2 e S(q ) ; S(x ) = (2 )4 we rewrite it in the 3nal form ˜ 2) s d 4 q iq(z−z ) t† 4 S(q ts t† e E (z )g E (z )|z; z ∈G : SZ (z ; z) = E (z ) (2 )4 −q2 4

(4.147)

(4.148)


91

This representation of the radiative correction is still valid for a general surface G. Here, we ˜ 2 ) of the polarization tensor is known observe another important conclusion. The scalar part S(q to possess a logarithmic divergence which is independent of q (for example, in Pauli–Villars regularization it is of the form − (26=3 ) ln(M=m) where M is the regularizing mass, and m the ˜ i.e. S(q ˜ 2) → mass of the electron). However, if only a q-independent constant is added to S, ts ts ts ts ˜ 2 ) + C, the quantity SZ changes according to SZ (z ; z) → SZ (z ; z) + CKZ (z ; z). Then it S(q may be veri3ed with the help of (3.94) that the corresponding change in the ground state energy is a simple constant which is independent of the geometry. Consequently, the removal of the divergence of the polarization tensor can be interpreted as a renormalization of the cosmological constant which is analogous to the free Minkowski space contributions discussed above. As a result only the 3nite, renormalized part of the polarization tensor needs to be taken into account when calculating the boundary dependent part of the radiative correction E0(1) . Now, we rewrite the radiative correction (4.148) in the speci3c geometry of two parallel planes. Using the polarization vectors Es (s=1; 2), (3.97), which commute with the polarization tensor, we obtain ) 4q ˜ 2 ) )) d ) S(q st iq(z−z SZ (z − z ) = st e : (4.149) ) (2 )4 q2 ) z; z ∈G

We substitute this expression into the radiative correction (4.145) and 3nd 2 ˜ 2 d4 k −1 −ik3 (aj −ai ) S(k ) &h e E0(1) = ˝c ij (2 )4 k2 √

i; j=1

k6 k 6 + i and k 2 = k k . Using (3.103) the last equation takes the form ˜ 2) d4 k & S(k −i&a E0(1) = 2i˝c − cos k a) : (e 3 (2 )4 sin &a k2

with & =

(4.150)

(4.151)

By means of the trivial relation exp (−i&a) = exp (i&a) − 2i sin &a we separate again a distance independent contribution that will be omitted. Now, we perform the Wick rotation and obtain the 3nal result for the radiative correction to the ground state energy to order 6 in the geometry of two parallel conducting planes ˜ 2) S(k d4 k C (1) −Ca E0 = 2˝c − cos k a) : (4.152) (e 3 (2 )4 sinh Ca k2 It is interesting to study the radiative correction (4.152) in the limit Bc =a1. For this purpose ˜ 32 + C2 ) there is a cut with we transform the integration path of the k3 -integration. Due to S(k branch point k3 = i 4(mc= ˝)2 + C2 in the upper half of the complex k3 -plane. The discontinuity ˜ 2 ) across the cut, disc S(k ˜ 2 ) = S(k ˜ 2 + i) − S(k ˜ 2 − i), of the one loop vacuum polarization S(k is well known [6]

2i 4m2 c2 2m2 c2 2 ˜ disc S(k ) = − 6 1 − 2 2 1 + 2 2 : (4.153) 3 ˝ k ˝ k

92


We move the integration contour towards the imaginary axis so that the cut is enclosed. The result can be written in the form ∞ √ ˜ 32 ) −Ca i ˝c d3 k C dk3 disc S(k (1) − k32 +C2 a E0 = (e − e ): (4.154)

(2 )3 sinh Ca 2mc=˝ k3 k 2 + C2 3

In the limit mca˝, the second term in the round brackets is exponentially suppressed and can be neglected. The desired series in inverse powers of mca= ˝ is now simply achieved by expanding the square root in the denominator 2 ∞ ˜ 32 ) disc S(k i ˝c d 3 k Ce−Ca C (1) E0 = dk3 1+O : (4.155) 2 2 (2 )3 sinh Ca 2mc=˝ k3 k32 After some elementary integrations and with the help of ∞ ˜ 32 ) disc S(k 3i 6˝ dk3 =− ; 2 16 2mc k3 2mc=˝ we 3nd the leading order contribution to the radiative correction 3 ˝

2 6 ˝2 + O E0(1) = 2560ma4 m2 ca5

(4.156)

(4.157)

in agreement with [92,100]. The calculation of the radiative correction to the Casimir eect for a conducting sphere had been carried out in [95]. It follows essentially the lines shown here but is technically more involved. Therefore, we discuss only the results here. First, the ultraviolet divergencies which could be simply dropped in the case of plane-parallel plates (as they were distance independent) deserve special consideration in the spherical case. In zeta-functional regularization there are divergent contributions E0(1)div = −

16 c4 4 6m3 R2 2 − 6mc2 9 ˝ 15

(4.158)

to E0(1) (here we use the same de3nition of the divergent part as in Section 3.3 as those from contributions with nonnegative powers of the mass). In this case the mass is that of the spinor 3eld. The physical argument is the same, the vacuum energy must vanish when the Muctuating 3eld becomes too massive. We note that there is actually no divergence in (4.158). This is due to the speci3c regularization used. One can expect that the same quantity calculated in another regularization shows up as a real divergence like those in the corresponding contributions in Eq. (3.78). Finally, we note that in the (one loop) expression for the divergent part of the ground state in zeta-functional regularization, Eq. (3.73), for a massless 3eld only the contribution of a2 is present. Thus, we may consider the divergent contributions given by Eq. (4.158) as radiative corrections to the corresponding heat kernel coeTcients the a1=2 and a3=2 . In order to perform the renormalization we must assume a procedure as described in Section 3.4. We have to de3ne a classical system with energy E class = FG + K as special case of Eq. (3.85) because there are only two heat kernel coeTcients dierent from zero.


93

We present the 3nite parts, the renormalized ground state energy for parallel plates and that for the sphere together with the corresponding one loop contribution, respectively, as

2 6 ˝2

2 ˝c E0plates = − + ; 720 a3 2560 ma4 6 ˝2 0:092353˝c mcR sphere −4 −3 E0 = − 7:5788 × 10 ln + 6:4833 × 10 : (4.159) 2R ˝ mR2 The numbers in E0sphere are a result of numerical computation in [95] whereby the one loop contribution is in fact that of Boyer, the number being taken from [93]. The common features of both equations is that the radiative correction is the largest possible, i.e., it is proportional to the 3rst power of the 3ne structure constant and the 3rst power of the ratio of the two geometric quantities, namely the distance a, the radius R, respectively, to the Compton wavelength of the electron Bc = ˝=(mc). The appearance of the logarithmic contribution for the sphere can be understood as a consequence of the radius dependent ultraviolet divergence, Eq. (4.158). In both cases, i.e., for the plates as well as for the sphere, the radiative correction can be interpreted as a renormalization of the distance by the spinor loop such as 3 6Bc a→a 1 + ; 32 a mcR 6Bc −1 −2 R → R 1 + 1:4040 × 10 + 1:6413 × 10 ln ; (4.160) ˝ R thus making them larger. This is the same in both cases regardless of the dierent overall sign in the energy. An extension of the above calculations is given in [176] where the radiative correction to the Casimir force between partly transmitting mirrors was calculated. These mirrors are given by delta function potentials on parallel planes which is equivalent to the matching conditions (3.27) with the transmission coeTcient (3.28). It was demonstrated that the calculation can be performed in straightforward generalization of the above formulas. As a result the relative weight of the radiative corrections WE0 can be represented as WE0 6˝ = f(7a) ; (4.161) E0 mca where 7 is now the strength of the delta potential (see Eq. (3.27)) and f(7a) is a smooth function with limiting values 2:92 9 f(7a) = − 1+ for 7a → ∞ ; (4.162) + ··· 32 7a i.e., in the limit of impenetrable mirrors (4.159) is again obtained, and 3 7a for 7a → 0 (4.163) f(7a) = 16 for nearly completely penetrable mirrors. It is worth noting that in this case the “largest possible” radiative correction, proportional to the 3rst powers of 6 and Bc =a, is present. In [100] the radiative correction was calculated at 3nite temperature. For intermediate temperatures, obeying kB T mc2 , it turned out to be given by the 3rst line in Eq. (4.160), i.e., by the same substitution as in the zero temperature case.

94


4.6. Spaces with non-Euclidean topology As was mentioned in Introduction, the Casimir eect arises not only in the presence of material boundaries but also in spaces with nontrivial topology. The latter causes the boundary conditions, or, more exactly, periodicity conditions, imposed on the wave functions which are of the same kind as those due to boundaries. In Section 2.3 the simple examples in one- and two-dimensional spaces were already given. Here we consider in detail the four-dimensional space–time of a closed Friedmann model which is physically important. Also the vacuum polarization in the space–time of cosmic strings is discussed and vacuum interaction between two parallel strings is calculated. The other examples of the Casimir eect in spaces with non-Euclidean topology is given by Kaluza–Klein theories. There the local Casimir energy density and pressure can lead to spontaneous compacti3cation of extra spatial dimensions with a de3nite value of compacti3cation parameter. 4.6.1. Cosmological models According to the concept of standard cosmology, the space of our Universe is homogeneous and isotropic. Depending on the value of the mean density of matter the cosmological models can be open (curved hyperbolic space of in3nite volume), quasi-Euclidean (Mat in3nite space in three dimensions), and closed (curved spherical space of 3nite volume). At present the precise value of the mean density of matter is not known. Thus, it does not oer a unique means to determine among the three possible models which one (closed) possesses non-Euclidean topology and, consequently, Casimir eect. Note, that the nontrivial topology may be introduced in any model, including the quasi-Euclidean case, by imposing some point identi3cation (see below). But the closed Friedmann model is topologically nontrivial and naturally incorporates the Casimir eect as compared to the other models mentioned above. It is well known that in gravitational theory, it is the energy density and the other components of the stress-energy tensor and not the total energy which are the physical quantities of importance. They are present in the right hand side of the Einstein equations and determine the space–time geometry. Because of this, we are interested here (and in the next section also) not by the global characteristics of a vacuum but by the local ones: 0|Tik (x)|0 ren . Due to the symmetry properties of the space–time only diagonal components survive. In addition the space components are equal to each other and can be expressed through the energy density by means of the conservation condition. Thus we are interested here only in the energy density (H) = 0|T00 (x)|0 ;

(4.164)

where H is a conformal time variable, and the stress-energy tensor is de3ned in Eq. (2.51). Metric of the closed Friedmann model has the form ds2 = a2 (H)(dH2 − dl2 ) ; dl2 = d12 + sin2 1(d, 2 + sin2 , d’2 ) ;

(4.165)

where a(H) is a scale factor with dimensions of length, 0 6 1 6 ; , and ’ are the usual spherical angles. It is seen that the space section of a closed model is the surface of a 3-sphere with topology S 3 . The dependence of the radius of curvature a(H) on time complicates the issue. With this


95

dependence the case under consideration can be likened to a dynamical Casimir eect with moving boundaries (see Sections 2.4 and 4.4). Because of this, except for the usual Casimir vacuum polarization depending on a sphere radius the other vacuum quantum eects may be expected also. Taking into account that N = 3; ' = 1=6 and the scalar curvature R = 6(a + a)=a3 , where prime denotes dierentiation with respect to conformal time H, we rewrite Eq. (2.50) in the form a ’ (x) + 2 ’ (x) − -(3) ’(x) + a

m2 c2 a2 a + + 1 ’(x) = 0 ; ˝2 a

(4.166)

where -(3) is the angular part of the Laplacian operator on a 3-sphere, x = (H; 1; ,; ’). The orthonormal set of solutions to Eq. (4.166) can be represented as √ ’(+) BlM (x) =

1 ∗ gB (H);BlM (1; ,; ’) ; 2a(H)

(+) ∗ ’(−) BlM (x) = [’BlM (x)] ;

(4.167)

where the eigenfunctions of the Laplacian operator are de3ned according to 1 ;BlM (1; ,; ’) = sin 1

B(B + l)! −l−1=2 (cos 1)YlM (,; ’) ; P (B − l + 1)! B−1=2

(4.168)

B=1; 2; : : : ; l=0; 1; : : : ; B − 1; YlM are the spherical harmonics and P4 (z) are the adjoint Legendre functions on the cut. The discrete quantity B has the sense of a dimensionless momentum, the physical momentum being ˝B=a. The time dependent function gB satis3es the oscillatory equation gB (H) + !B2 (H)gB (H) = 0;

!B2 (H) = B2 +

m2 c2 a2 (H) ˝2

(4.169)

with the time dependent frequency and initial conditions 3xing the frequency sign at the initial time 1 ; !B (H0 )

gB (H0 ) =

gB (H0 ) = i !B (H0 ) :

(4.170)

Eigenfunctions (4.167) – (4.170) de3ne the vacuum state at a moment H0 . In the homogeneous isotropic case one may put H0 = 0. Substituting the 3eld operator expanded in terms of functions (4.167), (4.168) into the 00-component of (2.51) and calculating the mean value in the initial vacuum state according

96


to (4.164) one obtains the nonrenormalized vacuum energy density ∞ ˝c (0) (H) = 2 4 B2 !B (H)[2sB (H) + 1] ; 4 a (H) B=1

sB (H) =

1 (|g |2 + !B2 |gB |2 − 2!B ) : 4!B B

(4.171)

The corresponding vacuum energy density in tangential Minkowski space at a given point is ∞ ˝c (0) M (H) = 2 4 B2 dB !B (H) : (4.172) 4 a (H) 0 Subtracting (4.172) from (4.171) with the help of Abel–Plana formula (2.25) we come to the result ∞ ˝c (0) (0) ren (H) = E (H) + 2 4 B2 !B (H)sB (H) ; (4.173) 2 a (H) B=1

where the Casimir energy density of a scalar 3eld in a closed Friedmann model is [17] 1=2 ∞ ˝c B2 dB m2 c2 a2 (H) (0) 2 B − : (4.174) E (H) = 2 4 2 a (H) mca(H)=˝ e2 B − 1 ˝2 Note that under subtraction of two in3nite quantities (4.171) and (4.172) the damping function was introduced implicitly (see Section 2.2 for details). Also, Eq. (2.35) was taken into account to obtain Eq. (4.174). The index ren near the energy density from (4.173) is put by convention. However, as it will become clear soon, additional renormalizations are needed in (4.173). In the frame of dimensional regularization the quantity 0M |Tik |0M is demonstrated to be proportional to a metrical tensor of space–time gik . What this means is the subtraction of quantity (4.172) performed above is equivalent to the renormalization of a cosmological constant in the eective action of the gravitational 3eld [53,28]. The same interpretation is valid also when one subtracts the contribution of free Minkowski space in the problem with material boundaries (if to put the renormalized value of cosmological constant equal to zero). Up to this point we have considered the closed Friedmann model which is nonstationary. If we consider Eq. (4.165) with a (H) = 0 we obtain the metric of the static Einstein model R1 × S 3 . In that case sB (H) = sB (H0 ) = 0 as is evident from Eq. (4.171), and the total vacuum energy density is given by the Casimir term E (0) from the Eq. (4.174) with a = const. It follows from (4.169), (4.171) that sB (H) = sB (H0 ) = 0 for massless 3eld (m = 0) even if a (H) = 0, i.e. metric is nonstationary. In that case, however, the total vacuum energy density does not reduce to the static-like Casimir term E (0) only. The point to note is that the second term on the right hand side of Eq. (4.173) is the subject of two additional renormalizations in accordance with the general structure of in3nities of Quantum Field Theory in curved space– time [53,54,28]. This is the renormalization of the gravitational constant and of constants near the terms that are quadratic in the curvature in the eective gravitational action. Both these renormalizations are accidentally 3nite for the conformal scalar 3eld in isotropic homogeneous space. As a result of these renormalizations the total vacuum energy of massless scalar 3eld in


97

closed Friedmann model takes the form (0) (H) = E0(0) (H) + ren

˝c (2b b − b2 − 2b4 ) ; 960 2 a4 (H)

(4.175)

where b = b(H) ≡ a (H)=a(H). The Casimir energy density of a massless 3eld which appears in this expression is obtained from Eq. (4.174) for both constant and variable a [15,17] as ∞ ˝c dB B3 ˝c (0) E0 (H) = 2 4 = : (4.176) 2 a (H) 0 e2 B − 1 480 2 a4 (H) The second term on the right hand side of Eq. (4.175) may be interpreted in two ways. It can be viewed as a part of the dynamical Casimir eect on one hand, as it disappears when a (H) → 0. Closer examination of this term makes it more reasonable to take the second interpretation, according to which it is the usual vacuum polarization by the external gravitational 3eld having no connection with the periodicity conditions. Actually, this term which leads to the conformal anomaly, is present in the open Friedmann model also, where the periodic boundary conditions (and therefore the Casimir eect) are absent. In the case of massive 3elds Eq. (4.174) can be integrated analytically in the limit mca(H)= ˝1 both for the constant and variable a with the result [17] E (0) (H) ≈

(mca(H)= ˝)5=2 ˝c −2 mca(H)=˝ : e 8 3 a4 (H)

(4.177)

It is seen that the Casimir energy density of a massive 3eld on S 3 is exponentially small which is not typical for the con3gurations with nonzero curvature. For the nonstationary metric and massive 3eld the quantity sB (H) from Eqs. (4.171), (4.173) is not equal to zero except at initial moment. As a consequence, the total vacuum energy density consists of the Casimir contribution (4.174), the vacuum polarization contribution given by the second term on the right hand side of Eq. (4.175) (it is the same for both massless and massive 3elds), and the contribution of particles created from vacuum by the nonstationary gravitational 3eld. The latter eect owes its origin to the variable gravitational 3eld, not to the periodicity conditions due to non-Euclidean topology. It exists in the open and quasi-Euclidean models which are topologically trivial. In this regard cosmological particle production is dierent from the production of photons in the dynamical Casimir eect with boundaries. There the material boundaries themselves play the role of a nonstationary external 3eld, so that without boundaries the eect is absent. Because of this, we do not discuss the eect of particle production here in further detail (see, e.g., [53,54]). All the above considerations on the Casimir energy density of scalar 3eld in the closed Friedmann model can be extended for the case of quantized spinor 3eld. The details of quantization procedure can be found in [24,25]. Here we dwell on the results only. After the calculation of the vacuum energy density and subtraction of the tangential Minkowski space contribution the result similar to Eq. (4.173) is ∞ 2˝c 1 (1=2) (1=2) 2 ren (H) = E (H) + 2 4 B − !B (H)sB (H) ; (4.178)

a (H) 2 B=3=2

98


where sB (H) is expressed in terms of the solution to the oscillatory equation with a complex frequency obtained from the Dirac equation after the separation of variables. The Casimir contribution in the right hand side is [53] 1=2 ∞ dB m2 c2 a2 (H) 2˝c 1 (1=2) 2 2 E (H) = 2 4 B + B − : (4.179)

a (H) mca(H)=˝ e2 B + 1 4 ˝2 To obtain (4.179) the Abel–Plana formula (2.26) for the summation over the half-integers was used. In the static Einstein model sB (H) = 0 once more and the total vacuum energy density is given by the Casimir term E (1=2) (H) from Eq. (4.179) with a = const. In the nonstationary case a (H) = 0 two additional renormalizations mentioned above are needed to obtain the total physical energy density of a vacuum (note that for the spinor 3eld the sum in Eq. (4.178) is divergent). The result in a massless case is given by ˝c 7 4 (1=2) 2 2 ren (H) = E0(1=2) (H) + 6b b − 3b − + 5b ; (4.180) b 480 2 a4 (H) 2 where the Casimir energy density of a massless spinor 3eld is ∞ B dB 2˝c 1 17˝c E0(1=2) (H) = 2 4 B2 + = : (4.181)

a (H) 0 4 e2 B + 1 960 2 a4 (H) The second contribution on the right hand side of (4.180), as well as in (4.175) for a scalar case, is interpreted as a vacuum polarization by the nonstationary gravitational 3eld. In the case of a massive 3eld, the eect of fermion pair creation in vacuum by the gravitational 3eld is possible [53,54]. It has no direct connection, however, to the periodicity conditions and the Casimir eect. As to the Casimir energy density of massive spinor 3eld, it decreases exponentially for mca(H)= ˝1, and E (1=2) (H) ≈ 4E (0) (H), where the scalar result is given by Eq. (4.177). At last for the quantized electromagnetic 3eld the Casimir energy density in the closed Friedmann model is 11˝c E0(1) (H) = : (4.182) 240 2 a4 (H) Notice that although the Casimir energy densities (4.176), (4.181), (4.182) are very small at the present stage of the evolution of the Universe, their existence provides an important means to determine the global topological structure of space–time from the results of local measurements. Theoretically, dierent topological structures of space–time on cosmological scale are possible. It is well known that one and the same metric, which is a solution to Einstein equations, may correspond to dierent spatial topologies. For example in [177] the quasi-Euclidean space–time was considered with a 3-torus topology of space. The latter means that the points x + kL; y + mL; z + nL; where k; m; n are integers and both negative and positive values, are identi3ed with a characteristic identi3cation scale L. As was shown in [177] the Casimir energy density in such a model can drive the inMation process. A number of multi-connected cosmological models have been investigated in the literature (see the review [178]). If the identi3cation scale is smaller than the horizon, some observational eects of nontrivial topology are possible. This was 3rst discussed in [179,180]. In particular, due to nontrivial topologies, the multiple images of a given cosmic source may exist (in some speci3c cosmological models the detailed analyses of this possibility was performed in [181]). The observable eects of the various


99

topological properties of space–time in the various cosmological models make the calculation of the Casimir energy densities important (see, e.g., the monograph [31] and review [182] where it was calculated in spherical and cylindrical Universes of dierent dimensionality). 4.6.2. Vacuum interaction between cosmic strings Cosmic strings are classical objects which may have been produced in the early universe as byproducts of cosmological phase transitions. They are interesting due to their cosmological implications such as their inMuence on primordial density Muctuations [183]. Straight cosmic strings are solutions to the Einstein equations whose metric is Mat everywhere except for the string axis where it is conical, i.e., it has an angle de3cit. As a consequence, in classical theory there are gravitational forces between test bodies and cosmic strings but there is no interaction between the strings themselves. The interaction of a matter 3eld with a string can be described by boundary conditions requiring the 3eld to be periodic after rounding the string with an angle of 2 6: ;(t; z; r; ’ + 2 6) = ;(t; z; r; ’) ;

(4.183)

where 0 ¡ 6 6 1 and (r; ’) are polar coordinates in the (x; y)-plane. In this way we have a situation analogous to the Casimir eect for plane conductors—no interaction between the classical objects and boundary conditions on the matter 3elds. Indeed as one might expect, the ground state energy appearing from the quantization of these 3elds may result in vacuum polarization and in a Casimir force between two (or more) cosmic strings. Vacuum polarization in the background of one string was 3rst considered in [184], later on in more general space–times containing conical Mux tube singularities, for example, in [185,186]. For purely dimensional reasons, the local energy density 0|T00 (x)|0 has a singularity proportional to r −4 near the string where r is the distance from the string. Therefore, the global vacuum energy is not integrable near r = 0. One way to avoid this problem is to consider a string of 3nite thickness. Here a complete analysis is still missing. A calculation for small angle de3cit showed a vanishing vacuum energy in the 3rst nontrivial order [187]. The situation is dierent if one considers two (or more) cosmic strings. Here one would expect a force acting between them due to the vacuum Muctuations in analogy with the force between conducting planes whereby there should be no distance dependent singularities. This was done in [188,189] for a scalar 3eld, in [190] for a spinor and in [191] for the electromagnetic 3eld. Let us note that the same holds for magnetic strings, [192], and can easily be generalized to cosmic strings carrying magnetic Mux. From the technical point of view, however, the problem with two or more strings is complicated because in that background the variables do not separate and at present it is not known how to calculate the vacuum energy in such situation. So one is left with perturbative methods. Fortunately, for cosmic strings this is reasonable because the corresponding couplings are small, see below. The background containing parallel cosmic strings is given by the interval [193] ds2 = c2 dt 2 − d z 2 − e−2:(x⊥ ) (d x2 + dy2 ) (4.184) % with :(x⊥ ) = k 4Bk ln(|x⊥ − ak |=T0 ), where x⊥ = (x; y) is a vector in the (x; y)-plane and ak are the positions of the strings in that plane, T0 is the unit of length. The coupling to the

100


background, Bk = Gk =c2 , where k are the linear mass densities and G is the gravitational constant, is connected with the angle de3cit in Eq. (4.183) by means of 6k = 1 − 4Bk . These couplings are small quantities of typical order Bk ∼ (MGUT =mpl ) ∼ 10−6 . In order to perform the calculations it is useful to start from the local energy density, from Eq. (4.164) for example, and then consider energy per unit length √ E0 = dx⊥ −g0|T00 (x)|0 ; (4.185) where i0|T00 (x)|0 = −

˝c

9x 9y D(x; y)|y=x (4.186) 2 T T is the vacuum expectation value of the (00)-component of the energy-momentum tensor, given here for the massless scalar real 3eld. The propagator D(x; y) obeys the equation

[e−2:(x) (c−2 92t − 92z ) − 92x − 92y ]D(x; y) = − (x; y) :

(4.187)

The perturbative setup is obtained from rewriting this equation in the form D(x; y) = − (x; y) − V (x)D(x; y) ;

(4.188)

where V (x) = − (1 − e−2:(x) )(c−2 92t − 92z ) is the perturbation. Iteration yields (0) D(x; y) = D (x; y) + d z D(0) (x; z)V (z)D(0) (z; y) + d z d z D(0) (x; z)V (z)D(0) (z; z )V (z )D(0) (z ; y) + · · · :

(4.189)

(4.190)

To 3rst order in V (z) the propagator D(1) (x; y) can be calculated quite easily, see for example [191], with the result (for y = x) i Bk D(1) (x; x) = − 2 ; (4.191) 6 rk2 k where rk = |x⊥ −ak | and the sum goes over the strings. In the same way the vacuum expectation value of the energy-momentum tensor can be calculated, and one obtains 2˝c Bk 0|T00 (x)|0 = − 15 2 rk4 k for the electromagnetic 3eld. The singularity ∼ r −4 near each string is clearly observed. In order to obtain the corresponding contribution to the interaction of two strings one has to take the second order term in (4.190). There one has to expand V (z) and V (z ) for small Bk and to pick up the term proportional to B1 B2 . In this way the interaction energy and force per


101

unit length of two parallel strings can be obtained. They have the form E0 = − F˝c

B1 B2 ; a2

F0 = − 2F˝c

B 1 B2 ; a3

(4.192)

where a is the distance between the strings, and F is a number. Eqs. (4.192) follow for dimensional reasons and the number F has to be calculated. In [189] this had been done for the massless scalar 3eld with the result F = 4=(15 ), in [190] for the massless spinor 3eld with the same result and in [191] for the electromagnetic 3eld with the result of F being twice the value of the scalar 3eld. The force acting between the strings is attractive. We presented here the typical setup of the vacuum polarization and of the Casimir force between cosmic strings and would like to mention that more work had been done in the past years for the vacuum polarization in space–times with conical singularities. The corresponding heat kernel coeTcients have been calculated in [194 –196] and further papers. Similar methods have been applied to other topological singularities, to monopoles [197] or black holes for example. 4.6.3. Kaluza–Klein compacti@cation of extra dimensions The previous discussion has shown that in the spaces with non-Euclidean topology there is a nonzero vacuum energy density of a Casimir nature. This fact is widely used in multi-dimensional Kaluza–Klein theories which provide the basis for the modern extended studies of uni3ed descriptions of all the fundamental interactions including gravitational. The main idea of Kaluza– Klein approach is that the true dimensionality of space–time is d=4+N , where the additional N dimensions are compacti3ed and form a compact space with geometrical size of order of Planck length lPl = G ˝=c3 ∼ 10−33 cm (G is the gravitational constant). Originally, this idea arose in the 1920s through investigation of the possibility of unifying gravitation and electromagnetism. For this purpose a 3ve-dimensional space was used with a topology M 4 × S 1 . A review of the modern applications of Kaluza–Klein theories in the framework of supersymmetric 3eld models can be found in [198]. According to present concepts, the most promising theory of fundamental interactions is the Superstring Theory [199]. The dimensionality of space–time in string theory is 3xed as (9 + 1), which results from the Lorenz invariance and unitarity. There is no way to establish a link between this theory and the real world except by suggesting the real space–time is of the form M 4 × K 6 , where K 6 is a six-dimensional compact space in the spirit of Kaluza– Klein theories. The superstring theories attract so much attention because they do not contain ultra-violet divergencies and some of them incorporate both nonabelian gauge interactions and gravitational interaction. The important problem of string theories is to 3nd the mechanism of dynamical compacti3cation that is responsible for the stable existence of six-dimensional compact manifold K 6 . Probably, the most popular mechanism of this kind is based on the Casimir eect; in so doing the Casimir energy density of dierent quintized 3elds de3ned on M 4 × K 6 is substituted into the right hand side of the multi-dimensional Einstein equations to look for the self-consistent solutions. The values of the geometrical parameters of a compact space K 6 are determined in the process.

102


In four-dimensional space–time in the right hand side of the classical Einstein equations, a vacuum stress-energy tensor of the quantized 3elds is widely used [53,54]. Such equations are, in fact, approximate in the framework of a one loop approximation in comparison to a fully quantized theory of gravitation. This imposes the evident restrictions on their application range: the characteristic geometrical values of obtained solutions must be larger than the Planck length. In cosmology the self-consistent nonsingular solutions to Einstein equations with the energy densities (4.175) and (4.180) as the sources in the closed, open and quasi-Euclidean Friedmann models were 3rst performed in [200 –202]. Among the obtained solutions there are the famous ones describing inMation. For the closed model, the Casimir eect makes nonzero contribution to vacuum energy density. However, a crucial role is played by the vacuum polarization due to the nonstationary nature of the gravitational 3eld. In Kaluza–Klein compacti3cation the manifold K is presumed to be stationary. Because of this, the Casimir vacuum energy density is important for the determination of the parameters of ren , the multi-dimensional K and its stability. Representing the Casimir stress-energy tensor by TAB Einstein equations take the form 1 8 G ren RAB − Rd gAB + :d gAB = − 4 d TAB ; 2 c

(4.193)

where A; B=0; 1; : : : ; d − 1 (we leave d arbitrary, not necessarily equal to ten as in string theory), Gd and :d are the gravitational and cosmological constants in d dimensions. In the literature a great number of dierent compact manifolds were considered and the corren were calculated. In many cases the self-consistent responding Casimir stress-energy tensors TAB solutions to Eq. (4.193) were found. In [203], as an example, the Casimir energies for scalar and spinor 3elds were computed in even-dimensional Kaluza–Klein spaces of the form M 4 × S N1 × S N2 × : : : . For the massless scalar and spinor 3elds de3ned on M 4 × S 2 × S 2 (four-dimensional compact internal space) the stable self-consistent solution was found. The Casimir energies on the background M 4 × T N , where T N is N -dimensional torus are considered and reviewed in [204]. This includes the case N = 6 which is of interest for the superstring theory. In [205,206] the Casimir energies were computed on the background of M 4 × S N . The self-consistent solutions to Eq. (4.193) were found and the problem of their stability was discussed. The case of M 4 × B, where B is the Klein bottle, was considered in [207]. The self-consistent solutions to the Einstein equations for a static space–time with spatial section S 3 × S 3 at 3nite temperature were examined in [208]. It was shown that the Casimir stress-energy tensor of massless Dirac 3eld determines the self-consistent value of sphere radii for all T . Many more complicated Kaluza–Klein geometries were studied also. To take one example, in [209] the Casimir eect for a free massless scalar 3eld de3ned on a space–time R2 × H d−1 =& was investigated, where & is a torsion free subgroup of isometries of (d − 1)-dimensional Lobachevsky space H d−1 (see also [31]). All the above research is aimed at 3nding the genuine structure of the internal space compatible with the experimental knowledge of High Energy Physics. Unfortunately, this objective has not been met up to the present time. Because of this, there is no point in going through here all the details of the numerous multi-dimensional models where the Casimir energies are calculated and serve as a mechanism of spontaneous compacti3cation of extra dimensions. Instead, we give below one example only [206] based on space–time of the form M 4 × S N , which illustrates


103

the main ideas in this 3eld of research (note that six-sphere does not satisfy the consistency requirements of string theory [210] but presents a clear typical case). We are looking for the solution of Eq. (4.193) which are Poincar]e invariant in four dimensions. What this means is the metrical tensor gAB and Ricci tensor RAB have the block structure Hnm 0 0 0 gAB = ; RAB = ; (4.194) 0 hab (u) 0 Rab (u) where Hmn (m; n = 0; 1; 2; 3) is the metric tensor in Minkowski space–time M 4 , and hab (u) is the metric tensor on a manifold S N with coordinates u (a; b = 4; 5; : : : ; d − 1). It is clear that the scalar curvature Rd coincides with the one calculated from the metrical tensor hab (u). The Casimir stress-energy tensor also has the block structure ren Tmn = T1 Hmn ;

ren Tab (u) = T2 hab (u) :

(4.195)

Note that T1; 2 do not depend on u due to space homogeneity. The Ricci tensor on an N -dimensional sphere is N −1 Rab (u) = − 2 hab (u) ; (4.196) a where a is a sphere radius. ren can be To 3nd T1 and T2 we remind the reader that the Casimir stress-energy tensor Tab expressed in terms of the eective potential V by variation with respect to the metric 2 V (h) ren Tab = − : (4.197) |det hab | hab The variation of the metric tensor hab can be considered as a change in the sphere radius a. Multiplying both sides of (4.197) by hab , summing over a and b, and integrating over the volume of S N , one obtains with the use of (4.195) [206] V (h) dV 1 = − a2 2 ; (4.198) T2 NUN = − d N u hab ab 2 da h where the volume of the sphere is UN = d N u |det hab | : (4.199) To express T1 in terms of the eective potential a similar trick is used. It is provisionally assumed that the Minkowski tensor is of the form gmn = B2 Hmn , where B is varied. The result is [206] T1 UN = −V : Now we rewrite the Einstein equations (4.193) separately for the subspaces Eqs. (4.194) – (4.196) and (4.198), (4.200). The result is 8 G N (N − 1) + :d = 4 V (a) ; 2 2a c 8 G dV (a) N − 1 N (N − 1) − 2 + + :d = 4 a : 2 a 2a c N da

(4.200) M4

and

SN

using

(4.201)

104


Subtracting the second equation from the 3rst, one obtains c4 (N − 1) a dV (a) ; (4.202) = V (a) − 2 8 Ga N da where the usual gravitational constant G is connected with the d-dimensional one by the equality Gd = UN G. From dimensional considerations for the massless 3eld we have T1; 2 ∼ a−d in a d-dimensional space–time. With account of Eqs. (4.198), (4.200) and UN ∼ aN this leads to ˝cCN (4.203) V (a) = 4 : a Here CN is a constant whose values depend on the dimensionality of a compact manifold. Substituting this into Eq. (4.202) we 3nd the self-consistent value of the radius of the sphere 8 CN (N + 4)G ˝ a2 = : (4.204) N (N − 1)c3 Then from (4.201) the cosmological constant is :d = −

N 2 (N − 1)2 (N + 2)c3 : 16 CN (N + 4)2 G ˝

(4.205)

Thus, the self-consistent radii are possible when N ¿ 1, and CN ¿ 0. In that case the multidimensional cosmological constant is negative. It is seen from Eq. (4.204) that a ∼ lPl , and the value of the coeTcient in this dependence is determined by the value of CN . Generally speaking, one should take into account not of one 3eld but of all kinds of boson and fermion 3elds contributing to the Casimir energy. From this point of view the self-consistent radii are expressed by 1=2 8 (N + 4) a= lPl ; CN = nB CBN + nF CFN ; (4.206) CN N (N − 1) where CBN and CFN are the dimensionless constants in the Eq. (4.203) written for each 3eld separately, nB and nF are the numbers of boson and fermion massless 3elds. It is important that CN ¿ 1. In other case the one loop approximation in frames of which the self-consistent solutions are found is not valid and one should take into account the corrections to it due to the quantization of gravity. To determine the values of CBN and CFN it is necessary to calculate the eective potential (4.203) explicitly. This was done in [206] for odd values of N using the method of dimensional regularization and generalized zeta functions. In the scalar case both the conformally and minimally coupled 3elds were considered (see Section 2.3). Some of the obtained results are represented in Table 5. It is seen from Table 5 that the values of all coeTcients are rather small, being especially small for the conformally coupled scalar 3eld. The values of CN from Eq. (4.206) are positive, e.g., for N = 3 or 7 regardless of the number of dierent 3elds. For N = 5; 9 the number of fermions and conformally coupled scalar 3elds should not be too large comparing the number of minimally coupled scalar 3elds in order to assure the condition CN ¿ 0. In all these cases Eq. (4.206) provides the self-consistent values of a compacti3cation radius. To reach the value


105

Table 5 The coeTcients of the eective potential for the minimally coupled scalar 3eld, conformally coupled scalar 3eld, and fermion 3eld N

105 CBN (min.)

3 5 7 9 11

7.56870 42.8304 81.5883 113.389 132.932

105 CBN (conf.) 0.714589

105 CFN 19.45058

−0:078571

−11:40405

−0:000182 −0:000157

−2:99172

0.007049

5.95874 1.47771

CN ≈ 1 (which is minimally permissible for the validity of a one loop approximation) a large number of 3elds is needed, however. For example, even for N = 11 where the value of CBN ≈ 133 × 10−5 is achieved, one needs to have 752 scalar 3elds with minimal coupling to have CN ≈ 1. The enormous number of light matter 3elds required to get the reasonable size of the compacti3cation radius (no smaller than a Planckean one) is the general characteristic feature of the spontaneous compacti3cation mechanisms based on the Casimir eect. Although such mechanisms are of great interest from a theoretical point of view, only with future development can we recognize if they have any relationship to the real world. 5. Casimir eect for real media In this section the Casimir eect for real media is considered which is to say that the realistic properties of the boundary surfaces are taken into account. In the previous sections the highly symmetrical con3gurations were mostly restricted by the geometrically perfect boundaries on which the idealized boundary conditions were imposed. Contrary to this, here we concentrate on the distinguishing features of all physically realizable situations where the impact of test conditions like nonzero temperature, surface roughness or 3nite conductivity of the boundary metal should be carefully taken into account in order to obtain highly accurate results. As indicated below, these conditions separately, and also their combined eect, have a dramatic inMuence on the value of the Casimir force. Thus, the present section forms the basis for experimental investigation of the Casimir eect presented in Section 6. 5.1. The Casimir e>ect at nonzero temperature The Casimir eect at nonzero temperature is interesting for the description of the present experimental situation and as an example for nonzero temperature quantum 3eld theory as well. We consider the static Casimir eect. This is a system in thermal equilibrium and can be described by the Matsubara formulation. One has to take the Euclidean version of the 3eld theory with 3elds periodic (antiperiodic) in the Euclidean time variable for bosons (fermions) within the interval of time 7 = ˝=kB T where kB is the Boltzmann constant. This procedure is well known and we restrict ourselves here to some questions which are speci3c to the Casimir eect. Its temperature dependence was 3rst investigated in [211,212] for conducting planes with the result that the inMuence of temperature is just below what was measured in

106


the experiment by Sparnaay [20]. Later on the temperature dependence had been intensively investigated theoretically, see for example [96,213–215]. It is still an active area of research, see for instance a recent investigation of the thermodynamics of the Casimir eect with dierent temperatures in between and outside the plates [216]. It is impossible to cover the whole area in this review. So we focus on two speci3c moments. From the mathematical point of view, the inclusion of nonzero temperature is nothing other than the addition of another pair of boundaries, just in the time coordinate with periodic boundary conditions (strictly speaking, this is a compacti3cation with no boundary). The spectrum in the corresponding momentum variable is equally spaced in both cases and one problem can be transformed into another. So, for example, the simple Casimir eect (two parallel conducting planes) can be reduced to the same Riemann zeta function (see Section 2.2, Eq. (2.39)), as it appears in the black body radiation in empty space. Yet another example is the Casimir eect for a rectangular domain say with two discrete frequencies nx =ax and ny =ay which can be expressed in terms of an Epstein zeta function in the same way as the Casimir eect for a pair of plates at nonzero temperature. Thereby one has, of course, to take into account that there are dierent boundary conditions, periodic ones in the imaginary time and Dirichlet ones in the spatial directions. Recently, the precision of the Casimir force measurements increased in a way that there is a hope to measure the temperature eects. Therefore, they must be calculated with suTcient accuracy, see Section 5.4. Of particular interest is the force between dielectric bodies at 3nite temperature. This problem had been 3rst solved by Lifshitz [9]. Because of its complicated form it had been reconsidered later on, see for example [217]. In view of its actual importance we give below another derivation which is based on 3eld theoretical methods, especially the representation of the ground state energy for a background depending on one coordinate developed in Section 3.1. 5.1.1. Two semispaces We start from the representation of the free energy FE in a 3eld theory at nonzero temperature which is on the one loop level given by mc 2 ˝ FE = Tr ln E + V (x) + ; (5.1) 2 ˝ where the trace and the wave operator are taken in Euclidean space. This formula can be obtained from the eective action (3.5) by Wick rotation, x0 → ix4 , with x4 ∈ [0; 7] and periodic boundary conditions in x4 on the 3eld. To be speci3c we consider two dielectric bodies with frequency dependent permittivity (!), restricted by two plane parallel surfaces z = ± a=2 and separated by air with distance a between them, see Fig. 9. Then Eq. (5.1) can be rewritten in the more speci3c form −s !l2 dk⊥ ˝ 9 1 2 FE = − (i! ) + k + B ; (5.2) n l ⊥ 2 9s 7 (2 )2 n c2 l

where k⊥ = (kx ; ky ) are the momenta in the translational invariant directions perpendicular to the z-axis and !l = 2 l=7 (l = −∞; : : : ; ∞) are the Matsubara frequencies (we do not need to introduce the factor 2s into this formula because it drops out from the distance dependent


107

Fig. 9. The con3guration of two dielectric bodies separated by air.

contributions to free energy like (5.5) below). Following the discussion in Section 3.1 we assume for a moment in the z-direction the presence of a “large box” in order to have the corresponding eigenvalues Bn discrete. As usual we use zeta-functional regularization with s → 0 in the end and the logarithm is restored by the derivative with respect to s. As next step we assume the scattering problem on the z-axis to be formulated in the same manner as in Section 3.1 for a background depending on one Cartesian coordinate. This is a reasonable setup for the problem under consideration. An electromagnetic wave coming from the left (or from the right) in the dielectric will be scattered on the air strip and there will be a transmitted and a reMected wave. Note that there are no bound states because the photon cannot be con3ned in between the two dielectric bodies. Also note that in this formulation there is no need to consider evanescent waves separately as the asymptotic states given by the incident waves from the left and from the right are complete. Now, we use again the same discussion as in Section 3.1 to pass to the representation −s ∞ !l2 ˝ 9 dk⊥ dk 9 2 2 FE = − (i!l ) 2 + k⊥ + k (5.3) (k; k⊥ ) ; 27 9s (2 )2 0 c 9k l

which is analogous to Eq. (3.24). Here, however, having in mind the properties of a dielectric we allow the scattering phase (k; k⊥ ) to depend on k⊥ . In passing to Eq. (5.3) we dropped a distance independent contribution and took into account that there are no bound states. Now, we turn the integration over k towards the imaginary axis and obtain in parallel to Eq. (3.25) −s dk⊥ ∞ !l2 ˝ 9 9 2 2 FE = dk k − (i! ) − k sin s ln s11 (ik; k⊥ ) : l ⊥ c2 2 7 9s (2 )2 √(i!l )!l2 =c2 +k⊥2 9k l (5.4) To proceed we assume that the integral over k converges for s = 0, i.e., for removed regularization. This is justi3ed for the problem we are interested in, namely for the force between two separated bodies where divergent distance independent contribution can be dropped. In the limit s → 0 the derivative with respect to s can be carried out by means of 9s sf(s)|s=0 = f(0) for any function f(s) regular in s = 0. Then the integral over k is over a total derivative and can be carried out with the result   2 ! dk⊥ ˝ 2 ;k  : FE = − ln s11 i (i!l ) 2l + k⊥ (5.5) ⊥ 27 (2 )2 c l

108


This is the generalization of Eq. (3.25) to nonzero temperature. For 0 in the sense that % T→ kB T ˝c=a the sum over l turns into an integral by means of (1=7) l → d!=2 with !l → ! 2 = k 2 Eq. (3.25) is reobtained. and after a change of variables to !2 =c2 + k⊥ So it remains to obtain an expression for the scattering coeTcient s11 (ik; k⊥ ) for the problem with two dielectric bodies as shown in Fig. 9. In fact, this scattering problem for an electromagnetic wave is quite simple. We consider the electric 3eld strength E(t; x) subject to the Maxwell equations 92 − - E(t; x) = 0 ; c 2 9t 2 ∇E(t; x) = 0 :

(5.6)

We turn to Fourier space in the translational invariant variables t and k⊥ by means of E(t; x) = ei!t−ik⊥ x⊥ E(!; k⊥ ; z)

(5.7)

and the equations become 92 !2 2 − 2 + k⊥ − 2 E(!; k⊥ ; z) = 0 ; c 9z ik⊥ E⊥ (!; k⊥ ; z) −

9 Ez (!; k⊥ ; z) = 0 : 9z

(5.8)

Now we allow the permittivity to be a function of the frequency !. Although the formalism described here works also in the case of an (!; z) which is a general function of z we consider the speci3c form  a a  for − 6 z 6 ; 1 2 2 (!; z) = (5.9) a   (!) for 6 |z | 2 for the dependence of on z in accordance with Fig. 9, where (!) is assumed to be a function of the frequency ! with the necessary analytic properties. As a next step we have to separate the polarizations. This can be done using the standard polarization vectors known from the transverse photons in Coulomb gauge. We introduce the decomposition   9    −ik1 9z  −ik2        e2 (k⊥ ; z) ik E(!; k⊥ ; z) = N1  −ik 9  e1 (k⊥ ; z) + N2  (5.10) 1     2 9z   0 k12 + k22 so that the second equation of (5.8) is satis3ed (N1;2 are the normalization factors).


109

The matching conditions on the surface of the dielectric demanding E⊥ (!; k⊥ ; z) and (!; z)Ez (!; k⊥ ; z) to be continuous are satis3ed if (!; z)e1 (k⊥ ; z);

9 e1 (k⊥ ; z) 9z

(5.11)

and e2 (k⊥ ; z);

9 e2 (k⊥ ; z) 9z

(5.12)

are continuous in z = ± a=2. Note that the second condition diers from the 3rst one simply by the absence of . By the 3rst equation of (5.8) and the matching conditions (5.11), (5.12) we have for the functions e1 (k⊥ ; z) and e2 (k⊥ ; z) a one-dimensional scattering problem. So we can 3nd a solution of the form  a  eikz + s12 e−ikz ; z 6 − ;    2   a a ei (k⊥ ; z) = 6eiqz + 7e−iqz ; − 6 z 6 ; (5.13 )  2 2    a   s11 eikz ; 6z 2 for each of the polarizations i = 1; 2. From Eq. (5.8) it follows − (!)

!2 2 + k⊥ + k2 = 0 c2

and

−

!2 2 + k⊥ + q2 = 0 : c2

(5.14)

The coeTcients can be determined from the matching conditions. Consider 3rst the polarization with i = 1. In z = a=2 we have from Eq. (5.11) 6eiqa=2 + 7e−iqa=2 = (!)s11 eika=2 ; iq(6eiqa=2 − 7e−iqa=2 ) = iks11 eika=2 ; wherefrom s11 6= 2

k i(k+q)a=2 (!) + e ; q

s11 7= 2

k i(k+q)a=2 (!) − e q

(5.15)

follows. In z = − a=2 we have (!)(6e−ika=2 + s12 eika=2 ) = 6e−iqa=2 + 7eiqa=2 ; ik(6e−ika=2 − s12 eika=2 ) = iq(6e−iqa=2 − 7eiqa=2 ) : Adding the 3rst equation divided by (!) to the second one divided by ik we eliminate s12 . After inserting 6 and 7 from Eq. (5.15) we arrive at s11 (k; k⊥ ) =

4(!)kqe−ika : [(!)q + k]2 e−iqa − [(!)q − k]2 eiqa

(5.16)

110


For representation (5.5) of the free energy we need s11 on the imaginary k-axis, s11 (ik; k⊥ ) =

[(i')q +

4(i')kqeka ; − [(i')q − k]2 e−qa

k]2 eqa

(5.17)

where we have substituted k → ik, q → iq and ! → i' so that these quantities are now related by (i')

'2 2 + k⊥ − k2 = 0 c2

and

'2 2 + k⊥ − q2 = 0 : c2

For the second polarization the same formulas hold by using = 1 where it explicitly appears in Eqs. (5.16) and (5.17). By means of Eq. (5.16) we have the coeTcient s11 (s12 can be obtained easily) of the scattering problem. It matches all usually expected analytic properties. For instance, it is an analytic function in the upper half-plane of k. There are no poles on the positive imaginary axis in accordance with the absence of bound states in this problem. The coeTcient s11 (k; k⊥ ) has poles in the lower half-plane which correspond to resonance states the photon has between the two dielectric bodies. Next we are going to insert s11 (ik; k⊥ ), Eq. (5.17) into the free energy, Eq. (5.5). As previously we rewrite it in the form

s11 (ik; k⊥ ) =

(i')q + k (i')q − k

2

−1 −2aq

−e

4(i')kq (k−q)a e : [(i')q − k]2

(5.18)

Now, we remark that the second factor is distance independent and that the third factor delivers a contribution linear in the distance a between the dielectrics. Both deliver contributions to the energy which are not relevant for the force FssT = −9FE = 9a and we drop them. So we are left with the 3rst factor on the right hand side of Eq. (5.18). Here we note that it provides a convergent contribution to the free energy, Eq. (5.4) so that the transition to Eq. (5.5) is justi3ed. Inserting this 3rst factor into Eq. (5.5) and taking the derivative with respect to a we arrive at  −1 −1  2 2   dk⊥ (i'l )ql + kl ql + kl ˝ 2aql 2aql FssT = − q e − 1 + e − 1 l  7 (2 )2  (i'l )ql − kl ql − kl l

(5.19) 2 and k = 2 , ' = 2 l=7. with ql = '2l =c2 + k⊥ (i'l )'2l =c2 + k⊥ l l Introducing a new variable p according to 2 k⊥ =

'2l 2 (p − 1) ; c2

(5.20)


111

we rewrite Eq. (5.19) in the original Lifshitz form [9,105]  −1 ∞ ∞  K(i' ) + (i' )p 2 kB T l l T 3 2 2a('l =c)p 'l p dp e −1 Fss (a) = − 3  K(i'l ) − (i'l )p

c 1 l=0

+

K(i'l ) + p K(i'l ) − p

2

e2a('l =c)p − 1

−1   

;

(5.21)

where the notation K(i'l ) ≡ [p2 − 1 + (i'l )]1=2

(5.22)

is introduced in analogy with the one from Eq. (4.23), and the prime near summation sign means that the zeroth term is taken with the coeTcient 1=2. As was already mentioned above, in the limit of low temperatures kB T ˝c=a the summation in l in Eq. (5.21) can be changed for integration with respect to d‘ = ˝ d'=(2 kB T ). As a result we are returning back to Eq. (4.25) which is the Casimir force between two semispaces at zero temperature. Note that the representation of Eq. (5.21) for the temperature Casimir force has a disadvantage as the l = 0 term in it is the product of zero by a divergent integral. This is usually eliminated [105] by one more change of variables z = 2a'l p=c. Both this change and also (5.20) are, however, singular at l = 0. Because of this, the representation of Eq. (5.19) is preferable as compared to (5.21) and other representations obtained from it by change of variables at the singularity at l = 0 (in Section 5.4.2 the additional diTculties are discussed connected with the zeroth term of Lifshitz formula in application to real metals). Now let us apply Eq. (5.19) to ideal metals of in3nitely high conductivity in order to 3nd the temperature correction to the Casimir force Fss(0) (a) between perfect conductors (see, e.g., Eq. (1.3)). To do this we use the prescription by Schwinger, DeRaad and Milton that the limit → ∞ should be taken before setting l = 0 [214]. Then, introducing a new variable y = 2aql in Eq. (5.19) instead of |k⊥ | = k (note that y is regular at any l) we arrive at ∞ kB T ∞ y2 dy T Fss (a) = − : (5.23) y 4 a3 2a'l =c e − 1 l=0

This expression can be put in a form [213,214] 4 ∞ 30 T T T 1 T 1 e e −3 − 3 sinh

n ; cosh n FssT (a) = Fss(0) (a) 1 + 4

Te n4 Te n T T n=1

(5.24)

where the eective temperature is de3ned as kB Te = ˝c=(2a). Note that the quantity in square brackets is always positive. At low temperatures T Te it follows from (5.24) 4 1 T FssT (a) ≈ Fss(0) (a) 1 + : (5.25) 3 Te

112


At high temperature limit T Te FssT (a) ≈ −

kB T %R (3) : 4 a3

(5.26)

Note that the corrections to the above asymptotic results are exponentially small as exp(−2 Te =T ) at low temperatures and as exp(−2 T=Te ) at high temperatures. As a consequence, the asymptotic regime is even achieved when the temperature is only two times lower (higher) than the eective temperature value. In [218] the analogical results are obtained for the so called unusual pair of parallel plates, i.e. the 3rst be a perfectly conducting and the second—in3nitely permeable one. Other new fascinating problems arising from the study of the nonzero temperature Casimir force for the real metals of 3nite conductivity are discussed in Sections 5.4.2 and 5.4.3. 5.1.2. A sphere (lens) above a disk Here we deal with the nonzero temperature Casimir force for the con3guration of a spherical lens (sphere) above a plate (disk) which is used in most experiments. The sphere will be considered to have large radius comparing the space separation to a plate. As was noted in Section 4.3 in this situation Proximity Force Theorem produces high accuracy results. To apply the Proximity Force Theorem one should derive 3rst the expression for the free energy density in the con3guration of two dielectric plates at a temperature T . It is obtained by the integration of −FssT (a) from Eq. (5.19) with respect to a or simply by the substitution of Eq. (5.18) into Eq. (5.5). The result is ∞ (i'l )ql − kl 2 −2aql kB T ∞ FE (a) = k⊥ dk⊥ ln 1 − e 4 (i'l )ql + kl 0 l=−∞

ql − kl ql + kl

+ ln 1 −

2

e−2aql

:

(5.27)

Once more, the contribution to the free energy is omitted which does not depend on a in order that FE (a) → 0 when a → ∞. The nonzero temperature Casimir force in con3guration of a sphere (lens) above a disk (plate) is given by Eq. (4.106) as follows: T Fd‘ = 2 RFE (a) :

(5.28)

Let us apply results (5.27), (5.28) in the case of a disk and a sphere made of ideal metals. By the use of the same change of variable y = 2aql as in Section 5.1.1 in the limit → ∞ one arrives at T Fd‘ (a)

∞

kB TR = 2a2 l=0

∞

2a'l =c

y dy ln(1 − e−y ) :

(5.29)


113

This expression is in a direct analogy with Eq. (5.23) for two semispaces. It can be put in an equivalent form [213] ∞ T 3 1 45 Te (0) T Fd‘ (a) = Fd‘ (a) 1 + 3 coth

n

Te n3 T n=1 4 T T 2 1 T − + sinh−2 n e : (5.30) Te n2 T Te Note that as in (5.24) the temperature correction to the Casimir force is always positive. It is (0) approximately 2.7% of Fd‘ at a = 1 m. But, e.g., at a = 6 m the temperature correction is (0) equal to 1:74 Fd‘ , i.e. is already larger than the zero temperature force. At low temperature T Te the hyperbolic functions behave like exponents of large arguments. As a result the Casimir force is approximately equal to 3 4 (3) T T 45% R (0) T Fd‘ (a) ≈ Fd‘ (a) 1 + − : (5.31)

3 Te Te In the opposite case of high temperatures T Te all the terms of series (5.29) are exponentially small. As a result the zeroth term alone determines the Casimir force value [40] kB TR ∞ kB TR%R (3) T Fd‘ (a) ≈ y dy ln(1 − e−y ) = − : (5.32) 2 4a 4a2 0 The corrections to Eq. (5.31) are exponentially small in exp(−2 Te =T ), and to Eq. (5.32)—in exp(−2 T=Te ). For this reason the transition region between the two asymptotic regimes is actually very narrow. 5.1.3. The asymptotics of the Casimir force at high and low temperature As explained in the beginning of Section 5:1 the Casimir eect at nonzero temperature can be described in quantum 3eld theory by the Matsubara formalism. This has a number of important consequences. For example, one can easily understand that all ultraviolet divergencies at 3nite temperature are the same as at zero temperature. Hence, the temperature dependent part is a 3nite expression which can be calculated without facing major problems, at least numerically. A second important consequence is that for high and low temperatures asymptotic expansions can be obtained in quite general terms whereby the remainders are exponentially small. These asymptotics had been obtained 3rst in [215] and later reconsidered and generalized, see for example [219]. Let P be some three-dimensional operator describing the spatial part of the considered system in the same sense as in Eq. (3.14) or (3.31) and take Eq. (3.15) as its eigenvalue problem. The free energy as the quantity to be considered can be written following Eqs. (5.1) and (5.2) as −s ∞ ˝ 9 2s !l2 FE = − + BJ (5.33) 2 27 9s c J l=−∞

114


(!l = 2 l=7; 7 = ˝=(kB T )) with s → 0 in the end. In this formula, is the arbitrary parameter (here with dimension of an inverse length) entering zeta-functional regularization, see Section 3.1. Using Eq. (3.54) we rewrite this expression in the form ˝c 9 2s ∞ dt t s FE = − (5.34) KT (t)KP (t) ; 2 9s t &(s) 0 where ∞ 1 −t!l2 =c2 KT (t) = e (5.35) 7c l=−∞

is the temperature dependent heat kernel and e−tBJ KP (t) =

(5.36)

J

is the same heat kernel of the operator P as given by Eq. (3.51). In order to obtain the low temperature expansion it is useful to employ the Poisson formula 2 2 2 ∞ l7 c 2 1 KT (t) = √ +√ exp − : (5.37) 4t 4 t 4 t l=1 Being inserted into FE , Eq. (5.34), the 3rst term in the right hand side of Eq. (5.37) delivers just the zero temperature (7 = ∞) contribution F0 ≡ FE|T =0 ;

(5.38)

where we split FE = F0 + FT :

(5.39)

The second term on the right hand side of Eq. (5.37) makes the t-integral in Eq. (5.34) exponentially converging for t → 0 so that there are no ultraviolet divergencies. The limit s → 0 can be done trivially and we obtain ∞ ˝c ∞ dt 2 −l2 72 c2 =4t √ FT = − e KP (t) ; (5.40) 2 t 4 t l=1 0 where sum and integral are absolutely convergent. Now, we return to the sum representation (5.36) of KP (t). The t-integration in (5.40) can be done explicitly and after that the sum over l also, resulting in ˝ FT = ln[1 − exp(−7c BJ )] : (5.41) 7 J Note that this expression can be obtained by applying the Abel–Plana formula (2.25) to Eq. (5.33). Representation (5.41) is well suited to discuss the behavior for T → 0, see [215]. For a purely discrete spectrum, FT is exponentially decreasing as exp(−7c B0 ) where B0 is the smallest eigenvalue of the operator P. If the spectrum of P is partly (or completely) continuous going down to B = 0 then powers of T may be present. Consider as an example the Casimir eect


115

between conducting planes. In this case for the temperature dependent part of free energy per unit area we have

∞ n 2 ˝ dk1 dk2 ln 1 − exp −7c k12 + k22 + : (5.42) FT = 7 (2 )2 n=−∞ a The power-like contribution at T → 0 comes from n = 0, ˝ dk1 dk2 FT = ln[1 − exp( − 7c k12 + k22 )] + O(e−7c =a ) 7 (2 )2 =−

˝%R (3)

2 73 c2

+ O(e−7c =a ) ;

(5.43)

which is just the result 3rst found by Mehra [212]. In order to compare with (5.25) we remark that Eq. (5.42) gives the temperature contribution to the free energy con3ned between the two plates. From the expansion of Eq. (5.43) for small T there is no contribution to the force. In Eq. (5.25), however, the contribution from the exterior region is taken into account (it was derived as a special case from extended dielectric media). It is not diTcult to write down the temperature dependent part of the free energy per unit volume from Eq. (5.41) for the exterior region of one plate ˝ dk1 dk2 dk3 fText = ln[1 − exp( − 7c k12 + k22 + k32 )] 7 (2 )3

2 ˝ =− : (5.44) 9074 c3 Multiplying this by 2 to account for the photon polarizations one obtains the familiar free energy density of the black body radiation in empty space fBB = 2fText :

(5.45)

We now recall that at zero temperature to obtain the renormalized energy between plates we subtracted the energy of zero-point Muctuations of empty space in the same volume (interval a in this case). To obtain the renormalized free energy we do the same, i.e. subtract from it the free energy of free space in the line interval a at the given temperature with the result FTren = FT − afBB = FT − 2afText :

(5.46)

It is evident that calculation of a force in accordance with −9FTren = 9a gives us the well known result in the second term of Eq. (5.25). Alternatively, one can say that there is equilibrium thermal radiation on the outside whose pressure is equivalent to the force which is attracting the plate. From the structure of these expressions one can, for instance, conclude FT ∼ T 2 for the conducting cylinder. Let us consider as one more example the exterior of a sphere or of a spherically symmetric background potential as in Section 3.1.2. The discussion leading there

116


from Eqs. (3.33) to (3.42) can be applied without change to FT , Eq. (5.41). Note that it is not useful to turn the integration path towards the imaginary axis, i.e., to pass to a formula in analogy with (3.43). In this case we obtain ∞ ˝ dk 9 FT = (2l + 1) (5.47) ln(1 − e−7ck ) l (k) ; 7

9 k 0 l

where l is the sum over the orbital momentum and l (k) are the scattering phase shifts. After the substitution k → k=7 we see that for 7 → ∞ the phase shifts can be expanded at k = 0 and the expansion FT = −

˝ %R (n + 1)

7

n¿0

(7c)n

l

(2l + 1)(n) l (0)

(5.48)

emerges relating the power-like contributions for T → 0 to the derivatives of the phase shifts at zero momentum. In order to obtain the high temperature expansion it is useful to separate in KT (t) (5.35) the zeroth Matsubara frequency KT (t) =

∞

1 2 −t!l2 =c2 e : + 7c 7c

(5.49)

l=1

Being inserted into Eq. (5.34), the 3rst contribution delivers by means of Eq. (3.55) the derivative of the zeta function of the operator P. Because of ln det P = Tr ln P = −

9 9 Tr P −s |s=0 = − %P (s)|s=0 = −%P (0) ; 9s 9s

it is directly related to the determinant of the operator P (in three dimensions). This is an example of the well known dimensional reduction connected with the zeroth Matsubara frequency. Consider now the second contribution on the right hand side of Eq. (5.49) being inserted into Eq. (5.34). The t-integration is exponentially convergent for t → ∞. Hence, we can use the heat kernel expansion (3.56) for KP (t) neglecting possible contributions which are exponentially small for t → 0 resulting in exponentially small for T → ∞ contributions to FE . After that, the integration over t can be carried out and we arrive at FE = −

˝*

%P (0) + %P (0) ln 2

27 ˝ 9 2s − 7 9s

n=0;1=2;:::

+

an &(s + n − 32 ) &(s) (4 )3=2

2 7c

3−2(s+n)

%R (2(s + n) − 3) :

(5.50)

The limit s → 0 can be performed, now. One has to take into account that there are poles from the gamma function in the numerator for n = 12 and 32 . For n = 12 there is a compensation by %R (−2) = 0. For n = 2 there is a pole from the zeta function so that for n = 32 and 2 there are


117

logarithmic contributions. Finally (using (2.40)), one obtains for T → ∞ a1=2 %R (3) FE 1 a0 2 a1 1 − − =− %P (0) − 4 3 3=2 ˝c 27c (7c) 90 (7c) 4 (7c)2 24 a3=2 ln 7c a2 7c − C + ln + 4 (7c) (4 )3=2 16 2 an (2 )3=2−2n 3 √ & − n − %R (2n − 3) (7c)4−2n 2 2 2 n¿2

(5.51)

in agreement with Eq. (41) in [215] up to a temperature independent contribution due to the ultraviolet subtraction done there. The result is remarkable since the high-T expansion is expressed completely in terms of the heat kernel coeTcients and the determinant of the operator P and in this way in terms of quantities usually depending only locally on the background. On the other hand, this is just what is generally expected from a high energy expansion. Sometimes the high-T expansion is called the classical limit. For instance, the contributions from the determinant and that from a3=2 to FE do not contain ˝. Consider again the simplest example of two conducting plates. All distance dependent heat kernel coeTcients are zero. The determinant of P which leads to nonzero contribution can be obtained from the zeta function ∞ n 2 −s dk1 dk2 2 2 %P (s) = k1 + k2 + (5.52) (2 )2 a n=1

for a real scalar 3eld with %P (0) = %R (3)=(8 a2 ) (see Eqs. (2.38) and (2.39)). Substituting this into the 3rst term on the right hand side of (5.51) one obtains (5.26) after multiplication by 2 for the photon polarizations. Another easy example is a sphere with various boundary conditions. In this case the heat kernel coeTcients and the determinant of P which is the Laplace operator are known (see [220] for Dirichlet and [221] for Robin boundary conditions where also further references can be found). 5.2. Finite conductivity corrections The original Casimir result (1.3) is valid only for perfect conductors of in3nitely high conductivity, i.e. for the planes which fully reMect the electromagnetic oscillations of arbitrary frequency. In fact, for suTciently high frequency any metal becomes transparent. This is connected with the 3niteness of its conductivity. Because of this, there are 3nite conductivity corrections to the results like (1.3) which are derived for the perfect metal. These corrections may contribute of order 10 –20% of the net result at separations a ∼ 1 m. Thus, they are very important in the modern high precision experiments on the Casimir force. In this section we present dierent approaches to the calculation of the 3nite conductivity corrections in con3gurations of two semispaces and a sphere (lens) above a plate.

118


5.2.1. Plasma model approach for two semispaces The general expression for the Casimir and van der Waals force between two semispaces made of material with a frequency dependent dielectric permittivity 2 is given by Eq. (4.25). It is convenient now to use the notation instead of 2 , introduce the new variable x = 2'pa=c instead of ', and change the order of integration. As a result the force equation takes a form −1 −1 ∞ ∞ 2 2 ˝ c dp (K + p) (K + p) FssC (a) = − x3 d x ex − 1 + ex − 1 ; 32 2 a4 0 p2 (K − p)2 (K − p)2 1 (5.53) where the quantity K =K(i')=K(icx=(2pa)) is de3ned by Eq. (4.23), and index C is introduced to underline that the nonideality of the boundary metal is taken into account. It is common knowledge that the dominant contribution to the Casimir force comes from frequencies ' ∼ c=a. We consider the micrometer domain with a from a few tenths of a micrometer to around a hundred micrometers. Here the dominant frequencies are of visible light and infrared optics. In this domain, the free electron plasma model works well. In the framework of this model the dielectric permittivity of a metal can be represented as !p2 (!) = 1 − 2 ; !

!p2 (i') = 1 + 2 ; '

(5.54)

where the plasma frequency !p is dierent for dierent metals. It is given by the formula 4 Ne2 ; (5.55) m∗ where N is the density of conduction electrons, and m∗ is their eective mass. Note that the plasma model does not take into account relaxation processes. The relaxation parameter, however, is much smaller than the plasma frequency. That is why relaxation can play some role only for large distances between the metal surfaces aBp = 2 c=!p , where the corrections to the Casimir force due to 3nite conductivity are very small. Let us expand the expression under the integral with respect to p in Eq. (5.53) in powers of a small parameter c x ' 0 x 6≡ = = ; (5.56) !p 2!p a p a 2p !p2 =

where 0 =Bp =(2 ) is the eective penetration depth of the electromagnetic zero-point oscillations into the metal. Note that in terms of this parameter (!)=1+(1=62 ). The perturbative expansion of the Casimir force in powers of the relative penetration depth 0 =a can then be obtained. Such formulation of the problem was given 3rst in [110], where the 3rst order 3nite conductivity correction to the Casimir force was calculated (with an error in a numerical coeTcient which was corrected in [222], see also [7,214]). In [223] the second order 3nite conductivity correction was found in frames of the Leontovich impedance approach which has seen rapid progress recently [224,225]. Here we present the perturbative results up to fourth order [39] which give the possibility to take accurate account of 3nite conductivity corrections in a wide separation range. Comparison of the perturbation results up to the fourth order with the numerical computations using the optical tabulated data for the complex refractive index (see below


119

Section 5.2.3) shows that perturbation theory works well for separations a ¿ Bp (not aBp as one would expect from general considerations). After straightforward calculations one obtains the expansion of the 3rst term in the integrand of Eq. (5.53) −1 (K + p)2 x 4A 1 8A e − 1 = 1− 6 + 2 (2A − 1)62 (K − p)2 ex − 1 p p 2A + 3 (− 6 + 32A − 32A2 + 2p2 − p4 )63 p 8A 2 2 4 4 5 + 4 (2A − 1)(2 − 16A + 16A − 2p + p )6 + O(6 ) ; (5.57) p where A ≡ ex =(ex − 1), 6 is de3ned in Eq. (5.56). In perfect analogy, the other contribution from Eq. (5.53) is −1 (K + p)2 x 1 e −1 = x [1 − 4Ap6 + 8A(2A − 1)p2 62 2 (K − p) e −1 + 2A(− 5 + 32A − 32A2 )p3 63 + 8A(1 + 18A − 48A2 + 32A3 )p4 64 + O(65 )]

(5.58)

(note that this expression actually does not depend on p due to (5.56)). of (5.57) and (5.58) into (5.53) all integrals with respest to p have the form ∞After substitution −k with k ¿ 2 and are calculated next. The integrals with respect to x have the form dp p 0 ∞ xn emx dx x (5.59) (e − 1)m+1 0 and can be easily calculated with the help of [226]. Substituting their values into (5.53) we obtain after some transformations the Casimir force between metallic plates with 3nite conductivity corrections up to the fourth power in relative penetration depth [39] 20 640 16 0

2 30 2800 163 2 40 C (0) Fss (a) = Fss (a) 1 − 1− + 1− ; + 24 2 − 3 a a 7 210 a3 9 7350 a4 (5.60) where Fss(0) (a) is de3ned in Eq. (4.31). The 3rst order term of this expansion was obtained 3rst in [110,222], and the second order one—in [223]. In Section 5.2.3 the dependence of Eq. (5.60) is displayed graphically for aluminum and gold in comparison with numerical computations demonstrating high accuracy of the perturbation result for all separations between semispaces larger than the plasma wavelength of a corresponding metal. 5.2.2. Plasma model approach for a sphere (lens) above a disk We consider now the plasma model perturbation approach in con3guration of a sphere (lens) above a disk. The sphere (lens) radius R is suggested to be much larger than the sphere-disk separation a. Owing to this the Proximity Force Theorem is valid (see Section 4.3) and the

120


Casimir force is given by Eqs. (4.106) and (4.26). Introducing once more the variable x=2'pa=c the following result is obtained: ∞ ∞ ˝cR dp (K − p)2 −x (K − p)2 −x C 2 Fd‘ (a) = x dx ln 1 − e + ln 1 − e : 16 a3 0 p2 (K + p)2 (K + p)2 1 (5.61) Bearing in mind the need to do perturbative expansions it is convenient to perform in (5.61) an integration by parts with respect to x. The result is  2 2 − (K + p)2 9 (K − p) ∞ ∞ (K − p) dp  ˝cR 9x (K + p)2 C 3  (a) = − x d x Fd‘ 48 a3 0 p2  (K + p)2 ex − (K − p)2 1  9 (K − p)2 (K − − (K + 9x (K + p)2   : +  (K + p)2 ex − (K − p)2

p)2

p)2

(5.62)

The expansion of the 3rst term under the integral in powers of the parameter 6 introduced in (5.56) is 9 (K − p)2 (K − p)2 − (K + p)2 9x (K + p)2 2 x (K + p) e − (K − p)2 1 4 = x 1+ (1 − Ax)6 e −1 px 2 8A + 2 (− 2 − x + 2Ax)62 + 3 [2 − 6p2 + 3p4 px px 2 + Ax(− 6 + 32A − 32A + 2p2 − p4 ) + 16A(2A − 1)]63 8A + 4 [ − 8 + 32A − 32A2 + 8p2 − 4p4 px 2 2 4 4 5 + x(2A − 1)(2 − 16A + 16A − 2p + p )]6 + O(6 ) : (5.63) In the same way for the second term under the integral of (5.62) one obtains 9 (K − p)2 (K − p)2 − (K + p)2 9x (K + p)2 (K + p)2 ex − (K − p)2 1 4 8A = x 1 + (1 − Ax)p6 + (−2 − x + 2Ax)p2 62 e −1 x x 2 + (−1 − 16A + 32A2 − 5Ax + 32A2 x − 32A3 x)p3 63 x 8A 2 2 3 4 4 5 + (− 4 + 32A − 32A − x + 18Ax − 48A x + 32A x)p 6 + O(6 ) : x

(5.64)


121

Substituting Eqs. (5.63) and (5.64) into Eq. (5.62) we 3rst calculate integrals with respect to p. All integrals with respect to x are of the form (5.59). Calculating these we come to the following result after some tedious algebra [39]:

2 30 400 163 2 40 0 72 20 320 (0) C Fd‘ (a) = Fd‘ (a) 1 − 4 + − 1− + 1− ; a 5 a2 7 210 a3 3 7350 a4 (5.65) (0) where Fd‘ (a) is de3ned in Eq. (4.108). Although results (5.60) and (5.65) for the two con3gurations were obtained independently they can be related by the use of Proximity Force Theorem. By way of example, the energy density associated with the fourth order contribution in (5.60) is ∞ 5 2 ˝c 163 2 40 C(4) C; (4) Ess (a) = Fss (a) da = − 1− : (5.66) 27 7350 a7 a

Then the fourth order contribution to the force between a disk and a lens given by 10 3 ˝cR 163 2 40 C(4) (a) = 2 REssC; (4) (a) = − 1 − Fd‘ 27a3 7350 a4

(5.67)

is in agreement with (5.65). The other coeTcients of (5.65) can be veri3ed in the same way. Note that the linear and quadratic corrections from Eq. (5.65) for the con3guration of a sphere (lens) above a disk were 3rst obtained in [40] and [227], respectively. In Section 5.2.3 Eq. (5.65) is displayed graphically and the comparison with numerical computations is made showing the excellent agreement for all a ¿ Bp . 5.2.3. Computational results using the optical tabulated data The plasma model representation for the dielectric permittivity (5.54) was applied above to calculate the 3nite conductivity corrections to the Casimir force. The obtained perturbation results are adequate in some distance range. The plasma model does not take into account, however, the absorption bands of the boundary metal and the relaxation of conduction electrons. In addition, the plasma frequency of the metal under consideration (e.g. aluminum or gold) is not known very precisely. Because of this in [228] the Lifshitz formalism was applied numerically to dierent metals. For this purpose the tabulated data for the frequency dependent complex refractive index of that metals were used together with the dispersion relation to calculate the values of dielectric permittivity on the imaginary frequency axis. Thereupon the Casimir force was calculated numerically for con3gurations of two plates and a spherical lens above a plate. As shown in [35] (see also [229]) computations of [228] contain errors in the interpolation and extrapolation procedures which resulted in the deviations of the obtained results from the correct values. The same computations were performed in [36] in a wider range of space separations and with account of thin layers covering the metallic surfaces. The results of [35,36] are in agreement. Let us discuss them in more detail. We begin from the force per unit area for the con3guration of two semispaces or the force for a sphere (lens) above a semispace given by Eq. (4.25), and Eqs. (4.26) and (4.106). To (0) calculate numerically the corrections to the results for ideal metal Fss(0) , Fd‘ due to the 3nite conductivity we use the tabulated data for the complex index of refraction n+ik as a function of

122


frequency [230]. The values of dielectric permittivity along the imaginary axes can be expressed through Im (!) = 2nk with the help of dispersion relation [105] 2 ∞ ! Im (!) (i') = 1 + d! : (5.68)

0 !2 + '2 All calculations were performed in [36] for Al and Au surfaces because these metals were used in the recent experiments on the Casimir force measurements (see Section 6). The complete tabulated refractive indices extending from 0.04 to 10 000 eV for Al and from 0.1 to 10 000 eV for Au from [230] are used to calculate Im (!). For frequencies below 0:04 eV in the case of Al and below 0:1 eV in the case of Au, the table values can be extrapolated using the Drude model, which is more exact than the plasma one because it takes relaxation into account. In this case, the dielectric permittivity along the imaginary axis is represented as: (i') = 1 +

!p2 ; '(' + C)

(5.69)

where !p = (2 c)=Bp is the plasma frequency and C is the relaxation frequency. The values !p = 12:5 eV and C = 0:063 eV were used for the case of Al based on the last results in Table XI on p. 394 of [230]. Note that the Drude model was used to describe the dielectric function of a sphere which undergoes the Casimir attraction to a perfectly reMecting wall [231]. In the case of Au the analysis is not as straightforward, but proceeding in the manner outlined in [35] we obtain !p = 9:0 eV and C = 0:035 eV. While the values of !p and C based on the optical data of various sources might dier slightly we have found that the resulting numerically computed Casimir forces to dier by less than 1%. In fact, if for Al metal, a !p = 11:5 eV and C = 0:05 eV as in [35] is used, the dierences are extremely small. Of the values tabulated below, only the value of the force in the case of a sphere and a semispace at 0:5 m separation is increased by 0.1%, which on round-o to the second signi3cant 3gure leads to an increase of 1%. The results of numerical integration by Eq. (5.68) for Al (solid curve) and Au (dashed curve) are presented in Fig. 10 on a logarithmic scale. As is seen from Fig. 10, the dielectric permittivity along the imaginary axis decreases monotonically with increasing frequency (in distinction to Im (!) which possesses peaks corresponding to interband absorption). The obtained values of the dielectric permittivity along the imaginary axis were substituted into Eqs. (4.25) and (4.26), (4.106) to calculate the Casimir force acting between real metals in con3gurations of two semispaces (ss) and a sphere (lens) above a disk (d‘). Numerical integration was done from an upper limit of 104 eV to a lower limit of 10−6 eV. Changes in the upper limit or lower limit by a factor of 10 lead to changes of less than 0.25% in the Casimir force. If the trapezoidal rule is used in the numerical integration of Eq. (5.68) the corresponding Casimir force decreases by a factor less than 0.5%. The results are presented in Fig. 11 (two semispaces) and in Fig. 12 for a sphere above a disk by the solid lines 1 (material of the test bodies is aluminum) and 2 (material is gold). In the vertical axis the C =F (0) in Fig. 12. These quantities provide a relative force FssC =Fss(0) is plotted in Fig. 11 and Fd‘ d‘ sense of the correction factors to the Casimir force due to the eect of 3nite conductivity. In the horizontal axis the space separation is plotted in the range 0.1–1 m. We do not present the results for larger distances because then the temperature corrections to the Casimir force become signi3cant. At room temperature the temperature corrections contribute only 2.6% of


123

Fig. 10. The dielectric permittivity as a function of imaginary frequency for Al (solid line) and Au (dashed line).

Fig. 11. The correction factor to the Casimir force due to 3nite conductivity of the metal as a function of the surface separation in the con3guration of two semispaces. The solid lines 1 and 2 represent the computational results for Al and Au, respectively. The dashed lines 1 and 2 represent the perturbation correction factor up to the fourth order for the same metals.

124


Fig. 12. The correction factor to the Casimir force due to 3nite conductivity of the metal as a function of the surface separation for a sphere (lens) above a disk. The solid lines 1 and 2 represent the computational results for Al and Au, respectively. The dashed lines 1 and 2 represent the perturbation correction factor up to the fourth order for the same metals. (0) (0) (0) Fd‘ at a = 1 m, but at a = 3 m they contribute 47% of Fd‘ , and at a = 5 m—129% of Fd‘ [45]. It is seen that the relative force for Al is larger than for Au at the same separations as it should be because of better reMectivity properties of Al. At the same time the relative force for Cu is almost the same as for Au [35]. The computational results presented here are in good agreement with analytical perturbation expansions of the Casimir force in powers of relative penetration depth of the zero-point oscillations into the metal (see the two above sections). In Fig. 11 (two semispaces) the dashed line 1 represents the results obtained by (5.60) for Al with Bp = 107 nm (which corresponds to !p = 11:5 eV), and the dashed line 2—the results obtained by (5.60) for Au with Bp = 136 nm (!p = 9 eV) [35]. In Fig. 12 the dashed lines 1 and 2 represent the perturbation results obtained for Al and Au by (5.65) for a lens above a disk. As one can see from the 3gure, the perturbation results are in good (up to 0.01) agreement with computations for all distances larger than Bp . Only at a = 0:1 m for Au there are larger deviations because Bp1 ≡ BpAu ¿ 0:1 m. This proves the fact that the perturbation expansions (5.60) and (5.65) are applicable with rather high accuracy not for aBp , as could be expected from general considerations, but for all a ¿ Bp . The same formalism gives the possibility to consider the inMuence of thin outer metallic layers on the Casimir force value [36]. Let the semispace made of Al (2 ) be covered by Au (1 ) layers as shown in Fig. 4. For a con3guration of a sphere above a plate such covering made of Au=Pd was used in experiments [41– 43] with dierent values of layer thickness d. In this case the Casimir force is given by Eqs. (4.24) and (4.21), (4.106), where the quantities Q1; 2 (i') are expressed by Eqs. (4.22), (4.23). The computational results for (i') are obtained


125

from Eq. (5.68). Substituting them into (4.24) and (4.21), (4.106) and performing a numerical integration in the same way as above one obtains the Casimir force including the eect of covering layers. The numerical computations described above show that a Au layer of d = 20 nm thickness signi3cantly decreases the relative Casimir force between Al surfaces. With this layer the force approaches the value for pure Au semispaces. For a thicker Au layer of d = 30 nm thickness the relative Casimir force is scarcely aected by the underlying Al. For example, at a space separation a = 300 nm in the con3guration of two semispaces we have FssC =Fss(0) = 0:773 for pure Al, FssC =Fss(0) = 0:727 for Al with 20 nm Au layer, FssC =Fss(0) = 0:723 for Al with 30 nm Au layer, and FssC =Fss(0) = 0:720 for pure Au. In the same way for the con3guration of a sphere above a C =F (0) = 0:817 (pure Al), 0.780 (Al with 20 nm Au layer), 0.776 (Al disk the results are: Fdl dl with 30 nm Au layer), and 0.774 (for pure Au). Let us now discuss the application range of the obtained results for the case of covering layers [36]. First, from a theoretical standpoint, the main question concerns the layer thicknesses to which the obtained formulas (4.24) and (4.21), (4.106) and the above computations can be applied. In the derivation of Section 4.1.1 the spatial dispersion is neglected and, as a consequence, the dielectric permittivities 6 depend only on ! not on the wave vector k. In other words, the 3eld of vacuum oscillations is considered as time dependent but space homogeneous. Except for the thickness of a skin layer 0 the main parameters of our problem are the velocity of the electrons on the Fermi surface, vF , the characteristic frequency of the oscillation 3eld, !, and the mean free path of the electrons, l. For the considered region of high frequencies (micrometer distances between the test bodies) the following conditions are valid [232]: vF (5.70) ¡ 0 l : ! Note that the quantity vF =! on the left hand side of Eq. (5.70) is the distance traveled by an electron during one period of the 3eld, so that the 3rst inequality is equivalent to the assumption of spatial homogeneity of the oscillating 3eld. Usually, the corresponding frequencies start from the far infrared part of spectrum, which means the space separation a ∼ 100 m [23]. The region of high frequencies is restricted by the short-wave optical or near ultraviolet parts of the spectrum which correspond to the surface separations of several hundred nanometers. For smaller distances absorption bands, photoelectric eect and other physical phenomena should be taken into account. For these phenomena, the general Eqs. (4.24) and (4.21), (4.106), however, are still valid if one substitutes the experimental tabulated data for the dielectric permittivity along the imaginary axis incorporating all these phenomena. Now let us include one more physical parameter—the thickness d of the additional, i.e. Au, covering layer. It is evident that Eqs. (4.24) and (4.21), (4.106) are applicable only for layers of such thickness that vF ¡d : (5.71) ! Otherwise an electron goes out of the thin layer during one period of the oscillating 3eld and the approximation of space homogeneity is not valid. If d is so small that inequality (5.71) is violated, the spatial dispersion should be taken into account which means that the dielectric permittivity would depend not only on frequency but on a wave vector also: 1 = 1 (!; k). So,

126


if (5.71) is violated, the situation is analogous to the anomalous skin eect where only space dispersion is important and the inequalities below are valid 0 (!) ¡

vF ; !

0 (!) ¡ l :

(5.72)

In our case, however, the role of 0 is played by the layer thickness d (the inMuence of nonlocality eects on van der Waals force is discussed in [233,234]). From (5.70) and (5.71) it follows that for pure Au layers (Bp ≈ 136 nm) the space dispersion can be neglected only if d ¿ 25–30 nm. For thinner layers a more general theory taking into account nonlocal eects should be developed to calculate the Casimir force. Thus for such thin layers the bulk tabulated data of the dielectric permittivity depending only on frequency cannot be used (see experimental investigation [235] demonstrating that for Au the bulk values of dielectric constants can only be obtained from 3lms whose thickness is about 30 nm or more). That is why the above calculated results for the case of d=20 nm are subject to corrections due to the inMuence of spatial dispersion. From an experimental standpoint thin layers of order a few nm grown by evaporation or sputtering techniques are highly porous. This is particularly so in the case of sputtered coatings as shown in [236]. The nature of porosity is a function of the material and the underlying substrate. Thus, it should be noted that the theory presented here which used the bulk tabulated data for 1 cannot be applied to calculate the inMuence of thin covering layers of d = 20 nm like those used in [34,41] and of d = 8 nm used in [42,43] on the Casimir force. The measured high transparency of such layers for the characteristic frequencies [34,41] corresponds to a larger change of the force than what follows from Eqs. (4.24) and (4.21). This is in agreement with the above qualitative analyses. Note that the role of spatial dispersion was neglected in [237] where the van der Waals interaction was considered between the metallic 3lms of several nanometer thickness. According to the above considerations such neglect is unjusti3ed. With account of spatial dispersion the Casimir attraction between a bulk conductor and a metal 3lm deposited on a dielectric substrate was studied in [238]. As is seen from Figs. 11, 12 at room temperature the Casimir force does not follow its (0) ideal 3eld-theoretical expressions Fss(0) , Fdl . For the space separations less than a = 1 m the corrections due to 3nite conductivity of the metal are rather large (thus, at a = 1 m they are around 7–9% for a lens above a disk, and 10 –12% for two semispaces; at a = 0:1 m, around 38– 44% (dl), and 45 –52% (ss)). For a ¿ 1 m the temperature corrections increase very quickly (see Section 5.1). Actually, the range presented in Figs. 11 and 12 is the beginning of a transition with decreasing a from the Casimir force to the van der Waals force. In [36] the intermediate region is investigated in more detail for smaller a and the values of a are found where the pure (nonretarded) van der Waals regime described by Eq. (4.27) starts. In doing so, more exact values of the Hamaker constants for Al and Au were also calculated H Al = (3:6 ± 0:1) × 10−19 J;

H Au = (4:4 ± 0:2) × 10−19 J

(see Eq. (4.28) for the determination of H ).

(5.73)


127

5.3. Roughness corrections The next point characterizing the real media is the imperfectness of their boundaries. In reality, there are always small deviations from the perfect geometry, whether it be the two plane parallel plates or a spherical lens (sphere) situated above a plate. These deviations can be of dierent types. For example, plates can be under some nonzero angle to each other. In a more general way, the boundary surfaces may have a point dependent small deviations from the perfect plane, spherical or cylindrical shape. In all cases “small” means that the characteristic distortion amplitude A is much less than the space separation a between the two bodies. The distortion period T can be much larger, of order of or much smaller compared to a. In the latter case one may speak about short-scale distortions describing surface roughness. Some kinds of roughness can be described also by large-scale distortions. Roughness is necessarily present on any real surface and contribute to the value of the Casimir force. Its contribution is rather large at a ∼ 1 m and should be taken into account when comparing theory with experiment. The problem of roughness corrections has long attracted the attention of researchers (see, e.g., [239,240] where the roughness corrections to the nonretarded van der Waals force were found). In principle, they can be calculated with perturbation theory based on the Green’s function method [96,157]. In doing so a small parameter characterizes the deviation from the basic geometry. Also the formalism based on functional integration can be applied for this purpose [241,242]. However, the resulting expressions turn out to be quite complicated and not very eective for speci3c applications. Because of this, we use here phenomenological methods of Section 4.3 to calculate corrections to the Casimir force due to imperfectness of the boundary geometry [10,227,243–247]. As is shown below, for small deviations from the plane parallel geometry the accuracy of these methods is very high, so that obtained results are quite reliable and can be used for the interpretation of the modern precision experiments on Casimir force measuring. We consider 3rst the con3guration of two plane parallel plates and next a sphere (spherical lens) above a plate which are the two cases of experimental interest. 5.3.1. Expansion in powers of relative distortion amplitude: two semispaces Consider two semispaces modeled by plates which are made of a material with a static dielectric permittivity 0 bounded by surfaces with small deviations from plane geometry. The approximate expression for the Casimir energy in this con3guration is given by Eq. (4.113), where the function L was de3ned in (4.30), and we now change the notation 20 used in Section 4.1.1 for 0 . As zeroth approximation in the perturbation theory we consider the Casimir force between the square plane plate P1 with the sidelength 2L and the thickness D and the other plane plate P2 , which is parallel to P1 and has the same length and thickness. Our aim is to calculate the Casimir force between plates whose surfaces possess some small deviations from the plane geometry. Let us describe the surface of the 3rst plate by the equation z1(s) = A1 f1 (x1 ; y1 )

(5.74)

and the surface of the second plate by z2(s) = a + A2 f2 (x2 ; y2 ) ;

(5.75)

128


where a is the mean value of the distance between the plates. The values of the amplitudes are chosen in such a way that max|fi (xi ; yi ))| = 1. It is suitable to choose the zero point in the z-axis so that L L A1 (s) z1 ≡ A1 f1 (x1 ; y1 ) ≡ d x1 dy1 f1 (x1 ; y1 ) = 0 ; (2L)2 −L −L (s)

z2 ≡ a + A2 f2 (x2 ; y2 ) = a :

(5.76)

We assume in our perturbation expansion that Ai a, aD, and aL. At the same time, in all real situations, we have a=D, a=LAi =a, so that we are looking for the perturbation expansion in the powers of Ai =a and in the zeroth orders in a=D and a=L. The nonnormalized potential of one atom at a height z2 over the plate P1 is given by L L A1 f1 (x1 ;y1 ) d x1 dy1 d z1 UA (x2 ; y2 ; z2 ) = −CN −L

−L

−D

×[(x2 − x1 )2 + (y2 − y1 )2 + (z2 − z1 )2 ]−7=2 ;

(5.77)

where C is an interaction constant from Eq. (4.109), N is the number of atoms per unit volume of plates P1 and P2 . Let us calculate this expression as a series with respect to the parameter A1 =z2 , which is small due to z2 ¿ aA1 . In carrying out these calculations we can neglect the corrections of order of z2 =L and z2 =D, i.e. assume that the thickness and length of the sides of the 3rst plate are in3nitely large. The result of the expansion up to the fourth order with respect to the small parameter A1 =z2 may be written in the form L L

UA (x2 ; y2 ; z2 ) = − CN + d x dy1 1 10z24 −L −L 7z23 f12 (x1 ; y1 ) A1 2 z2 f1 (x1 ; y1 ) A1 × + z2 z2 X 7=2 2X 9=2 7z23 + 6X 9=2 +

21z25 8X 11=2

3 9z22 A1 3 − 1 f1 (x1 ; y1 ) X z2 4 11z22 A1 − 3 f14 (x1 ; y1 ) X z2

(5.78)

with X = (x1 − x2 )2 + (y1 − y2 )2 + z22 . Here, the limit L → ∞ is performed in the 3rst item, which describes the perfect plates without deviations from planar case. The normalized potential of the Casimir force between the plates may be obtained by the integration of Eq. (5.78) over the volume V2 of the second plate using the boundary function A2 f2 (x2 ; y2 ) and by division of the obtained result by the normalization constant K from


Eq. (4.112) N U (a) = K R

L

−L

d x2

L

−L

dy2

a+D

a+A2 f2 (x2 ;y2 )

d z2 UA (x2 ; y2 ; z2 ) :

The Casimir force between the plates per unit area is given by 1 9U R (a) FssR (a) = − : (2L)2 9a Substituting (5.79) into (5.80) we can write L 1 N L R d x2 dy2 UA [x2 ; y2 ; a + A2 f2 (x2 ; y2 )] : Fss (a) = − (2L)2 K −L −L

129

(5.79)

(5.80)

(5.81)

Let us now represent the quantity UA de3ned in Eq. (5.78) as a series up to the fourth order in the small parameter A2 =a. Then we substitute this series into Eq. (5.81). The result may be written in the form k l 4−k 4 A1 A2 FssR (a) = Fss (a) ckl ; (5.82) a a k=0 l=0

where Fss (a) is as de3ned in Eq. (4.29) in the case of perfect plates. For the coeTcients in (5.82) we note that c00 = 1;

c01 = c10 = 0 :

(5.83)

The last two equalities follow from choice (5.76). The coeTcients whose 3rst index is zero are c02 = 10 f22 ;

c03 = −20 f23 ;

c04 = 35 f24 ;

where the notation for the averaged values are used as in Eq. (5.76). The remaining coeTcients ckl in Eq. (5.82) are more complicated. They read 35 35 c20 = a7 f12 Y −9 ; c30 = a7 f13 ’1 (Y ) ;

70 105 9 4 c40 = a f1 ’2 (Y ) ; c11 = − a7 f1 f2 Y −9 ; 4

35 35 c12 = a7 f1 f22 ’1 (Y ) ; c21 = − a7 f12 f2 ’1 (Y ) ;

105 9 105 9 3 c13 = − a f1 f23 ’2 (Y ) ; c31 = − a f1 f2 ’2 (Y ) ;

315 9 2 2 c22 = a f1 f2 ’2 (Y ) :

In (5.85) the following notations are used:

(5.84)

(5.85)

Y = [(x1 − x2 )2 + (y1 − y2 )2 + a2 ]1=2 ; ’1 (Y ) = 9a2 Y −11 − Y −9 ;

’2 (Y ) = 11a2 Y −13 − 3Y −11

(5.86)

130


and the following averaging procedure for a function depending on four variables L L L L 1 ;(x1 ; y1 ; x2 ; y2 ) = d x2 dy2 d x1 dy1 ;(x1 ; y1 ; x2 ; y2 ) : (2L)2 −L −L −L −L

(5.87)

We have also used the notations f1 ≡ f1 (x1 ; y1 ), f2 ≡ f2 (x2 ; y2 ). To obtain the general result for corrections to the Casimir force, we need to calculate the values of the coeTcients clk de3ned in (5.85). We start with the coeTcients cl0 which depend on one distortion function only. Integrating over x2 and y2 according to (5.87) from −∞ to ∞, we obtain to zeroth order in a=L c20 = 10 f12 ;

c30 = 20 f13 ;

c40 = 35 f14 :

(5.88)

These results are in agreement with Eq. (5.84) and may be obtained also from symmetry considerations. The calculation of the mixed coeTcients (depending on the deviation functions of both plates) is quite complicated. The results read [10]

∞ ∞ 4 2 (1; 1) 7=2 c11 = − Gmn zmn K7=2 (zmn ) ; 3 m=0 n=0

∞ ∞ 2 2 (1; 2) 7=2 c12 = Gmn zmn [zmn K9=2 (zmn ) − K7=2 (zmn )] ; 3 m=0 n=0

∞ ∞ 2 2 (1; 3) 9=2 c13 = − Gmn zmn [zmn K11=2 (zmn ) − K9=2 (zmn )] ; 9 m=0 n=0

∞ ∞ 1 2 (2; 2) 9=2 (2) (2) c22 = 210g00 h00 + Gmn zmn [zmn K11=2 (zmn ) − K9=2 (zmn )] : (5.89) 3 m=0 n=0

Let us explain the notations in Eqs. (5.89). The functions K4 (z) are the modi3ed Bessel functions, a 2 zmn ≡ n + m2 : (5.90) L (i; k) The quantities Gmn are given by (i; k) Gmn =

4

(i) (k) 1 gl; mn hl; mn ; (1 + m0 + n0 ) 4

(5.91)

l=1

where m0 is the Kronecker symbol and gl;(i)mn and h(k) l; mn are the Fourier coeTcients of the i k functions f1 and f2 , considered as periodic functions with the period 2L (for details see [10]). (2) 2 and h(2) The quantities g00 00 are the zeroth terms of the Fourier expansions of the functions f1 2 and f2 , respectively. The coeTcient c21 diers from c12 by its sign and by the sequence of upper indices of G. For obtaining of c31 it is enough just to change the sequence of upper indices of G in c13 .


131

So, the perturbation formalism developed allows one to obtain the Casimir force for con3gurations with deviations from plane parallel geometry in the form of the series, given by Eq. (5.82), with coeTcients de3ned in Eqs. (5.83), (5.84), (5.88) and (5.89). These coeTcients can be calculated explicitly for various kinds of distortions. However, the simple case of nonparallel plates is of prime interest here because an exact Casimir force value is also known for it. This gives the possibility to estimate the accuracy of the above phenomenological approach. 5.3.2. Casimir force between nonparallel plates and plates covered by large-scale distortions Let us consider the Casimir force for the con3guration of two plane plates with angle 6 between them. This angle is assumed to be small, so that the inequality 6La holds. This con3guration is a particular example for the deviation from plane parallel geometry which a characteristic length scale is much larger than a. For all deviations of such kind of coeTcients (5.89) can be calculated in a general form. To do this we use the explicit expressions of the modi3ed Bessel functions [226] in (5.89) and obtain [10] ∞ ∞ 2 2 1 3 (1; 1) −zmn c11 = − 20 Gmn e 1 + zmn + zmn + zmn ; 5 15 m=0 n=0

c12 = 60

∞ ∞

(1; 2) −zmn Gmn e

m=0 n=0

c13 = −140

∞ ∞

13 2 1 3 1 4 1 + zmn + zmn + zmn + zmn 30 10 90

;

(1; 3) −zmn Gmn e [1 + zmn S(zmn )] ;

m=0 n=0

(2) (2) c22 = 210 g00 h00 +

∞ ∞

(2; 2) −zmn Gmn e [1 + zmn S(zmn )]

;

(5.92)

m=0 n=0

5 2 2 3 1 4 where S(z) = 1 + 19 42 z + 42 z + 105 z + 630 z . The Fourier coeTcients of the distortion functions f1; 2 and their powers decrease quickly (i; k) with the number of harmonics. Due to Eq. (5.91) the quantities Gmn also decrease with the increase in m and n. The quantity zmn , de3ned in (5.90), is of order of a=L1 for all harmonics which give a signi3cant contribution to the coeTcients cik . So, we can put in Eq. (5.92) zmn = 0 without loss of accuracy: ∞ ∞ ∞ ∞ (1; 1) (1; 2) c11 = − 20 Gmn ; c12 = 60 Gmn ; m=0 n=0

c13 = −140

∞ ∞

m=0 n=0

(1; 3) Gmn ;

c22 = 210

m=0 n=0

(2) (2) g00 h00

+

∞ ∞

(2; 2) Gmn

:

(5.93)

m=0 n=0

Taking into account Eq. (5.91) and the elementary properties of Fourier expansions, it is possible to rewrite Eqs. (5.93) in the form c11 = − 20f1 f2 ;

c12 = 60f1 f22 ;

132


c13 = −140f1 f23 ;

c22 = 210f12 f22 ;

c21 = − 60f12 f2 ;

c31 = − 140f13 f2

(5.94)

(in the last line the correlations between cik and cki have been used—see Section 5.3.1). By the use of Eqs. (5.83), (5.84), (5.88) and (5.94) the 3nal result for the Casimir force from Eq. (5.82) can be represented in the form [10] 2 2 A A A A2 1 1 2 FssR (a) = Fss (a) 1 + 10 f12

− 2f1 f2

+ f22

a a a a 3 2 A1 A1 A2 + 20 f13

− 3f12 f2

a a a 2 3 A A A2 1 2 + 3f1 f22

− f23

a a a 4 3 2 2 A1 A1 A2 A1 A2 4 3 2 2 + 35 f1

− 4f1 f2

+ 6f1 f2

a a a a a 3 4 A A A2 1 2 − 4f1 f23

+ f24

: (5.95) a a a As can be seen from Eq. (5.95), the mixed terms have an evident interference character. For example, in the particular case f2 = ∓f1 we have 2 3 4 A A A A A A 1 2 1 2 1 2 FssR (a) = Fss (a) 1 + 10f12

± + 20f13

± + 35f14

± : a a a a a a (5.96) Now, let us apply results (5.95) and (5.96) to the case of two plane plates with a small angle 6 between them. This con3guration can be realized in three ways. For example, the upper plate may be left Mat and the lower plate allowed to deviate from the parallel position as described by the function x1 f1 (x1 ; y1 ) = (5.97) l with the amplitude A1 = 6L. Substituting (5.97) into expression (5.95) for the Casimir force and putting A2 = 0, one obtains 2 3 4 A A A1 1 1 FssR (a) = Fss (a) 1 + 10f12

+ 20f13

+ 35f14

: (5.98) a a a Using the averaged values calculated from function (5.97), one further obtains 2 4 10 6L 6L +7 : FssR (a) = Fss (a) 1 + 3 a a

(5.99)


133

The same con3guration can be obtained in a way when both plates are perturbed with the amplitudes A1 =61 L and A2 =62 L. The angle between them is, thereby, 6=61 +62 or 6= |61 − 62 | and the deviation functions are given by Eq. (5.97) supplemented by f2 = ∓f1 , respectively. It is easy to see that the calculation of the Casimir force according to Eq. (5.96) repeats result (5.99). This example is interesting, because it gives the possibility of observing the role of the interference terms in (5.95). Such terms must be taken into account in order to obtain the proper result. The con3guration of two nonparallel plates can also be considered by the method of the Green’s functions. For this purpose let us use the exact expression for the Casimir energy density of electromagnetic 3eld inside a wedge obtained by this method in [215,248] E(T) = −

(n2 − 1)(n2 + 11) ; 720 2 T4

(5.100)

where T is the distance from the line of intersection of the wedge faces, and n = =6. The total energy in the space between two square plates of side length 2L = T2 − T1 at an angle 6 to each other is 6 2L T2 R d’ dz dT TE(T) : (5.101) U = 0

0

T1

Substituting Eq. (5.100) into Eq. (5.101) and integrating, keeping 61, one obtains the result

2 L 1 1 R R − ; (5.102) U = U (a) = − 7206 (a − 6L)2 (a + 6L)2 where a is the mean distance between the two plates. Then the Casimir force can be obtained from Eq. (5.80) as

2 1 1 R − : Fss (a) = − 14406L (a − 6L)3 (a + 6L)3

(5.103)

Expanding this expression for the force as a series with respect to powers of the small parameter 6L=a up to fourth order, one exactly obtains Eq. (5.99). So it has been shown that, at least up to the fourth order inclusive in the parameter 6L=a, the approximate approach based on the additive summation with a subsequent normalization yields exactly the same result as the Green’s function method if the slope angle between plates is small. This allows to make an estimate of the relative error of the above approach used for con3gurations with small deviations from the plane parallel geometry. Taking the realistic estimation of 6L=a ≈ 10−1 , the conclusion is obtained that the relative error of result (5.95) for the Casimir force is much less than 10−2 %. Therefore, the application of the approach under consideration to con3gurations with small deviations from plane parallel geometry can be expected to give reliable results up to the fourth order in the parameter A1; 2 =a. As was noted above, Eq. (5.95) is valid for any large-scale distortions whose characteristic length scale T a. Such distortions may describe large-scale surface roughness. Let us consider the longitudinal distortions with amplitudes A1; 2 described by the functions f1 (x1 ; y1 ) = sin !x1 ;

f2 (x2 ; y2 ) = sin(!x2 + ) :

(5.104)

134


In the case when !−1 L the mean values of the functions f1;k 2 in Eq. (5.95) can be calculated as the mean values for one period. Substituting these results into Eq. (5.95), one obtains [10] 2 A1 2 A1 A2 A2 R Fss (a) = Fss (a) 1 + 5 − 2 cos + a a a a 3 105 A1 4 A1 A2 + − 4 cos 8 a a a 2 2 3 4 A1 A2 A1 A2 A2 + 2(2 + cos 2) − 4 cos + : (5.105) a a a a a It is seen that the result depends signi3cantly on the value of phase shift . If the periods of the distortion functions in Eq. (5.104) are dierent, it is necessary to calculate the mean values in the mixed terms over the whole plate according to Eq. (5.76). It can be seen that in this case only the contribution of f12 f22 = 1=4 to the mixed terms is not equal to zero. So, the Casimir force is 2 2 A A2 1 FssR (a) = Fss (a) 1 + 5 + a a 105 + 8

A1 a

4

A1 +4 a

2

A2 a

2

+

A2 a

4

:

(5.106)

The force from Eq. (5.106) is evidently larger than the force from Eq. (5.105) for = 0, but smaller than that for = . 5.3.3. Casimir force between plates covered by short-scale roughness The surfaces of real plates are always covered by some short-scale distortions or short-scale roughness of dierent types. It is necessary to take into account the contribution of such distortions in precision experiments on Casimir force measurements. The characteristic longitudinal scales of such distortions are of order (or less than) the distance a between plates. The distortions of the plate surfaces may be periodic or nonperiodic. In both cases the general result for the Casimir force (see Eq. (5.82) with the coeTcients (5.83), (5.84), (5.88) and (5.92)) acquires one and the same, simpler form than those presented by Eq. (5.95). As we shall see below, most of the mixed terms in Eq. (5.82) turn into zero for the short-scale distortions. The reasons for such simpli3cations are dierent for the cases of periodic and nonperiodic distortions. Let us start with the nonperiodic case, which is the more general. If the functions f1; 2 are nonperiodic, result (5.95) for the Casimir force applies. Because the characteristic scale of the distortions is of the order aL, and by means of Eq. (5.76), we obtain for the odd numbers (i; k) f1i f2k = 0 :

(5.107)


135

By the use of this, Eq. (5.95) takes the form 2 2 A1 A2 R 2 2 Fss (a) = Fss (a) 1 + 10 f1

+ f2

a a 4 2 2 4 A1 A1 A2 A2 4 2 2 4 + 35 f1

+ 6f1 f2

+ f2

: a a a a

(5.108)

The same result occurs for large-scale periodic distortions with dierent periods. Now, let the functions f1; 2 be periodic. If the periods of f1 and f2 are dierent (at least in one coordinate), Eq. (5.107) is valid once more and we return to expression (5.108) for the Casimir force. A dierent formula appears only in the case when periods Tx and Ty of f1 and f2 are equal in both coordinates, respectively. It is evident in our case that 2L ≈ mTx , 2L ≈ nTy holds where m; n1. In the Fourier expansions of the functions f1i and f2k the coeTcients of the modes with the corresponding large numbers are large. So, in Eq. (5.92) the main contributions are (i; k) given by the terms containing the coeTcients Gmn , which were de3ned in Eq. (5.91), with m; n large. The parameter zmn de3ned in Eq. (5.90) is of the order of a=Tx; y in that case. For extremely short-scale distortions Tx; y a we have e−zmn → 0 and only one coeTcient among all the mixed ones survives: (2) (2) c22 = 210g00 h00 ≡ 210f12 f22 :

(5.109)

As a consequence the Casimir force for the extremely short-scale distortions has the form [10] 2 2 A A2 1 FssR (a) = Fss (a) 1 + 10 f12

+ f22

a a 4 2 2 4 A1 A1 A2 A2 4 2 2 4 + 35 f1

+ 6f1 f2

+ f2

: (5.110) a a a a For the case Tx; y ∼ a one should calculate the coeTcients of Eq. (5.92) simpli3ed formulas. As an example of periodic distortions let us examine the longitudinal Eq. (5.104) with !−1 ∼ a. Calculating the mixed coeTcients according taking into account that only Gm; 0 = 1 with m = !L= is not equal to zero, c11 = − 20 cos e−zm0 [1 + zm0 + 25 (zm0 )2 +

3 1 15 (zm0 ) ]

according to nonones described by to Eq. (5.92) and one obtains

;

c12 = c21 = 0 ; −zm0 [1 + zm0 S(zm0 )] ; c13 = c31 = − 105 2 cos e

c22 =

105 2 {1

+ cos 2e−zm0 [1 + zm0 S(zm0 )]} ;

where S(z) is as de3ned in Eq. (5.92) and zm0 = !a.

(5.111)

136


If !a1, the terms of Eq. (5.111) containing the exponential factors can be neglected and we return to result (5.110) with f12 = f22 = 12 ;

f14 = f24 =

3 8

:

(5.112)

It is seen that this result coincides with (5.106) and does not depend on the phase shift as would be expected in this case. For !−1 ∼ a one ought to make computations. For example, for zm0 = 1 the results are c11 = −18:08 cos ;

c13 = −49:98 cos ;

c22 =

105 2 (1

+ 0:953 cos 2) :

(5.113)

The other coeTcients coincide with those obtained in Section 5.3.2 for this example. They are contained in Eq. (5.105) and do not depend on the period. The detailed investigation of the transition region between the large-scale and short-scale roughness is contained in [246]. Substituting all these coeTcients into Eq. (5.82), one obtains the Casimir force which is analogous to Eq. (5.105) for larger periods. It is seen that under a decrease of the period the absolute values of the coeTcients of the interference terms, i.e. the terms, containing , decrease [10]. In the above manner, the Casimir force between the plates covered by all types of small distortions can be described perturbatively with a required accuracy (note that the case of an atom near the cavity wall covered by roughness is examined in detail in [247]). 5.3.4. Expansion in powers of relative distortion amplitude: a spherical lens above a plate The con3guration of a spherical lens (or a sphere) above a plate (or a disk) is the most preferable from the experimental point of view (see Section 6). Because of this a knowledge of roughness corrections in this con3guration is necessary to compare theoretical predictions with the results of the Casimir force measurements. Let us start from the same approximate expression for the Casimir force potential as for two plates given by Eqs. (4.113) and (4.30). Let the lens curvature radius be R, thickness h, diameter 2r. The parameters of a plate are the same as in Section 5.3.1. The coordinate plane (x; y) is chosen to coincide with the plate surface. As above, the surface of the plate with some small distortions can be described by Eq. (5.74) with a condition given by the 3rst equality of (5.76). Let us consider the surface of a lens with small deviations from the ideal spherical shape described by the equation z2(s) = a + R − R2 − T2 + A2 f2 (T; ’) ; (5.114) where T; ’ are the polar coordinates. The perturbation theory can be developed in the same way as in Section 5.3.1. As a result the normalized potential of the Casimir force acting between the lens and the plate is given by R

U (a) = − ˝cL()

0

2

d’

0

r

T dT

h+a

z2(s)

d z2 UA (T; ’; z2 ) ;

(5.115)


where

UA (T; ’; z2 ) =

L

−L

d x1

L

−L

dy1

z1(s)

−D

137

d z1

× [(x1 − T cos ’)2 + (y1 − T sin ’)2 + (z1 − z2 )2 ]−7=2 :

(5.116)

Function (5.116) was calculated in Section 5.3.1 as a series with respect to the parameter A1 =z2 , which is small due to z2 ¿ aA1 . Neglecting the corrections of the order of z2 =L and z2 =D one has L L

UA (T; ’; z2 ) = + d x1 dy1 10z24 −L −L 2 A A1 7 1 × z2 f1 (x1 ; y1 )X −7=2 + z23 f12 (x1 ; y1 )X −9=2 (5.117) z2 2 z2 with X = (x1 − T cos ’)2 + (y1 − T sin ’)2 + z22 . Here the limiting transition L → ∞ is performed in the 3rst item, which describes the contribution of the perfect plate without deviations from the plane. For brevity we restrict ourselves to second order perturbation. Substituting Eq. (5.117) into Eq. (5.115) and calculating force as − 9U R = 9a, we arrive at the result in zeroth order with respect to the small parameter a=h 2 r R Fd‘ (a) = −˝cL() d’ T dT UA (T; ’; z2(s) ) : (5.118) 0

0

z2(s)

Here is as de3ned by Eq. (5.114). Let us now represent the quantity UA (T; ’; z2(s) ) in Eq. (5.118) as a series up to the second order in the small parameter A2 =a. After integration, the result can be written in the form k l 2−k 2 A1 A2 R Fd‘ (a) = Fd‘ (a) Ckl ; (5.119) a a k=0 l=0

where Fd‘ (a) is de3ned by Eq. (4.115). For the 3rst coeTcient in Eq. (5.119) we note that C00 = 1 :

(5.120)

The other coeTcients in Eq. (5.119) in zeroth order with respect to small parameters a=h, a=r, r=R, a=L are [245] −5 2 ∞ T2 6 C01 = − d’ T dTf2 (T; ’) 1 + ;

Ra 0 2aR 0 −6 2 ∞ 15 T2 2 C02 = d’ T dTf2 (T; ’) 1 + ;

Ra 0 2aR 0 C10 =

15a4 f1 Y −7 ;

2 R

138


3 T2 f12 a + Y −9 ; 2R 2 3 T2 105a5 C11 = − 2 f1 f2 a + Y −9 : 2 R 2R

C20 =

105a5 2 2 R

2

In Eqs. (5.121) we use the notation

(5.121)

T2 Y = (x1 − T cos ’) + (y1 − T sin ’) + a + 2R 2

2

2 1=2

and the following averaging procedure for a function depending on four variables 2 ∞ ∞ ∞ ;(T; ’; x1 ; y1 ) = d’ T dT d x1 dy1 ;(T; ’; x1 ; y1 ) : 0

0

−∞

−∞

(5.122)

(5.123)

Besides, we have used in Eqs. (5.121) the notations f1 ≡ f1 (x1 ; y1 ), f2 ≡ f2 (T; ’). A point that should be mentioned is that the coeTcients C01 , C10 in Eq. (5.119) in general dier from zero. For the con3guration of two parallel plates with small distortions, the analogical perturbative expansion starts from the second order (see Section 5.3.1). The coeTcients given by Eqs. (5.121), and also the ones of higher orders, can be calculated in the same way as illustrated in Sections 5.3.1 and 5.3.2 for the example of two plates. For this purpose the distortion function f1 is considered as a periodic one with a period 2L in both variables and the function f2 (T; ’) is considered as a periodic function of T with a period r. All the details can be found in Ref. [245]. Instead of presenting the detailed algebra, here we only discuss the speci3c results which are applicable to experiments. 5.3.5. Corrections to the Casimir force between a plate and a lens due to di>erent kinds of roughness Let us consider 3rst the short-scale roughness on the plate (disk) and on the lens with characteristic longitudinal scales Td , T‘ which are the subject to the inequality √ Td ; T‘ aR : (5.124) In this case only the terms of the Fourier expansions of the functions f1i , f2k with suTciently large numbers are signi3cant. As a result all the expansion coeTcients from Eq. (5.119) can be calculated in a closed form. The result up to the fourth order in the relative roughness amplitude is 2 2 A A A A2 1 1 2 R Fd‘ (a) = Fd‘ (a) 1 + 6 f12

− 2f1 f2

+ f22

a a a a 3 2 A1 A1 A2 3 2 + 10 f1

− 3f1 f2

a a a 2 3 A1 A2 A2 2 3 + 3f1 f2

− f2

a a a


+ 15

f14

− 4f1 f23

A1 a A1 a

4

−

4f13 f2

A2 a

3

A1 a

+ f24

139

3

A2 a

2 2 A2 A1 A2 2 2 + 6f1 f2

a a a 4

:

(5.125)

For the extremely short-scale roughness which is given by the condition Td ; T‘ a one obtains f1 f2 = 0 and the result (up to the second order in the relative roughness amplitude) is [245] 2 2 A A2 1 R (a) = Fd‘ (a) 1 + 6f12

+ 6f22

: (5.126) Fd‘ a

a

As is seen from Eqs. (5.125) and (5.126), there is no contribution to the Casimir force from the linear terms in relative distortion amplitude. This is analogous with the case of two plates. However, the large, linear, corrections to the Casimir force between a plate and a lens may appear in the case of large-scale distortions. If the characteristic longitudinal distortion scales Td ; T‘ are much larger than a, the Fourier modes with rather small numbers contribute to the result. The corresponding corrections to the Casimir force can be important. Their magnitude and sign are aected by the position of the lens. By way of illustration let us consider the longitudinal periodic distortions of the plate (disk) described by 2 x f1 (x) = cos + 1 (5.127) Td and the concentric distortions of the lens 2 T f2 (T) = cos + 2 : (5.128) T‘ For the large-scale distortions there are Td ∼ L, T‘ ∼ r. The parameter 2 in Eq. (5.128) de3nes the type of distortion in the lens center: convex or concave, smooth (2 = 0 or ) or sharp. The parameter 1 in Eq. (5.127) 3xes the position of the lens above the plate. Calculating the expansion coeTcients with functions (5.127) and (5.128), one obtains the Casimir force (5.119) in the form A1 A2 R Fd‘ (a) = Fd‘ (a) 1 + 3 cos 1 − 3 cos 2 a a 2 2 A1 A2 A1 A2 + 3 +3 − 12 cos 1 : (5.129) a a a a Here the dependence of the Casimir force on the parameters 1 ; 2 is seen in explicit form. At the same time the Casimir force for the extremely short-scale distortions of the form (5.127) and (5.128) with Td ; T‘ a is de3ned by Eq. (5.126) and naturally does not depend on the parameters 1 ; 2 . In Ref. [245] the smooth transition of the Casimir force described by Eqs. (5.126) – (5.129) was followed with decreasing of the scales Td ; T‘ in Eqs. (5.127) and (5.128). Also the case of two crossed cylinders was considered there and the roughness corrections to the Casimir force were calculated.

140


It is notable that result (5.95) for two plane plates and (5.125) for a lens above a plate are connected by the Proximity Force Theorem (see Section 4.3). This means that one may obtain the perturbation expansion for the energy density between the plates by integration of Eq. (5.95) and then determine the force between a plate and a lens by multiplying the result by 2 R. But in the case of large-scale distortions it is impossible to derive equation like (5.129) using this theorem. Indeed, the distance between the interacting bodies was de3ned above as the distance between such ideal surfaces relatively to which the average values of all distortions are equal to zero. As this takes place, for the con3guration of two parallel plates there are no 3rst order corrections for all distortion types considered above. However, for the con3guration of a lens above a plate there exist 3rst order corrections in the case of large-scale distortions. It is easy to see that the origin of such corrections is connected with the de3nition of distance between the interacting bodies, i.e. they disappear if one de3nes a as a distance between the nearest points of the distorted surfaces bounding the lens and the plate [227]. Thus, the choice of the theoretical formula for the Casimir force to be compared with the experimental results depends on how the distance is measured experimentally. The results of above calculations do not depend on L, i.e. the plate (disk) was eectively taken to be in3nitely large. In some experiments, however, the size of a lens can be even larger than of the plate (see, e.g., [40]). This is the reason why the boundary eects due to 3nite sizes of the plate may be interesting. According to analyses of this problem performed in [227] the Casimir force with account of 3niteness of the plate is given by a3 1 L Fd‘ (a) ≈ Fd‘ (a) 1 − 3 ; R (1 − T )3 R R−h ; : (5.130) T ≡ max √ R2 + L2 R For the parameters of experiment [40] (see Section 6.3) this results in a correction which does not exceed 6 × 10−7 in the complete measurement range. Even smaller contribution is given by the boundary eects in experiments using the atomic force microscope to measure the Casimir force. This is the reason why we limit our discussion of these eects here. 5.3.6. Stochastic roughness In the preceding subsections the surface roughness was described by regular functions. Here we discuss the case of extremely irregular roughness which can be modeled in a better way by stochastic functions. Let us discuss the case of two parallel plates 3rst [244] and a lens (sphere) above a plate next [245]. Now it is assumed that the surfaces of both plates with the dimensions 2L × 2L and of thickness D are described by the stochastic functions {i fi (xi ; yi )}, i = 1; 2, with dispersions i and mean values i fi (xi ; yi ) i = 0 :

(5.131)

Here, i denotes the averaging over the ensembles of all particular realizations i fi (xi ; yi ) of the corresponding stochastic functions. The factor i is written in front of fi to have the dispersion of the functions {fi (xi ; yi )} be equal to unity. So the surfaces of the plates under


141

consideration are given by the functions z1(s) = 1 f1 (x1 ; y1 );

z2(s) = a + 2 f2 (x2 ; y2 ) :

(5.132)

For test bodies with surfaces (5.132), the potential U R in Eq. (4.113) has to be substituted by U R (a) 1 2 . Then the Casimir force is given by Fss (a) = −

1 9 U R (a) 1 2 : (2L)2 9a

(5.133)

Naturally, we assume the absence of any correlation between the stochastic functions describing the deviations from plane parallel geometry on both plates. In our perturbation treatment we assume i a. In direct analogy with the above treatment a=Li =a, so that a perturbative expansion with respect to powers i =a is considered, taken in zeroth order with respect to a=D and a=L. The nonnormalized potential of one atom at the height z2 above the 3rst plate is given by Eq. (5.77) where dispersion 1 should be substituted instead of A1 . As a result of the perturbation expansion up to the fourth order with respect to the small parameter 1 =z2 Eq. (5.78) follows once more with the same substitution. In accordance with Eq. (5.133) it is necessary to calculate the mean value over all realizations of the stochastic functions {f1; 2 }. For the 3rst function it is suitable to do this directly in Eq. (5.78). Using the normal distribution at each point of the surfaces and the corresponding mean values fi i = fi3 i = 0;

fi2 i = 1;

fi4 i = 3

(5.134)

one obtains from (5.78) UA (x2 ; y2 ; z2 ) 1

= − CN

+ 10z24

L

−L

d x1

L

−L

dy1

7z23 2X 9=2

1 z2

2

63z25 + 8X 11=2

4 11z22 1 −3 : X z2 (5.135)

If the distortions on the surfaces are described by the stationary stochastic functions with i = const, the result in the limit L → ∞ is 2 4

1 1 UA (x2 ; y2 ; z2 ) 1 = − CN 1 + 10 + 105 : (5.136) 4 z2 z2 10z2 In order to obtain the normalized potential of the Casimir force between the plates let us integrate expression (5.136) over the volume of the second plate with the boundary function z2(s) given by Eq. (5.132). Then we calculate the mean value over all realizations of the stochastic function {f2 } and divide the result by the normalization factor K given by Eq. (4.112) 2 L 3 L a+D N R U (a) 1 2 = d x2 dy2 d z2 UA (x2 ; y2 ; z2 ) 1 : (5.137) K −L −L a+2 f2 (x2 ;y2 ) 2

142


Substituting (5.137) into (5.133) we obtain for the Casimir force per unit area L 1 N L R d x2 dy2 UA (x2 ; y2 ; a + 2 f2 (x2 ; y2 )) 1 2 : Fss (a) = (2L)2 K −L −L

(5.138)

Let us now expand the quantity UA 1 de3ned by (5.136) as a series with respect to small parameter 2 =a. Then substitute this series into Eq. (5.138). Using Eqs. (5.134) and the condition 2 = const for the stationary stochastic functions one obtains the 3nal result   2 2 2  2 2  1 2 1 2 FssR (a) = Fss (a) 1 + 10 + + 105 + : (5.139)   a a a a It is seen that the correction to the Casimir force depends on the sum 21 + 22 only and does not depend on the correlation radii T1; 2 of the stochastic functions describing the distortions. This result coincides with the result of Ref. [249] where the corresponding corrections to the van der Waals and Casimir forces were calculated (up to the second order in the dispersions only). For a typical value of 1; 2 ≈ 0:1 the correction given by Eq. (5.139) is 24% of Fss where 4% results from the fourth order. The same calculation procedure can be applied to the con3guration of a lens (sphere) above a plate (disk) covered by the stochastic roughness. In the case of stationary stochastic functions the Casimir force, which is analogous to Eq. (5.139), is   2 2 2  2 2  1 2 1 2 R Fd‘ (a) = Fd‘ (a) 1 + 6 + + 45 + : (5.140)   a a a a The case of nonstationary stochastic functions describing roughness is more complicated. Some results for the Casimir force, however, were obtained if the mean values of these stochastic functions do not depend on xi ; yi but the dispersions are coordinate dependent: i = i (xi ; yi ) (see Ref. [244] for the case of two plates and [245] for a lens above a plate). 5.4. Combined e>ect of di>erent corrections As discussed above, the corrections due to nonzero temperature, 3nite conductivity of the boundary material and surface roughness make important contributions to the value of the Casimir force and should be taken into account when comparing theory and experiment. Up to this point an assumption that each correction factor independently inMuences the Casimir force has been made, i.e. no multiplicative eects. Based on this assumption the combined eect of the several corrections can be calculated by the additive summation of the results obtained in Sections 5.1–5.3. The additivity of dierent corrections to the Casimir force holds, however, in the 3rst approximation only when all of them are small. In the region where, e.g., two corrections are signi3cant (as 3nite conductivity and surface roughness at small separations) more sophisticated methods to account for their combined eect are needed. These methods are discussed below.


143

5.4.1. Roughness and conductivity Finite conductivity corrections to the Casimir force were computed in Section 5.2 in the whole distance range a 6 1 m. For larger distances they should be considered together with the corrections due to nonzero temperature (see Section 5.4.2). For the separations of a 6 1 m the surface roughness makes important contributions to the value of the Casimir force. Surface roughness corrections were calculated in Section 5.3 based on the retarded interatomic potential of Eq. (4.109) with a power index equal to seven. This relationship is, however, speci3c for the pure Casimir regime only (distances of order of 1 m) and, broadly speaking, not applicable to a wide transition region from the Casimir to the van der Waals force. Because of this the results of Section 5.3 may not be immediately applicable to separations a ¡ 1 m. At the same time the roughness corrections of Section 5.3 demonstrate that the geometry of boundary surfaces covered by roughness is more important than the speci3c type of the interatomic potential. Let us consider two large plates of dimension L × L whose boundary planes (perpendicular to the axis z) are covered by small roughness. Let a(x; y) = a0 + f(x; y)

(5.141)

be a distance between the points of boundary surfaces of both plates with the coordinates (x; y). Here a0 is the mean distance between the plates which means that L L dx dy f(x; y) = 0 : (5.142) −L

−L

We remind the reader that for a wide range of surface distortions the Casimir force is given by Eq. (5.95) (for con3guration of two semispaces) and Eq. (5.125) (for a lens above a disk). It is signi3cant that the same expression can be obtained by the integration of the Casimir force, which takes into account the small distortions on the plates in the surface separation, over the boundary surface L 1 L R Fss (a0 ) = 2 dx dy Fss (a(x; y)) : (5.143) L −L −L Here Fss (a) is given by Eq. (4.29). In doing so the separation function f(x; y) introduced in Eq. (5.141) is connected with the distortion functions of Section 5.3 according to f(x; y) = A2 f2 (x; y) − A1 f1 (x; y) :

(5.144)

The analogous result can be obtained also for the con3guration of a spherical lens above a plate (this can be done, e.g., by application of the Proximity Force Theorem to Eq. (5.143)). The detailed example illustrating the above mentioned equivalence between the two methods in the case of the roughness correction is given below in Section 6.4.1. There the experimental con3guration of a sphere (lens) above a disk covered by roughness is considered and the roughness corrections are calculated by both methods leading to the same result. Although Eq. (5.143) is also applicable in the pure retarded, Casimir, regime only, it can be simply generalized for the range of smaller separations where the law of an interatomic interaction changes and the 3nite conductivity corrections become signi3cant. To do so, it is enough to replace the ideal Casimir force between dielectrics Fss in Eq. (5.143) by the one

144


from Section 5.2 taking the 3nite conductivity corrections into account. The resultant expression L 1 L R; C dx dy FssC (a(x; y)) (5.145) Fss (a0 ) = 2 L −L −L gives the possibility to evaluate the Casimir force with account of both 3nite conductivity and surface roughness of boundary material. Both the perturbation expression of Eq. (5.60) or the one of Section 5.2.3 based on the optical tabulated data can be substituted into Eq. (5.145) as a Casimir force FssC in the case of real metal. In Section 6.4.1 Eq. (5.145) is applied to 3nd the combined eect of surface roughness and 3nite conductivity corrections in the measurement of the Casimir force by means of the Atomic Force Microscope and the excellent agreement between theory and experiment is demonstrated. 5.4.2. Conductivity and temperature: two semispaces Both the 3nite conductivity of the boundary material and its nonzero temperature are taken into account in the general Lifshitz formula (5.19). Thus their combined eect can be calculated in a fundamental way on the basis of this formula. The contribution of 3nite conductivity into the Casimir force decreases with the increase of separation distance whereas the contribution of nonzero temperature increases with distance. Because of this, the two corrections are important at the opposite ends of separation interval. Nevertheless at intermediate distances both corrections make important contributions and should be taken into account in precision experiments on the Casimir force. This subject is interesting also from a theoretical point of view since it is connected with some delicate features of Lifshitz formula which have been recognized only recently [38]. We start from the Lifshitz formula (5.19) and rewrite it more conveniently in the form ∞ kB T ∞ T Fss (a) = − k⊥ dk⊥ ql {[r1−2 ('l ; k⊥ )e2aql − 1]−1 + [r2−2 ('l ; k⊥ )e2aql − 1]−1 } ; 2 0 l=−∞

(5.146)

where r1; 2 are the reMection coeTcients with parallel (perpendicular) polarization, respectively, given by (i'l )ql + kl 2 ql + kl 2 −2 −2 r1 ('l ; k⊥ ) = ; r2 ('l ; k⊥ ) = ; (5.147) (i'l )ql − kl ql − kl the other notations are introduced in Section 5.1.1. In terms of dimensionless variables '2l ' 2; y = 2aql = 2a + k⊥ '˜l = 2a l : c2 c Eqs. (5.146) and (5.147) can be rearranged to ∞ kB T ∞ 2 FssT (a) = − y dy{[r1−2 ('˜l ; y)ey − 1]−1 + [r2−2 ('˜l ; y)ey − 1]−1 } ; 16 a3 ˜ |'l | l=−∞

(5.148)

(5.149)


where

145

2 y − ( − 1)'˜l + y2 ˜ r1 ('l ; y) = ; 2 2 ˜ y + ( − 1)'l + y 2 y − ( − 1)'˜l + y2 ˜ r2 ('l ; y) = ; 2 2 ˜ y + ( − 1)'l + y

c'˜ ≡ (i'l ) = i l 2a

:

(5.150)

To calculate the combined eect of nonzero temperature and 3nite conductivity one should substitute some model dielectric function (i'l ) into Eqs. (5.149) and (5.150). The simplest function of this kind is given by plasma model (see Eq. (5.54)). On the base of plasma model the combined eect of nonzero temperature and 3nite conductivity was 3rst examined in [37,38]. It is common knowledge that the plasma model does not provide us with a correct behavior of the dielectric function at small frequencies. This is given by the more complete Drude model which takes relaxation processes into account. The dielectric function of Drude model is given by Eq. (5.69). There is, however, one major problem when the dielectric function of Eq. (5.69) is substituted into Eqs. (5.149) and (5.150). Let us change the discrete variable '˜l for continuous ˜ It is easily seen that the reMection coeTcient r2 ('˜l ; y) is discontinuous as a function of one '. the two variables at a point (0, 0). Actually, substituting Eq. (5.69) into r2 one obtains two limiting values ˜ 0) = 1; lim r22 ('; lim r22 (0; y) = 0 ; (5.151) '˜ → 0

y→0

which together demonstrate the presence of the discontinuity. As a result the zeroth term of Eq. (5.149) becomes ambiguous. What is more important, the perpendicular reMection coeTcient at zero frequency is discontinuous with respect to the relaxation parameter C at a point C = 0 [250]. In [237,251] the value r22 (0; y) = 0 was used to calculate numerically the value of the Casimir force (5:149). As a result the temperature correction was obtained which changes its sign at dierent separations. Also the value r2 used in [237,251] leads to anomalously large by the modulus temperature corrections for the real metal at small separations and to wrong asymptotic values at large separations. These asymptotic values are two times smaller than that for the case of perfect metal, independent of how high a conductivity is assumed for the real metal. What this means is that the value of r2 at zero frequency used in [237,251] is unacceptable. The use of the value r22 (0; y) = 1 in [252,253] as it is for physical photons is unjusti3ed also and leads to large temperature corrections at small separations and to the absence of any conductivity corrections at moderate separations of several nanometers with no regard to the quality of a metal. In fact the scattering problem which underlies the Lifshitz formula (see Section 5:1:1) is not well de3ned at zero frequency in the presence of relaxation. Because of this the zeroth term of Lifshitz formula, when applied to real metals, should be corrected in appropriate way like it was done in [214] for the ideal metal. To solve this problem one can use the representation of Lifshitz formula in terms of continuous variables instead of a discrete summation. Such a representation was suggested in [212,214] for other purposes. This approach is discussed below.

146


According to Poisson summation formula if c(6) is the Fourier transform of a function b(x) 1 ∞ b(x)e−i6x d x (5.152) c(6) = 2 −∞ then it follows [212] ∞ ∞ b(l) = 2 c(2 l) : l=−∞

(5.153)

l=−∞

Let us apply this formula to Eq. (5.149) using the identi3cation ∞ kB T 4 akB T bss (l) ≡ − y2 dy fss (l9; y); 9 ≡ ; 3 16 a |l|9 ˝c

(5.154)

where '˜l = 9l and (1) (2) fss (l9; y) = fss (l9; y) + fss (l9; y) ≡ (r1−2 ey − 1)−1 + (r2−2 ey − 1)−1

is an even function of l. Then the quantity css (6) from Eq. (5.152) is given by ∞ ∞ kB T d x cos 6x y2 dy fss (x9; y) : css (6) = − 16 2 a3 0 x9

(5.155)

(5.156)

Using Eqs. (5.149), (5.153) and (5.156) one 3nally obtains the new representation of Lifshitz formula ∞ FssT (a) = bss (l) l=−∞

=−

˝c

16 2 a4

∞ l=0

0

∞

∞ Te ˜ ˜ ˜ y) ; d ' cos l' y2 dy fss ('; T '˜

(5.157)

where the continuous variable '˜ = 9x: Note that in the representation (5.157) the l = 0 term gives the force at zero temperature. As is shown in [250], for real metals Eq. (5:157) can be transformed to the form 4 5 ∞ kB T T 2 (1) (2) Fss (a) = − y dy f (0; y) + f (y; y) ss ss 16 a3 0 ∞ ∞ 2 +2 y dyfss ('ñ ; y) : (5.158) n=1

'ñ

Here all terms with n ¿ 1 coincide with the corresponding contributions to (5.149). The zeroth term of (5.149) is modi3ed by the prescription generalizing the recipe used in [214] for the ideal metal (note that in the case of plasma dielectric function Eq. (5:158) is exactly equivalent to (5:149)). The representation (5.158) of Lifshitz formula is not subject of the above disadvantages and can be applied to calculate the temperature Casimir force between real metals at all separations [250].


147

Here we present some analytical results which can be obtained in frames of plasma model. To obtain perturbation expansion of Eq. (5.157) in terms of a small parameter of the plasma model 0 =a (see Section 5.2.1) it is useful to change the order of integration and then rewrite ˜ instead of '˜ it in terms of the new variable v ≡ '=y 1 ∞ Te ˝ c ∞ 3 FssT (a) = − y dy dv cos vyl fss (v; y) : (5.159) 16 2 a4 T 0 0 l=0

Expanding the quantity fss de3ned in (5.150) and (5.155) up to the 3rst order in powers of 0 =a one obtains 2 yey 0 fss (v; y) = y −2 y (1 + v2 ) : 2 e −1 (e − 1) a Substituting this into Eq. (5.159) we come to the Casimir force including the eect of both the nonzero temperature and 3nite conductivity (to underline this we have added index C) ∞ 30 1 3 cosh( tn ) 0 16 0 T; C (0) − − − 60 Fss (a) = Fss (a) 1 + 4 3 4

tn tn sinh ( tn ) 3 a a n=1 ∞ 2 cosh2 ( tn ) + 1 2 cosh( tn ) 1 coth( tn ) × − − − ; 2 3 tn3 sinh4 ( tn )

tn sinh3 ( tn ) 2 2 tn2 sinh2 ( tn ) n=1

(5.160)

where tn ≡ nTe =T . The 3rst summation in (5.160) is exactly the temperature correction in the case of ideal metals (see Eq. (5.24)). The second summation takes into account the eect of 3nite conductivity. In the limit of low temperatures T Te one has from (5.160) up to exponentially small corrections [38] 1 T 4 16 0 45%R (3) T 3 T; C (0) Fss (a) ≈ Fss (a) 1 + − 1− : (5.161) 3 Te 3 a 8 3 Te For 0 = 0 (perfect conductor) Eq. (5.161) turns into Eq. (5.25), and for T = 0 result (5.60) is reproduced. Note that the 3rst correction of mixing 3nite conductivity and 3nite temperature is of order (T=Te )3 . More signi3cantly, note that there are no temperature corrections of orders (T=Te )k with k 6 4 in the higher order conductivity correction terms (0 =a)i from the second up to the sixth order [38]. In the limit of high temperatures T Te Eq. (5.160) leads to 0 kB T T; C Fss (a) ≈ − %R (3) 1 − 3 (5.162) 4 a3 a up to exponentially small corrections. For 0 = 0 one obtains from (5.162) the known result (5.26) for perfect conductors. Finite conductivity corrections of higher orders are not essential at large separations. In the next subsection the combined eect of 3nite conductivity and nonzero temperature will be considered in the con3guration of a spherical lens (sphere) above a disk. There the results of

148


numerical computation are presented which give the possibility to observe the smooth transition between the asymptotics of type (5.161) and (5.162). Also the role of relaxation which is taken into account by the Drude model is discussed there. 5.4.3. Conductivity and temperature: lens (sphere) above a disk Along the lines of the previous section the combined eect of nonzero temperature and 3nite conductivity can also be found in the con3guration of a sphere (or spherical lens) above a disk. Here one should start from the temperature Casimir force of Eqs. (5.27) and (5.28) acting in the abovementioned con3gurations. In terms of the reMection coeTcients introduced in Eq. (5.147) this force can be represented as ∞ ∞ kB TR T Fd‘ (a) = k dk {ln[1 − r12 ('n ; k⊥ )e−2aqn ] + ln[1 − r22 ('n ; k⊥ )e−2aqn ]} : 2 n=−∞ 0 ⊥ ⊥ (5.163)

Introducing the dimensionless variables 'ñ = 2a'n =c and y = 2aqn we rewrite Eq. (5.141) in the form ∞ ∞ kB TR T Fd‘ (a) = y dy {ln[1 − r12 ('ñ ; y)e−y ] + ln[1 − r22 ('ñ ; y)e−y ]} : (5.164) 8a2 n=−∞ |'ñ | The term of Eq. (5.164) with n = 0 suers exactly the same ambiguity as the zeroth term of Eq. (5.149) for two plane parallel plates if one uses the Drude model to describe the dependence of dielectric premittivity on frequency. To eliminate this ambiguity we rewrite Eq. (5.164) in the form analogical to (5.157). This is achieved by the Poisson summation formula of Eqs. (5.152) and (5.153). Repeating the same transformations as in Section 5.4.2 the 3nal result is obtained ∞ ∞ Te ˝cR ∞ ˜ T ˜ ˜ y) ; Fd‘ (a) = d ' cos n' y dy fd‘ ('; (5.165) 8 a3 T 0 '˜ n=0

where ˜ y) = f(1) ('; ˜ y) + f(2) ('; ˜ y) ≡ ln(1 − r12 e−y ) + ln(1 − r22 e−y ) : fd‘ ('; d‘ d‘

(5.166)

Evidently, the zeroth term of Eq. (5.165) gives us the Casimir force at zero temperature. The terms of (5.165) with n ¿ 1 represent the temperature correction. Following the prescription formulated in [250] for real metals Eq. (5.165) can be represented in the form analogical to (5.158) 4 5 ∞ k TR B (1) (2) T Fd‘ (a) = y dy f (0; y) + f (y; y) d‘ d‘ 8a2 0 ∞ ∞ +2 y dyfd‘ ('ñ ; y) : (5.167) n=1

'ñ

In the case of plasma dielectric function Eqs. (5.165) and (5.167) are equivalent. In the case of Drude model Eq. (5.167) avoids the diTculties connected with the zeroth term of (5.164) [250].


149

Let us 3rst calculate the temperature Casimir force (5.165) in the framework of the plasma ˜ instead of '˜ model. Changing the order of integration and introducing the new variable v = '=y one obtains 1 ∞ Te ˜cR ∞ 2 T (a) = y dy dv cos n fd‘ (v; y) : (5.168) Fd‘ vy 8 a3 T 0 0 n=0

The expansion of fd‘ up to 3rst order in the small parameter 0 =a is y 0 fd‘ (v; y) = 2 ln(1 − e−y ) + 2 y (1 + v2 ) : e −1 a Substitution of this into Eq. (5.168) leads to result ∞ 45 1

0 1 (0) T; C − 4−4 coth( t ) + Fd‘ (a) = Fd‘ (a) 1 + 3 n 2 2

tn3 a t tn sinh ( tn ) 1 n=1

180 0 + 4

a

∞ n=1

coth( tn ) 2

2

3 cosh( tn ) + − + 2tn3 tn4 tn sinh3 ( tn ) 2tn2 sinh2 ( tn )

Remind that tn ≡ nTe =T . In the case of low temperatures T Te [38] T 4 45%R (3) T 3 (0) T; C Fd‘ (a) ≈ Fd‘ (a) 1 + −

3 Te Te 3 0 45%R (3) T T 4 −4 1− + : a 2 3 Te Te

:

(5.169)

(5.170)

For ideal metal 0 = 0 and Eq. (5.170) coincides with Eq. (5.31). At zero temperature the 3rst perturbation order of (5.65) is reobtained. Once more the perturbation orders (0 =a)i with 2 6 i 6 6 do not contain temperature corrections of orders (T=Te )3 and (T=Te )4 or lower ones [38]. At high temperature T Te 0 %R (3) T; C Fd‘ (a) ≈ − RkB T 1 − 2 : (5.171) 4a2 a For 0 = 0 one reobtains Eq. (5.32). In Ref. [38] the Casimir force was computed numerically by Eq. (5.168) in the frames of plasma model with plasma frequency !p = 1:92 × 1016 rad=s (as for aluminum), T = 300 K, and R = 100 m (see solid line in Fig. 13). The dotted line represents the perturbative results of low temperatures. The dashed line shows the Casimir force at zero temperature (but with account of 3nite conductivity). It is seen from the 3gure that perturbation theory works well within the range 0:1 m 6 a 6 3:5 m (note that six perturbative orders in 3nite conductivity were used at small separations). Starting from a = 6 m the solid line represents the asymptotic at high temperatures. In the Drude model representation of the conductivity, the computations of the Casimir force (5.167) were performed in Ref. [250]. In addition to above mentioned parameters the value

150


Fig. 13. The nonzero temperature Casimir force as a function of the surface separation in the con3guration of a sphere above a disk. The solid line represents the result of numerical computations, the dotted line is calculated by the perturbative theory, the dashed line is the zero temperature result.

of the relaxation frequency C = 9:6 × 1013 rad=s was used. At separation a = 8 m (which T; C corresponds to the high temperature asymptotic) the obtained value is Fd‘ ≈ 1:9303 × 10−15 N. This is approximately 0.78% lower compared to the asymptotic limiting value for a perfect metal at large separations 1:9454 × 10−15 N. The same force computed by the plasma model would be equal to 1.9378×10−15 N which is only 0.39% lower compared to the case of perfect metal. Thus, there is a factor of 2 dierence between the 3nite conductivity corrections to the high temperature Casimir force obtained by both models. It is apparent that at large separations (where the low frequencies make important contributions) it is the Drude model which gives the correct result for 3nite conductivity correction. Note that this correction itself is rather small at large separations. 5.4.4. Combined e>ect of roughness, conductivity and temperature As is seen from the previous subsection the combined eect of conductivity and temperature is very nontrivial. It essentially depends on the behavior of the reMection coeTcients at small frequencies. If this eect is taken into account correctly, the temperature eect for a real metal of 3nite conductivity is directly analogous to the case of the ideal metal. This means that the temperature corrections, which at room temperature are rather small at the space separations


151

a ¡ 1 m, increase monotonically with distance, and dominate the zero temperature Casimir force starting at a separation distance of several micrometers. Note that in the asymptotic regime of large separations a ¿ 10 m the 3nite conductivity corrections are even smaller than the temperature corrections at small separations a ∼ 1 m. According to results of Section 5.3 roughness corrections can make important contributions at a ≈ 1 m, while for larger separations their contribution decreases rapidly. The most important contribution of surface roughness is given at the separation range a ¡ 1 m. In this transition region to the van der Waals forces (as noted in Section 5.4.1) the retarded interatomic potential is not applicable. The combined eect of surface roughness and 3nite conductivity (which are the two inMuential factors in this range) can be found, however, by the geometrical averaging of Eq. (5.145) accounting for roughness in the space separation entering the Casimir force law in the case of the real metallic surfaces of 3nite conductivity. Exactly the same, geometrical, approach can be applied to 3nd the eect of roughness in addition to the combined eect of 3nite conductivity and nonzero temperature. All one has to do is to replace the Casimir force including only the eect of 3nite conductivity in Eq. (5.145) with the Casimir force calculated at both nonzero temperature and 3nite conductivity (see Sections 5.4.2 and 5.4.3). The result is L 1 L R; T; C Fss (a0 ) = 2 dx dy FssT; C (a(x; y)) : (5.172) L −L −L Here the distance a(x; y) between the interacting surfaces takes roughness into account in accordance with Eqs. (5.141) and (5.144). Eq. (5.172) can be used to calculate the combined eect of surface roughness, nonzero temperature and 3nite conductivity of the boundary metal in the experiments on the Casimir force (see Section 6). It is necessary to stress, however, that up to the present, none of the experiments has the precision necessary to measure the temperature contribution to the Casimir force. The reason is that the relative error of force measurements increases quickly with the increase in space separation and attains hundreds of percent at space separations where the temperature eects could be noticeable (see the next section). Because of this up to the present only the 3nite conductivity and roughness corrections to the Casimir force have been measured. As to the temperature corrections, its measurement is a problem to be solved in the future. 6. Measurements of the Casimir force In this section we review the experimental developments in the measurements of the Casimir force. Given that there have been only a few attempts at the measurement, a brief review of the older measurements will also be provided. It should be noted that the older measurements set the benchmark and the basis for improvement. Given two parallel plates of area S and in3nite conductivity, separated by a distance a the Casimir force is given by Eq. (1.3). The force is a strong function of a and is measurable only for a ∼ 1 m. This force is on the order of 10−7 N for Mat surfaces of 1 cm2 area for a separation distance of 1 m. Given the small value of the force for experimentally accessible surface areas, the force sensitivity of the experimental technique has been one of the most severe limitations on the accuracy of the

152


various measurements. Also given that the force has a very strong dependence on the separation distance, an accurate determination of the surface separation is necessary for good comparison to the theory. 6.1. General requirements for the Casimir force measurements The 3rst experiments dealing with the measurements of the Casimir force were done by Sparnaay [20,254]. The experimental technique based on a spring balance and parallel plates served to set the benchmarks. They also clari3ed the problems associated with other Casimir force measurements. From the instrumental standpoint there are clear requirements, like an extremely high force sensitivity and the capability to reproducibly measure the surface separation between the two surfaces. Other than these there are clear material requirements necessary for a good measurement of the Casimir force. These fundamental requirements as spelled out by Sparnaay [20,254] are: (1) clean plate surfaces completely free of chemical impurities and dust particles; (2) precise and reproducible measurement of the separation between the two surfaces. In particular, a measurement of the average distance on contact of the two surfaces which is nonzero due to the roughness of the metal surfaces and the presence of dust; (3) low electrostatic charges on the surface and low potential dierences between the surfaces. Note that there can exist a large potential dierence between clean and grounded metallic surfaces due to the work-function dierences of the materials used and the cables used to ground the metal surfaces. Thus an independent measurement of the systematic error due to the residual electrostatic force is absolutely necessary. Each of the above instrumental or material requirements is diTcult to obtain in practice and certainly very diTcult to obtain together. They have bedeviled this 3eld because, at least one or more of the above were neglected in the force measurements. Regarding the material requirements, as pointed out by Sparnaay [20], requirement 1 was ignored in experiments with glass and quartz surfaces [159,160,255 –257] where surface reactions with moisture and silicone oil from the vacuum apparatus lead to the formation of “gel layer” [20] on the surface. Sparnaay expects this gel layer to completely modify the forces for surface separation distances less than 1:5 m. The last two requirements are particularly diTcult to meet in the case of non-conductive surfaces such as glass, quartz [159,160,255] or mica [258–261]. Yet all these early measurements possibly neglected the systematic correction due to the electrostatic force in their experiments. Many other requirements such as the exact surface separation distance and the role of surface roughness were neglected in all but the most recent experiments. Some experiments have tried to use an ionized environment [262] to neutralize the static charges but reported additional electrostatic eects. Also all early measurements took the surface separation on contact to be zero. This can be a signi3cant error for large Mat surfaces or alternatively surfaces with large radius as the inevitable presence of obstacles prevent close contact of the two surfaces. As stated in [20], this is also true for some experiments with Pt metallic wires where the point of contact was assumed to be zero separation distance [255]. Thus independent checks of the surface separation are necessary for correct analysis of the data.


153

Of the earlier experiments with metallic surfaces only two meet at least some of the stringent criteria set forth by Sparnaay necessary for rigorous measurements of the Casimir force. The 3rst one is by Sparnaay [20]. The second is by van Blokland and Overbeek [257]. It should be mentioned that both experiments are a culmination of many years of improvements, references for which are provided in the respective publications. Some of the other signi3cant older measurements on metal and dielectric surfaces such as those of Derjaguin [159,160,255] (it should be noted that Derjaguin et al. were the 3rst to use curved shaped bodies which overcame the need to align the parallel surfaces) and with dielectric bodies by Tabor, Winterton and Israelachvili [258–261] and the dynamical measurements of Hunklinger et al. [263,264] and those using micromechanical tunnelling transducers by Onofrio et al. [265,266] will be brieMy discussed. The experiments on van der Waals forces with liquid He 3lms [267] are outside the con3nes of this review on retarded forces and accordingly will not be dealt in detail. Mention should also be made of the identi3cation of van der Waals forces with micron sized polystyrene spheres, using a tapping mode Atomic Force Microscope [268]. This work is also outside the limits of this review. Finally, the more recent experiments using the torsion pendulum by Lamoreaux [40] and that using the Atomic Force Microscope by Mohideen et al. [41,43,44] will be reviewed. In each case the experimental technique and a discussion of the results will be provided. As discussed in Section 4 the boundary dependence of the Casimir eect is one of its most intriguing properties. For example, the Casimir force is a strong function of geometry and that between two halves of thin metal spherical shells is repulsive. For two halves of a box it can be repulsive or attractive depending on the height to base ratio. The sign and value of the Casimir force becomes even more interesting for complex topologies such as encountered with a torus. However given the diTculty of making unambiguous measurements there has been only one attempt at demonstrating the nontrivial boundary dependence of the Casimir force [42] for the case of corrugated plate. This experiment will also be reviewed. 6.2. Primary achievements of the older measurements Here we brieMy discuss the main experiments on the Casimir force measurements which were performed until the year 1997 when the modern stage in this 3eld of research had started. In all cases, special mention is made of the necessary requirements for a good measurement that are met by the experiment under consideration. In many of them dielectric test bodies were used although in several experiments the Casimir force between metallic surfaces was also measured. 6.2.1. Experiments with parallel plates by Sparnaay Sparnaay [20] attempted to measure the Casimir force between two Mat metal plates. A force balance based on a spring balance was used in the 3nal series of measurements. The sensitivity of the spring balance was between (0:1–1)×10−3 dyn. The extension of the spring was measured through a measurement of the capacitance formed by the two Mat plates. Calibration of this capacitance was done with the help of tungsten and platinum wires (uncertainties in this calibration are not reported). Care was taken with vibration isolation. The author reports that the knife-edges and the springs used led to large hysteresis which made determination of the surface separation distance diTcult. This was reported to be the most severe drawback of

154


the measurement technique. The plates were mounted such that they were electrically insulated from the rest of the apparatus. Sparnaay realized that even a small potential dierence of 17 mV between the two parallel plates was suTcient to overwhelm the Casimir force. To take care of any potential dierences between the surfaces the two plates were brought in contact together at the start of the experiment. Three sets of metal plates, Al–Al, chromium– chromium and chromium–steel were used in the measurements. Even with a variety of electrical and mechanical cleaning procedures, dust particles larger than 2–3 m were observed on the plates. The plates were aligned parallel by visual inspection with about a 10% variation in the interplate distance from one of the plate to the other. Because of the presence of the dust particles it is estimated that even on contact, the plates are separated by 0.2 m (the procedure used to determine this was not provided). The chromium-steel and the chromium plates both led to attractive forces between them whereas the aluminum plates led to repulsive force. The peculiar repulsive force noticed in the case of the aluminum plates was thought to be due to presence of impurities on the aluminum surface. In the case of the attractive force for the chromium and chromium-steel plates, given the uncertainties in the measurement of the interplate distance only a general agreement with the Casimir force formula (here perfectly reMecting boundaries are assumed) could be achieved. Barring the repulsive forces measured with aluminum plates the necessary improvements other than the force sensitivity are: (1) more accurate measurement of surface separation; (2) more accurate measurement of the parallelism between the two surfaces (angles of less than 10−4 radians between plates of 1 cm2 area are necessary); (3) measurement of any residual electrostatic potential dierences between the two surfaces given the presence of the dust particles. In conclusion, these sets of measurements were the 3rst indication of an attractive Casimir force between metallic surfaces, approximately in line with the expectations. (Note that the aluminum plates showed repulsive force and therefore the attractive force was not conclusive.) Most importantly, from these experiments, Sparnaay clearly elucidated the problems that needed to be overcome for a rigorous and conclusive measurement of the Casimir force. 6.2.2. Experiments by Derjaguin et al. One of the major improvements that was pioneered by the group of Derjaguin et al. [159] was the use of curved surfaces to avoid the need to maintain two Mat plates perfectly parallel. This was accomplished by replacing one or both of the plates by a curved surface such as a lens, sphere or cylinder. The 3rst use of this technique was to measure the force between a silica lens and plate [159,160,269–271]. Sparnaay [20] points out that this work did not take into account the presence of “gel layer” which is usually present on such surfaces. Also the possible substantial electrostatic forces which will result in systematic errors are not reported in the experiment. These experiments will not be further discussed here. There was also related work with metallic surfaces by Derjaguin et al. [160,255]. In the case of metallic surfaces the forces between platinum 3bers and gold beads were measured. The force measurement was done by keeping one surface 3xed and attaching the other surface to


155

the coil of a galvanometer. The rotation of the galvanometer coil in response to the force led to the deMection of a light beam which was reMected o mirrors attached to the galvanometer coil. This deMected light beam was detected through a resistance bridge, two of whose elements could be photoactivated. The measured forces indicated an unretarded van der Waals forces for distances below 50 –80 nm and a retarded force region for larger distances. However, more accurate modern theoretical results [36] predict an unretarded force below distances of 2 nm in the case of gold. Derjaguin et al. report a discrepancy in the force measurements of around 60% [255]. Also any possible electrostatic forces due to the potential dierences between the two surfaces appears to have been neglected. While the distance on contact of the two surfaces appears to have been taken as the zero distance (ignoring the role of surface roughness), mention is made that surface roughness might have aected the experimental measurements and make the comparison to theory very diTcult particularly for distances less than 30 nm. 6.2.3. Experiments of Tabor, Winterton and Israelachvili using mica cylinders In the inter-weaning years between the experiments of Sparnaay and those of van Blokland and Overbeek using metallic surfaces, there were many force measurements on nonconductive surfaces. Of these, the experiments on muscovite mica [258–261] will be discussed here. The major improvement in these experiments was the use of atomically smooth surfaces from cleaved muscovite mica. This provides the possibility of very close approach of the two surfaces. As a result it was possible to measure the transition region between retarded and nonretarded van der Waals forces in those particular materials. Cylindrical surfaces of radii between 0.4 and 2 cm obtained by wrapping the mica sheets on glass cylinders were used to measure the force. The procedure used in the making of the mica cylinder led to 50% uncertainties in its radius. A spring type balance based on the jump method was used (for large separations modi3cations to this were done). Here the force of attachment of one of the cylinders to an extended spring is overcome by the attractive force from the opposite cylindrical surface. By using springs of dierent extensions and dierent spring constants a variety of distances could possibly be measured. Multiple beam interferometry was used for the measurement of the surface separation with a reported resolution of 0.3 nm. In what appears to be the 3nal work in this regard [259], a sharp transition from the retarded to the nonretarded van der Waals force was found at 12 nm (earlier work had measured a transition at larger surface separations [258]). Later reanalysis of the data with more precise spectral properties for the mica revealed that the data could be reconciled with the theory only if errors of at least 30% in the radius of curvature of the mica cylinders were introduced [261]. The possibility of changes in the spectral properties of the mica surface used to make the cylinders was also mentioned to explain the discrepancy. The separation on contact of the two surfaces was assumed to be zero, i.e., the surfaces were assumed to be completely free of dust, impurities and any atomic steps on the cleaved surface. Additionally, as mica is a nonconductor which can easily accumulate static charges, the role of electrostatic forces between the cylinders is hard to estimate. 6.2.4. Experiments of van Blokland and Overbeek The next major set of improved experiments with metallic surfaces were performed by van Blokland and Overbeek [257]. Here many of the improvements achieved with dielectric surfaces

156


were incorporated. Also care was taken to address many of the concerns raised by Sparnaay that were listed in the introductory notes. (Earlier measurements in the group [256] with dielectric surfaces did not report on the eects of chemical purity of the surface and the role of the electrostatic forces between the surfaces.) The 3nal improved version of the experiment using metallic surfaces was done by van Blokland and Overbeek [257]. The experiment was done using a spring balance. The force was measured between a lens and a Mat plate coated with either 100 ± 5 or 50 ± 5 nm of chromium. The chromium surface was expected to be covered with 1–2 nm of surface oxide. Water vapor was used to reduce the surface charges. This use of water vapor might have further aected the chemical purity of the metal surface. At the outset the authors recognize the outstanding problems in the Casimir force measurements as (1) the potential dierence between the two surfaces leads to electrostatic forces which complicates the measurement; (2) the exact determination of the separation distance between the two surfaces is required; (3) the exact determination of the nonzero surface separation on contact of the two surfaces should be performed. The authors then try to address the above problems. The 3rst was done by two methods: by looking for a minimum in the Casimir force as a function of the applied voltage, and by measuring the potential dierence from the intersection point in the electrostatic force with application of positive and negative voltages. The two methods yielded approximately consistent values for the potential dierence of between 19 and 20 mV. This large potential dierence was equal to the Casimir force around 400 nm surface separation. Thus, to measure the Casimir force the experiment had to be carried out with a compensating voltage present at all times. The separation distance between the two surfaces was measured through a measurement of the lens–plate capacitance using a Schering Bridge. This capacitance method is applicable for relative determination of distances as cables and stray capacitances are of the same order as that between the spherical surface and the plate. Additional problems such as the tilt of the lens with respect to the plate were recognized by the authors. The distance was calibrated with the help of the electrostatic force at a few points. The force was measured for distances between 132 and 670 nm for the 100 nm thick metal coating. Only distances larger than 260 nm could be probed for the 50 nm metal coating. The theoretical treatment of chromium metal was noted to be problematic as it has two strong absorption bands around 600 nm. Given this it was very hard to develop a complete theory based on the Lifshitz model and some empirical treatment was necessary. The imaginary part of the dielectric constant corresponding to this absorption was modeled as a Lorentz atom [272]. The two overlapping absorption bands were treated as a single absorption band. The strength of this absorption band could only approximately be taken into account in the theoretical modeling. However, this absorption band was found to make about 40% of the total force. The long wavelength response of chromium was modeled as Drude metal with a plasma frequency based on an electron number density of 1:15 × 1022 cm−3 . With this theoretical treatment the measured force was shown to be consistent with the theory. The authors estimate the eect of surface roughness [107,239,240,245] which was neglected in the theoretical treatment, to make contributions of order 10%. The relative uncertainty in the


157

measured force was reported to be around 25% near 150 nm separation but much larger around 500 nm separation. The authors report that the noise comes from the force measurement apparatus. Given the above we can estimate the accuracy of the experiment to be of order 50%. But it is worth noting that this was the 3rst experiment to grapple with all the important systematics and other factors noted in Section 6.1 which are necessary to make a clear measurement of the Casimir force. This experiment can therefore be considered as the 3rst unambiguous demonstration of the Casimir force between metallic surfaces. Thus, it is also the 3rst measurement of surface forces, in general, where an independent estimate of the experimental precision can be attempted (though none was provided by the authors). 6.2.5. Dynamical force measurement techniques In theory, dynamical force measurements are more sensitive as the signal (and the noise) in a narrow bandwidth is monitored. A dynamical force measurement technique was 3rst used by Hunklinger, Arnold and their co-authors to measure the Casimir force between silica surfaces and silica surfaces coated with a thin layer of silicon [263,264]. Here a glass lens of radius 2.5 cm was attached to the metal coated top membrane of loudspeaker, while a glass Mat plate surface was top surface of microphone. In one case the glass surfaces were coated with silicon [264]. A sinusoidal voltage (at the microphone resonance frequency of 3 kHz) was applied to the loudspeaker such that the distance between the two surfaces also changed sinusoidally. This change resulted in the sinusoidal oscillation of the Mat plate on the microphone due to the Casimir force. This oscillation on the microphone was detected. The calibration was done by removing the plate and lens and applying an electrostatic voltage between the top of the loudspeaker and the microphone. A probable error of 20% [263] and 50% [264] in the force calibration is reported. A possible force sensitivity was of about 10−7 dyn. The electrostatic force was minimized by use of water vapor and acetic acid vapor. No report of the residual electrostatic force was provided. Given the glass manufacturers’ roughness speci3cations of 50 nm for the surfaces, the surface separation on contact was estimated to be around 80 nm. Deviations from the expected behavior were found for separation distances below 300 nm and larger than 800 nm. Such a deviation might be possible due to the presence of the “gel layer” pointed out by Sparnaay and the role of the electrostatic charges. There is also a proposal to detect Casimir forces using a tunneling electromechanical transducer by Onofrio and Carugno [265,266]. Here the force was to be measured between two parallel metal surfaces with one plate mounted on a cantilever and a disk mounted rigidly on a piezo electric stack. Small oscillations induced on the disk by the application of a sinusoidal voltage to the piezo will induce oscillations of the plate (and the cantilever) through the Casimir force. This Casimir force induced cantilever motion should be detected by monitoring the tunneling current between a sharp needle and the cantilever. The sensitivity was increased by looking at the sidebands of the oscillation frequency as they were found to be less aected by seismic noise or 1=f electrical noise. In this report only the noise limit was found by applying an electrostatic bias to the metal pieces. The noise limit at surface separations of 1 m was reported to be three orders of magnitude larger than that necessary for the detection of the Casimir force. Further improvements such as a resonator with larger mechanical quality factor, phase sensitive detection and vacuum operation were proposed.

158


6.3. Experiment by Lamoreaux This experiment by Lamoreaux [40] was a landmark experiment, being the 3rst in a modern phase of Casimir force measurements. It particularly invigorated both the theoretical and experimental community by coinciding with the development of the modern uni3cation theories on compacti3ed dimensions (discussed below in Section 7). This coincidence heightened the awareness of the usefulness of the Casimir force measurements as being one of the most sensitive tests to new forces in the submillimeter distance range. It was also the 3rst time that most of the relevant parameters necessary for a careful calculation of the experimental precision was measured and reported. Thus, a quanti3cation of the experimental precision could be consistently attempted without recourse to an arbitrary estimation of parameters. This experiment used a balance based on the torsion pendulum to measure the Casimir force between a gold coated spherical lens and Mat plate. A lens with a radius of 11:3 ± 0:1 cm (later corrected to 12:5 ± 0:3 cm in [228,273]) was used. The two surfaces were 3rst coated with 0.5 m of Cu followed by a 0.5 m coating of Au. Both coatings were done by evaporation. The lens was mounted on a piezo stack and the plate on one arm of the torsion balance. The other arm of the torsion balance formed the center electrode of dual parallel plate capacitors C1 and C2 . Thus, the position of this arm and consequently the angle of the torsion pendulum could be controlled by application of voltages to the plates of the dual capacitor. The Casimir force between the plate and lens surface would result in a torque, leading to a change in the angle of the torsion balance. This change in angle would result in changes of the capacitances C1 and C2 which were detected through a phase sensitive circuit. Then compensating voltages were applied to the capacitors C1 and C2 through a feedback circuit to counteract the change in the angle of the torsion balance. These compensating voltages were a measure of the Casimir force. The calibration of this system was done electrostatically. When the lens and plate surfaces were grounded, a “shockingly large” [40] potential dierence of 430 mV was measured between the two surfaces. This large electrostatic potential dierence between the two surfaces was partially compensated with application of voltage to the lens (from the analysis there appears to have been a residual electrostatic force even after this compensation). The lens was moved towards the plate in 16 steps by application of voltage to the piezo stack on which it was mounted. At each step, the restoring force, given by the change in voltage required to keep the pendulum angle 3xed was noted. The maximum separation between the two surfaces was 12:3 m. The average displacement for a 5.75 V step was about 0:75 m. Considerable amount of hysteresis was noted between the up and down cycles, i.e. approach and retraction of the two surfaces. The displacement as a function of the 16 applied voltages was reported as measured to 0:01 m accuracy with a laser interferometer. The total force was measured between separations of 10 m to contact of the two surfaces. The experiment was repeated and a total of 216 up=down sweeps were used in the 3nal data set. The total measured force data was binned into 15 surface separation points. Two of the important experimental values needed (a) the residual electrostatic force and (b) the surface separation on contact of the two surfaces were obtained by curve 3tting the expected Casimir force to total measured force in the following manner. The total measured force for separations greater than 2 m (12 of the 15 data points) was 3t to the sum of the Casimir force (not including the conductivity


159

corrections) and the electrostatic force. Thus a 3tting function was of the form 7 +b ; (6.1) F m (i) = FcT (ai + a0 ) + ai + a0 where F m (i) is the total measured force at the ith step, FcT (i) is the theoretical Casimir force including the temperature correction given by Eqs. (5.31) and (5.32). The distance a0 in Eq. (6.1) is left as a 3t parameter which gives the absolute plate separation on contact of the two surfaces and b is a constant. Also 7 which is a measure of the electrostatic force between the two surfaces was left to be determined by the 3t. The uncertainty in a0 was noted to be less than 0:1 m and a typical drift of 0:1 m was noted between the up=down sweeps. About 10% of the up=down sweeps were noted to be rejected because of poor convergence, nonphysical value for a0 or an inconsistent result for 7. Finally, 216 up=down sweeps were retained. The value of a0 and the 7 determined from this 3t to the 12 data points were then used to subtract the residual electrostatic force from the total measured force at all the 15 data points. Thus, 7 Fcm (ai ) = F(ai ) − − b ; (6.2) ai where Fcm (ai ) is the measured Casimir force. Even after application of the compensating voltages to the lens and plate surfaces, the Casimir force was noted to be only around 20% of the residual electrostatic force at the point of closest approach. Next, a quanti3cation of the experimental precision was attempted. Here the experimental measured force Fcm (ai ) and the theoretical force FcT were compared using Eq. (4.108) Fcm (ai ) = (1 + )FcT (ai ) + b :

(6.3)

In the above values of b ¡ 5 × 10−7 dyn (95% con3dence level) were used in the 3t. For the 216 sweeps considered the average of value of determined from above was 0:01 ± 0:05. Based on this a 5% degree of experimental precision was quoted for this experiment at all surface separations from 0.6 to 6 m. In the above determination of the precision, the conductivity correction of the 3rst order from Eq. (5.65) was not included in the comparison. No evidence of its inMuence on the force value was gleaned from the measurement. In [40] an upper limit of 3% of any eect of the conductivity correction was placed. Surface roughness corrections were not reported in [40]. It was 3rst realized in [45] that 3nite conductivity corrections could amount to as much as 20% of the Casimir force at the separation of about 1 m. Subsequently, Lamoreaux pointed out two errors [273] in his experimental measurement. He calculated the 3nite conductivity correction for gold to be 22% and 11% for copper at 0.6 m. Thus, the theoretical value of the Casimir force would decrease by these percentage values. This calculation was based on the tabulated complex index of refraction for the two metals. A second error in the measurement of the radius of curvature was noted. Here the lens surface was reported to be aspheric and the radius in the region where the Casimir force measurement was performed was reported to be 12:5±0:3 cm, in contrast with the 11:3±0:1 cm used earlier in [40]. This change corresponds to a 10.6% increase in the theoretical value of the Casimir force. It was noted in [42] that only a pure copper 3lm (with the 11% conductivity correction) is consistent with a 5% precision reported earlier in [40]. Thus, an assumption of complete diusion of the copper layer through the 0.5 m

160


gold layer on both the lens and plate surfaces was thought necessary for the preservation of the experimental precision [273]. Later work [35,229] has shown that gold and copper surfaces lead to identical forces. In conclusion, this experiment introduced the modern phase sensitive detection of forces and thus brought possible increased sensitivity to the measurement of the Casimir forces. By using piezoelectric translation of the lens towards the plate, reproducible measurements of the surface separation could be done. The value of the electrostatic force and the surface separation on contact could only be determined by curve 3tting part of the experimental data to the expected Casimir force. Such a procedure biases that part of the experiment with the input value of the Casimir force. Also as noted by Lamoreaux in [40], the data is not of suTcient accuracy to demonstrate the temperature corrections to the Casimir force. It should be noted that the temperature corrections are 86%, 129% and 174% of the zero temperature result at surface separations of 4; 5 and 6 m, respectively [45]. As it follows from Fig. 4 of [40], the absolute error of force measurements is around WF = 1 × 10−11 N resulting in the relative error of approximately 700% at a = 6 m. Thus the temperature corrections remain uncon3rmed and the 5 –10% accuracy would probably apply at the smallest separations only. Above all this work by applying modern positioning and force measurement techniques to the Casimir eect promised rigorous tests of the theory including the various corrections. Thus it stimulated a surge in theoretical activity. 6.4. Experiments with the Atomic Force Microscope by Mohideen et al. The increased sensitivity of the Atomic Force Microscope (AFM) was used by Mohideen et al. to perform the most de3nitive experiments on the measurement of the Casimir force [41,43,44]. With the use of the AFM the authors report a statistical precision of 1% at smallest separations in their measurement of the Casimir force. The three important requirements set forth by Sparnaay, i.e. the use of nonreactive and clean metal surfaces, the determination of the average surface separation on contact of the two surfaces only in [43,44] and the minimization and independent measurement of the electrostatic potential dierences were all done independently of the Casimir force measurement. In the case of the experiments with aluminum coating, a thin Au=Pd coating was sputtered on top of the aluminum to reduce the eects of its oxidation. In these experiments the Au=Pd coating was treated phenomenologically as transparent. Complete theoretical treatment of thin metal coatings is complicated due to the wave vector dependence of the dielectric properties of metal layers [36,233]. It should be noted that with regard to the initial work [41] the complete conductivity and roughness corrections were described in detail in Ref. [34]. The last experiment [44] using gold coating avoids all these ambiguities and the comparisons with theory are more solid. Below we chronologically discuss the experiments and their improvements, culminating in the most accurate work with the gold surfaces. In this section the con3guration of a sphere above a Mat disk is considered. The case of a corrugated plate is a subject of Section 6.5. 6.4.1. First AFM experiment with aluminum surfaces A schematic diagram of the experiment [41] is shown in Fig. 14. A force between the sphere and plate causes the cantilever to Mex. This Mexing of the cantilever is detected by the deMection


161

Fig. 14. Schematic diagram of the experimental setup. Application of voltage to the piezo electric element results in the movement of the plate towards the sphere.

of the laser beam leading to a dierence signal between photodiodes A and B. This dierence signal of the photodiodes was calibrated by means of an electrostatic force. Polystyrene spheres of 200 ± 4 m diameter were mounted on the tip of the metal coated cantilevers with Ag epoxy. A 1 cm diameter optically polished sapphire disk is used as the plate. The cantilever (with sphere) and plate were then coated by thermal evaporation with about 300 nm of aluminum. To prevent the rapid oxidation of the aluminum coating and the development of space charges, the aluminum was sputter coated with a 60%=40% Au=Pd coating of less than 20 nm thickness. In the 3rst experiments aluminum metal was used due to its high reMectivity at short wavelengths (corresponding to small surface separations). Aluminum coatings are also easy to apply due to the strong adhesion of the metal to a variety of surfaces and its low melting point. To measure the Casimir force between the sphere and the plate they are grounded together with the AFM. The plate is then moved towards the sphere in 3.6 nm steps and the corresponding photodiode dierence signal was measured. The signal obtained for a typical scan is shown in Fig. 15. Here “0” separation stands for contact of the sphere and plate surfaces. It does not take into account the absolute average separation between the Au=Pd layers due to the surface roughness which is about 80 nm. If one also takes into account the Au=Pd cap layers which can be considered transparent at small separations (see below) the absolute average separation at contact between Al layers is about 120 nm. Note that in the experiment the separation distance on contact was found by 3tting the experimental data at large separations with the measured electrostatic force and the theoretical Casimir force. Region 1 shows that the force curve at large separations is dominated by a linear signal. This is due to increased coupling of scattered light into the diodes from the approaching Mat surface. Embedded in the signal is a long-range attractive electrostatic force from the contact potential dierence between the sphere and plate, and the Casimir force (small at such large distances). In region 2 (absolute separations vary from contact to 350 nm) the Casimir force is the dominant characteristic far exceeding all the systematic errors. Region 3 is the Mexing of the cantilever resulting from the continued extension of the piezo after contact of the two surfaces. Given the distance moved by the Mat

162


Fig. 15. Typical force curve as a function of the distance moved by the plate.

plate (x-axis), the dierence signal of the photodiodes can be calibrated to a cantilever deMection in nanometers using the slope of the curve in region 3. Next, the force constant of the cantilever was calibrated by an electrostatic measurement. The sphere was grounded to the AFM and dierent voltages in the range ± 0:5 to ± 3 V were applied to the plate. The force between a charged sphere and plate is given as [274] ∞

1 F = (V1 − V2 )2 csch n6(coth 6 − n coth n6) : 2

(6.4)

n=1

Here V1 is the applied voltage on the plate, and V2 represents the residual potential on the grounded sphere. One more notation is 6 = cosh−1 (1 + a=R), where R is the radius of the sphere and a is the separation between the sphere and the plate. From the dierence in force for voltages ± V1 applied to the plate, we can measure the residual potential on the grounded sphere V2 as 29 mV. This residual potential is a contact potential that arises from the dierent materials used to ground the sphere. The electrostatic force measurement was repeated at 5 dierent separations and for 8 dierent voltages V1 . While the force is electrostatically calibrated, one can derive an equivalent force constant using Hooke’s law and the force from Eq. (6.4). The average value thus derived was k = 0:0182 N=m. The systematic error corrections to the force curve of Fig. 15, due to the residual potential on the sphere and the true separations between the two surfaces, are now calculated. Here the near linear force curve in region 1, is 3t to a function of the form F = Fc (Wa + a0 ) +

B + C × (Wa + a0 ) + E : Wa + a0


163

Fig. 16. Measured average Casimir force for large distances as a function of plate–sphere separation is shown as open squares. The theoretical Casimir force with corrections to surface roughness and 3nite conductivity is shown by the solid line (when the space separation is de3ned as the distance between Al layers) and by the dashed line (with the distance between Au=Pd layers).

In this equation a0 is the absolute separation at contact, which is constrained to 120 ± 5 nm, is the only unknown to be completely obtained by the 3t. The second term represents the inverse linear dependence of the electrostatic force between the sphere and plate for Ra. The constant B = − 2:8 nN nm corresponding to V2 = 29 mV and V1 = 0 is used. The third term represents the linearly increasing coupling of the scattered light into the photodiodes and E is the oset of the curve. Both C and E can be estimated from the force curve at large separations. The best 3t values of C, E and the absolute space separation a0 are determined by minimizing the 12 . The 3nite conductivity correction and roughness correction (the largest corrections) do not play a signi3cant role in region 1 and thus the value of a0 determined by the 3tting is unbiased with respect to these corrections. These values of C, E and a0 are then used to subtract the systematic errors from the force curve in regions 1 and 2 to obtain the measured Casimir force as (Fc )m = Fm − B=a − Ca − E, where Fm is the measured total force. This procedure was repeated for 26 scans in dierent locations of the Mat plate. The average measured Casimir force (Fc )m as a function of sphere–plate separations from all the scans is shown in Figs. 16 and 17 as open squares in the separation range 120 nm 6 a 6 950 nm (a is the distance between Al surfaces). In Fig. 17 the dashed line represents the Casimir force of Eq. (4.108) which is in evident deviation from the experimental data. Because of this, dierent corrections to the Casimir force in the above experimental con3guration should be estimated and taken into account. Now let us compare the experimental results with a more precise theory taking into account surface roughness and 3nite conductivity corrections to the Casimir force (temperature corrections are negligibly small for the separations under consideration). For distances of a ∼ 1 m between the interacting bodies both the surface roughness and 3nite conductivity of the boundary

164


Fig. 17. Measured average Casimir force for small distances as a function of plate–sphere separation is shown as open squares. The theoretical Casimir force with corrections due to surface roughness and 3nite conductivity is shown by the solid line, and without any correction by the dashed line.

metal make important contributions to the value of the Casimir force. Although the exact calculation is impossible, one can 3nd the corresponding corrections approximately with the required accuracy. In the 3rst report [41] only the second order conductivity and roughness corrections were used for the comparison between theory and experiment. Using such a comparison, the authors found that the root mean square (rms) deviation of the experiment (Fexp ) from the theory (Fth ) is F = 1:6 pN in the complete measurement range. This is of order of 1% of the forces measured at the closest separation and was used as the measure of precision. Second order perturbation theory is, however, insuTcient to calculate the force value with an accuracy of 1%, because the third and fourth perturbation orders contributions are generally larger than 1%. Rather good agreement between theory and experiment obtained in [41] is explained by the fact that both corrections are of dierent signs and partly compensate each other (see Sections 5.2.2 and 5.3.5). As a result, the value of F in [41] was interval dependent, i.e. dierent if calculated separately at small separations, large separations and in the complete measurement range. Subsequently in Ref. [34] both the conductivity and roughness corrections were improved and a better approach to the theory was considered. This is discussed below starting with the roughness correction. We use the formalism presented in Section 5.3 to describe a plane plate (disk) of dimension 2L, thickness D and a sphere above it of radius R both covered by roughness. The roughness on the plate is described by function (5.74). The roughness on the sphere is described by Eq. (5.114). The values of the roughness amplitude are de3ned as speci3ed in Section 5.3. As is seen from the below experimental investigation the characteristic lateral sizes of roughness on the plate (Td ) and on the sphere (T‘ ) obey inequality (5.124). In this case, following the development outlined in Section 5.3.5, the Casimir force with account of the roughness correction takes


the form (see Eq. (5.125)) (0) R (a) = Fd‘ (a) Fd‘

1+6

+ 10 Tf13 U

−

Tf23 U

A2 a

Tf12 U

A1 a

3

A1 + 6Tf12 f22 U a

3

A1 a

2

2 A1 A2 A2 2 − 2f1 f2

+ f2

a a a

− 3Tf12 f2 U

+ 15

Tf14 U

2

2

A2 a

165

A1 a

A1 a

4

2

−

A2 A1 + 3Tf1 f22 U a a 4Tf13 f2 U

A1 − 4Tf1 f23 U a

A2 a

3

A1 a

3

A2 a

2

A2 a

A2 + Tf24 U a

4

:

(6.5)

Here the double angle brackets denote two successive averaging procedures with the 3rst one performed over the surface area of interacting bodies and the second one over all possible phase shifts between the distortions situated on the surfaces of interacting bodies against each other. This second averaging is necessary because in the experiment [41] the measured Casimir force (0) was averaged over 26 scans done on dierent points on the plate surface. Fd‘ (a) is the Casimir force acting between perfect metals in a perfectly shaped con3guration. The above result, Eq. (6.5), was used to calculate the roughness corrections to the Casimir force in the experiment [41]. The roughness of the metal surface was measured with the same AFM. After the Casimir force measurement the cantilever with sphere was replaced with a standard cantilever having a sharp tip. Regions of the metal plate diering in size from 1 m × 1 m to 0:5 m × 0:5 m were scanned with the AFM. A typical surface scan is shown in Fig. 18. The roughness of the sphere was investigated with a SEM and found to be similar to the Mat plate. In the surface scan of Fig. 18, the lighter tone corresponds to larger height. As is seen from Fig. 18 the major distortions are the large separate crystals situated irregularly on the surfaces. They can be modeled approximately by the parallelepipeds of two heights. As the analysis of several AFM images shows, the height of highest distortions is about h1 = 40 nm and of the intermediate ones—about h2 = 20 nm. Almost all surface between the distortions is covered by the stochastic roughness of height h0 = 10 nm. It consists of small crystals which are not clearly visible in Fig. 18 due to the vertical scale used. All together they form the homogeneous background of the averaged height h0 =2. The character of roughness on the plate and on the lens is quite similar. Note that in [41], only the highest distortions h1 = 40 nm were used to estimate the distortion amplitude. Now it is possible to determine the height H relative to which the mean value of the function, describing the total roughness, is zero. It can be found from the equation h0 (h1 − H )S1 + (h2 − H )S2 − H − S0 = 0 ; (6.6) 2 where S1; 2; 0 are, correspondingly, the surface areas occupied by distortions of the heights h1 , h2 and stochastic roughness. Dividing Eq. (6.6) into the area of interacting surface S = S1 + S2 + S0

166


Fig. 18. Typical atomic force microscope scan of the metal surface. The lighter tone corresponds to larger height as shown by the bar graph on the left.

one gets

h0 v0 = 0 ; (h1 − H )v1 + (h2 − H )v2 − H − 2

(6.7)

where v1; 2; 0 = S1; 2; 0 =S are the relative parts of the surface occupied by the dierent kinds of roughness. The analysis of the AFM pictures similar to Fig. 18 gave us the values v1 = 0:11, v2 =0:25, v0 =0:64. Solving Eq. (6.7) we get the height of the zero distortion level H =12:6 nm. The value of distortion amplitude de3ned relatively to this level is A = h1 − H = 27:4 nm :

(6.8)

Below, two more parameters will also be used h2 − H H − h0 =2 ≈ 0:231; 72 = ≈ 0:346 : A A With the help of them the distortion function from Eq. (5.74) was represented as   1; (x1 ; y1 ) ∈ GS1 ;   (x1 ; y1 ) ∈ GS2 ; f1 (x1 ; y1 ) = 71 ;    −7 ; (x ; y ) ∈ G ; 71 =

2

1

1

(6.9)

(6.10)

S0

where GS1 ; S2 ; S0 are the regions of the 3rst interacting body surface occupied by dierent kinds of roughness.


The same representation  −1;    f2 (x2 ; y2 ) = −71 ;    72 ;

167

is valid for f2 also (x2 ; y2 ) ∈ G˜ S1 ; (x2 ; y2 ) ∈ G˜ S2 ;

(6.11)

(x2 ; y2 ) ∈ G˜ S0 ;

G˜ S1 ; S2 ; S0 are the regions of the second interacting body surface occupied by the distortions of dierent kinds. For the roughness under consideration the characteristic lateral √ sizes of distortions are Td ; T‘ ∼ 200–300 nm as can be seen from Fig. 18. At the same time aR ¿ 3000 nm. Thus, condition (5.124) is valid and Eq. (6.5) is really applicable to calculate the roughness corrections. Now it is not diTcult to calculate the coeTcients of expansion (6.5). One example is Tf1 f2 U = − v12 − 271 v1 v2 + 272 v1 v0 − 712 v22 + 271 72 v2 v0 − 722 v02 = 0 ;

(6.12)

which follows from Eqs. (6.7) to (6.9). The results for the other coeTcients are Tf12 U = Tf22 U = v1 + 712 v2 + 722 v0 ; Tf13 U = − Tf23 U = v1 + 713 v2 − 723 v0 ; Tf14 U = Tf24 U = v1 + 714 v2 + 724 v0 ;

Tf1 f22 U = Tf12 f2 U = 0 ; Tf1 f23 U = Tf13 f2 U = 0 ;

Tf12 f22 U = (v1 + 712 v2 + 722 v0 )2 :

(6.13)

Substituting Eq. (6.13) into Eq. (6.5) we get the 3nal expression for the Casimir force with surface distortions included up to the fourth order in relative distortion amplitude A2 A3 (0) R Fd‘ (a) = Fd‘ (a) 1 + 12(v1 + 712 v2 + 722 v0 ) 2 + 20(v1 + 713 v2 − 723 v0 ) 3 a a 4 A + 30[v1 + 714 v2 + 724 v0 + 3(v1 + 712 v2 + 722 v0 )2 ] 4 : (6.14) a It should be noted that exactly the same result can be obtained in a very simple way. To do this it is enough to calculate the value of the Casimir force (4.108) for six dierent distances that are possible between the distorted surfaces, multiply them by the appropriate probabilities, and then summarize the results R Fd‘ (a) =

6 i=1

(0) wi Fd‘ (ai ) (0)

(0)

(0)

≡ v12 Fd‘ (a − 2A) + 2v1 v2 Fd‘ (a − A(1 + 71 )) + 2v2 v0 Fd‘ (a − A(71 − 72 )) (0) (0) (0) + v02 Fd‘ (a + 2A72 ) + v22 Fd‘ (a − 2A71 ) + 2v1 v0 Fd‘ (a − A(1 − 72 )) :

(6.15)

As was noted in Section 5.4 representations of this type immediately follow from the Proximity Force Theorem and thereby, with an appropriate interpretation of F (0) , can be applied not only inside the retarded (Casimir) regime, but also in the transition region to the van der Waals force.

168


Now let us start with the corrections to the Casimir force due to 3nite conductivity. The interacting bodies used in the experiment [41] were coated with 300 nm of Al in an evaporator. The thickness of this metallic layer is much larger than the penetration depth 0 of electromagnetic oscillations into Al for the wavelengths (sphere–plate separations) of interest. Taking BpAl = 100 nm as the approximative value of the eective plasma wavelength of the electrons in Al [230] one gets 0 = BpAl =(2 ) ≈ 16 nm. What this means is the interacting bodies can be considered as made of Al as a whole. Although Al reMects more than 90% of the incident electromagnetic oscillations in the complete measurement range 120 nm ¡ a ¡ 950 nm, some corrections to the Casimir force due to the 3niteness of its conductivity exist and should be taken into account. In addition, as was already mentioned above, to prevent the oxidation processes, the surface of Al in [41] was covered with less than - = 20 nm layer of 60% Au=40% Pd. The reMectivity properties of this alloy are much worse than of Al (the eective plasma wavelength of Au is BpAu = 136 nm and the penetration depth is ˜0 ≈ 21:6 nm). Because of this, it is necessary to take into account the 3niteness of the metal conductivity. For large distances which are several times larger than BpAu both Al and Au=Pd are the good metals. In this case the perturbation theory in the relative penetration depth into both metals can be developed. The small parameter is the ratio of an eective penetration depth e (into both Au=Pd and Al) and a distance a between the Au=Pd layers. The quantity e , in its turn, is understood as a depth for which the electromagnetic oscillations are attenuated by a factor of e. It takes into account both the properties of Al and of Au=Pd layers. The value of e can be found from the equation e − + = 1; e = 1 − 0 + - ≈ 21:2 nm : (6.16) 0 ˜0 ˜0 The resultant 3nite conductivity correction is given by Eq. (5.65) where e from Eq. (6.16) is substituted instead of 0 . Now we combine both corrections—one due to the surface roughness and the second due to the 3nite conductivity of the metals (see Section 5.4.1). For this purpose we substitute the (0) e quantity Fd‘ (ai ) from Eq. (5.65) into Eq. (6.15) instead of Fd‘ (ai ). The result is R; C Fd‘ (a)

=

6 i=1

e wi Fd‘ (ai ) ;

(6.17)

where dierent possible distances between the surfaces with roughness and their probabilities were introduced in Eq. (6.15). Eq. (6.17) describes the Casimir force between Al bodies with Au=Pd layers taking into account the 3nite conductivity of the metals and surface roughness for the distances several times larger than BpAu . For the distances of order of BpAu or even smaller a more simple, phenomenological, approach to calculation of the Casimir force can be applied. It uses the fact that the transmittance of 20 nm Au=Pd 3lms for the wavelength of around 300 nm is greater than 90%. This transmission measurement was made by taking the ratio of light transmitted through a glass slide with and without the Au=Pd coating in an optical spectrometer. So high transmittance gives the possibility to neglect the Au=Pd layers when calculating the Casimir force and to enlarge the distance between the bodies by 2- = 40 nm when comparing


169

the theoretical and experimental results. With this approach for the distances a ¡ BpAu , instead of Eq. (6.17), the following result is valid: F R; C (a) =

6 i=1

0 wi Fd‘ (ai + 2-) ;

(6.18)

where the Casimir force with account of 3nite conductivity is de3ned by the Eq. (5.65). Now, let us compare the theoretical Casimir force taking into account the fourth order roughness and conductivity corrections with the experiment. Let us 3rst consider large surface separations (the distance between the Au=Pd layers changes in the interval 610 nm 6 a 6 910 nm). We compare the results given by Eqs. (6.17) and (6.18) with experimental data. In Fig. 16 the dashed curve represents the results obtained by Eq. (6.17), and solid curve—by Eq. (6.18). The experimental points are shown as open squares. For 80 experimental points, which belong to the range of a under consideration, the root mean square average deviation between theory and experiment in both cases is F = 1:5 pN. It is notable that for large a the same result is also valid if we use the Casimir force from Eq. (4.108) (i.e. without any corrections) both for a and for a + 2-. By this is meant that for large a the problem of the proper de3nition of distance is not signi3cant due to the experimental uncertainty and the large scatter in experimental points. The same situation occurs with the corrections. At a + 2- = 950 nm the correction due to roughness (0) (positive) is about 0.2% of Fd‘ , and the correction due to 3nite conductivity (negative) is 6% (0) (0) . It is negligible of Fd‘ . Together they give the negative contribution, which is also 6% of Fd‘ if we take into account that the relative error of force measurement at the extreme distance of 950 nm is of approximately 660% (this is because the Casimir force is much less than the experimental uncertainty at such distances). Now, we consider the range of smaller values of the distance 80 nm 6 a 6 460 nm (or, between Al, 120 nm 6 a + 2- 6 500 nm). Here Eq. (6.18) should be used for the Casimir (0) (a + 2-) from Eq. (4.108) is shown by the dashed force. In Fig. 17 the Casimir force Fd‘ curve. The solid curve represents the dependence calculated according to Eq. (6.18). The open squares are the experimental points. Taking into account all 100 experimental points belonging to the range of smaller distances we get for the solid curve the value of the root mean square deviation between theory and experiment F100 = 1:5 pN. If we consider a more narrow distance interval 80 nm 6 a 6 200 nm which contains thirty experimental points it turns out that F30 = 1:6 pN for the solid curve. In all the measurement range 80 nm 6 a 6 910 nm the root mean square deviation for the solid curves of Figs. 16 and 17 is F223 = 1:4 pN (223 experimental points). What this means is that the dependence, Eq. (6.18), gives equally good agreement with experimental data in the region of small distances (for the smallest ones the relative error of force measurement is about 1%), in the region of large distances (where the relative error is rather large) and in the whole measurement range. If one uses less sophisticated expressions for the corrections to the Casimir force due to surface roughness and 3nite conductivity, the value of F calculated for small a would be larger than in the whole range. It is interesting to compare the obtained results with those given by Eq. (4.108), i.e. without taking account of any corrections. In this case for the interval 80 nm 6 a 6 460 nm (100 exper0 = 8:7 pN. For the whole measurement range 80 nm 6 a 6 910 nm imental points) we have F100 0 (223 points) there is F223 = 5:9 pN. It is evident that without appropriate treatment of the

170


corrections to the Casimir force the value of the root mean square deviation is not only larger but also depends signi3cantly on the measurement range. The comparative role of each correction is also quite obvious. If we take into account only roughness correction according to Eq. (6.14), then one obtains for the root mean square deviation R = 22:8 pN; FR = 12:7 pN and FR = 8:5 pN. At a + 2- = 120 nm the in dierent intervals: F30 100 223 (0) . For the single 3nite conductivity correction calculated by Eq. (5.65) correction is 17% of Fd‘ = 5:2 pN; F = 3:1 pN and F = 2:3 pN. At 120 nm this with 0 instead of e it follows: F30 100 223 (0) (0) correction contributes −34% of Fd‘ . (Note, that both corrections contribute – 22% of Fd‘ at 120 nm, so that their nonadditivity is demonstrated most clearly.) Several conclusions can be reached on the 3rst AFM experiment on measuring the Casimir force. This was the 3rst experiment using the 10−12 N force sensitivity of the AFM to make the Casimir force measurement. In the initial report [41] only the second order corrections to the conductivity and the roughness were considered. This was pointed out in Ref. [275]. Subsequently in Ref. [34], both surface roughness and 3nite conductivity of the metal were calculated up to the fourth order in the respective small parameters. The obtained theoretical results for the Casimir force with both corrections were compared with the experimental data. The excellent agreement was demonstrated which is characterized by almost the same value of the root mean square deviation between theory and experiment in the cases of small and large space separations between the test bodies and in the complete measurement range. It was shown that the agreement between the theory and experiment is substantially worse if any one of the corrections is not taken into account. What this means is that the surface roughness and 3nite conductivity corrections should be taken into account in precision Casimir force measurements with space separations of the order 1 m and less. Two of the three requirements set forth by Sparnaay, i.e. the use of a clean metal surface and the independent measurement of the electrostatic force between the two surfaces was met in this experiment. However, the value of the surface separation on contact of the two surfaces was done by 3tting of the Casimir force to a part of the experimental curve at large separations which, as was the case with [40], will bias the experimental curve. Also the roughness corrections were large, on the order of 17% [34,275] at smallest separations. In the next set of experiments reported both these problems were eliminated. Here an independent measurement of the surface separations was done and the roughness corrections were reduced to be order 1% only. These experiments are discussed below. 6.4.2. Improved precision measurement with aluminum surfaces using the AFM The following year, Mohideen et al. reported an improved version of the above experiment [43]. The particular experimental improvements were: (i) use of smoother metal coatings, which reduces the eect of surface roughness and allows for closer separations between the two surfaces, (ii) vibration isolation which reduces the total noise, (iii) independent electrostatic measurement of the surface separations, and (iv) reductions in the systematic errors due to the residual electrostatic force, scattered light and instrumental drift. Also the complete dielectric properties of Al is used in the theory along the lines of Section 5.2.3. The average precision de3ned on the rms deviation between experiment and theory remained at the same 1% of the forces measured at the closest separation. For a metal with a dielectric constant (!) the force


171

between a large sphere and Mat plate is given by the Lifshitz theory [9]. Here the complete (extending from 0:04 to 1000 eV from Ref. [230] along with the Drude model below 0:04 eV is used to calculate (i!). In the Drude model the dielectric constant of Al along the imaginary frequency axis is given by Eq. (5.69), where !p is the plasma frequency corresponding to a wavelength of 100 nm and C is the relaxation frequency corresponding to 63 meV [230]. Al metal was chosen because of its ease of fabrication and high reMectivity at short wavelengths (corresponding to the close surface separations). As in the previous experiment, the roughness of the metal surface is measured directly with the AFM. The metal surface is composed of separate crystals on a smooth background. The height of the highest distortions were 14 nm and intermediate ones of 7 nm both on a stochastic background of height 2 nm with a fractional surface areas of 0:05; 0:11 and 0.84, respectively. The crystals are modeled as parallelepipeds. This leads to the complete Casimir force including roughness correction [43]. Here, A = 11:8 nm is the eective height de3ned by requiring that the mean of the function describing the total roughness is zero and the numerical coeTcients are the probabilities of dierent distance values between the interacting surfaces (see the previous subsection). As a result the roughness correction is about 1.3% of the measured force (a factor of nearly 20 improvement over the previous measurement). There is also temperature correction which in this case was less than 1% of the force at the closest separation. Let us discuss now the measurement procedure of the improved experiment. The same technique for the attachment of the sphere to the AFM cantilever was done. Then a 250 nm aluminum metal coating was evaporated onto the sphere and a 1 cm diameter sapphire plate. Next, both surfaces were then sputter coated with 7:9 ± 0:1 nm layer of 60% Au=40% Pd. Thus here the Au=Pd coating was made much thinner and also its thickness was precisely measured. The sphere diameter was measured using the Scanning Electron Microscope (SEM) to be 201:7 ± 0:5 m. The rms roughness amplitude of the Al surfaces was measured using an AFM to be 3 nm. The AFM was calibrated in the same manner as reported in the last section. Next, the residual potential of the grounded sphere was measured as V2 = 7:9 ± 0:8 mV by the AC measurement technique again reported earlier (factor of 3.5 improvement over the previous experiment). Minor corrections due to the piezo hysteresis and cantilever deMection were applied as reported. To measure the Casimir force between the sphere and Mat plate they are both grounded together with the AFM. The raw data from one scan is shown in Fig. 19. Region 1 is the Mexing of the cantilever resulting from the continued extension of the piezo after contact of the two surfaces. In region 2 (a0 + 16 nm ¡ surface separations ¡ a0 + 516 nm) the Casimir force is the dominant characteristic far exceeding all systematic errors. The systematic eects are primarily from the residual electrostatic force ( ¡ 1.5% of the force at closest separation) and a linear contribution from scattered light. This linear contribution due to scattered light (and some experimental drift) can be observed and measured in region 3. In this experiment a key improvement is that, the electrostatic force between the sphere and Mat plate was used to arrive at an independent and consistent measurement of a0 , the average surface separation on contact of the two surfaces. This was done immediately following the Casimir force measurement without breaking the vacuum and no lateral movement of the surfaces. The Mat plate is connected to a DC voltage supply while the sphere remains grounded. The applied voltage V1 in Eq. (6.4) is so chosen that the electrostatic force is ¿10 times the

172


Fig. 19. Typical force curve as a function of the distance moved by the plate.

Fig. 20. The measured electrostatic force for an applied voltage of 0:31 V to the plate. The best 3t solid line shown leads to a a0 = 47:5 nm. The average of many voltages leads to a0 = 48:9 ± 0:6 nm.

Casimir force. The open squares in Fig. 19 represent the measured total force for an applied voltage of 0:31 V as a function of distance. The force results from a sum of the electrostatic force and the Casimir force. The solid line which is a best 12 3t for the data in Fig. 20 results in a a0 = 47:5 nm. This procedure was repeated for other voltages between 0.3 and 0:8 V leading to an average value of a0 = 48:9 ± 0:6 nm (the rms deviation is 3 nm). Given the 7:9 nm Au=Pd coating on each surface this would correspond to an average surface separation 48:9 ± 0:6 + 15:8 nm =


173

Fig. 21. The measured average Casimir force as a function of plate–sphere separation is shown as squares. The error bars represent the standard deviation from 27 scans. The solid line is the theoretical Casimir force with account of roughness and 3nite conductivity corrections.

64:7 ± 0:6 nm for the case of the Casimir force measurement. Now that we have presented the measurement procedure, we are coming to the results. The electrostatically determined value of a0 was now used to apply the systematic error corrections to the force curve of Fig. 19. Here the force curve in region 3, is 3t to a function: F = FC (Wa + 64:7 nm) + Fe (Wa + 48:9 nm) + Ca. The 3rst term is the Casimir force contribution to the total force in region 3 and the second term represents the electrostatic force between the sphere and Mat plate due to the residual potential dierence of V2 = 7:9 mV. The third term C represents the linear coupling of scattered light from the moving plate into the diodes and experimental drift and corresponds to a force ¡ 1 pN ( ¡ 1% of the forces at closest separation). The value of C is determined by minimizing the 12 . It is determined in region 3 and the electrostatic force corresponding to V2 = 7:9 mV and V1 = 0 is used to subtract the systematic errors from the force curve in regions 3 and 2 to obtain the measured Casimir force as: FC−m = Fm − Fe − Ca where Fm is the measured total force. Thus the measured Casimir force from region 2 has no adjustable parameters. The experiment is repeated for 27 scans and the average Casimir force measured is shown as open squares in Fig. 21. The error bars represent the standard deviation from the 27 scans at each data point. The authors report that due to the surface roughness, the averaging procedure introduces ± 3 nm uncertainty in the surface separation on contact of the two surfaces. The theoretical curve is shown as a solid line. The authors used a variety of statistical measures to de3ne the precision of the Casimir force measurement. They check the accuracy of the theoretical curve over the complete region between 100 and 500 nm with N =441 points (with an average of 27 measurements representing each point) with no adjustable parameters. Given that the experimental standard deviation over this range is 7 pN from thermal noise, the experimental √ uncertainty is 6 7= 27 = 1:3 pN leading to a precision which is better than 1% of the largest forces measured. If one wished to consider the rms deviation of the experiment (Fexp ) from the

174


theory (Fth ), which is equal to 2:0 pN as a measure of the precision, it is also on the order of 1% of the forces measured at the closest separation. From the above de3nitions, the statistical measure of the experimental precision is of order 1% of the forces at the closest separation. These measurements of the Casimir force using an AFM and aluminum surface were conclusive to a statistical precision of 1%. The second AFM experiment, met all three requirements by Sparnaay noted in Section 6.1 (in the 3rst experiment the separation distance on contact was not independently determined). However, these aluminum surfaces required the use of a thin Au=Pd coating on top. This coating could only be treated in a phenomenological manner. A more complete theoretical treatment is complicated as nonlocal eects such as spatial dispersion need to be taken into account in the calculation of the Casimir force (see Section 5.2.3). 6.4.3. Precision measurement with gold surfaces using the AFM This was the third in a series of precision measurements using the AFM. The other two were discussed above. The primary dierences here is the use of gold surfaces and the related experimental changes. The use of a thin Au=Pd coating on top of the aluminum surface to reduce eects of oxidation in the above two experiments prevented a complete theoretical treatment of the properties of the metal coating. Thus, it is important to use chemically inert materials such as gold for the measurement of the Casimir force. The complete dielectric properties of Au is used in the theory. Here the complete (extending from 0:125 to 9919 eV from Ref. [230] along with the Drude model below 0:125 eV is used to calculate (i'). In the Drude representation !p = 11:5 eV is the plasma frequency and C is the relaxation frequency corresponding to 50 meV. These values of !p and C are obtained in the manner detailed in [35,229]. The temperature correction is 1% of the Casimir force for the surface separations reported here and can be neglected. The fabrication procedures had to be modi3ed, given the dierent material properties of gold as compared to the aluminum coatings used previously in Refs. [41,43]. The 320 m long AFM cantilevers were 3rst coated with about 200 nm of aluminum to improve their thermal conductivity. This metal coating on the cantilever decreases the thermally induced noise when the AFM is operated in vacuum. Aluminum coatings are better, as applying thick gold coatings directly to these Silicon Nitride cantilevers led their curling due to the mismatch in the thermal expansion coeTcients. Next, polystyrene spheres were mounted on the tip of the metal coated cantilevers with Ag epoxy. A 1 cm diameter optically polished sapphire disk is used as the plate. The cantilever (with sphere) and plate were then coated with gold in an evaporator. The sphere diameter after the metal coating was measured using the Scanning Electron Microscope (SEM) to be 191:3 ± 0:5 m. The rms roughness amplitude A of the gold surface on the plate was measured using an AFM to be 1:0 ± 0:1 nm. The thickness of the gold coating was measured using the AFM to be 86:6 ± 0:6 nm. Such a coating thickness is suTcient to reproduce the properties of an in3nitely thick metal for the precision reported here. To reduce the development of contact potential dierences between the sphere and the plate, great care was taken to follow identical procedures in making the electrical contacts. This is necessary given the large dierence in the work function of aluminum and gold. As before with the application of voltages ± V1 to the plate, the residual potential dierence between the grounded sphere and the plate was measured to be V2 = 3 ± 3 mV. This residual potential leads to forces which are 1% of the Casimir forces at the closest separations reported here.


175

Fig. 22. The raw data of the force measured as a photodiode dierence signal as a function of the distance moved by the plate.

To measure the Casimir force between the sphere and Mat plate they are both grounded together with the AFM. The raw data from a scan is shown in Fig. 22. Region 1 is the Mexing of the cantilever resulting from the continued extension of the piezo after contact of the two surfaces. Region 2 (a0 ¡ surface separations ¡ a0 + 400 nm) clearly shows the Casimir force as a function of separation distance. The Casimir force measurement is repeated for 30 scans. The only systematic error associated with the Casimir force in these measurements is that due to the residual electrostatic force which is less than 0.1% of the Casimir force at closest separation. For surface separations exceeding 400 nm the experimental uncertainty in the force exceeds the value of the Casimir force. The surface separation on contact, a0 , is a priori unknown due to the roughness of the metal surface and is determined independently as described below. A small additional correction to the separation distance results from the deMection of the cantilever in response to the attractive Casimir force. As can be observed from the schematic in Fig. 14, this leads to a decrease in the distance of separation of the two surfaces. This “deMection correction” modi3es the separation distance between the two surfaces. This is given by a=a0 +apiezo − Fpd m, where a is the correct separation between the two surfaces, apiezo is the distance moved by the plate due to the application of voltage applied to the piezo, i.e. along the horizontal axis of Fig. 22 and Fpd is the photodiode dierence signal shown along the vertical axis in Fig. 22. Here m is deMection coeTcient corresponding to the rate of change of separation distance per unit photodiode dierence signal (from the cantilever deMection) and is determined independently as discussed below. The slope of the line in region 1 of the force curve shown in Fig. 22 cannot be used to determine m as the free movement of the sphere is prevented on contact of the two surfaces (due to the larger forces encountered here). Next, the authors used the electrostatic force between the sphere and Mat plate to arrive at an independent measurement of the constant m in the deMection correction and a0 the average surface separation on contact of the two surfaces. The Mat plate is connected to a DC voltage

176


Fig. 23. The measured electrostatic force curves for three dierent voltages 0:256 V (a), 0:202 V (b), and 0:154 V (c). The rate of change of separation distance per unit photodiode dierence signal corresponding to the slope of the dashed line which connects the vertices yields the deMection coeTcient m.

supply while the sphere remains grounded. The applied voltage V1 in Eq. (6.4) is so chosen that the electrostatic force is much greater than the Casimir force. As can be observed from Fig. 14, at the start of the force measurement, the plate and the sphere are separated by a 3xed distance and the plate is moved towards the sphere in small steps with the help of the piezoelectric tube. When dierent voltages V1 are applied to the plate, the point of contact between the plate and sphere varies corresponding to the dierent cantilever deMections. This is shown in Fig. 23 for three dierent applied voltages 0:256; 0:202 and 0:154 V. The vertex in each curve identi3es the contact point between sphere and plate. The deMection coeTcient m can be determined from the slope of the dashed line connecting the vertices. The slope corresponds to an average value of m = 8:9 ± 0:3 nm per unit photodiode dierence signal. The separation distance is then corrected for this cantilever deMection. Next, the surface separation on contact a0 was determined from the same electrostatic force curves. As explained previously in Section 6.4.2, a best 12 3t was done to the electrostatic force curves to obtain an average value of a0 = 31:7 nm. The average Casimir force measured from the 30 scans is shown as open squares in Fig. 24. The theoretical curve including all the corrections is shown as a solid line. For clarity only 10% of the 2583 data points are shown in the 3gure. Thus, the accuracy of the theoretical curve is checked over the complete region between 62 and 350 nm with N = 2583 points (with an average of 30 measurements representing each point). Given that the √ experimental standard deviation around 62 nm is 19 pN, the experimental uncertainty is 619= 30 = 3:5 pN leading to a precision which is better than 1% of the largest forces measured. If one wished to consider the rms deviation of the experiment (Fexp ) from the theory (Fth ); F = 3:8 pN as a measure of the precision, it is also on the order of 1% of the forces measured at the closest separation. The


177

Fig. 24. The measured average Casimir force as a function of plate–sphere separation is shown as squares. For clarity only 10% of the experimental points are shown in the 3gure. The error bars represent the standard deviation from 30 scans. The solid line is the theoretical Casimir force with account of roughness and 3nite conductivity corrections.

authors note that the uncertainties of 3:8 pN measured here are larger than the 2 pN in previous AFM measurements due to the poor thermal conductivity of the cantilever resulting from the thinner metal coatings used. Thus experiments at cryogenic temperatures should substantially reduce this noise. In conclusion the above measurement of the Casimir force with the AFM met all three criteria set forth by Sparnaay and thus is the most de3nite experiment to date. The complete conductivity properties of the metal coating were used in the comparison of the theory and experiment. Here the corrections to the surface roughness and that due to the electrostatic force were both reduced to 1% of the Casimir force at the closest separation. The separation distance on contact of the two surfaces was determined independently of the Casimir force measurement. In this experiment the temperature corrections were less than 1% of the Casimir force at the smallest separations. 6.5. Demonstration of the nontrivial boundary properties of the Casimir force One of the most exciting and unique aspects of the Casimir force is its ability to change the sign and value of the force, given changes in the geometry and topology of the boundary. However given the diTculties in the measurement of the Casimir force, it has only now become possible to measure Casimir forces for simple deviations from the Mat boundary case. The 3rst and only experiment to date in this regard is the measurement of Casimir forces between a sphere and plate with periodic uniaxial sinusoidal corrugations (PUSC) by Roy and Mohideen [42]. Such PUSC surfaces have been theoretically shown to exhibit a rich variety of surface

178


interactions [49,241,245]. Golestanian and Kardar [242] point out that such corrugated surfaces might lead to lateral forces and AC Josephson-like mechanical forces. Also creation of photons (dynamic Casimir eect discussed in Sections 2.4 and 4.4) is thought to result from the lateral movement of two such corrugated surfaces. In the present case of a measurement between a large sphere and the PUSC surface, the intent was to explore the eect of diraction and any lateral forces introduced by the uniform corrugation. Thus the force resulting from a PUSC plate can be expected to be substantially dierent from that resulting from a Mat one. The Casimir force was measured between the large sphere and the PUSC for surface separations between 0.1 and 0:9 m using an AFM. The amplitude of the corrugation is much smaller than the separation. Yet the measured force shows signi3cant deviations from a perturbative theory which only takes into account the small periodic corrugation of the plate in the surface separation. The authors also compare the measured Casimir force between the same sphere and identically coated Mat plate and show that it agrees well to the same theory in the limit of zero amplitude of corrugation. The authors point out that these results considered together, demonstrate the nontrivial boundary dependence of the Casimir force. 6.5.1. Measurement of the Casimir force due to the corrugated plate The experiment of [42] deals with the con3guration of polystyrene sphere above a 7:5 × 7:5 mm2 plate with periodic uniaxial sinusoidal corrugations. Both the sphere and the plate were coated with 250 nm of Al, and 8 nm layer of Au=Pd. For the outer Au=Pd layer transparencies greater than 90% were measured at characteristic frequencies contributing into the Casimir force. The diameter of the sphere was 2R = 194:6 ± 0:5 m. The surface of the corrugated plate is described by the function 2 x zs (x; y) = A sin ; (6.19) L where the amplitude of the corrugation is A = 59:4 ± 2:5 nm and its period is L = 1:1 m. The mean amplitude of the stochastic roughness on the corrugated plate was Ap = 4:7 nm, and on the sphere bottom—As = 5 nm. As is seen from Eq. (6.19) the origin of the z-axis is taken such that the mean value of the corrugation is zero. The separation between the zero corrugation level and the sphere bottom is given by a. The minimum value of a is given by a0 ≈ A + Ap + As + 2h ≈ 130 nm, where h ≈ 30 nm is the height of the highest occasional rare Al crystals which prevent the intimate contact between the sphere bottom and the maximum point of the corrugation. As in the standard measurement of Casimir forces, boundary dependence of the Casimir force can be easily obscured by errors in the measurement of the surface separation. To eliminate this ambiguity, the authors used the electrostatic method described in Section 6.4.2 to independently determine the exact surface separation and establish procedures for consistent comparison to theory. The electrostatic force between the sphere and the PUSC surface is given by m ∞ (V1 − V2 )2 A Fe = − Dm ; (6.20) R 4(a0 + Wa) a0 + Wa m=0

where Wa is the distance between the surfaces measured from contact and as before a0 is the true average separation on contact of the two surfaces due to the periodic corrugation and


179

Fig. 25. A typical force curve as a function of the distance moved by the plate. The “0” distance stands for point of contact and does not take into account the amplitude of the corrugation and the roughness of the metallic coating.

stochastic roughness of the aluminum coating (note that a = a0 + Wa). The nonzero even power coeTcients in Eq. (6.20) are: D0 = 1; D2 = 1=2; D4 = 3=8; D6 = 5=16; : : : V1 and V2 are voltages on the corrugated plate and sphere, respectively. The above expression was obtained using the Proximity Force Theorem, by starting from the electrostatic energy between parallel Mat plates. First the residual potential of the grounded sphere was measured. The sphere is grounded and the electrostatic force between the sphere and the corrugated plate was measured for four dierent voltages and 3ve dierent surface separations aA. With Eq. (6.20), from the dierence in force for voltages + V1 and − V1 applied to the corrugated plate, one can measure the residual potential on the grounded sphere V2 as 14:9 mV. This residual potential is a contact potential that arises from the dierent materials used to fabricate the sphere and the corrugated plate. As previously done, to measure the Casimir force between the sphere and the corrugated plate they are both grounded together with the AFM. The plate is then moved towards the sphere in 3.6 nm steps and the corresponding photodiode dierence signal was measured. The signal obtained for a typical scan is shown in Fig. 25. Here “0” separation stands for contact of the sphere and corrugated plate surfaces, i.e., Wa = 0. It does not take into account a0 . Region 1 can be used to subtract the minor ( ¡ 1%) experimental systematic due to scattered laser light without biasing the results in region 2. In region 2 (absolute separations between contact and 450 nm) the Casimir force is the dominant characteristic far exceeding all systematic errors (the electrostatic force is ¡ 2% of the peak Casimir force). Region 3 is the Mexing of the cantilever resulting from the continued extension of the piezo after contact of the two surfaces. Next by applying a DC voltage between the corrugated plate and sphere an independent and consistent measurement of a0 , the average surface separation on contact of the two surfaces, is arrived. Here the procedure outlined in Section 6.4.2 is followed. The applied voltage V1 in Eq. (6.20) is so chosen that the electrostatic force is ¿ 20 times the Casimir force. The experiment is repeated for other voltages between 0.4 and 0:7 V leading to an average value of a0 = 132 ± 5 nm. Given the 8 nm Au=Pd coating on each surface this would correspond to an

180


Fig. 26. The measured Casimir force F(a) as a function of the surface separation in con3guration of a sphere above a corrugated disk is shown by solid circles. The error bars represent the standard deviation from 15 scans. Curve 1 represents the computational results obtained with the uniform probability distribution, curve 2—with the distribution, when the sphere is located with equal probability above the convex part of corrugations, curve 3—with a distribution which increases linearly when the sphere approaches the points of a stable equilibrium, and curve 4—for a sphere situated above the points of stable equilibrium.

average surface separation 132 ± 5 + 8 + 8 nm = 148 ± 5 nm for the case of the Casimir force measurement. The electrostatically determined value of a0 can now be used to apply the systematic error corrections to the force curve considered, as above, in three regions. The force curve in region 1, is 3t to a function: F = FC (Wa + 148) + Fe (Wa + 132) + Ca. The 3rst term is the Casimir force contribution to the total force in region 1. The second term represents the electrostatic force between the sphere and corrugated plate as given by Eq. (6.20). The third term C represents the linear coupling of scattered light from the moving plate into the diodes and corresponds to a force ¡ 1 pN ( ¡ 1% eect). Here again the dierence in a0 in the electrostatic term and the Casimir force is due to the 8 nm Au=Pd coating on each surface. The value of C is determined by minimizing the 12 . The value of C determined in region 1 and the electrostatic force corresponding to V2 = 14:9 mV and V1 = 0 in Eq. (6.20) is used to subtract the systematic errors from the force curve in regions 1 and 2 to obtain the measured Casimir force as: FC−m = Fm − Fe − Ca where Fm is the measured total force. Thus the measured Casimir force from region 2 has no adjustable parameters. The experiment was repeated for 15 scans and the average Casimir force measured is presented as solid circles in Fig. 26 together with the experimental uncertainties shown by the error bars. To gain a better understanding of the accord between theory and experiment the distance range a 6 400 nm is considered here where the Casimir force is measured with higher accuracy.


181

As it is seen from the next subsection, there is signi3cant deviation between the measured force and the perturbative theory which takes into account the periodic corrugations of the plate in the surface separation. The experiment and analysis was repeated in [42] for the same sphere and an identically coated Mat plate (i.e. without corrugations). The average measured Casimir force from 15 scans shows good agreement with perturbation theory of Section 6.4.1. The reasons for this are discussed below. 6.5.2. Possible explanation of the nontrivial boundary dependence of the Casimir force In Ref. [276] the perturbative calculations for both vertical and lateral Casimir force acting in the con3guration of a sphere situated above a corrugated plate is presented (note that the lateral force arises due to the absence of translational symmetry on a plate with corrugations). The study of [276] revealed that a lateral force acts upon the sphere (in contrast to the usual vertical Casimir force) in such a way that it tends to change the bottom position of the sphere in the direction of a nearest maximum point of the vertical Casimir force (which coincides with the maximum point of corrugations). In consequence of this, the assumption of a simple perturbation theory that the locations of the sphere above dierent points of a corrugated surface are equally probable can be violated. As indicated in [276], the diverse assumptions on the probability distribution describing location of the sphere above dierent points of the plate result in an essential change in force–distance relation. Thus the perturbation theory taking the lateral force into account may work for a case of corrugated plate. Let us take up 3rst the vertical Casimir force acting between a corrugated plate and a sphere. A sinusoidal corrugation of Eq. (6.19) leads to the modi3cation of the Casimir force between a Mat plate and a sphere. The modi3ed force can be calculated by averaging over the period L C F(a) = d x T(x)Fd‘ (d(a; x)) : (6.21) 0

Here d(a; x) is the separation between the sphere (lens) bottom and the point x on the surface of corrugated plate (disk) 2 x d(a; x) = a − Ap − As − A sin : (6.22) L C is the Casimir force acting between a Mat plate and a sphere with account of corrections Fd‘ due to 3nite conductivity of the boundary metal given by Eq. (5.65). The quantity T(x) from Eq. (6.21) describes the probability distribution of the sphere positions above dierent points x belonging to one corrugation period. If the plate corrugation is taken into account in the surface separation only, then the uniform distribution is assumed (T(x)=1=L) which is to say that the sphere is located above all points x belonging to the interval 0 ¡ x ¡ L with equal probability. The right hand side of Eq. (6.21) can be expanded in powers of a small parameter A=(a − Ap − As ). This expansion had shown signi3cant deviations from the measured data of [42] while Eq. (5.65) is in excellent agreement with data for all d ¿ Bp in the limit of zero amplitude of corrugation. Now we turn to the lateral projection of the Casimir force. The lateral projection is nonzero only in the case of nonzero corrugation amplitude. Let us 3nd it in the simplest case of the ideal metal.

182


This can be achieved by applying the additive summation method of the retarded interatomic potentials over the volumes of a corrugated plate and a sphere with subsequent normalization of the interaction constant. Alternatively the same result is obtainable by the Proximity Force Theorem. Let an atom of a sphere be situated at a point with the coordinates (xA ; yA ; zA ) in the 7 over the coordinate system described above. Integrating the interatomic potential U = − C=r12 volume of corrugated plate (r12 is a distance between this atom and the atoms of a plate) and calculating the lateral force projection according to − 9U= 9xA one obtains [247,277] 4 2 Np C A zA 2 xA 5 A 4 xA (A) cos sin ; (6.23) Fx (xA ; yA ; zA ) = + L 2 zA L 5zA5 zA L where Np is the atomic density of a corrugated plate. Eq. (6.23) is obtained by perturbation expansion of the integral (up to second order) in small parameter A=zA . We can represent xA = x0 + x, yA = y0 + y, zA = z0 + z where (x0 ; y0 ; z0 ) are the coordinates of the sphere bottom in the above coordinate system, and (x; y; z) are the coordinates of the sphere atom in relation to the sphere bottom. The lateral Casimir force acting upon a sphere is calculated by the integration of (6.23) over the sphere volume and subsequent division by the normalization factor K = 24CNp Ns =( ˝c) obtained by comparison of additive and exact results for the con3guration of two plane parallel plates (see Section 4.3) Ns Fx (x0 ; y0 ; z0 ) = d 3 r Fx(A) (x0 + x; y0 + y; z0 + z) ; (6.24) K Vs where Ns is the atomic density of sphere metal. Let us substitute Eq. (6.23) into Eq. (6.24) neglecting the small contribution of the upper semisphere which is of order z0 =R ¡ 4 × 10−3 comparing to unity. In a cylindrical coordinate system the lateral force acting upon a sphere rearranges to the form R−√R2 −T2 2

3 ˝c A 2 x0 R dz 2 T Fx (x0 ; y0 ; z0 ) = cos T dT d’ cos cos ’ 30 L L 0 (z0 + z)5 0 L 0 R−√R2 −T2 2 5 dz 4 T 4 x0 R + A sin T dT d’ cos : cos ’ 2 L 0 (z0 + z)6 0 L 0 (6.25) Using the standard formulas from [226] the integrals with respect to ’ and z are taken explicitly. Preserving only the lowest order terms in small parameter x0 =R ¡ 10−2 we arrive at 2 x0 R

4 ˝c A 2 T Fx (x0 ; y0 ; z0 ) = − cos T dT J0 L 0 L 60z04 L R A 4 x0 4 T + 2 sin T dT J0 ; (6.26) z0 L 0 L where Jn (z) is Bessel function.


Integrating in T the 3nal result is obtained [276] A 2 x0 A 4 x0 2 R 4 R (0) cos + sin ; Fx (x0 ; y0 ; z0 ) = 3Fd‘ (z0 ) J1 J1 z0 L L z0 L L

183

(6.27)

(0) where the vertical Casimir force Fd‘ for ideal metal was de3ned in Eq. (4.108). As is seen from Eq. (6.27) the lateral Casimir force takes zero value at the extremum points of the corrugation described by Eq. (6.19). The lateral force achieves maximum at the points x0 = 0; L=2 where the corrugation function is zero. If the sphere is situated to the left of a point x0 = L=4 (maximum of corrugation) it experiences a positive lateral Casimir force. If it is situated to the right of x0 = L=4 the lateral Casimir force is negative. In both cases the sphere tends to change its position in the direction of a corrugation maximum which is the position of stable equilibrium. The situation here is the same as for an atom near the wall covered by the large-scale roughness [247]. That is the reason why the dierent points of a corrugated plate are not equivalent and the assumption that the locations of the sphere above them are described by the uniform probability distribution may be too simplistic. On this basis, one may suppose that the probability distribution under consideration is given by   2 ; kL 6 x 6 (k + 1 )L ; 2 T(x) = L (6.28)  1 0; (k + 2 )L 6 x 6 (k + 1)L ;

where k = 0; 1; 2; : : : : This would mean that in the course of the measurements the sphere is located with equal probability above dierent points of the convex part of corrugation but cannot be located above the concave one. It is even more reasonable to suppose that the function T increases linearly when the sphere approaches the points of a stable equilibrium. In this case the functional dependence is given by  16   x; kL 6 x 6 (k + 14 )L ;  2  L   T(x) = 16 L − x ; (k + 1 )L 6 x 6 (k + 1 )L ; (6.29)  4 2 2  L 2     0; (k + 12 )L 6 x 6 (k + 1)L : By way of example in Fig. 26 the theoretical results computed by Eq. (6.21) and shown by curves 1 (uniform distribution), 2 (distribution of Eq. (6.28)), 3 (distribution of Eq. (6.29)), and 4 (the bottom of the sphere is directly over the maximum of corrugation at all times) are compared with the measured Casimir force. It is seen that curve 3 is in agreement with experimental data in the limits of given uncertainties WF = 5 pN; Wa = 5 nm. The root mean square average deviation between theory and experiment within the range 169:5 nm 6 a 6 400 nm (62 experimental points) where the perturbation theory is applicable is F = 20:28 pN for the curve 1, F = 8:92 pN (curve 2), F = 4:73 pN (curve 3), and F = 9:17 pN (curve 4). By this means perturbation theory with account of the lateral Casimir force can be made consistent with experimental data and might explain the observed nontrivial boundary dependence of the Casimir

184


force [276]. The complete solution of the problem may be achieved with an experiment where both the vertical and the lateral Casimir forces are measured. 6.6. The outlook for measurements of the Casimir force With the advent of sensitive force detection techniques such as phase sensitive detection and the AFM, the measurement of the Casimir force has become conclusive. However the present measurements are only sensitive to forces at surface separations above 32 nm and below 1000 nm. The importance of these two separation limits and the possibility for their improvement will be discussed next. First, there is a clear need to measure Casimir (van der Waals) forces at smaller and smaller surface separations given the possible presence of compacti3ed dimensions as postulated by modern uni3ed theories. The measurement of the Casimir force at separations above 1 m is also of great interest as a test for some predictions of supersymmetry and string theory (see Section 7). Such measurements would substantially improve the limits on the coupling constants of these hypothetical forces. In order to make such measurements at smaller surface separations, one needs to use smoother metal coatings. Atomic layer by layer growth of metal coatings by technologies like Molecular Beam Epitaxy might be the best suited. Even here, single atomic lattice steps of size ± 0:5 nm are unavoidable. So the smallest separation distance that can possibly be achieved is on the order of 1–2 nm. In this regard, recently, a template stripped method for the growth of smooth gold surfaces for the use in Casimir force measurements has been reported [278]. As a result an rms roughness of 0:4 nm with pits of height 3–4 nm was achieved. Here a separation distance of 20 nm is estimated for the contact of the two surfaces and a rms deviation of 1% of the Casimir force at the closest separation distance was reported. However in this particular case, a hydrocarbon coating was necessary on top of the gold which might complicate the theoretical analysis and the independent measurement of the residual electrostatic force. Also a substantial deformation of the gold coating appears to have prevented an independent and exact determination of surface separation [278]. Probably, both the hydrocarbon layer and the deformation can be eliminated in future experiments leading to surface separations of 5 –10 nm on contact of the two surfaces. Given the implications of the Casimir force measurements for the detection of new forces in the submillimeter distance scale, it is increasingly important that consistent values of the experimental precision be measured. It should be noted that such a precision can only be attempted if an independent measurement of all the parameters particularly the surface separation on contact of the two surfaces is performed. All ambiguities such as thin protective coatings should preferably be avoided. Also an independent measurement of the systematics such as the electrostatic force between the two surfaces is also necessary. A value for the eect of the roughness of the surface needs to be given in order to estimate the roughness correction. If all the above are indeed provided then one can devise methods to unambiguously measure the deviation from the theoretical values of the Casimir force. The reasonable method is to report at each and every separation distance, the deviation of the average value of the force from the theoretical value at appropriate con3dence level and the absolute error of the experimental measurement. One also needs to measure the errors introduced due to the uncertainties in the measurement of the surfaces separation.


185

The second case of extending Casimir force measurements beyond 1000 nm is also vitally important in order to measure the 3nite temperature corrections to the Casimir force. The temperature corrections make a signi3cant contribution to the Casimir force only for separation exceeding a micrometer. Such measurements are needed given the controversies surrounding the theoretical treatments of the Casimir force between real metals (explained in Section 5.4.2). It appears that techniques similar to the AFM would be best suited to increase the sensitivity. In the case of the AFM, the authors propose to increase the sensitivity by (a) lithographic fabrication of cantilevers with large radius of curvature and (b) interferometric detection of cantilever deMection. Lower temperatures can also be used to reduce the thermal noise. This would however also reduce the temperature corrections. In addition a dynamic measurement might lead to substantial increase in the sensitivity. With regard to the boundary dependences, only one demonstration experiment has so far been done. Other boundaries, such as for example with cubical and spherical cavities might be feasible with the sensitivity available with the AFM technique. Also the inMuence of small material eects such as the polarization dependence of the material properties on the Casimir force might also be measurable in the future. 7. Constraints for non-Newtonian gravity from the Casimir eect According to the predictions of uni3ed gauge theories, supersymmetry, supergravity, and string theory, there would exist a number of light and massless elementary particles (for example, the axion, scalar axion, graviphoton, dilaton, arion, and others [279]). The exchange of such particles between two atoms gives rise to an interatomic force described by a Yukawa or power-law _ to cosmic eective potentials. The interaction range of this force is to be considered from 1 A scales. Because of this, it is called a “long-range force” (in comparison with the nuclear size). The long-range hypothetical force, or “3fth force” as it is often referred to [280], may be considered as some speci3c correction to the Newtonian gravitational interaction [281]. That is why the experimental constraints for this force are known also as constraints for non-Newtonian gravity [282]. Constraints for the constants of hypothetical long-range interactions, or on non-Newtonian gravity are obtainable from Galileo-, EUotvos-, and Cavendish-type experiments. The gravitational experiments lead to rather strong constraints over a distance range 10−2 m ¡ B ¡ 106 km [283]. In submillimeter range, however, no constraints on hypothetical long-range interactions follow from the gravitational experiments, and the Newtonian law is not experimentally con3rmed in this range. The pioneering studies in applying the Casimir force measurements to the problem of longrange interactions were made in [284 –288]. There it was shown that the Casimir eect leads to the strongest constraints on the constants of Yukawa-type interactions with a range of action of 10−8 m ¡ B ¡ 10−4 m (see also [289]). This means that the Casimir eect becomes a new nonaccelerator test for the search of hypothetical forces and associated light and massless elementary particles. Tests of this type take on great signi3cance in the light of the exciting new ideas that the gravitational and gauge interactions may become uni3ed at the weak scale (see, e.g., [290]). As a consequence, there should exist extra spatial dimensions compacti3ed at a relatively large scale of 10−3 m or less. Also, the Newtonian gravitational law acquires

186


Yukawa-type corrections in the submillimeter range [291,292] like those predicted earlier from other considerations. These corrections can be constrained and even discovered in experiments on precision measurements of the Casimir force. Below we demonstrate the application of the Casimir force measurements for obtaining stronger constraints on the parameters of hypothetical long-range interactions and light elementary particles starting from historical experiments and 3nishing with the most modern ones. 7.1. Constraints from the experiments with dielectric test bodies As was noted in Section 6.2, the Casimir force measurements between the dielectric disk and spherical lens were not very precise, and some important factors were not taken into account. Although, the 3nal accuracy of the force measurement was not calculated on a solid basis in those experiments, it may be estimated liberally as ∼ 10% in the separation distance range 0:1 m ¡ a ¡ 1 m (in this value of the factor of about 2–5 is implied which is not important for the discussion below). The Casimir force acting between a disk (plate) and a lens of curvature radius R is given by Eq. (4.115), where 02 is the dielectric constant of a lens and a plate material. The eective gravitational interaction between two atoms including an additional Yukawa-type term is given by GM1 M2 V (r12 ) = − (1 + 6G e−r12 =B ) : (7.1) r12 Here M1; 2 are the masses of the atoms, r12 is the distance between them, G is the Newtonian gravitational constant, 6G is a dimensionless constant of hypothetical interaction, B is the interaction range. In the case that the Yukawa-type interaction is mediated by a light particle of mass m the interaction range is given by the Compton wavelength of this particle, so that B = ˝=(mc). Let the thickness of the dielectric plate be D and of the lens H . It is easily seen [48] that the gravitational force acting between a lens and a plate is several orders smaller than the Casimir force and can be neglected (see also Section 7.3). The hypothetical force can be computed in the following way. We 3rst 3nd the interaction potential between the whole plate and one atom of the lens located at a height l above the plate center v(l) = − 2 GM1 M2 6G NB2 e−l=B (1 − e−D=B ) ;

(7.2)

where N is the number of atoms per unit plate and lens volume, and a 6 l 6 a+H . The density of atoms in a thin horizontal section of the lens at a height l ¿ a is given by F(l) = N [2R(l − a) − (l − a)2 ] :

(7.3)

The interaction potential between a lens and a plate is found by integration of Eq. (7.2) weighted with (7.3) in the limits a and a + H . Then the hypothetical force is obtained by dierentiating with respect to a. The result is F hyp (a) = − 4 2 GT2 B3 R6G (1 − e−D=B )e−a=B B H2 B H −H=B H × 1− +e −1+ + − ; R R R 2RB B

(7.4)


187

Fig. 27. Constraints on the constants of Yukawa-type hypothetical interaction following from the measurement of the Casimir (curve 2) and van der Waals (curve 3) forces. Curve 1 represents the constraints following from one of the Cavendish-type experiments.

where T is the density of plate and sphere material. In the above calculations the inequalities L; R; H; Da were based on the experimental con3guration under consideration. For B belonging to a submillimeter range it is also valid D; H; RB so that Eq. (7.4) can be substantially simpli3ed F hyp (a) = − 4 2 GT2 B3 R6G e−a=B :

(7.5)

The constraints on the constants of the hypothetical long-range interaction 6G and B were obtained from the condition [285,287] |F (a)| ; (7.6) 100% d‘ where the Casimir force Fd‘ (a) is given by Eq. (4.115). This condition has the meaning that the hypothetical force was not observed within the limits of experimental accuracy. The constraints following from inequality (7.6) are shown by the curve 2 in Fig. 27 drawn in a logarithmic scale. The regions below the curves in this 3gure are permitted by the results of force measurements, and those above the curves are prohibited. Curve 2 was the strongest result restricting the value of hypothetical force in the range 10−8 m ¡ B ¡ 10−4 m till the new measurements of the Casimir force between metallic surfaces which were started in 1997. For slightly larger B the best constraints on 6G ; B follow from the Cavendish-type experiment of Ref. [293] (see curve 1 in Fig. 27). Note that in Refs. [266,294,295] the constraints were obtained based on the search for displacements induced on a micromechanical resonator in the presence of a dynamic gravitational 3eld (see Section 6.2.5). For the moment, these constraints are less stringent than those from the Casimir force measurements between dielectrics and from the Cavendish-type experiment of [293]. |F hyp (a)| ¡

188


For 10−9 m ¡ B ¡ 10−8 m the best constraints on 6G ; B were found [296] from the measurements of the van der Waals forces between the dielectric surfaces (lens above a plate and two crossed cylinders). To do this the same procedure was used as in the case of the Casimir force. The main dierence is that the van der Waals force of Eq. (4.107) was substituted into Eq. (7.6) as Fd‘ . The obtained constraints are shown by curve 3 in Fig. 27). For extremely short wavelengths 10−10 m ¡ B ¡ 10−9 m the best constraints on 6G ; B were obtained in [297,298] by measuring the van der Waals force acting between a plate and a tip of an Atomic Force Microscope made of Al2 O3 . As discussed above, the Yukawa-type corrections to Newtonian gravitational law are caused by the exchange of light but massive particles. In the case of massless particles (e.g., two-arion exchange) the corrections are of power type, so that instead of Eq. (7.1) one has n−1 GM1 M2 r0 V (r12 ) = − 1 + BnG : (7.7) r12 r12 Here BnG with n = 2; 3; : : : are the dimensionless constants, r0 = 1 F = 10−15 m is introduced for the proper dimensionality of potentials with dierent n [299]. Casimir force measurement between dielectrics has also led to rather strong constraints on BnG with n=2; 3; 4 (see [285]). At the present time, however, the best constraints for the constants of power-type hypothetical interactions follow from the Cavendish- and EUotvos-type experiments [283,300 –302]. 7.2. Constraints from Lamoreaux’s experiment In [40] the Casimir force between two metallized surfaces of a Mat disk and a spherical lens was measured with the use of torsion pendulum (see Section 6.3). The radius of the disk was L = 1:27 cm and its thickness D = 0:5 cm. The 3nal radius of curvature of the lens reported was R = 12:5 cm and its height H = 0:18 cm. The separation between them was varied from a = 0:6 up to 6 m. Both bodies were made out of quartz and covered by a continuous layer of copper with - = 0:5 m thickness. The surfaces facing each other were additionally covered with a layer of gold of the same thickness. The experimental data obtained in [40] were found to be in agreement with the ideal theoretical result of Eq. (4.108) in the limits of the absolute error of force measurements WF =10−11 N (0) at a = 1 m). No for the distances 1 m 6 a 6 6 m (note that this WF is around 3% of Fd‘ corrections due to surface roughness, 3nite conductivity of the boundary metal or nonzero temperature were recorded. These corrections, however, may not lie within the limits of the absolute error WF (see Section 6.3). The correction due to the 3niteness of the disk diameter L (which is even smaller than the size of the lens) was shown to be negligible [45,46,227]. The constraints on the constants of hypothetical long-range interactions following from this experiment [40] can be obtained as follows. First, the Casimir force acting in the con3guration under consideration should be computed theoretically. It is the force F R; T; C (a) taking into account surface roughness, nonzero temperature and 3nite conductivity in accordance with Section 5.4. In the 3rst approximation the dierent corrections are additive so that (0) (0) (0) F R; T; C (a) = Fd‘ (a) + -G Fd‘ (a) + -L Fd‘ (a) ;

(7.8)


189

where the total correction to the force due to nonzero temperature, 3nite conductivity and short-scale roughness (0) (0) (0) (0) -G Fd‘ (a) ≈ -T Fd‘ (a) + -C Fd‘ (a) + -R Fd‘ (a)

(7.9)

(0) (a) in (7.8) is the correction due to the can be estimated theoretically. The quantity -L Fd‘ large-scale roughness (see, e.g., Eq. (5.129)). It cannot be estimated theoretically because the actual shape of interacting bodies was not investigated experimentally in [40]. To a 3rst approximation, however, for two dierent separations we can say that 4 1 a2 (0) (0) -L Fd‘ (a2 ) = -L Fd‘ (a1 ); k21 ≡ : (7.10) k21 a1

Next, the hypothetical force acting in the experimental con3guration should be computed. As in Section 7.1, gravitational force is small compared to the Casimir force. In view of the layer structure of interacting bodies the Yukawa potential of their interaction takes the form U hyp = − 6G G Uijhyp

=

Vi

3 i; j=1

3

Ti Tj Uijhyp ;

d r1

Vj

d 3 r2

1 −r12 =B e : r12

(7.11)

Here Ti ; Vi (i = 1; 2; 3) are the densities and volumes of the lens and the covering metallic layers (Tj ; Vj are the same for the disk). In numerical calculations below the values T1 = 2:23 g=cm3 ; T1 = 2:4 g=cm3 , T2 = T2 = 8:96 g=cm3 , T3 = T3 = 19:32 g=cm3 are used. As usual the force is obtained by F hyp = −9U hyp = 9a. In [45] the hypothetical force was computed analytically in two limiting cases: B ¡ H and BR. For example, if B is not only less than H but also B ¡ a the particularly simple result is obtained [45,46] F hyp (a) = − 4 2 B3 6G GRe−a=B (T1 e−2-=B + T2 e−-=B + T3 )(T1 e−2-=B + T2 e−-=B + T3 ) : (7.12) In the intermediate range between B ¡ H and BR the integration in Eq. (7.11) was performed numerically [45]. By virtue of the fact that the values of dierent corrections to the Casimir force surpass the absolute error of force measurements WF, their partial cancellation appears very likely. In this (0) situation, bearing in mind that the expression for Fd‘ was con3rmed with an absolute error WF, the constraints on the parameters of hypothetical interaction 6G ; B can be calculated from the inequality [45] (0)

|F R; T; C (a) + F hyp (a) − Fd‘ (a)| 6 WF :

(7.13)

Here F R; T; C (a) is the theoretical Casimir force value with account of all corrections given by Eq. (7.8). Substituting (7.8) into (7.13) one obtains (0)

(0)

|F hyp (a) + -G Fd‘ (a) + -L Fd‘ (a)| 6 WF :

(7.14)

According to the above results, the value of the hypothetical force is proportional to the interaction constant F hyp (ai ) = 6G KB (ai ). Considering Eq. (7.14) for two dierent values of distance

190


Fig. 28. Constraints for the Yukawa-type hypothetical interaction following from Lamoreaux experiment are shown by curves 4,a (6G ¿ 0) and 4,b (6G ¡ 0). Constraints following from the experiments by Mohideen et al. are shown by curves 5 and 6. Curves 1–3 are the same as in the previous 3gure. (0) a1 ; a2 with account of Eq. (7.10) and excluding the unknown quantity -L Fd‘ (a1 ) the desired constraints on 6G are obtained

− (k21 + 1)WF − -G (a1 ) + k21 -G (a2 ) 6 6G [KB (a1 ) − k21 KB (a2 )] ;

6G [KB (a1 ) − k21 KB (a2 )] 6 (k21 + 1)WF − -G (a1 ) + k21 -G (a2 ) :

(7.15)

The speci3c values of a1 ; a2 in Eqs. (7.15) were chosen in the interval 1 m 6 a 6 6 m in order to obtain the strongest constraints on 6G ; B. For the upper limit of the distance interval (a ≈ 6 m), the Casimir force F T (a) from Eq. (5.30), i.e. together with the temperature correction should be considered as the force under measurement. All corrections to it are much smaller than WF = 10−11 N. Thus for such values of a, the constraints on the hypothetical interaction may be obtained from the simpli3ed inequality (instead of from Eq. (7.14)) |F hyp (a)| = |6G KB (a)| 6 WF :

(7.16)

The results of numerical computations with use of Eq. (7.15) are shown in Fig. 28 by curves 4,a (6G ¿ 0) and 4,b (6G ¡ 0). In the same 3gure curves 1, 2, 3 show the previously known constraints following from the Cavendish-type experiments and Casimir (van der Waals) force measurements between dielectrics (Fig. 27). For dierent B the values of a1 = 1 m and a2 = 1:5–3 m were used. The complicated character of curves 4,a,b (nonmonotonic behavior of their 3rst derivatives) is explained by the multilayered structure of the test bodies. For B ¿ 10−5 m the metallic layers do not contribute much to the total value of the force which is determined mostly by quartz. But for B ¡ 10−5 m the contribution of metal layers is the predominant one. It is seen that the new constraints following from [40] are the best ones within a wide range 2:2 × 10−7 m ¡ B ¡ 1:6 × 10−4 m [45] (a slightly dierent result was obtained in Ref. [303] (0) were not taken into account). where, however, the corrections to the ideal Casimir force Fd‘


191

They surpass the results obtained from the Casimir force measurements between dielectrics up to a factor of 30. For B ¡ 2:2 × 10−7 m the latter lead to better constraints than the new ones. This is caused by the small value of the Casimir force between dielectrics compared to the case of metals and also by the fact that the force was measured for smaller values of a. The constraints for power-type hypothetical interactions of Eq. (7.7) following from the experiment [40] were calculated in [304]. They turned out to be weaker than the best ones obtained from the Cavendish- and EUotvos-type experiments. In Ref. [45] the possible strengthening by up to 104 times of the obtained constraints is proposed due to some modi3cations of the experimental setup. The prospective constraints would give the possibility to restrict the masses of the graviphoton and dilaton. They also fall within the range 6G ∼ 1 predicted by the theories with the weak uni3cation scale. Because of this the obtaining of such strong constraints is gaining in importance.

7.3. Constraints following from the atomic force microscope measurements of the Casimir force The results of the Casimir force measurements by means of an atomic force microscope are presented in Section 6.4. They were shown to be in good agreement with the theory taking into account the 3nite conductivity and roughness corrections. Temperature corrections are not important in the interaction range 0:1 m ¡ a ¡ 0:9 m as in [41] or 0:1 m ¡ a ¡ 0:5 m [43]. In both experiments the test bodies (sapphire disk and polystyrene sphere) were covered by the aluminum layer of 300 nm thickness [41] (or 250 nm thickness [43]) and Au=Pd layer of 20 or 7:9 nm thickness, respectively. This layer was demonstrated to be transparent for electromagnetic oscillations of the characteristic frequency. The absolute error of force measurements in [41] was WF = 2 × 10−12 N and almost two times lower in [43] due to use of vibration isolation, lower systematic errors, and independent measurement of surface separation. The con3dent experimental con3rmation of the complete Casimir force including corrections has made it possible to 3nd the constraints on the parameters 6G ; B of hypothetical interaction from the inequality hyp

|FR (a)| 6 WF :

(7.17)

Index R indicates that the hypothetical force should be calculated with account of surface roughness. Note that here, as opposed to Eq. (7.15), all the corrections are included in the Casimir force. By this reason the constraints on |6G | rather than on 6G are obtained. The gravitational and hypothetical forces described by potential (7.1) can be calculated as follows (we substitute the numerical parameters of the improved experiment [43]). The diameter 2R = 201:7 m of the sphere is much smaller than the diameter of the disk 2L = 1 cm. Because of this, each atom of the sphere can be considered as if it would be placed above the center of the disk. Let an atom of the sphere with mass M1 be at height lL above the center of the disk. The vertical component of the Newtonian gravitational force acting between this atom

192


and the disk can be calculated as L l+D 9 dz √ GM1 T2 r dr fN; z (l) = 9l r 2 + z2 0 l D + 2l ≈ −2 GM1 TD 1 − ; 2L

(7.18)

where T = 4:0 × 103 kg=m3 is the sapphire density, D = 1 mm is the thickness of sapphire disk, and only the 3rst-order terms in D=L and l=L are retained. Integrating Eq. (7.18) over the volume of the sphere one obtains the Newtonian gravitational force acting between a sphere and a disk 8 2 D R 3 FN; z ≈ − GTT DR 1 − − ; (7.19) 3 2L L where T = 1:06 × 103 kg=m3 is the polystyrene density. Note that this force does not depend on distance a between the disk and the sphere because of a = 0:1–0:5 mR. Substituting the values of parameters given above into (7.19) we arrive at the value FN; z ≈ 6:7 × 10−18 N which is much smaller than WF. The value of Newtonian gravitational force between the test bodies remains nearly unchanged when taking into account the contributions of Al and Au=Pd layers on the sphere and the disk. The corresponding result can be simply obtained by the combination of several expressions of type (7.19). The additions to the force due to layers are suppressed by the small multiples -i =D and -i =R. That is why the Newtonian gravitational force is negligible in the Casimir force measurement by means of atomic force microscope (note that for the con3guration of two plane parallel plates gravitational force can play more important role [305]). Now we consider hypothetical force acting between a disk and a sphere due to the Yukawatype term from Eq. (7.1). It can be calculated using the same procedure which was already applied in Section 7.1 for the con3guration of a lens above a disk and here in the case of gravitational force. The only complication is that the contribution of the two covering layers of thickness -1 (Al) and -2 (Au=Pd) should be taken into account. Under the conditions a, BR the result is [47,48] F hyp (a) = − 4 2 G6G B3 e−a=B R[T2 − (T2 − T1 )e−-2 =B − (T1 − T)e−(-2 +-1 )=B ] ×[T2 − (T2 − T1 )e−-2 =B − (T1 − T )e−(-2 +-1 )=B ] :

= 2:7 × 103

kg=m3

= 16:2 × 103

(7.20) kg=m3

is the density of Al and T2 is the density of 60% Here T1 Au=40% Pd. As it was shown in [47], the surface distortions can signi3cantly inMuence the value of hypothetical force in the nanometer range. In [43] smoother metal coatings than in [41] were used. The roughness of the metal surface was measured with the atomic force microscope. The major distortions both on the disk and on the sphere can be modeled by parallelepipeds of two heights h1 = 14 nm (covering the fraction v1 = 0:05 of the surface) and h2 = 7 nm (which cover the fraction v1 = 0:11 of the surface). The surface between these distortions is covered by a stochastic roughness of height h0 = 2 nm (v0 = 0:84 fraction of the surface). It consists of small crystals which form a homogeneous background of the average height h0 =2.


193

The detailed calculation of roughness corrections to the Casimir force for this kind of roughness was performed in Section 6.4.1. Using similar methods, a result in perfect analogy with Eq. (6.15) is obtained FRhyp (a) =

6

wi F hyp (ai ) ≡ v12 F hyp (a − 2A) + 2v1 v2 F hyp (a − A(1 + 71 ))

i=1

+ 2v2 v0 F hyp (a − A(71 − 72 )) + v02 F hyp (a + 2A72 ) + v22 F hyp (a − 2A71 ) + 2v1 v0 F hyp (a − A(1 − 72 )) :

(7.21)

Here the value of amplitude is de3ned relative to the zero distortion level which is given by A = 11:69 nm; 71 = 0:4012; 72 = 0:1121. According to the results of [43] no hypothetical force was observed. The constraints on it are obtained by the substitution of Eq. (7.21) into Eq. (7.17). The strongest constraints on 6G follow for the smallest possible value of a. As was told above there is amin = 100 nm in the Casimir force measurement for the experiment under consideration. This distance is between Al layers because the Au=Pd layers of -2 = 7:9 nm thickness were shown to be transparent for the frequencies of order c=a. Considering the Yukawa-type hypothetical interaction this means that aYu min = 100 nm − 2-2 = 84:2 nm. Substituting this value into Eqs. (7.17), (7.20) and (7.21) one obtains constraints on 6G for dierent B [48]. The computational results are represented by curve 5 of Fig. 28. As follows from Fig. 28, the new constraints turned out to be up to 560 times stronger than the constraints obtained from the Casimir and van der Waals force measurements between dielectrics (curves 2 and 3 in Figs. 27 and 28). The strengthening takes place within the interaction range 5:9 × 10−9 m 6 B 6 1:15 × 10−7 m. The largest strengthening takes place for B = 10–15 nm. Note that the same calculations using the results of the experiment [41] lead to approximately four times less strong constraints. Recently, one more measurement of the Casimir force was performed using the atomic force microscope [44]. The test bodies (sphere and a disk) were coated with gold instead of aluminum which removes some diTculties connected with the additional thin Au=Pd layers used in the previous measurements to reduce the eect of oxidation processes on Al surfaces. The polystyrene sphere used which was coated by gold layer was of diameter 2R = 191:3 m and a sapphire disk had a diameter 2L = 1 cm, and a thickness D = 1 mm. The thickness of the gold coating on both test bodies was - = 86:6 nm. This can be considered as in3nitely thick for the case of the Casimir force measurements. The root mean square roughness amplitude of the gold surfaces was decreased until 1 nm which makes roughness corrections negligibly small. The measurements were performed at smaller separations, i.e. 62 nm 6 a 6 350 nm. The absolute error of force measurements was, however, WF = 3:8 × 10−12 N, i.e., a bit larger than in the previous experiments. The reason is the thinner gold coating used in [44] which led to poor thermal conductivity of the cantilever. At smaller separations of about 65 nm this error is less than 1% of the measured Casimir force. The gravitational force given by Eq. (7.19) is once more negligible. Even if the sphere and disk were made of the vacuo-distilled gold with T = T = T1 = 18:88 × 103 kg=m3 one arrives from (7.19) at the negligibly small value of FN; z ≈ 6 × 10−16 NWF for the gravitational force.

194


The Yukawa-type addition to the Newtonian gravity which is due to the second term of potential (7.1) should be calculated including the eects of the true materials of the test bodies. It can be easily obtained using the same procedure which was applied above. The result is F hyp (a) = − 4 2 G6G B3 e−a=B R[T1 − (T1 − T)e−-=B ] [T1 − (T1 − T )e−-=B ] :

(7.22)

According to [44] the theoretical value of the Casimir force was con3rmed within the limits of WF = 3:8 × 10−12 N and no hypothetical force was observed. In such a situation, the constraints on 6G can be obtained from inequality (7.16). The strongest constraints follow for the smallest possible values of a ≈ 65 nm. The computational results are presented in Fig. 28 by curve 6 [306]. As is seen from Fig. 28, the Casimir force measurement between the gold surfaces by means of an atomic force microscope gives the possibility to strengthen the previously known constraints (curve 5) up to 19 times within a range 4:3 × 10−9 m 6 B 6 1:5 × 10−7 m. The largest strengthening takes place for B = 5–10 nm. Comparing the constraints obtained from the Casimir and van der Waals force measurements between dielectrics (curves 2 and 3) the strengthening up to 4500 was achieved with the Casimir force measurement between gold surfaces using the atomic force microscope. Note that there still persists a gap between the new constraints of curves 4 and 6 where the old results of curve 2 obtained from dielectric surfaces are valid. The Casimir force measurement between two crossed cylinders with gold surfaces ([278], see Section 6:6) also gives the possibility to strengthen constraints on the Yukawa-type interaction. The maximal strengthening in 300 times is achieved from this experiment at B = 4:26 nm [307]. As indicated above, there is abundant evidence that the gravitational interaction at small distances undergoes deviations from the Newtonian law. These deviations can be described by the Yukawa-type potential. They were predicted in theories with the quantum gravity scale both of order 1018 and 103 GeV. In the latter case the problem of experimental search for such deviations takes on even greater signi3cance. The existence of large extra dimensions can radically alter many concepts of space–time, elementary particle physics, astrophysics and cosmology. According to Section 7.2 the improvement of the experiment of Ref. [40] by a factor of 104 in the range around B = 10−4 m gives the possibility to attain the values of 6G ∼ 1. As to the experiments using the atomic force microscope to measure the Casimir force, there remains almost 14 orders more needed to achieve these values. Thus, in the experiments with an atomic force microscope it is desirable not only to increase the strength of constraints but also to make a shift of the interaction range to larger B. For this purpose the sphere radius and the space separation to the disk should be increased. The other experimental schemes are of interest also, for example, dynamical ones [294,295,308] (note that the gravitational experiments on the search of hypothetical interactions in the submillimeter range also suggest some promise [309]). In any case the Casimir eect oers important advantages as a new test for fundamental physical theories. Further strengthening of constraints on non-Newtonian gravity from the Casimir eect is expected in the future.


195

8. Conclusions and discussion The foregoing proves that the Casimir eect has become the subject of diverse studies of general physical interest in a variety of 3elds. It is equally interesting and important for Quantum Field Theory, Condensed Matter Physics, Gravitation, Astrophysics and Cosmology, Atomic Physics, and Mathematical Physics. Currently, the Casimir eect has been advanced as a new powerful test for hypothetical long-range interactions, including corrections to Newtonian gravitational law at small distances, predicted by the uni3ed gauge theories, supersymmetry, supergravity and string theory. It is also gaining in technological importance in vital applications such as in nanoelectromechanical devices [310,311]. From the 3eld-theoretical standpoint the most complicated problems arising from the theory of the Casimir eect are already solved (see Sections 3 and 4). There are no more problems (at least fundamentally) with singularities. Their general structure was determined by the combination of two powerful tools—heat kernel expansion and zeta-functional regularization. Calculation procedures of the 3nite Casimir energies and forces are settled as long as there is separation of variables. Otherwise, approximate methods should be applied or numerical computation by brute force done, which is sensible only for the purposes of some speci3c application. The numerous illustrations of the Casimir eect in various con3gurations are given in Section 4. Here both the Mat and curved boundaries are considered and the important progress with the case of a sphere is illustrated. Much importance is given to the additive methods and proximity forces which provide us with simple alternative possibilities to calculate the value of the Casimir force with high accuracy. In Section 4 the new developments in the dynamical Casimir eect are brieMy discussed and the results of calculations of the radiative corrections to the Casimir force are presented. Special attention is paid to the Casimir eect in spaces with non-Euclidean topology where cosmological models, topological defects and Kaluza–Klein theories are considered. The subject which is beyond the scope of our report is the atomic Casimir eect. As was shown in Sections 3.5 and 4.5 the presence of cavity walls leads to the modi3cations of the propagators. If the atom is situated between two mirrors there arise speci3c modi3cations in the spontaneous emission rates [312 –315]. The role of Casimir-type retardation eects in the atomic spectra is the subject of extensive study (see, e.g., [316–320] and references therein). The important new developments are in the study of the Casimir eect for real media, i.e. with account of nonzero temperature, 3nite conductivity of the boundary metal and surface roughness. Not only to each of these inMuential factors by itself contribute but also their combined eect should be taken into account in order to make possible the reliable comparison of theory and experiment. The theoretical results obtained here during the last few years are presented in Section 5. It turns out that the value of the Casimir force depends crucially on the electrical and mechanical properties of the boundary material. It was also discussed that the combination of such factors as nonzero temperature and 3nite conductivity presents a diTcult theoretical problem, so that one should use extreme caution in calculating their combined eect on the basis of the general Lifshitz theory. As was already noted in the Introduction the most striking development of the last years on this subject is the precision measurement of the Casimir force between metallic surfaces. In Section 6 the review of both older and recent experiments on measuring the Casimir force

196


is given and along with a comparison to modern theoretical results taking into account of all the corrections arising in real media. Excellent agreement between the experiment by means of atomic force microscope and theory at smallest separations is demonstrated. The measure of the agreement between theoretical and experimental Casimir force gives the possibility to obtain stronger constraints for the corrections to Newtonian gravitational law and other hypothetical long-range interactions predicted by the modern theories of fundamental interactions. These results are reviewed in Section 7 and represent the Casimir eect as a new test for fundamental interactions. In spite of quick progress in both theory and experiment during the last few years, the Casimir eect is on the verge of potentially exciting developments. Although the fundamental theoretical foundations are already laid, much should be done in the development of the approximate methods with controlled accuracy. In future the investigation of the real media will involve spatial dispersion which is needed for applications of the obtained results to thin 3lms. Some actual interest is connected with nonsmooth background (for example, when the derivative of a metric has a jump). Also, as was noticed in Section 4.2.3, the problem of the Casimir eect for the dielectric sphere is still open. More future progress in the 3eld is expected in connection with the new Casimir force measurements. Here the accuracy will be undoubtedly increased by several orders of magnitude, and the separation range will be expanded. As a result the 3rst measurement of the nonzero temperature Casimir force will be performed very shortly. The increased accuracy and fabrication of an array of many open boxes should give the possibility to observe the repulsive Casimir force which will have profound impact in nanotechnology (see Appendix A). The other experimental advance to be expected in the near future is the observation of the dynamical Casimir eect. It is anticipated that the strength of the constraints on the constants of hypothetical long-range interactions obtained by means of the Casimir eect will be increased at least by four orders of magnitude in the next two to three years. This would mean that the Casimir eect gives the possibility to check Newtonian gravitational law in the submillimeter range which was not possible by other methods in the last 300 years. As a result the exceptional predictions concerning the structure of space–time at short scales should be con3rmed or rejected by the measurement of the Casimir force. To conclude we would like to emphasize that although more than 50 years have passed after its discovery, the Casimir eect is gaining greater and greater importance on the development of modern physics. In our opinion, at present the Casimir eect is on the threshold of becoming a tool of exceptional importance both in fundamental physics and in technological applications.

Acknowledgements The authors are greatly indebted to Prof. G.L. Klimchitskaya for numerous helpful discussions and collaboration. V.M.M. is grateful to the Institute of Theoretical Physics (Leipzig University) and Brazilian Center of Physical Research (Rio de Janeiro) for kind hospitality. His work was partly supported by DFG (Germany) under the reference 436 RUS 17=19=00 and by FAPERJ under the number E-26=150:867=2000 and CNPq (process 300106=98-0), Brazil.


197

Appendix A. Applications of the Casimir force in nanotechnology The 3rst paper anticipating the dominant role of Casimir forces in nanoscale devices appeared over 15 years ago [321], but was largely ignored, as then the silicon chip fabrication dimensions were on the order of many microns. More recently, given the shrinking device dimensions to nanometers, the important role of Casimir forces present in nanoscale devices is now well recognized [310,311]. The important role of the Casimir forces in both the device performance and device fabrication has been acknowledged. Very recently, even an actuator based on the Casimir force has been fabricated using silicon nanofabrication technology [322]. A.1. Casimir force and nanomechanical devices Most present day nanomechanical devices are based on thin cantilever beams above a silicon substrate fabricated by photolithography followed by dry and wet chemical etching [310,311,322– 324]. Such cantilevers are usually suspended about 100 nm above the silicon substrate [323,324]. The cantilevers move in response to the Casimir force, applied voltages on the substrate, or in response to incoming radio-frequency signals [323,324]. In the case of radio-frequency transmitters and receivers, a high Quality factor (Q) is necessary for the narrow bandwidth operation of these devices. However, due to the coupling to the substrate and neighbouring cantilevers through the Casimir force, the vibration energy of the cantilever can be dissipated. This dissipation of mechanical energy leads to a decrease in the Q and crosstalk with neighboring receivers, both leading to degradation in the signal. The problem will be exacerbated in dense arrays of high Q transmitters and receivers needed for future mobile communication. Thus eective incorporation of the Casimir force is necessary in these devices to optimize their performance. Recently the 3rst actuator based on the Casimir force was developed by the researchers at Bell labs. It is a silicon based device that provides mechanical motion as a result of controlling vacuum Muctuations [322]. The device is fabricated by standard nanofabrication techniques such as photolithography and chemical etching on a silicon substrate. This device consists of 3.5 m thick, 500 m square heavily doped polysilicon plate freely suspended on two of its opposite sides by thin torsional rods as shown in Fig. 29A. The other ends of the torsional rods are anchored to the substrate via support posts as shown in Fig. 29B. There is a 2 m gap between the top plate and the underlying substrate, which is created by etching a SiO2 sacri3cial layer. To apply the Casimir force, the researchers suspended a gold coated ball just above one side of the plate. As the plate was moved closer to the ball, the Casimir force acting on the plate, tilted the plate about its central axis towards the sphere. Thus the Casimir force led to the mechanical motion of the nanofabricated silicon plate. This is the 3rst case of microelectromechanical device which shows actuation by the Casimir force. The measured tilt angle of the plate was calibrated and found to correspond accurately to the Casimir force. The experiments were done at room temperature and at a pressure less than 1 mTorr. The roughness amplitude of the gold 3lms was 30 nm. Due to the large roughness, the closest approach of the two surfaces was 75.7 nm. Even though the rms deviation of the experimental results from the theory was on the order of 0.5% of the forces at closest separation, the authors point out that the large roughness correction and limited understanding of the metal

198


Fig. 29. Scanning electron micrographs of (A) the nanofabricated torsional device and (B) a close-up one of the torsional rods anchored to the substrate. Courtesy of Federico Cappasso, Bell Labs, Lucent Technologies.

coating, prevent a better than 1% accurate comparison to the theory. However, an unambiguous mechanical movement of the silicon plate in response to the Casimir force was demonstrated. A.2. Casimir force in nanoscale device fabrication The Casimir force dominates over other forces at distances of a few nanometers. Thus movable components in nanoscale devices fabricated at distances less than 100 nm between each other often stick together due to the strong Casimir force. This process referred to as “stiction” leads to the collapse of movable elements to the substrate or the collapse of neighboring components during nanoscale device operation. This sometimes leads to permanent adhesion of the device components [310,311]. Thus this phenomena severely restricts the yield and operation of the devices. This stiction process is complicated by capillary forces that are present during fabrication. These together lead to poor yield in the microelectromechanical systems fabrication process. From the above it is clear that the Casimir forces fundamentally inMuence the performance and yield of nanodevices. The Casimir forces might well set fundamental limits on the performance and the possible density of devices that can be optimized on a single chip. On the grounds of


199

the above discussion even actuators based entirely on the Casimir force, or using a combination of the Casimir force and electrostatic forces will be possible in the near future. Thus a complete understanding of the material and shape dependences of the Casimir eect will be necessary to improve the yield and performance of the nanodevices. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37]

H.B.G. Casimir, Proc. Kon. Nederl. Akad. Wet. 51 (1948) 793. ∗ ∗ ∗ H.B.G. Casimir, D. Polder, Phys. Rev. 73 (1948) 360. ∗ ∗ ∗ M. Planck, Verh. d. Deutsch. Phys. Ges. 2 (1911) 138. H. Rechenberg, in: M. Bordag (Ed.), The Casimir Eect 50 Years Later, World Scienti3c, Singapore, 1999, p. 10. C. Itzykson, J.-B. Zuber, Quantum Field Theory, McGraw-Hill, New York, 1980. N.N. Bogoliubov, D.V. Shirkov, Quantum Fields, Benjamin=Cummings, London, 1982. P.W. Milonni, The Quantum Vacuum: An Introduction to Quantum Electrodynamics, Academic Press, New York, 1994. ∗ ∗ ∗ H.B.G. Casimir, in: M. Bordag (Ed.), The Casimir Eect 50 Years Later, World Scienti3c, Singapore, 1999, p. 3. E.M. Lifshitz, Sov. Phys.—JETP (USA) 2 (1956) 73. ∗ ∗ ∗ M. Bordag, G.L. Klimchitskaya, V.M. Mostepanenko, Int. J. Mod. Phys. A 10 (1995) 2661. ∗∗ I. Brevik, V.N. Marachevsky, K.A. Milton, Phys. Rev. Lett. 82 (1999) 3948. V.B. Berestetskii, E.M. Lifshitz, L.P. Pitaevskii, Quantum Electrodynamics, Pergamon, Oxford, 1982. Yu.S. Barash, V.L. Ginzburg, Sov. Phys.—Usp. (USA) 18 (1975) 305. ∗∗ Yu.S. Barash, V.L. Ginzburg, Sov. Phys.—Usp. (USA) 27 (1984) 467. ∗∗ L.H. Ford, Phys. Rev. D 11 (1975) 3370. L.H. Ford, Phys. Rev. D 14 (1976) 3304. S.G. Mamayev, V.M. Mostepanenko, A.A. Starobinsky, Sov. Phys.—JETP (USA) 43 (1976) 823. T.H. Boyer, Phys. Rev. 174 (1968) 1764. ∗∗ S.G. Mamayev, N.N. Trunov, Theor. Math. Phys. (USA) 38 (1979) 228. ∗∗ M.J. Sparnaay, Physica 24 (1958) 751. ∗ ∗ ∗ V.M. Mostepanenko, in: A.M. Prokhorov (Ed.), Physical Encyclopaedia, Vol. 5, Large Russian Encyclopaedia, Moscow, 1998, p. 664 (in Russian). G. Plunien, B. MUuller, W. Greiner, Phys. Rep. 134 (1986) 87. ∗ ∗ ∗ V.M. Mostepanenko, N.N. Trunov, Sov. Phys.—Usp. (USA) 31 (1988) 965. ∗ ∗ ∗ V.M. Mostepanenko, N.N. Trunov, The Casimir Eect and its Applications, Clarendon Press, Oxford, 1997. ∗ ∗ ∗ M. Krech, The Casimir Eect in Critical Systems, World Scienti3c, Singapore, 1994. J.S. Dowker, R. Critchley, Phys. Rev. D 13 (1976) 3224. S.W. Hawking, Comm. Math. Phys. 55 (1977) 133. B.S. DeWitt, Phys. Rep. 19 (1975) 297. ∗ S.K. Blau, M. Visser, A. Wipf, Nucl. Phys. B 310 (1988) 163. ∗ E. Elizalde, A. Romeo, Int. J. Mod. Phys. A 5 (1990) 1653. ∗ E. Elizalde, S.D. Odintsov, A. Romeo, A.A. Bytsenko, S. Zerbini, Zeta Regularization Techniques with Applications, World Scienti3c, Singapore, 1994. E. Elizalde, Ten Physical Applications of Spectral Zeta Functions, Springer, Berlin, Heidelberg, 1995. M. Bordag, E. Elizalde, K. Kirsten, S. Leseduarte, Phys. Rev. D 56 (1997) 4896. G.L. Klimchitskaya, A. Roy, U. Mohideen, V.M. Mostepanenko, Phys. Rev. A 60 (1999) 3487. ∗ A. Lambrecht, S. Reynaud, Eur. Phys. J. D 8 (2000) 309. ∗ G.L. Klimchitskaya, U. Mohideen, V.M. Mostepanenko, Phys. Rev. A 61 (2000) 062107. ∗ C. Genet, A. Lambrecht, S. Reynaud, Phys. Rev. A 62 (2000) 012110.∗

200 [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83]

M. Bordag et al. / Physics Reports 353 (2001) 1–205 M. Bordag, B. Geyer, G.L. Klimchitskaya, V.M. Mostepanenko, Phys. Rev. Lett. 85 (2000) 503. ∗ V.B. Bezerra, G.L. Klimchitskaya, V.M. Mostepanenko, Phys. Rev. A 62 (2000) 014102. S.K. Lamoreaux, Phys. Rev. Lett. 78 (1997) 5. ∗ ∗ ∗ U. Mohideen, A. Roy, Phys. Rev. Lett. 81 (1998) 4549. ∗ ∗ ∗ A. Roy, U. Mohideen, Phys. Rev. Lett. 82 (1999) 4380. ∗ ∗ ∗ A. Roy, C.-Y. Lin, U. Mohideen, Phys. Rev. D 60 (1999) 111101(R). ∗∗ B.W. Harris, F. Chen, U. Mohideen, Phys. Rev. A 62 (2000) 052109. ∗∗ M. Bordag, B. Geyer, G.L. Klimchitskaya, V.M. Mostepanenko, Phys. Rev. D 58 (1998) 075003. V.M. Mostepanenko, in: C.-K. Au, P. Milonni, L. Spruch, J.F. Babb (Eds.), Proceedings of Workshop on Casimir Forces, ITAMP, Cambridge, 1998, p. 85. M. Bordag, B. Geyer, G.L. Klimchitskaya, V.M. Mostepanenko, Phys. Rev. D 60 (1999) 055004. M. Bordag, B. Geyer, G.L. Klimchitskaya, V.M. Mostepanenko, Phys. Rev. D 62 (2000) 011701(R). ∗ M. Kardar, R. Golestanian, Rev. Mod. Phys. 71 (1999) 1233. ∗ S.K. Lamoreaux, Am. J. Phys. 67 (1999) 850. ∗ J.A. Espich]an Carrillo, A. Maia Jr., V.M. Mostepanenko, Int. J. Mod. Phys. A 15 (2000) 2645. A. Erd]elyi et al., Higher Transcendental Functions, Vol. 1, McGraw-Hill, New York, 1953. A.A. Grib, S.G. Mamayev, V.M. Mostepanenko, Vacuum Quantum Eects in Strong Fields, Friedmann Laboratory Publishing, St. Petersburg, 1994. N.D. Birrell, P.C.W. Davies, Quantum Fields in Curved Space, Cambridge University Press, Cambridge, 1982. S.G. Mamayev, V.M. Mostepanenko, in: M.A. Markov, V.A. Berezin, V.P. Frolov (Eds.), Proceedings of the Third Seminar on Quantum Gravity, World Scienti3c, 1985, p. 462. S.G. Mamayev, N.N. Trunov, Sov. Phys. J. (USA) 22 (1979) 766. G.T. Moore, J. Math. Phys. 11 (1970) 2679. S.A. Fulling, P.C.W. Davies, Proc. Roy. Soc. London A 348 (1976) 393. M. Razavy, J. Terning, Phys. Rev. D 31 (1985) 307. C.K. Law, Phys. Rev. A 51 (1995) 2537. V.V. Dodonov, A.B. Klimov, Phys. Rev. A 53 (1996) 2664. N.N. Bogoliubov, Yu.A. Mitropolsky, Asymptotic Methods in the Theory of Non-Linear Oscillations, Gordon and Breach, New York, 1985. P.B. Gilkey, G. Grubb, Comm. Partial Dierential Equations 23 (1998) 777. K. Chadan, P.S. Sabatier, Inverse Problems in Quantum Scattering Theory, Springer, Berlin, 1989. M.L. Goldberger, Collision Theory, Wiley, New York, 1964. M.T. Jaekel, S. Reynaud, J. Phys. I (France) 1 (1991) 1395. M. Bordag, D. Hennig, D. Robaschik, J. Phys. A 25 (1992) 4483. J.R. Taylor, Scattering Theory, Wiley, New York, 1972. B.S. DeWitt, Dynamical Theory of Groups and Fields, Gordon and Breach, New York, 1965. P.B. Gilkey, Invariance Theory, The Heat Equation and the Atiyah-Singer Index Theorem, 2nd Edition, CTC Press, Boca Raton, 1995. I.G. Avramidi, Nucl. Phys. B 355 (1991) 712. A.E.M. van de Ven, Class. Quant. Grav. 15 (1998) 2311. A.O. Barvinsky, G.A. Vilkovisky, Phys. Rep. 119 (1985) 1. M.J. Booth, preprint hep-th=9803113, 1998. G. Kennedy, J. Phys. A 11 (1978) 173. T.P. Branson, P.B. Gilkey, Comm. Partial Dierential Equations 15 (1990) 245. D.V. Vassilevich, J. Math. Phys. 36 (1995) 3174. T.P. Branson, P.B. Gilkey, D.V. Vassilevich, Boll. Union. Mat. Ital. B 11 (1997) 39. T.P. Branson, P.B. Gilkey, K. Kirsten, D.V. Vassilevich, Nucl. Phys. B 563 (1999) 603. K. Kirsten, Class. Quant. Grav. 15 (1998) L5. M. Bordag, E. Elizalde, K. Kirsten, J. Math. Phys. 37 (1996) 895. M. Bordag, K. Kirsten, D.V. Vassilevich, Phys. Rev. D 59 (1999) 085011. ∗∗ M. Bordag, D.V. Vassilevich, J. Phys. A 32 (1999) 8247.

M. Bordag et al. / Physics Reports 353 (2001) 1–205 [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134]

201

G. Barton, J. Phys. A 32 (1999) 525. J. AmbjHrn, R.J. Hughes, Nucl. Phys. B 217 (1983) 336. C. Peterson, T.H. Hansson, K. Johnson, Phys. Rev. D 26 (1982) 415. G. Esposito, A.Yu. Kamenshchik, I.V. Mishakov, G. Pollifrone, Class. Quant. Grav. 11 (1994) 2939. G. Esposito, A.Yu. Kamenshchik, I.V. Mishakov, G. Pollifrone, Phys. Rev. D 52 (1995) 2183. I.G. Moss, S. Poletti, Phys. Lett. B 245 (1990) 355. I.G. Moss, P.J. Silva, Phys. Rev. D 55 (1997) 1072. G. Esposito, A.Yu. Kamenshchik, K. Kirsten, Int. J. Mod. Phys. A 14 (1999) 281. M. Bordag, D. Robaschik, E. Wieczorek, Ann. Phys. (N.Y.) 165 (1985) 192. ∗ K.A. Milton, L.L. DeRaad Jr., J. Schwinger, Ann. Phys. (N.Y.) 115 (1978) 388. ∗ ∗ ∗ M. Bordag, preprint JINR-P2-84-115, JINR-Communication, 1984. M. Bordag, J. Lindig, Phys. Rev. D 58 (1998) 045003. R. Balian, B. Duplantier, Ann. Phys. (N.Y.) 112 (1978) 165. ∗ M. Bordag, Phys. Lett. B 171 (1986) 113. M. Bordag, in: D.Y. Grigoriev, V.A. Matveev, V.A. Rubakov, P.G. Tinyakov (Eds.), Seventh International Seminar on Quarks ’92, World Scienti3c, Singapore, 1993, p. 80. I.-T. Cheon, Zeitschr. Phys. D 39 (1997) 3. D. Robaschik, K. Scharnhorst, E. Wieczorek, Ann. Phys. (N.Y.) 174 (1987) 401. M.V. Congo-Pinto, C. Farina, M.R. Negrão, A.C. Tort, J. Phys. A 32 (1999) 4457. A.A. Actor, I. Bender, Fortschr. Phys. 44 (1996) 281. A.A. Actor, Fortschr. Phys. 43 (1995) 141. A.A. Actor, I. Bender, Phys. Rev. D 52 (1995) 3581. E.M. Lifshitz, L.P. Pitaevskii, Statistical Physics, Part 2, Pergamon Press, Oxford, 1980. ∗∗ N.G. van Kampen, B.R.A. Nijboer, K. Schram, Phys. Lett. A 26 (1968) 307. ∗∗ K. Schram, Phys. Lett. A 43 (1973) 283. ∗ F. Zhou, L. Spruch, Phys. Rev. A 52 (1995) 297. I. Belyanitsckii-Birula, J.B. Brojan, Phys. Rev. D 5 (1972) 485. I.E. Dzyaloshinskii, E.M. Lifshitz, L.P. Pitaevskii, Sov. Phys.—Usp. (USA) 4 (1961) 153. D. Kupiszewska, Phys. Rev. A 46 (1992) 2286. Yu.S. Barash, Sov. Radiophys. 16 (1975) 945. S.J. van Enk, Phys. Rev. A 52 (1995) 2569. W. Lukosz, Physica 56 (1971) 109. ∗∗ S.G. Mamaev, N.N. Trunov, Sov. Phys. J. (USA) 22 (1979) 51. J. AmbjHrn, S. Wolfram, Ann. Phys. (N.Y.) 147 (1983) 1. F. Caruso, N.P. Neto, B.F. Svaiter, N.F. Svaiter, Phys. Rev. D 43 (1991) 1300. F. Caruso, R. De Paola, N.F. Svaiter, Int. J. Mod. Phys. A 14 (1999) 2077. S. Hacyan, R. J]auregui, C. Villarreal, Phys. Rev. A 47 (1993) 4204. X.-Z. Li, H.-B. Cheng, J.-M. Li, X.-H. Zhai, Phys. Rev. D 56 (1997) 2155. V. Hushwater, Am. J. Phys. 65 (1997) 381. G.J. Maclay, Phys. Rev. A 61 (2000) 052110. H.B.G. Casimir, Physica 19 (1953) 846. M.F. Atiyah, V.K. Patodi, I. Singer, Math. Proc. Cambridge Philos. Soc. 77 (1975) 43. P.B. Gilkey, L. Smith, J. Dierential Geom. 18 (1983) 393. G. Grubb, R. Seeley, Comp. Rend. Acad. Sci. Paris, Ser. I 317 (1993) 1123. P.D. D’Eath, G. Esposito, Phys. Rev. D 44 (1991) 1713. E. Elizalde, M. Bordag, K. Kirsten, J. Phys. A 31 (1998) 1743. M. Abramowitz, I.A. Stegun, Handbook of Mathematical Functions, Dover, New York, 1970. M. Bordag, K. Kirsten, Phys. Rev. D 53 (1996) 5753. ∗ M. Bordag, K. Kirsten, Phys. Rev. D 60 (1999) 105019. M. Scandurra, J. Phys. A 32 (1999) 5679. G.E. Brown, A.D. Jackson, M. Rho, V. Vento, Phys. Lett. B 140 (1984) 285. K.A. Milton, Phys. Rev. D 22 (1980) 1441.

202 [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183]


K.A. Milton, Phys. Rev. D 22 (1980) 1444. K.A. Milton, Ann. Phys. (N.Y.) 150 (1983) 432. M. De Francia, H. Falomir, E.M. Santangelo, Phys. Rev. D 45 (1992) 2129. H. Falomir, E.M. Santangelo, Phys. Rev. D 43 (1991) 539. J. Baacke, Y. Igarashi, Phys. Rev. D 27 (1983) 460. S. Leseduarte, A. Romeo, Ann. Phys. (N.Y.) 250 (1996) 448. C.M. Bender, K.A. Milton, Phys. Rev. D 50 (1994) 6547. K.A. Milton, Phys. Rev. D 55 (1997) 4940. K. Kirsten, preprint hep-th=0007251, 2000. S. Liberati, F. Belgiorno, M. Visser, D.W. Sciama, J. Phys. A 33 (2000) 2251. I. Brevik, H. Kolbenstvedt, Ann. Phys. (N.Y.) 143 (1982) 179. I. Klich, Phys. Rev. D 61 (2000) 025004. K.A. Milton, Y.J. Ng, Phys. Rev. E 57 (1998) 5504. G. Lambiase, G. Scarpetta, V.V. Nesterenko, preprint hep-th=9912176, 1999. M. Bordag, K. Kirsten, D.V. Vassilevich, J. Phys. A 31 (1998) 2381. P. Gosdzinsky, A. Romeo, Phys. Lett. B 441 (1998) 265. L.L. DeRaad Jr., K.A. Milton, Ann. Phys. (N.Y.) 136 (1981) 229. K.A. Milton, A.V. Nesterenko, V.V. Nesterenko, Phys. Rev. D 59 (1999) 105009. G. Lambiase, V.V. Nesterenko, M. Bordag, J. Math. Phys. 40 (1999) 6254. V.V. Nesterenko, I.G. Pirozhenko, J. Math. Phys. 41 (2000) 4521. I. Klich, A. Romeo, Phys. Lett. B 476 (2000) 369. M. Scandurra, J. Phys. A 33 (2000) 5707. R. Balian, B. Duplantier, Ann. Phys. (N.Y.) 104 (1977) 300. J. Blocki, J. Randrup, W.J. Swiatecki, C.F. Tsang, Ann. Phys. (N.Y.) 105 (1977) 427. B. Derjaguin, Kolloid Z. 69 (1934) 155. B.V. Derjaguin, I.I. Abrikosova, E.M. Lifshitz, Q. Rev. 10 (1956) 295. Yu.S. Barash, The van der Waals Forces, Nauka, Moscow, 1988 (in Russian). V.M. Mostepanenko, I.Yu. Sokolov, Sov. Phys.—Dokl. (USA) 33 (1988) 140. T. Datta, L.H. Ford, Phys. Lett. A 83 (1981) 314. M. Schaden, L. Spruch, Phys. Rev. A 58 (1998) 935. M. Schaden, L. Spruch, Phys. Rev. Lett. 84 (2000) 459. ∗ M. Bordag, G. Petrov, D. Robaschik, Sov. J. Nucl. Phys. (USA) 39 (1984) 828. M. Bordag, F.-M. Dittes, D. Robaschik, Sov. J. Nucl. Phys. (USA) 43 (1986) 1034. D.A.R. Dalvit, F.D. Mazzitelli, Phys. Rev. A 59 (1999) 3049. L.H. Ford, A. Vilenkin, Phys. Rev. D 25 (1982) 2569. G. Barton, C. Eberlein, Ann. Phys. (N.Y.) 227 (1993) 222. P.A. Maia Neto, L.A.S. Machado, Phys. Rev. A 54 (1996) 3420. D.A.R. Dalvit, P.A. Maia Neto, Phys. Rev. Lett. 84 (2000) 798. M.T. Jaekel, S. Reynaud, J. Phys. I (France) 3 (1993) 1. G. Barton, A. Calogeracos, Ann. Phys. (N.Y.) 238 (1995) 227. G. Plunien, R. SchUutzhold, G. So, Phys. Rev. Lett. 84 (2000) 1882. M. Bordag, K. Scharnhorst, Phys. Rev. Lett. 81 (1998) 3815. Ya.B. Zel’dovich, A.A. Starobinsky, Sov. Astron. Lett. (USA) 10 (9)135. M. Lachièze-Rey, J.-P. Luminet, Phys. Rep. 254 (1995) 135. G.F.R. Ellis, Gen. Relat. Grav. 2 (1971) 7. D.D. Sokolov, V.F. Shvartsman, Sov. Phys.—JETP (USA) 39 (1974) 196. H.V. Fagundes, Astrophys. J. 338 (1989) 618. A.A. Bytsenko, G. Cognola, L. Vanzo, S. Zerbini, Phys. Rep. 266 (1996) 1. A. Vilenkin, E.P.S. Shellard, Cosmic Strings and Other Topological Defects, Cambridge University Press, Cambridge, 1994. [184] T.M. Helliwell, D.A. Konkowski, Phys. Rev. D 34 (1986) 1918. [185] J.S. Dowker, Phys. Rev. D 36 (1987) 3095.

M. Bordag et al. / Physics Reports 353 (2001) 1–205 [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212] [213] [214] [215] [216] [217] [218] [219] [220] [221] [222] [223] [224] [225] [226] [227] [228] [229] [230] [231] [232] [233] [234] [235]

203

J.S. Dowker, Class. Quant. Grav. 4 (1987) L157. N.R. Khusnutdinov, M. Bordag, Phys. Rev. D 59 (1999) 064017. M. Bordag, Ann. Phys. 47 (1990) 93. D.V. Gal’tsov, Yu.V. Gratz, A.B. Lavrent’ev, Phys. Atom. Nucl. 58 (1995) 570. U A.N. Aliev, M. Hortaacsu, N. Ozdemir, Class. Quant. Grav. 14 (1997) 3215. A.N. Aliev, Phys. Rev. D 55 (1997) 3903. M. Bordag, Ann. Phys. (N.Y.) 206 (1991) 257. P.S. Letelier, Class. Quant. Grav. 4 (1987) L75. D.V. Fursaev, Class. Quant. Grav. 11 (1994) 1431. G. Cognola, K. Kirsten, L. Vanzo, Phys. Rev. D 49 (1994) 1029. M. Bordag, K. Kirsten, J.S. Dowker, Commun. Math. Phys. 182 (1996) 371. E.R. Bezerra de Mello, V.B. Bezerra, N.R. Khusnutdinov, Phys. Rev. D 60 (1999) 063506. M.J. Du, B.E.W. Nilsson, C.N. Pope, Phys. Rep. 130 (1986) 1. M. Green, J. Schwarz, E. Witten, Superstring Theory, Vols. 1, 2, Cambridge University Press, Cambridge, 1987. S.G. Mamayev, V.M. Mostepanenko, Sov. Phys.—JETP (USA) 51 (1980) 9. V.M. Mostepanenko, Sov. J. Nucl. Phys. (USA) 31 (1980) 876. A.A. Starobinsky, Phys. Lett. B 91 (1980) 99. D. Birmingham, R. Kantowski, K.A. Milton, Phys. Rev. D 38 (1988) 1809. I.L. Buchbinder, S.D. Odintsov, Fortschr. Phys. 37 (1989) 225. A. Chodos, E. Myers, Ann. Phys. (N.Y.) 156 (1984) 412. P. Candelas, S. Weinberg, Nucl. Phys. B 237 (1984) 397. S.K. Blau, E.I. Guendelman, A. Taormina, L.C.R. Wijewardhana, Phys. Lett. B 144 (1984) 30. J.S. Dowker, Phys. Rev. D 29 (1984) 2773. A.A. Bytsenko, S. Zerbini, Class. Quant. Grav. 9 (1992) 1365. A. Sen, in: T. Piran (Ed.), Proceedings of the Eighth Marcel Grossmann Meeting on General Relativity, Part A, World Scienti3c, Singapore, 1999, p. 47. M. Fierz, Helv. Phys. Acta 33 (1960) 855. J. Mehra, Physica 37 (1967) 145. ∗ ∗ ∗ L.S. Brown, G.J. Maclay, Phys. Rev. 184 (1969) 1272. ∗ ∗ ∗ J. Schwinger, L.L. DeRaad Jr., K.A. Milton, Ann. Phys. (N.Y.) 115 (1978) 1. ∗ ∗ ∗ J.S. Dowker, G. Kennedy, J. Phys. A 11 (1978) 895. ∗∗ H. Mitter, D. Robaschik, Eur. Phys. J. B 13 (2000) 335. L. Spruch, Y. Tikochinsky, Phys. Rev. A 48 (1993) 4213. F.S. Santos, A. Ten]orio, A.C. Tort, Phys. Rev. D 60 (1999) 105022. A.A. Actor, Fortschr. Phys. 37 (1989) 465. J.S. Dowker, J.S. Apps, Class. Quant. Grav. 12 (1995) 1363. M. Bordag, B. Geyer, K. Kirsten, E. Elizalde, Commun. Math. Phys. 179 (1996) 215. C.M. Hargreaves, Proc. Kon. Nederl. Acad. Wet. B 68 (1965) 231. ∗ V.M. Mostepanenko, N.N. Trunov, Sov. J. Nucl. Phys. (USA) 42 (1985) 818. ∗ A.A. Maradudin, Opt. Commun. 116 (1995) 452. T.A. Leskova, A.A. Maradudin, I.V. Novikov, Appl. Opt. 38 (1999) 1197. I.S. Gradshteyn, I.M. Ryzhik, Table of Integrals, Series and Products, Academic Press, New York, 1980. V.B. Bezerra, G.L. Klimchitskaya, C. Romero, Mod. Phys. Lett. A 12 (1997) 2623. S.K. Lamoreaux, Phys. Rev. A 59 (1999) 3149(R). A. Lambrecht, S. Reynaud, Phys. Rev. Lett. 84 (2000) 5672. E.D. Palik (Ed.), Handbook of Optical Constants of Solids, Academic Press, New York, 1998. L.H. Ford, Phys. Rev. A 58 (1998) 4279. L.D. Landau, E.M. Lifshitz, Electrodynamics of Continuous Media, Pergamon, Oxford, 1982. E.I. Kats, Sov. Phys.—JETP (USA) 46 (1977) 109. G. Barton, Rep. Progr. Phys. 42 (1979) 963. P.B. Johnson, R.W. Christy, Phys. Rev. B 6 (1972) 4370.

204 [236] [237] [238] [239] [240] [241] [242] [243] [244] [245] [246] [247] [248] [249] [250] [251] [252] [253] [254] [255] [256] [257] [258] [259] [260] [261] [262] [263] [264] [265] [266] [267] [268] [269] [270] [271] [272] [273] [274] [275] [276] [277] [278] [279] [280] [281] [282] [283] [284] [285]

M. Bordag et al. / Physics Reports 353 (2001) 1–205 I. Stokroos, D. Kalicharan, J.J.L. Van der Want, W.L. Jongebloed, J. Microsc. 189 (1997) 79. M. BostrUom, Bo E. Sernelius, Phys. Rev. B 61 (2000) 2204. V.N. Dubrava, V.A. Yampol’skii, Low Temp. Phys. (USA) 25 (1999) 979. A.A. Maradudin, P. Mazur, Phys. Rev. B 22 (1980) 1677. ∗ P. Mazur, A.A. Maradudin, Phys. Rev. B 23 (1981) 695. R. Golestanian, M. Kardar, Phys. Rev. Lett. 78 (1997) 3421. ∗ ∗ ∗ R. Golestanian, M. Kardar, Phys. Rev. A 58 (1998) 1713. M. Bordag, G.L. Klimchitskaya, V.M. Mostepanenko, Mod. Phys. Lett. A 9 (1994) 2515. M. Bordag, G.L. Klimchitskaya, V.M. Mostepanenko, Phys. Lett. A 200 (1995) 95. G.L. Klimchitskaya, Yu.V. Pavlov, Int. J. Mod. Phys. A 11 (1996) 3723. G.L. Klimchitskaya, M.B. Shabaeva, Russ. Phys. J. (USA) 39 (1996) 678. V.B. Bezerra, G.L. Klimchitskaya, C. Romero, Phys. Rev. A 61 (2000) 022115. D. Deutsch, P. Candelas, Phys. Rev. D 20 (1979) 3063. J.L.M. van Bree, J.A. Poulis, B.J. Verhaar, K. Schram, Physica 78 (1974) 187. G.L. Klimchitskaya, V.M. Mostepanenko, Phys. Rev. A 63 (2001) 062108. ∗∗ M. BostrUom, Bo E. Sernelius, Phys. Rev. Lett. 84 (2000) 4757. V.B. Svetovoy, M.V. Lokhanin, Mod. Phys. Lett. A 15 (2000) 1013. V.B. Svetovoy, M.V. Lokhanin, Mod. Phys. Lett. A 15 (2001) 1437. M.J. Sparnaay, in: A. Sarlemijn, M.J. Sparnaay (Eds.), Physics in the Making, North-Holland, Amsterdam, 1989. B.V. Derjaguin, N.V. Churaev, V.M. Muller, Surface Forces, Plenum, New York, 1987. G.C.J. Rouweler, J.T.G. Overbeek, J. Chem. Soc. Faraday Trans. 67 (1971) 2117. P.H.G.M. van Blokland, J.T.G. Overbeek, J. Chem. Soc. Faraday Trans. 74 (1978) 2637. ∗∗∗ D. Tabor, R.H.S. Winterton, Nature 219 (1968) 1120. ∗∗ J.N. Israelachvili, D. Tabor, Proc. Roy. Soc. London A 331 (1972) 19. J. Israelachvili, Intermolecular and Surface Forces, Academic Press, San Diego, 1992. L.R. White, J.N. Israelachvili, B.W. Ninham, J. Chem. Soc. Faraday Trans. 72 (1976) 2526. J.T.G. Overbeek, M.J. Sparnaay, Discuss. Faraday Soc. 18 (1954) 12. S. Hunklinger, H. Geisselmann, W. Arnold, Rev. Sci. Instr. 43 (1972) 584. W. Arnold, S. Hunklinger, K. Dransfeld, Phys. Rev. B 19 (1979) 6049. R. Onofrio, G. Corrugno, Phys. Lett. A 198 (1995) 365. G. Corugno, Z. Fontana, R. Onofrio, G. Rizzo, Phys. Rev. D 55 (1997) 6591. E.S. Sabiski, C.H. Anderson, Phys. Rev. A 7 (1973) 790. B. Gady, D. Schleef, R. Reifenberger, D. Rimai, L.P. De Meja, Phys. Rev. B 53 (1996) 8065. B.V. Derjaguin, I.I. Abrikosova, J. Phys. Chem. Solid 5 (1958) 1. ∗∗ B.V. Derjaguin, I.I. Abrikosova, Sci. Am. 203 (1960) 47. B.V. Derjaguin, I.I. Abrikosova, Zh. Eksp. Teor. Fiz. 21 (1951) 445. H. Krupp, Adv. Colloid Interface Sci. 1 (1967) 111. S.K. Lamoreaux, Phys. Rev. Lett. 81 (1998) 4549(E). W.R. Smythe, Electrostatics and Electrodynamics, McGraw-Hill, New York, 1950. S.K. Lamoreaux, Phys. Rev. Lett. 83 (1999) 3340. G.L. Klimchitskaya, S.I. Zanette, A.O. Caride, Phys. Rev. A 63 (2001) 014101. G.L. Klimchitskaya, V.M. Mostepanenko, Comm. Mod. Phys. 1 (2000) 285. T. Ederth, Phys. Rev. A 62 (2000) 062104. J. Kim, Phys. Rep. 150 (1987) 1. E. Fischbach, G.T. Gillies, D.E. Krause, J.G. Schwan, C. Talmadge, Metrologia 29 (1992) 213. G.T. Gillies, Rep. Progr. Phys. 60 (1997) 151. E. Fischbach, C.L. Talmadge, The Search for Non-Newtonian Gravity, Springer, New York, 1998. ∗∗ G.L. Smith, C.D. Hoyle, J.H. Gundlach, E.G. Adelberger, B.R. Heckel, H.E. Swanson, Phys. Rev. D 61 (1999) 022001. V.A. Kuz’min, I.I. Tkachev, M.E. Shaposhnikov, JETP Lett. (USA) 36 (1982) 59. ∗∗ V.M. Mostepanenko, I.Yu. Sokolov, Phys. Lett. A 125 (1987) 405.

M. Bordag et al. / Physics Reports 353 (2001) 1–205 [286] [287] [288] [289] [290] [291] [292] [293] [294] [295] [296] [297] [298] [299] [300] [301] [302] [303] [304] [305] [306] [307] [308] [309] [310] [311] [312] [313] [314] [315] [316] [317] [318] [319] [320] [321] [322] [323] [324]

205

V.M. Mostepanenko, I.Yu. Sokolov, Sov. J. Nucl. Phys. (USA) 46 (1987) 685. V.M. Mostepanenko, I.Yu. Sokolov, Phys. Lett. A 132 (1988) 313. V.M. Mostepanenko, I.Yu. Sokolov, Sov. J. Nucl. Phys. (USA) 49 (1989) 1118. V.M. Mostepanenko, I.Yu. Sokolov, Phys. Rev. D 47 (1993) 2882. N. Arkani-Hamed, S. Dimopoulos, G. Dvali, Phys. Rev. D 59 (1999) 086004. E.G. Floratos, G.K. Leontaris, Phys. Lett. B 465 (1999) 95. A. Kehagias, K. Sfetsos, Phys. Lett. B 472 (2000) 39. V.P. Mitrofanov, O.I. Ponomareva, Sov. Phys.—JETP (USA) 67 (1988) 1963. R. Onofrio, Mod. Phys. Lett. A 13 (1998) 1401. G. Bressi, G. Carugno, A. Galvani, R. Onofrio, G. Ruoso, Clas. Quant. Grav. 17 (2000) 2365. M. Bordag, V.M. Mostepanenko, I.Yu. Sokolov, Phys. Lett. A 187 (1994) 35. Yu.N. Moiseev, V.M. Mostepanenko, V.I. Panov, I.Yu. Sokolov, Sov. Phys.—Dokl. (USA) 34 (1989) 147. Yu.N. Moiseev, V.M. Mostepanenko, V.I. Panov, I.Yu. Sokolov, in: R. RuTni (Ed.), Proceedings of the Fifth Marcel Grossmann Relativity Meeting, World Scienti3c, Singapore, 1989. G. Feinberg, J. Sucher, Phys. Rev. D 20 (1979) 1717. ∗ V.M. Mostepanenko, I.Yu. Sokolov, Phys. Lett. A 146 (1990) 373. G.L. Klimchitskaya, V.M. Mostepanenko, C. Romero, Ye.P. Krivtsov, A.Ye. Sinelnikov, Int. J. Mod. Phys. A 12 (1997) 1465. E. Fischbach, D.E. Krause, Phys. Rev. Lett. 83 (1999) 3593. J.C. Long, H.W. Chan, J.C. Price, Nucl. Phys. B 539 (1999) 23. G.L. Klimchitskaya, E.R. Bezerra de Mello, V.M. Mostepanenko, Phys. Lett. A 236 (1997) 280. D.E. Krause, E. Fischbach, in: C. LUammerzahl, C.W.F. Everitt, F.W. Hehl (Eds.), Testing General Relativity in Space: Gyroscopes, Clocks, and Interferometers, Springer, 2001. V.M. Mostepanenko, M. Novello, preprint hep-ph=0008035, 2000, in: A.A. Bytsenko, A.E. Goncalves, B.M. Pimentel (Eds.), Geometrical Aspects of Quantum Fields, World Scienti3c, Singapore, 2001. V.M. Mostepanenko, M. Novello, Phys. Rev. D 63 (2001) 115003. A. Grado, E. Calloni, L. Di Fiore, Phys. Rev. D 59 (1999) 042002. C.D. Hoyle, U. Schmidt, B.R. Heckel, E.G. Adelberger, J.H. Gundlach, D.J. Kapner, H.E. Swanson, Phys. Rev. Lett. 86 (2001) 1418. F. Serry, D. Walliser, G.J. Maclay, J. Appl. Phys. 84 (1998) 2501. E. Buks, M.L. Roukes, Phys. Rev. B 63 (2001) 033402. H. Moravitz, Phys. Rev. 187 (1969) 1792. P.W. Milonni, P.L. Knight, Opt. Comm. 9 (1973) 119. G. Barton, Proc. Roy. Soc. London A 410 (1987) 141. D.T. Alves, C. Farina, A.C. Tort, Phys. Rev. A 61 (2000) 034102. E.J. Kelsey, L. Spruch, Phys. Rev. A 18 (1978) 15. J.F. Babb, L. Spruch, Phys. Rev. A 40 (1989) 2917. G. Feinberg, J. Sucher, C.K. Au, Phys. Rep. 180 (1989) 83. F.S. Levin, D. Micha (Eds.), Long Range Forces: Theory and Recent Experiments in Atomic Systems, Plenum, New York, 1992. P.W. Milonni, M. Schaden, L. Spruch, Phys. Rev. A 59 (1999) 4259. Y. Srivastava, A. Widom, M.H. Friedman, Phys. Rev. Lett. 55 (1985) 2246. H.B. Chan, V.A. Aksyuk, R.N. Kleiman, D.J. Bishop, F. Capasso, Science 291 (2001) 1941. ∗∗ F.D. Bannon, J.R. Clark, C.T.-C. Nguyen, IEEE Journal of Solid State Circuits 35 (2000) 512. C.T.-C. Nguyen, L.P.B. Katehi, G.M. Rebez, Proc. of IEEE 86 (1998) 1756.


Roy equation analysis of scattering B. Ananthanarayana , G. Colangelob , J. Gasserc , H. Leutwylerc; ∗ a

b

Centre for Theoretical Studies, Indian Institute of Science, Bangalore, 560 012 India Institute for Theoretical Physics, University of Zurich, Winterthurerstr. 190, CH-8057 Zurich, Switzerland c Institute for Theoretical Physics, University of Bern, Sidlerstr. 5, CH-3012 Bern, Switzerland Received May 2000; editor : R: Petronzio

Contents 1. 2. 3. 4. 5. 6.

Introduction Scattering amplitude Background amplitude Driving terms Roy equations as integral equations On the uniqueness of the solution 6.1. Roy’s integral equation in the one-channel case 6.2. Cusps 6.3. Uniqueness in the multi-channel case 7. Experimental input 7.1. Elasticity below the matching point 7.2. Input for the I = 0; 1 channels 7.3. Phase of the P-wave from e+ e− → + − and → − 0 7.4. Phases at the matching point 7.5. Input for the I = 2 channel 8. Numerical solutions 8.1. Method used to @nd solutions 8.2. Illustration of the solutions 9. Universal band

209 211 213 214 217 218 218 220 221 222 222 223 225 226 227 228 228 229 231

10. Consistency 11. Olsson sum rule 12. Comparison with experimental data 12.1. Data on 00 − 11 from Ke4 , and on 20 below 0:8 GeV 12.2. The resonance 12.3. Data on the I = 0 S-wave below 0:8 GeV 12.4. Data above 0:8 GeV 13. Allowed range for a00 and a20 14. Threshold parameters 14.1. S- and P-waves 14.2. D- and F-waves 15. Values of the phase shifts at s = MK2 16. Comparison with earlier work 17. Summary and conclusions Acknowledgements Appendix A. Integral kernels Appendix B. Background amplitude B.1. Expansion of the background for small momenta

∗

Corresponding author. E-mail address: [email protected] (H. Leutwyler).


232 236 237 238 240 243 244 245 246 246 250 252 253 255 258 258 260 260

208

B. Ananthanarayan et al. / Physics Reports 353 (2001) 207–279

B.2. Constraints due to crossing symmetry B.3. Background generated by the higher partial waves B.4. Asymptotic contributions B.5. Driving terms Appendix C. Sum rules and asymptotic behaviour

261 263 265 268 269

C.1. Sum rules for the P-wave C.2. Asymptotic behaviour of the Roy integrals Appendix D. Explicit numerical solutions Appendix E. Lovelace–Shapiro–Veneziano model References

269 271 272 274 277

Abstract We analyse the Roy equations for the lowest partial waves of elastic scattering. In the @rst part of the paper, we review the mathematical properties of these equations as well as their phenomenological applications. In particular, the experimental situation concerning the contributions from intermediate energies and the evaluation of the driving terms are discussed in detail. We then demonstrate that the two S-wave scattering lengths a00 and a20 are the essential parameters in the low energy region: Once these are known, the available experimental information determines the behaviour near threshold to within remarkably small uncertainties. An explicit numerical representation for the energy dependence of the S- and P-waves is given and it is shown that the threshold parameters of the D- and F-waves are also @xed very sharply in terms of a00 and a20 . In agreement with earlier work, which is reviewed in some detail, we @nd that the Roy equations admit physically acceptable solutions only within a band of the (a00 ; a20 ) plane. We show that the data on the reactions e+ e− → and → reduce the width of this band quite signi@cantly. Furthermore, we discuss the relevance of the decay K → e in restricting the allowed range of a00 , preparing the grounds for an analysis of the forthcoming precision data on this decay and on pionic atoms. We expect these to reduce the uncertainties in the two basic low energy parameters very substantially, so that a meaningful test of the chiral perturbation theory predictions will c 2001 Elsevier Science B.V. All rights reserved. become possible. PACS: 11.30.Rd; 11.55.Fv; 11.80.Et; 13.75.Lb Keywords: Roy equations; Dispersion relations; Partial wave analysis; Meson–meson interactions; Pion–pion scattering; Chiral symmetries


209

1. Introduction The present paper deals with the properties of the scattering amplitude in the low energy region. Our analysis relies on a set of dispersion relations for the partial wave amplitudes due to Roy [1]. These equations involve two subtraction constants, which may be identi@ed with the S-wave scattering lengths, a00 and a20 . We demonstrate that the subtraction constants represent the essential parameters in the low energy region—once these are known, the Roy equations allow us to calculate the partial waves in terms of the available data, to within small uncertainties. Given the strong dominance of the two S-waves and of the P-wave, it makes sense to solve the equations only for these, using experimental as well as theoretical information to determine the contributions from higher energies and from the higher partial waves. More speci@cally, √ we solve the relevant integral equations on the interval 2M ¡ s ¡ 0:8 GeV. One of the main results of this work is an accurate numerical representation of the S- and P-waves for a given pair of scattering lengths a00 and a20 . Before describing the outline of the present paper, we review previous work concerning the Roy equations. Roy’s representation [1] for the partial wave amplitudes tlI of elastic scattering reads t‘I (s)

=

k‘I (s)

+

∞ 2 I =0 ‘ =0

∞

4M2

II I ds K‘‘ (s; s ) Im t‘ (s ) ;

(1.1)

where I and ‘ denote isospin and angular momentum, respectively and k‘I (s) is the partial wave projection of the subtraction term. It shows up only in the S- and P-waves, s − 4M2 1 I 1 1 I 0 1 I 0 I I 0 0 2 k‘ (s) = a0 ‘ + (2a0 − 5a0 ) (1.2) + − : 4M2 3 0 ‘ 18 1 ‘ 6 2 ‘

II (s; s ) are explicitly known functions (see Appendix A). They contain a diagonal, The kernels K‘‘ singular Cauchy kernel that generates the right hand cut in the partial wave amplitudes, as well as a logarithmically singular piece that accounts for the left hand cut. Relations (1.1) are consequences of the analyticity properties of the scattering amplitude, of the Froissart bound and of crossing symmetry. They are valid on the interval −4M2 ¡ s ¡ 60M2 [1–3]. Combined with unitarity, the Roy equations amount to an in@nite system of coupled, singular integral equations for the phase shifts. The integration is split into a low energy interval 4M2 ¡ s ¡ s0 and a remainder, s0 ¡ s ¡ ∞. We refer to s0 as the matching point, which is chosen somewhere in the range where the Roy equations are valid. The two S-wave scattering lengths, the elasticity parameters below the matching point and the imaginary parts above that point are treated as an externally assigned input. The mathematical problem consists in solving Roy’s integral equations with this input. Soon after the original article of Roy [1] had appeared, extensive phenomenological applications were performed [4 –10], resulting in a detailed analysis and exploitation of the then available experimental data on scattering. For a recent review of those results, we refer the reader to the article by Morgan and Pennington [11]. Parallel to these phenomenological applications, the very structure of the Roy equations was investigated. In [13], a family of

210


partial wave equations was derived, on the basis of manifestly crossing symmetric dispersion relations in the variables st + tu + us and stu. Each set in this family is valid in an interval s0 ¡ s ¡ s1 , and the union of these intervals covers the domain −28M2 6 Re s 6 125:3M2 (for a recent application of these dispersion relations, see [14]). Using hyperbolae in the plane of the above variables, Auberson and Epele [15] proved the existence of partial wave equations up to Re s = 165M2 . Furthermore, the manifold of solutions of Roy’s equations was investigated, in the single channel [16 –18] as well as in the coupled channel case [19]. In the late 1970s, Pool [20] provided a proof that the original, in@nite set of integral equations does have at least √ one solution for s0 ¡ 4:8M , provided that the driving terms are not too large, see also [21]. Heemskerk and Pool also examined numerically the solutions of the Roy equations, both by solving the N equation [21] and by using an iterative method [22]. It emerged from these investigations that—for a given input of S-wave scattering lengths, elasticity parameters and imaginary parts—there are in general many possible solutions to the Roy equations. This nonuniqueness is due to the singular Cauchy kernel on the right hand side of (1.1). In order to investigate the uniqueness properties of the Roy system, one may—in a @rst step—keep only this part of the kernels, so that the integral equations decouple: one is left with a single channel problem, that is a single partial wave, which, moreover, does not have a left hand cut. This mathematical problem was examined by Pomponiu and Wanders, who also studied the eLects due to the presence of a left hand cut [16]. Investigating the in@nitesimal neighbourhood of a given solution, they found that the multiplicity of the solution increases by one whenever the value of the phase shift at the matching point goes through a multiple of =2. Note that the situation for the usual partial wave equation is diLerent: There, the number of parameters in general increases by two whenever the phase shift at in@nity passes through a positive integer multiple of , see for instance [23,24] and references cited therein. After 1980, interest in the Roy equations waned, until recently. For instance, in Refs. [25] these equations are used to analyse the threshold parameters for the higher partial waves, relying on the approach of Basdevant et al. [7,8]. The uncertainties in the values of a00 and a20 are reexamined in Refs. [26]. In recent years, it has become increasingly clear, however, that a new analysis of the scattering amplitude at low energies is urgently needed. New Ke4 experiments and a measurement of the combination a00 − a20 based on the decay of pionic atoms are under way [27–31]. It is expected that these will signi@cantly reduce the uncertainties inherent in the data underlying previous Roy equation studies, provided the structure of these equations can be brought under @rm control. For this reason, the one-channel problem has been revisited in great detail in a recent publication [32], while the role of the input in Roy’s equations is discussed in Ref. [33]. The main reason for performing an improved determination of the scattering amplitude is that this will allow us to test one of the basic properties of QCD, namely the occurrence of an approximate, spontaneously broken symmetry: The symmetry leads to a sharp prediction for the two S-wave scattering lengths [34 – 42]. The prediction relies on the standard hypothesis, according to which the quark condensate is the leading order parameter of the spontaneously broken symmetry. Hence an accurate test of the prediction would allow us to verify or falsify that hypothesis [36]. First steps in this program have already been performed [37– 41]. However, in the present paper, we do not discuss this issue. We follow the phenomenological path and ignore the constraints imposed by chiral symmetry altogether, in order not to bias the data analysis


211

with theoretical prejudice. The matching of the phenomenological representation obtained in the present work with chiral perturbation theory to two loops [42] is discussed elsewhere [43]. 1 Finally, we describe the content of the present paper. Our notation is speci@ed in Section 2. Sections 3 and 4 contain a discussion of the background amplitude and of the driving terms, which account for the contributions from the higher partial waves and from the high-energy region. As is recalled in Section 5, unitarity leads to a set of three singular integral equations for the two S-waves and for the P-wave. The uniqueness properties of the solutions to these equations are discussed in Section 6, while Section 7 contains a description of the experimental input used for energies between 0.8 and 2 GeV. In particular we also discuss the information concerning the P-wave phase shift, obtained on the basis of the e+ e− → and → data. In Section 8, we describe the method used to solve the integral equations for a given input. The resulting universal band in the (a00 ; a20 ) plane is discussed in Section 9, where we show that, in the region below 0:8 GeV, any point in this band leads to a decent numerical solution for the three lowest partial waves. As discussed in Section 10, however, the behaviour of the solutions above that energy is consistent with the input used for the imaginary parts only in part of the universal band—approximately the same region of the (a00 ; a20 ) plane, where the Olsson sum rule is obeyed (Section 11). The solutions are compared with available experimental data in Section 12, and in Section 13, we draw our conclusions concerning the allowed range of a00 and a20 . The other threshold parameters can be determined quite accurately in terms of these two. The outcome of our numerical evaluation of the scattering lengths and eLective ranges of the lowest six partial waves as functions of a00 and a20 is given in Section 14, while in Section 15, we describe our results for the values of the phase shifts relevant for K → . Section 16 contains a comparison with earlier work. A summary and concluding remarks are given in Section 17. In Appendix A we describe some properties of the Roy kernels, which are extensively used in our work. The background from the higher partial waves and from the high energy tail of the dispersion integrals is discussed in detail in Appendix B. In particular, we show that the constraints imposed by crossing symmetry reduce the uncertainties in the background, so that the driving terms can be evaluated in a reliable manner. In Appendix C we discuss sum rules connected with the asymptotic behaviour of the amplitude and show that these relate the imaginary part of the P-wave to the one of the higher partial waves, thereby oLering a sensitive test of our framework. Explicit numerical solutions of the Roy equations are given in Appendix D and, in Appendix E, we recall the main features of the well-known Lovelace– Shapiro–Veneziano model, which provides a useful guide for the analysis of the asymptotic contributions. 2. Scattering amplitude We consider elastic scattering in the framework of QCD and restrict our analysis to the isospin symmetry limit, where the masses of the up and down quarks are taken equal and the 1

A more detailed account of this work is in preparation.

212


e.m. interaction is ignored. 2 In this case, the scattering process is described by a single Lorentz invariant amplitude A(s; t; u), d (p4 )c (p3 ) out |a (p1 )b (p2 )in

= fi + (2)4 i 4 (Pf − Pi ){ab cd A(s; t; u) + ac bd A(t; u; s) + ad bc A(u; s; t)} : The amplitude only depends on the Mandelstam variables s, t, u, which are constrained by s + t + u = 4M2 . Moreover, crossing symmetry implies A(s; t; u) = A(s; u; t) : The s-channel isospin components of the amplitude are given by T 0 (s; t) = 3A(s; t; u) + A(t; u; s) + A(u; s; t) ; T 1 (s; t) = A(t; u; s) − A(u; s; t) ; T 2 (s; t) = A(t; u; s) + A(u; s; t) :

(2.1)

In our normalization, the partial wave decomposition reads 2t I T (s; t) = 32 (2‘ + 1)P‘ 1 + t I (s) ; s − 4M2 ‘ ‘

1 I t‘I (s) = { I‘ (s)e2i‘ (s) − 1} ; 2i(s) 4M2 (s) = 1 − : s The threshold parameters are the coeNcients of the expansion Re t‘I (s) = q2‘ {aI‘ + q2 bI‘ + q4 c‘I + · · ·}

(2.2)

(2.3)

with s = 4(M2 + q2 ). The isospin amplitudes T = (T 0 ; T 1 ; T 2 ) obey @xed-t dispersion relations, valid in the interval −28M2 ¡ t ¡ 4M2 [2]. As shown by Roy [1], these can be written in the form. 3 T (s; t) = (4M2 )−1 (s1 + tCst + uCsu )T (4M2 ; 0) ∞ ∞ ds g2 (s; t; s ) Im T (s ; 0) + ds g3 (s; t; s ) Im T (s ; t) : + 4M2

4M2

(2.4)

The subtraction term is @xed by the S-wave scattering lengths: T (4M2 ; 0) = 32 (a00 ; 0; a20 ) : 2

In our numerical work, we identify the value of M with the mass of the charged pion. For an explicit representation of the kernels g2 (s; t; s ), g3 (s; t; s ) and of the crossing matrices Cst , Csu , we refer to Appendix A. 3


213

The Roy equations (1:1) represent the partial wave projections of Eq. (2.4). Since the partial wave expansion of the absorptive parts converges in the large Lehmann–Martin ellipse, they are valid in the interval −4M2 ¡ s ¡ 60M2 . If the scattering amplitude obeys Mandelstam analyticity, the @xed-t dispersion relations can be shown to hold for −32M2 ¡ t ¡ 4M2 and the Roy equations are then also valid in a larger domain: −4M2 ¡ s ¡ 68M2 (for a review, see [3]). In fact, as we mentioned in the introduction, the range of validity can be extended even further [13,15], so that Roy equations could be used to study the behaviour of the partial √ waves above 68M = 1:15 GeV, where the uncertainties in the data are still considerable. In the following, however, we focus on the low energy region. We assume Mandelstam analyticity and analyse the Roy equations in the interval from threshold to √ s1 = 68M2 ; s1 = 1:15 GeV : 3. Background amplitude The dispersion relation (2.4) shows that, at low energies, the scattering amplitude is fully determined by the imaginary parts of the partial waves in the physical region, except for the two subtraction constants a00 ; a20 . In view of the two subtractions, the dispersion integrals converge rapidly. In the region between 0.8 and 2 GeV, the available phase shift analyses provide a rather detailed description of the imaginary parts of the various partial waves. Our analysis of the Roy equations allows us to extend this description down to threshold. For small values of s and t, the contributions to the dispersion integrals from the region above 2 GeV are very small. We will rely on Regge asymptotics to estimate these. In the following, we split the interval of integration into a low energy part (4M2 6 s 6 s2 ) and a high energy tail (s2 6 s ¡ ∞), with √ s2 = 2 GeV; s2 = 205:3M2 : For small values of s and t, the scattering amplitude T (s; t) is dominated by the contributions from the subtraction constants and from the low energy part of the dispersion integral over the imaginary parts of the S- and P-waves. We denote this part of the amplitude by T (s; t)SP . The corresponding contribution to the partial waves is given by 1 s2 2 II I t‘I (s)SP = k‘I (s) + ds K‘‘ (3.1) (s; s ) Im t‘ (s ) : 2 I =0 ‘ =0 4M

The remainder of the partial wave amplitude, ∞ s2 2 I II I ds K‘‘ d‘ (s) = (s; s ) Im t‘ (s ) I =0 ‘ =2

+

4M2

∞ 2

I =0

‘ =0

∞

s2

II I ds K‘‘ (s; s ) Im t‘ (s )

(3.2)

is called the driving term. It accounts for those contributions to the r.h.s. of the Roy equations that arise from the imaginary parts of the waves with ‘ = 2; 3; : : : and in addition also contains

214


those generated by the imaginary parts of the S- and P-waves above 2 GeV. By construction, we have t‘I (s) = t‘I (s)SP + dI‘ (s) :

(3.3)

For the scattering amplitude, the corresponding decomposition reads T (s; t) = T (s; t)SP + T (s; t)d :

(3.4)

We refer to T (s; t)d as the background amplitude. The contribution from the imaginary parts of the S- and P-waves turns out to be crossing symmetric by itself. In this sense, crossing symmetry does not constrain the imaginary parts of the S- and P-waves. 4 The symmetry can be exhibited explicitly by representing the three components of the vector T (s; t)SP as the isospin projections of a single amplitude A(s; t; u)SP that is even with respect to the exchange of t and u. The explicit expression involves three functions of a single variable [13,38]: A(s; t; u)SP = 32{ 13 W 0 (s) + 32 (s − u)W 1 (t) + 32 (s − t)W 1 (u) + 12 W 2 (t) + 12 W 2 (u) − 13 W 2 (s)} :

(3.5)

These are determined by the imaginary parts of the S- and P-waves and by the two subtraction constants a00 ; a20 : a00 s ds Im t00 (s ) s(s − 4M2 ) s2 0 W (s) = + ; 2 4M2 4M2 s (s − 4M )(s − s) ds Im t11 (s ) s s2 ; W 1 (s) = 4M2 s (s − 4M2 )(s − s) ds Im t02 (s ) a20 s s(s − 4M2 ) s2 2 W (s) = + : (3.6) 2 4M2 4M2 s (s − 4M )(s − s) The representation A(s; t; u) = A(s; t; u)SP + A(s; t; u)d

(3.7)

yields a manifestly crossing symmetric decomposition of the scattering amplitude into a leading term generated by the imaginary parts of the S- and P-waves at energies below s2 and a background, arising from the imaginary parts of the higher partial waves and from the high energy tail of the dispersion integrals. 4. Driving terms In the present paper, we restrict ourselves to an analysis of the Roy equations for the Sand P-waves, which dominate the behaviour at low energies. The background amplitude only 4

The asymptotic behaviour of the scattering amplitude does tie the imaginary part of the P-wave to the contributions from the higher partial waves, see Appendix C.1.


215

generates small corrections, which can be worked out on the basis of the available experimental information. The calculation is described in detail in Appendix B. In particular, we show that crossing symmetry implies a strong constraint on the asymptotic contributions. The resulting numerical values for the driving terms are well described by polynomials in s, or, equivalently, in the square of the centre of mass momentum q2 = 14 (s−4M2 ). By de@nition, the driving terms vanish at threshold, so that the polynomials do not contain q-independent terms. In view of their relevance in the evaluation of the threshold parameters, we @x the coeNcients of the terms proportional to q2 with the derivatives at threshold and also pin down the term of order q4 in the P-wave, such that it correctly accounts for the background contribution to the eLective range of this partial wave. The remaining coeNcients of the polynomial are obtained from a @t on the interval from threshold to s1 . The explicit result reads d00 (s) = 0:116q2 + 4:79q4 − 4:09q6 + 2:69q8 ; d11 (s) = 0:00021q2 + 0:038q4 + 0:94q6 − 1:21q8 ;

(4.1)

d20 (s) = 0:0447q2 + 1:59q4 − 6:26q6 + 5:94q8 ; where q is taken in GeV units. The range 4M2 ¡ s ¡ 68M2 corresponds to 0 ¡ q ¡ 0:56 GeV. In this region, the background amplitude is dominated by the contribution from the isoscalar spin 2 resonance f2 (1275). The driving term of the I = 0 S-wave is due almost exclusively to this state, while those with I = 1 and I = 2 pick up a corresponding contribution only through crossed channels. For this reason, d11 (s) and d20 (s) are smaller than d00 (s) by an order of magnitude. In d11 (s), the D- and F-waves nearly cancel, so that the main contributions arise from the region above 2 GeV. The term d20 (s) picks up small contributions both from low energies and from the asymptotic domain. The above polynomials are shown as full lines in Fig. 1. The shaded regions represent the uncertainties of the result, which may be represented as dI‘ (s) ± e‘I (s), with e00 (s) = 0:008q2 + 0:31q4 − 0:33q6 + 0:41q8 ; e11 (s) = 0:002q2 + 0:06q4 − 0:17q6 + 0:21q8 ;

(4.2)

e02 (s) = 0:005q2 + 0:20q4 − 0:32q6 + 0:39q8 : Above threshold, the error bars in d00 (s), d11 (s) and d20 (s) roughly correspond to 6%, 1% and 4% of d00 (s), respectively. As far as d00 (s) is concerned, our result roughly agrees with earlier calculations [5,8]. Our values for d11 (s) and d20 (s), however, are much smaller. The bulk of the diLerence√is of purely kinematic origin: The values taken for we are working with s2 =2 GeV, √ s2 are diLerent. While √ the values used in Refs. [5,8] are 53 M 1 GeV and 110 M 1:5 GeV, respectively. The value of s2 enters the de@nition of the driving terms in Eq. (3.2) as the lower limit of the integration over the imaginary parts of the S- and P-waves. We have checked that, once this diLerence in the range of integration is accounted for, the driving terms given in these references are consistent with the above representation. Note however, that our uncertainties are considerably smaller, and we do rely on this accuracy in the following. It then matters that

216


Fig. 1. Driving terms versus energy in GeV. The full lines show the result of the calculation described in Appendix B. The shaded regions indicate the uncertainties associated with the input of that calculation. The dashed curves represent the contributions from the D- and F-waves below 2 GeV.


217

not only the range of integration, but also the integrands used in [5,8] diLer from ours: In these references, it is assumed that, above the value taken for s2 , the behaviour of the S- and P-wave imaginary parts is adequately described by a Regge representation. The diLerence between such a picture and our representation for the background amplitude is best illustrated with the simple model used in the early literature, where the asymptotic region is described by a Pomeron term with tot = 20 mb and a contribution from the -f-trajectory, taken from the Lovelace–Shapiro–Veneziano model (Appendix E). As discussed in detail in Appendix B.4, the assumption that an asymptotic behaviour of this type sets in early is in conPict with crossing symmetry [44]. In particular, the model overestimates the contribution to the driving terms from the region above 1:5 GeV, roughly by a factor of two. Either the value of tot or the residue of the leading Regge trajectory or both must be reduced in order for the model not to violate the sum rule (B.6). The manner in which the asymptotic contribution is split into one from the Pomeron and one from the leading Regge trajectory is not crucial. For any reasonable partition that obeys the sum rule (B.6), the outcome for the driving terms is approximately the same. The result for d11 (s) and d20 (s) is considerably smaller than what is expected from the above model. The leading term d00 (s), on the other hand, is dominated by the resonance f2 (1275) and is therefore not sensitive to the behaviour of the imaginary parts in the region above 1:5 GeV.

5. Roy equations as integral equations Once the driving terms are pinned down, the Roy equations for the S- and P-waves express the real parts of the partial waves in terms of the S-wave scattering lengths and of a principal value integral over their imaginary parts from 4M2 to s2 . Unitarity implies that, in the elastic domain 4M2 ¡ s ¡ 16M2 , the real and imaginary parts of the partial wave amplitudes are determined by a single real parameter, the phase shift. If we were to restrict ourselves to the elastic region, setting s2 =16M2 , the Roy equations would amount to a set of coupled, nonlinear singular integral equations for the phase shifts. We may extend this range, provided the elasticity parameters I‘ (s) are known. On the other hand, since the Roy equations do not constrain the behaviour of the partial waves for s ¿ 68M2 , the integrals occurring on the r.h.s. of these equations can be evaluated only if the imaginary parts in that region are known, together with the subtraction constants a00 , a20 , which also represent parameters to be assigned externally. In the present paper, we do not solve the Roy equations in their full domain of validity, but use a smaller interval, 4M2 ¡ s ¡ s0 . The reason why it is advantageous to use a value of s0 below the mathematical upper limit, s0 ¡ s1 , is that the Roy equations in general admit more than one solution. As will be discussed in detail in Section 6, the solution does become unique if the value of s0 is chosen between the mass and the energy where the I = 0 S-wave phase passes through =2—this happens around 0:86 GeV. In the following, we use √ s0 = 0:8 GeV; s0 = 32:9M2 : In the variable s, our matching point is nearly at the centre of the interval between threshold and s1 = 68M2 . We are thus solving the Roy equations on the lower half of their range

218


of validity, using the upper half to check the consistency of the solutions so obtained (Section 10). Our results are not sensitive to the precise value taken for s0 (Section 9). The Roy equations for the S- and P-waves may be rewritten in the form

s

s

I0 I1 Re t‘I (s) = k‘I (s) + -4M 2 ds K‘0 (s; s ) Im t00 (s ) + -4M 2 ds K‘1 (s; s ) Im t11 (s ) 0

0

s I2 + -4M ds K‘0 (s; s ) Im t02 (s ) + f‘I (s) + dI‘ (s) ;

0

2

(5.1)

where I and ‘ take only the values (I; ‘) = (0; 0); (1; 1) and (2; 0). The bar across the integral sign denotes the principal value integral. The functions f‘I (s) contain the part of the dispersive integrals over the three lowest partial waves that comes from the region between s0 and s2 , where we are using experimental data as input. They are de@ned as f‘I (s)

=

1 2 I =0 ‘ =0

s II -s ds K‘‘ (s; s ) Im t‘I (s ) : 2

0

(5.2)

The experimental input used to evaluate these integrals will be discussed in Section 7, together with the one for the elasticity parameters of the S- and P-waves. One of the main tasks we are faced with is the construction of the numerical solution of the integral equations (5.1) in the interval 4M2 6 s 6 s0 , for a given input {a00 ; a20 ; f‘I ; I‘ ; dI‘ }. Once a solution is known, the real part of the amplitude can be calculated with these equations, also in the region s0 6 s 6 s1 . 6. On the uniqueness of the solution The literature concerning the mathematical structure of the Roy equations was reviewed in the introduction. In the following, we @rst discuss the situation for the single channel case—which is simpler, but clearly shows the salient features—and then describe the generalization to the three channel problem we are actually faced with. For a detailed analysis, we refer the reader to two recent papers on the subject [32,33] and the references quoted therein. 6.1. Roy’s integral equation in the one-channel case If we keep only the diagonal, singular Cauchy kernel in (1.1), the partial wave relations decouple, and the left hand cut in the amplitudes disappears. Each one of the three partial wave amplitudes then obeys the following conditions: (i) In the interval between the threshold s = 4M2 and the matching point s = s0 , the real part is given by a dispersion relation 1∞ Im t(s ) Re t(s) = a + (s − 4M2 ) -4M 2 ds : (s − 4M2 ) (s − s)

(6.1)

(ii) Above s0 , the imaginary part Im t(s) is a given input function Im t(s) = A(s);

s ¿ s0 :

(6.2)


219

(iii) For simplicity, we take the matching point in the elastic region, so that 1 i(s) t(s) = sin (s); 4M2 6 s 6 s0 ; (6.3) e (s) where (s) is real and vanishes at threshold. We refer the reader to [32] for a precise formulation of the regularity properties required from the amplitude and from the input absorptive part. As a minimal condition, we must require lim Im t(s) = A(s0 ) :

ss0

(6.4)

Otherwise, the principal value integral does not exist at the matching point. Eqs. (6.1) – (6.4) constitute the mathematical problem we are faced with in this case: Determine the amplitudes t(s) that verify these equations for a given input of scattering length a and absorptive part A(s). Once a solution is known, the real part of the amplitude above s0 is obtained from the dispersion relation (6.1), and t(s) is then de@ned on 4M2 6 s ¡ ∞. The following points summarize the results relevant in our context: 1. Elastic unitarity reduces the problem to the determination of the real function (s), de@ned in the interval 4M2 6 s 6 s0 . The amplitude t(s) is then obtained from (6.3). 2. A given input {a; A(s)} does not, in general, @x the solution uniquely—in addition, the value of the phase at the matching point plays an important role. Indeed, let t(s) be a solution and suppose @rst that the phase at the matching point is positive. For 0 ¡ (s0 ) ¡ =2, the in@nitesimal neighbourhood of t(s) does not contain further solutions. For (s0 ) ¿ =2, however, the neighbourhood contains an m-parameter family of solutions. The integer m is determined by the value of the phase at the matching point ([x] is the largest integer not exceeding x): 2(s0 ) m= : (6.5) For a monotonically increasing phase, the index m counts the number of times (s) goes through multiples of =2 as s varies from threshold to the matching point. We illustrate the situation for m = 0; 1; 2 in Fig. 2. 3. If the value of the phase at the matching point is negative, the problem does not in general have a solution. In order for the problem to be soluble at all, the input must be tuned. For −=2 ¡ (s0 ) ¡ 0, for instance, we may keep the absorptive part A(s) as it is, but tune the scattering length a. This situation may be characterized by m = −1: Instead of having a family of solutions containing free parameters, the input is subject to a constraint. Once a solution does exist, it is unique in the sense that the in@nitesimal neighbourhood does not contain further solutions. 4. Consider now the case displayed in Fig. 2a, where the phase at the matching point is below =2. This corresponds to the situation encountered in the coupled channel case, for our choice of the matching point. According to the above statements, a given input {a; A(s)} then generates a locally unique solution—if a solution exists at all. We take it that uniqueness also holds globally, see [17]. The solution may be constructed in the following manner: Consider a family of unitary amplitudes, parametrized through c1 ; : : : ; cn . For any given amplitude, evaluate the right and

220


Fig. 2. Boundary conditions on the phase (s0 ) for solving Roy’s integral equation. Figs. a–c represent the cases 0 ¡ (s0 ) ¡ =2, =2 ¡ (s0 ) ¡ and ¡ (s0 ) ¡ 3=2, respectively. In Fig. c, the phase winds around the Argand circle slightly more than once.

left hand sides of Eq. (6.1) and calculate the square of the diLerence at N points in the interval 4M2 6 s 6 s0 . Finally, minimize the sum of these squares by choosing c1 ; : : : ; cn accordingly. Since the solution is unique, it suNces to @nd one with this method—it is then the only one. 6.2. Cusps In general, the solutions are not regular at the matching point, but have a cusp (branch point) there: (s)=(s0 )+C(s0 − s)) + · · · ; with ) ¿ 0. The phenomenon arises from our formulation of the problem—the physical amplitude is regular there. We conclude that, even if a mathematical solution can be constructed for a given input {a; A(s)}, in general it will not be acceptable physically, because it contains a @ctitious singularity at the matching point. The behaviour of the phase is sensitive to the value of the exponent: If ) is close to 1, the discontinuity in the derivative is barely visible, while for small values of ), it manifests itself very clearly. The strength of the singularity is determined by the constant C, whose value depends on the input used. In particular, if the scattering length a is varied, while the absorptive part A(s) is kept @xed, the size of C changes. We may search for the value of a at which C vanishes. Although the singularity does not disappear entirely even then, it now only manifests itself in the derivatives of the function (for the solution to become analytic at s0 , we would need to also adapt the input for A(s)). In view of the fact that our solutions are inherently fuzzy, because the values of the input are subject to experimental uncertainties, we consider solutions with C 0 or ) 1 as physically acceptable and refer to these as solutions without cusp. The search for solutions without cusp can be implemented as follows. Instead of @xing a, constructing solutions in the class of functions with a cusp and then determining the value of a at which the cusp disappears, we may simply consider parametrizations that do not contain a cusp, treating the scattering length a as a free parameter, on the same footing as the set c1 ; : : : ; cn used to parametrize the phase shift and minimizing the diLerence between the left and right hand sides of Eq. (6.1). We have veri@ed that if a solution without cusp does exist, this procedure indeed @nds it: Allowing for the presence of cusps does not lead to a better minimum.


221

Table 1 Multiplicity of solutions in the coupled channel case. The multiplicity index m is the number of free parameters occurring in the solutions of the Roy equations, if the matching point s0 is in the interval indicated (in GeV units). Also displayed is the variation of the physical phases 00 and 11 on that interval Range of s0 I

√

1 ¡ s0 ¡ 1:15 √

II

0:86 ¡ s0 ¡ 1

III

0:78 ¡ s0 ¡ 0:86

IV

0:28 ¡ s0 ¡ 0:78

√ √

Range of 00

Range of 11

¡ 00 ¡ 32

1 1 2 ¡ 1 ¡ 1 1 2 ¡ 1 ¡ 1 1 2 ¡ 1 ¡ 0 ¡ 11 ¡ 12

0 1 2 ¡ 0 ¡ 0 ¡ 00 ¡ 12 0 ¡ 00 ¡ 12

m 2 1 0 −1

The net result of this discussion is that the scattering length a must match the input for A(s)—it does not represent an independent parameter. When solving the Roy equations, we can at the same time also determine the value of a that belongs to a given input for the high energy absorptive part. The conclusion remains valid even if the matching point is above the @rst inelastic threshold, provided the elasticity parameter is known and suNciently smooth at the matching point. For a thorough analysis of the issue, we refer to [33]. 6.3. Uniqueness in the multi-channel case In the multichannel case, we need to determine three functions 00 ; 11 and 20 for a given input {a00 ; a20 ; f‘I ; I‘ ; dI‘ }. The multiplicity index m of the in@nitesimal neighbourhood of a given solution is displayed in Table 1 [33], for various values of the matching point s0 . The table contains the following information. In the situations indicated with the labels I and II, the in@nitesimal neighbourhood of a given solution contains a family of solutions, characterized by 2 and 1 free parameters, respectively. In case III, the solution is unique in the sense that the neighbourhood does not contain further solutions, while in case IV a solution only exists if the input is subject to a constraint (m = −1, compare paragraph 3 in Section 6.1). In order to uniquely characterize the solution in case I, for instance, we thus need to @x two more parameters—in addition to the input—say the position of the resonance and its width, or the position of the resonance and the value of s where the I = 0 phase passes through =2, and similarly for II. In the following, we stick to case III, where the solution is unique for a given input. As discussed above, each of the three partial waves will in general develop a cusp at the matching point s0 , unless some of the input parameters take special values. The situation encountered in practice is the following. Let 0:1 ¡ a00 ¡ 0:6, and let f‘I ; I‘ and I d‘ be @xed as well. For an arbitrary value of the scattering length a20 , the solution in general develops a strong cusp in the P-wave. This cusp can be removed by tuning a20 → aQ20 , using for instance the method described in the single channel case above. Remarkably, it turns out that the solutions so obtained are nearly free of cusps in the two S-waves as well. The problem manifests itself almost exclusively in the P-wave, because our matching point is rather close to the mass of the , where the imaginary part shows a pronounced peak. If a20 is chosen to slightly diLer from the optimal value aQ20 , a cusp in the P-wave is clearly seen. We thus obtain

222


a relation between the scattering lengths a00 and a20 . This is how the so-called universal curve, discovered a long time ago [45], shows up in our framework. We will discuss the properties of this curve in detail below. In principle, we might try to also @x a00 with this method, requiring that there be no cusp in one of the two S-waves. The cusps in these are very weak, however—the procedure does not allow us to accurately pin down the second scattering length. The choice a00 = −0:2, for instance, still leads to a fully acceptable solution. On the other hand, we did not @nd a solution in the class of smooth functions for a00 = −0:5. This shows that the analyticity properties that are not encoded in the Roy integral equations (5.1) do constrain the range of admissible values for a00 , but since that range is very large, the constraint is not of immediate interest, and we do not consider the matter further. In our numerical work, we consider values in the range 0:15 ¡ a00 ¡ 0:30 and use the centre of this interval, a00 = 0:225, as our reference point. 7. Experimental input In this section, we describe the experimental input used for the elasticity below the matching √ point at s0 = 0:8 GeV and for the imaginary parts of the S- and P-waves in the energy interval √ √ between s0 and s2 = 2 GeV. The references are listed in [46 – 60] and for an overview, we refer to [11,61]. The evaluation of the contributions from the higher partial waves and from the asymptotic region (s ¿ s2 ) is discussed in detail in Appendix B. 7.1. Elasticity below the matching point The Roy equations allow us to determine the phase shifts of the S- and P-waves only if—on the interval between threshold and the matching point—the corresponding elasticity parameters 0 (s); 1 (s) and 2 (s) are known. On kinematic grounds, the transition 2 → 4 is the only 1 0 0 √ inelastic channel open below our matching point, s0 = 0:8 GeV. The threshold for this reaction is at E = 4 M 0:56 GeV, but phase space strongly suppresses the transition at low energies— a signi@cant inelasticity only sets in above the matching point. In particular, the transition Q which occurs for E ¿ 2 MK 0:99 GeV, does generate a well-known, pronounced → K K, structure in the elasticity parameters of the waves with I = 0; 1. Below the matching point, however, we may neglect the inelastic reactions altogether and set √ 0 1 2 s ¡ 0:8 GeV : 0 (s) = 1 (s) = 0 (s) = 1; We add a remark concerning the eLects generated by the inelastic reaction 2 → 4, which are analyzed in Ref. [58]. 5 In one of the phase shift analyses given there (solution A), the inelasticity 1 − 11 (s) reaches values of order 4%, already in the region of the -resonance. The eLect is unphysical—it arises because the parametrization used does not account for the strong phase space suppression at the 4 threshold. 6 For the purpose of the analysis performed in Ref. [58], which focuses on the region above 1 GeV, this is immaterial, but in our context, it 5 6

We thank B.S. Zou and A.V. Sarantsev for providing us with the corresponding Fortran codes. We thank Wolfgang Ochs for this remark.


223

Fig. 3. Comparison of the diLerent input we used for the imaginary parts of the I = 0 and I = 1 lowest partial waves above the matching point at 0:8 GeV.

matters: We have solved the Roy equations also with that representation for the elasticities. The result shows signi@cant distortions, in particular in the P-wave. 7.2. Input for the I = 0; 1 channels The experimental information on the phase shifts in the intermediate energy region comes mainly from the reaction N → N . A rather involved analysis is necessary to extract the phase shifts from the raw data, and several diLerent representations for the phases and elasticities are available in the literature. The main source of experimental information is still the old measurement of the reaction − p → − + n by the CERN–Munich (CM) collaboration [50], but there are also older, statistically less precise data, for instance from Saclay [46] and Berkeley [49], as well as newer ones, such as the data of the CERN–Cracow–Munich collaboration concerning pion production on polarized protons [55] and those on the reaction − p → 0 0 n, obtained recently by the E852 collaboration at Brookhaven [60]. For a detailed discussion of the available experimental information, we refer to [11,58,61]. For our purposes, energy-dependent analyses are most convenient, because these yield analytic expressions for the imaginary parts, so that the relevant integrals can readily be worked out. To illustrate the diLerences between these analyses, we plot the corresponding imaginary parts in Fig. 3, both for the I = 0 S-wave and for the P-wave. The representations of Refs. [48,56,58] do not extend to 2 GeV, but they do cover the range between 0.8 and 1:7 GeV. Unitarity ensures that the contributions generated by the imaginary parts of the S- and P-waves in the region

224


Fig. 4. Comparison of the results obtained for the dispersion integral f00 with the various imaginary parts shown in Fig. 3.

between 1.7 and 2 GeV are very small, so that we may use these representations also there without introducing a signi@cant error. For the P-wave, the diLerences between the various parametrizations are not dramatic, but for the I = 0 S-wave, they are quite substantial. Despite these diLerences, the result obtained for the dispersive integrals are similar, at least in the range where we are solving the Roy equations. This can be seen in Fig. 4, where we plot the value of the dispersion integral f00 , de@ned in Eq. (5.2). The only visible diLerence is between parametrization B of Ref. [58] and the others. In order of magnitude, the eLect is comparable to the one occurring if the scattering length a00 is shifted by 0:01. It arises from the diLerence in the behaviour of the S-wave imaginary part in the region between 1 and 1:5 GeV. The phase shift analysis of Protopopescu et al. [49] does not cover that region, as it only extends to 1:15 GeV, but those of Au, Morgan and Pennington [56] as well as Bugg et al. [58] do. Both of these include, aside from the CM data, additional experimental information, not included in the analysis of Hyams et al. [48]. In the following, we rely on the representation of Au et al. [56] for the S-wave and the one of Hyams et al. [48] for the P-wave (the analysis of Au et al. does not include the P-wave). We have veri@ed that, using [48] also for the S-wave would not change our results below the matching point, beyond the uncertainties to be attached to the solutions, anyway. On the other hand, Au et al. [56] yield a more consistent picture above the matching point—for this reason we stick to that analysis. More precisely, we use the solution denoted by K1 (Etkin) in Ref. [56], Table I. That solution contains a narrow resonance in the 1 GeV region, which does not occur in the other phase shift analyses. In our opinion, the extra state is an artefact of the representation used: A close look reveals that the occurrence of this state hinges on small details of the


225

K-matrix representation. In fact, the resonance disappears if two of the K-matrix coeNcients 0 ; −c0 ) = (3:1401; 2:8447) → (3:2019; 2:6023). are slightly modi@ed, for instance with (−c12 22 7.3. Phase of the P-wave from e+ e− → + − and → − 0 For the P-wave, the data on the processes e+ e− → + − and → − 0 yield very useful, independent information. The corresponding transition amplitude is proportional to the pion form factor Fe:m: (s) of the electromagnetic current and to the form factor FV (s) of the charged vector current, respectively. The data provide a measurement of the quantities |Fe:m: (s)| and |FV (s)| in the time-like region, s ¿ 4M2 . In the isospin limit, the two form factors coincide: The currents only diLer by an isoscalar operator that carries odd G-parity, so that the pion matrix elements thereof vanish. While the isospin breaking eLects in |FV (s)| are very small, − ! interference does produce a pronounced structure in the electromagnetic form factor. The !-resonance generates a second sheet pole in the isoscalar matrix elements, at s = (M! − i 12 .! )2 . The residue of the pole is small, of order O(md − mu ; e2 ), but in view of the small width of the !, the denominator also nearly vanishes for s = M!2 . Moreover, the pole associated with the exchange of a occurs in the immediate vicinity of this point, so that the transition amplitude involves a sum of two contributions that rapidly change with s, both in magnitude and phase. Since the interference phenomenon is well understood, it can be corrected for. When this is done, the data on the two processes e+ e− → + − and → − 0 are in remarkably good agreement (for a review, see [62,63]). We denote the phase of the vector form factor by 0(s), FV (s) = |FV (s)|ei0(s) : In the elastic region 4M2 ¡ s ¡ 16M2 , the @nal state interaction exclusively involves scattering, so that the Watson theorem implies that the phase 0(s) coincides with the P-wave phase shift, 0(s) = 11 (s);

4M2 ¡ s ¡ 16M2 :

In fact, phase space suppresses the inelastic channels also in this case—the available data on the decay channel → 4 show that, for E ¡ 0:9 GeV, the inelasticity is below 1%, so that the phase of the form factor must agree with the P-wave phase shift, to high accuracy [64]. In the region where the singularity generated by -exchange dominates, in particular also in the vicinity of our matching point, the form factor is well represented by a resonance term and a slowly varying background. Quite a few such representations may be found in the recent literature. Since the uncertainties in the data (statistical as well as systematic) are small, these parametrizations agree quite well. In the following, we use the Gounaris–Sakurai representation of Ref. [65] as a reference point. That representation involves a linear superposition of three resonance terms, associated with (770); (1450) and (1700). We have investigated the uncertainties to be attached to this representation by (a) comparing the magnitude of the form factor with the available data, 7 (b) comparing it with other parametrizations, (c) varying the 7

We are indebted to Simon Eidelman and Fred Jegerlehner for providing us with these.

226


Table 2 Value of the phases 00 and 11 at 0:8 GeV. The @rst three rows stem from analyses of the data at a @xed value of the energy (“energy independent”), while the remaining entries are obtained from a @t to the data that relies on an explicit parametrization of the energy dependence (“energy-dependent analysis”) 00

11

11 − 00

Ref.

81:7 ± 3:9 90:4 ± 3:6 85:7 ± 2:9

105:2 ± 1:0 115:2 ± 1:2 116:0 ± 1:8

23:4 ± 4:0 24:8 ± 3:8 30:3 ± 3:4

[47,48] [51] s-channel moments [51] t-channel moments

81:6 ± 4:0 80.9 79.5 79.9 80.7 82.0

108:1 ± 1:4 105.9 106.1 106.8 — —

26:5 ± 4:2 25.0 26.5 26.9 — —

[49, Table VI] [47,48] [58] solution A [58] solution B [56] solution K1 [56] solution K1 (Etkin)

resonance parameters in the range quoted in Ref. [65] and (d) using the fact that analyticity imposes a strong correlation between the phase of the form factor and its magnitude. On the basis of this analysis, we conclude that the e+ e− and data determine the phase of the P-wave at ◦ 0:8 GeV to within an uncertainty of ±2 . A detailed comparison between the phase of the form factor and the solution of the Roy equations for the P-wave will be given in Section 12.2. 7.4. Phases at the matching point In the framework of our analysis, the input used for s ¿ s0 enters in two ways: (i) it speci@es the value of the three phases at the matching point and (ii) it determines the contributions to the Roy equation integrals from the region above that point. Qualitatively, we are dealing with a boundary value problem: At threshold, the phases vanish, while at the matching point, they are speci@ed by the input. The solution of the Roy equations then yields the proper interpolation between these boundary values. The behaviour of the imaginary parts above the matching point is less important than the boundary values, because it only aLects the slope and the curvature of the solution. We now discuss the available information for the phases 00 and 11 at the matching point. The values obtained from the high energy, high statistics N → N experiments are collected in Table 2. In those cases where the published numbers do not directly apply at 0:8 GeV, we have used a quadratic interpolation between the three values of the energy closest to this one. The errors given in the third column are obtained by adding those from the @rst two columns in quadrature. For the energy-dependent entries, the error analysis is more involved—only Ref. [49] explicitly quotes an error. The scatter seen in the table partly arises from the fact that diLerent methods of analysis are used. The corresponding systematic uncertainties are not covered by the error bars quoted in the individual phase shift analyses: Taken at face value, the numbers listed in the table are contradictory, particularly in the case of the P-wave. For a thorough discussion of the experimental discrepancies, we refer to [61].


227

As discussed above, both the statistical and the systematic uncertainties of the e+ e− and data are considerably smaller. They constrain the phase of the P-wave at 0:8 GeV to a narrow range, ◦ centered around the value 11 (s0 ) = 108:9 obtained with the Gounaris–Sakurai representation of the form factor in Ref. [65]: ◦

◦

11 (s0 ) = 108:9 ± 2 : (7.1) The comparison with the numbers listed in the second column of the table shows that this value is within the range of the results obtained from N → N . Unfortunately, the e+ e− and data only concern the P-wave. To pin down the I = 0 S-wave, we observe that the overall phase of the scattering amplitude drops out when considering the diLerence 11 − 00 , so that one of the sources of systematic error is absent. Indeed, the third column in the table shows that the outcome of the various analyses is consistent with the assumption that the Puctuations seen are of statistical origin. The statistical average of the ◦ ◦ energy-independent analyses yields 11 (s0 ) − 00 (s0 ) = 26:6 ± 3:7 , with 12 = 2 for 2 degrees of freedom (as the numbers are based on the same data, we have inPated the error bar—the number given is the mean error of the three data points). The remaining entries in the table neatly con@rm this result. Combining it with the one in the fourth row, which is based on independent data, we @nally arrive at ◦

◦

11 (s0 ) − 00 (s0 ) = 26:6 ± 2:8 : (7.2) Since the value for 11 comes from the data on the form factor, while the one for the diLerence 1 1 −00 is based on the reaction N → N , these numbers are independent, so that it is legitimate to combine them. Adding errors quadratically, we obtain ◦

◦

00 (s0 ) = 82:3 ± 3:4 : (7.3) In the following, we rely on the two values for the phases at the matching point given in Eqs. (7.1) and (7.3). We emphasize that the N → N data are consistent with these—in fact, the result of the energy-dependent analysis quoted in the fourth row of the table is in nearly perfect agreement with the above numbers. We are exploiting the fact that the e+ e− and data strongly constrain the behaviour of the P-wave in the region of the , thus reducing the uncertainties in the value of 11 at the matching point. For the principal value integrals to exist, we need to continuously connect the values of the imaginary parts calculated from the phases at the matching point with those of the phase shift representation we wish to use. This can be done, either by slightly modifying the parameters occurring in the representation in question or with a suitable interpolation of the phases between the matching point and K KQ threshold. We have checked that our results do not depend on how that is done, as long as the interpolation is smooth. Note that, for the representation K1 (Etkin) [56]—our reference input for the imaginary part of the I = 0 S-wave—an interpolation is not needed: The last row of Table 2 shows that, at the matching point, this representation nearly coincides with the central value in Eq. (7.3). 7.5. Input for the I = 2 channel The uncertainties in this channel are rather large. The current experimental situation is summarized in Fig. 5, where we show the data points from the two main experiments [52,54], and

228


Fig. 5. DiLerent data sets for the S-wave in the I = 2 channel and curves that we have used as input in the Roy equation analysis.

@ve diLerent parametrizations that we will use as input. The central one is our best @t to the data of the Amsterdam–CERN–Munich collaboration (ACM) [54] solution B (which we call from now on ACM(B)) with a parametrization a; la Schenk [66]. To cover the rather wide scatter of the data, we have varied the input in this channel, using the @ve curves shown in the @gure, together with 20 = 1 (note that for the Roy equation analysis, only the value of the scattering length a20 and the behaviour of the imaginary part above 0:8 GeV matter). 8. Numerical solutions In the preceding section, the input required to evaluate the r.h.s. of our system of equations was discussed in detail. In the present section, we describe the numerical method used to solve this system and illustrate the outcome with an example. 8.1. Method used to =nd solutions We search for solutions of the Roy equations by numerically minimizing the square of the diLerence between the left and right hand sides of Eq. (5.1) in the region between threshold and 0:8 GeV. As we are neglecting the inelasticity in this region, the real and imaginary parts of t‘I (s) are determined by a single real function, the phase I‘ (s). In principle, the minimization should be performed over the whole space of physically acceptable functions {00 (s); 11 (s); 20 (s)}, but for obvious practical reasons we restrict ourselves to functions described by a simple parametrization. We will use the one proposed by Schenk some time ago [66], allowing for an additional


parameter in the polynomial part: 4M2 − s‘I 4M2 2‘ I I I 2 I 4 I 6 tan ‘ = 1 − : q {A‘ + B‘ q + C‘ q + D‘ q } s s − s‘I

229

(8.1)

The @rst term represents the scattering length, while the second is related to the eLective range aI‘ = AI‘ ;

bI‘ = B‘I +

1 4 AI − (AI )3 ‘0 : s‘I − 4M2 ‘ M2 ‘

(8.2)

In each channel, one of the @ve parameters is @xed in order to ensure the proper value of the phase at s0 . Moreover the S-wave scattering lengths a00 and a20 are identi@ed with the two constants that specify the subtraction polynomials in the Roy equations. As discussed in Section 6, we need to tune the value of a20 in order to avoid cusps. Treating this parameter on the same footing as the others, we are dealing altogether with 15 − 3 − 1 = 11 free variables, to be determined by a minimization procedure. Our choice of s0 ensures that the solution is unique, and therefore the method is safe: The choice of a bad parametrization would manifest itself in a failure of the minimization method—the minimum would not yield a decent solution. The square of the diLerence between the left and right hand sides of the Roy equations is calculated at 22 points between threshold and s0 for each of the three waves, so that the sum of squares (42Roy ) contains 66 terms. The minimization of the function (42Roy ) over 11 parameters can be handled by standard numerical routines [67]. Our procedure does generate decent solutions: The diLerences between the left and right hand sides of the Roy equations are not visible on our plots—they are typically of order 10−3 . The equations could be solved even more accurately by allowing for more degrees of freedom in the parametrization of the phases, but, in view of the uncertainties in the input, the accuracy reached is perfectly suNcient. Note also that the exact solution corresponding to a given input contains cusps. We have checked that these are too small to matter: Enlarging the space of functions on which the minimum is searched by explicitly allowing for such cusps in the parametrization of the phases, we @nd that the solutions remain practically the same. 8.2. Illustration of the solutions To illustrate various features of our numerical solutions, we freeze for a moment all the inputs and analyse the properties of the speci@c solution we then get. The input for the imaginary parts above s0 is the following: For the I = 0 wave, we use the parametrization labelled K1 (Etkin) of Au et al. [56]. In the case of the I = 1 wave, we rely on the energy-dependent analysis of ◦ Hyams et al. [48], smoothly modi@ed between s0 and 4MK2 to match the value 11 (s0 ) = 108:9 . For the I = 2 wave, we take the central curve in Fig. 5. The driving terms are speci@ed in Eq. (4.1). Moreover we @x a00 = 0:225. With this input, the minimization leads to a20 = −0:0371 and the Schenk parameters take the values listed in Table 3, in units of M . The plot in Fig. 6 shows that the numerical solution is indeed very good: Below s0 , it is not possible to distinguish the two curves representing the right and left hand sides of Eq. (5.1). For this solution we found as a minimum 42Roy = 2:1 × 10−5 , which corresponds to an average diLerence between the right and left hand sides of about 6 × 10−4 .

230


Table 3 Schenk parameters of the solution shown in Fig. 6

AI‘ B‘I C‘I D‘I s‘I

I =0

I =1

I =2

0:225 0:246 −1:67 × 10−2 −6:40 × 10−4 36:7

3:63 × 10−2 1:34 × 10−4 −6:98 × 10−5 1:41 × 10−6 30:7

−3:71 × 10−2 −8:55 × 10−2 −7:54 × 10−3 1:99 × 10−4 −11:9

Fig. 6. Numerical solution of the Roy equations for a00 = 0:225; a20 = −0:0371 (the value of a00 corresponds to the centre of the range considered while the one of a20 results if the input used for Im t02 is taken from the central curve in Fig. 5). The arrow indicates the limit of validity of the Roy equations.

Having solved the Roy equations in the low-energy region, we now have a representation for the imaginary parts of the three lowest partial waves from threshold up to s2 . Since the driving terms account for all remaining contributions, we can then calculate the Roy representation for the real parts from threshold up to 1:15 GeV (full lines in Fig. 6). On the same plot, above s0 , we also show the real part of the partial wave representation that we used as an input for the imaginary parts (dashed lines). The comparison shows that the input we are using is well compatible with the Roy equations (we should stress at this point that in none of the phase-shift analyses which we are using as input the Roy equations have been used).


231

Fig. 7. Universal band. The @ve lines correspond to the @ve diLerent curves shown in Fig. 5 (the top line, for instance, results if the input for Im t02 in the region above 0:8 GeV is taken from the top curve in that @gure). S0 marks our reference point: a20 = 0:225; a20 = −0:0371. The bar attached to it indicates the uncertainty in a20 due to the one in the phase 00 at the matching point—the most important remaining source of error if the input for Im t02 is held @xed.

9. Universal band As we have discussed in the preceding sections, for a given value of a00 and @xed input, the Roy equations admit a solution without cusp only for a single value of a20 . By varying the input value of a00 , the Roy equations de@ne a function a20 = F(a00 ) that is known in the literature as the “universal curve” [45]. The experimental uncertainties in the input above 0:8 GeV convert this curve into a band. The universal band is the area in the (a00 ; a20 ) plane that is allowed by the constraints given by the -scattering data above 0:8 GeV and the Roy equations. In this section we give a more precise de@nition of our universal band, and calculate it accordingly. We @rst point out that the universal curve a20 = F(a00 ) depends rather mildly on the input in the I = 0 and 1 channel (a more quantitative statement concerning this dependence is given below). For this reason, we only consider the uncertainties in the input for the I = 2 channel. The available data in this channel are shown in Fig. 5, together with @ve diLerent curves that we have used as input. For each one of these, we obtain a universal curve, which nearly represents a straight line in the (a00 ; a20 ) plane. The resulting @ve lines are shown in Fig. 7. The central one is well represented by the following second degree polynomial: a20 = −0:0849 + 0:232a00 − 0:0865(a00 )2 :

(9.1)

232


The analogous representations for the top and bottom lines read: a20 = −0:0774 + 0:240a00 − 0:0881(a00 )2 ; a20 = −0:0922 + 0:225a00 − 0:0847(a00 )2 :

(9.2)

The region between these two solid lines is our universal band. It is diNcult to make a precise statement in probabilistic terms of how unlikely it is that the physical values of the two scattering lengths are outside this band. With our rather generous choice of the two extreme curves, we consider it fair to say that the experimental information above the matching point essentially excludes such values. In fact, we will argue below that the theoretical constraints arising from the consistency of the Roy equations above the matching point restrict the admissible region even further. We now turn to the dependence of the universal curve a20 =F(a00 ) on the input in the I =0 and I = 1 channels, keeping the one for I = 2 @xed. Changes in the input above 2MK are practically invisible at threshold: If we keep the phase shifts at the matching point @xed, the three diLerent available inputs for the I = 0 and 1 channels yield values of a20 that diLer by less than one permille. The phase shifts at s0 are the only relevant factor here. Moreover, for the value of ◦ a20 , 00 (s0 ) is much more important than 11 (s0 ): Shifts of 11 (s0 ) by ±2 change the value of a20 ◦ roughly by a permille, but a change by ±3:4 in 00 (s0 ) induces a shift of Va20 = ±8:4 × 10−4 , which amounts to 2%. Even so, this is much smaller than the width of the band, as can be seen in Fig. 7. √ We have also varied s0 within the bounds 0.78 and 0:86 GeV and found that the dependence of the relation a20 = F(a00 ) on s0 is rather weak. To exemplify, we mention that for the solution √ with a00 = 0:25 at the centre of the universal band, a shift from s0 = 0:8 to 0:85 GeV changes a20 by 10−3 . 10. Consistency It takes a good balancing of the various terms occurring in the Roy equations for the partial waves not to violate the unitarity limit. In the case of the S-wave with I = 0, for instance, the contribution to Re t00 that arises from the subtraction term k00 (s) is very large already at 1 GeV: The solution shown in Fig. 6 corresponds to a00 = 0:225 and a20 = −0:0371, so that k00 (s) = 2:7 for s = 1 GeV2 . As the energy grows, the term increases and reaches k00 (s1 ) = 3:6 at the upper end of the region where our equations are valid, s1 = 68 M2 . Unless the contributions from the dispersion integrals nearly compensate the subtraction term, the unitarity limit, |Re t00 | 6 (2)−1 12 is violated. The example in Fig. 6 demonstrates that we do @nd solutions for which such a cancellation takes place, with values of a00 ; a20 that are within the universal band. It is striking that, above the matching point, this solution very closely follows the real part of the input. In a restricted sense, this is necessary for the solution to be acceptable physically: The solution is obtained by identifying the imaginary part above the matching point with the one obtained from a particular representation of the partial waves. The Roy equations then √ determine the real part of the amplitude in the region below s1 = 1:15 GeV. If the result were very diLerent from the real part of the particular representation used, we would have


233

to conclude that this representation cannot properly describe the physics. This amounts to a consistency condition: Above the matching point, the Roy solution should not strongly deviate from the real part of the input. The condition can be met only if the cancellation discussed above takes place, but it is stronger. The example in Fig. 6 demonstrates that there are solutions that obey the consistency condition remarkably well, indicating that our apparatus is indeed working properly. We will discuss the consistency condition on a quantitative level below. Before entering this discussion, we briePy comment on a diLerent aspect of our framework: the stability of the solutions. The behaviour below 0:8 GeV is not sensitive to the uncertainties in the input used for the imaginary parts above 1 GeV. We can modify that part of the input quite substantially, and without changing anything else (not even below s0 ) still get a decent solution from threshold up to the limit of validity of our equations. Naturally, if we do not modify the Schenk parameters that de@ne the phase below s0 , the Roy equations are not strictly obeyed, but the deviation from II (s; s ) strongly the true solution is quite small. The reason is that, if s is small, the kernels K‘‘ 00 (s; s ), for instance, suppress the contributions from the region where s is large. The term K00 has the following expansion for s s: 1 1 00 2 2 2 2 1 K00 (s; s ) = {11s − 10s(4M ) − (4M ) } 3 + O 4 : 9 s s The interval above 1 GeV only generates very small contributions to the integrals on the r.h.s. of the Roy equations, if these are evaluated in the region below the matching point. We now take up the consistency condition and @rst observe that, once a solution has a consistent behaviour above the matching point, reasonable changes in the input above 1 GeV lead to solutions that also obey the consistency condition: It looks as if the Roy equations were almost trivially satis@ed, behaving like an identity for E ¿ 1 GeV. Is this consistent behaviour automatic, or does it depend crucially on part of the input? The answer to this question can be found in Fig. 8, where we show two solutions obtained with the same value of a00 as in Fig. 6, but diLerent inputs for Im t02 : The solution on the left is obtained by using the top curve in Fig. 5 instead of the central one (a20 = −0:0279 instead of a20 = −0:0371). The solution on the right corresponds to the bottom curve in Fig. 5, where a20 = −0:0460. The @gure clearly shows that the consistent picture which we have at the centre of the universal band is almost completely lost if we go to the upper border of this band: It is by no means trivial that we at all @nd solutions for which the output is consistent with the input. The fact that the peaks and valleys seen in the solutions mimic those in the input can be understood on the basis of analyticity alone: The curvature above the matching point arises from the behaviour of the imaginary parts there. The relevant term is the one from the principal value integral, Re t(s) =

1 s2 Im t(s ) -4M2 ds + r(s) : s −s

The remainder, r(s) contains the contributions associated with the subtraction polynomial, the left hand cut, the higher partial waves, as well as the asymptotic region. On the

234


Fig. 8. Solutions of the Roy equations for a00 = 0:225 and two extreme values for a20 . The left @gure corresponds to the point S2 in Fig. 7, while the one on the right shows the solution for S1 . The arrows indicate the limit of validity of the Roy equations.

interval s0 ¡ s ¡ s1 , it varies only slowly and is well approximated by a @rst order polynomial in s. The representations of the partial wave amplitudes that we are using as an input are speci@ed in terms of simple functions. In the vicinity of the region where we are comparing their real parts with the Roy solutions, these are analytic in s, except for the cut along the positive real axis. Hence they also admit an approximate representation of the above form—the contributions from distant singularities are well approximated by a @rst order polynomial. Disregarding the interpolation needed to match the representation with the prescribed value of the phase at s0 , their imaginary parts coincide with one of the corresponding Roy solution above the matching point. The small diLerences occurring in the interpolation region and below the matching point do not generate an important diLerence in the curvature. We conclude that the diLerence between the Roy solution and the real part of the input must be linear in s, to a good approximation. Moreover, within the accuracy to which our solutions obey the Roy equations, the two expressions agree at the matching point, by construction. Accordingly, the relation can be written in the form Re t(s)Roy = Re t(s)input + (s − s0 )7 :

(10.1)

We have checked that this relation indeed holds to suNcient accuracy, for all three partial waves. This does not yet explain why the solution follows the real part of the input, but shows that it must do so up to a term linear in s that vanishes at the matching point. In particular, if the diLerence between input and output is small at the upper end of validity of our equations, then analyticity ensures that the same is true in the entire region between the matching point and that energy (in this interval, s varies by about a factor of two). In view of the uncertainties attached to our input, we cannot require the Roy equations to be strictly satis@ed also above the matching point. The band spanned by the two green lines in Fig. 9 shows the region in the (a00 ; a20 ) plane, where the solution for Re t00 (s) diLers from the real part of the input by less than 0.05 (expressed in terms of the parameter 7 in


235

Fig. 9. Regions inside which the consistency condition is met. The band between the two blue lines is for the condition in the I = 2 channel, whereas the one between the two green lines is for the I = 0 channel. The two red lines delimit the band inside which the Olsson sum rule is satis@ed. The shaded area gives the intersection of the three bands.

Eq. (10.1), this amounts to |700 | ¡ 0:07 GeV−2 ). Likewise, the band spanned by the two blue lines represents the region where |Re t02 (s)Roy − Re t02 (s)input | ¡ 0:05, so that |702 | ¡ 0:07 GeV−2 . The corresponding band for the P-wave is much broader—in this channel, the consistency condition is rather weak and is met everywhere inside the universal band. We conclude that, in the lower half of the universal band, all three waves show a consistent behaviour, while for the upper quarter of the band, this is not the case (the situation at the upper border is shown on the left in Fig. 8). It is not diNcult to understand why the consistency condition is strongest for the I =0 S-wave. In this connection, the most important term in the Roy equations is the one from the subtraction polynomial—the solution can satisfy the consistency condition only if the term proportional to s is nearly cancelled by a linear growth of the remaining contributions. The term generates the contribution (700 ; 711 ; 702 ) = (6; 1; −3) × (2a20 − 5a20 )=(72 M2 ) to the coeNcients that describe the diLerence between output and input for the three lowest partial waves. The subtraction polynomial thus contributes twice as much to 700 as to 702 , so that the consistency band for the I = 2 wave must be about twice as broad as the one for the I = 0 wave, while the one for the P-wave must roughly be six times broader. At the qualitative level, these features are indeed born out in the @gure, but we stress that the term from the subtraction polynomial is not the only one that matters—those arising from the integrals also depend on the values of a00 and a20 . The two green lines correspond to a variation in a20 by about ±0:004. Increasing a20 by 0.004, the value of the subtraction term k02 (s1 ) decreases by 0.10. The fact that the lines correspond to

236


a change in Re t00 (s1 ) of only ±0:05 implies that the contributions from the integrals reduce the shift by a factor of 2. Also, if only the subtraction term were relevant, the consistency bands would be determined by the combination 2a20 − 5a20 and thus have a slope of 25 . Actually, these bands are roughly parallel to the universal band, whose slope is positive, but smaller by about a factor of 2. 11. Olsson sum rule In the Roy equations, the imaginary parts above the matching point and the two subtraction constants a00 ; a20 appear as independent quantities. The consistency condition interrelates the two in such a manner that the contributions from the integrals over the imaginary parts nearly cancel the one from the subtraction term. In fact, a relation of this type can be derived on general grounds. The @xed-t dispersion relation (2.4) contains two subtractions. In principle, one subtraction suNces, for the following reason. The t-channel I = 1 amplitude T (1) (s; t) ≡ 16 {2T 0 (s; t) + 3T 1 (s; t) − 5T 2 (s; t)} does not receive a Pomeron contribution and thus only grows in proportion to s8 (t) for s → ∞. The dispersion relation (2.4), however, does contain terms that grow linearly with s. For the relation to be consistent with Regge asymptotics, the contribution from the subtraction term must cancel the one from the dispersion integral. 8 At t = 0, this condition reduces to the Olsson sum rule, which relates the subtraction constants to an integral over the imaginary parts [68]: M2 ∞ 2 Im T 0 (s; 0) + 3 Im T 1 (s; 0) − 5 Im T 2 (s; 0) 2a00 − 5a20 = 2 ds : (11.1) 8 4M2 s(s − 4M2 ) It is well known that this sum rule converges only slowly—the contributions from the asymptotic region cannot be neglected. We split the integral into four pieces, 2a00 − 5 a20 = OSP + OD + OF + Oas : The @rst term represents the contributions from the imaginary parts of the S- and P-waves in the region below 2 GeV, which are readily worked out, using our Roy solutions on the interval from threshold to 0:8 GeV and the input phase shifts on the remainder. The result is not very sensitive to the input used and is well approximated by a linear dependence on the scattering lengths, OSP = 0:483 ± 0:011 + 1:13 (a00 − 0:225) − 1:01 (a20 + 0:0371) : The remainder is closely related to the moments InI introduced in Appendix B.1: here, we are concerned with the case n= −1. The term OD describes the contribution from the imaginary part of the D-waves, in the interval from threshold to 2 GeV. The relevant experimental information is discussed in Appendix B.3, where we also explain how we estimate the uncertainties. The 8

In the case of the t-channel amplitudes with I = 0 and 2, the @xed-t dispersion relation (2.4) does ensure the proper asymptotic behaviour.


237

numerical result reads OD = 0:061 ± 0:004, including the small, negative contribution from the I = 2 D-wave. The bulk stems from the tensor meson f2 (1275): In the narrow width approximation, this contribution amounts to 0.063. For the analogous contribution due to the F-wave, we obtain OF = 0:017 ± 0:002 (in narrow width approximation, the term generated by the 3 (1690) yields 0.013). Those from the asymptotic region are dominated by the leading Regge trajectory—as noted above, the Pomeron does not contribute. Evaluating the asymptotic contributions with the formulae given in Appendix B.4, we obtain Oas =0:102 ± 0:017. Collecting terms, this yields 2a00 − 5a20 = 0:663 ± 0:021 + 1:13 (a00 − 0:225) − 1:01 (a20 + 0:0371) :

(11.2)

The result corresponds to a band in the (a00 ; a20 ) plane: a20 = −0:044 ± 0:005 + 0:218 (a00 − 0:225) :

(11.3)

The band is spanned by the two red lines shown in Fig. 9. One of these nearly coincides with the lower border of the universal band, while the other runs near the center. The Olsson sum rule thus imposes roughly the same relation between a00 and a20 as the consistency condition. Note that the asymptotic contributions are numerically quite important here: The term Oas amounts to a shift in a20 of −0:026 ± 0:004. The fact that—in the region where our solutions are internally consistent—the sum rule is indeed obeyed, represents a good check on our asymptotics. The Olsson sum rule ensures the proper asymptotic behaviour of the amplitude only for t = 0. In order for the terms that grow linearly with s to cancel also for t = 0, the imaginary part of the P-wave must obey an entire family of sum rules. The matter is discussed in detail in Appendix C.1, where we demonstrate that one of these oLers a further, rather sensitive test of our framework. The relationship between the Roy equations and those proposed by Chew and Mandelstam [69] is described in Appendix C.2, where we also comment on the asymptotic behaviour of the dispersion integrals that occur on the r.h.s. of the Roy equations for the Sand P-waves. 12. Comparison with experimental data In our framework, the only free parameter is a00 . Comparing our Roy equation solutions to data, we can determine the range of a00 consistent with these, as well as a corresponding range for a20 . This experimental determination of the two S-wave scattering lengths is the @nal scope of the present analysis and the main subject of the present section. Data on the amplitude are available in a rather wide range of energies (we do not indicate the upper limit in energy when this exceeds 1:15 GeV, the limit of validity of our equations): • Ke4 data for the combination 00 − 11 (2M 6 E 6 0:37 GeV); • ACM and Losty et al. data for 20 (0:35 GeV 6 E); • Data on the vector form factor—according to the discussion in Section 7.3, these can safely be converted into values for 11 in the region of the (0:5 6 E 6 0:9 GeV); • CERN–Munich, and Berkeley data in the channels with I = 0 and 1 (0:5 GeV 6 E).

238


In the Roy equations, a00 and a20 exclusively enter through the subtraction polynomials, speci@ed in Eq. (1.2). Those relevant for the S-waves contain a constant contribution given by the scattering length and a term proportional to (s − 4M2 ) × (2a00 − 5a20 ). In the I = 0 wave, that term is larger than a00 from E ∼ 0:5 GeV on. For the I =2 wave, the linear term starts dominating over a20 even earlier. Since t11 (s) vanishes at threshold, the corresponding subtraction polynomial exclusively involves the linear term. This implies that, except in the vicinity of threshold, the behaviour of the solutions is sensitive only to the combination 2a00 − 5a20 of scattering lengths— roughly the combination that characterizes the universal band. Accordingly, only data that reach down close to threshold give a direct handle to separately determine a00 and a20 . In fact, only those coming from Ke4 decays meet this condition. There is another threshold in energy that is obviously relevant for our approach: the matching point s0 . We will make a clear distinction between data points below s0 and those at higher energies. The comparison with data above s0 can hardly yield any information on the scattering lengths, because the behaviour of our solutions at those energies very strongly depends on the input used for the imaginary parts: The uncertainties in the experimental input completely cover the dependence of the solutions on the scattering lengths—we will discuss this in detail below. Instead, we analyse the requirement that the solution is consistent with the input for s ¿ s0 , in the sense discussed in Section 10. This condition turns out to be practically independent of the input used for the imaginary parts above s0 and does therefore yield a meaningful constraint on 2a00 − 5a20 . 12.1. Data on 00 − 11 from Ke4 , and on 20 below 0.8 GeV Let us @rst consider the Ke4 data. The comparison between our solutions and the high-statistic data of the Geneva–Saclay collaboration [70] is shown in Fig. 10, for various values of the scattering lengths. The @gure con@rms the simple intuition that these data are mainly sensitive to a00 . In accordance with previous analyses [76], we @nd that they roughly constrain a00 to the range between 0:18 and 0:3. As for the low-energy data in the I = 2 channel, we should stress that this wave is quite strongly constrained once 20 (s0 ) is @xed. Because of the absence of any structures between threshold and 0:8 GeV, once we @x 20 (s0 ), the only freedom is in the way the phase approaches zero at threshold, i.e. in the value of a20 —which depends on a00 . Fig. 11 shows that, at @xed 20 (s0 ), even a sizeable change in a00 is barely visible in the I =2 phase. The only important factor here is the value of the phase at the matching point: The comparison with the experimental data basically tells us which value of 20 (s0 ) is preferred. A quantitative statement can be made in terms of 12 , and in principle we could calculate three diLerent 12 -values, based on the three sets of data shown in Fig. 5. Two of these, however, represent two diLerent analyses of the same set of N → N data. Their diLerence is a clear sign of the presence of sizeable systematic errors. We have estimated the latter using the diLerence, point by point, between the two analyses A and B of Ref. [54], and added this in quadrature to the statistical errors. As reference we have used the ACM(A) set of data, but have checked that interchanging it with the one of Losty et al. does not give signi@cantly diLerent results. The corresponding 12 , combined with the one obtained from the Ke4 data, has 2 = 5:1 (with 8 d.o.f.) at a0 = 0:242; a2 = −0:0357. The contour corresponding a minimum 1min 0 0


239

Fig. 10. Comparison of our Roy solutions for diLerent values of the scattering lengths with the data of the Geneva–Saclay collaboration, Rosselet et al. [70]. The full, dash–dotted and dashed lines correspond to the points S0 ; S2 and S3 in Fig. 7.

Fig. 11. Comparison of our Roy solutions with the data on 20 obtained by the ACM collaboration [54] and by Losty et al. [52]. The full, dash–dotted and dashed lines correspond to the points S0 ; S2 and S3 in Fig. 7.

240


Fig. 12. Range selected by the data below 0:8 GeV. The dashed line represents the 68% C.L. contour obtained by combining the Geneva–Saclay data on Ke4 decay with those from ACM(A) on 20 .

2 + 2:3) is shown in Fig. 12. The range 0:18 ¡ a0 ¡ 0:3 is to 68% con@dence level (12 = 1min 0 dictated by the Ke4 data, whereas the I = 2 data exclude the upper border of the band.

12.2. The resonance The input used at the matching point implies that the P-wave phase shift must pass through ◦ 90 somewhere between threshold and 0:8 GeV—the Roy equations determine the place where this happens and how rapidly the phase must grow with the energy there. The solutions turn out to be very stiL: Varying the values of a00 and a20 within the universal band, and also varying the input for the imaginary parts above 0:8 GeV within the experimental uncertainties, we obtain the narrow band of solutions shown in Fig. 13. In this @gure, the energy range only extends to 0:82 GeV, for the following reason: Our solutions move along the Argand circle only below the matching point. At higher energies, the real part of the partial wave calculated from the Roy equations does not exactly match the imaginary part used as an input: unless we correct the latter, the elasticity 11 diLers from unity, already before the inelastic channels start making a signi@cant contribution. If the consistency condition is met well, the departure from unity is small, but it can become as large as 5% if we go to the extreme of the consistency region shown in Fig. 9. This means that it is not very meaningful to extract the value of the phase without adjusting the imaginary part. The proper way to do this is to extend the interval on which the Roy equations are solved, but we did not carry this out.


241

Fig. 13. P-wave phase shift. The band shows the result of our analysis, obtained by varying the input within its uncertainties, while the data points indicate the phase shift measured in the process N → N by the CERN–Munich collaboration. The full line represents the phase of the vector form factor (Gounaris–Sakurai @t of Ref. [65]).

In the region 0:7 GeV ¡ 0:82 GeV, the result closely follows the data of the CERN–Munich collaboration. Below 0:7 GeV, however, the data are in conPict with the outcome of our analysis: The @ve lowest data points are outside the range allowed by the Roy equations, a problem noted already in Ref. [8]. In our opinion, we are using a generous estimate of the uncertainties to be attached to our input. Note, in particular, that at those energies, the driving terms barely contribute. We conclude that the discrepancy between our result and the CERN–Munich phase shift analysis occurring on the left wing of the is likely to be attributed to an underestimate of the experimental errors. As discussed below, the comparison with the e+ e− and decay data corroborates this conclusion. Concerning the resonance parameters, we @rst give the ranges of mass and width that follow if, in the vicinity of the resonance, the phase shift is approximated by a Breit–Wigner formula 9 1

e2 i 1 (s) =

9

M 2 + i . M − s ; M 2 − i . M − s

tg 11 (s) =

. M

: M 2 − s

The diLerence between M 2 ± i M . and (M ± 2i . )2 is beyond the accuracy of that approximation. The second is obtained from the @rst with the substitution M 2 → M 2 − 14 . 2 ; M . → M . , which increases the value of M

by about 4 MeV.

242


In this approximation, the mass of the resonance is the real value of the energy where the phase passes through 90◦ and the width may be determined from the value of the slope d11 =ds at resonance. The solutions contained in the band shown in the @gure correspond to the range M = 774 ± 3 MeV and . = 145 ± 7 MeV, to be compared with the average values obtained by the Particle Data Group, M = 770:0 ± 0:8 MeV, . = 150:7 ± 1:1 MeV [71]. The only process independent property of the resonance is the position of the corresponding pole—the above numbers specify this position only approximately. To determine it more accurately, we @rst observe that the Roy equations yield a representation of the partial wave t11 (s) on the @rst sheet, in terms of the imaginary parts along the real axis. The @rst sheet contains both a right and a left hand cut. We need to analytically continue the function from the upper rim of the right hand cut into the lower half plane (second sheet). The diLerence between the values obtained in this manner and those found by evaluating the Roy representation in the lower half plane is given by the analytic continuation of the imaginary part, Im t11 (s) =

1 sin2 11 (s) : (s)

On the @rst sheet, t11 (s) does not have singularities. Hence a pole can only arise from the continuation of the imaginary part. Indeed, the function sin2 11 (s) contains the term exp 2i11 (s), which has a pole below the real axis. The position is readily worked out with the explicit, algebraic parametrization of the phase that we are using. The result illustrates an observation made long ago [72–74]: The pole mass is lower than the energy at which the phase goes through ◦ 90 , by about 10 MeV: For the band shown in the @gure, the pole position varies in the range M = 762:5 ± 2 MeV;

. = 142 ± 7 MeV :

The e+ e− and data neatly con@rm the conclusion reached above: The phase of the form factor is in perfect agreement with the behaviour of the P-wave that follows from the Roy equations, but diLers from the data of the CERN–Munich phase shift analysis, particularly below 0:7 GeV. In our opinion, the information obtained about the behaviour on the left wing of the resonance on the basis of the reactions e+ e− → + − and → − 0 is more reliable than the one obtained from N → N . The fact that the Roy equations are in good agreement with the e+ e− and data is very encouraging. In view of the clean determination of the P-wave phase shift through e+ e− and experiments, we @nd it instructive to draw @xed 12 -contours in the (a00 ; a20 ) plane. To do so, we @rst need to attach an error bar to the curve representing the phase shift. In Section 7.4, we estimated ◦ the uncertainty in 11 (s0 ) at ±2 or ±2%. As we go down in energy, the relative precision √ of the determination of the phase decreases: A generous estimate of the uncertainty at s = ◦ 0:5 GeV is 10% or ±0:6 . A smooth interpolation between these two values is our estimate of the experimental error bar (below that energy, the e+ e− and data become scarce and have sizeable uncertainties). To construct the 12 we have compared our solutions to the experimentally determined phase shift at @ve points between 0.5 and 0:75 GeV. Combining this 12 with those from the data on Ke4 decays and on 20 below 0:8 GeV; we obtain the 68% C.L. area drawn in Fig. 14. The minimum of the 12 is now 5.4 (with 13 d.o.f.). The position of the minimum is barely shifted: It now occurs at a00 = 0:240, a20 = −0:0356. In other words, at the place where


243

Fig. 14. 68% C.L. contour obtained by combining all relevant low energy data: Ke4 decay, ACM(A) data on 20 below 0:8 GeV and results for 11 extracted from the e+ e− and data on the pion form factor.

the 12 of the Ke4 data on 00 − 11 and those on 20 had a minimum, the 12 relative to the data on the form factor is practically zero and also has a minimum. In view of the fact that the uncertainties in 11 are very small, this is quite remarkable. The data on the P-wave do not change the position of the minimum, but shrink the ellipse along the width of the universal band. As expected, they do not reduce the range of allowed values of a00 . 12.3. Data on the I = 0 S-wave below 0.8 GeV In Fig. 15 we compare the S-wave obtained from our Roy equation solutions with the available data: CERN–Munich [48] and Berkeley [49]. The band shown is a representation of the uncertainties in the solution, which have two main sources: the uncertainty in 00 (s0 ) and the one in 20 (s0 ) (width of the universal band). The central curve shows our reference solution a00 = 0:225, a20 = −0:0371. The uncertainties indicated do not account for the changes occurring if the value a00 = 0:225 is modi@ed. Changing this value within reasonable bounds, however, brings the solution out of the band only below 0:4 GeV, already far below the @rst data point. The @gure shows good agreement with the data, especially so for the Berkeley data set. The CERN–Munich data set shows a certain structure, which does not occur in our solutions—in view of the uncertainties in the data, this diLerence does not represent a problem. Despite the positive picture which emerges from the comparison, we refrain from using these data to draw con@dence-level contours in the (a00 ; a20 ) plane. The S-wave phase shifts have been extracted simultaneously with the P-wave. As discussed in the preceding section, these are aLected by systematic errors which are at least as large as the statistical ones. The same

244


Fig. 15. Comparison between the Roy solution for the S-wave and the phase shift analyses of the CERN–Munich (circles) and Berkeley (squares) collaborations. The band shows the uncertainties in the Roy solution, which are dominated by those in a00 and a20 .

must be true for the data in the I = 0 channel, so that a quantitative comparison with the Roy solutions is barely signi@cant. 12.4. Data above 0.8 GeV √ The Roy equations are valid up to s1 = 1:15 GeV. In Fig. 16, we show three diLerent solutions for the I = 0 and 1 partial waves, in the region above the matching point. They are obtained by using three diLerent inputs for the imaginary parts (note that the curves represent our solutions, not the real parts of the input). The @gure shows that the diLerences are substantial, √ especially in the S-wave, despite the fact that, below s0 = 0:8 GeV, the three solutions are practically identical, for all three waves. Evidently, above the matching point, the Roy solutions are very sensitive to the input used for the imaginary parts. It is not diNcult to understand why that is so. As discussed in detail in Section 10, the solutions follow the real parts of the representation that is used as input (see Fig. 6 for the case of Au et al.—in the other two cases, the picture is similar). The real parts of the three representations diLer considerably. Moreover, all of these are systematically lower than the “data points” in Fig. 16, which show the result of an energy-independent analysis of the CERN–Munich data [48]. In view of this, it is not surprising that the three Roy solutions are quite diLerent and that they are also systematically lower than the data points. We conclude that a comparison of the Roy solutions with the data in the region above the matching point does not yield reliable information about the values of the two S-wave scattering lengths and we do therefore not show con@dence-level contours relative to data above 0:8 GeV.


245

Fig. 16. Behaviour of the solutions above the matching point. The curves show the solutions obtained with three diLerent inputs for the imaginary parts. The data points are taken from the energy-independent analysis of the CERN–Munich data [48]. The upper (lower) curves display the I = 0 S-wave (I = 1 P-wave).

13. Allowed range for a00 and a02 The above discussion has made clear that we can rely only on two rather solid sources of experimental information to determine the two S-wave scattering lengths: the data on Ke4 and those on the P-wave in the region. The former determines a range of allowed values for a00 while the latter yield a range for the combination 2a00 − 5a20 . The consistency condition and the Olsson sum rule impose further constraints. Fig. 17 summarizes our @ndings: We have superimposed the ellipse of Fig. 14 with the lines that delimit the consistency bands for the two S-waves, as well as those relevant for the Olsson sum rule. The allowed range for a00 and a20 is the intersection of the ellipse with the band where the Olsson sum rule is obeyed within the estimated errors. In that region, the solutions also satisfy the consistency condition. We @nd it quite remarkable that the data on the shape of the resonance, the consistency condition and the Olsson sum rule all show a preference for the lower part of the universal band. This gives us con@dence that our conclusion on which region in the (a00 ; a20 ) plane is allowed by the present experimental information is rather solid. Once the new data on Ke4 decays will become available, the allowed range in a00 will become much narrower, and we will have a very small ellipse. The prospects of making a real precision test of the predictions for

246


Fig. 17. Intersection of the ellipse in Fig. 14 (68% C.L. relative to the data on Ke4 decay, on 20 and on the form factor) with the bands allowed by the consistency condition in all the three channels and by the Olsson sum rule.

the two S-wave scattering lengths in the near future, appear to be very good, in particular also in view of the pionium experiment under way at CERN [31]. The N → N data do provide essential information concerning the input of our calculations, but, as discussed in Sections 12.3 and 12.4, they do not impose a @rm constraint on the scattering lengths (incidentally, these data also prefer the lower half of the universal band). This is unfortunate, because the power of the Roy equations (unitarity, crossing symmetry and analyticity) is that of connecting regions of very diLerent energy scales. The behaviour of the two S-waves in the immediate vicinity of threshold is determined by the scattering lengths. In the combination 2a00 − 5a20 , these also determine the linear growth of the subtraction polynomial: As we discussed in detail in Section 10, the large contribution from the polynomial must be compensated to a high degree of accuracy by the dispersive integrals. We therefore expect that a reanalysis of the N → N data based on the Roy equations would lead to a rather stringent constraint on the allowed region, as it would make full use of the information contained in these data—in our opinion, the existing phase shift analyses are a comparatively poor substitute. 14. Threshold parameters 14.1. S- and P-waves As shown in Ref. [75], the eLective ranges of the S- and P-waves and the P-wave scattering length can be expressed in the form of sum rules, involving integrals over the imaginary parts


247

of the scattering amplitude and the combination 2a00 − 5a20 of S-wave scattering lengths. The sum rules may be derived from the Roy representation by expanding the r.h.s. of Eq. (5.1) in q2 and reading oL the coeNcients according to Eq. (2.3). In the case of the S-wave eLective ranges, the expansion can be interchanged with the integration over the imaginary parts only after removing the threshold singularity. This can be done by supplementing the integrand with a term proportional to the derivative

d 1 h(s) 2 =− ; h(s) = (s − 2M ) s(s − 4M2 ) : ds s(s − 4M2 ) {s(s − 4M2 )}2 In this notation, the sum rules may be written in the form ds 1 16 s2 0 0 2 b0 = (2a0 − 5a0 ) + {4M2 (s − M2 ) Im t00 (s) 2 3M 3 4M2 {s(s − 4M2 )}2 − 9M2 (s − 4M2 ) Im t11 (s) + 5M2 (s − 4M2 ) Im t02 (s) − 32 (a00 )2 h(s)} 8 ∞ ds (a0 )2 h(s) + b00d ; − s2 {s(s − 4M2 )}2 0 8 s2 1 ds 2 0 2 b0 = − (2a0 − 5a0 ) + {2M2 (s − 4M2 ) Im t00 (s) 2 6M 3 4M2 {s(s − 4M2 )}2

+ 9M2 (s − 4M2 ) Im t11 (s) + M2 (7s − 4M2 ) Im t02 (s) − 3(a20 )2 h(s)} 8 ∞ ds (a2 )2 h(s) + b20d ; − s2 {s(s − 4M2 )}2 0 1 8M2 s2 ds 0 2 1 a1 = (2a0 − 5a0 ) + {−2(s − 4M2 ) Im t00 (s) 18M2 9 4M2 {s(s − 4M2 )}2 + 9(3s − 4M2 ) Im t11 (s) + 5(s − 4M2 ) Im t02 (s)} + a11d ; b11 =

8 9

s2

4M2

ds {−2(s − 4M2 )3 Im t00 (s) + 9(3s3 − 12s2 M2 {s(s − 4M2 )}3

+ 48sM4 − 64M6 ) Im t11 (s) + 5(s − 4M2 )3 Im t02 (s)} + b11d :

(14.1)

The integrals only involve the imaginary parts of the S- and P-waves and are cut oL at s = s2 . The contributions from higher energies, as well as those from the imaginary parts of the partial waves with ‘ = 2; 3; : : : are contained in the constants b00d , b20d , a11d , b11d . By construction, these represent derivatives of the driving terms at threshold, d00 (s) = q2 b00d + O(q4 );

d11 (s) = q2 a11d + q4 b11d + O(q6 );

d20 (s) = q2 b20d + O(q4 ) :

The numerical values obtained within our framework are given in the upper half of Table 4, where we also show the numbers quoted in the compilation of Nagels et al. [76], which are

248


Table 4 Threshold parameters of the S-, P-, D- and F-waves. The signi@cance of the entries in columns A–E is speci@ed in the text. Column 41 indicates the uncertainty due to the error bars in the experimental input at and above 0:8 GeV; whereas 42 shows the shifts occurring if a00 and a20 are varied within the ellipse of Fig. 14, according to Eqs. (14.2) and (14.4) A

B

C

D

E

Total

41

b00

2:12

0:45

−0:03

0:02

0:00

2:56

b20

−1:06

0:26

0:02

0:01

0:00

−0:77

a11

3:53

−0:03

0:13

−0:01

0:01

3:63

±0:02

±0:02 ±0:003

b11

—

4:05

1:39

−0:07

0:08

5:45

±0:13

a02

—

1:29

0:28

0:07

0:03

1:67

±0:01

b02

—

−3:48

−0:04

0:25

0:02

−3:25

±0:07

a22

—

1:67

−0:51

0:35

0:02

1:53

±0:07

b22

—

−3:10

−0:09

0:06

0:02

−3:11

±0:07

a13

—

5:11

0:26

0:05

0:01

5:43

b13

—

−3:96

−0:01

0:01

0:01

−3:95

±0:1 ±0:08

42 +0:28

−0:12

+0:03

−0:07

+0:29

−0:11

Ref. [76] 2:5 ± 0:3

10−1 M−2

−0:82 ± 0:08

10−1 M−2

3:8 ± 0:2

10−2 M−2

+0:35

10−3 M−4

−0:44

+0:15

−0:06

1:7 ± 0:3

+0:34 +1:1

1:3 ± 3

+0:41 +1:6

+0:89

−1:9

10−4 M−4 10−4 M−6

−0:95 −0:72

10−3 M−4 10−4 M−6

−0:87 −0:45

Units

6±2

10−5 M−6 10−5 M−8

based on the analysis of Basdevant et al. [8]. In accordance with the literature, we use pion mass units. Since the relevant physical scale is of the order of 1 GeV; the numerical values rapidly decrease with the dimension of the quantity listed. Columns A–E indicate the following contributions to the total: 10 A. Contribution from the subtraction term ˙ 2a20 − 5a20 . B. Imaginary parts of the S- and P-waves on the interval 4M2 ¡ s ¡ s0 . This contribution is evaluated with the Roy solutions described in the text. C. Imaginary parts of the S- and P-waves in the range s0 ¡ s ¡ s2 . Here, we are relying on the experimental information, discussed in Section 7. D. Imaginary parts of the higher partial waves in the range 4M2 ¡ s ¡ s2 . These are calculated in the same manner as for the driving terms of the S- and P-waves (see Appendix B.3). E. Asymptotic contributions, s ¿ s2 . These are evaluated with the representation given in Appendix B.4. 10 The numbers given for the total include the tiny additional contributions to b00 and b20 that arise from the integrals over h(s)(a00 )2 and h(s)(a20 )2 in the interval s2 ¡ s ¡ ∞. Numerically, these amount to b00 = −6:3 × 10−4 M−2 and b20 = −1:7 × 10−5 M−2 .


249

√ √ For the reasons discussed earlier, we use s0 = 0:8 GeV, s2 = 2 GeV. The values quoted in columns A and B are obtained with our reference solution, a00 = 0:225, a20 = −0:0371, which corresponds to the point S0 in Fig. 7. The table shows that the result for b00 , b20 , a11 , b11 is dominated by the contributions from the subtraction term and from the imaginary parts of the S- and P-waves. The higher partial waves and the asymptotic region only yield tiny corrections. The sum of D + E represents the contribution from the driving terms. In the evaluation of these terms, which is discussed in detail in Appendix B.5, we have constrained the polynomial @t with the relevant derivatives at threshold, so that the numerical values of the four constants b00d , b20d , a11d , b11d are correctly reproduced by the corresponding terms in the representation (4.1). The uncertainty given in column 41 of Table 4 only accounts for the noise seen in our evaluation for the speci@c values a00 = 0:225, a20 = −0:0371 (errors in columns B–E added up quadratically). The sensitivity to these two parameters is well represented by linear relations of the form 11

b00 = 2:56 × 10−1 M−2 {1 + 3:2Va00 − 12:7Va20 } ; b02 = −0:77 × 10−1 M−2 {1 + 2:5Va00 − 7:6Va20 } ; a11 = 3:63 × 10−2 M−2 {1 + 2:3Va00 − 7:8Va20 } ; b11 = 5:45 × 10−3 M−4 {1 + 0:1Va00 − 5:7Va20 } ;

(14.2)

with Va00 = a00 − 0:225, Va20 = a20 + 0:0371. Using this representation, the 1 ellipse of Fig. 14 can be translated into 1 ranges for the various quantities listed in the table—these are shown in column 42 (since our reference point is not at the center of the ellipse, the ranges are asymmetric). The table neatly demonstrates that the two S-wave scattering lengths are the essential low energy parameters—the uncertainty in the result is due almost exclusively to the one in a00 , a20 . This is to be expected on general grounds [77]: The integrals occurring in the above sum rules are rapidly convergent, so that only the behaviour of the partial waves in the threshold region matters. The uncertainties in the input used for the imaginary parts above the matching point only enter indirectly, through their eLect on the S- and P-waves in the threshold region. We did not expect, however, that the eLect would be as small as indicated in the table and add a few comments concerning this remarkable @nding. In order to document the statement that the uncertainties which we are attaching to the phenomenological input of our calculation (behaviour of the imaginary parts above the matching point, elasticity, driving terms) only have a minute eLect on the result for the threshold parameters, we @nd it best to give the numerical size of this eLect (column 41 of the table). We repeat that the numbers quoted there merely indicate the noise seen in our evaluation—we do not claim to describe the scattering amplitude to that accuracy. Isospin breaking, for instance, cannot be neglected at that level of precision. 11 For 0:15 6 a00 6 0:30 the representation holds inside the universal band to better than 4%. Similar relations also follow directly from the representation of the S- and P-waves given in Appendix D, but since the threshold region does not carry particular weight when solving the Roy equations, these do not have the same accuracy.

250


The reason why the threshold parameters are insensitive to the uncertainties of our input is the following. As discussed in detail in Sections 6 –9, the solutions of the Roy equations in general exhibit a cusp at the matching point. If the imaginary parts above 0:8 GeV and the value of a00 are speci@ed, there is a solution with physically acceptable behaviour in the vicinity of the matching point only if the parameter a20 is chosen properly. In other words, there is a strong correlation between the behaviour of the imaginary parts and the parameters a00 , a20 . As we are selecting a speci@c value for these parameters, we are in eLect subjecting the imaginary parts to a constraint. For this reason, the uncertainties in the input can barely be seen in the output for the threshold parameters—the main eLect is hidden in a00 , a20 . The correlation just described originates in the fact that one of the two subtraction constants is superPuous: The combination 2a00 − 5a20 may be represented as a convergent dispersion integral over the imaginary part of the amplitude. The correlation is illustrated by the lines in Fig. 7, which correspond to the speci@c parametrization of the input used for the imaginary part of the I = 2 S-wave shown in Fig. 5. As there is very little experimental information about the energy dependence of this partial wave, we have worked out the change in the Roy solutions that occurs if this energy dependence is modi@ed above the matching point. The result for the threshold parameters turns out to be practically unaLected. Also, we have varied the driving terms within the uncertainties given in Section 4. Again, the response in the threshold parameters can barely be seen. 14.2. D- and F-waves Similar sum rules also hold for the threshold parameters of the higher partial waves. The contributions from the imaginary parts of the S- and P-waves are obtained by expanding the Kernels occurring in the Roy equations for the D- and F-waves around threshold. We write the result in the form 16 s2 ds 0 a2 = {(s − 4M2 ) Im t00 (s) + 9(s + 4M2 ) Im t11 (s) 45 4M2 s3 (s − 4M2 ) + 5(s − 4M2 ) Im t02 (s)} + a02d ; 32 s2 ds {(s − 4M2 ) Im t00 (s) − 3(s − 12M2 ) Im t11 (s) b02 = − 4 15 4M2 s (s − 4M2 ) + 5(s − 4M2 ) Im t02 (s)} + b02d ; s2 ds 8 2 {2(s − 4M2 ) Im t00 (s) − 9(s + 4M2 ) Im t11 (s) a2 = 3 45 4M2 s (s − 4M2 ) + (s − 4M2 ) Im t02 (s)} + a22d ; 16 s2 ds 2 {2(s − 4M2 ) Im t00 (s) + 3(s − 12M2 ) Im t11 (s) b2 = − 4 15 4M2 s (s − 4M2 ) + (s − 4M2 ) Im t02 (s)} + b22d ;


a13

16 = 105

s2

4M2

s4 (s

251

ds {2(s − 4M2 ) Im t00 (s) + 9(s + 4M2 ) Im t11 (s) − 4M2 )

− 5(s − 4M2 ) Im t02 (s)} + a13d ; 128 s2 ds 1 {2(s − 4M2 ) Im t00 (s) + 36M2 Im t11 (s) b3 = − 5 105 4M2 s (s − 4M2 ) − 5(s − 4M2 ) Im t02 (s)} + b13d ;

(14.3)

a02d ; b02d ; : : :

contain the contributions from s ¿ s2 as well as those from the higher partial where waves. The evaluation of these contributions, however, meets with problems that we need to discuss in some detail. First, we note that the de@nition of the driving terms in Eq. (3.2) is suitable only for the analysis of the S- and P-waves. For ‘ ¿ 2, the functions dI‘ (s) contain a branch cut at threshold, so that these quantities are complex. In order to solve the Roy equations for the D-waves, for instance, the contributions generated by their imaginary parts need to be isolated, using a diLerent decomposition of the right hand side of these equations. As far as the scattering lengths and eLective ranges are concerned, however, only the values of the functions dI‘ (s) and their @rst derivatives at threshold are needed, which are real. A more subtle problem arises from the fact that the explicit form of the kernels occurring in the Roy equations for the higher partial waves depends on the choice of the partial wave projection. As discussed in detail in Ref. [78], the de@nition (A.4)—which we used in our analysis of the S- and P-waves—does not automatically ensure that the threshold behaviour of the partial waves with ‘ ¿ 3 starts with the power q2‘ . The problem arises from the fact that the solution of the Roy equations leads to a crossing symmetric scattering amplitude only if the imaginary parts of the higher partial waves satisfy sum rules such as the one in Eq. (B.8). In particular, the expansion of the F-wave in powers of q in general starts with Re t31 (s) = x31 q4 + a13 q6 + b13 q8 + · · · : For the @ctitious term x31 to be absent, the imaginary parts of the higher partial waves must obey a sum rule. In fact, we have written down the relevant sum rule already: Eq. (B.8). The derivation given in Section B.2 shows that this constraint ensures crossing symmetry of the terms occurring in the expansion of the scattering amplitude around threshold, up to and including contributions of O(q4 ). The threshold expansion of the partial waves with ‘ ¿ 3 thus only starts at O(q6 ) if this condition holds—in particular x31 then vanishes. The sum rule that allows us to pin down the asymptotic contributions to the driving terms for the S- and P-waves thus at the same time also ensures the proper threshold behaviour of the F-waves. The absence of a term of O(q6 ) in the G-waves leads to a new constraint, which could be derived in the same manner, etc. Note that the contributions from the imaginary parts of the S- and P-waves are manifestly crossing symmetric—the constraints imposed by crossing symmetry exclusively concern the higher waves. 12 12 The family of sum rules discussed in Appendix C.1 does not follow from crossing symmetry, but from an asymptotic condition that goes beyond the Roy equations. As shown there, those sum rules do tie the imaginary part of the P-wave to the higher partial waves.

252


The F-wave scattering length occurs in the expansion of the amplitude around threshold among the contributions of O(q6 ), two powers of q beyond the term just discussed. In the numerical analysis, we thus need to make sure that the sum rule holds to high precision if we are to get a reliable value in this manner. For the eLective range, the situation is even worse. This indicates that for the numerical analysis of the higher partial waves, the extension of the range of validity of the Roy equations achieved if the standard partial wave projection (A.2) is replaced by (A.3) generates considerable complications. For the evaluation of the threshold parameters, this extension is not needed—we may use the partial wave projection (A.2), for which the problem discussed above does not occur. In particular, x31 then automatically vanishes, so that the evaluation of the scattering lengths and eLective ranges does not pose special numerical problems. To evaluate those from the asymptotic region, we expand the @xed-t dispersion relation (2.4) in powers of t. The results obtained for a00 = 0:225, a20 = −0:0371 are listed in the lower half of Table 4. The dependence on the S-wave scattering lengths may again be represented (to better than 6% inside the universal band for 0:15 6 a00 6 0:30) with a set of linear relations: a02 = 1:67 × 10−3 M−4 {1 + 2:6Va00 − 8:6Va20 } ; b02 = −3:25 × 10−4 M−6 {1 + 6:6Va00 − 17Va20 } ; a22 = 1:53 × 10−4 M−4 {1 + 14Va00 − 25Va20 } ; b22 = −3:11 × 10−4 M−6 {1 + 6:2Va00 − 11Va20 } ; a13 = 5:43 × 10−5 M−6 {1 + 5:5Va00 − 8Va20 } ; b13 = −3:95 × 10−5 M−8 {1 + 8Va00 − 8Va20 } :

(14.4)

The sensitivity is more pronounced here than in the case of the threshold parameters for the Sand P-waves. In particular, the linear representation for the D-wave scattering length a22 only holds to a good approximation if a00 and a20 do not deviate too much from the central values. 15. Values of the phase shifts at s = MK2 A class of important physical processes where the phase shifts play a relevant role is that of kaon decays. Let us recall, for instance, that the phase of : is given by the value of 20 − 00 + 12 at s = MK2 . In this section, we give numerical values for the three phase shifts at the kaon mass as they come out from our Roy equation analysis, and show the explicit dependence on the two S-wave scattering lengths. In this manner, an improved determination of the latter will immediately translate into a better knowledge of the phases at s = MK2 . The decays K 0 → and K + → concern slightly diLerent values of the energy. In view of the fact that the CP-violating parameter : manifests itself in the decays of the neutral kaons, we evaluate the phases at s = MK2 0 . Note that, in addition to this diLerence in the masses, there are also isospin breaking eLects in the relevant transition matrix elements. As far as the phases are concerned, however, the isospin breaking eLects due to md − mu are tiny, because G-parity implies that these only occur at order (md − mu )2 .


253

Table 5 Values of the phase shifts at s = MK2 0 in degrees. The central value is obtained with our reference solution of the Roy equations, where a00 = 0:225, a20 = −0:0371. The column 41 indicates the uncertainty due to the error bars in the experimental input at and above 0:8 GeV, whereas 42 shows the shifts occurring if a00 and a20 are varied within the ellipse of Fig. 14, according to Eq. (15.1) Value at s = MK2 0

41

00

37.3

±1:4

11

5:5

±0:1

20

−7:8

±0:04

00 − 20

45:2

±1:3

42 +4:3

−1:6

+0:3

−0:13

+0:7

−0:8

+4:5

−1:6

As in the preceding section, we give values at the reference point a00 =0:225 and a20 = −0:0371, and break down the errors into those due to the noise in our calculations and those due to the poorly known values of the two scattering lengths. The results are shown in Table 5. Like for the threshold parameters, the two S-wave scattering lengths are the main source of uncertainty. In the present case, the errors due to the uncertainties in our experimental input at 0:8 GeV are not negligible, but they amount to at most 4%. The dependence of the central values on the two scattering lengths is well described by the following polynomials: ◦

00 (MK2 0 ) = 37:3 {1 + 3:0Va00 − 8:5Va20 } ; ◦

11 (MK2 0 ) = 5:5 {1 + 1:7Va00 − 6:7Va20 } ; ◦

20 (MK2 0 ) = −7:8 {1 + 1:9Va00 − 13Va20 } ; ◦

00 (MK2 0 ) − 20 (MK2 0 ) = 45:2 {1 + 2:8Va00 − 9:4Va20 } :

(15.1)

Our results are in agreement with Refs. [61,79,80], but are more accurate. In the foreseeable future, the two S-wave scattering lengths will be pinned down to good precision, so that the above formulae will @x the phases to within remarkably small uncertainties. 16. Comparison with earlier work The Roy equations were used to obtain information on the scattering amplitudes, already in the early seventies. Most of the work done since then either follows the method of Pennington and Protopopescu [5,6] or the one of Basdevant et al. [7,8]. In the present section, we briePy compare these two approaches with ours. A review of the results obtained by means of the Roy equations is given in Ref. [11].

254


To our knowledge, Pennington and Protopopescu [5] were the @rst to analyse scattering data using Roy’s equations. In principle, the approach of these authors is similar to ours. In our √ language, they @xed the matching point at s0 = 0:48 GeV. As input data, they relied on the production experiment of the Berkeley group [49], using the data of Baton et al. [46] for the I = 2 channel (at the time they performed the analysis, the high-energy, high-statistics CERN– Munich data [48] were not yet available). The Roy equations then allowed them to extrapolate the S- and P-wave phases of Protopopescu et al. [49] to the region below 0:48 GeV. Comparing the Roy-predicted real parts with the data (this corresponds to what we call consistency), they found that these constrain the two S-wave scattering lengths to the range a00 = 0:15 ± 0:07, a20 = −0:053 ± 0:028. In their subsequent work [6], they then used the Roy equations to solve the famous up–down ambiguity that occurs in the analysis of the S-wave. The fact that, in their analysis, the matching point is taken below the mass of the has an interesting mathematical consequence: As discussed in Section 6.3, the Roy equations then do not admit a solution for arbitrary values of a00 , a20 , even if cusps at the matching point are allowed for (the situation corresponds to row IV of Table 1). To enforce a solution, one may for instance keep the input for the imaginary parts as it is, but tune the scattering length a20 . The result, however, in general contains strong cusps in the partial waves with I = 0; 1. These can only be removed if the input used for the imaginary parts above the matching point is also tuned—the situation is very diLerent from the one encountered for our choice of the matching point. Basdevant et al. [7,8] constructed solutions of the Roy equations by considering several phase shift analyses and a broad range of S-wave scattering lengths. The method used by these authors is diLerent from ours √ in that √ they relied on an analytic parametrization of the S- and P-waves from threshold up to s2 = 110M = 1:47 GeV, the onset of the asymptotic region in their case. Some of the parameters occurring therein are determined from a @t to the data, some by minimizing the diLerence √ between the right and left hand sides of the Roy equations in √ the region below s0 = 60M = 1:08 GeV. In this manner, they construct universal bands corresponding to the Berkeley [49], Saclay [46] and CERN–Munich phases as determined by Estabrooks et al. [51]. The individual bands are not very much broader than the shaded region in Fig. 17, but they are quite diLerent from one another: Crudely speaking, the Berkeley band is centred at the upper border of our universal band, while the one constructed with the CERN– Munich phases is centred at the lower border. The Saclay band runs outside the region where we can @nd acceptable solutions at all. In order to compare their results with ours, we @rst note that, for the six explicit solutions given in Table 5 of [8], the value of a00 varies between −0:06 and 0:59. Only two of these correspond to values of the S-wave scattering lengths in the region considered in the present paper: BKLY2 and SAC2 . For these two, the value of the P-wave phase shift at 0:8 GeV is ◦ ◦ 108:3 and 108:0 , respectively, remarkably close to the central value of the range allowed by the data on the form factor, Eq. (7.2). Concerning the value of 00 at 0:8 GeV, however, ◦ the two solutions diLer signi@cantly: While BKLY2 yields 79:7 and is thus within our range ◦ in Eq. (7:4), the value 70:2 that corresponds to SAC2 is signi@cantly lower. In our opinion, that solution is not consistent with the experimental information available today. In the interval from threshold to 0:8 GeV, our solution diLers very little from BKLY2 . Above this energy, the imaginary part of the I = 0 S-wave in BKLY2 is substantially smaller than the one we are using


255

as an input. Nevertheless, the solutions are very similar at low energies, because the behaviour below the matching point is not sensitive to the input above 1 GeV. 17. Summary and conclusions The Roy equations follow from general properties of the scattering amplitude. We have set up a framework to solve these equations numerically. In the following, we summarize the main features of our approach and the results obtained with it, omitting details—even if these would be necessary to make the various statements watertight. 1. In our analysis, three energies s0 ¡ s1 ¡ s2 play a special role: √ s0 = 0:8 GeV; s0 = 32:9M2 ; √ s1 = 1:15 GeV; s1 = 68M2 ; √ s2 = 2 GeV; s2 = 205:3M2 : We refer to the point s0 as the matching point: At this energy, the region where we calculate the partial waves meets the one where we are relying on phenomenology. The point s1 indicates the upper end of the interval on which the Roy equations are valid, while s2 is the onset of the asymptotic region. 2. Given the strong dominance of the S- and P-waves, we solve the Roy equations only for these, and only on the interval 4M2 ¡ s ¡ s0 , that is on the lower half of their range of validity. In that region, the contributions generated by inelastic channels are negligibly small. There, we set 00 (s) = 11 (s) = 20 (s) = 1. In the interval from s0 to s2 , we evaluate the imaginary parts with the available experimental information, whereas above s2 , we invoke a theoretical representation, based on Regge asymptotics. We demonstrate that crossing symmetry imposes a strong constraint on the asymptotic contributions, which reduces the corresponding uncertainties quite substantially—in most of our results, these are barely visible. 3. The Roy equations involve two subtraction constants, which may be identi@ed with the two S-wave scattering lengths a00 , a20 . In principle, one subtraction would suNce: The Olsson sum rule relates the combination 2a00 − 5a20 to an integral over the imaginary parts in the forward direction (or, in view of the optical theorem, over the total cross section). This imposes a correlation between the input used for the imaginary parts and the values of the S-wave scattering lengths, but using this constraint ab initio would lead to an unnecessary complication of our scheme. We instead treat the two subtraction constants as independent parameters. The consequences of the Olsson sum rule are discussed below. 4. Unitarity converts the Roy equations for the S- and P-waves into a set of three coupled integral equations for the corresponding phase shifts: The real part of the partial wave amplitudes is given by a sum of known contributions (subtraction polynomial, integrals over the region s0 ¡ s ¡ s2 and driving terms) and certain integrals over their imaginary parts, extending from threshold to s0 . Since unitarity relates the real and imaginary parts in a nonlinear manner, these equations are inherently nonlinear and cannot be solved explicitly. 5. Several mathematical properties of such integral equations are known, and are used as a test and a guide for our numerical work. In particular, the existence and uniqueness of the solution is guaranteed only if the matching point s0 is taken in the region between the place where the

256

B. Ananthanarayan et al. / Physics Reports 353 (2001) 207–279 ◦

P-wave phase shift goes through 90 and the energy where the I = 0 S-wave does the same. As √ this range is quite narrow (0:78 GeV ¡ s0 ¡ 0:86 GeV), there is little freedom in the choice √ of the matching point—we use s0 = 0:8 GeV. According to Table 1, the multiplicity index of √ √ the interval 0:86 ¡ s0 ¡ 1 GeV is equal to 1. By way of example ( s0 = 0:88 GeV), we have veri@ed that our framework indeed admits a one-parameter family of numerical solutions if the matching point is taken in that energy range. 6. A second consequence of the mathematical structure of the Roy equations is that, for a given input and for a random choice of the two subtraction constants, the solution has a cusp at s0 : In the vicinity of the matching point, the solution in general exhibits unphysical behaviour. The strength of the cusp is very sensitive to the value of a20 . In fact, we @nd that the cusp disappears in the noise of our calculation if that value is tuned properly. Treating the imaginary parts as known, the requirement that the solution is free of cusps at the matching point determines the value of a20 as a function of a00 . This is how the universal curve of Martin, Morgan and Shaw manifests itself in our approach. 7. The input used for the imaginary parts above the matching point is subject to considerable uncertainties. In our framework, the values of the S- and P-wave phase shifts at the matching point represent the essential parameters in this regard. In order to pin these down, we @rst make use of the fact that the data on the pion form factor, obtained from the processes e+ e− → + − and → − 0 , very accurately determine the behaviour of the P-wave phase shift in the region of the -resonance, thus constraining the value of 11 (s0 ) to a remarkably narrow range. Next, we observe that the absolute phase of the scattering amplitude drops out in the diLerence 11 − 00 , so that one of the sources of systematic uncertainty is eliminated. Indeed, the phase shifts extracted from the reaction N → N yield remarkably coherent values for this diLerence. Since the P-wave is known very accurately, this implies that 00 (s0 ) is also known rather well. The experimental information concerning 20 , on the other hand, is comparatively meagre. We vary it in the broad range shown in Fig. 5. 8. The uncertainties in the experimental input for the imaginary parts and those in the driving terms turn the universal curve into a band in the (a00 ; a20 ) plane, part of which is shown in Fig. 7. Outside this “universal band”, the Roy equations do not admit physically acceptable solutions that are consistent with what is known about the behaviour of the imaginary parts above the matching point. 9. One of the striking features of the solutions is that, above the matching point, they very closely follow the real part of the partial wave used as input for the imaginary part, once the value of a20 is in the proper range. The phenomenon is discussed in detail in Section 10, where we show that, in a certain sense, this property represents a necessary condition for the solution to be acceptable physically. The region where this consistency condition holds is shown in Fig. 9: It roughly constraints the admissible values of a20 to the lower half of the universal band. 10. As mentioned above, the Olsson sum rule relates the combination 2a00 − 5a20 of scattering lengths to an integral over the imaginary parts of the amplitude. Evaluating the integral, we @nd that the sum rule is satis@ed in the band spanned by the two red curves shown in Fig. 9. The Olsson sum rule thus amounts to essentially the same constraint as the consistency condition. Presumably, the universal band is of the same origin: Physically acceptable solutions only exist if the subtraction constants are properly correlated with the imaginary parts. The shaded region


257

in Fig. 9 shows the domain where all of these conditions are satis@ed. It is by no means built in from the start that the various requirements can simultaneously be met—in our opinion, the fact that this is the case represents a rather thorough check of our analysis. 11. The admissible region can be constrained further if use is made of experimental data below the matching point. At the moment there are two main sources of information on scattering below 0:8 GeV: A few data points for the I = 2 S-wave phase shift—which to our knowledge will, unfortunately, not be improved in the foreseeable future—and a few data points on 00 −11 very close to threshold, as measured in Ke4 decays. These data do provide an important constraint. We compare our solutions inside the universal band to both sets of data. As shown in Fig. 12, the corresponding 12 contours nicely @t inside the universal band. The net result for the allowed range of the parameters is shown in Fig. 17, which summarizes our @ndings. 12. To our knowledge, the Roy equation analysis is the only method that allows one to reliably translate low energy data on the scattering amplitude into values for the scattering lengths. As discussed above, the available data do correlate the value of a20 with the one of a00 . Unfortunately, however, the value of a00 as such is not strongly constrained: In agreement with earlier analyses, we @nd that these data are consistent with any value of a00 in the range from 0.18 to 0.3. 13. The new experiments at Brookhaven [29] and at DAYNE [30] will yield more precise information in the very near future. We expect that the analysis of the forthcoming results along the lines described in the present paper will reduce the error in a00 by about a factor of three. Moreover, the pionic atom experiment under way at CERN [31] will allow a direct measurement of |a00 − a20 | and thus con@ne the region to the intersection of the corresponding, approximately vertical strip with the region shown in Fig. 17. 14. The two subtraction constants a00 ; a20 are the essential parameters at low energies: If these were known, our method would allow us to calculate the S- and P-wave phase shifts below 0.8 GeV to an amazing degree of accuracy. The parameters a00 ; a20 act like a @lter: If the solutions are sorted out according to the values of these parameters, the noise due to the uncertainties in our input practically disappears, because variations of that input require a corresponding variation, either in a00 or in a20 —otherwise, the behaviour of the solution near the matching point is unacceptable. A simple explicit representation for the S- and P-wave phase shifts as functions of the energy is given in Appendix D. The representation explicitly displays the dependence on a00 ; a20 . 15. We have also analysed the implications for the scattering lengths of the P-, D- and F-waves, as well as for the various eLective ranges. The fact that a00 and a20 are the essential low energy parameters manifests itself also here: If we change the input in the Roy equations within the uncertainties, but keep a00 and a20 constant, the values of the various threshold parameters vary only by tiny amounts, typically around one percent or less. The main source of uncertainty in the determination of the threshold parameters is by far the one attached to the S-wave scattering lengths. 16. If the energy approaches the matching point, the uncertainties in the experimental input, naturally, come more directly into play. Also, the uncertainties in the driving terms grow rather rapidly with the energy. At the kaon mass, however, these are still very small. We have analysed the phase shifts at E = MK in detail, because these represent an important ingredient in the calculation for various decay modes of the K mesons. The result shows that the uncertainties

258


are dominated by those in a00 ; a20 , also at that energy. We conclude that the future precision data on K‘4 -decay and on pionic atoms will translate, via the Roy equations, into a rather precise knowledge of the scattering amplitude (not only the lowest three partial waves) in the entire low energy region, extending quite far above threshold. 17. In the present paper, we followed the phenomenological path and avoided making use of chiral symmetry, in order not to bias the data analysis with theoretical prejudice. A famous low energy theorem [34] predicts the values of the two basic low energy parameters in terms of the pion decay constant. The prediction holds to leading order in an expansion in powers of the quark masses. The corrections arising from the higher order terms in the chiral expansion are now known to order p6 (two loops) [42]. The phenomenological representation obtained in the present paper is compared with the chiral one in Ref. [43], where it is shown that the matching leads to a very sharp prediction for a00 and a20 . The confrontation of the prediction with the forthcoming results of the precision measurements will subject the chiral perturbation theory to a crucial test. Acknowledgements We are indebted to W. Ochs, M. Pennington and G. Wanders for many discussions and remarks concerning various aspects of our work. Also, we wish to thank G. Ecker for providing us with his notes on the problem that were very useful at an early phase of this investigation. Moreover, we thank J. Bijnens, P. BZuttiker, S. Eidelman, F. Jegerlehner, B. Loiseau, B. Moussallam, S. Pislak, A. Sarantsev, J. Stern and B. Zou for informative comments, in particular also for detailed information on data and phase shift analyses. This work was supported by the Swiss National Science Foundation, Contract No. 2000-55605.98, by TMR, BBW-Contract No. 97.0131 and EC-Contract No. ERBFMRX-CT980169 (EURODAYNE). Appendix A. Integral kernels Crossing symmetry, A(s; u; t) = A(s; t; u), implies that the isospin components T = (T 0 ; T 1 ; T 2 ) are subject to the constraints (u ≡ 4M2 − s − t) T (s; u) = Ctu T (s; t) ; T (t; s) = Cst T (s; t) ; T (u; t) = Csu T (s; t) ;

where the crossing matrices Ctu = Cut ; Csu = Cus ; Cst = Cts are given by  1 1 5  5    1 3 −1 3 3 3 1 0 0  1 5 5 1 1 Ctu =  0 −1 0  ; Csu =  − 13 ; C = − 3 st 2 6  2 6 : 0 0 1 1 1 1 1 1 1 3 2 6 3 −2 6


259

Their products obey the relations (Ctu )2 = (Csu )2 = (Cst )2 = 1 ; Cst Ctu = Ctu Cus = Cus Cst ;

Csu Cut = Cts Csu = Cut Cts :

The quantities g2 (s; t; s ); g3 (s; t; s ) occurring in the @xed-t dispersion relation (2.4) represent 3 × 3 matrices built with Cst ; Ctu and Csu , t 1 Csu ; g2 (s; t; s ) = − (uCst + sCst Ctu ) + s (s − 4M2 ) s − t s − u0 su 1 Csu ; (A.1) g3 (s; t; s ) = − + s (s − u0 ) s − s s − u where u = 4M2 − s − t and u0 = 4M2 − t. The straightforward partial wave projection of the amplitude reads 1 1 I t‘ (s) = d z P‘ (z)T I (s; tz ); tz = 12 (4M2 − s)(1 − z) : 64 −1 On account of crossing symmetry, the formula is equivalent to 1 1 I t‘ (s) = d z P‘ (z)T I (s; tz ) : 32 0

(A.2)

(A.3)

As pointed out by Roy [1], the second form of the projection is preferable in the present context, because it involves smaller values of |tz |, so that the domain of convergence of the partial wave series for the imaginary parts on the r.h.s. of the @xed-t dispersion relation (2.4) becomes larger: Whereas for the straightforward projection, the large Lehmann–Martin ellipse is mapped into −4M2 ¡ s ¡ 32M2 , the one in Eq. (A.3) corresponds to −4M2 ¡ s ¡ 60M2 . II (s; s ) that occur in Eq. (1.1) are diLerent from zero only if both I + ‘ and The kernels K‘‘ I + ‘ are even. With the partial wave projection (A.3), the explicit expression becomes 13 1 II K‘‘ d z P‘ (z)K‘ (s; tz ; s )II ; (s; s ) = (2‘ + 1) 0

tz = 12 (4M2 − s)(1 − z) :

(A.4)

13 Note that the @xed-t dispersion relation (2.4) is not manifestly crossing symmetric—for ‘ ¿2, the kernels do depend on the speci@c form used for the partial wave projection. In particular, the kernels occurring in the Roy equations for the waves with ‘¿3 are proportional to (s − 4M2 )‘ only if the projection in Eq. (A.2) is used— for the one we are using here, the proper behaviour of the solutions only results if the contributions from the imaginary parts of the diLerent partial waves compensate one another near threshold (see Section 14.2). For a detailed discussion of these issues we refer to [78].

260


The functions K‘ (s; t; u)II are the matrix elements of 2t K‘ (s; t; s ) = g2 (s; t; s ) + g3 (s; t; s )P‘ 1 + : s − 4M2

(A.5)

The kernels contain the usual pole at s = s , generating the right hand cut of the partial wave II amplitudes, as well as a piece KQ ‘‘ (s; s ) that is analytic in Re s ¿ 0, but contains a logarithmic branch cut for s 6 −(s − 4M2 ):

II K‘‘ (s; s ) =

1 II Q II‘‘ (s; s ) : + K ‘‘ (s − s)

To illustrate the structure of the second term, we give the explicit expression for I =I =‘=‘ =0: 2 s + s − 4M2 2s + 5s − 16M2 00 Q K 00 (s; s ) = − ln : 3(s − 4M2 ) s 3s (s − 4M2 ) We do not need to list other components—they may be generated from the above formulae by means of standard integration routines. Appendix B. Background amplitude B.1. Expansion of the background for small momenta The background amplitude only contains very weak singularities at low energies. At small values of the arguments, A(s; t; u)d thus represents a slowly varying function of s; t; u, which is adequately approximated by a polynomial. We may, for instance, consider the Taylor series expansion around the centre of the Mandelstam triangle: Set s0 = 43 M2 ; s=s0 +x; t =s0 − 12 (x − y), expand in powers of x and y and truncate the series. Alternatively, we may exploit the fact that, in view of the angular momentum barrier, the dispersion integral over the imaginary parts of the higher partial waves receives signi@cant contributions only for s & 1 GeV2 . For small values of s and t, we can therefore expand the kernels g2 (s; t; s ) and g3 (s; t; s ) in inverse powers of s . The coeNcients of this expansion are homogeneous polynomials of s; t and M2 , which may be ordered with the standard chiral power counting. The corresponding expansion of the Legendre polynomial starts with 2t t P‘ 1 + = 1 + ‘(‘ + 1) + O(p4 ) : 2 s − 4M s Truncating the expansion at order p6 , the background amplitude becomes T (s; t)d = −32{(tuCst + suCsu + stCtu )(1 + Csu )I0

+ {s2 tCtu + u2 sCsu + t 2 uCst + (t 2 sCtu + s2 uCsu + u2 tCst )Csu }I1 + stu(1 + Csu )H } + O(p8 ) :

(B.1)


The coeNcients I0 and I1 represent moments 14 of the imaginary part at t = 0, ∞ ds Im T I (s; 0)d 1 InI = : 322 4M2 sn+2 (s − 4M2 )

261

(B.2)

In view of the optical theorem, these quantities are given by integrals over the total cross section, except that the contributions from the S- and P-waves below s2 are to be removed. Equivalently, we may express these coeNcients in terms of the imaginary parts of the partial waves: ∞ ∞ (2l + 1) s2 ds Im t‘I (s) (2l + 1) ∞ ds Im t‘I (s) I In = + : (B.3) n+2 (s − 4M 2 ) n+2 (s − 4M 2 ) 4M2 s s2 s ‘=2

‘=0

I11 ,

the last term in Eq. (B.1) may be expressed in Except for a contribution proportional to terms of the derivative of Im T (s; t)d with respect to t: ∞ 1 ds 9 Im T I (s; t)d I 1 I : (B.4) H = −2I1 1 + 322 4M2 s3 9t t=0 Here, only the higher partial waves contribute: ∞ ∞ ds Im t‘I (s) I I 1 H = (2l + 1){‘(‘ + 1) − 21 } : 4M2 s3 (s − 4M2 )

(B.5)

‘=2

The expression is similar to the one for I1I , except that the sum over the angular momenta picks up a factor of ‘(‘ + 1), indicating that partial waves with higher values of ‘ are more signi@cant here. Note that all of the above moments are positive. B.2. Constraints due to crossing symmetry The expansion of the background amplitude starts at order p4 , with a manifestly crossing symmetric contribution determined by the moments I0 . The term from I1 is also crossing symmetric, but the one proportional to stu violates the condition T (s; u)d = Ctu T (s; t)d , unless the I = 1 component of the vector (1 + Csu )H vanishes, i.e. 2H 0 = 9H 1 + 5H 2 :

(B.6)

This sum rule is both necessary and suNcient for the polynomial approximation to the background amplitude to be crossing symmetric up to and including contributions of order p6 . The sum rule illustrates the well-known fact that crossing symmetry leads to stringent constraints on the imaginary parts of the partial waves with ‘ ¿ 2 (for a thorough discussion, see 14

The factor 1=(s − 4M2 ) could also be expanded in inverse powers of s, but this would worsen the accuracy of the polynomial representation. Note that the same factor also occurs in the representation (3.6) for the contributions generated by the imaginary part of the S- and P-waves below s2 : The expansion of the functions W I (s) in powers of s yields integrals of the same form. Hence the low energy expansion of the full amplitude can be expressed in terms of moments of this type.

262


[81,3]). Crossing symmetry implies for instance that Im t20 (s) can be diLerent from zero only if some of the higher partial waves with I = 1 or 2 also possess an imaginary part—in marked contrast to the situation for the S- and P-waves, where crossing symmetry does not constrain the imaginary parts. In the form given, the sum rule only holds up to corrections of order M2 . We may, however, establish an exact variant by expanding the I =1 component of the relation T (s; u)d =Ctu T (s; t)d around threshold, for instance in powers of t and u. In order for the term of order tu occurring in the expansion of the left hand side to agree with the corresponding term on the right hand side, the imaginary parts must obey the sum rule ∞ ds 0 2 {2 Im T˙ (s; 0) − 5 Im T˙ (s; 0)} 2 (s − 4M 2 ) s 2 4M ∞ ds(3s − 4M2 ) 1 =3 {(s − 4M2 ) Im T˙ (s; 0) − 2 Im T 1 (s; 0)} ; (B.7) 2 2 3 4M2 s (s − 4M ) I where T˙ (s; t) stands for the partial derivative of T I (s; t) with respect to t. Expressed in terms of the imaginary parts of the partial waves, the relation reads ∞ ds (2‘ + 1)‘(‘ + 1) {2 Im t‘0 (s) − 5 Im t‘2 (s)} 2 2 2 4M2 s (s − 4M ) ‘=2;4;:::

=

(2‘ + 1){‘(‘ + 1) − 2}

‘=3;5;:::

∞

4M2

ds(s − 43 M2 ) 9 Im t‘1 (s) : s2 (s − 4M2 )3

(B.8)

The approximate version (B.6) diLers from this exact result only through terms of order M2 . The constraints imposed by crossing symmetry show, in particular, that the concept of tensor meson dominance is subject to a limitation that does not occur in the case of vector dominance: The hypothesis that convergent dispersion integrals or sum rules are saturated by the contributions from a spin 2 resonance leads to coherent results only at leading order of the low energy expansion. The sum rule (B.7) demonstrates that the hypothesis in general fails: Crossing symmetry implies that singularities with ‘ ¿ 2 cannot be dealt with one by one. Since the relation (B.6) ensures crossing symmetry, the above low energy expansion of the isospin components of the amplitude is equivalent to a manifestly crossing symmetric representation of the background amplitude: A(s; t; u)d = p1 + p2 s + p3 s2 + p4 (t − u)2 + p5 s3 + p6 s(t − u)2 + O(p8 ) :

(B.9)

By construction, A(s; t; u)d does not contribute to the S-wave scattering lengths. This condition @xes p1 and p2 in terms of the remaining coeNcients: p1 = −16M4 p4 ;

p2 = 4M2 (−p3 + p4 − 4M2 p5 ) ;

The explicit expressions for the latter read 8 16 2 p3 = (4I00 − 9I01 − I02 ) + M (−8I10 − 21I11 + 11I12 + 12H ) ; 3 3 p4 = 8(I01 + I02 ) + 16M2 (I11 + I12 ) ;

(B.10)


p5 =

263

4 0 (8I1 + 9I11 − 11I12 − 6H ) ; 3

p6 = 4(I11 − 3I12 + 2H ) :

(B.11)

In view of the sum rule (B.6), only two of the components of H are independent. Moreover, the amplitude only involves a combination thereof: H ≡ 25 (H 0 − 2H 1 ) = 29 (H 0 + 2H 2 ) = H 1 + H 2 :

(B.12)

The above formulae show that the leading background contribution is determined by the integrals I0 , which yield p1 = O(M4 );

p2 = O(M2 );

p3 = O(1);

p4 = O(1) :

The contributions from I1 and H modify the result by corrections that are suppressed by one power of M2 and, in addition, generate a polynomial of third degree in s; t; u, characterized by p5 and p6 . B.3. Background generated by the higher partial waves Next, we turn to the numerical evaluation of the integrals I0 ; I1 ; H and @rst consider the contributions from the imaginary parts of the partial waves with ‘ ¿ 2 in the region below 2 GeV. The integrals are dominated by the resonances, which generate peaks in the imaginary parts. In the vicinity of the peak, we may describe the phase shift with the Breit–Wigner formula e2i(s) =

Mr2 + i.r Mr − s ; Mr2 − i.r Mr − s

where Mr and .r denote the mass and the width of the resonance, respectively. To account for inelasticity (decays into states other than ), we multiply the corresponding expression for the imaginary part of the partial wave amplitude with the branching fraction .r→ =.r . This leads to s .r→ .r Mr2 Ir Im t‘r (s) = ; s − 4M2 (s − Mr2 )2 + .r2 Mr2 where Ir and ‘r denote the isospin and the spin of the resonance, respectively. In the narrow width approximation, the formula simpli@es to Im t‘Irr (s) = .r→ Mr (1 − 4M2 =Mr2 )−1=2 (s − Mr2 ) :

(B.13)

Only four of the states listed in the particle data booklet [71] below 2 GeV have spin ‘ ¿ 2 and carry the proper quantum numbers to be produced in collisions: The spin 2 resonances f2 (1275) and f2 (1525), the spin 3 state 3 (1681) and the state fJ (1710), whose spin is not @rmly established, but must be even. There is no evidence for exotic states: f2 ; f2 and fJ are isoscalars, while the 3 is an isovector. Very likely, the lightest spin 4 state is the f4 (2044): A linear (770) − f2 (1275) − 3 (1691) Regge trajectory calls for a spin 4 recurrence almost exactly there. At any rate, if the spins of the state fJ (1710) were equal to 4 or even larger, it would sit above that trajectory and thus

264


upset the standard Regge picture, which we will be making use of to estimate the asymptotic part of the driving terms. We take it for granted that J = 0 or 2 and conclude that only the I =0 D-wave and the F-wave contain resonances below 2 GeV. In the following, we discuss the contributions generated by these states, comparing the result obtained from the narrow width formula with the one found on the basis of two diLerent phase shift analyses. The most important contribution arises from the tensor meson f2 (1275). Inserting the val0 = 0:25 GeV−4 , ues Mf2 = 1275 MeV, .f2 → = 157 MeV, the narrow width formula gives I0f 2 0 = 0:15 GeV−6 , H 0 = 0:93 GeV−6 , to be compared with the results obtained with the I1f f2 2 parametrizations of the D-wave in Refs. [48,58], which yield 0 0 I0D = 0:25 GeV−4 ; I1D = 0:18 GeV−6 ; HD0 = 1:10 GeV−6

(B.14)

(Ref. [48]), 0 0 I0D = 0:27 GeV−4 ; I1D = 0:19 GeV−6 ; HD0 = 1:17 GeV−6

(B.15)

(Ref. [58]). These numbers show that the contributions from the imaginary part of the D-wave are dominated by the f2 (1275). We add a few remarks concerning the detailed behaviour of Im t20 (s) and @rst note that the Q In the present context, this state may be ignored, because the f2 (1525) mainly decays into K K. corresponding partial width is tiny: .f2 → = 0:62 ± 0:14 MeV. The phase shift analysis of Ref. [58] does contain a second resonance in the D-wave, which generates a small enhancement in the integrands on the r.h.s. of Eqs. (B.3), (B.5) towards the upper end of the range of integration. The numerical result in Eq. (B.15) includes the tiny contribution produced by this enhancement, but this eLect only accounts for a small fraction of the diLerence in the values obtained with the two diLerent phase shift analyses. The main reason for that diLerence is that the two representations of the D-wave in Refs. [48,58] do not agree very well on the left wing of the f2 (1275). In the context of the present paper, these details are not essential—we use the diLerence between the two phase shift analysis as a measure for the uncertainties to be attached to the moments. To estimate the signi@cance of the remaining partial waves with I = 0, we consider the contribution generated by the f4 (2044). This resonance also mostly decays into states other than . The relevant partial width is .f4 → = 35 ± 4 MeV. The narrow width formula shows 0 = 0:009 GeV−4 , I 0 = 0:002 GeV−6 , that the contribution from this state is very small: I0f 1f4 4 Hf04 = 0:04 GeV−6 . Moreover, the centre of the peak is outside our range of integration—more than half of the contribution from this level is to be booked in the asymptotic part. We conclude that the imaginary parts of the partial waves with ‘ ¿ 4 only matter at energies above 2 GeV. The 3 (1681) shows up as a peak in the imaginary part of the F-wave. According to the particle data tables [71], it mainly decays into 4. The partial width of interest in our context is 1 = 0:020 GeV−4 , . 3 → = 38 ± 3 MeV. Inserting this in the narrow width formula, we obtain I0

3 1 = 0:007 GeV−6 , H 1 = 0:07 GeV−6 , to be compared with the values found by performing I1

3 3 the numerical integration with the representations for the F-wave given in the two references quoted above: 1 1 I0F = 0:028 GeV−4 ; I1F = 0:012 GeV−6 ; HF1 = 0:12 GeV−6

(B.16)


265

Table 6 Moments of the background amplitude. The rows L, R and P indicate the contributions from the region below 2 GeV, from the leading, Regge trajectory and from the Pomeron, respectively. The last two rows show the result for the sum of these contributions and our estimate of the uncertainties, respectively

L R P Total ±

I00 GeV−4

I =0 I10 GeV−6

H0 GeV−6

I01 GeV−4

I =1 I11 GeV−6

H1 GeV−6

I02 GeV−4

I =2 I12 GeV−6

H2 GeV−6

0.26 0.03 0.01 0.30 0.01

0.19 0.004 0.001 0.19 0.01

1.13 0.11 0.04 1.28 0.05

0.029 0.018 0.010 0.058 0.007

0.014 0.003 0.001 0.018 0.002

0.14 0.07 0.04 0.24 0.03

0.005 — 0.010 0.015 0.008

0.006 — 0.001 0.007 0.006

0.04 — 0.04 0.08 0.04

(Ref. [48]), 1 1 = 0:030 GeV−4 ; I1F = 0:016 GeV−6 ; HF1 = 0:16 GeV−6 I0F

(B.17)

(Ref. [58]). In the present case, the narrow width formula only accounts for about half of the result: The region below the resonance is equally important. There, the diLerence between the two phase shift analyses is more pronounced than for the D-waves. Accordingly, the uncertainties in the F-wave contributions to the moments are larger. The formula (B.13) predicts that the contribution generated by the imaginary part of the I =2 waves vanishes, because that channel does not contain any resonances. According to Martin et al. [82], the D-wave phase shift may be approximated as 22 (s) − 0:003(s=4M2 ) (1 − 4M2 =s)5=2 . The corresponding contributions to the moments are indeed very small: I02 = 0:005 GeV−4 , I12 = 0:006 GeV−6 , H = 0:04 GeV−6 . In the following, we assume that these estimates do hold to within a factor of two. This completes our discussion of the contributions generated by the higher partial waves in the region below 2 GeV. The net result is that these are due almost exclusively to the D- and F-waves. The numerical results are listed in row L of Table 6. For I = 0; 1, the values given rely on the phase shift analyses of Refs. [48,58], while the estimates for I = 2 correspond to the parametrization of Ref. [82]. B.4. Asymptotic contributions We now turn to the contributions from the high energy tail of the dispersion integrals. The asymptotic behaviour of the scattering amplitude may be analysed in terms of Regge poles. A trajectory with isospin I generates a contribution ˙s8(t) to the t-channel isospin component Im T (I ) (s; t), which is de@ned by CstII Im T I (s; t) : Im T (I ) (s; t) = I

The asymptotic behaviour of the amplitude with It = 1 (s → ∞, t @xed) is governed by the

-trajectory, Im T (1) (s; t) = 7 (t)s8 (t) :

266


The Pomeron dominates the high energy behaviour of the It = 0 amplitude. Together with the contribution from the f-trajectory, the Regge representation of this component reads Im T (0) (s; t) = 3P(s; t) + 7f (t)s8f (t) : In the absence of exotic trajectories, the component with It = 2 rapidly tends to zero when s becomes large. The asymptotic behaviour of the s-channel isospin components thus takes the form Im T 0 (s; t) = P(s; t) + 13 7f (t)s8f (t) + 7 (t)s8 (t) + (t ↔ u) ; Im T 1 (s; t) = P(s; t) + 13 7f (t)s8f (t) + 12 7 (t)s8 (t) − (t ↔ u) ; Im T 2 (s; t) = P(s; t) + 13 7f (t)s8f (t) − 12 7 (t)s8 (t) + (t ↔ u) :

(B.18)

If t is kept @xed, the terms with P(s; t) and 7(t)s8(t) dominate, generating a peak in the forward direction, while the analogous structure in the backward direction (@xed u) is described by those with P(s; u) and 7(u)s8(u) . At @xed t, the crossed terms drop oL very rapidly with s, so that their contribution disappears in the noise of the calculation and may just as well be dropped. The Lovelace–Shapiro–Veneziano model [83–85] provides a very instructive framework for understanding the interplay of the asymptotic contributions with the resonance structures seen at low energies (see Appendix E). In that model, the - and f-trajectories are linear and exchange degenerate, 8 (t) = 8f (t) = 80 + t81 :

(B.19)

We @x the intercept with the Adler zero, 8(M2 ) = 12 , and choose the slope such that the spin 1 state on the leading trajectory occurs at the proper mass: 81 = 12 (M 2 − M2 )−1 ;

80 = 12 − 81 M2 :

(B.20)

The amplitude may be represented as a sum of narrow resonance contributions. Since the model does not contain exotic states, Im T 2 (s; t) vanishes, so that the residues 7f (t) and 7 (t) are in the ratio 3 : 2. The explicit expression reads 7 (t) = 23 7f (t) =

?(81 )8(t) : .[8(t)]

(B.21)

Finally, we @x the overall normalization constant ? such that the width of the agrees with what is observed. This requires ? = 96. M 2 (M 2 − 4M2 )−3=2 :

(B.22)

The model explicitly obeys crossing symmetry and yields a decent picture both for the masses and widths of the resonances occurring on the leading trajectory and for the qualitative properties of the Regge residues 7 (t), 7f (t). The main de@ciency of the model is lack of unitarity: It does not contain a Pomeron term, so that the total cross section tends to zero at high energies. While the model yields quite decent values for the full widths, it does not account for the fact that


267

the resonances often decay into states other than , particularly if the available phase space becomes large—in the model, the branching fraction .r→ =.r is equal to 1. Consequently, the LSV-model overestimates the magnitude of the Regge residues—a signi@cant fraction thereof should be transferred to the Pomeron term. For this reason, the model can only serve as a semi-quantitative guide. As discussed in Section B.2, crossing symmetry strongly correlates the asymptotic behaviour of the partial waves with their properties at low energy. In particular, the parameters occurring in the Regge representation of the scattering amplitude can be extracted from low energy phenomenology. For a review of these calculations, we refer to the article by Pennington [44]. The value obtained for 7 (0) is smaller 15 than what follows from Eqs. (B.21), (B.22) by a factor of 0:6 ± 0:1. Also, while formula (B.21) implies that the residue contains a zero at t0 = 2M2 − M 2 = −0:55 GeV2 because 8(t) vanishes there, the calculation of Ref. [44] instead yields a zero at t0 = −0:44 ± 0:05 GeV2 . This con@rms the remarks made above: The LSV-model describes the qualitative properties of the Regge residues quite decently, but overestimates their magnitude. In the numerical evaluation, we use the linear -trajectory speci@ed above, 8 (t) = 8(t), and @x the corresponding residue with the results of Ref. [44], which are adequately described by a modi@ed version of the LSV-formula: 7 (t) =

?1 818(t) ; .[(t − t0 )81 ]

t0 = −0:44 GeV2 ;

?1 = (0:78 ± 0:13)? :

(B.23)

We determine the properties of the f-trajectory with exchange degeneracy, i.e. set 8f (t) = 8(t) and 7f (t) = 32 7 (t). For the Pomeron, we use the representation P(s; t) = s e1=2bt :

(B.24)

While the parameter b = 8 GeV−2 [44] describes the width of the diLraction peak, the optical theorem implies that represents the asymptotic value of the total cross section. Evidently, the above parametrization can be adequate only in a limited range of energies: The cross section does not tend to a constant, but grows logarithmically. In the present context, however, the behaviour at very high energies is an academic issue, because the integrands of the moments rapidly fall oL with s. What counts is that the above representation yields a decent approximation for c.m. energies in the range between 2 and 3 GeV. There, the terms generated by the

-f-trajectory are by no means negligible: The formula (B.18) shows that at 2 GeV (3 GeV), these terms by themselves generate a contribution to Im T 0 (s; 0) that corresponds to a total cross section of 21 mb (14 mb)—in the energy range relevant for the moments, the Pomeron term does not represent the dominating contribution to the total cross section. As discussed in detail in Ref. [44], crossing symmetry leads to the estimate = (6 ± 5) mb. Although the error bar is large, the value is signi@cantly smaller than what is indicated by the rule of thumb 2 N 4 NN 20 mb. tot 3 tot 9 tot 15

8 (t)−1=2

In Ref. [44], the residue is written as 7 (t) = 16 . The result obtained for the value at t = 0 is 3 ) (t)81 −1 ) (0) = (0:6 ± 0:1)M , to be compared with the number ) (0) = 0:97M−1 that follows from Eqs. (B.19) – (B.22).

268


Indeed, the sum rule (B.6) con@rms this result. The numerical values obtained with the above representation for the contributions from the -f-trajectory are indicated in row R of Table 6. If the high energy tail is omitted altogether, the l.h.s. of the sum rule (B.6) becomes (2H 0 )L = 2:3 GeV−6 , while the r.h.s. amounts to (9H 1 + 5H 2 )L = 1:5 GeV−6 . Clearly, further contributions are required to bring the sum rule into equilibrium. The Regge terms do contribute more to the right than to the left and reduce the discrepancy by a factor of two. Since the Pomeron aLects the various isospin components almost equally, it contributes about 7 times more to the right than to the left. For the sum rule to be obeyed within the uncertainties of the remaining contributions, the value of must be in the range = (5 ± 3) mb. Let us compare our representation of the background with the model used for the asymptotic behaviour in the early literature. Assume that, above an energy of 1:5 GeV, the imaginary parts can be described by a Pomeron term with tot = 20 mb and a Regge term that corresponds to the leading trajectory of the LSV-model. The l.h.s. of the sum rule (B.6) then takes the value 2H 0 = 3:3, while the r.h.s yields 9H 1 + 5H 2 = 6:1 (to be compared with the value 2.6 obtained for either one of the two sides with our representation of the background). Evidently, the model is in conPict with crossing symmetry. In the region relevant for the driving term integrals, the LSV-model overestimates the Regge residues by about 40% [44] and the sum rule (B.6) then implies that the value = 20 mb is too large by about a factor of 4. We repeat that our calculation has no bearing on the asymptotic behaviour of the total cross section—we are merely observing that, unless the value of is in the range 5 ± 3 mb, the representation used for the amplitude violates crossing symmetry. The row P indicates the contributions to the moments generated by the Pomeron if is taken in the middle of this range. The net result of our calculation is contained in the last two rows of Table 6, which list the outcome for the moments and for the error bars to be attached to these, respectively. For the quantity H de@ned in Eq. (B.12), we obtain H = 0:32 ± 0:02 GeV−6 :

(B.25)

B.5. Driving terms The polynomial approximation for the background amplitude can be used to determine the low energy behaviour of the driving terms—it suNces to evaluate the partial wave projections of the polynomial T (s; t)d . The range of validity of the resulting representation for the driving terms, however, only extends to c.m. energies of about 0:6 GeV. For our numerical work, we need a representation that holds for higher energies. The approximations for the imaginary parts discussed above yield the following representation of the driving terms: dI‘ (s) = dI‘ (s)L + dI‘ (s)R + dI‘ (s)P ; 3 s2 2 II I dI‘ (s)L = ds K‘‘ (s; s ) Im t‘ (s ) ; 2 I =0 ‘ =2 4M

dI‘ (s)H

1 = 32

0

1

d z P‘ (z)T I (s; tz )H ;

H = R; P ;


T (s; t)H =

∞

s2

ds g2 (s; t; s ) Im T (s ; 0)H +

∞

s2

269

ds g3 (s; t; s ) Im T (s ; t)H ;

Im T 0 (s; t)R = 32 7 (t)s8(t) + 32 7 (u)s8(u) ; Im T 1 (s; t)R = 7 (t)s8(t) − 7 (u)s8(u) ; Im T 2 (s; t)R = 0 ; Im T 0 (s; t)P = Im T 2 (s; t)P = P(s; t) + P(s; u) ; Im T 1 (s; t)P = P(s; t) − P(s; u) : The result of the numerical evaluation of these integrals with the parameter values speci@ed above is given in Eq. (4.1). We use the diLerence between the results for d00 (s)L and d11 (s)L obtained with the two phase shift analyses quoted above as a measure for the uncertainties in these quantities. In the case of the I = 2 D-wave, we assume that the Martin–Morgan–Shaw formula does describe the behaviour of the imaginary part to within a factor of 2. For the Regge contributions, we use the error estimate ) (0) = (0:6 ± 0:1)M−1 given in Ref. [44]. Finally, the uncertainties attached to the Pomeron term correspond to those in the value = 5 ± 3 mb, obtained in Section B.4. The result quoted in Eq. (4.2) is obtained by adding the corresponding error bars quadratically and @tting the outcome with a polynomial. There is a neat and rather thorough check of the above calculation. The driving terms represent the partial wave projections of the background amplitude. Since that amplitude must be crossing symmetric, we may equally well calculate the projections with the formula (A.2) instead of using (A:3)—the result should be the same. The modi@cation of the partial wave projection II (s; s ) and the contributions from the imaginary parts of changes the form of the kernels K‘‘ the higher partial waves below 2 GeV then change, quite substantially. The contributions from the asymptotic region, however, are also modi@ed. In the sum, these changes indeed cancel out, to a remarkable degree of accuracy. This corroborates the claim that our description of the background is approximately crossing symmetric. Evidently, the sum rule (B.6) plays an important role here, as it correlates the magnitude of the asymptotic contributions with those from the low energy region.

Appendix C. Sum rules and asymptotic behaviour C.1. Sum rules for the P-wave As discussed in Section 11, the Olsson sum rule ensures the correct asymptotic behaviour of the t-channel I = 1 scattering amplitude T (1) (s; t) for s → ∞, t = 0. The requirement that this amplitude has the proper high energy behaviour also for t ¡ 0 implies a further constraint, which is readily derived from the @xed-t dispersion relation (2.4). It suNces to evaluate the coeNcient of the term that grows linearly with s and to subtract the value at t = 0. The result

270


involves the following integrals over the imaginary parts of the amplitude (t 6 0): ∞ 0 1 2 2 Im TQ (s; t) + 3 Im TQ (s; t) − 5 Im TQ (s; t) ds S(t) ≡ 12s(s + t − 4M2 ) 4M2 ∞ (s − 2M2 ) Im T 1 (s; 0) − ds : s(s − 4M2 )(s − t)(s + t − 4M2 ) 4M2

(C.1)

I

The barred quantities stand for Im TQ (s; t) = {Im T I (s; t) − Im T I (s; 0)}=t. The amplitude T (1) (s; t) has the proper asymptotic behaviour only if S(t) vanishes for space-like values of t. Since the S-waves drop out, the condition amounts to a family of sum rules that relate integrals over the imaginary part of the P-wave to the higher partial waves. For t = 0, for instance, the sum rule may be written in the form ∞ ∞ 2 Im t‘0 (s) − 5 Im t‘2 (s) Im t11 (s) ds 2 (2‘ + 1)‘(‘ + 1) ds = s (s − 4M2 ) 18s(s − 4M2 )2 4M2 4M2 ‘=2;4;::: ∞ {‘(‘ + 1)s − 4(s − 2M2 )}Im t‘1 (s) + (2‘ + 1) ds : (C.2) 6s2 (s − 4M2 )2 4M2 ‘=3;5;:::

The integrals over the individual partial waves converge more rapidly than in the case of the Olsson sum rule, but the factor ‘(‘ + 1) gives the higher partial waves more weight—in fact, the contributions from the asymptotic region are even more important here. The sum rule is of the same structure as the one that follows from crossing symmetry, Eq. (B.7), but there are two diLerences: The integrals converge less rapidly by one power of s and the P-wave does not drop out. Since the sum rule (C.2) oLers a good opportunity to check the representation used for the asymptotic region, we evaluate it explicitly with our input for the imaginary parts. We split the √ integration into one from threshold to s2 = 2 GeV and one over the asymptotic region, s ¿ s2 (compare appendix B). Denoting the low energy part of the integral over the P-wave by s2 Im t11 (s) SP = ds 2 ; s (s − 4M2 ) 4M2 we write the sum rule in the form SP = SD + SF + Sas ;

(C.3)

where SD and SF stand for the analogous integrals over the D- and F-waves. While the low energy contributions from the waves with ‘ ¿ 4 are neglected, their high energy tails are accounted for in the term Sas , which collects all contributions from the region above s2 . Form (C.2) of the sum rule is suitable to calculate the contributions from the interval 4M2 ¡ s ¡ s2 . Numerically, we obtain SP = 1:93 ± 0:08;

SD = 0:55 ± 0:03;

SF = 0:13 ± 0:01 ;

in units of GeV−4 . To evaluate the asymptotic contributions, we instead use the form (C.1): The term Sas coincides with the expression S(0)=48, except that the integration now only


271

extends over the interval s2 ¡ s ¡ ∞. Inserting the representation speci@ed in Appendix B.4, we @nd that the bulk stems from the leading Regge trajectory (1:12 ± 0:19). The Pomeron does not contribute to the @rst integral on the r.h.s. of Eq. (C.1), because that integral is of the same isospin structure as the one occurring in the Olsson sum rule, but it does generate a small negative term via the second integral (−0:02 ± 0:01). The net result for the asymptotic contributions, Sas = 1:10 ± 0:19 ; leads to SD + SF + Sas = 1:77 ± 0:19. Within the errors, the outcome agrees with the numerical value SP = 1:93 ± 0:08 obtained for the l.h.s. of the sum rule (C.3). Note that more than half of the r.h.s. stems from the asymptotic region. We conclude that our asymptotic representation is valid within the estimated uncertainties, also for this sum rule, which converges more slowly than the moments considered in Appendix B. Since the Olsson sum rule belongs to the same convergence class as the one above, we feel con@dent that our error estimates apply also in that case. C.2. Asymptotic behaviour of the Roy integrals If the imaginary parts of the partial waves with ‘ ¿ 1 are discarded, the Roy equations become a closed system for the S- and P-waves. The explicit expressions for the kernels show that the dispersion integrals over the imaginary parts of these waves grow linearly with s, like the subtraction polynomials. Except for the contributions from the higher partial waves, the r.h.s. of the Roy equations for the S- and P-waves thus grows in proportion to s: As As As Re t00 → ; Re t11 → ; Re t02 → − ; 2 2 12M 72M 24M2 2 Im t00 (s) + 27 Im t11 (s) − 5 Im t02 (s) 4M2 ∞ A = 2a00 − 5a20 − ds : (C.4) s(s − 4M2 ) 4M2 So, if the coeNcient A vanishes, the contribution from the dispersion integrals cancels the one from the subtraction polynomial, simultaneously for all three partial waves [4,82]. In fact, if the imaginary parts of the higher partial waves are dropped and if A is set equal to zero, the Roy equations become identical to those proposed by Chew and Mandelstam [69] (see Ref. [4] for a detailed discussion). The expression for A resembles the Olsson sum rule, where the contributions from the S- and P-wave read 2 Im t00 (s) + 9 Im t11 (s) − 5 Im t02 (s) 4M2 ∞ 0 2 2a0 − 5a0 = ds + ··· : 4M2 s(s − 4M2 ) If only the S-waves are retained, the Olsson sum rule does imply that A vanishes—evidently, this sum rule is closely related to the observation that the linearly rising contribution from the subtraction terms must cancel the one from the dispersion integrals (Section 10). As is well known, however, the coeNcient of the P-wave term in A diLers from the one in the Olsson sum rule. The implications of this discrepancy for the Chew–Mandelstam framework are discussed in the references quoted above. The family of sum rules derived in Appendix C.1 points in

272


the same direction: The imaginary part of the P-wave is tied together with those of the higher partial waves—setting these equal to zero leads to inconsistencies [86]. For the above asymptotic formulae to apply at E ∼ 1 GeV, two conditions would have to be met: (a) the contributions from the higher partial waves can be ignored at these energies and (b) the integrals over the imaginary parts of the S- and P-waves are dominated by the contributions from low energies. Unfortunately, neither of the two conditions is met. The solutions show a pronounced structure in the region above the matching point—evidently, we are not dealing with the asymptotic behaviour there. The numerical value of A is negative: The integral in Eq. (C.4) over-compensates the term 2a00 − 5a20 . We may lay the blame upon the contributions above the matching point—if the integral were cut oL there, A would approximately vanish. The situation is quite diLerent for the Olsson sum rule, which does not rely on low energy approximations but represents the exact version of the condition that must be obeyed by the two subtraction constants for the scattering amplitude to have the proper asymptotic behaviour. In that case, the coeNcient of the P-wave is three times smaller—the region above the matching point plays an essential role in bringing the sum rule into balance. The numerical evaluation in Section 11 shows that even those from the region above 2 GeV are signi@cant. The rapid growth of the driving terms indicates that the higher partial waves become increasingly important as the energy rises—it is clear that the asymptotic behaviour of the partial wave amplitudes cannot be studied on the basis of the S- and P-wave contributions to the r.h.s of the Roy equations. We conclude that, at the quantitative level, the above simple mechanism cannot explain why, for suitable values of a00 and a20 , our solutions remain within the bounds set by unitarity. For an analysis of the behaviour above the matching point that neither discards the higher partial waves, nor relies on low energy dominance, we refer to Sections 10 and 11. Appendix D. Explicit numerical solutions In this appendix, we make available our explicit numerical solutions of the Roy integral equations. We proceed as follows. For a few tens of pairs (a00 ; a20 ) in the √universal band (see Fig. 7), we have constructed the three lowest partial waves at 2M 6 s 6 0:8 GeV. As we explain in the main text, we parametrize the phase shifts I‘ of the solutions in the manner proposed by Schenk [66], 4M2 − s‘I 4M2 2‘ I I I 2 I 4 I 6 tan ‘ = 1 − ; (D.1) q {A‘ + B‘ q + C‘ q + D‘ q } s s − s‘I Each solution of the Roy equations corresponds to a speci@c value of the 3 × 5 coeNcients in this expansion, AI‘ = AI‘ (a00 ; a20 ); : : : ; s‘I = s‘I (a00 ; a20 ) : We approximate these by a polynomial of third degree in the scattering lengths a00 and a20 . In terms of the variables u=

a00 − 1; p0

v=

a20 − 1; p2

p0 = 0:225;

p2 = −0:03706 ;


273

Table 7 Polynomial coeNcients for Roy solutions A00

B00

C00

D00

s00

0.2463 0.1985 0.1289 0:1426 × 10−1 0:8717 × 10−2 0:5058 × 10−1 −0:4266 × 10−2 −0:4658 × 10−2 −0:5358 × 10−2 −0:2555 × 10−2

−0:1665 × 10−1 0:3283 × 10−2 0:1142 × 10−1 0:1400 × 10−1 0:1613 × 10−1 0:3000 × 10−1 −0:4045 × 10−2 0:2110 × 10−2 0:1095 × 10−1 0:4249 × 10−2

−0:6403 × 10−3 −0:4136 × 10−2 −0:3699 × 10−2 −0:3980 × 10−2 −0:3152 × 10−2 −0:7354 × 10−2 −0:1212 × 10−2 −0:4544 × 10−2 −0:4558 × 10−2 −0:1271 × 10−2

0:3672 × 102 0:1339 × 10 0.6504 −0:3211 × 10 −0:1396 × 10 −0:4114 × 10 −0:3447 × 10 −0:8428 × 10 −0:6350 × 10 −0:1486 × 10

z1 z2 z3 z4 z5 z6 z7 z8 z9 z10

A11 0:3626 × 10−1 0:1834 × 10−1 0:1081 × 10−1 −0:3195 × 10−2 0:1670 × 10−3 −0:9543 × 10−3 0:5049 × 10−3 0:4595 × 10−4 −0:9000 × 10−4 −0:1198 × 10−4

B11 0:1337 × 10−3 −0:2336 × 10−2 −0:8563 × 10−3 0:1678 × 10−3 0:4147 × 10−4 0:8402 × 10−4 −0:9308 × 10−4 −0:2755 × 10−3 −0:2308 × 10−3 −0:6120 × 10−4

C11 −0:6976 × 10−4 0:1965 × 10−3 0:3268 × 10−4 0:2173 × 10−4 0:3267 × 10−5 0:2059 × 10−4 0:1070 × 10−4 0:5554 × 10−4 0:5307 × 10−4 0:1519 × 10−4

D11 0:1408 × 10−5 −0:1974 × 10−4 −0:8821 × 10−5 −0:6047 × 10−6 −0:1617 × 10−5 −0:3125 × 10−5 −0:1257 × 10−5 −0:4432 × 10−5 −0:4415 × 10−5 −0:1344 × 10−5

s11

z1 z2 z3 z4 z5 z6 z7 z8 z9 z10

A20 −0:3706 × 10−1 0.0000 −0:3706 × 10−1 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

B02 −0:8553 × 10−1 −0:1236 × 10−1 −0:6673 × 10−2 0:4901 × 10−2 0:2810 × 10−1 0:4010 × 10−1 −0:1663 × 10−1 −0:6784 × 10−1 −0:5429 × 10−1 −0:1178 × 10−1

C02 −0:7542 × 10−2 0:3466 × 10−1 0:2857 × 10−1 0:2674 × 10−2 0:1477 × 10−1 0:2458 × 10−1 −0:3030 × 10−1 −0:9512 × 10−1 −0:8744 × 10−1 −0:2535 × 10−1

D02 0:1987 × 10−3 −0:2524 × 10−2 −0:1993 × 10−2 0:1506 × 10−2 0:2915 × 10−3 0:1325 × 10−2 0:8759 × 10−3 0:4713 × 10−2 0:5313 × 10−2 0:1730 × 10−2

s02

z1 z2 z3 z4 z5 z6 z7 z8 z9 z10

0.2250 0.2250 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

0:3074 × 102

−0:2459 −0:1733

0:6323 × 10−1 −0:1090 × 10−2 0:2724 × 10−1 −0:7218 × 10−2 0:1483 × 10−1 0:1813 × 10−1 0:5016 × 10−2 −0:1192 × 102 −0:4040 × 102 −0:3457 × 102 −0:9879 × 102 −0:9856 × 102 −0:2072 × 103 −0:1589 × 103 −0:5259 × 103 −0:5366 × 103 −0:1723 × 103

the numerical representation for the coeNcient B00 , for instance, reads B00 = z1 + z2 u + z3 v + z4 u2 + z5 v2 + z6 uv + z7 u3 + z8 u2 v + z9 uv2 + z10 v3 : The 15 × 10 numbers z1 ; : : : ; z10 for the coeNcients A00 ; B00 ; : : : ; s02 are displayed in Table 7, in units of M2 . The uncertainties to be attached to these numbers are dominated by those in the phases at the matching point. It is not very meaningful to list corresponding error bars for the individual coeNcients, because these are very strongly correlated. The variations in the result for 00 (s) and

274


11 (s) that arise from both, the uncertainties in a00 and a20 and those in the experimental input are shown in Figs. 13 and 15. In the lower half of the interval from threshold to 0:8 GeV, the uncertainties are dominated by those in a00 and a20 , while towards the upper end of that interval, the values of the phases at the matching point govern the behaviour. As shown in Ref. [43], chiral symmetry practically eliminates the @rst source of error, as it very strongly constrains the values of a00 and a20 . The uncertainties that remain once the S-wave scattering lengths are pinned down in this manner are discussed in detail in a forthcoming publication. Appendix E. Lovelace--Shapiro--Veneziano model In this appendix, we describe the model used to illustrate the basic properties of the Regge poles occurring in the asymptotic representation of the scattering amplitude [83–85]. In this model, the scattering amplitude is taken to be of the form A(s; t; u)V = ?1 D(8s ; 8t ) + ?1 D(8s ; 8u ) + ?2 D(8t ; 8u ) ; where D(8; 7) is closely related to the Beta function .(1 − 8).(1 − 7) D(8; 7) = : .(1 − 8 − 7) and 8s represents a linear Regge trajectory, 8s = 80 + 81 s : At @xed t, the function D(8s ; 8t ) shows Regge behaviour when s tends to in@nity: D(8s ; 8t ) → (−8s )8t .(1 − 8t ) : At the same time, the representation (1 − 8t ¿ 0) D(8s ; 8t ) = (1 − 8s − 8t )B(1 − 8s ; 1 − 8t ) ∞ 1 8t (8t + 1) · · · (8t + n − 1) + = (1 − 8s − 8t ) 1 − 8s n!(n + 1 − 8s )

(E.1)

n=1

shows that the amplitude may be expressed as a sum of narrow resonance contributions, with mass Mn2 = (81 )−1 (n − 80 );

n = 1; 2; : : : :

The coupling constants ?1 ; ?2 may be chosen such that the amplitude does not contain resonances with I = 2. For this condition to be satis@ed, the corresponding s-channel isospin component T 2 (s; t)V = 2?1 D(8t ; 8u ) + (?1 + ?2 )(D(8s ; 8t ) + D(8s ; 8u )) should be free of poles in the physical region of the s-channel. This requires ?2 = −?1 ≡ 12 ? ;


275

so that the amplitude takes the form A(s; t; u)V = − 12 ?{D(8s ; 8t ) + D(8s ; 8u ) − D(8t ; 8u )} ; T 0 (s; t)V = − 12 ?{3D(8s ; 8t ) + 3D(8s ; 8u ) − D(8t ; 8u )} ; T 1 (s; t)V = −?{D(8s ; 8t ) − D(8s ; 8u )} ; T 2 (s; t)V = −?D(8t ; 8u ) :

(E.2)

In the chiral limit, where the Mandelstam triangle shrinks to the point s = t = u = 0, the amplitude must contain an Adler zero there. Indeed, the factor 1 − 8s − 8t generates such a zero if 80 = 12 . Hence the deviation of 80 from 12 must be of order M2 , so that 8s − 12 represents a quantity of order p2 . At leading order of the low energy expansion, the behaviour of the amplitude therefore represents the @rst term in the expansion around the point 8s = 8t = 8u = 12 , √ which in view of .( 12 ) = yields A(s; t; u)V = ?(8s − 12 ) + O(p4 ) ;

(E.3)

This does have the structure of the Weinberg formula, provided the intercept 80 is chosen such that 8s passes through the value 12 at s = M2 , i.e. [84] 80 = 12 − 81 M2 : The lowest levels of spin 1; 2; 3; 4 indeed occur on an approximately linear trajectory with this intercept: Fixing the value of the slope with M , 81 = 12 (M 2 − M2 )−1 ; the model predicts Mf2 = 1319 (1275) MeV;

M 3 = 1699 (1691) MeV;

Mf4 = 2008 (2044) MeV ;

where the numbers in brackets are those in the data tables [71]. The representation (E.1) shows that for s ¿ 4M2 ; t ¡ 0, the imaginary part of D(8s ; 8t ) consists of a sequence of -functions: ∞ Im D(8s ; 8t ) = − Rn (8t )(8s − n) ; n=1

.(8t + n) 1 Rn (8) = = 8t (8t + 1) · · · (8t + n − 1) : .(n).(8t ) (n − 1)! For the imaginary part of the s-channel isospin components, we thus obtain ∞ 3? 0 {Rn (8t ) + Rn (8u )}(s − Mn2 ) ; Im T (s; t)V = 281 n=1 ∞ ? {Rn (8t ) − Rn (8u )}(s − Mn2 ) ; Im T 1 (s; t)V = 81 2

Im T (s; t)V = 0 with u = 4M2 − t − Mn2 .

n=1

276


We may then read oL the imaginary parts of the partial wave amplitudes by decomposing the polynomial Rn (8) into a Legendre series: 16 n 2t Rn (8t ) = (2‘ + 1)P‘ 1 + 2 r ; Mn − 4M2 n‘ ‘=0 ∞ 3? {1 + (−1)‘ } rn‘ (s − Mn2 ) ; Im t‘0 (s)V = 6481 n=‘ ∞ ? {1 − (−1)‘ } rn‘ (s − Mn2 ) ; Im t‘1 (s)V = 3281 Im t‘2 (s)V = 0 :

n=‘

On the leading trajectory, the coeNcients are n rnn = n 8n (M 2 − 4M2 )n : 2 (2n + 1)!! 1 n Comparison with the narrow width formula (B.13) shows that the model predicts the width of the various levels as 17 ?!I rn‘ .n‘ = (M 2 − 4M2 )1=2 ; (E.4) 3281 Mn2 n where !I depends on the isospin of the particle: !0 = 3, !1 = 2, !2 = 0. In particular, the result for the width of the reads ? . = (M 2 − 4M2 )3=2 : (E.5) 96M 2

Fixing the coupling constant with the experimental value . = 151:2 MeV, we obtain ?=32 = 0:728. Formula (E.4) then predicts .f2 = 130 (157) MeV;

. = 51 (51) MeV; 3

.f4 = 46 (35) MeV ;

where the numbers in brackets are again taken from the data tables [71]. This shows that the model does yield a decent picture, not only for the masses but also for the widths of the particles on the leading trajectory. In addition to the levels on the leading trajectory, the model, however, also contains plenty of daughters, with a rather strong coupling to the -channel. For the states on the @rst daughter = 783 MeV, . = 154 MeV, . = 113 MeV, trajectory, for instance, Eq. (E.4) yields .10 21 32 .43 = 42 MeV, etc. The scalar daughter of the is particularly fat. It is clear that an amplitude that describes all of the levels as narrow resonances fails here. Unitarity implies the bound M2

0 ds Im t0 (s) 1 − 4M2 =s 6 M 2 − 4M2 : 4M2

In the case of t00 (s), the sum over n only starts at n = 1. The formula reproduces the numerical results in Table I of Ref. [85], if the parameter values are adapted accordingly (80 = 0:48; 81 = 0:9 GeV−2 ; . = 112 MeV; M = 764 MeV). 16 17


277

This condition is violated for M ¡ 1:3 GeV. Also, if the intercept of the leading trajectory is @xed with the Adler condition as above, the scalar daughter of the f2 is a ghost: Formula (E.4) yields a negative decay width [85]. In this respect, the model is de@cient—as witnessed by the life of even royal families, the decency of a mother does not ensure that her daughters behave. The problem also shows up in the S-wave scattering lengths: Chiral symmetry relates the coeNcient of the leading term in the low energy expansion (E.3) to the pion decay constant, 1 ?81 = 2 : (E.6) F If the coupling constant ? is @xed such that the model yields the proper width for the , the l.h.s. of this relation exceeds the r.h.s. by a factor of 1.7. Accordingly, the prediction of the model for a00 exceeds the current algebra result by about this factor. In the vicinity of threshold, the behaviour of the amplitude is determined by the properties of the function 0(8; 7) for 8 7 12 . There, the @rst term in the series (E.1) accounts for about two thirds of the sum. The spin 1 component of this term is due to -exchange, while the spin 0 part arises from the scalar daughter of the . By construction, the former does have the proper magnitude. The S-wave scattering lengths are too large because the scalar daughter of the is too fat. As was noted from the start [85], the model is not unique. To arrive at a more realistic model, we could add extra terms that domesticate the daughters and leave the leading trajectory and the position of the Adler zero untouched. Note, however, that the number of states occurring in the Veneziano model corresponds to the degrees of freedom of a string, while the spectrum of bound states in QCD is the one of a local @eld theory, where the number of independent states grows much less rapidly with the mass. Modi@cations of the type just mentioned can at best provide a partial cure. In particular, these do not remove the main de@ciency of the model, lack of unitarity. References [1] S.M. Roy, Phys. Lett. B 36 (1971) 353. [2] A. Martin, Nuovo Cimento 42 (1966) 930; A. Martin, Nuovo Cimento 44 (1966) 1219; Scattering Theory: Unitarity, Analyticity and Crossing, Lecture Notes in Physics, Vol. 3, Springer, Berlin, Heidelberg, New York, 1969. [3] S.M. Roy, Helv. Phys. Acta 63 (1990) 627. [4] J.L. Basdevant, J.C. Le Guillou, H. Navelet, Nuovo Cimento A 7 (1972) 363. [5] M.R. Pennington, S.D. Protopopescu, Phys. Rev. D 7 (1973) 1429. [6] M.R. Pennington, S.D. Protopopescu, Phys. Rev. D 7 (1973) 2591. [7] J.L. Basdevant, C.D. Froggatt, J.L. Petersen, Phys. Lett. B 41 (1972) 173, 178. [8] J.L. Basdevant, C.D. Froggatt, J.L. Petersen, Nucl. Phys. B 72 (1974) 413. [9] J.L. Petersen, Acta Phys. Austriaca 13 (Suppl.) (1974) 291; Yellow Report CERN 77-04, 1977. [10] C.D. Froggatt, J.L. Petersen, Nucl. Phys. B 91 (1975) 454; C.D. Froggatt, J.L. Petersen, Nucl. Phys. B 104 (1976) 186 (E); C.D. Froggatt, J.L. Petersen, Nucl. Phys. B 129 (1977) 89. [11] D . Morgan, M.R. Pennington, in: Ref. [12, p. 193]. [12] L. Maiani, G. Pancheri, N. Paver, The Second DAYNE Physics Handbook, INFN-LNF-Divisione Ricerca, SIS-UNcio Publicazioni, Frascati, 1995. [13] G. Mahoux, S.M. Roy, G. Wanders, Nucl. Phys. B 70 (1974) 297. [14] B. Ananthanarayan, Phys. Rev. D 58 (1998) 036002 [hep-ph=9802338]. [15] G. Auberson, L. Epele, Nuovo Cimento A 25 (1975) 453.

278 [16] [17] [18] [19] [20] [21] [22] [23] [24] [25]

[26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37]

[38] [39] [40] [41] [42] [43] [44] [45] [46]

B. Ananthanarayan et al. / Physics Reports 353 (2001) 207–279 C. Pomponiu, G. Wanders, Nucl. Phys. B 103 (1976) 172. D. Atkinson, R.L. Warnock, Phys. Rev. D 16 (1977) 1948. D. Atkinson, T.P. Pool, H. Slim, J. Math. Phys. 18 (1977) 2407. L. Epele, G. Wanders, Phys. Lett. B 72 (1978) 390; Nucl. Phys. B 137 (1978) 521. T.P. Pool, Nuovo Cimento A 45 (1978) 207. A.C. Heemskerk, Application of the N=D method to the Roy equations, Ph.D. Thesis, University of Groningen, 1978. A.C. Heemskerk, T.P. Pool, Nuovo Cimento A 49 (1979) 393. C. Lovelace, Comm. Math. Phys. 4 (1967) 261. O. Brander, Comm. Math. Phys. 40 (1975) 97. P. BZuttiker, Comparison of Chiral perturbation theory with a dispersive analysis of scattering, Ph.D. Thesis, UniversitZat Bern, 1996; B. Ananthanarayan, P. BZuttiker, Phys. Rev. D 54 (1996) 1125 [hep-ph=9601285]; Phys. Rev. D 54 (1996) 5501 [hep-ph=9604217]; Phys. Lett. B 415 (1997) 402 [hep-ph=9707305] and in Ref. [27, p. 370]. O.O. Patarakin, V.N. Tikhonov, K.N. Mukhin, Nucl. Phys. A 598 (1996) 335; O.O. Patarakin (for the CHAOS collaboration, in Ref. [27], p. 376, and hep-ph=9711361; N Newsletter 13 (1997) 27; M. Kermani et al. [CHAOS Collaboration], Phys. Rev. C 58 (1998) 3431. A.M. Bernstein, D. Drechsel, T. Walcher (Eds.), Chiral Dynamics: Theory and Experiment, Workshop held in Mainz, Germany, Lecture Notes in Physics, Vol. 513, Springer, Berlin, 1–5 September 1997. J. Gasser, A. Rusetsky, J. Schacher, Miniproceedings of the Workshop HadAtom99, held at the University of Bern, October 14 –15, 1999, [hep-ph=9911339]. J. Lowe, in Ref. [27], p. 375, and hep-ph=9711361; talk at Workshop on Physics and Detectors for DAYNE, Frascati, November 16 –19, 1999, to appear in the Proceedings [http:==wwwsis.lnf.infn.it= talkshow=dafne99.htm]; S. Pislak, in Ref. [28], p. 25. P. de Simone, in Ref. [28], p. 24. B. Adeva et al., Proposal to the SPSLC, CERN=SPSLC 95-1, 1995. J. Gasser, G. Wanders, Eur. Phys. J. C 10 (1999) 159 [hep-ph=9903443]. G. Wanders, Eur. Phys. J. C 17 (2000) 323 [hep-ph/0005042]. S. Weinberg, Phys. Rev. Lett. 17 (1966) 616. J. Gasser, H. Leutwyler, Phys. Lett. B 125 (1983) 325. N.H. Fuchs, H. Sazdjian, J. Stern, Phys. Lett. B 269 (1991) 183; J. Stern, H. Sazdjian, N.H. Fuchs, Phys. Rev. D 47 (1993) 3814 [hep-ph=9301244]; M. Knecht, J. Stern, in Ref. [12], p. 169, and references cited therein; J. Stern, in Ref. [27], p. 26, and hep-ph=9712438. J. Sâ Borges, Nucl. Phys. B 51 (1973) 189; J. Sâ Borges, Phys. Lett. B 149 (1984) 21; J. Sâ Borges, Phys. Lett. B 262 (1991) 320; J. Sâ Borges, Phys. Lett. B 149 (1984) 21; J. Sâ Borges, F.R. Simao, Phys. Rev. D 53 (1996) 4806; J. Sâ Borges, J. Soares Barbosa, V. Oguri, Phys. Lett. B 393 (1997) 413; J. Sâ Borges, J. Soares Barbosa, M.D. Tonasse, Phys. Rev. D 57 (1998) 4108 [hep-ph=9707394]. M. Knecht et al., Nucl. Phys. B 457 (1995) 513 [hep-ph=9507319]; M. Knecht et al., Nucl. Phys. B 471 (1996) 445 [hep-ph=9512404]. L. Girlanda et al., Phys. Lett. B 409 (1997) 461 [hep-ph=9703448]. M.G. Olsson, Phys. Lett. B 410 (1997) 311 [hep-ph=9703247]. D. Po_cani^c, in Ref. [27], p. 352, and hep-ph=9801366; in: M.A. Ivanov et al. (Eds.), Proceedings of the International Workshop on Hadronic Atoms and Positronium in the Standard Model, Dubna, Russia, 1998, p. 33, and hep-ph=9809455. J. Bijnens et al., Phys. Lett. B 374 (1996) 210 [hep-ph=9511397]; J. Bijnens et al., Nucl. Phys. B 508 (1997) 263 [hep-ph=9707291]; J. Bijnens et al., Nucl. Phys. B 517 (1998) 639 (E). G. Colangelo, J. Gasser, H. Leutwyler, Phys. Lett. B 488 (2000) 261 [hep-ph=0007112]. M.R. Pennington, Ann. Phys. 92 (1975) 164. D. Morgan, G. Shaw, Nucl. Phys. B 10 (1969) 261; Phys. Rev. D 2 (1970) 520. J.P. Baton et al., Phys. Lett. B 25 (1967) 419; Nucl. Phys. B 3 (1967) 349; Phys. Lett. B 33 (1970) 525, 528.


279

[47] W. Ochs, Die Bestimmung von -Streuphasen auf der Grundlage einer Amplitudenanalyse der Reaktion − p → − + n bei 17 GeV=c PrimZarimpuls, Ph.D. Thesis, Ludwig-Maximilians-UniversitZat, MZunchen, 1973. [48] B. Hyams et al., Nucl. Phys. B 64 (1973) 134. [49] S.D. Protopopescu et al., Phys. Rev. D 7 (1973) 1279. [50] G. Grayer et al., Nucl. Phys. B 75 (1974) 189. [51] P. Estabrooks, A.D. Martin, Nucl. Phys. B 79 (1974) 301. [52] M.J. Losty et al., Nucl. Phys. B 69 (1974) 185. [53] B. Hyams et al., Nucl. Phys. B 100 (1975) 205. [54] W. Hoogland et al., Nucl. Phys. B 126 (1977) 109. [55] H. Becker et al. [CERN-Cracow-Munich Collaboration], Nucl. Phys. B 150 (1979) 301. [56] K.L. Au, D. Morgan, M.R. Pennington, Phys. Rev. D 35 (1987) 1633. [57] B.S. Zou, D.V. Bugg, Phys. Rev. D 48 (1993) 3948; B.S. Zou, D.V. Bugg, Phys. Rev. D 50 (1994) 591; V.V. Anisovich, D.V. Bugg, A.V. Sarantsev, B.S. Zou, Phys. Rev. D 50 (1994) 1972, 4412. [58] D.V. Bugg, B.S. Zou, A.V. Sarantsev, Nucl. Phys. B 471 (1996) 59. [59] R. Kaminski, L. Lesniak, K. Rybicki, Z. Phys. C 74 (1997) 79 [hep-ph=9606362]; R. Kaminski, L. Lesniak, B. Loiseau, Eur. Phys. J. C 9 (1999) 141 [hep-ph=9810386]; R. Kaminski, L. Lesniak, K. Rybicki, Acta Phys. Polon. B 31 (2000) 895 [hep-ph=9912354]. [60] J. Gunter et al. [E852 Collaboration], hep-ex=0001038. [61] W. Ochs, N Newslett. 3 (1991) 25. [62] S. Eidelman, F. Jegerlehner, Z. Phys. C 67 (1995) 585 [hep-ph=9502298]. [63] R. Barate et al. [ALEPH Collaboration], Z. Phys. C 76 (1997) 15. [64] L. Lukaszuk, Phys. Lett. B 47 (1973) 51. [65] S. Anderson et al. [CLEO Collaboration], hep-ex=9910046. [66] A. Schenk, Nucl. Phys. B 363 (1991) 97. [67] The NAG library, The Numerical Algorithms Group Ltd, Oxford UK. [68] M.G. Olsson, Phys. Rev. 162 (1967) 1338. [69] G.F. Chew, S. Mandelstam, Phys. Rev. 119 (1960) 467. [70] L. Rosselet et al., Phys. Rev. D 15 (1977) 574. [71] C. Caso et al. [Particle Data Group], Eur. Phys. J. C 3 (1998) 1. [72] J. Pi_su^ t, M. Roos, Nucl. Phys. B 6 (1968) 325. [73] C.B. Lang, A. Mas-Parareda, Phys. Rev. D 19 (1979) 956. [74] J. Bohacik, H. Kuhnelt, Phys. Rev. D 21 (1980) 1342. [75] G. Wanders, Helv. Phys. Acta 39 (1966) 228. [76] M.M. Nagels et al., Nucl. Phys. B 147 (1979) 189. [77] H. Leutwyler, Nucl. Phys. A 623 (1997) 169C [hep-ph=9709406]. [78] D. Atkinson, T.P. Pool, Nucl. Phys. B 81 (1974) 502. [79] J.F. Donoghue, C. Ramirez, G. Valencia, Phys. Rev. D 38 (1988) 2195. [80] J. Gasser, U.G. Meissner, Phys. Lett. B 258 (1991) 219. [81] G. Wanders, Springer Tracts Mod. Phys. 57 (1971) 22. [82] B.R. Martin, D. Morgan, G. Shaw, Pion-Pion Interactions in Particle Physics, Academic Press, London, 1976. [83] G. Veneziano, Nuovo Cimento 57 (1968) 190. [84] C. Lovelace, Phys. Lett. B 28 (1968) 264. [85] J.A. Shapiro, Phys. Rev. 179 (1969) 1345. [86] C. Lovelace, Nuovo Cimento 21 (1961) 305.

THE INFRARED BEHAVIOUR OF QCD GREEN:S FUNCTIONS Con5nement, dynamical symmetry breaking, and hadrons as relativistic bound states

Reinhard ALKOFER, Lorenz VON SMEKAL Institut fuK r Theoretische Physik, UniversitaK t T uK bingen, Auf der Morgenstelle 14, 72076 T uK bingen, Germany Institute fuK r Theoretische Physik III, UniversitaK t Erlangen-NuK rnberg, Staudtstr. 7, 91058 Erlangen, Germany

AMSTERDAM } LONDON } NEW YORK } OXFORD } PARIS } SHANNON } TOKYO


The infrared behaviour of QCD Green’s functions Con%nement, dynamical symmetry breaking, and hadrons as relativistic bound states Reinhard Alkofera; ∗ , Lorenz von Smekalb a

Institut fur Theoretische Physik, Universitat Tubingen, Auf der Morgenstelle 14, D-72076 Tubingen, Germany b Institut fur Theoretische Physik III, Universitat Erlangen–Nurnberg, Staudtstr. 7, 91058 Erlangen, Germany Received December 2000; editor: W: Weise

Contents 1. Introduction 2. Basic concepts in quantum %eld theory 2.1. Generating functional of QED and QCD 2.2. Gribov copies, monopoles and gauge %xing 2.3. Positivity versus colour antiscreening 2.4. Description of con%nement in the linear covariant gauge 2.5. Alternative singularity structures of Green’s functions 3. The Dyson–Schwinger formalism 3.1. Dyson–Schwinger equations for QED and QCD propagators 3.2. BRS invariance of Green’s functions 3.3. The structure of vertex functions 4. Green’s functions of QED 4.1. (1 + 1)-dimensional QED (Schwinger model): Dyson–Schwinger equations in a model with instantons

283 286 288 293 300 305 313 316 317 325 328 331 331

4.2. Con%nement in (2 + 1)-dimensional QED 4.3. Dynamical chiral symmetry breaking in (3 + 1)-dimensional QED 5. The infrared behaviour of QCD Green’s functions 5.1. “Con%ned” or “con%ning” gluons? 5.2. The gluon propagator in axial gauge 5.3. Truncation schemes for propagators in Landau gauge 5.4. A non-perturbative expansion scheme in Landau gauge QCD 5.5. Lattice results 6. Mesons as quark–antiquark bound states 6.1. Bethe-Salpeter equation for mesons 6.2. Ground state mesons 7. Baryons as diquark–quark bound states 7.1. Goldstone theorem and diquark con%nement 7.2. Modelling of diquarks

332 338 343 344 347 351 380 384 390 390 398 415 416 423

Supported in part by DFG (Al 279=3-3) and COSY (contract no. 41376610). Corresponding author. E-mail addresses: [email protected] (R. Alkofer), [email protected] (L. von Smekal). ∗


282

R. Alkofer, L. von Smekal / Physics Reports 353 (2001) 281–465

7.3. Quark–diquark Bethe-Salpeter equations 7.4. Electromagnetic form factors 7.5. Strong and weak form factors 7.6. Hadronic reactions 8. Concluding remarks Acknowledgements Appendix A. Real versus complex ghost %elds: ghost–antighost symmetry and SL(2; R)

427 434 440 442 445 446

Appendix B. Conventions for Fourier transformations Appendix C. Dyson–Schwinger equation for the gluon propagator Appendix D. 3-gluon vertex in axial gauge References

452 453 456 457

446

Abstract Recent studies of QCD Green’s functions and their applications in hadronic physics are reviewed. We discuss the de%nition of the generating functional in gauge theories, in particular, the rôle of redundant degrees of freedom, possibilities of a complete gauge %xing versus gauge %xing in presence of Gribov copies, BRS invariance and positivity. The apparent contradiction between positivity and colour antiscreening in combination with BRS invariance in QCD is considered. Evidence for the violation of positivity by quarks and transverse gluons in the covariant gauge is collected, and it is argued that this is one manifestation of con%nement. We summarise the derivation of the Dyson–Schwinger equations (DSEs) of QED and QCD. For the latter, the implications of BRS invariance on the Green’s functions are explored. The possible inJuence of instantons on DSEs is discussed in a two-dimensional model. In QED in (2 + 1) and (3 + 1) dimensions, the solutions for Green’s functions provide tests of truncation schemes which can under certain circumstances be extended to the DSEs of QCD. We discuss some limitations of such extensions and assess the validity of assumptions for QCD as motivated from studies in QED. Truncation schemes for DSEs are discussed in axial and related gauges, as well as in the Landau gauge. Furthermore, we review the available results from a systematic non-perturbative expansion scheme established for Landau gauge QCD. Comparisons to related lattice results, where available, are presented. The applications of QCD Green’s functions to hadron physics are summarised. Properties of ground state mesons are discussed on the basis of the ladder Bethe-Salpeter equation for quarks and antiquarks. The Goldstone nature of pseudoscalar mesons and a mechanism for diquark con%nement beyond the ladder approximation are reviewed. We discuss some properties of ground state baryons based on their description as Bethe-Salpeter=Faddeev bound states of quark–diquark correlations in the quantum c 2001 Elsevier Science B.V. All rights reserved. %eld theory of con%ned quarks and gluons. PACS: 02.30.Rz; 11.10.−z; 11.15.Tk; 12.20.Ds; 12.38.−t; 14.20.−c; 14.40.Aq; 14.65.Bt; 14.70.Dj Keywords: Strong QCD; Green’s functions; Con%nement; Chiral symmetry breaking; Dyson–Schwinger equations; Bethe-Salpeter equation


283

1. Introduction Looking at the plethora of diLerent hadrons it is evident that baryons and mesons are not elementary particles in the naive sense of the word “elementary”. Experimental hadron physics has determined the partonic substructure of the nucleon to an enormous precision leaving no doubt that the parton picture emerges from quarks and gluons, the elementary %elds of quantum chromodynamics (QCD). On the other hand, it is a well-known fact that these quarks and gluons have not been detected outside hadrons. This puzzle was given a name: con%nement. Despite the fact that the con%nement hypothesis was formulated several decades ago our understanding of the con%nement mechanism(s) is still not satisfactory. And, in contrast to other non-perturbative phenomena of interest in QCD (e.g., dynamical breaking of chiral symmetry, UA (1) anomaly, and formation of relativistic bound states) the phenomenon of con%nement might to some extend be in conJict with nowadays widely accepted foundations of quantum %eld theory. Quantum %eld theory provides the basis for our current understanding of particle physics. The quantum %eld theoretical description of elementary particles has been impressively successful since it was %rst developed in the quantisation of electrodynamics in the late-1920s. After its %rst applications to elementary processes like the spontaneous decays of exited atoms, the photo and the Compton eLect, electron–electron scattering, pair creation and Bremsstrahlung, the next major step was accomplished in the late-1940s. The anomalous magnetic moment of the free electron from the Dirac theory of relativistic quantum mechanics was observed experimentally. The development of the covariant perturbation expansion by Tomonaga, Schwinger and Feynman together with the concept of renormalisation made it possible to calculate higher order corrections to the elementary processes of electrons and photons. Its application to the Lamb shift explained the experimental observations, and higher order corrections subsequently agreed with the results of re%ned experiments. Since these developments of quantum electrodynamics (QED), local quantum %eld theory has further been developed and applied to the descriptions of elementary particles. Their processes are accounted for by the collision theory developed by Lehmann, Symanzik and Zimmermann, the so-called LSZ formalism [1] (for a description of its role in modern quantum %eld theory see, e.g., Refs. [2] or [3]). Together with perturbation theory it describes the processes of elementary particles at high energies based on asymptotically free gauge theories. In particular, in the weak coupling regime of QCD, i.e., at high energies, the agreement of perturbative calculations with the huge number of measurements available is impressive (see, e.g., Fig. 14 in Section 5.3). The perturbative description of elementary particles is essentially based on the %eld-particle duality which means that each %eld in a quantum %eld theory is associated with a physical particle. A simple example for this being the Fermi theory of the -decay where a %eld is associated to all particles involved, the proton, the neutron, as well as the electron and the neutrino. This is, of course, not what one has in mind in describing hadrons as composite states with quark and gluonic substructure in hadronic processes. On the one hand, scattering theory can be extended to include processes of composite particles described by “almost local” %elds leading to a generalised LSZ formalism for bound states [4 – 6] (described also in the book of Ref. [3]). On the other hand, the situation in QCD is more complicated, however. Not only does the asymptotic state space contain composite states, but the physical Hilbert space of the asymptotic hadron states does not contain any states corresponding to particles associated with

284


the elementary %elds in QCD, the quarks and gluons. For a description of con%nement of quarks and gluons within the framework of local quantum %eld theory, the elementary %elds have to be divorced completely from a particle interpretation. A quote from Haag’s book [3] expresses this in a clear way as follows: The rôle of %elds is to implement the principle of locality. The number and the nature of diLerent basic %elds needed in the theory is related to the charge structure, not to the empirical spectrum of particles. The description of hadronic states and processes based on the dynamics of the con%ned correlations of quark and glue is the outstanding challenge in the formulation of QCD as a local quantum %eld theory. In particular, assuming that only hadrons are produced from processes involving hadronic initial states, one has to explain that the only thresholds in hadronic amplitudes are due to the productions of other hadronic states, and that possible structure singularities occur in composite states which are due to their hadronic substructure only. Some theoretical insight into the mechanism(s) for con%nement into colourless hadrons could be obtained from disproving the cluster decomposition property for colour-non-singlet gaugecovariant operators. One idea in this direction is based on the possible existence of severe infrared divergences, i.e., divergences which cannot be removed from physical cross sections by a suitable summation over degenerate states by virtue of the Kinoshita–Lee–Nauenberg theorem [7]. 1 Such severe infrared divergences could provide damping factors for the emission of coloured states from colour-singlet states (see [9]). However, the Kinoshita–Lee–Nauenberg theorem applies to non-Abelian gauge theories in four dimensions order by order in perturbation theory [10,11]. Therefore, such a description of con%nement in terms of perturbation theory is impossible. In fact, extended to Green’s functions, the absence of unphysical infrared divergences implies that the spectrum of QCD necessarily includes coloured quark and gluon states to every order in perturbation theory [12]. An alternative way to understand the insuRciency of perturbation theory to account for con%nement in four-dimensional %eld theories is that con%nement requires the dynamical generation of a physical mass scale. In presence of such a mass scale, however, the renormalisation group (RG) equations imply the existence of essential singularities in physical quantities, such as the S-matrix, as functions of the coupling at g = 0. This is because the dependence of the RG invariant con%nement scale on the coupling and the renormalisation scale near the ultraviolet %xed point is determined by [13] g dg g→0 1 = exp − → exp − ; 0 ¿ 0 : (1) (g ) 20 g2 Since all RG invariant masses in massless QCD will exhibit behaviour (1) up to a multiplicative constant, the ratios of all bound state masses are, at least in the chiral limit, determined independent of all parameters. Therefore, to study the infrared behaviour of QCD amplitudes non-perturbative methods are required. In addition, as singularities are anticipated, a formulation in the continuum is desirable. One promising approach to non-perturbative phenomena in QCD is provided by studies of 1

Also referred to as the non-Abelian Bloch–Nordsieck prescription, cf., Ref. [8].


285

truncated systems of its Dyson–Schwinger equations (DSEs) [14,15], the equations of motion for QCD Green’s functions. Typical truncation schemes resort to additional sources of information like the Slavnov–Taylor identities [16,17], as entailed by gauge invariance, to express vertex functions and higher n-point functions in terms of the elementary two-point functions, i.e., the quark, ghost and gluon propagators. In principle, these propagators can then be obtained as self-consistent solutions to the non-linear integral equations representing the closed set of truncated DSEs. The underlying conjecture to justify such a truncation of the originally in%nite set of DSEs is that a successive inclusion of higher n-point functions in self-consistent calculations will not result in dramatic changes to previously obtained lower n-point functions. To achieve this it is important to incorporate as much independent information as possible in constructing those n-point functions which close the system. Such information, e.g., from implications of gauge invariance or symmetry properties, is suRciently reliable so that the related properties are expected to be reproduced by the solutions to subsequent truncation schemes. Until recently, available solutions to truncated DSEs of QCD did not even fully include all contributions of the propagators themselves. In particular, even in absence of quarks, solutions for the gluon propagator in Landau gauge used to rely on neglecting ghost contributions [18–21] which, though numerically small in perturbation theory, are unavoidable in this gauge. While this particular problem can be avoided by ghost free gauges such as the axial gauge, in studies of the gluon DSE in the axial gauge [22–26], the possible occurrence of an independent second term in the tensor structure of the gluon propagator has been disregarded [27]. In fact, if the complete tensor structure of the gluon propagator in axial gauge is taken into account properly, one arrives at a coupled system of equations which is of similar complexity as the ghost–gluon system in the Landau gauge and which is yet to be solved. In addition to providing a better understanding of con%nement based on studies of the behaviour of QCD Green’s functions in the infrared, DSEs have proven successful in developing a hadron phenomenology which interpolates smoothly between the infrared (non-perturbative) and the ultraviolet (perturbative) regime, for recent reviews see, e.g., [28,29]. In particular, a dynamical description of the spontaneous breaking of chiral symmetry from studies of the DSE for the quark propagator is well established in a variety of models for the gluonic interactions of quarks [30]. For a suRciently large low-energy quark–quark interaction quark masses are generated dynamically in the quark DSE in some analogy to the gap equation in superconductivity. This in turn leads naturally to the Goldstone nature of the pion and explains the smallness of its mass as compared to all other hadrons. In this framework a description of the diLerent types of mesons is obtained from Bethe-Salpeter equations (BSEs) for quark–antiquark bound states. Recent progress towards a solution of a fully relativistic three-body equation extends this consistent framework to baryonic bound states. Investigations of QCD Green’s functions have been extended successfully to %nite temperatures and densities during the last few years. As this is a subject of its own, and with regard to the length of the present review we have refrained from reviewing this topic. Instead we refer the interested reader to the recent review provided by Ref. [31]. This review is organised as follows: Chapter 2 reviews some basic concepts of quantum %eld theory and especially QCD, some derivations and the notations to provide the necessary background for the later sections. In Chapter 3 the Dyson–Schwinger formalism is presented.

286


Chapter 4 provides a short summary of QED Green’s functions in (1 + 1); (2 + 1) and (3 + 1) dimensions. These results are helpful to put the corresponding results for QCD into perspective. Chapter 5 is the central part of this review: The infrared behaviour of QCD Green’s function and its implications for con%nement, for dynamical breaking of chiral symmetry and the structure of hadrons in general are discussed. Phenomenological studies of mesons which are based on the results for the propagators are summarised in Chapter 6. On the way towards a description of baryons as bound states in colour singlet 3-quark channels a detailed understanding of diquark correlations is necessary. In Chapter 7 the corresponding framework is provided, and baryonic bound states of quarks and diquarks are described in reduced Bethe-Salpeter=Faddeev equations obtained for separable diquark correlations. A few concluding remarks are given in Chapter 8. Some more technical issues are provided in several appendices. We wish to emphasise that this review is a status report on an on-going eLort. The long way from the dynamics of quark and glue to hadrons in one coherent description is far from paved. Considerable segments are, however, increasingly well understood, some on a fairly fundamental level, others from temporarily used model assumptions. Connections between the various pieces are made in form of justi%cations and improvements of the respective model assumptions. The present review describes some of these segments along the way. 2. Basic concepts in quantum eld theory In this chapter, some underlying concepts of the subsequent chapters are brieJy reviewed, mainly to introduce de%nitions and conventions for later use. In addition, and maybe more importantly, we discuss some of the fundamental issues in this chapter which might in future lead to advances in the understanding of hadronic physics based on the dynamics of quark and glue. Since a comprehensive and %nal quantum %eld theoretic description of a con%ning theory is not established yet, it is necessary to cover various descriptions in this review. However, even the most widely adopted ones possess some quite complementary aspects. A rough classi%cation may be possible in realising that the least modi%cations, necessary to accommodate con%nement in quantum %eld theory, seem to be given by the choice of either relaxing the principle of locality or abandoning the positivity of the representation space. 2 We will discuss some of the implications of both these possibilities in this chapter. Since abandoning locality has much further reaching consequences, the latter choice, the description of QCD based on local quantum %eld theory with inde%nite metric spaces, might be the more viable possibility of the two, if the cluster decomposition property of local %elds can be circumvented. As was originally suggested for QED by Gupta [32] and Bleuler [33], the starting point for a covariant description of gauge theories is an inde%nite metric space. In particular, this implies that, apart from positivity, most other properties of local quantum %eld theory, and hereby most importantly the analyticity properties of Green’s functions and amplitudes, remain to be valid in such a formulation. In QCD, coloured states are supposed to exist in the inde%nite metric space of asymptotic states. A semide%nite subspace is obtained as the kernel of an operator. Just as in QED, where the Gupta–Bleuler condition is to enforce the Lorentz condition on physical 2

As a third alternative both might eventually turn out to be necessary, of course.


287

states, this subspace, called the physical subspace, has a partition in equivalence classes of states which diLer only by their zero norm components. A %rst impression of such a description of con%nement can be obtained from the analogy with QED. Quantising the electromagnetic %eld in a linear covariant gauge, besides transverse photons one also obtains longitudinal and scalar (time-like) photons. The latter two are unobservable because one eliminates inde%nite metric states by requiring the Lorentz condition on all physical states. The S-Matrix of QED scatters physical states into physical ones only, because it commutes with the Lorentz condition. The scalar photons are “eaten up” by the longitudinal ones with which they form metric partners. Colour con%nement in QCD can be described by an analogous mechanism: No coloured states should be present in the positive de%nite space of physical states de%ned by some suitable condition which has to commute with the S-Matrix of QCD to ensure scattering of physical states into physical ones. The dynamical aspect of such a formulation resides in the cluster decomposition property of local %eld theory. The proof of which, absolutely general otherwise [34], does not include the inde%nite metric spaces of covariant gauge theories. In fact, there is quite convincing evidence for the contrary, namely that the cluster decomposition property does not hold for coloured correlations of QCD in such a description [35]. This would thus eliminate the possibility of scattering a physical state into colour singlet states consisting of widely separated coloured clusters (the “behind-the-moon” problem, see also Ref. [36] and references therein). We will return in some more detail in Section 2.4 to the foundations of this description which is based on the representations of a particular symmetry of covariant gauge theories found by Becchi, Rouet and Stora, the BRS symmetry [37]. The dynamics of the elementary degrees of freedom of QCD is encoded in its n-point correlation functions, i.e., the hierarchy of the (time-ordered) Green’s functions G (n) (x1 ; : : : ; x n ). Owing to the axioms of local quantum %eld theory, in Minkowski space–time these are de%ned to be boundary values (by k → 0, see below) of analytic functions of n − 1 complex 4-vectors zk = k + i k which are complex extensions of the relative coordinates k ≡ xk − xk+1 (with k = 1; : : : ; n − 1). The complicated domain of holomorphy of the correlation functions is then established in several steps, see Refs. [38,3] for more details. First, one observes that it contains the primitive domain which is de%ned by the requirement that the negative imaginary parts of all zk lie in the forward cone, −k ∈ V+ . Then, all “points” are included which can be reached from the primitive domain by complex Lorentz transformations, i.e. by extending SL(2; C), the double cover of the proper orthochronous Lorentz group, to SL(2; C) × SL(2; C). Permutations of the n − 1 variables zk and the theory of functions of several complex variables then lead to the so-called envelope of holomorphy of permuted extended tubes. This connects the primitive domain with the non-coincident Euclidean region, {(x1 ; : : : ; x n ) ∈ R4n : xk = xl ∀k=l∈{1; :::; n} }. Or vice versa, with the Euclidean SU (2) × SU (2) symmetry as subgroup of SL(2; C) × SL(2; C), the domain of holomorphy allows a complex extension of the Euclidean space. This apparently technical issue is quite important to realise, however, since it justi%es the incorporation of time-like vectors (e.g., the total momenta of bound states) as complex 4-vectors in an analytically continued Euclidean formulation. We will adopt such a Euclidean formulation throughout the following chapters with few exceptions which will be mentioned explicitly where they occur. The %rst of the alternatives mentioned in the beginning of this section is based on describing con%nement by an absence of coloured states from the asymptotic state space altogether. In

288


particular, for a formulation in terms of some elementary quark and gluon %elds this requires a relaxation of the principle of locality in order to admit singularity structures of their Green’s functions that cannot occur in a local quantum %eld theory. We will discuss some consequences of this in Section 2.5. One way to implement con%nement in such a description might be provided by assuming that the elementary correlations are given by entire functions in momentum space, e.g., that no singularities are present at all in any %nite region of the complex p2 -plane of the 2-point correlations reJecting their con%ned character. While (%nite) time-like momenta are readily incorporated in such a description, singularities for p2 → ∞ are indispensable for non-trivial entire functions. Asymptotic freedom, however, entails that analytic 2-point functions need to vanish in this limit for all directions of the complex p2 -plane [39,40], see also Section 2.3 below. Perturbation theory, yielding the perturbative logarithms for large p2 , thus seems hard to be reconciled with the idea of entire 2-point functions. A singularity structure which generates the perturbative logarithms and complies at the same time with the analyticity considerations above entails that singularities occur on the time-like real p2 -axis only. The 2-point correlations of quarks and transverse gluons are then analytic functions in the cut complex p2 -plane. While the phenomenologically appealing models employing entire 2-point functions can therefore be motivated only as approximations for not too large |p2 |, the required analyticity structure is evident in the local description of covariant gauge theories based on inde%nite metric spaces. Con%nement of quarks and transverse gluons is hereby attributed to violations of positivity which should result in inde%nite spectral densities for their respective correlations, see, e.g., Ref. [40]. It has in fact been argued that such a violation of positivity can already be inferred from asymptotic freedom in combination with the unbroken BRS invariance in QCD [40,41]. The subtleties in this argument which might be regarded as not absolutely conclusive are discussed further in Section 2.3. Independent of this perturbative argument, however, such violations of positivity are observed in the presently available solutions to Dyson–Schwinger equations as well as in lattice results for the transverse gluon propagator, see Chapter 5. Since these results together seem to provide quite convincing evidence for such violations of positivity of the elementary correlations of QCD in the covariant formulation, we will return to this issue repeatedly in the following chapters. 2.1. Generating functional of QED and QCD The Feynman–Schwinger functional integral representation of the generating functional for a gauge theory coupled to fermions is in the Euclidean domain formally given by a Z[j; ; V ] = D[A; q; q][A](f V (A)) 1 a a 4 4 a a ×exp − d x V −D= + m)q + d x(A j + q V + q) V : (2) F F + q( 4 Hereby sources j a for the gauge %elds Aa , and Grassmannian sources V and for the fermion %elds q and q, V have been introduced. We furthermore employ a positive de%nite Euclidean metric g = with hermitian "-matrices, {" ; " } = 2 . In the case of QED, an Abelian


289

gauge theory with gauge coupling e, the %eld strengths are given by F = 9 A − 9 A ;

and

D = 9 + ieA

(3)

is the covariant derivative. The other case of interest here is QCD, i.e., the gauge group SU (3). It is often convenient, however, to consider a variable number of colours Nc , and we present the following discussions mostly in a way applicable to general SU (Nc ) gauge groups. The %eld strengths are in either case given by a F = 9 Aa − 9 Aa − gfabc Ab Ac ;

and

D ab = ab 9 + gfabc Ac

(4)

is the covariant derivative in the adjoint representation of SU (Nc ) with fabc being the corresponding structure constants, and g is the coupling constant. Denoting the generators of SU (Nc ) in the fundamental representation as t a we can rewrite the covariant derivative: A = t a Aa ;

and

D = 9 + igA

with [t a ; t b ] = ifabc t c :

(5) fa (A)

The functional integration of the gauge %elds over the hypersurface = 0 involves the measure [A], called the Faddeev–Popov determinant [42]. In linear covariant gauges one uses fa (A) = 9 Aa . In the case of QED the corresponding condition f(A) = 9 A leads to a %eld-independent Faddeev–Popov determinant, i.e., a pure number, and Faddeev–Popov ghosts do not couple to physical %elds. The situation is diLerent in non-Abelian gauge theories despite the fact that the underlying idea is quite similar. To obtain the physical con%guration space it is necessary to divide the set of all gauge potentials by the set of all gauge transformations including the homotopically non-trivial ones [43,44]. A local gauge %xing condition is introduced to select a particular gauge %eld con%guration AU0 by fa (AU0 ) = 0 from the equivalence class of gauge %elds belonging to the same orbit, [AU ] := {AU = UAU † + U dU † : U (x) ∈ SU (Nc )} :

(6)

This procedure is locally unique if for in%nitesimally neighbouring con%gurations along the orbit, AU = AU0 + A( with Aa ( = −D ab ( b , one has a U f (A (x)) [A] = Det = 0 : (7) ( b (y) (=0 In linear covariant gauges the Faddeev–Popov determinant reads explicitly: [A] = Det(−9 D ab ) :

(8)

Perturbatively this Jacobian factor is taken care of by introducing ghost %elds, i.e., scalar Grassmann %elds cVa and ca in the adjoint representation, such that the Faddeev–Popov determinant is written as a Gaussian integral of these ghost %elds. The scalar ghost %elds belong to the trivial representation of SL(2; C), the cover of the connected part of the Lorentz group. As local %elds with space-like anticommutativity, they violate the spin-statistics theorem and are thus necessarily unphysical. The domain of holomorphy of the vacuum expectation values of any product of local %elds, and the positive de%niteness of the scalar product between any two states generated from the vacuum by the polynomial algebra of those %elds, together entail that anticommutativity is normally tied to %elds belonging to half-odd integer spin representations of SL(2; C), see, e.g., [3]. This is not necessarily so in the inde%nite-metric spaces of covariant gauge theories, however, in which the scalar product

290


is replaced by an inde%nite sesquilinear form. It implies of course that Faddeev–Popov ghosts are unobservable, see also [45]. As we shall describe in a little more detail in the context of BRS invariance in Section 2.4, in the covariant operator formulation of gauge theories ghosts and antighosts together with longitudinal and time-like gluons form quartets of metric partners, see, e.g., [36]. With the exception that ghosts decouple in QED this is no diLerent from the case of longitudinal and time-like photons [32,33]. In contrast to QED, however, in QCD positivity is violated for transverse gluon states too. This can be inferred already in perturbation theory from asymptotic freedom (and unbroken global gauge invariance) for less than 10 quark Javours [46,47], see the discussion at the end of Section 2.3. It implies that the massless transverse asymptotic gluon states of perturbation theory belong to unphysical quartets also. 3 This alone is not suRcient for a realisation of con%nement, however, for which it is absolutely crucial that non-perturbatively no such massless transverse gluon states exist, regardless of the fact that possible asymptotic single particle states in transverse gluon correlations generally form quartets. We will come back to this point in Section 2.4. The results presented in Section 5 for the Landau gauge gluon propagator from both, solutions to Dyson–Schwinger equations [48,49] and lattice simulations [50,51] demonstrate the violation of positivity for transverse gluons non-perturbatively and, 4 in addition, agree in con%rming the absence of massless asymptotic transverse single gluon states. The gauge condition fa (A) = 0, formally represented by a delta functional, is usually relaxed into fa (A) = iBa with a Gaussian distribution of width . In the linear covariant gauges this amounts to the replacement, a a 1 a 4 a 2 4 a a (f (A)) → exp − d x(9 A ) = DB exp − d x iB 9 A + B B ; (9) 2 2 which may or may not be represented by a Gaussian integration of the (Euclidean) Nakanishi– Lautrup auxiliary %eld Ba . The Lorentz condition 9 Aa = 0 is strictly implemented only in the limit → 0 which de%nes the Landau gauge. Perturbation theory can then be de%ned by choosing the Gaussian measure of free quark, gluon and ghost %elds as a starting point for a power series expansion of the non-Gaussian interaction terms in the eLective Lagrangian LeL of covariant perturbation theory, Z[j; ; V ; +; V +] = D[A; q; q; V c; V c]exp − d 4 xLeL + d 4 x(Aa j a + q V + q V + +c V + c+) V ; (10) with

1 −9 − LeL = − 1 9 9 Aa + cVa 92 ca + gfabc cVa 9 (Ac cb ) − gfabc (9 Aa )Ab Ac 1 a 2 A

2

+ 14 g2 fabe fcde Aa Ab Ac Ad + q( V −9= + m)q − igq" V t a qAa :

(11)

Here, sources + and +V have been introduced for the (anti)ghost %elds (c)c, V in exactly the same way as the sources for the quark %elds. The sign convention adopted for Grassmann %elds is 3 4

Together with massless states in certain ghost–gluon composite operators, see Section 4:4:3 in [36]. Positivity violation was already observed in the lattice studies of Refs. [52,53].


291

that derivatives generically denote, := left derivative; (; V +) V

:= right derivative : (; +)

(12)

Despite the gauge invariance of the generating functional the Green’s functions, obtained as the moments of this functional by taking derivatives with respect to the sources, are of course gauge dependent. The underlying gauge invariance, however, leads to relations between diLerent Green’s functions: the Ward–Takahashi identities in QED [54,55] and the Slavnov–Taylor identities in QCD [16,17]. The most convenient device to derive the Slavnov–Taylor identities is to exploit the Becchi–Rouet–Stora (BRS) symmetry [37] of Green’s functions [56]. This will be discussed in detail in Section 3.2. Here, for completeness, we give the (on-shell) nilpotent BRS transformations for linear covariant gauges: Aa = D ab cb -;

q = −igt a ca q- ; 1 ca = − g2 fabc cb cc -; cVa = 9 Aa

(13)

with the global parameter - belonging to the Grassmann algebra of the ghost %elds. This parameter thus (anti)commutes with monomials in the %elds that contain (odd)even powers of ghost or antighost %elds. Here it commutes with the quark %elds since we assumed commutativity of ghosts with quark %elds (without loss of generality, because of ghost number conservation, either commutativity or anticommutativity can be assumed, see Ref. [36]). Therefore, one can assign - the ghost number NFP = −1 reJecting the fact that the BRS charge has ghost number NFP = 1. The BRS invariance of the total Lagrangian in Eq. (11) follows from the gauge invariance of the classical action and the fact that gauge %xing and ghost terms can be expressed as a BRS variation themselves (i.e., they are BRS-exact). We use complex ghost %elds with cV ≡ c† . With this hermiticity assignment the Lagrangian of Eq. (11) is not strictly hermitian and, furthermore, the BRS transformation given in Eq. (13) is not compatible with this assignment, as discussed, e.g., in [36]. To avoid this, independent (here Euclidean) real Grassmann %elds u; v should be introduced by substituting c → u and cV → iv in Eqs. (11) and (13) above. In Landau gauge ( = 0) we can make use of the additional ghost–antighost symmetry, however, to maintain the hermiticity of the Lagrangian and compatibility with the larger double BRS symmetry also for the assignment cV = c† . For = 0, in the more general covariant gauge, this assignment is possible only at the expense of a quartic ghost interaction (which vanishes for → 0). Since we are mainly interested in the Landau gauge, we can disregard this subtlety and employ the naive BRS transformations together with the apparently wrong hermiticity assignment for the ghost %elds in the derivations of Slavnov–Taylor identities. The results obtained in this way will be correct as long as we let → 0 eventually in these derivations. The explicit connection between independent real and complex ghost %elds is provided by realising that, in Landau gauge, the ghost number Qc and the ghost–antighost symmetry of the real formulation are actually both part of a larger global SL(2; R) symmetry (which can be maintained for = 0 by introducing the quartic ghost self-interactions). The connection with the complex formulation is provided by the Cayley map and SL(2; R) SU (1; 1). Some details of this connection are provided in Appendix A.

292


Noether’s theorem implies that there is a conserved anticommuting charge associated with the BRS symmetry. The existence of this non-trivial nilpotent and hermitian charge is only possible because the state space has inde%nite metric. From Noether’s theorem one deduces furthermore that the BRS charge is the generator of BRS transformations: The BRS transform of an operator is given by the (anti)commutator of it with the BRS charge. An operator which is the BRS transform of another operator is called exact. From the nilpotency of the BRS charge one immediately concludes that the BRS transform of an exact operator vanishes. Taking furthermore into account that the BRS charge commutes with the S-matrix this has lead to the conjecture that physical states are the ones which are annihilated by the BRS charge [57,41]. Furthermore, we note here that - need not be in%nitesimal nor need it be %eld independent for (13) to be a symmetry of the Faddeev–Popov gauge %xed action [58]. However, the use of a %eld-dependent BRS transformation is aggravated by the fact that the functional measure is in general not invariant under this non-local transformation. Within linear covariant gauges the Green’s functions also depend on the gauge parameter . Using BRS symmetry one can furthermore derive the Nielsen identities [59] which control the gauge parameter dependence of Green’s functions. These identities can be used to prove the gauge independence of particle poles in the standard model to all orders in perturbation theory, for recent applications see Ref. [60] and the references therein. Concluding this section we would like to add a remark regarding the non-perturbative use of the generating functional. In perturbation theory, the Gaussian measure over the free %elds can formally be de%ned as a probability measure with support over the space of tempered distributions, see Refs. [61,38]. For ghosts and longitudinal gluons this measure is not positive. The products of free %elds occurring in the interactions, the composite %elds, may also be well de%ned as tempered distributions. Ambiguities arise for products of these composite %elds at coinciding Euclidean points. This is the origin for the need of renormalisation. In a renormalisable theory there exists a %nite set of composite %elds such that the product of any of them at coinciding points contains composite %elds within the same set multiplied by (formally in%nite) renormalisation constants. This is usually proven at all orders in perturbation theory. Multiplicative renormalisability beyond perturbation theory has the status of a conjecture. Beyond perturbation theory, the only safe way to de%ne the measure in the Euclidean generating functional, and thus the Euclidean Green’s functions as its moments, is given by the continuum limit of the lattice formulation of quantum %eld theory. The need for gauge %xing, the presence of long-range correlations such as the infrared divergences caused by the soft photons in QED, 5 the possibility of infrared slavery in QCD, and triviality are some obstacles in a proper de%nition of the generating functional beyond perturbation theory. Some of these can be taken care of, others are less understood. In order to proceed, the existence of the generating functional has to some extend still be postulated for many realistic theories. This will be also the point of view in the following chapters. Before we move on, however, we brieJy discuss the incompleteness of the standard gauge %xing procedure and some related issues in the next section. 5

For two recent reviews on the treatment of soft and collinear infrared divergences see, e.g., [62,63].


293

2.2. Gribov copies, monopoles and gauge ;xing It is a well-known problem that the Lorentz gauge condition 9Aa = 0 is not complete [64]. On compact space–time manifolds it has been proven that solutions to local gauge conditions of the form f(A) = 0 are generally unable to uniquely specify the connection, i.e., the gauge potentials A. The problem is generic and due to the topological structure of the non-Abelian gauge group [65], for a pedagogical discussion see Chapter 8 of [66]. Gribov’s observation from the Coulomb, or analogously, from the Lorentz gauge condition 9A = 0 is intuitively easy to understand. Consider the set of connections 1 := {A : 9A = 0} with the further constraint that all A connected by global transformations SU (Nc )global have to be identi%ed in addition. For suRciently strong A(x) the Faddeev–Popov operator −9D(A) can be shown not to be positive. Very much like bound states in quantum mechanics arise for suRciently strong potentials, there is a critical Ac for which the lowest eigenvalue -0 of the Faddeev–Popov operator is zero. A normalisable zero mode always arises from the analogue of the bound state wave-function in the limit A → Ac from that side for which -0 → 0− . Field con%gurations for which such zero modes occur in the Faddeev–Popov operator −9D(A) constitute the Gribov horizons. In particular, the con%gurations Ac where this happens for the lowest eigenvalue are said to lie on the %rst Gribov horizon 9, i.e., the set of %eld con%gurations for which the lowest eigenvalue of the Faddeev–Popov operator vanishes. It can be shown that any point on 9 has a %nite distance to the origin in %eld space [44]. Furthermore, in Coulomb gauge on any compact three-manifold every Gribov copy obtained by a homotopically non-trivial gauge transformation of the trivial gauge %eld A = 0 has a vanishing Faddeev–Popov determinant [43]. An example of this has already been given in the appendix of Gribov’s original paper [64]. He considered a pure gauge potential A =(0; Apg )=(0; −iU † B U ) in Coulomb gauge B Apg = 0. If the gauge condition was unique, the only solution should be Apg = 0. However, choosing an hedgehog con%guration in an SU (2) subgroup parametrised by the generators ((r) ((r) r U (r) = cos ; (14) + irˆ sin ; rˆ = 2 2 |r| the gauge condition becomes d 2 (˜ d (˜ ˜ = ((r) : + − 2 sin (˜ = 0; with t = ln r and ((t) (15) dt 2 dt This is the classical equation of motion of a damped pendulum corresponding to the motion of ˜ = 2 cos (˜ with friction d (=dt ˜ a particle in the potential V (() of unit strength. The static solutions ˜ d (=dt = 0 are given by (˜ = ( = l3 which decomposes into two sequences, the even and the odd multiples of 3 for l = 2n and 2n + 1 with n ∈ Z, respectively. For all pure gauge %eld con%gurations that approach these solutions at spatial in%nity, ( ( pg r→∞ † A → −iU B U = −i exp −i rˆ B exp i rˆ ; (16) 2 2 the Pontryagin index (winding number) is found to be half integer, l −i pg pg = d 3 x 4ijk tr(Apg (17) with l ∈ Z : i Aj Ak ) = 2 243 2

294


Note that even though pg

A(2n) = 0 ;

and

pg

A(2n+1) =

2 × rˆ ;

(18) at r = 0 2 r for the even ( = 2n3 and the odd ( = (2n + 1)3 static solutions, respectively, neither of these need to be entirely trivial. They can carry winding number concentrated at r = 0 (for l = 0). √ For regularised (4l (r) = l3r= r 2 + 42 one veri%es l (4 (r) 1 4→0 l 3 † † † − 4 tr[(U4 ∇i U4 )(U4 ∇j U4 )(U4 ∇k U4 )] → (x) ; for U4 = exp i rˆ : 2432 ijk 2 2 (19) The even sequence l=2n √ yields in%nitesimal and thus singular n-vacua obtained from the regular 2n ones, (4 (r) = 23n r= r 2 + 42 → 2n3, for 4 → 0. These, of course, correspond to the classi%cation of con%gurations according to 33 [SU (N )] = Z, i.e., of con%gurations with the boundary condition that U → U0 for a unique U0 ∈ SU (2) in all directions at spatial in%nity. The odd sequence, on the other hand, corresponds to con%gurations for which two group elements at spatial in%nity in opposite directions diLer by a non-trivial central element (here by −5 ∈ Z2 = {±5} for SU (2)) which does not aLect adjoint %elds such as the gauge potentials. 6 This additional classi%cation into even and odd sequences generalises for the SU (N ) pure gauge theory according to 31 [SU (N )=ZN ] = ZN corresponding to fractional topological indices k=N with k = 0; : : : ; N − 1, see [67]. In addition to the even and odd static solutions with ( = l3 discussed so far, there are ˜ of course also solutions to Eq. (15) for ((t) which start at one of the maxima of the potential ˜ = 2 cos (˜ at (˜ = 2n3 with in%nitesimal velocity for t → −∞, and which approach one of V (() the two neighbouring minima at (˜ = (2n ± 1)3 for t → ∞ where they eventually come to rest due to the friction term. For n = 0 these correspond to everywhere regular pure gauge %eld con%gurations, i.e., Gribov copies of the vacuum, with U (r) → ±5 for r → 0 and U (r) → ±irˆ for r → ∞, and with topological index = ±1=2. The property of these regular solutions to reach the odd sequence asymptotically at r → ∞ suRces to show that any Wilson line-integral along a curve " starting from spatial in%nity in some direction and leading to spatial in%nity in the opposite direction is −5, see Ref. [67], ((r) ((r) rˆ B exp i rˆ Wfd (Apg ) = P exp i Apg d s = −5; for Apg = −i exp −i 2 2 " (20) in the fundamental representation for these con%gurations. Note that the n-vacua yield +5 just as the trivial vacuum; and adjoint Wilson lines are +5 in either case, of course. Instantons change the index by one unit. Regular ones of in%nitesimal size 8 = 4 → 0, or equivalently, those with 8 → ∞ in a singular gauge, connect the vacua from the even sequence with the adjacent even ones and those of the odd sequence with adjacent odd ones, i.e., they correspond to transitions between the above vacua of constant angles ( = l3 with l → l ± 2. At zero temperature these instantons are discontinuous at a space-like surface passing through 6

For the pure gauge theory these boundary conditions are equivalent and combined in U → ZN U0 .


295

their centres. This is a rami%cation of the general argument for a necessarily discontinuous time evolution of such transitions in the Coulomb gauge [68]. In the Hamiltonian description this was related to the observation that, within such a transition, the con%gurations pass through the Gribov horizon at which the Coulomb gauge Hamiltonian fails to generate a continuous time evolution [64,68,69]. Instantons at %nite temperature T = 1= are the Harrington–Shepard calorons [70]. Those calorons that connect the n-vacua of Eq. (18), with l → l ± 2, at the opposite ends of the %nite time interval can be obtained from (antiperiodic) gauge transformations of special types of static, (anti)self-dual Bogomolnyi–Prasad–Sommer%eld (BPS) monopoles, namely the ones with topological charge Q = =23=1, where is the scale of the BPS monopole, see [71,72]. 7 Studying the limit T → 0 of these calorons with %rst, at %nite T , e.g., 8 → ∞ for the singular gauge, one can see the aforementioned discontinuity arise explicitly. In this limit, they reduce to sequences of the form  pg A ; t ¿ t0 ;    (2n+2) pg (21) A(x; t) = A(2n+1) ; t = t0 ;    Apg ; t ¡t ; (2n)

0

where t = t0 de%nes the central time slice of the original caloron. At the expense of the %nite action for an instanton, the vacua of the odd sequence of constant angles ( = (2n + 1)3 can therefore exist for in%nitesimally short times only. Cutting the time interval for %nite = 1=T at t = t0 , one obtains self-dual con%gurations in each of the two parts which are separately gauge transforms of BPS monopoles, now with topological charge Q=1=2. 8 These con%gurations have half the instanton action and connect the even with the odd vacua, with integer and half-odd integer winding numbers =l=2, respectively, corresponding to transitions l → l ± 1 from one end of their %nite time intervals to the other. Quite obviously, however, for temperatures T → 0 these transitions occur at in%nitesimally early or late times leading to the pure gauge con%gurations Apg (2n) with integer winding number =2l=n for all %nite times. In the Hamiltonian description, it therefore seems rather questionable whether such transitions can in the end give rise to a ground state wave-functional that is centre symmetric in the sense of the Wilson lines of Eq. (20). 7

The Polyakov loop of these special BPS monopoles passes through the centre of SU (2) at the position of the monopole, and it approaches the centre at spatial in%nity. They thus correspond to two axial-gauge monopoles, the rami%cations of the Gribov problem in the axial gauge, one at the position of the BPS monopole and the other one at in%nity with zero total magnetic charge. This is a special case of the general relation between instantons and axial-gauge monopoles which was clari%ed in Refs. [73–76]. Instantons corresponding to two axial-gauge monopoles at a %nite distance of each other, called the non-trivial holonomy instantons because for these the Polyakov-loop does not approach the centre at spatial in%nity, were found in Refs. [77,78]. For the relation between instantons and the magnetic monopoles of general Abelian gauges, see Ref. [79]. For gauge %xing and instantons in a %eld strength formulation, see Ref. [80]. A relation to monopoles might be provided by a %eld strength formulation in the maximal Abelian gauge [81]. 8 Since for these BPS monopoles the Polyakov loop passes through the centre only once, at their positions and, in particular, does not approach a central element at spatial in%nity, they each correspond to one single axial-gauge monopole and thus have non-trivial holonomy.

296


An alternative possibility to connect vacua of the even sequence with neighbouring odd ones is provided by merons. These are classical solutions of the SU (N ) pure gauge theory which are singular, non-self-dual and thus of in%nite action [82,83]. Explicitly, an Euclidean one-meron solution in Coulomb gauge, and for an arbitrary SU (2) subgroup, see the review in Ref. [84], is given by 1 t mn A (r ; t) = × rˆ 1− √ : (22) 2 r t2 + r2 Clearly, for t → ∞ (and r = 0) the meron con%guration vanishes. More carefully, including the singularity at r = 0, one %nds that it approaches a vacuum of the even sequence, corresponding to the 4 → 0 limit of (42n (r) → 2n3. For t → −∞ one has Amn → Apg = ( × r)=r ˆ corresponding 2n+1 9 to a (4 (r) → (2n + 1)3 con%guration of the odd sequence. As might intuitively seem reasonable for con%gurations that connect the even with the odd vacua, the gauge potentials of single-meron con%gurations are exactly half the gauge potentials of the instantons in the special limit discussed above, i.e., of those with 8 = 4 → 0 or 8 → ∞ in the regular or the singular gauge, respectively. Or, vice versa, these instantons can in fact be viewed as the special case of analytically known two-meron solutions for vanishing distance, see Refs. [83,84]. 10 Though single-meron con%gurations are neither self-dual nor have %nite action, in contrast to the Q = 1=2 BPS monopoles discussed above, they do connect vacua with integer and half-odd integer winding number without discontinuity in time. At %nite separations in the time-direction two-meron con%gurations might therefore, at least in principle, be employed to populate the half-odd winding number con%gurations Apg (2n+1) for %nite time periods. It was furthermore argued that the logarithmically diverging action of meron pairs, which can be made explicit in a regularisation in terms of so-called “instanton caps”, might be compensated in the free energy by their contributions to the entropy [86,87]. This observation has induced some renewed interest in the role of merons with respect to con%nement, e.g., for a recent lattice study of meron pairs, see Ref. [88]. To summarise, while a simple picture of con%nement, e.g., as intuitive as the dual Meissner eLect by a condensation of magnetic monopoles proposed for the maximal Abelian gauge [89,90], is not yet available for Coulomb or Landau gauge, relations between the various types of monopoles in the various gauges, gauge singularities and the role of topologically non-trivial gauge copies are increasingly well understood. Some analogies of these issues can be found also for the Coulomb or Landau gauge along the directions mentioned above. Whether these subtleties will be relevant for a description of con%nement or not, whether any kind of semiclassical analysis might in the end be swamped by genuine quantum eLects, the mere existence and the possible couplings of the fractional n-vacua seems to show that the capacity exists, at least in principle, to introduce a suRcient disorder which might eventually be all that is needed 9

Its connection to monopoles is quite obvious, at t = 0 the meron passes through a chromomagnetic monopole with Bia = −ri ra =r 4 and Eia = 0 which was %rst reported as a static solution to the classical Yang–Mills equations in Ref. [85]. 10 The hypothesis that exact solutions for two merons at a %nite distance [82] might provide a more general connection between instantons and monopoles, in particular also for Coulomb and Landau gauges, is a long-standing conjecture, see Refs. [86,87].


297

in the pure gauge theory to lead to an area law for large Wilson loops also in the Coulomb or the Landau gauge. For reasons that will become clear below, consider now the square norm of the various pure gauge con%gurations obtained from Eq. (14). It is straightforward to see that ∞ 4 pg pg pg 2 3 2 2 A = d x tr Ai Ai = 23 r dr (( (r)) + 2 (1 − cos ((r)) (23) r 0 which vanishes for A = 0 and for all static ( = 2n3 copies thereof. 11 From Eq. (23) it is clear ˜ in order for that ((r) → 2n3 for r → ∞, corresponding to the maxima of the potential V (() the square norm of such a pure gauge con%guration to be %nite. This is the case only for those con%gurations that approach the integer n-vacua Apg (2n) at large r. In particular, it is not the case for the regular Gribov copies discussed in the paragraph above Eq. (20) √ nor for regularised vacua with half-odd winding numbers obtained from (4 (r) = 3(2n + 1)r= r 2 + 42 . In addition, the %rst term in the norm integral, with

∞ ∞ ˜ ) 2 d ((t r 2 dr(( (r))2 = dt ; (24) dt 0 −∞ ˜ moving in the then corresponds to the total energy dissipated (for t → ∞) by the particle ((t) ˜ ˜ ˜ potential V (() = 2 cos (. This implies that, due to the friction d ((t)=dt in its equation of motion (15), any particle that comes to rest at one of the maxima (˜ = 23l for t → ∞ must at t → −∞ initially have come from (positive or negative) in%nity with in%nite initial energy. Therefore, ˜ ≡ 2n3 at all t. The only the dissipated energy of this particle is also in%nite except for ((t) con%gurations with %nite square norm are thus the singular n-vacua of the even sequence with integer winding number, for which Apg 2 = 0 degenerate with the trivial con%guration A = 0. The even sequence ( = 2n3 with = n ∈ Z provides a set of degenerate absolute minima of the square norm. Generally, the Gribov region is de%ned as the set of connections within the %rst Gribov horizon, i.e., as the set of gauge potentials for which the Faddeev–Popov operator is positive. In the linear covariant gauges it explicitly reads := {A : 9A = 0; −9D(A) ¿ 0}. This convex Gribov region is determined by the set of the local minima of the functional 1 U 2 aU d 4 x AaU (25) EA [U ] ≡ A :=

(x)A (x) 2 for the equivalence classes of gauge %elds [AU ] given in Eq. (6). There are in general many local minima, and the Gribov region still contains gauge copies. As a further restriction, the fundamental modular region is de%ned as the set of absolute minima of functional (25). Each orbit (6) intersects exactly once [91]. The fundamental modular region is contained within the Gribov region. In the interior of the absolute minima are non-degenerate. Degenerate minima exist, however, on the boundary 9. These minima have to be identi%ed [43,44]. The Gribov horizon, i.e., the boundary of the Gribov region , touches 9 at the so-called singular 11

The fact that these are singular at √ r = 0 is irrelevant here. One readily veri%es that the norm of the regular n-vacuum con%gurations (4 (r) = 23nr= r 2 + 42 is of order 4 and thus vanishes for 4 → 0.

298


Fig. 1. Sketch of the hypersurface 1 = {A : 9A = 0}, the Gribov and the fundamental modular region, < and , respectively. The necessity of identi%cations on the boundary of is indicated by dashed arrows.

boundary points. The situation is sketched in Fig. 1. Note that the relevant con%guration space is =SU (Nc ), since the origin of the fundamental modular region, A = 0, is invariant under global gauge transformations [43,44]. In a continuum formulation of QCD it seems unlikely that a systematic elimination of gauge copies is possible at all. 12 Their presence may or may not be a serious problem. On the other hand, there has been recently some progress treating the Gribov problem in lattice calculations. The lattice analogue of restricting to the absolute minima of EA is called minimal Landau gauge [98,99]. Various algorithms are used in gauge %xed lattice calculations to minimise this functional, e.g., in Refs. [100,52,53,101]. Methods to %nd the absolute minima and the inJuence of Gribov copies are assessed in Ref. [102]. Therefore, a solution of the Gribov problem might in principle be feasible on the lattice. However, the question of existence and uniqueness of the continuum limit for corresponding quantities still remains an open question. On compact manifolds, gauge %xing without the necessity of elimination of Gribov copies can be formulated systematically in terms of a (Witten type) topological quantum %eld theory on the gauge group G (see Refs. [103,66]). In such a formulation, the standard Faddeev–Popov procedure of inserting unity into the un%xed generating functional which generalises to a weighted average over all U with AU ∈ 1 = {A : 9A = 0} in presence of Gribov copies [104,105] or, equivalently, the perturbative BRS quantisation, essentially correspond to constructing a topological quantum %eld theory whose partition function computes the generalised Euler characteristic =(G) of the gauge group [106]. This can vanish, however, just as the Witten index vanishes in theories with spontaneously broken supersymmetry. For the SU (2) lattice gauge theory the vanishing of the Euler character follows quite trivially from =(⊗sites SU (2)) = =(S 3 )#sites = 0, see also Ref. [107]. In the continuum this remains to be the case due to the global gauge transformations which provide one vanishing factor =(S 3 ) that survives the continuum limit. One way to cure this problem is to remove the global ghost zero modes by constructing a topological quantum 12 For completeness we mention that using stochastic quantisation there is no need for a gauge %xing term and the Gribov problem is thus avoided, see Ref. [92] for a pedagogical treatment of this topic. A related continuum formulation [93,94] considers QCD from a %ve-dimensional point of view, the %fth dimension playing the role of the “stochastic time”. This leads to parabolic equation for the propagators of the various ghost %elds in the %ve-dimensional bulk and thus yields a trivial Faddeev–Popov determinant. There are also recent numerical investigations on the lattice based on Stochastic Quantisation, e.g., see Refs. [95 –97].


299

%eld theory that computes the Euler character of the coset space ⊗x SU (Nc )=SU (Nc )global which was shown not to vanish for SU (2) in Ref. [106]. Within the framework of BRS quantisation this procedure has been worked out for QCD in the covariant gauge on the 4-torus in Refs. [108,109]. 13 An alternative way to avoid a vanishing Euler character proposed in Ref. [110] is to %x the SU (N ) gauge symmetry only partially to the maximal Abelian subgroup U (1)N −1 . For the SU (2) lattice gauge theory the BRS construction to compute =(⊗sites SU (2)=U (1)) = =(S 2 )#sites = 2#sites can then be used to obtain a reduced U (1) lattice gauge theory [110]. Within covariant Abelian gauges in the continuum this kind of BRS quantisation by ghost–antighost condensation can lead to mass generation for oL-diagonal gauge bosons and thus to %nite propagators at all Euclidean momenta except for the diagonal “Abelian” gauge boson which remains massless [111,112]. The most important diLerence between this scheme and the standard Faddeev–Popov gauge %xing adopted for the maximal Abelian gauge, e.g., in Ref. [81], is that the maximal Abelian gauge condition is not implemented exactly in the former but softened by a Gaussian weight of width analogous to the Lorentz condition in the linear covariant gauge. This leads to the occurrence of quartic ghost self-interactions (which formally vanish for = 0) together with a global SL(2; R) symmetry in the BRS construction of Refs. [111,112]. The global SL(2; R) can dynamically break down to the usual ghost number symmetry in the phase with ghost– antighost condensation with the condensate as the order parameter. Besides being responsible for the ghost–antighost condensation and mass generation in some analogy to the BCS theory of superconductivity (with Higgs mechanism for the plasmon excitation), technically, the quartic ghost-self-interactions eliminate the global gauge zero modes due to the constant (in this case the oL-diagonal) ghosts. In this description, screening masses for the oL-diagonal gauge bosons thus emerge naturally which persist in the high temperature phase [111]. This last conclusion is due to the relation of the ghost condensate with the scale anomaly which at the same time seems to show that it cannot provide an order parameter for the chiral symmetry breaking and=or con%nement transition. It rather suggests that the global SL(2; R) is broken in both, the high and the low temperature phases. In order to understand the possible origin of con%nement in such a formulation, which in the usual BRS framework is related to the realisation of the global gauge symmetry on unphysical states, an application to the Higgs mechanism in the SU (2) × U (1) electroweak interactions would seem to be a natural next step. 14 It might be interesting for our present purposes, however, to note that a global SL(2; R) symmetry containing ghost number and ghost–antighost symmetry emerges also in the Landau gauge, i.e., in the special case of the linear covariant gauge with = 0 in which the Lorentz condition is implemented “exactly”. Maintaining this symmetry in covariant gauges for gauge parameters = 0 leads to the (massless) Curci–Ferrari gauges discussed in Appendix A. The signi%cance of this symmetry seems not entirely clear at present. The diLerences between these Curci–Ferrari gauges and the standard linear covariant gauge seems, however, quite analogous to the situation in the maximal Abelian gauge discussed above. 13

Among compact space–time manifolds, the special choice of the torus was adopted for simplicity, to avoid global topological obstructions. The in%nite volume limit is believed to be independent of this particular choice, of course. 14 In particular, the question might arise why the massive gauge boson belongs to an unphysical quartet (see below) in one case while it de%nitely is a BRS singlet in any of the known Higgs models.

300


While the presence of the quartic ghost self-interactions might at %rst not seem to be a very appealing feature of the Curci–Ferrari generalisation of the Landau gauge apart from maintaining its special symmetry, they do have one possibly quite interesting eLect: The quartic ghost-self-interactions could be eLective to eliminate all constant ghost and antighost zero modes for the gauge group SU (3) of QCD. 15 This might therefore provide for a BRS formulation by a topological quantum %eld theory with non-vanishing partition function without need to eliminate the global gauge invariance. As we shall discuss in Section 2.4 the realisation of the global gauge symmetry is of particular importance in the BRS formulation of the linear covariant gauge. The Kugo–Ojima criterion is based on the necessity of this global symmetry to be unbroken for a realisation of con%nement. Its breaking, on the other hand, leads via the converse of the Higgs mechanism to massive physical (BRS singlet) states in transverse gauge boson channels. In light of this, a formulation that allows both these possibilities by leaving the global gauge invariance untouched clearly seems desirable. Among the SL(2; R)-symmetric covariant gauges Landau gauge is special in that the quartic ghost interactions disappear for = 0 with the eLect that constant ghost zero modes arise. These are certainly problematic for a proper formulation of the gauge %xed theory at a %nite volume as discussed above. It is not inconceivable, however, that it might suRce to deform the Landau gauge just slightly into an SL(2; R)-symmetric covariant gauge without such zero modes at large but %nite volume without modifying the naive Landau gauge results presented in later chapters of this review in the in%nite volume limit (in which → 0 might be retained). This is certainly a quite optimistic assessment of the current situation about gauge %xing in presence of Gribov copies, and considerable further studies will be necessary to clarify this issue. Apart from some evidence in favour of the naive procedure, by comparing the results from Dyson–Schwinger equations to the conjectures of Gribov [64] and Zwanziger [98] and to lattice results, we will not have much more to say about this problem in the following chapters. 2.3. Positivity versus colour antiscreening In this section we brieJy review and discuss quite a long-known contradiction between asymptotic freedom, implying antiscreening of the colour charge in the sense of KYallZen, and the positivity of the spectral density for gluons in the covariant gauge [39]. While the apparent paradox was argued to be resolved for the (space-like) axial gauge by West [113], as will be discussed in Section 5.2.3, present knowledge of the axial gauge suggests that this resolution is itself likely to be an artefact of a violation of positivity introduced by the axial-gauge singularity of the gluon propagator. 16 The root of the contradiction thus seems to be more generic and not special to the covariant gauge. As will become clear in subsequent chapters, combined evidence from diLerent non-perturbative calculations indicates quite convincingly that gluonic correlations 15 For SU (3) there are 16 constant (anti)ghost modes, and the expansion of the exponential of their quartic interaction to fourth order yields exactly one term that contains each of them exactly once, provided the prefactor of this term does not vanish. The same was not possible for SU (2) with 6 constant modes, since no 6-(anti)ghost term arises in this expansion in the %rst place. 16 In fact, recent studies of the axial gauge and, in particular, of the singularities in the corresponding tree-level gluon propagator, start from linear covariant gauges in de%ning the axial gauge, see Ref. [58] and the references therein.


301

do indeed violate positivity. As mentioned in the beginning of this chapter, and discussed in more detail in the next section, this can be interpreted as a manifestation of con%nement. In the present section, we describe the original argument that this might be inferred already from asymptotic freedom and BRS invariance [46,114,41]. To understand the origin of the problem some basic properties of interacting %elds are brieJy recalled. For the moment a one-to-one correspondence between basic %elds and stable particles is assumed which is of course not the case in QCD. Assuming %eld-particle duality and asymptotic completeness, the Lehmann–Symanzik–Zimmermann asymptotic condition for t → −∞ on an interacting %eld >(’; t) states that it converges weakly on a dense domain D to the creation operator a†in (’) of a single particle state with wave-function ’(x): t→−∞

@|>(’; t)| → Z 1=2 @|a†in (’)| ;

(26)

i.e., the matrix elements of all states |@; | in D converge to those of the asymptotic %eld involving a normalisation constant Z for the overlap with the corresponding single particle state of mass m’ . This constant appears, of course, in the Lehmann representation of the propagator of the >-%eld (m ¿ m’ ), 17 ∞ Z ˜ 2) 2 8(A D> (k) = 2 + dA : (27) k + m2’ k 2 + A2 m2 The single particle contribution is explicitly separated here, i.e., a full spectral function can be de%ned as 8(k 2 ) := Z(k 2 − m2’ ) + 8(k ˜ 2 ). If one insisted on equal-time commutation relations for the interacting %elds (e.g., as in [2]), the following spectral sum rule would be obtained: ∞ 1=Z + dA2 8(A ˜ 2) ; (28) m2

which would thus imply 1 ¿ Z ¿ 0 for positive 8. ˜ In general however, the second term on the r.h.s. of Eq. (28) is a divergent quantity associated with the %eld renormalisation necessary in a renormalisable theory. This reJects the fact that, in contrast to free %elds which are as operator valued distributions de%ned at %xed times, the interacting %elds in a renormalisable theory are more singular objects. In particular, smearing over both the space and time variables is necessary in their de%nition. Their equal time commutation relations are no longer well de%ned. Heuristically, they involve the divergent %eld renormalisation constants. In the case of the gluon %eld being primarily under consideration here this constant is usually called Z3 , and the resulting spectral sum rule reads ∞ −1 Z3 = Z + dA2 8(A ˜ 2) : (29) m2

Turning the above argument around, the necessity of renormalisation can be understood as follows: If the canonical equal-time commutation relations of the free theory (g=0) are retained in the interacting theory (g = 0), the constant Z has to acquire a divergence so as to cancel the one 17

The diRculties encountered in presence of massless particles are ignored here. Losing the correspondence between single particle states and the discrete eigenvalues of the mass operator one has to account for the infrared divergences caused by, e.g., the soft photons in QED. These can be dealt with by introducing coherent states. Thus, this complication is of no further signi%cance for the arguments sketched in the following.

302


on the r.h.s. of Eq. (28). This would imply that the asymptotic condition, Eq. (26), is lost. The representation of the interacting %elds and the Fock space representation of the free canonical %elds are inequivalent which is referred to as Haag’s theorem. It is an example of the general representation problem in quantum %eld theory, the existence of inequivalent representations of the canonical commutation relations being the rule rather than the exception. Of course, in constructive %eld theory the equal-time canonical commutation relations are replaced by space-like commutativity as the more general implementation of locality for interacting %elds, see Haag’s book for a thorough presentation and an account of the mathematical background [3]. The renormalised version of the spectral sum rule for the interacting theory given in Eq. (29) is in conJict with positivity of the spectral density of the (transverse) gluon propagator in Landau gauge QCD as was %rst observed in Ref. [39]. To see this, we note that Eq. (29) implies Z3 6 Z −1 for a positive spectral function, 8(A ˜ 2) 2 ¿ 0. Near the renormalisation group %xed point of vanishing g , however, one has in linear covariant gauges 2 " 0 g Z3 = = (30) 2 g0

with the renormalised gauge parameter , the bare one 0 , and " being the leading coeRcient of the anomalous dimension of the gauge %eld. The second equality is meaningful only, of course, if one is not considering the Landau gauge = 0 = 0. The %rst equality, however, holds for general covariant gauges including = 0. In QED the spectral density 8(k 2 ) of the photon propagator in the covariant gauge is identical to its axial gauge counterpart 8g (k 2 ), cf., Section 5.2.2, which is a consequence of the gauge invariance of the Coulomb potential. In QED one furthermore has " = 1, and from the gauge invariance of 8 it was argued that for the bare gauge parameter only the choices 0 = 0 and 0 = ∞ exist, implying the possible values of the bare coupling to be e02 ∈ {∞; 0} [115]. The second choice corresponding to asymptotic freedom, in QED one might thus expect that the bare coupling diverges, i.e., that the running coupling behaves as eV 2 ( ) → ∞ for → ∞ (beyond one-loop). Here, a possible problem might rather be triviality of QED in the absence of an ultraviolet %xed point (for a pedagogical discussion see, e.g., Huang’s text book on quantum %eld theory [116]). In fact, recent evidence in favour of triviality is obtained for instance in the lattice simulation of Ref. [117]. Without a further %xed point, it follows that Z3 → 0. Due to the infrared %xed point at e2 = 0, however, it is suRcient to note that the positive -function near this %xed point generally implies a renormalised charge which is smaller than the bare charge. This is referred to as KYallZen screening. Therefore, in QED one has Z3 ¡ 1 6 Z −1 as one should. In contrast, asymptotic freedom corresponds to the scaling limit g0 → 0. Therefore, for " ¿ 0 one has Z3 → ∞, and from Eq. (29) one thus concludes that the spectral density cannot be positive. In perturbative QCD in Landau gauge one has 1 ¿ " ¿ 0 for Nf ¡ 10 quark Javours. Then, Z3−1 → 0 leads to the Oehme–Zimmermann superconvergence relation [39]. Its generalisation to the complete class of linear covariant gauges leads to the following form of the spectral sum rule for the transverse gluon propagator [46,47,114]: ∞ 0 for 6 0 2 2 Z+ (31) dA 8(A ˜ )= = 2 0 for ¿ 0 m


303

Positivity of the gluon spectral density in Landau gauge is thus apparently in contradiction with antiscreening. As a result of this, it was concluded that positivity for gauge-boson %elds is indeed violated in gauge theories with Z3−1 → 0 [114]. This has been interpreted as a manifestation of con%nement from asymptotic freedom and unbroken BRS invariance, since the existence of a semide%nite physical space of transverse gluon states obtained after projecting out longitudinal gluon and ghost degrees of freedom would imply that 8(A2 ) ¿ 0 [46,40,41]. The signi%cance of the global BRS charge structure here, to describe con%nement in QCD on one hand versus the Higgs mechanism in the standard model of electroweak interactions on the other, will be discussed in the next section. In order to look into the origin of the superconvergence relation, we recall that the renormalised gluon propagator in linear covariant gauges (explicitly including the dependence on the renormalisation scale for the moment) has the general structure, k k k k Z(k 2 ; 2 ) D (k; ) = − 2 + 4 : (32) k k2 k The subtlety of the argument can be made a little more explicit by considering the gluon renormalisation function Z(k 2 ; 2 ) which depends on the invariant momentum k 2 , the scale

and the gauge parameter . In a perturbative momentum subtraction scheme in Landau gauge (i.e., = 0) one obtains for suRciently large its leading logarithmic behaviour to be (also compare Section 5.3.3), 2 " gV (tk ; g) 2 2 Z(k ; ) = = Z3−1 ( 2 ; k 2 ) → 0 for k 2 → ∞ (33) 2 g with a positive anomalous dimension " (for Nf ¡ 10). Here, gV 2 (tk ; g) is the one-loop running coupling with tk = 12 ln(k 2 = 2 ) and gV 2 (0; g)=g2 : Z3 ( 2 ; 2 ) is the multiplicative constant of %nite renormalisation group transformations. Depending on the details of the regularisation scheme it is related to the gluon %eld renormalisation constant essentially by Z3 ( 2 ; 2 ) → Z3 for → ∞. The spectral representation of the gluon propagator, on the other hand, leads to ∞ k2 Z(k 2 ; 2 ) = dm2 2 8(m2 ; 2 ; g) ; (34) 2 k + m 0 where the dependence of the spectral function 8 on 2 and g was made explicit again (the pair (g; ) really represents only one parameter, of course). The renormalisation condition of the momentum subtraction scheme %xes the gluon propagator to the tree-level one at a suRciently large space-like renormalisation point k 2 = 2 , ∞ ∞

2 2 2 dm2 2 8(m ;

; g) → dm2 8(m2 ); for 2 → ∞ : (35) Z( 2 ; 2 ) = 1 = 2 + m

0 0 Comparing Eq. (35) to the limit in (33) one thus realises that ∞ dm2 8(m2 ; ∞; 0) = 1 ; (36) 0

as expected for the free theory, whereas from Eq. (34) for k 2 → ∞ one obtains ∞ dm2 8(m2 ; 2 ¡ ∞; g ¿ 0) = 0 : 0

(37)

304


Note that this last limit results by choosing a strictly %nite renormalisation point 2 and employing the limit k 2 = 2 → ∞. This thus demonstrates explicitly that it is not possible to renormalise the interacting theory at a strictly %nite scale, and a small but %nite coupling g, to the free theory (corresponding to g ≡ 0). The superconvergence relation might therefore be interpreted as a reincarnation of Haag’s theorem. The free theory and the interacting theory are inequivalent no matter how small the coupling is. The contradiction with positivity from the superconvergence relation could be avoided at this stage by supplying the de%nition of the asymptotic subtraction scheme with an implicit limit → ∞, k 2 = 2 = D(k)tree-level : D(k) (38) lim 2

→∞

In practical applications of the renormalisation group this means that momenta larger than the renormalisation point can be considered, their rough order of magnitude, however, is bound by that of the renormalisation scale (with the momentum dependence for k 2 ∼ 2 governed by the scaling %xed point). Ambiguities in the non-commuting limits k 2 → ∞ and 2 → ∞ as those leading to Eqs. (36) versus (37) arise, if momenta are taken to in%nity relative to the subtraction point. This same spirit of renormalising the interacting theory to the free one asymptotically was actually adopted previously also for the quark propagator [118]. In that context it turned out to be necessary in order to implement the quark-con%nement mechanism of infrared slavery into the asymptotically free theory. The alert reader will have noticed that the Landau gauge considered so far is exceptional ( = 0), and that for other possible choices of the gauge parameter in Eq. (31) the superconvergence relation might not rule out positivity anyway. The generalisation of the Oehme– Zimmermann argument to the whole family of linear covariant gauges is less obvious. It is, in fact, long known that QCD in the covariant gauge has an ultraviolet %xed point in the (g2 ; ) plane at a %nite positive value of the gauge parameter (0; 0 ) with 0 = 13=3 − 4Nf =9, see, e.g., Ref. [9]. Therefore, naively one might think that with → 0 the spectral sum rule of the free theory (28) is recovered. A more detailed analysis shows, however, that the gluon spectral function 8(k 2 ) is negative for suRciently large k 2 also in this case. Assuming that the only singularities of the gluon propagator lie on the time-like real axis in the complex k 2 -plane, one can still show that the discontinuity at the cut behaves asymptotically [46,47], −"−1 1 k2 2 2 2 2 2 8(k ) ≡ 8(k ; g ; ; ) → −"CR (g ; ) 2 ln 2 for k 2 → ∞ : (39) k

Here, " is the same positive (for Nf ¡ 10) anomalous dimension of the gluon %eld and CR (g2 ; ) some positive constant. Therefore, 8(k 2 ) is shown not to be positive also in the general, linear covariant gauges. Analogous results from the renormalisation group analysis employing analyticity in the cut complex k 2 -plane exist also for the ghost and the quark propagator for asymptotically large but complex k 2 , see Ref. [119]. While this demonstrates the violation of positivity of transverse gluons independent of the spectral sum rule in Eq. (31), and already at the level of perturbation theory, by itself it is of course not suRcient to yield con%nement. It can serve to demonstrate that the massless transverse gluon states of perturbation theory have


305

to belong to unphysical BRS quartets (see the next section). For the realisation of con%nement it is necessary in addition that there is a mass gap in the transverse gluon correlations, i.e., that the massless one-gluon pole of perturbation theory is screened non-perturbatively. Only then the Kugo–Ojima con%nement criterion can establish the equivalence of BRS singlets with colour singlets by requiring the global colour symmetry to be unbroken. This will be discussed next. Note, however, that the requirement of an unbroken global gauge symmetry, the absence of both, physical as well as unphysical massless states from the spectrum of the global gauge current, is a necessary condition in the derivation of the superconvergence relations discussed in this section [41]. This condition is violated in models with Higgs mechanism which prevents one from concluding a positivity violation of the massive physical vector-states in the transverse gauge-boson correlations in that case. The non-positivity of the gluon spectral density will be discussed in Section 5.3.4 again. There we collect the present evidence for its violation from two sources of non-perturbative results, from lattice simulations of the gluon propagator in the Landau gauge, and from the solutions to truncated Dyson–Schwinger equations. These two kinds of non-perturbative results furthermore both agree in indicating that no massless one-particle pole exists in the transverse gluon correlations. 2.4. Description of con;nement in the linear covariant gauge Covariant quantum theories of gauge %elds require inde%nite metric spaces. This implies some modi%cations to the standard (axiomatic) framework of quantum %eld theory. Modi%cations are also necessary to accommodate con%nement in QCD. These seem to be given by the choice of either relaxing the principle of locality or abandoning the positivity of the representation space. The much stronger of the two principles being locality, non-local descriptions (see Section 2.5) have received far less attention than local ones. Great emphasis has therefore been put on the idea of relating con%nement to the violation of positivity in QCD. Just as in QED, where the Gupta–Bleuler prescription is to enforce the Lorentz condition on physical states, a semide%nite physical subspace can be de%ned as the kernel of an operator. The physical states then correspond to equivalence classes of states in this subspace diLering by zero norm components. Besides transverse photons covariance implies the existence of longitudinal and scalar photons in QED. The latter two form metric partners in the inde%nite space. The Lorentz condition eliminates half of these leaving unpaired states of zero norm which do not contribute to observables. Since the Lorentz condition commutes with the S-Matrix, physical states scatter into physical ones exclusively. Colour con%nement in QCD is ascribed to an analogous mechanism: No coloured states should be present in the positive de%nite space of physical states de%ned by some suitable condition maintaining physical S-matrix unitarity. A comprehensive and detailed account of most of the material summarised in this section can be found in the textbook by Nakanishi and Ojima [36]. Here, we brieJy recall those of the general concepts that relate to some of the results presented in Chapter 5. In particular, we would like to emphasise the following three aspects: positivity violations of transverse gluon and quark states, the Kugo–Ojima con%nement criterion, and the conditions necessary for a failure of the cluster decomposition. We describe each of these in the next three subsections, and we will %nd that the results of

306


Chapter 5 nicely %t into these general considerations which thus together lead to a quite coherent, though certainly still somewhat incomplete picture. 2.4.1. Representations of the BRS algebra and positivity Within the framework of BRS algebra, in the simplest version for the BRS-charge QB and the ghost number Qc (both hermitian with respect to an inde%nite inner product) given by QB2 = 0;

[iQc ; QB ] = QB ;

(40)

completeness of the nilpotent BRS-charge QB in a state space V of inde%nite metric is assumed. This charge generates the BRS transformations (> ≡ -B > with Grassmann parameter -) of a generic %eld > by the ghost number graded commutator, B > = {iQB ; >} ;

(41)

i.e., by a commutator or anticommutator for %elds > with even or odd ghost number Qc , respectively. In presence of ghost–antighost symmetry by Faddeev–Popov conjugation, this structure generalises to that for the semidirect product of the global SL(2; R) with the double BRS invariance, 18 see Appendix A. The semide%nite physical subspace Vphys = Ker QB is de%ned on the basis of this algebra by those states which are annihilated by the BRS charge QB , Vphys = {| ∈ V: QB | = 0} = Ker QB :

(42)

Since QB2 = 0 this subspace contains the space of so-called daughter states which are images of others, their parent states in V, Im QB = {| ∈ V:| = QB |E; |E ∈ V } ⊂ Vphys :

(43)

A physical Hilbert space is then obtained as (the completion of) the covariant space of equivalence classes, the BRS-cohomology of states in the kernel modulo those in the image of QB , H(QB ; V) = Ker QB =Im QB Vs ;

(44)

which is isomorphic to the space Vs of BRS singlets. It is easy to see that the image is furthermore contained in the orthogonal complement of the kernel. Given completeness they are identical, Im QB = (Ker QB )⊥ = Ker QB ∩ (Ker QB )⊥ which is the isotropic subspace of Vphys . It follows that states in Im QB , in the language of de Rham cohomology called BRS-coboundaries, do not contribute to the inner product in Vphys . Completeness is thereby important in the proof of positivity for physical states [57,121,36], because it assures the absence of metric partners of BRS-singlets, so-called “singlet pairs” which would otherwise jeopardise the proof. With completeness all states in V can be shown to be either BRS singlets in Vs or belong to so-called quartets which are metric-partner pairs of BRS-doublets (of parent with daughter states), and that this exhausts all possibilities. The generalisation of the Gupta–Bleuler condition on physical states, QB | = 0 in Vphys , eliminates half of these metric partners leaving 18

Corresponding to a Inonu–Wigner contraction of a OSp(1; 2) superalgebra, see Refs. [120,36].


307

unpaired states of zero norm (in the isotropic subspace of Vphys ) which do not contribute to any observable. This essentially is the quartet mechanism: Just as in QED, one such quartet, the elementary quartet, is formed by the massless asymptotic states of longitudinal and time-like gluons together with ghosts and antighosts which are thus all unobservable. In contrast to QED, however, one expects the quartet mechanism also to apply to transverse gluon and quark states, as far as they exist asymptotically. A violation of positivity for such states then entails that they have to be unobservable also. The combined evidence for this, as collected in the present review, provides strong indication in favour of such a violation for possible transverse gluon states. The members of quartets are frequently said to be con%ned kinematically. This is no comprehensive explanation of con%nement, of course, but one aspect (among others as we shall describe below) of its description within the covariant operator formulation [36]. In particular, asymptotic transverse gluon and quark states may or may not exist in the inde%nite metric space V. If either of them do exist and the Kugo–Ojima criterion is realised (see below), they belong to unobservable quartets. In that case, the BRS-transformations of their asymptotic %elds entail that they form these quartets together with ghost–gluon and=or ghost–quark bound states, respectively, see Section 4:4:3 in [36]. We reiterate that it is furthermore crucial for con%nement, however, to have a mass gap in transverse gluon correlations, i.e., the massless transverse gluon states of perturbation theory have to disappear (even though they should belong to quartets due to superconvergence in asymptotically free and local theories, see the discussion at the end of Section 2.3). Before we continue we add two brief remarks. The BRS construction of the physical state space sketched above is endowed with a quantum mechanical interpretation in terms of transition probabilities and measurements as expectation values of observables. A necessary and suRcient condition on a (smeared local) operator 19 A is that the isotropic subspace of zero norm states does not aLect its expectation values in Vphys [122,36], i.e. B A = {iQB ; A} = 0 :

(45)

A is then called a (smeared local) observable in the present context thereby slightly generalising the usual notion of an observable (by self-adjointness). It then follows that for all states generated from the vacuum |(x1 )>(x2 )}|P ; =(x V 1 ; x2 ; P) := 0|T {>† (x1 )>† (x2 )}|P ; together with their Fourier transforms d 4 p −ipx −iPX =(x1 ; x2 ; P) =: e e =(p; P) ; (23)4 d 4 p −ipx +iPX =(x V 1 ; x2 ; P) =: e e =(p; V P) : (23)4

(242)

(243)

Hereby |0 denotes the ground state (vacuum) and |P the bound state. Close to the pole the regular terms can safely be neglected and the dependence of the 4-point function on the relative momenta p and p can be separated. Expanding G (4) and [D − K] in the inhomogeneous Bethe-Salpeter equation in powers of (P 0 − !) yields the homogeneous Bethe-Salpeter equation and the normalisation condition for the amplitude. The order (P 0 − !)−1 provides 4 d p [D(p; p ; Pos ) + K(p; p ; Pos )]=(p; Pos ) = 0 ; (244) (23)4 whereas to O((P 0 − !)0 ) one obtains d 4 p d 4 p 9 tr =(p; V Pos ) 0 (D(p; p ; P) + K(p; p ; P)) =(p ; Pos ) = 2i! : (245) (23)4 (23)4 9P P 0 =! This ensures the residue to be equal to 1 at the bound state pole. The homogeneous Bethe-Salpeter equation (244) is a linear integral equation for the amplitude = whose overall normalisation is %xed by (245). Approximating the kernel by the one-boson-exchange depicted in the %rst diagram of Fig. 6, Eq. (244) can be cast into an eigenvalue problem for the coupling constant by using the vertex function 1(p1 ; p2 ) instead of the amplitude: =(p; P) =: G1 (p1 )G2 (p2 )1(p1 ; p2 ) :

(246)

In the context of BS equations it is advantageous to use the total and relative momenta, see Eq. (238), as arguments of the vertex functions, 1(p1 ; p2 ) → 1(p; P). The homogeneous BS


393

Fig. 23. Pictorial representation of the homogeneous Bethe–Salpeter equation in the ladder approximation.

equation (244) in terms of the vertex function then reads 4 d p 1(p; Pos ) = − K(p; p ; Pos )G1 (p1 )G2 (p2 )1(p ; Pos ) : (23)4

(247)

In the ladder approximation the kernel K is essentially given by the propagator of the exchanged particle multiplied by one coupling constant g for each vertex, i.e., by g2 . On inspection one %nds that (247) is an eigenvalue problem for g2 , if the G1 and G2 are the bare propagators of the constituents. The ladder approximation to the BS equation is pictorially represented in Fig. 23. If a parameter pair (g2 ; P 0 = M ) exists the pole assumption is a posteriori justi%ed and M is the bound state mass with = being the corresponding amplitude (wave-function) as can be inferred from Eq. (241) which reJects, of course, nothing else than the Lehmann representation of the 4-point function. The description of mesons, and especially the one of pions, requires to use a generalised ladder approximation: the intrinsically non-perturbative nature of bound-state problems, and the complex structure of the QCD vacuum, necessitates that one employs non-perturbative gluon and quark propagators in the BS equation kernel. There have been many studies of meson spectroscopy using this framework; summaries can be found in Refs. [28–30]. Typically, such studies employ an Ansatz for a dressed gluon propagator in solving a rainbow-approximate quark DSE, and then pair the input gluon propagator with the calculated quark propagator to construct the non-perturbatively dressed kernel for the meson BS equation in ladder approximation. The resulting BS equation is then solved to obtain the spectrum. This rainbow-ladder truncation scheme has the feature that Goldstone’s theorem is manifest; i.e., in the chiral limit, when the current quark mass mq = 0, the pion is a zero-mass bound state in strongly dressed quark– antiquark correlations [292]. As we will see in the following, with few-parameter models for the gluon propagator, this can be used to provide fair descriptions of the light–light, light–heavy and heavy–heavy meson spectra and decays. In order to simplify notations we introduce “multiple indices” E = {ic ; if ; iD } associated with the colour, the Javour and the Dirac structure of an amplitude. The homogenous BS equation for the vertex function 1M , the subscript M denoting meson, then reads GH 1 d 4 k EF;GH 1 EF 1M (p; P) = K (k; p; P) S k + 1 (k; P)S k − ; (248) P P M (23)4 M 2 2 where we have symmetrised the momenta of the quark legs for reasons which will become clear soon. The rainbow approximation is obtained from inserting a bare quark gluon vertex (1 (k; p) ≡ " ) into the quark DSE. Corresponding to this, the generalised ladder approximation

394


consists of employing KMEF;GH (k; p; P)(S(k + 12 P)1M (k; P)S(k − 12 P))GH 1 -a 1 -a EF ≡ −g2 D (p − k) " S k + P 1M (k; P)S k − P " 2 2 2 2

(249)

for the kernel in Eq. (248). This form of the kernel, the dressed-ladder gluon exchange combined with the solution to the rainbow quark DSE, preserves the Goldstone boson character of the pion. One observes that, in the chiral limit mq = 0, the meson BS equation (248) is obtained from the rainbow DSE for the quark self-energy (de%ned by S −1 (p)= : i" · p + T(p)), 4 d4 k T(p) = mq + g2 " S(k)" D (p − k) ; (250) 3 (23)4 upon replacing " S(k)" → " S(k + P=2)1M (k; P) S(k − P=2) " :

(251)

Straightforward algebraic manipulations then reveal that the BS equation in the pseudoscalar channel for P = 0 is identical to the equation for the scalar quark self-energy, i.e., to its chirally non-invariant dynamical contribution [292]. 65 The derivation of this can equivalently be based on the observation that in the chiral limit the vertex function of a pion, with Javour index a and vanishing momentum P = 0, must be proportional to the result of an in%nitesimal chiral rotation of the quark self-energy d i@b (1=2)Pb "5 1 c c (e T(p)ei@ (1=2)P "5 )@a =0 = P a {"5 ; T(p)} : (252) d@a 2 In the chiral limit the dynamical breaking of chiral symmetry is therefore always accompanied by a massless bound state in the pion channel within the rainbow-ladder scheme. Before we are going to discuss the solutions of the ladder BS equations for the ground state mesons two comments on general properties are, however, in order. 1a (p; P = 0) ˙ −i

6.1.2. Solutions of the ladder Bethe-Salpeter equation in Minkowski space An Euclidean formulation is used throughout this review which, as described in the beginning of Section 2, has its justi%cation in the fact that the domain of holomorphy for the Green’s function of a quantum %eld theory (ful%lling the usual axioms) allows a complex extension of the Euclidean space. The on-shell momenta of bound states introduced in the last chapter are time-like and have to be represented as complex four vectors in an analytically continued Euclidean formulation. Being justi%ed formally, at least for quantum %eld theories without complications like con%nement, there are inherent practical diRculties, however. Looking at Eq. (248) one realises that the quark propagators have to be known in a parabolic region of the complex p2 -plane. This region contains the positive real (half-)axis of space-like p2 , and it extends to p2 = −M 2 =4 on the negative real axis with its boundary intersecting the 65

An extension of this constructive way to preserve the Goldstone boson character of the pion will be presented in Section 7.1.1.


395

Fig. 24. Plot of the complex p2 plane. The interior of the parabola shows the subset of the complex plane where the BS equation ‘probes’ the propagators of the constituents. M is the mass of the bound state and is the momentum partitioning parameter.

imaginary axis at p2 = ±iM 2 =2, see Fig. 24. (This does, of course, depend on the momentum partitioning parameter P , e.g., its extend into the time-like region is p2 = −M 2 2P for one and p2 = −M 2 (1 − P )2 for the other constituent, such that P = 1=2 is the best choice to minimise this complex domain for both quark propagators.) If the kernel of the quark DSE, i.e., in rainbow approximation the gluon propagator, is a known analytic function in this parabolic region it is possible to solve the BS equation without further approximation, see Section 6.2 for a discussion of such calculations. If the kernel of the quark DSE is known only numerically at certain momenta, e.g., on the space-like p2 -axis, however, or if singularities occur in this parabolic region (note that it always contains the point p2 = 0 for example), or if both is the case, a reliable numerical evaluation of the integration kernel in the BS equation is virtually impossible. Thus, a solution of the DSE directly in Minkowski space might provide some new insight. Rewriting the integrals in the fermion DSE in rainbow approximation with the help of dispersion relations it is possible to solve this equation directly in Minkowski space [293]. However, for the extraction of the imaginary part it has been a necessary prerequisite that the analytic structure of the kernel is known explicitly. In Ref. [293] either a bare gauge boson propagator or a bare gauge boson propagator multiplied with a logarithmically running coupling has been used. Of course, the obtained results are in accordance with the one obtained from the Euclidean DSE. Another Minkowski space study of DSEs is employing the perturbation theory integral representation (see below) in scalar E3 theory [294]. This work is, however, still unpublished. In Refs. [295,296] the ladder BS equation for a scalar–scalar bound state with scalar exchange has been solved in Minkowski space using the perturbation theory integral representation [297]. This representation is an extension of the spectral representation for 2-point Green’s functions. Hereby, for the relatively simple kernels tested so far, the results for bare and dressed ladder kernels are in complete agreement with the results obtained from an Euclidean approach and Wick rotation. The main advantage of this method would, however, become vital when using non-ladder kernels. Note that beyond ladder approximation the naive Wick rotation is not possible, i.e., one has to choose much more complicated integration contours. Furthermore, the

396


method is quite general and it is also applicable if a Wick rotation is not possible at all. Also in these cases it could provide the vertex function and the amplitude for the entire range of allowed momenta. Therefore, progress in this direction would be highly desirable. 6.1.3. (In-)consistency of the relativistic description of excited states At %rst sight the BS equation seems very suited to describe excited states. Interpreting the spectrum of the homogeneous BS equation is, however, far from being trivial [298]. Common to the analytical 66 and the numerical solutions is the existence of abnormal states which have led to controversial discussions regarding their physical interpretation [299 –301]. Even worse, for the case of constituents with unequal masses some eigenvalues of the homogeneous Bethe-Salpeter equation become complex [302,303]. Clearly, such a behaviour is unexpected and has to be understood. It has been usually attributed to the use of the ladder approximation [299] which destroys crossing symmetry from the very beginning. This conjecture in its strict form is, however, refuted: going beyond ladder approximation and employing also crossed ladder exchanges the abnormal states still exists [304]. In this section we will provide evidence that the use of a dressed ladder kernel is absolutely required if one wants to interpret the spectrum of the BS equation [298]. The abnormal solutions are “excitations in relative time”. They will obviously not appear in a purely non-relativistic treatment where the constituents are considered for equal times only. 67 In the Wick–Cutkosky model [299,300] with constituents of equal masses m1 = m2 = m these abnormal states are easily identi%ed: they only exist for certain values of the coupling constant (- := g2 =1632 m2 ¿ -c = 1=4). If the binding energy becomes very small the corresponding coupling constant - vanishes for the normal solutions, i.e. - → 0, whereas - → -c = 1=4 for the abnormal states. The latter behaviour is completely unexpected, a vanishing binding energy should be related to no coupling at all. It can be shown that in this model the abnormal solutions possess nodes when plotted as functions of the relative time. For all normal solutions, including the ground state, there are no nodes in relative time. It has to be noted, however, that for a general BS equation there is no known method to identify abnormal states. The appearance of complex eigenvalues is related to a crossing of an abnormal with a normal (or abnormal) state [298]. Therefore, this problem is related to the existence of abnormal states. It occurs for a wide range of parameters. Increasing the mass of the exchange particle the higher lying eigenvalues tend to become real again. This can be understood from the fact that for an in%nitely heavy exchange particle the BS equation assumes an O(4) symmetric form as in the case of a massless exchange particle, see also Table 1 which summarises the symmetries of the scalar ladder BS equation. It is interesting to note that the ladder BS spectrum of QED also shows such a phenomenon [298]. First, one has to note that for QED in Feynman gauge there 66

The only analytically solvable example of a BS equation is the one for two (massive) scalar particles bound by the ladder approximation to the exchange of a massless scalar %eld [299 –301]. Despite its relative simplicity as compared to realistic systems this model, the Wick–Cutkosky model, displays already the advantages (e.g., full covariance) as well as the shortcomings (e.g., the existence of abnormal states) inherent to almost all Bethe-Salpeter based approaches used until today. 67 However, not all abnormal solutions necessarily vanish in a three-dimensional reduction of the BS equation [305]. On the contrary, the spectrum of a three dimensionally reduced equation will contain remnants of these abnormal states.


397

Table 1 Summary of the symmetries of the scalar BS equation in ladder approximation. denotes the mass of the exchange particle and M is the Mass of the bound state. The functions Z (or Y ) denote the spherical harmonics for the corresponding n-sphere Y −1 =S : (322)

−1 > Y 0 (D ) The diquark propagators D and D are given in Eqs. (318), (319), and the quark propagator S in Eq. (320). The coupled system of BS equations for the nucleon amplitudes or their vertex functions can be written in the following compact form: 5 4 d p −1 Y G (p; p ; P) (p ; P) = 0 ; (323) 4 Y

(23) in which G −1 (p; p ; P) is the inverse of the full quark–diquark 4-point function. It is the sum of the disconnected part and the interaction kernel. The latter results from the reduction of the Faddeev equation for separable quark–quark correlations. It describes the exchange of the quark with one of those in the diquark which is necessary to implement Pauli’s principle in the baryon, i.e., it describes the minimal dynamical coupling necessary to account for the full exchange symmetry in the quark–diquark model [366]. Due to the overall colour antisymmetry of the baryon (being a colour singlet) the other quantum numbers have to be symmetrised leading to Pauli attraction instead of Pauli repulsion familiar from most ordinary fermionic many-body systems. Taking into account the coupled channel nature of scalar and axial vector


429

Fig. 32. The coupled set of BS equations for the vertex functions >. (Adopted from Ref. [385].)

diquark contributions within the nucleon one obtains −1 D (pd ) 0 −1 4 4 −1 G (p; p ; P) = (23) (p − p )S (pq ) 0 (D )−1 (pd )

√ 2 T 1 √−=(p22 )S T (q)=(p V 12 ) 3= (p2 )S (q)=(p V 12 ) : − 2 3=(p22 )S T (q)=V (p12 ) = (p22 )S T (q)=V (p12 )

(324)

The Javour and colour factors have been taken into account explicitly, and =; = stand for the Dirac structures of the diquark–quark vertices. The freedom to partition the total momentum between quark and diquark introduces the parameter ∈ [0; 1] with pq = P + p and pd = (1 − )P −p as usual. The momentum of the exchanged quark is then given by q=−p−p +(1−2)P. The relative momenta of the quarks in the diquark vertices = and =V are p2 =p+p =2−(1−3)P=2 and p1 = p=2 + p − (1 − 3)P=2, respectively. Invariance under four-dimensional translations implies that for every solution >(p; P; 1 ) of the BS equation there exists a family of solutions of the form >(p + (2 − 1 )P; P; 2 ). The corresponding BS equations are pictorially represented in Fig. 32. The necessary presence of the total momentum P of the baryonic bound state in the exchange kernel for = 1=2 was apparently not taken into account in the studies of Refs. [379,380]. As stated in the last section the quark exchange kernel of the reduced Bethe-Salpeter/Faddeev problem for baryons, for = 1=2, necessarily depends on the total momentum of the baryonic bound state P. Since this has important implications on the normalisations and charges of the bound state amplitudes, it is preferable to use the residual freedom in choosing the momentum partitionings in the relativistic bound state problem such as to keep the bound state momentum dependence of the exchange kernel to a necessary minimum. While the exchange quark momentum is found to be P-independent for = 1=2, this choice, however, necessarily introduces P-dependence in the diquark amplitudes: The dominant momentum dependence of the diquark amplitudes are given by the scalars x1 and x2 , x1 = −p12 − (1 − 2+)((1 − )p1 P − p1 k) ;

(325)

x2 = −p22 + (1 − 2+ )((1 − )p2 P − p2 p) :

(326)

These coincide with p1;2 2 only for symmetric quark momentum partitionings, i.e., + = + = 1=2, cf., the discussion at the end of the last section. These symmetrised arguments of the diquark amplitudes x1; 2 are independent of the total nucleon momentum, only if +=+ = 12 and thus = 13 .

430


This conclusion can be generalised [382]: The exchange symmetry of the diquark amplitudes suRces to show that these can generally be independent of P only if = 1=3 and + = + = 1=2. This is the only choice leading to diquark amplitudes independent of the total nucleon bound state momentum P, and this follows from the exchange symmetry alone and is not a result of the particular parametrisations employed in the model calculation. In actual calculations the variable is varied around the value = 1=3 [384]. The diquark momentum partitionings are %xed to + = + = (1 − 2)=(1 − ) for given . While P-independent diquark amplitudes can be obtained only for the value = 1=3 with this choice the exchange quark carries total momentum whenever = 1=2. This entails that the exchange kernel of the reduced Bethe-Salpeter=Faddeev equation for baryons unavoidably depends on the total momentum of the baryonic bound state. This implies some considerable extensions to the calculations, e.g., of electromagnetic form factors, which become necessary with the inclusion of diquark sub-structure [382,384]. Using the positive energy projector with nucleon bound state mass Mn , 1 P= + 1+ ; (327) = 2 iMn the vertex functions can be decomposed into their most general Dirac structures, i 5 p=S2 + ; > (p; P) = S1 + Mn P

i i

+

> (p; P) = A1 + p=A2 "5 + " A3 + p=A4 "5 + iMn Mn Mn p

i + A5 + p=A6 "5 + : iMn Mn

(328)

(329)

In the rest frame of the nucleon, P = (0; iMn ), the unknown scalar functions Si and Ai are functions of p2 = p p and of P · p. Certain linear combinations of these eight covariant components then lead to a full partial wave decomposition, see the next section. The BS solutions are normalised by the canonical condition 4 d4 p d p V + !

9 −1 Mn = − G (p ; p; P) Y(p; Pn ) : (330) Y(p ; Pn ) P (23)4 (23)4 9P

P=Pn The eLective multi-spinor for the delta baryon representing the BS wave-function can be characterised as Y (p; P)u (P) where u (P) is a Rarita–Schwinger spinor. The momenta are de%ned analogous to the nucleon case. As the delta state is Javour symmetric, only the axial vector diquark contributes and, accordingly, the corresponding BS equation reads 4 d p −1

G (p; p ; P)Y (p ; P) = 0 ; (331) (23)4 where the inverse quark–diquark propagator G−1 in the -channel is given by

G−1 (p; p ; P) = (23)4 4 (p − p )S −1 (pq )(D

)−1 (pd ) + = (p22 )S T (q)=V (p12 ) :

(332)


431

The general decomposition of the corresponding vertex function > , obtained as in Eq. (322) by truncating the quark and diquark legs of the BS wave-function Y , reads -T i P

i p

p=D2 + E1 + p=E2 - > (p; P) = D1 + M iM M iM -T -T p p i p

i + " E3 + p=E4 - + E5 + p=E6 - : (333) M iM iM M iM Here, is the Rarita–Schwinger projector, 1 2 P P i P " − P "

+

= − " " + − 3 3 M2 3 M

(334)

which obeys the constraints P = " = 0 :

(335)

Therefore, the only non-zero components arise from the contraction with the transverse relative momentum p T = p − P (p · P)=P 2 . The invariant functions Di and Ei in Eq. (333) again depend on p2 and p · P. The partial wave decomposition in the rest frame is made explicit below. Finally, we want to comment on an extension to three Javours. In the isospin limit the strange quark constituent mass is the only source of Javour symmetry breaking. The equations describing octet and decuplet baryons have been derived under the premises of Javour and spin conservation, i.e., only those wave function components with the same spin and Javour content couple to each other [381]. The Javour structure of the eight equations describing N; ; T; Z; ; T∗ ; Z∗ and < can be found in Appendix A of Ref. [381]. The -hyperon is hereby of special interest. Firstly, its measured polarisation asymmetry in the process p" → K + will provide a stringent test for the diquark–quark model for time-like momenta, see below. As discussed in [391], there are only scalar diquarks involved in this process. Secondly, broken SU (3)-Javour symmetry induces a component of the total antisymmetric Javour singlet into wave and vertex function. In non-relativistic quark models with SU (6) symmetry such a component is forbidden by the Pauli principle. As the Javour singlet is composed of scalar diquarks and quarks only, this generates two additional scalar amplitudes besides the usual two from the octet state. The axial vector part of the vertex function remains unchanged in Javour space. In Ref. [381] it has been found that the Javour singlet amplitudes are numerically small. Thus, one can safely regard the -hyperon as an almost pure octet state in Javour space. 7.3.2. Partial wave decomposition In a relativistic system only the total angular momentum, 1=2 for the nucleon and 3=2 for the , is a good quantum number. Nevertheless, it is instructive to decompose the BS amplitudes into partial waves in the rest frame. However, it has to be noted that these partial waves start to mix when the covariant amplitude is boosted. In the rest frame the Pauli–Lubanski operator for an arbitrary multi-spinor is given by W i = 12 4ijk Ljk :

(336)

432


Its eigenvalues are the total angular momentum W2 = J (J + 1) :

(337)

The tensor Ljk is the sum of an orbital part, Ljk , and a spin part, S jk . For a three-particle system they are given by

3 9 9 (−i) paj k − pak j ; (338) Ljk = 9pa 9 pa a=1 2(S jk )@@ ; ; "" = (+jk )@@ ⊗ ⊗ "" + @@ ⊗ (+jk ) ⊗ "" + @@ ⊗ ⊗ (+jk )"" ;

(339)

such that Ljk = Ljk + 12 S jk . The tensor Ljk is proportional to the unit matrix in Dirac space. The de%nition of + := −(i=2)[" ; " ] diLers by a minus sign from its Minkowski counterpart. The tensors L and S are written as a sum over the respective tensors for each of the three constituent quarks which are labelled a = 1; 2; 3 and with respective Dirac indices @@ ; ; "" . De%ning the spin matrix Ti = 12 4ijk +jk the Pauli–Lubanski operator can be written as (W i )@@ ; ; "" = Li @@ ⊗ ⊗ "" + (S i )@@ ; ; "" ; i j 9 j 9 L = (−i)4ijk p k + q k ; 9p 9q

(340) (341)

(S i )@@ ; ; "" = 12 ((Ti )@@ ⊗ ⊗ "" + @@ ⊗ (Ti ) ⊗ "" + @@ ⊗ ⊗ (Ti )"" ) : (342) Hereby the relative momentum p between quark and diquark and the relative momentum q within the diquark has been introduced via a canonical transformation: P = p1 + p2 + p3 ;

p = (p1 + p2 ) − (1 − )p3 ;

q = 12 (p1 − p2 ) :

(343)

Taking into account only the leading Dirac covariant in the diquark amplitudes no orbital angular momentum is carried by the diquarks, L2 =(q) = L2 = (q) = 0 :

(344)

The Pauli–Lubanski operator then simpli%es and can be calculated by straightforward but tedious algebra [381,385]. In the nucleon (or generally in octet baryons) there is one s-wave associated with the scalar diquark and two s-waves associated with the axial vector diquark, one of them connected with its virtual time component, see Table 9. In the non-relativistic limit only two s-waves out of the eight components would survive. It is remarkable that the relativistic description leads to four accompanying p-waves, the “lower components”, and a d-wave which are expected to give substantial contributions to the fraction of the nucleon spin carried by orbital angular momentum. At least, these p-waves would not be present in a non-relativistic model. In the delta (or generally in decuplet baryons) only one s-wave is found by the method described above. Two d-waves that could, in principle, survive the non-relativistic limit are present and one d-wave can be attributed to the virtual time component of the axial vector diquark. All even partial waves are accompanied by relativistic “lower” components that could be even more important as in the nucleon case.


433

Table 9 Components of the octet baryon wave-function with their respective spin and orbital angular momentum. ("5 C) corresponds to scalar and (" C); = 1; : : : ; 4; to axial vector diquark correlations. Note that the partial waves in the %rst row possess a non-relativistic limit (Adopted from Ref. [392].) “Non-relat.” partial waves

Spin Orb.ang.mom.

1=2 s

“Relat.” partial waves

Spin Orb.ang.mom.

1=2 p

= 0

0 p=

("5 C)

Pˆ

4

0 =

("4 C)

1=2 s

("5 C)

Pˆ

4

1=2 p

i

i+ = 0



i ˆ −  i pˆ ( p)

("i C)

1=2 s ( p)= 0

4

(" C)

0 i+i (+p)=

1=2 p

3=2 d

 i

(" C)

0

+i 3



=

("i C)



0   ("i C) +i ( p) i = i p − 3 3=2 p

The relativistic decomposition of nucleon and quark–diquark wave-functions yields a rich structure in terms of partial waves, for more details see Refs. [381,385]. Well-known problems from certain non-relativistic quark model descriptions are avoided from the beginning in a relativistic treatment: First, photo-induced N –-transitions that are impossible in spherically symmetric non-relativistic nucleon ground states will occur in this model through overlaps in the axial vector part of the respective wave-functions. Additionally, photo-induced transitions from scalar to axial vector diquarks can take place, thus creating an overlap of the nucleon scalar diquark correlations with the axial vector diquark correlations. Secondly, the total baryon spin will mainly be due to the quark spin in the s-waves and the orbital angular momentum of the relativistic p waves (that are absent in a non-relativistic description). Which fraction of, e.g., the nucleon spin, is carried by the quark spins is related to the matrix element of the Javour singlet axial current and is subject of an on-going investigation. 7.3.3. Numerical results for ground state baryons The quark–diquark BS equations have been solved in the corresponding bound state rest frame using an expansion in Chebyshev moments, see Refs. [379,381,382,385] for details. This method exploits the approximate O(4) symmetry of the BS equation and proves to be very eRcient for obtaining numerically accurate solutions of the full four-dimensional equations. Actually, there is not much diLerence in the requirements for computational resources in solving the four-dimensional equation this way or solving a reduced three-dimensional approximation. Given the non-covariance of the reduced equation (and the related shortcomings when calculating observables, e.g., see Ref. [383]) further use of three-dimensional reductions seems questionable. In Ref. [384] the nucleon and delta amplitudes have been calculated using two diLerent parameter sets. For one set a constituent quark mass of mq = 0:36 GeV has been employed. Due to the free-particle poles in the quark and diquark propagators used in Ref. [384] the axial vector diquark mass is below 0:72 GeV and the delta mass below 1:08 GeV. On the other hand, nucleon and delta masses are %tted by second set and the parameter space is constrained by these two masses accordingly. In particular, this implies mq ¿ 0:41 GeV. The %rst parameter set

434


Table 10 Octet and decuplet masses obtained with two diLerent parameter sets. Set I represents a calculation with weakly con%ning propagators (d = 10 in Eq. (345)), Set II with strongly con%ning propagators (d = 1 in Eq. (345)). All masses are given in GeV (Adopted from Ref. [392].)

Set I Set II Exp.

mu

ms

M

MT

MZ

MT ∗

MZ ∗

M

Physics Reports vol.353

Physics Reports vol.342

Physics Reports vol.302

Physics Reports vol.347

Physics Reports vol.107

Physics Reports vol.351

Physics Reports vol.326

Physics Reports vol.369

Physics Reports vol.381

Physics Reports vol.361

Physics Reports vol.321

Physics Reports vol.318

Physics Reports vol.359

Physics Reports vol.394

Physics Reports vol.316

Physics Reports vol.307

Physics Reports vol.434

Physics Reports vol.356

Physics Reports vol.308

Physics Reports vol.363

Physics Reports vol.319

Physics Reports vol.346

Physics Reports vol.312

Physics Reports vol.435

Physics Reports vol.380

Physics Reports vol.343

Physics Reports vol.355

Physics Reports vol.331

Physics Reports vol.396

Physics Reports vol.314

Physics Reports vol.375

Physics Reports vol.353

Physics Reports vol.342

Physics Reports vol.302

Physics Reports vol.347

Physics Reports vol.107

Physics Reports vol.351

Physics Reports vol.326

Physics Reports vol.369

Physics Reports vol.381

Physics Reports vol.361

Physics Reports vol.321

Physics Reports vol.318

Physics Reports vol.359

Physics Reports vol.394

Physics Reports vol.316

Physics Reports vol.307

Physics Reports vol.434

Physics Reports vol.356

Physics Reports vol.308

Physics Reports vol.363

Physics Reports vol.319

Physics Reports vol.346

Physics Reports vol.312

Physics Reports vol.435

Physics Reports vol.380

Physics Reports vol.343

Physics Reports vol.355

Physics Reports vol.331

Physics Reports vol.396

Physics Reports vol.314

Physics Reports vol.375

Recommend Documents