Lecture Notes in
Physics
Monographs Editorial Board R.
Beig, Wien, Austria J. Ehlers, Potsdam, Germany U. Frisch, Nic...
10 downloads
540 Views
16MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Lecture Notes in
Physics
Monographs Editorial Board R.
Beig, Wien, Austria J. Ehlers, Potsdam, Germany U. Frisch, Nice, France K. Hepp, Zilrich, Switzerland W. Hillebrandt
Garching, Germany Imboden, Zt1rich, Switzerland R. L. Jaffe, Cambridge, MA, USA R. Kippenhahn, G6ttingen, Germany R.. Lipowsky, Golm, Germany H. v. L6hneysen, Karlsruhe, Germany D.
I.
Ojima, Kyoto, Japan Weidemniffler, Heidelberg, Germany J. Wess, Mflnchen, Germany J. Zittartz, K61n, Germany H. A.
Managing Editor
Beiglb6ck Springer-Verlag, Physics Editorial Department II Tiergartenstrasse 17, D-69121 Heidelberg, Germany W.
c/o
Springer Berlin
Heidelberg New York
Barcelona
Hong Kong London
E
Milan Paris
Singapore Tokyo
Physics
and Astron0ifty
-
ONLINE LIBRARY
http.//wwwspringerde/phys/
The Editorial
Policy for Monographs
The series Lecture Notes in
Physics reports new developments in physical research and teaching quickly, informally, and at a high level. The type of material considered for publication in the monograph Series includes monographs presenting original research or new angles in a classical field. The timeliness of a manuscript is more important than its form, which may be preliminary or tentative. Manuscripts should be reasonably selfcontained. They will often present not only results of the author(s) but also related work by other people and will provide sufficient motivation, examples, and applications. The manuscripts or a detailed description thereof should be submitted either to one of the series editors or to the managing editor. The proposal is then carefully refereed. A final decision concerning publication can often only be made on the basis of the complete manuscript, but otherwise the editors will try to make a preliminary decision as definite as they can on the basis of the available information. Manuscripts should be no less than loo and preferably no more than 400 pages in length. Final manuscripts should be in English. They should include a table of contents and an informative introduction accessible also to readers not particularly familiar with the topic treated. Authors are free to use the material in other publications. However, if extensive use is made elsewhere, the publisher should be informed. Authors recei#e jointly 30 complimentary copies of their book. They are entitled to purchase further copies of their book at a reduced rate. No reprints of individual contributions can be supplied. No royalty is paid on Lecture Notes in Physics volumes. Commitment to publish is made by letter of interest rather than by signing a formal contract. Springer-Verlag secures the copyright -
for each volume.
The Production Process are hardbound, and quality paper appropriate to the needs of the author(s) is used. Publication time is about ten weeks. More than twenty years of experience guarantee authors the best possible service. To reach the goal of rapid publication at a low price the
The books
of photographic reproduction from a camera-ready manuscript was chosen. This process shifts the main responsibility for the technical quality considerably from the publisher to the author. We therefore urge all authors to observe very carefully our guide-
technique
preparation of camera-ready manuscripts, which we will supply on request. applies especially to the quality of figures and halftones submitted for publication. Figures should be submitted as originals or glossy prints, as very often Xerox copies are not suitable for reproduction. For the same reason, any writing within figures should not be smaller than 2.5 mm. It might be useful to look at some of the volumes already published or, especially if some atypical text is planned, to write to the Physics Editorial Department of Springer-Verlag direct. This avoids mistakes and time-consuming correspondence during the production period. As a special service, we offer free of charge BTEX and ThX macro packages to format the text according to Springer-Verlag's quality requirements. We strongly recommend authors to make use of this offer, as the result will be a book of considerably improved technical quality. For further information please contact Springer-Verlag, Physics Editorial Department II, Tiergartenstrasse 17, D-6912i Heidelberg, Germany.
lines for the
This
Series
homepage
-
http://wwwspringer-de/phys/books/lilpm
Volker Perlick
Ray Optics,
Fermat's Principle, and Applications to General Relativity
4
Q Springer 11
'14%
Author Volker Perlick
Technische Universität Berlin Sekr. PN 7-1
Hardenbergstrasse 36 io623 Berlin, Germany
Library of Congress Cataloging-in-Publication Data
Perlick, Volker, 1956Ray optics, Fermat's principle, and applications to general relativity / c Volker Perlick. (Lecture notes in physics. Monographs, ISSN 0940-7677 ; v. m6 1) p. cm. Includes bibliographical references and index. ISBN 3540668985 (alk. paper) 1. Light--Transmission--MathematicaI models. 2. Maxwell equations. 3. General relativity (Physics) I. Title. H. Lecture notes in physics. New series m, Monographs --
m6 1.
QC389.P37
2000
523.01'53--dc2l 99-089303 ISSN 0940-7677 (Lecture Notes in Physics. Monographs) ISBN 3-540-66898-5 Springer-Verlag Berlin Heidelberg New York to copyright. All rights are reserved, whether the whole or part concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only
This work is
subject
of the material is
under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are
@
liable for
prosecution under the
Springer-Verlag
Printed in
Berlin
German
Copyright Law.
Heidelberg 2000
Germany
The use of general descriptive names, registered names, trademarks, etc. in this publication does not
imply,
even
from the relevant
a specific statement, that such names are exempt regulations and therefore free for general use.
in the absence of
protective laws
and
Typesetting: Camera-ready by the author Cover design: design &production, Heidelberg Printed
on
acid-free paper
SPIN:io644482
55/3144/du
-
5 4 3
2 10
Preface
All kind of information from distant celestial bodies of
comes
to
us
in the form
electromagnetic radiation. In most cases the propagation of this radiation can be described, as a reasonable approximation, in terms of rays. This is true not only in the optical range but also in the radio range of the electromagnetic spectrum. For this reason the laws of ray optics are of fundamental importance for astronomy, astrophysics, and cosmology. According to general relativity, light rays are the light-like geodesics of a Lorentzian metric by which the spacetime geometry is described. This, however, is true only as long as the light rays are freely propagating under the only influence of the gravitational field which is coded in the spacetime geometry. If a light ray is influenced, in addition, by an optical medium (e.g., by a plasma), then it will not follow a light-like geodesic of the spacetime metric. It is true that for electromagnetic radiation traveling through the universe usually the influence of a medium on the path of the ray and on the frequency is small. However, there are several cases in which this influence is very well measurable, in particular in the radio range. For example, the deflection of radio rays in the gravitational field of the Sun is considerably influenced by the Solar corona. Moreover, current and planned Doppler experiments with microwaves in the Solar system reach an accuracy in the frequency of Awlw 5--- 10-15 which makes it necessary to take the influence of the interplanetary medium into account. Finally, even in cases where the quantitative influence of the medium is negligibly small it is interesting to ask in which way the qualitative aspectsof the theory are influenced by the medium. The latter remark applies, in particular, to the intriguing theory of gravitational lensing. Unfortunately, general-relativistic light propagation in media is not usually treated in standard textbooks, and the more specialized literature is concentrated on particular types of media and on particular applications rather than on general methodology. In this sense a comprehensive review of general-relativistic ray optics in media would fill a gap in the literature. It is the purpose of this monograph to provide such a review. Actually, this monograph grew out of a more special idea. It was my original plan to write a review on variational principles for light rays in general relativistic media, and to give some applications to astronomy and astro-
V1
Preface
particular to the theory of gravitational lensing. However, I soon realized the necessity of precisely formulating the mathematical theory of light rays in general before I could tackle the question of whether these light rays are characterized by a variational principle. The sections on variational principles and on applications are now at the end of Part II, in which a general mathematical ftamework for ray optics is set up. This is written in the language of symplectic geometry, thereby elucidating the well-known analogy between ray optics and the phase-space formulation of classical mechanics. Moreover, I found it desirable to also treat the question of how to derive ray optics as an approximation scheme from Maxwell's equations. This is the topic of Part I which serves the purpose of physically motivating the physics,
in
fundamental definitions of Part II. In vacuo, the passage from Maxwell's equations to ray optics is, of course, an elementary textbook matter and the
generalization to isotropic and non-dispersive media is quite straightforward. However, for anisotropic and/or dispersive media this passage is more subtle. In Part I two types of media are discussed in detail, viz., an anisotropic one and a dispersive one, and the emphasis is on general methodology. I have organized the material in such a way that it should be possible to read Part II without having read Part I. This is not recommended, of course, but the reader might wish to do so. Both parts begin with an introductory section containing a brief guide to the literature and a statement of assumptions and notations used throughout. Whenever the reader feels that a symbol needs explanation or that the underlying assumptions are not clearly stated, he or she should consult the introductory section of the respective part. Also, the index might be of help if problems of that kind occur. Large parts of this monograph present material which, in essence, is not new. However, I hope that the formulation chosen here might give some new insight. As to Part I, our discussion of the passage from Maxwell's equations optics includes several mathematical details which are difficult to find literature, although the general features are certainly known to experts. To mention just one example, it is certainly known to experts that in a linear but anisotropic medium on a general-relativistic spacetime the light rays axe determined by two "optical Finsler metrics"; to the best of my knowledge, however, a full proof of this fact is given here for the first time. As to Part II, the basic formalism is just the 170-year-old Hamiltonian optics, rewritten in modern mathematical terminology and adapted to the framework of general relativity. However, the presentation is based on some general mathematical definitions which have not been used before. This remark applies, in particular, to Definition 5.1.1, which is the definition of what I call "ray-optical structures". This definition formalizes the widely accepted idea that all of term "ray sysray optics can be derived from a "dispersion relation". (The
to ray
in the
tem" is sometimes used
though
by Vladimir
not quite identical
sense.)
Arnold and his collaborators in
It also
applies,
e.g., to Definition
a
similar
5.4.1,
on
Preface
VII
optical structures, which characterizes dispersion-free geometric way. On the other hand, I want to direct the reader's attention to the fact that this monograph contains some particular results which, as far as I know, have not been known before. These include, e.g.: "dilation-invariant" ray
media in
a
general redshift formula for light rays in media on a general-relativistic spacetime in Sect. 6.2; the results on light bundles in isotropic non-dispersive media on a generalrelativistic spacetime in Sect. 6.4, in particular the generalized "reciprocity
the
theorem '
(Theorem 6.4.3);
Theorem 7.3.1, which can be viewed as a version of Fermat's principle for light rays in (possibly anisotropic and dispersive) media on general-
relativistic spacetimes; Theorem 7.5.4, which generalizes the "Morse index theorem" of Riemannian geometry to the case of light general-relativistic spacetimes.
Some of the some
questions raised
extent this is
remark
applies
an
to the
in this
rays in
stationary media
monograph
remain
on
stationary
unanswered, i.e.,
to
on work in progress. In particular, this two special issues. (a) In Part I we are able to
interim report
following
prove that for the linear medium treated in
Chap.
2 ray
optics is associated
approximate solutions of Maxwell's equations, i.e., that ray optics gives viable approximation scheme for electromagnetic radiation. Unfortunately,
with a
we are
not able to prove
a
similar result for the plasma model of Chap. 3.
a gap which should be filled in the future. (b) In Part II we are able to establish a Morse index theorem for light rays in stationary media. However, it is still an open question whether these results can be generalized
This is
to the
non-stationary
case
in
which,
up to now,
a
Morse
theory
exists
only
rays. With Fermat's principle in the form of Theorem 7.3.1 we have a starting point for setting up a Morse theory for light rays in arbitrary (non-stationary) media. This is an interesting problem to be tackled in future
for
vacuum
work.
This
monograph
in its
present form is a slightly revised version of my Hause this opportunity to thank the members of
bilitation thesis. I would like to
Committee, Karl-Eberhard Hellwig, Erwin Sedlmayr, Bernd Wegner, John Beem, Friedrich Wilhelm Hehl, and Gernot Neugebauer, for their interest in this work and for several useful comments. In particular, I would like to thank Bernd Wegner for paving the way to having this text published with Springer Verlag. While working at this monograph I have profited from many discussions, in particular with my academic teacher Karl-Eberhard Hellwig and his collaborators at the Technical University in Berlin, but also with other colleagues. Special thanks are due to Wolfgang Hasse and Marcus Kriele for collabora, tion on various aspects of light propagation in general relativity; to Wolfgang the Habilitation
VIII
Preface
Rindler for sions
on
at the
hospitality
at the
the fundamentals of
University
University of Texas at Dallas and for discusgeneral relativity; to John Beem for hospitality
of Missouri at Columbia and for discussions
to Gernot
Neugebauer and his collaborators
on
Lorentzian
for
hospitality at the University of Jena and for discussions on various aspects of general relativity; to Paolo Piccione, Fabio Giannoni, and Antonio Masiello for hospitality during several visits to Italy and to Brazil and for collaboration on Morse theory; and to Jfirgen Ehlers and Arlie Petters for fruitful discussions on Fermat's principle and gravitational lensing. Also, I have enjoyed discussions on this subject with students during seminars and classes in Berlin, 0snabrfick, and geometry;
S5,o Paulo.
Finally, I am grateful to the Deutsche Forschungsgemeinschaft for sponsoring this work with a Habilitation stipend, and to the Wigner Foundation, to the Deutscher Akademischer Austauschdienst, and to the FundagRo de Amparo, 6 Pesquisa do Estado de Sdo Paulo for financially supporting my visits to Dallas, Columbia, and Sdo Paulo. Berlin, August
1999
Volker Perlick
Contents
Part I. From Maxwell's
1.
1.2
optics 3
A brief guide to the literature Assumptions and notations
Light propagation in
............................
3
..............................
5
linear dielectric and
2.2
2.3
Asymptotic
2.4
Derivation of the eikonal
2.5
Discussion of the eikonal equation Discussion of transport equations and the introduction of rays Ray optics as an approximation scheme
2.6 2.7
.........................
solutions of Maxwell's equations equation and transport
...............
5.
equations
Introduction to Part II
19 24
31 36
44
.............
46
...................................
61
Assumptions
and notations on
optics
............................
..............................
arbitrary manifolds
.............
5.2
Definition and basic properties of ray-optical structures Regularity notions for ray-optical structures
5.3
Symmetries
of
17
43
4.2
structures
7 14
...............
A brief guide to the literature
Ray-optical
7
................
4.1
61 63
67
.....
67
...............
76
......................
82
5.5
ray-optical structures Dilation-invariant ray-optical structures Eikonal equation
5.6
Caustics
5.4
.
...................
Light propagation in other kinds of media 3.1 Methodological remarks on dispersive media 3.2 Light propagation in a non-magnetized plasma
5.1
.
........................
Part II. A mathematical framework for ray 4.
permeable media permeable media.
Maxwell's equations in linear dielectric and Approximate-plane-wave families
2.1
3.
to ray
Introduction to Part 1 1.1
2.
equations
...................
87
.......................................
92
...............................................
100
x
Contents
6.
Ray-optical 6.1
6.2 6.3
6.4 6.5 6.6
7.
structures
on
Lorentzian manifolds
7.3 7.4
7.5
...........
ray-optical structure Observer fields, frequency, and redshift Isotropic ray-optical structures Light bundles in isotropic media Stationary ray-optical structures Stationary ray optics in vacuum and in simple media
ill
....................
113
...........................
120
..........................
123
.........................
131
........
141
.............................
149
case
.........
principle of stationary action: The strongly regular Fermat's principle A Hilbert manifold setting for variational problems A Morse theory for strongly hyperregular ray-optical structures The
154
......................................
156
.........
165
.............
168
..............................................
on
8.4
Minkowski space
.....................................
183 190
193
....................................
199
....................................................
211
in
a
plasma
on
Kerr
183
...........
Light propagation Gravitational lensing
References
149
.
case
Applications 8.1 Doppler effect, aberration, and drag effect in isotropic media. 8.2 Light rays in a uniformly accelerated medium 8.3
III
.........................
vacuum
Variational principles for rays 7.1 The principle of stationary action: The general 7.2
8.
The
spacetime
1. Introduction to Part I
In Part I
recapitulate the general ideas of how to derive the laws of equations. We presuppose a general-relativistic ray optics and as we consider media which are general enough spacetime background, to elucidate all relevant features of the method. Chapter 2 treats the case of a linear (not necessarily isotropic) dielectric and permeable medium in full detail. Chapter 3 discusses dispersive media in general and a simple plasma model in particular. In this way the material presented in Part I serves two purposes. First, it motivates our mathematical frame-work for ray optics, to be set up in Part II below. Second, it provides us with physically important examples of ray optical structures to which we shall recur frequently. we
from Maxwell's
1.1 A brief
guide
to the literature
we have to assume some familiarity on the reader's side with equations in matter on a general-relativistic spacetime. Whereas vacuum Maxwell's equations are detailed in any textbook on general relativity, the matter case is not usually treated in extenso. For general aspects of the phenomenology of electromagnetic media in general relativity we refer to Bressan [171 who gives many earlier references. The case of a linear (not necessarily isotropic) dielectric and permeable medium which is at the basis of Chap. 2 is briefly treated by Schmutzer [127], Chap. IV, following an original article by Marx [91]. The general-relativistic plasma model which is at the basis of Chap. 3 is systematically treated in two articles by Breuer and Ehlers [18] [19]; for earlier work on the same subject we refer to Madore [90], to Bi66k and Hadrava [14],
In Part I
Maxwell's
and to Anile and Pantano
As of cal
an
media
electromagnetic methods, i.e., that
tions
can
[51 [6].
aside it should be mentioned that the can
phenomenological theory theory by statisti-
be derived from electron
the macroscopic (phenomenological) Maxwell equa, a sort of microscopic Maxwell equations. For linear
be derived from
isotropic media in inertial motion on flat spacetime this is a standard textbook matter; the generalization to accelerated media is due to Kaufmann [67). For a general-relativistic plasma, the derivation of phenomenological proper-
V. Perlick: LNPm 61, pp. 3 - 6, 2000 © Springer-Verlag Berlin Heidelberg 2000
4
1. Introduction to Part I
ties from the kinetic
theory of photons is discussed in the above-mentioned by Bi&ik and Hadrava [14]. The main topic of Part I is the derivation of the laws of ray optics from Maxwell's equations. The basic idea is to make an approximate-plane-wave ansatz for the electromagnetic field and to assume that this ansatz satisfies Maxwell's equations in an asymptotic sense for high frequencies. This results in a dynamical law for wave surfaces which can be rewritten equivalently as a dynamical law for rays. In optics the dynamical law for wave surfaces is usually called the eikonal equation. It is formally analogous to the Hamilton-Jacobi equation of classical mechanics, whereas the dynamical law for rays is formally analogous to Hamilton's equations. Mathematically, this so-called my method is, of course, not restricted to Maxwell's equations but applies equally well to other partial differential equations with or without relevance to physics. In this sense, the ray method has applications not only to optics but also to acoustics and to wave mechanics. In the latter context, the ray method is known as JWKB method, refering to the pioneering work of Jeffreys, Wentzel, Kramers and Brioullin, and is detailed in virtually any article
textbook
on
quantum mechanics.
In this brief guide to the literature
we
shall concentrate
on
the ray method
optics. As to other applications we refer to the comprehensive list of references given in monographs such as Keller, Lewis and Seckler [70] or Jeffrey and Kawahara [66]. Purely mathematical aspects of the ray method can be found in textbooks on partial differential equations. Particularly useful for our purposes axe, e.g., the books by Chazarain and Piriou [26] and by Egorov
in
and Shubin
[36].
Whereas rudiments of the ray method can be traced back to work of Liouville and Green around 1830, it was first carried through in the context
optics by Sommerfeld and Runge [1321 in the year 1911, following a suggestion by Debye. The work of Sommerfeld and Runge was restricted to the vacuum Maxwell equations in an inertial system, and the only goal was to derive the corresponding eikonal equation. Their treatment was generalized and systematized by Luneburg [881 who considered infinite asymptotic series solutions rather than just asymptotic solutions of lowest order as Sommerfeld and Runge did. Later, the method was extended from the vacuum case to the case of light propagation in matter. This was a very active field of reaseaxch in the 1960s, see, e.g., Lewis [84], Chen [27] and Kravtsov [75]. All these papers are restricted to special relativity in the sense that they are presupposing a flat spacetime. Nonetheless, the techniques used are of interest also in view of general relativity. The reason is that vacuum Maxwell's equations on a general-relativistic spacetime are very similar to Maxwell's equations in an inhomogeneous medium on flat spacetime, at least locally. This was first observed by Plebafiski [120]. Note, however, that global aspects which do not carry over to general relativity are brought into play whenever temporal Fourier expansions (as e.g. by Lewis [841) and/or spatial of
1.2
Fourier expansions (as e.g. by Chen ray method that does carry over to
Assumptions
and notations
[271)
are used. A global treatment of the general relativity is possible in terms of the Lagrangian manifold techniques introduced in the 1960s by Maslov and Arnold, see Arnold [71, Maslov [94], Duistermaat [311 or Guillemin and Sternberg (55]. In Part I we are concerned with local questions only. However, we shall touch upon Lagrangian manifold techniques and their relevance for the investigation of caustics in Part II below. In general relativity, the passage from Maxwell's equations to ray optics was carried through for the first time by Laue [77] in the year 1920. In this paper, which is the written version of a talk given by Laue at the 86. Naturforscherversammlung, the author demonstrated how to derive from vacuum Maxwell's equations on a curved spacetime the light-like geodesic equation for the rays. Laue's treatment followed closely the seminal paper by Sommerfeld and Runge [132]. A more systematic general-relativistic treatment of the ray method in optics, including asymptotic solutions of arbitrarily high order, was brought forward much later by Ehlers [381. He considered linear isotropic nondispersive media on an arbitrary general-relativistic spacetime and derived not only the eikonal equation for the rays but also transport equations of arbitrary order for the polarization plane along the rays. In particular, his results put earlier findings about light propagation in such media by Gordon [50] and Pham Mau Quan [117] on a mathematically firm basis. At least for the
vacuum
case, the main results
can now
or
be found in many textbooks on gen[98], Straumann [136],
Thorne and Wheeler
relativity, see, e.g., Misner, Stephani [133). The general-relativistic
eral
relevance of
higher order
terms in
the asymptotic series expansion was discussed by Dwivedi and Kantowski [321 and by Anile [4]. A general-relativistic treatment of the ray method for
dispersive media, exemplified with and Ehlers
by
[181 [19]
Bi66k and Hadrava
Assumptions
1.2
a
special plasma model,
who modified and enhanced earlier work
[14),
and
by
Anile and Pantano
is due to Breuer
by Madore [90],
[51 [6].
and notations
general-relativistic spacetime, i.e., a four-dimensional C' mansignature (+, +, +, -). On this spacetime background we consider Maxwell's equations, using units making the dielec1. tricity and permeability constants of vacuum equal to one, -,, po Thereby, in particular, the vacuum velocity of light is set equal to one. We restrict ourselves to the C' category in the sense that throughout Part I all
We
assume a
ifold with
a
metric of Lorentzian
=
maps and tensor fields
are
tacitly assumed
to be
=
infinitely often differentiable.
We work in local coordinates using standard index notation. Throughout, Einstein's summation convention is in force with latin indices running from 1
greek indices running from 1 to 3.The (covariant) components spacetime metric will be denoted by gab. As usual, we define gbr by Jc, where Jc denotes the Kronecker delta, and we use gab (and gbe,
to 4 and with
of the gab 9
bc
=
a
a
1. Introduction to Part I
respectively)
to lower
(and raise, respectively)
indices. With respect to
a co-
'9 ordinate system x (x 1,X 2,X3,X4), paxtial derivatives -9X" will be denoted by 0. for short, whereas V. means covariant derivative with respect to the =
Levi-Civita connection of of "a tensor field
Qabc
77
if
our
metric. For the sake of
we mean
brevity,
we
shall speak
"a tensor field whose contravariant
com-
coordinate system are Qabc " etc. Our treatment will be purely local throughout Part 1. Therefore, the use of local coordinates and index
ponents in
a
notation is
no
restriction whatsoever.
Light propagation in and permeable media
linear dielectric
2.
On
spacetime manifold we consider Maxwell's equations in a linear but necessarily isotropic medium, i. e., in a medium phenomenologically characterized by a dielectricity tensor field and a permeability tensor field. It is our goal to derive and to discuss the laws of ray optics in such a medium. The standard textbook problem of light propagation in vacuo is, of course, included as a special case. The results of this chapter cover a wide range of applications including light propagation in gases (isotropic case) and crystals (anisotropic case) as long as dispersion is ignored. For dispersive media we refer to Chap. 3 below. In view of applications to astrophysics, the isotropic case is more interesting than the (much more complicated) anisotropic case. On the other hand, a thorough treatment of the anisotropic case is highly instructive from a methodological point of view. In particular, it gives us the opportunity to discuss the phenomenon of birefringence. our
not
2.1 Maxwell's
and On
equations permeable media
our
in linear dielectric
spacetime manifold, the source-free Maxwell equations for (macro-
scopic) electromagnetic
fields in matter
can
be written in local coordinates
as
77 or,
VbFcd
using partial rather than 77
In
abcd
abcd
i9bFcd
=
=
0
and
covariant
0
and
Vb Gbe
derivatives,
=
(2.1)
0
as
77abcd 19b(?7cde_f Gef)
=
0
(2.1) and (2.2), 77abcd denotes the totally antisymmetric Levi-Civita (volume form) of our metric which is defined by the equation
(2.2) tensor
field
771234
Here the minus
plus sign
sign
=
,,/_jd_et(9cd)I
is valid if the coordinate
system is right-handed and the
is valid if it is left-handed. In other
V. Perlick: LNPm 61, pp. 7 - 41, 2000 © Springer-Verlag Berlin Heidelberg 2000
(2.3)
-
words,
we
have to choose
an
8
Light propagation
2.
orientation
on
in linear dielectric and
the domain of
our
of the Levi-Civita tensor field.
permeable
media
coordinate system to fix the sign ambiguity However, this is irrelevant since Maxwell's
invariant under 77abcd 1 -?7abcd7 and so are all the relevant results in Part I. If the reader is not familiar with volume elements he or she
equations
axe
consult, e.g., Wald [1461, p. 432. Fab and Gab denote the electromagnetic field strength and the electromagnetic excitation, respectively, both of which are antisymmetric second rank tensor fields. With respect to a reference system, given in terms of a time-like vector field Ua with Ua U., -1, we can introduce the electric field strength
may
=
Ea and the
=
Fab Ub
(2.4)
magnetic field strength Ba
77abcd
2
Ub Fcd
(2.5)
such that
Fab Here
we
_,7cd abBcUd+EbUa-EaUb
=
have used the familiar property 77abcd
,,aefk je d
Similarly,
we
jf jkd
_jeb
.
jf jkb c
of the Levi-Civita tensor
je
_
c
C
jfb jkd
+ je C
field, cf.,
jfd jkb
+ je b
_
je d
jfb jkc+
jfd jk
e.g., Wald
(2.7)
c
[146], equation (B.2.12).
introduce the electric excitation
Da and the
(2.6)
=
Gab Ub
(2.8)
magnetic excitation Ha
2
77abcd
Ub Gcd
(2.9)
such that
Gab
=
_,,cd ab Hc Ud + Db Ua
-
Da Ub
(2.10)
-
With respect to the reference system used for their definitions, the electric and magnetic field -strengths are purely spatial one-forms, and so are the electric and
magnetic excitations,
EaUa=B,,Ua=O Our
terminology of calling Ea
and
Da Ua
=
Ha Ua
and Ba the "field
=
0.
strengths" (in
(2.11) german:
Feldstdrken) and Da and Ha the "excitations" (in german: Erregungen) follows Gustav Mie and Arnold Sommerfeld. This terminology is reasonable
2.1 Maxwell's
equations
in linear dielectric and
permeable
media
9
Ea and Ba determine the Lorentz force exerted on a charged test partiwhereas, in the presence of field-producing charges and currents, Da and Ha are the fields "excited" by those sources via Maxwell's equations. The traditional terminology of calling Ha the "magnetic field strength" is misleading. Moreover, it is highly inconvenient from a relativistic point of view where Ea and Ba, rather than Ea and Ha, are united into an antisymmetric second rank tensor field on spacetime. In what follows we consider Maxwell's equations in the form (2.2). As long as only the metric is known, (2.2) gives us eight component equations for since
cle
(The unknown functions are the six independent electromagnetic field strength plus the six independent electromagnetic excitation.) Hence, (2.2) is an underdetermined system of partial differential equations. It must be supplemented by constitutive equations relating the electromagnetic field strength with the electromagnetic excitation. Thereby the medium is characterized in a phenomenological way. In this chapter we consider linear dielectric and permeable media according to the following definition.
twelve unknown functions.
components of the components of the
Definition 2.1.1. A linear dielectric and
tion,
a
medium characterized
Da
:--
Ca
b
permeable medium is, by definiequations that take the form
constitutive
by Eb
and
Ba
--::
sat-
for all (ZI, Z2, Z3, Z4) : (0, 0, 0, 0)
with
some
(a) Ua Cab 0 and Ua /jab 0 ba (b) Cab eba and ttab /-t (c) Eab Za Zb > 0 and ttab Za Zb =
We
refer
6abas
=
fields -a
b
.
> 0
0.
distinguished reference system Ub as to the rest system, to dielectricity tensor field and to tiab as to the permeability tensor
to the
to the
of the medium.
field
Condition
of Definition 2.1.1 guarantees that the constitutive equa0 and Ba Ua = 0. Conditions (b) in agreement with Da Ua
(a)
(2.12) are (c) imply that
tions
and
tensor
=
=
=
Ua Za
(2.12)
,
andpab
reference system U', with second rank the following conditions: isfying in
Ila bHb
=
in the rest w
=
system of the medium the energy .1 2
(DaE a+BaH ) a
I
density
(2.13)
electromagnetic field is positive definite. Altogether, conditions (a), (c) guarantee that the dielectricity and permeability tensor fields are "spatially invertible". We can, thus, define (/.L-')ab by the properties
of the
(b)
and
(4_1)a b=O, (,,-l)ab (/.,-l)ba, (/_t-1)ab Abc Ja + Ua Uc Ua
=
=
C
(2.14)
10
Light propagation
2.
in linear dielectric and
The constitutive equations
Gab The
(
---
.1 2
77
cd
ab
(2.12)
can
then be united in
(A-1)ce, ?7erpq Ur Ud + 6bP Uq Ua
following special
case
deserves
particular
Definition 2.1.2. A linear dielectric and
tropic if the dielectricity and permeability Ca
with
some
requires
-
b
=
scalar
and IL
6
(jb + U, Ub ) a
particular,
=
ell
it =
form
=
(2.15)
(Fab
Gab
e
and
-
a
single equation,
Cap Uq
Ub) Fpq
vacuum can
1. In this
case
permeable medium is called isoare of the special form
tZab
tt
=
(jb + Ua Ub) a
,
(2.16) 2.1.1 then
reduces to
(1
CA) (Fad Ub
be characterized
(and,
(2.15)
tensor fields
*
+
-
interest.
functions and IL. (Condition (c) of Definition to be strictly positive.)
In the isotropic case,
In
permeable media
more
-
as a
generally,
Fbd Ua
)Ud)
(2.17)
linear isotropic medium with isotropic medium with
in any
1) Ua drops out from (2.17) and the constitutive equations take the (2.12) in any reference system. This is in agreement with the obvious
fact that for
vacuum
any reference
system
can
be viewed
as
the rest system
of the medium.
We emphasize that our phenomenological constitutive equations are physically reasonable in the rotational as well as in the irrotational case, i.e., Ua need not be hypersurface-orthogonal. Although this should be clear from the general rules of relativity, there is still a debate on this issue, even in the case of an isotropic medium on flat spacetime, see, e.g., Pellegrini and Swift [106]. We are now going to analyze the dynamics of electromagnetic fields in a linear dielectric and permeable medium. We have already mentioned that Maxwell's equations (2.2) alone give us eight equations for twelve unknown b b functions. With (2.12) at hand, and assuming that Ua, Ca and Pa are known, we can eliminate six of the unknown functions. Now (2.2) gives us eight equations for six functions, i.e., the system looks overdetermined. However, only six of those eight equations are evolution equations, governing the dynamics of electromagnetic fields, whereas the other two equations are constraints.This is most easily verified in a local coordinate system (xi, X2, X3, X4) in which const. are space-like such that x4 can be viewed as the hypersurfaces x4 function. local time a Owing to the antisymmetry of the Levi-Civita tensor, 4 the a components of equations (2.2) do not involve any a4 derivative. =
=
Hence, these two equations are to be viewed as constraints whereas the re1, 2, 3 components of equations (2.2), maining six equations, i.e., the a are the evolution equations governing the dynamics. Again owing to the antisymmetry of the Levi-Civita tensor field, the evolution equations preserve the constraints in the following sense. If the constraints are written in the form =
2.1 Maxwell's
equations
in linear dielectric and
permeable
media
11
C2 fi C1 0, then the evolution equations imply that (94C, o94C2 f2 C2 with some spacetime functions f, and f2. Hence, if a solution of the evolution equations satisfies the constraints on some initial 4 const., then it satisfies the constraints everywhere (on hypersurface x some neighborhood of any point of the initial hypersurface, that is). In other words, locally around any one point all solutions of Maxwell's equations can be found in the following way.
C1
=
0 and
and
=
=
=
=
Step Step
1. Choose
a
2. Choose
a
surface is
Step
space-like hypersurface through that point. local coordinate system such that the chosen 4 const. given by the equation x
hyper-
=
equations with all initial data that satisfy
3. Solve the evolution
the constraints. In the rest of this section
shall prove that the initial value
we
problem
con-
sidered in Step 3 is well-posed in the sense that it is characterized by a local existence and uniqueness theorem, provided that the initial hypersurface has been chosen appropriately. Conditions (a), (b) and (c) of Definition 2.1.1 will prove essential for this result.
First
we
introduce
special coordinates according
to the
following
defini-
tion. Definition 2.1.3. Let U' denote the rest system of a linear dielectric and medium and fix a spacetime point x0. Then a local coordinate sys-
permeable
(Xl,X2,X3,X4) defined
tem near
(a) (b)
x0
neighborhood of x0,
on a
,
is called
adapted
to Ua
if
Ua is
given by the equation
94A
0
=
for
p
=
Ua
at the
1, 2, 3
77ffg 4=4=
point
xo
xa
14
-
For any linear dielectric and permeable medium, it is obvious that adapted are characterized by the following existence and uniqueness prop-
coordinates
erty. If we choose
a
spacetime point x0 and
to Ua at x0, then there is
a
a
hypersurface S that is orthogonal adapted to U1 near xo such
coordinate system
4 const. Another coordinate system represented by the equation x (x ilxt2 x13 x/4 ) is, again, adapted to Ua near xo if and only if it is related to (Xl,X2,X3,X4) by a coordinate transformation of the special form
that S is
=
Xpi X
9X/4
with TZF ., Condition
4
)X
/P
(X 1,X2,X3)
X/4 (X 1, X2,X3,X4)
(2.18)
0.
(b)
hypersurface x4
sure that at the point xo the respective integral curve of Ua orsufficiently small neighborhood of x0, all
of Definition 2.1.3 makes
=
const. intersects the
thogonally. This implies that, on a const. are space-like. Of course, they cannot be orthogohypersurfaces x4 nal to Ua on a whole neighborhood unless the medium is non-rotating. Hence, =
12
in
2.
an
Light propagation
adapted
in linear dielectric and
permeable media
coordinate system the mixed components 9,.4 and g114 of the except at the central point xo. The spatial components
metric need not vanish
give positive definite 3 x 3 matrices (g,,,) and (gl") on some neighborhood point xo these matrices are inverse to each other. The temporal 44 components 944 and g are strictly negative functions on some neighborhood of xO ; at the point xO they are inverse to each other. Now we consider a linear dielectric and permeable medium in a coordinate system adapted to its rest system Ua near some point xO. Then (2.11) reduces
of xO; at the
to
E4=B4=0
owing
to condition
(a)
and
Hence, (2.12) simplifies
of Definition 2.1.3.
D,
e,P Ep
=
and
(2.19)
D4=H4=0
B,
=
p,P Hp
to
(2.20)
.
Conditions (b) and (c) of Definition 2.1.1 guarantee that -,P and A,P are positive definite and symmetric with respect to gP. We can, thus, define v,P and w,P, which are again positive definite and symmetric with respect to gP', by
M,,' v,' v,-P For the
following
=
JP V
and
e,,' w,' w,P
it will be convenient to introduce the a
Zp
=
vp B,
and
Yp
=
wp'D,
(2.21)
JIP,
=
quantities
(2.22)
,
Z1, Z2, Z3, Y1, Y2, Y3 for the six independent components of the electromagnetic field. That is to say, we start from Maxwell's equations (2.2) with (2.6) and (2.10); we use part (a) of Definition 2.1.3 and equation (2.19); and we eliminate E, and H, with the help of (2.20); finally, we express D, of bit little After of a and means of algebra, terms (2.22). B, in Yp by Zp the a 1, 2, 3 components of Maxwell's equations (2.2) give us evolution and to
use
=
equations of the form La
for the
Y
+ M
(Z) (1)
(2.23)
(Y3)
(2.24)
=
Y
0
dynamical variables Z
Here
(Z)
j9a
L1,
L 2,
L3 L
and L
=
4
(Z3) Z, Z2
are
=
1
Y
x-dependent
(Q QT ) 1
4
and
and
LP
6
=
x
Y, Y2
6 matrices of the form 0
(AP )T
AP
0
)
'
(2.25)
2.1 Maxwell's
where
Q
is
a
3
x
3 matrix with
QA AP is
a
3
equations
T
=:
WXv
3 matrix with
x
77.4
4-y
V_Y
T
=
=
M is
a
6
6 matrix whose
x
For the
Wx"'w 04-y V_YT904
WX V?7v4PY Uy
(QT),\,r gp,\ with v,\' and
13
components
transposition with respect
means
permeable media
(2.26)
components
APxIT
)T
in linear dielectric and
to
=
(2.27)
'r
gP" such that,
QxP gr,\
e.g.,
(2.28)
;
components involve the spacetime metric along
wpr.
investigation of the evolution equations (2.23) the following
observations
are
two
crucial.
0 adapted coordinate system we have 9,04 4 invertible L is some on neighborhood By continuity, and, thus, Q of xO to which we can restrict our considerations. Hence, (2.23) can be solved for the i94 derivative. 2 (b) L1, L L3 and L 4are symmetric (=self-adjoint) with respect to the positive definite scalar product
(a)
At the central point =
xo of
=
our
0.
,
(YJz1) (Z2) .
=
Y2
Here the dots
on
the
right-hand
'I 2
(Z1
-
Z2 + Y1
-
Y2)
side refer to the scalar
(2.29) product
defined
by a
-
b
=
gl" Zi-, bm
(2.30)
for any two (C3 -valued functions a and b on the neighborhood considered, with the overbar denoting complex conjugation. (To be sure, in (2.23) all quantities are real. For later purposes, however, we need the complex version of this scalar
product.)
imply that (2.23) satisfies the defining properties of a symmetric hyperbolic system of partial differential equations. By a well-known
These two observations
theorem
Sect. 4.12 e.g., Theorem 4.5 in Chazarain and Piriou [261 or and uniqueness and Shubin [36]) this guarantees local existence
(see,
in
Egorov
of
a
X4
=
solution Z, Y for any initial data ZO, YO given on our hypersurface const. (Please recall our stipulation of tacitly working in the CI cate-
throughout Part I. Had we restricted ourselves to the analytic category instead, property (a) alone would guarantee local existence and uniqueness of a solution to any initial data, owing to the well-known Cauchy-Kovalevsky theorem.) Moreover, the fact that (2.23) is symmetric hyperbolic implies that solutions Z, Y are bounded in terms of so-called energy inequalities, see, e.g.,
gory
14
Light propagation
2.
in linear dielectric and
Theorem 4.3 in Chazarain and Piriou Shubin
[36].
are
going
to
[261
employ these
Theorem 2.63 in
or
Egorov
and
facts later.
explicit form of the matrix M in (2.23) will be of no interest for following. What really matters is the structure of the L, i.e., the
The us
We
permeable media
in the
information contained in the 6
x
6 matrix
L (x, p)
=
pa
L'(x)
(2.31)
.
Here the first argument x (Xl,X2,X3,X4 ) ranges borhood considered and the second argument p =
=
R4. The
L(x, p)
by (2.31)
defined
over
the coordinate
(P1, P2
7
neigh-
P3 i P4) ranges
over
is called the
principal matrix or the characteristic matrix of the system of differential equations (2.23). Its determinant, which gives a homogeneous polynomial of degree six in the Pa, is called the principal determinant or the characteristic determinant of (2.23). We shall see later that the laws of ray optics in our medium are coded in the matrix
characteristic determinant. The notions of characteristic matrix and characteristic determinant
can
be introduced for any system of kth order partial differential equations, linear in the highest order derivatives, that gives n equations for n dynamical vari-
ables. The characteristic matrix is then formed in from the coefficients of the
a
fashion similar to
highest independent of the unknown functions, i.e., if the system of differential tions is semi-linear, the characteristic matrix is of the form
L(x, p) Hence,
its determinant
respect
to the Pa.
2.2
gives
(2.31)
order derivatives. If these coefficients
=Pal a
...
al Pa, L
...
axe
equa-
(2.32)
ak(X).
homogeneous polynomial of degree nk with
Approximate-plane-wave families
preceding section we have discussed Maxwell's equations in a linear permeable medium. The laws of light propagation in such a medium are determined by the dynamics of wavelike solutions of those equations. In this section we clarify what is meant by the attribute "wavelike". The following definition is basic.
In the
dielectric and
Definition 2.2.1. An
family of
approximate-plane-wave family
antisymmetric second rank tensor
Fab(a, x) with the
(a)
=
fields of
the
is
a
one-parameter
form
Rej e S(x)lcl fab(ai X) 1
(2.33)
following properties.
The coordinates
x
=
(x 1,X2, X3,x 4)
range
and the parameter
spacetime manifold real numbers, a E R+.
a
over some
ranges
over
of the strictly positive
open subset
the
2.2
(b)
S is
a
real-valued function whose
8S(X)
families
Approximate-plane-wave
gradient has
no
zeros,
i.e.,
(49IS(X), a2S(X),193S(X), 494S(X)) 0 (0, 0, 0, 0)
=
15
(2.34)
neighborhood considered. We refer to S as to the eikonal of the approximate-plane-wave family. (c) For each a E R+, fab(a7 *) is a complex-valued antisymmetric second rank tensor field. Moreover, fab admits a Taylor expansion of the form for all
x
in the
function
No+1
fab (Ce X) 7
a
:--
Nf b(x) b
+
0(aNO+2)
(2.35)
N=O
for all integers No
>
-1, where
f b (x) refer to f wave family. We
(d)
For all
x
br
in the
as
N! lim
=
a-- O
to the N
th
order
fab (a, x)
(2.36)
.
amplitude of the approximate-plane-
neighborhood considered,
(fa0b W) In
9N ---ff ac,
(2.33), i denotes,
of course, the
0
0
(2.37)
-
imaginary unit, i2
1, and Re denotes
the real part of a complex number. We call S the "eikonal function" because
an approximate-plane-wave family satisfies Maxwell's equations in an asymptotic sense to be discussed later only if S satisfies a partial differential equation which is known as the eikonal equation. The term "eikonal", which was introduced in 1895 by Bruns [23] in a more special context, is derived from the greek word eikon which means "image". This terminology is, indeed, justified since the eikonal equation is the fundamental equation of ray optics; so it governs, in particular, the ray
optical laws of image formation. According to our general stipulation that all maps and tensor fields are tacitly assumed to be infinitely often differentiable it goes without saying that fab(al x) is a C' function of a E R+. A Taylor expansion of the form (2.35) is valid if and only if this function admits a C' extension into the point
a
goes to
=
0. Note that
zero
for N
--+
we
oo,
do not
i.e.,
we
assume
do not
that the
assume
O(a N+1)
analyticity
term in
(2.35)
with respect to
a.
It is important to realize that an approximate-plane-wave family cannot of the following lemma converge for a --+ 0. This is an immediate consequence which will often be used in the Lemma 2.2.1. Let S be the
following.
eikopal function of an approximate-plane-wave (b). Let u be a complex valued function
to Definition 2.2.1
family, according defined on the same open subset of spacetime as the approximate-plane-wave family. (As always in Part I, we tacitly assume that u is of class C' and, 0. thus, continuous). Then lim Re{e'S/auJ exists pointwise only if u =
a-0
16
2.
in linear dielectric and
Light propagation
permeable media
Proof. If u is different from zero at some point, it is different from zero, by continuity, on a whole neighborhood. For almost all points x of this neigh1:1 borhood, (2.34) implies that S(x) =h 0 and the limit does not exist. are now going to justify the name "approximate-plane-wave family". fully, (More (2.33) should be called a "locally-approximate-plane-and-monochromatic-wave family". This terminology, however, seems a little bit too cumbersome.) The physical idea behind Definition 2.2.1 becomes clear if we consider the special case that the tensor fields OaS and fab(a7 -) are covariantly constant (and non-zero, as assured by (2.34) and (2.37)), i.e., that the 0 are satisfied. Then (2.33) gives 0 and Vcfab(a) ') equations WaS a one-parameter family of monochromatic plane waves. With respect to an inertial system (i.e., a covariantly constant time-like vector field Va with -1), the frequency of such a wave is given by w IVaOaS and gab Va Vb I S W gab Vb Hence, the limit the spatial wave covector is given by ka ;, 19a a -* 0 corresponds to infinitely high frequency with respect to all inertial systems Va with VI 49aS :A 0. Now this is a very special case since on a spacetime without symmetry there are no non-zero covariantly constant vector fields. Therefore, as we want to work with ansatz (2.33) on an arbitrary spacetime, we cannot assume that (9a S and fab (a7 ) are covariantly constant. However, if we restrict our consideration to a sufficiently small neighborhood7 OaS and fab(al -) deviate arbitrarily little from being covariantly constant. Similarly, on a sufficiently -I devismall neighborhood, any time-like vector field Va with gab Va Vb small this ates arbitrarily little from an inertial system. However neighborhood may be, by choosing a sufficiently small we can have arbitrarily many wave periods in this small spacetime region. This reasoning justifies the terminology introduced in Definition 2.2.1. Please note that (2.34) and (2.37) are essential to guarantee that (2.33) gives an approximately plane and monochromatic wave near each point for a sufficiently close to zero. In correspondence with this interpretation we shall refer to the hyperconst. as to the wave surfaces of our approximate-plane-wave surfaces S terms eikonal surfaces and phase surfaces are also alternative family. The call common. Moreover, we
We
=
=
=
=
Ct
-
.
-
=
=
w
its
frequency function and ka (ce, x)
its
(a, x)
spatial
wave
covector
we
=
=
field
19aS(X) Va(x)
(2.38)
call
I- 19a S a
-.1.
(X)
-
LO
(Ci, X) gab (X) Vb(X)
(2.39)
with respect to the observer field Va; here all -1 are admitted for which
those time-like vector fields with gab Va Vb Va OaS has no zeros.
=
2.3
Asymptotic
It is worthwile to note that from
an
produce non-monochromatic appropriate density function w,
we
can
17
solutions of Maxwell's equations
approximate-plane-wave family (2.33) by integrating over a with an
waves
C,
Ra b (X)
Fab (a, x)
w
(a) da
(2.40)
.
al
generalized Fourier synthesis. Here we have to assume sufficiently small to justify the approximate-planewave interpretation. Moreover, it is also possible to form superpositions of approximate-plane-wave families with different eikonal functions S.
This
can
be viewed
as a
that a, < 02, with a2
2.3
Asymptotic
solutions of Maxwell's
equations
dynamics of wavelike electromagnetic fields in our medium we plug our approximate-plane-wave ansatz (2.33) into Maxwell's equations, i.e., into (2.2) supplemented with our constitutive equations. Unfortunately, only in very special cases is it possible to determine the eikonal function S and the amplitudes f b in such a way that the resulting equations feature of the are exactly satisfied for some a E R+. It is the characteristic To
study
the
have to
f, b
ray method to determine S and are
satisfied,
rather than for
some
in such
a
way that Maxwell's
In this way the ray method gives amplitudes in the high frequency limit. To put this
the
following sense
rigorously
we
-->
0.
wave
introduce
notation.
Definition 2.3.1. For N E in the
equations
finite value of a, asymptotically for a us the dynamics of wave surfaces and
Z,
an
approximate-plane-wave family Fab(al
of Definition 2.2.1 is called
an
Nth order asymptotic solution of
Maxwell's equations if lim a-0
lim a--+O
Here
Gf (a, ) -
(-2-V C,
G_
I
W,
77
77
abcd
abcd
0,
(2.41)
ab ('Rcd`f Gef (a)
Fab (Ce7
is related to
09bFcd(a1
by
0
the constitutive
equations of the
medium. In to the
(2.41),
the limits
are
meant to be
spacetime coordinates;
we
performed pointwise with respect neighborhoods on
shall restrict ourselves to
which the convergence is uniform. For the evaluation of two observations are crucial.
(a)
The metric is
independent
of
a
and
so are
(2.41),
the
following
the other tensor fields that
equations, i.e., U', '-a and Pab. Hence, the special form in which a enters into the approximate-plane-wave ansatz (2.33) together with the linearity of the constitutive equations implies that (2.41) is trivially satisfied for N < -1.
enter into the constitutive
b
18
2.
(b)
If
in linear dielectric and
Light propagation
(2.41)
holds for N
=
No, then
Permeable media
it holds all the
more
for N < No.
together suggest that Noth order asymptotic solutions can be found for arbitrarily large No > 0 by first evaluating (2.41) for N -1 and then proceeding step by step up to N No. We shall see in the following that this inductive procedure gives us dynamical laws for the eikonal function S and, step by step, for the amplitudes f, b up to arbitrarily large order These two observations
=
=
N
=
No.
If we want to get dynamical laws for f, b for all N E N, we have to assume that our approximate-plane-wave family (2.33) satisfies (2.41) for all N E N
F,,.b (Ce, equations.
or, what is the same, for all N C- Z. In this
asymptotic
series solution of Maxwell's
Fig.
2. 1. For N >
tions
approaches
for
-+
oz
0,
as
0,
an
Nth
order
case
is called
an
infinite
of Maxwell's equaasymptotic solution F(,b (a, equations asymptotically
the space of exact solutions of Maxwell's will be proven in Sect. 2.7.
of course, not imply that the one-parameter family Fab (a pointwise (or in any other sense) towards an exact solution of Maxwell's equations for a --> 0. We have already emphasized that for an approximate-plane-wave family the limit lim. F,,b(a, *) cannot exist. This
(2.41) does,
7
converges
C,__+O
question of whether asymptotic solutions can be viewed as approxThis question will be answered in Sect. 2.7 below by proving solutions. imate
raises the
following result. Let Fab(a7 -) be an approximate-plane-wave family that an Nth order asymptotic solution of Maxwell's equations in a linear dielectric and permeable medium for some N > 0. Then there exists, locally around any one point, a one-parameter family F* ab(a, ') of exact solutions of Maxwell's equations such that F,*,b(a, Fab(a) ) goes to zero in the finer norms involving arbiwith to some even sense respect pointwise (and aN+1 --+ 0. In other words, for a sufficiently for a as trarily high derivatives) small the members of our approximate-plane-wave family can be viewed as arbitrarily good approximations to exact solutions of Maxwell's equations. Figure 2.1 illustrates this situation in the infinite-dimensional space of (CI)
the is
2.4 Derivation of the eikonal
antisymmetric second-rank
equation and transport equations
tensor fields defined
on some
open
19
spacetime do-
main.
2.4 Derivation of the eikonal
and transport
equation
equations
we derive, in a linear dielectric and permeable medium, the dynamical equations for wave surfaces and for wave amplitudes in the high frequency limit. We do that locally around any spacetime point xO. As a preparation, we prove the following fact.
In this section
Proposition 2.4.1. Consider an approximate-plane-wave family Fab(a, that is an Nth order asymptotic solution of Maxwell's equations in a linear dielectric and permeable medium for some N > -1. Then the frequency function (2.38) of Fab(a) ') with respect to the rest system of the medium
(Va
Ua)
=
has
no zeros.
Proof. We introduce, around any spacetime point x, a coordinate system adapted to Ua in the sense of Definition 2.1.3. We are done if we can show that o94S is different from zero at x,,. By assumption, our approximate-planewave family satisfies (2.41) for N -1, i.e. =
77
77
where
=
4
abcd
ab S fcod
(2.42)
0
=
abS 77,def 900
=
(2.43)
0
fcod
by the constitutive equations. (Here we made use 0 at x,. At this point, the thata4S component of (2.42) implies
gOef
of Lemma a
abcd
is related to
2.2.1.)
Now let
us assume
=
g"' ajS bo, for the
magnetic part
bO,
of
fa0b, whereas the a a, S eto,
for the electric part eoV of
(2.44)
0
=
-
P
81, S eo,
0
falb. Similarly, (2-43) 9
A'r
a,, S d2r
=
components of (2.42) imply
(2.45)
results in
(2.46)
0
and
a,S ho A
-
ajS ho
0
V
(2.47)
for the electric part dO. and for the magnetic part hO Of goaab. Note that S is real whereas the amplitudes are complex. (2.45) and (2.46) imply
9/,&'r eo TO, aS
=
0
.
(2.48)
2.
20
Light propagation
Similarly, (2.44)
and
in linear dielectric and
permeable media
(2.47) imply 9trr
ho R, aS
0
=
(2.49)
.
0. Thus, condition (2.34) rewe are at a point where a4S condition Hence, by a3 0, (c) of Definition 2. 1. 1, : S, i92S, 0). S) (0, quires (a, that (bO, and that implies eo, 0, (2.49) 0) eo) (0, (eO, implies 1 bO, 2 bO) (2.48) 3 3 1 2 contraof a zero that a shows This our i94S gives having hypothesis 0, (0, 0).
Recall that
=
=
diction to
(3
(2.37).
To analyze the dynamics of wave surfaces and amplitudes in the high frequency limit near an arbitrary spacetime point xO, we introduce near xO a coordinate system which is adapted to the rest system U' of the medium in the sense of Definition 2.1.3. We can then express electromagnetic fields in terms of the dynamical variables Z1, Z2, Z3, Y1, Y2, Y3 introduced in (2.22). Then any approximate-plane-wave family takes the form
))
Z(a Y(a x) for any
No+1
Ref
=
integer No
eiS(X)l
E
(XN
N=O
> -1. Here the
(ZN(X)) yN(X)
o(aNo+2)
+
complex amplitudes
f, b
from
(2.50)
(2.35)
are ex-
following proposition yN. in terms gives necessary and sufficient conditions on the eikonal function S and on the amplitudes zN, yN such that (2.50) is an asymptotic solution of Maxwell's equations. of (C3-valued functions zN and
pressed
The
2.4.2. Consider, locally around any spacetime point xO, a coordinate system (xi, x2, x3, x') adapted to the rest system Ua of a linear dielectric and permeable medium, Then an approximate-plane-wave family, represented in this coordinate system in the form (2.50), is an asymptotic -1 if and Maxwell's equations in lowest non-trivial order N solution
Proposition
of only if 94S has
=
no zeros
and
,OaS La
(YO) CO) Z
0
(2.51)
=
th
No ! 0, such an approximate-plane-wave family is an No order totic solution of Maxwell's equations if and only if, in addition,
For
(L a(ga+M ) (YN) Z
N
for
0 < N
0 for the functions
case
o,
1, 2, 3, has
=
hi(x, p)2
and
a
double
h2(X, p)
2
of
bilinear with respect to the momentum coordinates. An example for medium is a uniaxial crystal. For the partial Hamiltonians (2.73) we
(2.5)
axe
such
a
find in this
special
after
case
quick
a
HA(x7P) for A
31
and the introduction of rays
=
1 and A
1 2
ga6 oA (X) Pa A
(2.90)
2, where
=
90ab ab 9 o2
==
calculation
1 =
-
F3
1
(,ab + Ua Ub)
=
61
(Xa1 Xb1
+
X2a Xb) 2
_
I
+ El
Ua
Ub, (2.91)
Oa
b
X3 X3
-
Ua
Ub.
Hence, either branch of the characteristic variety is the null cone bundle of a Lorentzian metric. Generalizing the terminology from the isotropic case, the two metrics (2.91) can be called the optical metrics of the medium. The first optical metric does not distinguish a spatial direction, i.e., it is of the second optical same kind as the optical metric in an isotropic medium. The is metric, however, reflects the fact that the X3-direction distinguished in the situation like this the 1-branch of the characteristic called the ordinary branch whereas the 2-branch is called branch. In this terminology solutions of the partial eikonal
medium considered. In
variety is usually the extraordinary equation (2.76) with A with A
=
2
are
a
=
1
are
associated with ordinary
associated with extraordinary
waves
and solutions
waves.
dielectricity tensor field are mutually different, the two characteristic varieties are no longer the null cone bundles of Lorentzian metrics. An example for such a medium is a biaxial crystal. If we want to speak of "optical metrics" in such a medium, we have to understand If the
eigenvalues
El, -52) E3 of the
the term "metric" in the Finslerian sense. Both these optical Finsler metrics display the anisotropy of the medium in a symmetrical way, i.e., it is not to distinguish one of them by the attribute "ordinary". For this
justified reason, waves
we
prefer
to
speak
and extraordinary
2.6 Discussion of
of 1-waves and 2-waves
waves)
in
general
(rather
than of
ordinary
anisotropic media.
transport equations and the
introduction of rays In this section
we
associate solutions of the eikonal
equation
in
a
linear di-
electric and permeable medium with congruences of rays. The guiding idea is to introduce the notion of rays in such a way that the transport equations
(2.58)
can
be
reinterpreted
as
ordinary differential equations along
rays.
32
Light propagation
2.
in linear dielectric and
permeable media
According to Proposition 2.5.1 the left-hand side of the eikonal equation a product structure. This suggests to introduce, for solutions S of the eikonal equation, the following terminology. S is called a solution of multiplicity two iff it satisfies both partial eikonal equations (2.76), and it is called a solution of multiplicity one iff it satisfies exactly one of the two partial eikonal has
equations. The multiplicity
can, of course, change from point to point. following we consider solutions of the eikonal equation on neighborhoods where the multiplicity is constant. Note that, for any given solution, there exists such a neighborhood near almost all spacetime points. We begin our discussion with solutions of multiplicity one, later we consider the somewhat more complicated case of solutions of multiplicity two. We introduce the following definition.
In the
Definition 2.6.1. Let S be
a solution of the eikonal equation according to on some open spacetime region U, S is of that, Proposition multiplicity one, i.e., that the partial eikonal equation (2.76) is satisfied for 2 at all points of U. Then the vector field A 1, say, but not for A
2.5.1. Assume
=
=
Ka(X) on
='OHI X,,gS(X) (
(2.92)
16Pa
U is called the transport vector field and its
integral
called the
curves are
rays associated with S.
In the
theory of partial differential equations the
(bi-) characteristic curves. Owing to (2.83) the transport are
immersed
corresponds
vector field has
rays
no
are
zeros,
also known
i.e., the
Changing the partial Hamiltonian according reparametrizing the rays. Please note that
curves.
to
a,,,S(x) K'(x)
=
0
to
as
rays
(2.84) (2.93)
.
This follows from the fact that H, satisfies equation (2.81) which was a consequence of the homogeneity property established in Proposition 2.5.2 (a). the transport vector field is tangent to the hypersurfaces S = const. We want to show now that the transport equations can be viewed as
Thus,
ordinary differential equations along rays. We do that locally around any point x0 of the neighborhood U mentioned in Definition 2.6.1. To that end we introduce a coordinate system adapted to the rest system U1 of the medium near x0. (Please recall Definition 2.1.3.) In such a coordinate system, the partial Hamiltonians (2.73) take the form HA (X P) i
1 :--:
2
(hA (X P) 7
+
P4
V
944 (X)
i.e., the partial eikonal equations (2.76)
-) (hA(X7P) are
94;(X) 1)
P4 -
r-
V
equivalent
to
,
(2.94)
2.6 Discussion of transport
equations and the introduction of rays
194S
hA (X7 49S(X)) for A
(2.95)
0
=
V'
33
944 (X)
1, 2. Here and in the following, the upper sign corresponds to negative frequency solutions, a4S < 0, and the lower sign corresponds to positive frequency solutions, 04S > 0. With (2-95) the partial derivative of HA takes =
the form
i9HA
OP.
194S
(x,,OS(x))
ah Uf 6A
-
N/
944
With these informations at
hand,
zNL
and
with ZN and 11
yN 1
yN 11
take
we
transport equation, i.e., at equation for
Ja
(T ON (X, aS(X)) (X) -
-
(2.58)
--4
_
N/1-944(X)
(2.96)
closer look at the Nth order
a
viewed
as a
differential equation
assumed known. In the situation of Defini-
2.6.1, the projection operator PS(x) onto the kernel of 9aS(x) L'(x) given, in terms of the eigenvectors UA and vA of (2.66), by tion
(U1(x'aS(x))))'3 ( ul(x,as(x)))) (X"9s(x)
PS(X) N and ZN and y I I
are
as(x)
V, (X,
necessarily of the ZN -L (X)
with
a
factor
C-valued function
form
U1
(X, 09S(X)) (x, as(x)
(2.98)
V,
N
After multiplication with the
the N th order transport equation for the function N of the form
(94S(X)1944(X),
differential
equation
K a (X) Here Ka is
an
K a(X)
(2.97)
V,
N(X)
Y N(X)
is
iga6N (X)
+
f (X)
6N (X)
))
-L a(x)
+
kN(X)
=
non-vanishing
(2.58)
reduces to
a
(2.99)
0.
abbreviation for
84S(X) 944(X)
(
(x, as(x)) V1 (X, 19S(X)
U,
( U1(x1(x, V,
aS(X) 49S(X)
)
(2.100)
and f (x) and kN(x) are known C-valued functions. Clearly, (2.99) gives an ordinary differential equation for N along each integral curve of the vector field Ka. We
now
through (2.92).
show that K a is,
indeed,
We first observe that
UA(XM ( VA(XiP))
(P4
-Pa
the transport vector field defined
(2.66) implies
L'(x)
UB(X)P) (VB(X7P))
\/--g44(x) hA(X7P)) JAB
(2.101)
34
Light propagation
2.
for
A, B
=
1,
2.
in linear dielectric and
Upon differentiation with respect
UA(XiP) ( VA(X7P))
,F-g44(x) (hA(X7P)
-
we
b
(X)
(
(2.101) yields
P))
UB (X; P) VB
(Xi
19
UB
VA
49hA
-
1OPb
( Xi P)) JAB
(2.102)
evaluate this equation with A B I along p 49S(x), we see that side of (2.100) coincides with the right-hand side of (2.96) for =
=
right-hand
A = 1. This proves that the vector field Ka in vector field associated with S. Now let
us
of solutions of
the
to A,
(Xi P) (UA(X,P) (X, P)) .ON (VB(X7P))
V -_944(X)
b
the
L
hB(X,P))
(J40 If
permeable media
following
(2.99) is, indeed,
the transport
investigate to what extent these results carry over to the case multiplicity two. In analogy to Definition 2.6.1, we introduce
notions.
Definition 2.6.2. Let S be
solution
of the eikonal equation according to Proposition that, on some open spacetime region U, S is of that the partial eikonal equation (2.76) is satisfied for multiplicity two, i.e., A 1 and A 2 at all points of U. Then for A 1 and A 2 the vector field a
2.5.1. Assume
=
=
KI (X) a
on
the the
U is called the
A-rays partial
=
=
A-transport
allA
ON
(X"9S(X))
vector field and its
(2.103) integral
associated with S. We shall also refer to transport vector fields associated with S.
Kl'(x)
curves are
and
K2a(x)
called as
to
and K2a may or may not coincide. For a solution of multiplicity two, Ka 1 (If they are collinear, they can be made equal by a transformation of the form (2.84).) If the two branches of the characteristic variety coincide, all solutions are of multiplicity two with Ka K2a. This is the case for an 1 isotropic medium where, by (2.87), =
K1a(X)
=
Ka(x) 2
=
gab(X),gbS(X)
(2.104)
0
Hence, there is only one congruence of rays associated with each solution of the eikonal equation in an isotropic medium. In the anisotropic case we have to live with the situation that solutions of the eikonal equation might be associated with two different congruences of rays. Clearly, this makes it more complicated to interpret the transport equations as ordinary differential equations along rays. We are now going to work out the details. In the situation of Definition
2.6.2, (2.97)
is to be
replaced
with
2.6 Discussion of transport 2
Ps(x)
=
(
E A=1
and ZN and I
yNI
are
equations and the introduction
UA(X,P) VA(X,P)
) ( 0
UA(X,P) VA(XiP)
of rays
)
35
(2.105)
of the form
(Y (x
(
2
ZN(X) I A 1,
(X, IOS(X)) ) VA (X, OS (X )
UA
E UN(X) A=l
(2.106)
and 2N to be determined. Upon multipliIN non-vanishing factor a4S(X)1944(X), the transport equation system of two coupled differential equations for ,N and 2N of
with two C-valued functions cation with the
(2.58) gives
a
the form 2
Kj(X),9a AN(X)
+
E fAB (X) N(X) + kAN(X) B
=0,
A=1,2.
(2.107)
)
(2.108)
B=1
Here
KI
Kj(x)
is
=
an
abbreviation for
194S(X) 944(X)
-
(X, aS(X)) (VA(X749S(X))] UA
L a(X)
(
(X7 49S(X)) VA (Xi 19S(X))
UA
known C-valued functions. To put the transport 1, 2 and fAB, kN A are form the into equation (2.107), we made use of the fact that, by (2.6), our for A
=
multiplicity-two solution
(
(X, i9s(x)) (x, as(x))
) U2(X,19S(X)))) (X, aS(X) U,
-
V,
V2
To
satisfies
.
(X, 19S(X)) (V2(X,19S(X))) "(X) ( ul(x,as(x))) ) (X"9s(x)
La (X)
L
U2
(2.1.09) =
0.
V,
verify that the KAa given by (2.108)
are,
indeed,
the
partial transport
B. This shows (2.103), we consider (2.6) with A that the right-hand side of (2.108) coincides with the right-hand side of (2.96). If the two partial transport vector fields coincide, (2.107) gives an ordinary the integral curves of Kja differential equation for the tuple ( 1N' 6N 2 ) along I K2a. In the general case, the situation is more complicated. (2.107) with A along the integral curves of gives an ordinary differential equation for 6N 1 2 gives an ordinary differential Kf that involves 62N, and (2.107) with A along the integral curves of K2a that involves 6N equation for 6N 1 2
vector fields defined in
=
=
=
1
following way. For a solution of the eikonal equation in a linear dielectric and permeable medium, Definition 2.6.1 gives a transport vector field and, thus, a congruence of rays on open subsets on which the multiplicity is one, and Definition 2.6.2 gives two partial transport vector fields and, thus, two congruences of rays on open subsets on We
summarize our
discussion in the
36
Light propagation
2.
in linear dielectric and
permeable media
which the
multiplicity is two. What is left out is the set of all points where multiplicity changes. By continuous extension into such point we might get pathologies such as bifurcating rays. Any ray is an integral curve of a vector field KAI given by (2.103) with A I and/or A 2. For any such integral curve s x(s) we can define a ) map s i p(s) by p(s) o9S(x(s)), thereby getting a solution of Hamilton's the
=
i
=
)
=
equations
HA
(x(s) p(s))
a(S)
,
OHA
Pa(S) We call any immersed curve s ) p(s), an A-ray for short, A s
=
0
1
X(s),P(s)
ON OHA
)
,gxa
(X(S),P(S))
x(s)
for which
=
1,
(2.110)
I
(2.6)
is
satisfied, with
some
2.
partial Hamiltonian HA is changed into f1A by a transformation (2.84), the A-rays undergo a reparametrization but they are unchanged otherwise. In other words, the A-rays are determined, up to their parametrization, by the A-branch of the characteristic variety. In the uniaxial case discussed in Example 2.5.2, the 1-rays are called ordinary rays and the 2-rays are called extraordinary rays. If we solve Hamilton's equations (2.6) with the partial Hamiltonians given by (2.90), we find that the ordinary and extraordinary rays are the light-like geodesics of the first and the second optical metric (2.91), respectively. In the isotropic case there is only one optical metric and the notions of 1H2 rays and 2-rays coincide. By solving Hamilton's equations (2.6) with H1 given by (2.87), we find that the rays are exactly the light-like geodesics of the optical metric. This implies, of course, in particular the familiar textbook result that in vacuum the rays are exactly the light-like geodesics of the If the
of the form
=
spacetime metric.
2.7
Ray optics
From the
as an
approximation scheme
preceding sections we know that
rays
are
associated with asymptotic
solutions of Maxwell's equations. We are now going to show that they are associated, moreover, with approximate solutions of Maxwell's equations. For the
physical interpretation
of ray
optics this is
a
crucial point.
solution S of the eikonal equation which, in a medium of the kind under consideration, is given by (2.74) with partial Hamiltonians Let
(2.73).
us
start with
a
As always, we assume that S is given on some open neighborhood of spacetime and that its gradient has no zeros. Moreover, we have to assume in
2.7
Ray optics
as an
37
approximation scheme
following that S is associated with a unique -congruence of rays. In other words, we have to assume that either S is a solution of multiplicity one or that S is a solution of multiplicity two for which the two partial transport the
vector fields coincide.
It is our goal to associate such an eikonal function S with an approximate solution of Maxwell's equations. To that end, we fix a spacetime point and, the rest on a neighborhood of that point, a coordinate system adapted to We 2.1.3. of Definition use the inductive the in medium sense the of system method of Sect. 2.5 to construct
(
Z(a ,x ) Y(a ,X)
) =R+
an
Nth order asymptotic solution
I] am (ym(x) (x)) +1
is(x)/Ci
/m
+ 0 (a
N+2) I
(2.111)
M=O
equations (2.23), where N can be chosen as large as we want. arbitrary. For any choice of this term, (2.111) th order N asymptotic solution of the constraints as well. automatically an
of the evolution
This leaves the 0 (a N+2) term is
It is not difficult to check that the
0(aN+2)
term
can
be chosen in such
the initial hypersurface a way that the constraints are exactly satisfied on X4 = const. These initial values determine a unique exact solution of the
evolution
family
equations (2.23)
for each a,
thereby giving by Z* (a,
us
of exact solutions that will be denoted
a
one-parameter
Y* (a,
Now the
difference
AZ(a,
Z(a,
Z*(a,
AY(a,
Y(a,
Y*(a,
La i9a + M
AZ(a ) (,6Y(a:
(2.112)
satisfies ,
)))
=
0(aN+1)
(2.113)
and vanishes on the initial hypersurface. We have already stressed in Sect. 2.1 that the differential operator on the left-hand side of (2.113) is symmetric hyperbolic with respect to the scalar product (2.29). Hence, we have the so-called energy inequalities at our disposal (see, e.g., Theorem 4.3 in Chazor Theorem 2.63 in Egorov and Shubin [36]). As a arain and Piriou consequence,
[26] (2.113) implies
the existence of
a
constant C such that the
inequality
(,AZ(a, AY(a, holds
on
an
appropriately
AZ(a, (,AY(a, chosen
0
has
no
(ja.
This is indeed
possible
zeros,
(3.18)
,
spacetime region considered. If this condition is satisfied, following way. From (3.14) we find, with the help of (3.9) and (3.17),
in the
we can
pro-
ceed in the
0
e 0
e A
Since
we can
(3.19)
"a
Vb Pcb (ja
+ U
C
0
UC)
(3.20)
by no, (3.20) can be used to eliminate 6ra from (3.15). following linear second order differential equation for -Pab:
0
Ub(ja + U0a Uc) VbVd-P C
=
Vb-Pab
divide
This results in the 0
(ja
Ua
ft
cd
0
+ e2
(Vb Ub(ja C
0
0
-AU
M
b
^a
Fb
0
0
'
a + U U
")
+
Vc U
a) VdP
cd -
=0-
(3.21)
Light propagation
3.2
If
have
we
a
solution
-Pab
(3.13)
of
in
non-magnetized plasma
a
(3.21),
and
we can
49
&a by
define ft and
(3.19) (3.20), respectively. It is easy to check that then the full system of equations (3.13)-(3.17) is satisfied. In other words, we have reduced this and
system (3.13)-(3.17)
Pab alone, given by (3.13)
dynamical equations for
to
(3.21).
and
To rewrite in terms of
a
(3.13) and (3-21)
and
a more
we assume
a
we
express
-Pab
(3.22)
thatAa satisfies the Landau gauge
condition in the rest system
background
electron
"--
fluid, -
(3.23)
0
theory to verify that any antisymmetric be locally represented in this way, and (locally) uniquely determined by P ab up to gauge transformations
standard exercise in Maxwell
tensor field
that
form,
V[aAbj
0[aAb]
'--:
Aa Ua It is
convenient
potential Aa,
Pab of the
in
Aa
is
-Pab that satisfies (3.13)
can
Aa + 49ah
A,,
(3.24) 0
where h is any spacetime function that is constant along the flow lines of Ua. In other words, h can be freely prescribed on a hypersurface transverse to those flow lines.
With
(3.22), (3.13)
automatically satisfied and (3.21) takes the
is
E)af Af
=
(3-25)
0
where the differential operator Daf is defined
E)af Af
0 =
U
b
0
0
0
a
c
e
U c) + Vc U
20
;;T
determines the
0
a
_
C
(VbU b(p + U (3.25)
by
(ja + U U,)Vb(VfVc "
0
n
a) (VfVc
_
gfcVd Vd) Af gfcVd Vd) Af
+
-
U Vb)Af U _gafob (OfVa
dynamics
of
form
electromagnetic
(3.26) waves
in
our
plasma.
component equations, but only three of them (3.25) dependent since the equation consists of four
are
in-
0
Ua'DafAf
=
0
(3.27)
identically satisfied for any Af. By the Landau gauge condition (3.23), Af independent components. Hence, we have as many equations as unknown functions. In this sense, (3.25) gives a determined system of linear third order differential equations for the electromagnetic potential. To make
is
has three
50
3.
this
Light propagation
explicit,
one can
in other kinds of media
choose,
open subset of
appropriate
on an
spacetime,
0
orthonormal tetrad field El, E2, E3, E4 with form
Af
gfkAA
=
of
Multiplication
E k with A
(3.25)
scalar
some
with ga,
it Ehlers
[18] [19]
existence and
prescribed
=
U'.
By (3.23),
functionsAl, A2, A3
El' gives ,
1, 2, 3) for the three functions
=
E4'
us
on
three equations
Al, A 2 A 3
.
,
isofthe
that domain.
(numbered by
It is shown in Breuer and
that this system of linear differential
equations admits 0
AA,
uniqueness theorem for any data space-like hypersurface.
Af
an
Ua
OaAll,
0
0
Ua U
b
a
local
iaaabA"
on a
Viewed in this sense, (3.25) is the system of evolution equations for electromagnetic waves in our plasma. Those evolution equations are of second
strengths, and they are not supplemented by constraints. different from the evolution equations (2.23) in a linear thus, quite are, dielectric and permeable medium. Unfortunately, (3.25) is not of the kind for which standard theorems guarantee the validity of energy inequalities. With the dynamical law (3.25) at hand, we can now perform the passage to ray optics. Since it is our goal to take dispersion into account, we proceed in a way different from Chap. 2. As outlined in Sect. 3.1, it will be crucial to consider one-parameter families of background fields rather than fixed background fields. The background fields that enter into the differential operator -Daf are the metric gab, the electron number density n" and the electron 4-
order in the field
They
0
velocity
Ua. Let
(3.6)-(3.9)
and
us
fix such
a
system around this point. We
by
the coordinates xo
=
1
background fields which have to satisfy fix a spacetime point and a coordinate assume that the chosen point is represented
set of
(3.18). Further,
let
3
2
(xO' X0
us
,
X0,
4
xO)
and that the considered coordinate
star-shaped with respect to xo in R 4. The latter condition means that for any point x in this domain the straight line between x and xO is completely contained in this domain. Refering to this fixed coordinate system, we define new background fields, depending on a real parameter,8, by domain is
gab 0
n
(0, X) P, X)
=
gab ,
=
A
(XO +Nx
-
-
XO))
XO))
(3.28)
,
(3.29)
,
0
0
Ua
(XO + 0 (X
(0, X)
Ua (X, + 0 (X
=
-
X()))
(3.30)
.
0
nO(,8,
For 0 < 6 < 1, the new background fields gab (,3 1 well defined on the star-shaped domain considered, and
and U' (,3,
are
they satisfy again
equations (3.6)-(3.9) and condition (3.18). (This observation does
not carry
0
over
if
an
electromagnetic background field Fab -'A 0 is to be taken into a magnetized plasma, one cannot assume the same,3-dependence
account. For
0
0
background fields gab) on) U,and Fab-) 0, the components of the background For,3 a
for all
--+
fields become constant in
the coordinate system under consideration. In this sense,
gab(O, )) nO(O, *)
3.2
Light propagation
in
a
non-magnetized plasma
51
0
and
U'(0,
fields. In
homogeneous
are
particular, gab(O -)
is
a
flat metric
0
and
Ua(O'
is
covariantly constant, i.e.,
to this metric. For this reason,
homogeneous background If
replace
we
9ab(6, .),
(3.2)
in
we
inertial system, with respect '3 --+ 0 as to the
an
shall refer to the limit
limit.
the
original background
fields gab,
on
0
and Ua
by
0
AO(3, -)
Ua(O' .), respectively, we get a one-parameter family ). It is our plan to enter into the differential 0 with an approximate-plane-wave ansatz for equation E)af (0, .)Af (0, -) the potential Af (A ). Hence, we consider two-parameter families of the form and
of differential operators E)af (,J,
-
=
-
Af (a,fl, x) 2
;7
which
Re
f e's(xO+,O(x-xO))Ia af (a,
satisfy the Landau
0
assume
that the
xo
+,3(x
-
xo)) I
gauge condition
Af (a, 13, x)
Of (3, x) We
(3.31)
=
complex amplitudes
are
0
=
(3.32)
.
of the form
No+1 N aN( f -)a
eif (a
+
O(CNo+2
(3.33)
N=O
for all integers No > -I and that
Fab (a, 0, X) Re
Iei
S(xo+o(x-xo))/Ci i
'--
19[aAb] (a,,3) x)
(49[a S 60b]) (XO + 0 (X
(3.34)
=
-
XO))
+ 0 (a)
I
approximate-plane-wave family, in the sense of Sect. 2.2, for any fixed 0 < 1. For an approximate plane wave in this family, the frequency function with respect to the background electron rest system (3.30) is then given by
is
an
with 0
0
on
are
which
to be
A"
=
0
embark upon such
an
investigation here.)
3.2
Light propagation
in
a
non-magnetized plasma
57
is, therefore, justified to concentrate the discussion of light propagation plasma on the partial Hamiltonian H3. Using the eikonal. equation (3.56) and the Oth order polarization condition (3.55) as the starting point, we could now go on to evaluate (3.37) inductively for N 1, 2, 3 etc. This would result in transport equations and polarization conditions for the amof arbitrarily high order. We leave it to the reader to verify that, plitudes 6N f proceeding along the lines of Sect. 2.4, this hierarchy of equations can be solved inductively to construct solutions of the asymptotic condition (3.37) for arbitrarily large N. Unfortunately, it is a difficult problem, apparently unsolved so far, to prove that those asymptotic solutions are approximate It
in
our
=
well. The method of Sect. 2.7 does not carry over since the differential equation (3.25) is not of the kind for which energy inequalities are solutions
as
Therefore, it is hard to see how the difference between our asymptotic solutions and appropriate exact solutions could be estimated. If such error estimates do exist, they are, of course, quite different depending on which sort of limit is considered. For the homogeneous background limit with fixed frequency considered here, the error bounds must go to zero if known to hold true.
o
background fields gab7 no and Ua become homogeneous. In other words, our asymptotic solutions yield good approximations if the background fields are sufficiently homogeneous. Clearly, this is not necessarily the case if the high-frequency limit on a fixed background is considered. Either limit yields
the
reasonable eikonal equation, reasonable transport equations and reasonpolarization conditions. The difference is in the range of validity as an approximation scheme for exact electromagnetic wave fields (providing this
a
able
validity
can
be
established,
in terms of
From the results of this section
error
estimates,
at
all).
following lesson. The plasma does not only depend on
we can
draw the
equation for light propagation in a plasma model (two-fluid model, infinite inertia of the ion component, vanishing pressure of the electron component, linearization around background with vanishing electromagnetic field, etc.); it also depends on the kind of asymptotic limit considered. For the high frequency limit on a fixed background, the eikonal equation is exactly the same as for light propagation in vacuum, i.e., there is no effect of the plasma on the rays. For the homogeneous background limit with fixed frequency, on the other hand, the eikonal equation is given by (3.43), i.e., there is an effect of the plasma on the rays which has causes, in particular, dispersion. Although the eikonal. equation (3.43) a product structure associated with three partial Hamiltonians, one should not speak of "multiple refraction" in this case. The reason is that only the transverse modes described by H3 can be linked to solutions of the vacuum eikonal equation in the way indicated above. In other words, rays that enter into our plasma from an adjacent vacuum region have to proceed as 3-rays, i.e., they are not multiply refracted. In a magnetized plasma, however, the background electromagnetic field causes the branch of the dispersion relation associated with H3 to split into two branches. Then the medium becomes
eikonal the
58
3.
Light propagation
in other kinds of media
On a special-relativistic background this is a standard plasma physics. For a general-relativistic treatment of this case
double-refractive. sult of
refer to Breuer and Ehlers
rewe
(18] [19].
In addition to the high frequency limit on a fixed background and the homogeneous background limit with fixed frequency there are many other possibilities. To mention just one further example, we could modify ansatz (3.28)-(3.30) by assuming a different scaling behavior for the one-parameter family of background fields. In this way it is possible to derive, e.g., an eikonal equation such that the rays are directly affected by the rotation of the back0
ground rest system U1. An eikonal equation of this kind was brought forward by Heintzmann and SchrUer [61], based on earlier work by Heintzmann, Kundt and Lasota [601 in the context of special relativity. For each eikonal equation derived that way, the range of validity as an approximation scheme (if any) must be checked individually.
4. Introduction to Part II
In Part II
optics as a theory in its own right. In Chap. 5 we finite-dimensional manifold M and set up a Hamilarbitrary presuppose tonian formalism for ray optics in the cotangent bundle over M. In Chap. 6 we assume that, in addition, a Lorentzian metric is given on M. Specialized to the case dim(M) 4, (M, g) can then be interpreted as a spacetime in the sense of general relativity and our formalism covers ray optics in arbitrary media on such a spacetime. This procedure has the advantage that the results of Chap. 5 apply equally well to spacetime theories other than general relativity and to the case that M is to be interpreted as space, rather than as spacetime, in any kind of theory where such a notion makes sense. Chapter 7 will then be devoted to variational principles for rays and Chap. 8 presents applications of the general formalism to astrophysics and astronomy. we
treat ray
an
=
The results of Part I will often be used for the sake of motivation, and will provide us with illustrative examples. However, the mathematical
they
formalism
developed
4.1 A brief In the most some
guide
following
part
in Part II is
we
we use
completely self-contained.
to the literature
make extensive
use
of Hamiltonian formalism. For the
coordinate free notation since it is
global questions. We assume symplectic geometry
calculus and with
our
goal
to also treat
that the reader is familiar with differential as
it is used in the modern treatment
of classical mechanics. Our standard reference for
background
material is the
textbook by Abraham and Marsden [1]. In addition, we also refer to Arnold [8) and to Woodhouse [150]. More particularly in view of optics, it might
helpful to consult the textbook by Guillemin and Sternberg [551 where applications of the Hamiltonian formalism and of symplectic geometry to optics are given in modern mathematical terminology. In traditional notation, applications of the Hamiltonian formalism to optics can be found, e.g., in the classical work of Carath6odory [25] and of Luneburg [88) [89). Readers interested in the historical roots of "Hamiltonian optics" should go back to the original work of Sir William Rowan Hamilton who established this formalism in the 1820s, see vol. 1 of the collected papers of Hamilton edited by Conway and Synge [29]. Next to the work of Hamilton, the most fundamental be
V. Perlick: LNPm 61, pp. 61 - 65, 2000 © Springer-Verlag Berlin Heidelberg 2000
4. Introduction to Part Il
62
contribution to the mathematical
theory
of ray optics is due to Bruns
[23]
who introduced the socalled eikonal function. The relation of Bruns's eikonal function to Hamilton's characteristic
function is controversially discussed in by Herzberger [631 and Synge [1391. Textbooks on general relativity do not usually treat ray optics in detail. Most of them are restricted to light propagation in vacuo where the light rays are just the light-like geodesics of the spacetime metric. An important exception to this rule is the book by Synge [142] where a Hamiltonian formalism articles
for ray
optics
in
isotropic media
to compare this to earlier work
[1381 [1401 [141].
is discussed in
some
detail. It is worthwile
ray optics by the same author, see Synge More recent work by Miron and Kawaguchi [97], also see
[68] [69],
on
strongly influenced by ideas of Synge but terminology. It is the main purpose of their work to develop a differential geometric formalism for light propagation in isotropic dispersive media. Miron and Kawaguchi repeatedly claim that standard symplectic geometry does not provide an appropriate framework for the treatment of such media. We do not share this point of view Having set up a Hamiltonian formalism for general-relativistic ray optics, the way is paved for characterizing rays by a variational principle. Some of these variational principles can be interpreted as general-relativistic versions of Fermat's principle. The oldest versions, which hold for vacuum rays in static or stationary spacetimes, date back to Weyl [149] and Levi-Civita [81]. Related material can be found in Levi-Civita [80) [82] [83] and Synge [137]. Kawaguchi uses
and Miron
a more
is
modern mathematical
.
reader is cautioned that the latter paper does not meet the standard of Synge's later work.) These versions of Fermat's principle are also discussed in
(The
several modern textbooks and review articles, see, e.g., Frankel [43] or Straumann [136] for the static case and Landau and Lifshitz [761 or Brill [211 for the stationary case. For a discussion from a mathematical point of view we refer to Masiello [931. Generalizations from vacuum to an isotropic medium,
first considered by Pham Mau Quan Uhlenbeck [144] found the first hand, [116] [117] [118] [119]. variational principle for (vacuum) light rays in general-relativistic spacetimes without symmetries. Whereas for the work of Uhlenbeck it was crucial that the spacetime be globally hyperbolic, Kovner [741 was able to formulate a Fermat principle for vacuum light rays in an arbitrary Lorentzian manifold. A rigorous proof that the solution curves of Kovner's variational principle are, indeed, the light-like geodesics was given in Perlick [108]. Kovner's variational principle was further discussed, both from a physical and from a mathematical point of view, e.g., by Faraoni [42], Nityananda and Samuel [1011, Schneider, Ehlers and Falco [1281, Bel and Martfn [12], Perlick [110] and in several articles by Giannoni, Masiello and Piccione, see, e.g., Giannoni but still assuming
stationarity,
were
On the other
and Masiello
[46]
or
Giannoni, Masiello and Piccione [47] [48].
4.2
4.2
Assumptions
Throughout
Part II
we
Assumptions and notations
63
and notations presuppose
a
finite-dimensional real C"O manifold M
whose topology satisfies Hausdorff 's axiom and the second axiom of countabil-
ity. This implies that M is paracompact. The terms "manifold" and "submanifold" always mean manifold and submanifold without boundary. The physical interpretation we have in mind refers to M as to a spacetime model in the sense of general relativity. However, for the basic concepts of ray optics, to be introduced and discussed in Chap. 5, we need no additional structure on M and n dim(M) need not be specified. In Chap. 6 we shall assume that there is a C1 Lorentzian metric g given on M, whereas n dim(M) will still be an unspecified positive integer except for the restriction that, in Chap. 6, we assume n > 2 to exclude some pathologies. In all applications to relativity we use units making, the vacuum velocity c of light equal to one. At a point q E M, we denote the tangent space by TM and its dual, the cotangent space, by Tq*M. The tangent bundle will be denoted by ) M. It will ) M and the cotangent bundle by -r.) : T*M ,rm: TM often be necessary to remove the zero section from TM and from T*M; =
=
0
0
by TM and T*M, respectively. By a "Lorentzian metric" we always mean a covariant symmetric second rank tensor field with signature ( ...... +, -). With respect to a Lorentzian metric g, a linear subspace of the tangent space TM is called space-like if on this subspace the metric g is positive definite, light-like if it is positive semidefinite but not positive definite, and time-like otherwise. A vector X E TqM is called space-like, light-like, or time-like whenever the linear subspace spanned by this vector has the respective property. As a consequence, X is 0 or g(X, X) > 0, light-like if X : 0 and g(X, X) 0, and space-like if X the same time-like if g(X, X) < 0. If X is space-like, light-like, or time-like, of is submanifold M property is assigned to the covector g(X, .). Finally, a the has whenever its time-like called space-like, light-like, or tangent space respective property at all points. With a Lorentzian metric (or, more generally, a pseudo-Riemannian met-
what is left will be denoted
=
=
ric of any
signature)
on
M there is associated its Levi-Civita connection V.
This defines the notions of
parallel transport and of geodesics. By
) M from map A: I to k Such a curve is called
always is parallel if, more specifically, mean a
we
V
We
assume
definition of
VA
a
a
geodesic
real interval into M such that
an
affinely parametrized geodesic
0 is satisfied. the equation that the reader is familiar with exterior calculus. As to the
antisymmetric
tensor
=
product,
exterior
derivative, etc.,
and factor conventions follow Abraham and Marsden
[1].
Whenever
sign refering
our
a local chart (xl,... I Xn) on M, we use Einstein's summation convention with latin indices running from 1 to n and greek indices running from I to 1. With respect to such a local chart, elements of TM can be represented n in the form vaalaxa, and elements of T*M can be represented in the form
to
-
64
4. Introduction to Part 11
p,,dx'.
In this way
(x ...... x vl,..., v')
local chart
on TM, and a Following Abraham and Marsden [1) we refer to charts constructed in this way as natural charts induced by (XI, x'). It is also usual to refer to the va as to velocity coordinates and to the Pa as to momentum coordinates conjugate to the x". Occasionally we also refer to elements of TM as to velocity vectors and to elements of T*M .
as
.
.
we
get
(xl.... xn, pi,
local chart
I
a
-
-
-,
p.)
on
T*M
-
,
to momentum covectors.
It is well known and
well-defined) 0 is known
one-form 0
as
easily verified that on
the canonical
coordinate-free
there is
T*M such that 0
one-form
on
a
padxa
=
T*M. It
can
(unique
and
globally
in any natural chart. be characterized in a
T* M such that 3* 0 =,a ) T*M, see Abraham and Marsden 0: M [1], p. 179. (Here,,3*0 denotes the pull-back of 0 with,3.) The two-form Q -dO, which dxa A dpa in is known as the canonical two-form on T* M, takes the form Q chart More Q in natural which chart. takes this T*M on generally, any any It 0 called canonical chart. obvious that form is is is closed a special (i.e., dQ 0) and non-degenerate (i.e., the equation Q(X, -) 0 implies X 0). Hence, S? makes T*M into a symplectic manifold The restrictions of 0 and manner as
the unique one-form
on
for all C' sections
=
=
=
0
again be denoted by 0 and S? for the sake of simplicity. R a Hamilassign to each C' function H: T*M tonian vector field XH on T*M by the formula
f2 to T*M will
S?
can
be used to
Q(XH, -) non-degeneracy indeed, well defined.
The
=
(4.1)
dH.
of f2 guarantees that the assignment H i XH iS2 In a natural chart, the Hamiltonian vector field XH
takes the form
XH A C'
curve
of Hamilton's
field
aH
0
=
aH
a
19xa
'9N
(4.2)
_
ON
19xa
'
T*M, defined on a real interval 1, is called a solution : I equations iff it is an integral curve of the Hamiltonian vector
XH, i.e., iff
(4.3) for all
(4.3)
s
E
1. If
is
represented
in
a
natural chart
as a
takes the familiar canonical form of Hamilton's
.ta(s)
pa(s)
'9' =
159Pa
(X(S)'P('
OH =
-
axa
(X(s), p(s))
map
s 1
)
(x(s), p(s)),
equations
(4.4)
.
(4.5)
Equation (4.4), which gives the velocity coordinates as functions of the position and momentum coordinates, is properly called the vertical part of
4.2
Hamilton's tors
equations
tangent
corresponds
since it
to the fibers of T*M. If
(4.4)
Assumptions and to
equation (4.3) applied
can
65
notations
to
vec-
be solved for the momentum
coordinates p, the result can be inserted into (4.5). This leaves us with a set of second order equations for the position coordinates x1. Locally around some point u E T*M, (4.4) can be solved for the momentum coordinates p,, if and
if the condition
only
det(H ab) holds at
Here and in the
u.
Ha
following,
aH -
H
,
ON
ab
-
(4-6)
0
we use
the abbreviations
a2H =
aPaaPb'
(4.7)
etc.
A Hamiltonian H that satisfies (4.6) at u is called regular or non-degenerate at u. (It is easy to check that (4.6) holds in any natural chart if it holds in be viewed one natural chart.) Hence, locally Hamilton's equations can
just
if and only if H is as a system of second order differential equations on M everywhere regular. To express regularity in invariant notation, without refering to a natural .) R in chart, one introduces the fiber derivative of a function H: T*M to H of the following way. For q E M, we denote the restriction T,*M by ) R being a differential the each for E (dHq)u: u Tq*M then, Tq*M, Hq;
linear map can be viewed as an element of (Tq*M)* '- TqM. The fiber deriva, (dHq)u ) TM of H is defined by the equation (FH)(u) tive FH: T*M where q -r) (u). Using natural charts on T*M and TM, induced by one and the same chart x x') on M, the fiber derivative takes the form (x 1, =
=
=
(x, p)
i
)
Hamilton's
.
.
.
,
9H1,9p). With the help of the map FH the equations (4.3) can be written in the form (-r 4
(x, v
=
vertical part of FH o , )*
o
=
where the ring denotes composition of maps and the dot stands for derivative with respect to the curve parameter. Hence the desired invariant characteriza-
regularity can be given in the following way. A Hamiltonian is regular at a point u E T*M if and only if its fiber derivative FH maps a neighborhood TM is of u diffeomorphically onto an open subset of TM. If FH: T*M called H is hyperregular. even a global diffeomorphism,
tion of
Ray-optical structures on arbitrary manifolds 5.
object on which all of ray dispersion relation, i.e., a characteristic variety in a cotangent bundle. We make this into the general definition of ray optical structures on our arbitrary n-dimensional manifold. In Part I
optics
we
can
have
seen
be based is
that the fundamental
a
5.1 Definition and basic
of
ray-optical
properties
structures
we performed the passage from Maxwell's equations to ray optics in I, the characteristic variety came about as the zero-level surface of a Hamiltonian function H, where H was a smooth function on the punctured
When Part
cotangent bundle whose derivative with respect nates had on our
no zeros.
to the momentum coordi-
It is therefore natural to define
arbitrary n-dimensional manifold M
as a
a
ray-optical
structure
closed codimension-one sub-
0
manifold N of T*M which is
everywhere
transverse to the fibers. If
the condition that JV covers all points of M, we definition which is fundamental for all of Part II. Definition 5.1.1. A
ray-optical
structure
on
led to the
are
M is
a
(2n
-
we
add
following
I)-dimensional
0
closed embedded C' a
submanifold M of T* M
such that
r) JAr: JV
M is
su7jective submersion. Here
-r) JAr
denotes the restriction of the bundle
M to Ar. The condition of
-r,14 I.Ar being
a
projection -r.L
:
T* M
submersion guarantees that A( is
0
-r.) JAr being surjectogether imply that
transverse to the fibers of T* M, whereas the condition of
tive guarantees that
X
covers
all of M. Both conditions
0
the set
Arq
=
Ar
n
T,*M
is
a
codimension-one submanifold of the
cotangent
punctured
0
0
for all q E M. As space T*M q
0
Ar is closed in T*M,
so
is
Arq
in
points q and q', the manifolds Arq and JVq, are not necessarily diffeomorphic. particular, Ar need not be a fiber bundle below. be over M. This will exemplified
Tq*M.
Note that, for two different
In
V. Perlick: LNPm 61, pp. 67 - 109, 2000 © Springer-Verlag Berlin Heidelberg 2000
68
5.
Ray-optical
According
structures
on
arbitrary manifolds
5.1.1, A( need not be closed in T*M and its might fail to be a smooth manifold at the zero section. examples we have in mind, it is indeed necessary to keep as general as that. In particular, we do not want to exclude
to Definition
closure in T*M In view of the
Definition 5.1.1 the
that Ar is the null
case
If
A(l
then Ar
=
connected its
cone
bundle of
a
Lorentzian metric.
Ar2 are two ray-optical structures on M with Ar, n A(2 0, A(, U A(2 is again a ray-optical structure on M. Conversely, each component of a ray-optical structure is a ray-optical structure in
and
=
right. Ray-optical
own
functions
are
structures that
characterized
by
come
the
about
level surfaces of Hamiltonian
as
following proposition. 0
Proposition
5.1.1. Fix
C'
a
function
R and let
H: T*M
Ar be the
0
zero-level
optical
Then A( is a ray0 u E T*M I H(u) surface of H, i.e., A( on M, provided that H satisfies the following two properties. =
structure
0
(a) (b)
For all q E For all u E
M, the
M,
the
set
Xq
fiber
u
E
derivative
T*M q
of
H
I H(u)
1 is non-empty. satisfies (FH)(u) -' 0. =
0
Proof. Condition (a) guarantees that the map -r) jjv is surjective. Condition (b), which is the coordinate-free way of saying that the derivative of H with respect
to the momentum coordinates has
no
zeros,
guarantees that X is
a
0
closed embedded codimension-one submanifold of T*M and that
-r, 41j\r
submersion.
is
a
0
0
T*q
Fig.
5 1. For the -
metric g,, at each
ray-optical point
structure of
q E M.
Example 5.1 1, JV, -
is the null
cone
of the
5.1 Definition and basic
A
ray-optical
properties of ray-optical
generated globally by
structure need not be
69
structures
a
function H in
this way. In general, such a function H exists only locally. (Later in this section we shall investigate this question in more detail.) Insofar, Definition 5.1.1
applies to situations generalization seems throughout. Next as our
we
mention
standard
more
general
than those encountered in Part I. Such
to be reasonable since the treatment of Part I
examples
some
examples
of
ray-optical
requested
a
local
They will serve properties of ray-optical
structures.
for the discussion of all
structures. Therefore the reader is
was
to commit them to his
or
her
memory.
Example
5. 1. 1. Let go be
induced fiber metric of go
are
denoted
(go)
good
=
a
C' Lorentzian metric
on
by
0
ac
jd.) a
M and denote the
T*M by g:O. (In other words, if the components 0 of g# the are given by gab with components (go)ab, 0 0
on
H(x,p)
by H(u)
R
Define H: T*M .1
=
2
.1
=
2
gO (u, u) i.e., 0
,
gab (X) PaPb
(5.1)
0
0
in terms of natural coordinates. Then
ray-optical
structure
Ar
u
E
T*M
I H(u)
=
0
1
is
a
M.
on
example admits several different interpretations on the basis of general relativity. The first possibility is to interpret go as the spacetime metric such that Ar gives light propagation in vacuum. The second possibility is to interpret go as the optical metric (2.88) in a linear dielectric and permeable medium which is isotropic. The third possibility is to interpret go as one of the two optical metrics (2.91) in a linear dielectric and permeable medium which is anisotropic with the special features typical of a uniaxial crystal. In the latter case it is important to realize that, in general, the two optical metrics must be treated separately; the union of the two "light cone bundles" is not a ray-optical structure in the sense of Definition 5.1.1 unless they are disjoint. We should keep in mind that Example 5. 1 .1 is not general enough to cover light propagation in arbitrary linear dielectric and permeable media. According to Sect. 2.5 this would require a generalization to Finslerian metrics. Here is another example of a ray-optical structure. For the
Example
case
dim(M)
=
5.1.2. Let go and
H: T*M
R
4 this
g#
by H(u)
0
2
be
as
in
Example
(g# (u, u) + 1),
H(x,p)
0
=
.1 2
5.1.1 and define
a
function
i.e.,
(gab (X)Pa Pb + 1)
(5.2)
0
0
in terms of natural coordinates. Then
ray-optical
structure
on
M.
X
u
E
T*M
I H(u)
=
0
1
is
a
70
5.
Ray-optical
structures
on
arbitrary manifolds
0, Arq is a double cone whereas Arq, is a two-shell hyperboloid, i.e., Ar. and Arq, are not even homeomorphic, let alone diffeomorphic We now turn to a different kind of examples for ray-optical structures. u
0
where h vanishes this leads
us
is
a
-
back to
=
=
=
.
Example
5.1.3. Fix
a
C' vector field U
on
M that has
no zeros
and define
0
a
function H: T*M
0
R
by H(u)
=
u(Uq)
for all q E M and
U
E
T*M, q
i.e., H (x, p)
=
Ua (X) Pa
(5.3)
5.1 Definition and basic
properties of ray-optical
71
structures
0
JV
in terms of natural coordinates. Then the set
is
a
ray-optical
In
our
structure
discussion of
encountered the
I
E T* M
H (u)
0
=
M.
on
light propagation
(partial)
U
in
a
non-magnetized plasma
we
(3.44)
which generates a ray-optical will be useful in the following to demon-
Hamiltonian
structure of this sort. This
example possible pathologies. Example 5.1.3 admits the following generalization which is mathematically interesting although somewhat contrived in view of physical applica-
strate
some
tions. Instead of
a
line field
COO vector field U without zeros, it suffices to have a C' map that assigns to each point q C- M a one-dimensional
L, i.e., a subspace Lq of the tangent
TqM.
space
Then
define
we
Arq
as
the set of all
0
T*M that annihilate Arq I q E M I is indeed
covectors in
Ar
all vectors in
It is easy to check that structure on M. If L is not
Lq.
a ray-optical Iu E globally spanned by a C' vector field without zeros, this ray-optical ture M is not globally generated by a Hamiltonian function, i.e., it is =
strucnot of
the kind considered in Proposition 5.1.1.
/__77
0
T*, M
Aj
Arq
0
Ar.
0
T*qM
0
(b)
(a)
Fig. 5.3. For the ray-optical structure of Example 5.1.3, Arq is a punctured hyperplane (a), whereas for Example 5.1.4 it is a pair of hyperplanes (b).
following example is related to Example Example 5.1.2 is related to Example 5.1-1.
5.1.3 in
The
Example 5.1-4.
Let U be
a
COO vector field
on
0
define
a
R
function H: T*M
by H(u)
=
1 2
a
similar way
M that has
no
(U(Uq )2
for all
_
1)
zeros
as
and
points
0
q E M and for all covectors
H(x, p)
U
E
.1 2
T*M, i.e., q
(Ua (X) Ub (X)Pa Pb
(5.4)
1 0
in terms of natural coordinates. Then the set
is
a
ray-optical
structure
on
M.
X
u
E
T*M
I H(u)
=
0
72
5.
Ray-optical
structures
on
axbitrary manifolds
partial Hamiltonian (3.45), which determines the plasma oscillations non-magnetized plasma, generates a ray-optical structure of this kind.
The in
a
0
Here
we
have to divide the vector field U of
(3.45) by
given by (3.51), to get the vector field U, and regions where the plasma frequency has no zeros.
w2,,
the
we
plasma frequency
have to restrict to
0
T*A q
Fig.
5.4. For the
ray-optical
structure of
Example 5.1.5, Arq
is
a
sphere.
Finally, we mention an example that has no physical relevance if M is interpreted as a spacetime manifold. However, it is the most important example of a ray-optical structure if M is interpreted as space, e.g., in ordinaxy optics or in a static general-relativistic spacetime. In the latter context, we shall examine this example in more detail in Sect. 6.5 below.
Example 5.1.5. Let g+ be a C' (positive definite) Riemannian metric on M Define a Hamiltonian and denote the induced fiber metric on T*M by
go.
0
H: T*M
R
1(go+ (u
by H(u)
2
H(x, p)
I
u)
.1 2
-
1),
i.e.,
(gab + PaPb
-
1)
(5.5) 0
in terms of natural coordinates. Then
ray-optical
structure
JV
u
E
T*M
I H(u)
0
1
is
a
M.
on
examples to which we shall come back frequently. now our goal justify the term "ray-optical structure" by showing such structure gives rise to the notions of rays. If X is a that, indeed, any This ends It is
our
list of
to
0
ray-optical structure on M, we can use the inclusion map i: X pull back the canonical one-form 0 and the canonical two-form S? resulting forms will be denoted by OAr
=
i*O
and
QAr
=
i*Q.
T*M to to
Ar. The
(5.6)
5.1 Definition and basic
properties of raye-optical
73
structures
pull-back operation i*,
Since the exterior derivative d commutes with the
related by the equation flAr dOV. In particular, this 0. Moreover, at each point implies that QV is a closed two-form, dS?Ar Q is non-degenerate and since U E A( the kernel of S?,v is one-dimensional these two forms
are
=
=
JV has codimension
Abraham and Marsden
(Ar, S?Ar)
This shows that
one.
[1],
is
a
contact
manifold,
see
Definition 5.1.4.
field iff it satisfies one-dimensional, S?,v any two equation Qv (X, ) characteristic vector fields are linearly dependent. Integral curves of characteristic vector fields (or their projections to M) are often called characteristic curves. They give us the rays of JV, according to the following definition. A vector field X
the
on
Ar is called
=
-
Definition 5.1.2. Let JV be
characteristic vector
a
ray-optical
is
structure
on
M and let PAr be the
JV from
Ar defined by (5.6). A C1 immersion : I real interval I into JV is called a lifted ray iff
contact a
a
0. As the kernel of
two-form
on
(PAO (s) ( (S), for all
s
1. Then the
E
Clearly, lifted properties.
(a)
(b)
rays
projected satisfy
curve
the
)
-
-r, 4
o
following
=
:
(5.7)
0
M is called
I
existence and
a
ray.
(non-) uniqueness
A lifted ray remains a lifted ray under an arbitrary reparametrization (which need not be orientation-preserving). Through each point u E JV there is a lifted ray, and it is unique up to
reparametrization and Moreover, it is important
(This fibers.) Hence,
fiber of T*M. to the
"rays
extension.
to realize that
a
lifted ray is nowhere tangent to
follows from the fact that A( is every ray is
an
immersed
curve
everywhere
a
transverse
in M. In other
words,
do not stand still in M".
In the
case
of
Example
metric g,, whereas in the of the metric g,,. In the in M is
5.1.1 the rays are the of Example 5.1.2 they
case
case
of
Example
5.1.3 and 5.1.4
an
immersed
field U. In the ray iff it is an integral curve of the vector 5.1.5 the rays are the immersed geodesics of the metric g+.
a
Example
light-like geodesics of the are the time-like geodesics curve
case
of
immediately from the definitions that two ray-optical structures equal if and only if they determine the same set of lifted rays. Howstructures to have ever, it is very well possible for two different ray-optical the same rays. As an example we may consider a ray-optical structure constructed from a (positive definite) Riemannian metric g+ as in Example 5.1.5 such that the rays are the geodesics of the metric g+. If we change g+ by multiplication with a positive constant c: 0 1 we get a different ray-optical structure but the set of rays remains unchanged. This is obvious since g+ and It follows
on
M
are
cg+ have the
same
Levi-Civita connection.
74
Ray-optical
5.
So far
structures
notion of rays made
our
arbitrary manifolds
on
no use
of Hamiltonian functions. To cast the
equation (5.7) into Hamiltonian form we need the following proposition which can be viewed as the converse of Proposition 5.1.1. ray
Proposition
5.1.2. Let
A( be
ray-optical
a
structure
on
M and
fix
point
a
0
u
Ar. Then there is
E
function H: W
(a) ArnW (b) dH has Any
such
=
)
jw
function
Proof. This is
Later
is
we
Proposition We shall let
admits
I H(w)
01
all
a
of Ar
a
us
shall give criteria for the existence of now
rewrite the ray equation special case of a
consider the
global Hamiltonian H: W by XH. Then, at
and tangent to Ar. Hence, JV. If the defining equation
non-zero on
global
a
that, by Definition
5. 1. 1,
S?Ar (XH IM
n
global Hamiltonians,
see
5.1.4 below.
vector field of H
field
for M. H is called by )/V, i.e., if Ar C W.
local Hamiltonian is covered
immediate consequence of the fact
an
C'
a
Ar n W.
H is called
if
=
u
embedded codimension-one C' submanifold of T*M.
an
with,
W
E
no zeros on
Hamiltonian for Ar
JV
an open neighborhood W of R with the following properties.
in T*M and
)
(5.7)
in Hamiltonian form. To
ray-optical structure./V
R and let
all points
us
the vector
XHjAr gives (4.1) of XH is pulled a
begin
M that
denote the Hamiltonian
Ar,
E
U
on
nowhere
(XH)u
vanishing
back to
JV,
is
vector
find
we
') 0, i.e., XH jAr is a characteristic vector field. Thus, any other characteristic vector field on X must be a multiple Of XHjAr. This implies ) M is a lifted that an immersion : I ray if and only if its tangent field is
=
i
everywhere
a
multiple
of XH.
Thus, lifted
rays
are
characterized
by
the
equations H
( (s),
0
(5.8)
,
k(s) (dH) (s)
7
(5.9)
vanishing but otherwise arbitrary function. The freedom corresponds to the fact that lifted rays can be arbitrarily reparametrized. If H is a local rather than a global Hamiltonian, this result remains true that on part of X which is covered by W. This implies that, with respect to a natural chart and a local Hamiltonian, lifted rays are characterized by the equations
where k is
a
nowhere
to choose this function at will
properties of ray-optical
5.1 Definition and basic
H
(x(s), p(s)) k(s)
.0(s) J
These considerations
can
"
-k(s)
P,,(s)
(5.10)
0,
=
75
structures
P_- (x(s),p(s)) "
,
(x(s),p(s))
,gxa
be summarized in the
(5.12) way. If
following
lucky
we are
JV, the lifted rays of Ar can be enough to have a global with this function H; (ii) singling Hamilton's found by (i) solving equations in lie that out those solutions Ar; (iii) allowing for arbitrary reparametrizaHamiltonian H for
tions. If there is
through
no
global Hamiltonian, the
same
procedure
can
the domain of each local Hamiltonian. On the mutual
on
be carried
overlaps
of
those domains the lifted rays have to be patched up. Rom a geometrical point of view it is quite satisfactory to work with the contact manifold (A(, S?Ar) without refering to Hamiltonians. On the other
hand, the
use
familiar to
of
(local)
physicists.
Hamiltonians leads to
For that
reason we
a
formalism that looks
more
will often refer to Hamiltonians.
propositions that are helpful when working proposition clarifies the relation between and the same ray-optical structure, the second
We end this section with two
with local Hamiltonians. The first two local Hamiltonians for
one
criteria for the existence of
proposition gives
Proposition 5.1.3. Let JV the COO function H: W
be
a
ray-optical
R is
a
a
global
structure
Hamiltonian.
M and
on
local Hamiltonian
assume
that
for Af, defined
on
0
some
of T* M with Ar
open subset W
n
W
: - 0.
Then another C'
function
0
R, defined on the same open subset W of T*M, is again a local ) R \ 10} for Ar if and only if there is a C' function F: W FH holds on W. (By continuity, the function F such that the equation ft must be either everywhere positive or everywhere negative.)
fl: W
Hamiltonian
=
Proof. So let
Fo(u)
Since the "if' part is trivial, we just have to prove the "only if' part. that both H and ft are local Hamiltonians for Ar. Then R \ 101 since ) defines a C' function Fo: W \ A
us assume =
ft(u)JH(u)
both H and H
are
non-zero on
W
\ Ar.
As dH and dH have
no zeros on
jV n W, the Bernoulli-I'Mpital rule guarantees that F0 has a continuous ) R \ 101. At a point u E JV n W, the value of F is given extension F: W (df1)u(X)/(dH)u(X), where X is any vector in Tu(T*M) which
by F(u) is
=
non-tangential
to
X. What remains
to be shown is that at all
of class Cr for all
W the function F is, indeed, verified by induction over r where again the
A(
be
n
r
points of
N. This
Bernoulli-l'Mpital rule
can
be
has to 11
applied.
Proposition 5.1.4. Let Ar be a my-optical lowing properties are mutually equivalent.
E
structure
on
M. Then the fol-
76
5.
Ray-optical
structures
on
arbitrary manifolds
(a) X admits a global Hamiltonian H. (b) There is a nowhere vanishing characteristic C' vector field X (C) There is a nowhere vanishing C' one-form,3 on M such that
on
X.
the char-
acteristic direction on X is transverse to the kernel of 3 at all points of X. (d) Ar is orientable, i.e., there is a nowhere vanishing C' (2n 1)-form on Ar. -
Proof.
The
implication "(a)=>-(b)"
is trivial since
we can
choose X
=
XHJ'V.
To prove the implication "(b)=*(c)" we put a C' (positive definite) Riemannian metric g+ on X. Such a metric exists since X is a finite-dimensional
paracompact manifold; for
a
proof
of this well-known fact
see
Proposi-
[1]. Then 3 g+(X, -) will do the job. To prove the implication "(c)=>(d)" we can put E 0 A (f2Ar)(n-1)' where (Q.'V)(n-1) A QAr with (n QAr A 1) factors on the right-hand side. Finally, we prove the implication "(d)=*(a)". By assumption, A( is an tion 2.5.13 in Abraham and Marsden
=
=
=
...
-
0
orientable codimension-one submanifold of T*M. We choose
orientation
an
0
for Ar and put a (positive definite) C' Riemannian metric G+ on T*M. (The existence of such a metric is guaranteed by the same argument as above.) This ) 0 u E M. Let t i (u, t) affinely parametrized G+-geodesic tangent to this unit vector at ) R for A( is well-defined on some t 0. Then a. global Hamiltonian H: VV t if and only if there is a u E Ar neighborhood W of Ar by setting H(w)
defines
an
outward unit normal vector at each point
denote the =
=
such that
w
=
O(u, t).
1:1
proposition we should keep in mind, in particular, that oriequivalent to the existence of a global Hamiltonian. We agreement with usual terminology if we define the choice of an for Ar in the following way.
Rom this
entability
of JV is
thus in
are
orientation
ray-optical structure A( is an equivfor JV. Here two global Hamiltonians for A( are called equivalent if they are related, according to Proposition 5.1.3, by a positive function F. After an orientation [H] for Ar has been chosen, the parametrization of a lifted ray is called positively oriented if (5.9) holds with a positive function k for any H E [H]. Definition 5.1.3. An orientation
alence class
5.2 In
[H] of global
for
a
Hamiltonians
Regularity notions for ray-optical
Proposition
5.1.3
we
have
seen
structures
that two local Hamiltonians H and
fl
for
ray-optical structure are related, on their common domain F H where F is a nowhere of definition, by an equation of the form AI with respect to the their derivatives As function. a vanishing consequence, momentum coordinates in any natural chart have to satisfy the equations one
and the
same
=
5.2 a
on
fl,
JV,
=
flab
and
where the abbreviations
and F. This
(4.6)
F Ha
Regularity
implies,
in
cannot be satisfied at
a
notions for
F Hab + Fa
=
(4.7) u
Hb
77
structures b
(5.13)
+ F Ha
have been used for the functions
particular, point
ray-optical
E
that the usual
regularity
H,
condition
Af by all Hamiltonians for N. If this
regularity condition is satisfied by one Hamiltonian for JV it can always be spoiled by switching to another Hamiltonian with the help of an appropriate F, according to (5.13). Therefore following way.
function
we
define
regularity
of
ray-optical
structures in the
Definition 5.2.1. A
point
u
E
ray-optical
JV
on
M is called regular at
a
local Hamiltonian H for Ar, defined on some neighu, such that the fiber derivative FH of H maps A( n W dif-
A( if there is
borhood W
structure
of
a
0
feomorphically
onto its
image
in TM. This is true
if and only if H satisfies
condition (4.6) at u in any natural chart. A my-optical structure X on M is called hyperregular if there is a global Hamiltonian H for)v such that the 0
fiber If
derivative FH take
we
a
of H
look at
find that
maps
our
five
A( difleomorphically
examples
of
onto its
ray-optical
image
in TM.
structures mentioned
Example 5.1.1, Example 5.1.2 and Example 5.1.5 give
above, we hyperregular ray-optical
structures. In each of these
cases
the fiber derivative 0
0
TM. given Hamiltonian is a global diffeomorphism FH: T*M The ray-optical structures of Example 5.1.3 and Example 5.1.4, on the other hand, axe nowhere regular, provided that dim(M) > 2. It is easy to check that on a two-dimensional manifold all ray-optical structures are everywhere regular. The best strategy to verify results of this kind is the following. To find out whether a ray-optical structure Ar is regular at some point u E Ar, we choose the a local Hamiltonian H and a natural chart around u. As before, we use abbreviations (4.7). If det(Hab(U)) =A 0, it is obvious that JV is regular at u. of the
If
det(Hab(U))
=
0,
we
U(X1'...' Xn)
P
If this is the
zero
consider the second-order
=
polynomial
det(Hab(U) + Ha(U) Xb + Xa Hb(U)).
polynomial, i.e.,
if the
equation P,
(X1,
....
(5.14) Xn)
=
0 is
satisfied by all (X1,...' X') E Rn, we know that Ar cannot be regular at u. This follows immediately from the fact that any other local Hamiltonian ft for M must be related to H by the transformation formulae (5.13). If, on the other there is an (Xl,..., X") E Rn with Pu(Xl,..., Xn) : - 0, Ar must be
hand, FH, regular at u. In order to prove this we switch to a new Hamiltonian ft X1. Then and 1 that such in F a F'(u) function the F(u) way choosing we can read from (5.14) that det(fl'b(u)) is equal to Pu(Xl,..., X") which, by assumption, is different from zero. If JV is regular at u, an appropriate choice of a Hamiltonian near u gives us and velocity a local one-to-one correspondence between momentum covectors =
=
=
78
Ray-optical
5.
structures
on
arbitrary
manifolds
vectors. It is clear that
a ray-optical structure alone, without choosing a parHamiltonian, cannot give such a correspondence since lifted rays can be arbitrarily reparametrized. Under such a reparametrization the momentum covectors remain unchanged whereas the velocity vectors are multiplied by a scalar factor. This observation suggests a different regularity notion for ray-optical structures. We want to call a ray-optical structure "strongly regular" if it gives a local one-to-one correspondence between momentum covectors and directions of velocity vectors. To make this precise we consider the vertical part of the ray equation together with the dispersion relation in a natural chart, i.e., the equations (5.10) and (5.11). The desired local oneto-one correspondence holds if and only if this system of equations can be solved for the momentum coordinates p1(s),...,A,,(s) and for the stretching factor k(s). This solvability condition can be written in the form
ticular
det
where
a
is
such that With the
an
index
numbering
(n + 1) help of (5.13) it is we
get
(
(Hab) (Ha) 0 (Hb)
an
of the Hamiltonian chosen.
x
)
54
and b is
rows
(n + 1)
matrix
Switching
an
index
numbering columns
the left-hand side of
(5-15). (5.15) is, indeed, independent
on
easy to check that
(5.15)
0
back to coordinate-free
notation, this
leads to the following definition. Definition 5.2.2. A at
a
point
u
E
ray-optical
Ar if for
structure
and hence
one
Ar
for
on
M is called
strongly regular R,
any Hamiltonian H: W 0
defined on a sufficiently small neighborhood W of TM defined by CH: Ar n W x R+
9H(Wi C) is
a
diffeomorphism
onto its
=
u
in
T*M, the
c]FH(w)
image. This is the
map
(5.16) case
if and only if in
any
ray-optical structure Ar on M is called strongly hyperregular if there is a global Hamiltonian H for Ar such ) that the map CH: X x R+ TM defined by (5.16) is a diffeomorphism onto its image. Here FH denotes the fiber der1vative of H and R+ denotes the set of strictly positive real numbers.
natural chart condition
Strong regularity
(5.15)
we
look at
our
u.
A
is easier to check than
regularity;
we
just have
to calcu-
with any Hamiltonian for Ar. standard examples, we find that Example 5.1.2 and Ex-
late the left-hand side of
If
holds at
(5.15)
ample 5.1.5 give strongly hyperregular ray-optical structures. On the other hand, Example 5.1.1 gives ray-optical structures which are nowhere strongly regular. The same is true of Example 5.1.3 and Example 5.1.4 if we exclude the trivial case dim(M) 1. This shows that strong regularity is violated for several physically interesting ray-optical structures on spacetimes. (In the next section we shall see that a ray-optical structure that describes light =
5.2
in
propagation
Regularity
non-dispersive medium
a
ray-optical
notions for
spacetime
on a
79
structures
cannot be
strongly
Nonetheless strong regularity is a physically useful notion, in particular in the case that M is to be interpreted as space rather than as spacetime.
regular.)
We have not yet justified our terminology by showing that, indeed, strong regularity implies regularity. This follows from the next proposition.
Proposition 5.2.1. A ray-optical structure,,V on M is strongly regular at a point u E A( if and only if there exists a local Hamiltonian H for A( on some neighborhood of u such that in any natural chart
(a)
(Hab)
det
(b) Hab H' H at
u.
HabH
Here be
=
0 and
:A b
0
-
we use
again the abbreviations (4.7) and Hab
is
defined through
jc.
Proof. The "if' part is a trivial exercise in linear algebra. To prove the "only if' part, we assume that JV is strongly regular at u and fix a local Hamiltonian H for A( on a neighborhood 1YV of u. If H satisfies (a) we are done since in this that H does not satisfy case, by (5.15), (b) is also satisfied. So let us assume
(a). It is our goal to find another Hamiltonian H FH for X such that (a) As, by assumption, the and, thus, (b) are satisfied if H is replaced with kernel of the matrix (Hab) is non-trivial, our strong regularity assumption (5.15) guarantees u
uniqueness of
existence and
which satisfies Hab nb
=
0 and H bnb
vertical vector
a
1. If W is
=
nb,0/0pb
sufficiently small,
at
the
R is, again, a local Hamiltonian for JV. It of the matrix kernel (Hab + 2Ha H b) is our goal to prove that the that demonstrate ftab to want We 0. that yb is trivial. So let us assume the form in be Yb Ha can As 0. decomposed this implies Yb 1, Yb na
function
ft
=
H(H
1):
+
W
)
(ftab)
=
=
Zb + c nb Hab Zb +
where Hb Zb 2
c
Ha
=
=
=
=
0.
=
0. Then
Owing
to
assumption takes the form ftab yb strong regularity assumption this implies =
our
our
Z1, 2 c is in the kernel of a components Z1, these all components are zero. 13 determinant; hence,
that the column vector with matrix with
non-zero
.
.
.
,
This proposition shows that, at the level of local Hamiltonians, our strong regularity notion is equivalent to the so-called Condition N introduced by Guckenheimer For further
(53]. illustrating strong regularity
we
introduce the
following
nota-
tion.
Definition 5.2.3. Let Ar be
ray-optical
a
structure
on
M. For any q
E
M,
the set
Cq is called the infinitesimal
(O) I light C
=
A is
cone
IX
is called the bundle of infinitesimal
a
ray with
of JV E
Cq
light
I
X(0)
=
q
(5.17)
at q. The set q E M cones
1
of JV.
(5.18)
80
Ray-optical
5.
Here
structures
allow ourselves
we
necessarily a fiber bundle X E Cq implies that c X
E
slight
a
over
Cq
arbitrary manifolds
on
abuse of
language
insofar
C is not
as
M. As rays can be arbitrarily reparametrized, for all c E R 0} This justifies calling Cq a -
0
"cone". X
=
Clearly,
FH(u)
vector X E
a
where
defined around
Xq
E
u
TqM
is in
and H is
a
if and
Cq
only if
it is of the form
local Hamiltonian for JV which is
Moreover, for any local Hamiltonian H, the image of the defined through (5.16) is a subset of C. Note, however, that even for map CH a global Hamiltonian the image of aH does not necessarily cover all of C; the reason is that the image Of OH need not be invariant under multiplication with negative numbers. u.
0
In the
case
Example 5.1.1, Cq
of
i.e., Cq equals the null
cone
{X
=
E
I go(X,X)
TqM
01,
=
of the Lorentzian metric go at q. In partic0
ular, C.
is
a
closed codimension-one submanifold of
The situation is
completely different
0
X E
Cq
Tq M
I go (XI X)
< 0
1
in the
equals
case
of
in this
TqM
case.
5.1.2. Here
Example
the interior of the null
of the
cone
0
Lorentzian metric go and is, thus, an open subset of TqM. For the ray-optical is c Uq c (-= R \ 10} structures of Example 5.1.3 and Example 5.1.4, Cq
I
I
0
a
one-dimensional submanifold of
TqM
whereas in the
of
case
Example
5.1.5
0
Cq
is all of
TqM.
examples show that C has very different features for different rayoptical structures. Moreover, they demonstrate that there is no obvious reThese
lation between the geometry of A( and the geometry of C. In the case of Example 5.1.1, which dominates our intuitive ideas of general relativistic ray
optics, JV and C
are
diffeomorphic.
In the other cases,
com-
different from A(.
pletely
following proposition implies that diffeomorphic to Ar.
The not be
however, C looks
Proposition
5.2.2. Let
A( is strongly regular
A( be
at all
a
in the
ray-optical
points
u
E
Ar.,
strongly regular
structure
the
on
case
M and q
infinitesimal light
C
E
can-
M.
If
Cq
is
cone
0
an
open subset
of TqM. the differential of
Proof. By Definition 5.2.2, strong regularity implies that the map 0`H has maximal rank. This observation is
1:1
exemplified by Example
5.1.2 and
Example
5.1.5.
Please recall that strong regularity guarantees that the system of equations (5. 10) and (5. 11) can be solved for the momentum coordinates Pa (s) and for the
stretching
factor
k(s).
It is worthwile to become clear about the in-
formation contained in this system of equations. If a curve s i satisfies (5.10) and (5.11) with some k(s) but not necessarily termines at each of its
points the
same
velocity
vector
as a
)
(x(s), p(s))
(5.12),
it de-
lifted ray passing
5.2
through
Regulaxity
notions for
ray-optical
81
structures
this point. Hence this curve, although not necessarily a lifted ray, an object moving at the velocity of light (in the medium for which
describes
M gives the dispersion
relation).
We
now
introduce
a
special
name
for such
curves.
Definition 5.2.4. Let M be immersion r ay
M from
: I
arbitrary ray-optical
an
structure
real interval I into JV is called
a
on a
M. A C'
lifted virtual
iff
WS), Zoo)
=
(5.19)
0
T (,)Af with (T-r) )(Z (,))
0. Then the
projected
rays" should not be confused with the images" which is used in elementary optics. It follows immediately from the definitions that lifted virtual virtual rays can be characterized in the following way.
notion of
for all s curve
E I
-r;4
o
and all
:
I
Z (,) )
E
M is called
a
=
virtual ray.
This notion of "virtual
"virtual
Proposition
M be any ray-optical structure M on M. Then M is a lifted virtual ray if and only if
5.2.3. Let
CI immersion
(-r, 4 function k: I ray-optical structure M.
with the
some
R
E
s
o
)*
\ 101.
=
k FH
a
(5.20)
o
Here H is any
(local)
Hamiltonian
for
if and only if (s) E C,\(s) I. Here C,\(,) denotes the infinitesimal light cone introduced in M is
A COO immersionX: I
for all
rays and
a
virtual ray
Definition 5.2.3.
Clearly, general not
a
ray is all the
more
a
virtual ray whereas the
converse
is in
Example 5.1.1 all g,,-light-like curves in M are virtual rays but only the g,,-Iight-like geodesics are rays. In the case of Example 5.1.2 the rays are the g,,-time-like geodesics whereas all g,,-time-like curves are virtual rays. In the case of Example 5.1.3 and 5.1.4 an immersed curve in M is a ray iff it is a virtual ray iff it is an arbitrarily parametrized integral curve of the vector field U. In the case of Example 5.1.5 all immersed curves in M are virtual rays whereas only the g+-geodesics are rays. If M is orientable, we can generalize Definition 5.1.3 in the following way. true. In the
case
Definition 5.2.5. Let Ar be
of
a
ray-optical
structure
on
M and
[H]
be
an
ori-
of JV is called positively oriented for Ar. Then a lifted virtual ray if (5.20) holds with a positive function k for any H E [H] Similarly, a virtual oriented ray is called positively oriented if it is the projection of a positively lifted virtual ray. entation
-
proposition we prove that in the strongly hyperregular case global one-to-one correspondence between positively oriented lifted
In the next there is
a
82
5.
Ray-optical
structures
on
arbitrary manifolds
virtual rays and positively oriented virtual rays, i.e., that at each point of a positively oriented virtual ray the momentum covector is uniquely determined.
Proposition 5.2.4. Let Ar be a ray-optical structure on M and assume that Ar is strongly hyperregular. By Definition 5.2.2 this guarantees the existence ) TM of a global Hamiltonian H for M such that the map OH: Ar x R+ defined through (5.16) is a global diffeomorphism onto its image. Choose such a Hamiltonian, thereby defining an orientation [H] for JV. Then for every ) M there is a unique positively oriented positively oriented virtual ray X: I Ar that projects onto X, r, 4 o A. lifted virtual ray : I
Proof. The nontrivial claim is the uniqueness of . So let us assume that , and 2 do the job. Since lifted virtual rays have to satisfy (5.20), this k2 FH o 2. Since both , and 2 are supposed to be implies that k, IFH o , and k, k2 have to be positive such that the last equation oriented, positively =
( , (s), ki (s)) diffeomorphism, this implies 61
be written in the form 6rH
can
Since OH is
a
17H
=
=
( 2 (s), k2 (S))
2.
for all
s
E I. 11
Example 5.1.5 demonstrates that the restriction to positively oriented indeed, necessary to get uniqueness.
lifted virtual rays is,
Symmetries of ray-optical
5.3
As the symmetries of
a
ray-optical
structures
structure
Ar
on
M
we can
view all dif-
0
feomorphisms
on
T*M that leave JV invariant. For
our
purposes it will be
0
diffeomorphisms on T*M that are induced diffeomorphisms on the base manifold M. (Such diffeomorphisms are called "point transformations" in Hamiltonian mechanics.) To work this out, ) M induces a cotanwe have to recall that each diffeomorphism. V): M T*M which is again a diffeomorphism, defined gent map T*?P: T*M u (TV) (X)) for all q E M, X E Tq M and by the equation ((T* V)) (u)) (X) is well known and easily verified that T*0 leaves the canonical It M. u E T one-form 0 and, thus, the canonical two-form S? invariant, i.e., reasonable to restrict to those
from
)
=
(T*,O)*O
=
0
and
(T*O)*Q
=
R
-
(5.21)
proof we refer to Abraham and Marsden [1], Theorem 3.2.12. As an alternative, the proof can be accomplished easily in a natural chart. If ) x, (T*O)-l is represented ,0 is represented, in a local chart, by a map x ) the in the pertaining natural chart by (x', p') with p' given map (x, p) 1 by (2.71). After these preparations we are now ready to introduce the following
For
an
invariant
i
definition.
5.3
Symmetries
of
ray-optical
structures
83
Definition 5.3.1. Let X be
a ray-optical structure on M. The symmetry ) M GAr of A( is, by definition, the set of all diffeomorphismo: M such that T* 0 leaves Ar invariant, i. e., such that (T*'O) (u) E A( for all u E M.
group
Clearly, GA(
is
a
group with
For the sake of illustration
respect take
to
composition of
maps.
look at the symmetry groups of our of Example 5.1.1, where Ar is the set of all
we
a
standard examples. In the case light-like covectors of a Lorentzian metric g,,, the symmetry group G.'V con) M for which the pulled back metric sists of all diffeomorphisms V): M
V)*g,, has the same cone bundle as g,. This is the case if and only if O*g, is e2f g,, with some C' conformally equivalent to g,,, i.e., if and only if O*g,, function f : M ) R. For a proof of this well-known fact we refer, e.g., to Wald [146], p. 445. Hence, in the case of Example 5.1.1 the symmetry group G.1v equals the set of all conformal symmetries of the Lorentzian manifold (M, g,,). In particular, this implies that GAr is a finite-dimensional Lie group. Similarly, in the case of Example 5.1.2 we find that the symmetry group is the group of all isometries of the metric g,,. Again, this is a finite-dimensional Lie group. In the case of Example 5.1.3, on the other hand, the symmetry ) M that map arbitrargroup GA( consists of all diffeomorphisms 0: M of U onto curves arbitrarily parametrized integral ily parametrized integral curves of U. In general, this is an infinite-dimensional subgroup of the diffeomorphism group Diff (M). The same result is found for Example 5.1.4, with the only difference that the diffeomorphisms 0 have to respect the parametrization adapted to U in addition. Finally, in the case of Example 5.1.5, the symmetry group is the group of isometries of the Riemannian =
metric g+. Next we want to show that for any
V) E GAr the induced cotangent map T*,o maps lifted rays onto lifted rays. For later convenience we prove the following more general proposition. 0
Proposition
5.3-1. Let A( be
a
ray-optical structure
on
M and TI: T*M
0
T*M be
form
a
factor,
a
0
R, and that T1
f : T*M whenever
Assume that T1 leaves the canonical two
diffeomorphism.
C'
0 invariant up to
-r, j (ul)
--*:--
is
i. e., !P* Q
=
f Q with
some
fiber preserving, i.e., r j (Tf(ul)) Then the following properties
Ttj (U2).
-=
function
T j (11 (U2))
are
mutually
equivalent.
(a) (b) (c)
T1 leaves JV
invariant, i.e., 0/(u) E JV for all u E./V. lifted ray of M onto a lifted ray. lifted virtual my of JV onto a lifted virtual
T1 maps each T1 maps each
Proof.
First
hold true,
we assume
we
pull
that
(a)
ray.
is satisfied. To prove that then (b) and (c) = f Q to Ar. As the diffeomorphism
back the equation Tl* Q
0
closed submanifold of T*M, TV maps JV ) X onto itself. Hence, !P (S?Ar f IA(S?,v where T1,V: M
TV leaves JV invariant and A( is
diffeomorphically
a
=
84
5.
Ray-optical
structures
on
arbitrary manifolds
denotes the restriction of T1 to Ar. This
S?x ((TI Now let
us assume
has to be
o
f
o
that
is
(5.22) vanishes,
hand side of T1
a
lifted ray
as
S?,v (
o
, (TTI)
latter condition is
(5.22)
(TTI)-'(X)
equivalent
proof
(5.22)
of
vanishes
on
all
is vertical. Since TV is
fiber-preserving, the being vertical. Thus, the left-hand side
to X
has to be a lifted all vertical vectors, i.e., T1 o of the converse implications "(b)=>(a)" and "(c)=*(a)"
has to vanish
virtual ray. The
(5.22)
well. This proves the implication "(a)=>(b)". that is a lifted virtual
implication "(a)=*(c)", let us assume Then, by Definition 5.2.4, the right-hand side
vectors X such that
of
for any C' immersion
lifted ray of Ar. Then, by (5.19), the rightthe left-hand side has to vanish as well, i.e.,
a so
To prove the ray.
implies that,
Ar,
I
on
0
point u E T*M is in A' if and only if there is if and only if there is a lifted virtual ray through u.
is trivial since
through If
u
we
a
apply this proposition
to the map T1
=
T*,o
ITIM o
a
lifted ray
we see
b that
a
M is in G.,V if and only if its cotangent map diffeomorphism b: lifted lifted onto T*O maps rays if and only if T*O maps lifted virtual rays This virtual lifted onto implies, in particular, that any 0 E GAr rays. rays M
maps rays onto rays. Please
note, however, that the
converse
is not true. A
M that maps rays onto rays need not be in GV. with our earlier observation that two different ray-
M
diffeomorphism 0: This is in correspondence optical structures on M may have the same rays. As an example we may consider a ray-optical structure constructed from a (positive definite) Rie) M mannian metric g+ as in Example 5.1.5. Then a diffeomorphism'O: M onto 1 with constant c such that O*g+ a 54 positive rays maps rays cg+ but it is not in the symmetry group GAr. As an alternative, symmetries can be treated in terms of. infinitesimal generators. This gives a symmetry algebra rather than a symmetry group. To make this definition precise we need the well-known fact that each vector )
=
field K
lift
on
M defines
a
9
vector field
on
T*M which is called the canonical
of K. Let 0: V
gRxM
K, i.e., let t through the point
denote the flow of passes at t
=
0
1
condition that its flow
)
)
(t,q)
M,
dit(q)
denote the
integral curve of K that k is defined by the
b is given by the equation 4bt
=
K a (X)
(5.23)
q. Then the vector field
chart the canonical lift of the vector field K
ff
Ot(q),
)
a -
axa
A
=
Ka(x)
T*-D-t. In a natural -.!2- takes the form
=
8XI
9Kb (X)
a
gxa
9pa
(5.24)
Comparison with (4.2) shows that k is the Hamiltonian vector field of the ) R, i.e., -k 0 (k) : T* M function H XH- In terms of a natural chart =
=
this function takes the form H (x, p)
=
pa Ka (X).
5.3
Now the symmetry the
following
algebra of
Symmetries a
of
ray-optical
ray-optical structure
85
structures
can
be defined in
way.
Definition 5.3.2. Let Ar be
a ray-optical structure on M. The symmetry Ar algebra 9Ar of is, by definition, the set of all C' vector fields K on M such that at all points of X the canonical lift K of K is tangent to M.
It is easy to check that
gAr is a Lie algebra with respect to the usual Lie Clearly, the one-parameter subgroups of GAr are in one-to-one correspondence with the complete vector fields in Q'V. In analogy to (5.21) the canonical lift k of a vector field K satisfies bracket of vector fields.
LEO
=
0
where L denotes the Lie derivative.
(local)
5.3.1 to the
Proposition
and
LES?
=
(5.25)
0
Hence, by applying (a local version of) we get the following result.
flow of k
Proposition 5.3.2. Let Ar be a ray-optical structure on M. Then for vector field K on M the following properties are equivalent.
(a) (b) (c)
a
C'
K is in
9,v. flow of the canonical lift The flow of the canonical lift
of K of K
The
maps maps
lifted lifted
lifted rays. virtual rays onto lifted
rays onto
virtual rays. In Hamiltonian mechanics it is well-known that
symmetries give
rise to
constants of motion. Similarly, every element of 9,V is associated with a function on T*M which is constant along each lifted ray. This is shown in the
following proposition. Proposition 5.3-3. Let Ar be a ray-optical structure on M and K E 9A(. ) R is constant along each lifted ray. Here Then the function O(ff): T*M 0 denotes the canonical one-form on T*M and k denotes the canonical lift of K. Proof. We
fix
a
point
u
E
Ar and
a
local Hamiltonian H for Ar around
Then the definition of the exterior derivative d
XH(O(k))
-
k(O(XH))
-
OQXH7 k1)
vector fields. On the left-hand side
tonian vector field
XH,
on
the
where
we use
implies
[ -, ] -
that
(dO) (XH, k)
u. =
denotes the Lie bracket of
(4.1) of the Hamil(5.25). This results in
the definition
right-hand side
we use
XH (0 (k)) On Ar, the left-hand side vanishes since k is tangent Ar. Hence, the right-hand side has to vanish on JV as well. This implies that the function O(k) is constant along each integral curve Of XH which is 0 contained in JV, i.e., along each lifted ray.
-dH(k)
=
.
to
If K E 9,V is represented in a local chart as K alOx', which is possible locally around any point of M where K does not vanish, the constant of motion O(k) is exactly the corresponding momentum coordinate, O(R) p,,. =
=
86
5.
For this K E
Ray-optical
structures
O(k)
reason
arbitrary manifolds
on
is called the momentum of the infinitesimal symmetry
Qff-
The fact that symmetries imply constants of motion is of particular relin view of dimensional reductions. Given a ray optical structure Ar
evance
M, any subgroup G of the symmetry equivalence relation on M by q,
If the action of G
M
=
Ml-
can
M satisfies
on
)
M'
0
a
some
be furnished with
pr: M
projection
there is
4==:>.
q2
-
QV
group
on
can
G such that q,
E
be used to define
=
an
V)(q2)-
regularity conditions, the quotient
space
manifold structure such that the natural
a
becomes
a
submersion. If G is
an
r-dimensional
Lie group, Proposition 5.3.3 gives rise to r constants of motion. Fixing a value for each of them singles out a certain subclass of lifted rays of Ar.
Circumstances
ray-optical
permitted,
)V
structure
full detail for stationary Sect. 6.5 below. Relevant reduction formalism Marsden
can
ray-optical structures on Lorentzian manifolds in background material on the general features of the be found in Chap. 4 of the book by Abraham and
[1].
possibility
The
this subclass of lifted rays projects onto a reduced M. We shall discuss this reduction formalism in
on
symmetries for dimensional reduction is an imporconsidering ray-optical structures on bare manifolds of
to
tant motivation for
use
dimension.
unspecified
We end this section with
GAr. This
is
defined,
GAqr
=
remarks
some
for each q E
on
the
isotropy subgroup
GAZV
M, by
10
E
GAr I O(q)
=
(5.26)
q}.
the cotangent map T*O restricted to T,*M gives us For any 0 E GqV, JV linear automorphism Tq*o: Tq*M Tq*M that leaves the manifold Arq
A(
n
Tq* M
of
invariant. We introduce the
a
following definition.
Definition 5.3.3. Let A( be
a ray-optical structure on M and fix a point By definition, the structure group of JV at q is the set of all linear A( n Tq*M automorphisms Tq*M Tq*M that leave the manifold Xq
q E
M.
=
invariant.
In the
case
of
Example 5.1.2, the
Lorentz transformations of the metric
ple
structure group at q consists of all
golq,
whereas in the
case
of Exam-
multiplications with nonzero numbers in addition. Example 5.1.4, the structure group at q is represented by all
5.1.1 it contains the
In the
case
of
invertible matrices of the form
A',-' An,
A
(A b
(5.27) A'
n-
0
1
...
...
1 An Ann- 1 IL- 1
0
An
5.4 Dilation-invariant
in
a
basis such that
equation A ple 5.1.5, the
=
U'(q)
=
b.,
ray-optical
whereas in the
1 must be satisfied in addition.
case
of
Finally,
87
structures
Example
in the
case
5.1.3 the of Exam-
structure group at q consists of all linear
automorphisms that orthogonal with respect to the metric 9+lqAs long as M is a bare manifold, the only distinguished linear automorphisms on Tq*M are dilations and inversions, i.e., multiplications with positive or negative numbers. The question of whether or not a ray-optical structure is invariant under dilations and inversions is of particular relevance. are
Therefore
we
devote the next subsection to this question.
5.4 Dilation-invariant
ray-optical
The notion of dilation-invariance for
structures structures will
ray-optical
mathematically elegant For this reason the following definition
characterization of media which is of paramount
are
give us a dispersion-free.
importance in
ray
optics. Definition 5.4.1. A
ray-optical
structure
Ar
on
M is called dilation-invari-
M if etu E Arq for all u E Arq and t E R. Ar is called point reversible at a point q E M if -u E Arq for all u E Arq. Ar is called dilationant at
invariant at all
q E
a
(or reversible, resp.) if it
points
q c-
is dilation-invariant
(or reversible, resp.)
M.
interpreted as a general-relativistic spacetime, a dilation-inray-optical structure on M is called dispersion-free or non-dispersive, otherwise it is called dispersive. In Chap. 6 below we shall link up this terminology with the physics textbook definition of non-dispersive media, i.e., we shall use the notions of phase velocity and group velocity for characterizing dilation-invariant ray-optical structures. These notions refer to a time-like vector field; hence, they can only be introduced if there is a Lorentzian metric on M since otherwise we do not know what is meant by "time-like". A brief look at our standard examples shows the following. Whereas Examples 5.1.1 and 5.1.3 are dilation-invariant and reversible, Examples 5.1.2, 5.1.4, and 5.1.5 are only reversible but not dilation-invariant. For each t E R, the dilation If M is to be
variant
0
0
Pt: T*M
)
T*M,
u
i
)
et U
(5.28)
) (x, et p). With the represented in a natural chart by the map (x, p) help of this representation it is readily verified that Ot leaves the canonical one-form 0 p,, dxa invariant up to a factor,
is
i
=
(P*O t
Application
=
of the exterior derivative d
et 0.
yields
(5.29)
88
Ray-optical
5.
structures
on
arbitrary manifolds
P*S? t
Applying Proposition
5.3.1 to the map Tf
Proposition 5.4.1. For a ray-optical erties am mutually equivalent.
(a) N
(b)
et 0
=
=
(5.30) Ot
proves the
structure
A(
on
following.
M, the following prop-
is dilation-invariant.
) Af remains a lifted lifted ray : I ray if it is multiplied pointwise a positive number, ) et 6( ( ) ). ) JV remains a lifted virtual (c) Each lifted virtual ray 6: 1 ray if it is with et 6( a positive number, multiplied pointwise
Each with
i
-
-
0
0
T*M is represented in each natural Similarly, the inversion X: T*M chart bythe map (x,p) i ) (x,-p). This implies that X*O= -0 and X*Q -Q. Hence, Proposition 5.3.1 can be applied to the map T1 X as well, resulting in the following proposition. =
=
Proposition 5.4.2. For a ray-optical structure JV erties are mutually equivalent.
on
M, the following prop-
(a) A( is reversible. A( remains a lifted ray if it is pointwise inverted, (b) Each lifted ray 6: 1 6H -6(-). A( remains a lifted virtual ray if it is (c) Each lifted virtual ray 6: 1 pointwise inverted, 6(.) -6(.). A
)
0
The dilation-invariant
)
case can
also be characterized in terms of the vector
field that generates the one-parameter group of dilations. Since
(5.28)
satisfies
0
all
properties of
a
*global
C' flow
T*M,
on
we can
define
a
C' vector field
0
E
on
T*M
by Eu
=
d dt
(et u) I t=0
(5.31)
0
for all
u
T*M. The
E
integral
curves
of E
are
the radial lines in the fibers of
the cotangent bundle. This vector field E is known
as
the Euler vector
field
0
or
Liouville vector
field
on
T*M. In E
(5.29)
and
(5.30) imply
natural chart E takes the form
a
(9 =
P"
1OPa
that the Lie derivatives of the canonical one-form
and of the canonical two-form with respect to the Euler vector field
LEO=O Moreover
,
(5.32)
the identities
and
LEQ= Q-
satisfy
(5.33)
5.4 Dilation-invariant
O(E)=O are
easily verified
dilation-invariant
in
a
ray-optical
89
structures
(5.34)
-20(E, -)=O
and
natural chart. In terms of the Euler vector field E
ray-optical
structures
are
by the following
characterized
proposition.
Proposition 5.4.3. A ray-optical if and only if the Euler vector field Proof. The "only if part
follows
structure
X
M is dilation-invariant
on
E is tangent to JV at all
directly from Definition
points of JV.
5.4.1. For the "if'
0
part
one
has to
In terms of in the
use
the fact that M is closed in T*M in addition.
(local)
following
Hamiltonians; dilation-invariance
can
El
be characterized
way.
ray-optical structure M on M is dilation-invariant if Proposition 0 and only if any local Hamiltonian H for M satisfies the equation dH(E) 5.4.4. A
=
on
M. The
tion
proof follows, immediately
dH(E)
=
from the definitions.
By (5.32), the
equa-
0 takes the form
'H Pa
1OPa
(x,p)
=
(5.35)
0
(5.35) is certainly satisfied on M if H is a homogeneous (of any degree) with respect to the momentum coordinates p,,. If Ar is a dilation-invariant ray-optical structure and u E Ar, then the whole radial line I et u It E R I must be in Ar. This implies that X, A(nT,*M in
a
natural chart.
function
=
necessarily has a non-void intersection with each neighborhood of the origin in T*M. Hence, A(q cannot be closed in T,*M. Typically, the closure Arq U 101 Of in T,* M fails to be a smooth manifold at the origin but forms something like a tip or a vertex there like in our Example 5.1.1, see Figure 5.1. For a dilation-invariant ray-optical structure, Xq U f 0} is a smooth manifold at the origin if and only if it is a hyperplane. This situation is encountered in our pathological Example 5.1.3, see Figure 5.3 (a). Proposition 5.4.4 can be used to further characterize dilation-invariant ray-optical structures in the following way.
7q
Proposition M. Fix
a
5.4.5. Let M be
my-optical
dilation-invariant
structure
on
0
point
natural chart
a
u
E
(x, p)
Ar and, and
a
on some
open
neighborhood W of u in T*M, a for M. Then there is a C E R
local Hamiltonian H
such that
(Ha6) (Ha) 0 (H b) Here
at
u.
in
(5.15).
we use
the abbreviations
) ( ) (0) (Pb)
1:
0
c
(4-7)
and the
same
(5.36) matrix notation
as
90
5.
Ray-optical
structures
on
arbitrary
manifolds
Proof. By assumption, H has to satisfy equation (5.35) gives the last row of the matrix equation (5.36).
at all
This
of all vertical vectors
and
only
(5.35)
equation of
row
if it satisfies
as a
A Zrppb Pb
(5.36)
aH
ap.
=
resulting
-c
=
at the
ap.
Z,,
derivation a2H
a
Z,,
point
0. In this in A
-2-H-L holds at '9P.
Such
u.
case
a2H
Za ZrPa Pb u
with
it =
a
vector is
can
0 at
some
points of ArnW.
Now consider the set
be u.
tangent
applied
Since at least
(5.36) implies
zero
the
one
that the
real number
This
corollary in
a
As the last
c.
this
proposition has the
(n + 1)
of strong
says
medium
that,
on
on
of the momentum coordinates must be
determinant. On the other
defining property
agation
equation
This shows that the
Corollary 5-4.1. A dilation-invariant ray-optical structure Ar be strongly regular at any point U E Ar.
Proof.
Ar if
already verified, this completes the proof
was
Recalling Definition 5.2.2 of strong regularity, following important consequence.
u,
to
to
x
(n + 1)
matrix
M cannot
non-zero
hand, non-vanishing of.this determinant regularity according to Definition 5.2.2.
if M is
a
at
the left-hand side has
on
was
0
spacetime and Ar describes light propregularity can hold only if the
this spacetime, strong
medium
is.dispersive. Moreover, with the help
fiber derivative of
a
(local)
of
Proposition
5.4.5 it is easy to verify that the a dilation-invariant ray-optical
Hamiltonian of
I et u I JV. This implies that, for
I
to radial lines
structure maps radial lines
t E R
for
a
dilation invariant
a
closed subset of
u
E
I e FH(u) I
t E RI
ray-optical structure, 0
the infinitesimal
light
codimension > 1 if it is
cone
a
Cq
is
submanifold.
(Please
TqM
and that it has
recall Definition 5.2.3 and the
subsequent discussion.) This general result is exemplified by Example 5.1.1 and Example 5.1.3. On the other hand, transversality of Ar to the flow of the Euler vector field is not sufficient for the infinitesimal light cones to be open. This is demonstrated by Example 5.1.4. The following proposition gives a pointwise characterization of dilationinvariant ray-optical structures that will be of relevance later.
Proposition 5.4.6. Let Ar be a ray-optical structure on M. Then, for point u E Ar, the following properties are mutually equivalent.
(a) The Euler vector field E is tangent to Ar at the point u. (b) Every characteristic vector field X on JV satisfies OAr(X) (c) O.,V A f2r' 0 at u.
=
0 at
any
u.
=
Here
(n
-
Or'
means
1) factors.
the
antisymmetrized
tensor
product S?Ar
A
...
A
S?Ar with
5.4 Dilation-invaxiant
Proof.
To prove the
equivalence
characteristic vector field
(a)
of
(b)
and
ray-optical we
recall that
JV is the Hamiltonian
on
91
structures
locally
vector field of
Hamiltonian for JV. Thus, (b) holds true if and only O(XH) for any local Hamiltonian H. Owing to the second identity given in if
(4.1)
and the definition
of the Hamiltonian vector
=
this is
field,
every
a
local
0 at
u
(5.34)
equivalent
to
dH(E) 0 at u for any local Hamiltonian H, i.e., it is equivaleni to (a). Now we prove the equivalence of (a) and (c). It is obvious that, at each =
0
point of T*M, 0
A
01-1 is
dimensional kernel. From
a non-zero
(5.34)
Euler vector field E. Hence the at those
we
to
I)-form
and has,
01-1
to
thus,
a one-
spanned by
A
the
JV vanishes exactly
JV.
proposition, OAr
to this
-
of 0 A
pull-back
points where E is tangent
According
(2n
read that this kernel is
13
Q5V1
has
no
zeros
if
JV
is every-
where transverse to the flow of the Euler vector field. In this case (JV, Ov) is an exact contact manifold in the terminology of Abraham and Marsden
[1),
Definition 5.1.4. In other
of JV to the flow of the
words, transversality
Euler vector field guarantees that JV is orientable and that there is even a on Ar. By Proposition 5.1.4, this imcanonical volume form, viz. OAr A
OW',
plies
in
particular
the existence of
a
Hamiltonian for such
global
a
ray-optical
structure.
Proposition tual rays.
5.4.6 has the
(Please
following important
recall Definition
5.2.4.)
Proposition 5.4.7. Let JV be a ray-optical A( satisfies the equation ray : I
OV-1) Ws)) at the to
parameter value
Ar at the point
s
E
I
consequence for lifted vir-
=
structure
on
M. A
lifted virtual
(5.37)
0
if and only if the Euler
vector field E is
tangent
(s).
Proof. By Proposition 5.2.3, the tangent
vector of
a
lifted virtual ray is the
characteristic vector tangent to Ar and a vertical vector. Since all vertical vectors are in the kernel of the canonical one-form, now the statement 1:1 follows from the equivalence of (a) and (b) in Proposition 5.4.6. sum
of
a
In Hamiltonian mechanics, integration over the canonical one-form gives the socalled action functional. In this terminology, Proposition 5.4.7 says that for a dilation-invariant ray-optical structure the action functional vanishes on
all lifted virtual rays. This observation will be of great importance for principles in Chap. 7 below.
our
discussion of variational
If, to
on
the other
Ar, Proposition
reparametrization
hand, 5.4.7
the Euler vector field is
guarantees that
everywhere
transverse
every lifted virtual ray admits
a
such that
001) Ws))
=
1
(5.38)
92
5.
Rayo-optical
structures
on
arbitrary manifolds
This
which gives a canonical parametrization for each lifted virtual ray ) unique up to an additive constant, (s) (s + s,,). In the case of Example 5.1.5 this distinguished parametrization gives g+-arc length along each virtual ray A 1. Similarly, in the case of -r, 4 o , i.e., g+ ( , ) Example 5.1.2 the distinguished parametrization gives g,,-proper time along each virtual ray (= g,,-time-like curve). Finally, in the case of Example 5.1.4 the distinguished parametrization is adapted to U or adapted to -U along each virtual ray (= integral curve of U) A U o A. -r., 4 o , i.e., No such distinguished parametrization exists if jV is dilation-invariant since then every lifted virtual ray satisfies equation (5.37). is
1
=
=
=
5.5 Eikonal From the
=
equation
examples
studied in Part I
ciated with families of
we
know that families of rays
are asso-
surfaces. In this
chapter we want to study the relation of rays and wave surfaces in our general geometrical setting. In particular we want to introduce, for arbitrary ray-optical structures in the sense of Definition 5.1.1, an eikonal equation by which the dynamics of wave surfaces is determined. It is largely a matter of taste whether one considers rays as more
wave
fundamental than
light propagation the notion of
wave
surfaces
vice
or
versa.
Our intuitive ideas of
the notion of rays, rather than on surfaces. On the other hand, the derivation of ray optics
are
wave
normally
based
on
from Maxwell's equations leads to the eikonal equation first and to the ray equation at a later stage, as we have seen in Part I.
Formally the eikonal equation
of
a
ray-optical
Ar
structure
on
the Hamilton-Jacobi equation determined by any Hamiltonian for Ar. More precisely, we say that a C' function S: U
be introduced defined
as
open subset U of
on some
equation of Ar iff
it
M, is satisfies the equation
H(dS(q))
=
0
a
classical solution
as a
map dS: U
T*U C
) R, of the eikonal
(5.39) as a
local section in the
T*M, and
H denotes any
local Hamiltonian for JV whose domain of definition W C T*M
point dS(q).
In
a
can
(local)
for all q E U
Here the differential of the function S is to be viewed
cotangent bundle, i.e.,
M
natural chart the eikonal equation
(5.39)
covers
takes the
the
more
familiar form
H(x, 0S(x)) (5.39)
can
=
0.
be rewritten without any reference to local Hamiltonians
dS(U)
C
Ar
(5.40) as
(5.41)
5.5 Eikonal
where
dS(U)
denotes the
image of the
equation
93
T*U C T*M.
section dS: U
0
a subset of T*M, (5.41) automatically guarantees that dS has Hence, the function S determines a foliation (or "slicing") of the const. which are called open subset U of M into smooth hypersurfaces S wave surfaces (or eikonal surfaces, or phase -surfaces). The motivation for this terminology comes, of course, from the approximate-plane-wave method
Since A( is
no zeros.
=
outlined in Sect. 2.2.
Example 5.1.1 a wave surface is a g,,-light-like hypersurface whereas it is a g,,-space-like hypersurface in the case of Example 5.1.2. In the case of Example 5.1.3 a wave surface is foliated into integral curves of the vector field U whereas it is transverse to U in the case of Example 5.1.4. Finally, in the case of Example 5.1.5 a wave surface is a completely arbitrary hypersurface. The eikonal equation can be viewed analytically as a partial differential equation for a function S. As an alternative, suggested by (5.41), the eikonal equation can be viewed geometrically as the problem of finding an n-aimensional manifold (with certain properties) that is contained in a given (2n l)-dimensional manifold. Henceforth we take the geometrical point of view which is of great advantage for global questions. In other words, we turn In the
case
of
-
our
attention away from the function S and concentrate upon the manifold
dS(U).
For any Cc' function S: U R, dS(U) is an n-dimensional C' submanifold of T*M and it is transverse to the fibers. Moreover, dS(U) is a socalled "Lagrangian submanifold" of the symplectic manifold (T* M, Q). This notion, which will be at the center of this section, is defined in the
following
way.
Definition 5.5.1. LetC C T*M be
and denote the inclusion map
(a)
f- is called isotropic 0. vanishes, j*f2
by j:
an
C
embedded C' )
submanifold of T*M
T*M.
iff the pull-back with j of the canonical two-form
S?
=
(b)
C is called
Lagrangian iff f-
Definition 5.5.1 admits
an
is
isotropic and dim(L)
obvious
generalization
=
for
dim(M). immersed, rather
embedded, submanifolds of T*M. non-degeneracy of Q immediately implies that an isotropic submanifold of T*M must have dimension < dim(M). Thus, Lagrangian sub-
than
The
isotropic submanifolds of maximal dimension. The name "La, grangian submanifold" was introduced by Maslov and Arnold in the 1960s. It refers to the following characterization of such submanifolds in terms of the classical Lagrange brackets. Consider a k-dimensional embedded submanifold f- of T*M; let (ul, Uk) be any local chart on the manifold C and natural chart (or, more generally, a canonical be a let (xl.... Ixn Pi, p.)
manifolds
are
...
7
chart)
on
-
-
-
7
,
T*M. Then the classical Lagrange brackets
are
defined
as
94
5.
Ray-optical
structures
(UA) UB) for
A, B
=
1,
reference to
.
a
.
.
,
k.
Clearly,
natural
axa
A, B j: C
yxa
19Pa IOUB allA
IOUA IOUB
the
same
(or canonical)
k where
lba
-
(UA7UB) for
arbitrary manifolds
on
expression
chart
(j*J?)
=
can
(5-43)
be written without any
as
C9
GUA
7
(5.43)
IOUB
j*f2 denotes the pull-back
T*M. From
(5.42)
of 0 with the inclusion
it is obvious that the
Lagrange brackets k if and only if the submanifold L is isotropic. In A, B 17 other words, Lagrangian submanifolds are submanifolds of maximal dimension for which the Lagrange brackets vanish identically. Properties of Lagrangian submanifolds are detailed in many articles and textbooks, e.g., in Weinstein [148], Guillemin and Sternberg [55]), Abraham and Marsden [1], and Woodhouse [150]. In the following two propositions we recall some well-known facts which are of particular relevance for us, cf. Figure 5.5. map
vanish for all
=
....
R be a C' function defined on an open Proposition 5.5.1. Let S: U subset U of M. Then L dS(U) is an embedded Lagrangian submanifold of T*M which is everywhere transverse to the fibers. =
Proof.
The
only non-trivial claim
is that C is
recall that the canonical one-form 0
isotropic. To
this,
prove
we
T*M satisfies 0*0 0 where 3 is dS. Now we apply the exterior any local section in T*M. Hence, (dS)*O 0 and the fact derivative d to this equation. Upon using the identity dd on
=
=
=
that d commutes with the
pull-back operation, this results
in
(dS)*f2
) T* M can be written in the form As the inclusion map j: fj * this implies j * S? (-r, 4) * (dS) Q 0. =
0.
=
dS o -r,
=
4
,
1:1
=
Proposition 5.5.2. Let f- be an n-dimensional embedded Lagrangian C' submanifold of T*M which is everywhere transverse to the fibers. Then L can be represented, locally around each of its points, in the forra L dS(U) ) where S: U R is a C' function defined on an open subset U of M. Moreover, if L is simply connected, L can be globally represented in this way. S is then called a generating function for L. =
Proof. Since L is an n-dimensional C' submanifold of T*M transverse to the fibers, it is the image of a local section in T*M. Thus, there is a (necessarily
open) subset U L =,3(U). Since
of M and L is
a
one-form 3: U
Lagrangian,
3*Q
=
)
T*U C T*M such that
0. On the other
hand,
the
defining
6 property of the canonical one-form 0 on T*M guarantees that 6*0 these of results U 0. 8*0 If two -d,3. Comparison and, thus, gives d# =
=
is
simply connected, this implies that )3 ) R which is unique up
function S: U
=
is of the form to
an
0
=
dS with
additive constant.
(S
some
can
be
5.5 Eikonal
equation
95
dS
M
Fig.
5.5. A
the fibers is
Lagrangian submanifoldC of T*M which is everywhere transverse locally generated by a function S on M, i.e., C dS(U).
to
q
where the integral by fixing a point q' in U and setting S(q) is to be performed along any path from q' to q. If U is simply connected, the equation d,3 0 guarantees that the result is independent of the path chosen. This follows from the well known Stokes theorem, see, e.g., Abraham and Marsden [1], p. 138). Since U is simply connected if and only if L is simply connected, this proves the second claim. The first claim follows from 11 the fact that each point in U has a simply connected neighborhood. defined
=
These two propositions have the following consequences, which are illus) R of the eikonal equation Figure 5.5. Any classical solution S: U
trated in
Lagrangian submanifold L dS(U) of T*M which is transverse completely contained in Ar. Conversely, to any Lagrangian submanifold,C of T*M which is transverse to the fibers and completely contained in A( we can find, on each simply connected subset U of -r., 4 (C), a clasdetermines
a
=
to the fibers and
sical solution S of the eikonal equation such thatCn(-r )-'(U) dS(U); this solution S is unique up to an additive constant. These observations suggest =
the
following
definition.
Definition 5.5.2. Let A( be
a
ray-optical
structure
on
M. A generalized
solution of the eikonal equation of Ar is a Lagrangian C' T*M which is completely contained in Ar. The
following proposition guaranteees that
eikonal equation determines
an
(n
-
any
submanifold
r-
of
generalized solution of the
l)-parameter family
of lifted rays.
Proposition 5.5.3. Let A( be a ray-optical structure on M and let L be an embedded Lagrangian C' submanifold of T*M which is completely contained in Ar. Then f- is foliated into lifted rays.
96
Ray-optical
5.
structures
on
arbitrary manifolds
point u E L and let X,, E TA( c Tu(T*M) be a characteristic 0. This implies that Qu (Xu, Zu) 0 for X, i.e., (f2,v)u (Xu, ) all vectors Zu E TuC C TuJV. Since f- is Lagrangian, i.e., maximally isotropic, this can be true only if Xu E T,,,,C. In other words, at all points u E C the characteristic direction of N must be tangent to L. Hence, L must be foliated 13 into integral curves of characteristic vector fields. Proof. Fix
a
vector for
=
-
=
1C
P
T*M
M
Fig. 5.6. A generalized solution L to the eikonal equation can be constructed, according to Proposition 5.5.4, by applying the characteristic flow to an appropriately chosen isotropic submanifold P.
If
we
Proposition 5.5.1, we find that each equation is associated with a are the projections to M of the lifted
combine this observation with
classical solution S: U
)
R of the eikonal
on U. Those rays = L which into dS(U) is foliated. (In terms of a local Hamiltonian and a rays natural chart this construction is already known to us from Sect. 2.4.) In other
congruence of rays
words, on
S
a
) R of the eikonal equation determines, M, not only a "slicing" into wave surfaces "threading" into rays. Later in this section we shall
classical solution S: U
its domain of definition U C =
const. but also
a
inquire whether rays and wave surfaces are transverse to each other. The following proposition gives a construction method for generalized solutions of the eikonal equation, please cf. Figure 5.6. Ar be
a ray-optical structure on M. Fix an (n submanifold P of T*M such that P is completely contained in Ar, isotropic, and non-characteristic. By the latter condition we mean that, at all points u E P C Ar, the characteristic direction of JV is non-
Proposition
5.5.4. Let
-
dimensional embedded C'
set of all points in T*M that can be connected to point of P by a lifted ray. Then L is a generalized solution of the eikonal equation of Ar. (In general, C is only an immersed but not an embedded submanifold of T*M. However, if we restrict to an appropriate neighborhood of P this construction always gives an embedded submanifold.)
tangent to P. Let C be the a
5.5 Eikonal
is defined
Proof. C
field. Since P is
(n
as -
97
equation
the image of P under the flow of a characteristic vector and non-characteristic, 'C must be an n-
I)-dimensional
dimensional immersed submanifold of X. What remains to be shown is that L ) -P, (u) is isotropic, i.e., that the pull-back of Q to C vanishes. If !D: (s, u) i under of P the denotes the flow of a characteristic vector field on X, image
0., is
isotropic submanifold (for each
an
s
R for which this
E
image
is
non-
empty). This follows from the fact that the Lie derivative of Q with respect to a Hamiltonian vector field on T*M vanishes. Hence, at each point of 'C the tangent space to L is spanned by the characteristic direction and by the tangent space to an isotropic submanifold. This proves that 'C must be 0
isotropic. At the level of
(local) Hamiltonians this [11, Lemma 5.3.29.
is
a
standard
result, cf.,
e.g.,
Abraham and Marsden
The construction of Proposition 5.5.4 can be carried through, in particAr n Tq*M where q is any point in Xq ular, for the special choice P M. Since Arq is completely contained in the fiber Tq*M, it is, indeed, noncharacteristic and isotropic. Hence the image of P under the characteristic flow gives a generalized solution L of the eikonal equation. Clearly, this fcannot be transverse to the fibers at q. The projection of 'C to M gives the set of all points in M that can be joined to q by a ray. On the other hand, it is also possible to choose the initial surface P in such a way that the projection -r, 4 maps P diffeomorphically onto an (n I)dimensional submanifold -r. 4 (?) of M. In this case the resulting generalized solution L of the eikonal equation is transverse to the fibers, and thus of the form dS(U), near P. It is foliated into lifted rays which, if projected to M, give a congruence of rays that intersect -r 4 (P) transversely. Farther away from P, however, C need not be transverse to the fibers and neighboring =
=
-
other, see Figure 5.6. This shows that it is necessary to consider generalized solutions, rather than just classical solutions, of the eikonal equation if one wants to treat global questions. Proposition 5.5.4 has the following interesting consequence. rays may intersect each
Proposition ture
jV
on
8
I
e=
is
M and
R
S: U -
5.5.5. Let s
E
:
I
)
A( be
I. Then there is
a
lifted
an E
of
> 0 and
of the eikonal equation for A' such
"'s +
ray
that
a
a
(s')
ray-optical
struc-
classical solution E
dS(U) for
all
El.
Proof. Construct a generalized solutionC of the eikonal equation according Proposition 5.5.4, with an initial manifold P that passes through the point (s) and is transverse to the fibers at that point. Then f- must be transverse to the fibers on some neighborhood of X(s), i.e., it can be written as the image is contained of a differential dS on that neighborhood. As, by construction, 1:1 in C, this concludes the proof.
to
Quite generally, Proposition 5.5.4 gives a generalized solution of the I)-parameter family of lifted rays, equation in the form of an (n
eikonal
-
98
Ray-optical
5.
structures
arbitrary manifolds
on
parametrized by the points of P. It
(n
is crucial to realize that
only very special
l)-parameter
families of lifted rays give rise to a generalized solution of the eikonal equation. This special property is in the condition of P be-
ing isotropic which guarantees that L is Lagrangian. This condition on the (n I)-parameter family of lifted rays can be viewed as an integrability condition. The following proposition is helpful to clarify the geometric meaning of this integrability condition. -
Proposition
5.5.6. Let L be
note the inclusion map
Then the
(a) Qc (b) On
following
=
two statements
submanifold of T*M, dej*O and &?L j*S?-
T*M and let Oc
)
are
=
=
equivalent.
0, i.e., L
every
9: 0
Proof.
by j:
embedded C'
an
L
)
is an isotropic submanifold of T*M. simply connected open subset a of L there is a C' function R, unique up to an additive constant, such that d,9 OL Ia. =
Since 0
=
-dO and the exterior derivate d commutes with the
pull-
back operation, S?,c 0 is equivalent to d0,C 0. On a simply connected subset this equation is satisfied if and only if 0,C is the differential of a function, =
please As
cf. the a
=
proof of Proposition
consequence,
an
5.5.2.
n-dimensional submanifold L of M is
solution of the eikonal equation if and one-form 0,C is locally integrable. If
11
only if
a
generalized
the kernel distribution of the
supplement the hypotheses of Proposition 5.5.6 with the assump0 guarantees that 0,C has no zeros, the isotropy condition 0,C L is locally foliated into hypersurfaces 9 const. If we specialize from the isotropic to the Lagrangian case, the situation that 0,C has no zeros can be characterized with the help of the Euler vector field (5.31) in the following we
tion that
=
=
way.
Proposition and at
u E
to L
that Ou(Xu) 0 for all Xu E T,,f- C Tu(T*M). By 0 for all Xu E TuL C Tu(T*M). As equivalent to Qu(Eu, X,,,) Lagrangian (i.e., maximally isotropic), this is equivalent to Eu E TuL c
Proof. Let
(5.34) L is
an
pull-back
if and only if the Euler
u
embedded
Lagrangian submanifold of T*M of the canonical one-form 0 has a zero vector field E is tangent to C at u.
5.5.7. Let L be
L. Then the
us assume
this is
Tu(T*M). If
=
=
o
Lagrangian submanifold of T*M is invariant under the flow of the field, it is called conic (cf. Guckenheimer [52]) or homogeneous (cf. Guillemin and Sternberg [55]). Conic Lagrangian submanifolds are of relevance as generalized solutions of the eikonal equation for dilation-invariant ray-optical structures. They are necessarily non-transverse to the fibers, i.e., they cannot be associated with classical solutions of the eikonal equation. By Proposition 5.5.7, the pull-back of the canonical one-form 0 to a conic a
Euler vector
5.5 Eikonal
99
equation
Lagrangian submanifold vanishes identically, i.e., its kernel distribution does not give a foliation into smooth hypersurfaces. Let us consider, on the other hand, a Lagrangian submanifoldC of T*M such that the Euler vector field is nowhere tangent to L. In this case Proposition 5.5.7 guarantees that the pull-back of 0 to C has no zeros. As a consequence of Proposition 5.5.6, the kernel distribution of this one-form defines a foliation of C into smooth hypersurfaces. We introduce the following terminology. Definition 5.5.3. Let L be
X
a
generalized
solution
of
the eikonal
equation
M. Assume that the Euler vector field is of a ray-optical nowhere tangent to L such that the pull-back Oc to JV of the canonical oneform 0 has no zeros. Then an integral manifold of the kernel distTibution of structure
0,C
is called
a
lifted
wave
on
surface
of
simply connected open represented as a surface 9
On each can
be
=
C.
subset of
L,
const. where
a
lifted
wave
surface of C
S satisfies dS
=
OC. If, in
the situation of Definition 5.5.3, C is transverse to the fibers of T*M, the projection r. 4 maps each lifted wave surface onto a smooth hypersurface in
M which, in agreement with our earlier terminology, is called a wave surface associated with C. If C is not transverse to the fibers, the image of a lifted need not be a smooth submanifold of wave surface under the projection r) a generalized wave surface. have, thus, generalized our earlier observation that a classical solution ) R of the eikonal equation is associated with a "slicing" of U into S: U of the wave surfaces and a "threading" of U into rays. A generalized solution L surfaces lifted C into of wave eikonal equation is associated with a "slicing" and a "threading" of C into lifted rays, provided that the Euler vector field is nowhere tangent to L. The question of whether lifted rays are transverse to lifted wave surfaces is answered in the following proposition.
M and could be called We
Proposition 5.5.8. Let C be a generalized solution of the eikonal equation of a ray-optical structure M on M. Assume that the Euler vector field is nowhere tangent to C, i. e., that L is foliated not only into lifted rays but also into lifted wave surfaces. Then for any point u E C C M the following two statements are equivalent.
(a) (b)
through The Euler vector field The
lifted
ray
u
is
E is
tangent to the lifted tangent to Ar at u.
unique
speak of "the" lifted ray through u in the up to reparametTization and extension.
Proof.
Let X. E
Here
we
T,,j\(
C
T.,,(T*M)
wave
sense
surface through
that this
lifted
be tangent to the lifted ray
u.
ray is
through
the characteristic direction at u. By u, i.e., let Xu be a vector that spans 0. By Proposition 5.4.6, Definition 5.5.3, (a) is satisfied if and only if Ou (Xu) =
this is
equivalent
to
(b).
11
100
Ray-optical
5.
Let
structures
on
arbitrary
manifolds
this proposition to a ray-optical structure JV which is evetransverse to the flow of the Euler vector field, such as given by
apply
us
rywhere Example 5.1.2, 5.1.4 or 5.1.5. Then the Euler vector field cannot be tangent to a generalized solution of the eikonal equation, i.e., it is automatically guaranteed that every generalized solution C of the eikonal equation is foliated into lifted wave surfaces. By Proposition 5.5.8, those lifted wave surfaces are always transverse to the lifted rays into which C is foliated. Any real valued (local) function 9 on C with d,9 0,C gives a (local) parametrization on each of those lifted rays. This distinguished parametrization, which is unique (globally along the lifted ray) up to an additive constant, was already mentioned in Sect. 5.4, see (5.38). The situation is completely different for a dilation-invariant ray-optical structure Ar, such as given by Example 5.1.1 or 5.1.3. For a generalized solu=
tion L of the eikonal equation, the Euler vector field E may or may not be tangent to C at any of its points. Only in the case that E is nowhere tan-
gent to L is C foliated into lifted lifted
wave
surfaces
valued function S
wave
surfaces.
By Proposition 5.5.8, those
then foliated into lifted rays. In other words, any real C with dS 0,C is constant along each of those lifted
are
on
=
rays.
For
an
arbitrary ray-optical
structure
to the maximal open subset of
Ar
the maximal open subset of Ar
on
JV
on
M these considerations
appropriate matching procedure in addition. This cumbersome and we abstain from working out an example.
sion
requires
apply
which E is non-tangent to JV and to which E is tangent to Ar. A full discus-
on
an
is rather
5.6 Caustics In the last section
we
have
seen
that
generalized solutions
L of the eikonal
equation are foliated into lifted rays. If projected to M those lifted rays give a congruence of rays as long as L is transverse to the fibers of T*M. At points where L fails to be transverse to the fibers
ing,
see
Figure
64caustic"
.
5.6. In
Therefore
neighboring
rays start intersect-
this indicates the formation of
optical terminology, introduce the following mathematical definition.
a
we
Definition 5.6.1. Let L c T*M be
embedded
Lagrangian C' submaniof the cotangent bundle projec) M. Then u (-= C is called a critical point of L tion -r;4 by n -r, 4 1,c: L ) the iff tangent map Tun: TuL Tr.(u)M is not surjective. The set
fold of
an
T*M and denote the restriction to L =
Caust,c is called the caustic
=
I n(u)
E
M
I
u
is
critical
point of L
(5.44)
of L.
Clearly, the critical points
of L
are
exactly those points where L is not words, C is everywhere transverse
transverse to the fibers of T*M. In other
to the fibers if and
a
only if Caustc
=
0.
5.6 Caustics
In any case, Caustc is
a
set of
measure zero
in M. This is
101
imme-
an
diate consequence of the well-known Sard theorem which is proven, e.g., in Abraham and Robbin [2], p. 37. In general, Caustc features cusps, edges and vertices, i.e., Caustc is not a submanifold of M. Thus, the geometry of caustics can be very complicated even locally. Quite generally, the variety of and vertices possible is so vast that a complete classification is However, V. Arnold was able to locally classify all caustic types which are stable in a certain sense. For the details of this highly technical work we refer to Arnold, Gusein-Zade and Varchenko (9]. Arnold's formalism was applied to general relativity, e.g., by Friedrich and Stewart [45], by Petters [114], by Hasse, Kriele and Perlick [57], and by Low [871. Here we want, of course, to apply Definition 5.6.1 to the case that L is a generalized solution of the eikonal equation of a ray-optical structure X on M. Then L is foliated into lifted rays which can be projected to M to give a family of rays. In this situation, Caustc is the set of all points in M where infinitesimally neighboring rays intersect each other. To put this rigorously we introduce the following notation (see Figure 5.7). cusps,
edges
not feasible.
Definition 5.6.2. Let X be
(a)
A COO vector field Z
on
a
X
M.
my-optical
structure
is called
field of connecting vectors iff, for the Lie bracket [Z, X] is, again,
every characteristic vector field X
on
a
on
X,
characteristic.
TAr be a C' map lifted ray of Ar and let T: I Jacobi field along i lifted called I. is all a with 7(s) E T (,)A( for s E around be any parameter value s E 1, in represented, locally iff it can where Z is a field of connecting vectors. Two lifted Z o the form .7 are called equivalent iff they differ by a multiple of Jacobi fields along the tangent field of . The respective equivalence classes are called lifted Jacobi classes. A lifted Jacobi field is called trivial if it is equivalent to the zero vector field, i.e., if it is parallel to the tangent field of 6 (c) If i is a lifted Jacobi field along , J T-r 4 o i is called a Jacobi field along the ray A -r,' o - Two Jacobi fields along A are called equivalent if they differ by a multiple of the tangent field of A. The respective equivalence classes are called Jacobi classes. A Jacobi field is called trivial if it is equivalent to the zero vector field, i.e., if it is parallel to the tangent field of A.
(b)
Let
:
I
)
Ar be
a
=
-
=
=
For a ray-optical structure X whose rays are geodesics, such as in our Examples 5.1.1, 5.1.2 and 5.1.5, Definition 5.6.2 (c) reproduces the standard textbook definition of Jacobi fields. (Note, however, that those standard textbooks usually assume their geodesics to be affinely parametrized whereas our rays are arbitrarily parametrized.) If i is a lifted Jacobi field along a lifted ray , the "arrow-head" of .7 can be thought as tracing a neighboring lifted ray which is infinitesimally close to ray.
. All members
of
a
lifted Jacobi class trace the
same
neighboring
lifted
102
5.
Ray-optical
structures
on
arbitrary manifolds
0
T*A1
J(S)
M
A(S)
5.7. A lifted Jacobi field
Fig. ray;
a
Jacobi field J connects
To construct
a
i
connects
a
lifted ray
with
ray A with
a
neighboring
ray.
lifted Jacobi field
a
along
a
lifted ray
a
neighboring
a
variation of
a
-
-
Af
: [sl, s2]
C" map q: I , i.e., Cc? 7 Co [ X [S 11 such that q (e, ) is a lifted ray for all C G ] 6,,, E,, [ and 77 (0, ) differentiation with respect to the variational parameter E at e consider
S21
along ,
i(s)
=
77(
s) 1,=O.
In
*
-,
a
natural
the derivative with respect to the variational parameter
i=6x' a
lifted Jacobi field is determined
a
7XW
+
by the
6P.
=
0
-
J(Pa
+ k
aH axa
19
(XI P))
Ar
0
5 such that
(5.45)
ON
set of
(H(x, p)) 9H j(ba k ON (X)P)) 5
by
we
Then
=
=
lifted Jacobi field
o
. gives a chart, denoting -
-
lifted
equations
(5.46)
,
=
0,
(5.47)
=
0
(5.48)
1
where H is any (local) Hamiltonian for the ray-optical structure considered. (5.46), (5.47) and (5.48) are, of course, just the conditions that the ray equa-
(5.10), (5.11) and (5.12) are to be preserved. The J-derivatives in (5.46), (5.47) and (5.48) can be evaluated with the help of the usual product and
tions
chain rules. As derivatives with respect to the variational parameter and with respect to the curve parameter commute, (5.47) and (5.48) give us a system of first order linear differential equations for JPa and Jxa. Any solution of this system, with any Jk, that satisfies (5.46) gives us a lifted Jacobi field. It is easy to
verify
the
following fact.
To any initial values
jxa(sl), jp,,(Sl)
5.6 Caustics
satisfy (5.46)
that
there is
a
solution
103
JxI, Jp,, of the full system (5.46), (5.47)
and it is unique up to adding a multiple of the tangent field. This freedom corresponds to the freedom of choosing 6k at will.
and
(5.48),
This observation proves the following result which can be verified directly as well, without refering to the coordinate representation
from Definition 5.6.2
(5.46), (5.47)
and
(5.48).
Proposition 5.6.1. The set of all lifted Jacobi fields along a lifted ray is an infinite dimensional real vector space. The set of lifted Jacobi classes along is a (2n 2)-dimensional real vector space, corresponding to the (2n 2) -
-
directions transverse to the characteristic direction in
JV.
As a consequence, the set of Jacobi fields along a ray A is an infinite dimensional real vector space. The set of Jacobi classes along '\ is a finite dimensional real vector space of dimension < (2n 2). Since there is a ray smaller than (n be cannot dimension the of 1). each M, point through -
-
This minimal value is realized, e.g., in Examples 5.1.3 and 5.1.4. We shall now demonstrate that the maximal value is realized in strongly regular ray-
optical
Definition 5.2.2.) The proof will be based on strongly regular case Jacobi classes are determined differential equation that admits an existence and
(Please recall
structures.
the observation that in the
by
second order linear
a
uniqueness theorem; so the dimension of the space of Jacobi classes found out by counting the allowed values for the inditial data.
on
field
M be
[Sli 821
)
M. Then
for
and A:
V
J
along
Ar be
5.6-2. Let
Proposition
\ with
J(s 1)
J+w
i
any ray X has dimension
X and the
(n
-
strongly regular ray-optical structure on M of Ar. Choose an arbitrary affine connection
X and
transformations J strongly regular ray-optical up to
(s 1) J
V
with
=
T,\(,,)M
-
w(si)
2), corresponding
1) components of
there is
Y. This Jacobi =
0. As
structure the vector space
(2n
be
a
ray any two vectors X and Y in a
can
of
Y transverse to the
Jacobi
for a along 1) components of
consequence,
a
Jacobi classes
(n
to the
a
field is unique
-
tangent
vector
(si).
) Ar be a lifted ray that projects onto X First we : [Sli S21 Proof. Let give the proof of the proposition under the additional assumption that can be covered by the domain of a Hamiltonian and by a natural chart. Then our assumption of strong regularity guarantees that, in the natural chart chosen, condition (5.15) holds along ; we can thus introduce the inverse matrix by
(Gca) (Gc) ) (Ga)
G
( (H
ab
) (Ha)
(H b)
0
1
0
0
1
(5.49)
where the components Gca, G,, and G are to be viewed as functions of the of their coordinate representacurve parameter s. Please recall that, in terms determined by (5.46), (5.47), and lifted Jacobi fields along 6 are tion
(5.45), (5.48). It is
our
goal
to eliminate 5k and
bp,,
from these
equations and
to
get
104
a
5.
Ray-optical
structures
on
arbitrary manifolds
second order differential equation for Jx'
alone, i.e.,
to
(5.46)
and
(5.47)
be written
can
help
(5.49), (5.50)
of
(Jpc)
C
Jk
(Ga)
With the an
help
(5.50),
of
we
G
j92H
a
k
ap.,gXb -9H
jXa
_
JPb
) (.1jta
gp.igXb'
Jp,
-9H
a-x--r
'X')
and
92 H -
k
may eliminate
equation for observe that
equation in the following
be solved for
can
(Gca) (G )
7
matrix
('pk')
(Hab) (Ha) 0 ( H b) With the
as a
an we
get
Jacobi fields rather than for lifted Jacobi fields. To that end
way.
(5.50)
Jk,
jXb
(5.51)
jXb
and Jk from
(5.48)
which gives
equation of the form Gab & b
+
Bab j:tb
+
b Cab JX
=
(5.52)
o.
meaning as in (5.49) and (5.50) whereas Bab and Cab the special form of which will be of no interest in the following. By construction, j jXb ax.-9ris a Jacobi field if and only if the JXb satisfy (5.52). From (5.49) we read that Here
Gab has the
are some
same
functions of
s
=
-
GcaH
a
=
0
(5.53)
.
Thus, for each parameter value s the matrix (Gca (s)) has a non-trivial kernel which is spanned by the tangent vector of the ray, so (5.52) cannot be solved for the second derivatives. This reflects the fact that initial values
and of
6P(si)
do not fix
adding multiples
introduce the
(n
-
a
solution Jx1 of
(5.52) uniquely but
of the tangent field. At each parameter value
l)-dimensional
L(s)
=
I (za)
I Ga(S) za
E R'
GaH
(5.49)
we
read that for all H
6a
(za)
ta(,)
a =
in
=
=
0
an
(5.52) gives
existence and
E
(5.54)
(5.55)
L(s)
the equation
Zb
(5.56)
(Gac(S))
(H ba(S))
=
holds true, i.e., that on L(s) the matrix is invertible, with Jacobi fields with to If inverse. restrict its we being
then
may
k(s) Ha(S) since, by (5.49),
1.
(s)Gac(s)zc
(jxa (s))
s we
vector space
which is transverse to the tangent vector
From
JXa(Sj)
leave the freedom
L(s)
for all s,
(5.57)
second order differential equation for JXa that admits uniqueness theorem. In order to prove this it is convenient us a
5.6 Caustics to choose the coordinates in such
105
a way that Ha J." and G,, 5a along A, which is possible owing to (5.55). With this choice of coordinates (5.57) implies that (&ba(s)) E L(s) and (& a(s)) E L(s) for all s. As a consequence, multiplying (5.52) with Hca results in
51-' + Hca Bab
Giving initial values J(sl)
-
5,tb
Cab jXb
+ Hca
X and
J
=
=
(5.58)
0.
Y is
equivalent to giving initial By adding appriopriate multiples of a (SI) we get
=
values 6xa (si) and 6.,ta (,,)
=
=
L(sl) which determine a unique solution jXb of (5.58). Then f &b, with f (si) chosen appropriately, is a Jacobi field that satisfies the original initial conditions. Except for its value at the initial point, f can be chosen at will. This completes the proof under the assumption that A can be initial values in
6Xb
+
covered
by a chart of the desired form. general case we divide the domain
In the
the restriction of A to each subinterval
can
of A into subintervals such that
be covered
by a local chart in which
the equations Ha 6' hold along A. Then we get the desired 6.1 and Ga a Jacobi fields by solving (5.58) piecewise, where on each subinterval the initial =
=
values
are
determined
by
the end values
on
the
preceding
subinterval.
1:1
Quite generally, Jacobi fields and lifted Jacobi fields can be used to following, way. Let L be a generalized solution of the eikonal equation of a ray-optical structure Ar on M and let C C Ar be a lifted ray through the point u ) : [S17 821 (S2) E L. Then u is a critical point of L, in the sense of Definition 5.6.1, if and only if there is a non-zero vertical vector Zu E TuC. By the above argument, the existence of such a vector Zu is equivalent to the existence of a non-trivial lifted Jacobi field i along such that J is everywhere tangent to C and .7(82) is vertical. (Please note that i is everywhere tangent to L if 7(S2) is tangent to L, owing to the Lagrange property of L.) Verticality of .7(S2) indicates with the "infinitesimally neighboring ray" an intersection of the ray 7-,Z4 o J o j. In this sense, Caustc is the set of all points where infinitesT-r. 4 imally neighboring members of the family of rays determined by L have an intersection. Note that, in general, J may be zero on a whole interval, i.e., the two neighboring rays may coincide on a whole interval. To exclude this unwanted situation one introduces the following definition. characterize caustics in the
=
=
Ar be a ray of a ray-optical structure X on different parameter values 8 11 S2 E 1. Let Jac(A, S1 S2) denote the vector space of Jacobi classes [J) along A such that there is a J E [J] with
Definition 5.6-3. Let A: I
)
M and fix two
J(si)
=
0 and
7
J(S2)
==
0. Then the
point A(S2) is called conjugate
along A iff the dimension of Jac(A, S1, S2) is called the multiplicity of the conjugate point. For the
ray-optical
structures of
non-zero
to
A(sl)
and this dimension is
Example 5.1.5, Definition 5.6.3 coincides conjugate points in Riemannian
with the standard textbook definition of
106
5.
Ray-optical
geometry. For the ray-optical
tively)
arbitrary
manifolds
structures of
Example
structures
on
5.1.1 (and 5.1.2, respec(and time-like, respectively) Beem, Ehrlich, and Easley [11].
it coincides with the definition of light-like
conjugate points in Lorentzian geometry, cf. ray-optical structures of Example 5.1.3 and 5.1.4 do jugate points. The
not admit any
con-
Later we shall come back to the notion of Jacobi fields and of conjugate points. In particular, we shall explicitly evaluate the differential equation for Jacobi fields of isotropic ray-optical structures on Lorentzian manifolds in Sect. 6.4, and we shall use the notion of conjugate points and its multiplicity to develop a Morse theory for rays of strongly (hyper-)regular ray-optical structures in Sect. 7.5. In the latter
context, the following observation will
be important.
Ar be a ray of a strongly regular rayProposition 5.6.3. Let A : I optical structure A( and fix a parameter value sl E 1. Then the following holds true.
(a) If A(s) is conjugate to A(sl) along A, its multiplicity cannot be bigger than (n 1) (b) There is an e > 0 such that fo r 0 < I s s 1 1 < E the point A (s) cannot be conjugate to A(si) along A. Let be a lifted ray that projects onto A and assume that, with respect to (C) local Hamiltonian and any natural chart, the matrix in (5.15) is not any only non-degenerate but even positive definite at all points of . If A(S2) is conjugate to A(si), then there is an e > 0 such that for 0 < IS -S21 < 6 the point A(s) cannot be conjugate to A(si) along A. -
-
-
Proof. By Proposition 5.6.2, the vector space of Jacobi classes that vanish at a particular point has dimension (n 1). Hence, the vector space of Jacobi classes that vanish at two points cannot be bigger than (n 1). This proves -
-
(a). To prove
(b),
we
consider the set of all Jacobi fields J that vanish at
si.
By Proposition 5.6.2,
an
(n
-
l)-dimensional
-
si
I
< E
V (,,,)J
of those Jacobi fields span
vector space transverse to
(si),
where V denotes
M. By Taylor's theorem, this implies that for 0 < the values J(s) pf those Jacobi fields span an (n l)-dimensional
any affine connection
Is
the derivatives
on
-
A(s).
This proves of (c) which is
vector space transverse to
(b).
We now turn to the proof more difficult. If the point A(S2) conjugate to A(si) along A, part (a) implies that the multiplicity M of this conjugate point has to satisfy the inequality m < n 1. In this situation we can find Jacobi fields J1,..., J,,-, along A such that is
-
-
-
-
-
the vector fields J1,
JI(SO J1 (S2)
=
=
the vectors
.
.
.
,
J.-1(sl) Jm (S2) Jm+ 1 (S2))
J,,- 1, =
=
...
are
linearly independent
over
R;
0;
0; )
Jn- 1 (S2)
7
(S2)
are
linearly independent.
107
5.6 Caustics
a Jacobi field if and only if it J,,- 1 with constant coefficients by J1, multiple of k Here we made use of Proposition 5.6.2. Now we give the proof of (c) under the additional assumption that the
A vector field differs from a
a
along
A that vanishes at s, is
linear combination of
.
.
,
-
is contained in the domain of a natural chart (x,p) and of lifted ray a Hamiltonian H for M. For A (n 1), our Jacobi fields are then -
represented
in the form
JA
a
JAXa
=
(5.59)
,gXa
with
6AXa(sl)
=
j,xa (S2)=O
(jM+Ixa (S2))) We
are
A
still free to
JA
JA
...
+
(n
=
fA -
1)
7
(j
_lxa (S2))
(n
for A
0
i
-
(5.60)
1)
(5.61)
forI=1,...'M;
(,ta (S2))
are
(5.62)
linearly independent.
change the Jacobi fields by
a
transformation of the form
with functions fA that satisfy fA(sl) 0 for all indices I = 1,.. and fI(82) =
0 for all indices m.
As shown in
proof of Proposition 5.6.2, we may use this freedom in such a way that, in an appropriately chosen chart, the second order differential equation (5.58) 1, is satisfied by jxa jAxa for all A (n 1). Then (5.61) implies that
the
=
=
(jl_+a (S2)))
...
I
Otherwise there would be
-
-
(jmj a (S2))
-
-
are
,
(5.63)
linearly independent.
cij,xa(s) jxa(s) equation (5.58) with jXa (S2)
a non-zero
C!njmXa(S) of the linear differential j_ta (S2) 0, which is impossible.
solution
=
+
=
...
+
0 and
=
following we have to use the positive-definiteness assumption of (c). This assumption implies that, along , we have (5.49) at our disposal, with both matrices on the left-hand side positive definite. (Here we make use of the elementary fact that the inverse of a positive definite matrix is, again, (n 1), the JAXa are the components of a positive definite.) For A 6APb jXa JAXa into (5.51) determines JPb Jacobi field. Hence, inserting and are i.e., satisfied, that such (5.48) in a way and bk (5.46), (5.47), 6Ak In the
=
-
=
=
=
such that
jA is
a
lifted Jacobi field
(5.47)
and
(5.48)
it is
=
6AXO'
19
,gxa
+
JAPa
along for each A readily verified that
(jjpa jrXa
-
6.,P. 6jxa)
49
(5.64)
ON
(n
0
-
1).
With the help of
(5.65)
108
5.
Ray-optical
structures
arbitrary manifolds
on
for any two indices I and J between 1 and
(n
1). By (5-60),
-
this
implies
that
JJP. 5IX, Evaluating
As 5,p,, is determined by
derivatives, (5.67)
can
( (jjXb (S2)) 0) for 1 < I
2 is essential.) Thus, Arg has either one or two connected components. The Lorentzian manifold (M, g) is called timeorientable iff Mg has two connected components, cf., e.g., Sachs and Wu At each point q E
(Here
our
[126],
p.
Mg
24,
Wald
or
may be viewed
p. 189. In this case, each connected component of ray-optical structure in its own right. One of them
(146],
as a
future-pointing momenta whereas the other one gives rays with past-pointing momenta. If (M, g) is not time-orientable, i.e., if Ary has only one connected component, no such distinction can be made in a globally gives
rays with
consistent way.
Ray-optical interpreted
as
structures
Ar
on
M which
giving light propagation
V. Perlick: LNPm 61, pp. 111 - 147, 2000 © Springer-Verlag Berlin Heidelberg 2000
in
a
are
different from Arg
are
to be
medium. If Ar is dilation-invariant
112
6.
Ray-optical
structures
on
Lorentzian manifolds
it is called
of Definition 5.4.1, the medium is called non-dispersive, otherwise dispersive. The following proposition characterizes the vacuum
ray-optical
structure
in the
sense
AP in comparison
to other
ray-optical
structures
on
M.
Proposition 6.1.1. Let JV be a ray-optical structure on M. Assume that Ar is regular at all points u E M in the sense of Definition 5.2.1 and that all the rays of M are g-light-like. Then M C Mg (That is to say, if (M, g) is not time-orientable, M Mg; if (M, g) is time-orientable, either M Mg .
=
=
or
Ar is
one
of
the two connected components of
Mg.)
Proof. Fix a point q E M. Since all the rays of the ray-optical structure M Ar n T,* M is a light-like hypersurface of the Minkowski are g-light-like, jVq By elementary Minkowski geometry, this implies that Xq space (T,* M, g#). q =
0
any such line by light-like straight lines. Since jVq is closed in T*M, q either runs from infinity to infinity or it runs from the origin to infinity. We shall prove that the first case is impossible. By contradiction, let us assume that there is a light-like straight line L in Afq that runs from infinity to infinity. This means that all light-like straight lines in Arq which are infinitesimally close to L also have to run from infinity to infinity, without intersecting L. We decompose the motion of those neighboring lines in the familiar way into rotation, shear and expansion. Since the lines are surface forming, the rotation vanishes. Since the neighboring lines have no intersection with L, shear and expansion also vanish. (Non-vanishing shear gives an intersection with L of those neighboring lines that lie in the principal shear directions. In the case of vanishing shear, non-vanishing expansion gives an intersection with L of all neighboring lines.) Hence, all the neighboring lines have to be parallel to L. It is easy to verify that this conclusion contradicts our assumption that M is everywhere regular. Thus, we have proven that X, is ruled by light-like straight lines running from the origin to infinity. As a consequence, Xq must
is ruled
be
a
subset of
Ar.9
=
Mg
n
Tq*M.
can be rephrased in the following way. As long as reguviolated, the velocity of light in a medium is necessarily different from the vacuum velocity of light, at least for some rays. To demonstrate that the regularity assumption is, indeed, necessary one can consider Example 5.1.3 with a light-like vector field U. For an arbitrary ray-optical structure on our Lorentzian manifold the rays can be time-like, light-like or space-like; they can even change their causal
This
larity
proposition
is not
point to point. If we assume that rays can be used to transmit signals (and this, after all, is a basic idea of ray optics), the rules of general relativity prohibit space-like rays. We use the following terminology.
character from
Definition 6.1.1. A
spect
to g
if all
rays
ray-optical
of Ar
are
structure
Ar
on
M is called causal with
everywhere g-time-like
or
g-light-like.
re-
6.2 Observer
ray-optical
The
respect
Examples
structures of
5.1.1 and 5.1.2
to g iff the metric g,, is "narrower" than g,
g,,(X, X) of
fields, frequency, and redshift
< 0
satisfies the
implies the inequality g(X, X)
5.1.3 and 5.1.4
Examples
causal with
i.e., iff the inequality
The
ray-optical
structures
causal with respect to g iff the vector field U
are
inequality g(U, U)
< 0.
are
113
< 0.
The question of whether or not all physically reasonable ray-optical structures on a spacetime have to be causal is a bit subtle. If ray optics is viewed as an
approximation scheme
to
wave
optics, the
energy of
a wave
field does
propagate exactly along rays; this is true only in an approximative sense. (We have discussed this issue in Sect. 2.7, at least for a special class of media.) On the basis of this observation, non-causal rays are not necessarily to
not
altogether as unphysical. In the next section we shall introduce the notions of phase velocity and group velocity for a ray-optical structure X, and we shall see that X is causal iff the group velocity does not exceed the vacuum velocity of light. It is well known that there are physically relevant optical media in which the group velocity exceeds the vacuum velocity of light. For a comprehensive discussion of this issue we refer to Brillouin [?2]. be discarded
6.2 Observer
fields, frequency, and redshift
Several basic concepts of ray optics which are familiar from elementary textdepend on the notion of frequency. For a ray-optical structure on our
books
Lorentzian manifold
(M, g)
this notion
can
be introduced after
a
time-like
vector field has been chosen.
field, given on some open subset U of M will be M we speak of field henceforth. In the case U a global observer field. This terminology refers to the fact that the integral curves of such a vector field can be interpreted as the worldlines of observers. A global observer field exists if and only if (M, g) is time-orientable, see, that we have e.g., Wald [146], Lemma 8.1.1. In the following we assume A time-like C' vector
called
an
a
(local)
observer
observer field V
=
given
normalization condition
on some
g(V, V)
=
open subset U of M which satisfies the -1. This normalization condition means
integral curves of V are parametrized by proper time. At each point q E U C M, we write Vq for the value at q of the timelike vector field V. We decompose the tangent space TqM into the onedimensional time-like subspace spanned by Vq and its (n I)-dimensional orthocomplement
that the
-
HqM
=
JY
E
TqM I gq(Vq,y)
=
0}
-
(6.2)
Similarly, we decompose the cotangent space T.-M into the one-dimensional l)-dimensional orsubspace spanned by the covector g,(Vq, -) and its (n thocomplement -
Hq*M
=
fu
E
Tq*M I u(Vq)
=
0}
-
(6.3)
114
As
6.
Ray-optical
suggested by
our
structures
on
Lorentzian manifolds
notation, H*M q
can
be identified with the dual space of
HqM. Now let JV be a ray-optical structure on M. To each point assign the following quantities, denoting the footpoint of u by q The frequency
w(u) the
spatial
-u(Vq)
E
E
X
we
r. 4 (u). (6.4)
R;
covector
wave
k(u) the
=
u =
u
=
-
w(u) g(Vq,
Hq*M
(6.5)
Hq*M;
(6.6)
E
phase velocity
W(U) the ray
velocity
or
group
V(U) denotes the
Here
ilk(u)112
a
E
velocity
=
gq norm
-(FH) (u) -Vq EHqM. (Vq, (IFH) (u)) induced
1FH denotes the fiber derivative of
In
k(u)
=
a
on
Hq*M by
-Pa Va (X)
W(X7P) -
following
form.
(6.8)
,
W(XI P) gab(X) V6(X)
(X, P) ka(X,P) 9bc (x) kb (X, p) kc (x, p)
(6.9)
W
A
=
va (X, P)
=
Wa (X,
Pa
Lorentzian metric and
local Hamiltonian H for JV.
natural chart these definitions take the
ka(x, P)
our
(6.7)
7
(6.10)
Va(X).
(6.11)
-
QH
If
we
evaluate
(6-8)
and
*5-P. (X 7 P) _
gcd(X) VCW 2H (X, P) .9p,l
(6.9) along
a
classical solution S: U
R of the
frequency (2.38) and the reproduce a can be parameter respectively. (The (2.39), spatial absorbed by a redefinition of the eikonal function S.) This justifies the terminology. The name "phase velocity" for w(u) refers, of course, to the same situation. At each point U E dS(U), the covector w(u) determines the spatial velocity of the wave surfaces (="phase surfaces") with respect to the observer field V. Geometrically, the norm of w(u) is a measure for the pseudoEuclidean angle ^I between the covector gq (Vq, ) and the covector u according eikonal
equation
wave
of
M,
function
the
we
covector field
-
to the formula sinh
27
=
(1
_
I I W (U) 11 2)
velocity (6.7), interpretation if evaluated along The ray
on
-
1,
the other a
as can
hand,
be read from equation (6.6). admits an obvious physical
lifted ray. The direction of the vector
v(u)
6.2 Observer
fields, frequency, and redshift
115
determines the
spatial direction in which the ray is moving, viewed with the traveling along an integral curve of V; the norm of v(u) is a measure for the pseudo-Euclidean angle between the tangent vector to the ray and the tangent vector to the observer's worldline, i.e., for the relative velocity of the ray with respect to the observer field chosen. This justifies the term ray velocity". To verify that, moreover, (6.7) is equivalent to the familiar textbook definition of the "group velocity", we proceed in the following way. First we have to assume that gq(Vq,1FH(u)) =h 0 to make sure that v(u), as defined by (6.7), is non-singular. This condition is satisfied if and only if at the point u the straight line parallel to gq (Vq, ) in Tq*M is transverse to Xq X n Tq* M. (For causal ray-optical structures this transversality condition is automatically satisfied.) Clearly, this is the case if and only if the manifold R locally Xq C Tq* M --- Hq* M x R is the graph of a function f : Hq* M around u. (Globally, however, Arq need not be the graph of a single-valued function. In typical cases, such as in our Examples 5.1.1, 5.1.2, and 5.1.4, Afq has several "branches".) Then a quick calculation shows that v(u), as defined by equation (6.7), is equal to the differential (df)u E (Hq*M)* --- HqM. In other words, to calculate v(u) we have to write the frequency as a function of the spatial wave covector by means of the dispersion relation and we have to calculate the gradient of this function. This is exactly the usual textbook definition of the group velocity. By (6.6), the phase velocity has a zero at points u E Ar where the frequency vanishes, and it has a singularity at points u E X where the spatial wave covector vanishes. Either case is to be viewed as a pathological behavior and indicates a "bad" choice of observer. (If no other choice is possible, the ray-optical structure is to be viewed as "bad".) Similarly, we can read from (6.7) that the ray velocity has a zero at points u E X where FH(u) is parallel to our observer field and that it has a singularity at points u E X where FH(u) is orthogonal to our observer field. The first case indicates a "bad" choice of observer, whereas the second case cannot happen if our rayoptical structure is causal in the sense of Definition 6.1.1. As a matter of fact, a ray-optical structure is causal if and only if I I v (u) I I :, 1 for all U E Ar. eyes of an observer
cc
=
-
)
Here space
11 11 denotes the norm induced by HqM at the point q -r.; (u). In -
=
our
Lorentzian metric
other
words,
a
on
the vector
ray-optical
structure
only if the ray velocity group velocity) is bounded by the of vacuum velocity light. Note that basically the phase velocity is a spatial covector whereas the ray velocity is a spatial vector. We can, of course, use the metric g to identify vectors and covectors. In particular, we can use the metric to make the spatial wave covector into a vector, i.e., we can introduce, in terms of a natural chart, the quantity is causal if and
V (x, p)
=
gab (x) kb (X P)
This vector is sometimes called the vector
7
-
(6.12)
of normal slowness, following Sir along any clas-
William R. Hamilton. Here, "normal" refers to the fact that,
116
6.
Ray-optical
structures
on
Lorentzian manifolds
equation, this vector is orthogonal to all spatial wave surfaces; "slowness" refers to the fact that the phase velocity decreases if the length of this vector increases. The properties of a specific ray-optical structure are nicely visualized by indicating, for a fixed frequency, the phase velocity and the group velocity for each spatial direction. To make this idea precise we fix a ray-optical structure M, a normalized observer field V, a point q E M, and a real number W,, E R. sical solution of the eikonal
vectors which
Then
we
are
tangent
to the
define the sets
Q * 2'=
w(u)
u
E
JV,, w(u)
v(u)
u
E
Xq, w(u)
=
=
W,,
c
Hq*M
w,,,
C
HqM
c C
Tq*M,
(0.13)
TqM
(6.14)
-
figuratrix of the medium whereas !a, is called the indicaftix. Both names originate from variational calculus. In general, figuratrix and indicatrix depend on the frequency value W,,, i.e., if we switch from w,, to ew, with some real number c, figuratrix and indicatrix undergo a deformation. If we restrict ourselves to the case c > 0 (i.e., if we fix the sign of the frequency), there is no such deformation, for any observer field and for any point q E M, if and only if our ray-optical structure is dilation invariant. In other words, the dilation invariant case is characterized by the property that phase velocity and group velocity are independent of the frequency, as long as the sign of the frequency is fixed. This is the defining property of a non-dispersive medium according to standard textbooks on optics. We have thus justified our earlier claim that for a rayoptical structure on a Lorentzian manifold the attributes "dilation invariant" and "non-dispersive" are synonymous. As an example, we consider a Hamiltonian of the form called the
usually
31* is
H (x, p)
9ab(X)
gab(X) )
0
Then the
W2/ (W2
-
0
0
(6.15)
and
Example
(h(x) :A
0 and
point with coordinates
at the
figuratrix
5.1.2
h(x)) (WO2 h(x)) IW2
around the origin in
Hq*M,
gab(X) O x
is
a
=
=
0 and
gab (x)1h(x)).
sphere
of radius
whereas the indicatrix is
a
sphere
around the origin in HqM. 0 For a pathological ray-optical structure of the type given in Example 5.1.3, the other hand, the figuratrix is a sphere through the origin and the
of radius
on
!(gab (X)PaPb + h(x))
2
coordinates, which comprises Example 5.1.1 (h(x)
in natural =
=
-
indicatrix is Now
we
a
single point. question
words, we want for light propagation
other
natural chart function
d
TS
of how the
frequency changes along
to discuss the
relativistic redshift
in media.
general Along any
curve
turn to the
as
map
a
s
1
)
(x(s),p(s)),
in
M, given
a
ray. In
(or blueshift) in terms of
differentiation of the
a
frequency
(6.8) yields
W(X(S),P(S))
=
-p.(S) Va(X(S))
-
pa(S)
'V'(X(S)) IgXb
: b (S).
(6.16)
6.2 Observer
fields, frequency, and redshift
117
If we introduce the canonical lift V of the observer field V
according to (5.24),
equation (6.16)
as
can
be rewritten in coordinate-free form d
T.9 W
(X(S), P(s))
(6.17)
for any C' curve 6: 1 JV. Since A( is transverse to the fibers of T*M, IV can be decomposed, at each point of Ar, into a vector tangent to JV and a vector
is to
a
(6.17).
the
(This decomposition is,
to the fiber.
tangent
lifted ray,
of course, not unique.) If a contribution
only the component tangent to the fiber gives Therefore, it is justified to say that a C' curve : I
redshift law
sk.) WS), QW)
=
(6.18)
Hamiltonian H for A(. Then
conveniently a parametrization adapted to H (i.e., satisfies, by (6.17), the redshift law we use a
with
d
TS w
(x(s), p(s))
To illustrate the redshift law with structure that is
generated by
c
is
a
gab 0
are
an
expressed
A(
lifted ray 6: 1 such that (5.9) holds with k a
dH(V)
example,
=
more
1)
(6.19) we
consider
a
ray-optical
=.I 2
gab(X) PaPb + C
(6.20)
0
the contravariant components of a Lorentzian metric g,, and Examples 5.1.1 and 5.1.2 are of this form. A lifted
real constant. Our
ray of such
a
ray-optical
structure
gives
a
geodesic
of the metric go if pro-
M. The parametrization is adapted to H if this geodesic is affinely
jected parametrized to
and
if,
in
addition,
Pa(s) In this situation, the is
=
T-r., 4(Q(s))
Hamiltonian of the form
a
H(x,p) where
Ar satisfies
0
for all C' maps Q: I TAr with Q(s) E T (,)Ar and ) The law for lifted rays can be L for redshift S E Y if
)
of lifted rays with respect to the observer field V if
:--:
(go)ab(X(S)) :tb (S)
frequency with respect
to
a
(6.21)
normalized observer field V
given by
W(X(8)1P(S))
=:
-(go)ab(X(S))
the lifted ray. If we switch to coordinate-free
bb (S) Va (X (S))
(6.22)
along
Ar and its projection
to M
by
W( (S2)) WWSO)
A
=
notation, denoting the lifted
=,T., 4
o
ray
by :
I
, (6.22) implies
(go)A(a2) ( (S2)) VX(52)) (90)A(so Ns'), vxoo)
(6.23)
118
6.
Ray-optical
structures
on
Lorentzian manifolds
enters into the for any two parameter values si, S2 E I. Please note that right-hand side of (6.23) only in terms of its projection X As (6.23) remains true after
which is The
an
affine reparametrization, it gives the redshift
parametrized affinely (as a geodesic of g,,). redshift formula (6.23) applies, in particular,
(6.23)
to the
any ray
vacuum
ray-
have to read g,, g. In this special context equais well known. It was found by Kermack, McCrea and Whittacker
optical structure, where tion
along
we
=
by Schr6dinger [129]. A particularly clear derivation was given by Brill [20]. As an alternative, the vacuum redshift formula can also be expressed in terms of acceleration, expansion and shear of the observer field V. Details can be found in articles by Ehlers [37], by Hasse and Perlick [581 and by Perlick [107]. Now we turn back to arbitrary ray-optical structures. The following propo-
[71]
and rediscovered
sition characterizes the redshift-free
case.
Proposition 6.2.1. Consider a ray-optical structure Ar and a global ob-1 on M. Then the following two properties server field V with g(V, V) are equivalent. =
(a) (b)
The
V (=-
frequency with respect to V is constant along each lifted ray of Ar. 9,v. (Please recall Definition 5.3.2 of the symmetry algebra gAr.)
Proof. The general redshift formula (6-17) implies that (a) is true if and only if the function fl(X, V) vanishes identically on Ar whenever X is a characteristic vector field on Ar. Since, at each point of Ar, the kernel of Q(X, -) coincides with the tangent space to Ar, this condition is satisfied if and only if V is tangent to Ar at all points of Ar. By Definition 5.3.2, this is 0 equivalent to (b). For
an
arbitrary ray-optical
need not contain
a
structure
Ar
on
M the symmetry
time-like vector field normalized to
g(V, V)
algebra 9Ar -1. Thus,
=
in very special cases is it possible to find a normalized observer field V such that the frequency is constant along all lifted rays.
only
Proposition 6.2.1 can be specialized to our standard examples for which analyzed the symmetries in Sect. 5.3. This gives the following results. For the ray-optical structures of Example 5.1.1 the frequency with respect to V is constant along each lifted ray iff V is a conformal Killing vector field of the metric g,,, i.e., iff the Lie derivative Lvg,, is a multiple of g,,. (In the
we
have
vacuum case
g,,
=
g, the normalization condition g (V,
V)
=
-
1 then
requires
field, i.e., LVg 0.) For the ray-optical structures of Example 5.1.2 the frequency with respect to V is constant along each lifted 0. For the iff Lvg,, ray iff V is a Killing vector field of the metric g,,, i.e., with the respect to V is frequency ray-optical structures of Example 5.1.3 is bracket Lie the iff a multiple of U. constant along each lifted ray [V, U] the 5.1.4 of frequency with Example Finally, for the ray-optical structures 0. iff lifted each [V, U] ray respect to V is constant along V to be
a
Killing
vector
=
=
=
6.2 Observer
fields, frequency, and redshift
119
interesting to consider normalized observer fields which are not in gAr but can be rescaled to give an element of gAr. In this case the redshift does not vanish but it admits a representation in terms of a "redshift potential". This situation, which is of particular interest in cosmology, is chaxacterized by the following proposition. It is also
Proposition 6.2.2. Consider a ray-optical structure M and a global ob-1 on M. Then for any C1 function server field V with g(V, V) the two ) R properties are equivalent. following f:M =
(a) f
is
JV is a lifted potential in the following sense. If 6: 1 with the o A respect to V frequency w projection -r. , 6,
redshift
a
of M with satisfies ray
=
ef
(b) ef V
E
Proof. We
write W
=
d
be
(6.24)
9,v.
TS as can
const.
ef V. Then the general redshift formula (6.17) implies
( ef (-(8))w WS0
=
sks) WS), vv_ (-))
(6.25)
easily checked with the help of the coordinate expression (5.24) for
the canonical lift of
a
vector field. The
right-hand side of (6.25)
lifted rays if and only if W is tangent to JV at all if W E 9Ar (This argument is analogous to the -
vanishes for all
points of A(, i.e., if and only proof of Proposition 6.2.1).
13
If the
frequency has
no
W
ln W
zeros,
( (S2)) (6 (S 1))
(6.24) implies =
f (A(Sl))
-
f (A(S2))
(6.26)
for any two parameter values si and S2. It is this expression to which the name "redshift potential" refers. By Proposition 6.2.2, a ray-optical structure admits a redshift potential, for an appropriately chosen observer field, if and
only if there is a time-like vector field in the symmetry algebra gAr. A rayoptical structure with this property is called "stationary", a notion we are going to discuss in full detail in Sect. 6.5 below. For the vacuum ray-optical structure the notion of a "redshift potential" (or "redshift function") was investigated in papers by Dautcourt [30] and by Hasse and Perlick [581. Please JV9 the condition W E recall that for the vacuum ray-optical structure M 9,V means that W is a conformal Killing vector field of the metric g. =
120
.6.
6.3
Rays-optical
structures
on
Isotropic ray-optical
Lorentzian manifolds
structures
With the Lorentzian metric g given on M we can define, for each point q E M, the set of Lorentz transformations on the cotangent space Tq*M by
Lor(q) A is
a
linear
A:
=
automorphism with
The Lorentz transformations
Tqm* M1
Tqq* M
Lor(q)
(6.27)
g*(A(.),A(.)) q
=
N
foliate the punctured cotangent space
0
0
Tq*M
into orbits.
Here,
I
a
subset
form Q JA(u) A E orbits is sketched in Figure 6.1. =
Lor(q)j
Q
of
Tq*M q
is called
an
orbit iff it is of the
0
for
some
u
E
T*M. The geometry of the q
0
T*ql
Fig.
6.1. The orbits of
two-shell
Lor(q)
hyperboloids and
Please recall that
we
a
are
family
the of
g-light-like cone, a family of g-space-like g-time-like one-shell hyperboloids.
have defined the structure group of
a
ray-optical
structure in Definition 5.3.3. The next definition characterizes the situation
that,
at
a
point
q E
M, the set of Lorentz transformations (6.27)
contained in the structure group of M.
is
completely
6.3
Isotropic ray-optical
structures
121
ray-optical structure Ar on (M, g) is called Lorentz invariant at a point q E M iff A(u) E Arq for all u E Xq and A E Lor(q). Ar is called Lorentz invariant iff it is Lorentz invariant at all points q E M. Definition 6.3.1. A
If an
a
ray-optical
orbit of
structure
Lor(q)
or a
Ar
is Lorentz invariant at
union of several orbits which
a
point q, JVq must be separated by open
are
0
If Ar is causal, only space-like and light-like orbits neighborhoods in T*M. q come into question, i.e., the one-shell hyperboloids of Figure 6.1 are excluded. The following proposition is rather trivial.
Proposition
6.3.1. Let
Ar be
ray-optical
a
dilation invariant and Lorentz invariant at all vacuum
ray-optical structure, A(
Proof. By assumption, jVq
jVq
and
is
an
points
orbit of the
Lor(q)
C
only
T*M with these properties is the double q
This
proposition implies that
(M,g)
which is
M. Then JV is the
or a
union of several
orbits,
codimension-one submanifold
0
Arq
on
q E
Arg.
=
Clearly,
is dilation invariant.
structure
Arq
cone
=
jVq9.
El
non-dispersive medium necessarily has
a
to
break Lorentz invariance. The class of Lorentz invariant
ray-optical
structures is rather small. We
get much laxger classes if we require invariance not under the full Lorentz If we fix a vector Uq E TM with group but only under certain subgroups. of spatial rotations the consider can we subgroup -1, gq(Uq, Uq) =
Rot(Uq) Uq-
with respect to
=
{ A E Lor(q) I A(gq(Uq) *))
=
gq(Uqi *) 1
(6.28)
This gives rise to the following definition.
Definition 6.3.2. A ray-optical structure Ar on (M,g) is called isotropic at a point q E M with respect to a normalized time-like vector Uq E TqM iff A(U) E Arq for all U E jVq and A E Rot(Uq). JV is called isotropic with
respect to all points
a
global
observer
q E M with
field
respect
U with
-1 iff Ar is isotropic at g(U, U) Uq U(q) ( U evaluated at q ). =
to the vector
=
=
use more precise attributes such as "spa, spatial rotations". For the sake of brevity, however, we stick with the terminology of Definition 6.3.2. If a ray-optical structure is Lorentz invariant at q, then it is in particular isotropic at q with respect to all normalized time-like vectors Uq E T.M. In addition, we already know some examples of isotropic ray-optical structures that need not be Lorentz invariant. In Example 2.5.1 of Part I we have derived
Instead of "isotropic"
tially isotropic"
or
one
might
"invariant under
the Hamiltonian
H(x,p)
1 =
2
(
gab (x)
+ Ua (x)
Ub (X) _
n(X)2
Ua (X)
Ub (X)
Pa Pb
(6.29)
122
Ray-optical
6.
structures
on
Lorentzian manifolds
for
light propagation in a linear and isotropic electromagnetic medium, cf. eq. (2.87). Clearly, this Hamiltonian generates a ray-optical structure which is isotropic with respect to the rest system U of the medium. We are now going to prove that this example is much more general than it seems to be. As a matter of fact, every isotropic ray-optical structure is locally generated by a Hamiltonian of the form (6.29) provided that it is dilation invariant. If it is not dilation invariant, the only modification is in the fact that the index of refraction must be allowed to depend on the frequency, i.e., we have to write
n(x, -U'(x)p,,)
It needs
the
a
ray-optical
n(x).
instead of
little bit of
preparation
X
structure
on
(M, g)
field U. For notational convenience
Ar. We have
we
to prove this fact. Let
us assume
that
isotropic with respect to an observer introduce a natural chart (x, p) around is
that neither the frequency W(x,p), given point with the V" nor U', by (6.8) spatial wave covector k,, (x, p), given by (6.9) has Then there is a neighborhood W C T*M with Va at a zero u. Ua, of u on which frequency and spatial wave covector are different from zero. If this neighborhood W has been chosen appropriately, our isotropy assumption guarantees that the norm Vgab (x) ka (X, p) kb (X P) of the spatial wave covector must be a function of x and w (x, p) on X n W, as is nicely illustrated by the orbit structure of Figure 6.1. Hence there is a strictly positive real valued function n, defined on some subset of M x R, such that a
E
u
to
assume
=
=
7
n(x,w(x,p))jw(x,p)j on
gab(x)ka(X,p)kb(X,P)
=
(6.30)
Ar n W. This construction locally assigns a frequencyo-dependent index of n to any isotropic ray-optical structure. Comparison with (6.10)
refraction
reciprocal to the norm of the phase velocity. Since we use units equal to 1, this can be rephrased as saying making that 1/n gives the phase velocity in units of the vacuum velocity of light. It is to be emphasized that, as long as there are no additional assumptions on our isotropic ray-optical structure Ar, the index of refraction is a local concept. Here, "local" refers in particular to the necessity of restricting the fibers of T*M. Globally, n might be a "multi-valued function" corresponding shows that
n
is
the vacuum velo,'city of light
to various "branches" of
On
an
tion is well-defined and
H(x,p) It is over,
that
an
1 =
2
can
gab (X)
(
C
T*M,
at
be used to introduce
+ Ua (X)
Ub (X) _
n(x, -UC(X)P, )2
a
least, the index of refraclocal Hamiltonian
Ua(X) Ub(X)
PaPb
(6.31)
immediate consequence of (6.30) that H vanishes on Ar n W. Moreassumption that the spatial wave covector has no zeros implies
our
(,OH/c9pa)
has
Hamiltonian for
i.e.,
JV-
appropriate neighborhood W
no zeros on
our
ray-optical
Ar
n
W. Hence, Ar. If
constant with respect to its
(6.31) gives, indeed,
a
local
frequency-independent, second argument, the rays determined by
structure
n
is
6.4
Light
bundles in
isotropic media
123
(6.31) are the light-like geodesics of a Lorentzian metric. In frequency-dependent case they are associated with a sort of "generalized Lorentzian metric" investigated in detail by Miron and Kawaguchi [97]. To sum up, we have proven that any isotropic ray-optical structure is locally generated by a Hamiltonian of the form (6.31) near any point of JV where frequency and spatial wave covector are non-vanishing. Here, frequency and spatial wave covector are meant with respect to the normalized observer field U distinguished by the isotropy assumption. Hence, a ray-optical structure on
the Hamiltonian the
isotropic with respect to some given normalized observer field R+, unambiguously characterized by an index of refraction n: M x R
(M, g) U is
which is
neighborhood 'YV
on a
C
is well-behaved. It is easy to
(a)
point where the ray-optical structure check that such a ray-optical structure is
T* M
near
any
causaliff
n(x, w)
+
WW)2
: 1;
W
n(x,
(6.32)
dilation-invariant iff
(b)
,On
TW (X, W) Lorentz invariant iff
(c)
n
(b)
From
just
a
and
(c)
we can
different way of
locally generated by
read
an
=
I
(6.33)
0;
h(x)
a
(6.34)
_
alternative
saying that a
=
is of the form
n(x, W)2
is
2
Ln (X, W)
W2
*
proof of Proposition 6.3.1. (c) is ray-optical structure
Lorentz invariant
Hamiltonian of the form
H(x,p)
Pa A + h(x)) ,!(gab(x) 2
(6.35)
Please note that the Hamiltonian (3.46) for light propagation in a nonmagnetized plasma is of this form. In this case the function h(x) is given
by
the
6.4
plasma frequency (3.51), h(x)
Light bundles
in
=
isotropic
WP (X)2
=
e2
0
mn (X).
media
investigate the dynamics of infinitesimal bundles of light several stanrays in an isotropic non-dispersive medium, thereby generalizing to asbe will it For our necessary of results dard purposes ordinary optics. characterized it is that and irrotational is globally medium the that sume by a single-valued index of refraction which has no zeros and no singularities. According to the results of the preceding section, this implies that the
In this section
we
124
6.
Ray-optical
structures
Lorentzian manifolds
on
exactly the light-like geodesics of a Lorentzian metric go, called the metric" henceforth. The contravariant components gob of the optical metric are determined by writing (6.29) in the form H(x,p) jgab(X)Papb, 0 rays
are
"optical
=
go is related to the
i.e.,
go (X,
Y)
spacetime =
n2 g(X, Y)
for all vector fields X and Y
g(U, U)
_
1) g(U7X) g(UIy)
M. Here, U is
(6.36)
given observer field with
a
M which is supposed to be hypersurface-orthogonal and R+ is a given C' function. In the following we refer to U as to the -1
M
n:
on
(n2
+
2
by
metric g
on
"rest
system" of the medium and to n as to the "index of refraction". These assumptions include, of course, vacuum light propagation as the special case n
1.
=
Now let
us
fix
ray A: I
a
M, i.e.,
light-like go-geodesic, where
a
real interval. For the sake of simplicity we choose an affine I TM of A satparametrization such that the tangent field K I denotes
a
isfies the equations
go(K, K)
=
(Vo)K K
and
0
(6.37)
0
where Vo denotes the Levi-Civita connection of the metric go. Infinitesimally neighboring rays are mathematically modeled by Jacobi fields along A. (Please
recall that Jacobi fields Definition
5.6.2.)
In the
TM with rm
J: I
o
defined, for arbitrary ray-optical structures, by at hand, a Jacobi field along A is a C' map A that satisfies the following two conditions.
are
case
J
=
(Vo)K (Vo)K J
-
Ro(K, J, K)
go (K,
J)
=
is
const.
a
(6.38) (6.39)
multiple of K,
,
where Ro denotes the curvature tensor of the connection Vo. (6.38) assures that "the arrow-head of J is tracing a neighboring go-geodesic" and (6.39)
neighboring geodesic is again go-light-like. analyzing the motion of such Jacobi fields it is convenient to refer to an appropriate basis of vector fields along A. We introduce the following definition which makes sense for arbitrary curves A in our Lorentzian manifold assures
that this
For
(M, g)
-
M
Definition 6.4.1. Let A: I
E,,-2)
field by K. Then (El, 1,...,n 2
be
a
is called
C'
a
curve
and denote its tangent
Sachs bein
along
X
iff for A, B
-
(a) EA is a C1* vector field alongA, i.e., EA: (b) g(EA, EB) JAB and g(K, EA) 0; (C) g(EA, VKEB) 0 =
A Sachs bein is called
g-orthogonal
ITM with -rM
o
EA
=
A;
=
=
-
adapted
to the vector
to
an
Vx(,) for
observer
all A
=
1,
field V iff -
.
.
,
n
-
the vector EA (s) is
2 and all
8
E I.
Light bundles
6.4
in
isotropic media
125
Whenever refering to a Sachs bein we use the summation convention for capital indices A, B, C.... running from I to n 2. It is easy to check that, for an arbitrary curve X and an arbitrary observer field V on M, there is a Sachs bein along \ which is adapted to V and that it is unique up to transformations of the form -
EA (s) where
(OB A)
is
a
the literature, the
constant name
OB A EB (S)
m
(6.40)
6AB- In orthogonal matrix, i.e., OD A OCB 8DC usually restricted to the case that A is
"Sachs bein" is
light-like geodesic with respect to g. In this situation, which was considered original paper by Sachs [125], condition (b) of Definition 6.4.1 assures that K, Ej,. E,-2 span the g-orthocomplement of K and condition (c) requires that, apart from the freedom of adding multiples of K, each EA is V-parallel along A. Here, however, we are considering the case that A is a light-like geodesic of the optical metric rather than of the spacetime metric. In the following we fix a Sachs bein along A that is adapted to the distinguished observer field U, i.e., that satisfies in addition to (6.42) and (6.43) the condition a
in the
-
-)
g (U,
EA)
'
go (Ui
EA)
0
(6.41)
-
help of (6.36) and (6.41), conditions (b) and (c) of Definition be rewritten in terms of the optical metric in the following way.
With the can
go(EA, EB)
=
n2 JAB
(Vo) K Q EA)
is
and a
n
Whereas
(6.42)
is
obvious,
it needs
calculate the difference tensor V of U
a
go (K,
multiple
bit of work to
being hypersurface-orthogonal (
=
In this situation every vector field J
=
0
(6.42)
,
(6.43)
of K.
verify (6.43). One has to to use the assumption
(6.36) and irrotational).
V,, from
-
EA)
6.4.1
along
A
can
be
represented
as a
linear
combination
j(S)
=
jA (s) EA (s)
+
v(s) U,\(,)
+
w(s) K(s)
,
(6.44)
jA(S), v(s) and w(s). Jacobi fields are determined by inserting (6.44) into (6.38) and (6.39). This gives conditions on the coefficients
with scalar coefficients
jA and
v
but not
on w
because
a
Jacobi field remains
a
Jacobi field if
an
0, J is g-orthogonal arbitrary multiple of the tangent field is added. If v to the observer field U up to the irrelevant term proportional to the tangent =
field, i.e., the connecting vector from X to the neighboring ray is purely spatial 1 with respect to the distinguished observer field U. (In the vacuum case n of the fields dynamics simultaneously.) Hence, this is true for all observer infinitesimally thin bundles of light rays in our isotropic medium is given by 0 into (6.38) and (6.39). As (6.39) is automatically inserting (6.44) with v =
=
126
6.
Ray-optical
structures
on
Lorentzian manifolds
satisfied with const. 0, we only have to care about system of second order linear differential equations =
n(s)d2 (n(s) jA(S)) ds-7
for the coefficients
jA'
SA B (S)
This
jB(S)
gives the
(6.45)
where
sA B Here and in the
=
(6.38).
jAC go (Ec, Ro (K, EB, K))
=
(6.46)
following we write n(s) for n (A(s)). Owing SAB satisfies the identity
to the
symmetries
of the curvature tensor,
sA B 6BC If the Sachs bein is
SCB 6BA
=
changed according
to
(6.47)
(6.40), SA B undergoes
the trans-
formation
jAC ODC OF B SG F(S) JDG
sA B(S) Now let
(L
A
B
us
(8))
assume
that
have
we
that satisfies the matrix
(6.48)
matrix valued function
a
s
L(s)
=
analogue of the differential equation (6.45),
i.e., d2
n(s) ds2* (n(s)L AC(S)) Then any
(Cl'...' e-2 )
=
SA B (s) L13 c (s)
Rn-2 determines
E
jA(S)
=
L
A
a
(6.49)
solution
B(S) CB
(6-50)
0 and w arbitrary, a (6.45) and, upon inserting into (6.44) with v (6.38) and (6.39) with const. 0. In other words, such a matrixvalued function L determines an (n 2)-parameter family of infinitesimally neighboring rays around the central ray A. If det (L(s)) : 0, those neighboring rays fill the space (not the spacetime!) around A completely. For that reason, we call a solution L (L AB) of (6.49) that satisfies det(L(s)) 54 0, with
of
=
solution J of
=
-
=
the
possible exception
of
some
isolated parameter values s,
an
infinitesimal
bundle of rays around A. More precisely, L should be called the representation of such a bundle with respect to the Sachs bein chosen. If we change the Sachs
bein,
we
have, of
course, to
L
Since
n
has
no
AB(S)
zeros,
change
L
according
to
6AC ODC OF BL G F(S) 3DG
m
(6.49)
can
d2
be solved for ds-7 LAc(s).
initial values for L and for its first derivative determine
a
Invertibility of L(s) for almost all parameter values equation D
A
(6.51)
B(s)L'c(s)
=
jAC(S) ,
Thus, arbitrary
unique solution. s
implies that
the
(6.52)
defines
D(s)
matrix
a
those isolated values
infinite.) Hence,
(DA B(S)) for det(L(s))
=
where
s
the jA from
almost all parameter values
0,
=
some
D
=
A B
(s) jB (S)
decompose D(s) part according to
into
we
D
AB(S)
=
(At
D(s)
a
respect
symmetrical and
OA B(S)
+
measures
the
to the Sachs bein
an
antisymmetrical
WA B(S)
(6-54)
OCB(S) jBA
0,
(6-55)
AB (s) jBC + WCB(S) jBA
0,
(6.56)
oA B (S) jBC W
s.
(6-53)
motion of the infinitesimal bundle around A with
chosen. If
127
components of D become
demonstrates that the matrix
(6.53)
s.
isotropic media
(6.50) satisfy
jA(S) for almost all
bundles in
Light
6.4
-
symmetrical part OAB (s) gives
the deformation and the antisymmetrical infinitesimal bundle with respect to the the of WA the rotation part B (S) gives Sachs bein. The symmetrical part can be further decomposed according to
the
oA B(S)
=
07
AB(S) +
0 (s)
(n-2)
jA B
(6.57)
OA A(s) gives the expansion and a A B(S), which is defined O(s) through (6.57), gives the shear of the infinitesimal bundle with respect to
where
=
the Sachs bein.
propagation equations for these quantities. If we calcu-
It is easy to derive
late the derivative of
(6.49)
for
(6.52), we find that the second order differential equation
LA B implies the first order differential equation
j)A B(S) I
n(s)2
for D A B.
sA B(S)
AB(S)
For the
D
A
24(') D A B(S) n(s)
C(s)DCB(S)
-
(6.58)
(S) 6AB
n(s)
Symmetrization respectively antisymmetrization results n(s-)y
SA B (S)
WA C(S)W C B
2nLj ) n(s)
6A B(S)
ci
-
+
=
U)
A
1
C(S) OCB (8)
vacuum case n
=
_
OAC(S) OCB(S)
OA B (S)
OAC(S) WCB(S)
n(s)
-
(6.59)
jA B
s)
n(s)
1, the propagation equations
in
A
B(S)
(6.59)
and
(6.60) (6.60)
are
well-known and can be found in many textbooks on general relativity, see, of (6.59) gives the well-known e.g., Wald [146], p. 222. In particular, the trace
128
Ray-optical
6.
structures
on
Lorentzian manifolds
1. Please note that focusing equation for vacuum light rays in the case n in the generalization considered here SA B(S) involves the curvature tensor of the optical metric according to (6.46). Now we use the propagation equations (6.59) and (6.60) to prove three theorems on light propagation in a general-relativistic medium of the kind under consideration and thus, in particular, in vacuum. All three have famous counter-parts in ordinary optics. =
Theorem 6.4.1. In the
an
isotropic, dispersion-free and non-rotating medium
holds true.
rotation
following If for an infinitesimal bundle of rays the for one parameter value s, then it vanishes for all s.
vanishes
Proof. This is an immediate consequence of the fact that homogeneous differential equation (6.60).
the rotation satisfies
the
This theorem
be viewed
can
as a
11
general relativistic analogue of the Malus
theorem of ordinary optics. In its most elementary version, found by Malus in 1808, this classical theorem can be formulated in the following way (cf., e.g.,
[161). A family of straight lines that starts surface-orthogonal surface-orthogonal (a) after reflexion at an arbitrarily curved surface according to the usual reflexion law and (b) after refraction at an arbitrarily curved surface according to Snell's law. In other words, a surface-orthogonal bundle of light rays remains surface-orthogonal after passing through any system of mirrors and lenses. Theorem 6.4.1 gives a similar statement for light rays in an isotropic, non-dispersive, and non-rotating medium on a generalrelativistic spacetime. The analogy comes from the fact that a two-parameter family of straight lines in ordinary Euclidean 3-space is surface orthogonal iff, around any member of this family, the infinitesimally neighboring mem-
Born and Wolf remains
bers
are
ordinary
irrotational. In this Euclidean
case
rotation is to be measured with respect to
parallel transport.
The other two theorems refer to infinitesimal bundles of rays which are homocentric, i.e., to the case that L(s) is the zero matrix for one particular
parameter value
s.
Theorem 6.4.2. In
the
following holds
an
true.
isotropic, non-dispersive and non-rotating medium If an infinitesimal bundle of rays is homocentric,
then its rotation vanishes.
Proof. can
Let L
=
(LA B)
be
a
solution of the differential equation
be rewritten in matrix notation d
2
n(s) da-7 (n(s)L(s)) with S
=
=
which
n(s) dT (n(s)L T(s))
=
(6.61)
S(s) L(s)
(SA B)- Owing to (6.47), transposition d2
(6.49)
as
L
of
(6.61)
T(S) S(S)
results in
(6.62)
6.4
where
denotes the transpose of
imply that the
a
isotropic media
(6.61)
matrix.
a
in
and
129
(6.62) together
matrix
C(s) has
Light bundles
n(8)2 (.LT (s) L(s)
=
( (s)
vanishing derivative,
homocentric infinitesimal
-
L
T(S).L(S))
(6.63)
Hence, C(s) is a constant matrix. If L is bundle, L(s,,) is the zero matrix for some specific 0.
=
parameter value s,,. So C(s) C(s,,) must be the zero matrix for all s. With C the matrix we 0, multiply equation (6.63) from the left by (L-1)T(s) =
=
and from the
L-1(s).
right by
This
be done for almost all parameter equation reads D D(s) == 0, i.e., the can
T(S)
By (6.52), resulting antisymmetrical part of D(s) vanishes.
values
the
s.
This theorem is
-
0
analogous to the elementary fact that, are surface-orthogonal.
in
ordinary optics,
homocentric bundles We
turn to the last of
now
relevance for
particular
cosmology.
(Reciprocity theorem)
Theorem 6.4.3. and
three theorems which is of
our
In
an
non-rotating medium, the following holds
infinitesimal bundles around are homocentric, with L, (sl)
the =
central ray
same
0 and
L2 (S2)
isotropic, dispersion-free If L, and L2 are two A and if both Ll and L2
true.
=
0, then
I det (n(S2) L, (S2))l- jdet(n(s1)L2(S1))j jdet(n(S2)-WS2))j 1 det (n(si).Li (si)) I Proof.
We
use
the
same
matrix notation
as
(6.64)
proof of Theorem
in the
6.4.2.
By
assumption, the differential equation (6.61) and its transposed version (6.62) are satisfied by L L2. This implies that the matrix L, and L =
=
C12(S) has
=
vanishing derivative, particular, the equation
n(S)2 (LT1 (s) L2(S) t12(S)
=
has to hold. Owing to
simplifies
our
is
=
is
a
(6.65)
constant matrix. In
(6.66)
C12(S2)
hypothesis Ll(si)
an
obvious consequence of
=
=
0 and
L2(32)
=
0, (6.66)
--:-n(S2)2 LT1 (S2) L2(S2
and L2
are
To
a
give
underlying
infinitesimal bundles and
physical interpretation
(6.67)
(6.67).
Please note that the denominators in
L,
(S) L2(S))
to
n(s, )2 jT (si) L2(SI) (6.64)
LI
Hence,. C12(S)
0.
C12(Sl)
T
-
0
(6.64) n
is
are
different from
zero
since
strictly positive.
to this theorem
we now assume
that
our
Lorentzian manifold is 4-dimensional and that, with respect to the
130
6.
Ray-optical
structures
time orientation defined
A(si).
future of
by
We restrict
on
the
our
Lorentzian manifolds
distinguished
observer field
U, A(82)
is in the
consideration to the section of X between these
points. We consider the totality of all Jacobi fields, defined via eq. (6.50) A LI, with 6AB C cB < 1. This can be interpreted as a pencil of light rays issuing from the point A(si). Similarly, the analogous construction carried through with L L2 gives a pencil of light rays received at the point
two
with L
=
=
A(S2).
Now the numerators in
(6.64),
up to
a
factor
-x
and up to the square of
the index of
refraction, give the cross-sectional areas of these two pencils at the points A(si) and A(S2), respectively. It is important to realize that these quantities have an invariant geometrical meaning, independent of the Sachs bein and of the g,,-affine parametrization chosen for A. On the other hand, the denominators in (6.64), again up to the square of the index of refraction, are measuring the opening angle of the respective pencil at its focal point. These quantities are, again, independent of the choice of the Sachs bein; however, they do depend on the affine parametrization chosen for A. If we switch to ) another affine parametrization by a transformation s i as + b with real constants a : A 0 and b, on both sides of (6.64) the denominator is getting a factor jaj-1-. As for a ray (i.e., for a g,,-light-like geodesic) the choice of a particular affine parametrization is a matter of arbitrariness, this argument shows that the denominators of (6-64) are "unphysical" in the sense that they cannot be measured. Therefore we introduce the quantities dl,,m
=
jdet(Lj(S2)) Idet (Li (si)) I
dang
=
jdet(L2(Sl)) jdet(12(S2))j
g,\(81)
(K(sl), U(sj))
jg,\(. ) (K(S2), U(S2))l
U(s) denotes the distinguished observer dang are invariant with respect to changing
(6.68)
(6.69)
1
A(s). dj"nj parametrization of
where
field at the point
and
the affine
,\ since the tangent field K in the numerator is stretched with the same factor as the derivative operator (overdot) in the denominator. Here it is essential that
we
restrict to the
matrices.
2
djum
opening angle
case
dim(M)
=
4 such that
relates the cross-sectional at
A(si)
area
where the latter is
of the
now
L, and L2
Li-pencil
measured
as a
at
(2 x 2)\(S2) to its
are
solid
angle
in
the local rest space of the observer U(si), see Figure 6.2. In cosmology djuln is known as the corrected luminosity distance from A(sj) to A(S2), whereas the
analogously from
A(si)
defined quantity dalg is known as the angular diameter distance A(S2). Now the reciprocity law (6.64) can be rewritten in the
to
form
dium
=
d
ang
n(s 1)2 1g,\(,,J) (K(si), U(si)) 1 n(82 )2 19MS2) (K(S2) U(S2)) ,
Please note
that, by (6.23), the factor
(6.70)
6.5
Stationary ray-optical
131
structures
-52) ;2)
A(SI) a pencil with central ray A, issuing from a light observer at A(82). The corrected luminosity distance relates the cross-sectional area of the pencil at A(82) to its opening angle at A(si), measured in the rest space of the observer U(si). The angular diameter distance is defined in
Fig.
6.2. This illustration shows
source
an
at
A(si)
analogous
to
an
manner
for
a
pencil around A with focus
(K(si), U(si)) 9X(32) (K(S2)i U(S2))
at
A(.52)-
(K(si), U(si))
g,\(,,)
(6-71)
-
(911))1(12) (K(82)i U(S2))
gives the redshift under which U(si) is seen by U(S2)In the vacuum case n 1, (6.70) gives a remarkable relation between corrected luminosity distance, angular diameter distance and redshift. (The observers U(si) and U(S2) are arbitrary in the vacuum case.) As to the literature on this subject we refer to Etherington [41] who discovered the law (6.70) for the case n 1; to Ellis [40] whose proof of the reciprocity law for I served as a model for our proof of Theorem 6.4.3; and to the case n Schneider, Ehlers and Falco [128], Sect. 3.5, who give a detailed discussion of the reciprocity theorem for vacuum light rays and of its relevance for cosmology. It should also be mentioned that the name "reciprocity theorem" goes back to Straubel (1351 who introduced the analogue of Theorem 6.4.3 in ordinary optics. This classical reciprocity theorem is closely related to the socalled sine condition for stigmatic imaging. The latter can be traced back well into the 19th century to Clausius, Helmholtz and Abb6. It is discussed, =
=
=
e.g., in the standard textbook
6.5
by
Born and Wolf
Stationary ray-optical
[16],
p. 166.
structures
Recalling Definition 5.3.2 of the symmetry algebra gAr, stationarity optical structure A( can be introduced in the following way.
of
a
ray-
132
6.
Ray-optical
structures
on
Lorentzian manifolds
ray-optical structure Ar on (M, g) is called stationary iff there is a vector field W E gAr which is everywhere time-like, g(W, W) < 0. If, in addition, the one-form g(W, -) is globally (or locally, respectively) of the form g (W, ) h dt with some scalar C' functions h and t, Ar is called globally (or locally, respectively) static.
Definition 6.5. 1. A
=
-
In the static which
are
the spacetime is foliated into hypersurfaces t const. the of W. It to follows from the wellcurves g-orthogonal integral case
=
known Frobenius Theorem that the one-form
[1261,
p. 53
For the
K
=
g(W,
locally such
foliation exists if and
a
satisfies the equation
r.
A dk
only
if
0, cf. Sachs and Wu
=
.
vacuum
ray-optical
structure
JV
=
jV9
on
(M, g)
know from
we
Sect. 5.3 that the symmetry algebra gAr coincides with the set of conformal Killing vector fields of g. Hence, the vacuum ray-optical structure is stationary sense of Definition 6.5.1 if and only if the Lorentzian manifold (M, g) conformally stationary. In terms of local Hamiltonians, stationary ray-optical structures can be characterized in the following way.
in the is
Proposition 6.5.1. Let JV be a stationary ray-optical structure on (M, g) and fix a point u E JV. Then there is a local Hamiltonian H for M, defined on 0. Here W denotes the canonical a neighborhood of u, such that dH(fV-) lift (5.24) of the time-like vector field W E gAr. =
Proof. Since the a
vector field W is
W(u) 54
0. Hence, we can choose through u that is transverse 9,v implies that W(u) is tangent
time-like,
codimension-one C' submanifold P of T*M
to the flow of
W. As the assumption W
E
automatically guaranteed that P is transverse to JV at u. This transversality property implies that Ar n P is a codimension-one C' submanifold of P. If P is small enough, this guarantees the existence of a
to
JV,
it is then
C' function h: P
A(n P
=
define
a
is
a
IW
E P
)
I h(w)
R such that the differential dh has =
0
1.
Then the conditions H I p
real valued function H
on a
neighborhood
of
=
u.
no
h and
zeros
dH(17v)
a
natural
=
0
By construction, H
local Hamiltonian for M.
In
and
1:1
chart, induced by
a
chart
(x',..., x')
on
M with W
=
49/(9x',
the equation dH(fV-) 0 means that H is independent of the coordinate x'. For a stationary ray-optical structure, the time-like vector field W E 9Ar =
defines
a
normalized observer field V
f
=
.1 2
=
e-f W, where
In(- g(W, W))
.
(6.72)
By Proposition 6.2.2, f is a redshift potential for this observer field. This observation is closely related to the fact that, by Proposition 5.3.3, the mo) R is constant along each lifted ray. The existence of mentum O(W): T*M this constant of motion is crucial for the dimensional reduction of stationary
6.5
ray-optical
structures. We
are now
Stationary ray-optical
going
133
structures
to discuss this reduction formalism
in detail.
goal
The
study,
is to
for
a
stationary ray-optical
structure
M
on
our
(M, g), the dynamics of rays in terms of their prol)-dimensional space A In the first step we have to jections onto an (n A The obvious idea is to introduce M as a quotient construct the space n-dimensional spacetime -
space of
M, calling
integral
curve
two
points of M equivalent
iff
they
are
of the time-like vector field W E iCA( and to
connected
hope
b
an
that M is
smooth Hausdorff manifold. This is, indeed, always true locally, i.e., if we restrict to a sufficiently small neighborhood of an arbitrary point in M. Globally, however, the topological space M' may violate the Hausdorff axiom and a
need not admit Ir :
M
)
M
smooth manifold structure such that the natural projection a submersion. E.g., for a time-like vector field W on
a
becomes
Minkowski space with one point removed the quotient space necessarily violates the Hausdorff property. Also, it is easy to verify that the quotient space cannot be a smooth manifold if an integral curve of W is almost closed, com-
ing back
into any
neighborhood
of
some
point infinitely often without being
periodic. It is, thus, necessary, to introduce additional assumptions to make sure that the quotient space M is a smooth Hausdorff manifold. To'put this rigorously we introduce the following terminology, cf. Figure 6.3.
Definition 6.5.2. Let W be
function
t: M
--+
R is called
a a
time-like C' vector
global timing
(a) dt(W) 1 and (b) for any t, and t2 in R the fiow of W diffeomorphically onto the hypersurface t
field on (M, g). A for W iff
C'
function
=
It would be
respect
misleading to call t hypersurfaces
since the
function",
a
t
"time =
maps the =
t2
hypersurface
t
=
t,
-
function", rather than a "timing space-like with
const. need not be
to the Lorentzian metric g.
function for W, the above-mentioned quotient be identified with any of the hypersurfaces t = const.; this identification makes A 4 into an (n I)-dimensional C110 manifold such that A 4 becomes a submersion. Then the map the natural projection ir: M ---> If t is
space
A
a
global timing
can
-
global diffeomorphism. counter-examples demonstrate that, for an arbitrary time-like vector field W, a global timing function need not exist. It is interesting to note the following result. If W is a time-like vector field on a Lorentzian manifold that has no closed integral curves, then the Hausdorff property of the quotient space M' guarantees the existence of a global timing function for an appropriate reparametrization of W. The proof can be taken
(.7r, t):
M
-+
M
x
R is
a
The above-mentioned
over
from Harris
If t: M
R is, again,
-->
a
[56],
R is
a
global
Theorem 2.
global timing function for W, timing function for W if and
a
second function t': M
only if t'
is of the form
134
6.
Ray-optical
structures
on
Lorentzian manifolds
R
W
t
M
7r
Fig. 6.3. A global timing function t spacetime M as a product of space
for
t'
a
time-like vector field W allows to write
and time R.
t +
h o ir
(6.73)
h: M' R is any C' function. In bundle theoretical language, two different global timing functions for W define two different global trivializa, A 4. If t can be chosen in such a way that tions for the fiber bundle ir: M the hypersurfaces t const. are g-orthogonal to W, this additional condition fixes the timing function uniquely up to an additive constant. However, since it is our goal to study stationary ray-optical structures, and not only where
--+
--+
=
globally static not possible. Now let
ones,
we
us assume
W. Then the
have to deal with situations where such
that
we
have
a
global timing function
global difleomorphism (7r, t):
M
--4
M
x
a
t: M
R induces
a
choice is
-
R for
splitting
of the cotangent, spaces T,*M -- T* M E) T* R for all points q E M. ,r(q) t(q) Projecting onto the first factor gives a reduction map red: T*M If
we
change
undergoes
the
)
T*M.
timing function according
to
(6.74) (6.73),
the reduction map
the transformation
red(u)
1
)
red(u)
=
red(u)
+
(dh)7r(q)
(6.75)
Stationary ray-optical
6.5
for
u
E
structures
135
T*M. In local coordinates, the reduction map is most easily expressed q (xl,. x1) on M with t xI and W ala'x". Then
if we choose coordinates
=
(xl.... Xn-1)
we can use
as
I
coordinates
on
f4
=
and the reduction map takes
the form
(xl,...,Xn,pl
'... ,
Please note that in such
Pn)
a
coincides with the function
Proposition
(XII
'
...
I
Xn-1, PI,
-
-
-)Pn-1)
(6-76)
-
coordinate system the momentum coordinate p" O(W) which is a constant of motion according to
5.3-3.
ready to formulate a reduction theorem for stationary rayoptical structures. Roughly speaking, this theorem says that a stationary ray-optical structure)V on our n-dimensional spacetime (M, g) induces a onel)-dimensional parameter family )V,,. of ray-optical structures on the (n quotient space M', where the family parameter w,, E R is given by the value of We
are now
-
the conserved momentum, w,, -O(W). For the construction of the reduced it is structure )V,,. necessaxy to choose a global timing function t ray-optical for the time-like vector field W E !gAr, i.e., in situations where such a t does =
globally. Moreover, the theorem reEven locally for the reduction process assumptions. transversality quires to give a reduced ray-optical structure near a point U E JV it is necessary that (i) the covector u is not a multiple of the differential dt at the point q -rt4 (u); and (ii) the fiber derivative FH (u) is not a multiple of W at the not exist the reduction does not work two
=
point q -r;A (u); where H is any local Hamiltonian for M. Note that FH(u) is a multiple of W at those points where the ray velocity with respect to the e- f W has a zero, see (6.7). Hence, the secnormalized observer field V ond transversality assumption just excludes all points where the ray-optical structure has a pathological behavior. The precise formulation of the reduction theorem reads as follows. =
=
(Reduction
Theorem 6.5.1.
theorem for stationary
ray-optical
stationary ray-optical structure on (M, g) and let t: M --+ R be a global timing function for the time-like vector field W E !9,V As outlined above, this induces a global diffeomorphism (ir, t): M -- M x R T*M'. Now fix a value w,, E R such that and a reduction map red: T*M Ar the set Q, u E -w,, I is non-empty. Assume that for all I O(W) Let A( be
structures)
a
-
(i) the covector u is not a multiple of the differential dt at (ii) the fiber derivative FH(u) is not a multiple of W at -r) (u); -r, 4 (u), where H is any local Hamiltonian for M. Then 9,,o red(Q,,.) : I 9,. is a lifted ray (or a ray-optical structure on X4'. A C' curve
points q q
is a
u
E
and
=
--+
lifted virtual
form ray,
red
ray, o
respectively) of )V,,. if and only if it
where
:
I
--
Q,.
C
X is
a
lifted
can
ray
be written in the
(or
a
lifted virtual
respectively) of M.
Proof. Recall that W
is the Hamiltonian vector field of the function
O(W),
136
Ray-optical
6.
structures
Lorentzian manifolds
on
d(O(W))
Q(W, -)
=
This equation shows that the differential set
IC,,.
=
{u
of T*M. On
E
T*M
I (O(W)) (u)
=
-w,,
(6.77)
.
d(O(W)) I
is
a
has
the canonical two-form Q induces
IC,,.,
no zeros.
Hence, the
codimension-one submanifold two-form
a
S?,,.
with
a
one-dimensional kernel. At each point of IC,,,, this kernel is spanned by W, as can be read from (6.77). Let us call two points of IC,,. equivalent if they can be connected
by
integral
an
of
curve
W. Then the quotient
space
)C,,. /_
carries
Hausdorff manifold structure such that the natural projection /C,,. becomes a submersion. This follows from the fact that W admits
timing function. Moreover, on
1Q,. /-.
the two-form
S?,,.
induces
a
symplectic:
a
/C,,. /_ a global )
structure
It is worthwile to reconsider this construction in terms of
a
natural
by coordinates (xl,..., x') on M with t x' and W 9/,gxn. Then )C,,,,, is given by the equation p,, -w,,, i.e., r-,,. is parametrized by the coordinates (xl.... X", P1, p.- 1). Forming the quotient IC,,,, /_ comes up to factoring out the coordinate x'. This shows that )C,,./an be identified, as a symplectic manifold, with the cotangent bundle T*M, and that the ) IC,,.I- can be identified with the restriction of natural projection IC,,,, the reduction map (6.74) to )C,,,,. For all points u Xn Now we consider the set Q,. FH(u) is linearly independent of W,;,(.,,) by assumption. Thus, the characteristic direction of 1Q,. (i.e., the direction spanned by W) and the characteristic direction of Ar (i.e., the lifted ray direction) do not coincide. This implies that Ar and IC,,. have a transverse intersection at all points u E Q,,.. Thus, Q,,. is a closed codimension-one submanifold of IC,,., i.e., a manifold of dimension (2n 2). Since W E gAr, Q, is invariant under the flow of W. This implies chart induced
=
=
=
-
I
-
-
,
=
-
that the set of dimension of
(2n
dt,;,(,,), 9,,.
that for
u
E
red(Q,,,,) is a closed submanifold 3). Since we assume that u E Q,,,,
=
-
does not meet the
Q,,.
zero
the fiber derivative
T*M
of is
T*M; since never a multiple
section in
FH(u)
is
c--
never a
multiple
we
assume
of W
(u),
everywhere transverse to the fibers of T*M. This proves that Jv, is, wo indeed, a ray-optical structure on A To prove the rest of the proposition, we observe that Q,,. is foliated into the two-surfaces spanned by lifted rays and by integral curves of W. If we denote the pull-back of the canonical these two-surfaces can be characterized as the two-form 0 to Q,,. by Hence, red maps any such two-surface integral manifolds of the kernel of This proves that for each lifted ray onto the image of a lifted ray of ,o. of there is a one-parameter family of lifted rays : I red o . Since the map red is fiber preserving, this result X such that El remains true if "lifted ray" is replaced with "lifted virtual ray". A
is
=
Please note that at
a
point
u
E
to the normalized observer field V
reversible
one can
Q,,, =
n
Tq*M
e-f W
is
the
frequency with respect
equal
to
restrict to values w,, > 0 without loss of
e-f (q) ,
W".
If Af is
generality.
Stationary ray-optical
6.5
137
structures
Locally around a point u E Ar the reduction can be carried through in following way. It is convenient to introduce coordinates (xl,...,xn) on x' and W M such that t O/Ox'. By Proposition 6.5.1, we can choose a local Hamiltonian H for JV around u which is independent of x1. It is easy to verify that a local Hamiltonian ft for the reduced ray-optical structure is given simply by setting pn equal to -w,,, i.e., the
=
=
ft (XI,
.
,
.
.
x-
1, pl,
.
.
.
,
pn-
1)
=
H
(xl,
.
.
.
,
xn- 1, P1,
pu- 1,
-w,,)
.
(6.78)
It is important to realize that the reduced ray-optical structure 9,,. depends on the choice of the global timing function for W in the following way. If, in the situation of Theorem 6.5-1, the global timing function is changed according to (6.73), the reduced ray-optical structure changes according to
9,,,, is
again
as
well
as
a
ray-optical
structure
on
A
for the old
one.
proof
The
of
(6.79)
M provided
that the
transversality global timing function (6.79) follows immediately from the
of Theorem 6.5.1 is satisfied for the
(i)
condition
+
new
(6.75) of the reduction map. In this situation, Theorem 6.5.1 gives a natural one-to-one relation beThis relation is defined by and lifted rays of./([' tween lifted rays of
transformation behavior
9,,,,
associating
a
representable
.
of
lifted ray
JV,,. red
in the form
of M.
ray
s
E
I. There is
In terms of
a
an
=
analoguous
natural chart
on
and
where p 1, coincide =
n
-
=
PP(s)
o
with the
then related
they
same
are
lifted
by
(6.80)
A
T*A, (6-80)
_XP
takes the form
xP(s)
(6-81)
(xj(s),...,X._j(s))
(6-82)
1. This observation
although the lifted
are
iff
of
one-to-one relation for lifted virtual rays.
A +
red'
'
(s) + A I
x1p (8)
PIP(s)
a
lifted ray
and
o
By (6.75),
'(s) for
with
implies that the rays of 9,,,, and Similarly, the virtual rays of
rays do not.
although the lifted virtual rays do not. From (6.80) the transformation behavior of wave surfaces. read or (6.82) we can also : M is ) R a classical solution of the eikonal equation of The function the function if + h is a classical solution of the eikonal Xr,,o if and only and
W"
coincide
equation of 9'o. Clearly, both solutions are associated with the same family of rays. There is a far-reaching formal analogy between this situation and the dynamics of charged particles moving in a magnetostatic field. The change of the global timing function corresponds to a gauge transformation of the
6.
138
Rayv-optical
structures
Lorentzian manifolds
on
magnetostatic potential. In both cases, the canonical
undergo
transformation of the form
(6.82).
momentum coordinates
In view of this
analogy one "gauge invariant" whereas the wave surfaces are not. This "gauge freedom" can be removed if our stationary rayoptical structure is globally static. Then we can "fix the gauge" by choosing the hypersurfaces t const. g-orthogonal to the integral curves of W. If, however, our stationary ray-optical structure is not globally static, then there is no distinguished choice for the global timing function and we have to live with the "gauge freedom". Having clarified the dependence of the reduced ray-optical structure on the choice of the global timing function, we are now going to investigate its dependence on the parameter w,, which fixes the frequency of the rays. If the assumptions of Theorem 6.5.1 are satisfied for two real numbers and are in general w,, and w', the reduced ray-optical structures completely different. If, however, the stationary ray-optical structure M is might
a
say that the rays of
91,1,,
are
=
0
dilation-invariant in the consideration is
sense
of Definition 5.4.1
non-dispersive),
then
JV,,,
and
(i.e.,
JV,,,
if the medium under
are
related
o
by the fol-
lowing proposition. Proposition 6.5.2. Assume that all the assumptions of Theorem 6.5.1 are satisfied and that, in addition, Ar is dilation-invariant. Then the assumptions c W,, for any of Theorem 6.5.1 are still satisfied if w,, is replaced with W real number c: > 0, and the reduced ray-optical structures are related by =
0
R. ,, In
particular,
the rays
of
dilation-invariant but also
Proof. Recall that M numbers
c,
> 0. As
we
=
C
X ,.
coincide with the rays carries
reversible, (6.83)
is dilation-invariant if and
have
(6.83)
-
seen
in Sect.
5.4,
of 9,,.. If Ar over
to the
only. if M
=
c
is not
case c
only
< 0.
Ar for all real
dilation-invariance
implies that,
for any u E Ar and any local Hamiltonian H of Ar which is defined around u, the fiber derivative satisfies FH(c u) cFH(u) for all real numbers c > 0. If X =
is not
only
dilation-invariant but also
c
an
easy exercise.
< 0.
reversibel, these properties remain valid proof of Proposition 6.5.2 is
On the basis of these observations the
for
13
Proposition 6.5.2 is, of course, in perfect agreement with the basic idea a non-dispersive medium the spatial path of a light ray is independent of its frequency. constructed by the method If we have a reduced ray-optical structure of Theorem 6.5.1 from a stationary ray-optical structure, we can integrate of T*M'. Quite each lifted virtual ray of 9.,, over the canonical one-form generally, the integral over the canonical one-form is known as the action functional and will play a central role in our discussion of variational principles in Chap. 7 below. In the case at hand, it is helpful to introduce the following definition. that in
Stationary ray-optical
6.5
of Theorem 6.5.1 with of the reduced ray-optical
Definition 6.5.3. Consider the situation For
JQ ,,
[81) 821
virtual ray
lifted
a
f is called the on
Wo
f82
(' (s))
., I
.
optical path length of
Here
0
139
structures
54
W,,
0.
structure
(6.84)
ds
denotes the canonical
one-form
T*A 4.
everywhere transverse to the Euler vector field k on T*A , the integral (6.84) is a strictly monotonous function of the upper bound S2. In this case the optical path length gives us a distinguished parametrization along each lifted virtual ray of Ar,,,,. We have already seen that such a distinguished parametrization exists if and only if the Euler vector field is transverse to Ar, please recall Proposition 5.4.7 and the subsequent discussion. It is important to realize that the optical path length is "gauge dependent" in the following sense. Under a change of the global timing function the according to (6.79). reduced ray-optical structure changes into of 9,,,, 9,,,, changes into a [81, S21 Thereby each lifted virtual ray f : [SI) S21 lifted virtual ray according to (6.80). If we R',, of If
9,,,,
is
,
,W
W
compare the optical path length of ' with the optical find that they do not coincide but are related by
()
-T
path length
of
we
(6.85)
+
A
where A
=
-r*.
M
-r*.
o
o
M
following proposition relates the optical path length to the "travel global timing function t. This result is of time", Fermat's of view in relevance principle to be discussed in Chap. 7 particular The
measured in terms of the
below.
Ar Proposition 6.5.3. Let, in the situation of Theorem 6.5.1, 6: [S1 821 be a lifted virtual ray of A( along which the conserved momentum O(W) takes is given by red o the value -w,, =A 0. Then the optical path length of 7
=
S2
If N
( (s)) ds
is dilation-invariant this
-T
Proof. Since
red
O (,,)
o
()
and
( (s))
(/X(S2))
equation simplifies
==
t
(A(S2))
O(W)
=
+ t
-
t
-
t
(A(S1))
-
to
(6.87)
(A(81))
takes the constant value -w,,
( (s))
-
wo
(6.86)
dt,\(,)
( (s))
along ,
(6.88)
140
for
6.
8
E
Ray-optical
[81) S21.
To
structures
verify this equation
(Xl,...,xn)
coordinates
Lorentzian manifolds
on
on
t and
=
natural chart induced
by W; then (6.88) is Integrating (6.88) from si integral vanishes owing to
we can use a
M with x'
alax'
just the trivial identity p,, dx' pp dxP + pn dxn. to S2 yields (6.86). If JV is dilation-invariant, the Proposition 5.4.7. =
=
13
If the reduced ray-optical structure 9,,. is strongly hyperregular and thus orientable, we know from Proposition 5.2.4 that there is a one-to-one relation between positively oriented virtual rays and positively oriented lifted virtual rays. In that case the optical path length can be viewed as a functional on (positively oriented) virtual rays rather than on lifted virtual rays. Again, this observation is crucial in view of Fermat's principle. As a matter of fact, for any stationary ray-optical structure with relevance to physics the reduced rayoptical structure is indeed strongly hyperregular or at least strongly regular. In the latter case the above-mentioned one-to-one correspondence holds true at least locally. The following proposition gives a useful criterion.
Proposition 6.5.4. Assume that all satisfied and fix a point u E Ar with ray-optical structure only if the condition
9,,.
(5.15),
u
-w,,.
=
at the
point
(Hab) (Ha) (Wa) 0 (H6) 0 0 (Wb) 0
fi
red(u) if
and
(6.89)
in any natural chart. Here we use the same matrix notation as in denoting any local Hamiltonian for Ar and Wa denoting the vector
field
W.
It is easy to check that
induced
coordinates
by
choose
a
(6.89)
is
independent
of which natural chart
JV,,.
holds at
u
(xl,
.
.
.
,
x")
on
M with t
=
a
Xn and W
natural chart =
o9/c9x`,
and
local Hamiltonian H that is
owing to Proposition for
=
0
and which local Hamiltonian has been chosen. We choose
we
are
Then the reduced
with H
components of the
Proof.
assumptions of Theorem 6.5.1
(O(W)) (u)
strongly regular
is
det
holds at
the
6.5.1. Then
independent of x1. This is possible equation (6.78) gives us a local Hamiltonian
around & As in the coordinates chosen Wa if and
=
Jna,
condition
(6.89)
only if the condition det
Olpa) (ftp)
( (fi')
0
(6.90)
0
holds at fi, where p is an index numbering rows and o, is an index numbering columns, both running from I to n 1. By Definition 5.2.2, (6. 90) holds at 'a -
if and
only
if
9,.
is
strongly regular
at that
point.
In many cases of interest condition (6.89) can be checked help of the following result from linear algebra.
1-:1
quickly with the
Stationary
6.6
Proposition 6.5.5. If the is equivalent to det
We
Proof.
and
is satisfied if and
I (Zb)
I
R"
(=-
and in
vacuum
is
HcdHcWd)
Hef He Wf
Hgh Wg
(Ha)
and
(Wa)
141
simple media
invertible, HabHb,:
Hab Ha Hb
Ja, (6.89)
=
(6.91)
0.
.
Whj
linearly independent
are
since
both wrong. With this assumption, (6.89) if the image of the (n 2)-dimensional vector space 0 } under (Hab) is transverse to the 2Wa Za
(6.91)
only
Ha Za
(Hab)
matrix
that
can assume
(6.89)
otherwise
(
optics in
ray
are
-
=
=
dimensional space spanned by (Ha) and (Wa). This is the case if and only if (Hab) is non-degenerate on the 2-dimensional space spanned by (Ha) and
(Wa), i.e.,
6.6
if and
only
Stationary
In the
preceding
(6.91)
if
ray
section
holds true.
in
optics
0
and in
vacuum
have established the
we
general
simple media features of the
re-
structures. To illustrate these
duction formalism for stationary ray-optical results by way of example, we shall now carry
through
the reduction in full
structures. To that end
detail for stationary vacuum ray-optical assume that we have a g-time-like vector field W notes the
vacuum
ray-optical
structure
on
we
where M
9Ar (M, g). According E
to
=
our
have to
Afg de-
results of
5.3, the condition W E 9Ar means that W is a conformal Killing vector field of the metric g, i.e., that the Lie derivative Lwg is a multiple of g. This implies that W is a Killing vector field of the rescaled metric e-2f 91 Sect.
(e-2f g)
LW where
f
2
In(
-
g (W,
W)).
=
(6.92)
0
Hence, the one-form
_e-2f g(w,
(6.93)
satisfies
O(W) Now let This
us a
(6.94)
LwO=O.
global timing function ) M' global diffeomorphism (7r, t): M
us assume
gives
and
=1
that
we
have
a
t: M x
R for W.
R. The fact that
-2f g induces a particular Killing vector field of the rescaled metric e the one-form (6.93) M'. this work use To we out, geometrical structure on the spacetime write to function the of dt and the differential global timing
W is
a
metric in the form g
=
e
2f
e-2f g + 0 0 0
_
(0
-
dt +
dt)
0
(0
-
dt +
dt))
(6.95)
142
6.
Ray-optical
structures
on
Lorentzian manifolds
which is
a trivial identity. Clearly, the symmetric second rank tensor field e-2f g + 0 0 0 satisfies (e -2f g+000)(W, -) 0 and Lw(e -2f 9+000) 01 and it is positive definite on the orthocomplement of W. Hence, it must be the pull-back of a (positive definite) Riemannian metric on =
e-2f g
+
0
0
0
=
=
7r*.
(6.96)
dt satisfies Similarly, the one-form dt) (W) 0 and Lw it be the of must one-form a Hence, pull-back 0 on M,
dt)
=
=
0.
A
0 With
(6.96)
and
(6.97) g
=
e
dt
inserted into
(7r*
e2f
The conformal factor
-
2f has
(-7r*
-
no
=
-7r* .
(6.95), +
the metric g takes the form
dt)
influence
(6.97)
0
on
(-7r* the
+
dt))
(6.98)
.
vacuum
light
rays.
Thus,
(6.98) suggests that the metric and the one-form are the relevant geometrical objects that determine the reduced ray-optical structures This
9"..
is indeed the
case as we
depend cording to (6.73), on
shall
see
below. But first
the choice of the
transforms like
gauge
want to check if
and
we change acglobal timing obviously unaffected whereas the one-form potential,
the metric a
we
function t. If
t
is
(6.99) Thus, the two-form 4
=
d
(6.100)
independent of which global timing function has been chosen. The geometmeaning of c is that it measures the rotation (=twist) of the integral curves of the time-like vector field W. Vanishing of ca characterizes the lo0 is equivalent to W being locally cally static case, i.e., the equation c j 0 is hypersurface-orthogonal. If M' is simply connected, the equation c even equivalent to W being globally hypersurface-orthogonal. Let us quickly prove the second statement which implies, of course, the first one. On a sim0 guarantees the existence of a ply connected manifold the equation d A. (We have used this well-known fact already in function h such that the proof of Proposition 5.5.2.) Thus, a gauge transformation (6.99) with this function h leads to 0. Together with (6.97) this shows that, for M simply 0 is equivalent to the existence of a global timconnected, the equation c 2f dt'. 0 and thus, by (6.93), g(W, ) -e ing function t' such that 0 dt' Clearly, the latter equation characterizes the case that W is orthogonal to the hypersurfaces t' const., i.e., it characterizes the globally static case. As a preparation for the reduction, we now use the representation (6.98) of the spacetime metric g to write the dispersion relation for vacuum light is
rical
=
=
=
=
=
=
-
=
6.6
Stationary
rays in terms of the
spatial
(xl,..., x')
local coordinates
ray
optics in
metric on
and in
vacuum
and the
M with x1
simple
media
143
spatial one-form 0. If we use i9lax' W, (6.98) takes
t and
=
the form
dXb
gab dXa g
(
e2f
p.
dxP 0 dx'
(6.101)
( , dx' + dt) 0 ( p dxP + dt)).
-
following greek indices are running from 1 to n 1 p, and xn- 1) whereas f depends on (xl, xr). With the covaridepend on (xl, ant metric components gab given by (6.101) it is an easy exercise to calculate the contravariant metric components gab This puts the dispersion relation 9ab Pa Pb 0, by which the vacuum ray-optical structure is determined, into Here and in the
-
-
.
.
.
.
,
.
.
,
.
=
the form
(Pp
VC
-
p) (Pa
Pn
-
Pn
a)
2
P'
have introduced the contravariant components ja which are defined by ,,p P`r Here
we
=
We
are now
according
of the metric
g^
/V
to construct the reduced
ready
(6.102)
Pn =0.
_
ray-optical
structures
first check if all the assumptions of this at hand. The set Q,,,, is non-empty for all real
to Theorem 6.5.1. Let
us
are satisfied in the case numbers w,, :A 0. The transversality condition (i) of Theorem 6.5.1 is satisfied if and only if the one-form dt is nowhere g-light-like whereas the transversality condition (ii) is always satisfied. Thus, we have to assume that the global
theorem
timing function has been chosen in such a way that the hypersurfaces t const. are either everywhere space-like or everywhere time-like with respect to g. Then Theorem 6.5.1 gives us a reduced ray-optical structure 9,,,, for all real numbers w,, 0 0. Please note that the left-hand side of (6.102) gives us a Hamiltonian H for JV that is independent of the coordinate Xn As in (6.78) we get a Hamiltonian ft for the reduced ray-optical structure 9". simply by =
.
setting
p,
equal
to -w,,
ft(X1,..., Xn-1,P1,..., pn- 1)
(
2
where the factor 2wo
ta
was
(PI,
With the Hamiltonian
ray-optical
-
2),
WO
0-
structure reads
t) (PC + WO &)
+ U)O
(6.103), ,_
wo jz) (PC + WO a)
introduced for later convenience. Thus, the disper-
sion relation of the reduced
ga (P,4
+
(6.103)
Hamilton's
=1 WO
ap
(P
P
_
W2 0
=
0
equations take the
+ WO
4)
1
(6.104)
.
form
(6.105)
144
6.
Ray-optical
structures
I
on
aPA
2 w,, i9xO'
VA
Lorentzian manifolds
(PP
a0p axo-
P) (P
+ W'
(PA
+
(6.106)
1.1)
+ WO
-
Together with (6.104), the equations (6.105) and (6.106) determine the lifted rays of 9,,. in the special parametrization adapted to H. (6.105) can be solved for the momentum coordinates; upon inserting the result into (6.104) and (6.106), respectively, we find
,,, ;.r..
, A + Here
we
=
AP
1
AP
(,OP.
9XA
(6.107)
,
*
04
( OX0,
have introduced the Christoffel
2
I
09XP
):t,
.
(6.108)
symbols
+
ap,\ -
g:_-_,%)
'9XP
jqXr.
(6.109)
. (6.107) and (6.108) determine the rays of 9". in the parametadapted to ft. Rom (6.107) we read that this is the parametrization d 0, the right-hand side of by -arc length. In the locally static case, 4 the -geodesics. If ( does not vanish, (6.108) vanishes and the rays are exactly in the the rays deviate from -geodesics response to the "force ter&' on the has the same formal structure This force term side of (6.108). right-hand exerted the Lorentz a on as charged particle by a magnetostatic field. force In this analogy, the two-form c. corresponds to the magnetic field strength and the one-form corresponds to the magnetic potential. This is, of course, formal analogy. In the situation at hand c has nothing to do with a only a real magnetic field. It is a purely kinematical quantity measuring the rotation ( twist) of the integral curves of W. Physically, the right-hand side of (6.108) can be viewed as a CoTiolis force. Since enters into (6.108) only in terms of 4 d , the rays of JV,,. are gauge invariant although the lifted rays are not. We know already from our discussion following Theorem 6.5.1 that this is a general feature of the reduction formalism. Moreover, as neither (6.107) nor (6.108) involves the for WO E R \ f01, parameter w,,, all the reduced ray-optical structures 6.5.2 since the observation This the same Proposition exemplifies give rays. vacuum ray-optical structure is dilation-invariant and reversible. We shall now derive an expression for the optical path length, which was
of the metric rization
=
=
=
=
and the introduced in Definition 6.5.3, in terms of the Riemannian metric one-form . (6.104) and (6-105) 4etermine the lifted virtual rays of 9". in the
parametrization adapted
P'-'+,
=
to H. These
W.
equations imply
1A (V/9_^P11,1--tP
-
4 X-P)
-
(6.110)
6.6
Stationary
ray
optics in
vacuum
and in
simple
media
145
We can now free ourselves from the particular parametrization. Clearly, an orientation-preserving reparametrization leaves (6.110) unchanged whereas an orientation-reversing reparametrization requires replacing the positive square-root with the negative square-root. Since 0 p, dx'7, (6.110) gives us the integrand of the optical path length (6.84). If we switch back to invariant )9"1" notation, the optical path length of a lifted virtual ray : [Sl) 821 =
takes the form
JS2 where can
: [S 11 S21 A 4 is (6.98) that
the
(s) ds projection of
(6.111)
As A is
to
light-like,
we
read from
dt(A)
(6.112)
.
Comparison with (6-93) and (6.97) shows that the positive square-root in (6.112) corresponds to the case that the parametrization of '\ is futureoriented with respect to W, i.e., g(W, ) < 0. Inserting (6.112) into (6.111) is equal demonstrates that, in the case at hand, the optical path length of to the travel time with respect to the global timing function used for the reduction. The same result follows from Proposition 6.5.3, using the fact that the vacuum ray-optical structure is dilation-invariant. (6.111) clearly shows that, up to an orientation-depending sign ambiguity, the optical path length of is determined by its projection k We have already mentioned that this is true whenever the reduced ray-optical structure is strongly hyperregular. In the case at hand 9,,. has the additional property that every COI curve in M is a virtual ray. For this reason the optical pa h length can be viewed as a functional on the set of all CO, curves X in M, given by the right-hand side of (6.111). (6.111) again exemplifies the gauge dependence of the optical path length. In the globally static case we can choose the global timing function in such a way that 0 vanishes. In this distinguished gauge the optical path length coincides with the -arc length. In the stationary but non-static case, however, the gauge freedom in the definition of the optical path length cannot be A
removed.
established all the relevant equations of the reduction formalism for stationary vacuum ray-optical structures. Examples will be given and the gauge-dependent one-form in Chap. 8 below, where the metric We have
are
now
calculated for several
relevance to
physics.
Carter and Lasota For
For
[3]
(conformally) stationary spacetimes (M, g)
examples
of this kind
and to Perlick
light propagation ray-optical
also refer to
structure
with
Abramowicz,
[109]. general, particular, determined by a
in matter, the reduction formalism
rather different features in comparison to the the reduced
we
R,,,,
is,
in
vacuum
general,
has,
case.
not
In
in
146
Ray-optical
6.
structures
on
Lorentzian manifolds
Riemannian metric and
by a one-form which is unique up to gauge transforHowever, there is a special class of (non-dispersive) media to which our vacuum results immediately carry over, viz., media characterized by a Lorentzian "optical metric". So let us consider a ray-optical structure Ar on (M, g) which is of the kind given in Example 5. 1. 1, i.e., let us assume that mations.
Ar
C
T*M is the null
cone
bundle of
a
Lorentzian metric N. If g,, is confor-
mally equivalent to 9, then Ar is the vacuum ray-optical structure on (M, g), otherwise Ar gives light propagation in a medium. We assume that JV is stationary, i.e., we assume that there is a vector field W which is time-like with respect to g and a conformal Killing field with respect to g,,. If W is time-like with respect to g, as well, and if we can find a global timing function for W, then we can carry through the reduction procedure in analogy to the vacuum case. Now it is, of course, the optical metric g,, that is decomposed in the form (6.98). Hence, vanishing of the induced one-form implies that const. with respect to the optiW is orthogonal to the hypersurfaces t cal metric g,, and does not characterize the globally static case. Similarly, const. it is now the metric g,, with respect to which the hypersurfaces t have to be non-light-like in order to guarantee that the assumptions of Theorem 6.5.1 are satisfied for all w,, 0 0. In this situation the rays of the reduced ray-optical structure are, again, determined by equations of the form (6.107) and (6.108), and the optical path length is, again, representable in the form (6.111). Explicit examples of this kind will be given in Chap. 8 below. One of the most interesting aspects of the reduction formalism for stationary ray-optical structures is that it provides a link between our general relativistic Lorentzian geometry setting of ray optics and the ordinary Euclidean geometry setting of elementary textbook ray optics. Roughly Speaking, ray optics in media, as it is treated in elementary optics textbooks, can be viewed as the result of our reduction process applied to an appropriate ray-optical =
=
structure
Minkowski space. If (M, g) is n-dimensional Minkowski space, Xn- 1, Xn t) to identify M pseudo-Cartesian coordinates (xl,.
on
we can use
=
with Rn and to put g into the form g
Here,
as
=
6pcr dxP
before, the summation
from 1 to
n
-
1. This induces
a
0 dxo'
-
dt (9 dt
greek indices running P1.... p.) globally coordinate p,, O(W) gives
convention is used for
natural chart
(xl....
T*M. Up to a minus sign, the momentum the frequency with respect to the inertial system V on
optical
structure
Ar
on
(6.113)
.
M. Now let
us
consider'a
I
X
n
7
=
=
W
=
9/,9t
ray-optical
for any ray-
structure
JV
on
M such that all the matter functions that enter into the dispersion relation t. This implies that W 49/&t E are independent of the time coordinate x' =
=
it implies that JV stationary. Since g(W, ) t gives coordinate x' the time static and
gAr, i.e., globally timing function. For of Theorem 6.5.1
is
-
-dt, Ar
is then
even
distinguished global frequency values w,, E R for which the assumptions satisfied, the reduction formalism gives us a reduced =
all
are
=
a
Stationary
6.6
ray-optical
9,,,.
structure
ray
on
optics
in
vacuum
and in
simple media
147
Euclidean space A _2` R". The physically 4. Any sort of medium treated in elementary
interesting case is, of course, n optics textbooks can be modeled in ray-optical structures on M' c---- R3. =
Here is
ray-optical
a
special example
structure
JV
terms of such
a
one-parameter family of
of this construction. Let
Minkowski space is given
on
6pa ppp,
by
n(xl,..., Xn-1, -P.)2 N2
-
=
us
assume
that the
the equation
(6.114)
0
R+ is a C' function. In the terminology of Sect. 6.3, where n: R'- 1 x R W A( is isotropic with respect to the inertial system V allot. The =
=
gives the index of refraction which is assumed to be independent x" whereas it may depend on the frequency -p,'. of the time coordinate t function
n
=
In this situation the From
by
(6.114)
the
global
we
assumptions of Theorem 6.5.1 ray-optical
read that the reduced
: 0. governed
satisfied for all W,,
structure
9,".
is
Hamiltonian
ft(xl) I
2
This
are
implies that
...
( n(xl,
the rays of
X"_1IP17
7
...
I
Pn- 1)
P, PP P, .
9.,,
Xn- 1,
are
W,,)
the
2
(6.115)
W2 0
geodesics
of the
conformally flat
Riemannian metric
p, on
A4
which
virtual ray of its
projection
(6.116). optics
depends, of
Xf,,o, to
n(. U)0)2 jPO' ,
(6.116)
on the frequency value wo. For a lifted optical path length _T( ) equals the -arclength denotes the Riemannian metric given by where
of course,
the
M,
We have thus rediscovered the standard textbook formulae for ray
in
dispersive isotropic
media.
7. Variational
In this
chapter
principles for
rays
want to characterize the rays of
a ray optical structure principles. In particular, we want to investigate for what kind of ray optical structures some version of Fermat's principle holds true. This question is of interest not only from an abstract theoretical point of view but also in view of applications. Most elementary optics textbooks, such as, e.g., Born and Wolf [16], give a formulation of Fermat's principle for non-dispersive and isotropic media only. However, generalizations to more complicated media are known, see, e.g., Newcomb [1001. If we allow for dispersive and anisotropic media, Fermat's principle in ordinary optics can be phrased in the following way. we
in terms of variational
Fix two points in space and a frequency value w,,. Consider all possibilities to go, along different spatial paths, from one point to the
velocity of light as it is determined, for the frequency W,,, by the medium considered. Among all these "trial curves", the actual light rays are then the local extrema and saddle-points of a certain functional which is called the optical path length. If the medium is non-dispersive it is not necessary to specify the frequency and the optical path length can be reinterpreted as travel time.
other at the
principle can be viewed as a mathematical relegitimate to consider it as a version of Fermat's principle. In the following we establish several variational principles for the rays of a ray-optical structure, and we discuss their relation to Fermat's principle. Whenever
a
variational
formulation of this statement it is
7.1 The
principle of stationary
action: The
In classical mechanics it is usual to define the action
[SI S21 7
--+
T*M
by integration
over
0
V. Perlick: LNPm 61, pp. 149 - 181, 2000 © Springer-Verlag Berlin Heidelberg 2000
=
functional on C1 0, i.e.,
case
curves
the canonical one-form
S2
A( )
general
OC(S)
( (s))
ds
(7.1)
7. Variational
150
principles
for rays
Actually, the action functional can be defined on differentiability class; for the time being, however, From standard textbooks
on
curves we
classical mechanics
of
a more
general
stick to the C'
we
case.
know that the solu-
equations satisfy a principle of stationary action. Theresurprise that the lifted rays of an arbitrary rayon M satisfy a principle of stationary action as well. In com-
tions of Hamilton's
fore,
it should not
come as a
optical structure parison to the situation of classical mechanics there are three modifications. First, for an arbitrary ray-optical structure the existence of a Hamiltonian is guaranteed only locally. Second, lifted rays have to satisfy the dispersion relation, i.e., in mechanical terminology they are restricted to the "energy 0. Third, lifted rays can be reparametrized arbitrarily which is surface" H reflected by the arbitrary stretching factor k in the ray equations (5.9). Mimicking standard techniques of classical mechanics, but taking care of these three modifications, we find the following principle of stationary action for lifted rays of an arbitrary ray-optical structure, cf. Figure 7.1. =
(Principle of stationary action for lifted rays) Let arbitrary ray-optical structure on M and fix a C' immersion consider all C' maps JV. As allowed variations of ) [SI -921 the tangent vectors which ) M withq(O, -) X for S21 [S17 EO[ 1 -607 the curves q ( s 1) and q ( 82) are annihilated by the canonical one-form
Theorem 7.1.1. be
an
1
77: to
=
-
-
-
,
following
0. Then the
(a) If
is
a
lifted
ray
holds true.
of JV,
is
then
a
stationary point of A,
d
TEA (,q
(7.2)
0
for all allowed variations q of . (b) Conversely, if (7-2) is true at least for all allowed variations q of with sj) and 77(', S2) constant, then is fixed endpoints in Ar, i. e., with a lifted ray. Proof. Let
q be
an
tional vector field
allowed variation of
by
Y:
[Sl) S21
and denote the
pertaining
varia-
TA(, i.e.,
)
Y(s) =,q(., SY(O) Then the variational derivative of the action
(7.3)
-
can
be calculated in the
following
way, using standard derivative rules. d
Te
LO
A (77
152 ((dO) ( ,) (Y(s), (s))
+ -4ds
(0 (,) (Y(s)))
ds
S
(Y (s), (s)) ds
J52 S1
+
0 (-92)(Y(82))
( (s), Y(s)) ds
-
.
0 (sj)(Y(Sl)) (7.4)
7.1 The
Fig.
principle
of
7.1. An allowed variation, in the
stationary
sense
action: The
general
case
of Theorem 7.1.1, must be
151
completely
contained in Ar.
In the last step we have used the fact that Y(sj) and Y(82) are annihilated by 0 because 71 is an allowed variation. Since our allowed variations stay within X, Y must be everywhere tangent to Ar. Thus, we can read from
(7.1) that part (a) of the proposition follows directly from Definition 5.1.2. To prove part (b) we assume that the last integral in (7.1) vanishes for all that are everywhere tangent to Ar and vanish at C' vector fields Y along the endpoints. Hence the fundamental lemma of variational calculus implies must be a lifted ray according to Definition 5.1.2. Alternative proof. For those readers who feel more comfortable with tradican be tional index notation we give an alternative prooL We assume that covered by the domain of a Hamiltonian and of a natural chart, and we write at e 5 for the derivative with respect to 0, as in (5.46), (5.47) and (5.48).
that
,
Then
(7.1)
=
takes the form 82
JA
=
6
P"
(s).,ka(s)ds 82
2
JPa(S) :ta(,) ds
132 6pa (S) Pa (S2) 6xa (82)
-
Pa (81)
Pa (S) jta (s) ds
+
(7.5)
ba (s) ds +
5xa(sl)
-
f52 Pa(s)
jxa (s) ds
.q 491
z
'92
bPa(s) ia (s) ds 1
-
182 49
ba (S) jxa (s) ds
152
7. Variational
principles
Since all allowed variations constraint
equation aH
igXa
for rays
confined to
are
H(x(s),p(s))
(X(S), 19(s)) 6x,,(s)
Ar, they
have to preserve the
0, i.e.,
=
'H +
(x(s),P(s)) Jp"(S)
ON
=
To prove part (a) we recall that lifted rays satisfy (5.11) and inserting these equations into (7.1) the desired result follows
with
(7.6).
Lagrange
To prove part (b) we insert multiplier k(s). This results in
JA
=
1,92 Jpa(S) (&a(,)
-
(7.6)
k(s)
'9H
ap.
into
(7.1)
0.
(5.12). Upon immediately help of a
with the
(x(s), p(s))
ds
-
(7.7)
92
jxa(s) (P,, (s)
+ k (s)
(7.6)
ax"
(x(s), p(s)))
ds
By assumption, the right hand side of (7.7) is equal to zero for all smooth Jp,, and bxa that vanish at s, and S2. (The Lagrange multiplier k(s) allows to forget about (7.6).) Hence, the fundamental lemma of variational calculus El implies that the ray equations (5.11) and (5.12) have to be satisfied. Theorem 7.1.1 d
Te
A(77(e:, -)) 1,,=o
and
77(', S2)
are
in particular, that is a lifted ray if and only if 0 for all allowed variations for which the curves n(-, si)
implies,
=
vertical, i.e., for variations that keep the endpoints fixed
in
M. As
A(x)
obviously invariant under orientation-preserving reparametrizaequally well allow for variations that change the parameter interval [S1, S21. However, such a generalization of Theorem 7.1.1 will not be needed in the following. We emphasize that it is not justified to use the name "principle of minimal action", rather than "principle of stationary action ', for Theorem 7.1.1. In general, a lifted ray can be a local minimum, a local maximum, or a saddlepoint of the action functional. This is true even if we restrict to arbitrarily
tions of
,
is
we
could
short rays. The advantage of Theorem 7.1.1 is in the fact that it holds for
arbitrary ray-optical structures. In particular, no regularity assumption is needed. Its disadvantage is in the fact that a very big set of allowed variations is used. Apart from the boundary conditions, all curves in Ar, i.e., all curves for which the momenta satisfy the dispersion relation, are to be considered as "trial curves" q(E, -). This includes curves with arbitrary velocities and not only motions at the velocity of light. For this reason Theorem 7.1.1 cannot be viewed as a version of Fermat's principle as it was stated in the beginning of this chapter. If we restrict the set of trial curves to motions at the velocity of light, in general we will have not enough allowed variations to prove an analogue of part (b) of Theorem 7.1.1. (There is, of course, no problem with
7.1 The
principle
of
stationary action: The general
part (a).) In Sect. 7.2 we shall see that there if Ar is strongly (hyper-) regular. Theorem 7.1.1 has several
ample, we Huyghens notions of
interesting
are
153
case
enough allowed variations
still
consequences. To
give just
one ex-
now show that part (a) of Theorem 7.1.1 leads to the familiar construction of (lifted) wave surfaces. Here we have to recall the
generalized solutions
of the eikonal
equation and of lifted
wave
surfaces from Sect. 5.5. Let Ar be
(Huyghens construction)
a ray-optical structhe Euler vector field. of flow on everywhere Let 45 be the flow of the distinguished characteristic vector fleld X on Ar which 1. Fix a generalized solution L of the is- determined by the equation OAr(X) eikonal equation of M and a lifted wave surface S of L. Let s E R be such that the set -P, (9) is non-empty and, thus, again a lifted wave surface of C. Then 4i,(S) can be constructed as the envelope of the surfaces P,(Arq) where -rJL (u) q ranges over all points in M that can be represented in the fo nn q
Theorem 7.1.2.
transverse to the
M which is
ture
=
=
with
some u
E
Ar denote the maximal integral Proof. Choose any u E S and let : I u. of the distinguished characteristic vector field X on./V with (O) We have to prove that at the point (s) the (n I)-dimensional surfaces =
curve
-
(P,(S)
and
P,(Arq)
tangent
are
to each other where q
consider all C1 maps 77: ] -eo 7 eo [ x [0, s] the varied curves q(e, ) are lifted rays with we
-
)
=
Ar with
q(e, 0)
E
-r 4(u).
To that end
such that 77(0, ) Arq and 77(,,, s) E lps (S) =
-
By Theorem 7. 1.1 (a), any such 77 satisfies the condition d This implies that for any such 77 the vector 71( s)* (0) is 1,0. 76- A(77(e, tangent to P, (A(,). Since all elements of T (,) (C (S)) can be represented in 11 this form, 4i,,(S) and P,(AQ must be tangent to each other at (s). for all
6
E
E0, -0 [.
M
M
a)Example
b)Example
5.1.2
5.1.5
Fig. 7.2. Theorem 7.1.2 leads to the familiar Huyghens construction in M according to which wave surfaces are the envelopes of "elementary waves". Please note that transversality of the ray-optical structure to the flow of the Euler vector field is necessary. This is the case, e.g., for Example 5.1.2, where the "elementaxy waves" waves" axe spheres. are hyperboloids, and for Example 5.1.5, where the "elementary
Upon projection according
struction
to
M, Theorem
to which
the
7.1.2
wave
gives the familiar Huyghens
surfaces
(projections
of
4i,(S))
con-
are
154
7. Variational
principles
for rays
envelopes of "elementary waves" (projections of 4i,(A(q)), see Figure 7.2. Clearly, the projections to M of (P,(S) and P,(Arq) are smooth submanifolds only if %(S)) and %(Afq) are transverse to the fibers. It is to be emphasized that Theorem 7.1.2 has to presuppose a ray-optical
the
structure
Ar that is
otherwise
a
transverse to the flow of the Euler vector field since
characteristic vector field X with
0,\r(X)
=
1 does not exist. In
particular, Theorem 7.1.2 does not apply to ray-optical structures that give light propagation in a non-dispersive medium on a spacetime. For this reason Theorem 7.1.2 has more relevance for ray-optical structures on space rather than on spacetime.
principle of stationary action: strongly regular case
7.2 The
The
already stressed that Theorem 7.1.1 cannot be viewed as a verprinciple since the trial curves are not restricted to motions at the velocity of light. What we want to have is a theorem, analogous to Theorem 7.1.1, where only lifted virtual rays are considered as trial curves rather than arbitrary curves in M. (Please recall that lifted virtual rays are defined through Definition 5.2.4.) We are now going to show that such a theorem holds true for strongly regular ray-optical structures. Contrary to Theorem 7.1.1 we restrict to variations with fixed end-points in M.
We have
sion of Fermat's
Theorem 7.2. 1. Let X be
a
strongly regular ray-optical
structure
on
M and
) Ar virtual ray : [81 S21 of JV. As allowed variations of A for ) M with 77(0, -) we consider all C1 maps q: EO 7 EO [ X [S1 7 S21 which the curves 77 (,-, ) are lifted virtual rays for all e E ] eo 7 -o [ and the curves si) and q( 82) are vertical. Then is a lifted ray if and only if
fix
lifted
a
=
-
-
-
d
-j,6
A (,q (e,
0
=
=O
for all allowed
variations q
of
Proof. Since allowed variations in the sense of this theorem are, ular, allowed variations in the sense of Theorem 7.1.1, the "only
in
partic-
if' part is a trivial consequence of part (a) of Theorem 7.1.1. To prove the "if' part, with Z(si) = 0 and ) TA( be let Z: [Sli S21 any CIO vector field along
Z(S2) IL(O, -
0. Then =
that Z is
we can
find
a
I
C1 map p:
-
60 7 EO
X
[S1 S21
=
=
=
general the
curves
)
i
(S2) for all e E , p(s, si) (si) and A(E, S2) the pertaining variational vector field, i.e., Z(s)
X with
-0, Eo [ such -
,
s)'(0)
-
In
p(6, -)
allowed variation of
will not be lifted virtual rays, so ti will not be an in the sense of this theorem. Therefore we consider the
M of p to M which gives a vari-r;4 op: I -EO EO [ X [817 S21 with fixed end-points. Now we have to recall that strong r;, regularity guarantees local solvability of (5.10) and (5.11) for the momenta and for the factor k. By compactness of the interval [sl', S21 this guarantees Ar existence and uniqueness of an allowed variation q: I 150 160 [ X [SI S21
projection
r.
ation of A
==
=
7
)
o
-
7
)
7.2 The
principle
of
stationary action: The strongly regular
case
155
sufficiently small, that projects onto n. We denote the variational by Y as in (7.3). Now we have two variations p and 77 of C both of which project onto n. Hence, the difference between the pertaining variational vector fields Z and Y must be vertical. Since q is, in particular, an allowed variation in the sease of Theorem 7.1.1, (7.1) holds true. By hypothesis, 0 -)) 1..-0. Thus, the last integral in (7.1) has to vanish. de TJV is vertical and C is a lifted virtual ray, this Z : [811 821 Since Y integral still vanishes if Y is replaced with Z. But Z was an arbitrary C' vector field along C tangent to Ar that vanishes at both endpoints. Hence, as in the proof of Theorem 7. 1. 1, the fundamental lemma of variational calculus 1:1 implies that C must be a lifted ray. of
, for
eo
vector field of 77
=
-
If Ar is not
choose
an
only strongly regular
orientation for
X and
but
even
strongly hyperregular,
construct
we can
a
we can
one-to-one relation be-
positively oriented lifted virtual rays and positively oriented virtual rays according to Proposition 5.2.4. In this situation the action functional, which is defined by (7.1) on curves in T*M, gives a well-defined functional, A on the set of all positively oriented virtual rays A via
tween
A(A) Here
is the
C
A and
A(C)
=
(7.8)
A(C).
unique positively oriented lifted virtual ray that projects onto by (7.1). Therefore, in the strongly hyperregular case
is defined
Theorem 7.2.1
can
be reformulated
variational
as a
following
than for lifted rays, in the
principle
for rays, rather
way.
stationary action for rays) Let M be strongly hyperregular and thus oria ray-optical structure on and M entable. Choose an orientation for fix a positively oriented virtual variations of A all C' maps allowed Consider ) as M. ray A: [S I S21
Theorem 7.2.2.
(Principle
of
M which is
i
A such that r.(e, si) A(si), r,(O, ) virtual are curves r.(E, 60, -0 [. rays for all E E ] ) d 0 for all allowed variations Then A is a ray if and only if de A (r. (E, )) I e=o A is A. Here n of defined through (7.8).
] EO) EO [ X [S 1) S21 r.(e, S2) A (S2), and the K:
)
-
M with
-
=
-
=
-
=
proof follows immediately from Theorem 7.2.1 and Proposition 5.2.4. If M is to be interpreted as space (and not as spacetime), Theorem 7.2.1 and Theorem 7.2.2 can be viewed as versions of Fermat's principle. To put this rigorously, we consider a stationary ray-optical structure and we assume that, for some value w,, E R, all the assumptions of Theorem 6.5.1 are satisfied such that the reduction can be carried through. We can then apply Theorem 7.2.1 to the reduced ray-optical structure 9,,., provided that 9,,. is strongly regular. (Criteria for 9,,. to be strongly regular are given in Propositions 6.5.4 and 6.5.5.) The action functional A of Theorem 7.2.1 equals the optical path The
length _T
frequency factor W,. If we 0, varying A is equivalent to varying 1.
of Definition 6.5.3 up to the constant
exclude the
pathological
case
Thus, Theorem 7.2.1 tells
us
w,,
=
that,
among all lifted virtual rays between two
156
7. Variational
principles
for rays
fixed points in space M, the lifted rays are characterized by making the oppath length IT stationary. This can be viewed as a version of Fermat's
tical
principle.
9,.
If
is
even
strongly hyperregular, Theorem
7.2.2
gives
a more
familiar reformulation of this result in terms of rays rather than in terms of lifted rays. In the non-dispersive case the optical path length can be reinter-
preted the
are
travel time, according to Proposition 6.5.3, and the rays of JV.,, for all positive values of w, according to Proposition 6.5.2.
as a
same
Viewed in this sense, Theorem 7.2.2 covers principle for stationary situations.
virtually all known versions Considering stationary vacuum ray-optical structures, as in the first part of Sect. 6.6, reproduces Fermat's principle for vacuum light propagation on (conformally) stationary Lorentzian manifolds as it is given in many textbooks on general relativity, see, e.g., Landau and Lifshitz [76] or, for the static case, Frankel [43] or Straumann [1361. Considering stationary ray-optical structures on Minkowski space, as in the last part of Sect. 6.6, reproduces all elementary textbook versions of Fermat's principle in ordinary optics. Theorem 7.2.1 and 7.2.2 also apply to some (necessarily dispersive) rayoptical structures on spacetimes, e.g., to those of Example 5.1.2 giving light propagation in a non-magnetized plasma on a general-relativistic spacetime. In those cases, however, they cannot be interpreted as versions of Fermat's principle since the trial curves have fixed endpoints not only in space but even in spacetime. of Fermat's
7.3 Fermat's Now can
we
principle light rays in an arbitrary general-relativistic medium by a version of Fermat's principle. Throughout this
want to ask if
be characterized
section
Chap.
(M, g)
we presuppose a Lorentzian manifold (M, g) with dim(M) > 2, as in 6. Our physical interpretation refers to the case dim(M) = 4 where
be viewed
general-relativistic spacetime. For an arbitrary M, the results of Sect.s 7.1 and 7.2 can then be summarized in the following way. In any case, the lifted rays of A( are characterized by the variational principle of Theorem 7.1.1. This, however, cannot be viewed as a version of Fermat's principle because the space of trial curves is too big, as outlined above. If A( is strongly regular, which by Corollary 5.4.1 can hold only if A( describes light propagation in a dispersive medium, the lifted rays of A( are characterized by the variational principle of Theorem 7.2.1. This, however, cannot be interpreted as a version of Fermat's principle either because the end-points of the trial curves are fixed in spacetime rather than in space. The results of the preceding sections give a version of Fermat's principle only in the very special case that Ar is stationary. More precisely, we have can
ray-optical
to
assume
structure
as
JV
a
on
in addition that all the conditions of the reduction theorem
of Theorem
6.5.1)
are
satisfied for
some
w,, E R
\ 101
(i.e.,
and that the reduced
7.3 Fermat's
9,,.
principle
157
strongly regular. Then Theorem 7.2.1 applied to principle for the lifted rays with frequency constant w,, in the medium considered. If 9,,. is even strongly hyperregular, this result can be reformulated as a variational principle for rays, rather than for lifted rays, according to Theorem 7.2.2. If JV is a non-stationary ray-optical structure on (M, g), our previous results do not give us a version of Fermat's principle for its rays or lifted rays. Therefore we have to formulate another variational principle which needs some prepaxation. The first step is to define the space of trial curves. According to the elementary formulation given at the beginning of Chap. 7 Fermat's principle requires to consider motions "at the velocity of light". In our setting this translates into considering (lifted) virtual rays of Ar. For
ray-optical
9,,.
gives
structure
us a
convenience
is
version of Fermat's
we
(lifted)
shall restrict to
fixed parameter interval [0, 1] Then we have to impose
virtual rays which
are
defined
on
the
-
boundary
conditions
by fixing
"two points in
space". The appropriate way to translate this into a spacetime setting is to fix a point q and a time-like curve -f in spacetime M, and to restrict to virtual "a point in rays \ that start at q and terminate on -f. q can be interpreted as in "a be point interpreted as space viewed space at a particular time"; -y can float to the along a time-like endpoint over some time interval". Allowing to Fermat's principle requires vary the arrival time. curve is necessary since between a point and a for Further physical motivation considering light rays whenwe below are 8.4 Sect. in will be going to discuss time-like curve given gravitational lensing. Finally, as we want to include dispersive media, it is necessary to "fix the frequency". This requires to choose an observer field. More precisely, what rather we shall need is not exactly a time-like vector field on all of M but the introduce We to from virtual each field vector time-like ^J. q a along ray 7.3. see Figure following definition, Definition 7.3.1. Let -y be a time-like curve in the Lorentzian manifold (M, g). A generalized observer field for -y is a map W that assigns to each virtual ray A that terminates on -y a time-like COO vector field Wx along ,\ that coincides with
A(1) W,\ : [0, 11 and W,\(1) nates at
=
-y(T(A)),
=
If W is
with
TM satisfies the
-
end-point.
L e.,
if
A
:
[0, 1]
M terrni-
)
T(A) denoting some parameter value, then conditions VV,\(s) E T;,\(,)M for all s E [0, 1]
(T(X)). the notion of observer fields in the
following ordinary observer field, i.e., a time-like C' vector field on W o A ) an integral curve of W, then the assignment A
This definition sense.
at the
generalizes
an
M, and if -y is gives a generalized observer field
l
for -y. Whereas observer fields do not exist which manifolds are not time-orientable, generalized observer on Lorentzian time-like curve y. E.g., we may define W,\(S) by for exist fields always any
parallely transporting
the vector
(T(A))
along
A
158
7. Variational
principles for
rays
(1)
T(A))
Al
Fig.
7.3. A
field for -y,
generalized observer
to each virtual ray A that terminates
on
-j
a
as defined in Definition 7.3.1, assigns time-like vector field W,\ along X
Upon choosing a generalized observer field for -/ we may assign to each point (s) of a lifted ray by the equation
a
frequency
w(s)
W
(s)
=
-
Ws)) (Wx (s))
(7.9)
,
provided that A r, , o terminates on -y. As, in general, the frequency does not satisfy a conservation law along rays, the right way to "fix the frequency" is to prescribe a frequency value w,, for the arrival at y and to require that the redshift law for lifted rays is everywhere satisfied. (Please recall our discussion of redshift; in Sect. 6.2.) Collecting all this together, we are led to defining the space of trial curves 9R(A(, q,,y, W, w,,) in the following way. =
Definition 7.3.2. Let M be
manifold (M, g).
rentzian
(1) (2)
a a
point q E M ; C' embedding =
(3) (4)
a
W-Y(.)
-f: I
we
arbitrary ray-optical
structure
on
the Lo-
M
from
W
for -y;
a
real interval I into M such that
;
generalized observer field
a non-zero
Then
an
Fix
real number w,,
-
define the space of trial curves 9R(A(, q, -y, W, w,,) ) T*M such that : [0, 11
C' immersions
is a lifted virtual ray of JV; (a) (b) (O) E Tq*M; (c) there is a T(C) E I such that C(l) -wo; (d) ( (1)) ( (T(6))) =
E
T*Y(T(e)) M
as
the set
of
all
7.3 Fermat's
(e)
principle
159
satisfies the redshift law of lifted rays with respect to the generalized 0 for all C' vector fields field W, i.e., S? (,) ( (s), Q (s)) o with TA( ) along Q: [0, 1] T-r 4 Q parallel to W,;,. along -r) o recall please (6.18). observer
=
In terms of a natural chart and that the
a
representation (x(s),p(s)) (e), the equation
local of
Hamiltonian, condition (a) requires satisfy (5.10) and (5.11). By
has to
condition
k (s) Wa (S)
Wa(s) Pa (S) has to hold with the
same
factor
k(s)
,
axa
(X(s),P(s))
that appears in
(5.11);
(7.10) here
Wa(s)
denotes the components W, ,o (s) In the non-dispersive case, i.e., if JV is dilation-invariant, Proposition 6.5.2 gives a natural one-to-one relation between the spaces 9X(Ar, q, -/, W, w,,) and of
V(M, q, -y, W, c w,,) but also
reversible,
-
for any constant c > 0 If M is not this result carries over to the case -
only dilation-invariant c:
< 0.
In the stationary case, the distinguished time-like vector field W E ON gives us a generalized observer field W : A i ) W o A for each integral curve
by our using the same but related to which different are objects symbol each other in an obvious way.) In this special case conditions (d) and (e) of Definition 7.3.2 are equivalent to saying that the momentum O(W) takes the constant value -w,, along 6. If, in addition, all the assumptions of the reduction theorem (i.e., of Theorem 6.5.1) are satisfied, conditions (a), (d) -y of W.
(We hope
that the reader will not be confused
W for two mathematical
(e)
and
of Definition 7.3.2
imply
that
=
In the
6 is a lifted virtual ray of the non-stationary case, however, there red
o
ray-optical ray-optical structure and the trial curves have to be defined in the rather complicated way of Definition 7.3.2. With the space of trial curves at hand, we are now ready to introduce the reduced is
no
structure
reduced
functional that is to be extremized. Definition 7.3.3. F
:
Under the assumptions
9R(Ar, q, -y, W, w,,)
of Definition 7.3.2, R
-,-LA( ) + T( )
(7.11)
generalized optical path length functional. Here A denotes the action functional (7.1) and T denotes the arrival time functional defined by condition (c) of Definition 7.3.2. is called the
In the
non-dispersive
optical path length Proposition 5.4.7.
ized
case,
i.e., if JV is dilation-invariant, the generalT( ), owing to F( )
reduces to the arrival time,
In the stationary case, with W field W E 9A( and 7 an integral
=
given by the distinguished time-like vector curve of W, the reduction theorem (i.e.,
160
7. Variational
Theorem
6.5.1)
principles for
rays
be used to rewrite the
generalized optical path length in ray-optical structure, provided that all the assumtions of the reduction theorem are satisfied. In that case we find that the generalized optical path length is related to the optical path length of Definition 6.5.3 by F(6) T(red o ) + const. for all E 9R(A(, q, -/, W, w,,), owing to Proposition 6.5.3. This justifies the name "generalized optical path length" for the can
terms of the reduced
=
functional -T. Our
goal is, of course, to prove that the stationary points of the functional exactly the lifted rays in 9X(Ar, q, -y, W, w,,). Our previous results suggest that some kind of regularity condition will be necessary to prove this. The crucial point is the following lemma. .F
are
Lemma 7.3.1. Assume that all the assumptions of Definition 7.3.2 isfied and fix a lifted virtual ray 9A(Ar, q, -y, W, w,,). Assume that
the
are
sat-
along
regularity condition
(Hab) (H a) (Wa) 0 0 (Hb)
det
(Wb)
7
(7.12)
0
0
0
holds in any natural chart and with any local Hamiltonian, where Ha 9H/49paPb, and Wa denotes the components of the vector ,OH/49N, Hab =
field ,q:
]
Consider
W, ,.C. -
for all
eo,,-o[ E E I
[0, 1]
X
-
as )
allowed variations
Ar with
-0, -0 [. Then the
with Z(O) any Cw vector field along such that Tr, 4 allowed variation q of
=
=
o
and
(Y
-
o
Proof. Let
Z:
0. Fix
=
that keeps the with p(0, -)
[0, 1] a
TA( be
variation of
a
in
of all
the set
q(E, ) -
true.
If
E
CO' maps
9X(JV, q, -y, W, w,,) TA( is [0, 11
Z:
)
Z(1) Z) is a multiple of
0 and
Here Y denotes the variational vector
. ,r) (7.3).
Z(1)
of
6 q (0, ) following holds -
=
field of
COO vector field
q which is
along
Ar with variational
0, then there is
with
along defined by
Z(O)
vector field
an
W
=
0 and
equal
to Z
) Ar end-points fixed, i.e., fix a C' map p: ] eo, co [ x [0, 1] (O) and (I). In Z(s), tL(,-,O) , 1L(-,s)'(0) -
=
=
=
=
We shall now use this map tt will not be an allowed variation of that satisfies all the requirements tt for constructing another variation q of of the proposition.
general,
In the first step we give the construction of 77 under the special assumption can be covered by the domain of a natural chart and of a local Hamilto-
that
that the natural chart is induced
by coordinates o . In that case, our assumption (7.12) allows us to solve the system of equations (5.10), (5.11) and (7.10) for pl,...,p,,-,, k,:in,P,, along this central curve. By conti nuity, the same solvability condition is true for curves which are sufficiently close, i.e., for varied curves with sufficiently small variational parameter E. nian.
(Xi
Moreover,
I...
IXn)
on
we assume
M with W'
=
J.' along
the central
curve
r)
7.3 Fermat's
(Here
we
interval
make
[0, 1].)
(7. 10) give
us
equations for
use
of the fact that
defined
our curves are
words, with x1, algebraic equations for pi, In other
.
.
.
,
Xn- I
161
principle on
the compact
known, (5.10), (5.11) and
and first order differential , p,,- 1 xn and p,, which have to be satisfied by the varied curves to -
.
.
e =A 0, the coordinate representation of the curve 11(c, -) satisfy the system of equations (5.10), (5.11) and (7.10). However, I xn- 1 (,-, s) from this representation we can take the coordinates x (e, s), and determine the quantities pi, p,,_ 1, k,. b, Pn, locally uniquely, in such a way that (5.10), (5-11) and (7.10) hold. Together with the boundary valxn (0, 0) and p,,(,-, 1) ues xn (e, 0) -w,, this determines a unique curve for all ) JV near sufficiently small. In this way we get an ): [0, 1] allowed variationq of . Its variational vector field Y satisfies the condition of T-r.; o (Z Y) being parallel to W,;,. since, by construction, the coordinates xn-1 coincide along??(,-, -) and p(e, .). X11 If cannot be covered by a single chart of the kind considered above, this construction must be supplemented by an appropriate matching procedure. By compactness of the interval [0, 1], we can find finitely many intermediary < s,,,-, < s,,, 0 < si < 1, such that, for each 0 _< i 1.
1,
>
a
own
r
> 1 can
right,
for all
be
([0, 11, M)
integers r > 1. replaced by TM. This
construction with M
Hilbert manifold HI ([0,
1], TM) that may be More the tangent precisely, 1], ([0, M). E HI ([0, 1], M) is given, in the sense of
([0, 11, M)
at
a
point A
identification, by
T,\Hr ([0, 1], M) for
same
the proof for
this result and consider Hr
the tangent bundle of Hr
space of Hr a
we use
Hilbert C' manifold in its
Now
=
=
{Z (=- H'([O, 11, TM) I
-rm
o
Z
AI
=
This becomes obvious if the tangent vectors to Hr
(7.23)
QO, 1], M)
are
RN and its tangent map expressed with the help of an embedding j: M ) TRN '-'-R2N. Tj: TM We now state two simple lemmas which axe readily verified with the help ) RN. of an embedding j: M Lemma 7.4.1. For each Hr
Q0, 11, M)
integer r > 2, a C' if and only if its tangent field is in
Hr QO, is
a
11, M)
[0, 1)
M is in
([0, 11, TM)
The map
A:
curve
Hr- 1
)Hr-1([0,11,TM),
)
(7.24)
[0, 11
the evaluation
AI
C' map.
Lemma 7.4.2. For each
integer
> 1
r
and each
E
s
map ev,:
is
a
C' map. Its
T,\
ev,
Furthermore,
H'([O, 1], M)
tangent :
we
map at
a
T,\ Hr Q0, 1], M) shall need the
)M' point A )
E
AI Hr
T,\(")m
(7.25)
)A(s)
Q0, 1], M) ,
Y
1
following important
)
is
given by
Y(s).
result.
(7.26)
168
7. Variational
principles for
rays
Lemma 7.4.3. Consider
a C' map 0: M, M2 between two finite manifolds M1 and M2 and fix an integer r > 1. Then ) Hr 0 o A defines a COO map di: Hr Q0, 11, Ml) T)(A) QO, 11 M2) and the tangent map T P is given by ((TiP),\(Z))(s) (TO).X(,)(Z(s)) for all
dimensional C' =
1
=
X E
HrQ0,1J,M1),
For r
by
the
1, 1, the result
>
r
=
induction
ET,\HrQ0'1],M1),
Z
proof can
was
and
be found in Palais
S
[0,1].
E
[104]
or
first stated in Palais and Smale
in Schwartz
[105]
and
can
[130].
For
be proven
over r.
After these preparations
7.5 A Morse
ray-optical
theory
turn to the
we now
for
problem
we are
interested in.
strongly hyperregular
structures
As indicated above, it is our goal to rephrase Theorem 7.2.2 as a variational principle on a Hilbert manifold. To that end we modify our notion of virtual
First, we replace the C' condition on virtual rays by an appropriate integer r. Second, we choose a global Hamiltonian and use it for fixing the parametrization of each virtual ray. (Since Theorem 7.2.2 presupposes a strongly hyperregular ray-optical structure, it only applies to situations where the existence of a global Hamiltonian is assured.) We are, thus, led to introduce the following definition. rays in two respects. an HI condition, for
If JV is a ray-optical structure on M and H a global ) M for Ar, we denote by 93(H) the set of all maps A: [0, 1] that satisfy the following condition. There is a 6 E H1 QO, 1], Ar) and a c E R+ A and (-r, 4 o ) such that -rL o 6 c FH o .
Definition 7.5.1. Hamiltonian
'
=
=
Lemma 7.4.1 and Lemma 7.4.3
imply that 93(H)
C
H2 QO'
11, M).
It is
virtual ray in the sense of Def93(H) inition 5.1.2. Conversely, any such virtual ray which is defined on a compact parameter interval can be made into an element of 9J(H) by a unique obvious that each C'
reparametrization. It
curve
is thus
A
is
E
justified
a
to say that Definition 7.5.1 translates
earlier notion of virtual rays into an HI-setting. In natural coordinates, elements of QJ(H) are represented by exactly those x E H2 QO, 11, R) for our
which the system of equations
ba(s)
49H =
c
Pa
(X(s)'P(s))
H(x(s),p(s)) admits
a
solution
c
E
R+,
p E
HI QO,
(7.28)
0.
1], Rn).
In the
strongly hyperregular
solution must be unique. We write QJ(H), rather than Z(Ar), to indicate that this set
case
such
=
(7.27)
a
the choice of
a
global Hamiltonian. However,
if H and
ft
are
depends on two global
7.5 A Morse
theory for strongly hyperregular ray-optical
169
ray-optical structure, there is a natural Z(ft) given by relating those curves each other which coincide up to reparametrization. We are now going to show that, in the strongly hyperregular case, 93(H)
Hamiltonians for
and the
one
one-to-one relation between to
structures
carries
a
same
93(H)
and
natural Hilbert manifold structure.
Proposition 7.5.1. Let Ar be a ray-optical structure on M. Assume that H ) TM deis a global Hamiltonian for Ar such that the map UH: Ar x R+ recall its onto that, by image. (Please fined by (5.16) is a C' diffeomorphism Definition 5.2.2, such a global Hamiltonian exists if and only if Ar is strongly hyperregular.) Then T(H) is a C' submanifold of H2([0' 1], M). Proof. We denote the image Of 0`H by C+. Since OH is a C' diffeomorphism onto its image, its differential has maximal rank. Hence, C+ is open in TM. As a consequence Lemma 7.4.1 implies that the set H is
a
2
([0,1],M;C+) ={A
EH
C' submanifold of H 2Q0, X: H
2
2
Q0, 1], M) I
1], M)
-
Now
Q0, 1], M; C+)
)
E
we
H1 Q0, 11, C+)
}
introduce the map
H'
1Q0, 11, R+)
(7.30)
)pr200*H_ O
/\1
(7.29)
R+ denotes the projection onto the second factor. where pr2: X x R+ It follows immediately from Lemma 7.4.1 and Lemma 7.4.3 that X is a C' that 9J(H) X-'(R+) where the set map. This map is defined in such a way R+. By Lemma 7.4.2, with identified R+ is of constant functions from [0, 1] to of the propostatement the Now HI R+ is a COO submanifold of 1], =
R+).
Q0,
able to prove that X is a submersion. To that end we c pick an element \ E T(H) C H 2Q0, 1], M; C+). Then the equation X(A) the of extension continuous R+. tangent map has to hold with some c E By
sition follows if
we are
=
T,\ X : T,\ H2 Q0, 1], M; C+) we
get
a
TcHl Q0, 11, R+)
5---
H1 Q0, 1], R)
(7-31)
map
T,\X where
)
:
T,\H2 ([0, 1], M; C+)
T,\H2 Q0, 1], M; C+)
)
Ho ([0, 1], R)
denotes the closure of
H1 ([0, 1], M). Since pr2 0 OH satisfy the equation
-I
is
homogeneous,
TA_X (f
c
any
,
(7-32)
T,\H Q0, 1], M; C+) f E H1 Q0, 1], R) has
f
This proves that the image of T,\X is dense in H1 Q0, 1], the image of T,\X is a closed linear subspace of H1 Q0, 1],
2
in to
(7.33)
R). On the other hand R). Both observations
170
7. Variational
principles for
rays
together imply that the image of T.\X is all of H1 QO, 1], R), i.e., that X is a submersion at the point A. As this result holds for all A E qj(H), we have
X-1(R+) is a closed H'([O, 1], M).
proven that
T(H)
and, thus,
submanifold of
If the
a
=
T(H)
boundary conditions Proposition
7.5.2. 11:
is
a
of
assumption
result and view
Hilbert C' manifold in its
need the
we
a
closed C'
is either
empty
M
M
x
=
{A
E
To
,
Proposition 7.5.1,
A
this
impose
the map
(A(O), A(1))
(7.34)
1
(7.35)
Z(H) I A(O)
submanifold of Z(H),
or a
we can use
right.
Thus,
Z(H; q, q) M
own
following proposition.
Under the assumptions of
93(H; q) is
11, M; C+) 11
strong hyperregularity is satisfied,
as a
Z(H)
C' submersion.
submanifold of H 200'
closed C'
q
and
93(H) I A(O)
A E
=
=
=
q,
A(1)
=
q'
submanifold of 93(H; q) for
(7.36) any q and
q'
in
-
Proof. By Lemma 7.4-2, H is a C' map. To prove that IT is a submersion, c has to hold with pick any element A E T(H). Then the equation X(A) some c: E R+, where the map X is defined by (7.30). Again by Lemma 7.4.2, the tangent map of IT at the point A is given by we
=
Tx IT: TXZ(H)
C
2
TxH ([0' Y
1
1], M)
TA (0) M
(Y(O), Y(1))
)
X
T'\ (1) M
.
We consider the continuous extension
T,\-H: T,\Z(H)
)
T,\ (0) M
X
T,\ (1) M
T,\17, where T,\93(H) denotes the closure of T,\QJ(H) in T,\Hl QO, 1], M). an arbitrary element of T,\Hl QO, 1], M). So, in particular, Y(O) and Y(1) are arbitrary vectors in T.\(O)M and T,\(,)M, respectively. We want to find a function f E H1 QO, 1], R) such that of
Let Y be
f /\ Since
we
know from the
the function
f
satisfies
proof
(7.37)
T,\-x (Y + f
of
T,\QJ(H)
E
(7.37)
.
Proposition only if
7.5.1 that
Z(H)
=
X-'(R+),
if and
)
=
T,\-x (Y)
+
C
const.
(7.38)
7.5 A Morse
theory
for
strongly hyperregular ray-optical
171
structures
Q0,
For any choice of const. (7.38) has a unique solution f E H1 with 11, f (0) = 0. By integrating (7.38) from 0 to I we see that there is a unique choice for const. such that the corresponding solution f satisfies the boundary condition
f (1)
0. With this function
=
T,\--Iy (Y
f)
+
f
=
we
R)
get
(Y (0), Y (1))
(7.39)
which proves that T,\H is surjective, i.e., that the image of T,\H is dense in T\(o)M x T,\(,)M. On the other hand, the image of T,\11 is a closed subspace of
T,\(O)M
x
T,\(,)M. Thus,
1Y is
To define the action functional sition
5.2.4, for
submersion.
a
on
n
QJ(H), we have to recall that, by Propo-
strongly hyperregular ray-optical
a
structure there is
a one-
to-one relation between virtual rays and lifted virtual rays. Translated into
Hr-setting, this gives
our
Proposition
7.5.3.
following proposition.
Under the assumptions
)H1([0,1],T*M),
F: 93 (H)
is
rise to the
injective and of class C'. Here first factor.
prl:
of Proposition 7.5.1,
the map
)prjOaH-10
Ai
JV x R+
(7.40)
Ar denotes the projection
onto the
Proof. Lemma 7.4.1 and Lemma 7.4.3 guarantee that 35- is a C' map. By Proposition 5.2.4, the restriction of S to 93(H) n COO Q0, 1], M) is injective. 0 must be injective. By continuous extension, It is not difficult to show
that, moreover, represented by the map x
E-3 is
an
immersion. In natural
(x,p) given by solving (7.27) 1], R'). Hence, for every Z E T,\QJ(H) with coordinate representation 6x E H' Q0, 1], R), the coordinate representation (Jx, Jp) E H1 Q0, 1], R2n) of the vector T =(Z) satisfies the system of coordinates SE-F is
(7.28)
and
for
c
l
E
R+ and p
E
)
H1 Q0,
equations
6(b6 with
some
Sc E R. As
our
aH -
c
'OP.
(X)p))(s)
(H(x, p)) (s)
notation
=
suggests,
0
=
(7.41)
0,
(7.42)
,
one
should think of 6
as
of the
derivative with respect to a variational parameter E at the point E 0, and one should calculate the left-hand sides of (7.41) and (7.42) with the help =
of the
ordinary chain rule and product rule. Moreover, 5 and (.)' commute
Lemma 7.4.3.
Hr+1 ([0, 11, RN) We
are
translate
now
(7.8)
This
procedure
to HI
Q0, 1], RN)
ready
to define the action functional A
into
our
is
justified by
since the derivative map from
is linear.
Hr-setting.
on
Q(H), i.e.,
to
172
7. Variational
Proposition
principles
7.5.4. Let all the
Then the action
assumptions of Proposition ) R defined by functional A: QJ(H)
A(A) is
a
is the
QJ(H)
=
0
=
(A)
11 ((E(A)) ( )) (s)
H1 QO, 1], TM
E(A))
-
7.5.1 be
satisfied.
(7.43)
ds
0
composition of the following three
-->
A
is
denotes the map introduced in
C' map. Here
Proof. A
for rays
(D
T*M)
--*
(EF(A)) (A)
Proposition
7.5.3.
maps.
H1 QO, 1], R)
--+
(7.44)
R
f ((E(A)) ( )) (s)
ds
0
Here TM E) T*M denotes the fiber bundle
over
M whose fiber at q
E
M
Tq*M. The first map in this -sequence is a C' map owing to Lemma 7.4.1 and Proposition 7.5.3. The second map is, again, a C' map as
is TM
be
x
by applying Lemma 7.4.3 to the natural pairing map between Finally, the last map in the sequence is obviously linear and, with the help of inequality (7.21), it is easy to check that it is continuous. 1:1 Thus, A is the composition of three C' maps. can
seen
vectors and covectors.
ft
are two global Hamiltonians for a ray-optical structure A( satisfy the assumptions of Proposition 7.5.1, Proposition 7.5.4 gives us a C' action functional on Z(H) and on 93(fl). If we identify Z(H) and QJ(H) in the way outlined above, these two action functionals are easily shown to be related in the following way. From Proposition 5.1.3 we know that H and satisfy, on their common domain of definition, an equation of FH with a nowhere vanishing function F. If F is positive, the the form ft action functionals on Z(H) and 93(fl) coincide. If F is negative, they differ by sign. A can be represented with the help of natural coordinates in the form
If H and
both of which
A(A)
2([0, 1], R')
=
fo
1
p,,
(s) &'(s) ds
.
(7.45)
denotes the coordinate representation of A E QJ(H) H1 QO, 1], R2n ) denotes the coordinate representation of ----(A) in ) p(s) is the notation of Proposition 7.5.3. In other words, the function s 1 determined by solving the system of equations (7.27) and (7.28) for c E R+ and p E HI QO, 1], R). We may use the representation (7.45) even if A cannot be covered by a single chart. We just have to read the integrand as an invariant function that takes the given form locally in any natural chart. Now we view A as a composition of three maps, as in the proof of Proposition 7.5.4, and we apply Lemma 7.4.3. This shows that the Fr6chet differen) R can be derived from (7.45) by calculating, in the tial (dA),\: T,\93(H) Here and
x
E H
(x, p)
e
7.5 A Morse
theory for strongly hyperregular ray-optical
173
structures
usual way of differential calculus, the derivative with respect to a variational E under the integral. Denoting this derivative at e 0 by 9 we get
parameter
=
10
(dA),\ (Z)
1 f
(bp.
M c
0
1
(
_
c
jxa
0
for Z E
(7.42),
aH TXW (X, P)
T,\T(H).
Here
6x E H 2 Q0,
sents T --'--(Z). It is rem
Jx') (s) ds + p,, (1) Jx'(1)
(x, P)
19P.
(Jp. t' + p,, &t') (s) ds
p,,
jXa) (s) ds + Pa (1) jxa (1)
we
have used
1], R) represents
now
-
(7.46)
=
(0) Jx'(0)
-
P(' (0) jXa (0)
(7.27) and (7.42). As in (7.41) and (6x, 6p) E H1 Q0, 1], R 2n) repre-
Z and
easy to prove the desired H'-reformulation of Theo-
7.2.2.
Theorem 7.5.1. Consider the situation
of Proposition 7.5.1. Let Aq,q, defunctional A, defined in Proposition 7.5.4, the Hilbert manifold T(H; q, q) of Proposition 7.5.2. Then '\ E QJ(H; q, q) a my if and only if the Fr6chet differential (dAq,ql),\ is equal to zero.
note the restriction to
is
of the
action
Proof. For \ E QJ(H; q, q'), the differential (dAq,ql),\ is equal to zero if and 0. 0 and Z(I) 0 for all Z E T,\93(H) with Z(O) only if (dA),\(Z) By continuous extension, this is the case if and only if the last line in (7.46) vanishes for all 6x 1], R') w!th_x (0) 0 and 6x (1) 0 that repre=
=
T,\T(H),
sent elements Z E
T,\Hl Q0, 1], M).
=
=
=
where
T,\T(H)
denotes the closure of
T,\T(H)
proof of Proposition 7.5.2 we know that those Jx are of the form 6xI (s) ya (s) + f (S) ta (S), where y is an arbitrary el0 and y(l) ement of H1 Q0, 1], R) with y(O) 0 and f is some function in H1 Q0, 1] , R) with f (0) 0. However, the term proportional to f (1) the tangent field gives no contribution to the last integral in (7.46) as is easily verified with the help of (7.27) and (7.28). Hence, the Fr6chet differential (dAq,ql),\ is zero if and only if the last integral in (7.46) vanishes for all 0. Owing to the fundamental 6x E HI Q0, 1], Rn) with 6x(O) 0 and 6X'(1) lemma of variational calculus, generalized into an Hl-setting by continuous extension, this is the case if and only if in
From the =
=
=
=
=
P"(s)
==
=
IH =
_c
,oXa
(x(s),P(S))
(7.47)
) i.e., if and only if the H' map s (x(s),p(s)) satisfies not only (7.27) and This concludes the proof since, by induction, all the but (7.28) ray equations. H' ) then be must s i an map for all r E N, i.e., it must be a (x(s),p(s))
CI map.
proposition we have achieved our goal of formulating a variaprinciple for rays of a strongly hyperregular ray-optical structure in a
With this tional
11
174
7. Variational
principles
for rays
Hilbert manifold setting. In a nutshell, the rays from q to q' are the critical ) R, i.e., the points of the functional Aq,ql : 93(H; q, q') points where this functional has a local minimum, a local maximum, or a saddle point. These three types of critical points can be distinguished by looking at the second derivative of Aq,ql i.e., at the Hessian of Aq,,, at A,
T,\Z(H; q, q')
Hess,\Aq,ql
(X,\,Y,\)
i
)
x
T,\93(H; q, q')
(7.48)
R
Hess,\Aq,q'(XA,Y,\)=(XYAq,ql),\.
Here X and Y denote any C' vector fields (i.e., derivations) on 93(H; q, q') Y,\, respectively, at the point A. If (dAq,q,),\ 0, the
with values X,\ and
=
Hessian is indeed well-defined
A), and it gives T,\W(H; q, q').
Y at the point Hilbert space
(i.e., a
depends only on the values of symmetric continuous bilinear form it
X and on
the
Clearly, if the Hessian is non-degenerate, it characterizes the critical point following way. Depending on whether the Hessian is positive definite, negative definite, or indefinite, the critical point is a strict local minimum, a strict local maximum, or a saddle point. If the Hessian is degenerate, third or higher order derivatives have to be considered for characterizing the critical point. In this connection it is helpful to recall the following standard terminology. For a symmetric continuous bilinear form (P: Y) x.0' ) R on a Hilbert space the index the maximal b, ind(fl is, by definition, dimension.of a subspace of S5 on which (P is negative definite. The extended index ind,' (fl is, by definition, the maximal dimension of a subspace of .1r) on which P is negative semidefinite. The nullity null(fl is the dimension of the kernel of P. Then in the
ind,,(fl
=
since, by Hilbert be
ind(fl
space algebra, the into two orthogonal
+
null((P)
(7-49)
orthocomplement of the kernel of P can subspaces on which P is positive definite
decomposed negative definite, respectively. For a C' (or, more generally, C2)
and
index of the Hessian at
function
on a
Hilbert
manifold, the
critical point is called the Morse index of the critical point. If the Hessian is non-degenerate at all critical points, the function is called a Morse function. Clearly, the Hessian at a critical point A is non-
degenerate
if and
a
only
if the extended Morse index of A coincides with the
Morse index of A. A strict local minimum has
vanishing Morse index (but necessarily vanishing extended Morse index). Conversely, if the extended Morse index of A vanishes, A is a strict local minimum. not
In terms of natural coordinates the Hessian of can
ing
be calculated from
(7-46)
with
Jx(O)
=
0 and
Aq,q, at 5x(l)
with respect to another variational parameter s'. If 0, we find
derivative at e'
=
=
a
0
we
critical point A by differentiat-
write J' for this
7.5 A Morse
theory
strongly hyperregular ray-optical
for
1
J1
Hess,\Aq,ql (ZI Z) 7
0
In
1
(P-
aH +
OH
J
+
c
0
X T (X, Tx
c
5
P)) (S)
(x,
175
structures
p)) (s) Jx'(s) ds
J'x
(s)
(7.50)
ds'
Z, Z' E T,\93(H; q, q'). Here (Jx, Jp) and (J'x, Yp) denote the coordinate representations of T-'---(Z) and T-'---(Z'), respectively. The first equality in (7.50) holds since A is a ray which implies that (7.47) is satisfied. The second equality in (7.50) holds since the Hessian is symmetric. We shall now give a criterion for Hess,\Aq,ql to be non-degenerate. To that end we use the notion of Jacobi fields which was introduced, for arbitrary rayoptical structures, in Definition 5.6.2.
for
Theorem 7.5.2. a
Let,
in the situation
T(H; q, q') be of Hess.\Aq,ql if and
7.5. 1, \ E
of Theorem
ray and Z E TX93(H; q, q'). Then Z is in the kernel Z is a Jacobi field along A.
only if
7.5.1, we work in natural coordinates following argument is valid independently of whether or not the considered curve can be covered by a single chart. Please recall that the coordinate representation (bx, 6p) of T-'---(Z) has to satisfy (7.41) and (7.42). If Z is a Jacobi field, (5.46), (5.47) and (5.48) must be true. Comparison shows that the function Jk must be equal to the
Proof. As
in the
proof
of Theorem
for notational convenience. The
constant Jc such that
(5.48)
J(Pa
takes the form OH +
c
(X1
A) (8)
=
0
(7.51)
-
that Z is in the kernel of the Hessian. Conversely, Hessian, the last integral in (7.50) vanishes for all Yx that represent elements Z' E 93(H; q, q'). Hence, the last integral in
Now
we can
read from
(7.50)
if Z is in the kernel of the
(7.50)
vanishes for all Yx E H
2QO' 1], Rn)
with
Yx(O)
=
0 and
Yx(l)
=
0.
This follows from the fact that any such Yx is the coordinate representation of some Z' E T,\Z(H; q, q') up to adding a multiple of the tangent field of A
which drops out from (7.50) anyway. The same trick was used already in the proof of Theorem 7.5.1. Here the situation is even more convenient since A is 2 map and not a C' curve such that its tangent field is, in particular, an H
H1
only
an
that
(7.51)
have to
map. Now the fundamental lemma of variational calculus complete the proof that Z is a Jacobi field
has to hold. To
verify
implies we
still
2n).
QO,
a C' map. We know that (Jx, 6p) E H1 1], R 2,, for all R that and (7.41) imply (Jx, 6p) E H'([O, 11,
that Z is
By induction, (7.51) r E N, i.e., that Jx and Jp
)
are,
indeed, C'
13
maps.
terminology of Definition 5.6.3, this proposition implies that A(O) A(1) is conjugate to q Hess,\A,,ql is degenerate if and only if q' Hessian equals the multiplicity of the along A, and that the nullity of the In the
=
=
176
7. Variational
principles for
rays
conjugate point. Please note that in each Jacobi class along a ray A E Z(H) a unique representative J E T,\93(H). It is instructive to illustrate these results by specializing to the ray-optical structure of Example 5.1.5 where the rays are the geodesics of a (positive definite) Riemannian metric 9+. For the Hamiltonian given in this example, there is
93(H) the
is the set of all \ E H 2
g+-Iength
Q0, 11, M)
functional. In this
case
(, )
with g+ is
Hess,\Aq,ql
(the
=
const. and A is
H 2 version of ) the
standard index form of Riemannian geometry and the notions of Jacobi fields and of conjugate points are the familiar textbook ones, see e.g. Bishop and
[15], Chap. 11. Similarly, specializing to the ray-optical structure of Example 5.1.2 gives the analogous results for time-like geodesics of a Lorentzian metric which should be compared with Beem, Ehrlich and Easley (111, Sect. 10.1. Note that for the Hamiltonian H given in Example 5.1.2 A is the negative Lorentzian length functional on time-like curves. Switching from H to -H yields the positive Lorentzian length functional instead. Now we want to relate the Morse index of a critical point A of Aq,q, to the number of conjugate points along A, thereby generalizing the classical Morse index theorem for Riemannian geodesics. Partly as a preparation for that we prove the following criterion for a critical point to be a minimum. This criterion applies to rays that are associated with a classical solution of the eikonal equation. (Please recall Sect. 5.5.) It generalizes a classical theorem of variational calculus, based on the socalled Weierstrass excess function, into our setting of ray -optical structures. In the language of traditional variational calculus, rays associated with a classical solution of the eikonal equation are usually characterized as being "embedded in a field of extremals". Crittenden
Theorem 7.5.3. be
a
Let,
in the situation
ray. Assume that there is ==
(a) If, for
7.5. 1,
A,,
E
Z(H; q, q')
) R of the is completely lifted ray lifting map of (7.40). Then the
denotes the
true.
the Hamiltonian H under
of (5.15)
hand side
(in
along
local minimum
(b) If the
of Theorem
classical solution S: U C M
0 such that the
eikonal equation H o dS contained in dS(U), where
following holds
a
is not
one
consideration, the matrix on the leftonly non-degenerate but even positive definite
and thus in any natural
of Aq,ql
chart),
then A,, is
a
strict
-
R is defined on a domain W 9 T*M such all q E M, if the matrix (o9'H1apa19Pb) for Tq*M is positive definite at all points u E W (in one and thus in any natural chart), and if the domain U of S covers all curves A E Z(H; q, q'), then ,\,, is a strict global minimum of Aq,ql
Hamiltonian H: W
that W n
is
convex
-
Proof. We
use
the coordinate representation
(7.45)
of A and its restriction
Aq,ql. As in the proof of Theorem 7.5.1 and Theorem 7.5.2, coordinates are employed for notational convenience only. The following argument remains
7.5 A Morse
valid X E
theory for strongly hyperregular ray-optical
if U cannot be covered
even
93(H; q, q')
Aq,ql (A)
-
by
single chart. Then
a
(x, p)
we
find for all
contained in U
Aq,ql (I\o)
I(p,,., ') (s)
=
ds
-(9,,S(x,,).,t')(s)ds.
is the coordinate
(7.52)
0
0
n 0
Here
177
structures
representation of -F(A) whereas (x, aS(x,,))
the coordinate representation of we can replace x,, by x in the second
x(O)
As
=
xo(O)
and
x(l)
=
is
x,,(1),
integral. With (7.27) this puts (7.52)
into the form i
Aq,ql (A)
-
If A is close to
Aq,ql (Ao) A,,, p(s)
:-4:
fo ((P,,
on an
i9,,S(X)) Co9H (x, p)) (s) ds.
'
ap.
(7.53)
aS(x(s)).
In that case, as H is a C' and open neighborhood of Ar, Taylor's theorem
is close to
thus C2 function defined
-
implies
H(x(s),p(s)) 92 H
1
for
some
Pa(S)
+
H(x(s),aS(x(s))) 'H (x(s), P(s)) ('9aS(x(s)) _Pa(s)) ON
(X(S), P'(S)) (0a S(X(S)) =
left-hand side of
Pa(S) (7.54)
+
O(S)
-
on
the
a
Z(H) satisfy (7.28). Hence, inserting (7.54)
Aq,q'(, ) C
1
-
with 0
1 implies that completely contained in the vacuum light sphere, gab Va Vb < the rest system, the indicatrix turns into a sphere, see
gab V*a
V*b
this
ellipsoid
1. If
we
-1 ny
-
is
pass to
(8.22)
we also calculate the figuratrix (6.13) to visualize dependence of the phase velocity on spatial directions. The definition (6.10) of the phase velocity implies that
For the sake of completeness
the
W
ka With
(8.23),
the
-
Wa
(8.23)
gab Wa Wb
dispersion relation (8-7) yields
(I
_
p2) gab Wa Wb (I
(n
gab Wa,Wb)
_
(gab Wa Wb
2
_
(8.24)
Ua Wa )2.
a fourth order surface, see Figure 8.2. Our assumption guarantees that gab Wa Wb :f , 1, i.e., not only the ray velocity but also the phase velocity is bounded by the vacuum veclocity of light. If we pass to the rest system of the medium, the figuratrix turns into a sphere,
The n
figuratrix is, thus,
> 1
9
ab
W* W*b a
1
-n7
(8.25)
-
the rest system figuratrix and indicatrix coincide if tangent space and cotangent space in the usual way with the spacetime metric.
Hence, for
For rays relations gab situation
parallel V
a
(8.21)
Ub
or
antiparallel
iO'\/gcd Vc Vd
and
to the relative motion
and Wa Ua
:L,3
we
identify help of the we
can
N/ w , wd.
use
the
In this
(8.24) imply
N//gab Va Vb
=
'A wa Wb
(8.26) n
8.1
,3
Fig.
=
Doppler effect, aberration, and drag
0
0
8.2. This
0 and a
the
a
0, but that (8.70) and (8.71) cannot be used in this limit since they contain undetermined exFor this reason, a somewhat inconvenient matching 0 and regions with wp 54 0 are procedure must be used if regions with wp to be treated in a unified setting.
pressions of the
form
0/0.
=
(c)
The
Now
we
stationary
case
want to consider the situation that
JV is
a
stationary ray-optical
structure and that -1 is an integral curve of the distinguished time-like vector field W E gAr, i.e., that the light source is at rest with respect to this time-like vector field. Moreover, we shall assume that the assumptions of the reduc-
of Theorem 6.5.1) are satisfied. The gravitational lensing then be described in terms of space rather than in terms of viz., in terms of the reduced ray-optical structure. If the reduced
tion theorem
situation
(i.e.,
can
spacetime,
ray-optical structure is strongly regular (which is true in virtually all situa, physical interest in which the preceding assumptions are valid), the Morse theory developed in Sect. 7.5 can be applied. We want to illustrate the general features of this approach by way of example. To that end we consider, on a 4-dimensional Lorentzian spacetime manifold (M, g), a ray-optical structure Ar determined by a Hamiltonian of the form (8.42), i.e., a dispersion relation of the form
tions of
9ab W Pa A + WP(X)2 which describes
light propagation
in
a
=
(8.73)
0
non-magnetized plasma. Here gab
are
the contravariant components of the spacetime metric and W,, is the plasma frequency. The spacetime metric is supposed to describe a cosmological model with some local mass concentrations that act as "deflectors"; the light rays are
supposed
to be
influenced by
plasma clouds, situated
some
in
regions
where the function w,, is different from zero. We want to assume that Ar is stationary, i.e., that there is a time-like vector field W in the symmetry algebra 9,V. This means that W must be a conformal
Killing field
spacetime
of the
metric g,
Lw (e -2f g) where
f
constant
=
-!In( 2
-
g(W, W)),
along each integral
=
and that the rescaled
curve
of
W,
(8.74)
0
plasma density
must be
206
8.
Applications Lw (e-f wp)
=
0.
(8.75)
These assumptions are satisfied, e.g., if there is an open subset V in M, invariant under the flow of W, with the following properties. (M,g) is a Robertson-Walker spacetime without plasma (wp 0) on M \ , whereas it =
is
stationary spacetime with a stationary plasma (Lwg 0 and Lwwp 0) on D. D is to be interpreted as the region where the influence of the deflector mass and of the plasma cloud on the light rays is to be taken into account. a
=
=
Instead of
a Robertson-Walker spacetime we could use any other conformally stationary cosmological background on M \ To apply the reduction theorem, we have to assume that there is a global ) R for W which timing function t : M gives us a global diffeomorphism X4' ) M x (7r, t) : R, please recall Figure 6.3. To construct the reduced rayoptical structure according to Theorem 6.5.1, we choose a frequency constant w,, > 0. Rom (8.73) we read that rays with p,,W' -w,, cannot leave the region =
M,,,, If
we
restrict to this
and the reduction
{q
=
E
M
e-2f (q) wp(q)2
0, otherwise it might have, of course, M,,. w,, be necessary to excise some parts from spacetime where the plasma frequency is so large that rays with frequency constant w, cannot enter. However, if the structure case
=
on
0
=
we
=
function wl, has
spatially compact support
sufficiently large
w,,.
we
always
have
M,".
=
M for
With the results from Sect. 6.6 it is easy to find a Hamiltonian for the reduced ray-optical structure First we recall that, by (8.74), the spacetime metric induces
according (6.98). to the hypersurfaces to
a
positive definite
The one-form t
=
const. In coordinates with
the spacetime metric takes the form (6.101). there is a function c4 R such that
e-fwp Hence,
(8.73)
in coordinates with is
equivalent
x'
=
=
t and
a
one-form
only x
4
=
if W is t and
on M, orthogonal
al,9X4
=
Moreover, (8.75) implies
-7r*c ,,.
191aX4
W
that
(8.77) =
W the
dispersion relation
+C 2P
0
to
k"'(Pl.i with
and
metric
vanishes if and
-
P4 [t) (Po- N a) -
2
-
P4
=
(8.78)
greek indices running from 1 to 3. According to the general rules found 6.5, the left-hand side of (8.78) gives us a Hamiltonian for the reduced ray-optical structure if P4 is replaced with -w,,. Since we are always free to multiply the Hamiltonian with a non-zero function, this implies that 9". is generated by the Hamiltonian in Sect.
8.4 Gravitational
OAO, (PA
2
To
+
wo jl) (Pa + wo a) 2
WO
study gravitational lensing
lensing
P
A G.
we
9,,.
ask how many rays of go from Hamiltonian (8.79) the map af, phism onto its image, i.e., that
(8.79)
C 2
-
207
fix two points 4 and d' in and we to 4'. It is easy to check that for the
4
TM^,,.
global diffeomorstrongly hyperregular according to Definition 5.2.2. Thus, the Morse theory developed in Sect. 7.5 applies. For the Hamiltonian (8.79), the space of trial curves 93(k, 4, d') is equal to the set of all H2 curves, defined on the interval [0, 1], in A G. from 4 to 4' with
(W2
_
0
C P2)
and the action functional is given A
wo
10
X
o
R
)
is
a
is
:ia
=
(8.80)
const.
by z,2
ds
-
(8.81)
A
934,q, with coordinate representation x E H 2QO, 1], R 3). Please that, up to the factor wo > 0, the action functional (8.81) equals the optical path length (6.84) of the lifted ray associated with the ray A. In the vacuum case c p 0, the optical path length can be reinterpreted as a travel time according to Proposition 6.5.3. According to Fermat's principle in the version of Theorem 7.5.1, the light rays from 4 to d' are the stationary points of the action functional (8.81) or, equivalently, of the optical path length functional. In the static (i.e., non0. rotating) case we can choose the timing function in such a way that Then the optical path length functional is equal to the length functional of the frequency-dependent metric
for each X E note
A
=
=
0=
(1-":i
(8.82)
optical path length functional is orientation-reversing reparameterizations. Hence, in that from 4 to 4' does not travel along the same path as a light
Please note that in the
rotating
case
the
not invariant under case a
light
ray from
d'
ray to
Since, for
4.
the Hamiltonian
(8.79),
the matrix
,92fl
(9P,4,9PO, )
=
0tta) W2 0
-
W2
(8.83)
P
the Morse index theorem in the version of Theopositive definite on rem 7.5.4 implies that along each ray the extended Morse index is equal to the number of conjugate points counted with multiplicity, see (7.56). In particular, a ray gives a strict local minimum of the optical path length functional
is
208
8.
if and
Applications
only
if it is free of
conjugate points whereas it gives a saddle-point conjugate point in the interior. Since at each conjugate point neighboring light rays are crossing from one side to the other, an odd Morse index is associated with a side-reversed image in comparison to an even Morse index. This is observable for light sources surrounded by irregular structures, e.g., for quasars with jets or lobes. For the vacuum rays it is known that the occurence of conjugate points gives rise to multiple imaging situation. Under certain assumptions on the causal and topological structure of spacetime, the converse is also true, i.e., in any multiple imaging situation at least one of the rays must contain a pair of conjugate points. For a general proof of these facts we refer to Perlick [111]. This is an interesting result since, in combination with Einstein's field equation, the existence of conjugate points along a vacuum light ray allows to estimate the matter density along the ray, see Padmanabhan and Subramanian [102]. The above-mentioned Morse index theorem might be useful for generalizing this result to the case of light rays in media, at least for stationary situations and for media which satisfy the positive-definiteness assumption of Theorem 7.5.4. Finally we want to prove an odd number theorem, i.e., we want to show that, under certain reasonable assumptions, a transparent deflector always produces an odd number of images. To that end we generalize a differentialtopological argument, first published by McKenzie [95], into our setting of stationary ray-optical structures. For the sake of comparison the reader is refered to Dyer and Roeder [33] who prove an odd number theorem for spherical deflectors, and to Burke [24] and Petters [113] where odd number theorems are given for thin deflectors and weak gravitational fields. An argument very similar to Burke's but under slightly more general assumptions was worked out by Lombardi [86]. A general discussion of odd number theorems can also be found in Schneider, Ehlers, and Falco [128]. The following argument applies to all situations in which the assumptions of the reduction theorem (i.e., of Theorem 6.5.1) are satisfied. As before, we fix two points 4 and 4' in M,,. and we ask how many rays of 9,,. go from 4 to 4'. We need the following three additional assumptions (see Figure 8.9). if there is
(a)
a
There is
and
1
an
open subset B in
is contractible to
with
(P(O,
ray of
9,,,, 0
(c) Every ray is
In
vector in
unique
a
B
for
A
boundary
issuing from 4
S
=
W of B is
intersects
if
diffeomorphic
to
a
sufficiently extended.
^
TqM,,.
is the
tangent vector of a ray of reparametrization.
and this
up to extension and
conditions (a) and (b) prohibit non-transparent deflectors. non-transparent deflector would to be modeled either as a hole in M,,.,
physical terms,
Such
following properties. 4 E : [0, 1] x 1 all E 8. The closure of
is differentiable map P
P(1,
and
is compact in M,,,. The 2-sphere and d' E S.
(b) Every
with the
M,,.
4, i.e., there
8.4 Gravitational
Fig.
8.9. Under the
assumptions stated in the text, the
rays
lensing
209
issuing from 4 define
continuous map from the small sphere . to the big sphere . The degree of this map must be equal to 1 which proves that there is an odd number of rays from a
to
41.
thereby violating condition (a), or as a compact region in which some rays are trapped, thereby violating (b). Condition (c), roughly speaking, makes sure This condition that in any spatial direction there is exactly one ray of is satisfied, e.g., if M is the vacuum ray-optical structure. Please note that condition (c) could not hold if 9,,,, was not strongly regular. In the dispersive case, conditions (a), (b), and (c) have, of course, to be checked for each value of the frequency constant w,, individually. Under these assumptions, every ray issuing from 4 intersects an infinitesimally small sphere ,, around 4 in exactly one point , and it reaches the )S. sphere S at some point f ( ). This defines a differentiable map f : S,, that such for all 6 fix of value We now fix a regular a f, i.e., we laoint E is 6 the tangent map Tf f : Tf S,, a f E S,, with f bijection. Tf (f) S,, Please note that, according to the well known Sard.Theorem (see, e.g., Abraham and Robbin [2], p. 37) almost all points in S are regular values of f Clearly, 6 is a regular value of f if and only if 6 is not conjugate to 4 along any ray in 8. With a regular value 6 chosen we define the degree of f as A
.
deg(f)
=
E
sgn( )
(8.84)
f M=6
sgn( ) is equal to +1 if the differential Tpf is orientation preserving equal to -1 otherwise. Here we refer, of course, to the orientations of the spheres according to which 4 "lies to their inner sides". It is a standard
where
and
210
Applications
8.
theorem in differential
topology
that
deg(f)
is
well-defined, i.e., independent
of the choice of 6, see, e.g., Guillemin and Pollack [54] for a detailed discussion. Moreover, our assumption of B beingcontractible to 4 gives an orientation
preserving diffeomorphism from to S,, and a smooth deformation of f into identity, i.e., it implies that f is homotopic to the identity map. As it is well known that, for maps between compact manifolds without boundary, the degree is a homotopic: invariant, the degree of f must be the degree of 1. the identity, i.e., deg(f) We now consider the rays from 4 to 4'. We exclude the exceptional case that d' is conjugate to 4 along some ray, Le, we assume that 4' is a regular value of f. Then the definition of the degree implies that
the
=
deg(f)
=
n+
-
n-
(8.85)
1. where n is the number of rays from 4 to 4' in 8 such that sgn( ) Here f denotes the intersection of the ray with S,,. Clearly, n+ is the number of rays with an even number of conjugate points and n- is the number of =
odd number of conjugate points. As the degree of f is equal to I + 2n-, i.e., the number of rays from 4 to implies that n+ + n-
rays with
1, (8.85)
an
=
is odd.
For this argument stationarity was, of course, essential since otherwise in which it could be applied. Even for vacuum rays it is no space
M,,,,
there is
similar degree argument could give an odd number theorem spacetime setting, i.e., without assuming stationarity. (This problem was discussed in detail by Gottlieb [51].) For that reason it is important to know that McKenzie [95] was able to give another argument to prove that a transparent deflector produces an odd number of images. This was done for vacuum light rays in a globally hyperbolic spacetime, using Morse theoretical results of Uhlenbeck [144]. Unfortunately, it was necessary for McKenzie to impose some additional assumptions on the spacetime metric the physical meaning of which is obscure. Therefore it seems fair to say that in the non-stationary case a satisfactory odd number theorem is still missing, even for vacuum rays. Infinite dimensional Morse theory, as it was developed for vacuum rays between a point and a time-like curve in a Lorentzian manifold partially by Perlick [110] and, to a fuller extent, by Giannoni, Masiello, and Piccione [47] [481, could be a useful tool. In the non-stationary non-vacuum case, there are not even rudiments of a Morse theory for light rays between a point and a time-like curve. So there is still a lot to be done in the future. hard to in
a
see
how
a
References
R., Marsden J. (1978) Foundations of mechanics. Addison-Wesley,
1. Abraham
New York 2. Abraham
R., Robbin J. (1967) Transversal mappings and flows. Benjamin, New
York
M., Carter B., Lasota J. P. (1988) Optical reference geometry for stationary and static dynamics. Gen. Rel. Grav. 20, 1173-1183 4. Anile A.M. (1976) Geometrical optics in general relativity: A study of the higher order corrections. J. Math. Phys. 17, 576-584 5. Anile A. M., Pantano P. (1977) Geometrical optics in dispersive media. Phys. Lett. 61A, 215-218 6. Anile A.M., Pantano P. (1979) Foundation of geometrical optics in general relativistic dispersive media. J. Math. Phys. 20, 177-183 7. Arnold V.I. (1967) Characteristic class entering in quantization conditions. Funct. Anal. Appl. 1, 1-13 8. Arnold V.I. (1978) Mathematical methods of classical mechanics. Springer, New 3. Abramowicz
York
I., Gusein-Zade S., Varchenko A. (1985) Singularities of differentiable Birkhiiuser, Boston Asanov, G. (1985) Finsler geometry, relativity and gauge theories. Reidel, Dor-
9. Arnold V. maps. I.
10.
drecht
K., Ehrlich P.E., Easley K.L. (1996) Global Lorentzian geometry. MaxDekker, New York Bel U., Martin J. (1994) Fermat's principle in general relativity. Gen. Rel. Grav. 26, 567-585 Bertotti B. (1998) Doppler effect in a moving medium. Gen. Rel. Grav. 30,
11. Beem. J. cel 12. 13.
209-226
J., Hadrava P. (1975) General-relativistic radiative transfer theory in dispersive media. Astron. Astrophys. 44, 389-399 Bishop R.C., Crittenden R.J. (1964) Geometry of manifolds. Academic Press,
14. Bi66k
refractive and 15.
New York
M., Wolf E. (1960) Principles of optics. Pergamon, Oxford (1978) Relativistic theories of materials. Springer, Berlin Breuer R.A., Ehlers J. (1980) Propagation of high frequency waves through a magnetized plasma in curved space-time. I. Proc. Roy. Soc. London A 370,
16. Born
17. Bressan A. 18.
389-406
R.A., Ehlers J. (1981) Propagation of high frequency magnetized plasma in curved space-time. II. Proc. Roy. Soc.
19. Breuer
waves
through
a
London A 374,
65-86 D. (1972) A simple derivation of the general redshift formula. In Farnsworth D., Fink J., Porter J., Thompson A. (Eds.) Methods of local and
20. Brill
212
References
global differential geometry Springer, New York, 45-47
in
general relativity. Lecture
Notes in
Physics 14,
(1973) Observational contacts of general relativity. In Israel W. (Ed.) Relativity, Astrophysics and Cosmology. Proceedings of the Banff Summer School August 1972, Reidel, Dordrecht, 127-152 Brillouin L. (1960) Wave propagation and group velocity. Academic Press, New
21. Brill D.
22.
York 23. Bruns H.
(1895) Das Eikonal. Hirzel, Leipzig (1981) Multiple gravitational imaging by
24. Burke W. A. 25. 26.
27. 28. 29.
30.
distributed
masses.
As-
trophys. J. 244, Ll Carath6odory C. (1937) Geometrische Optik. Springer, Berlin Chazarain J., Piriou A. (1982) Introduction to the theory of linear partial differential equations. North-Holland, Amsterdam Chen K. H. (1961) Asymptotic theory of wave propagation in spatial and temporal dispersive inhomogeneous media. J. Math. Phys. 12, 743-753 Chwolson 0. (1924) tber eine m6gliche Form fiktiver Doppelsterne. Astronomische Nachrichten 221, 329 Conway A., Synge J. L. (1931) (Eds.) The mathematical papers of William Rowan Hamilton. Cunningham Memoir Nr.XIII, Cambridge Univ. Press, Cambridge Dautcourt G. (1987) Spacetimes admitting a universal redshift function. Astronomische Nachrichten 308, 293-298
(1974) Oscillatory integrals, Lagrange immersions and unsingularities. Commun. Pure Appl. Math. 27, 207-281 Dwivedi I., Kantowski R. (1972) On the possibility of observing first order corrections to geometrical optics in curved space-time. J. Math. Phys. 13, 1941-
31. Duistermaat J. J.
folding 32.
of
1943 33. 34.
35. 36.
37.
Dyer C., Roeder R. (1980) Possible multiple imaging by spherical galaxies. Astrophys. J. 238, L67-L70 Eddington, A. S. (1920) Report on the relativity theory of gravitation. London, Physical Society Eddington, A. S. (1920) Space, time, and gravitation. Cambridge University Press, Cambridge Egorov Yu. V., Shubin M. A. (1992) Partial differential equations. I. Encyclopedia of Mathematical Sciences, vol. 30, Springer, Berlin Ehlers J. (1961) Beitriige zur relativistischen Mechanik kontinuierlicher Medien. Akad. Wiss. Lit.
38. Ehlers J.
(1967)
(Mainz), Zum
Abh. Math. K1.
fTbergang
von
der
1961(11),
Wellenoptik
791-837 zur
geometrischen Optik
Relativitiitstheorie. Z. Naturforsch. 22a, 1328-1332 39. Einstein A. (1936) Lens-like action of a star by the deviation of light in the in der
allgemeinen
gravitational field. and 41.
42. 43. 44. 45. 46.
Science 84, 506-507 Relativistic cosmology. In Sachs R.
(Ed.) General relativity cosmology. Enrico Fermi School, Course XLVII, Academic Press, New York Etherington, I. M. H. (1933) On the definition of distance in general relativity. The Philos. Mag. and J. of Science (Ser. 7) 15, 761-773 Faxaoni V. (1992) Nonstationary gravitational lenses and the Fermat principle. Astrophys. J. 398, 425-428 Frankel T. (1979) Gravitational curvature. Freeman, San IRrancisco Rench A. (1968) Special relativity. MIT Course, Norton, New York Friedrich H., Stewart J. (1983) Chaxacteristic initial data and wavefront singularities. Proc. Roy. Soc. London A 385, 345-371 Giannoni F., Masiello A. (1996) On a Fermat principle in general relativity: A Morse theory for light rays. Gen. Rel. Grav. 28, 855-897
40. Ellis G. F. R.
(1971)
References
213
F., Masiello A., Piccione P. (1997) A variational theory for light stably causal Lorentzian manifolds: Regularity and multiplicity results. Commun. Math. Phys. 187, 375-415 48. Giannoni F., Masiello A., Piccione P. (1998) A Morse theory for light rays on stably causal Lorentzian manifolds. Ann. Inst. H. Poincar6, Physique Theoretique 69, 359-412 49. Golubitsky M., Guillemin V. (1973) Stable mappings and their singularities. Springer, New York 50. Gordon W. (1923) Zur Lichtfortpflanzung nach der Relativitiitstheorie. Annalen der Physik 72, 421-456 51. Gottlieb D. (1994) A gravitational lens need not produce an odd number of images. J. Math. Phys. 35, 5507-5510 52. Guckenheimer J.(1973) Catastrophes and partial differential equations. Ann.
47. Giannoni rays in
Inst. Fourier 23, No. 2, 31-59 (1974) Caustics and
nondegenerate Hamiltonians. Topology 13,127-133 54. Guillemin V., Pollack S. (1974) Differential topology. Prentice-Hall, Eaglewood Cliffs, NJ 55. Guillemin V., Sternberg S. (1977) Geometric asymptotics. Amer. Math. Soc., Providence, Rhode Island 56. Harris S. (1992) Conformally stationary spacetimes. Class. Quantum Grav. 9,
53. Guckenheimer J.
1823-1827 57. Hasse W., Kriele M., Perlick V. (1996) Caustics of wavefronts in general relativity. Class. Quantum Grav. 13, 1161-1182; there will be an Erraturn to this paper, rectifying the incorrect proof of Theorem 4.4 58. Hasse W., Perlick V. (1988) Geometrical and kinematical chaxacterization of 59. 60. 61. 62. 63.
64. 65. 66.
67.
parallax-free world models. J. Math. Phys. 29, 2064-2068 Hawking S., Ellis G. F. R. (1973) The laxge scale structure of space-time. Cambridge Univ. Press., Cambridge Heintzmann H., Kundt W., Lasota J.P. (1975) Electrodynamics of a chargesepaxated plasma. Phys. Rev. A 12, 204-210 Heintzmann H., Schriffer E. (1977) Lorentz covaxiant eikonal method in magnetohydrodynamics. I. The dispersion relation. Phys. Lett. A 60, 79-80 Helfer A. (1994) Conjugate points on spacelike geodesics or pseudo-self-adjoint
Morse-Sturm-Liouville systems. Pacific J. Math. 164, 321-350 Herzberger M. (1936) On the characteristic function of Hamilton, the eikonal of Bruns and their use in optics. J. Opt. Soc. Amer. 26, 177-180 Higbie H. (1981) Gravitational lens. Amer. J. Phys. 49, 652-655 Ives H., Stilwell G. (1938) Experimental study of the rate of a moving atomic clock J. Opt. Soc. Amer. 28, 215-226 Jeffrey A., Kawahara T. (1982) Asymptotic methods in nonlinear wave theory. Pitman, Boston Kaufmann A. (1962) Maxwell equations in nommiformly moving media. Annals
Physics 18, 264-273 Kawaguchi T., and Miron
of 68.
the metric 69. 70. 71.
-y'j(x)
R.
(1989)
(1/c2)y'yj.
+ Miron R.
On the
generalized Lagarange
spaces with
Tensor 48, 53-63
(1989) A Lagrangian model for gravitation and elecKawaguchi T., tromagnetism. Tensor 48, 153-168 Keller J. B., Lewis R. M., Seckler, B. D. (1956) Asymptotic solutions of some diffraction problems. Commun. Pure Appl. Math. 9, 207-265 Kermack W. 0., McCrea W. H., Whittacker, E. T. (1932) On properties of null geodesics and their application to the theory of radiation. Proc. Roy. Soc. Edinburgh 53, 31-47
214 72. 73.
74.
References
Klingenberg Klingenberg
(1978) (1983)
Lectures
on closed geodesics. Springer, Berlin geodesics on Riemannian manifolds. Conference Board of the Mathematical Science Regional Conference Series in Mathematics No. 53, Amer. Math. Soc., Providence, Rhode Island Kovner 1. (1990) Fermat principle in gravitational fields. Astrophys. J. 351,
W.
W.
Closed
114-120
(1968) The geometrical optics approximation in the general inhomogeneous and nonstationary media with frequency and spatial dispersion. Sov. Phys. JETP 55, 1470-1476 76. Landau L. D., Lifshitz E.M. (1959) Course of theoretical physics. II: Theory of fields. Addison-Wesley, Reading, Massachusetts and Pergamon, London 77. Laue M. v. (1920) Theoretisches iiber neuere optische Beobachtungen zur Relativitiitstheorie. Phys. Z. 21, 659-662 78. Lebach D. E., Corey B. E., Shapiro I. I., Ratner M. I., Webber J. C., Rogers A. E. E., Davis J. L., Herring T. A. (1995) Measurement of the Solar gravitational deflection of radio waves using very-long-baseline interferometry. Phys. Rev. Lett. 75, 1439-1442 (1995) 79. Lenard P. (1921) tber die Ablenkung eines Lichtstrahls von seiner geradlinigen Bewegung durch die Attraktion eines Weltk6rpers, an welchem er nahe vorbeigeht; von J.Soldner, 1801. Annalen der Physik 65 593-604 80. Levi-Civita T. (1917) Statica Einsteiniana. Atti della reale accademia dei lincei, Seria quinta, Rendiconti, Classe di scienze fisiche, matematiche e naturali 26,
75. Kravtsov Yu. A. case
of
,
458-470 81. Levi-Civita T.
(1918)
Cimento 16, 105-114 82. Levi-Civita T. (1924)
La teoria di Einstein
e
il
principio di Fermat. Nuovo,
Ragen der klassischen und relativistischen Mechanik.
dem Jahre 1921, Springer, Berlin 83. Levi-Civita T. (1927) The absolute differential calculus.
Vier
Vortrhge
84. Lewis R. M. 85.
aus
(1965) Asymptotic theory (1998)
lenses. Modern
90. 91. 92.
Blackie, London propagation. Arch. Rat. Mech.
An
light. Nature 104, 354 application of the topological degree
to
gravitational
Lett. A 13, 83-86 Stable singularities of wave-fronts in
Phys.
general relativity. J. Math. (1998) Phys. 39, 3332-3335 Luneburg R. K. (1948) Propagation of electromagnetic waves. Lecture Notes, New York University Luneburg R. K. (1964) Mathematical theory of optics. Mimeographed notes from 1949, University of California Press, Berkeley Madore J. (1974) Faraday transport in curved space-time. Commun. Math. Phys. 38, 103-110 Marx G. (1954) Das elektromagnetische Feld in bewegten anisotropen Medien. Acta Phys. Hung. 3, 75-94 Mashhoon B. (1987) Wave propagation in a gravitational field. Phys. Lett. A
87. Low R.
89.
wave
Anal. 20, 191-250 Lodge 0. (1919) Gravitation and
86. Lombardi M.
88.
of
122, 299-304 93. Masiello, A.
(1994)
Variational methods in Lorentzian geometry. Pitman Re-
search Notes in Mathematics Series 309, Longman Scientific & Technical, Essex 94. Maslov V. P. (1972) Th6orie des perturbations et m6thodes; asymptotiques.
Dunod, Gauthier-Villars, Paris (1985) A gravitational lens produces an odd number of images. J. Math. Phys. 26,1592-1596 Milnor J. (1963) Morse theory. Ann. Math. Studies Vol. 51, Princeton University Press, Princeton
95. McKenzie R. 96.
References
215
R., Kawaguchi T. (1991) Relativistic geometrical optics. Int. J. Theor. Phys. 30, 1521-1543 98. Misner C., Thorne K., Wheeler J. (1973) Gravitation. Freeman, San Francisco 99. Nandor M., Helliwell T. (1996) Fermat's principle and multiple imaging by gravitational lenses. Amer. J. Phys. 64, 45-49 100. Newcomb W. A. (1983) Generalized Fermat principle. Amer. J. Phys. 51, 97. Miron
338-340
101.
Nityananda R., Samuel
J.
(1992)
Fermat's
principle
in
general relativity. Phys.
Rev. D 45, 3862-3864 102. Padmanabhan T., Subramanian K.
the condition for
(1988) The focusing equation, caustics and multiple imaging by thick gravitational lenses. Mon. Not. Roy.
Astr. Soc. 233, 265-284 Pais, A. (1982) Subtle is the Lord... Oxford
103.
104. Palais R. 105. Palais
(1963)
University Press, Oxford Topology 2, 299-340 (1964) A generalized Morse theory. Bull. Amer. Math.
Morse
theory
on
Hilbert manifolds.
R., Smale S. Soc. 70, 165-172 106. Pellegrini G. N., Swift A. R.
(1995) Maxwell's equations in a rotating medium. problem? Amer. J. Phys. 63, 694-705 107. Perlick V. (1990) On redshift and paxallaxes in general relativistic kinematical world models. J. Math. Phys. 31, 1962-1971 108. Perlick V. (1990) On Fermat's principle in general relativity: 1. The general case. Class. Quantum Grav. 7, 1319-1331 109. Perlick V. (1990) On Fermat's principle in general relativity: II. The conformally stationary case. Class. Quantum Grav. 7, 1849-1867 110. Perlick V. (1995) Infinite dimensional Morse theory and Fermat's principle in general relativity. I. J. Math. Phys. 36, 6915-6928 111. Perlick V. (1996) Criteria for multiple imaging in Lorentzian manifolds. Class. Quantum Grav. 13, 529-537 112. Perlick V., Piccione, P. (1998) A general-relativistic Fermat principle for extended light sources and extended receivers. Gen. Rel. Grav. 30, 1461-1476 113. Petters A. 0. (1992) Morse theory and gravitational microlensing. J. Math. Phys. 33, 1915-1931 114. Petters A. 0. (1993) Arnold's singularity theory and gravitational lensing. J. Math. Phys. 34, 3555-3581 115. Petters A. 0., Levine H., Wambsganss J. (1999) Singulaxity theory and gravitational lensing. To appear with Birkhiiuser, Boston 116. Pham Mau Quan (1956) Projections des g6od6siques de longeur nulle et rayons 6lectromagn6tiques dans un milieu en mouvement permanent. C. R. Acad. Sci. Paris 242, 875-878 117. Pham. Mau Quan (1957) Inductions 616ctromagn6tique en relativit6 g6n6rale et principe de Fermat. Arch. Rat. Mech. Anal. 1, 54-80 118. Pham. Mau Quan (1958) Sur le principe de Fermat. Enseignement Math.(2) 4,41-70 119. Pham Mau Quan (1962) Le principe de Fermat en relativit6 g6n6rale. In Proc. Royaumont Conf. (1959) Les th6ories relativistes de la gravitation. Editions du Centre Nationale de la Recherche Scientifique, Paris, p. 165 120. Plebaiiski J. (1960) Electromagnetic waves in gravitational fields. Phys. Rev. 118, 1396-1408 121. Preston T. (1912) Theory of light. Macmillan, London 122. Refsdal S., Surdej J. (1994) Gravitational lenses. Rep. Prog. Phys. 56, 117-185 123. Rindler W. (1977) Essential relativity. Springer, New York 124. Rund H. (1959) The differential geometry of Finsler spaces. Springer, Berlin Is there
a
*
References
216
IV. The
outgoing
radiation condition. Proc. Roy. Soc. London A 264, 309-338 126. Sachs R. K., Wu H. S. (1977) General relativity for mathematicians.
Springer,
125. Sachs R. K.
(1961)
Gravitational
waves
in
general relativity.
New York 127. Schmutzer E. (1968) Relativistische Physik. Akademische Verlagsgesellschaft Geest & Portig, Leipzig 128. Schneider P., Ehlers J., Falco E. (1992) Gravitational lenses. Springer, Hei-
delberg New York Schr6dinger E. (1956) Expanding bridge
129.
130. Schwartz J. T. and its
universes.
Cambridge Univ. Press, Cam-
Nonlinear functional analysis. Notes on Mathematics Gordon & Breach, New York-London-Paris Morse theory and a nonlinear generalization of the Dirichlet
(1969)
Applications,
131. Smale S.
(1964)
problem. Ann. Math. 80, 382-396 132. Sommerfeld A., Runge 1. (1911) Anwendungen der Vektorrechnung auf die Grundlagen der geometrischen Optik. Annalen der Physik 35, 277-298 133. Stephani H. (1991) Allgemeine Relativitiitstheorie. Deutscher Verlag der Wissenschaften, Berlin 134. Stix T. H. (1962) The theory of plasma waves. McGraw-Hill, New York 135. Straubel R. (1902) Über einen allgemeinen Satz der geometrischen Optik und einige Anwendungen. Physikalische Zeitschrift 4, 114-117 136. Straumann N. (1984) General relativity and relativistic astrophysics. Springer, Berlin
Synge J. L. (1925) An alternative treatment of Fermat's principle for a stationary gravitational field. The Philosophical Magazine and Journal of Sciences
137.
(6.Ser.) 50,
913-916
Synge J. L. (1937) Geometrical optics. Cambridge University Press, Cwnbridge 139. Synge J. L. (1937) Hamilton's characteristic function and Bruns's eikonal. J. Opt. Soc. Amer. 27, 138-144 140. Synge J. L. (1954) Geometrical mechanics and de Broglie waves. Cambridge University Press, Cwnbridge 141. Synge J. L. (1956) Geometrical optics in moving dispersive media. Comm. Dublin Inst. Adv. Studies, Ser. A No. 12 142. Synge J. L. (1960) Relativity. The general theory. North-Holland, Amsterdam 143. Thorpe J. A. (1979) Elementary topics in differential geometry. Springer, New York-Heidelberg-Berlin 144. Uhlenbeck K. (1975) A Morse theory for geodesics on a Lorentzian manifold. Topology 14, 69-90 145. Wambsganss, J. (1998) Gravitational lensing in astronomy.
138.
,
http://www.livingreviews.org/Articles/Volumel/1998-12wamb/
(1984) General relativity. Chicago University Press, Chicago 147. Walsh D., Carlswell R., Weyman R. (1979) 0957+561 A,B: twin quasistellar objects or gravitational lens? Nature 279, 381-384 148. Weinstein A. (1971) Symplectic manifolds and their Lagrangian submanifolds. 146. Wald R.
Advances in Mathematics 6, 329-346 Weyl H. (1917) Zur Gravitationstheorie. Annalen der Physik 54, 117-145 150. Woodhouse N. (1980) Geometric quantization. Clarendon, Oxford 151. Zheleznyakov V. V. (1996) Radiation in astrophysical plasmas. Kluwer Aca-
149.
Publishers, Dordrecht Zwicky F. (1937) Nebulae as gravitational lenses. Phys. Rev. 51,
demic 152.
290
Index
aberration 183, 187 action functional 91, 149, 172 adapted coordinates 11 angulax diameter distance 130 approximate solution of Maxwell equation 18,36
conjugate momentum 26,64 conjugate point 105, 165, 175, 178, 203,207 constant of motion 85,132 constitutive equations 9 constraints 10, 43
-
approximate-plane-wave family arrival time functional
4,14
159,164
asymptotic series solution of Maxwell equation 18,24 asymptotic solution of Maxwell equation 4,17,36
contact manifold
corrected
73
luminosity distance
cotangent bundle
130
63
-
-
Baumbach-Allen formula 198 biaxial crystal 31 bicharacteristic curve 32
birefringence
29,
58
of characteristic vaxiety bundle
infinitesimal
27
drag
15,
62
eikonal equation 4,15,92 classical solution of 92
-
of vector field
84
-
canonical one-form
64
canonical two-form
64
causality of ray-optical
-
-
-
-
structure
112, 115,
123
for linear medium
caustic
5, 100 curve
plasma 53 generalized solution paxtial 27,54
eikonal function energy
32, 73
characteristic determinant
95
15
39
equivalence of Hamiltonians Euler equation 47
14
-
Euler vector field
88
evolution equations in linear medium
-
-
equation
92
in
plasma
excess
98
-
50
function
excitation
Lagrangian submanifold
of
16,93 9,39
density
energy flux
chaxacteristic equation 21 characteristic function 62 chaxacteristic matrix 14 characteristic variety 27, 67 chaxacteristic vector field 73 classical solution of eikonal
19, 21, 24
for
eikonal surface
characteristic
27
54
183,185 29,58 183,187
effect
eikonal
36,64
canonical lift
conic
for
double refraction
123,126
canonical chaxt 64 canonical equations
for linear medium
plasma Doppler effect
-
-
-
light 195 3, 5, 9 dilation invariance 87,116,138 dimensional reduction 86,135 dispersion 44,87,116 dispersion relation 44 dielectricity
-
branch
-
deflection of
electric
8
176
43
10,
12
29,75
Index
218
-for
electromagnetic 8 magnetic 8 expansion 127 of light bundle extraordinary ray 36 extraordinary wave 31
-
stationary ray-optical
-hyperregular -partial -regular
-
65
27,53 65
-transformation of
29,75,76
Hamiltonian vector field
-
-
-
-
-
-
64
principle 62 203 and gravitational lensing for arbitraxy ray-optical structure
Hessian
156,161
homogeneous background limit 51, 52 homogeneous Lagrangian submanifold
Fermat -
structure
132
-
for Kerr metric
195
for Rindler model
Hilbert manifold
stationary ray-optical 155,165
structure
for
ray-optical
16, 19, 52
165
98
195
vacuum
limit
high frequency
192
for Schwazschild metric
for
174
Huyghens construction hyperregularity -
structure
-
65
of Hamiltonian of
ray-optical
153
77
structure
163 -
in
ordinaxy optics
field -
149
index
65
fiber derivative
174,178,207 30, 122, 124, 183, 190 indicatrix 116,188,189 16 inertial system 11 initial value problem isotropic medium 10, 30 isotropic ray-optical structure 121 isotropic submanifold 93 isotropy subgroup 86
-
strength
electric
-
8
electromagnetic 8 magnetic 8 figuratrix 116,188,189
-
-
Finsler metric
28
experiment 189 Fourier synthesis 17,45 frequency 16,114,136,185 Fizeau
Resnel formula
189
gauge transformation
Jacobi class 137
95
generalized wave surface 99 geodesic 63 geometric optics approximation of Maxwell fields
24
gravitational lensing 199 113,114,188,189 group velocity H' space Hamilton
166
equations
4,36,64
65 vertical paxt Hamilton-Jacobi equation
36,64 equivalence of 29, 75 for isotropic ray-optical
4,27,92
Hamiltonian
-
structure
-
-
for linear medium for for
lifted
Kerr
structure
4 193
spacetime
Lagrange bracket 93 Lagrangian submanifold -
-
conic
98
homogeneous
Landau gauge lensing 199
49
Levi-Civita, connection
6
7
lifted Jacobi class
101
lifted Jacobi field
101
73 lifted ray light bundle 123 -
68, 74
5,93
98
cone
ray-optical light-like 63
27
53
plasma ray -optical
101, 124,175
101
JWKB method
light
121 -
Jacobi field -
101, 176
101
Levi-Civita tensor
-
-
of refraction
-lifted
generalized observer field 157 generalized optical path length 159 generalized solution of eikonal equation
-
Morse
for
Lorentz force
structure
47
79
Index Lorentzian metric
63
partial transport vector field 34 permeability 3, 5, 9 permeable 7 phase surface 16,93 phase velocity 113,114,188,189 plasma 3, 5, 46, 70, 205 193 on Kerr spacetime plasma frequency 55, 70,72,193,205 plasma oscillation 55, 72
Malus theorem 128 Maxwell equations 3,7 in linear medium 3,4,7
-
-
in
plasma
3, 46
medium -
-
-
-
-
accelerated 190 biaxial. 31 dielectric 3, 7, 9, 69
dispersive
5,44,87,112,116,147,
189 -
-
Poincar6 half-space 193 polarization condition -
isotropic
3,10,30,62,121,147,183,
190 -
linear 3,7,9,69 non-lineax 45
-
permeable
-
uniaxial
3, 9, 30, 69
69
for linear medium
-
Lorentzian
for
,
-
ray
63
optical 30 31 36 69 spacetime, 69 ,
,
and energy flux
-
extraordinary
186 -
Minkowski energy-momentum tensor 39
-
-
momentum -
-
-
conjugate
26,64
of symmetry
Morse index
86
gravitational lensing 207 theory 165,168 and gravitational lensing 205 multiplicity of conjugate point 105,175,178 and Morse
ordinary
36
oriented
81
virtual
of solution of eikonal equation
32
-
-
-
-
-
-
generalized
157
-
odd number theorem
optical
metric
-
-
208
-
30,31,36,69,124,146,
183
-
-
optical path length as travel time 139,145,149 generalized 159,205 139
-
-
-
-
ordinary ordinaxy
ray wave
36 31
,
causal
-
for ray-optical structure
87 , 116 , 123
examples 69 hyperregular 77 isotropic 121, 123 Lorentz invariant on
120,123 111
Lorentzian manifold
orientable
regular
75
77
reversible
87
131, 155
stationary
strongly hyperregular 78, strongly regular 78, 140 vacuum
of
168
111
vacuum
-
-
-
rays
potential
118
119,132
for Rindler observers
reduction
partial eikonal equation 27 partial Hamiltonian 27
67
112 , 115 , 123
dilation invaxiant
redshift
76,81
189
,
reciprocity theorem 129 redshift 116,131,186 -
orientation
82
velocity 114 188 ray-optical structure
-
64 113
81
ray
-
observer field
41 36
73
-
natural chaxt
150,154
36, 111
in vacuum
lifted
-symmetry of
178
-
-
action
4 , 17 ray method ray optical structure
174,178
Morse index theorem
14
31,32,34,36,54,73
-
-
microwave links
22
plasma 55, 57 Poynting vector 39 principal determinant principal matrix 14 principle of stationaxy
-
metric -
219
and
192
86
gravitational lensing spacetime 195
of Kerr
206
Index
220
-
-
of Rindler model of
tangent bundle
192
stationary ray-optical
time-like
134 -
of
stationary
structure
vacuum
135
-
8
reference system
-
65
of Hamiltonian
-
ray-optical structure reversibility 87
-
of
76
-
-
spacetime,
Robertson-Walker
light bundle
Sachs bein
32,34
for
plasma 54 partial 34, 54 145
uniaxial
crystal
vacuum
ray
30,69
36, 111 111
vacuum
194
150 variational vector field 115 vector of normal slowness velocity coordinates 64
127
light bundle
sine condition
131
Sobolev space
166
space-like 63 stationarity of ray-optical structure strong hyperregularity 78 strong regularity 78 -
structure group
for linear medium
rayv-optical structure 149 variational principle
124
shear of
19,22,33,35
57
127
Schwarzschild metric
-
plasma
206
rotation
of
for
travel time
190
Rindler model
-
for lineax medium
transport vector field
regularity -
111
63
timing function 133 transport equation
ray-optical
141
reduction theorem
63
orientability
time
structure
vertical part of Hamilton 65
wave
extraordinary ordinary 31 wave amplitude
31
-
-
85
symmetry group 83 symmetry of ray optical structure symplectic geometry 61
symplectic manifold
81
virtual ray 131
86
superposition 17 symmetry algebra
equations
82
64
Offsetdruck, WrIenbach
Druck:
Strauss
Verarbeitung:
Schiffer, GrUnstadt
wave
covector
wave
surface
-
-
generalized lifted
19
17,114 4,16,19,92,99,153 99
99,153
Weierstrass
excess
function
176