8083.9789814340489-tp.indd 1
8/29/11 4:59 PM
This page intentionally left blank
Bernard H Lavenda
Universita’ degli Studi di Camerino, Italy
World Scientific NEW JERSEY
•
8083.9789814340489-tp.indd 2
LONDON
•
SINGAPORE
•
BEIJING
•
SHANGHAI
•
HONG KONG
•
TA I P E I
•
CHENNAI
8/29/11 4:59 PM
Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
A NEW PERSPECTIVE ON RELATIVITY An Odyssey in Non-Euclidean Geometries Copyright © 2012 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN-13 978-981-4340-48-9 ISBN-10 981-4340-48-0
Typeset by Stallion Press Email:
[email protected] Printed in Singapore.
YeeSern - A New Perspective on Relativity.pmd1
10/14/2011, 9:08 AM
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
In memory of Franco Fraschetti (1924–2009)
v
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
This page intentionally left blank
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
Preface
Electrodynamics was the next oasis after thermodynamics which saw a confluence of physicists and mathematicians, many of whom had been protagonists in thermodynamics. Just as thermodynamics had an offspring, quantum theory, so too did electrodynamics, namely the theory of relativity. While a single name can be attached to the origins of thermodynamics, Sadi Carnot, and that of its offspring, Max Planck, no such simplicity exists in electrodynamics and relativity. Relativity is as much about physics as it is about the human beings, and their failings, that made it. Every physics student will have heard of Maxwell’s equations but will he have also heard of Weber’s force? The student may have heard of Weber and Gauss for the units named after them, but not about their championing of Ampère’s law which threatened the supremacy of Newton’s inverse square law. The names of Helmholtz, Clausius and Boltzmann may be familiar from thermodynamics and statistical thermodynamics but much less known for their theories of electromagnetism. Every student of mathematics will have heard the names of Gauss and Riemann, but will he also know of their fundamental contributions to electromagnetism? Who were Abraham, Heaviside, Larmor, Liénard, Lorenz, Ritz, Schwarzschild, and Voigt? Why have their names been struck from the annals of electromagnetism? We are familiar with the priority disputes between Kelvin and Clausius in thermodynamics, but not with those in electromagnetism and relativity. A student of physics may have heard the name of Lorentz, because of his law of force and transformation, but not at the same level of Einstein. And Poincaré is known for just about everything else than his principle of relativity. The history of electromagnetism and relativity has been rewritten and in a very unflattering way. vii
Aug. 26, 2011
11:17
viii
SPI-B1197
A New Perspective on Relativity
b1197-fm
A New Perspective on Relativity
By the modern historical account of electromagnetism and relativity, there were winners and losers. Maxwell is said to have triumphed over Weber and Gauss, in formulating a field theory of electromagnetism, and over Lorenz and Riemann in the formulation of his displacement current, Einstein’s absolute speed of light prevailed over Ritz’s ballistic theory of emission, Lorentz’s supremacy over Abraham and Bucherer in devising a model of the electron whose expressions for the variation of mass, momentum, and energy with velocity were later to be adopted in toto by relativity as a model for all matter, whether charged or not, and Einstein’s seniority in stating the principles of relativity though they were previously enunciated by Poincaré. Why were the experimentalists, Ives and Essen, so vehemently opposed to relativity? Ives viewed his verification of the second-order Doppler shift as a clear demonstration that a moving clock runs slow by the same factor that was predicted by Larmor and Lorentz, and not as a vindication of time dilatation in special relativity. Essen, who built the first cesium clock, queried what happens to the lost ticks when more ticks are transmitted than are received, independent of whether two clocks are approaching or receding from one another? Essen went so far as to query relativity as a “joke or swindle?” Most if not all monographs on relativity do not touch on these questions. Not so with O’Rahilly’s Electromagnetics written in 1938. Not everyone will agree with his dispraise of Maxwell’s displacement current, or his over appraisal of Ritz, but much of what he says could not be truer today: There is far more authoritarianism in science that physicists are aware or at least publicly acknowledge. Anybody with a scientific reputation would today hesitate to criticize Einstein, except by way of outdoing him in cosmological speculations.
Essen expressed similar views Students are told that the theory (relativity) must be accepted although they cannot be expected to understand it. . . The theory is so rigidly held that young scientists who have any regard for their careers dare not openly express their doubts.
Whether there is any truth in the allegations I will leave to the reader. But what I plan to do is to present relativity from a ‘new’ point of view that treats, known and unknown, relativistic phenomena from different perspectives. I put ‘new’ in quotation marks because the approach is really not new, but was suggested by Kaluza and Variˇcak over a century ago. What
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
Preface
ix
is ‘new,’ I believe, is the wealth of physical phenomena that can be drawn from the non-Euclidean geometrical perspective. This monograph is neither intended an historical account of relativity nor an essay in constructive criticism of it. A recurring theme is that motion causes deformity and this can, under certain circumstances, catapult us into non-Euclidean spaces. It was also an exciting exercise to see where non-Euclidean geometries could be found, but were not appreciated as such. There are at least two eye-catching relations: The product of two longitudinal Doppler shifts is the square root of the cross-ratio, and whose logarithm is hyperbolic distance, and the Beltrami metric in polar coordinates is the exact expression for the metric for the uniformly rotating disc. Gravitational phenomena rather than being a manifestation of warped space-time can be accounted for by a varying index of refraction in an inhomogeneous medium that modifies Fermat’s principle of least time. The reader will find old and new things alike — but the ‘old’ with a new interpretation. I don’t expect that everything is true to 100 percent, some things will have to be changed, modified or clarified, but, I do believe that this is a very fruitful approach that has led to the questioning of many fundamental aspects of relativity. According to Riemann, physics is the search for a geometric manifold upon which physical processes occur. The line element of constant curvature, 1 1 + 41 α
x2
dx2 ,
(R)
appearing in his Habilitation Dissertation, when written in polar coordinates is precisely the metric for a uniformly rotating disc with constant negative curvature, α < 0. When charge is added, it becomes the Liénard expression for the rate of energy loss due to radiation. The role of the longitudinal Doppler shift means that space and time do not appear separately but only in a ratio, as a homogeneous coordinate. It is the difference in longitudinal Doppler shifts that is responsible for the slowing down of clocks in relative motion. Einstein elevated c, the velocity of light in vacuo, to a universal constant. The fact that c is a constant, even to observers in relative motion, is tantamount to making it a unit of measurement — one which is necessary for the existence of a
Aug. 26, 2011
x
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
A New Perspective on Relativity
non-Euclidean geometry. So raising c to a universal constant, as Essen has pointed out, meant that the definition of unit length or time, or both, had to be abandoned. Relativity will thus unfold in a hyperbolic space of velocities that is entirely consonant with the relativistic addition of velocities. Following the historical route spreads the honors of discovery of relativity more evenly. Poincaré had arrived at the postulates of relativity at least five years before Einstein, but “because he did not fully appreciate the status of both postulates” is no argument to deny him credit. To deny Poincaré his primary role in developing the theory of relativity because he held onto the aether concept is to deny Carnot the credit for discovering his principle because he still believed in caloric theory. It would never have passed my mind to say that Boltzmann’s principle is incomplete because it deals with only part of a probability distribution, being a very large number instead of a proper fraction, whereas I have shown that the entropy is the potential of law of error for which the most probable value is the average value of the measurements, that I have detracted any credit from Boltzmann. And which average is considered most probable will determine the form of the entropy. What is incomprehensible was Poincaré’s need to ‘adjust’ the laws of physics so as to preserve Euclidean geometry, and Einstein’s later concurrence with him. Was Euclidean geometry superior to non-Euclidean geometries to which Poincaré made so many outstanding contributions? Why couldn’t Poincaré connect with his fractional linear transformations which preserve certain geometric properties and define a new concept of length in hyperbolic geometry with Lorentz transformations which he did so much work on? Historians of science make much ado over the tortuous path that Einstein followed to arrive at his field equations of general relativity — taking for granted that they are the final solution to the gravitational problem. Little progress has been made since Einstein wrote down his equations almost a century ago, and what the general theory proposes has still to be collaborated by observation. Singularities, black holes, and gravitational waves have, as yet, to be confirmed. Why time warps and what constitutes emptiness is left to be explained. How can a gravitational field exist in the absence of matter and all other physical fields?
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
Preface
xi
Young Maxwell gave a very interesting example of an optical instrument, for which the optical length of any curve in the object space is equal to that of its image, by expressing Fermat’s principle for the extremal path of a ray in terms of a varying index of refraction and a flat metric. The varying index of refraction had the exact same form as the coefficient in (R) for positive curvature (α > 0). This gave me the idea that an opticogravitational approach might prove useful in which a non-constant index of refraction would mimic a varying gravitational field while the flat metric would include the centrifugal potential. That gravitational and centrifugal forces appeared in different parts led me to question the equivalence principle whereby a gravitational field can be annulled by acceleration. The distortions that we observe due to motion is the result of our Euclidean rulers and clocks. Inhabitants of hyperbolic space would see no changes in the measuring devices since they change along with them. All gedanken experiments using local observers would lead to null results. Paradoxes exist because the phenomena which give rise to them are not understood. Everyone would agree that emission theories are dead, but to say that the velocity on the outward journey is c + v, and the velocity on the return journey is c − v, where v is the velocity of the aether in the Michelson–Morley experiment is truly contradictory. To explain the null result a contraction hypothesis in the direction of the motion was assumed, yet the only contraction that arises from the Doppler shift is a second-order one in the direction normal to the motion. The journey has been a long one for me. Along the way I have gotten to know a lot of people through their writings. I can feel the eccentricity and biting sarcasm of Oliver Heaviside, who without a formal education, took on the establishment with his unwavering faith in Maxwell; the youthful enthusiasm of Walther Ritz for his science, the credit that was denied him, and the tragedy of his short and painful life; the nonchalance by which Poincaré added hypotheses to theories, his wavering afterthoughts about them, and his humility that led him to uphold Euclidean geometry after all the work he did in bringing hyperbolic geometry into the mainstream of mathematics — but not physics; the quarrelsome and critical Abraham, who was denied the credit he justly deserved, and whose death was also tragic; the mild mannered, cautious and pragmatic approach of Lorentz, and, finally, the enigmatic figure of Einstein, who, more often than not,
Aug. 26, 2011
11:17
xii
SPI-B1197
A New Perspective on Relativity
b1197-fm
A New Perspective on Relativity
contradicted his own principles. It has also shown me other sides to people whom I thought I knew. The openness to explore all avenues, no matter how distasteful, that Planck exercised in his approach to blackbody radiation is now contrasted to his opinionated view that non-Euclidean geometries was ‘child’s play,’ in comparison to the demands that relativity make on the mind. But is it? Trevignano Romano March 2011
Bernard H. Lavenda
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
Contents
Preface
vii
List of Figures
xxi
1.
Introduction
1
1.1
Einstein’s Impact on Twentieth Century Physics . . . . 1.1.1 The author(s) of relativity . . . . . . . . . . . . . 1.1.2 Models of the electron . . . . . . . . . . . . . . . 1.1.3 Appropriation of Lorentz’s theory of the electron by relativity . . . . . . . . . . . . . . . . . . . . . 1.2 Physicists versus Mathematicians . . . . . . . . . . . . 1.2.1 Gauss’s lost discoveries . . . . . . . . . . . . . . 1.2.2 Poincaré’s missed opportunities . . . . . . . . . 1.3 Exclusion of Non-Euclidean Geometries from Relativity . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.
Which Geometry? 2.1 2.2
1 2 26 27 30 31 35 41 47 51
Physics or Geometry . . . . . . . . . . . . . . 2.1.1 The heated plane . . . . . . . . . . . . Geometry of Complex Numbers . . . . . . . 2.2.1 Properties of complex numbers . . . 2.2.2 Inversion . . . . . . . . . . . . . . . . 2.2.3 Maxwell’s ‘fish-eye’: An example of from elliptic geometry . . . . . . . . .
xiii
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . inversion . . . . . .
51 51 57 57 58 61
Aug. 26, 2011
11:17
xiv
SPI-B1197
A New Perspective on Relativity
b1197-fm
A New Perspective on Relativity 2.2.4 The cross-ratio . . . . . . . . . . . 2.2.5 The Möbius transform . . . . . . . 2.3 Geodesics . . . . . . . . . . . . . . . . . . 2.4 Models of the Hyperbolic Plane and Their Properties . . . . . . . . . . . . 2.5 A Brief History of Hyperbolic Geometry References . . . . . . . . . . . . . . . . . . . . .
3.
. . . . . . . . . . . . . . . . . . . . . . . .
67 72 76
. . . . . . . . . . . . . . . . . . . . . . . .
80 88 107
A Brief History of Light, Electromagnetism and Gravity The Drag Coefficient: A Clash Between Absolute and Relative Velocities . . . . . . . . . . . . . . . . . 3.2 Michelson–Morley Null Result: Is Contraction Real? . . . . . . . . . . . . . . . . . . . 3.3 Radar Signaling versus Continuous Frequencies . . 3.4 Ives–Stilwell Non-Null Result: Variation of Clock Rate with Motion . . . . . . . . . . . . . . . . . . . . 3.5 The Legacy of Nineteenth Century English Physics . 3.5.1 Pressure of radiation . . . . . . . . . . . . . . . 3.5.2 Poynting’s derivation of E = mc2 . . . . . . . 3.5.3 Larmor’s attempt at the velocity composition law via Fresnel’s drag . . . . . . . . . . . . . . 3.6 Gone with the Aether . . . . . . . . . . . . . . . . . . 3.6.1 Elastic solid versus Maxwell’s equations . . . 3.6.2 The index of refraction . . . . . . . . . . . . . 3.7 Motion Causes Bodily Distortion . . . . . . . . . . . 3.7.1 Optical effect: Double diffraction experiments . . . . . . . . . . . . . . . . . . . 3.7.2 Trouton–Noble null mechanical effect . . . . . 3.7.3 Anisotropy of mass . . . . . . . . . . . . . . . 3.7.4 e/m measurements of the transverse mass . . 3.8 Modeling Gravitation . . . . . . . . . . . . . . . . . . 3.8.1 Maxwellian gravitation . . . . . . . . . . . . . 3.8.2 Ritzian gravitation . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . .
109
3.1
.
109
. .
112 117
. . . .
118 122 122 123
. . . . .
124 127 127 133 137
. . . . . . . .
137 138 140 149 156 156 163 174
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
Contents 4.
Electromagnetic Radiation Spooky Actions-at-a-Distance versus Wiggly Continuous Fields . . . . . . . . . . . . . . . . . . 4.1.1 Irreversibility from a reversible theory . . 4.1.2 From fields to particles . . . . . . . . . . . 4.1.3 Absolute versus relative motion . . . . . . 4.1.4 Faster than the speed of light . . . . . . . . 4.2 Relativistic Mass . . . . . . . . . . . . . . . . . . . 4.2.1 Gedanken experiments . . . . . . . . . . . 4.2.2 From Weber to Einstein . . . . . . . . . . . 4.2.3 Maxwell on Gauss and Weber . . . . . . . 4.2.4 Ritz’s electrodynamic theory of emission . 4.3 Radiation by an Accelerating Electron . . . . . . . 4.3.1 What does the radiation reaction force measure? . . . . . . . . . . . . . . . . 4.3.2 Constant rate of energy loss in hyperbolic velocity space . . . . . . . . . . . . . . . . . 4.3.3 Radiation at uniform acceleration . . . . . 4.3.4 Curvatures: Turning and twisting . . . . . 4.3.5 Advanced potentials as perpetual motion machines . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . .
xv 177
4.1
5.
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
177 181 184 186 189 192 194 197 200 208 212
. . .
212
. . . . . . . . .
217 220 225
. . . . . .
229 232
The Origins of Mass 5.1 5.2 5.3
5.4
Introduction . . . . . . . . . . . . . . . . . . . . . . . From Motional to Static Deformation . . . . . . . . . 5.2.1 Potential theory . . . . . . . . . . . . . . . . . Gravitational Mass . . . . . . . . . . . . . . . . . . . . 5.3.1 Attraction of a rod: Increase in mass with broadside motion . . . . . . . . . . . . . . . . 5.3.2 Attraction of a spheroid on a point in its axis of revolution: Forces of attraction as minimal curves of convex bodies . . . . . . . . . . . . . Electromagnetic Mass . . . . . . . . . . . . . . . . . . 5.4.1 What does the ratio e/m measure? . . . . . . .
235 . . . .
235 236 237 243
.
243
. . .
245 249 255
Aug. 26, 2011
11:17
xvi
SPI-B1197
A New Perspective on Relativity
b1197-fm
A New Perspective on Relativity 5.4.2 5.4.3
Models of the electron . . . . . . . . . . . . . . . Thomson’s relation between charges in motion and their mass . . . . . . . . . . . . . . . . . . . 5.4.4 Oblate versus prolate spheroids . . . . . . . . . 5.5 Minimal Curves for Convex Bodies in Elliptic and Hyperbolic Spaces . . . . . . . . . . . . . . . . . . 5.6 The Tractrix . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Rigid Motions: Hyperbolic Lorentz Transforms and Elliptic Rotations . . . . . . . . . . . . . . . . . . . 5.8 The Elliptic Geometry of an Oblate Spheroid . . . . . . 5.9 Matter and Energy . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.
Thermodynamics of Relativity Does the Inertia of a Body Depend on its Heat Content? . . . . . . . . . . . . . . . 6.2 Poincaré Stress and the Missing Mass . . . . . 6.3 Lorentz Transforms from the Velocity Composition Law . . . . . . . . . . . . . . . . 6.4 Density Transformations and the Field Picture 6.5 Relativistic Virial . . . . . . . . . . . . . . . . . 6.6 Which Pressure? . . . . . . . . . . . . . . . . . 6.7 Thermodynamics from Bessel Functions . . . 6.7.1 Boltzmann’s law via modified Bessel functions . . . . . . . . . . . . . 6.7.2 Asymptotic probability densities . . . . References . . . . . . . . . . . . . . . . . . . . . . . .
262 263 265 275 280 283 287 289 298 301
6.1
7.
. . . . . . . . . .
301 303
. . . . .
. . . . .
308 315 323 325 327
. . . . . . . . . . . . . . .
328 334 338
. . . . .
. . . . .
. . . . .
General Relativity in a Non-Euclidean Geometrical Setting 7.1 7.2
7.3 7.4 7.5
Centrifugal versus Gravitational Forces . . . . . . Gravitational Effects on the Propagation of Light 7.2.1 From Doppler to gravitational shifts . . . 7.2.2 Shapiro effect via Fermat’s principle . . . Optico-gravitational Phenomena . . . . . . . . . The Models . . . . . . . . . . . . . . . . . . . . . . General Relativity versus Non-Euclidean Metrics
. . . . . . .
. . . . . . .
341 . . . . . . .
341 344 344 346 348 361 367
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
7.6
The Mechanics of Diffraction . . . . . . . . 7.6.1 Gravitational shift of spectral lines 7.6.2 The deflection of light . . . . . . . . 7.6.3 Advance of the perihelion . . . . . References . . . . . . . . . . . . . . . . . . . . . .
8.
. . . . .
Contents
xvii
. . . . .
375 378 379 381 383
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
Relativity of Hyperbolic Space 8.1 8.2 8.3
Hyperbolic Geometry and the Birth of Relativity . . . Doppler Generation of Möbius Transformations . . . . Geometry of Doppler and Aberration Phenomena . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Kinematics: The Radar Method of Signaling . . . . . . 8.4.1 Constant relative velocity: Geometric-arithmetic mean inequality . . . . . . . . . . . . . . . . . . 8.4.2 Constant relative acceleration . . . . . . . . . . 8.5 Comparison with General Relativity . . . . . . . . . . 8.6 Hyperbolic Geometry of Relativity . . . . . . . . . . . 8.7 Coordinates in the Hyperbolic Plane . . . . . . . . . . 8.8 Limiting Case of a Lambert Quadrilateral: Uniform Acceleration . . . . . . . . . . . . . . . . . . . 8.9 Additivity of the Recession and Distance in Hubble’s Law . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.
Nonequivalence of Gravitation and Acceleration 9.1 9.2 9.3 9.4 9.5 9.6 9.7
The Uniformly Rotating Disc in Einstein’s Development of General Relativity . . . . . . . . . . . . . . . . . . . . The Sagnac Effect . . . . . . . . . . . . . . . . . . . . . Generalizations of the Sagnac Effect . . . . . . . . . . . The Principle of Equivalence . . . . . . . . . . . . . . . Fermat’s Principle of Least Time and Hyperbolic Geometry . . . . . . . . . . . . . . . . The Rotating Disc . . . . . . . . . . . . . . . . . . . . . The FitzGerald–Lorentz Contraction via the Triangle Defect . . . . . . . . . . . . . . . . . . .
385 385 388 393 398 398 401 407 410 415 419 421 423 425 425 434 439 443 449 453 464
Aug. 26, 2011
xviii
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
A New Perspective on Relativity 9.8
Hyperbolic Nature of the Electromagnetic Field and the Poincaré Stress . . . . . . . . . . . . . . . . . . 9.9 The Terrell–Weinstein Effect and the Angle of Parallelism . . . . . . . . . . . . . . . . . . . . . . . . 9.10 Hyperbolic Geometries with Non-Constant Curvature 9.10.1 The heated disc revisited . . . . . . . . . . . . . 9.10.2 A matter of curvature . . . . . . . . . . . . . . . 9.10.3 Schwarzschild’s metric: How a nobody became a one-body . . . . . . . . . . . . . . . . . . . . . 9.10.4 Schwarzschild’s metric: The inside story . . . . 9.11 Cosmological Models . . . . . . . . . . . . . . . . . . . 9.11.1 The general projective metric in the plane . . . 9.11.2 The expanding Minkowski universe . . . . . . 9.11.3 Event horizons . . . . . . . . . . . . . . . . . . . 9.11.4 Newtonian dynamics discovers the ‘big bang’ . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.
Aberration and Radiation Pressure in the Klein and Poincaré Models Angular Defect and its Relation to Aberration and Thomas Precession . . . . . . . . . . . . . . . . . 10.2 From the Klein to the Poincaré Model . . . . . . . . . 10.3 Aberration versus Radiation Pressure on a Moving Mirror . . . . . . . . . . . . . . . . . . . 10.3.1 Aberration and the angle of parallelism . . . . 10.3.2 Reflection from a moving mirror . . . . . . . . 10.4 Electromagnetic Radiation Pressure . . . . . . . . . . 10.5 Angle of Parallelism and the Vanishing of the Radiation Pressure . . . . . . . . . . . . . . . . 10.6 Transverse Doppler Shifts as Experimental Evidence for the Angle of Parallelism . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . .
468 470 473 473 476 478 482 484 484 490 492 496 498
501
10.1
11.
. .
501 509
. . . .
512 512 514 515
.
522
. .
525 526
The Inertia of Polarization
529
11.1
529
Polarization and Relativity . . . . . . . . . . . . . . . .
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
Contents 11.1.1 A history of polarization and some of its physical consequences . . . . . . . . . 11.1.2 Spin . . . . . . . . . . . . . . . . . . . . . . . 11.1.3 Angular momentum . . . . . . . . . . . . . . 11.1.4 Elastic strain . . . . . . . . . . . . . . . . . . 11.1.5 Plane waves . . . . . . . . . . . . . . . . . . 11.1.6 Spherical waves . . . . . . . . . . . . . . . . 11.1.7 β-decay and parity violation . . . . . . . . . 11.2 Stokes Parameters and Their Physical Interpretations . . . . . . . . . . . . . . . . . . . . . 11.3 Poincaré’s Representation and Spherical Geometry 11.3.1 Isospin and the electroweak interaction . . . 11.4 Polarization of Mass . . . . . . . . . . . . . . . . . . 11.4.1 Mass and momentum . . . . . . . . . . . . . 11.4.2 Relativistic space-time paths: An example of mass polarization . . . . . . . . . . . . . . 11.5 Mass in Maxwell’s Theory and Beyond . . . . . . . 11.5.1 A model of radiation . . . . . . . . . . . . . . 11.5.2 Enter mass: Proca’s equations . . . . . . . . 11.5.3 Proca’s approach to superconductivity . . . 11.5.4 Phase and mass . . . . . . . . . . . . . . . . 11.5.5 Compressional electromagnetic waves: Helmholtz’s theory . . . . . . . . . . . . . . 11.5.6 Directed electromagnetic waves . . . . . . . 11.6 Relativistic Stokes Parameters . . . . . . . . . . . . 11.6.1 Weyl and Dirac versus Stokes . . . . . . . . 11.6.2 Origin of the zero helicity state . . . . . . . . 11.6.3 Lamb shift and left-hand elliptical polarization . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . Index
xix
. . . . . . .
. . . . . . .
529 540 543 545 550 553 554
. . . . .
. . . . .
560 568 572 577 577
. . . . . .
. . . . . .
585 590 590 600 607 617
. . . . .
. . . . .
620 627 631 631 640
. . . .
648 654 657
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
This page intentionally left blank
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
List of Figures
1.1
A tiling of the hyperbolic plane by curvilinear triangles that form right-angled pentagons. . . . . . . . . . . . . . . . . . .
37
A bug’s life in the heated disk; ‘hot’ in the center and ‘cold’ on the disc. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
52
2.2
Construction of the point of inversion P. . . . . . . . . . . . .
59
2.3
Circle of inversion for constructing the inverse P with respect to P . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
60
2.4
Maxwell’s “fish-eye.” . . . . . . . . . . . . . . . . . . . . . . .
63
2.5
The magnification of the inner product as it is projected stereographically onto the Euclidean plane. . . . . . . . . . .
64
In the case of inversion both the point and its image are on the same ray emanating from the center of the disc H. . . . . . .
66
It appears that rulers get longer as they are moved further from the origin. However, the elliptic distance from x to y is exactly the same as that from X to Y. . . . . . . . . . . . . . . . . . .
67
2.8
A tiling of the plane. . . . . . . . . . . . . . . . . . . . . . . . .
69
2.9
Calculation of cross-ratio and perspectivity. . . . . . . . . . .
70
2.1
2.6 2.7
2.10
a, d, x , y
The four points u, a, c, v and from point p have the same angles, hence, have the same cross-ratio. This also is true for c, b, w, z and d, b, x , y . . . . . . . . . . . . . . . . . . . . . .
72
2.11
Derivation of Snell’s law. . . . . . . . . . . . . . . . . . . . . .
77
2.12
Angle of parallelism. . . . . . . . . . . . . . . . . . . . . . . .
78
2.13
The number of lines passing through P that are hyperparallel to the line g are infinite. The lines h1 and h2 are limiting parallel to g, while the others are hyperparallel to g. . . . . . . . . . .
79
xxi
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
xxii
A New Perspective on Relativity
2.14
Surfaces of negative constant curvature that are mapped onto part of the hyperbolic plane. The middle figure is the mapping of a pseudosphere that produces horocycles as dashed curves. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83
The ratio of concentric limiting arcs depends only on the distance between them. . . . . . . . . . . . . . . . . . .
84
2.16
Using Euclidean geometry to derive the angle of parallelism by considering concentric limiting arcs. . . . . . . . . . . . .
85
2.17
A right triangle in hyperbolic space: As P increases without limit the angle tends to the angle of parallelism which is a function only of d. . . . . . . . . . . . . . . . . . . . . . . . . .
88
2.18
The parallax of a star. . . . . . . . . . . . . . . . . . . . . . . .
89
2.19
Tractrix and pseudosphere as its surface of revolution. . . . .
90
2.20
Minkowski’s vision of space-time. . . . . . . . . . . . . . . .
94
2.21
Projection of the hyperboloid onto the plane. . . . . . . . . .
97
2.22
Geodesics determined by planes cutting the hyperboloid and passing through the center. . . . . . . . . . . . . . . . . .
99
2.23
Cayley’s calculation of distance in the projective disc model.
100
2.24
The Poincaré disc model as a stereographic projection from the south pole S of the bottom sheet. . . . . . . . . . . . . . .
101
2.25
Beltrami’s double mapping of Klein and his hyperbolic disc model onto the Poincaré disc model. . . . . . . . . . . . . . .
102
The combined vertical orthogonal projection upwards and the stereographic projection downwards. . . . . . . . . .
102
2.27
Geodesics consist of arcs that cut the disc, , orthogonally. .
104
3.1
Fizeau’s aether-drag apparatus with mirrors placed on corners to reflect light. . . . . . . . . . . . . . . . . . . . . . . . . . . .
111
2.15
2.26
3.2
Monochromatic, yellow light is split by a mirror into two beams. 113
3.3
Second-order wavelength shifts plotted as a function of first-order shifts. . . . . . . . . . . . . . . . . . . . . . . . . . .
121
3.4
Trouton–Noble experiment to search for effect of Earth moving through aether. . . . . . . . . . . . . . . . . . . . . . . . . . .
139
3.5
Planes formed from a moving trihedron. . . . . . . . . . . . .
146
3.6
Thomson’s apparatus for determining the ratio e/m for cathode rays. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
150
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
List of Figures 3.7 3.8 4.1 4.2 4.3 5.1 5.2 5.3
5.4 5.5 5.6 5.7
5.8 5.9 5.10 5.11 5.12 5.13 5.14 7.1 7.2 7.3 7.4 7.5
The points on the parabola refer to electrons deflected by parallel and anti-parallel (left side) fields. . . . . . . . . . . . Elliptical orbit of Mercury showing the excess rotation of the major axis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The configuration for calculating the retarded scalar potential. Orientation of two circuit elements ds and ds . . . . . . . . . . Frenet frame field for a trajectory of the motion. . . . . . . . Stellar aberration: (a) A telescope at rest, and (b) a telescope aimed at the same star but in relative motion. . . . . . . . . . The potential of a homogeneous rod. . . . . . . . . . . . . . . A rod AB has length 2 with O as its center. The attracted point P with an element of mass dm at a distance r from it. r1 and r2 are the lines joining P to the ends of the rod at A and B. . . . Family of ellipses and orthogonal confocal hyperbolas. . . . Attraction of a circular disc on its axis. . . . . . . . . . . . . . A figure of revolution. . . . . . . . . . . . . . . . . . . . . . . The ratio of charge to mass as a function of the relativity velocity. The sloping curve is the ratio determined by Abraham while the horizontal curve results from Lorentz’s formula. . The orientation of the fields in Bucherer’s experiment. . . . . (a) Oblate ellipsoid with a = b > c; (b) prolate ellipsoid with a = b < c. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The caustic circle of radius c separates the bright (periodic) region a > c from the shadow (exponential) region, a < c. . . The perimeter L consists of the two half-lines that are tangent to the circle and the arc length between them. . . . . . . . . . A circle inscribed in an n-gon. . . . . . . . . . . . . . . . . . . A regular n-gon inscribed in a circle. . . . . . . . . . . . . . . Newton’s tractrix. . . . . . . . . . . . . . . . . . . . . . . . . . The set-up for the Shapiro effect. . . . . . . . . . . . . . . . . Rays tangent to a circular caustic of radius l. . . . . . . . . . . Sector inscribed in a triangle. . . . . . . . . . . . . . . . . . . Newton’s tractrix again. . . . . . . . . . . . . . . . . . . . . . The stereographic projection of a point on the sphere P onto the plane at point Q. . . . . . . . . . . . . . . . . . . . .
xxiii
152 165 182 201 225 236 238
239 243 245 246
257 258 268 274 277 278 280 282 346 364 365 367 368
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
xxiv
A New Perspective on Relativity
7.6
Comparison of the Newtonian potential (a) with that of the Schwarzschild potential (b). . . . . . . . . . . . . . . . . . . . Geodesic curves that cut the rim of the hyperbolic plane orthogonally are arcs of a circle whose center O lies outside the disc. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Circles of inversion. . . . . . . . . . . . . . . . . . . . . . . . . A more detailed description of the circle containing the fixed points v1 and λ which are uniform states of motion at relative velocities u and 2u/(1 + u2 ). The Möbius automorphism of the disc may be considered as a composition of two hyperbolic rotations: A rotation of π about the hyperbolic midpoint between the origin and λ, and a rotation about the origin. The maximum angle φ is determined by the angle of parallelism, , beyond which no motion can occur. . . . . . . . . . . . . . Extension of hyperbolic trigonometry to general triangles. . Hyperbolic velocity triangle. . . . . . . . . . . . . . . . . . . . A Lambert quadrilateral in velocity space consisting of three right-angles and one acute angle. . . . . . . . . . . . . . . . . A Lambert quadrilateral comprised of complementary segments where the ‘fourth vertex’ is an ideal point. . . . . . The Sagnac Interferometer as originally depicted in his 1913 article. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Disc cut out of hemisphere at an angle ϑ. . . . . . . . . . . . Gamow’s [62] depiction of Einstein’s gedanken experiment showing the equivalence between acceleration and gravity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . The angle of parallelism between two bounding parallels connected by the geodesic curve γ. . . . . . . . . . . . . . . . Geometric characterization of the metric density. . . . . . . . Geometric set-up for stellar aberration. . . . . . . . . . . . . . Fokker’s [65] visualization of fitting errors when objects are placed on curved surfaces. The left and right sides correspond to negative and positive curvature, respectively. . . . . . . . Hyperbolic right triangle inscribed in a unit disc. . . . . . . . Interpretation of the variables of the two metrics which are the radii of the elliptic plane. . . . . . . . . . . . . . . . . . . . . .
7.7
8.1 8.2
8.3 8.4 8.5 8.6 9.1 9.2 9.3
9.4 9.5 9.6 9.7
9.8 9.9
372
375 391
392 394 395 416 420 436 440
443 451 455 459
465 466 486
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-fm
List of Figures The three possible scenarios of closed, flat and open universes. The freckles are the galaxies which are more or less evenly distributed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.11 The fates of the universe. . . . . . . . . . . . . . . . . . . . . . 10.1 A segment H of a horocycle with center at infinity with angles of parallelism . . . . . . . . . . . . . . . . . . . . . . . 10.2 Angle of parallelism with transversal perpendicular to one of the parallel lines. . . . . . . . . . . . . . . . . . . . . . . . . 10.3 Poincaré’s projections of the Beltrami model vertically into the southern hemisphere and stereographically back onto the equator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.4 Klein model where vertical sections of the hemisphere are projected into straight lines. Geodesics retain their straightness at the cost of not being conformal. . . . . . . . . . . . . . . . 10.5 Radiation falling obliquely on a mirror of length AB. . . . . . 10.6 The Poincaré half-plane model of measuring distances. . . . 11.1 Spherical right triangle for scheme (II). . . . . . . . . . . . . . 11.2 Hyperbolic right triangle related to the scheme (III). . . . . . 11.3 Weak β-decay of the neutron. In Fermi’s theory this occurs at a single point where the emission of an electron-antineutrino pair is analogous to electromagnetic photon emission. . . . . 11.4 The decay of polarized cobalt. . . . . . . . . . . . . . . . . . . 11.5 The decay plane of cobalt 60. . . . . . . . . . . . . . . . . . . 11.6 The spherical coordinates used to describe the orientation of spin. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.7 The Poincaré sphere is the parametrization of the Stokes parameters in elliptic geometry. . . . . . . . . . . . . . . . . . 11.8 The polarization ellipse swept out by the electric field vector which is enclosed by a rectangle of sides 2a and 2b. The transformation to new electric vector components Ex and Ey consists in a counter-clockwise rotation about the angle ψ. . 11.9 Complex plane representation of polarized states. . . . . . . 11.10 Stereographic projection of the complex plane onto the Poincaré sphere. . . . . . . . . . . . . . . . . . . . . . . . . . . 11.11 The scattering of a neutrino and antineutrino emits a Z0 boson which decays into W bosons. . . . . . . . . . . . . . . . . . .
xxv
9.10
492 498 505 506
510
511 517 524 538 539
555 556 557 559 563
564 566 568 575
Aug. 26, 2011
11:17
xxvi
SPI-B1197
A New Perspective on Relativity
b1197-fm
A New Perspective on Relativity
11.12 V and S interactions rotate toward one another as the electron velocity decreases. . . . . . . . . . . . . . . . . . . . . . . . . . 11.13 A short vertical antenna. . . . . . . . . . . . . . . . . . . . . . 11.14 The configuration of electric and magnetic fields on the surface of a sphere. P is Poynting’s vector showing the direction of radiation. In any small portion, a spherical wave cannot be distinguished from a plane wave. . . . . . . . . . . . . . . . . 11.15 The polar plots of the spherical harmonics. Maxwell’s equations prohibit the middle radiation pattern. . . . . . . . 11.16 The diagrams of the original and deformed paths of integration with the pole at r = ∞ as if it were at a finite distance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.17 A right-spherical triangle. . . . . . . . . . . . . . . . . . . . . 11.18 A right-spherical triangle traced out by an orbiting electron. 11.19 Zeeman splitting: light path parallel (perpendicular) to field results in a doublet (triplet). . . . . . . . . . . . . . . . . . . . 11.20 The conventional explanation of the Lamb shift as the shielding of the electron’s charge by virtual electron-positron pairs that are produced by the vacuum when acted upon by an electric field. . . . . . . . . . . . . . . . . . . . . . . . . . . 11.21 Splitting of energy levels of a hydrogen-like atom (not drawn to scale). All shifts are left-hand elliptical polarizations. . . .
580 591
595 597
640 644 646 648
651 653
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Chapter 1
Introduction
Planck made two great discoveries in his lifetime: the energy quantum and Einstein [Miller 81]
1.1
Einstein’s Impact on Twentieth Century Physics
When one mentions the word ‘relativity’ the name Albert Einstein springs to mind. So it is quite natural to ask what was Einstein’s contribution to the theory of relativity, in particular, and to twentieth century physics, in general. Biographers and historians of science run great lengths to rewrite history. Undoubtedly, Abraham Pais’s [82] book, Subtle is the Lord, is the definitive biography of Einstein; it attempts to go beneath the surface and gives mathematical details of his achievements. A case of mention, which will serve only for illustration, is the photoelectric effect. Pais tells us that Einstein proposed Emax = hν − P, where ν is the frequency of the incident (monochromatic) radiation and P is the work function — the energy needed for an electron to escape the surface. He pointed out that [this equation] explains Lenard’s observation of the light intensity independence of the electron energy. Pais, then goes on to say that first E [sic Emax ] should vary linearly with ν. Second, the slope of the (E, ν) plot is a universal constant, independent of the nature of the irradiated material. Third, the value of the slope was predicted to be Planck’s constant determined from the radiation law. None of this was known then.
This gives the impression that Einstein singlehandedly discovered the photoelectric law. This is certainly inaccurate. Just listen to what J. J. Thomson [28] had to say on the subject: It was at first uncertain whether the energy or the velocity was a linear function of the frequency. . . Hughes, and Richardson and Compton were however able to
1
Aug. 26, 2011
2
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity show that the former law was correct. . . The relation between maximum energy and the frequency can be written in the form 12 mv2 = kν − V0 e, where V0 is a potential characteristic of the substance. Einstein suggested that k was equal to h, Planck’s constant. [italics added]
Pais asks “What about the variation of the photoelectron energy with light frequency? One increases with the other; nothing more was known in 1905.” So it is not true that “At the time Einstein proposed his heuristic principle, no one knew how E depended on ν beyond the fact that one increases with the other.” . . . And this was the reason for Einstein’s Nobel Prize.
1.1.1
The author(s) of relativity
Referring to the second edition of Edmund Whittaker’s book, History of the Theory of Relativity, Pais writes Forty years latter, a revised edition of this book came out. At that time Whittaker also published a second volume dealing with the period from 1910 to 1926. His treatment of the special theory of relativity in the latter volume shows how well the author’s lack of physical insight matches his ignorance of the literature. I would have refrained from commenting on his treatment of special relativity were it not for the fact that his book has raised questions in many minds about the priorities in the discovery of this theory. Whittaker’s opinion on this point is best conveyed by the title of his chapter on this subject: ‘The Relativity Theory of Poincaré and Lorentz.’
Whittaker ignited the priority debate by saying In the autumn of the same year, in the same volume of the Annalen der Physik as his paper on Brownian motion, Einstein published a paper which set forth the relativity theory of Poincaré and Lorentz with some amplifications, and which attracted much attention. He asserted as a fundamental principle the constancy of the speed of light, i.e. that the velocity of light in vacuo is the same for all systems of reference which are moving relatively to each other: the assertion which at the time was widely accepted, but has been severely criticized by later writers. In this paper Einstein gave the modifications which must now be introduced into the formulae for aberration and the Doppler effect.
Except for the ‘severe criticism,’ which we shall address in Sec. 4.2.1, Whittaker’s appraisal is balanced. Pais’s criticism that “as late as 1909 Poincaré did not know that the contraction of rods is a consequence of the two Einstein postulates,” and that “Poincaré therefore did not understand one of the most basic traits of special relativity” is an attempt to discredit Poincaré in favor of Einstein. In fact, there have been conscientious attempts at demonstrating Poincaré’s ignorance of special relativity.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
3
The stalwarts of Einstein, Gerald Holton [88] and Arthur Miller [81] have been joined by John Norton [04] and Michel Janssen [02]. There has been a growing support of Poincaré, by the French, Jules Leveugle [94], Christian Marchal, and Anatoly Logunov [01], a member of the Russian Academy of Sciences. It is, however, of general consensus that Poincaré arrived at the two postulates first — by at least ten years — but that “he did not fully appreciate the status of both postulates” [Goldberg 67]. Appreciation is fully in the mind of the beholder. There is a similar debate about who ‘discovered’ general relativity, was it Einstein or David Hilbert? These debates make sense if the theories are correct, unique and compelling — and most of all the results they bear. In this book we will argue that they are not unique. It is also very dangerous when historians of science enter the fray, for they have no means of judging the correctness of the theories. However, since it makes interesting reading we will indulge and present the pros and cons of each camp. Why then all the appeal for Einstein’s special theory of relativity? Probably because the two predictions of the theory were found to have practical applications to everyday life. The slowing down of clocks as a result of motion should also apply to all other physical, chemical and biological phenomena. The apparently inescapable conclusions that a twin who goes on a space trip at a speed near that of light returns to earth to find his twin has aged more than he has, and the decrease in frequency of an atomic oscillator on a moving body with the increase in mass on the moving body which is converted into radiation, all have resulted in paradoxes. All this means that the physics of the problems have as yet to be understood. Just listen to the words of the eminent physicist Victor Weisskopf [60]: We all believe that, according to special relativity, an object in motion appears to be contracted in the direction of motion by a factor [1 − (v/c)2 ]1/2 . A passenger in a fast space ship, looking out the window, so it seemed to us, would see spherical objects contracted into ellipsoids.
Commenting on James Terrell’s paper on the “Invisibility of the Lorentz contraction” in 1960, Weisskopf concludes: . . . is most remarkable that these simple and important facts of the relativistic appearance of objects have not been noticed for 55 years.
It is well to recognize that what appears as to be a firmly established phenomenon keeps popping up in different guises. It is the same type
Aug. 26, 2011
11:16
4
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
of remarks that the space contraction is a ‘psychological’ state of mind, and not a ‘real’ physical effect, that prompted Einstein to reply: The question of whether the Lorentz contraction is real or not is misleading. It is not ‘real’ insofar as it does not exist for an observer moving with the object.
Here, Einstein definitely committed himself to the ‘reality’ of the Lorentz contraction.
1.1.1.1
Einstein’s retraction of these two postulates and the existence of the aether
The cornerstones of relativity are the equivalence of all inertial frames, and the speed of light is a constant in all directions in vacuo. These postulates were also those of Poincaré who uttered them at least seven years prior to Einstein. So what makes Einstein’s postulates superior to those of Poincaré? Stanley Goldberg [67] and Arthur Miller [73] tell us that Poincaré’s [04] statements the laws of physical phenomena must be the same for a stationary observer as for an observer carried along in a uniform motion of translation; so that we have not and cannot have any means of discerning whether or not we are carried along in such a motion,
and no velocity can surpass that of light,
were elevated to “a priori postulates” [Goldberg 67] which “stood at the head of his theory.” These postulates also carry the name of Einstein. Why then would Einstein ever think of retracting them? If time dilatation and space contraction due to motion are actual processes then there is no symmetry between observers in different inertial frames. The first postulate of relativity is therefore violated [Essen 71]. Einstein used gedanken experiments which is an oxymoron. Consider what Einstein [16] has to say about a pair of local observers on a rotating disc: By a familiar result of the special theory of relativity the clock at the circumference — judged by K — goes more slowly than the other because the former is in motion and the latter is at rest. An observer at the common origin of coordinates capable of observing the clock at the circumference by means of light would therefore see it lagging behind the clock beside him. As he will not make up his mind to let the velocity of light along the path in question depend explicitly on the time, he will
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
5
interpret his observations as showing that the clock at the circumference ‘really’ goes more slowly than the clock at the origin.
First the uniformly rotating disc is not an inertial system so the special theory does not apply. Second, local observers cannot discern any changes to their clocks or rulers as to where they are on the disc because they shrink or expand with them. It is only to us Euclideans that these variations are perceptible. If the velocity of light is independent of the velocity of its source, how then can the outward journey of a light signal to an observer moving at velocity v be c + v, on its return it travels with a velocity c − v? Although this violates the second postulate, such assertions appear in the expression for the elapsed time of sending out a light signal from one point to another and back again in the Michelson–Morley experiment whose null result they hope to explain. They also appear alongside Einstein’s relativistic velocity composition law in his famous 1905 paper “On the Electrodynamics of Moving Bodies.” Also in that paper is his ‘definition’ of the velocity of light as the ratio of “light path” to the “time interval.” But we are not allowed to measure the path of the light ray and determine the time it took, for c has been elevated to a universal constant! “How can two units of measurement be made constant by definition?” Essen queries. In his first attempt to explain the bending of rays in a gravitational field, Einstein [11] claims For measuring time at a place which, relative to the origin of the coordinates, has a gravitation potential , we must employ a clock which — when removed to the origin of coordinates — goes (1 + /c2 ) times more slowly than the clock used for measuring time at the origin of coordinates. If we call the velocity of light at the origin of coordinates c0 , then the velocity of light c at a place with the gravitational potential will be given by the relation c = c0 1 + . c2 The principle of the constancy of the velocity of light holds good according to this theory in a different form from that which usually underlies the ordinary theory of light. [italics added]
On the contrary, this violates the second postulate which makes no reference to inertial nor non-inertial frames. And is his equation a cubic equation for determining c?
Aug. 26, 2011
6
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
It did not take Max Abraham [12] long to point this out stating that Einstein had given “the death blow to relativity,” by retracting the invariance of c. Abraham said he warned “repeatedly against the siren song of this theory. . . [and] that its originator has now convinced himself of its untenability.” What Abraham objected most to was that even if relativity could be salvaged, at least in part, it could never provide a “complete world picture,” because it excludes, by its very nature, gravity. Einstein also uses the same Doppler expression for the frequency shift. The Doppler shift is caused by the motion of the source with respect to the observer. “There is, therefore, no logical reason why it should be caused by the gravitational potential, which is assumed to be equivalent to the acceleration times distance” [Essen 71]. Thus Einstein is proposing another mechanism for the shift of spectral lines that employs accelerative motion rather than the relative motion of source and receiver. Does the acceleration of a locomotive cause a shift in the frequency of its whistle? or is it due to its velocity with respect to an observer on a stationary platform? But no, Einstein has replaced the product of acceleration and distance with the gravitational potential — which is static! Just where a clock is in a gravitational field will change its frequency. This is neither a shift caused by velocity nor acceleration. Everyone would agree that Einstein removed the aether. Whereas Hertz considered the aether to be dragged along with the motion of a body, Lorentz considered the aether to be immobile, a reference frame for an observer truly at rest. On the occasion of a visit to Leyden in 1920, Einstein [22a] had this to say about the aether: . . . the whole change in the conception of the aether which the special theory of relativity brought about, consisted in taking away from the aether its last mechanical quality, namely, its immobility. . . . according to the general theory of relativity space is endowed with physical qualities; in this sense, therefore, there exists an aether. . . . space without aether is unthinkable; for in such a space there not only would be no propagation of light, but also no possibility of the existence for standards of space and time (measuring rods and clocks), nor therefore any space time intervals in the physical sense. But this aether may not be thought of as endowed with the quality characteristic of ponderable media, as consisting of parts which may be tracked through time. The idea of motion may not be applied to it.
Essentially what Einstein is saying that what was not good for special relativity is good for general relativity for “We know that [the new aether] determines the metrical relations in the space-time continuum.” How is it
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
7
needed for the propagation of light signals and yet has not the characteristics of a medium? Einstein’s real problem is with rotations for “Newton might no less well have called his absolute space ‘aether;’ what is essential is merely that besides observable objects, another thing, which is not perceptible, must be looked upon as real, to enable acceleration or rotation to be looked upon as something real.” This is five years after Einstein’s formulation of general relativity, and his desire is to unite the gravitational and electromagnetic fields into “one unified conformation” that would enable “the contrast between aether and matter [to] fade away, and, through the general theory of relativity, the whole of physics would become a complete system of thought.” The search for that utopia was to occupy Einstein for the remainder of his life.
1.1.1.2
Which mass?
In Lorentz’s theory two masses result depending on how Newton’s law is expressed, i.e. F=
d (mv), dt
or F = ma, where a is the acceleration. Both forms of the force law coincide when the mass is independent of the velocity, but not so when it is a function of the velocity. If the force is perpendicular to the velocity there results the transverse mass, m0 mt = √ , (1 − β2 ) while if parallel to the velocity there results the longitudinal mass, ml =
m0 . (1 − β2 )3/2
While it is true that a larger force is required to produce an acceleration in the direction of the motion than when it is perpendicular to the motion, it “is unfortunate that the concept of two masses was ever developed, for the [second] form of Newton’s law is now recognized as the correct one” [Stranathan 42].
Aug. 26, 2011
8
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
In the early days of relativity the relativistic mass was written m = and not m = E/c2 . Einstein was aloof to the factor of 43 — which was a consequence of the Lorentz transform on energy — but not to there being two masses. According to Einstein [05] “with a different definition of the force and acceleration we would obtain different numerical values for the masses; this shows that we must proceed with great caution when comparing different theories of the motion of the electron.” Apart from ‘numerical’ differences, Kaufmann’s experiments identified the mass as the transverse mass, but this did not prevent Einstein [06a] to propose an experimental method to determine the ratio of the transverse to the longitudinal mass. According to Einstein the ratio of the transverse to longitudinal mass would be given by the ratio of the electric force, eE, to the potential, V, “at which the shadow-forming rays get deflected,” i.e. 4 2 3 E/c ,
mt ρ Ex , = ml 2 V where ρ is the radius of curvature of the shadow-forming rays and Ex is the electric field in the x-direction. As the ‘definition’ of the longitudinal mass, ml , Einstein takes kinetic energy =
1 ml v 2 . 2
It would be very difficult for Einstein to get this energy as a nonrelativistic approximation of a relativistic expression for the kinetic energy. Einstein’s contention that A change of trajectory evidently is produced by a proportional change of the field only at electron velocities at which the ratio of the transverse to longitudinal mass is noticeably different from unity
is at odds with his assumption of the validity of the equation of motion, m0
d2 x = −eEx , dt2
which holds “if the square of the velocity of the electrons is very small compared to the square of the velocity of light.” The mass of the electron m0 is not specified as to whether it is the transverse or longitudinal mass, or a combination of the two.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
9
This example shows that Einstein was not attached to his relativity theory as he is made out to be. Why is it that the same types of contradictions and incertitudes found in Poincaré’s statements are used as proof as to his limitations as a physicist, while there is never mention of them in Einstein’s case?
1.1.1.3
Conspiracy theories
In order to defend the supremacy of German science, David Hilbert, with the help of Hermann Minkowski and Emil Wiechert, set out to deny Poincaré the authorship of relativity. Hilbert was the last in a long line of illustrious Göttingen mathematicians who sought to retain the dominance of the University which boasted of the likes of Carl Friedrich Gauss, Bernhard Riemann and Felix Klein. Whereas there existed a friendly competition between Felix Klein and Poincaré [Stillwell 89], Hilbert’s predecessor, there was jealousy between Hilbert and Poincaré, which was only exasperated when Poincaré won the Bolyai prize in mathematics for the year 1905. Ironic as it may be, János Bolyai was the co-inventor of hyperbolic geometry, and the rivalry between Klein and Poincaré had to do with the development of that geometry. As the story goes, Arnold Sommerfeld [04], an ex-assistant of Klein’s, Gustav Herglotz and Wiechert were working on superluminal electrons during the fall of 1904 through the spring of 1905. In the summer months of 1905, beginning on the notorious date of the 5th of June, the Göttingen mathematicians organized seminars on the ‘theory of electrons,’ in which there was a session on superluminal electrons chaired by Wiechert on the 24th of July. The date of the 5th of June coincided with Poincaré’s [05] presentation of his paper, “Sur la dynamique de l’électron,” to the French Academy of Sciences. The printed paper was published and sent out to all correspondents of the Academy that Friday, the 9th of June. The earliest it could have arrived in Göttingen was Saturday the 10th, or given postal delays it would have arrived no latter than the following Tuesday, the 13th of June.a In that a These dates are reasonable since the other German physics bi-monthly journal,
Fortschrift der Physik had a synopsis of the Poincaré paper in its 30th of June issue. Given the publication delay, it would make the 10th of June arrival date of the Comptes Rendus issue more likely.
Aug. 26, 2011
11:16
10
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
paper Poincaré supposedly declared that no material body can go faster than the velocity of light in vacuum, and this threw a wrench into the works of the Göttingen school [Marchal]. However, this is nothing different than what Poincaré [98] had been saying since 1898 when he postulated the invariance of light in vacuo to all observers, whether they are stationary or in motion. Or, to what Poincaré reiterated in 1904: “from these results, if they are confirmed would arise a new mechanics [in which] no velocity could surpass that of light.” So the all-important date of the publication date of 5th of June to the proponents of the conspiracy theory [Leveugle 04] is a red herring for it said only what he had said before on the limiting velocity of light. Moreover, there was a continual boycott of Poincaré’s relativity work in such prestigious German journals as Annalen der Physik. Consequently, there was no contingency for the appearance of Einstein’s paper when it did. But let us continue. So the plot was hatched that some German, of minor importance and one who was willing to take the risks of plagiarism, had to be found that would reproduce Poincaré’s results without his name. Now Minkowski knew of Einstein since he had been his student at the ETHb from 1896– 1900. Einstein was also in contact with Planck, since Einstein’s summary of the work appearing in other journals for the Beiblätter zu der Annalen der Physik earned him a small income. In fact, there is one review of Einstein of a paper by A. Ponsot “Heat in the displacement of the equilibrium of a capillary system,” that appeared in the Comptes Rendu 140 just 325 pages before Poincaré’s June 5th paper. To make matters worse, an article by Weiss, which appeared in the same issue of Comptes Rendu, was summarized in the November issue of the Supplement, but not for Poincaré’s paper. Neither that paper nor its longer extension that was published in the Rendiconti del Circolo Matematico di Palermo [06] were ever summarized in the Beiblätter. Surely, these papers would have caught the eye of Planck, who was running the Annalen, and was known to be in correspondence with Einstein not only in this connection, but, also with regard to questions on quanta. Einstein had also published some papers on the foundations
b The Eidgenössische Technische Hochschule (ETH) was then known as the Eid-
genössische Polytechnikum; the name was officially changed in 1911.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
11
of thermodynamics during the years 1902–1903 in the Annalen whose similarity with those of J. Willard Gibbs was “quite amazing” even to Max Born [51]. Thus, the relativity paper was supposedly prepared by the Göttingen mathematicians and signed by Einstein who submitted it for publication at the end of June, arriving at the offices of the Annalen on the 30th of June. Einstein was an outsider, being considered a thermodynamicist, with a lot to gain and little, if nothing, to lose. The paper fails to mention either Lorentz or Poincaré, and, for that matter, contains no references at all. If there was a referee for the paper,c other than Planck himself, it would have been obvious that the transformation of the electrodynamic quantities went under the name of Lorentz, with Lorentz’s parameter k(v) replaced by Einstein’s ϕ(v), both ultimately set equal to 1, and the relativistic addition law had already been written down by Poincaré as a consequence of the Lorentz transform in his 1905 paper on “Sur la dynamique de l’électron.” Although Einstein derives the relativistic composition law in the same way as Poincaré, he provides a new generalization when the composition of Lorentz transformations are in different planes, for that also involves rotations. It has been claimed that there was no connection between Lorentz and Einstein for Einstein gets the wrong expression for the transverse mass in his “Electrodynamics of moving bodies,” while Lorentz errs when he subjects the electric current to a Lorentz transformation [Ohanian 08]. But, it is clear from his method of derivation from the Lorentz force, that Einstein’s error was a typo. Einstein’s paper appeared in the 26th of September issue of the Annalen, and Planck lost no time in organizing a symposium on his paper that November, which, in the words of von Laue, was “unforgettable.” Not all is conjecture, certain things are known. First, Poincaré worked in friendly competition with Klein in studying universal coverings of surfaces. What initiated Poincaré on his studies of hyperbolic geometry was cApparently the paper was handled by Wilhelm Röntgen, a member of the Kurato-
rium of the Annalen, who gave it to his young Russian assistant, Abraham Joffe [Auffray 99]. Joffe noted that the author was known to the Annalen, and recommended publication. That an experimental physicist should have handled the paper, and not the only theoretician on the Kuratorium — Planck — would have made such a referring procedure extremely dubious.
Aug. 26, 2011
11:16
12
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
an 1882 letter of Klein to Poincaré who informed him of previous work by Schwarz. Second, it was Klein who brought Hilbert to Göttingen. When criticized about his choice, Klein responded “I want the most difficult of all.” Third, Klein was known to pass on important letters and scientific material to Hilbert. Fourth, since Klein and Poincaré were on good terms and in contact, it would be unthinkable that Klein did not know of Poincaré’s work on relativity, and that Klein would have passed this on to Hilbert. Fifth, there was a lack of “kindred spirit” [Gray 07] between Poincaré and Hilbert from their first meeting in Paris in 1885. Sixth, Poincaré was “unusually open about his sources,” [Gray 07] and non-polemical, while Hilbert had a tremendous will who thought every problem was solvable. Lastly, Poincaré’s work on relativity was actively boycotted in Germany, and later in France thanks to Paul Langevin. Thus, it is unthinkable that Hilbert was in the dark about relativity theory prior to 1905. His colleague, Minkowski, became interested in electrodynamics through reading Lorentz’s papers. According to C. Reid, in “Hilbert,” Hilbert conducted a joint seminar with Minkowski. Ayear after their study, in 1905, they decided to dedicate the seminar to a topic in physics: the electrodynamics of moving bodies. Hilbert was often quoted as saying “physics is too important to be left to the physicists.” What is truly unbelievable that the discover of relativity and two models of hyperbolic geometry would not even once think there was a relation between the two. Everything else is conjecture, even Einstein’s supposed receipt of the latest issue of Volume CXL of Comptes Rendus, vested as a reviewer for the Beiblätter, on Monday the 12th of June in the Berne Patent Office. Undoubtedly, that would have created a dire urgency to finish his article on the electrodynamics of a moving body [Auffray 99]. But wherever the real truth may lie, there cannot be any doubt that Planck played a decisive role in Einstein’s rise to fame. The behavior of Langevin to a fellow countryman is even more baffling when we realize that he was the first French physicist to learn of the “new mechanics” of Poincaré, which would later be known as relativity, but without the name of its author. Langevin had accompanied Poincaré to the Saint-Louis Congress of 1904 where he presented his principle of relativity. It is hardly admissible that Langevin was not familiar of all Poincaré’s publications especially when Poincaré [06] dedicated a whole section of
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
13
his 1906 article in the Rendiconti to him, entitling it “Langevin Waves,” and stating Langevin has put forth a particularly elegant formulation of the formulas which define the electromagnetic field produced by the motion of a single electron.
Yet, in his obituary column of Poincaré, Langevin fails to note Poincaré’s priority over Einstein’s writing Einstein has rendered the things clearer by underlining the new notions of space and time which correspond to a group totally different than the conserved transformations of rational mechanics, and asserting the generality of the principle of relativity and admitting that no experimental procedure could ascertain the translational movement of a system by measurements made on its interior. He has succeeded in giving definitive form to the Lorentz group and has indicated the relations that exist between the same quantity simultaneously made on each of two systems in relative movement. Henri Poincaré arrived at the same equations in the same time following a different route, his attention being directed to the imperfect form which the formulas for the transformation had been given by Lorentz. Familiar with the theory of groups, he was preoccupied to find the invariants of the transformation, elements which are unaltered and thanks to which it is possible to pronounce all the laws of physics in a form independent of the reference system; he sought the form that these laws must have in order to satisfy the principle of relativity.
This could not have appeared in a more appropriate place: Revue de Métaphysique et de Morales! Another priority feud also erupted between Einstein and Hilbert over general relativity in November 1915. It ended with the publication of papers with the unpretentious titles of “The foundation of the general theory of relativity,” by Einstein, and “The foundations of physics,” by Hilbert. Historians of science make Einstein’s theory the ultimate theory of gravitation with titles like “How Einstein found his field equations,” [Norton 84], and “Lost in the tensors: Einstein’s struggle with covariance principles” [Earman & Glymour 78]. In the opinion of O’Rahilly [38], “Einstein’s theory, which delights every aesthetically minded mathematician, is a much less grandiose affair as judged and assessed by the physicist.” He points out that Walther Ritz arrived at prediction of a perihelion advance of the planets in 1908. We will use his same force equation to show he could have obtained the other predictions of general relativity in Sec. 3.8.2. Furthermore, the same experimental tests of these equations can be obtained with far more simplicity, as we shall see in Chapter 7. The proponents of the conspiracy
Aug. 26, 2011
11:16
14
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
theory claim that Einstein’s conciliatory letter of December to Hilbert may be due, in part, for the favor that Hilbert did for him ten years earlier. The defenders of Einstein belittle Poincaré for his “lack of insight into certain aspects of the physics involved” [Goldberg 67]. The same can be said of Einstein; in a much quoted letter to Carl Seelig on the occasion of the 50th ‘anniversary’ of relativity, Einstein writes: The new feature was the realization of the fact that the bearing of the Lorentztransformations transcended their connection with Maxwell’s equations and was concerned with the nature of space and time in general. A further result was that the Lorentz invariance is a general condition for any physical theory. This was for me of particular importance because I had already previously found that Maxwell’s theory did not account for the micro-structure of radiation and could therefore have no general validity.
In a letter to von Laue in 1952, Einstein elaborated what he meant by a “second type” of radiation pressure: one has to assume that there exists a second type of radiation pressure, not derivable from Maxwell’s theory, corresponding to the assumption that radiation energy consists of indivisible point-like localized quanta of energy hν (and of momentum hν/c, c = velocity of light), which are reflected undivided. The way of looking at the problem showed in a drastic and direct way that a type of immediate reality has to be ascribed to Planck’s quanta, that radiation, must, therefore, possess a kind of molecular structure as far as energy is concerned, which of course contradicts Maxwell’s theory.
Maxwell’s equations together with the Lorentz force satisfy the Lorentz transform so it is difficult to see that the transformation is more general than what it transforms. In addition, the discovery of Planck’s radiation law did not contradict the Stefan–Boltzmann radiation law, nor provide a new type of radiation. Here, Einstein is confusing macroscopic laws with the underlying microscopic processes that are entirely compatible with those laws when the former are averaged over all frequencies of radiation. Consequently, there is no second type of radiation pressure. What the conspiracy theories have in common with their opponents is the presumption that the end result is correct. What authority did Poincaré’s June paper of 1905 have for dashing the efforts of Sommerfeld’s investigations on superluminal electrons? Weber was no stranger to superluminal particles nor was Heaviside. In all the years preceding that paper, there was no authority bearing down upon them even though the mathematical structure of relativity had been set in place. What was supposedly
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
15
new about Einstein’s paper was the liberation of space and time from an electromagnetic framework, as he claimed in his letter to Seelig. But is this true?
1.1.1.4
Space-time in Einstein’s world
The conventional way of rebuffing the conspiracy theories is “to show the nature of Poincaré’s ideas and approach that prevented him from producing what Einstein achieved” [Cerf 06]. Einstein was not so unread as he would have us believe for he used Poincaré’s method — radar signaling — in discussing simultaneous events, and falls into the same trap as Poincaré did. Poincaré asks us to consider two observers, A and B, who are equipped with clocks that can be synchronized with the aid of light signals. B sends a signal to A marking down the time instant in which it is sent. A, on the other hand, resets his clock to that instant in time when he receives the signal. Poincaré realized that such a synchronization would introduce an error because it takes a time t for light to travel between B and A. That is, A’s clock would be behind B’s clock by a time t = d/c, where d is the distance between B and A. This error, according to Poincaré is easy to correct: Let A send a light signal to B. Since light travels at the same speed in both directions, B’s clock will be behind A’s by the same time t. Therefore, in order to synchronize their clocks it is necessary for A and B to take the arithmetic mean of the times arrived at in this way. This is also Einstein’s result. Certainly the definition of the velocity v = d/t seems innocuous enough. But, as Louis Essen [71] has pointed out it is possible to define the units of any two of these terms. Normally, one measures distance in meters and time in seconds so the velocity is meters per second. But making the velocity of light constant “in all directions and to all observers whether stationary or in relative motion” is tantamount to making c a unit of measurement, or what will turn out to be an absolute constant. According to Essen, “the definition of the unit of length or of time must be abandoned; or, to meet Einstein’s two conditions, it is convenient to abandon both units.” The two conditions that Essen is referring to is the dilatation of time and the contraction of length. There is no new physical theory, but, “simply a new system of units in which c is constant” so that either time or length
Aug. 26, 2011
11:16
16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
or both must be a function of c such that their ratio, d/t, gives c. This is not what Louis de Broglie [51] had to say: Poincaré did not take the decisive step. He left to Einstein the glory of having perceived all the consequences of the principle of relativity and, in particular, of having clarified through a deeply searching critique of the measures of length and duration, the physical nature of the connection established between space and time by the principle of relativity.
So by elevating the velocity of light to a universal constant, Einstein implied that the geometry of relativity was no longer Euclidean. The number c is an absolute constant for hyperbolic geometry that depends for its value on the choice of the unit of measurement. To the local observers there is no such thing as time dilatation nor length contraction. These distortions are due to our Euclidean perspective. It is all a question of ‘frame of reference.’ Poincaré after having written down his relativistic law of the composition of velocities should have realized that the only function which could satisfy such a law is the hyperbolic tangent, which is the straight line segment in Lobachevsky (velocity) space. Thus, time and space have no separate meaning, but only their ratio does. Consider Einstein’s two postulates which he enunciated in 1905: (i) The same laws of electrodynamics and optics will be valid for all frames of reference for which the equations of mechanics hold. (ii) Light is always propagated in empty space with a definite velocity c, which is independent of the state of motion of the emitting body. Match them against Poincaré’s first two postulates as he pronounced them in 1904: (i) The laws of physical phenomena should be the same whether for an observer fixed, or for an observer carried along in a uniform movement of translation; so that we could not have any means of discerning whether or not we are carried along in such a motion; (ii) Light has a constant velocity and in particular that its velocity is the same in all directions.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
17
Now Poincaré introduces a third postulate, which Pais makes the following comment: The new mechanics, Poincaré said, is based on three hypotheses. The first of these is that bodies cannot attain velocities larger than the velocity of light. The second is (I use modern language) that the laws of physics shall be the same in all inertial frames. So far so good. Then Poincaré introduces a third hypothesis. ‘One needs to make still a third hypothesis, much more surprising, much more difficult to accept, one which is of much hindrance to what we are currently used to. A body in translational motion suffers a deformation in the direction in which it is displaced. . . However strange it may appear to us, one must admit that the third hypothesis is perfectly verified.’ It is evident that as late as 1909 Poincaré did not know that the contractions of rods is a consequence of the two Einstein postulates. Poincaré therefore did not understand one of the most basic traits of special relativity.
Whether or not rods contract or rotate when in motion will be discussed in Sec. 9.9, but it appears that Pais is reading much too much into what Poincaré said as to what he actually did. In Sec. 4 of “Sur la dynamique de l’électron” published in 1905, entitled “The Lorentz transformation and the principle of least action,” Poincaré shows that both time dilatation and space contraction follow directly from the Lorentz transformations. By the Lorentz transformation, δx = γl(δx − βct),
δy = lδy,
δz = lδz,
δt = γl(δt − βδx/c),
it follows that for measurements made on a body at the same moment, δt = 0, in an inertial system moving with a relative velocity β = v/c along the x-axis, the body undergoes contraction by a factor γ −1 when viewed in the unprimed frame when we set l = 1. It is therefore very strange that Poincaré would reintroduce this as a third hypothesis when it is a consequence of Lorentz’s transformation which he accepts unreservedly. As Poincaré was prone to writing popular articles and books he may have thought that the contraction of rods were sure to catch the imagination of the layman. The problem is in the interpretation of what is meant by the second postulate regarding the constancy of light, which is usually interpreted as the velocity of light relative to an observer, whether he be stationary or moving at a velocity v. Thus, instead of obtaining values c + v or c − v for the velocity of light, for an observer moving at ±v relative to the source, one would always ‘measure’ c. A frequency would therefore not undergo a Doppler shift, contrary to what occurs.
Aug. 26, 2011
11:16
18
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
According to Einstein’s prescription, the time taken for a light signal to complete a ‘back-and-forth’ journey over a distance d is the arithmetic average of the two t=
1 1 1 c . d + =d 2 2 c+v c−v c − v2
We are thus forced to conclude that instead of obtaining the velocity c, we get the velocity c(1 − v2 /c2 ), which differs from the former in the presence of a second order term, −v2 /c2 . Rather, if we use the relativistic velocities (c + v)/(1 + v/c) and (c − v)/(1 − v/c), we obtain 1 1 + v/c 1 − v/c t= d + = d/c, 2 c+v c−v and the second-order effect disappears, just as it would in the Michelson– Morley experiment [cf. Sec. 3.2]. It is not as Einstein claims: “The quotient [distance by time] is, in agreement with experience, a universal constant c, the velocity of light in empty space.” The ‘experience’ is the transmission of signals back and forth, like those envisioned by Poincaré. In this setting, the ‘principle’ of the constancy of light is untenable [Ives 51]. The velocities of light in the out and back directions co and cb will, in general, be different. If the distance traversed by the light signal is d, the total time for the outward and backward journey is, according to Einstein, 1 t= 2
d d + co cb
=
co + cd co cd
d . 2
(1.1.1)
But, according to the principle of relativity, there should be no difference in the velocities of light in the outward and backward directions, so that this principle decrees t=
d . c
Equating (1.1.1) and (1.1.2) yields [Ives 51] (co + cb )/2 1 = , co cb c
(1.1.2)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction which can easily be rearranged to read: √ c 2 (co cb ) = ≤ 1. √ (co + cb ) (co cb )
19
(1.1.3)
The inequality in (1.1.3) follows from the arithmetic-geometric mean inequality which becomes an equality only when co = cb = c. Thus, if there are no superluminal velocities, the latter case must hold, for if not, one of the two velocities, co or cb must be greater than c. A similar situation occurs for the inhomogeneous dispersion equation of a wave [cf. Sec. 11.5.6], ω2 = c2 κ2 + ω02 , where ω and κ are the frequency and wave number, and ω0 is the critical frequency below which the wave becomes attenuated. Differentiation of the dispersion equation gives ωdω = c2 κ dκ. Introducing the definitions of phase and group velocities, u = ω/κ and w = dω/dκ, it becomes apparent that u > c implies w < c [Brillouin 60]. Since uw = c2 , the equivalence of the two velocities requires the critical frequency to vanish and so restores the isotropy of space. Einstein [05] uses absolute velocities to show that two observers traveling at velocities ±v would not find that their clocks are synchronous while those at rest would declare them so. He considers light emitted at . A at time tA to be reflected at B at time tB which arrives back at A at time tA If d is the distance between A and B, the time for the outward and return journeys are tB − tA =
d , c+v
− tB = tA
d , c−v
and
respectively. Since these are not the same, Einstein concludes that what seems simultaneous from a position at rest is not true when in relative motion. But, in order to do so, Einstein is using absolute velocities: the velocity on the outward journey is c + v, and the velocity of the return
Aug. 26, 2011
11:16
20
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
journey is, c − v, and so violates his second postulate. If the relativistic law of the composition of velocities is used, instead, the total times for outward and return journeys become the same, which is what is found to within the limits of experimental error [Essen 71]. Einstein then attempts to associate physical phenomena with the fact that clocks in motion run slower than their stationary counterparts, and rods contract when in motion in comparison with identical rods at rest. He considers what is tantamount to the Lorentz transformations, as a rotation through an imaginary angle, θ, x = x cosh θ − ct sinh θ,
ct = ct cosh θ − x sinh θ,
at the origin of the system in motion so that x = 0. He thus obtains √ x/t = c tanh θ, t = t (1 − v2 /c2 ) = t/ cosh θ. (1.1.4) He then concludes that clocks transported to a point will run slower by an amount 12 tv2 /c2 with respect to stationary clocks at that point, which is valid up to second-order terms. Rather, what Einstein should have noticed is that 1 + v/c 1 θ = tanh−1 v/c = ln 2 1 − v/c is the relative distance in a hyperbolic velocity space whose ‘radius of curvature’ is c. Space and time have lost their separate identities, and only appear in the ratio v = x/t whose hyperbolic measure is θ = v¯ /c. The role of c is that of an absolute constant, whose numerical value will depend on the arbitrary choice of a unit segment. By raising the velocity of light to a universal constant, Einstein implied that the space is no longer Euclidean. Euclidean geometry needs standards of length and time; in this sense Euclidean geometry is relative. In terms of meters and seconds, the speed of light is 3 × 108 m/s. If there was no Bureau of Standards we would have no way of defining what a meter or second is. Not so in Lobachevskian geometry where angles determine the sides of the triangle. In Lobachevskian geometry lengths are absolute as well as angles. The ‘radius of curvature’ c is no longer an upper limit to the velocities, but, rather, defines the unit of measurement. Lobachevskian geometries with different values of c will not be congruent. As c approaches infinity, Lobachevskian formulas go over into their Euclidean counterparts.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction The exponential distance, 1 + v/c 1/2 ν v¯ /c e = = , 1 − v/c ν
21
(1.1.5)
is the ordinary longitudinal Doppler factor for a shift in the frequency, ν , due to a moving source at velocity v. In the Euclidean limit, θ ≈ x/ct and (1.1.5) reduces to the usual Doppler formula [Variˇcak 10]: ν = ν(1 + v/c). It is undoubtedly for this reason that both Einstein and Planck found nonEuclidean geometries distasteful. For as Planck remarked [98] It need scarcely be emphasized that this new conception of the idea of time makes the most serious demands upon the capacity of abstraction and projective power of the physicist. It surpasses in boldness everything previously suggested in speculative natural phenomena and even in the philosophical theories of knowledge: non-Euclidean geometry is child’s play in comparison. And, moreover, the principle of relativity, unlike non-Euclidean geometry, which only comes seriously into consideration in pure mathematics, undoubtedly possesses a real physical significance. The revolution introduced by this principle into the physical conceptions of the world is only to be compared in extent and depth with that brought about by the introduction of the Copernican system of the universe.
Prescinding Planck’s degrading remarks concerning non-Euclidean geometries, we can safely conclude that the distortion effects due to the spatial contraction and time dilatation of moving objects can be perceived by an observer using a Euclidean metric and clock. To local observers in hyperbolic space, there is no possible way of discerning these distortions because their rulers and clocks shrink or expand with them. All the ‘peculiar consequences’ are based on the issue of ‘frame of reference.’ What is truly tragic is that Poincaré never realized that his models of non-Euclidean geometries were pertinent to relativity. According to Arthur Miller [73] For a scientist of Poincaré’s talents the awareness of Lorentz’s theory should have been the impetus for the discovery of relativity. Poincaré seemed to have all the requisite concepts for a relativity theory: a discussion of the various null experiments to first and second order accuracy in v/c; a discussion of the role of the speed of light in length measurements; the correct relativistic transformation equations for the electromagnetic field and the charge density; a relativistically invariant action
Aug. 26, 2011
11:16
22
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity principle; the correct relativistic equation for the addition of velocities; the concept of the Lorentz group; a rudimentary of the four-vector formalism and of four-dimensional space; a correct relativistic kinematics. . . [italics added]
so what went wrong? Miller claims that “his relativity was to be an inductive one with the laws of electromagnetism as the basis of all of physics.” This, according to Miller, prevented him from grasping the “universal applicability of the principle of relativity and therefore the importance of the constancy of the velocity of light in all inertial frames.” In other words, the equations are right but the deductions are wrong. One can deduce what he likes from the equations as long as it is compatible with experiment. While Miller [81] acknowledges that both Poincaré and Einstein, “simultaneously and independently,” derived the relativistic addition law for velocities, “only Einstein’s view could achieve its full potential.” He further claims that Poincaré never proved “the independence of the velocity of light from its source. . ..” These assertions have no justification at all: Poincaré did not have to prove anything, the velocity addition law negates ballistic theories. It is also not true that “Lorentz’s theory contained special hypotheses for this purpose.” No special hypotheses are needed since the velocity addition law is a direct outcome of the Lorentz transformations. Here is a clear intent to disparage Poincaré. And where is the experimental verification of Einstein’s theory as opposed to Poincaré’s? Or, maybe, Poincaré just did not go far enough? According to Scribner [64] the whole of the kinematical part of Einstein’s 1905 paper could have been rewritten in terms of aether theory. So according to him, the aether would play the role of the caloric in Carnot’s theory which, by careful use, did not invalidate his results. Carnot never ‘closed’ his cycle for that would have meant equating the heat absorbed at the hot reservoir with the heat rejected at the cold reservoir since, according to caloric theory, heat had to be conserved. Where Einstein puts into quotation marks “stationary” as opposed to “moving” it does not imply a physical difference because one is relative to the other. Moreover, the distinctions between “real” and “apparent” must likewise be abandoned. If there is no distinction between the two, then why should Einstein have taken exception to Variˇcak’s remark that Einstein’s “contraction is, so to speak, only a psychological and not a physical fact.” This brought an immediate reaction from Einstein to the effect that Variˇcak’s
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
23
note “must not remain unanswered because of the confusion that it could bring about.” After all these years has the confusion been abated? To condemn Lorentz and Poincaré for their belief in the aether is absurd. The aether for them was the caloric for Carnot. But did the caloric invalidate Carnot’s principle? And if Carnot has his principle, why does Poincaré not have his? Carnot’s principle still stands when the scaffolding of caloric theory falls. Another analogy associates Poincaré to Weber, and Einstein to Maxwell. Weber needed charges as the seat of electrical force, while Maxwell needed the aether as the medium in which his waves propagate. Maxwell’s circuital equations make no reference to charges as the carriers of electricity. Miller [73] asserts that Poincaré did not realize “in a universal relativity theory the basic role is played by the energy and momentum instead of the force.” But it was Lorentz’s force that was able to bridge Maxwell’s macroscopic field equations with the microscopic world of charges and currents. It is clear that Poincaré did not want to enter into polemics with Einstein. And Einstein, on his part, admits that his work was preceded by Poincaré. After a critical remark made by Planck on Einstein’s first derivation of m = E/c2 , to the effect that it is valid to first-order only, the following year Einstein [06b] makes another attempt. In this study he proposes to show that this condition is both necessary and sufficient for the law of momentum, which maintains invariant the center of gravity, citing Poincaré’s 1900 paper in the Lorentz Festschrift. He then goes on to say Although the elementary formal considerations to justify this assertion are already contained essentially in a paper of Poincaré, I have felt, for reasons of clarity, not to avail myself of that paper.
Even though Einstein clearly admits to Poincaré’s priority no one seems to have taken notice of it. On July the 5th 1909, Mittag-Leffler, editor of Acta Mathematica writes to Poincaré to solicit a paper on relativity writing You know without doubt Minkowski’s Space and Time published after his death, and also the ideas of Einstein and Lorentz on the same problem. Now, Fredholm tells me that you have reached the similar ideas before these other authors in which you
Aug. 26, 2011
11:16
24
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity express yourself in a less philosophical, but more mathematical, manner. Would you write me a paper on this subject . . . in a comprehensible language that even the simple geometer would understand.
Poincaré never responded. Then there was the letter of recommendation of Poincaré’s to Weiss at the ETH where he considers Einstein as one of the most original minds that I have met. I don’t dare to say that his predictions will be confirmed by experiment, insofar as it will one day be possible.
Notwithstanding, Einstein writes in November 1911 that “Poincaré was in general simply antagonistic.” Relativity was probably just a word to him, since it was he who postulated the ‘principle of relativity.’ But it is true that Poincaré looked to experimental confirmation for his principle. Be that as it may, what is truly incomprehensible is Poincaré’s lack of appreciation of the velocity addition law, for that should have put him on the track of introducing hyperbolic geometry. Then the distortions in space and time could be explained as the distortion we Euclideans observe when looking into another world governed by the axioms of hyperbolic geometry. To the end of his life, Poincaré maintained that Euclidean geometry is the stage where nature enacts her play, never once occurring to him that his mathematical investigations would have some role in that enactment. Now Poincaré was more than familiar with Lorentz’s contraction of electrons when they are in motion. He even added the additional, nonelectromagnetic, energy necessary to keep the charge on the surface of the electron from flying off in all directions. The contraction of bodies is likened to the inhabitants of this strange world becoming smaller and smaller as they approach the boundary. The absolute constant needed for such a geometry would be the speed of light which would determine the radius of curvature of this world. In retrospect, it is unbelievable how Poincaré could have missed all this. It is also said that Poincaré was using the principle of relativity as a fact of nature, to be disproved if there is one experiment that can invalidate it. This is not much different than the second law of thermodynamics. In fact when Kaufmann’s measurements of the specific charge initially tended
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
25
to favor the Abraham model of the electron [cf. Sec. 5.4.1], Poincaré [54] appears to have lost faith in his principle for [Kaufmann’s] experiments have given grounds to the Abraham theory. The principle of relativity may well not have been the rigorous value which has been attributed to it.
Kaufmann’s experiments were set-up to discriminate between various models proposed for the dependency of the mass of the electron on its speed. And if the Lorentz model had been found wanting, Einstein had much more to lose since his generalization of Lorentz’s electron theory to all of matter would certainly have been its death knell. Einstein had this to say in his Jahrbuch [07] article: It should also be mentioned that Abraham’s and Bucherer’s theories of the motion of the electron yield curves that are significantly closer to the observed curve than the curve obtained from the theory of relativity. However, the probability that their theories are correct is rather small, in my opinion, because their basic assumptions concerning the dimensions of the moving electron are not suggested by theoretical systems that encompass larger complexes of phenomena.
The last sentence is opaque, for what do the dimensions of a moving electron share with larger complexes of phenomena? And how are both related with Kaufmann’s deflection measurements? Einstein may not have liked Abraham’s model, but Abraham did because, according to him, it was based on common sense. It must be remembered that Lorentz’s theory of the electron was also a model. According to Born and von Laue, Abraham will be remembered for his unflinching belief in “the absolute aether, his field equations, his rigid electron just as a youth loves his first flame, whose memory no later experience can extinguish.” But how rigid could Abraham’s electron be if the electrostatic energy depended on its contraction when in motion? That is everyone will agree that “Abraham took his electron to be a rigid spherical shell that maintained its spherical shape once set in motion. . . [yet] a sphere in the unprimed coordinate system becomes, in the primed system, an ellipsoid of revolution” [Cushing 81]. The unprimed system is related to the prime system by a dilation factor, equal to the inverse FitzGerald–Lorentz contraction, which elongates one of the axes into the major axis of the prolate ellipsoid. In the Lorentz model, one of the axes is shortened by the contraction factor so that an oblate ellipsoid results. In fact, as we shall see in Sec. 5.4.4, that
Aug. 26, 2011
11:16
26
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
the models of Abraham and Lorentz are two sides of the same coin, which are related in the same way that hyperbolic geometry is related to elliptic geometry, or a prolate ellipsoid to an oblate ellipsoid. If we take Einstein’s [Northrop 59] remark: If you want to find out anything about theoretical physicists, about the methods they use, I advise you to stick closely to one principle: don’t listen to their words, fix your attention on their deeds.
at face value, then according to Einstein’s own admission, there is no difference between the Poincaré–Lorentz theory and his. Whether the mass comes from a specific model of an electron in motion, or from general principles which makes no use of the fact that the particle is charged or not, they merge into the exact same formula for the dependence of mass on speed.
1.1.2
Models of the electron
At the beginning of the twentieth century several models of the electron were proposed that were subsequently put to the test by Kaufmann’s experiments involving the deflection of fast moving electrons by electric and magnetic fields. The two prime contenders were the Abraham and Lorentz models. If mass of the electron were of purely electromagnetic origin, it should fly apart because the negative charges on the surface would repel one another. There is a consensus of opinion that it was for this reason Abraham chose a rigid model of an electron which would not see the accumulation of charge that a deformed sphere would. Miller [81] contends that Abraham “chose a rigid electron because a deformable one would explode, owing to the enormous repulsive forces between its constituent elements of charge.” Even a spherical electron would prove unstable without some other type of binding forces. In that case, “the electromagnetic foundations would be excluded from the outset,” according to Abraham. In order to calculate the electrostatic energy Abraham needed an expression for the capacitance for an ellipsoid of revolution. This he found in an 1897 paper by Searle. The last thing he had to do was to postulate a dependence of the semimajor axis of revolution upon the relative velocity β = v/c. ‘Rigid’ though the electron may be, Abraham evaluated the electrostatic energy in the primed system where a sphere of radius a turns into a cigar-shaped prolate ellipsoid with
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
27
√ semimajor axis a/ (1 − β2 ). So Abraham’s rigid electron was not so rigid as he might have thought for the total electromagnetic energy he found was proportional to [Bucherer 04]: 1+β 1 ln − β. 2 1−β This expression happens to be the difference between the measures of distance in hyperbolic and Euclidean velocity spaces. When the radius of curvature, c, becomes infinite, the total electromagnetic energy will vanish, and we return to Euclidean space. So Abraham’s total electromagnetic energy was a measure of the distance into hyperbolic space which depended on the magnitude of the electron’s velocity. Abraham’s model fell into disrepute, and even Abraham abandoned it in latter editions of his second volume of Theorie der Elektrizität. However his electron turns out to be a cigar-shaped, prolate ellipsoid when in motion, while Lorentz’s was a pancake-shaped, oblate ellipsoid. So the two models were complementary to one another; the former belonging to hyperbolic velocity space while the latter to elliptic velocity space, with the transition between the two being made by ‘inverting’ the semimajor and semiminor axes.
1.1.3
Appropriation of Lorentz’s theory of the electron by relativity
Another historian of science, Russell McCormmach [70], claims that: Einstein recognized that not only electromagnetic concepts, but the mass and kinetic energy concepts, too, had to be changed. Entirely in keeping with his goal of finding common concepts for mechanics and electromagnetism, he deduced from the electron theory elements of a revised mechanics. In his 1905 paper he showed that all mass, charged or otherwise, varies with motion and satisfies the formulas he derived for the longitudinal and transverse masses of the electron. He also found a new kinetic energy formula applying to electrons and molecules alike. And he argued that no particle, charged or uncharged, can travel at a speed greater than that of light since otherwise its kinetic energy becomes infinite. He first derived these non-Newtonian mechanical conclusions for electrons only. He extended them from electrons to material particles on the grounds that any material particle can be turned into an electron by the addition of charge “no matter how small.” It is curious to speak of adding an indefinitely small charge, since the charge of an electron is finite. Einstein could speak this way because he was concerned solely with the “electromagnetic basis of Lorentzian electrodynamics and optics of moving bodies” [italics added].
Aug. 26, 2011
11:16
28
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
The argument that takes us from electrodynamic mass to mass in general is the following. Kaufmann and others have deflected cathode rays by electric and magnetic fields to find the ratio of charge to mass. This ratio was found to change with velocity. If charge is invariant, then it must be the mass in the ratio that increases with the particle’s velocity. These measurements cannot be used to confirm that all the mass of the electron is electromagnetic in nature. The reason is that “Einstein’s theory of relativity shows that mass as such, regardless of its origin, must depend on the velocity in a way described by Lorentz’s formula” [italics added] [Born 62]. In a collection dedicated to Einstein, Dirac [86] in 1980 observed In one aspect Einstein went much farther than Lorentz, Poincaré and others, namely in assuming that the Lorentz transforms should be applicable in all of physics, and not only in the case of phenomena related to electrodynamics. Any physical force, that may be introduced in the future, must be consistent with Lorentz transforms.
According to J. J. Thomson [28], Einstein has shown that to conform with the principles of Relativity mass must √ vary with velocity according to the law m0 / (1 − v2 /c2 ). This is a test imposed by Relativity on any theory of mass. We see that it is satisfied by the conception that the whole of the mass is electrical in origin, and this conception is the only one yet advanced which gives a physical explanation of the dependence of mass on velocity.
So this would necessarily rule out the existence of neutral matter, and, in fact, this is what Einstein [05] says when he remarks that charge “no matter how small” can be added to any ponderable body. The dependencies of mass upon motion arose from the assumption that bodies underwent contraction in the direction of their motion. This follows directly from the nature of the Lorentz transformation. From the geometry of the body one could determine the energy, W , and momentum, G, since the two are related by dW = v dG, in a single dimension. Then since G = mv, the expression for the increment in the energy becomes dW = v2 dm + mv dv.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
29
Introducing dW = c2 dm, and integrating lead to √ m/m0 = 1/ (1 − β2 ),
(1.1.6)
where m0 is a constant of integration, and β = v/c, the relative velocity. Expression (1.1.6) was derived by Gilbert N. Lewis in 1908. The same proof was adopted by Philipp Lenard, a staunch anti-relativist, in his Über Aether und Uräther who attributes it to Hassenöhrl’s [09] derivation of radiation pressure. The only verification of a dependency of mass upon velocity at that time was Kaufmann’s experiments on canal rays. Kaufmann was able to measure the ratio e/m, and assuming that the charge is constant, all the variation of this ratio must be attributed to the mass. The mass of the negative particle contains both electromagnetic and non-electromagnetic contributions. However, Lewis contended that whatever its origin is mass remains mass so that “it matters not what the supposed origin of this mass may be. Equation (1.1.6) should therefore be directly applicable to the experiments of Kaufmann.” But an accelerating electron radiates, and the radiative force is missing from dG. This did not trouble Lewis, and he went on to compare the observed value of the relative velocity with that calculated from (1.1.6). His results are given in the following table.
m/m0 1 1.34 1.37 1.42 1.47 1.54 1.65 1.73 2.05 2.14 2.42
β (observed)
β (calculated)
0 0.73 0.75 0.78 0.80 0.83 0.86 0.88 0.93 0.95 0.96
0 0.67 0.69 0.71 0.73 0.76 0.80 0.82 0.88 0.89 0.91
Aug. 26, 2011
11:16
30
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
Although the calculated and observed values of the relative velocities follow the same monotonic trend, the latter are between 6–8% larger. Lewis believed that this was within the limits of experimental error in Kaufmann’s experiments. While Kaufmann claimed a higher degree of accuracy is necessary, Lewis believed that notwithstanding the extreme care and delicacy with which the observations are made, it seems almost incredible that measurements of this character, which consisted in the determination of the minute displacement of a somewhat hazy spot on a photographic plate, could have been determined with the precision claimed.
So what is Lewis comparing his results to? Kaufmann’s initial results agreed better with the expression, m 3 1 1 + β2 1 + β = ln −1 , m0 4 β2 2β 1−β derived from Abraham’s model rather than (1.1.6), which coincides with the Lorentz model, but which has been “derived from strikingly different principles.” Why neutral matter should be subject to the deflection by the electromagnetic fields in Kaufmann’s set-up is not broached. But, Lewis considers that the mass of a positively charged particle emanating from a radioactive source would be a good test-particle because it consists of mainly ‘ponderable’ matter with a very small ‘electromagnetic’ mass. Lewis believed that his non-Newtonian mechanics revived the particle nature of light. From the fact that the mass, according to (1.1.6), becomes infinite as the velocity approaches that of light, it follows that “a beam of light has mass, momentum and energy, and is traveling at the velocity of light would have no energy, momentum, or mass if it were at rest. . ..” This is almost two decades before Lewis [26] was to coin the name ‘photon’ in a paper entitled “The conservation of photons.” The paper was quickly forgotten, but the name stuck.
1.2
Physicists versus Mathematicians
In attempting to unravel the priority rights to the unification of light and electricity we can appreciate a remarkable confluence of physicists and mathematicians in one single arena that was never to repeat itself. On the physics side there were André-Marie Ampère, Ludwig
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
31
Boltzmann, Rudolf Clausius, Michael Faraday, Hermann von Helmholtz, James Clerk Maxwell, and Wilhelm Weber, while on the mathematics side there were Carl Friedrich Gauss and Bernhard Riemann, and those that should have been there, but were not: János Bolyai and Nicolai Ivanovitch Lobachevsky. To Ampère credit must go to the fall of the universal validity of Newton’s inverse square law as a means by which particles interact with one another at a distance. Today, Ampère is remembered as a unit, rather than as the discoverer of that law, and contemporary treatises on electromagnetism present the alternative formulation of Jean-Baptiste Biot and Félix Savart. Although both laws of force coincide when the circuit is closed, they differ on the values that the force takes between two elements of current when open. That the interaction of persisting direct (galvanic) currents needed an angular-dependent force was loathed and scorned at. Surely, magnetism cannot be the result of the motion of charged particles. Odd as it may seem, like many of the French physics community, Biot rejected Ampère’s discovery outright. Since the angular dependencies vanish when electric currents appear in complete circuits, it seemed as extra baggage to many, including Maxwell, who reasoned in continuous fields which could store energy and media (i.e. the aether) in which waves could propagate in. Yet, it was Ampère’s attempt that would initiate a search for a molecular understanding of what electricity is and how it works.
1.2.1
Gauss’s lost discoveries
It may take very long before I make public my investigations on this issue; in fact, this may not happen in my lifetime for I fear the ‘clamor of the Boeotians.’ Gauss in a letter to Bessel in 1829 on his newly discovered geometry.
Gauss’s seal was a tree but with only seven fruits; his motto read “few, but ripe.” Such was, in effect, an appraisal of Gauss’s scientific accomplishments. Gauss had an aversion for debate, and, probably, a psychological problem of being criticized by people inferior to him, like the Boeotians of Greece who were dull and ignorant. Ampère’s discovery would have finished in oblivion had it not caught the eye of Gauss. By 1828 Gauss was resolved to test Ampère’s angle law when he came into contact with a young physicist, Wilhelm Weber. With no
Aug. 26, 2011
11:16
32
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
surprise, Weber was offered a professorship at Göttingen three years later, and an intense collaboration between the two began. According to his 1846 monograph, Weber was out to measure a force of one current on the other. This was something not contemplated by Ampère who was satisfied to making static, or what he called ‘equilibrium,’ measurements. When Weber was ready to present his results, he shied away from a discussion of the angular force because he knew it would cause commotion. A letter from Gauss persuaded him otherwise, and insisted that further progress was needed to find a “constructible representation of how the propagation of the electrodynamic interaction occurs.” Weber accepted Fechner’s model in which opposite charges are moving in opposite directions, and interpreted Ampère’s angular force in terms of the force arising from relative motion, depending not only on their relative velocities but also on their accelerations. In so doing, Weber can thus be considered to be the first relativist! The anomaly in Ampère’s law, where there appears a diminution of the force at a certain angle, now appeared as a diminution of the force at a certain speed. That constant later became known as Weber’s constant, and in a series of experiments carried out with Rudolf Kohlrasch it was found to be the speed of light, increased by a factor of the square root of 2. Present at these experiments was Riemann, and Riemann was later to present his own ideas on the matter. In the 1858 paper, “A contribution to electrodynamics,” that was read but not published until after Riemann’s death, Riemann states I have found that the electrodynamic actions of galvanic currents may be explained by assuming that the action of one electrical mass on the rest is not instantaneous, but is propagated to them with a constant velocity which, within the limits of observation, is equal to that of light.
Although he errs referring to φ = −4πρ as Poisson’s law, instead of = −4πρ, Riemann surely did not merit the wrath that Clausius bestowed upon him. Riemann proposes a law of force similar to that of Weber, where the accelerations along the radial coordinate connecting the two particles are replaced by the accelerations projected onto the coordinate axes, and advocates the use of retarded potentials instead of a scalar potential. In his Treatise, Maxwell cites Clausius’s criticisms as proof of the unsoundness of Riemann’s paper. Surely, Maxwell had no need of Clausius’s help, so it was probably used to avoid direct criticism. Moreover, ∇ 2φ
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
33
Clausius’s criticisms are completely unfounded, and what Maxwell found wanting in Weber’s electrokinetic potential actually applies to Clausius’s expression. Whereas Clausius had some grounds for his priority dispute with Kelvin when it came to the second law, here he has none. Weber’s formulation,which today is all but forgotten, held sway in Germany until Heinrich Hertz [93], Helmholtz’s former assistant, verified experimentally the propagation of electromagnetic waves and showed that they had all the characteristics of light. Helmholtz then crowned Maxwell’s theory, and went even a step further by generalizing it to include longitudinal waves, if ever there would be a need of them [cf. Sec. 11.5.5]. Gauss played a fundamental role in bridging the transition from Ampère to Weber. Moreover, Maxwell’s formulation of a wave equation, from his circuit equations, in which electromagnetic disturbances propagate at the speed of light, was undoubtedly what Gauss thought was as an oversimplification of the problem. The complexity of the interactions in Ampère’s hypothesis persuaded him that it was not as simple as writing down a wave equation for a wave propagating at the speed of light. This will not be the only time Gauss loses out on a fundamental discovery. Gauss’s letters are more telling than his publications, and if it had not been for his reluctance to publish he would have certainly been the discoverer of what we now know as hyperbolic geometry. Gauss wrote another famous letter, this time to Taurinus in 1824, again reluctant to publish his findings. This is what he said: . . . that the sum of the angles cannot be less than 180◦ ; this is the critical point, the reef on which all the wrecks occur. . . I have pondered it for over thirty years, and I do not believe that anyone can have given it more thought. . . than I, though I have never published anything on it. The assumption that the sum of three angles is less than 180◦ leads to a curious geometry, quite different from ours (the Euclidean), but thoroughly consistent. . .
Gauss is, in fact, referring to hyperbolic geometry, and it is another of his lost discoveries. The credit went instead to Bolyai junior and Lobachevsky. In 1831, Gauss was moved to publish his findings, as it appears in a letter to Schumacher: I have begun to write down during the last few weeks some of my own meditations, a part of which I have never previously put in writing, so that already I have had to think it all through anew three or four times. But I wished this not to perish with me.
Aug. 26, 2011
11:16
34
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
But it was too late, before Gauss could finish his paper, a copy of Bolyai’s Appendix arrived. Gauss’s reply to Wolfgang Bolyai senior unveils his disappointment: If I commenced by saying that I am unable to praise this work, you would certainly be surprised for a moment. But I cannot say otherwise. To praise it, would be to praise myself. Indeed the whole contents of the work, the path taken by your son, the results to which he is led, coincide almost entirely with my meditations, which have occupied my mind partly for the last thirty or thirty-five years. So I remained quite stupefied. . . it was my idea to write down all this later so that at least it should not perish with me. It is therefore a pleasant surprise for me that I am spared the trouble, and I am very glad that it is just the son of my old friend, who takes precedence of me in such a remarkable manner.
Even more mysterious is why Gauss failed to help the younger Bolyai gain recognition for his work. Was it out of jealousy or Gauss’s extreme prudence? Another person who was looking to the stars for confirmation that two intersecting lines can be parallel to another line was Lobachevsky. He, like Gauss, considered geometry on the same status of electrodynamics, that is, a science founded on experimental fact. Lobachevsky fully realized that deviations from Euclidean geometry would be exceedingly small, and, therefore, would need astronomical observations. Just as Gauss attempted to measure the angles of a triangle formed by three mountaintops, Lobachevsky claimed that astronomical distances would be necessary to show that the sum of the angles of a triangle was less than two right angles. In 1831 Gauss deduced from the axiom that two lines through a given point can be parallel to a third line that the circumference of a circle is 2πR sinh r/R, where R is an absolute constant. By simply replacing R by iR, he obtained 2πR sin r/R, or the circumference of a circle of radius r on the sphere. The former will be crucial to the geometrical interpretation of the uniformly rotating disc that had occupied so much of Einstein’s thoughts. And we will see in Sec. 9.11 that Gauss’s expression for the hyperbolic circumference is what modern cosmologists confuse with the expansion factor of the universe. The first person to show that there was a complete correspondence between circular and hyperbolic functions was Taurinus in 1826, who was in Gauss’s small list of correspondents on geometrical matters. Although this lent credibility to hyperbolic geometry, neither Taurinus nor Gauss
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
35
felt confident hyperbolic geometry was self-consistent. In 1827 Gauss came within a hair’s breadth of what would latter be known as the Gauss–Bonnet theorem. This theorem shows that the surfaces of negative curvature produce a geometry in which the angular defect is proportional to the area. Gauss was cognizant that a pseudosphere was such a surface, and Gauss’s student Minding latter showed that hyperbolic formulas for triangles are valid on the pseudosphere. But, a pseudosphere is not a plane, like the Euclidean plane, because it is infinite only in one direction. The extension of the pseudosphere to a real hyperbolic plane came much later with Eugenio Beltrami’s exposition in 1868. So it was not clear to Gauss and his associates what this new geometry was, and, if, in fact, it was logically consistent. Gauss dabbled in many areas of physics and mathematics, and it would appear that his interests in electricity and non-Euclidean geometries are entirely disjoint. Who would have thought that these two lost discoveries might be connected in some way? Surely Poincaré did not and it is even more incredible because he developed two models of hyperbolic geometry that would have made the handwriting on the wall unmistakable to read.
1.2.2
Poincaré’s missed opportunities
Jules-Henri Poincaré began his career as a mathematician, and, undoubtedly, became interested in physics because of the courses he gave at the Sorbonne. Poincaré was not a geometer by trade, but made a miraculous discovery that the Bolyai–Lobachevsky geometry which the geometers, Beltrami and Klein, were trying to construct already existed in mainstream mathematicians [Stillwell 96]. The tragedy is that he failed to see what he called a Fuchsian group was the same type of transform that Lorentz was using in relativity, and that he would be commenting on the latter without any recognition of the former.
1.2.2.1
From Fuchsian groups to Lorentz transforms
Poincaré’s first encounter with hyperbolic geometry came when he was trying to understand the periodicity occurring in solutions to particular differential equations. The single periodicities of trigonometric functions
Aug. 26, 2011
11:16
36
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
were well-known, and so too the double periodicities of elliptic functions. Double periodicity can be best characterized by tessellations consisting of parallelograms in the complex Euclidean plane whose vertices are multiples of the doubly periodic points. Poincaré found a new type of periodic function, which he called ‘Fuchsian,’ after the mathematician Lazarus Immanuel Fuchs who first discovered them.d The periodic function is invariant under a group of substitutions of the form z →
az + b , cz + d
(1.2.1)
for which ad − bc = 0, for otherwise it would result in a lack-luster constant mapping. Poincaré wanted to study this group of transformations by the same type of tessellations that elliptic functions could be characterized in the complex Euclidean plane. Only now the tessellation consists of curvilinear triangles in a disc, shown in Fig. 1.1, which Poincaré obtained from earlier work by Schwarz in 1872. The curvilinear triangles form right-angled pentagons which are mapped onto themselves by the linear fractional transformation, (1.2.1). As Poincaré tells us Just at the time I left Caen, where I was living, to go on a geological excursion . . . we entered an omnibus to go some place or other. At the moment I put my foot on the step the idea came to me, without anything in my former thoughts seeming to have paved the way for it, that the transformation I had used to define Fuchsian functions were identical with those of non-Euclidean geometry.
The linear fractional transformations, (1.2.1), can be used to define a new concept of length for which the cells of the tessellation are all of equal size. The resulting geometry is precisely that of Bolyai–Lobachevsky which, through Klein’s renaming in 1871, has come to be known as hyperbolic geometry. If c = b and d = a, then the fractional linear transformation (1.2.1) becomes the distance-preserving and orientation-preserving map, with a2 − b2 = 1, of Poincaré’s conformal disc model of the hyperbolic plane D2 -isometrics. What Poincaré failed to realize is that by interpreting z as the linear fractional transformation (1.2.1), with a = cosh and b = sinh , becomes precisely the transformation he named in honor of Lorentz, where dAfter Klein informed Poincaré in May 1880 that there were groups of linear frac-
tional transformations, other than those of Fuchs, Poincaré named them ‘groupes kleinéens,’ to the chagrin of Klein.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
37
Fig. 1.1. A tiling of the hyperbolic plane by curvilinear triangles that form rightangled pentagons.
the sides of any curvilinear triangle in Fig. 1.1 are proportional to the hyperbolic measures of the three velocities in three different reference frames. Had Poincaré recognized this, it would have changed his mind about the ‘convenience’ of Euclidean geometry, and would have brought hyperbolic geometry into mainstream relativity. That is, given three bodies moving with velocities u1 , u2 and u3 , the corresponding triangle with curvilinear sides has as its vertices the points u1 , u2 and u3 . The relative velocities will correspond to the sides of the triangle and the angles between the velocities will add up to something less than two right angles. It should also be appreciated that the square of the relative velocity is invariant under (1.2.1). Suppose that w is a relative velocity formed from the composition of u and v, then if these velocities are replaced by the velocities u and v relative to some other frame, the value of w will be unaffected by the change. In other words, the square of the relative velocity w is invariant under a Lorentz transformation.
Aug. 26, 2011
11:16
38
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
However, it never dawned on Poincaré that these curvilinear-shaped triangles might be relativistic velocity triangles for he kept mathematics and physics well separated in his mind. For he considered . . . the axioms of geometry . . . are only definitions in disguise. What then are we to think of the question: Is Euclidean geometry true? We might as well ask if the metric system is true and if the old weights and measures are false; if Cartesian coordinates are true and polar coordinates false. One geometry cannot be more true than another: it can only be more convenient.
Convenience was certainly not the answer.e
1.2.2.2
An author of E = mc2
Unquestionably the most famous formula in all of physics, its origins lie elsewhere than in Einstein’s [05b] paper “Does the inertia of a body depend upon its energy content?” John Henry Poynting [07] derived a relation between energy and mass from the radiation pressure around the turn of the twentieth century. Friedrich Hassenöhrl [04] obtained the effective mass of blackbody radiation as 43 ε/c2 , where ε = hν. The same factor of 43 was found by Comstock [08] from his electromagnetic analysis, and represents the sum of the energy and the work done by compression, the latter being equal to one-third of the energy in the ultrarelativistic limit. The sum of the two quantities is the enthalpy, as was first clearly stated by Planck [07], so in Einstein’s title ‘heat content,’ or enthalpy, should replace ‘energy content.’ Once again we find evidence of Poincaré’s priority in the derivation of the famous formula, and, as we have mentioned, Einstein’s recognition of it [cf. p. 23]. In the second edition of his text, Électricité et Optique, Poincaré [01] treats the problem of the recoil due to a body’s radiation. He considers the emission of radiation in a single direction, and in order to maintain fixed the center of gravity, the body recoils like an ‘artillery cannon’ (pièce d’artillerie). According to the theory of Lorentz, the amount of the recoil will not be negligible. Suppose, says Poincaré, that the artillery piece has a mass of 1 kg, and the radiation that is sent in one direction at the velocity e Strangely, we find Einstein [22b] uttering the same words: “For if contradictions
between theory and experience manifest themselves, we should rather decide to change physical laws than to change axiomatic Euclidean geometry.”
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
39
of light has an energy of three million Joules. Then, according to Poincaré, it will recoil a distance of 1 cm. Actually, the relation between ‘electromagnetic momentum’ and Poynting’s vector appears in a 1895 paper by Lorentz, which was commented and elaborated upon by Poincaré [00] in 1900. He derives the expression between the momentum density, G, and the energy flux, S, as G = S/c2 .
(1.2.2)
Even earlier in 1893, J. J. Thomson refers to ‘the momentum’ arising from the motion of his Faraday tubes. It is only later that Abraham [03] introduced the term ‘electromagnetic momentum.’ Pauli [58] unjustly attributes (1.2.2) to Planck [07] as a theorem regarding the equivalence between momentum density and the energy flux density. According to Pauli, This theorem can be considered as an extended version of the principle of the equivalence of mass and energy. Whereas the principle only refers to the total energy, the theorem has also something to say on the localization of momentum and energy.
Since the magnitude of the energy flux, S = Ec, (1.2.2) becomes: mv = E/c. Then introducing m = 103 grams, E = 3×1013 ergs, and c = 3×1010 cm/sec, Poincaré finds v = 1 cm/sec for the recoil speed. Thus, Poincaré derived E = Gc, and if G is the momentum of radiation, G = mc, so that m = E/c2 is the mass equivalent to the energy of radiation. Poincaré was infatuated with the break-down of Newton’s third law, the equality between action and reaction, in his new mechanics. In a followup paper entitled, “The theory of Lorentz and the principle of reaction,” Poincaré [00] considers electromagnetic energy as a ‘fictitious fluid’ (fluide fictif) with a mass E/c2 . The corresponding momentum is the mass of this fluid times c. Since the mass of this fictitious fluid was ‘destructible’ for it could reappear in other guises, it prevented him from identifying the fictitious fluid with a real fluid. What Poincaré could not rationalize became ‘fictitious’ to him. The lack of conservation of the fictitious mass prevented Poincaré from identifying it with real mass, which had to be conserved under all circumstances. What is conserved, however, is the inertia associated with the radiation that has produced the recoil of the artillery cannon. It is the
Aug. 26, 2011
11:16
40
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
difference between the initial mass and what is radiated that is equal to the change in energy of the system. Ives [52] showed that m − m = m = E/c2 ,
(1.2.3)
where m is the change in mass after radiation, and E/c2 is the mass of the radiant energy, which follows directly from Poincaré’s 1904 relativity principle. The difference between the Doppler shift in the frequency due to a source moving toward and away from a fixed observer is: 1 νv 1 + β 1/2 1 − β 1/2 ν = ν = √ − . (1.2.4) 2 1−β 1+β c (1 − β2 ) The frequency shift becomes a nonlinear function of the velocity, just like the expression for the relativistic momentum. But here there is no mass present! The relation between frequency and energy was known at the time; it is given by Planck’s law, E = hν, so that (1.2.4) could be written as hν/c =
c2
Ev = G, (1 − β2 )
√
where G is momentum imparted to the artillery piece due to recoil. It is given by m v, (1 − β2 )
G= √
if (1.2.3) holds. The derivation is thus split into two parts: A nonrelativistic relation between mass and energy, (1.2.3), which depends only on the central frequency, ν, and a relativistic part that relates the size of the shift to the velocity, according to (1.2.4). It is through the difference in the Doppler shifts that the momentum acquires nonlinear dependency upon the velocity, 1 v¯ /c β (e − e−¯v/c ) = sinh (¯v/c) = √ , 2 (1 − β2 )
(1.2.5)
where v¯ is the hyperbolic measure of the velocity whose Euclidean measure is v. Equation (1.2.5) also indicates that c is the absolute constant of velocity
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
41
space. If we multiply (1.2.5) through by πc, it becomes Gauss’s expression for the semi-perimeter of a non-Euclidean circle of radius v¯ , and absolute constant c, that he wrote in a letter to Schumacher in 1831 [cf. Eq. (9.11.24)]. Where is the mass dependence on velocity? The Doppler shifts refer to a shift in frequency, the frequency is related to an energy, the energy is related to mass; that is, the mass equivalent of radiation. In fact, the attributed nonlinear dependence of mass on its speed, (1.2.4), can be obtained without mentioning mass at all! Poincaré was ever so close to developing a true theory of relativity, but ultimately could not break loose of the classical bonds which held him. It is even a greater tragedy that he could not bridge the gap between his mathematical studies on non-Euclidean geometries and relativity that could have unified his lifelong achievements.
1.3
Exclusion of Non-Euclidean Geometries from Relativity
Neither Whittaker, nor Pais, gave any reference to the potential role that non-Euclidean geometries could have played in relativity. Pais pays little tribute to Hermann Minkowski other than saying that Einstein had a change in heart; rather than considering the transcription of his theory into tensorial form as ‘superfluous learnedness’ (überflüssige Gelehrsamkeit), he later claimed it was essential in order to bridge the gap from his special to general theories. Minkowski, in his November 1907 address to the Göttingen Mathematical Society, began with the words “The world in space and time is, in a certain sense, a four-dimensional non-Euclidean manifold” [cf. p. 37]. The invariance of the hyperboloid of space-time from the Lorentz transform was identified as a pseudosphere of imaginary radius, or a surface of negative, constant curvature. It is plain from Whittaker’s formulas that the Lorentz transformation consists of a rotation through an imaginary angle. Poincaré too viewed the Lorentz transformation as a rotation in fourdimensional space-time about an imaginary angle and that the ratio of the space to time transformations gave the relativistic law of velocity addition.
Aug. 26, 2011
11:16
42
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
But, he could not bring himself to identify the velocity as a line element in Lobachevsky space. Edwin Wilson, who was J. Willard Gibbs’s last doctoral student, and Lewis [12] felt the need to introduce a non-Euclidean geometry for rotations, but not for translations. They assumed, however, that Euclid’s fifth postulate (the parallel postulate) held, and, therefore, excluded hyperbolic geometry from the outset, even though their space-time rotations are through an imaginary angle. Had they realized that their non-Euclidean geometry was hyperbolic they would have retracted the statement that “Through any point on a given line one and only one parallel (nonintersecting) line can be drawn.” It would have also saved them the trouble of inventing a new geometry for the space-time manifold of relativity. They do, in fact, disagree with Poincaré that it is, however, inconsistent with the philosophic spirit of our time to draw a sharp distinction between that which is real and that which is convenient, and it would be dogmatic to assert that no discoveries of physics might render so convenient as to be almost imperative the modification or extension of our present system of geometry.
Neither their plea nor paper had a sequel. In the last of his eight lectures, delivered at Columbia University in 1909, we listened to Planck’s animosity toward non-Euclidean geometries. Although blown up, and completely out of proportion, Planck was making a statement that he does not want any infringement on the special theory of relativity by mathematicians. Where would this infringement come from? From nowhere else than the Göttingen school of mathematicians, notably Felix Klein. The Hungarian Academy of Science established the Bolyai Prize in mathematics in 1905. The commission was made up of two Hungarians and two foreigners, Gaston Darboux and Klein. The contenders were none other than Poincaré and Hilbert. Although the prize went to Poincaré, his old friend Klein refused to present him with it citing ill health. According to Leveugle [04] it would have meant that Klein had to publicly admit Poincaré’s priority over Einstein to the principle of relativity, and the group of transformations that has become known as the Lorentz group, a name coined by Poincaré in honor of his old friend. This would not have been
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
43
received well by the Göttingen school for not only did Hilbert come in at second place, it would have been a debacle of all their efforts to retain relativity as a German creation. Arnold Sommerfeld, a former assistant to Klein, showed in 1909 that the famous addition theorem of velocities, to which Einstein’s name was now attached, was identical to the double angle formula for the hyperbolic tangent. The velocity parallelogram closes only at low speeds. This was the first demonstration that hyperbolic geometry definitely had a role in relativity, and its Euclidean limit emerged at low speeds. Now Sommerfeld would surely have known that the hyperbolic tangent is the straight line segment in Lobachevsky’s non-Euclidean geometry. Acknowledgment of his former supervisor’s interest in relativity surfaced in the revision of Pauli’s [58, Footnote 111] authoritative Mathematical Encylopedia article on relativity where he wrote: This connection with the Bolyai–Lobachevsky geometry can be briefly described in the following way (this had not been noticed by Variˇcak): If one interprets dx1 , dx2 , dx3 , dx4 as homogeneous coordinates in a three-dimensional projective space, then the invariance of the equation (dx1 )2 + (dx2 )2 + (dx3 )2 − (dx4 )2 = 0 amounts to introducing a Cayley system of measurement, based on a real conic section. The rest follows from the well-known arguments of Klein.
Sommerfeld just could not resist rewriting the history of relativity. He changed Minkowski’s opinion of the role Einstein had in formulating the principle of relativity. Quite inappropriately he inserted a phrase praising Einstein for having used the Michelson experiment to show that a state of absolute rest, where the immobile aether would reside, has no effect on physical phenomena [Pyenson 85]. He also exchanged the role of Einstein as the clarifier with that as the originator of the principle of relativity.f A much more earnest attempt to draw hyperbolic geometry into the mainstream of relativity was made by Vladimir Variˇcak. Variˇcak says that f And Sommerfeld’s revisions did not stop at relativity. Writing in the obituary col-
umn of the recently deceased Marion von Smolukowski, Sommerfeld lauds Einstein for his audacious assault on the derivation of the coefficient of diffusion in Brownian motion, “without stopping to bother about the details of the process.” Von Laue, writing in his History of Physics clearly states that Smolukowski developed a statistical theory of Brownian motion in 1904 “to which Einstein gave definitive form (1905).”
Aug. 26, 2011
11:16
44
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
even before he heard Minkowski’s 1907 talk, he noticed the profound analogy between hyperbolic geometry and relativity. At low velocities, the laws of mechanics reduce to those of Newton, just as Lobachevskian geometry reduces to that of Euclidean geometry when the radius of curvature becomes very large. To Variˇcak, the Lorentz contraction appears as a deformation of lengths, just as the line segment of Lobachevskian geometry is bowed. Taking the line element of the half-plane model of hyperbolic geometry, Variˇcak says that it cannot be moved around without deformation. Thus, he queries whether the Lorentz contraction can be understood as an anisotropy of the (hyperbolic) space itself. Variˇcak also appreciates that in relativity the velocity parallelogram does not close; hence, it does not exist, and must be replaced by hyperbolic addition, which is the double angle formula of the hyperbolic tangent. Relativity abandons the absolute, but does introduce an absolute velocity, c; this corresponds to the absolute constant in the Lobachevsky velocity space. Owing to the fact that an inhabitant of the hyperbolic plane would see no distortion to his rulers as he moves about because his rulers would shrink or expand with him, Variˇcak questions the reality of the Lorentz transform. To Variˇcak, the “contraction is, so to speak, only a psychological and not a physical fact.” Although known non-Euclidean geometries were not entertained by Einstein, Variˇcak’s formulation should have raised eyebrows. But it did not. The only thing that it would do, by questioning the reality of the space contraction, would be to cause confusion, and this provoked a response by Einstein himself. But whose confusion did he abate? Apart from optical applications referring to the Doppler shift and aberration, which were already contained in Einstein’s 1905 paper in a different form, Variˇcak produced no new physical relations or new insights into old ones. These factors led to the demise of the hyperbolic approach to relativity, as far as physicists were concerned. However, there was an isolated incident in 1910, where Theodor Kaluza [10] draws an analogy between a uniformly rotating disc and Lobachevskian geometry. Kaluza writes the line element as
r2 1+ 1 ± r2
dϕ dr
2 dr,
(*)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
45
which at constant radius becomes
r2 1 ± r2
dϕ.
(**)
If Kaluza wants to show that the circumference of a hyperbolic circle is greater than its Euclidean counterpart, he has to choose the negative sign in expression (*), bring out the dr from under the square root, and remove the square in the numerator of (**). Apart from these typos, and the fact that the first factor in (*) had to be divided by (1 − r2 )2 , Kaluza was the first to draw attention to the fact that the hyperbolic metric of constant curvature describes exactly a uniformly rotating disc. The paper was stillborn. Another unexplainable event is that Einstein entered into a mathematical collaboration with his old friend, Marcel Grossmann, to develop a Riemannian theory of general relativity. Grossmann was an expert in non-Euclidean geometries; so why did he not set Einstein on the track of looking at known non-Euclidean metrics instead of putting him on the track of Riemannian geometry? Probably Einstein wanted the general theory to reduce to Minkowski’s metric in the absence of gravity which meant that the components of the metric tensor reduce to constants. But that meant he was fixing the propagation of gravitational interactions at the speed of light. Grossmann is, however, usually remembered for having led Einstein astray in rejecting the Ricci tensor as the gravitational tensor [Norton 84]. In order for it to reproduce correctly the curvature of ‘space-time,’ the coefficients would have to be (nonlinear) functions of space, and maybe even of time. According to Einstein the Riemannian metric should play the role of the gravitational field. Curvature would be a manifestation of the presence of mass–energy so that if he could find a curvature tensor, comprising of the components of the metric tensor, then by setting it equal to a putative energy–momentum tensor he could find the components of the metric tensor, and thereby determine the line element. Such an equation would combine time and space with energy and momentum. The rest is history and has been too amply described by historians of science. Since the metric has ten components, the search was on for a curvature tensor with the same number of components. The contraction of the Riemann–Christoffel tensor into the Ricci tensor,
Aug. 26, 2011
11:16
46
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
having ten components, seemed initially as a good bet to be set equal the energy–momentum tensor. Setting the Ricci tensor equal to zero was made a condition for the emptiness of space. It constitutes Einstein’s law of gravitation, and as Dirac [75] tells us ‘Empty’ here means that there is no matter present and no physical fields except the gravitational field. [italics added] The gravitational field does not disturb the emptiness. Other fields do.
So gravity can act where matter and radiation are not! When the field is not empty, setting the Ricci tensor equal to the energy–momentum tensor leads to inconsistencies insofar as energy– momentum is not conserved. If the Ricci tensor vanishes then so do all that is related to it, like the scalar, or total, curvature. Einstein found that if he subtracted one-half the curvature-invariant from the Ricci tensor and set it equal to the energy–momentum tensor, then energy–momentum would be conserved. The equipment needed to carry out the program involves, curvilinear coordinates, parallel displacement, Christoffel symbols, covariant differentiation, Bianchi relations, the Ricci tensor and its contraction, plus a knowledge of what the energy–momentum tensor is. The only outstanding solution is known as the Schwarzschild metric, in which the metric is constructed on solving the ‘outer’ and ‘inner’ solutions [cf. Secs. 9.10.3 and 9.10.4]. All the known tests of general relativity are independent of the timecomponent of the metric, except for the gravitational shift of spectral lines, which is independent of the spatial component. The latter was predicted by Einstein in 1911, prior to his general theory of relativity. However, it does not follow from the Doppler shift so Einstein was either uncannily lucky, or the true explanation lies elsewhere. Viewed from a pseudo-Euclidean point of view, there is a clear distinction between special and general relativity. Within the hyperbolic framework, this separation between inertial and noninertial ones becomes blurred. This is because the uniformly rotating disc is, as Stachel [89] claims, the missing link to Einstein’s general theory. That the Beltrami metric describes exactly the uniformly rotating disc, means that hyperbolic geometry is also the framework for noninertial systems. We have already seen Planck’s hostility to non-Euclidean geometries. There was also Wilhelm Wien, Planck’s assistant editor of the Annalen, who
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
47
insisted that relativity has “no direct point of contact with non-Euclidean geometry,” and Arnold Sommerfeld who considered the reinterpretation of relativity in terms of non-Euclidean geometry could “be hardly recommended.” Authoritarianism carried the day and non-Euclidean geometry was shelved for good. It is the purpose of this monograph to show that non-Euclidean geometries make inroads into relativistic phenomena and warrant our attention.
References [Abraham 03] M. Abraham, “Prinzipien der Dynamik des Elektrons,” Ann. der Phys. 10 (1903) 105–179. [Abraham 12] M. Abraham, “Relativität und Gravitation. Erwiderung auf eine Bemerkung des Herrn A. Einstein,” Ann. der Phys. 38 (1912) 1056–1058. [Auffray 99] J.-P. Auffray, Einstein et Poincaré: Sur des Traces de la Relativité (Le Pommier, Paris, 1999), pp. 131, 133. [Born 51] M. Born, “Physics in my generation, the last fifty years,” Nature 268 (1951) 625. [Born 62] M. Born, Einstein’s Theory of Relativity (Dover, New York, 1962), p. 278. [Brillouin 60] L. Brillouin, Wave Propagation and Group Velocity (Academic Press, New York, 1960), p. 143. [Bucherer 04] A. H. Bucherer, Mathematische Einführung in die Elektronentheorie (Teubner, Leipzig, 1904), p. 50, Eq. (91a). [Cerf 06] R. Cerf, “Dismissing renewed attempts to deny Einstein the discovery of special relativity,” Am. J. Phys. 74 (2006) 818–824. [Comstock 08] D. F. Comstock, “The relation of mass to energy,” Phil. Mag. 15 (1908) 1–21. [Cushing 81] J. T. Cushing, “Electromagnetic mass, relativity, and the Kaufmann experiments,” Am. J. Phys. 49 (1981) 1133–1149. [de Broglie 51] L. de Broglie, Savants et Découvertes (Albin Michel, Paris, 1951), p. 50. [Dirac 75] P. A. M. Dirac, General Theory of Relativity (Wiley-Interscience, New York, 1975), p. 25. [Dirac 86] P. A. M. Dirac, Collection Dedicated to Einstein, 1982-3 (Nauka, Moscow, 1986), p. 218. [Earman & Glymour 78] J. Earman and C. Glymour, “Lost in the tensors: Einstein’s struggles with covariance principles” Stud. Hist. Phil. Sci. 9 (1978) 251–278. [Einstein 05a] A. Einstein, “On the electrodynamics of moving bodies,” Ann. der Phys. 17 (1905); transl. in W. Perrett and G. B. Jeffrey, The Principle of Relativity (Methuen, London, 1923). [Einstein 05b] “Does the inertia of a body depend upon its energy content?,” Ann. der Phys. 18 (1905) 639–641; translated in The Collected Papers of Albert Einstein: The Swiss Years, Vol. 2 (Princeton U. P., Princeton NJ, 1989), pp. 172–174.
Aug. 26, 2011
11:16
48
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
[Einstein 06a] A. Einstein, “On a method for the determination of the ratio of the transverse and longitudinal mass of the electron,” Ann. der Phys. 21 (1906) 583–586; translated in The Collected Papers of Albert Einstein, Vol. 2 (Princeton U. P., Princeton NJ, 1989), pp. 207–210. [Einstein 06b] A. Einstein, “Le principe de conservation du mouvement du centre de gravité ed l’inertie de l’energie,” Ann. der Phys. 20 (1906) 627–633. [Einstein 07] A. Einstein, “On the relativity principle and the conclusions drawn from it,” Jahrbuch der Radioaktivität und Elektronik 4 (1907) 411–462; translated in The Collected Papers of Albert Einstein, Vol. 2 (Princeton U. P., Princeton NJ, 1989), pp. 252–311. [Einstein 11] A. Einstein, “On the influence of gravitation on the propagation of light,” Ann. der Phys. 35 (1911); translated in W. Perrett and G. B. Jeffrey, The Principle of Relativity (Methuen, London, 1923), pp. 99–108. [Einstein 16] A. Einstein, “The foundation of the general theory of relativity,” Ann. der Phys. 49 (1916); translated in W. Perrett and G. B. Jeffrey, The Principle of Relativity (Methuen, London, 1923), pp. 111–173. [Einstein 22a] A. Einstein, “Aether and the theory of relativity,” translated in G. B. Jeffrey and W. Perrett, Sidelights on Relativity (E. P. Dutton, New York, 1922), pp. 1–24. [Einstein 22b] A. Einstein, “Geometry and experience,” translated in G. B. Jeffrey and W. Perrett, Sidelights on Relativity (E. P. Dutton, New York, 1922), pp. 27–56. [Essen 71] L. Essen, The Special Theory of Relativity: A Critical Analysis (Clarendon Press, Oxford, 1971). [Goldberg 67] S. Goldberg, “Henri Poincaré and Einstein’s theory of relativity,” Am. J. Phys. 35 (1967) 934–944. [Gray 07] J. Gray, Worlds Out of Nothing (Springer, London, 2007), p. 252. [Hassenöhrl 04] F. Hassenöhrl, “Zur Theorie der Strahlung in bewegten Körpern,” Ann. der Phys. 320 (1904) 344–370; Berichtigung, ibid. 321 589–592. [Hassenöhrl 09] F. Hassenöhrl, “Bericht über dei Trägheit der Energie,” Jahrbuch der Radioactivität 6 (1909) 485–502. [Hertz 93] H. Hertz, Electric Waves (Macmillan, London, 1893). [Holton 88] G. Holton, Thematic Origins of Scientific Thought (Harvard U. P., Cambridge MA, 1988). [Ives 51] H. Ives, “Revisions of the Lorentz transformations,” Proc. Am. Phil. Soc. 95 (1951) 125–131. [Ives 52] H. Ives, “Derivation of the mass–energy relation,” J. Opt. Soc. Am. 42 (1952) 540–543. [Janssen 02] M. Janssen, “Reconsidering a scientific revolution: The case of Einstein versus Lorentz,” Phys. Perspect. 4 (2002) 424–446. [Kaluza 10] Th. Kaluza, “Zur Relativitätstheorie,” Physik Zeitschr. XI (1910) 977–978. [Leveugle 94] J. Leveugle, “Poincaré et la relativité,” La Jaune et la Rouge 494 (1994) 31–51.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch01
Introduction
49
[Leveugle 04] J. Leveugle, La Relativité, Poincaré et Einstein, Planck, Hilbert. Histoire Véridique del la Théorie de la Relativité (l’Harmattan, Paris, 2004). [Lewis 08] G. N. Lewis, “A revision of the fundamental laws of matter and energy,” Phil. Mag. 16 (1908) 705–717. [Lewis 26] G. N. Lewis, “The conservation of photons,” Nature 118 (1926) 874–875. [Logunov 2001] A. A. Logunov, On the Articles by Henri Poincaré “On the dynamics of the electron”, 3rd ed. (Dubna, 2001). [Marchal] C. Marchal, “Poincaré, Einstein and the relativity: A surprising secret,” (http:// www.cosmosaf.iap,fr/Poincare.htm). [McCormmach 70] R. McCormmach, “Einstein, Lorentz and the electromagnetic view of Nature,” Hist. Studies Phys. Scis. 2 (1970) 41–87. [Miller 73] A. I. Miller, “A study of Henri Poincaré’s ‘Sur la Dynamique de l’Électron,”’ Arch. His. Exact Sci. 10 (1973) 207–328. [Miller 81] A. I. Miller, Albert Einstein’s Special Theory of Relativity (AddisonWesley, Reading MA, 1981), p. 254. [Northrop 59] F. C. Northrop, “Einsteins’s conception of science,” in ed. P. A. Schillip, Albert Einstein Philosopher-Scientist, Vol. II, (Harper Torchbooks, New York, 1959), p. 388. [Norton 84] J. Norton, “How Einstein found his field equations,” Stud. Hist. Phil. Sci. 14 (1984) 253–284. [Norton 04] J. D. Norton, “Einstein’s investigations of Galilean covariant electrodynamics prior to 1905,” Arch. His. Exact Sci. 59 (2004) 45–105. [Ohanian 08] H. C. Ohanian, Einstein’s Mistakes: The Human Failings of Genius (W. W. Norton & Co., New York, 2008), p. 84. [Pais 82] A. Pais, Subtle is the Lord (Oxford U. P., Oxford, 1982), p. 381. [Pauli 58] W. Pauli, Theory of Relativity (Dover, New York, 1958), p. 125. [Planck 07] M. Planck, “Zur Dynamik bewegter Systeme,” Berliner Sitzungsberichte Erster Halbband (29) (1907) 542–570; see also, B. H. Lavenda, “Does the inertia of a body depend on its heat content?,” Naturwissenschaften 89 (2002) 329–337. [Planck 98] M. Planck, Eight Lectures on Theoretical Physics (Dover, New York, 1998), p. 120. [Poincaré 98] H. Poincaré, “La mesure du temps,” Rev. Mét. Mor. 6 (1898) 371–384. [Poincaré 00] H. Poincaré, “The theory of Lorentz and the principle of reaction,” Arch. Néderland. Sci. 5 (1900) 252–278. [Poincaré 01] H. Poincaré, Électricité et Optique: La Lumière et les Théories Électrodynamiques, 2 ed. (Carré et Naud, Paris, 1901), p. 453. [Poincaré 04] H. Poincaré, “L’état actuel et l’avenir de la physique mathematique,” Bulletin des sciences mathématiques 28 (1904) 302–324; translation “The principles of mathematical physics,” Congress of Arts and Science, Universal Exposition, St. Louis, 1904 Vol. 1, (1905) pp. 604–622 (http://www.archive.org/details/ congressofartssc01inte). [Poincaré 05] H. Poincaré, “Sur la dynamique de l’électron,” Comptes Rend. Acad. Sci. Paris 140 (1905) 1504–1508.
Aug. 26, 2011
11:16
50
SPI-B1197
A New Perspective on Relativity
b1197-ch01
A New Perspective on Relativity
[Poincaré 06] H. Poincaré, “Sur la dynamique de l’électron,” Rend. Circ. Mat. Palermo 21 (1906) 129–175. [Poincaré 52] H. Poincaré, Science and Hypothesis (Dover, New York, 1952), pp. 70–71; translated from the French edition, 1902. [Poincaré 54] H. Poincaré, Oeuvres (Gauthier-Villars, Paris, 1954), p. 572. [Poynting 07] J. H. Poynting, The Pressure of Light, 13th Boyle Lecture delivered 30/05/1906 (Henry Frowde, London, 1907); (Soc. Promo. Christ. Know., London, 1910). [Pyenson 85] L. Pyenson, The Young Einstein: The Advent of Relativity (Adam Hilger, Bristol, 1985). [Schribner 64] C. Scribner, Jr, “Henri Poincaré and the principle of relativity,” Am. J. Phys. 32 (1964) 672–678. [Sommerfeld 04] A. Sommerfeld, “Überlichtgeschwindigkeitsteilchen,” K. Akad. Wet. Amsterdam Proc. 8 (1904) 346 (translated from Verslag v. d. Gewone Vergadering d. Wis-en Natuurkundige Afd. 26/11/1904, Dl. XIII); Nachr. Wiss. Göttingen 25/02/1904, 201–235. [Stachel 89] J. Stachel, “The rigidly rotating disc as the ‘missing link’ in the history of general relativity,” in Einstein and the History of General Relativity, eds. D. Howard and J. Stachel (Birhaüser, Basel, 1989). [Stillwell 89] J. Stillwell, Mathematics and Its History (Springer, New York, 1989), p. 311. [Stillwell 96] J. Stillwell, Sources of Hyperbolic Geometry (Am. Math. Soc., Providence RI, 1996), p. 113. [Stranathan 42] J. D. Stranathan, The ‘Particles’ of Modern Physics (Blakiston, Philadelphia, 1942), p. 137. [Thomson 28] J. J. Thomson and G. P. Thomson, Conduction of Electricity Through Gases, 3rd ed. (Cambridge U. P., Cambridge, 1928), p. 439. [Variˇcak 10] V. Variˇcak, “Application of Lobachevskian geometry in the theory of relativity,” Physikalische Zeitschrift 11 (1910) 93–96. [Walter 99] S. Walter, “The non-Euclidean style of Minkowskian relativity,” in J. Gray, ed. The Symbolic Universe (Oxford U. P., Oxford, 1999), pp. 91–127. [Weisskopf 60] V. F. Weisskopf, “The visual appearance of rapidly moving objects,” Phys. Today, Sept. 1960, 24–27. [Whitakker 53] E. Whittaker, A History of the Theories of Aether and Electricity, Vol. II The Modern Theories 1900–1926 (Thomas Nelson & Sons, London, 1953), p. 38. [Wilson & Lewis 12] E. B. Wilson and G. N. Lewis, “The space-time manifold of relativity. The non-Euclidean geometry of mechanics and electrodynamics,” Proc. Am. Acad. Arts and Sci. 48 (1912) 387–507.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Chapter 2
Which Geometry?
2.1 2.1.1
Physics or Geometry The heated plane
In La Science et l’Hypothèse Henri Poincaré [68] argued for the ‘passivity’ of physical space. Since all measurements involve both physical and geometrical assumptions, Poincaré considered it meaningless to ask whether space was Euclidean or non-Euclidean. We might try to measure the sum of the angles of a triangle formed by three hill tops and check to determine whether their sum was greater or less than 180◦ . In fact, Gauss attempted such a measurement. He measured the sum of the angles of a triangle formed by the three peaks of Broken, Hohehangen and Inselsberg. The sides of the triangle were 69, 85 and 197 km. Gauss determined that the sum exceeded 180◦ by 14 85. However, to the chagrin of Gauss, the experiment was inconclusive since the experimental error was greater than the excess he found. In fact, the sum could have as well as been less than 180◦ . The triangle was too small, since as Gauss realized, the defect is proportional to its area, and only a big triangle, of astronomical proportions, could be used to settle the question of whether the geometry of the universe is Euclidean or not. Poincaré was more indecisive in that he reasoned that any defect which could be revealed could equally as well be the consequence of the fact that light rays do not always travel in straight paths. It is this type of reasoning that was used against Poincaré, and from being denied the discovery of relativity. For we have seen in 1.2.2 that many of the concepts that were attributed to Einstein rightly belong to Poincaré, such as the velocity addition theorem, for which uniform motion is undetectable as far as physical laws are concerned, and the axiom that nothing can travel faster 51
Aug. 26, 2011
11:16
52
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
than light. His willingness to change a physical law so as to suit Euclidean geometry is responsible for his secondary role in twentieth century science. But not a word was muttered when Einstein [22] was found agreeing with Poincaré’s philosophy. To make his point, Poincaré considered an imaginary universe in the interior of a sphere of radius R. In such a universe, at any point p, its temperature would be given by T(p) = k(R2 − r2 ), where k is a positive constant and r is the Euclidean distance from the center of the sphere to the point p in question. He also assumed that the linear dimensions of the body vary with the temperature at the point where the body is found so that as one moves from the center of the body to the surface he becomes colder and contracts. In fact, it would take him an infinite amount of time to reach the surface. Even worse he cannot detect his shrinkage because the measuring sticks he uses shrink along with him. To our traveler, the universe appears infinite. We know that in Euclidean space the shortest path between two points is a straight line. But, because of the shrinkage, these geodesics, which are by definition the paths of shortest distance between pairs of points in , are not straight lines, but are curves bent inward toward the center of . Actually, they are circular arcs that cut the boundary normally. This is shown in Fig. 2.1 where the bug’s right legs are shorter than his left so even though he thinks he is traveling in a straight path, the unequal lengths of his legs cause him to follow a circular arc AB.
Fig. 2.1. A bug’s life in the heated disk; ‘hot’ in the center and ‘cold’ on the disc.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
53
However, to the bug, his right legs do not appear to be shorter than his left legs because his measuring tools also contract as things get colder. But, to us Euclideans, it appears that the bug’s right legs are shorter than his left legs because we are using Euclidean measuring sticks. So even though we owe this model of hyperbolic geometry to Poincaré, he failed to find it physically attractive. The question he posed “Which geometry is correct?” was answered by him with another question: “Which geometry is more convenient?” And unhesitatingly Poincaré clung to Euclidean geometry as the true geometry which Nature chooses. So that if we find a discrepancy between a physical law and Euclidean geometry we must be willing to change the former so as to preserve the latter. In effect, Poincaré was debasing his models of hyperbolic geometry, along with those of Beltrami and Klein, as having no physical relevance. Suppose that we have to deal with a rather large metal sheet which is not at a uniform temperature. Take one edge of the sheet and label it the x-axis, and consider its normal y-axis to vary with temperature in the following way: 1 T = by − , p where b and p are constants. Suppose also that the metallic sheet is fixed in such a way that it cannot bend or buckle. Lastly, we are given a measuring stick made of another metal whose thermal coefficient of expansion is p. How can we use this stick to determine the nature of the geometry of the sheet? It would be better to have a measuring rod whose coefficient of thermal expansion were zero, but not having one we are left to make measurements with this imperfect rod. We therefore inquire how to make consistent measurements. There are two ways: (i) At some standard temperature, which we take to be zero degrees celsius, the measuring stick has a length ds. But because there will be points with higher temperatures, the true length at any point (x, y) will be ds = (1 + pT)ds = pby ds.
Aug. 26, 2011
11:16
54
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
This choice allows us to maintain our Euclidean measure on the surface by allowing for a temperature correction factor pby. This is our physical law which allows us to preserve the Euclidean nature of the geometry. (ii) We make measurements without taking into account the changes in the length of the rod. Here, we clearly rule out that there are changes in matter due to heat variations, and look to the geometry to make the necessary modifications. If we opt for the second choice we realize that the x-axis represents absolute cold, which corresponds to a line at infinity. If we try to make measurements using a rod parallel to the y-axis we find that the stick will become shorter and shorter as it approaches the x-axis, so that it will appear as a line at infinity because it is infinitely far away. The prime interest of a geometer is to create an object, such as a triangle, made up of the measuring sticks, that when moved over the surface remains congruent. We shall refer to such displacements as motions, of which we will be interested primarily in infinitesimal ones. But, we must first determine how we measure distance, or define a metric for the space. Wanting to keep as close as possible with a Euclidean measure we might try: pby ds =
√
(dx2 + dy2 ).
If we agree to a choice of units where pb = 1, then √ 2 (dx + dy2 ) ds = . y
(2.1.1)
This ‘distance’ increases without limit as y → 0. For x constant, the ‘distance’ along vertical lines increases exponentially in comparison with its Euclidean counterpart. For example, the adjacent distances between y = 1, 12 , 41 , . . . at x = 0 are all equal. We now want the invariance property of this metric to determine the permissible motions. Consider, for instance, a point transformation: x = x (x, y),
y = y (x, y).
(2.1.2)
We want this point transformation to conserve distance; the condition is: dx2 + dy2 dx2 + dy2 = . y2 y2
(2.1.3)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
55
Obviously, this implies the invariance of distance, but we want something more. We want it also to preserve angles, meaning we want it to be conformal. For later use, observe that 2 1 1 ∂x 2 ∂y = 2, + 2 ∂x ∂x y y ∂x ∂x ∂y ∂y = 0, + ∂x ∂y ∂x ∂y 2 1 1 ∂y ∂x 2 = 2. + 2 ∂y ∂y y y
(2.1.4)
Now consider two infinitesimal displacements, (d1 x, d1 y) and (d2 x, d2 y) drawn from the point (x, y) and making an angle θ, and the corresponding ones (d1 x , d1 y ) and (d2 x , d2 y ) drawn from (x , y ) making a corresponding angle θ . In order for the transformation to be conformal, we require the cosines of the two angles, d1 xd2 x+d1 yd2 y y2
2 , cos θ = √ 2 d1 x +d1 y2 d2 x +d2 y2 y2
(2.1.5a)
y2
and d1 x d2 x +d1 y d2 y y 2
2 , cos θ = √ 2 d1 x +d1 y2 d2 x +d2 y2 y2
(2.1.5b)
y2
to be equal, where we have divided numerator and denominator by y2 and y2 , respectively, in order to be able to use (2.1.3). That is, on account of (2.1.3) the denominators in (2.1.5a) and (2.1.5b) are equal so it remains only to show that the condition, 1 1 (d1 x d2 x + d1 y d2 y ) = 2 (d1 xd2 x + d1 yd2 y), 2 y y holds. If we introduce, dx =
∂x dx dx + dy, ∂x ∂y
dy =
∂y ∂y dx + dy, ∂x ∂y
(2.1.6)
Aug. 26, 2011
11:16
56
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
into the left-hand side of (2.1.6) and use (2.1.4) it becomes evident that the left side coincides with the right side thereby establishing the conformality of the point transformation (2.1.2). Regarding motions, it is easily seen that magnification x → ax, with a ≥ 0, and translation x → x + s, with s real, are two possible motions. It is well-known that in a two-dimensional space, like the one we are considering, if there exist two independent motions then there must be a third. This third is called inversion and it states that if there are two points connected by a straight line to the origin of a circle whose circumference divides the two points then the product of the distances that the two points are from the origin is equal to the square of the radius. Inversion introduces the notion of anti-congruence. The basic motions are most easily expressed in terms of complex variables z = x + iy and w = x + iy , viz. translation: w = z + s magnification: w = az inversion: w = 1/¯z,
(s ∈ R) (a ≥ 0)
(2.1.7)
where z¯ is the complex conjugate of z. These three independent motions imply that any two-dimensional object in the space may be shifted, magnified and turned inside-out and still remain congruent if the number of inversions is even, or anti-congruent if the number of inversions is odd. These motions give the object complete freedom of movement. If we generalize the concept of inversion to include the product of an inversion in the unit circle, a translation by an amount c, w=
z , cz + 1
and another inversion then we can construct a generic displacement as a product of the fundamental motions involving an even number of inversions. Such a generalized displacement will have the form [cf. (1.2.1)]: w=
az + b , cz + d
(2.1.8)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
57
where a, b, c, d are real numbers and the determinant = ad − bc > 0. The linear fractional transformation is known as a Möbius transform, and it will play a prominent role in what follows. The Möbius transform (2.1.8) can be obtained by the following elementary motions [Archbold 70]: z1 = cz,
(magnification)
z2 = z1 + d,
(translation)
z3 = 1/z2 , z4 = z3 , c
(inversion)
and so w=
a − z4 , c
which is (2.1.8). The displacement of any object in our space requires knowing the position of the object, and the alignment of a particular direction with that of an arbitrarily chosen direction in the space. For this to be accomplished we need three parameters, and the corresponding group is a threeparameter group. An object enjoying free mobility in such a space is a congruent space and its geometry is a congruent geometry. We will now dig deeper into the notions of these motions by transferring to the complex plane.
2.2 2.2.1
Geometry of Complex Numbers Properties of complex numbers
An ordered pair (x, y) is called a complex number, z = x+iy. The modulus of √ z is r = (x2 + y2 ). The number θ, defined by cos θ = x/r and sin θ = y/r is called the amplitude or argument (arg) of z. In terms of r and θ the complex number can be expressed as z = reiθ = r( cos θ + i sin θ), and de Moivre’s theorem follows: (cos θ + i sin θ)n = cos nθ + i sin nθ.
Aug. 26, 2011
11:16
58
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
If z1 and z2 are any two complex numbers then |z1 z2 | = |z1 | · |z2 |, |z1 /z2 | = |z1 |/|z2 |, arg (z1 z2 ) = arg z1 + arg z2 , arg (z1 /z2 ) = arg z1 − arg z2 . The last two properties recall the property of logarithms, which we shall shortly return to. Moreover, the product of a complex number z and its complex conjugate z¯ is z¯z = |z|2 . The square of the absolute value of the sum of two complex numbers is: |z1 + z2 |2 = (z1 + z2 )(¯z1 + z¯ 2 ) = z1 z¯ 1 + z2 z¯ 2 + z1 z¯ 2 + z¯ 1 z2 = |z1 |2 + |z2 |2 + z1 z¯ 2 + z¯ 1 z2 = |z1 |2 + 2Re(z1 z¯ 2 ) + |z2 |2 ≤ |z|1 + 2|z1 ||z2 | + |z2 |2 = (|z1 | + |z2 |)2 , since Re(z1 z¯ 2 ) ≤ |z1 z¯ 2 | = |z1 ||z2 |. Taking the positive square roots and observing that |z| = |¯z| gives the triangle inequality: |z1 + z2 | ≤ |z1 | + |z2 | .
2.2.2
(2.2.1)
Inversion
The property of inversion can be stated as: If a circle of radius R has a center O and two points P and P are inverse with respect to the circle then the following conditions must hold: (i) O, P, P lie on the same straight line; (ii) O does not lie between P and P ; (iii) OP · OP = r2 . To find the point of inversion P we construct a semicircle with diameter OP . If Q is the point of intersection of this semicircle with the circle whose origin is 0 then P will be the foot perpendicular from Q to OP . This is a
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
Fig. 2.2.
59
Construction of the point of inversion P.
consequence of the fact that OQP is a right triangle as shown in Fig. 2.2. Because OPQ will also be a right triangle, cos ϑ =
OP r = , r OP
and, consequently, OP · OP = r2 ;
(2.2.2)
O then is the center of inversion and the circle is called the circle of inversion. These conditions can be simply stated as: If P and P are represented by the complex number z and w, then (i) arg w = arg z, (ii) |w| = r2 /|z|, where property (i) takes both properties (i) and (ii) of the above. Given the point of inversion P we may calculate the coordinates of P , and vice-versa. From Fig. 2.3 it is apparent that the right triangles whose hypoteneuses are OP and OP are similar so that x x = . y y Then since (2.2.2) holds, (x2 + y2 )(x 2 + y 2 ) = r2 .
(2.2.3)
Aug. 26, 2011
11:16
60
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
Fig. 2.3.
Circle of inversion for constructing the inverse P with respect to P .
Introducing the value of y in (2.2.3) we have x2 + x2 ·
y2 r4 , = x2 x2 + y2
or x=
x r 2 , x2 + y2
y=
y r2 . x2 + y2
These are the coordinates of the interior points which are fully symmetric to the exterior points, x =
xr2 , x2 + y 2
y =
yr2 . x2 + y 2
Although the method of inversion has found extensive use in electrostatics, apparently introduced by Lord Kelvin, it seems to be relatively unknown in other branches of science. On closer inspection, however, it appears to have been employed for the first time in optics in a completely novel way by, the then twenty-three year old, Maxwell [Born & Wolf 59]. Since it combines Fermat’s principle of least time, which we will need later on in Chapter 7, and inversion, we will now turn to a discussion of it.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
2.2.3
61
Maxwell’s ‘fish-eye’: An example of inversion from elliptic geometry
Light emitted by a point source at P0 will propagate in a medium of index of refraction η(x, y, z). Although an infinite number of rays have been emitted by our point source, only a finite number will be found to pass through any other point in the medium, with the exception of a point P1 through which an infinite number of rays pass. Such a point is said to be a stigmatic, or sharp, image of P0 . An optical instrument which images stymatically in three-dimensions is referred to as absolute. To every point P0 in the object space there corresponds a stigmatic image P1 in the image space. These points in the two spaces are said to be conjugate to one another. It was precisely Maxwell, in 1858, who proved that for an absolute instrument the optical length of any curve in the object space is equal to the optical length of its image, provided both spaces are homogeneous. Maxwell provides us with a simple example of an absolute instrument in a medium which is characterized by a refractive index, η(r) =
1 η0 , 1 + (r/a)2
(2.2.4)
where r denotes the distance from a fixed point O, and η0 and a are constants. It is commonly referred to as Maxwell’s ‘fish-eye’ which he first studied in 1854. According to Fermat’s principle, light will propagate between any two points in such a way as to minimize (or at least to extremize) its travel time. In a system of varying index of refraction, (2.2.4), the true path will render the optical length, I=
η(r)ds = √
= η0
√ η(r) (dr2 + r2 dϕ2 )
(dr2 + r2 dϕ2 ) = η0 1 + r2 /a2
√
(1 + r2 ϕ2 )dr , 1 + r2 /a2
(2.2.5)
an extremum, where the prime indicates differentiation with respect to r. Calling the integrand , and noting that ϕ is a cyclic coordinate, i.e. ϕ is absent but its derivative is not, we immediately obtain a first integral of
Aug. 26, 2011
11:16
62
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
the motion, ∂ η(r)r2 ϕ =√ = c = const. ∂ϕ (1 + r2 ϕ2 ) Solving for ϕ , we get
ϕ=c
r
dr
√
(η2 (r)r2
− c2 )
,
on integrating. To perform the integral it will be convenient to set ρ = r/a and κ = c/aη0 . For then we find: ρ κ(1 + ρ2 )dρ ϕ − ϕ0 = √ 2 ρ (ρ − κ2 (1 + ρ2 )2 )
ρ κ d ρ2 − 1 −1 = sin dρ, √ dρ (1 − 4κ2 ) ρ and, consequently, by inverting, r 2 − a2 , (a2 η20 − 4c2 ) ar
sin (ϕ − α) = √
c
(2.2.6)
where α is a constant of integration. Expression (2.2.6) is the equation of a circle in polar coordinates. All rays through the fixed point, P0 (r0 , ϕ0 ), must be as such to keep the ratio, r02 − a2 r 2 − a2 = , r sin (ϕ − α) r0 sin (ϕ0 − α) constant. The fixed point P1 (r1 , ϕ1 ) must also satisfy this ratio for whatever α may be, and this leads to the conditions r0 r1 = a2 ,
ϕ1 = π + ϕ0 .
(2.2.7)
All rays from a point P0 meet at P1 which lies on a line connecting P0 to O. The points P0 and P1 lie on opposite sides of O such that OP0 · OP1 = a2 . Consequently, Maxwell’s fish eye is an absolute instrument where the image is an inversion since the first condition in (2.2.7) is the condition for inversion, (2.2.2). Only this time O is between the two points instead of condition 2 above. For ϕ = α and ϕ = π + α, r = a and each ray emanating from a fixed point P0 intersects the circle r = a normally. All Euclidean circles orthogonal
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
Fig. 2.4.
63
Maxwell’s “fish-eye.”
to the rim of the circle of radius r = a are the routes of geodesics in the elliptic plane E. The rays emanating from any fixed point P0 and coalescing at P1 , which lies on the line OP0 , are geodesics, or paths of shortest distance between the two points. Arcs of a circle replace straight lines in the elliptic plane. We shall return to this point shortly. To transform the Eq. (2.2.6) from polar to Cartesian coordinates we set x = r cos ϕ and y = r sin ϕ. We then obtain c y cos α − x sin α = √ (x2 + y2 − a2 ), a (a2 η20 − 4c2 ) (x − b sin α)2 + (y + b cos α)2 = a2 + b2 =
a4 , 4κ2
(2.2.8)
√ where b = (a/2c) (a2 η20 − 4c2 ). According to the theorem of chords, all chords passing through a fixed interior point, in this case O, are divided into two parts whose lengths have constant product: OP0 · OP1 = a2 . W thus have to set b = 0, so that the radius of the circle of inversion is exactly a = 2c/η0 . If we do not distinguish between the flat metric and the index of refraction in (2.2.5), then we can write the metric as d˜s2 = η2 (r)ds2 =
dr2 + r2 dϕ2 , (1 + (r/r0 )2 )2
(2.2.9)
Aug. 26, 2011
11:16
64
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
Fig. 2.5. The magnification of the inner product as it is projected stereographically onto the Euclidean plane.
where we set the absolute constant a = r0 . A simple way to get new geometric structures is to distort old ones. The stereographic projection of the dot product of the tangent vectors x˜ and y˜ at a point p on the surface of a sphere S projects onto the Euclidean plane at a point q where x and y are the tangent vectors, as shown in Fig. 2.5. The relation between their inner products is given by the stereographic inner product distortion [O’Neill 66] x2 + y 2 x·y = 1+ x˜ · y, ˜ (2.2.10) r02 and so transforms the Euclidean plane into the stereographic plane with constant, positive curvature, 1/r02 . To rationalize (2.2.10), we consider the inverse map of a plane onto a sphere, from a horizontal plane of height r0 onto a sphere of radius r0 . This is given by the projection along the radius (x, y, z) → (λx, λy, λz), where z = r0 , the domain of the plane, and (λx)2 +(λy)2 +(λr0 )2 = r02 , the codomain of the sphere. Solving for λ results in: r0
λ= √
(r02
+ x2 + y 2 )
,
(2.2.11)
and, consequently, the stereographic inner product can be written as λx · λy = x˜ · y, ˜ which is again (2.2.10). Stereographic projection was one of the topics covered by Riemann in his 1854 lecture for his Habilitation. Although he discusses a space of positive, constant, curvature, 1/r02 , he was undoubtedly aware of what happens
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
65
when r0 becomes imaginary. In such an event, the inner product (2.2.10) makes sense only when we restrict it to a disc x2 + y2 < r02 . Inside this region, the two-dimensional space is one of negative, constant, curvature. We shall come back to this in our discussion of the Poincaré disc model in Sec. 9.5. Stereographic projection possesses two very remarkable properties: (i) Circles on the sphere are mapped into lines or circles in the plane. (ii) Angles are preserved: the angle formed from two intersecting circles on the sphere is the same as the angle formed from intersecting lines or circles in the plane that correspond to the former under stereographic projection. Therefore, by sacrificing straight line geodesics we have been able to preserve angles so the stereographic projection is a conformal map of the surface. Maxwell, unwittingly, discovered that his expression for the refractive index, (2.2.4), was the dilatation factor in d˜s = η(r)ds,
(2.2.12)
in which the infinitesimal shape on the surface is represented in the map by a similar shape that differs from the original one only in size. The one on the stereographic plane is just η times bigger, and the index of refraction is the stereographic magnification factor! This was indeed a big fish to fry for the 23 year-old: he obtained an image as an inversion using stereographic projection. We will use the metric (2.2.12) in Chapter 7 to derive the tests of general relativity by identifying physically the index of refraction, η, which is the magnification factor of the flat metric, ds. The essential point is that both the point source P0 and its image point P1 are on different collinear rays emanating from O, whereas in the case of inversion, a circle P0 and its image P1 are on the same ray emanating from O. This is guaranteed by the form of the index of refraction (2.2.4). As we have just mentioned, we can get a surface of negative curvature, −1/r02 = 1/(ir0 )2 by allowing the radius of the sphere to take on the imaginary value ir0 . Instead of the index of refraction (2.2.4) we now have η(r) =
1 η0 , 1 − (r/r0 )2
(2.2.13)
Aug. 26, 2011
11:16
66
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
which obviously limits us to a (hyperbolic) disc r < r0 , which is the absolute constant of the space. Following the same procedure as before, we now find the equation of the circle (x − β sin α)2 + (y + β cos α)2 = β2 − r02 ,
(2.2.14)
√ where β = (r0 /2c) (r02 η20 + 4c2 ). The circle of inversion, C, has a center at (β, α), and its distance from the center of the hyperbolic disc is β. But, in order that (2.2.14) describe a circle, β > r0 , as can be seen in Fig. 2.6, the center of inversion must lie outside the hyperbolic plane, H. Thus, its center does not separate the source P0 and its image P1 along a common line uniting the three points, and, as a consequence P0 and P1 will not lie on geodesics arcs of a circle. So, it was not at all fortuitous that young Maxwell chose the form (2.2.4) for the index of refraction, and not (2.2.13). In fact, as we approach the rim of H, which does not belong to H, the index of refraction (2.2.13) becomes infinite. Since the velocity of propagation is inversely proportional to the index of refraction, it will become very small in the limit. Clocks slow down and rulers shrink as they approach the rim when viewed from our Euclidean perspective. We might expect that this shrinking of rulers and slowing down of clocks to have something to do with space contraction and time dilatation. This ‘shrinkage’ of rulers, and ‘slowing down’ of clocks is in direct contrast as to what happens in the stereographic, or elliptic, plane of constant
Fig. 2.6. In the case of inversion both the point and its image are on the same ray emanating from the center of the disc H.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
67
Fig. 2.7. It appears that rulers get longer as they are moved further from the origin. However, the elliptic distance from x to y is exactly the same as that from X to Y.
positive curvature. The projection of points in the northern hemisphere X and Y are much further away from the origin than projections from the southern hemisphere, x and y, as shown in Fig. 2.7. The index of refraction (2.2.4) becomes smaller and smaller the farther we move away from the origin. This means that the velocity increases and clocks speed up, while rulers get longer as they move farther away from the origin. Large circles in the stereographic plane have very small stereographic arclength since they correspond to small circles about the north pole of the sphere S. In consideration of the relationship between hyperbolic and elliptic spaces we might expect phenomena such as time contraction and space dilatation to be characteristic of elliptic spaces when viewed from our Euclidean perspective. To an inhabitant of the plane, he would measure the same distance between x and y in Fig. 2.7, as he would measure between X and Y.
2.2.4
The cross-ratio
Now consider a circular arc, defined by arg[(z − z1 )/(z − z2 )] = const. In addition let there be two fixed points on the arc P1 and P2 with P lying between them. We let P vary such that the angle, measured in radians, ∠P1 P2 = θ is constant. If P, P1 , P2 are represented respectively by z, z1 , z2 the necessary and sufficient condition that the angle remains constant, as P is varied, is: z − z1 arg (z − z1 ) − arg (z − z2 ) = arg = θ. z − z2
Aug. 26, 2011
11:16
68
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
If 0 < θ < π, the locus of P is the arc of a circle with endpoints P1 and P2 . For the particular values θ = π/2, the locus of P is a semi-circle, while for θ = π it is the segment P1 P2 . Generalizing to four points lying on an arc we have: If points P3 and P4 lie on an arc whose endpoints are P1 and P2 , then z3 − z1 z4 − z1 arg = arg , z3 − z 2 z4 − z 2 or
z3 − z1 z4 − z2 arg · z3 − z 2 z4 − z 1
= 0,
where the zi ’s represent the Pi ’s. But, this can only be if the number, z3 − z 1 z 4 − z 2 · , z3 − z 2 z 4 − z 1 is real and positive. This number is the cross-ratio of the four numbers z1 , z2 , z3 , z4 . Alternatively, if P3 and P4 lie outside of the arc segment P1 P2 , then z3 − z1 z4 − z2 arg = π, · z3 − z 2 z4 − z 1 and the corresponding cross-ratio is a negative real number. The cross-ratio finds its origins in renaissance art where artists found it necessary to give depth to their two-dimensional drawings. If the points A, B, C, D lie on a line and the pairs of points A, B separate C, D then the cross-ratio, AC AD {A, B|C, D} = , BC BD is positive, while if they do not then the cross-ratio is negative. The cross-ratio of four points is the minimum number of points that is invariant under projection. A correspondence between two straight lines such that for all corresponding quadruples, A, B, C, D and A , B , C , D , their cross-ratios are equal, {A, B|C, D} = {A , B |C , D } is called a projective correspondence. Since the ordinary projection of a line onto a line preserves the cross-ratio, it is an example of a projective correspondence. Such a correspondence is said to be perspective.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
69
A perspective is merely a realistic representation of spatial depth on a plane. Yet, the method for correct perspective was awarded to the Florentine painter, Brunelleschi at the beginning of the fifteenth century. Alberti solved a special case, known as costruzione legittima, whereby nonhorizontal floor tiles are lined up on a base line with ever-progressing smaller tiles placed behind them and letting them converge to a vanishing point on the horizon as in Fig. 2.8. The development of projective geometry followed, mainly through the work of Desargues, with the introduction of ‘vanishing’ points, or points at infinity where parallels meet, and transformations which change lengths and angles, i.e. projections. But, if length and angles are not invariant under projection, what is? Since it is possible to project any three points on a line onto any three others, this cannot be an invariant. The smallest number of points which is invariant is four, and the cross-ratio is a projective invariant. Following the proof given by Möbius in 1827 that the cross-ratio is a projective invariant, we consider four points on a line A, B, C, D, and a point O not lying on the line as in Fig. 2.9. Drop a normal onto the line and let δ be its length. By computing the area of the triangles OCA, OCB, ODA and ODB, first using the height δ and the bases AB, BC, DA, and DB, and then using the bases OA, and OB, and expressing the height in
Fig. 2.8. A tiling of the plane.
Aug. 26, 2011
11:16
70
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
Fig. 2.9.
Calculation of cross-ratio and perspectivity.
terms of the sines of the angle at O, we find 1 δ · CA = area OAC = 2 1 δ · CB = area OCB = 2 1 δ · DA = area ODA = 2 1 δ · DB = area ODB = 2
1 OA · OC sin ∠COA, 2 1 OB · OC sin ∠COB, 2 1 OA · OD sin ∠DOA, 2 1 OB · OD sin ∠DOB, 2
Taking the ratio of the first and second pairs, and dividing the former by the latter results in CA DA sin ∠COA sin ∠DOA = . CB DB sin ∠COB sin ∠DOB Observing that any other four points A , B , C , D in perspective with the original points A, B, C, D, with the same external point O, will have the same central angle at O, shown in Fig. 2.9, and, consequently, will have the same cross-ratio. Projective transformations, or collineations as they are sometimes referred to, can map parallel lines onto intersecting lines thereby providing a sense of depth, like the converging parallel lines in Fig. 2.8. In order to define a projective transformation, we must add points ‘at infinity.’ These are necessary in order to insure the one-to-one correspondence that arises in connection with the central projection of a plane onto a plane in which
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
71
some of the points of the first plane have no images. The straight lines of one that intersect the plane correspond to points of intersection with the plane, while those lines parallel to the plane are new points, called points at infinity where parallels meet. For when the straight line that intersects the plane becomes closer and closer to a parallel line, its point of intersection recedes to infinity. The Euclidean plane is transformed into the projective plane by the addition of points at infinity. The logarithm of the cross-ratio measures hyperbolic distance. Because of the logarithmic form it would not satisfy the triangle inequality, (2.2.1). It is well-known that logarithmic equations of state in thermodynamics [Lavenda 09], and logarithmic measures of divergence in information theory [Kullback 59], have all the topological requisites of a distance except that of the triangle inequality. However, a remarkable property of the cross-product enables the hyperbolic distance to satisfy the triangle inequality, and, therefore, be considered as a bona fide distance. Consider four collinear points with a and b between x and y with b between a and x. The cross-ratio, {a, b|x, y} > 1, unless a = b. If d is some other interior point, {a, d|x, y} · {d, b|x, y} = {a, b|x, y},
(2.2.15)
and the cross-ratio is associative. Since the distance is the logarithm of the cross-ratios, it is precisely this last property that would lead one to believe that the triangle inequality cannot be satisfied. But wait a moment. What happens if we shorten the interval, say to some x lying between b and x. It can be shown that [Buseman & Kelly 53]: {a, b|x , y} > {a, b|x, y}.
(2.2.16)
So anytime we shorten the interval we increase the cross-ratio, and, consequently, the distance from a to b is also increased. To establish the triangle inequality consult Fig. 2.10. The perspectivity of the lines uv and xy from the pole, p, and the inequality (2.2.16) give {a, c|u, v} = {a, d|x , y } ≥ {a, d|x, y}.
Aug. 26, 2011
11:16
72
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
Fig. 2.10. The four points u, a, c, v and a, d, x ,y from point p have the same angles, hence, have the same cross-ratio. This also is true for c, b, w, z and d, b, x ,y .
Likewise, the perspectivity of wz and xy, together with inequality (2.2.16), give {c, b|w, z} = {d, b|x , y } ≥ {d, b|x, y}. Taking the product of the two inequalities, and using property (2.2.15), result in {a, c|u, v} · {c, b|w, z} ≥ {a, d|x, y} · {d, b|x, y} = {a, b|x, y}. Finally, forming the hyperbolic distances by taking the logarithm of both sides yields the triangle inequality, h(a, c) + h(c, b) ≥ h(a, b),
(2.2.17)
for the hyperbolic distance as the logarithm of the cross-ratio.
2.2.5
The Möbius transform
The properties of the Möbius transform that we discuss here will be used in Chapter 8, especially in Sec. 8.2.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
2.2.5.1
73
Invariance of the cross-ratio
We will now show that the Möbius transform leaves the cross-ratio invariant. It is this property that Poincaré used, to show that all the cells of the tessellations in the hyperbolic plane are of equal size. Take all four numbers z1 , z2 , z3 , z4 to be different, and cz + d
= 0 for any of them. If wi =
azi + b , czi + d
i = 1, . . . , 4, the wi are all different and their differences are given by: w1 − w2 = (z1 − z2 )/(cz1 + d)(cz2 + d), w2 − w3 = (z2 − z3 )/(cz2 + d)(cz3 + d), with denoting, again, the determinant. Dividing the first by the second, w1 − w2 z1 − z2 cz3 + d = · . w2 − w 3 z2 − z3 cz1 + d Likewise, w1 − w4 z1 − z4 cz3 + d , = · w4 − w 3 z4 − z3 cz1 + d and again dividing the first by the second, w1 − w2 w1 − w4 z1 − z2 z1 − z4 = . w2 − w 3 w4 − w 3 z 2 − z 3 z4 − z 3 The left- and right-hand sides are the cross-ratios of four numbers, and they are equal. This shows that the Möbius transform preserves cross-ratios.
2.2.5.2
Fixed points
A fixed point occurs when w = z. Fixed points are, therefore, determined by the equation cz2 + (d − a)z − b = 0. It is not difficult to see that the only Möbius transform with more than two fixed points is the identity transform. For if a = d and b = c = 0, every point is fixed. The transformation reduces to w = z, or the identity
Aug. 26, 2011
11:16
74
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
transformation. Now, if a = d, b
= 0, and c = 0, the quadratic has one root, i.e. ∞, which is the only fixed point. Further, if c = 0, a
= d, and b
= 0, the quadratic has distinct roots, b/(a − d) and ∞. Now, assume that c
= 0 and δ is either of the square roots of the discriminant, (a − d)2 + 4bc. If δ
= 0, the quadratic has two distinct roots, (a − d ± δ)/2c. Rather, if δ = 0, the roots coalesce to a single fixed point (a − d)/2c. Hence, we have shown that there cannot be more than two fixed points of a Möbius transformation.
2.2.5.3
Associativity
The Möbius transformation is also associative, just like the cross-ratio, (2.2.15). That is, if T1 transforms z1 into z2 and T2 transforms z2 into z3 , then the product T1 T2 transforms z1 into z3 . Let T1 be the Möbius transform, w = (az + b)/(cz + d), and T2 be the transform, w = (Az + B)/(Cz + D), then their product, T1 T2 is defined as w=
A[(az + b)/(cz + d)] + B , C[(az + b)/(cz + d)] + D
which has the same form w=
(Aa + Bc)z + (Ab + Bd) . (Ca + Dc)z + (Cb + Dd)
Since the determinant is the product of determinants 1 2 = (AD − BC) (ad − bc), and does not vanish, it makes T1 T2 also a Möbius transformation. A special product transformation will be of importance in our further developments; that is, when the product T1 T2 = I, the identity transformation. The identity w = z will result only when the following conditions are met Aa + Bc = Cb + Dd, Ab + Bd = 0, Ca + Dc = 0. This will happen only when the ratios A : B : C : D are the same as d : − b : −c : a. Then there is a unique transform T2 which has the Möbius transform, w=
dz − b . −cz + a
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
75
It is the inverse to T1 , and is written as T1−1 . Hence, T1 transforms z1 into z2 , and T1−1 transforms z2 back into z1 . Moreover, if it happens that a+d = 0, T1 and T1−1 are the same, and T1 is called involutory, since it is its own inverse.
2.2.5.4
Transformations for which the unit circle is invariant
These transformations are particularly interesting for they correspond to the Lorentz transformations in relativity. Considering complex coordinates on the unit disc, a Lorentz transformation corresponds to a Möbius transformation, w=
az + c¯ , cz + a¯
(2.2.18)
for which |¯a| > |¯c|, so that their ratio, |¯c/¯a| will be a point in the interior of the unit disc. This is a necessary and sufficient condition that w maps the interior of the unit disc onto its interior [Schwerdtfeger 62]. A Möbius transform which transforms three distinct points of a unit circle into three other distinct points of the circle it must, obviously, transform the unit circle into itself since if z is a circle or a line, so too will be w. If the Möbius transform, w = (az+b)/(cz+d) transforms the unit circle into itself, |w| = 1, implying |az + b| = |cz + d|. The latter condition must be the same as |z| = 1. Now, the condition |az + b| = |cz + d| is the same as |z + b/a| = |c/a| · |z + d/c|. This is the equation of a circle having a pair of inverse points −b/a and −d/c. We know that the two inverse points must be of the form z = 1/¯z, ¯ which implies b/a = c¯ /d. As a special case we can set b = c¯ and d = a¯ . For then, the Möbius transform which carries the unit circle into itself will be of the form (2.2.18). Inverting it we get z=
a¯ w − c¯ . −cw + a
The family of circles |z| = κ > 0, where κ is real, is transformed into the coaxial circle: |¯az − c¯ | = κ|cz − a|. Coaxial circles are a family of circles such that any pair has the same radical axis. The radical axis is the line passing through the two points of intersection of a pair of circles, as the line PQ in
Aug. 26, 2011
11:16
76
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
Fig. 8.1. The origin, which is a circle of radius κ = 0, is transformed into c¯ /¯a, which lies inside the unit circle when |c| < |a|, and outside of it when |c| > |a|. The condition that the determinant must not vanish, ad − bc
= 0, prohibits the case |c| = |a|.
2.3
Geodesics
Returning to our hyperbolic model of the heated plane, we have 2 1/2 p2 √ 2 p2
dy 1 (dx + dy2 ) s= 1+ dx, = dx y p1 p1 y
(2.3.1)
as the distance between two points p1 and p2 . The pre-factor has the form of a varying index of refraction. For if we suppose that at the Earth’s surface y = 0, the index of refraction η(y) will be a function of height y only. The propagation time, τ along a ray connecting two endpoints p1 and p2 will be given by Fermat’s principle of least time: p2 √ cτ = η(y) (1 + y2 )dx, (2.3.2) p1
where the prime denotes differentiation with respect to the independent variable, x. The product cτ is known as the optical path length, where c is the velocity of light in vacuum. According to Fermat’s principle of least time, the optical path length is stationary for the true ray path. In terms of the integrand of (2.3.2), √ (y, y ) = η(y) (1 + y2 ), (2.3.3) the Euler–Lagrange equation can be written as − y
∂ = C, ∂y
(2.3.4)
where C = const is a first integral of the motion. Explicitly, the Euler– Lagrange equation (2.3.4) is η = C. (1 + y2 )
√
(2.3.5)
So, the constant C is the value of the index of refraction where the ray becomes horizontal. The angle, θ, formed between the tangent to the ray
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
Fig. 2.11.
77
Derivation of Snell’s law.
and the normal to the ray, shown in Fig. 2.11, is given by √ √ θ = arc sin(dx/ (dx2 + dy2 )) = arc sin(1/ (1 + y2 )), so that the Euler–Lagrange equation (2.3.5) coincides with Snell’s law, η(y) sin θ = C.
(2.3.6)
According to Snell’s law, the sines of the angles which the incident θi and transmitted θt rays make with the normal to an interface between two different media are proportional, i.e. sin θi = η, sin θt
(2.3.7)
where η is the relative index of refraction of the two media. Expression (2.3.6) generalizes Snell’s law to the case where the index of refraction is a function of the height. Ordinarily, the index of refraction decreases with altitude, and this is borne out by the heated plane model since upon comparing terms in the integrands of (2.3.1) and (2.3.2) we find η = 1/y, and, consequently, dη/dy < 0. The true ray that will connect the two points will be concave: Light minimizes its propagation time by arching its path upwards between the endpoints, like a cat ready to attack. As a result, objects do not appear to be where they are but are a little bit lower than our line of sight.
Aug. 26, 2011
11:16
78
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
In contrast, the index of refraction will be an increasing function of height in inversion layers where mirages are formed. The ray is now convex and the images will be higher than our line of sight. In both cases, these distortions are caused by the non-Euclidean nature of the geometry. This will be a recurrent theme throughout. Even without making the integral (2.3.1) stationary we can get some remarkable properties about hyperbolic geometry, undoubtedly the most important of which is the angle of parallelism. Transforming to polar coordinates, and considering the arc γ to increase the angle from α to π/2 we get √ γ
(dx2 + dy2 ) = y
π/2 α
≥
α
π/2
√
(r2 + r2 ) dθ r sin θ
dθ = − ln tan (α/2), sin θ
(2.3.8)
where the prime now stands for differentiation with respect to the independent variable, θ. If we set (2.3.8) proportional to the minimum distance from a point P, using the perpendicular distance d, to a line , as shown in Fig. 2.12, we have one of the most remarkable formulas in all of mathematics. The number α of radians in the angle of parallelism depends only on the distance d from P to Q and not on the particular line , or the particular point P. The formula was discovered independently by J. Bolyai and N. Lobachevsky. In Euclidean geometry the rays emanating at P must coincide, so α, which is usually written as (d) in the literature, is always a right angle. However, under Lobachevsky’s postulate these lines are distinct and the angle (d) is necessarily acute. It is a function only of the hyperbolic distance d.
Fig. 2.12. Angle of parallelism.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
79
The Euler–Lagrange equation which renders the integral (2.3.1) stationary is: y +
1 + y2 = 0. y
The solution to this equation is a family of circles (x − a)2 + y2 = c2 , whose centers lie on the x-axis, where a and c are constants of integration. Restricting ourselves to the half-circles located in the upper half-plane will give us the Poincaré half-plane model. The semi-circles will be the geodesics of our hyperbolic space. If the plane were Euclidean, we could draw only one line through any given point parallel to a given straight line. This is Euclid’s fifth postulate. In this plane there would be only one geodesic through a given point that would be parallel to another given geodesic. Not so in our heated plane! Because the geodesics are semi-circles, all geodesics through a point P not lying on the geodesic g in Fig. 2.13 are parallel to g, even h1 and h2 , which are tangent to it at points U and V, because those points have been excluded by considering them infinitely far away, i.e. points at infinity. To any student of geometry, this smacks of Lobachevsky geometry, who only claimed that there exist two lines parallel to a given line through a given point not on the line. However, this does not mean that he did not recognize that there were infinitely many non-intersecting lines. His parallel
Fig. 2.13. The number of lines passing through P that are hyperparallel to the line g are infinite. The lines h1 and h2 are limiting parallel to g, while the others are hyperparallel to g.
Aug. 26, 2011
11:16
80
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
property is what is now usually referred to as ‘asymptotically parallel’ or ‘horoparallel.’ Here, Lobachevsky’s statement is due to the peculiar nature of the Poincaré half-plane model. Nevertheless, the half-plane model illustrates the hallmark of Lobachevskian geometry: the sum of the angles of a triangle are less than two right angles. The heated plane model also illustrates other properties of projective √ geometry. We substitute the positive square root, [c2 − (x − a)2 ] for y in (2.3.1) and determine the distance s between two points x1 and x2 as x2 c s= dx 2 2 x1 c − (x − a) 1 x2 1 1 = + dx 2 x1 c + x − a c − x + a 1 x2 − (a − c) (a + c) − x1 = ln · . (2.3.9) 2 (a + c) − x2 x1 − (a − c) Now, let us define two other x-coordinates, x3 , x4 with x4 > x3 , as the points where the geodesics intersect the x-axis, i.e. x3 = a − c and x4 = a + c. Substituting these values into (2.3.9) results in x2 − x3 x4 − x1 1 . (2.3.10) · s = ln 2 x4 − x 2 x1 − x 3 This is precisely the logarithm of the cross-ratio, x2 − x3 x4 − x1 · , x4 − x 2 x 1 − x 3 of four ordered points, x1 , x2 , x3 , x4 . For fixed endpoints x3 and x4 , (2.3.10) is the hyperbolic distance between x1 and x2 , which we know by (2.2.17) satisfies the triangle inequality.
2.4
Models of the Hyperbolic Plane and Their Properties
In the half-plane model, studied at the beginning of this chapter, we found √ it equipped with the distance function ds = (dx2 + dy2 )/y. This is one model of the hyperbolic plane because anything with the same metric is also a viable model of the hyperbolic plane.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
81
In search of these other models, we take our cue from Euclidean and spherical geometries where equivalent metrics, or isometries, are found by using complex functions. In terms of the complex number, z = x + iy, √ the distance ds = (dx2 + dy2 )/y becomes |dz|/Im z. The map from the half-plane to the disc is w= so that
iz + 1 , z+i
or
z=
−iw + 1 , w−i
iw + 1 |dz| −iw + 1 ds = = d Im Im z w−i w−i ¯ + i) (1 − iw)(w 2|dw| Im = |w − i|2 |w − i|2 =
2|dw| . 1 − |w|2
(2.4.1)
Thus, we have two models already of the hyperbolic plane: • the upper half-plane model with distance ds = |dz|/Im z, where the ‘lines’ are semi-circles perpendicular to the real axis, as in Fig. 2.13, and angles which are the same as Euclidean angles; and • the open disc model with metric, (2.4.1), and ‘lines’ that are circular arcs orthogonal to the boundary, as shown in Fig. 2.6, with angles the same as Euclidean angles. The reason for conformality of the Poincaré disc model is that it took two inversions to go from the half-plane to the disc. We will soon meet yet another disc model which straightens out the circular arcs at the cost of losing angle invariance. The attribute of having a circle at infinity as a natural boundary is that the points on the disc are actually located at infinity as our inhabitants of the unit disc, whom we shall refer affectionately to as ‘Poincarites,’ know.a The distance from the origin to any point tends to infinity as the point tends to 1. Lines, or rather circular arcs, which have a common point on the circle a The circle at infinity will take on a physical vest when it is identified as the limit of
the inner solution to the Schwarzschild metric in Sec. 9.10.3. The name ‘Poincarites’ was probably first used by Needham [97].
Aug. 26, 2011
11:16
82
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
at infinity are known as asymptotic lines. The point where they meet is not a point belonging to the lines, but, rather, a limit point because points are at ‘infinity.’ In contrast, ultraparallels are circular arcs which cut the circle at infinity but have no common point. The distinction between these two lines in the unit disc is far from academic. A product of reflections in ultraparallel lines constitutes a translation, whereas a product of reflections in asymptotic lines is a ‘limit’ rotation. Limit rotations on the disc are circles tangent to the circle at infinity, and are known as horocycles, or ‘limit’ cycles. An amazing find was that a horocycle, or a horosphere in three-dimensions, is a circle at infinity that obeys Euclidean, and not a hyperbolic, geometry. This discovery was made by Wachter, a student of Gauss, way back in 1816. It will allow us to use Euclidean geometry to determine the properties of the hyperbolic plane, notably the angle of parallelism and the necessity of introducing an absolute constant, or a unit of measure, which is completely foreign to Euclidean geometry. The horocycle, or circle whose center is at infinity, i.e. on the unit disc, is most clearly seen considering the ‘pseudosphere.’ Of all the mappings of constant negative curvature on the unit disc, it is only the middle figure in Fig. 2.14, which has been adapted from Klein’s 1928 book on non-Euclidean geometry, that shows the horocyles as dashed lines. The solid lines are the image of one turn of the covering of the pseudosphere. All three mappings show that surfaces of constant negative curvature are mapped only onto part of the disc. We postpone a discussion of some of the remarkable properties of the pseudosphere, which is to hyperbolic geometry what the plane and sphere are to Euclidean and elliptic geometry, respectively, and use the following property of horocycles to derive the angle of parallelism. The ratio of any two concentric limiting arcs cut by radii depends only on the distance between them and not on their size or where they are located in the hyperbolic plane. It is by no means an understatement to say that all the trigonometric relations of hyperbolic geometry follow from the fact that the ratio of concentric limiting arcs l and m, with l > m, intercepted between two radii is
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
83
Fig. 2.14. Surfaces of negative constant curvature that are mapped onto part of the hyperbolic plane. The middle figure is the mapping of a pseudosphere that produces horocycles as dashed curves.
given by l/m = ea/κ ,
(2.4.2)
where a is the distance between the arcs and κ is a positive (absolute) constant. In Fig. 2.15 the arcs l, m, and n are cut by two radii. The distance between the first two is a, while the distance between the second and third arcs is b. We know that the ratios depend on the distance between them but
Aug. 26, 2011
11:16
84
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
Fig. 2.15. The ratio of concentric limiting arcs depends only on the distance between them.
we do not know the functional form. That is, l/m = f (a),
m/n = f (b),
l/n = f (a + b),
where f is a positive, and increasing function. These relations suggest the functional relation, f (a) · f (b) = f (a + b). Such a functional relation can only be satisfied by an exponential function. The transfer from any exponential g > 1 to the constant e entails introducing an absolute constant κ such that ga = (eln g )a = ea/κ , and on account that g > 1, κ > 0. If we go back to the pseudosphere, we find that for any two points on its surface, the following remarkable relationship holds x2 + y 2 < κ2 . When we go to plot these points on a Euclidean plane, as in Fig. 2.14, they are constrained to lie within a circle of radius κ. All points on the entire pseudosphere are thus constrained to lie within a circle of radius κ on a Euclidean plane. This radius is called the radius of curvature, or space constant, and is an absolutely determined length. It is the analog of the radius of a sphere in spherical geometry under the transform κ → iκ. And although it is absolutely determined, its magnitude will depend upon the units chosen.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
85
Fig. 2.16. Using Euclidean geometry to derive the angle of parallelism by considering concentric limiting arcs.
The arc lengths C B and CB belong to two concentric horocycles in Fig. 2.16. The angle at C is a right angle so that by Euclidean geometry B C = l sin β and AC = l cos β. The distance between the concentric arcs is
eb/κ = eCC /κ =
l B C
=
l = csc β. l sin β
Now, the ratio of the concentric limiting arcs l + AC and l is [Kulczycki 61] l + AC = 1 + cos β, ea/κ eb/κ = l and consequently, ea/κ =
1 + cos β = cot(β/2). sin β
Denoting β = (a) as the angle of parallelism, which can only depend on the hyperbolic distance a, we obtain the Bolyai–Lobachevsky formula, tan
(a) = e−a/κ . 2
(2.4.3)
Euclidean geometry can be used in the hyperbolic plane to derive nonEuclidean results by considering the properties of concentric horocycles, and from their property (2.4.2) all the trigonometric formulas of hyperbolic geometry follow. Now consider M1 (x1 , y1 ) and M2 (x2 , y2 ) as any two points on a horocycle lying in the unit disc. Let M(x, y) be any point on the arc M1 M2 . With
Aug. 26, 2011
11:16
86
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
M1 M/MM2 = λ, the relation between the coordinates are known to be x1 + λx2 , 1+λ
x=
y=
y1 + λy2 . 1+λ
(2.4.4)
The values λ1 and λ2 will be the roots of the unit disc, x2 + y2 = 1,
(2.4.5)
that is when M coincides with either P or Q on the boundary. Introducing (2.4.4) into (2.4.5), (x1 + λx2 )2 + (y1 + λy2 )2 − (1 + λ) = 0, and defining 11 = x12 + y12 − 1,
22 = x22 + y22 − 1,
22 = x1 x2 + y1 y2 − 1, so that the former equation can be written as the quadratic equation, 22 λ2 + 212 λ + 11 = 0. The ratio of the roots to this quadratic, λ1,2 =
−12 ±
√
(212 − 11 22 ) , 22
is the cross-ratio, √ 12 + (212 − 11 22 ) λ2 {M1 , M2 |P, Q} = = . √ λ1 12 − (212 − 11 22 ) Thus, the distance between the two points M1 and M2 is: √ 12 + (212 − 11 22 ) κ s(M1 , M2 ) = ln √ 2 12 − (212 − 11 22 ) √ (212 − 11 22 ) −1 , = κ tanh 12 where κ is the absolute constant.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
87
We now transfer this result to velocity space. Let u and v be velocities, and s becomes relative velocity, w. The equation for the unit disc transforms into u2 + v 2 = c 2 , where c is the speed of light. Let us consider that v is infinitesimally close to u such that v = u + du. Now, 11 = u2 + v2 − c2 ,
22 = du2 + dv2 ,
12 = u du + v dv, and for small arguments, the inverse hyperbolic tangent can be approximated by the argument itself, so that √ 2
dw =
(212 − 11 22 ) 22
= c2
(u du + v dv)2 − (u2 + v2 − c2 )(du2 + dv2 ) (u2 + v2 − c2 )2
= c2
c2 (du2 + dv2 ) − (v du − u dv)2 , (u2 + v2 − c2 )2
(2.4.6)
Equation (2.4.6) is the famous Beltrami metric, but whose derivation we followed was that by Klein [71]. If the velocities are infinitesimally close to one another, such that v = u + du then (2.4.6) becomes 2 2 2 2 2 c (du) − (u × du) dw = c , (c2 − u2 )2 or =c
2
(du)2 (u · du)2 + c 2 − u2 (c2 − u2 )2
.
(2.4.7)
We will come across the Beltrami metric on numerous occasions, for example in the radiation pressure in Sec. 4.2.4, on uniformly rotating disc in Sec. 9.6, and in the Thomas precession in Sec. 10.1, which was discovered by Borel.
Aug. 26, 2011
11:16
88
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
For finite difference between the velocities u and v, the relative velocity w is w2 =
(u − v)2 − (u × v)2 /c2 . (1 − u · v/c2 )2
(2.4.8)
The square of the relativity velocity, (2.4.8) is invariant under Möbius transforms (2.2.18) with |¯a| > |¯c|. If we replace the relative velocities by u and v relative to some other frame, the value of (2.4.8) will be unaffected by the change.
2.5
A Brief History of Hyperbolic Geometry
The elliptic plane can be developed on a sphere; the hyperbolic plane on a hyperboloid. It took almost two thousand years to appreciate that on such planes Euclid’s fifth postulate would be violated. Hyperbolic geometry was born in 1829 when Lobachevsky showed that in a right triangle with a fixed side d, as the opposite vertex P moves infinitely far away, the angle α increases to a limit α0 = (d) < π/2, as shown in Fig. 2.17. If the unit of distance is properly chosen, Lobachevsky derived d = − ln tan ((d)/2). Some forty years later, Beltrami showed that this unit of distance corresponds to a surface negative curvature, −1. The following year, Lobachevsky was already looking for applications of his ‘imaginary’ geometry. If the universe is indeed non-Euclidean, then the unit of distance must be much larger than our solar system. If the vertex is the star Sirius, and the distance d is that of the Earth’s orbit, the parallax of Sirius would be 1.24 .b The parallax of stars is the annual oscillation
Fig. 2.17. A right triangle in hyperbolic space: As P increases without limit the angle tends to the angle of parallelism which is a function only of d. bActually, the parallax of Sirius is 0.37 .
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
89
of the star’s apparent position due to the Earth’s motion about the sun. It depends on the distance that the star is from the Earth. Bradley back in 1725 tried to measure the distance to a star using the diameter of the Earth’s orbit as a baseline. He had hoped to determine stellar distances in much the same way that surveyors measure distances by triangulation. However, what he measured was not the parallax for it depended on the Earth’s motion, and not its position at a given point in the orbit. We will return to the phenomenon he discovered, stellar aberration, in Sec. 10.1. The parallax is measured by comparing the apparent position of a star S with that of some reference star S which is much more distant. This is shown in Fig. 2.18, where the angle φ is the parallax of the star. It is the upper limit on the defect D of the triangle ASM, π π + α + β < − α = φ. D=π− 2 2 Three years after Lobachevsky’s first publication, J. Bolyai published his own version of hyperbolic geometry. As we saw in Sec. 1.2.1 Gauss, a friend of Bolyai senior, did nothing to encourage his son to develop his ideas further. Probably the reason can be found in Gauss’s 1824 letter to Taurinus where he writes: I have sometimes in jest expressed the wish that Euclidean geometry is not true. For then we would have an absolute a priori unit of measurement.
Gauss never published anything during his lifetime on hyperbolic geometry, as it came to be known. We have also mentioned in the preface that
Fig. 2.18.
The parallax of a star.
Aug. 26, 2011
11:16
90
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
Riemann noted in his Habilitation that the metric of a manifold of constant curvature, α, could be written as: √ 2 (dx1 + · · · + dxn2 ) , ds = 1 + α4 (x12 + · · · + xn2 ) where α = +1, −1 for elliptic and hyperbolic spaces, respectively. The next advance came in 1868 with two publications by Beltrami [Stillwell 91]. The conclusion of his first paper was that twodimensional non-Euclidean geometry is simply the study of surfaces of constant negative curvature. He coined the name ‘pseudosphere’ of radius R, for a bugle looking surface of negative constant curvature, −1/R2 . The pseudosphere is the surface of revolution that is obtained by revolving the tractrix about its axis of symmetry, as shown in Fig. 2.19. The pseudosphere has total curvature −2π, and when it is divided by its constant curvature, −1/R2 gives, surprisingly, a finite area 2πR2 . A tractrix is the track that a dog on a leash of unit length leaves who is being pulled by his master walking along the x-axis. The curve is determined by the property that its tangent lines meet the x-axis at a unit distance from the point of tangency. The tractrix was known to Newton as far back as 1676, and the pseudosphere was investigated by Huygens as early as 1693. Huygens established that its surface area is finite, and found its volume and the enclosed center of mass of the solid are also finite. After having read Riemann’s 1854 inaugural address, which was only published posthumously in 1868, Beltrami realized, in his second paper,
Fig. 2.19.
Tractrix and pseudosphere as its surface of revolution.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
91
that the points of an n-dimensional non-Euclidean geometry are identical to the interior points of a hemisphere: y=
√ 2 (c − x12 − · · · − xn2 ),
y ≥ 0,
in an (n + 1)-dimensional Euclidean space, and provided it with the Riemann metric, √ 2 (dx1 + · · · + dxn2 + dy2 ) ds = R . (2.5.1) y The metric (2.5.1) is an obvious generalization of (2.1.1) to give it n−1 more dimensions. So it was Beltrami who actually discovered the Poincaré disc model some 14 years before he did! Beltrami also appreciated that the boundary points at y = 0 are infinitely far from the interior of this metric. These points are the coldest regions of the heated plane model, and are referred to as points at infinity in the projective plane. Projecting the hemisphere stereographically onto the disc, x12 + · · · + xn2 ≤ c2 , Beltrami obtained the conformal disc model with metric, √ 2 (dz1 + · · · + dzn2 ) ds = , 1 − 4R1 2 (z12 + · · · + zn2 ) which had already been obtained by Riemann. Then, performing an inversion in a boundary point of the disc, Beltrami obtained the half-plane model, with coordinates x1 , . . . , xn and y ≥ 0 whose metric is given by (2.5.1). Beltrami gave credit to Liouville for having written down the two-dimensional case earlier, precisely the credit that was denied to him in having discovered the half-plane model in the n-dimensional case! The two-dimensional formula, (2.1.1), was derived by Liouville in 1850 by mapping the pseudosphere into the half-plane, but he did not realize that the half-plane with his distance formula was a model of hyperbolic geometry. Twenty-one years were to pass before Klein formulated Beltrami’s projective disc model in the language of projective geometry. A sphere in elliptic geometry with radius R has constant positive curvature, 1/R2 . A sphere in hyperbolic space also has constant curvature, but it
Aug. 26, 2011
11:16
92
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
is negative, −1/R2 . Therefore, if an elliptic space has radius R, its hyperbolic counterpart has radius iR. To construct an n-dimensional sphere we begin with the equation of a circle, c2 = x02 + x12 + · · · + xn2 , to which we have added an extra dimension. This gives rise to a Euclidean metric ds2 = dx02 + dx12 + · · · + dxn2 . Constraining this metric to the unit sphere, c = 1, gives a Riemannian metric of constant positive curvature, +1. Alternatively, if we begin with the indefinite metric, ds2 = −dx02 + dx12 + · · · + dxn2 ,
(2.5.2)
which is associated with a hyperbola, c2 = −x02 + x12 + · · · + xn2 , and then a sphere of radius i centered at the origin is the hyperboloid, c2 = −1. A hyperboloid is the surface of revolution obtained by rotating the hyperbola around x0 . We pause for a moment to relate (2.5.2) to space-time. Consider two inertial frames, S and S , both traveling at the same uniform speed but in opposite directions. The space-time coordinates of one frame, x, t, must be linear functions of the other frame, x , t , viz. x = Ax + Bt , x = Ax − Bt. If we place ourselves at the origin of S, we measure a velocity B/A = −u in S . Likewise, if we are at the origin of S , we measure a velocity B/A = u in S. Now, we consider the propagation of light signals in both frames; in S we have x = ct, while in S , x = ct . When these equations are introduced into the above pair of linear equations, we get t = (A + B/c)t , t = (A − B/c)t. The times can be eliminated from these equations to get a condition on the constants, i.e. c2 = A2 (c2 − u2 ),
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
93
where we used B/A = u. Rearranging we find 1 =: γ. (1 − u2 /c2 )
A= √
Thus, the Lorentz transformation is x = γ(x − ut),
ct = γ(ct − ux/c).
(2.5.3)
Now squaring both sides of the first equation, and subtracting it from the square of the second give: x2 − c2 t2 = x2 − c2 t2 . We may now identify x0 with ct in (2.5.2). The Lorentz transformation thus consists in passing from one set to another set comprised of a timelike semi-diameter ct = 1, and a space-like semi-diameter x = 1 of the hyperboloid, x2 + y2 + z2 − (ct)2 = −1,
(2.5.4)
and taking the lengths of the new semi-diameters for time and space coordinates. Actually, Minkowski wrote the Lorentz transform (2.5.3) in the form: x = x cos ω + (ct) sin ω,
y = y, z = z,
ct = −x sin ω + (ct) cos ω, and concluded that the Lorentz transformation may be described as a rotation in a four-dimensional space x, y, z, ct, through an imaginary angle ω in the plane x, ct, or ‘round the plane’ y, z.
Minkowski took seriously his pseudo-Euclidean space for he wrote in his 1909 “Time and space” paper: The world postulate permits identical treatment of the four coordinates x, y, z, t. By this means, as I shall now show, the forms in which the laws of physics are displayed again in intelligibility. In particular the idea of acceleration acquires a clear-cut character. I will use a geometrical manner of expression, which suggests itself at once if we tacitly disregard z in the triplet x, y, z. I take any world-point O as the zero-point space-time. The cone (ct)2 − x2 − y2 = 0 with apex O in Fig. 2.20 consists of two parts, one with values t < 0, the other with values t > 0. The former, the front cone of O consists, let us say, of all the world-points which “send light to O,” the latter, the back cone of O, of all the world-points which “receive light from O.” The
Aug. 26, 2011
11:16
94
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
Fig. 2.20.
Minkowski’s vision of space-time.
territory bounded by the front cone alone, we call “before” O, which is bounded by the back cone alone, “after” O. The hyperboloid sheet already discussed [(2.5.4)] lies after O. The territory between the cones is filled by the one-sheeted hyperboloid figures x2 + y2 + z2 − (ct)2 = k 2 for all constant positive k. We are specially interested in the hyperbolas with O as center, lying on the latter figures. The single branches of these hyperbolas may be called briefly the internal hyperbolas with center O. One of these branches, regarded as a world-line, would represent a motion which, for t = −∞ and t = ∞, rises asymptotically to the velocity of light, c.
Minkowski’s view of space-time has been reproduced in almost every book written on the special theory of relativity. It has led to many speculative thought-experiments regarding communication and space travel. Yet, it lies outside the domain of the hyperbolic plane, and it is in this plane where all the physics occurs, including the velocity addition law which stands in the defense of Poincaré’s (Einstein’s) postulate that c be the limiting velocity. In order to get a projective space we have to identify antipodal points. These points lie on disjoint sheets of the hyperbola, the north N and S poles. Just as we have different types of maps which represent the surface of the Earth, different maps can be used to represent the hyperbolic plane. But, in order to fully understand what Beltrami did let us consider the projection of the sphere onto a plane. The map from the hemisphere onto a plane is called a geodetic projection. A sphere is centered at the origin and has a radius r. A map from a horizontal plane at height κ to the hemisphere is given by the projection along the radius by the magnification (u, v, w) → (λu, λv, λr) such that (λu)2 +(λv)2 +(λr)2 = κ2 . Solving for the magnification
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
95
yields [cf. (2.2.11) above] λ= √
(r2
κ . + u2 + v 2 )
The distance between (u, v) and (u + du, v + dv) is given by the first fundamental form dw2 = E2 du2 + 2F du dv + G dv2 = κ2
(r2 + v2 )du2 + 2uv du dv + (r2 + u2 )dv2 . (r2 + u2 + v2 )2
The absolute constant, κ, is related to the constant positive curvature, i.e. 1/κ2 . To see the effect on how distances become distorted, just set v = 0, dw = κ
r du . r 2 + u2
In order to get the same increments in dw, we have to consider larger and larger increments in du, as depicted in Fig. 2.7. This is to say that viewed by us Euclideans, it appears that our rulers become longer and longer the farther we travel from the origin. In contrast, Beltrami considered the projection of a pseudosphere onto a horizontal plane by changing the surface to one of constant negative curvature, −1/κ2 . Beltrami called the surface of constant negative curvature a ‘pseudosphere,’ changing the name Minding had given his surface, Fig. 2.19. The pseudosphere, as we have mentioned earlier, possesses some remarkable properties. Many geodesics can be drawn through a point on the surface of the pseudosphere that never meet a given geodesic. If three angles of one triangle are equal respectively to three angles of another, the triangles have equal areas. This is true also in elliptic geometry, and shows that in non-Euclidean geometries, the angles determine the sides of a triangle, something that is not true in Euclidean geometry. In other words, the size of a triangle cannot be altered without distorting it. Beltrami writes in his Saggio that dw2 = κ2
(r2 − v2 )du2 + 2uv du dv + (r2 − u2 )dv2 (r2 − u2 − v2 )2
(2.5.5)
Aug. 26, 2011
11:16
96
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity represents the square of a line element on a surface whose spherical curvature is constant, negative, and equal to [−1/κ2 ]. The form of this expression … has the particular advantage (from our point of view) that a linear equation in u, v represents a geodesic and, conversely, any geodesic is representable by a linear equation in these variables.
In order to derive (2.5.5), Beltrami considered the map from a horizontal plane of height r to the hyperboloid H + given by the projection (u, v, w) → (λu, λv, λw), where w = r to the hyperboloid (λr)2 − (λu)2 − (λv)3 = κ2 . To see what distortion there is, we again set v = 0, dw = κ
r du . r 2 − u2
Now equal increments in dw require smaller and smaller increments in du as we move away from the origin. In other words, our rulers become smaller and smaller as they approach the rim of the disc so that it would appear that the rim is infinitely far away. Beltrami achieved this without producing a surface in threedimensional Euclidean space that would be analogous to a hemisphere, for no such surface exists. The pseudosphere cannot be considered such a surface since it has a discontinuity where geodesics cannot tread. Klein, however, did indicate how a pseudosphere could be drawn on Beltrami’s disc. Cut it open and place the cusp end on the rim of the disc. The above metric tells us that the rim of the disc is infinitely far away. Now, what Beltrami did was to project stereographically the upper hyperboloid sheet, H + , onto the unit disc so that all the rays would converge at the origin, O, as shown in Fig. 2.21. In this way Klein got the projective model, which now bears his name , of rays falling onto the horizontal plane z0 = const. Somewhat earlier, Weierstrass introduced coordinates, analogous to spherical coordinates, to describe a sphere of imaginary radius. Since the radial coordinate is imaginary it is not difficult to see that the appropriate coordinates are x = κ sinh (r/κ) cos ϕ,
y = κ sinh (r/κ) sin ϕ, z = κ cosh (r/κ),
where κ is the absolute constant that sets the scale. If we square the third term and subtract the squares of the other two terms we get z2 − x 2 − y 2 = κ 2 ,
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
Fig. 2.21.
97
Projection of the hyperboloid onto the plane.
which is none other than our hyperboloid with z = ct. The semi-diameter κ sets the scale and is often set equal to unity for mere convenience. On his surface of constant negative curvature, −1/κ2 , Beltrami used the Weierstrass coordinates in the form of ratios. That is, there is a map from the upper half-hyperboloid, H + , to any disc with its center on the z-axis parallel to the xy-plane. Aconvenient choice has the disc touching the upper hyperboloid at its lowest point z = 1. The map sends the coordinates, (a sinh (r/κ) cos ϕ, a sinh (r/κ) sin ϕ, a cosh (r/κ)), into the homogeneous coordinates (u, v, 1), where u = a tanh (r/κ) cos ϕ,
v = a tanh (r/κ) sin ϕ,
(2.5.6)
are rectilinear coordinates. Projective geometry allows points of infinity, such as those where parallel train tracks merge in a drawing, to be placed on the same level as any other coordinates in (X, Y) ∈ R2 . The so-called homogeneous coordinates were introduced by Möbius and Plucker in the early part of the nineteenth century. By extending the coordinates to all real triples, (x, y, z), and dividing through by z to obtain (x/z, y/z, 1), these triples are just the coordinates
Aug. 26, 2011
11:16
98
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
of a line in R3 from 0 to (X, Y), where X = x/z and Y = y/z are the homogeneous coordinates. All horizontal lines whose points have coordinates (x, y, 0) corresponds to points at infinity. Thus, the extra coordinate z enables us to create new points, and, in particular, the points at infinity. As z → 0, both X and Y tend to infinity, so it is entirely reasonable to consider them as ‘points at infinity.’ If Weierstrass’s coordinates are interpreted as the space-time coordinates in a two-dimensional space then Beltrami’s coordinates are the velocity coordinates of the fundamental disc. In Beltrami’s own words: If we denote by the letters x and y the rectangular coordinates of points of an auxiliary plane, then the equations x = u,
y=v
determine a representation of the region under investigation in which to every point of the region there corresponds a uniquely determined point of the plane and vice versa; and the whole region turns out to be represented in the interior of a circle of radius [c] with center at the origin that we will call the limit circle. In this representation the chords of the limit circle correspond to the geodesics of the surface and, in particular, the parallels to the coordinate axes correspond to the coordinate geodesic lines.
In fact, Beltrami’s coordinates, u = c tanh (r/κ) cos ϕ,
v = c tanh (r/κ) sin ϕ,
satisfy u2 + v2 = c2 tanh2 (r/κ) < c2 . Taking the square root and inverting we have r = κ tanh
−1
√
(u2 + v2 ) c
√ κ c + (u2 + v2 ) = ln . √ 2 c − (u2 + v2 )
The argument of the logarithm is none other than the cross-ratio {0, r|c, −c}, √ where r = (u2 + v2 ) is the distance from the center of the disc to a given point. Hence, r is the distance between any two arbitrary points in Beltrami’s model; and the whole sphere of imaginary radius, or hyperboloid, is represented by the interior of the circle of radius c, the speed of light in vacuo.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
99
Fig. 2.22. Geodesics determined by planes cutting the hyperboloid and passing through the center.
Geodesics on the hyperboloid are described by planes that pass through a given point p in the direction of the tangent vector to the hyperboloid and pass through the origin. This is shown in Fig. 2.22. The projection onto the unit disc makes them straight lines, which is what we normally expect of geodesics. But, there is a price to pay, namely, angles become distorted and the model is not conformal. So if we restrict ourselves to the disc, where hyperbolic geometry rules, we cannot reason in terms of space-time, but, rather, in terms of their ratios, the velocities. The distance we want is the distance between two relative velocities. And although Beltrami provided the first proof of the consistency of Lobachevsky’s plane geometry by representing it in the Euclidean plane, he gave no formula for the distance between two arbitrary points. Klein began with Cayley’s expression for the non-Euclidean measure of distance, 1 e(ap) e(bq) ln · > 0, 2 e(aq) e(bp)
(2.5.7)
which we recognize as one-half the natural logarithm of the cross-ratio, {p, q|a, b} between two interior points p and q, and two boundary points a and b, as shown in Fig. 2.23. One-half is introduced so that the curvature will be −1, and e(aq) is the Euclidean distance from a to q. The factor of one-half is very important
Aug. 26, 2011
11:16
100
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
Fig. 2.23.
Cayley’s calculation of distance in the projective disc model.
since it is then distinguished from the Poincaré half-plane model in which the one-half is not present [cf. Chapter 10]. For Poincaré’s model will turn out to be conformal, whereas Klein’s is not. In the same paper Klein [71] also coined the term ‘hyperbolic geometry’ as the non-Euclidean geometry of Lobachevsky and Bolyai. Another model of the hyperbolic plane is to stereographically project from the south pole, S, of the lower hyperboloid sheet, H − , onto the upper hyperboloid sheet, H + . The stereographic projection of these rays located on the unit disc in the horizontal plane x0 = 0, shown in Fig. 2.24, has come to be known as the Poincaré disc model. There are many proofs that non-Euclidean geometries are consistent. The earliest one was given by Beltrami who represented non-Euclidean geometries on Euclidean surfaces of constant curvature. Beltrami’s favorite was the hemisphere, and he used it to go from his flat model to the Poincaré model of the hyperbolic plane in two maps. In view of the facts that he did this in 1868, and that Poincaré did not get around to doing this till 1882, this model should also be attributed to Beltrami. This is a classic example of Stigler’s law of eponymy, which states that no scientific discovery is named after its rightful discoverer. The maps from the Beltrami flat plane model to the Poincaré model consist of the following. Beltrami’s model is located in the disc B of radius r in (a) of Fig. 2.25. A sphere of the same radius is placed over the disc with its south pole at the center of B in (b). Using vertical parallel — and not stereographic — projection, B will be projected onto the southern hemisphere with the disc radius coinciding with the equator, as shown in (c) of Fig. 2.25. From the north pole, the southern hemisphere is, this time,
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
101
Fig. 2.24. The Poincaré disc model as a stereographic projection from the south pole S of the bottom sheet.
stereographically projected back onto the plane, which covers a circular region P of radius r in (d). This is the Poincaré disc. The straight lines of the Beltrami model undergo a transformation under these maps. Under the first map, a chord of B rises to the sphere to become an arc of a circle which intersects the equator at right angles in (e) of Fig. 2.25. Under the second map this arc is mapped back into a circular arc that cuts the boundary of P normally in (f). As a consequence of these two maps, one is a vertical parallel projection and the other a stereographic projection, the hyperbolic straight lines are transformed into circular arcs that cut P at right angles. Although we lose the characterization of geodesics as straight lines, we still preserve the angles: Euclidean and hyperbolic angles are equal so the model is conformal. The two mappings are summarized in Fig. 2.26.
Aug. 26, 2011
11:16
102
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
Fig. 2.25. Beltrami’s double mapping of Klein and his hyperbolic disc model onto the Poincaré disc model.
Fig. 2.26. The combined vertical orthogonal projection upwards and the stereographic projection downwards.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
103
So who exactly did what, and who should get the credit: is it Beltrami or Klein? Beltrami’s approach was to show that hyperbolic geometry is consistent if Euclidean geometry is. Klein, on the other hand, worked from projective geometry and showed that all three geometries, elliptic, hyperbolic, and Euclidean, are consistent if projective geometry is consistent. And projective geometry comes before Euclidean geometry. To Beltrami the hyperbolic plane lies in a portion of the Euclidean plane, the unit disc. Outside the disc there is nothing. But to Klein, who believed in points at infinity, the plane is the projective, and not the Euclidean, plane. So Klein was able to go beyond the boundary of the disc and showed how distinct forms of rotation could take place within the disc, on the rim of the disc, and outside of the disc. Today, these differences seem marginal since the foundation for all geometries are based on real number theory, where points are n-tuples and planes are the equations they satisfy [Stillwell 91]. In 1882 Poincaré studied the Liouville–Beltrami upper half-plane model in regard to fractional linear transformations of a complex variable. Poincaré showed that the cross-ratio of two points, z and z + dz, infinitesimally separated on the same geodesic semi-circle with endpoints lying on the x-axis that could be written as 1 + |dz|/y, when higher than linear-order infinitesimals are ignored. The distance is the logarithm of the cross-ratio, which in this case is ln (1 + |dz|/y) = |dz|/y,
(2.5.8)
to linear-order in the infinitesimal. This connects the geometry of the halfplane, or our heated plane, with the Poincaré metric |dz|/y to the cross-ratio whose logarithm is the distance between any two points in the interior of the unit disc. By mapping the upper half-plane onto the unit disc, Poincaré again obtained the representation of the hyperbolic plane in the interior of a circle. At the expense of having hyperbolic straight lines, or geodesics, they now appear as arcs of circles that cut the unit disc orthogonally. This may appear as a blemish, but the model is conformal: Euclidean and hyperbolic measures of an angle are the same. Let A and B be points inside the unit disc and let P and Q be the points where the line intersects the disc, as shown in Fig. 2.27.
Aug. 26, 2011
11:16
104
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
Fig. 2.27.
Geodesics consist of arcs that cut the disc, , orthogonally.
Then, Poincaré defined the cross-ratio in the usual way, {A, B|P, Q} =
e(AP) e(BQ) · , e(AQ) e(BQ)
where e(AP) is the Euclidean length of AP. But, unlike Cayley, Poincaré defined his length d(AB) as d(AB) = | ln{A, B|P, Q}|, which is precisely twice that of Cayley’s. Poincaré also studied motions and considered the group of all fractional linear transformations of the form z →
az + b , cz + d
(2.5.9)
which we recognize as Möbius transformations. Poincaré worked with real coefficients, a, b, c, d and the determinant, > 1. Poincaré found in (2.5.9) a new type of periodic function that is invariant under this substitution. Up until this time the only periodic functions that were known were the trigonometric and elliptic functions. The double periodicity of the elliptic functions could be characterized by a tessellation of the Euclidean plane in which the elliptic functions take on the same value at the vertices of parallelograms. As we have seen in Sec. 1.2.2, tessellations consist of curvilinear triangles inside the disc in the hyperbolic plane, as depicted in Fig. 1.1. As mentioned there, the tessellation was originally constructed by H. A. Schwarz in 1872 to explain the periodicities in the solutions of Gauss’s differential equation. In the passage quoted there, Poincaré had the wonderful idea
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
105
that these new fractional transformations, (2.5.9), could be used to define a new distance for which the tessellations all become of equal size. His fractional transformation maps the upper half-plane model onto itself if > 0, which excludes the lower half-plane. We know from (2.1.7) that (2.5.9) can be decomposed into a magnification, translation, and inversion. Each of these preserves the metric (2.5.8), and so do (2.5.9). Poincaré’s measure of distance was twice as great as Cayley (2.5.7), and as we have mentioned, it preserves angles and so is conformal. Two Poincaré lines are parallel if and only if they have no point in common. An example is two arcs of a circle that cut the unit circle which do not intersect. This permits all the axioms of hyperbolic geometry to be translated into Euclidean-geometrical statements. Thus, the Poincaré model gives still further proof that if Euclidean geometry is consistent so too is hyperbolic geometry. In the following year, Poincaré studied the motions consisting of all fractional linear transformations of the plane of points at infinity by allowing the coefficients in (2.5.9) to become complex, z → eiθ z. Poincaré also introduced the half-space and hemisphere models, which we have discussed in the example of the heated plane. They can be considered as ‘first cousins’ of the disc model because they can be derived from one another by inversions [Thurston 97]. They can also be derived from one another by the linear fractional transformation, z →
1 − zi . z−i
To see this [Stahl 08], consider a point, z = x + iy on the interior of the unit disc, and another point w = u + iv in the upper half-plane. Then w=
1 − zi , z−i
is translated into u + iv =
1 − (x + iy)i . x + iy − i
Separating real and imaginary terms, differentiating, and using the Cauchy–Riemann relations, we find that if there is a curve γ in the unit
Aug. 26, 2011
11:16
106
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
disc model then there will be another curve, in the upper half-plane such that √ 2 √ 2 2 (dx + dy2 ) (du + dv2 ) . = 2 2 v γ 1−x −y This relates the heated plane metric to the unit disc metric, which was first written down by Riemann. The geometric interpretation of all fractional linear transformations of the form (2.5.9) as length-preserving, or isometries of the hyperbolic plane, is just not possible because of the infinite number of them. Poincaré’s idea was to decompose the linear fractional transformations into individual motions and to express them as products of inversions. But what he failed to notice was the one with a = d and b = c, such that b/a = tanh ψ = z1 is an inner point of the unit circle. For then z →
z + z1 , 1 + zz1
(2.5.10)
which maps the unit circle onto itself sending z1 into 0. And for z = z1 , (2.5.10) becomes the isomorphism that takes us from the Poincaré to the Klein, or projective, model. With the relative velocity given by b/a = tanh ψ, (2.5.9) preserves the unit circle z¯z = 1, which expresses Lorentz invariance, and the identification of (2.5.10) as the Poincaré composition law for relativistic velocities, that is usually attributed to Einstein. He would then not have needed to introduce his second postulate for it would be already incorporated in the statement that all transformations of (2.5.9), with the above definitions of the coefficients, leave the real unit circle invariant, and all interior points have velocities less than the speed of light. Poincaré would have been able to study them by studying tessellations, which would be curvilinear triangles that make up right-angled polyhedra in relativistic velocity space, referred to as honeycombs [Coxeter 99]. The tessellations do not have the same size but are mapped onto themselves by the Lorentz transform. And since Möbius transforms preserve the cross-ratio, they can be used, according to Poincaré’s wonderful idea, to define a new concept of length under which the cells of the tessellation are all equal. The geometry to which it gives rise to is hyperbolic geometry, and it is this geometry that is used by Poincarites. A fortiori, if some of the Poincarites lived in the half-plane and others in the unit disc, and
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch02
Which Geometry?
107
they could communicate between themselves, there would nothing that they could do or measure to tell their worlds apart. Only to us Euclideans would differences in their two worlds become noticeable. This is the origin of all relativistic effects. In this the cross-ratio has a fundamental role. And since the cross-ratio is none other than a product of longitudinal Doppler shifts, we must look to the Doppler shift for the origin of all relativistic phenomena. The story of hyperbolic geometry does not end here, but for us it does. New advances involve the definition of a global manifold, for which remained no satisfactory definition throughout the nineteenth century. A chronological account of all later developments can be found in Milnor [82].
References [Archbold 70] J. W. Archbold, Algebra, 4th ed. (Pitman, London, 1970), p. 91. [Barankin 42] E. W. Barankin, “Heat flow and non-Euclidean geometry,” Amer. Math. Monthly (1942) 4–14. [Born & Wolf 59] M. Born and E. Wolf, The Principles of Optics (Macmillan, New York, 1959), pp. 146–148. [Buseman & Kelly 53] H. Busemann and P. J. Kelly, Projective Geometry (Academic Press, New York, 1953), pp. 157–158. [Coxeter 99] H. S. M. Coxeter, “Regular honeycombs in hyperbolic space,” in The Beauty of Geometry: Twelve Essays (Dover, New York, 1999), pp. 199–214; see also, C. Criado and N. Alamo, “Relativistic kinematic honeycombs,” Found. Phys. Lett. 15 (2002) 345–358. [Einstein 22] A. Einstein, “Geometry and experience,” in Sidelights on Relativity, transl. by G. B. Jeffrey and W. Perrett (E. P. Dutton, New York, 1922), p. 34. [Klein 71] F. Klein, “On the so-called non-Euclidean geometry,” Math. Ann. 4 (1871) 573–625; translated in J. Stillwell, Sources of Hyperbolic Geometry (Am. Math. Soc., Providence RI, 1991), pp. 69–110, especially Sec. 8. [Kulczycki 61] S. Kulczycki, Non-Euclidean Geometry (Pergamon, Oxford, 1961), p. 138. [Kullback 59] S. Kullback, Information Theory and Statisitics (Wiley, New York, 1959), p. 6. [Lavenda 09] B. H. Lavenda, A New Perspective on Thermodynamics (Springer, New York, 2009), Sec. 6.11. [Milnor 82] J. Milnor, “Hyperbolic geometry: The first 150 years,” Bull. Amer. Math. Soc. 6 (1982) 9–24. [Needham 97] T. Needham, Visual Complex Analysis (Clarendon Press, Oxford, 1991).
Aug. 26, 2011
11:16
108
SPI-B1197
A New Perspective on Relativity
b1197-ch02
A New Perspective on Relativity
[O’Neill 66] B. O’Neill, Elementary Differential Geometry (Academic Press, New York, 1966), p. 314. [Poincaré 68] H. Poincaré, La Science et l’Hypothèse (Flammarion, Paris, 1968). [Schwerdtfeger 62] H. Schwerdtfeger, Geometry of Complex Numbers (U. Toronto Press, Toronto, 1962), p. 121. [Stahl 08] S. Stahl, A Gateway to Modern Geometry: The Poincaré Half-Plane, 2nd ed. (Jones and Bartlett, Sudbury MA, 2008), pp. 193–194. [Stillwell 91] J. Stillwell, Sources of Hyperbolic Geometry (Am. Math. Soc., Providence RI, 1991), pp. 63–68. [Thurston 97] W. P. Thurston, Three-Dimensional Geometry and Topology (Princeton U. P., Princeton NJ, 1997), p. 53.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
Chapter 3
A Brief History of Light, Electromagnetism and Gravity
Much ado about null results.
3.1
The Drag Coefficient: A Clash Between Absolute and Relative Velocities
Most, if not all books on the special theory, use the famous drag coefficient of Fizeau as an example of the new kinematics that special relativity preaches. That is, the drag coefficient comes out naturally from the velocity addition law, (2.5.10). According to the special theory, the parallelogram rule for the addition of velocities is only approximately true. For two bodies moving at speeds u and u in opposite directions, the velocity that an observer would register traveling along with the second body is not u − u , but w=
u − u . 1 − uu /c2
Now let us apply this modification of the parallelogram law to Fizeau’s experiment. Larmor [00] tells us that the phenomenon to be accounted for is the observation that the motion of the Earth does not affect reflection and refraction of light. With the corpuscular theory then still in vogue, Arago reasoned that insofar as the velocity of light is different in air than it is in glass, the aberration of its path due to the motion of the Earth would also be different in the two media, depending upon the direction of the Earth’s motion. Arago did not find any effect at all, and asked Fresnel whether he could analyze this null result from the point of view of wave theory.
109
Aug. 26, 2011
11:16
110
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
Fresnel replied that the lack of an effect could be explained by assuming that the surrounding aether is carried along with the motion of the Earth. But, if this is true, it goes against the grain of stellar aberrational measurements which could only be explained if the aether was immobile, or stagnant. According to this hypothesis, the velocity of light in vacuo would retain its normal value, but, it would change in the body arising from its motion through the aether so as to make refraction and reflection the same as for the matter at rest. Arago had not considered such a possibility. What Fresnel was able to show was that there would be no change if the velocities in vacuo and the prism were c and c , respectively, when at rest, while when in motion they were c − u and c − u/η2 , where η is the ordinary index of refraction, η = c/c . That is, the absolute velocity of light in the moving prism would be c = c + u(1 − η−2 ).
(3.1.1)
Fresnel believed that aether permeates moving matter and is partially convected by matter, being absorbed at the front surface and readmitted at the rear surface. In this way the paths of the light rays in moving bodies would be unaltered, and at the same time the known facts of aberration would not be contradicted. The experimental confirmation of Fresnel’s formula had to wait thirtythree years, when Fizeau preformed his famous experiment. His arrangement, shown in Fig. 3.1, is essentially an optical interferometer. Rays from a light source are split into two by a halfway mirror. The two beams are made to travel around the same closed circuit, by reflection from mirrors, but in opposite directions. When they arrive back at the halfway mirror one beam carries on while the other beam is reflected, and both are culminated on a telescope. Any given fringe represents an optical path difference between the interfering beams. If δ is the distance that a ray traverses in a medium of refractive index η, the optical path is simply η · δ. Since water is flowing through the circuit, the rays going against the direction of the water should be retarded, and this produces a dragging effect. The optical path difference concerns only what goes on in the tubes, so that if each tube is of length δ and the water is flowing at speed u, it is t =
c
2δ 2δ − , − γu c + γu
(3.1.2)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
Fig. 3.1. light.
111
Fizeau’s aether-drag apparatus with mirrors placed on corners to reflect
where γ is the so-called drag coefficient, and recall that c = c/η. This gives a first-order effect of amount t ≈
4γδ u · . c c
The optical path length difference is ct, and expressing this as the number of interference lines f times the wavelength, λ, of the monochromatic light used, gives the expression f=
4γδ u · . λ c
Typical values of Fizeau parameters are: δ = 1.5 m, u = 7 m/sec, λ = 5.3 × 10−7 m, η = 1.33, and f = 0.23 of a fringe. This gives the observational value of the drag coefficient as γ = 0.48, while, its calculated value from the index of refraction of water alone is γ = 1−1/η2 = 0.43. Thus, to within an error of approximately 10%, Fizeau confirmed Fresnel’s formula for the drag coefficient — but it did not explain it. All it could do was to reinforce the belief that the motion of the aether has no effect on the properties of moving objects, just as it does not have on stellar aberration. However, it is claimed that the Fizeau experiment can be explained by the kinematics of special relativity. What special relativity says is that the velocity of light in the direction of the flow of water with velocity u is
Aug. 26, 2011
11:16
112
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
not c + u, but, rather
c + u c 2 c= ≈c +u 1− 2 . 1 + c u/c2 c
Since η = c/c , the velocity composition law, “without any extra assumptions,” gives to lowest-order the Fresnel drag expression, which “other aether theorists had to explain in terms of a partial dragging of light by the medium” [French 68]. However, whereas c + u was an absolute velocity in the determination of the fringes, it has now, miraculously, become the sum of two velocities, which for the relativistic law of addition applies! Moreover, the relativistic velocity composition law applies only to relative velocities, and not to absolute and relative velocities. But, if it is true that we are talking about the sum of two velocities, c and u, we should take this into account in determining the time difference, (3.1.2). If this is done we get t =
4δγu(1 − 1/η2 ) , c2 − (γu)2
(3.1.3)
and this decreases both the fringes and the drag coefficient, i.e. f = 0.1 fringe and γ = 0.21. The latter is no way near the calculated value of 0.43. Thus, there is a fundamental incapability of using special relativity to deal with the classical experiments that led to the demise of the aether. Only observations which do not give a null effect are those in which absolute velocities are involved, as in the Fizeau experiment.
3.2
Michelson–Morley Null Result: Is Contraction Real?
The experimental set-up shown in Fig. 3.2 appeared in the ‘classic’ 1887 paper of Michelson and Morley. Light is emitted from a source and is reflected by a mirror at distance 1 . The time it takes for the forward and backward journeys can be found in any text, t1 =
1 1 21 /c + = , c+u c−u (1 − u2 /c2 )
(3.2.1)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
113
Fig. 3.2. Monochromatic, yellow light is split by a mirror into two beams. These beams cover equal distances to mirrors b and c where they are reflected back to a, and then combined to produce interference fringes. The paths lengths ab = ac = 11 m. A rotation of the apparatus by 90 degrees gave no displacement of the interference fringes that was expected because the beam traveling along ab is parallel to the Earth’s motion through the aether while the ray along ac was normal to it.
where u is the speed that Michelson’s apparatus is moving with respect to the inertial frame defined by the aether. To calculate the time of transit of the light ray perpendicular to the ‘aether wind,’ the Galilean composition of velocities is used. If c is the hypotenuse with base u, the decrease in √ velocity between the interferometer and the second mirror is (c2 − u2 ). The time of transit of the outward and backward journeys from the second mirror is 22 . (c2 − u2 )
t2 = √
(3.2.2)
From (3.2.1) and (3.2.2) Michelson determined the time difference. This time difference was compared to that which is resulted when the whole apparatus is rotated by 90◦ , which interchanges 1 with 2 . The difference between the time differences would produce a shift in the interference pattern, which is calculated in terms of the number of fringes, just
Aug. 26, 2011
11:16
114
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
as in the Fizeau experiment. Although the expected effect was predicted to be small, Michelson failed to observe anything. Michelson was led to conclude that The interpretation of these results is that there is no displacement of the interference bands. The result of the hypothesis of a stationary aether is shown to be incorrect.
The difference in times of the two journeys is t =
√ 2 {1 − 2 (1 − u2 /c2 )}. 2 2 c(1 − u /c )
(3.2.3)
Because no differences were found, FitzGerald and Lorentz independently postulated that there must be a contraction of the arm in the direction √ of motion by the amount (1 − u2 /c2 ) so that with equal arm-lengths, (3.2.3) would vanish. Oliver Lodge claimed that this ‘contraction’ hypothesis [cf. (3.2.5) below] was proposed by FitzGerald in a conversation he had with him in order to explain the null result. It was also put forward by Lorentz a short time afterwards, and has become to be known as the FitzGerald–Lorentz contraction, or more succinctly as the Lorentz contraction. However, O’Rahilly [38] warns us: “It is to be gravely doubted that the FitzGerald–Lorentz contraction really does explain Michelson’s nullresult.” If the arms’ lengths are equal, 1 = 2 , dividing (3.2.1) by (3.2.2) results in t2 . (1 − u2 /c2 )
t1 = √
This can be interpreted as a time dilatation, where clocks in motion supposedly go slower than clocks at rest. According to Larmor, “the change of the time variable, in comparison of radiations in the fixed and moving systems, involves the Doppler effect on the wavelength.” The letter that Maxwell sent to D. P. Todd to thank him for the astronomical tables was read by a young scientist by the name of Michelson, who in the previous year had already performed a measurement of the speed of light. Although we will discuss this in greater detail in Sec. 4.1.3, we mention it here because of the effect it had upon Michelson. Michelson did not rule out the possibility of detecting the motion of the aether as a second-order effect by the amount predicted by Maxwell
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
115
[cf. Eq. (4.1.12)]. His first attempt was in 1881, and a more precise experiment carried out with Morley, described above, was done in 1887. According to French [66] “this refined version of the experiment. . .has long been regarded as one of the main experimental pillars of special relativity.” However, Maxwell was using absolute velocities. If you use relative velocities, you will find t1 = 2/c. There is no effect at all. Since the time difference, t1 − t2 , must be positive or vanish, we must also find t2 = 2/c so that the aether is stagnant, u = 0. One could also query the validity of the use of the Galilean law of the composition of velocities. Light emitted transversely to the direction of the motion can be obtained by an average of measurements on radiation emitted forward and backward with respect to the direction of the motion, as Ives and Stilwell so well realized [cf. Sec. 3.4], 1 1 c + u 1/2 c − u 1/2 =√ + . 2 c−u c+u (1 − u2 /c2 ) So, there is nothing wrong with the use of the Galilean composition law of velocities. If there is no change in t1 there can be no change in t2 since t2 cannot be greater than t1 . It can only be equal to it when the aether is stagnant. Pillar or not, it is no wonder that some of the best experimentalists of their day, like Herbert Ives and Louis Essen, were staunch antirelativists. Therefore, Michelson, like Fresnel, Fizeau, and Airy before him, could only conclude that the aether remains completely undisturbed by the Earth’s motion. Moreover, the FitzGerald–Lorentz contraction, and time dilatation arise from considering the velocities to be absolute, and not relative. The null result is in perfect conformity with the relativistic additional law for velocities, and the existence of a stationary aether. On the basis of Maxwell’s (and Michelson’s) analysis u is absolute, so that if the lengths are the same then there is time dilatation, t2 , (1 − u2 /c2 )
t1 = √
(3.2.4)
Aug. 26, 2011
11:16
116
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
while if the epochs are the same, then there is length contraction, √ 1 = 2 (1 − u2 /c2 ).
(3.2.5)
Alternatively, if u is relative there is no effect at all since an absolute speed c cannot be combined with a relative speed u through the relativistic law of the composition of speeds. Here, we find complete agreement with O’Rahilly [38] who contends the electrodynamic contraction is based on the assumption of an Earth-convected aether and involves an ordinary measurable u relative to the laboratory. If this assumption is correct, the FitzGerald–Lorentz contradiction disappears; and the null result of the Michelson–Morley experiment becomes self-evident.
When dealing with absolute velocities, there is no way that we can avoid having a speed greater than c if on one tract, the speed is less than c on the other tract. To see this, do not specify what the speeds are for the outward and return journeys; then t=
+ , c c
where the two unknown speeds are c and c . But, if light is to travel at the same speed no matter which way we are going then t=
2 . c2
Comparing the two expressions we arrive at the conclusion that √ c (c c ) ≤ 1, √ = 1 (c c ) 2 (c + c ) on the basis of the arithmetic-geometric mean inequality. The equality sign holds if and only if the two speeds coincide with the speed of light. Otherwise, one of the two speeds will necessarily be greater than the speed of light. We should not leave this section with the impression that relative velocities will have no effect in delaying the round trip times. We must only specify that one of the velocities in the relativistic composition law is not the velocity of light. This we can do by immersing the entire apparatus in a medium whose index of refraction η > 1. For then the
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
117
time to make a round trip parallel and perpendicular to the aether wind will be η + u/c η − u/c t1 = + c 1 + uη/c 1 − uη/c 2η 2η 1 − u2 /c2 u2 2 ≈ = 1 + 2 (η − 1) , c c c 1 − u2 η2 /c2 1 u 2 η2 2η 2η t2 = . 1+ ≈ √ c c (1 − u2 η2 /c2 ) 2c2 The magnitude and sign of the time difference will depend upon the magnitude of η.
3.3
Radar Signaling versus Continuous Frequencies
The Doppler shift measures the change in frequency due to relative motion. Anyone standing on a platform listening to a train go by will have noticed that the pitch on the whistle of a train is higher when the train is approaching than when it is receding. Only one frequency is recorded; it has nothing to do with the exchange of signals between two different inertial platforms. For the time measurement of the exchange of signals from two different inertial platforms we need two clocks. Consider one clock in motion and the other stationary, the latter sends out a signal which is picked up by the former in time T, according to the stationary clock. Now, the new feature is that a clock in motion has a different ticking rate than the clock at rest, so to the observer with his clock in motion it would appear that the time for the signal to reach him will be KT, where K is some factor that has to be determined. Every time the signal bounces back it increases by a factor K, so the signal that was sent out in time T will be reflected back in time K 2 T. The moment at which the light is reflected should be the arithmetic mean of these two times: 12 (T + K 2 T). During this time interval, the distance that the clock in motion covers is the difference of these two times multiplied by c: 12 (K 2 − 1)cT. The average velocity is therefore 1 (K 2 − 1)T u = , 2 (K 2 + 1)T c
Aug. 26, 2011
11:16
118
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
from which K can be determined as c + u 1/2 ¯ = eu/c , K= c−u
say.
(3.3.1)
The left-hand side is just the longitudinal Doppler effect, but what is the right-hand side? Taking the logarithm of both sides of (3.3.1) gives c 1 + u/c u¯ = ln (3.3.2) = c tanh−1 (u/c). 2 1 − u/c For K = 1, this new feature that a moving clock has its rate changed has catapulted us into a new world, where if u is approaching the speed of light, the new measure of speed, u¯ in (3.3.2), approaches infinity. In this new world there is no limit on the speed at which objects can travel! It depends on the exponent of the longitudinal Doppler shift. There is nothing classical ¯ We will return to a discussion of how radar signaling about the velocity u. is performed in Sec. 8.4. Maxwell realized that radiation causes pressure. It was also known that this pressure was Doppler-shifted, or to use Heaviside’s quaint terminology ‘dopplerized,’ when it hits a moving mirror. These are the prerelativity days when one could think of the “number of waves occupying c.” If stationary it is just c, while if the source is moving forward at velocity u, the number of waves are “crushed into c − u, or if moving away are lengthened by an amount c + u” [Poynting 10]. If the mirror is moving toward the source at speed u, then the wavelength of the incident beam λ will be proportional to c + u, while the reflected radiation at wavelength λ will be lessened to c−u due to the compression of the spring which is likened to the electromagnetic wave. Thus, c−u λ = , λ c+u
(3.3.3)
so that the wavelength becomes dopplerized on reflection due to the motion of the mirror.
3.4
Ives–Stilwell Non-Null Result: Variation of Clock Rate with Motion
According to Ives [51], the spectrum of relativity theory has “one end by the Michelson–Morley experiment with its null result . . . and at the other
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
119
by the Ives–Stilwell experiment, which demonstrated a positive result, the variation of the clock rate with motion.” The classical Doppler shift can be derived by considering the change in frequency of a wave that impinges on a mirror moving at speed u. The incoming wave impinges on the mirror at an angle ϑ, x cos ϑ + y sin ϑ A cos ω t − +ϕ , c of amplitude A and phase ϕ, is reflected by the mirror, thereby producing an outgoing wave of the form x cos ϑ − y sin ϑ A cos ω t + + ϕ . c At the surface of the mirror, x = ut, there must hold a certain relation between the incoming and outgoing waves that is valid for all times. This is possible only if the coefficients of t in the arguments are the same, implying 1 − (u/c) cos ϑ ω =ω . (3.4.1) 1 + (u/c) cos ϑ This is the ordinary, oblique, Doppler shift. Relativity changes this by relating the two times and two coordinates in space through the transformation t = t cosh ψ +
x sinh ψ, c
x = x cosh ψ + ct sinh ψ, y = y, where space and time have been rotated through an ‘imaginary’ angle, ψ = ¯ Although this is commonly referred to as the Lorentz transformation, u/c. it was first derived by W. Voigt in 1887. Only as late as 1909 does Lorentz realize that “to my regret [Voigt’s paper of 1887] has escaped my notice all these years.” Lorentz continues, “The idea of the transformations . . . might therefore have been borrowed from Voigt, and the proof that it does not alter the equations for the free aether is contained in that paper” [O’Rahilly 38]. But, the eponym is too engraved in the literature, and we will continue to adhere to it. This is yet another example of Stigler’s law of eponymy.
Aug. 26, 2011
11:16
120
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity Now, the incoming wave, x cos ϑ + y sin ϑ +ϕ , A cos ω t − c
will be converted into the outgoing wave, x cos ϑ − y sin ϑ A cos ω t + +ϕ , c but the phase is a physical invariant, and if there is no change in energy, the condition for the agreement of the cosines at the surface of the mirror is:
x x ωt 1 − cos ϑ = ω t 1 + cos ϑ . (3.4.2) ct ct Let us consider the motion in the unprimed system from the origin of the primed system. Then x = 0, which is equivalent to considering the transverse shift, ϑ = π/2. From the second equation of the Lorentz transform we find x/t = −c tanh ψ = −u, and inserting this into the first equation gives t = t sech ψ. Thus, condition (3.4.2) is equivalent to: 1 + (u/c) cos ϑ ω =ω √ . (3.4.3) (1 − u2 /c2 ) To test (3.4.3), Einstein in 1907 predicted that it might be possible to observe the transverse shift, ϑ = π/2 by examining the light emitted by canal rays in hydrogen, which Stark had published a paper on the year before. No general confirmation was possible, and it had to wait more than thirty years until Ives and Stilwell performed their famous experiment. Realizing that it would be almost impossible to get a direct measurement of the transverse Doppler shift, they resorted to an averaging of the forward and backward radiation that ions would emit by accelerating them through a given voltage. Solving (3.4.3) for the corresponding wavelengths, we get the longitudinal Doppler shifts for ϑ = π, 0, and developing these expressions in a series in powers of u/c gives λ (π)
1 + u/c 1/2 =λ 1 − u/c u 1 u2 + ··· , = λ 1+ + c 2 c2
(3.4.4)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
121
and λ (0)
1 − u/c 1/2 =λ 1 + u/c u 1 u2 = λ 1− + − ··· . c 2 c2
(3.4.5)
Normally, square and higher powers in the relative velocity are so much smaller than the linear term that they can safely be neglected. This will give a first-order Doppler shift of ±(u/c)λ. However, by averaging the forward and backward wavelengths of the emitted radiation, the first-order terms cancel, and to lowest-order there results λ2 :=
1 1 u2 (λ (π) + λ (0)) − λ ≈ λ. 2 2 c2
(3.4.6)
In comparison, the first-order Doppler shift is λ1 = uλ/c, so plotting the second-order shift (3.4.6) in terms of the first-order one, should result in a parabolic plot. This is exactly what was found by Ives and Stilwell, as shown in Fig. 3.3, in which a hydrogen discharge tube was the source of H2+ and H3+ ions.
Fig. 3.3.
Second-order wavelength shifts plotted as a function of first-order shifts.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
122
A New Perspective on Relativity
3.5
The Legacy of Nineteenth Century English Physics The English teach Mechanics as an experimental science. On the continent it is always presented more or less as a deductive science and a priori. The English are right, needless to say. . . On the other hand, if the principles of Mechanics have no other sources than experiments, they are therefore, only approximate and temporary. New experiments may lead us some day to modify or even abandon them [Poincaré 02]
We are at the end of the nineteenth century. It is known that the energy of motion is proportional to the square of the velocity, i.e. the kinetic energy. Also [Poynting 10] . . . waves contain energy. If we compress them into a shorter length we have to put more energy into them, somewhat as we have to put more energy into the spiral spring when we crush it up.
Since the spring is shortened, its ends will exert a pressure on any surface which it comes into contact with. It was assumed that the energy density was proportional to the inverse square of the wavelength. This assumption is somewhat prophetic since it establishes an inverse dependency of the speed on the wavelength. Introducing the mass, the constant of proportionality had to be an action, and the only action that was around was Planck’s quantum of action. So, de Broglie’s relation could have been derived almost a quarter of a century before he did by people of the likes of Poynting!
3.5.1
Pressure of radiation
With an inverse square dependency of the energy density upon the wavelength, (3.3.3) gives for the ratio of the energies reflected and incident on the mirror as ε c+u 2 . = ε c−u The net rate of energy flow into the mirror is the difference, ε (c − u) − ε(c + u) = 2uε and this is the work done against the mirror.
c+u , c−u
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
123
The work is of a compressional nature so it is logical to set it equal to P u, where P is the pressure. When this is done, it is seen that the pressure has been dopplerized by the amount, P = P
c+u , c−u
(3.5.1)
where P = 2ε is the pressure if the mirror were at rest. In Sec. 6.6, we will derive (3.5.1) from electromagnetic theory. Moreover, we can write the Doppler shift, (3.5.1), in the suggestive form c 2 − u2 ε + ε 2 2 c +u √ = (1 − B2 )(ε + ε ),
P =
(3.5.2)
in terms of the total energy density, where B=
2u/c . 1 + u2 /c2
(3.5.3)
Relation (3.5.2) expresses the pressure as the Lorentz contraction of the total energy density, where the relative velocity, (3.5.3) is in a frame in which the mirror is initially stationary. It is also the relativistic composition law for collinear velocities.
3.5.2
Poynting’s derivation of E = mc2
An even more profound relation can be obtained following Poynting’s reasoning on light pressure and its relation to corpuscular theory. “Let,” says Poynting, a beam of light, supposed to consist of corpuscles moving with velocity c, be incident perpendicularly on a completely absorbing, that is, a quite black surface. Let m be the mass of the corpuscles in a cubic centimeter. Then the mass coming up to and entering a square centimeter of the surface in one second is that in a column of c centimeters long and 1 sq. cm. cross section. The total mass is therefore mc. As it has velocity c the momentum entering per second is mc2 . But this momentum entering per second is the pressure P per sq. cm.
Poynting thus equates P = mc2 .
Aug. 26, 2011
11:16
124
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity Introducing this into (3.5.2) results in √ (ε + ε ) (1 − B2 ) = mc2 .
(3.5.4)
We can therefore think that the total energy be given by mc2 = m c 2 . (1 − B2 )
ε + ε = √
(3.5.5)
Relation (3.5.5) is a clear indication that mass should have a dependency on the speed. This clearly shows that the English physicists, at the turn of the twentieth century, had all the necessary elements to determine the relativistic mass–energy relationship (3.5.5) before their colleagues on the continent.
3.5.3
Larmor’s attempt at the velocity composition law via Fresnel’s drag
Another contender for relativistic glory was Joseph Larmor, who came within a hair’s breadth of the relativistic law of the composition of velocities some time prior to 1900. Bradley’s work on stellar aberration had been known for almost a century when it occurred to Arago that, inasmuch as the velocity of light is different in glass than in air, the aberration of its path caused by the motion of the Earth would also be different in glass. The optical deviation caused by a glass prism would be different depending on whether the light rays are in the direction of the Earth’s motion or in the opposite direction. Arago found that the laws of reflection and refraction are not affected by the motion of the Earth. Since the velocity of light in a motionless medium had its normal value, a way was sought to decrease its velocity of propagation in a medium which was in motion. If η = c/u is the index of refraction in the medium then a decrease the velocity u by the factor, 1 − η−2 , would do the trick. This supposition on the part of Fresnel left the rays relative to the moving medium unaltered, which at the same time did not contradict any known facts about aberration. In 1871 Sir George Airy, who was a bitter critic of Faraday’s lines of force and Maxwell’s electromagnetic theory, devised an experiment to determine whether the motion of the Earth through the aether could
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
125
be revealed. A telescope is aimed at a star which is directly overhead [cf. Fig. 5.1]. Denote by α the angle of aberration, and u the speed of the Earth through the aether. Now, fill the telescope full of water which has a refractive index η = 1.33. According to the wave theory of light, light will travel more slowly in water than in air. It will appear that the length of the telescope will be lengthened by a factor of η. In order to keep the star in sight, the telescope will have to be tilted still further, say to an angle β. By measuring this angle we might hope to find the speed u. We must also take into account the refraction that occurs at the objective lens; on one side we have air and on the other side, water. Using Snell’s law of refraction, η=
sin β β ≈ , γ sin γ
(3.5.6)
where γ is the angle of refraction in water. We would expect that γ ≈ ηα, where α ≈ u/c, the angle of aberration. Hence, the difference β − α ≈ (η2 − 1)u/c. Measurement of the angles allows u, the speed of the Earth through the aether, to be determined. Again, a null result was obtained: there is no difference between the aberration angles α and β. Now, since α and β are the same, the angle of refraction is smaller than either of these two by the amount α/η. The telescope has length , and the time it takes for light to pass down is t = η/c in the presence of water. The light entering the top of the telescope, as measured from the position of the eyepiece, is the sum of refraction, γ, and the dragging of the water, fut, where if there were no drag it would just be ut, the distance that the telescope moves. Hence, this distance can be thought of as the sum of refraction and drag, viz. ut = γ + fut. Recalling that = ct/η and γ = u/cη gives Fresnel’s drag coefficient, f =1−
1 . η2
(3.5.7)
Aug. 26, 2011
11:16
126
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
Who would have thought that (3.5.7) would come from something else, and only be an approximation to order η−2 ? That person was Larmor. Larmor [00] was fully aware of the FitzGerald–Lorentz contraction hypothesis for which “the dimensions of the moving system are contracted √ in comparison with the fixed system in the ratio (1 − u2 /c2 ).” Larmor reasoned that the particle should move with its own velocity v and under a convective velocity u of the medium in the x-direction by the amount x = v(t − ux/c2 ), or w=
v , 1 + uv/c2
where w = x/t is the net velocity. He also took into account the FitzGerald– Lorentz contraction, so that the speed is reduced to
u2 w=v 1− 2 c
1/2
1+
uv
. c2
Then to lowest-order Larmor found 1 w≈v 1− 2 , η and identified the coefficient of v with the Fresnel drag coefficient, (3.5.7). Had Larmor simply added the term ut to his displacement, i.e. x = v(t − ux/c2 ) + ut, he would have come out with w=
u+v x . = t 1 + uv/c2
(3.5.8)
This has the advantage of symmetrizing the convective velocity u and the particle’s velocity v so that Larmor could have equally as well have written x = u(t − vx/c2 ) + vt, and arrived at (3.5.8). The important point is that to lowest-order both u and v must contribute to the speed of the particle, w. This is what Larmor missed, and in so doing, failed to derive the collinear addition law for speeds, (3.5.8). Whittaker [53] succinctly sums up Larmor’s contribution: We are now in a position to show the connection between the Lorentz transformation and FitzGerald’s hypothesis of contraction; this connection was first established by Larmor [00] for his approximate form of the Lorentz transformation, which is accurate only to the second order in (w/c), but the extension to the full Lorentz transform is easy.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
3.6
127
Gone with the Aether
3.6.1
Elastic solid versus Maxwell’s equations
It may be said that whereas nineteenth century physicists tried to explain Nature, twentieth century physicists had the more modest goal of attempting to describe her. Kelvin once remarked to his friend Tait, “If you tell me what electricity is, I will tell you everything else.” Unfortunately, today this list has to be lengthened considerably. So after a century and a half of Maxwell, do we know what is electricity? And how to describe it best whether in terms of fields or their potentials? This was the purpose of the aether: to account for Maxwell’s equations. According to Heaviside, Let it not be forgotten that Maxwell’s theory is only a first step towards a full theory of the aether; and, moreover, that no theory of the aether can be complete that does not fully account for the omnipresent force of gravitation.
It should be said “Let Maxwell be and all is light,” instead of referring to Newton. In one bold sweep, Maxwell introduced the displacement current to produce a magnetic field, and thereby obtained a wave solution to his equations. Moreover, his equations predicted the velocity at which light should travel. It is given by the ratio of the electromagnetic unit of current to the electrostatic unit of current. By measuring the current and voltage in the lab, and aided with a balance that indicated the equality between electrostatic and magnetic forces, Maxwell found, around 1865, that the numerical value of c was slightly less than 3×1010 cm/sec. So if electromagnetic waves really did exist they should have a speed of 300,000 km/sec, a mind-boggling number in those days. Waves must travel through something, so what is that ‘something’? In Maxwell’s time, it was believed that visible light was the wave motion of a luminiferous aether that, though weightless, had remarkable elastic properties that could give rise to transverse waves. Maxwell’s equations only reiterated this belief that electromagnetic waves were disturbances of the luminiferous aether. And so Maxwell was led to believe that light was merely a manifestation of an electromagnetic wave. Even earlier, Faraday had speculated upon this possibility reasoning that whereas one all-pervasive and infinite aether was heavy on the mind, all the more so, would be the coexistence of two aethers — one for light and
Aug. 26, 2011
11:16
128
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
the other for electricity. On this basis alone, Faraday reasoned that light was an electromagnetic phenomenon. Maxwell’s equations describe the unified electromagnetic and luminiferous aether, which is the “seat and transmitter of not only of electric and magnetic energy, but that of light.” Is this the most general aether there is? Helmholtz devised a more general aether which would generate longitudinal as well as transverse waves, but would do so independently [cf. Sec. 11.5.5]. It was Boltzmann who showed how this could be done by introducing certain substitutions. Moreover, Boltzmann’s substitutions showed how Helmholtz’s aether could be reduced to that of Maxwell. However, all attempts to find light phenomena that are governed by longitudinal waves proved in vain, and so, too, the need to generalize Maxwell’s equations. Even today it is felt [Skilling 42] safest to avoid the embarrassing question of the character of the medium in which such waves are transmitted. The best we can do is to follow Faraday’s “shadow of a speculation” and “dismiss the aether but keep the vibrations.”
Elastic solid theories of light were mechanical and made precise statements of what light is: light is the vibration of an elastic solid. Simple refraction could be explained by changes either in the rigidity of the solid or its density. However, it is known that a longitudinal wave is created in the reflection of a transverse wave polarized in the plane of incidence whereas one oscillating perpendicularly to the plane of incidence does not create a longitudinal wave. The existence of a longitudinal wave would then generate another longitudinal wave as well as a transverse wave in the plane of incidence. In other words, if the elastic solid was a viable model for light then the splitting of light beams at interfaces would necessarily bring in longitudinal waves, which we know from experience not to exist. The splitting of light waves at surfaces separating two different media was known as double refraction of light, and no observation of longitudinal waves has ever been observed. Maxwell’s theory predicted there would be none. If we accept that Maxwell’s theory is a theory of propagation through the aether then the only vestiges are the aetheral constants, the permittivity and inductivity. The luminiferous aether has a long and glorious past, but came to a tragic end when it was interpreted as vacuous by the Michelson–Morley experiment. Why were so many of the great
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
129
physicists of the nineteenth century prey to it? And why was it felt necessary to get rid of the longitudinal waves and what seemed an absence of boundary conditions? Maxwell’s integrals of densities were over all space. In a continuous, elastic medium (Green’s aether) with a compressibility λ, and shear modulus, n, measuring its ‘rigidity’ to deformation, there are two kinds of waves that can propagate through it: longitudinal waves in which the medium wiggles back and forth, and transversal waves in which the medium waves back and forth in directions normal to the direction of propagation. These waves can propagate independently of one another, and at different speeds: transversal waves travel at a speed, n vT = , (3.6.1) ρ while longitudinal waves have a speed, λ + 43 n , vL = ρ
(3.6.2)
in a medium of density ρ. We can understand the rigidity and density of a material body, but how do these properties relate to the electromagnetic field? In other words, what values must we substitute for these quantities so that the speed of light comes out? With the discovery of polarization at the beginning of the nineteenth century by Malus, Arago and Fresnel, there was almost unanimous consensus that luminiferous vibrations had to be transversal. The nagging question was how to get rid of the longitudinal waves — if they had to be gotten rid of at all. In fact, their observation was a long awaited event. Initially, it was thought Röntgen’s rays, which he discovered in 1895, were the long awaited aetheral waves. The English physicist Silvanus Thompson called them ‘ultra-violet sound’ due to their very short wavelengths. The new longitudinal radiation was considered as the missing link to the understanding of gravity. It even provoked a reaction in Kelvin who put together a paper entitled “On the generation of longitudinal waves in the aether.” Since Maxwell’s equations only allowed transverse wave propagation, Larmor voiced his opinion that Maxwell’s equations “required some modification” that would allow for longitudinal disturbances
Aug. 26, 2011
11:16
130
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
to propagate. Even Helmholtz’s electrodynamic theory allowed for longitudinal waves, and Helmholtz was championed by none other than Boltzmann. It is rather interesting to note how genius is limited to a certain area of expertise. Boltzmann was no match for Heaviside in electromagnetism, but Heaviside was no match for Planck in the arena of thermodynamics. Gibbs, though, held his own in both areas: Kelvin’s claim that Röntgen waves were “condensational waves of the luminiferous aether” was followed by his idea on how to test experimentally these waves of superluminal speed, which according to (3.6.2) would need a large compressibility factor, λ. Gibbs solved the electromagnetic equations under the conditions cited by Kelvin and showed only transverse waves resulted. Both Boltzmann and Lodge supported Kelvin in the hope that the newly discovered radiation would be longitudinal because it would make the aether much more palatable. Even Rayleigh also made his bloomers, as in his paper “The apparent failure of the usual electromagnetic equations” in which he sided with the American Barus that Maxwell’s equations could be interpreted as to make the waves run ‘backwards.’ Ten years were to pass after Röntgen’s discovery before Barkla proved conclusively that these new X-rays were transversal. What Barkla did was to assume that the rays were transversal and used his experiment to confirm it. Incident light makes dipoles vibrate in the direction of its electric vector. The scattered light will be linearly polarized if the incident light is linearly polarized. The intensity of scattered light will be zero in the direction of the electric vector and maximum at the direction making a right angle to both it and the direction of propagation of the incident wave. Alternatively, if the incident light is unpolarized, the light scattered in the direction normal to the direction of propagation will be linearly polarized. The components of the electric vectors of the unpolarized light that are normal to both these directions produce scattered light in the direction perpendicular to the direction of propagation of the incident light. In Barkla’s experiment the scatterers were spheres of paraffin; had they been heavier materials, the radiation produced on scattering might have given wrong results. Yet, none of these superstars would have ever questioned the existence of the aether. Whereas aether theories had to have longitudinal waves,
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
131
Maxwell’s theory just did not include them. As Heaviside asserted: There are no ‘longitudinal’ waves in Maxwell’s theory analogous to sound waves. Maxwell took good care that there should not be any.
The vectors E and H oscillate in planes normal to the direction of propagation. All, and that is a pretty big prescription, one must know are the permittivity and permeability µ. In contrast, the theories of the aether attributed a difference in the optical properties of two bodies either to their difference in density ρ or to their difference in rigidity, n. Associating either the electric force or the magnetic force with the velocity of the medium serves to reduce an electromagnetic problem to one of an elastic solid which is easier to picture. Experience has taught that nearly all transparent bodies (i.e. non-conducting) have equal permeabilities, µ ≈ 1, and places the blame of optical differences squarely on the shoulders of the permittivities, . This is true at least for extremely short electromagnetic waves. Variations in the density are therefore due to variations in the permittivity which for different dielectrics has widely differing values. This means that the electric energy density is a kinetic energy density, and since both are quadratic, the electric field is proportional to the velocity. The magnetic induction is represented by the rotation of the aether particles and the electric vector is proportional to their velocity. This is the way Fresnel pictured light. Although Heaviside would agree entirely with such an analogy, it is not without its difficulties. For an ordinary conductor with constant charge, it would mean that there is a permanent flow of aether. However, following the analogy, the vector potential would be the spatial displacement, instead of the ‘electrokinetic momentum’ designated by Maxwell, so that the displacement current would play the role of momentum, the magnetic force that of a torque, and the magnetic energy would be the potential energy due to rotation. Rather, if inertia is now attributed to the permeability which should occur in the opposite, long wavelength limit, then the magnetic force would play the role of the velocity of the aether particles and magnetic induction their momentum. The continuous streaming of the aether near a magnet is circuitous, which is not difficult to imagine since it leads to steady flow. Thus, whenever there is a magnetic force, there will be an
Aug. 26, 2011
11:16
132
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
aetheral velocity, and the dielectric displacement is the rotation due to that velocity. This is the way Neumann construed light to behave, where the magnetic force lies in the plane of polarization. There is nothing in electromagnetic theory to indicate which force lies in the plane of polarization. Historically, the magnetic vector is called the direction of polarization, and the plane containing this vector and the direction of propagation is referred to as the plane of polarization. But, one could equally choose the plane of polarization to be the electric vector and the direction of propagation. In the case of the reflection of light at the boundary of a transparent dielectric, if we assume that the displacement coincides with the electric displacement, we get Fresnel’s formula for the ratio of the reflected to incident wave thereby indicating that the magnetic vector lies in the plane of polarization. Even though the magnetic force plays a role only when the particle is in motion, Heaviside, while admitting the difficulties in associating the electric force with the velocity of the medium in that a charge must be continually “emitting fluid in all directions,” considered the case far worse if the magnetic force is the velocity, for an impossibility is involved. The electric force becomes rotation or proportional thereto, and the impossibility is that we need to have E both circuital and polar roundabout an isolated charge! Dr. Larmor’s determined attempt to make the rotational aether go, with H as the velocity, labors under this apparently incurable defect.
The situation is not so clear-cut as Heaviside would have us believe, and even he wavered. In the first volume of his Electromagnetic Theory he championed associating a velocity with the magnetic force. By the time he got to the second volume he had changed his mind completely. In the first volume he argues I have shown that when impressed electric forces act it is the curl or rotation of the electric force which is to be considered as the source of the resulting disturbances. Now, on the assumption that the magnetic force is the velocity of the elastic solid, we find that the curl of the impressed electric force is represented simply by [the] impressed mechanical force of the ordinary Newtonian type. This is very convenient.
If we are considering the increase in inertia due to a charge set in motion, the natural thing would be to associate H with the velocity, as we shall do
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
133
in Sec. 5.4.3, while if we want to discuss the possibility of compressional waves then E is the more likely candidate for the velocity, as we shall see in Sec. 7.3. Heaviside oscillated between euphoria Aether is a wonderful thing. It may exist in the imagination of the wise, being invented and endowed with properties to suit their purposes; but we cannot do without it. . . But admitting the aether to propagate gravity instantaneously, it must have wonderful properties, unlike anything we know.
and depression The actual constitution of the aether is unknown. It can never be known.
3.6.2
The index of refraction
The aether had its purpose and maybe it still has. It got the likes of Maxwell to reason in terms of bodily distortions and how they could be modeled by two field equations, even though he preferred his ‘electrokinetic momentum’ to the actual fields. As we shall see in Sec. 7.3, the decomposition of the wave equation into the field equations provides an understanding of what is being propagated and how it is being propagated. Moreover, the product of the two parameters in Maxwell’s theory allows the introduction of a spatial-varying index of refraction. Until Hertz’s famous experiment, the index of refraction was a gateway to experimental confirmation. In a transparent isotropic medium with dielectric constant , and showing but a negligible difference in µ, Maxwell pre√ dicted the speed of light to be c/ . The refractive index of the dielectric √ medium is simply . Maxwell extrapolated the index of refraction for infinitely long waves so that it would approach a quasi-static process which would facilitate measurements of the dielectric constant. For solid paraffin, whose square-root of the dielectric constant was 1.405, he determined an index of refraction of 1.422. The difference was more than experimental error would allow and did not confirm conclusively his theory. Although Maxwell admitted that the dielectric constant was not the sole contribution to the index of refraction, it was its major contributor. Maxwell expected better agreement when “the grain structure of the medium in question will be taken into account.” Sadly, he died before his theory was confirmed by Hertz.
Aug. 26, 2011
11:16
134
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
Much of the sequel depends on the form of the index of refraction, and for mass to enter we must consider a dispersive medium. In this respect we may follow Schrödinger’s derivation of the group velocity. He considered the equivalence between Hamilton’s principle, 2T dt = extremum, (3.6.3) and Fermat’s principle of least time, ds = extremum, v
(3.6.4)
where 2T is twice the kinetic energy, ds, is an element of path, and v is the phase velocity. The integrands of the two principles should be proportional to one another. Schrödinger then wrote the kinetic energy as 2 √ ds ds 2T = m = 2m(W − V) = [2m(W − V)] . dt dt Inserting this into (3.6.3) gave him √ [2m(W − V)]ds = extremum.
(3.6.5)
The integrands of (3.6.4) and (3.6.5) must be proportional to one another, i.e. C v= √ , (3.6.6) [2m(W − V)] where the constant of proportionality, C, must be independent of the space coordinates upon which the potential V depends. Hence, it can, at most, be a function of the total energy, W . For a nonrelativistic particle, Schrödinger identified the group velocity u from the expression of the momentum √ [2m(W − V)] u= . (3.6.7) m If ω is the angular velocity and κ, the wavenumber, then this must coincide with the definition of the group velocity as √ 1 dκ d ω d W [2m(W − V)] = = = , u dω dω ω/κ dW C
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
135
where ω/κ is the phase velocity, and he availed himself of Planck’s relation between energy and frequency. Upon differentiation, Schrödinger found C = W. For an electromagnetic wave, the phase velocity is c/η, which if the system is inhomogeneous, the index of refraction will depend upon the spatial coordinates. The product of the group, (3.6.7), and phase, (3.6.6), velocities is √ c [2m(W − v)] W uv = · = , η m m from which the expression, √ c [2m(W − V)] η= , W
(3.6.8)
for the index of refraction follows. Interpreting W as frequency we have κ2 :=
ηω 2 c
= 2m(W − V),
and the reduced, or Helmholtz, equation becomes ∇ 2 E + κ2 E = 0,
(3.6.9)
which will have a prominent role in our gravitational studies in Chapter 7. Schrödinger then opted to describe the vibratory motion of an electron in the hydrogen atom by finding the “possible movements of an elastic body.” Realizing that this is complicated by the existence of both longitudinal and transverse waves, he decided to “avoid this complication,” and consider only longitudinal waves, thereby missing out on the discovery of spin. Rather, we consider the wave equation obtained from the nondispersive Maxwell equations ˙ ∇ × H = E, ˙ −∇ × E = µH. We look for a solution where the electric and magnetic forces have the form X(x, y, z)e−iω[t−S(x,y,z)] ,
Aug. 26, 2011
11:16
136
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
where S is known as the eikonal. In the high frequency limit, or short wavelength limit, the wave propagates in the direction ∇S which is perpendicular to both E and H, where (∇S)2 = µ =
1 η2 = . v2 c2
Combining this with (3.6.8) gives the expression for the product of the two parameters appearing in Maxwell’s theory as µ =
2m(ω − V) , ω2
(3.6.10)
in units of action. Unfortunately, there is no known material that has a constituency relation of the form (3.6.10). It neither reduces to Cauchy’s law of disper√ sion in an electromagnetic system where η = and µ = 1, nor in an √ electrostatic system where η = µ and = 1. However, it does include the phenomenon of total reflection, where ω < V. Total reflection occurs when the first medium from which light arrives is optically denser than the second medium into which it enters. Take glass and the vacuum, and use primes to indicate the internal incidence within the glass, and the external refraction in the vacuum, β and γ . In applying Snell’s law, (3.5.6), the external angle of refraction γ corresponds to our former β, while β is equal to our former γ. Total internal reflection occurs when β > sin−1 (1/η), where the latter is referred to as the critical angle. Beyond this angle, Snell’s law yields values of sin γ which are preposterous because sin γ = η sin β > 1, and, consequently, values of cos γ that are imaginary. The Fresnel reflection coefficient from the boundary becomes complex, predicting total reflection, but, more surprisingly, it predicts the correct phase jump that occurs at total reflection. However, even beyond the critical angle there still exists a refracted wave, whose amplitude, however, will decay exponentially, the more rapidly the greater the difference between β − sin−1 (1/η) becomes. The reason for this transmission is exactly the same as for electron waves: In a region of imaginary index of refraction the waves penetrate in an exponentially decaying manner. The inequality ω < V implies (3.6.8) is imaginary.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
3.7
137
Motion Causes Bodily Distortion
Even before Larmor, Lorentz was contemplating what effects motion has on the change in configuration of bodies. According to Lorentz, the shape and charge distribution of an electron at rest must be correlated with one in motion. In Sec. 5.4.4, we will see that Lorentz assumed that when an electron, with a spherical shape and uniform charge distribution on its surface, is set in motion it will become an oblate spheroid, whose semiminor to semimajor axes are related by the FitzGerald–Lorentz contraction factor. Although Lorentz does not pretend to offer even a partial explanation, he has gone a long way in driving home the point that motion causes distortion. There is no better way of summarizing this then to repeat Cunningham’s [14] words: . . . we are bound to recognize the possibility of changes in the shape and properties of material bodies when their velocity is altered, and also that the Newtonian conception of a rigid body as one having a permanent configuration independent of its velocity is one which is not even approximately realized unless that velocity is very small compared with that of light.
3.7.1
Optical effect: Double diffraction experiments
Michelson’s attempt to determine the speed of the Earth relative to the aether was the first of its kind at determining a second-order effect. Even earlier, Mascart [74] sought a first-order effect, which was repeated later by Lord Rayleigh [02], and, still later, by Brace [04] to an even higher degree of precision. Rayleigh reasoned that if an isotropic transparent body actually did contract as a result of its motion through the aether, then its optical properties must be modified. An isotropic, transparent body would no longer be isotropic, and, as a consequence, a beam of light passing through it in an oblique direction would undergo birefringence, or double refraction. However, he could find no trace of such a change. Brace repeated Rayleigh’s experiment to a precision that, if detected, would be one-fiftieth of what would be produced through a mechanical contraction due to pressure. The null result of Rayleigh’s attempt was fully expected by Larmor when his paper was read before the British Association, meeting in Belfast in
Aug. 26, 2011
11:16
138
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
September 1902. Whereas “optical measurements are usually made by the null method of adjusting the apparatus so that the disturbance vanishes,” Larmor [00] contended that the “result carries the general absence of the effect of the Earth’s motion in optical experiments, up to second-order of small quantities”; that is, up to terms of order (u/c)2 . This hardly led to credence in the FitzGerald–Lorentz hypothesis of contraction in the Michelson–Morley experiment. For if there were factors of ‘compensation’ they should have surely been at work in that experiment. However, Lorentz [16] was so attached to his contraction hypothesis that he offered compensating mechanisms, one being the compensation that would ensue if there would be differences in the effective mass of an electron when it vibrates in different directions.
3.7.2
Trouton–Noble null mechanical effect
There is absolutely no evidence of a couple acting on two rigidly connected charges due to the motion of the Earth through the aether.
Trouton in collaboration with Noble [03], carrying on the search for an effect of the motion of the Earth through the aether, that was begun by his recently deceased mentor FitzGerald. They set up the apparatus shown in Fig. 3.4 to measure a mechanical effect. Take two equal and opposite charges and set them in motion with the same velocity in parallel directions. The angle between the charges and the direction of the motion should neither be a right angle nor zero. Suppose the charges are moving in the x-direction in the xy-plane. The positive charge will produce a magnetic field at the negative charge in the z-direction. The negative charge will feel a force in the direction of the y-axis and the positive charge will feel an equal force in the −y direction. The charges were then mounted on a platform which could turn freely about its center. The forces acting on the charges would then be transmitted to the platform to provide a couple that would tend to set it in motion at right angles to the motion of the charges. If it exists, the effect would be to the second-order. The greatest deflection that was measured was 0.36 cm, while the deflection expected theoretically was 6.8 cm. This experiment extended the null results found in opticalelectrodynamic phenomena to mechanical-electrodynamic effects. Mysterious compensating factors were at work that led one to believe that there
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
139
Fig. 3.4. (a) Schematic arrangement of their experimental set-up. A parallel plate capacitor AB was hung from a 37 cm long phosphor bronze strip PA. The capacitor was charged to voltages up to 3000 V. A mirror attached to the capacitor was viewed through a telescope to see if there were any oscillations. In (b) E is the capacitor’s electric field and v is the direction of the Earth’s motion through the aether. The angle is between the line connecting the opposite charges and the direction of their motion.
was some unknown mechanism upon which the mechanical, optical, and electromagnetic properties of matter depend. This unknown mechanism was whisked away by Poincaré even before many of the experiments had been performed! In his optics course at the Sorbonne in 1899, Poincaré discarded the possibility of ever finding an effect that would provide evidence of the motion of the Earth through the aether, either to first- or second-order in the coefficient of aberration, u/c. In his own words, I regard it as very probable that optical phenomena depend only on the relative motions of material bodies, luminous sources, and optical apparatus concerned,
Aug. 26, 2011
11:16
140
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity and that is true not merely as far as quantities of the order of the square of the aberration, but rigorously.
Whittaker [53] drives this point home by mentioning two other occasions where Poincaré lays down his ‘principle of relativity,’ even going so far as to liken it to the second law of thermodynamics so long as it would fall if one example could be found where absolute motion could be detectable. Apparently, Whittaker was anticipating similar claims made by Einstein in his electromagnetic paper of 1905 in which he “set forth the relativity theory of Poincaré and Lorentz with some amplifications, and which attracted much attention.”
3.7.3
Anisotropy of mass
Be that as it may, everyone, including Lorentz and Poincaré, were talking about charged particles, and, in particular, the electron. But, as Thomson [28] notes: Einstein has shown that to conform with the principles of Relativity mass must √ vary with velocity according to the law m0 / (1 − u2 /c2 ). This is a test imposed by Relativity on any theory of mass. We see that it is satisfied by the conception that the whole of the mass is electrical in origin, and this conception is the only one yet advanced which gives a physical explanation of the dependence of the mass on velocity.
Einstein [23a] wrote in his 1905 paper on electrodynamics: We remark that these results as to the mass are valid for ponderable material points, because a ponderable material point can be made into an electron (in our sense of the word) by the addition of an electric charge, no matter how small.
This statement makes no sense in itself, except for expressing the desire to incorporate all of matter into a theory of relativity, for which a case could only be made for electrons. Einstein does not even broach the problem of the distinction between the electrostatic and electromagnetic masses. As such, it can be considered nothing less than a leap of faith. Even up to the early 1940’s, there was no example of a non-charged particle varying with velocity. Stranathan [42] is quick to point out that while relativity theory and the earlier theory of Lorentz predicted the same dependency of the mass on speed, there was one important
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
141
difference: Lorentz’s theory treated only electromagnetic mass, of mass attributable to a charged particle because of the energy represented by the fields about it. The relativity theory treats mass in general, with no specification of what may be responsible for the property. It is true that Lorentz presented arguments that all mass is probably electromagnetic in character. If these arguments are accepted, his theory would lead one to suppose that all mass varies with velocity. The relativity theory predicts this directly.
But, from all the experiments to-date, all one could deduce is the ratio of e/m, the ratio of the charge to the mass. So, if there were no charge, there would be no deflection, and hence no dependency of the mass on the speed! Thus, one had to go to extra-terrestrial objects to show that mass depends on speed. The example picked by Stranathan is the advance of the perihelion of Mercury, where, after taking into account all known perturbing forces on the orbit, there still remains a residual quantity that must be explained, that of some 43 of arc per century. According to Stranathan, There was no logical interpretation of this residual rotation before the advent of relativity theory. If, however, account is taken of the variation of the mass of the planet with velocity in its orbit, it turns out that one would expect this mass variation to produce a rotation almost exactly the residual observed. This is rather convincing evidence of the change in mass with velocity for ordinary matter. No one doubts that all mass changes in exactly the same way.
If this were correct, there would have been no need to invent the general relativity. In Sec. 7.6.3 we will see that the advance of the perihelion is due to factors other than the increase in mass with speed. It just seems astounding that recourse had to be made to a phenomenon which has nothing to do with the increase in mass when a charged particle undergoes deflection in an electromagnetic field. Then it is a question of mass and its relation to energy and stress. According to the Newtonian viewpoint, the momentum of a particle would always coincide with its direction of motion. That is, the mass, or better the ‘rest’ mass, is a scalar quantity which is necessarily isotropic. However, since the inertia of a body depends on its heat content [cf. Sec. 6.1], there is a part of the mass that depends on the stress. Only in the case where the stress degenerates into the ordinary, scalar, pressure will the mass be isotropic. We may then say that inertia is polarized by its motion. This will occupy our attention in Chapter 11.
Aug. 26, 2011
11:16
142
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
The simplest example is a person walking a wheel. If its center of inertia is at rest with respect to the observer, he will see a circle. However, if the observer is at rest with respect to the wheel in motion, he will see an ellipse. The part of the wheel touching the ground will not seem contracted, while the uppermost part of the wheel will move with double the velocity, and thus appear contracted. Consequently, there will be a greater mass density at the upper part of the wheel than at the bottom. Agreater amount of inertia above the center of inertia, coinciding with the center of the ellipse, means that inertia has been polarized by the motion of the wheel [Fokker 65]. Momentum will be parallel to the velocity only when the particle moves along one of its principal axes of stress. That is, the torque exerted on the particle will vanish only when its constant velocity coincides with its principal stress-axes. Stress entangles translational and rotational motion so that there will be two differently directed vectors, and, therefore it will neither be a vector nor a scalar but a combination of the two: a quaternion.
3.7.3.1
Quaternionic mass
From the Lorentz transform on momentum and energy, it emerges that the mass is given by the quaternion [Silberstein 14] M=
V W + 2 εP, c2 c γ
(3.7.1)
where V is the volume at rest, and P is the vector operator, the stress, which is a mixed second-order tensor. The so-called longitudinal stretching vector, which is a Lorentz transform coefficient without rotation, ε = γii + jj + √ kk, where γ = 1/ (1 − u2 /c2 ), stretches vectors parallel to the motion in the ratio γ : 1, while not affecting those normal to the motion. Since i is the versor for the velocity u, i.e. i = u/u, the stretching factor can be written as ε = I + (γ − 1)
uu , u2
where I = ii + jj + kk, the idemfactor. Inserting this into the mass (3.7.1) leads to (P · u)u (P · u)u −1 Mc2 = W + V P − . (3.7.2) + γ u2 u2
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
143
In general, the mass will have both scalar and vector components. The vector components are further broken down into the longitudinal and transverse components; the latter, according to (3.7.2), undergoes a FitzGerald–Lorentz contraction. If the linear vector operator P degenerates into a simple scalar, the pressure is purely normal and equal in all directions. The mass operator, (3.5.5), degenerates into the scalar rest-mass, M=
W + PV H = 2, 2 c c
(3.7.3)
where P is the isotropic, or ‘hydrostatic,’ pressure, and H is known as the heat function, or enthalpy. This expression for the energetic equivalence of mass was first advanced by Planck in 1907 [cf. Sec. 6.4]. The principal axes of stress are those for which the pressure becomes purely normal. The stress will have three mutually perpendicular principal axes with the corresponding principal pressures that are represented by the scalars Pi , Pj and Pk . With the definition of the absolute value of the pressure as: 3 2 Pi , |P| = i=1
the mass operator can be written as: 1 M= 2 c
W
−|P|V
|P|V
W
= W 1 + i|P|V,
(3.7.4)
where
1 1= 0
0 , 1
0 i= 1
−1 . 0
The squared absolute value of the mass is |M|2 c4 = W 2 + |P|2 V 2 = W 2 + V 2
3 i=1
and not as expected from Planck’s formula (3.7.3).
Pi2 ,
(3.7.5)
Aug. 26, 2011
11:16
144
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
The second term in (3.7.4) is responsible for the inertia of polarization. Although we will come back to this in Chapter 11, let it suffice to say here that the inertia of polarization destroys the decomposition of the mass into ‘longitudinal’ and ‘transverse’ components in the equation of motion, dG = F, dt since the inertial rest mass is no longer a scalar mass. Instead of the matrix representation of the complex mass, (3.7.4), we can form the quaternion, M = [W + (P1 i + P2 j + P3 k)V]/c2 .
(3.7.6)
To this quaternion, there corresponds the matrix: 1 A= 2 c
W + iP1 V −(P2 − iP3 )V
(P2 + iP3 )V
.
W − iP1 V
(3.7.7)
Thus, if we multiply the pressure vector by −i, which is equivalent to rotating it through −π/2, the matrix (3.7.7) becomes ˜ = 1 A c2
W + P1 V (P3 + iP2 )V
(P3 − iP2 )V W − P1 V
.
The determinant of (3.7.7) is the invariant form (3.7.3). Note, there is no mass component for each principal value of the stress [Silberstein 14], for that would involve an energy component in addition. Expression (3.7.6) generalizes Planck’s expression for the scalar rest mass, (3.7.3), and gives a general relation between the principal stresses and the rest mass. Another interesting point is this. Apart from the energy terms, the matrix (3.7.7) can be decomposed into three components,
i 0 A(i) = , 0 −i
0 A(j) = −1
1 , 0
0 A(k) = i
which are related to the Pauli spin matrices by σx = −iA(k),
σy = −iA(j),
σz = −iA(i).
i , 0
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
145
If we rotate the imaginary component of the quaternion (3.7.6) by −π/2, then instead of obtaining the equation of a circle, or three-sphere, (3.7.5), we will get a hyperbola, or hyperboloid. It is this rotation that transforms an elliptical quaternion into a hyperbolic Stokes representation when mass is invariant. If energy is invariant then the Stokes representation coincides with an elliptical quaternion known as the Poincaré representation. These topics will be discussed in much greater detail Sec. 11.2 and Sec. 11.3, respectively. To test for the possibility of mass anistropy, Poynting and Gray [22] looked at natural crystals which can harbor enormously large ‘latent stresses.’ They attempted to measure whether a quartz crystal sphere had any directive action on another crystal sphere in its proximity. They did so by attempting to measure any difference in the work it would do to rotate from a configuration where their axes were parallel with one another to the opposite configuration where they would be crossed. It would seem reasonable to believe that the crystals exerted greater attraction to one another when their axes were parallel. Start with the configuration where both crystals lie in the same plane but with their axes perpendicular to one another. To split them apart will take work. When they are out of the range of attraction, turn one of the crystals around till its axis becomes parallel to the other one. If the process is done quasistatically it will involve no work. Bringing the crystals together again will show that less work will be needed than on the outgoing journey. When the crystals have come to within the same distance they started out in, rotate one of the crystals so that its axis again becomes normal to the other. Work must be done in order to turn one of the crystals, for, if not, then the cycle could be carried out again always yielding a surplus of energy. This would be tantamount to a perpetual motion machine whereby energy without limit can be drawn from it. So the work that is necessary to rotate the crystal axis must be dissipated either in the cooling of the crystal or in a diminution of its mass. Since neither of these possibilities are feasible, the only conclusion we can come to is that it requires work to rotate the crystal from where its axis is parallel to the other crystal to one where it is normal to it. Poynting and Gray failed to find what they called any ‘directive action in gravitation.’ The matter was dropped at that.
Aug. 26, 2011
11:16
146
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
3.7.3.2
Vectorial mass
Another way to define the electromagnetic mass is to consider it as the ratio of the momentum to speed [Schott 12], G = mu.
(3.7.8)
This definition makes mass a vectorial quantity. We will refer to (3.7.8) as transverse mass, in contrast to longitudinal mass, m =
∂m ∂G =m+u , ∂u ∂u
(3.7.9)
which is also a vector. The different components of the mass will be related to the external force, defined by dG ∂G = + ω × G. dt ∂t
(3.7.10)
We consider a mass in motion, which, at any instant, has a set of three mutually orthogonal axes, (ξ, η, ζ). These axes are the tangent, principal normal, and binormal to the path that the electron traces out, as shown in Fig. 3.5. The partial derivative in (3.7.10) refers to the differentiation in time relative to the axes which rotate at an angular velocity, ω. Since the electromagnetic momentum, G, is not an explicit function of time, ∂G ∂G = u˙ , ∂t ∂u
Fig. 3.5.
Planes formed from a moving trihedron.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
147
and the angular velocity will have components (u/τ, 0, u/ρ), where ρ and τ are the radii of curvature and torsion, respectively. The force, (3.7.10), will have the following components: Fξ = mξ u˙ − mη
u2 , ρ
Fη = mη u˙ + mξ
u2 u2 − mζ , ρ τ
Fζ = mζ u˙ + mη
u2 , τ
(3.7.11)
implying three different actions [Schott 12]: ˙ proportional to the acceler(i) A quasi-longitudinal force component, m u, ˙ in the direction of m . ation, u, (ii) A quasi-transversal force component, (−mη , mξ , 0)u2 /ρ, proportional to the centripetal acceleration, u2 /ρ, in the osculating plane normal to transverse mass vector, m. (iii) A torsional force component, (0, −mζ , mη )u2 /τ, proportional to the torsion, 1/τ, in the normal plane perpendicular to the mass vector, m. For a symmetrical electron, the torsion vanishes, so the first and second components reduce to the longitudinal and transverse mass components, respectively. If the electron is not symmetrical we need: ˙ in addition to the longitudinal component, (i) components mη u˙ and mζ u, ˙ in order to keep the electron moving in a straight line, whereas mξ u, (ii) in addition to the centripetal component, mξ u2 /ρ, the longitudinal component, −mη u2 /ρ, is necessary to keep the electron moving in a circle of radius ρ at uniform speed, u. It thus becomes clear that the tangential mass component is related to rotational motion, while the longitudinal mass is involved in rectilinear motion. This gives credence to the statement that the transverse mass arises when the force is perpendicular to the velocity, while the longitudinal mass occurs when the force is parallel to the velocity [Okun 89]. But, it is never mentioned that it applies to uniform circular motion.
Aug. 26, 2011
11:16
148
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity The momentum of the Lorentz electron, mu , (1 − β2 )
G= √
(3.7.12)
implies that G is in the direction of the motion, which need not always be the case. According to (3.7.8) and (3.7.9) this will imply that the two mass vectors, m and m point in the direction of motion. Consequently, m = mξ and m = mξ , and all the other components vanish. With the mass dependencies given by m = m0 γ and m = m0 γ 3 , where m0 is 4e2 /5c2 a according to Lorentz, with a the radius of the sphere. While to the relativists, ˙ m0 is just the ‘rest’ mass, although it will have both a longitudinal, m0 γ 3 u, 2 component and a centripetal force, m0 γu /ρ, component. Therefore, the total force acting on a Lorentz electron will be F = m u˙ + mu˙ ρ ,
(3.7.13)
where u˙ ρ = u2 /ρ is the centripetal acceleration. However, this is not what the relativists tell us today. The longitudinal component did not comply with the early e/m measurements, and was quickly swept under the carpet. The transverse component has nothing to do with orbital motion so that ˙ but it still requires u ⊥ F! the force acting on the electron is mu, The rate of energy loss, ˙ + mu3 /ρ, Fu˙ = m uu
(3.7.14)
shows that part of the energy is stored as W=
m u du,
and part is lost at the rate mu3 /ρ due to its tendency to rotate.a We will now see that the early measurements on the electron were specifically designed to measure the centripetal force in (3.7.13). a Since an accelerating electron radiates, (3.7.14) would have supplemented by two
other terms: the rate of energy loss due to radiation, and a small term involving the pressure of radiation. We shall consider these terms in Sec. 4.3.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
3.7.4
149
e/m measurements of the transverse mass
It is the purpose of section to show that the early experiments to measure the ratio e/m were specifically designed to measure the transverse, or the centripetal acceleration in the force law, (3.7.13). There were two general methods that the ratio e/m could be measured: the so-called deflection methods where one observes the bending of an electron beam in electric and magnetic fields, and the spectroscopic method which measures frequencies in the spectral lines of a radiating atom which depend, among other things, on the mass of the nucleus and the strength of an externally applied magnetic field. We will discuss only the former class of measurements.
3.7.4.1
Thomson’s method
Rather than aiming at a numerical precision of the charge-to-mass ratio, e/m, Thomson was interested in gaining information regarding the nature of the particles in beams that could be deflected by electric and magnetic fields. Thomson carried out his experiments in 1897 before it became the fad to discriminate among the different theories that attempted to account for an increase in inertia with speed. These will be discussed in Sec. 5.4.1. To Thomson’s surprise, he found that a great number of different sources possessed the same type of particle, which was baptized the ‘electron’ by G. J. Stoney, who was FitzGerald’s uncle, in 1891. Although Thomson only made sure identification of the electron in 1897, from his work on cathode rays, the term ‘electron’ was well amalgamated into the literature by then, having been adopted by Lorentz in 1892, and by Larmor in 1894. Thomson’s apparatus, shown in Fig. 3.6, has electrons produced at the cathode C passing through slits A and B which then hit a fluorescent screen. Between plates D and E an electric field could be applied, and through a current flowing in two external coils, a magnetic field could be produced that would be perpendicular to the electric field. The coils were arranged so that the particles would experience both electric and magnetic fields simultaneously. The fields were oriented in such a way that, when one field acts alone, the beam would be deflected either upward or downward.
Aug. 26, 2011
11:16
150
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
Fig. 3.6.
Thomson’s apparatus for determining the ratio e/m for cathode rays.
The fields were so adjusted that the net deflection on the electrons was zero. In this way, Thomson’s pet annoyance with the lack of action and reaction in the Lorentz force could be avoided for there would result u E = . c H
(3.7.15)
The beam of electrons was then deflected by the magnetic field in which mechanical equilibrium is achieved by balancing it with the centripetal force, eH
u mu2 = , c R
where we will determine R subsequently [cf. following (3.7.24) below]. No matter what the dependency the transverse mass has upon speed, it clearly selects out the centripetal component of the force in (3.7.13). Eliminating the velocity between these two equations results in e c2 E = 2 . m H R
(3.7.16)
Since all the quantities on the right-hand side of (3.7.16) could be measured experimentally, the ratio of e/m could therefore be determined. Thomson later modified his apparatus so that after the cathode rays passed through the slit they entered a ‘Faraday chamber.’ A Faraday chamber is an insulated conductor which the electrons in the beam hit after passing through a small aperture in the chamber. If there are N electrons in the chamber, the total charge accumulated will be Q = N e. A beam with the same number of electrons was then aimed at a small thermocouple of known heat capacity. The energy, W , transferred to the
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
151
thermocouple could be measured by a rise in its temperature. This energy must be entirely kinetic so that W=
1 N mu2 . 2
The electron beam was then bent by a magnetic field, again selecting the centripetal force in (3.7.13), so that Heu =
mu2 . ρ
Eliminating the velocities between these three equations results in e 2W = 2 2 . m H ρ Q The conclusion that “since all quantities on the right can be measured, the ratio e/m can be obtained” [Stranathan 42] is now inaccurate. The first objection is that Q depends on e so that the ratio is e2 /m, and the second is on the validity of W for the correct expression for the kinetic energy. In fact, Thomson measured the velocity of the cathode rays to be one-tenth that of light where relativistic effects cannot be neglected. But what Thomson did succeed to determine is that the ratio e/m was independent of the gas used, or the composition of the metal electrodes.
3.7.4.2
Kaufmann’s method
Kaufmann’s experiments show that the real constant mass of the electron is negligible compared with the apparent [electrodynamic] mass; it can be considered as zero, so that if it is mass which constitutes matter we can almost say that matter no longer exists. . . There are merely holes in the aether. Poincaré
At the turn of the century, Kaufmann came onto the scene with a new method of determining the ratio. Kaufmann found β-rays expelled from radioactive substances to be faster and better suited than cathode rays. The electric and magnetic fields were now parallel so that the deflection of the electrons was perpendicular. The deviation of the electron from its line of flight, on account of its deflection in an arc of a circle, is inversely proportional to the radius of the circle, provided the deflection is small. Again, Kaufmann selects the centripetal force in (3.7.13), and arrives at similar equations to those of Thomson, except the radii of the circular
Aug. 26, 2011
11:16
152
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
arcs are different, i.e. Ee =
mu2 , ρE
eH =
mu . ρH
These radii can be transformed to Cartesian coordinates, x = k/ρE and y = k/ρH , where k depends upon the specific nature of the geometry. Eliminating the velocity between the two, and replacing the radii by the Cartesian coordinates, Kaufmann obtains the equation of a parabola, y=
c2 E m 2 x . kH 2 e
(3.7.17)
The ratio, y c E = , x uH
(3.7.18)
is, therefore, a measure of the deviation of the Lorentz force from zero [cf. Eq. (3.7.15) above]. The position of the origin is that of the undeflected beam about to enter the fields. The full parabola, for negative values of x, can be obtained by reversing the direction of the magnetic field, as shown in Fig. 3.7. Electrons of different speeds fall at different points along the parabola so that at any point Kaufmann obtained a specific value of e/m at a given velocity. Kaufmann was the first to obtain results that correlated the ratio e/m with definite electron velocities. Kaufmann found that e/m varied slightly with velocity, decreasing with increasing velocities.
Fig. 3.7. The points on the parabola refer to electrons deflected by parallel and anti-parallel (left side) fields.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
153
Ritz [08] queried the invariancy of Kaufmann’s parabola, (3.7.17), for it will only have that form if m is independent of the speed. If the mass did increase with velocity, say, at some power greater than one, the parabola would tend to a straight line as that power increases. Instead of (3.7.18) Ritz triedb y = x
√
(c2 − u2 ) E , H u
(3.7.19)
giving rise to a parabola, y=
mc2 E 2 eE x + k 2, 2 ek H mc
(3.7.20)
shifted upward at the origin. The mass is now a constant, independent of the parameter u. The differences between observed and calculated values of y could, according to Ritz, come well within the limits of experimental errors. The surprising aspect occurs at the origin where x vanishes for u = c giving a zero magnetic deviation, while the electric deviation is proportional to the ratio of the electric to rest mass energies [cf. Fig. 3.7]. Ritz concluded that there is a large leeway for hypotheses, and Kaufmann’s experiments can be interpreted equally as well as keeping the mass constant and modifying the Lorentz force. Moreover, all radiation effects of the accelerated electrons have been neglected. The latter conclusion probably was attractive to Ritz on account of his modification of Ampère’s law to contain more terms in his Taylor expansion as the velocity of the electron increases, and leaving invariant the charges in that expression. The transverse mass enters only when we establish a mechanical equilibrium between the magnetic part of Lorentz’s force and the centripetal force acting on the electron. Ritz, therefore, was of the opinion that there were no general laws at great speeds, but the phenomena occurring at these speeds could be accounted for by an appropriate series expansion of his law of force in powers of 1/c. Ritz was also correct in observing that it is only the transverse mass that “comes into play.” bActually, Ritz wrote 2c for c, but that does not change anything “in a theory which
considers only relative velocities.”
Aug. 26, 2011
11:16
154
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
Deflection experiments were designed to measure the transverse mass, where the acceleration is not the longitudinal acceleration, but, rather, the centripetal acceleration which occurs in the osculating plane perpendicular to m. The longitudinal force component is in the direction of m . These conclusions hold for whatever model one may choose for the vectorial masses, m and m. Let us now turn to a microscopic interpretation of the deflection methods to measure the ratio, e/m.
3.7.4.3
Microscopic interpretation of the deflection methods
Let us resolve the Lorentz force,
u F =e E+ ×H , c
(3.7.21)
into tangent, principal normal drawn to the center of curvature, and binormal erected to form a right-handed system of moving axes (ξ, η, ζ). Accomplishing this we get Fξ = eEξ , Fη = e Eη − Fζ = e Eζ +
u
Hζ , c u
Hη , c
(3.7.22)
which we equate to the mechanical force components, (3.7.11) acting on the electron. We then obtain eEξ = mξ u˙ − mη
u2 , Hξ = 0, ρ
eEη = mη u˙ + mξ
u2 , ρ
˙ eEζ = mζ u,
u e H η = mη , c τ e u Hζ = mζ . c τ
(3.7.23)
We first note that the longitudinal components of the force do not enter into the mechanical conditions of equilibrium. Second, we observe that the ratio of the magnetic field components stand in the same ratio as
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
155
their masses, Hη mη = . mζ Hζ If we consider the motion along the tangential, ξ-axis, Thomson’s arrangement has the electric and magnetic fields normal to this axis. From (3.7.22) it follows that the electric field is pointing in the η-direction, and the magnetic field in the ζ-direction. Eliminating the speed parameter between the two leads to e m2ζ /mξ
=
c 2 Eη ρ . Hζ2 τ 2
(3.7.24)
Comparing Thomson’s result, (3.7.16), with (3.7.24) we conclude that the transverse mass components, mξ and mζ should be equal, and R = τ 2 /ρ. The distinction of these two radii of curvature will become evident in Kaufmann’s set-up. In Kaufmann’s apparatus the electric and magnetic fields are parallel to one another. If the motion is in the ξ-direction, the only possibility is to eliminate the speed parameter between Eη and Hη . We then get c2 Eη m2η /mξ 1 1 . = ρ e τ2 Hη2
(3.7.25)
In comparison with (3.7.17), we identify the x and y deflections with the torsion, 1/τ, and curvature, 1/ρ, respectively. Moreover, the two transverse mass components, mη and mξ must be equal. This implies that the transverse mass vector, m, cannot be in the direction of the motion for, otherwise, mξ = m and mη = mζ = 0. Ritz’s formula, (3.7.20), would have a non-vanishing value of the curvature at zero torsion, which is entirely reasonable for a symmetrical electron. In some ways, Ritz’s approach is similar to Boltzmann’s view of statistical mechanics where elegance should be left to tailors and cobblers. Granted its lack of uniqueness in that there are arbitrary constants that have to be chosen to get the right numerical results, Ritz’s explanation of the gravitational phenomenon of the advance of the perihelion of Mercury was extremely successful, as well as other gravitational phenomena falling outside the domain of Newtonian mechanics. We now turn to a field versus force approach to relativistic gravitational phenomena.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
156
A New Perspective on Relativity
3.8
Modeling Gravitation
3.8.1
b1197-ch03
Maxwellian gravitation
The analogy between the inverse square laws of Newtonian and Coulomb forces led to many attempts to describe gravity in electromagnetic terms. The big problem, and the adjective ‘big’ cannot be over-exaggerated, was the perplexing problem of the localization of gravitational energy. For when one brings two like charges together, energy is necessary to overcome repulsion, and it is this energy that “comes from the field,” which implies that space has a positive field energy. However, masses always attract one another so that it takes energy to keep them apart. This would mean that the surrounding field had negative, instead of positive, energy. At first sight one would think that the same would apply to unlike charges which attract one another. Given a universe which is electrically neutral, these two charges would have to be separated from two other charges, and when all the electromagnetic interactions are accounted for, a negative electromagnetic field energy never arises. The negative gravitational field energy so vexed Maxwell that he abandoned any hope of constructing a gravitational theory that would mimic his electromagnetic theory, and claimed that such a theory was beyond nineteenth century physics. In his own words, Maxwell admits “Since it is impossible for me to understand how a medium could possess such properties, I cannot pursue research, in this vein, into the cause of gravitation.” Specifically what Maxwell was referring to was this: Call F the gravitational force. In a static regime, Maxwell’s equations reduce to: ∇ · F = −ρ, ∇ × F = 0, where the negative sign denotes convergence [-divergence], and ρ is the mass density. Now, the gravitational energy will be: W = − F2 dV + const, where the negative sign denotes attraction. Maxwell reasoned that since the energy is essentially positive, the integration constant must be so large that
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
157
W is positive for whatever value the force F can assume. The energy will be a maximum when the force vanishes. For elastic systems, however, the energy is a minimum when the deformations vanish. Hence, the gravitating system will always be in a state of unstable equilibrium, and, therefore, gravity does not fall under the jurisdiction of field equations. But this did not deter Heaviside, and, in 1893, he had a go at constructing a gravitational theory by modifying Maxwell’s equations of the electromagnetic field. And like Maxwell, Heaviside based his analogy on the presence of an aether, invoking Newton’s authority in which he refers to a letter written in 1693 to Richard Bentley, the then Bishop of Worcester in which Newton writes: That gravity should be innate, inherent and essential to Matter, so that one body may act upon another at a Distance thro’ a Vacuum, without the Mediation of anything else, by and through which their Action and Force may be conveyed from one to another, is to me so great an Absurdity, that I believe no Man who has in philosophical Matters a competent Faculty of thinking, can ever fall into it. Gravity must be caused by an Agent acting constantly according to certain Laws; but whether this Agent be material or immaterial, I have left to the consideration of my Readers.
Nothing had changed for the next two hundred years since Newton wrote those words, and Heaviside [94] concluded that “It is incredible now as it was in Newton’s time that gravitative influence can be exerted without a medium; and, granting a medium, we may as well consider that it propagates in time, although immensely fast.” Heaviside first states the analogy between electricity and gravitation: both are inverse square laws so that the localization of electromagnetic energy should be the same as gravitational energy. However, there is one big difference: bringing two like charges together requires energy, and it is this energy that ‘goes into the field.’ In contrast, two masses attract one another so that energy is needed to keep them apart. Thus, the field energy density is negative. Negative energy densities are worrisome today, but even more so to nineteenth century physicists, and its abhorrence, as we have stated, led Maxwell to drop any attempt of grappling with gravitation along the same lines as his electromagnetic theory, but not his protégé, Heaviside! Charge neutrality requires the two charges to be separated from two other opposite charges so there will be a positive energy field density
Aug. 26, 2011
11:16
158
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
over all. Not so with mass, for if all the mass in the universe were positive, we would expect to find a negative energy field density. Negative mass must still be relegated to science fiction, but should it exist, the analogy with electromagnetism would be one step nearer. Heaviside reasoned that if e is the intensity of the gravitational force and when “matter ρ enters any region through its boundary, there is a simultaneous convergence of the gravitational force into that region proportional to ρ.” In other words, the difference between the two currents, ρu − e˙ /G, must be divergence-free, where ρu is the flux of matter. The gravitational constant G is seen to play the role of the inverse permittivity, showing that the electric field has more to do with gravitation than the permeability of the magnetic field that is necessary to close the circuit. This can only be the case if the difference is proportional to the curl of a vector, say h, ∇ × h = ρu − e˙ /G.
(3.8.1)
The divergence of (3.8.1) vanishes, and this must be equivalent to the continuity equation. It will be if the intensity of the force satisfies a Gaussian law, ∇ · e = −Gρ,
(3.8.2)
so that the divergence of (3.8.1) results in the continuity equation, ∇ · J + ρ˙ = 0,
(3.8.3)
where J = ρu is the flux. According to Heaviside, if there is instantaneous action, ∇ × e = 0, because “the gravitational force is exactly dependent on the configuration of matter,” meaning that it is given by the gradient of the Newton potential. Nothing moves, and all interactions occur by the hideous action at a distance. Rather, if e is propagated at a finite speed, v, then e must satisfy the wave equation, v2 ∇ 2 e = e¨ .
(3.8.4)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
159
Then, since ∇ 2 = ∇(∇ · ) − ∇ × (∇ × ),
(3.8.5)
it follows that −v2 ∇ × (∇ × e) = e¨ , in space free of matter. But, we also have by (3.8.1) that ∇ × h = −˙e/G,
(3.8.6)
in the absence of matter. Differentiating (3.8.6) with respect to time gives ˙ e¨ = −G∇ × h, and by (3.8.4) this becomes ˙ (v2 /G)∇ × e = h.
(3.8.7)
Then calling µ = G/v2 , which is analogous to the magnetic permeability, (3.8.7) can be written as ˙ ∇ × e = µh.
(3.8.8)
Heaviside concludes that the second circuital law (3.8.8) is a consequence of a finite speed of propagation, which could have been inferred straight-off from the analogy with electromagnetism. The subsidiary condition, (3.8.3), and J˙ = −v2 ∇ρ,
(3.8.9)
implies that the mass density, ρ, propagates at speed v, since ρ too satisfies a wave equation, ρ¨ = v2 ∇ 2 ρ. If we introduce b, the induction field, according to b = µh, the circuital laws become: ν2 ∇ × b = (˙e − GJ), ˙ ∇ × e = −b.
(3.8.10)
Aug. 26, 2011
11:16
160
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
The Maxwell gravitational equations (3.8.11) are equivalent to the wave equation, (3.8.4). Now, the point is this: The wave equation (3.8.4) is valid even if b = 0, since e¨ = v2 ∇(∇ · e) = v2 ∇ 2 e,
(3.8.11)
on account of the fact that ∇ × e = 0. This means that e is the gradient of some scalar potential, so that Heaviside was wrong to conclude that the vanishing curl of e means ‘instantaneous action.’ Since polar e is propagated at speed v, without magnetic forces, the electric waves are longitudinal. Moreover, barring rigidity, a generalized displacement g can be written as ρg¨ = λ∇(∇ · g) − ν∇ × (∇ × g), where λ and ν are the elastic constants related to compression and rotation. With either constant equal to zero we still obtain a wave equation. Heaviside identifies the velocity g˙ with e. In the case ν = 0, the vibrations are √ longitudinal and propagate at a velocity v = (λ/ρ), whereas if λ = 0, the √ vibrations are transversal and propagate at a speed v = (ν/ρ). In the latter case, we need two circuital laws with a medium ‘waving’ in directions normal to the direction of wave propagation. Whereas, in the former case of longitudinal propagation, all that is necessary is the ‘back-and-forth’ compression and rarefaction motions of the medium. Finally, in the case λ = ν, the vector identity (3.8.5) shows that a wave equation will result with both longitudinal and transverse wave motion. However, polarization occurs only with transversal waves as Young and Fresnel showed back in 1817, but what is polarized? Nothing is new under the sun, and Carstoiu [69] rediscovered Heaviside’s gravitational equations over three-quarters of a century later. In his nomenclature, h is the gravitational ‘vortex,’ J the gravitational current, and e the (vector) gravitational field. If Carstoiu’s gravitational vortex, h, has any role, it must be related to the curl of the velocity field. The reason why Heaviside introduced a gravitational vortex stemmed from the fact that the difference between the conduction current ρu and the
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
161
displacement current εe should be divergenceless, i.e. the divergence of their difference should give the continuity equation (3.8.3) when Gauss’s law is introduced. Without an h field, there is no reason why Einstein should have put electromagnetism and gravitation on the same footing, and propagating with a common speed, c. Einstein does not make any distinction between the velocity of propagation of electromagnetic and gravity waves, although the latter still have as yet to be observed. Thus, the Einstein equations should be reduced to the circuital equations of Maxwell, indicating that they are transverse waves and polarizable. It is well-known that Einstein’s equations, 1 8πG Rαβ − gαβ R = − 4 Tαβ , 2 c where Rαβ is the Ricci tensor, gαβ are the components of the metric tensor, R is scalar curvature, and Tαβ is the energy–momentum tensor, can be linearized to read c2 ∇ 2 φαβ −
∂2 φαβ = −16πGTαβ . ∂t2
(3.8.12)
Equations (3.8.12) predict that the velocity of propagation of the gravitational potential φαβ is the same as that of light. To zeroth-order in v/c, T00 = ρc2 , and to first-order in v/c, Tα0 = −ρc2 (vα /c), where vα is the particle velocity, and v that of the ‘aether.’c Thus, φ00 corresponds to the scalar, and φα0 to the vector, potentials of electromagnetism and can be expressed in terms of their respective density, ρ, and current, jα = ρvα , as c2 1 φ00 (P, t) = 4 4π and cφ α0 (P, t) = −
µ 4π
V
ρ(P , t ) dV, r
V
jα (P , t ) dV, r
c For a criticism of the energy–momentum tensor see Sec. 6.7.1.
Aug. 26, 2011
11:16
162
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
where r is the distance between the point P and the location P of the volume element, dV, and t = t − r/c, meaning that the potentials are retarded. In analogy with electromagnetism, the permittivity is defined as = 1/4πG, and the gravitational permeability is µ = 4πG/c2 [Forward 61], such that their product is the inverse of the square of the velocity of light, c. But, this is much more committal than Heaviside ever wanted to be. By multiplying (3.8.8) through by h, and subtracting e times (3.8.1) he obtains what he refers to as the ‘equation of activity,’ −∇ · (e × h) = −
1 ∂ 1 µh2 + e2 + F · u, 2 ∂t G
(3.8.13)
which we would refer to today as a power equation. The first two terms are the rates of decrease of the rotational and kinetic energies, while the last term represents the power of an impressed field where F = ρe is the force whose intensity is e. The vector e × h (“found by Poynting and myself” [Heaviside]) describes the flux of gravitational energy, just as it describes the flux of electromagnetic energy. However, Heaviside quickly realizes the direction of the gravitational flux of energy is pointing in the wrong direction! Or, at least, it is pointing in the opposite direction than in electromagnetism. For draw a sphere around a charged particle whose axis of spin coincides with the direction of motion. The positive pole is placed in the forward direction and the negative pole in the back. The magnetic intensity rotates with the lines of latitude in the direction of rotation, while the electric intensity points radially outward. Then, the flux of electromagnetic energy corresponds to lines of longitude from the negative to the positive pole. The only change in the gravitational case is the direction of the electric intensity; it points radially inward! Thus, there is a reversal of direction of the gravitational flux, according to Heaviside, given that “all matter being alike and attractive.” If nothing is being ‘radiated’ away, why then should there be a depletion of energy in (3.8.13)? Carstoiu’s gravitational vortex, which is supposedly analogous to the magnetic field, is parallel to the gravitational force, which is supposed to play the role of the electric field. This is based on the analogy that there is a magnetic charge, Qm , which is the source of a scalar potential in an analogous way ordinary charge is the source of the scalar potential. If Qm
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
163
existed, there would be no need to introduce a vector potential whose curl is the magnetic force. The two fields differ by a mere 1/c. This fact destroys his circuital equations, and his gravitational vortex is not divergence free! Since the gravitational and vortex potentials are parallel, Poynting’s vector vanishes, and so too, the energy flow. Moreover, u does not give rise to a source term in the energy balance equation (3.8.13). It can either be solenoidal, or what Heaviside calls ‘circuital,’ or it is irrotational, or ‘polar’ in Heaviside’s terminology. In the former case it adds a contribution to the rate of increase in stored energy, while in the latter case it adds contributions to both the energy flux and energy rates of change. Consider, first the circuital case. Then we can set u = a, where a plays the role of the vector potential intensity that satisfies e + a˙ /v = 0. Inserting these two terms into the last term in (3.8.13) gives the additional contribution of 12 (ρ/v)da2 /dt to the rate of storage of energy. Now take u to be polar, i.e. u = −∇(∇ ·a). Introducing this into (3.8.13) and noting that ρ ρ ∂ ρ (∇ · a)2 , − a˙ ∇(∇ · a) = − ∇(˙a∇ · a) + 12 v v ∂t v shows that the negative of the first term on the right-side contributes to the flux of energy, while the second term contributes to the rate of change of stored energy. In other words, −∇ · a plays the role of a hydrostatic pressure, and the term which contributes to the energy flux is analogous to a term which would contribute to the energy flux of sound waves. Rather, the quadratic term in the energy, which is proportional to (∇ · a)2 , is proportional to the energy of compression. The two cases correspond to transverse and longitudinal waves. The former can give rise to polarization, the latter not. So, Heaviside made no commitments to whether the waves he was talking about were transverse or longitudinal, or a combination of both. In addition, there is no statement that the gravitational waves travel at the speed of light, just that they have a finite speed of propagation, v.
3.8.2
Ritzian gravitation
One of the most extraordinary physicists of the early twentieth century was a young Swiss scientist, Walther Ritz. O’Rahilly [38] sums up this carrier
Aug. 26, 2011
11:16
164
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
most succinctly: And when, in spite of his acknowledged researches in spectroscopy and elasticity, the Swiss physicist, Walther Ritz, expressed heterodox views on electromagnetics in 1908, shortly before his death, his ideas were received with a chill of silence and have ever since been systematically boycotted. He was out of tune with the music, and out of step with the crowd.
In this section we are going to show that young Ritz had created a theory of gravity that not only explained the advance of the perihelia of planets, but also could have accounted for the deflection of light of a massive body and other tests put to Einstein’s general relativity. The only difference is that Ritz preceded Einstein by a good seven years! Since the middle of the nineteenth century, the anomaly of the advances of the planets was a thorn in the side of the Newtonian theory of gravitation. As we have seen in the last section, the analogy between electricity and gravitation posed too much of an opportunity to be bypass easily. In 1864 the German astronomer, Seegers, proposed to treat the advance of the planets as a gravitational force in the same way that Weber’s force law holds for electrical attraction between particles. The advance of the perihelion would thus be due to the motion of the other planets, and it would be necessary to take into account their relative velocities, scaled down by the inverse square of the velocity of light, and their accelerations. That is, they were entirely cognizant of the fact that they were looking for second-order effects. By applying Weber’s law, Scheibner found a secular variation of 6.73 , while Tisserand found the value of 6.28 for the secular variation of Mercury in 1872. In 1890, Lévy noticed that a combination of the Weber, (4.1.5), and Riemann (4.1.6), potentials could serve as a potential for gravitation.d Furthermore, Ritz observed that his expression for the force, (4.1.7), could be written as a linear combination of Weber’s (W) and Riemann’s (R) forces, Fx =
1 1 (1 − λ)FWx + (1 + λ)FRx , 2 2
(3.8.14)
d The reader is asked to accept the formulas on faith; they will be developed fully
in the next chapter.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
165
in the absence of accelerations, in the x-direction where λ is an undetermined parameter of mixing. Its value will be chosen so that theory corresponds to observation. Ritz’s force (3.8.14) can be derived from the usual Euler–Lagrange equations, (4.1.3), with L=
1 1 (1 − λ)LW + (1 + λ)LR , 2 2
(3.8.15)
as the Lagrangian. Since gravity is a much weaker force than electricity, its effects will be much harder to determine. At least two cases were known where gravity acted in a way which did not comply with Newtonian mechanics: the deflection of light, and the excess rate of turning of the long axis of Mercury’s orbit. The former constituted a prediction made by Johann Söldner [04] in 1801, who calculated that a star viewed near the sun would be shifted by 0.85 . The latter was observed by the French astronomer, Le Verrier, who noticed that the major axis of the elliptical orbit was turning slightly faster than expected from perturbations exerted by the sun and by neighboring planets, as shown in Fig. 3.8. It was realized that a correction to the Newtonian gravitational force, Mm C + 4, 2 r r was needed to explain the excess rate of turning. The unknown parameter C, had to be determined by the fact that the excess rate of turning amounted to some 41 arc seconds per century. This was its value at the beginning of the twentieth century, more recent data fix it at 43.1 per century.
Fig. 3.8.
Elliptical orbit of Mercury showing the excess rotation of the major axis.
Aug. 26, 2011
11:16
166
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
The determination of C is usually accredited to Einstein’s general theory of relativity. The first time we hear of Einstein’s attempt to determine the excess turning of Mercury is in 1912 when he enlisted help from his friend Besso. He did not publish his result until the end of 1915 when he revised his general theory. Yet, in 1908, Ritz published his calculation of the advance of perihelion of Mercury, long before the world heard of Einstein’s general relativity. But, sadly, the world did not hear of Ritz’s achievement. In addition, we shall show that he had in his grasp all the tests that are usually attributed to the confirmation of Einstein’s general relativity, except for the prediction of the gravitational shift of spectral lines. Unlike his earlier work in which he cites Laplace for predicting that the propagation of gravity is some 107 times faster than the speed of light which would lead to first-order corrections in the relative velocity, Ritz now assumes that the gravity propagates at the speed of light which would introduce only second-order corrections. Ritz’s starting point is his second-order force equation in the absence of acceleration terms. He sets the components of the force, Fx , and Fy equal to m¨x and m¨y, respectively. He replaces the product of the charges, ee , by GMm, where M is the central mass and m the peripheral mass which cancels out, and obtains GMx GM(1 + λ) 3(1 − λ) r˙ 2 (3 − λ) 2 2 x¨ = − 3 + (˙x + y˙ ) − x˙ r˙ , 1+ 4 4c2 c2 2c2 r2 r 2 ˙ GMy r (3 − λ) 2 GM(1 + λ) 3(1 − λ) y¨ = − 3 y˙ r˙ , 1+ + (˙x + y˙ 2 ) − 4 r 4c2 c2 2c2 r2 where λ is an arbitrary constant. We have made the identifications ur = r˙ , ux = x˙ , u2 = x˙ 2 + y˙ 2 in his force law, (4.1.7). Multiplying the first equation by y and subtracting x times the second equation from it, Ritz gets d GM(1 + λ) r˙ z , z = dt 2c2 r2 where z = xy˙ − yx˙ = r2 ϕ˙ is the z-component of the angular momentum (relative to unit mass) of the planet, and (r, ϕ) are polar coordinates.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
167
Since the force equations contain a component parallel to the velocity, the aerial velocity is not conserved. This is also the origin of the longitudinal component of the mass in special relativity. The lack of conservation of the angular momentum is also found in the general relativity, as we shall see in Sec. 7.4. Integrating Ritz gets 2
r ϕ˙ = 0 e
−GM(1+λ)/2c2 r
≈ 0
GM(1 + λ) 1− , 2c2 r
(3.8.16)
where he identifies the constant of integration, 0 , as the conserved angular momentum. Conservation of angular momentum implies r2 ϕ˙ = 0 = const, and not (3.8.16). The lack of conservation of momentum is the consequence of choosing a particular hyperbolic stereographic inner product, as we shall see in Secs. 7.4 and 9.6. In any case, the lack of conservation of momentum, (3.8.16), should have disturbed Ritz, but he remained silent. Now what Ritz’s needs is the equation of the trajectory which he must subsequently solve. He removes the quantities x y
GM GMx ¨ x + y¨ , (1 − λ) (1 + λ)¨ x + r r 2c2 r 2c2 r from the first of his force equations, and y
GMy x GM x¨ + y¨ , (1 + λ)¨y + 2 2 2 r 2c r 2c r r from the second, and replace them by the same expression with the exception that the accelerations x¨ and y¨ are replaced by −GMx/r3 and −Gmy/r3 , respectively. He claims that in the final analysis these changes are equivalent to introducing terms of fourth-order, and “completely negligible.” Then multiplying the first of the modified equations by x˙ and the second by y˙ , adding and integrating he gets his equation of energy conservation, 1 2 1 GM u2 u − (1 + λ) 1− 2 2 2 r 2c 1 G2 M 2 GM r˙ 2 − (1 − λ) 1 − 2 − 2 2 = W = const. (3.8.17) 2 r 2c 2c r
Aug. 26, 2011
11:16
168
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
Next Ritz transforms (3.8.17) to polar coordinates, where r2 = x2 + y2 and u2 = x˙ 2 + y˙ 2 = r˙ 2 + r2 ϕ˙ 2 . Employing (3.8.16) Ritz arrives at: 20 r4
dr dϕ
2
1 α = 1 + (λ − 1) 4 r 2 2 20 1 α 1 αc 2GM , × 2W − 2 + + (λ + 1)W + (λ + 2) r r 2 r 4 r (3.8.18)
where α := 2GM/c2 is the so-called Schwarzschild radius. Schwarzschild would not come across it till 1916 so that this radius should bear the name of Ritz, and is another example of Stigler’s law of eponymy.
3.8.2.1
Mass from the gravitational field
Consider the energy conservation equation (3.8.17) in a state at rest, W =−
GM 1α 1+ . r 4r
(3.8.19)
As a result of the gravitational field of the sun, a planet orbiting about it will feel a greater attraction than the solar mass M. We may reason from the analogy with electrostatics [Sexl & Sexl 79] where the electric field is the gradient of e/4πr, and the electrostatic energy is We = 12 E2 . Heaviside tells us to replace the dielectric constant, by 1/4πG, and substituting GM for the charge e gives the energy of the static gravitational field as Wg = −GM2 /8πr4 . The negative sign denotes attraction, whereas the plus sign in the electrostatic energy indicates that like charges repel. This is what Maxwell found so unattractive about the gravitational energy. The gravitational energy Wg will contribute to the solar mass by an amount Wg /c2 so that its total mass will be Mtot = M −
1 c2
Wg dV = M −
4π 8πc2
∞ r
GM2 2 1α . r dr = M 1 + 4r r4
Remarkable as it is, this is just what Ritz could have predicted from (3.8.19)!
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
3.8.2.2
169
Advance of the perihelion
Now, this is what Ritz did do. Ritz took the positive square root of (3.8.18), and rearranged it to read
1 − 1 α(λ − 1)ρ dρ 8 ϕ − ϕ0 = , √ (A − Bρ − C2 ρ2 ) where ρ = 1/r, A = 2W /20 ,
B = (2GM + (λ + 1)αW /2)/20 ,
C2 = 1 − α2 c2 (λ + 2)/420 ,
and ϕ0 is a constant of integration. Performing the integration Ritz found αc 2 2C2 ρ − B (λ + 5) arcsin √ 2 ϕ − ϕ0 = 1 + 40 (B + 4AC2 ) +
α(λ + 1) √ ( − C2 ρ2 − Bρ + A). 8
Hence, the difference in ϕ between two successive perihelia is αc 2 (λ + 5) . 2π 1 + 40 This differs from 2π which is what we would get if there were no advance of the perihelion. The correction, π ϕ = 8
αc 0
2
(λ + 5),
(3.8.20)
is a very small quantity that will make the elliptic orbit turn in its plane. Introducing the relation, GM 20
=
1 , l
(3.8.21)
known from the elementary theory of elliptical motion [Born 60], where l is the semi-latus rectum, a(1 − ε2 ), a and ε being the semi-major axis and eccentricity of the ellipse, respectively, into (3.8.20) results in ϕ =
πα(λ + 5) . 4l
(3.8.22)
Aug. 26, 2011
11:16
170
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
As we will see in Sec. 7.6.3 general relativity predicts a shift of ϕ = 3πα/a. In order for (3.8.22), to produce such a shift, we must set λ ≈ 7. Now let us continue to see what Ritz missed.
3.8.2.3
Deflection of light
The estimate of the advance of the perihelion was the only prediction that Ritz could make. His equation for the trajectory of the motion, (3.8.18), contains a wealth of information that sadly Ritz’s short life did not allow him to uncover. If we retain only linear terms in G, the equation for the trajectory becomes 20 r4
dr dϕ
2
= 2W −
20 (λ − 1) α20 λW 2GM − . 1 + + r 2 c2 4 r2 r3
(3.8.23)
For λ = 0, (3.8.23) has a potential, S = −
α2 2 GM + 02 − 30 , r 2r 8r
(3.8.24)
which, apart from a trivial numerical factor in the last term, is often referred to as the Schwarzschild potential, because it is this potential which results from general relativity of a gravitating body that Schwarzschild solved [cf. Secs. 9.10.3 and 9.10.4]. Just as Coulomb’s law had to be modified in the presence of charges in motion, so too does Newton’s law of gravitation in the presence of orbiting bodies. It is the last term in (3.8.24) which is a coupling between Newton’s radial attraction and centrifugal repulsion that is responsible for the advance of the perihelion and the deflection of light. In fact, the deflection of light is due entirely to the presence of the coupling term in (3.8.24) in the absence of the gravitational attraction. We therefore set λ = −2, and W = c2 thereby obtaining the equation for the orbit of a photon about a massive body as:
2
3 = −2 − ρ2 + αρ3 . (3.8.25) 4 √ We have again set r = 1/ρ, and = 0 / (2)c has the dimensions of a collision parameter. It is a function of the conserved angular momentum, 0 , and can take on any value we assign to it so it does not appear as a distance dρ dϕ
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
171
of closest approach. But, general relativity gets the same result. Curiously, the conserved angular momentum is divided by Weber’s constant. By a change of variable, σ = (1 − 43 αρ)1/2 ρ, we can write the positive √ square root of (3.8.25) as dρ/dϕ = (1 − σ 2 )/, which upon integration gives:
σ dσ 1 + 3 σ/ 4 3α 3 α√ ϕ= . (1 − σ 2 ) + = arcsin σ − √ 4 4 (1 − σ 2 ) 0 At maximum ρ, or the distance of closest approach, dρ/dϕ = 0, σ = 1. The angle at which this occurs is: ϕm =
π 3α + . 2 4
The total deflection will, therefore, be: π 3 α 2 ϕm − = . 2 2
(3.8.26)
The value found from general relativity is 2α/, where is taken as the radius of the Sun. But is it the case? The only place where the ratio 20 /GM appears is in the expression for the semi-latus rectum of an orbital ellipse [cf. (3.8.21)]. It has absolutely nothing to do with the problem of the deflection of a light ray by a massive body. The conserved angular momentum has nothing whatsoever to do with the radius of the Sun! Just because it has the right dimension does not mean that it has the right physical interpretation, and, yet, this is what general relativity asserts [Møller 52].e This was not the way Einstein originally derived his expression for the deflection, as we shall now see. The ratio of the Schwarzschild radius to the solar radius was predicted to be the magnitude of the deflection of light by the Sun by Söldner [04] back in 1801. Surely, Ritz could not have been aware of that prediction. And excluding Söldner, he would have preceded Einstein by three years who obtained the same result as Söldner in 1911. Einstein was unaware of it, e General relativity modifies (3.8.24) by changing the 1 to 1 . The circular orbit where 8 2 √ √ potential S is a minimum [cf. Fig. 7.6] determines = 0 / 2c = [α/2(2r − √ 3α)]r ≈ (αr). The Schwarzschild radius for the sun is 3 × 103 m, and has a radius 7 ×108 m. If were to have this value, it would fix the radius of the orbit r at 1014 m,
which is the Schwarzschild radius of a galaxy.
Aug. 26, 2011
11:16
172
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
and it was only brought to his attention in 1921 by Lenard in an effort to discredit him and his theory of relativity. That Einstein’s result could not have been verified at the time was heralded a fortuitous circumstance since his general theory doubled the value. Observations were made during various solar eclipses with results ranging from 1.5 to 2.2 . The latter value was found by the Freundlich eclipse expedition, which summarily remarked: “There appears to be no further doubt possible that our series of measurements is not compatible with the value 1.75 asserted by theory.” Numerical values do not make or break a theory, but their predictions do. Ritz could have never obtained an exact numerical result because he always had an undetermined parameter, λ, at his disposition. In closing this chapter, perhaps some speculative remarks would be in order. As we have mentioned, Ritz [09] published a joint paper with Einstein in 1909, shortly before his death. Apparently, Ritz was not aware of the deflection of light by a massive object. But was Einstein aware of it, and if so, when? Probably as early as 1907, for as Einstein [23b] writes in his 1911 paper, In a memoir published four years ago I tried to answer the question whether the propagation of light is influenced by gravitation. I return to this theme, because my previous presentation of the subject does not satisfy me, and for the stronger reason, because I now see that one of the most important consequences of my former treatment is capable of being tested experimentally. For it follows from the theory here to be brought forward, that rays of light, passing close to the Sun, are deflected by its gravitational field, so that the angular distance between the Sun and a fixed star appearing near to it is apparently increased by nearly a second of arc.
The really surprising aspect of Einstein’s 1911 calculation of the deflection of light by a massive body is that it employed a non-constant speed of light, as we have mentioned in Sec. 1.1.1.1. Einstein first draws the ‘analogy’ between the frequency shift caused by the (classical) Doppler shift, ν = ν0 (1 − u/c),
(3.8.27)
and that which could occur in a gravitational field, = −GM/r. To ‘derive’ the expression for the shift, he replaces u by gt, where g is the constant acceleration on the surface of the Earth. With the time t = h/c, where h is the height that a photon falls from where it is emitted to where it is
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
173
absorbed, and = −gh, the shift in frequency becomes ν = ν0 1 + 2 . c
(3.8.28)
Supposedly, (3.8.28) is valid everywhere, and not just for constant acceleration. In fact, it is static, whose only effect is to slow down clocks in a gravitational field, = −GM/r. The Doppler shift, (3.8.27), can also lead to an increase in the frequency by reversing the sign of the velocity. There is no such possibility in (3.8.28). Then there is the nagging question of what causes the Doppler shift: uniform velocity or uniform acceleration? Obviously it is the former. Finally, we know that (3.8.27) is classical so that it would mean that there would be a relativistic generalization of (3.8.28). No one has ever mentioned it. Einstein continues For measuring time at a place which, relatively to the origin of the coordinates, has the gravitational potential , we must employ a clock which — when removed to the origin of the coordinates — goes (1 + /c2 ) times more slowly than the clock used for measuring time at the origin of the coordinates. If we call the velocity of light at the origin of coordinates c0 , then the velocity of light c at the place where the gravitational potential will be given by the relation
c = c0
1+ 2 . c
(3.8.29)
Here, Einstein has to admit that The principle of constancy of the velocity of light holds good according to this theory in a different form from that which usually underlies the ordinary theory of relativity.
Which form he does not say. Moreover, this appears to be a second-order effect, and not the usual Doppler shift, which is linear in the relative velocity, u/c. In addition (3.8.29) presents itself as a cubic equation for determining the speed of light as a function of a static gravitational potential. According to Pais [82], “Einstein restored sanity, but at a price.” On the contrary, Einstein did not restore sanity, but did pay the price! The reason why “we must use clocks of different constitution for measuring the time at places with different gravitational potential” does not seem like clocks at all for there can be no consensus of the time measured nor a law telling
Aug. 26, 2011
11:16
174
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
us how to calculate the differences because the clocks are all of “different constitutions.” But let this not prevent us from continuing. Einstein then claims that From the proposition which has just been proved, that the velocity of light in a gravitational field is a function of place, we may easily infer, by means of Huygens’s principle, that light-rays propagated across a gravitational field undergo deflection.
Leaving aside the ‘proof’ of the proposition, Einstein asserts that the rate of change of the velocity with respect to the normal of the wavefront gives the angle of deflection and equates ∂c/∂n with (c0 /c2 )∂/∂n , on the strength of (3.8.29), where n is the normal to the surface. Einstein thus proposes a deflection, “on the side directed toward the heavenly body, of magnitude” 1 a= 2 c
θ=π/2 θ=−π/2
GM GM cos θ ds = 2 2 = 4 × 10−6 = 0.83 , 2 r rc0
(3.8.30)
where the arc ds = rdθ. Now the standpoints of Ritz and Einstein are even more curious for Ritz contends that the speed of a photon relative to its emitter is c, but relative to an inertial observer is c ± u, where u is the radial speed between the source and observer, while Einstein argues that a photon’s speed is always c, independent of the speed of its emitter. This debate took place in 1909 [cf. Sec. 4.1.3]. Just two years later, Einstein admits that c is no longer an absolute velocity! It is commonly assumed that the difference between the special and general theories is that the latter supplies a factor of two in (3.8.30). It is not the factor of 2 which is the important point. The corrections to Newtonian physics are all of the magnitude of the ratio of the Schwarzschild radius to the radius of the object under investigation. This is why all of the so-called tests of the general theory can so easily be gotten without any of the heavy machinery of tensorial analysis [Sexl & Sexl 79]. This will be the object of our study in Chapter 7.
References [Born 60] M. Born, The Mechanics of the Atom (Fredrick Ungar, New York, 1960), p. 141. [Brace 04] D. B. Brace, “On double refraction in matter moving through the aether,” Phil. Mag. 7 (1904) 317–329.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A Brief History of Light, Electromagnetism and Gravity
175
[Carstoiu 69] J. Carstoiu, “Les deux champs de gravitation et propagation des ondes gravifiques,” Comptes Rendu 268 série A (1969) 201–204; “Nouvelles remarques sur les deux champs de gravitation et propagation des ondes gravifiques,” ibid 268 (1969) 261–264. [Cuttingham 14] E. Cuttingham, The Principle of Relativity (Cambridge U. P., Cambridge, 1914). [Einstein 23a] A. Einstein, “On the electrodynamics of moving bodies,” translated by W. Perrett and G. B. Jeffrey from Ann. der Physik 17 (1905) in The Principle of Relativity (Metheun, London, 1923), p. 63. [Einstein 23b] A. Einstein, “On the influence of gravitation on the propagation of light,” Ann. der Physik 35 (1911) 898–908; translated by W. Perrett and G. B. Jeffrey in The Principle of Relativity (Metheun, London, 1923), p. 99. [Feynman 64] R. P. Feynman, The Feynman Lectures on Physics, Vol. II (AddisonWesley, Reading MA, 1964), p. 28–11. [Fokker 65] A. D. Fokker, Time and Space Weight and Inertia (Pergamon Press, Oxford, 1965). [Forward 61] R. L. Forward, “General relativity for the experimentalist,” Proc. IRE 49 (1961) 892–904. [French 68] See, for example, A. P. French, Special Relativity (Van Nostrand Reinhold, London, 1968). [Heaviside 94] O. Heaviside, Electromagnetic Theory, Vol. I (The Electrician, London, 1894), Appendix B. [Ives 51] H. E. Ives, “Revisions of the Lorentz transformations,” Proc. Am. Philos. Soc. 95 (1951) 125–131. [Klein 71] F. Klein, “On the so-called noneuclidean geometry,” Mathematische Annalen 4 (1871) 573–625; transl. in J. Stillwell, Sources of Hyperbolic Geometry (Am. Math. Soc., Providence RI, 1991), pp. 69–111. [Larmor 00] J. Larmor, Aether and Matter: A Development of the Dynamical Relations of the Aether to Material Systems on the Basis of the Constitution of Matter (Cambridge U. P., Cambridge, 1900). [Lavenda 09] B. H. Lavenda, A New Perspective on Thermodynamics (Springer, New York, 2009). [Lorentz 16] H. A. Lorentz, Theory of Electrons (G. E. Strechert, New York, 1916), p. 217. [Mascart 74] E. E. N. Mascart, “Sur le modification qu’éprouve la lumière par suite du mouvement de la source lumineuse et du mouvement de l’observateur,” Annales de l’École Normale 3 (1874) 157–214. [Møller 52] C. Møller, The Theory of Relativity (Oxford U. P., London 1952), p. 354. [Okun 89] L. B. Okun, “The concept of mass,” Physics Today June (1989) 31–36. [O’Rahilly 38] A. O’Rahilly, Electromagnetics (Longmans, Green & Co., London, 1938), p. 325. [Pais 82] A. Pais, Subtle is the Lord (Oxford U. P., Oxford, 1982), p. 199. [Poincaré 02] H. Poincaré, La Science et l’Hypothèse (Flammarion, Paris, 1902). [Poynting 10] J. H. Poynting, The Pressure of Light (Society for Promoting Christian Knowledge, London, 1910). [Poynting & Thomson 22] J. H. Poynting and J. J. Thomson, Text-book of Physics: Properties of Matter (Charles Griffin, London, 1922), pp. 48–52.
Aug. 26, 2011
11:16
176
SPI-B1197
A New Perspective on Relativity
b1197-ch03
A New Perspective on Relativity
[Rayleigh 02] Lord Rayleigh, “Does motion through the aether cause double refraction,” Phil. Mag. 4 (1902) 678–683. [Ritz 08] W. Ritz, Gesammelte Werke–Oeuvres (Gauthier Villars, Paris, 1911) see also, http://www.datasync.com/rsf1/crit2/1908-2p.htm. [Ritz & Einstein, 09] W. Ritz and A. Einstein, “Zum gegenwärtigen Stande des Strahlungsproblems,” Phys. Z. 10 (1909) 323–324. [Schott 12] G. A. Schott, Electromagnetic Radiation (Cambridge U. P., Cambridge, 1912), p. 250. Note that the direction of the third component of the force should be m , and not m. [Sexl & Sexl 79] R. Sexl and H. Sexl, White Dwarfs–Black Holes: An Introduction to Relativistic Astrophysics (Academic Press, New York, 1979), Ch. 2. [Silberstein 14] L. Silberstein, The Theory of Relativity (MacMillan, London, 1914). [Skilling 42] H. H. Skilling, Fundamentals of Electric Waves (Wiley, New York, 1942). [Söldner 04] J. G. v. Söldner, “Über die Abllenkung eines Lichstrahls von seiner geradlinigen Bewegung, durch di Attraktion eines Weltkörpers, an welchem er nahe vorbei geht,” Berliner Astronomiches Jahrbuch (1804) 161–172; reprinted in P. Lenard, “Über die Ablenkung eines Lichtstrahls von seiner geradlinigen Bewegung durch die Attraktion eines Weltkörpers, an welchem er nahe vorbeigeht von J. Söldner, 1801, ” Ann. d. Phys. 65 (1921) 593–604. [Stranathan 42] J. D. Stranathan, The “Particles” of Modern Physics (Blaskiston, Philadelphia, 1942). [Thomson 28] J. J. Thomson and G. P. Thomson, Conduction of Electricity Through Gases, 3rd ed. (Cambridge U. P., Cambridge, 1928), p. 262. [Trouton & Noble 03] F. T. Trouton and H. R. Noble, “The mechanical forces acting on a charged electric condenser moving through space,” Philos. Trans. R. Soc. London 202 (1903) 165–181. [Whittaker 53] E. Whittaker, A History of the Theories of Aether and Electricity, Vol. 2 (Nelson, London, 1953).
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Chapter 4
Electromagnetic Radiation
4.1
Spooky Actions-at-a-Distance versus Wiggly Continuous Fields
The early decades of the nineteenth century saw a very diverse set of actors on the stage of electrodynamics. This stage rapidly turned into a battleground between the proponents of actio in distans, Gauss, Clausius, Weber, Riemann and Ritz, and the advocates of continuous action through a medium, Faraday, Maxwell, Lorentz, Heaviside, and Hertz. Maxwell’s idea that “we can scarcely avoid the inference that light consists in the transverse undulation of the same medium which is the cause of electric and magnetic phenomena” is counterpoised by Gauss’s assertion “two elements of electricity in relative motion repel or attract one another differently when in motion than when at rest.” So is it the wiggly nature of elastic undulations or the ballistic nature of electrified particles that interact through action at a distance that held the day? In an 1845 letter to Wilhelm Weber, Gauss admits: I would doubtless have long since published my researches, were it not that at the time I gave them up I had failed to find what I regarded as the keystone: namely, the derivation of the additional forces — to be added to the mutual action of electrical particles at rest when they are in mutual motion — from the action which is propagated not instantaneously but in time as is the case with light.
So it is not only in the discovery of hyperbolic geometry where Gauss lost priority due to his extremely conservative nature. Weber, remembered today only for his “absolute units of measurements of electrical quantities,” is not given credit for his ideas that electricity has an atomic structure, that electrical currents consist in streams of particles, that Coulomb’s law needs to be modified for charges in motion, that Ampère’s law acts directly between the charges and not between the conductors, that there is a limiting 177
Aug. 26, 2011
11:16
178
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
speed at which the force of attraction vanishes, and that their interaction is not instantaneous, as Gauss affirmed. Actually, Ampère’s law was quite revolutionary in its day because it overturned the apple cart of Newton’s universal inverse square law. Albeit such status, Ampère’s law reduces to that of Coulomb for charges at rest, it took into account not only the separation between the charges but also the angular direction of their currents. Ampère also alluded to the creation of magnetism through the motion of charges. Notwithstanding his revolutionary ideas, his work would have fallen into oblivion had it not been for the intervention of Gauss, who with the collaboration of a young physicist, Wilhelm Weber, sought experimental verification of Ampère’s hypothesis. The gap began to widen between the Maxwellian continuity of the electric and magnetic fields and the Weberian belief that forces between charged particles depended on their relative velocities and accelerations in which Ampère’s angular dependency diminishes the force of attraction when the particles are in relative motion. Weber, and all those who agreed with him, like Rudolf Clausius, was lambasted by the English school in the personified figure of Peter Guthrie Tait. In his first edition of Sketch of Thermodynamics, Tait insinuates that Weber’s inadmissible theory of the forces exerted on each other by moving electric charges, for which the conservation of energy is not true; while Maxwell’s result is in perfect consistence with that great principle.
Maxwell, in his Treatise, latter admitted that the non-conservation of energy “does not apply to the formula of Weber.” Why this retraction on the part of Maxwell? According to O’Rahilly [38], it is due to the presence of an acceleration term, r¨ , in Weber’s force law, Fx =
ee 1 2 ˙ , r cos (rx) 1 + r¨ r − 2 r2
(4.1.1)
where the argument of the cosine denotes the angle formed between r and x, so that it accounts for the energy lost through radiation. In the second edition of Sketch, Tait admits to his mistake, but, now, faults Weber on the fact that his “potential involves relative velocities as well as relative positions, and cannot therefore be called potential energy.” Tait is here saying that only absolute velocities, or velocities with respect to the aether, can be used in
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
179
electrodynamics! Next, he attacks Weber on the nature of the force law; any true law of force must only be a function of the mutual distance between the charges e and e , and cannot involve velocity, either relative or absolute. This is nothing but unfettered authoritarianism. Even the ‘man who believed in atoms,’ Ludwig Boltzmann, acquiesced to the continuum model of Maxwell, saying The hypothesis of electric fluids was brought to high perfection by Wilhelm Weber, and the general recognition given to his work in Germany stood in the way of the study of Maxwell’s theory. . . It is certainly useful if Weber’s theory is held up for ever as a warning example that we should always preserve the required mental elasticity.
In essence, Boltzmann is negating his entire life’s work! But, to Boltzmann discreteness, or atomism, was a mere way of counting. He always took the continuum limit at the end of his calculations — something Planck was unable to do. Weber’s force law (4.1.1) can be derived from the so-called electrokinetic potential, L = e(φ − u · A/c),
(4.1.2)
where φ and A are the scalar and vector potentials. The idea is to introduce expressions for these potentials, such as: ρ φ = dV, (Poisson–Gauss) r 1 j A= dV, (Ampère) c r ρt−r φ = dV, (Riemann–Lorenz) r jt−r 1 A= dV, (Riemann–Lorenz) c r e φ = , (Liénard–Wiechert) r − u · r/c eu/c A= , (Liénard–Wiechert) r − u · r/c where the subscripts on the charge density, ρ, and current density, j, mean to evaluate them at the earlier time t = t − r/c, and r is the radius vector, taken from the point where the charge is located to the point where it is
Aug. 26, 2011
11:16
180
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
observed. Then (4.1.2) could be treated as a mechanical Lagrangian so that the force is given by the variational expression Fx = −
∂L d ∂L . + ∂x dt ∂ux
(4.1.3)
Clausius’s Lagrangian, L=
ee 1− ux ux /c2 , r
(4.1.4)
where the sum is over all particles, introduces absolute velocities, u and u , and not relative ones. Unwittingly, Clausius joined electron theory to the aether, and undoubtedly, his expression (4.1.4) is similar to his virial, where the charges are replaced by the masses. Earlier expressions for the force, derived from Weber’s, LW =
ee (1 + u2r /2c2 ), r
(4.1.5)
LR =
ee (1 + u2 /2c2 ), r
(4.1.6)
and Riemann’s,
Lagrangians contained only relative velocities, where ur is the radial velocity. Ritz showed that his second-order force relation for the interaction of two charges in uniform motion, 1 Fx = eEx + (u × H)x c ee 1 3 2 2 2 = 2 2 cos (rx) c + (3 − λ)u − (1 − λ)ur 4 4 c r − (1 + λ)
ee ee ˙ ˙ u u − + u cos (rx) , u r x x r 2r2 c2 2c2 r
(4.1.7)
can be expressed as a linear combination of the Weber and Riemann forces, and, hence, be expressed as a linear combinations of their Lagrangians, (4.1.5) and (4.1.6) [cf. 3.8.15]. Moreover, for λ = 1, Ritz’s force law reduces to Liénard’s [98] expression,
ee ee 1 ee Fx = 2 2 cos (rx) c2 + u2 − 2 2 ux ur − 2 u˙ x + u˙ r cos (rx) , (4.1.8) 2 c r r c 2c r where the acceleration terms have been included.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
4.1.1
181
Irreversibility from a reversible theory
Not only was Riemann a great mathematician, but, like the prince of mathematicians, Gauss, he also contributed to electrodynamics. Riemann observed I have found that the electrodynamic actions of galvanic currents may be explained by assuming that the action of one electrical mass on the rest is not instantaneous, but is proportional to them with a constant velocity which, within the limits of observation, is equal to that of light.
Not only in the field of thermodynamics was there a rivalry between the English and Germans in the nineteenth century [Lavenda 09], but that rivalry poured over into electrodynamics. Maxwell was quick to object to the notion that “potential is propagated like light” and refers to Clausius’s criticisms of Riemann to the effect that Riemann fails to obtain the known laws of electrodynamics. Clausius here was in no way as successful as he was on the thermodynamics front. The finite time of propagation, referred to by Riemann, could either mean the potential is retarded insofar as the observation of the charge or current is made at a later date, or there is a finite time involved in the interaction of the charges, i.e. there is no action at a distance. Maxwell deals with the first possibility and claims that “the electrical potential, which is the analogue of temperature, is a mere scientific concept.” This is in sharp contrast to his A, the “electrokinetic momentum,” which “may even be called the fundamental quantity in the theory of electromagnetism.” Verily, Maxwell was not aware of the four-vector status of the potentials. The explicit introduction of retarded potentials was made by Ludvig Lorenz in 1867. Lorenz wrote, φ=
[ρ] dV, r
1 A= c
[j] dV, r
(4.1.9)
where the square brackets indicate that the charge, ρ, and current, j = ρu, densities are to be evaluated at an earlier time, t − r/c. Maxwell reacted more kindly to Lorenz than he did to Riemann saying, in his Treatise, that “his conclusions are similar to those in this chapter, though obtained by an entirely different method.” Maxwell then claims priority over Lorenz.
Aug. 26, 2011
11:16
182
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
Fig. 4.1.
The configuration for calculating the retarded scalar potential.
The form of the retarded potential in practice today was first given by Liénard in 1898, and slightly later by Wiechert in 1900. Liénard’s proof that the scalar potential is given by φ=
e , r(1 − vr /c)
(4.1.10)
and a similar expression for vector potential, consists in considering the total charge e in a small volume V. Suppose that V is so small that the velocity vr can be considered constant throughout. A sphere of radius r cuts the volume in αβ, which can be considered a plane in Fig. 4.1. An increase in the radius r, by an infinitesimal amount dr, leads to an increase in volume swept out by αβ by an amount αβ × dr. However, the volume swept out with respect to the original volume is αβ × (dr + vr dt), where vr is the rate of change of the radius. The increment in the radial coordinate dr = −cdt, looking backwards, so that the volume V expands by the amount V/(r − vr /c), and, consequently, the scalar potential where the observer is located, (4.1.10), is greater than its value where the charge is located.a a Curiously enough, Heaviside attributes the retarded potentials to his friend FitzGerald, “who first brought the progressive A and φ into electromagnetics . . . But his potentials were not dopplerized. . . .” How Heaviside could have confused retarded propagation with the Doppler effect is, indeed, a mystery. Even correct equations can sometimes be obtained through false physical reasoning.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
183
O’Rahilly [38] makes a good point that whereas Lorenz would write ˙ In so doing, Maxwell shows the dichotomy u, Maxwell would write u + E. of the current: a current carrying charged particles, and a more abstract displacement current, so named because Maxwell was under the impression that such a current was the result of polarization, or the displacement of charges in a dielectric. The two are united through the continuity equation and Gauss’s law. However, the real magic of the displacement current, when substituted into the second circuital equation, closed the equations and led to a wave equation for the propagation of the fields at the same velocity of light! Although we do not agree with O’Rahilly’s claim that “the view of Lorenz is accepted universally today, while there appears to be little or no realization of the elementary inference that Maxwell’s displacement current is thereby rendered unnecessary,” and that “Maxwell’s so-called displacement current is merely a mathematical equivalent expression without physical significance,” the retarded potential picks out the arrow of time which is lacking in the rest of electrodynamics, be it Maxwell’s or Lorentz’s theory. Whereas the retarded potential corresponds to waves diverging in all directions from electric charges, the advanced potential corresponds to waves coming in from infinity and converging on the electric charges. As Ritz pointed out over a century ago: retarded potentials depend on previous states, advanced potentials depend on future states. He concluded: Experience shows, and Lorentz admits, that only [retarded] waves can exist, and, furthermore, contrary hypotheses would involve inadmissible consequences, such as the possibility of perpetual motion.
For if we admit the existence of advanced potentials, charges would no longer be the sources of the field. Ritz uses the argument that since advanced, as well as retarded, potentials satisfy Maxwell’s equations, these equations are unable to distinguish between the two, and this is, yet, another reason for preferring the “formulas of elementary actions.” It is rather interesting to see how Einstein vacillated back and forth. Any theory which seeks to unite Maxwell’s equations with the corpuscular theory of light must lead to inconsistencies. But, Einstein divided them into two papers in 1905, one “On the electrodynamics of moving bodies,” and the other “On a heuristic point of view concerning the production and
Aug. 26, 2011
11:16
184
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
transformation of light.” According to Planck’s interpretation of Einstein, if light waves have a corpuscular constitution then Maxwell’s equations have to be abandoned. Einstein supported this by saying According to the usual theory an oscillating ion generates a divergent spherical wave. The reverse process does not exist as an elementary process. . . the elementary process of light emission has not as such the character of reversibility. . . Hence the constitution of radiation appears to be different from that deduced by our wave theory.
Undoubtedly swayed by the enormous popularity that special relativity received in the years that followed, Einstein, writing in Maxwell’s Commemoration Volume, again reverted to his previous stance: Since Maxwell’s time physical reality has been thought of as represented by continuous fields, governed by partial differential equations and not capable of any mechanical interpretation.
And so he was to remain for the rest of his life.
4.1.2
From fields to particles
Maxwell took his theory of electromagnetism up to the point of specifying the molecular constituency of matter. For he wrote: Here we may introduce once for all the common phrase ‘the electric fluid’ for the purpose of warning our readers against it. It is one of those phrases which, having been at one time used to denote an observed fact, was immediately taken up by the public to connote a whole system of imaginary knowledge. As long as we do not know whether positive electricity or negative or both should be called a substance or the absence of a substance. . . we must avoid speaking of the electric fluid.
Although Maxwell’s original goal was “to discover a method of forming a mechanical conception of this electrotonic state,” he conceded defeat in that we have made only one step in the theory of the action of the medium. We have supposed it to be in a state of stress, but we have not in any way accounted for this stress or explained how it is maintained. . . I have not been able to make the next step, namely, to account by mechanical considerations for these stresses in the dielectric.
That next step was taken by Hendrick Lorentz, and it was a step backwards to Weber, who considered the action to depend “directly on the relative velocities of the particles, and to Riemann and Lorenz who
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
185
envisioned a gradual propagation of something, whether potential or force, from one particle to another.” Lorentz’s force, u F =e E+ ×H , c
(4.1.11)
is a concoction of Coulomb’s law and J. J. Thomson’s force (1881), which is experienced by a charge e as it moves through a magnetic field, H. Lorentz’s synthesis (1892) is this: It is got by generalizing the results of electromagnetic experiments. The first term represents the force acting on an electron in an electrostatic field [F1 = eE]. . . the part of the force expressed by the second term may be derived from the law according to which an element of wire carrying a current is acted on by a magnetic field [F2 = j ds × H/c]. . . simplifying . . . [to] only one kind of moving electron with equal charges and a common velocity. . . [j ds = eu]. . . we now combine the two in the way shown by the equation. . .
O’Rahilly [38] points out that the two forces are incompatible for how can an electron be both stationary and moving? The word ‘combine’ leaves something to be desired. Moreover, as Thomson was the first to point out, (4.1.11) violates the law of action and reaction without the presence of the aether. Lorentz’s law is derivable from the ‘electrokinetic’ potential of Schwarzschild, the ‘electrodynamic’ potential of Clausius, or the ‘convection’ potential of Searle, (4.1.2), since, as Schwarzschild showed, (4.1.11) follows from the classical variational equation, (4.1.3). Another criticism that can be lodged against Lorentz’s synthesis is that the electric field due to a charge e moving with relative velocity u on the charge e traveling at relative velocity u, at a distance r apart, is 1 ∂A c ∂t e 1 ∂ e u = −∇ − r c ∂t r e 1 e 1 e ur = 2 rˆ − u, u˙ + c r c r2 r
E = −∇φ −
where rˆ is a unit normal in the direction of the motion. This shows that Lorentz’s assumption will hold if we neglect accelerations and secondorder terms in the relative velocities.
Aug. 26, 2011
11:16
186
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity Ritz remarks: This remarkable result, due to Schwarzschild, shows that Lorentz’s theory resembles the older theories much more than we could at first sight believe.
He contends that it is not the fields that determine the force in (4.1.11), but, rather, “we only know F. . . where there is electrified matter, and deduce E and H by reasoning (which is not always so simple when we have to consider absolute motion).” Moreover, when it is realized that H = e ∇ × u /cr, it becomes apparent that the velocities, u and u , enter (4.1.11) “in a non-symmetrical manner which clearly shows the inequality of action and reaction, even when the accelerations are supposedly negligible and there is no radiation.” Ritz concludes by saying “the inequality of action and reaction constitutes, therefore, a serious objection to Lorentz’s theory.” Without the aether, Lorentz’s theory appears lopsided.
4.1.3
Absolute versus relative motion
We return to our discussion in Sec. 3.2 of Maxwell’s role to determine motion through the aether. In the year of his death, Maxwell thanked D. P. Todd, of the U. S. Nautical Almanac Office, for astronomical tables he had sent him. In that letter he brings up the possibility of measuring the velocity of the solar system through the aether. He thought it could be done by measuring the eclipses of Jupiter’s moons over half the period of Jupiter’s orbit about the sun. That is by observing the apparent time of the eclipses with the Earth at diametrically opposite ends of its orbit, it would be possible to infer the time it [aether] would take to travel the diameter of the Earth’s orbit about the Sun. If the whole solar system is moving at speed u, and the diameter of the Earth’s orbit is d, the times it takes light to travel this distance is t1 =
d , c+u
or
t2 =
d , c−u
depending on whether the aether is flowing towards, or away from the Earth. There will be a time difference of t = t2 − t1 =
2du 2du ≈ 2 . c 2 − u2 c
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
187
Maxwell emphasized that this would be superior to terrestrial measurements, which relied on the completion of a round trip. In such a case, one would have to determine, not the time difference, but the total time for the ‘out and return’ trip, 2d d 2cdu u2 d ≈ + = 2 1+ 2 , t= c+u c−u c c − u2 c where the increase in time due to the motion of the aether is t ≈
2d u2 . c c2
(4.1.12)
In contrast to the extraterrestrial measurement, which is first-order in the ratio u/c, we now have a second-order effect. Maxwell realized that a terrestrial effect would be too small to measure; taking u to be the velocity of Earth in its orbit, the relative velocity is 10−4 . To save the principle of action and reaction, Ritz created a new form of emission theory. Is radiation transmitted according to Poincaré’s analogy between an artillery cannon and the force that a body experiences when it emits radiation, or whether radiant energy is transmitted by a medium-like disturbance that can be described by a wave equation? The validity of Maxwell’s equations rests on there being an absolute velocity, the velocity of light. But, Maxwell’s equations are tacit on the existence of advanced potentials, which according to Ritz are unphysical. Ritz assumed that the velocity of a photon relative to its emitter was still c so that Maxwell’s equations are preserved, but the velocity of the emitter with respect to an observer would be c + ur , where ur is the velocity of the source in the direction of the vector joining the source to the point where it is being observed. As we saw in Sec. 3.2, this automatically accounts for the null resort in the Michelson–Morley interferometer experiments. It is only when we retain Einstein’s assumption that the velocity of light is constant to whatever frame we are in, do we need a hypothesis of a contraction in the direction of motion, like that invented almost contemporaneously by FitzGerald and Lorentz. What Ritz had in mind was modifying the retarded potentials, (4.1.9), so that they would read: ρt−r/(c±ur ) jt−r/(c±ur ) 1 φ= dV, A = dV, (4.1.13) r c r
Aug. 26, 2011
11:16
188
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
where the ± indicates whether the source is receding or approaching the observer. The new fields, (4.1.13), depend on the motion of the source, and this will increase or decrease depending on where they are being observed. Ritz’s emission theory was not in contradiction with terrestrial monitoring since it differs from a constant velocity of light in terms involving second-order in the velocity of light. If the source is traveling at a (nonrelative) speed u the time it would take to make a return journey traveling a distance l and back is t=
l 2lc l = 2 . + c+v c−v c − u2
(4.1.14)
If the velocity of light were constant, the transit time would be 2l/c. We came across (4.1.14) in the explanation of the null result in the Michelson– Morley experiment [cf. (3.2.1)]. But, if Ritz’s theory held sway there would have been no need to invent contraction factors. Ritz’s emission theory put Einstein in a tight squeeze [Ritz and Einstein 09]. On the one hand, Einstein could not negate the validity of Maxwell’s equations for he had recently based his special theory upon them. On the other hand, he could not accept speeds greater than the velocity of light for his explanation of the Fizeau drag coefficient from the relativistic velocity addition law would be evanescent, as we shall see in the next section. In a futile attempt to fend off Ritz’s allegations that advanced potentials would violate the second law, all Einstein could say was “the irreversibility rests exclusively upon the grounds of probability.”b The inverse process of a charge absorbing its own radiation, was not considered as an elementary process in 1909, though it had to be admitted that it was a solution to Maxwell’s equations. To Einstein the inverse process consisted of an enormous collection of radiating particles that somehow could concentrate all of their radiation in a single point. In order to remove this asymmetry in Maxwell’s equations, Einstein suggested following a corpuscular theory with no corpuscles of light ever exceeding c. But that would leave his construct, special relativity, without a firm foundation on Maxwell’s theory. b This is seen here as a retraction of his adage that “God does not play dice.”
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
189
Although Maxwell’s calculation is in conformity with Ritz’s, it is in conflict with special relativity. Ritz considers two points A and B which move with constant absolute velocity v in the direction AB. A luminous wave starting from A at instant t will arrive at B at instant t . It will have to travel the distance AB + v(t − t ) with speed c; we then have t − t =
AB + v(t − t) , c
or
t − t =
AB . c−v
He then goes on to say that there will be a first-order correction, (AB/c)v/c to the ‘true’ time to get what he calls the ‘local’ time. Not so for terrestrial measurements. Although Ritz is following Maxwell, he quotes Lorentz. For terrestrial measurements of the velocity of light, we are obliged to make it over a closed path which brings it back to its starting point; thus eliminating first-order terms. So, in the example considered, if the wave emitted at A is reflected at B, it will arrive at A after a time t − t = AB
4.1.4
1 1 + c−v c+v
=
2AB c
1+
v2 + ··· . c2
Faster than the speed of light
In the following chapters, we will see that hyperbolic geometry allows for velocities greater than light. Not to encroach on material of later chapters, suffice it to say here that Felix Klein’s [71] definition of distance is the following: The distance between an element z and the element z1 is the logarithm of the quotient, z/z1 , divided by the constant log λ.
Here, λ is the scale factor; it is the unit of measure into which the interval is divided. If z = c + u and z1 = c − u, the ‘distance’ between them is c+u c + u − (c − u) = 2u = c ln , (4.1.15) c−u where we have set ln λ = 1/c. If we expand the logarithm to first-order we get 2u = 2u, but the presence of higher-order terms in the expansion means that the ‘u’ on the left-hand side of (4.1.15) is not the same as the ‘u’ on the right-hand side. Whereas the velocity on the right-hand side is the Euclidean measure, the velocity on the left-hand side is the hyperbolic measure of the velocity.
Aug. 26, 2011
11:16
190
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity Furthermore, the composition law, u =
u−v , 1 − uv/c2
implies that the difference of their hyperbolic counterparts, which we will indicate with a bar, is c c c + u c+u c−v c ¯ v¯ = ln u− = ln · = ln{u, v|−c, c}, (4.1.16) 2 c − u 2 c−u c+v 2 where the logarithm of the last term is the cross-ratio. We cannot over-stress the facts that whereas hyperbolic velocities are absolute, the absolute constant being the velocity of light, their Euclidean counterparts are relative. And whereas the former are strictly additive the latter satisfy the relativistic velocity composition law. Multiple longitudinal Doppler shifts give the cross-ratio so that its logarithm, (4.1.16), is a true measure of hyperbolic distance. The velocity of light determines the absolute limit of the Euclidean velocity while it determines the unit of measurement with respect to its hyperbolic measure. To say that the meter is the distance traveled by light in vacuum during a time interval of 1/2.9979 × 10−8 of a second is a convention, and depends on how a second is defined. If all meter sticks and cesium clocks were to disappear from the face of the Earth, there would be no way for us to define a meter nor measure a second. We can thus say that velocities are relative in Euclidean space. Angles, on the other hand, are absolute, and, in nonEuclidean geometries, a triangle is determined by its three angles. This is completely foreign to Euclidean geometry. The angle between the relative velocities u1 and u2 is cos (u1 u2 ) =
u1 · u2 . u1 u2
If we put our system on a platform moving at constant velocity v, the expression for the angle becomes (u1 − v) · (u2 − v) − (u1 × v)(u2 × v)/c2 , √ {(u1 − v)2 − (u1 × v)2 /c2 } · {(u2 − v)2 − (u2 × v)2 /c2 } (4.1.17)
cos θ = √
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
191
where θ is the angle at v in the triangle formed from the three vertices, v, u1 , and u2 . Expression (4.1.17) is the cosine of the angle in a hyperbolic triangle. In Chapter 8, we will see that the relative velocity is related to the corresponding segment of a hyperbolic straight line by |u| = c tanh |u/c|. ¯ The relative velocity between u1 and u2 will be equal to the difference in the lengths u¯ 2 and u¯ 1 . With u¯ 2 > u¯ 1 , this distance is: v = c tanh (u2 − u1 )/c = c
tanh u2 /c − tanh u1 /c , 1 − tanh (u1 /c) tanh (u2 /c)
which is none other than the relativistic law for the composition of velocities, v=
u2 − u1 , 1 − u1 u2 /c2
as was first pointed out by Arnold Sommerfeld in 1909. The fact that u¯ → ∞ implies u → c, means that c looses its primacy in hyperbolic space, and is relegated to an absolutely determined constant, analogous to the radius of curvature, whose numerical value will depend on the choice of units. Tolman asserted the necessity of an experimental test to decide between Ritz’s hypothesis that to an observer the velocity of emission depends on its source, while Einstein claimed that it did not. Ehrenfest’s intervention brought with it an amusing paradox in that the believers in the aether would have to join ranks with the relativists in supporting Einstein’s hypothesis. The hopes of the emission proponents were (temporarily) dashed by de Sitter’s observations of the light emitted from eclipsing binary stars. If the velocity of light depended additively on the velocity of the source, the time for light to reach Earth from an approaching star would be smaller than that from the receding member of the doublet. From the laws of mechanics, de Sitter concluded that the effect, if it existed, would introduce a spurious eccentricity into the orbit. No such peculiarities were observed. However, the propagation of light through a medium, no matter how rare it is, involves a continual process of absorption and reemission of light as secondary radiation. This would have the effect of erasing any ‘memory’ that light has of the original source. This phenomenon, referred
Aug. 26, 2011
11:16
192
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
to as extinction and presented as a theorem of Ewald and Oseen, transforms light emerging from a dispersive medium into the velocity characteristic of the medium after a single ‘extinction length.’ In the case of interstellar space, the dispersive medium would be a permeating gas, and this has invalidated de Sitter’s lack of peculiarities. Consider the passage of light through a medium of index of refraction, η. The speed of light in such a medium would be c/η, where η > 1. If the source were moving a speed u in the direction away from an observer, Ritz would contend that the observer measures the speed of light, c = c/η + u, where u is in the line of sight between the source and the observer, while Einstein would contend that they are separate velocities, and that they ‘add’ according to the velocity addition law, c=
c/η + u . 1 + u/cη
Assuming slow motion, u c/η, the denominator can be expanded in powers of u/c, and to first-order Einstein obtained c 1 c ≈ + 1 − 2 u. η η This is precisely Fresnel’s result who, as we saw in Sec. 3.1, used it to explain the partial ‘dragging’ of light by the medium. It is the main reason why Einstein could not believe in emission theories, for they would invalidate the composition law of velocities. This law shows that no matter how you combine velocities, and no matter how many you take in the combination, the velocity of light can never be exceeded. This is, however, valid only in Euclidean space.
4.2
Relativistic Mass
It is fair to say that Einstein made Lorentz’s theory of the electron “applicable to ‘material points’ without charge” [O’Rahilly 38]. Even Abraham was into generalizing all electromagnetic results into those valid for all
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
193
particles. Referring to the expressions for the transverse and longitudinal masses he claims that these formulas agree with those deduced for the Lorentz electron. We have derived them here, without making any assumption whatever concerning the configuration or the charge-distribution, solely from the theorem on the momentum of the energy stream. . .The same equations must hold for the ‘particle’ of elementary mechanics.
Abraham assumes that W = mc2 holds for any radiation process whatsoever. The associated momentum is G = W /c if the particle is traveling at the speed of light, or G = W /u if it is traveling at a speed u < c. Then the force is F=
dG d = mu, dt dt
by definition. The increment in the work necessary to keep the system in motion can only come from a change in the energy, dW = c2 dm = Fu dt = u2 dm + mu du, or upon rearranging, dm/m = u du/(c2 − u2 ). Integrating leads to m0 , (c2 − u2 )
m= √
as the expression for the transverse mass at constant speed, as we have derived it in Sec. 1.1.3. This derivation was even accepted by Lenard, an arch enemy of relativity, who attributed it to Hassenöhrl. Over the years there have been raging controversies of the validity of the original ‘proof’ of Einstein’s mass–energy equivalence. There is a general consensus that the criticism of Einstein’s [35] proof rests with Ives [52], but, Einstein himself found it necessary to offer another proof of the famous relation some thirty years after the original one. And that one was not so original because Poincaré came up with it five years before Einstein, as we saw in Sec. 1.2.2.2. On the strength of Ives’s criticism, Jammer [61] writes: It is a curious incident in the history of scientific thought that Einstein’s own derivation of the formula E = mc2 , as published in his article in the Annalen der Physik,
Aug. 26, 2011
11:16
194
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity was basically fallacious. In fact, what for the layman is known as “the most famous mathematical formula ever projected,” in science was but the result of a petitio principii. . .
Arzeliès [66] uses Einstein’s derivation as a bulwark for other seedy relations in physics when he states It might appear rather piquant that a relation of this importance was introduced into physics by this expedient. In fact, it is just one example among others of the slight importance of logic in physical research.
Let us now turn to what Einstein did, or rather did not do.
4.2.1
Gedanken experiments
Einstein [05] considers a body at rest which emits a ‘light wave’ of total energy L into two opposite directions of equal magnitude. The body, which certainly cannot be an electron, conserves momentum but loses energy. If it had energy E0 before the emission took place and E1 after, the conservation of energy demands L = E0 − E1 . Now consider, says Einstein, the same process in another inertial system moving at speed u relative to the former. The same emission will lead to the energy conservation L = H0 − H 1 , (1 − u2 /c2 )
√
where H0 and H1 are the energies before and after radiation. This result, he claims, follows from the law of transformation of energy from one inertial frame to another that he derived previously. Subtracting the former from the latter gives 1 (H0 − E0 ) − (H1 − E1 ) = L √ − 1 . (4.2.1) (1 − u2 /c2 ) Einstein continues H and E are the energy values of the same body, referred to two coordinate systems in motion relative to each other, in one of which . . . the body is at rest. It is therefore clear that the difference H − E can differ from the kinetic energy K of the body with respect to the other system . . . only by an additive constant C which depends on the choice of the arbitrary additive constants in the energies H and E.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
195
Thus he sets H0 − E0 = K0 + C, H1 − E1 = K1 + C,
(4.2.2)
where the constant C will not be altered by the emission of radiation. This is obvious, for, otherwise, it would not be an arbitrary constant. Then replacing the energy difference in (4.2.1) by these expressions he comes out with 1 K0 − K 1 = L √ −1 , (1 − u2 /c2 ) where K0 and K1 are “the initial and final kinetic energies of the body with respect to the inertial frame in which its velocity is u” [Stachel & Torretti 82]. However, if u is the relative velocity of frame 1 with respect to frame 0, it follows that K0 = 0, for the latter is at rest with respect to the former. What Ives [52] does is to get rid of the difference on the right hand side of (4.2.1) by introducing the expression for the difference in the kinetic energies, 1 Ki = mi c2 √ − 1 , i = 0, 1, (4.2.3) (1 − u2 /c2 ) thereby obtaining (H0 − E0 ) − (H1 − E1 ) =
L (K0 − K1 ). (m0 − m1 )c2
But, if K0 = 0 it follows that m0 will not appear because what multiplies it in (4.2.3) is zero! Notwithstanding, Ives separates this equation into two equations H0 − E 0 =
L (K0 (m0 −m1 )c2
+ C),
H1 − E1 =
L (K1 (m0 −m1 )c2
+ C),
(4.2.4)
and claims that they are not (4.2.3) because they differ from them by the multiplicative factor, L/(m0 − m1 )c2 . So, claims Ives, they will become them when we set the multiplicative factor equal to 1. This is the petitio principii, or the begging of the question, to which Jammer refers. Ives concludes: “The relation [E = mc2 ] was not derived by Einstein.” Although the latter is correct, it was not because Ives pin-pointed the error.
Aug. 26, 2011
11:16
196
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
The criticism [Riseman & Young 53] lodged against (4.2.4) is that the two equations are not independent. Stated slightly differently, the first equation knows ahead of time that there is going to be an emission in energy leading to a decrease in mass.c Even worse, as we have already mentioned, no motion in O means no K0 and, therefore, the mass m0 will not appear in the equation so that there is no decrease in mass on account of the radiation. Stachel and Torretti [82] emphasize that “though ‘clear’ to Einstein, their validity [referring to (4.2.3)] has not been quite so evident to others.” They hope to remedy the situation by the following considerations. Consider, they say, a body in a rest frame whose internal state is characterized by a set of state parameters, S. By the ‘relativity principle’ it must be possible to have the same state in motion with the same state parameters, for, otherwise, we could distinguish motion from rest, or have absolute motion simply by noting a change in S. Thus, the energy E, can only be a function of the velocity and set of state parameters, E = E(u, S). Now comes the crux of their argument: The kinetic energy of the body, by definition, is equal to the work necessary to bring the body from the state of rest to uniform motion with velocity u. But, by conservation of energy, this must be equal to the difference between its energy for the state S and speed u and its energy for the same internal state when at rest: K = E(u, S) − E(0, S).
They then deduce that E(u, S0 ) − E(0, S0 ) − [E(u, S1 ) − E(u, S1 )] 1 = K(u, S0 ) − K(u, S1 ) = L √ −1 . (1 − u2 /c2 ) We beg to differ with the statement that the “kinetic energy, by definition, is equal to the work. . .” since in order for the system to be brought from a state of rest into one of uniform motion it will have to undergo acceleration, i.e. a change in its velocity. So it is not the kinetic energy Stachel and Torretti are referring to, but, rather, the work, G · u, necessary to keep the system in a state of uniform motion, regardless of the manner it got there, i.e. by accelerating it. c This sounds like an advanced potential.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
197
This is to say that the differences in energies of the two states are E(u, S0 ) − E(0, S0 ) = G0 u, E(u, S1 ) − E(0, S1 ) = G1 u, and subtracting the latter from the former gives (m0 − m1 )u2 E(u, S0 ) − E(0, S0 ) − [E(u, S1 ) − E(0, S1 )] = (G0 − G1 )u = √ . (1 − u2 /c2 ) This is certainly not the difference in the kinetic energies of the two states; in particular, for u/c 1 it reduces to twice the difference in kinetic energies. According to Jammer, after ‘correctly’ proving (4.2.1), Einstein “mistakenly put” H − E equal to the kinetic energy. Stachel and Torretti rejoin by saying “And yet it is hard to see what else one could mean by kinetic energy of a body with internal state S and speed u.” Rather, it is the difference between the total energy, what they refer to as E, and the internal energy which can depend only on S, is the work necessary to keep the system in uniform motion [Lavenda 02]. The authors then claim that Einstein had proved (4.2.3) for the kinetic energy of an ‘electron,’ i.e. a charged structureless particle in his first relativity paper; but he studiously avoided using it in the derivation of the mass–energy equivalence, even though. . .it would have simplified his task.
Surely, Stachel and Torretti know that bringing the body from of a state of rest to uniform motion involves acceleration, and accelerating electrons radiate; this would carry us way beyond the limits of special relativity.
4.2.2
From Weber to Einstein
The birth of electrodynamics came in 1826 with the publication of Ampère’s memoir on the interaction of two small currents of electricity. Ampère claimed that the force acting between two elements of current was not just proportional to the inverse square of their distance, but, also to the angles which these small elements made with the line connecting their centers. In Ampère’s own words: it is no longer contradictory to admit that from the actions proportional to the inverse square of the distance which each molecule exerts, there can result between
Aug. 26, 2011
11:16
198
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity two elements of conducting wires a force which depends not only on their distance but also on the direction of the two elements. . .
Ampère’s memoir caught the eye of Gauss who, in 1835, came to the conclusion that “two elements of electricity in relative motion repel or attract one another differently when in motion and when in relative rest.” We have already quoted Gauss’s letter to his assistant Weber, in Sec. 4.1, where he expressed his view that these interactions are not instantaneous, but propagated “as with the case of light.” At the beginning of the nineteenth century discoveries were seemingly unconnected, for as Fechner wrote in 1845, “Faraday’s phenomena of induction and the electrodynamic phenomena of Ampère have been related only by an empirical rule.” According to Fechner, and subsequently Weber, all electric currents are currents of convection; that is, they are due to the motion of electricity. The corpuscular interpretation of electricity was elaborated upon by Weber himself. He accepted Fechner’s hypothesis that both positive and negative elements of electricity move at equal velocities in opposite directions. On the strength of Coulomb’s law these interactions should cancel out and there should be no motion. But, this contradicted the fundamental experiments conducted by Ampère who demonstrated beyond any doubt that a motion is produced between wires. Therefore, there must be a force not contained in Coulomb’s law. Weber went on to derive a force existing between charged particles that took into account their relative velocities as well as accelerations. If mass is basically electromagnetic in nature, the velocities contribute to the force, and not to any variation of mass with speed. Weber’s ideas held sway on the continent largely due to the authority of Helmholtz. However, Helmholtz found what he believed to be a flaw in Weber’s force law: it could lead to infinite work arising from finite motion of electrical particles. Weber replied that in order for this to happen, particles would have to move at enormous speeds surpassing his constant c, the only parameter that appears in his equation. When the motion of particles reach this value, the force between the electrical particles vanish. In the Weber–Kohlrausch experiment, performed in 1854, the constant was determined to have the same value as the product of the speed of light, in vacuo, with the square root of two. Riemann who was present during
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
199
the experiment was impressed by the very profound connection between electricity in motion and light. In England a different sort of synthesis was brewing. Maxwell attempted a different sort of amalgamation between a tentative explanation of electrical actions through mechanical properties of the aether, and a purely phenomenological description in terms of two fields that satisfied a set of partial differential equations. Gone were the forces by which electrical charges interact whether in motion or at rest. Through the introduction of the displacement current, as an auxiliary means by which charges move, Maxwell was able to close his set of equations into a single wave equation with a velocity of propagation equal to that of light. It took thirty years since Maxwell’s first publication “On Faraday’s line of force” in 1856 to get even a hearing for his theory on the continent. However, the experiments of Hertz clinched the success of Maxwell’s theory. Maxwell’s theory, or Hertz’s generalization to bodies in motion, does not agree with well-known optical phenomena of aberration and experiments like the Fizeau drag and the Eichenwald experiment on the magnetic field produced by the rotation of a dielectric in an electric field. Another step was needed and it was provided by Lorentz who brought together two seemingly disconnected laws into a single force law. In doing so, Lorentz filled the huge abyss between Maxwell’s field equations and the mechanical theories that saw forces resulting from attraction and repulsion, and from motion. It is commonly accepted that Einstein [49] tore down the scaffolding of the Maxwell–Lorentz theory and replaced it by two postulates, which in his own words are The insight fundamental for the special theory of relativity is this: The assumptions relativity and light speed invariance are compatible if relations of a new type (“Lorentz transformation”) are postulated for the conversion of coordinates and times of events... The universal principle of the special theory of relativity is contained in the postulate: The laws of physics are invariant with respect to Lorentz transformations (for the transition from one inertial system to any other arbitrarily chosen inertial system). This is a restricting principle for natural laws.
So Einstein accepted the continuous field concept encapsulated in Maxwell’s equations, and, with them, the existence of an absolute velocity.
Aug. 26, 2011
11:16
200
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
Both Maxwell’s equations and the Lorentz force are time-symmetric, and, therefore, cannot explain phenomena involving radiation which is a clearly irreversible process. This was Ritz’s contention which was formulated into the “Ritz–Einstein Agreement to Disagree.” Ritz argued that Maxwell’s equations do not discriminate between retarded and advanced potentials, and thus cannot explain phenomena involving radiation. Einstein, wishing to preserve the central role of Maxwell’s equations, insisted that Maxwell’s equations can be solved, in principle, using the end state, instead of the initial state, with the aid of the advanced potential. According to Lanczos [74] “Ritz took strong objection to this view, and Einstein admitted his mistake.” This is rather curious since neither author admitted making a mistake. It could be that Lanczos was referring to Einstein’s opinion in his later years. Einstein goes on to sustain that irreversibility is a statistical effect like fluctuations in blackbody radiation and Brownian motion. However, there is nothing statistical about solving a partial differential equation, and this shows Einstein’s tenacious effort to maintain the unadulterated status of Maxwell’s equations and a constant, limiting speed. However, with age, Einstein had growing disillusionments that differential equations are the correct setting for a unified theory. To Pais [82] he remarked “he was not sure whether differential geometry was to be the right framework for further progress,” while to Besso he wrote the year before his death, “I consider it quite possible that physics cannot be based upon the field concept, i.e. on continuous structures. In that case, nothing remains of my entire castle in the air, gravitation theory included, [and of] the rest of modern physics.”
4.2.3
Maxwell on Gauss and Weber
In the last chapter of his treatise, Maxwell [91] compares his notion of the transmission of radiant energy from one particle to another with those of Gauss and Weber. He titles his chapter “Theories of Action at a Distance,” which is a forewarning of the critical attitude he is to take. Typical texts on electricity and magnetism use the 1820 formulation of Biot and Savart, while Gauss and Weber based their theory on that given by Ampère in 1825. The two expressions differ in their prediction of the
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
Fig. 4.2.
201
Orientation of two circuit elements ds and ds .
force acting between two current elements in an open circuit, but coincide in a closed circuit because all angle dependent terms disappear. If ds and ds are two lengths of wire located at a distance r apart, and carrying currents I and I , respectively, then Ampère showed that the force exerted on ds by ds is κ
II ds ds 2 cos ε − 3 cos ϕ · cos ϕ , r2
(4.2.5)
where ε is the angle between the two elements, ds and ds and ϕ and ϕ are the angles formed between the radial vector r connecting them and their orientations, as shown in Fig. 4.2. The constant κ is determined by the units. Another constant, κ, again determined by the units, appears in Coulomb’s law for the force acting between two charges e and e at a distance r apart, κ
ee . r2
(4.2.6)
Once one of the constants is chosen to define a unit of charge, the other must √ be related to the speed of light, (κ/κ ). This was determined by Weber in √ 1852, who fixed the constant at 2 times larger than the speed of light. The idea is to consider only the relative motion of the two particles.d If v and v are their speeds, then the square of their relative speed is u2 = v2 − 2vv cos ε + v 2 . d The true relativists belonged to the nineteenth century, not the twentieth!
Aug. 26, 2011
11:16
202
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
And the rate of change of the distance between the two particles is r˙ =
dr ∂r ∂r = v + v . dt ∂s ∂s
To get rid of the absolute square of the velocities, recourse was made to Fechner’s hypothesis, in which the electric current consists of positive and negative charged particles traveling in opposite directions, and are equal and opposite in magnitude. Introducing these expressions into an equivalent form of Ampère’s law, ee ∂r ∂r ∂2 r − 2 , (4.2.7) − 2r ∂s ∂s ∂s ∂s r with charges replacing current elements, Gauss and Weber finally came out with ee 1 3 2 2 1 + 2 u − r˙ , (4.2.8) 2 r2 c ee 1 1 2 1 + 2 r¨r − r˙ . (4.2.9) 2 r2 c Both expressions reduce to Coulomb’s law in the static limit. The constant √ c appearing in these equations is (κ/κ ). Maxwell attributes the first of these expressions, (4.2.8), to Gauss, found posthumously in his notebooks dated July 1835, and the second to Weber, published in 1867, under the title “Determinations of electrodynamic measure.” According to Gauss, “two elements of electricity in a state of relative motion attract or repel one another, but not in the same way as if they are in a state of relative rest.” Actually, both expressions (4.2.8) and (4.2.9) appear on the same page in Gauss’s notebooks, and probably Maxwell wanted to distinguish the two expressions by giving them two different names, since the latter is what appears in Weber’s book. But, it is (4.2.8) that he based his view of the non-instantaneous, ballistic transmission of energy, and the one which Ritz will re-derive in the absence of accelerations and for a specific value of his parameter, λ = −1. So, as O’Rahilly [38] remarks, Ritz’s paternity is directly traceable to Gauss. We will come to Ritz’s emission theory in the next section, but, first we want to see what Maxwell had to
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
203
say about (4.2.8) and (4.2.9), and how he ultimately distanced himself from them. While Maxwell admits that the two expressions give the same mechanical force between two currents, and, in this sense are identical to Ampère’s law, (4.2.7), they differ when considered as laws of nature. In particular, do they obey the conservation of energy? Here, Maxwell bases himself on conventional wisdom by stating that only those forces that act between particles are a function of distance only, and not upon “the time, or the velocities of the particles, [for which] the proof would not hold.” This rings of Tait’s earlier claim near the start of Sec. 4.1. Maxwell concludes that a law of electrical action, involving the velocity of the particles, has sometimes been supposed to be inconsistent with the principle of the conservation of energy.
In reality, Maxwell is not willing to go that far, and, in the next paragraph, says that only Gauss’s formula, (4.2.8), is inconsistent with the conservation of energy “and therefore must be abandoned.” What, supposedly, saves Weber’s force from the same fate is that it is derivable from a potential, L=
ee r
1−
1 2 ˙ r . c2
(4.2.10)
The work involved in moving the particle from the beginning to the end of any path segment is ψ1 − ψ0 . Now, says Maxwell, ψ depends only on the distance r, and its rate of change, r˙ , so that when a particle is moved in a closed path, the potential will be the same as when it started. That is, “no work will be done on the whole during the cycle of operations.” Notwithstanding this, Maxwell goes on to cite Helmholtz’s criticism that Weber’s force can become the seat of a perpetual motion machine by the fact that two electrified particles, which move according to Weber’s law, may have at first finite velocities, and yet, while still at a finite distance from each other, they may acquire an infinite kinetic energy, and may perform an infinite amount of work.
So it appears that Sadi Carnot’s use of perpetual motion to outlaw certain forms of cyclic motion lasted well into the nineteenth century. But, here we are not talking about a machine that can operate over a cycle and still have energy to burn, but one of instantaneous motion. Weber’s rebuttal consisted in saying that any such particle would have a velocity greater
Aug. 26, 2011
11:16
204
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
than c, and that the distances involved where the particles energies would be infinite would be so small as to go unperceived. The former would have been a sufficient answer to Helmholtz, but the latter gave Helmholtz space to maneuver. Consider, says Helmholtz, a non-conducting sphere of radius, a, and a uniform surface charge, σ. A particle whose mass is m and carrying a charge e moving at a speed v will have a potential, according to Weber, given by v2 4πσae 1 − 2 , 6c which is completely independent of the position of the particle within the sphere. Adding this to the kinetic energy, 12 mv2 , of the particle, and to whatever other potential energies may be present, V, the conservation of energy requires 1 4 πσae 2 v + 4πσae + V = const. (4.2.11) m− 2 3 c2 Interestingly, a mass term of the form of the second term in (4.2.11), but with opposite sign, will be found by Thomson in 1881 for a slowly moving charge in Sec. 5.9. But, with the sign as indicated in the formula, Helmholtz argued that the second term in the coefficient of v2 could be increased indefinitely by increasing a, while keeping the surface density, σ, constant, so that the coefficient of v2 can become negative. A negative kinetic energy could further be made more negative by frictional terms, which ordinarily oppose the motion thereby decreasing the kinetic energy, and would thereby lead to a perpetual motion machine when carried over a cycle. Any potential which introduces a negative sign in the coefficient of v2 would lead to the same conclusion. Now, Maxwell insists that Weber’s law is “consistent with the principle of the conservation of energy insofar that a potential exists,” while Gauss’s law, (4.2.8), or what was supposedly attributed to him, does not. However, introducing [O’Rahilly 38, p. 525] r˙ = vr − vr = ur , r¨ =
d (vx − vx )(x − x ) (u2 − u2r ) = + ar − ar , dt x r r
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
205
since u2 =
vx − vx
2
,
vr − vr =
x
(vx − v )(x − x ) x
r
x
ar − ar =
(˙vx − v˙ ) x
(x − x )
,
r,
into Weber’s formula, (4.2.9), gives precisely Gauss’s law, (4.2.8), when the accelerations, ar and ar , are omitted! So, the whole argument lodged against Gauss’s formula, (4.2.8) is entirely unjustified. To make matters worse, Helmholtz criticized Weber’s formula, (4.2.9) precisely due to the presence of acceleration, which does not apply to Gauss’s (4.2.8). Thus Maxwell cannot maintain that since Gauss’s law is inconsistent with the principle of the conservation of energy it will not explain all the laws of induction, and, hence, is unacceptable. To aggravate matters further, there is a sign error in (4.2.10), which should read [cf. (4.1.5)] ee u2r L= (4.2.12) 1+ 2 . r 2c Now, we are free to choose r and ur as the independent variables, as well as x and vx . In the former case we get Fr = −
∂L d ∂L , + ∂r dt ∂ur
(4.2.13)
which is Weber’s law, (4.2.9), or Gauss’s, (4.2.8), if the accelerations are negligible. In the second case, we must bear in mind that r is not independent of x, i.e. ∂ (vx − vx )(x − x ) ∂ur = ∂x ∂x r vx − vx (vx − vx )(x − x ) = cos (rx). − r r2
(4.2.14) (4.2.15)
We thus get the force in the x-direction, Fx = −
d ∂L ∂L + , ∂x dt ∂vx
which introduces a cos (rx) into the force law, reducing to (4.2.12) when the r lies along the x-axis.
Aug. 26, 2011
11:16
206
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
Consequently, we must change the negative sign to a positive one in (4.2.11). We must now inquire into what it means. Recalling the definition of the electrokinetic potential, (4.1.2) or L = e(φ − (ur /c)Ar ), the induced emf, d ∂L E= dr. (4.2.16) dt ∂ur Introducing Weber’s expression for the electrokinetic potential, setting e = e , and considering a spherical conductor of radius 3a we come out with E=
e2 u2r . 3a c2
(4.2.17)
This particular choice allows us to identify with Thomson’s expression on Sec. 5.9 for a medium of unit permittivity. There will be an additional contribution to the kinetic energy, which is now 1 2 e2 . (4.2.18) m+ 2 3 ac2 Now, Thomson [20] attributes the increase in the mass as due to the magnetic field surrounding the charge which has been created by its motion. Here, we appreciate it as due to magnetic induction, which is a term of order u2r /c2 , that cannot be neglected. So, it is, in fact, in April of 1872 that Helmholtz, unwittingly, discovered the inertia of energy due to a circulating electric charge. Hindsight has indeed twenty-twenty vision! Interestingly, the criticism lodged against Weber by Helmholtz can be used against Clausius’s expression, L=
ee 1− vx vx /c2 , r
because, in the words of Maxwell, “This impossible result [acceleration causing a decrease in the kinetic energy] is a necessary consequence of assuming any formula for the potential which introduces negative terms into the coefficient of v2 .” But, it cannot be used against Riemann, whose expression is L=
ee 1 + u2 /c2 , r
and whose formulation Clausius so viciously attacked.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
207
Maxwell tell use The mathematical investigation given by Riemann has been examined by Clausius, who does not admit the soundness of the mathematical processes, and shews that the hypothesis that potential is propagated like light does not lead either to the formula of Weber, or to the known laws of electrodynamics.
What Riemann did was to derive phenomena related to induction from a modified form of Poisson’s law, ∇ 2 V + 4πρ =
1 ∂2 V , α2 ∂t2
where V is the electrostatic potential, ρ, the charge density, and α is the velocity of propagation. Riemann was coming too close for comfort, and the only exception Maxwell could take is that he avoids “making explicit mention of any medium through which the propagation takes place.” To modern relativists, this is no exception at all, and it smacks of the petty differences drawn to distinguish Poincaré’s principle of relativity from that of Einstein’s. Coming closer to the true motivation behind Maxwell’s criticism, Maxwell refers to the 1845 letter of Gauss to Weber in which he considers the action between charged particles not to be instantaneous (action at a distance), but, rather, “propagated in time, in a similar manner to that of light.” A finite-time propagation mechanism would undoubtedly involve wave propagation, but Gauss probably did not think it was that simple, given the complicated angular dependencies of Ampère’s law, (4.2.5). But, Maxwell found it suitable for his needs to claim that Clausius showed that the “potential is propagated like light does not lead either to the formula of Weber, or to the known laws of electrodynamics.” Be that as it may, the exact same criticism can be leveled against Maxwell’s potentials and fields! Maxwell, however, did not take Weber’s force nor Ampère’s law as a criterion that had to be fulfilled. This is clearly seen by the following question posed by Maxwell. e Riemann presented his article to the Royal Society of Göttingen in 1858 but had
to withdraw it because of Clausius’s opposition to its publication. It was later published in the Annalen, posthumously in 1867. This is another example of unjust, and biased referring where Clausius missed his mark.
Aug. 26, 2011
11:16
208
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
“If something is transmitted from one particle to another at a distance,” queries Maxwell, “what is its condition after it has left the one particle and before it has reached the other? If this something is in the form of potential energy how then does it exist in the intervening time, after it left one particle and before it reached its destination?” According to Maxwell, there ought to be a medium which can house and transport this energy, and it is to this medium that we must focus our attention. And, “this has been my constant aim in this treatise.” We have seen in Sec. 3.8.1 that this type of reasoning led Maxwell to drop the whole idea of finding an explanation of gravity as a field effect, and witnessed in Sec. 3.8.2 that Ritz’s approach was far superior in that it could explain all the known deviations from Newtonian theory known at that time. What was so fruitful to Maxwell in the derivation of his equations became a deterrent for further progress.
4.2.4 4.2.4.1
Ritz’s electrodynamic theory of emission Absolute versus relative velocities
What is not intuitive at all is Einstein’s second postulate which says that all observers will always measure the same speed of light regardless of their motion. It abolishes the parallelogram addition of velocities and replaces it by a composition law which makes sure that the speed of light is never exceeded no matter how may velocities are added together. Einstein needed this composition law because it explained the first-order correction of the Fizeau experiment [cf. Sec. 3.1], but Ritz is quoted as saying “it would be deplorable for our economy of thought if we had to accept such complications.” Walther Ritz was a young Swiss physicist who is best noted for his combination principle, and the Rayleigh–Ritz perturbation technique. As we have seen he did much more in predicting the advance of the perihelion of Mercury Sec. 3.8.2 from his force equation, and he formulated the only serious alternative to special relativity and Maxwell’s electrodynamics. Ritz [08] asked the question “Do [Maxwell’s] equations really deserve such extreme confidence?” His immediate response was “The answer to this question is decisively no.” His most damning criticism was that Maxwell’s field equations admit an infinite number of solutions, many of
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
209
which are unphysical. To eliminate such solutions we must invoke retarded potentials. And if we are to deal directly with the scalar and vector potentials what good are the field equations which determine the evolution of the electric and magnetic fields? Moreover, we never observe the fields themselves, but, rather, deduce them from the (Lorentz) force that acts upon charged particles. This force is reversible in time, when we reverse the velocities we also have to reverse the magnetic field. However, radiation is clearly an irreversible phenomenon so Maxwell’s equation together with Lorentz’s force are unable to cope with the real nature of irreversible radiative phenomena. We will see that Ritz obtained the self-reaction of the electron upon itself when it radiates without the support of either Maxwell or Lorentz. Ritz assumed that all charged particles constantly emit ‘fictitious’ charged particles which are infinitely small. If the charges are in motion then the velocity of these particles would be the vector sum of the velocity of the charges, at the instant of emission, and the velocity of light. Ritz insisted that his was not a true theory, but only one example where Lorentz invariance is not part and parcel of every relativistic phenomenon. Oddly enough, Lorentz invariance, or better the invariance of the cross-ratio, is at the very heart of the additivity of the velocities when the velocities are hyperbolic ones. Consider a circle of radius c. The center of the circle is located at the origin and let us calculate the cross-ratio for the collinear points (−c, 0, u, c). As we know from Sec. 2.2.4 it is given by 2 c+u c c+u u¯ = ln · = ln = ln {0, u|c, −c}, c c−u c c−u where c is the absolute constant which determines the scale. The Maclaurin expansion of the logarithm shows that for small velocities u¯ = u, while at large velocities u¯ → ∞ as u → c. Now if we want to add two velocities, u¯ and v¯ , we get c c+u c+v c+w u¯ + v¯ = ln · = ln , (4.2.19) 2 c−u c−v c−w where w=
u+v , 1 + uv/c2
Aug. 26, 2011
11:16
210
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
which is precisely the relativistic composition law of velocities. And it is precisely this law that guarantees the additivity of their hyperbolic counterparts! In other words, Lorentz invariance in Euclidean space guarantees the additivity of the velocities in hyperbolic space. Whereas Euclidean velocities cannot exceed c, there are no restrictions placed upon hyperbolic velocities, which, are in fact, additive. Thus, the violation of parallelogram law for the addition of (Euclidean) relativistic velocities should be an indication of the hyperbolic nature of the velocity space. Ritz died in 1909 and his ‘ballistic,’ or c + u, model of radiation seems to have met a similar fate in 1913 when de Sitter’s binary star observations failed to predict the c + u effect. Basically, de Sitter argued that if the velocity of light emitted by a binary star were additive with the velocity at which the star is moving then certain effects should be observable (which he did not observe). In other words, there would be an interaction between the light that is emitted at the slower velocity, c − u, when the side of the orbit is moving further away from the observer and the faster velocity, c + u, one-half orbit later when it is approaching the observer. At the point where the faster light overtakes the slower light emitted one-half orbit earlier, the light from the star should be observed in two different parts of its orbit simultaneously [Fox 62]. If is the distance between the star and the observer when it is receding from him, the time it will take for slow light to arrive is ts =
, c−u
while the time it takes for fast light to reach the observer is tf = τ +
, c+u
where τ is the time for the star to complete one-half of its orbit. At the exact time where ts = tf , the time to complete one-half of its orbit will be τ=
1 1 − c−u c+u
=
2u , − u2
c2
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
211
which is the same reasoning used in the Michelson–Morley experiment, in Sec. 3.2, except there the sum, (3.2.1), instead of the difference was taken. This is further evidence that the Michelson–Morley interpretation is a ballistic one. From the above equation one solves for , and obtains approximately τc2 /2u. The speed of light is the absolute constant of hyperbolic velocity space, ¯ We can, however, and it makes no sense to add to it the hyperbolic velocity u. consider a Euclidean velocity c/η, where η is the index of refraction of the medium that light is propagating in. In fact, the so-called ‘extinction theorem’ is an argument in favor of Ritz’s ballistic theory. The theorem, supposedly put forth by Ewald and Oseen [Born & Wolf 59], shows how an external electromagnetic disturbance traveling with the velocity of light in vacuum is exactly canceled out and replaced in a substance by the secondary disturbance traveling with an appropriately smaller speed.
Terrestrial extinction occurs on the surface of a dipole field, which has an extinction length of a thin surface layer 10−4 cm, whereas interstellar extinction due to gaseous envelopes surrounding binary stars are still small in comparison to the distance, τc2 /2u. Granted there is no Doppler shift for c + u because c is the absolute constant. But, suppose that the emitter traveling at speed u is Doppler-shifted along with the emitted radiation in a medium whose index of refraction is η. Setting c/η = c , Ritz’s addition law would be (c + u) (η + 1) c + u = c ln c + u = c ln · , (4.2.20) c − u (c − u) (η − 1) where u =
c/η + u . 1 + u/ηc
Although c > u > c , the left-hand side of (4.2.20) can be of unlimited magnitude, the closer u is to c, or the closer η is to 1. Many criticisms have been lodged against emission theories, notably by Pauli [58]. Pauli questions whether one measures a change in frequency or a change in wavelength, since emission theory does not require their product to be constant, i.e. λν = c. If a star approaching the Earth at a speed u emits radiation at a frequency ν, it will be seen by an earthling at frequency ν = ν(1 + u/c), but with no change in wavelength. According to
Aug. 26, 2011
11:16
212
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
Ritz this is due to the fact that the emitter, moving at constant speed, emits concentric spherical wavefronts each one of which is centered on the source as it moves. Thus, the wavefronts will diverge with no interference. Rather, if there is an intervening medium present which re-emits radiation at a speed c, then there will be a decrease in wavelength by the amount λ/(1 + u/c). A change in the speed of light would also occur when light from outer space enters the atmosphere with an index of refraction different from unity. The extra-terrestrial Michelson experiment was performed by Tomaschek in 1924 giving the same null result found in the terrestrial experiment. This provided “telling evidence against the Ritz theory” [Fox 65]. But, as we know from Sec. 3.2, the explanation of the null experiment rests on the supposition that light is traveling at two different speeds, c+u if the emitter is approaching the observer and c − u if it is receding from the observer. Pauli also criticized emission theories in that if a light wave were traveling at velocity c+u, it could not interact with particles scattering radiation at a velocity c. Fox [65] rebuffs this criticism on the basis that the interaction can only occur when their frequencies, and not their velocities, are equal. Scattering of radiation will indeed change velocities of the waves, and there will be a tendency to ‘localize’ their velocities about c with an obliteration of their phase differences in much the same way that the extinction theory predicts.
4.3 Radiation by an Accelerating Electron 4.3.1 What does the radiation reaction force measure? The radiation reaction force, γ2 2e2 2 γ2 γ4 2 Frad = 3 γ u¨ + 2 u(u · u) , ¨ + 3 2 u(u ˙ · u) ˙ + 3 4 u(u · u) ˙ 3c c c c
(4.3.1)
was first derived by Abraham [05] in 1905, by taking the time-derivative of his ‘electromagnetic momentum,’ and deduced directly by Schott [12], who called it a ‘radiation pressure.’ Schott “dismissed it very briefly” by claiming that It is a small quantity of the order zero, and depends only on the magnitude of the charge and its mean motion, not at all on its configuration nor on the relative motion of its parts. It is not difficult to prove that the expression for [Frad ] which is
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
213
in question, can be obtained by means of the Lorentz–Einstein transformation from the well known expression for the radiation pressure on an electric charge e, which is vibrating with high frequency but small amplitude, so that its velocity is always very small while its acceleration is finite. The radiation pressure on such a charge is equal to 2e2 d3 r , 3c3 dt3 where r denotes its radius vector and t the time. If we regard this system as moving relative to a fixed system with velocity v, considered as constant for the time being, and transformed by the method of Lorentz and Einstein, we obtain the expression [(4.3.1)] for [Frad ], for a charge e moving with the velocity v, now regarded as variable.
The transformation from an inertial to a non-inertial one after a Lorentz transform has been performed is, indeed, miraculous. We prefer to refer to (4.3.1) as a radiation reaction force rather than as a radiation pressure for it represents the self-interaction of the electron caused by its own radiation. The radiation reaction force, (4.3.1) can also be derived by an expansion in inverse powers of c either from the Liénard, (4.1.8), or the Ritz, (4.1.7) expression for the force. The first term in (4.3.1) is the force exerted by a charge on itself, which is independent of the dimensions of the body, and represents a sort of ‘friction’ due to loss of energy. The radiation force (4.3.1), unlike the Lorentz force, (4.1.11), is clearly not invariant under time-reversal. However, if we go back to the original source [Abraham 05], we see that Abraham began with the Liénard expression for the rate of loss of energy due to radiation, t2 2 e2 t2 ˙ dt Wrad = − 3 dt γ 2 {u˙ 2 γ 2 + (u · u) ˙ 2 γ 4 /c2 }, (4.3.2) 3 c t1 t1 and observed that the reaction force was u/c2 times this quantity, i.e. t2 2 e2 t2 dt Frad = − 5 dt γ 2 u{u˙ 2 γ 2 + (u · u) ˙ 2 γ 4 /c2 }. (4.3.3) 3 c t1 t1 Then considering the time intervals of (4.3.2) and (4.3.3) to be short, and observing the following integrations by parts, t2 t2 t2 dt γ 2 u¨ = uγ ˙ 2 − dt 2γ 4 u(u ˙ · u)/c ˙ 2, t1
t1
t1
Aug. 26, 2011
11:16
214
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity t2 t1
dt γ 4 u(u · u)/c ¨ 2 = u(u · u) ˙ −
γ 4 t2 c2 t1
t2
dt t1
γ4 2 {u˙ u + u(u ˙ · u) ˙ + 4(u · u) ˙ 2 uγ 2 /c2 }, c2
because d u · u˙ γ = γ3 2 . dt c Then assuming uniform acceleration, he sums the two sides to obtain
t2
t1
dt γ 2 u¨ + u(u · u)γ ¨ 2 /c2 =−
t2
t1
γ4 dt 2 c
uu˙
2
γ2 + 3u(u ˙ · u) ˙ + 4u(u · u) ˙ 2 2 c
.
If the reaction radiation force is given by (4.3.3), Abraham gets back (4.3.1). But, how is it possible that the integrand of (4.3.3) equals (4.3.1)? It does not! What is equal to (4.3.1) is u ˙ 2e2 d γ4 2 2 Frad = 3 ˙ 2 + 2W (4.3.4) u˙ γ + u(u · u) rad . 3c dt c c However, under the condition of uniform acceleration the terms involving the total derivative vanish, so this cannot be right. Actually, under the condition of uniform acceleration the entire reaction radiation force, (4.3.1) vanishes! To this already confusing situation Rohrlich [65] adds more. He defines the “energy rate of radiation by a charge,” R, to be the negative of (4.3.2). However, the rate of energy loss by radiation is not (4.3.2), but, rather 6 2e2 d 2 2 4 2γ ˙ Wrad = 3 − u˙ γ + (u · u) (u · u)γ ˙ . (4.3.5) ˙ dt 3c c2 Supposedly, the terms in the total differential, when evaluated over the ends of the time interval, will vanish under the assumption of uniform acceleration. But, under this condition (4.3.5) vanishes altogether.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
215
Rohrlich considers R to be “a Lorentz invariant and constitutes the relativistic generalization of the famous nonrelativistic Larmor formula, (2e2 /3c3 )u˙ 2 ,” probably not realizing that it was derived by Liénard before the advent of relativity. He then goes on to say that onef is tempted to identify the Abraham four-vector [(4.3.1) in vector notation] with the radiation reaction. This, however, leads to various difficulties: it is possible that at a particular instant [u = 0] but [u˙ = 0]; this results in no radiation emission, R = 0, but a nonvanishing “radiation reaction,” [(4.3.1)]. Conversely, it is possible that [(4.3.1)= 0] but that radiation is being emitted, R. This is the case whenever u¨ µ −
1 u˙ ν u˙ ν uµ = 0, c2
u˙ ν u˙ ν = 0.
(4.3.6)
We recognize this equation as the condition for uniform acceleration. Thus, in uniformly accelerated motion radiation is emitted while the ‘radiation reaction’ [Frad ] vanishes. For these reasons the interpretation of [Frad ] as a radiation reaction force is to be rejected. Obviously, the radiation reaction −Ruµ vanishes if and only if no radiation is emitted, R = 0.
However, if (4.3.1) vanishes, so, too, will (4.3.5) vanish, and this does not appear to be compatible with the fact that energy is being radiated! Such a radiation loss would surely affect the particle’s motion, but it is nowhere to be found in the Lorentz–Dirac equation [Rohrlich 90, p. 171, formula preceding (6–80)], making that equation extremely dubious to the point where its existence is called into question. This would eliminate, at one stroke, problems related to pre-acceleration, lack of causality, and self-accelerating or run-away solutions. Denoting uν as the velocity four-vector, Abraham’s radiation reaction force (4.3.1) can be written as 2e2 u¨ µ − u˙ ν u˙ ν uµ 3 3c 2e2 3 dγ 2 γ4 2 2 = 3 uγ ¨ + ¨ 2 u , u˙ + 3γ˙ + (u · u) 2 dt 3c c
Fµ =
(4.3.7)
which is the same as (4.3.1), where [Corben 68] u˙ ν u˙ ν = R = γ 4
f The term u
2 γ u˙ 2 + (u · u) ˙ 2 2 . c
µ is a velocity four-vector, see (4.3.8) below.
(4.3.8)
Aug. 26, 2011
11:16
216
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
Then, in order for (4.3.7) to be the radiation reaction force (4.3.1), it is necessary that d γ4 2 u¨ µ = (4.3.9) uγ ˙ + u(u · u) ˙ 2 , dt c and the condition that this term equal Liénard’s radiation rate is precisely the vanishing of (4.3.1). But, it is not Liénard’s radiation rate that remains constant in time [Rohrlich 90, p. 121 formula (5–42) and p. 169 formula (6–114)], but, rather, γ −2 times that expression, or the Beltrami metric, (4.3.16) below. With the rate of energy loss being given by 2 2 ˙ rad = 2e c2 γ γ¨ − γ 4 u˙ 2 = 2e c2 d (γ γ) ˙ ˙ (4.3.10) u ˙ − u W ν ν , dt 3c3 3c3 ˙ being identified as the radiation reaction force, (4.3.1), we would have and G for small radiation damping and assumed periodic motion at a constant angular velocity ω [Corben 68] 2
˙ = − 2e ω2 u2 γ 4 , W 3c3 2
˙ = − 2e ω2 uγ 4 . G 3c3 These equations imply, ˙, ˙ =W u·G which is a mechanical relation, rather than a radiative one. This equation is not of the form ˙ rad /c2 = mu, ˙ = uW ˙ Frad = G
(4.3.11)
as the analogy with special relativity would lead us to believe with m as the mass equivalent of radiation. Thus, it makes no sense to consider mu˙ = Frad + · · ·
(4.3.12)
as an equation of motion for the electron where the dots indicate terms like the Lorentz force. To see this, it suffices to consider uniform acceleration ˙ Equation (4.3.12) is often referred to where Frad = 0, and non-constant, G.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
217
as the Lorentz–Dirac equation, which is a third-order equation, manifesting unphysical, ‘run-away,’ solutions, pre-acceleration, and the like. We can, however, identify the acceleration in (4.3.2) with the Lorentz force, in which case it becomes [Landau & Lifshitz 75] 2 2 u 2e2 2 2γ ˙ Wrad = − 2 3 γ E + × H + (u · E) 2 . c 3m c c The radiation reaction force, according to Abraham, would be u/c2 times this value. Hence, there is no logic to adding the Lorentz force onto an equation which already contains it.
4.3.2
Constant rate of energy loss in hyperbolic velocity space
There is a much more elegant, and intuitive, way to proceed. We want to generalize Larmor’s formula, 2 ˙ rad = − 2 e W 3 m2 c 3
dG dG · , dt dt
(4.3.13)
where G = mu. The Lorentz-invariant generalization of (4.3.13) is [Jackson 75] 2 ˙ rad = − 2 e W 3 m2 c 3
dGµ dGµ , dτ dτ
(4.3.14)
where dτ = dt/γ is the proper time element, Gµ is the charged particle’s momentum–energy vector, and the four-vector scalar product, dGµ dGµ − = dτ dτ
dG dτ
2
1 − 2 c
dW dτ
2 .
Introducing W = γmc2 and G = γmu into the four-vector scalar product gives Liénard’s expression 2 2 2 ˙ rad = 2 e γ 6 u˙ 2 − (u × u) ˙ /c , W 3 c3
Aug. 26, 2011
11:16
218
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
or, equivalently, 2 e2 4 2 2 2 2 (u . + γ /c u ˙ · u) ˙ γ 3 c3
(4.3.15)
Now the remarkable thing about (4.3.15) is that if we evaluate the time-derivatives in the co-moving frame using proper time, ˙ rad W
2 e2 2 = γ 3 c3
du dτ
2
+γ
2
du u· dτ
2 /c
2
,
(4.3.16)
we clearly see that it is none other than the Beltrami metric! So the Beltrami metric is the rate of energy loss in a frame which is moving with the electron. It is apparent that the rate of energy loss will be decreased in the co-moving frame by and amount γ −2 . It is also prodigious that the Beltrami metric,
ds dτ
2
γ2 = 2 c
u˙
2
γ2 + (u · u) ˙ 2 2 c
,
(4.3.17)
is an electrokinetic potential from which the Euler–Lagrange equations can be derived. Taking the positive square root of (4.3.17) we obtain the shortest arc length of a curve traced out by the system between times τ1 and τ2 from the action principle s=c
τ2
√
(2L)dτ,
(4.3.18)
τ1
where γ2 L= 2 2c
u˙
2
γ2 + (u · u) ˙ 2 2 c
,
(4.3.19)
and the dot will now stand for the proper time-derivative. The shortest curve that is traced out in the time interval between τ1 and τ2 will be that for which the variation of the integrand, (4.3.18), vanishes. The Lagrangian, (4.3.19), bears an uncanny similarity to Liénard’s expression for the rate of energy loss due to radiation, (4.3.2). In fact, it is just γ −2 times smaller!
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
219
The Lagrangian, (4.3.19), is a second-order homogeneous function of the accelerations,g u˙
∂L = 2L, ∂u˙
and we will now see that the Euler–Lagrange equations requires that it be a first integral of the motion. The variational equations (in velocity space!) are 1 ∂L 1 ∂L d −√ = 0. √ (2L) ∂u dτ (2L) ∂u˙
(4.3.20)
The condition that (4.3.20) becomes the Euler–Lagrange equations is that L must be an integral of the motion. That is to say, the condition that dL/dτ = 0 is u¨ = −2
(u · u) ˙ 2 γ u. ˙ c2
(4.3.21)
For then (4.3.20) becomes ∂L d ∂L − = 0, dτ ∂u˙ ∂u which is γ 2 γ 2 γ 4 (u · u)u ¨ +2 (u · u) ˙ u˙ + 2 (u · u) ˙ 2 u = 0. (4.3.22) γ 2 u¨ + c c c The variational equations (4.3.22) will not be satisfied if there is an external force, Fext acting on the system. In the co-moving frame, γFext must be added to the right-hand side of (4.3.22), viz. γ 2 γ 2 γ 4 (u · u)u ¨ +2 (u · u) ˙ u˙ + 2 (u · u) ˙ 2u Fext = αγ u¨ + c c c γ3 2 d 3 2 2 2γ =α uγ ˙ + (u · u) ˙ uγ /c − 2 u˙ + (u · u) u , (4.3.23) ˙ dτ c c2 where α is a constant of proportionality, yet to be determined. g ∂L/∂u is a symbolic representation of the vector whose components are the deriva-
tives of L with respect to the corresponding components of u.
Aug. 26, 2011
11:16
220
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
Integrating (4.3.23) over a proper time interval, the first term vanishes leaving τ2 τ2 2 γ3 2 2γ u . (4.3.24) Fext dτ = −α dτ 2 u˙ + (u · u) ˙ c2 c τ1 τ1 If we set α = 2e2 /3c3 , (4.3.24) is the Abraham radiation force, (4.3.3), in co-moving frame. Introducing the acceleration [Fock 59], a = uγ ˙ 2, we can write the Euler–Lagrange equations in the form γ2 2e2 γFrad = 3 a˙ + (u · a˙ )u 2 . 3c c
(4.3.25)
(4.3.26)
In a state of uniform acceleration, a = const., and in a frame co-moving with the radiating electron the radiation reaction force, (4.3.26), vanishes. There will be a constant rate of energy loss, 2 2e2 2 2e2 2 2γ ˙ Wrad = − 3 2L = − 2 γ u˙ + (u · u) = const., (4.3.27) ˙ 3c 3c c2 in the co-moving frame which is γ −2 times smaller than Liénard’s formula, (4.3.15), which is what a stationary observer would measure. We will see in Sec. 9.6 that the Beltrami metric is the metric for a uniformly rotating disc. Give it a charge and its power loss due to radiation will be given by (4.3.16).
4.3.3
Radiation at uniform acceleration
The condition for uniform acceleration, (4.3.21), can be generalized to any form of hyperbolic motion, u¨ = −n
(u · u) ˙ 2 γ u, ˙ c2
(4.3.28)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
221
for integer n. Abraham’s case is n = 3, while the Beltrami metric corresponds to n = 2. We will begin with the latter. Consider a uniformly accelerated particle moving in the x-direction, where u is the only component of the velocity in this direction. Criterion ˙ 2 )/dτ = 0, can be integrated to give (4.3.21), or what is the same, d(uγ u˙ = g, (4.3.29) 1 − u2 /c2 where g is the particle’s uniform acceleration. (4.3.29) will easily be recognized as the equation for the velocity of a body falling under the force of gravity, −g, or under the force of electrical attraction. Multiplying both sides by u, gives uu˙ = gu/c2 . (4.3.30) 2 c − u2 The negative of (4.3.30) represents the rate at which a homogeneous compression parallel to the direction of motion is taking place [Schott 12, p. 174]. Where does this strain come from? Obviously, we need to consider the electron as finite, say with a bounding surface S(x, y, z, t) = 0 that remains stationary in time, i.e. dS ∂S ∂S ∂S ∂S = + ux + uy + uz = 0. dt ∂t ∂x ∂y ∂z
(4.3.31)
We assume that the velocity components are linear functions of the coordinates, viz. ux = σ11 x + σ12 y + σ13 z, and similar expressions for uy and uz . Now suppose that the electron undergoes a FitzGerald–Lorentz contraction in the direction of motion, which is the x-direction, so that √ x = x0 (1 − β2 ), y = y0 , z = z0 , where the coordinates (x0 , y0 , z0 ) are the coordinates of any point on the electron at rest while (x, y, z) are those of the same point when the electron is in motion, and β = u/c. Consequently, the velocity components are ux = −
ββ˙ , 1 − β2
uy = 0,
uz = 0.
Aug. 26, 2011
11:16
222
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
Hence, the electron undergoes a pure strain, σ11 = −
ββ˙ , 1 − β2
which is what (4.3.30) says. The condition that the bounding surface remains stationary, ∂S ∂S ββ˙ x − = 0, ∂t 1 − β2 ∂x can be solved for its characteristics by means of Lagrange’s method, dx −ββ˙ x 1−β2
=
dy dz = = dt. 0 0
The three independent integrals are: x = const., (1 − β2 )
√
y = const.,
z = const.,
so that the equation of the surface is x S √ , y, z = 0, (1 − β2 ) when the electron is in motion, and reduces to S(x, y, z) = 0 when at rest. Returning to (4.3.29), and assuming that the velocity is zero at time t = 0, we get by integration u = c tanh (gτ/c),
(4.3.32)
where gτ is the hyperbolic measure of the velocity. (4.3.32) expresses the relative velocity in terms of a segment of a Lobachevsky straight line. If we further assume that x = 0 at τ = 0, we obtain x=
c2 ln cosh (gτ/c). g
We have set the arbitrary integration constant equal to zero because, for gτ c, it must reduce to the nonrelativistic relation, x = 12 gt2 , for the motion of a particle with constant acceleration since there is no longer any distinction between proper and coordinate times.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
223
The rate of energy loss due to the uniform acceleration of the electron through radiation is 2 2 ˙ rad = − 2e g2 γ 2 sech2 (gτ/c) = − 2e g2 , W 3c3 3c3
(4.3.33)
which is precisely Larmor’s formula, (4.3.13) for constant acceleration. The rate loss of energy, (4.3.33) remains constant in time. The case n = 3 is well-known and was discovered in the early days of special relativity. For a particle under the influence of a constant gravitational acceleration, d u g= , (4.3.34) √ dt (1 − u2 /c2 ) will be constant so that integration gives simply u ¯ = c sinh u/c. (1 − u2 /c2 )
gt = √
(4.3.35)
Now, the velocity can be written as gt dx . = dt (1 + (gt)2 /c2 )
¯ =√ u = c tanh u/c
(4.3.36)
If we further assume that x = 0 at t = 0, we get a second integral x=
c2 √ c2 ¯ − 1). ( (1 + (gt)2 /c2 ) − 1) = ( cosh u/c g g
(4.3.37)
This is the one-dimensional hyperbolic motion found by Born [09] in 1909, and by Sommerfeld one year later. It will be our prototype of a onedimensional system at constant acceleration. The measure of the hyperbolic velocity u¯ can be obtained from time dilatation. Infinitesimal increments in moving inertial frame, dτ with velocity u, and dt are related by dτ = dt/γ =
√ (1 − u2 /c2 )dt.
Integrating (4.3.38), τ=
0
t
dt c = sinh−1 (gt/c), 2 2 g (1 + (gt) /c )
√
(4.3.38)
Aug. 26, 2011
11:16
224
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
or gt = c sinh (gτ/c),
(4.3.39)
we obtain the time interval indicated by the moving clock when the elapsed time according to the clock at rest is t. In the derivation of (4.3.39) we have used (4.3.36). On the strength of (4.3.35), we find the hyperbolic measure of the velocity as u¯ = gτ. As t → ∞, τ will increase more slowly than t. Dividing (4.3.37) by (4.3.39) we find x cosh (gτ/c) − 1 u= =c = c tanh (gτ/2c), t sinh (gτ/c) which is, again, the line segment in Lobachevsky space. The rate at which energy is lost by a uniform accelerating electron, 2 ˙ rad = − e g2 , W 6c3
(4.3.40)
is, again, constant in time, but only one-quarter as large as Larmor’s formula, (4.3.33). Finally, in the case where the acceleration is perpendicular to the velocity, which can be the case of a charge moving in a circle of given radius and given angular velocity, the rate of energy loss will be given by (4.3.16), 2
˙ rad = − 2e u˙ 2 γ 2 . W 3c3
(4.3.41)
Expression (4.3.41) is again γ −2 smaller than the rate of energy loss reported in standard texts [Panofsky & Phillips 55], because our frame of reference coincides with that of the electron. One can argue that in the frame where the electron seems at rest, there would be no magnetic field, and, therefore, Poynting’s vector would vanish indicating that there is no flow of energy. Hence, we will not see any radiation. However, in the transformation to proper time, (4.3.38), the velocity is the electron’s momentary velocity, and since the acceleration is non-vanishing, it will be constantly changing. The very fact that the acceleration is non-vanishing attests to the fact that the electron must be radiating. To summarize we may say that the existence of the metric (4.3.17) shows that there is no meaning to solving a third-order equation, referred to as the Lorentz–Dirac equation. It appears that no one ever thought of
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
225
that Liénard’s formula for the rate of energy loss by an accelerating electron as something to render extremum. We have shown that it is not Liénard’s formula that is a first integral of the motion, but, rather, his expression transformed to the frame of the instantaneous velocity of the electron. The paths which render (4.3.19) an extremum are those of hyperbolic motion that is followed by a uniformly accelerating charge. Along such paths the radiation reaction force vanishes. The radiation reaction force is not to be rejected outright [Rohrlich 90], but, rather, has to be interpreted as measuring the deviation from hyperbolic motion executed by a uniformly accelerated charge. Such a situation occurs when the curvature is no longer constant.
4.3.4
Curvatures: Turning and twisting
The decomposition of the force into its curvature components of turning and twisting was first carried out by Schott [12]. At any point in the threedimensional particle trajectory a mutually orthogonal frame can be erected ˆ as shown in Fig. 4.3. As the particle traces with unit vectors q, ˆ n, ˆ and b, out its trajectory, these unit vectors change direction but always remain orthogonal to one another. Denote by r the radius vector to the given point along the curve and let s stand for the arc length. Call qˆ = dr/ds the unit vector tangent to the curve, and denote ρ = ds/dϕ as the radius of curvature. Since the angular frequency is ϕ˙ = dϕ/ds, dqˆ = n/ρ, ˆ ds
Fig. 4.3.
or
q˙ˆ = nω ˆ ρ,
Frenet frame field for a trajectory of the motion.
(4.3.42)
Aug. 26, 2011
11:16
226
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
where the component of the angular velocity, ωρ = u/ρ, and nˆ is normal to q, ˆ tangent to the curve, and is directed toward the center of curvature.h ˆ the binormal, bˆ = qˆ × n. ˆ Finally, denote by b, ˆ Since db/ds is normal to bˆ we can take it along n, ˆ i.e. dbˆ = −n/τ, ˆ ds
or
˙ bˆ = −nω ˆ τ,
(4.3.43)
where ωτ = u/τ, and the minus sign is traditional. The function, τ is the radius of torsion, or torsion for short, and unlike ρ, it may be negative, or even zero at points along the curve. Moreover, since nˆ = bˆ × q, ˆ it follows that dnˆ ˆ − q/ρ, = −nˆ × q/τ ˆ + bˆ × n/ρ ˆ = b/τ ˆ ds or equivalently, ˆ τ − qω n˙ = bω ˆ ρ.
(4.3.44)
Equations (4.3.42), (4.3.43), and (4.3.44) are known as the Frenet–Serret equations. Now, in order to write the radiation reaction force, (4.3.1), in component form, we introduce the four-vector, ¯ cosh β), ¯ uν = (γβ, γ) = (qˆ sinh β, ¯ is the hyperbolic measure since uν uν = −1, where β = u/c, and β¯ = u/c of the relative velocity. Employing the Frenet equations we calculate its derivatives as ¯ β˙¯ sinh β¯ , u˙ ν = qˆ β˙¯ cosh β¯ + ωρ nˆ sinh β, ¯ nˆ ¯ bˆ + (ω˙ ρ sinh β¯ + ωρ β˙¯ cosh β) u¨ ν = (ωρ ωτ sinh β)
(4.3.45)
¯ q, + (β¨¯ cosh β¯ + (β˙¯ 2 − ωρ2 ) sinh β) ˆ β¨¯ sinh β¯ + β˙¯ 2 cosh β¯ . The square of the first equation in (4.3.45), ¯ u˙ ν u˙ ν = β˙¯ 2 + ϕ˙ 2 sinh2 β,
(4.3.46)
h The dot denotes the derivative with respect to coordinate time, and not proper
time as in Rohrlich [65].
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
227
where we have substituted ϕ˙ for ωρ . Expression (4.3.46) is none other than the Beltrami metric, (4.3.17). Apart from a multiplicative factor, (4.3.46) is what has been referred to as an ‘invariant radiation rate’ [Rohrlich 90]. The radiation reaction force has three components: (i) the tangential component, Fq =
2e2 3 ¨ ˙ 2 γ β¯ + β¯ − ωρ2 β + 2β˙¯ 2 β2 γ 4 (1 + β2 γ 4 ), 3c
(4.3.47)
(ii) the normal component, Fn =
2e2 3 γ ω˙ ρ β + ωρ β˙¯ 1 + 2β2 γ 4 , 3c
(4.3.48)
and (iii) the binormal component, Fb =
2e2 γωρ ωτ , 3c
(4.3.49)
¯ where we substituted γ for cosh β¯ and γβ for sinh β. The condition for the vanishing of the radiation reaction force, (4.3.21), gives two conditions: (i) one in the direction of the principal normal vector field of the trajectory, ω˙ ρ β + ωρ β˙¯ 1 + 2β2 γ 4 = 0,
(4.3.50)
and (ii) one along the tangent vector field, β¨¯ + β˙¯ 2 − ωρ2 β = −2β˙¯ 2 β2 γ 4 .
(4.3.51)
Condition (4.3.50) makes (4.3.48) vanish, while (4.3.51) makes (4.3.47) zero. The last remaining component of the self-force, (4.3.49) requires us to set ωτ = 0. Since ωτ measures the twisting, or torsion, of the trajectory, its vanishing is a necessary condition for the radiation reactive force to vanish.
Aug. 26, 2011
11:16
228
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity The rate at which energy is radiated can be determined from (4.3.46);
it is 2 ˙ rad = − 2e γ 2 ω2 β2 + β˙ 2 γ 2 . W ρ 3c3
(4.3.52)
From (4.3.52) we can conclude that unlike the torsion, the curvature function will be involved in the rate at which energy is lost through radiation. Since the last term in (4.3.50) is small, it can be neglected; then integrating the equation we get √ ωρ = c1
(1 − β2 ) , β
where c1 is a constant of integration. Introducing ωρ = u/ρ we come out with u˙ ρ =
√ u2 = c2 × (1 − β2 ), ρ
where c2 = cc1 . Consequently, the centripetal acceleration will decrease as the velocity increases. The decrease in the centripetal acceleration, corresponding to an increase in the radius of curvature ρ, is due to energy loss through radiation. The particle begins to spiral outwards with the consequence that the angular velocity ωρ = u/ρ decreases. It is rather peculiar that no mention has been made of the decrease in the rest mass due to radiation, and, in fact, no mention has been made of mass at all. Where then is the equivalence of mass loss and radiated energy? Corben [68] associates Wrad with the energy of the particle, and the radiation reaction force Frad with the change in momentum of the particle, ˙ Moreover, he considers the case of gyroscopic motion, where ω = u/r = G. mc2 /s, for a particle moving in a circle of radius r in a plane normal to the spin s, whose magnitude is s. As a result of radiation, circular motion is converted into helical motion. However, it all hangs on the association of rest mass with sω/c2 that allows him to conclude “the rest-energy and total energy become progressively smaller, corresponding to the fact that electromagnetic energy is being radiated away from the particle.”
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
4.3.5
229
Advanced potentials as perpetual motion machines
Much use and abuse has been made of retarded and advanced potentials in the self-interaction of the electron. The following discussion has been used by Feynman [64] to show the utility of introducing advanced, as well as retarded, potentials. Although it is undoubtedly motivated by the facts that the different forces add, and that the inertial term is proportional to the acceleration, while the Schott term, or the self-reaction of the electron due to its radiation, is proportional to the rate of change of the acceleration, it can find no justification in the original derivations of the force expressions given by Liénard (1898), Heaviside (1902), Schwarzschild (1903), nor Ritz (1908). All of them began with the Lorentz force, (4.1.11), and used the notion of retarded potentials to cast it in a form ux ee u˙ x Fx = 2 A cos (rx) − B − C 2 , c r c for the x-component, where the coefficients A, B, and C depend on the velocities and accelerations, but not upon higher time-derivatives. However, according to Feynman, the self-interaction of the electron is described by a force which begins with the acceleration and contains higher-order derivatives, Fx = −m x¨ −
2 e2 ... e2 a .... x + 4 x + ··· , 3 c3 c
(4.3.53)
where m = 23 e2 /ac2 is the electrostatic mass, which we will meet in Sec. 5.4.3, and a is the ‘classical’ electron radius. In the limit as a → 0, the third term in (4.3.53) will go to zero, while the first term becomes infinite, i.e. an infinite mass. The second term is the radiation damping, and it is independent of a. This term describes the action of the electron on itself, and we want to extract this term from the others in (4.3.53). Let us try changing c into −c for it will change the sign of the second term in (4.3.53). The resulting force from such an operation would be Fxadv = −m x¨ +
2 e2 ... e2 a .... x + 4 x + ··· . 3 c3 c
(4.3.54)
But, changing the sign of c changes a retarded into an advanced potential. Instead of viewing the charge at the previous time, t − r/c, we now view it at a later time, t + r/c. Since it is the second term that we want, Dirac said
Aug. 26, 2011
11:16
230
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
that the electron acts on itself half the time as a retarded field, and half the time as an advanced field [Feynman 64]. The net force is half the differences of the forces, (4.3.53) and (4.3.54), 2 e2 ... 1 Fx − Fxadv = − 3 x , 2 3c
(4.3.55)
since any higher-order terms that persist will go to zero with a. So simply by continually running time forward and backward, we will have simultaneous diverging and converging waves acting on the electron, which will feel a force given by (4.3.55). The electron will, according to Ritz, be excited by its own radiation by the advanced wave, and will radiate by the retarded wave. This can continue indefinitely and constitutes a perpetuum mobile of the second kind. As such it is outlawed by the second law. Although there is nothing to exclude advanced potentials, or linear combinations with retard potentials, such solutions to Maxwell’s equations should be outlawed on the basis that they violate causality, irreversibility, and they constitute elements for constructing perpetual motion machines. Expanding Liénard’s force, (4.1.8) to terms in 1/c3 gives F = −m a +
2e2 a˙ . 3c3
(4.3.56)
The last term is the so-called Schott [12] radiation term, because it was Schott who first brought out its significance as the friction due to loss of energy by radiation, or the self-reaction of the charge on itself due to radiation. If it could change sign, as Dirac supposes, what was radiation loss would be radiation gain, and an unlimited energy supply would exist leading to perpetual motion of the second kind. However, before we jump to conclusions, let us take a closer look at the origin of this term. Frenkel [26] tells us that the acceleration at time t is not a, but, rather, there is a component coming from an earlier time t = t − r/c, viz. a = a(t) + (t − t)˙a. This introduces an extra term into the force, de de 2 + a cos (rx) . a x r 2c3
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
231
If we take the acceleration a along the x-axis so that ar = ax = a, then performing the integration gives 2e2 1 e2 = 3 a. a 1 + 3 3 3c 2c We know that the charge accelerates because we can see its radiation. A change in sign of the Schott term would mean that there is a contribution to the acceleration at a later time, t = t + r/c. We are consequently viewing something that will happen in the future, and this destroys causality! We can see something that happened in the past, like the stars that glow at night, because it takes light a finite time to reach us. But, to see something in the future would be to have a crystal ball at our disposal. Since there are no crystal balls, there can be no so-called pre-acceleration [Rohrlich 90, p. 151]. Hence, (4.3.54), and, (4.3.55), have no meaning since an advanced potential is meaningless. The rate at which energy is lost is obtained by taking the scalar product of (4.3.56) and u; we then obtain F·u=−
m d 2 2e2 d 2e2 u + 3 (a · u) − 3 a2 . 2 dt 3c dt 3c
(4.3.57)
If the charge oscillates back and forth, the time average of the first two terms in (4.3.13) vanish, thereby leaving a single term that was found by Larmor in 1897. However, we cannot use Maxwell’s method of showing that a term in the Lagrangian of the form (2e2 /3c3 )(a · u) would lead to an indeterminate sign in the expression for the kinetic energy since this term has no effect upon the Euler–Lagrange equations, Fx = −
∂L d2 ∂L d ∂L − 2 . + ∂x dt ∂ux dt ∂ax
The last two terms would lead to an exact cancellation. In closing this chapter, we might mention the anecdote that it was none other than Einstein who, during his Princeton years, brought Ritz’s emission theory to the attention of Wheeler and Feynman [45, 49] in 1941, who were working on a time-symmetric absorber theory which used a combination of retarded and advanced potentials. Was he still carrying on his debate with Ritz, and having second thoughts about emission theories,
Aug. 26, 2011
11:16
232
SPI-B1197
A New Perspective on Relativity
b1197-ch04
A New Perspective on Relativity
as Lanczos seemed to feel? As for their theory it has but all been forgotten, for a time-symmetric absorber theory risks being branded as a perpetual motion machine.
References [Abraham 05] M. Abraham, Theorie der Strahlung (Leipzig, 1905), p. 123, Eq. (85); 1st ed. Theorie der Elektrizität, Vol. 2, 5th ed. (Teubner, Leipzig, 1923), p. 115. [Arzeliès 66] H. Arzeliès, in Rayonnement ed Dynamique du Corpuscule Chargé Fortement Acceléré (Gauthier-Villars, Paris, 1966), pp. 74–79. [Born 09] M. Born, “Die Theorie des starren Elektrons in der Kinematik des Relativitätsprinzips,” Ann. der Physik 30 (1909) 1–56; A. Sommerfeld, “Ber die Zusammensetzung der Geschwindigkeiten in der Relativtheorie,” Verh. der DPG 21 (1909) 577–582. [Born & Wolf 59] M. Born and E. Wolf, Principles of Optics (Pergamon Press, New York, 1959), p. 70. [Corben 68] H. C. Corben, Classical and Quantum Theories of Spinning Particles (Holden-Day, San Francisco, 1968), Sec. 11. [Einstein 05] A. Einstein, “Ist die Trägheit eines Körpers von seinem Energieninhalt abhängig,” Ann. Phys. 18 (1905) 639–641. [Einstein 35] A. Einstein,“Elementary derivation of the equivalence of mass and energy,” Bull. Am. Math. Soc. 22 (1935) 223–230. [Einstein 49] A. Einstein, Autobiographical notes, 1949. [Feynman 64] R. P. Feynman, The Feynman Lectures on Physics, Vol. II (AddisonWesley, Reading MA, 1964), Ch. 28, p. 11. [Fock 59] V. Fock, The Theory of Space, Time, and Gravitation (Pergamon Press, New York, 1959), p. 41. [Fox 62] J. G. Fox, “Experimental evidence for the second postulate of special relativity,” Am. J. Phys. 30 (1962) 297–300. [Fox 65] J. G. Fox, “Evidence against emission theories,” Am. J. Phys. 33 (1965) 1–17. [Frenkel 26] J. Frenkel, Lehrbuch der Elektrodynamik, Vol. 1 (Berlin, 1926), pp. 208– 211. [Ives 52] H. E. Ives, “Derivation of the mass-energy relation,” J. Opt. Soc. Am. 42 (1952) 540–543. [Jackson 75] J. D. Jackson, Classical Electrodynamics, 2nd ed. (Wiley, New York, 1975), p. 660. [Jammer 61] M. Jammer, Concepts of Mass and Modern Physics (Harvard U, Cambridge MA, 1961). [Lanczos 74] C. Lanczos, The Einstein Decade (1905–1915) (Elek Science, London, 1974), p. 161. [Landau & Lifshitz 75] L. D. Landau and E. M. Lifshitz, The Classical Theory of Fields (Pergamon Press, Oxford, 1975), p. 195. In (73.7) there is a sign difference and the absence of γ 2 in the second term. [Lavenda 02] B. H. Lavenda, “Does the inertia of a body depend on its ‘heat’ content?,” Naturwissenschaften 89 (2002) 329.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch04
Electromagnetic Radiation
233
[Lavenda 09] B. H. Lavenda, A New Perspective on Thermodynamics (Springer, New York, 2009). [Liénard 98] A. Liénard, “Champ électrique et magnétique produit par une charge électrique concentrée en un point et animée d’un mouvement quelconque,” L’éclairage électrique 16 (1898) pp. 5, 53, 106. [Maxwell 91] J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed. (Clarendon Press, London, 1891), Ch. 23. [O’Rahilly 38] A. O’Rahilly, Electromagnetics, (Longman, Green & Co., London, 1938). [Pais 82] A. Pais, Subtle is the Lord (Oxford U. P., Oxford, 1982), p. 467. [Panofsky & Phillips 55] W. K. H. Panofsky and M. Phillips, Classical Electricity and Magnetism (Addison-Wesley, Reading MA, 1955), p. 307, Eqs. (19) (35). [Pauli 58] W. Pauli, Theory of Relativity (Pergamon Press, New York, 1958), pp. 5–9. [Riseman & Young 53] J. Riseman and I. G. Young, “Mass-energy relationship,” J. Opt. Soc. Am. 43 (1953) 618; H. E. Ives, “Note on ‘Mass-energy relationship,’ ” ibid 43 (1953) 618–619. [Ritz 08] W. Ritz, “Ricerches critiques sur l’Électrodynamique Générale,” Ann. Chimie et Physique, 8th series, XIII (1908) 145–275; translated and commented upon by W. Hovgaard, “Ritz’s Electrodynamic Theory,” J. Math. Phys. 11 (1932) 218–254. [Ritz and Einstein 09] W. Ritz and A. Einstein, “Zum gegenwärtigen Stande des Strahlungsproblems,” Phys. Z. 10 (1909) 323–324. [Rohrlich 90] F. Rohrlich, Classical Charged Particles: Foundations of Their Theory (Addison-Wesley, Reading, MA, 1990). [Schott 12] G. A. Schott, Electromagnetic Radiation (Cambridge U. P., Cambridge, 1912), p. 246. [Stachel & Torretti 82] J. Stachel and R. Torretti, “Einstein’s first derivation of the mass–energy equivalence,” Am. J. Phys. 50 (1982) 760–763. [Thomson 21] J. J. Thomson, Elements of the Mathematical Theory of Electricity and Magnetism (Cambridge U. P., Cambridge, 1921), p. 388. [Wheeler & Feynman 45] J. A. Wheeler and R. P. Feynman, “Interaction with the absorber as the mechanism of radiation,” Rev. Mod. Phys. 17 (1945) 157–181. [Wheeler & Feynman 49] J. A. Wheeler and R. P. Feynman, “Classical electrodynamics in terms of direct interparticle interaction,” Rev. Mod. Phys. 21 (1949) 425–433.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
Chapter 5
The Origins of Mass
5.1
Introduction
The hallmark of the potential theory of a long rod is that the attraction of an infinitely long rod for a particle which is at a distance r from it is inversely proportional to r, and not to the inverse of its square. This gives rise to logarithmic potentials and the connection with inverse hyperbolic functions. We may then supplant the long rod by constant spheroidal level layers in which there appears the eccentricity of a meridian section, which is the intersection of a surface of revolution, in this case a prolate spheroid, with a plane that contains the axis of revolution. The axis of revolution coincides with the direction of the rod. The eccentricity, or the ratio between the distance form a point on the conic to the focus and the distance from that point to the directrix will have the exact same role as the relative velocity in aberration. Not only will this allow us to draw the parallelism between the motional distortion of stellar aberration and the eccentricity of the potential for prolate ellipsoid, but, moreover, it will pave the way to determining the mass dependence on the speed once we introduce the relativistic expression that relates the energy, or potential, to the momentum. In this way we will appreciate that the two models of an electron based on the deformation of a sphere into prolate and oblate spheroids are two sides of the same coin. The fact that the eccentricity of the meridian section of the spheroid is identified as the relative speed implicates that the deformation of the spherical electron at rest into a spheroid in motion is caused by the FitzGerald–Lorentz contraction, just like Abraham and Lorentz conceived it to be. However, there is no reason to pass summary judgment on the two models and declare Lorentz the winner. For it will turn out that these models are related to one another as elliptic geometry is
235
Aug. 26, 2011
11:16
236
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
related to hyperbolic geometry! And it is our premise that if a phenomenon occurs in one of the two non-Euclidean geometries it will almost certainly occur in the other.
5.2
From Motional to Static Deformation
Consider two incoming light signals that make angles ϕ1 and ϕ2 with respect to the z-axes in two frames that are traveling at a relative velocity u in (b) with respect to one in (a), as shown in Fig. 5.1. If the velocities of the two outgoing signals are u1 = cos ϕ1 and u2 = cos ϕ2 , the velocity composition law in the z-direction is cos ϕ2 =
cos ϕ1 − u , 1 − u cos ϕ1
while, in one of the perpendicular planes, √ sin ϕ2 =
(1 − u2 ) sin ϕ . 1 − u cos ϕ
Fig. 5.1. Stellar aberration: (a) A telescope at rest, and (b) a telescope aimed at the same star but in relative motion.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
237
With the aid of the trigonometric identity, tan ϕ/2 = sin ϕ/(1 + cos ϕ), we obtain 1 + u 1/2 tan ϕ1 /2. (5.2.1) tan ϕ2 /2 = 1−u Equation (5.2.1) says that the ratio of half the tangent angles in the two inertial frames are longitudinally Doppler-shifted. We will now see how aberration arises in potential theory when the relative velocity is replaced by the eccentricity. Then we will see how mass depends upon eccentricity. Reverting to relative velocities shows how relativistic mass acquires a dependence on them.
5.2.1
Potential theory
Consider a rod of length 2 pointing in the z-direction. It will have a constant, linear mass density ρ = m0 /2. If we consider the infinitesimal section dz, measured from the center of the rod, it will have an infinitesimal mass of dm = ρ dz, as shown in Fig. 5.2. The potential that a particle will feel at a point P at a distance r from the rod is determined from the fact that for a rod the attraction is proportional to the inverse of the distance, instead of the inverse square of the distance so that + dm dz (r) = G = Gρ , r − r where G is Newton’s gravitational constant. If ϕ is the angle subtended by the rod and the line to the point P from the rod, then the infinitesimal arc length that is swept out when we move from the origin to a distance dz up the rod is r dϕ = sin ϕ dz. This allows us to express the potential as ϕ2 tan ϕ2 /2 dφ = Gρ = Gρ ln . tan ϕ1 /2 ϕ1 sin φ
(5.2.2)
Aug. 26, 2011
11:16
238
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
Fig. 5.2.
The potential of a homogeneous rod.
√ If we use the half angle formula, tan ϕ/2 = [(1 − cos ϕ)/(1 + cos ϕ)], we can write the potential (5.2.2) as the logarithm of the cross-ratio 1 1 + cos ϕ1 1 − cos ϕ2 = Gρ ln · 1 − cos ϕ1 1 + cos ϕ2 2 =
1 Gρ ln {cos ϕ1 , cos ϕ2 | − 1, 1} . 2
(5.2.3)
So from very simple geometrical arguments, we have created for ourselves a hyperbolic space equipped with a cross-ratio, whose logarithm is a measure of hyperbolic distance. The potential (5.2.3) is related to hyperbolic distance. In fact, it is the difference of two hyperbolic lengths, = Gρ[tanh−1 ( cos ϕ1 ) − tanh−1 ( cos ϕ2 )]. The potential vanishes for equal hyperbolic lengths. And for ϕ2 = π/2, ϕ1 becomes the angle of parallelism which is a function only of the distance, (/Gm0 )2.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
239
If x is the normal distance between the rod and the point P then tan ϕ1 =
x , z+
tan ϕ2 =
x , z−
(5.2.4)
and using the half-angle formula for the tangent, (5.2.2) becomes √ [(z − )2 + x2 ] − (z − ) = Gρ ln √ . [(z + )2 + x2 ] − (z + ) Now, if r1 and r2 are the distances between the ends of the rod and the point P, as shown in Fig. 5.3, viz. r12 = x2 + (z + )2 ,
r22 = x2 + (z − )2 ,
(5.2.5)
Fig. 5.3. A rod AB has length 2 with O as its center. The attracted point P with an element of mass dm at a distance r from it. r1 and r2 are the lines joining P to the ends of the rod at A and B.
Aug. 26, 2011
11:16
240
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
we can write the potential as r2 − z + . = Gρ ln r1 − z −
The difference between the two expressions in (5.2.5) is 4z = r12 − r22 ,
(5.2.6)
and so, +z=
r12 − r22 + 42 , 4
−z=
42 − r12 + r22 . 4
The potential can thus be brought into the form 4r2 + r22 + 42 − r12 = Gρ ln 4r1 − r12 + r22 − 42 (r1 + r2 + 2)(r2 − r1 + 2) (r2 + r1 − 2)(r2 − r1 + 2) r1 + r2 + 2 = Gρ ln . r1 + r2 − 2 = Gρ ln
(5.2.7)
Surfaces of constant potential, or level surfaces, are defined by the relation r1 + r2 = 2a,
(5.2.8)
for which = const. The level surfaces are prolate ellipsoids with a major axis 2a, and foci that are located at the ends of the rod. Therefore, the rod can be supplanted by an infinite number of spheroidal level layers. Introducing the eccentricity, ε, of a meridian section of the level surface by = εa, into (5.2.7) leads to 1+ε = 2Gρ tanh−1 ε. = Gρ ln (5.2.9) 1−ε When viewed far from the rod, a is large and ε must be small, if the rod is to have constant length. The equipotential surfaces appear nearly spherical. Rather, for points on the rod itself = a, and with ε = 1, is infinite. For points in the neighborhood of the rod, ε < 1 and is very large. At large distances both ε and tend to zero.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
241
Comparing (5.2.9) with (5.2.2) we obtain the condition for equipotential surfaces, where is constant, as tan ϕ1 /2 = D−1 tan ϕ2 /2,
(5.2.10)
where D=
1+ε 1−ε
1/2 .
(5.2.11)
On an equipotential surface, is constant and equal to D = e/Gρ = eε , where ε is the hyperbolic measure of the eccentricity, which is no longer limited to the interval [0, 1]. But (5.2.10) will be identical to (5.2.1) when we identify the eccentricity, ε, with the relative speed, u/c. Both create large distortions for values in the neighborhood of unity. As the eccentricity is varied a sphere is transformed into an infinitely long rod. Of all the shapes that the system can pass through, the sphere has the minimum volume. We can pass through constant level surfaces by varying ε just as we can by transforming from one inertial frame to another. Gravitational attraction is supplanted by electrostatic attraction. Inside the spheroid, a particle is attracted equally in all directions so the net attraction vanishes. This means that the potential is constant inside the spheroid and if its shell is infinitely thin, the surface is an equipotential surface. At the points exterior to the shell, the potential is the same as though the mass (or charge) were uniformly distributed over the surface. The shell attracts an exterior particle just as the rod does. Following MacMillan [30] we use the double angle formula, 2 1 − tan ϕ2 /2, = tan ϕ2 tan ϕ2 /2
(5.2.12)
and a similar expression for ϕ1 . Into the latter expression we introduce the level layer condition (5.2.10) to get: 2 tan ϕ2 /2 D − = . tan ϕ1 tan ϕ2 /2 D
(5.2.13)
Aug. 26, 2011
11:16
242
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
Multiplying (5.2.13) by D and subtracting (5.2.12), and then multiplying (5.2.12) by D and subtracting (5.2.13) result in the pair of equations (D2 − 1) 1 D = , − 2 tan ϕ1 tan ϕ2 tan ϕ2 /2 1 (D2 − 1) D 2 =− − tan ϕ2 /2. tan ϕ1 tan ϕ2 D
Multiplying the two equations together eliminates the half angle terms and results in 1 D (D2 − 1)2 D 1 = − − . tan ϕ2 tan ϕ1 tan ϕ2 tan ϕ1 4D Finally, introducing (5.2.4) results in the equation for equipotential surfaces in two dimensions: (D − 1)2 z2 (D − 1)2 x2 + = 1, 4D 2 (D + 1)2 2
(5.2.14)
which are a family of confocal ellipses. Introducing 4D2 = s, (D − 1)2
(D + 1)2 2 = 2 + s, (D − 1)2
in (5.2.14) we get confocal conics, z2 x2 = 1, + s 2 + s shown in Fig. 5.4. The hyperbola, z2 x2 = 1, − s 2 − s
0 < s < ,
represents the lines of force which are always normal to curves of equipotential. In three dimensions, we get the equipotential surfaces, z2 x2 + y 2 = 1, + s 2 + s which are prolate spheroids.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
Fig. 5.4.
5.3
243
Family of ellipses and orthogonal confocal hyperbolas.
Gravitational Mass
5.3.1
Attraction of a rod: Increase in mass with broadside motion
In order to obtain expressions for the gravitational mass, we calculate the force of attraction in a plane normal to the rod, Fx , and the attractive force in the plane parallel to the rod, Fz [MacMillan 30]. These forces are defined by Fx = −
∂ ∂ε , ∂ε ∂x
Fz = −
∂ ∂ε , ∂ε ∂z
where ε=
2 , r1 + r 2
where r1 and r2 are given by (5.2.5). With the aid of the expressions, 2Gρ ∂ , = ∂ε 1 − ε2
Aug. 26, 2011
11:16
244
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity ∂ε ∂ε 2 ε = =− =− , ∂r1 ∂r2 2a (r1 + r2 )2 x ∂r1 = , ∂x r1
∂r2 x = , ∂x r2
∂r1 z+ , = ∂z r1
∂r2 z− , = ∂z r2
we find the normal component and parallel components of the force as: 1 Gρε Gρεx 1 z+ z− Fx = , F . + = + z r2 r1 r2 a(1 − ε2 ) r1 a(1 − ε2 ) From (5.2.6), (5.2.8) and = aε we have r1 = a + εz and r2 = a − εz. These enable the normal and parallel components of the force to be written as Gρ 2εx · 2 , 2 1 − ε a − ε2 z
Fx =
Fz =
2Gρεz . − ε2 z 2
a2
(5.3.1)
Eliminating x in the normal component of the force through the equation of the ellipse, x2 a2 (1 − ε2 )
+
z2 = 1, a2
gives 2Gρε3 Fx = ± √ · (1 − ε2 )
√
(2 − ε2 z2 ) , 2 − ε 4 z 2
Fz =
2Gρε3 z . 2 − ε 4 z 2
(5.3.2)
Assuming z is fixed and small, such that |z| , and recalling the definition of the mass, m0 = 2ρ, the normal and parallel components of the force become Gmx Gmz z Fx = 2 , Fz = (5.3.3) 3 where the masses are m 0 ε3 , (1 − ε2 )
mx = √
mz = m0 ε3 .
(5.3.4)
Replacing the eccentricity ε by the relative velocity, (5.3.4) gives the increase in inertia due to the motion. As the relative velocity tends to zero, the rod shrinks to a point particle. In analogy with the motion of Faraday tubes, broadside motion of the rod should lead to a greater inertial mass because more of the surrounding ‘aether’ is dragged with it than when it performs frontal movement. This prediction has been corroborated by the masses (5.3.4).
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
5.3.2
245
Attraction of a spheroid on a point in its axis of revolution: Forces of attraction as minimal curves of convex bodies
As a preliminary, we treat the attraction of a circular disc to a point on the axis of the disc through its center. This will aid us in the subsequent determination of the attraction of a spheroid on a point in its axis of revolution. Consider a disc of radius R and surface density ρ. Place the origin of our coordinate system at the center O of the disc and introduce the polar coordinates, r and θ. The attracted point P is a distance z from the center of the disc, as shown in Fig. 5.5. The element of mass of the particle is dm = ρr dr dθ, and the distance from the attracted particle on the disc to the point P is given by the Pythagorean theorem h=
√
(r2 + z2 ).
Fig. 5.5. Attraction of a circular disc on its axis.
Aug. 26, 2011
11:16
246
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
The Newtonian law of attraction is therefore given by Gdm Gρr dr dθ = 2 . h2 r + z2 If φ is the angle at P, then the component of this force along the axis is Gρr dr dθ Gρzr dr dθ Gρr dr dθ z . = 2 cos φ = 2 2 2 2 r +z h (r + z2 )3/2 r +z The total force of attraction is obtained by integrating from the center to the rim of the disc and integrating around the entire disc, viz.
R 2π
r dr dθ Fz = Gρz = 2πGρz 2 + z2 )3/2 (r 0 0 z = −2πGρ √ 2 − 1 . (z + R2 )
R 0
r dr (r2 + z2 )3/2
As the force (5.3.5) changes sign when z passes from positive to negative values, but does not vanish with it, the attractive force possesses a finite discontinuity at z = 0. The force undergoes a finite jump, as z passes from negative to positive values, equal to 4πGρ. Now, if R were to tend to infinity, as the Earth was once thought of as a flat, infinite, disc, the acceleration due to gravity would, indeed, be constant everywhere. We now let z be the axis of revolution of an oblate ellipsoid, Z0 Z, with coordinates ξ, η, ζ as in Fig. 5.6. For a system of particles that form a continuous density, the attractive force in the z direction is proportional to
Fig. 5.6. A figure of revolution.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
247
the difference in the masses averaged over the surface of the ellipsoid ζ−z Fz = Gρ dξ dη dζ. 3 Z R Now, we can use our result of the disc, (5.3.5) to avoid two integrations. To do so, we consider a thin cross-section dζ at a distance ζ from the origin Z0 . We must now interpret ρ as a volume density, rather than a surface density, and integrating from Z0 to Z, the limits of the ellipsoid along ζ, we find Z z−ζ Fz = 2πGρ − 1 dζ, (5.3.5) √ [(z − ζ)2 + R2 ] Z0 for the total force of attraction at a point z > Z. The surface of an oblate ellipsoid with a = b > c, shown in Fig. 5.9 (a), is given by the equation, ξ 2 + η2 ζ2 + = 1. a2 c2 Introducing the radius of the cross-section at a distance ζ from the origin, 2 ζ R2 = a2 1 − 2 , c into (5.3.5) gives Fz = 2πGρ
c −c
√
(z − ζ)dζ (z − ζ)2 + a2 −
a2 2 ζ c2
− 2c .
The rest is a technical matter of integration, which can be found in MacMillan [30]. The final result is √ 2 (a − c2 ) 3GM z −1 Fz = − 2 , (5.3.6) 1− √ 2 tan z (a − c2 ) (a − c2 ) where M is the mass of the ellipsoid, M=
4 πρa2 c. 3
It is apparent from the expansion, 1 tan−1 x = x − x3 + · · · , 3
Aug. 26, 2011
11:16
248 that
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
1 3 a2 − c 2 + ··· . Fz = −GM 2 − 5 z4 z
If the body were spherical, only the first term would subsist so that we may conclude that the force of attraction of an oblate spheroid at a point of its axis is less than it would be for a sphere at the same distance. When (5.3.6) it is evaluated at the surface z = a of the ellipsoid there results √ √ 3GM (5.3.7) { (a2 − c2 ) − a tan−1 ( (1 − c2 /a2 ))}. Fz=a = − 2 2 3/2 (a − c ) The terms in the parenthesis of (5.3.7) represent the minimal distance of a convex curve in elliptic space, and is related to the phase of the asymptotic Hankel function whose argument is greater than its order. We will elaborate on this in Sec. 5.4.4. Moreover, the ratio of the two terms will be shown to give the capacitance of an oblate ellipsoid [cf. (5.4.34) below]. Finally, we √ can express (5.3.7) in terms of the eccentricity ε = (1 − c2 /a2 ),a Fz=c = −
3GM (ε − tan−1 ε). a2 ε3
In contrast, for a prolate ellipsoid (c > a = b) whose surface is defined by the equation: ξ2 η2 + ζ 2 + = 1, a2 c2 the force of attraction of a point on the surface x = c is 3GM (tanh−1 ε − ε) c 2 ε3 3GM 1+ε = − 2 3 ln − 2ε , 1−ε 2c ε
Fx=c = −
(5.3.8)
a This corrects an error in Landau and Lifshitz [60] who give the eccentricity as √ ε = (a2 /c2 −1). This is obviously incorrect since the eccentricity must vary between 0 and 1. Their expression for the ‘depolarization coefficient’ of an oblate ellipsoid (4.34) should be replaced by
√ √ 1 n(z) = 3 [ε − (1 − ε2 ) cos−1 (1 − ε2 )). ε
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
249
√ expressed in terms of the eccentricity, ε = (1 − a2 /c2 ). In Sec. 5.4.4 we will see that the distortion of the sphere into an oblate, or prolate, spheroid is attributed to the motion. This means setting the eccentricity equal to the relative velocity for when the latter vanishes so, too, will the former. It will then be appreciated that the terms in the parentheses of (5.3.8) represent the difference between total electric and magnetic energies of a charged spheroid and the electrostatic energy it would have had it remained a sphere [Bucherer 04]. The terms also happen to be the difference between hyperbolic and Euclidean distances in velocity space. Consequently, the force is a measure of the deviation from Euclidean geometry due to deformation caused by the motion. For small values of the eccentricity, (5.3.8) reduces to the Newtonian law, Fx=c = −GM/c2 . In the following, we will replace the uniform mass distribution over the spheroid by a uniform charge distribution, and the eccentricity by the relative velocity. In so doing we will derive the mass dependence on the relative velocity from the relativistic expression for the momentum.
5.4
Electromagnetic Mass . . . the whole idea of electromagnetic mass is based on the view that the forces between point-charges do not obey the principle of action-reaction [O’Rahilly 38]
It may be said that electromagnetism was clarified by relativity whereas mechanics was transformed by it. The idea was that quantities like the Poynting vector could be used in defining mass since it is proportional to the momentum. Relativistic motion of matter led to inconsistencies with Newtonian dynamics, and only increased the need to place classical mechanics on an electrodynamic foundation in order to reconcile its divergence with classical theory. The desire of symmetry in the natural laws led Maxwell and his followers to consider a magnetic pole on the same footing as electric charge. However, it was the discovery of the electron and the failure to find the magnetic monopole that led the mass concept to be associated with the electron and its electric charge. If, for the sake of symmetry alone, we could restore this equivalence by writing alongside the Lorentz force, v Fe = e E + × B , (5.4.1) c
Aug. 26, 2011
11:16
250
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
the Lorentz force for the magnetic pole, v Fq = q H − × D , c
(5.4.2)
where E and H are the electric and magnetic field intensities, B and D the magnetic and electric flux densities, and e and q the electric charge and magnetic pole strength, respectively. The Lorentz force (5.4.1) is what the observer at rest would measure on an electron in motion with a velocity v. It can be decomposed into components that are parallel, F = eE and perpendicular, v F⊥ = e E⊥ + × B , c to the velocity. If we were to apply Newton’s second law, like the early practitioners of relativity, we would come out with two masses instead of one. These masses were baptized the ‘transverse’ and ‘longitudinal’ masses. At relativistic speeds, the longitudinal mass was much larger than the transverse mass, and since it did not fit the experimental measurements made by Kaufmann in the early part of last century it was swept under the rug. This is just one example where electromagnetism raised havoc with mechanics. Now, let us consider the mass of a charged conductor at rest. The energy density due to the electrostatic field is We =
0 2 |E| . 2
If, for the sake of simplicity, we assume the conductor to be a sphere of radius R, with a uniform surface charge density ρ = e/4πR2 , then the electric field density is E=
e rˆ , 4π 0 r2
(5.4.3)
for r ≥ R, where rˆ is the unit vector and 0 is the dielectric constant in free space. The total energy over all space is ∞ e2 1 2
0 |E| 4πr2 dr = . (5.4.4) 8π 0 R R 2
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
251
Equating this with the rest energy, mc2 , gives the expression mel =
e2 , 8π 0 Rc2
(5.4.5)
for the electrostatic mass. Consider, now, the charge to be moving with a uniform velocity u. At low speeds a charge generates an electric field intensity, E, and a magnetic field density, B, whose magnitudes are related by B(r) = |B(r)| = |u × E| =
eµ0 u sin θ eµ0 |u × rˆ | = , 2 4πr 4πr2
(5.4.6)
for r > R, and 0 for r < R, where we introduced Coulomb’s law (5.4.3), and θ is the angle between r and v. Equation (5.4.6) is known as the Biot–Savart law, named after its discoverers, where µ0 is the magnetic permeability of free space. Thomson now considers the magnetic field energy as the kinetic energy since that has been generated by the motion of the charge. Thus, according to Thomson, the kinetic energy is 1 2 1 mu = 2 2µ0
∞ π R
0
B2 (r)2πr2 sin θ dθ dr =
µ0 e2 2 u . 6πR
A comparison with (5.4.5) readily gives m =
4 mel = mem , 3
(5.4.7)
which has been called the electromagnetic mass. From its derivation it would appear that expression (5.4.7) is valid only for small velocities. Instead of electrostatic considerations, we can begin with the Liénard force law of electron theory [cf. Eq. (4.1.8)], e2 e2 cos θ 2 e2 2 2 (u˙ x + u˙ r ) , c + Fx = + u − 3u − u u u u − x r j r j 8π 0 r 4π 0 c2 r2 4π 0 r2 j
(5.4.8) where θ is the angle between r and x. If we take u to be in the x-direction, with ur = u cos θ, u˙ x = u, and u˙ r = y cos θ all terms in the velocities will contain odd powers in cos θ, and, hence, average to zero, whereas there will
Aug. 26, 2011
11:16
252
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
be a finite contribution coming from the acceleration terms. Then averaging Fx = −u˙ gives
e2 (1 + cos2 θ), 8π 0 r
1 ˙ Fx = −mel u˙ 1 + = −mem u. 3
(5.4.9)
There is consensus that [Yaghjian 92] Lorentz and Abraham were also unconcerned with the electromagnetic mass mem equaling 43 the electrostatic mass mel , defined as the energy of formation of a spherical charge . . . because they derived the equation of motion before Einstein’s 1905 papers on relativistic electrodynamics . . .
Nothing could be further from the truth, and Einstein’s papers do not provide one iota of insight into the 43 factor. Although it does not resolve the 43 factor, we have given two diametrically opposite proofs of the electromagnetic mass being 43 the electrostatic mass. In the first proof we have assumed a continuous distribution of charge, and the mass derived applies to a state of uniform motion. In the second derivation the spherical symmetry of the electron eliminates any dependency on the velocity in the force law, leaving only the acceleration terms. The origin of the mass is placed squarely on the interaction of the two charges, and the resulting force does not obey Newton’s third law. So it seems, that here again, two proofs with incompatible assumptions lead to the same result. O’Rahilly [38] thinks that the error lies in the definition of the ‘kinetic’ energy, T. It should be clear from the expression of the Lagrangian what is the kinetic energy. Starting with the Liénard electrokinetic potential, ee 2 L= (5.4.10) uj uj − ru˙ r , c + u2 − u2r − 4π 0 c2 r j
from which the equation of motion (5.4.8) follows via the Euler–Lagrange equations, Fx = −
∂L d ∂L , + ∂r dt ∂ur
he claims that Thomson’s procedure, “still reproduced in text-books,” consists in setting u = u , e = e , taking the velocity along the x-direction so
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
253
that u = ux = ur cos θ and is constant, where θ is the angle between r and x. In this case the Liénard electrokinetic potential reduces to L = 2W0 − W0 (1 + cos2 θ). Averaging over θ gives 2W0 − L =
4 4 W0 = mel u2 = mem u2 . 3 3
This would identify the left-hand side as 2T which makes no sense. Furthermore, the definition of the electrostatic mass, and consequently the electromagnetic mass, depends on our conception of what the electron looks like. For if we consider the charge to be uniformly distributed over the surface, σ = e/4π 0 R2 , then the field inside the surface is zero and outside is (5.4.3). This gives an electrostatic energy (5.4.4), corresponding to an electromagnetic mass (5.4.7). Alternatively if there is a volume distribution of charge with a charge density ρ = 3e/4π 0 R3 , there will be a contribution to the electrostatic energy for r < R, since there is a nonvanishing electric field E = 13 ρr = er/4π 0 R3 . This the total electrostatic energy, ∞ 2 2 e e 1 R e2 1 6 , (5.4.11) r4 dr + dr = W0 = 2 0 4π 0 R6 2 R 4π 0 r2 5 8π 0 R which is the same as the average gravitational energy, and gives an electromagnetic mass, 65 · 43 mel = 65 mem , greater than (5.4.7) due to the volume contribution to the energy in (5.4.11). There is nothing stranger in considering like charges distributed over a surface than to consider their density distribution in a finite volume. Both make no physical sense since like charges repel one another, and to consider a Poincaré pressure acting on a surface or throughout a volume has no physical relevance. Everything is fine so long as we treat the energy of interaction of two point charges. When we attempt to give the electron ‘body and shape,’ we run into trouble. The energy integrals extend over all of space, where the aether lives, but the electron can only be of finite extension. This is echoed in J. J. Thomson’s view where the mass has been increased by the charge; and since the increase is due to the magnetic force in the space around the charge, the increased mass is in this space and not in the charged sphere.
Aug. 26, 2011
11:16
254
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
In the force law interpretation, the electric charges retain their point-like character and in defining the electrostatic mass there is no integral over all the volume. But, it suffers from resulting in an expansion in the relative velocity to order 1/c2 . According to Ritz’s treatment, higher-order terms in the expansion give the next correction to the force as Fx(2) =
e2 (u¨ x + u¨ r cos θ). 8π 0 c3
Taking u¨ along x, and u¨ r = u¨ cos θ, and then averaging give Fx(2) =
e2 ¨ u. 6π 0 c3
(5.4.12)
Expression (5.4.12) is independent of the electron’s classical radius, and represents dissipation due to radiation. According to Larmor [00], who discovered this effect in 1897, the expression for the magnetic field must be modified to read B=
µ0 e sin θ 4πr
u u˙ + , r c
(5.4.13)
where for periodic motion the second term would average out to zero, but “will be preponderant in the integral of B2 across the shell [of radiation] when r is great.” In fact, the integral over the radial coordinate from R to infinity will diverge. But, since the integral is independent of r, the “energy of the expanding shell is conserved as it moves.” Averaging the power density, d F·u=− dt
e2 e2 1 2 ˙ uu − u˙ 2 , mem u − 2 6π 0 c3 6π 0 c3
(5.4.14)
over the ‘time of motion’ leaves only the last term in (5.4.14) which, according to Larmor [00], represents the amount of energy per unit time that travels away and is lost to the system, the velocity of the electron being as usual taken to be of a lower order than that of radiation.
This being so, then the neglect of such terms in the electromagnetic mass would mean that such mass cannot be converted into radiation because the
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
255
loss term is absent. That is, if we expand the force, ux ee r2 u¨ x ru˙ x A cos (rx) − B − C 2 − D 3 − · · · , Fx = c 4π 0 r2 c c in powers 1/c, where the coefficients are functions of the velocities and accelerations, it becomes clear that the radiation terms will enter at higher-order. Thus, we come to the conclusion that it is not the electrostatic or electromagnetic mass which can be converted into radiation, but only the contributions coming from the higher-order terms. It would therefore appear that without acceleration, mass cannot be converted into radiation. What then does the equivalence of mass and energy mean? According to O’Rahilly [38] the Lorentz–Einstein theory “has nothing . . . to say about the general interconvertibility of mass and energy.” We may say that hν/c2 is the mass equivalent of radiation, but it cannot be identified with the mass of an electron.
5.4.1
What does the ratio e/m measure?
At the beginning of the twentieth century, experiments were devised to measure the charge-to-mass ratio in the hope of discovering the true origin of mass. Associating the kinetic energy of a particle with the energy of the magnetic field produced by an electric charge in motion, J. J. Thomson reasoned that it would take more energy to start or stop an electric charge than if it were neutral. This extra energy could be attributed to the inertia of a particle so that it would appear to have an additional mass due to its charge. This idea was first expressed by Thomson in April 1881, to which we will return in Sec. 5.4.3, but, it did not attract much interest until 1897 when he found that cathode rays were negatively charged electrons traveling along the cathode ray tube at very high speeds. A few years later, it was observed that rays emitted from radium salts behaved in electromagnetic fields as if they were composed of negatively charged particles. Experiments, involving the deflection of these rays in electric and magnetic fields, showed that the charge-to-mass ratio was the same order as that of cathode rays, ∼107 . They were christened β-rays by Rutherford, and were used interchangeably with (high speed) electrons. In contrast to cathode rays, the β-rays were deflected less in a magnetic field
Aug. 26, 2011
11:16
256
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
than cathode rays and could thus reach higher speeds, even exceeding 0.9 that of light. β-rays were therefore better candidates to determine the variation of the ratio of e/m with velocity than cathode rays which did not register any variation, and which could only be accelerated up to 0.3 the speed of light. The experimental variation with velocity could then be confronted with that predicted by theory. There were two main contenders which viewed the electron in motion as prolate and oblate spheroids, whose axes in the direction of motion were shortened as a result of the FitzGerald–Lorentz contraction. We will discuss these models in greater detail in Sec. 5.4.4; let it suffice here to give their expressions. Whereas the Abraham model predicted the mass should vary as 3 m0 1 + β 2 1+β m= ln −1 , 4 β2 2β 1−β the Lorentz model predicted m0 , (1 − β2 )
m= √
where m0 is the mass of the electron in a state of rest, and β = u/c is the relative velocity. The initial experiments were not of sufficient accuracy to distinguish between these two expressions for the mass, but they did show a very marked change of e/m with the relative speed. This led Thomson [28] to the conclusion that this is “consistent with the view that all the mass is electrical.” For if the mass were not entirely electrical in origin, “a constant term would have to be added [to the mass expressions] to represent the non-electrical mass.” Experiments carried over a span of more than a decade showed [Thomson & Thomson 28] √ e/m (1 − β2 ) plotted against u, so the points should lie on a line parallel to the axis if the Lorentz formula is true, and it will be seen that they do so within the errors of experiment.
But, in Fig. 5.7 we see that it is the ratio e/m which is plotted against u/c. The experiments carried out by Bucherer and others use crossed fields.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
257
Fig. 5.7. The ratio of charge to the mass as a function of the relativity velocity. The sloping curve is the ratio determined by Abraham while the horizontal curve results from Lorentz’s formula.
β-rays are generated between plates of a condenser, and an outer solenoid applies a magnetic field. On emerging from the condenser the electrons strike a photographic plate at a distance δ from the condenser. The electric field has a sole component, Ey , perpendicular to the direction of motion of the electrons along the x-axis. The magnetic field has two components, Hx = H cos ϑ, and Hz = H sin ϑ. The configuration is shown in Fig. 5.8. There is no acceleration along the x- and z-directions, and the acceleration along the y-direction will vanish when the Lorentz force vanishes, viz. Ey = βHz = βH sin ϑ. This is the condition for the rectilinear motion of the electrons so that they will be able to pass through the narrow gap in the condenser plates. On emerging from the plates they are acted on by the magnetic field causing an acceleration, ay , in the y-direction, euH sin θ = may .
(5.4.15)
The deviation in the y-direction, for a particle moving under constant acceleration, is given by y=
1 ay τ 2 , 2
Aug. 26, 2011
11:16
258
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
Fig. 5.8.
The orientation of the fields in Bucherer’s experiment.
where τ is the time of flight. Using it to eliminate the acceleration in (5.4.15) gives the charge-to-mass ratio as e 2u2 y , = m E y δ2
(5.4.16)
where τ = δ/u has been used to eliminate it. However, (5.4.16) does not agree with the experimental results. Rather, if we modify the right-hand side of (5.4.15) by multiplying it √ by 1/ (1 − β2 ), (5.4.16) becomes e 2u2 y = , √ m Ey δ2 (1 − β2 )
(5.4.17)
whose constancy does agree with what is observed experimentally. Thus, it is not what Thomson claims that is held constant. Moreover, O’Rahilly [1965b] claims that no Lorentz type of “ad hoc modification” is required in Ritz’s theory since as far as second-order terms in the relative velocity are concerned, there is no change in H but Ey will undergo an increase
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
259
√ by the amount 1/ (1 − β2 ). This does not give the same result as (5.4.17). Finally, if we use the Liénard–Wiechert expression for the field produced by a charge moving at a uniform velocity u [cf. (5.4.21) below], we come out with (5.4.16) when it is introduced into (5.4.17), or if it is introduced into (5.4.16), we come out with a ratio, e/m, decreasing as u increases, as that predicted by Abraham’s model in Fig. 5.7! Returning to Thomson, he goes further and claims that This result might be supposed to prove that the whole of the mass of the electron is electrical. If this is so, and the electron is assumed spherical, its radius a can be found from the equation m0 = 23 e2 /a.
If we set such a rest mass in motion we should find that it acquires inertia √ of the order 1/ (1 − β2 ). But, then what is there to be distinguished in the ratio e/m? Putting a charge in motion should also increase its inertia by the same amount. We will return to this shortly. What is even more provoking is Thomson’s affirmation that Einstein has shown that to conform with the of Relativity mass must √principles (1 − u2 ). This is a test imposed by vary with velocity according to the law m0 Relativity on any theory of mass. We see that it is satisfied by the conception that the whole of the mass is electrical in origin, and this conception is the only one yet advanced which gives a physical dependency of mass on velocity.
General principles can never be used to determine specific formulas. If the mass is completely electrical then it should have the increased inertia predicted from the relativistic formula, although the distinction between mass and charge has all but disappeared. Moreover, what Thomson is saying is that neutral mass should not manifest any variation with speed, for a neutral particle in motion does not create a magnetic field. How then does relativity distinguish between charged and neutral matter? Kaufmann is slightly more cautious and writes the total mass as the sum of the mechanical (‘real’) mass and the electromagnetic (‘apparent’) mass. It is the latter that contains all the dependency on the speed. If this were the case the ratio e/m would contain the charge in both the numerator and in the denominator so that at zero speed it would reduce to e m + e2 /6π
0 ac
2
.
Aug. 26, 2011
11:16
260
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
When the particle is set in motion why should only the second term in the denominator acquire a dependency on its speed? According to Millikan, since an electric current, by virtue of the property called self-induction, opposes any attempt to increase or diminish its magnitude, it is clear that an electric charge as such possesses properties of inertia . . . It is clear then that theoretically that an electrically charged pith ball must possess more mass than the same pith ball uncharged.
Until mass is defined in a way which does not employ charge the distinction between the two is more than precarious. As far back as 1911, More suggested that the ratio, √ e/m = (e/m)0 (1 − β2 ),
(5.4.18)
√ can either be interpreted as Lorentz does, e = e0 and m0 = m (1 − β2 ), or √ m = m0 and e = e0 (1 − β2 ). Mass will increase with speed at constant charge, or charge will diminish with speed at constant mass. According to Bridgman, “the operations do not exist by which unique meaning can be given to the question of whether the magnitude of a charge is a function of its velocity.” If we opt for the second choice, we can expect modifications to Coulomb’s law when charges are set into motion. Such corrections were known prior to the advent of the special theory and Lorentz transforms. They were derived by Liénard and Wiechert in 1898 and 1900, respectively. They found the expressions for the vector and scalar potentials as [cf. (4.1.10) with the difference that we are now using rationalized units], a=
eβ , 4π 0 (r − β · r)
φ=
e , 4π 0 (r − β · r)
(5.4.19)
where u is the velocity of the charge, and r is the radius vector, taken from where the charge is located to where it is observed. The terms on the right-hand sides of (5.4.19) must be evaluated at the earlier time where the charge was located. The negative of the electric field is obtained by taking the sum of the gradient of the scalar potential and the time-derivative of the vector potential, 1 E = − a˙ − ∇φ. c
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
261
In an inertial frame, the same expression is obtained from a Lorentz transform, viz. E=
er 1 − β2 4π 0 r3 (1 − β2 sin2 ϑ)3/2
(5.4.20)
where ϑ is the angle between the radius vector r and the direction of motion of the charge. If the two should coincide, (5.4.20) reduces to E =
e (1 − β2 ), 4π 0 r2
while if the two directions are perpendicular to one another, E⊥ =
e 1 . √ 4π 0 r2 1 − β2
(5.4.21)
As the speed of a charge increases it will have opposing effects on the components of the electric field. The component of the field in the direction of the motion is contracted like that of a sphere into a spheroid, while the normal component is increased: like the mass? or the electric charge? This has not gone unnoticed, for Bridgman claims that Bush has “shown that there are advantages in supposing the charge of an electron to change when it is set in motion.” The electric field entering in Lorentz’s law is (5.4.20), if the acceleration terms are omitted as they must be in an inertial state. Such a state cannot radiate electromagnetic waves, but such a state is precisely that in which the ratio e/m has been determined! The electrokinetic potential for deriving the force is L = e(φ − β · a) =
e(1 − β2 ) e(1 − β2 ) , = √ 4π 0 (r − β · r) 4π 0 r (1 − β2 sin2 ϑ )
(5.4.22) (5.4.23)
where r is the distance from the charge to the observer at the exact moment he observes the charge. The last equality in (5.4.22) follows from r = r − ur/c, and from which it follows that the ratio of the magnitudes of the two distances, ‘then’ and ‘now’ are in the inverse ratio of their
Aug. 26, 2011
11:16
262
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
respective sines, i.e. r sin ϑ = , r sin ϑ where ϑ is the angle between r and u. The potential (5.4.22) can be traced all the way back to Clausius who introduced it as the ‘electrodynamic’ potential in 1857. Later it was rediscovered by Schwarzschild who named it the ‘electrokinetic’ potential, and to what Searle referred to as the ‘convection’ potential for a charge moving at a constant velocity. The retarded potentials (5.4.19), or for that matter the potentials themselves, do not enter Maxwell’s equations. Heaviside showed dispraise for them, referring to them as “the metaphysical nature of the propagation of the potentials,” but, are necessary when Maxwell’s equations are to accommodate mass, as we shall see in Sec. 11.5.2. The force components follow just as in mechanics, viz. Fx = −
d ∂L ∂L + . ∂x dt ∂ux
On this expression, Ritz remarked: This expression reduces, in first approximation, to the law of the inverse square of the distance; we can therefore call it the law of Newton generalized . . . In these formulas the notion of field does not intervene . . . This remarkable result, due to Schwarzschild, shows that Lorentz’s theory resembles the older theories much more than we could at first sight believe.
The ‘older’ theories Ritz is referring to are those of action at a distance in which the charged particles retained their corpuscular character and were not something belonging to a region of a continuum — the aether — which would increase or decrease, appear or disappear in a continuous fashion.
5.4.2
Models of the electron
We do not have the remotest idea of how the single electron holds together. It ought to be one of the most explosive and unstable things in physics; yet it behaves as a permanent existence in defiance of every known physical law. [Soddy 32]
Thomson [81, 88], as early as 1881, reasoned that a mass should be heavier when it is charged than when it is uncharged. Although this would be considered relativistic heresy today, the reason he gave was essentially
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
263
that of self-induction: Just as starting a current creates an instantaneous electromotive force opposing it, so a charge set in motion creates an electric field, together with a changing magnetic field, that act on the charge to retard its motion. Likewise, when the particle decelerates, the electric field produced by the changing magnetic field acts in the direction of motion of the charge, thereby, again, increasing its inertia. Thus, it appears that a charged mass has a greater inertia than when it is uncharged. This idea was further developed by Abraham [03] who transformed it into a model of the ‘rigid’ electron, and thus constructed the first theoretical model of a subatomic particle. Its generality [Cushing 81] and complete absence of ad hoc assumptions make it all the more incredible that it did not correspond to experiment. Today, it is almost forgotten, being surpassed by the Lorentz model, although it was the first field model of an elementary particle. Abraham’s idea that the mass of the electron could be accounted for by the electrodynamic fields meant that its energy and momentum could be determined from these fields in the case where deviations from a spherical form could be explained as a distortion due to the motion. So Abraham’s model is not as ‘rigid’ as it is made out to be. Abraham based his model on a prolate ellipsoid, whose expression for the capacitance, and, hence, the electrostatic energy he found given in Maxwell’s Treatise on Electricity and Magnetism. Lorentz, on the other hand, considered that the motion distorts the spherical electron into a ‘Heaviside’ ellipsoid, as it was referred to in old literature, or what we commonly know today as an oblate ellipsoid. Our aim will be to show that the expressions for the energies of the oblate and prolate ellipsoids are related by ‘analytic continuation,’ R → iR, that converts elliptic into hyperbolic geometry, respectively.
5.4.3
Thomson’s relation between charges in motion and their mass
The calculation of the additional mass in the small motion limit, m =
e2 , 6π 0 ac2
(5.4.24)
where a is the radius of a small sphere, had been made by equating the energy in the magnetic field — equal to the energy in the electric field — to
Aug. 26, 2011
11:16
264
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
the kinetic energy 12 m u2 . The kinetic energy per unit volume is 1 2 µ0 e2 u2 sin2 θ B = , 2µ0 32π2 r4 where θ is the angle between the direction of motion of the charge, and a point which is a distance r from the center of the sphere. The kinetic energy density is to be integrated over the volume with an element of a ring whose axis is in the line of motion of the charge with a cross section r dr dθ and perimeter 2πr sin θ, where r sin θ is the radius. This gives the element of volume as 2πr2 sin θ dθ dr. Hence, 1 2µ0
B2 dV =
µ0 e2 u2 6π
0
π
∞
sin3 θdθ a
dr µ0 e2 2 = u , 2 12πa r
(5.4.25)
is the kinetic energy outside of the sphere of radius a. Thus, there is an additional mass which is given by (5.4.24), in the small motion limit. In other words, if m is the mass of the uncharged sphere, when it is charged and set into motion at a velocity u, it will have a kinetic energy given by 1 µ0 e 2 m+ u2 2 6π a where the second term in the parenthesis is seen to be m given by (5.4.24), observing that µ0 = 1/ 0 c2 . Thus, Thomson [21] concludes that “when a sphere moves through a liquid it behaves as if its mass were m+m , where m is the mass of the sphere, and (5.4.24) the mass of the liquid displaced by it.” Thomson then goes on to replace the ‘liquid’ by the ‘aether,’ saying that it is necessary for the conservation of momentum. In fact, the entire analogy with self-induction is inappropriate since accelerative motion has not been taken into account, nor have transient effects. According to classical theory, an accelerated charge must necessarily radiate, and, hence, mass will be lost. Moreover, (5.4.24) applies to “indefinitely slow motion.” According to a respectable text [Richtmyer & Kennard 42], for finite motion, relativistic effects must be included in which “the sphere becomes √ contracted in the direction of the motion in the ratio (1 − β2 ) : 1.” Hence,
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
265
there is a further increase in the mass bringing it to m =
µ0 e 2 1 ·√ . (1 − β2 ) 6πac2
(5.4.26)
But, this makes no sense since m is the additional mass produced by the accompanying electromagnetic fields during the motion. Lorentz’s factor in (5.4.26), therefore, appears as an ad hoc factor appended onto a result which has already taken into account the motion of the mass. The dilatation (5.4.26) is no more fundamental than the hypotheses of Lorentz and Abraham. In fact, Lorentz [52] admitted Thomson’s priority, but his calculation was considered by him to be “somewhat different from that to which one is led in the modern theory of electrons.” In effect, it had nothing to do with it, and Thomson stood by his results well into the mid 1920’s even after the advent of special relativity.
5.4.4
Oblate versus prolate spheroids
The shortening of the electron ‘is true but not really true’ Eddington
Lorentz and Abraham both devised models of the electron as a sphere suffering from (FitzGerald–Lorentz) contraction when in motion. Although Abraham’s model seemed initially the more promising one, it was Lorentz’s model that finally won the almost ten year long battle. Abraham’s model went to oblivion, and even he does not refer to it in the later editions of his second volume of Theorie der Elektrizität. Many an author considered [O’Rahilly 38] Abraham’s results are largely of merely historical interest. We shall therefore turn to Lorentz’s ‘contractile electron.’
Our path is not to follow the historical evolution of these two models and the particular assumptions that went into their evaluation [for that see Cushing [81]], but, rather, to show they were two sides of the same coin and when flipped they turned into one another by the addition of i, or its removal. Nevertheless, as late as 1938, the matter of deciding between the Abraham and Lorentz models was far from settled. According to Zahn and Spees [38]
Aug. 26, 2011
11:16
266
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity So far as is known to the authors, it appears that, at least for higher velocities, no very satisfactory experimental distinction between the two types of electron has as yet been made by direct electric and magnetic deflections. In view of the fundamental importance of such experiments it seems that much is left to be desired.
They isolated the problem in the so-called ‘10% effect.’ Bucherer and Neumann designed their experiments to make a quantitative measurement of the variation of the electron mass with velocity. In so doing they hoped to distinguish between the Abraham and Lorentz electrons. From Neumann’s data, it appeared that the ratio e/m0 calculated from Lorentz’s theory remained practically constant, while in Abraham’s theory it varied about 10% [cf. Fig 5.7]. The very large spreads, focusing effects, and scattering, could have resulted in errors of 10% in such a way that masked the variation in Lorentz values while causing an observable variation in those of Abraham. Notwithstanding the merits of the experiments, the Lorentz and Abraham electrons are, in fact, models of elliptic and hyperbolic geometry, respectively, and come out very simply from two different spheroids. Had Abraham only knew his was a model of hyperbolic geometry he could have found allies in Variˇcak and Silberstein. But, everyone was focused on the expression for the mass and how it compared with the experimental measurements of the ratio e/m, made by Kaufmann, and later Bucherer and Neumann. Actually the Bucherer and Neumann experiments do little more than Kaufmann’s, which only succeeded in indicating a large, qualitative, increase in mass with velocity. Again, through the prejudice of wanting the relativistic electron to succeed, and along with it all relativistic matter whether charged or not, we see another golden opportunity wasted of investigating the geometrical structure of the two electron models. The equations of an ellipsoid, x2 y2 z2 + + = 1, a2 + s b 2 + s c 2 + s
(a > b > c)
(5.4.27)
is a cubic equation in s, having three different real roots lying in the following ranges, s1 ≥ −c2 ,
−c2 ≥ s2 ≥ −b2 ,
−b2 ≥ −a2 .
These roots are coordinates of a point (x, y, z); surfaces of constant s1 , s2 , and s3 are ellipsoids and hyperboloids of two sheets which are confocal
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
267
with the ellipsoid, y2 z2 x2 + + = 1. a2 b2 c2 The problem of finding the electric field of a charged ellipsoid reduces to solving Laplace’s equation [Landau & Lifshitz 60], d d (5.4.28) R(s) = 0, ds ds where surfaces of the constant ellipsoidal coordinate, s, are equipotential surfaces. In particular, s = 0 represents the surface of the ellipsoid itself, and √ R(s) = [(s + a2 )(s + b2 )(s + c2 )], where s ≥ a2 > b2 > c2 . The solution to Laplace’s equation, ∞ ∞ e ds ds e = (s) = , √ 2 16π 0 s R(s) 16π 0 s [(s + a )(s + b2 )(s + c2 )] can be simplified by the change of variable, 1/2 a2 − c2 z= . a2 + s For then we have the known integral, √ 2 ds (a − c2 ) ∞ √ 2 )(s + b2 )(s + c2 )] 2 [(s + a s =
√ a2 −c2 a2 +s
0
where
κ=
dz , [(1 − z2 )(1 − κ2 z2 )]
√
a2 − b2 a2 − c 2
(5.4.29)
1/2 (5.4.30)
is the modulus. A further change of variable, z = sin ϕ reduces (5.4.29) to 0
√ a2 −c2 a2 +s
dz = √ 2 [(1 − z )(1 − κ2 z2 )]
φ 0
dϕ
√
(1 − κ2 sin2 ϕ)
= F(φ, κ),
Aug. 26, 2011
11:16
268
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
where sin φ =
a2 − c 2 a2 + s
1/2 .
The integral, F(φ, κ) is Legendre’s elliptic integral of the first kind. Putting the pieces together, we find the expression, =
e F(φ, κ). √ 8π 0 (a2 − c2 )
(5.4.31)
The capacity of the conductor is thus C0−1 = /e =
8π 0
1 F(φ, κ). (a2 − c2 )
√
(5.4.32)
Neither (5.4.31) nor (5.4.32) can be found in closed form. But, the situation changes when any two semi-axes become equal, for then the ellipsoid degenerates in a spheroid. A spheroid is a surface of revolution obtained by revolving an ellipse about one of its axes. When the axis of revolution is the major axis the ellipsoid is ‘cigar-shaped,’ or prolate, while, if it is the minor axis, the ellipsoid is ‘pancake-shaped,’ or oblate. These spheroids are shown in Fig. 5.9.
Fig. 5.9.
(a) Oblate ellipsoid with a = b > c; (b) prolate ellipsoid with a = b < c.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
269
The reason for the distortion of the sphere into spheroids is that motion will, in general, distort objects. Whether this is real or apparent is another matter, and depends on the geometric space that the observer is in. For an oblate ellipsoid a = b > c, so that we may think of the semi-minor axis c as being due to a FitzGerald–Lorentz contraction, c=
√ (1 − β2 )a.
Since the modulus (5.4.30) of the elliptic integral of the first kind F vanishes, the integral in (5.4.32) can easily be performed. We then get √ 8π 0 (a2 − c2 ) . cos−1 (c/a)
(5.4.33)
e cos−1 (c/a) =e , √ C0 8π 0 (a2 − c2 )
(5.4.34)
Co = Consequently, o =
is the field at the surface of the oblate ellipsoid, and whose energy is e times (5.4.34), √ 1 − β2 e2 cos−1 . (5.4.35) Wo = eo = 8π 0 a β The momentum, Go , associated with the total energy Wo isb Go = uWo /c2 = mel cos−1
√
(1 − β2 ),
where we have set mel = e2 /8π 0 ac2 [cf. Eq. (5.4.5)]. Actually, m represents the electrostatic mass. Moreover, if we use our definition of mass as m=
∂Go = mel γ, ∂u
(5.4.36)
we obtain Lorentz’s expression for the ‘transverse’ mass, where γ = 1/ √ (1 − u2 /c2 ). Now by the definition of the force, F=
dG0 d 2 = mγu = mγ u˙ + mγ 3 (u · u)u/c ˙ , dt dt
(5.4.37)
b Confusion should not arise between the velocity of light and the c-axis of the
ellipsoid.
Aug. 26, 2011
11:16
270
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
there will be a component of the force in the direction of the velocity. This is non-Newtonian, and leads to a rate of working u2 2 ˙ 3. (5.4.38) F · u = mγ(u · u) ˙ 1 + 2 γ = m(u · u)γ c Introducing this into the last term in (5.4.37) leads to u˙ =
F − (F · u) u/c2 . mel γ
(5.4.39)
If the velocity is parallel to the force, (5.4.39) becomes u˙ =
F , mel γ 3
(5.4.40)
where ml = mel γ 3 is referred to as the longitudinal mass, while if the force is perpendicular to the velocity there results the transverse mass. In Sec. 3.7.3.2 we remarked that such a situation corresponds to a uniformly rotating disc where the velocity is tangent to the disc while the centripetal force is directed inward toward the center of the disc. This can hardly describe the rectilinear motion of an electron. Since ml is not what Kaufmann’s experiments predicted, the longitudinal mass was quickly forgotten. The vanishing of (5.4.38) gives the condition for the absence of the longitudinal mass: Either the velocity is constant, or the velocity is orthogonal to the acceleration. For radiation phenomena neither of these two conditions are met. If the ellipsoid of revolution is prolate, (5.4.34) becomes imaginary in form, since c > a = b, though not in reality. The potential at any distance r from the surface is: √ √ e (r + c2 ) + (c2 − a2 ) p (r) = ln √ √ √ 8π 0 (c2 − a2 ) (r + c2 ) − (c2 − a2 ) √ e (c2 − a2 ) −1 = tanh . √ √ 4π 0 (c2 − a2 ) (r + c2 ) √ The eccentricity of the prolate ellipsoid is ε = (1 − a2 /c2 ), which as a → 0 degenerates into a long thin rod of length = εc. As we have seen in Sec. 5.3, the potential p becomes infinite as ε → 1. Instead, as ε → 0, c ≈ a, and
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
271
the equipotential surfaces are nearly spheres. We are then far from the rod, and deformations due to motion, or, for that matter, attraction, are hardly perceptible. √ √ Bearing in mind that (a2 − c2 ) = i (c2 − a2 ), and cos−1 (c/a) = i cosh (c/a), the capacity (5.4.33) becomes √ 8π 0 c2 − a2 , Cp = cosh−1 (c/a) so that the electrostatic field at the surface of a prolate ellipsoid is p =
e cosh−1 (c/a) e tanh−1 . = √ 2 2 8π 0 (c − a ) 8π 0 c
The energy at the surface of the prolate ellipsoid in terms of the relative velocity β, mel c2 1+β tanh−1 β = ln , (5.4.41) Wp = ep = mel c2 β 2β 1−β gives a momentum, Gp = uWp /c2 = mel c tanh−1 β.
(5.4.42)
Abraham would not have arrived at this expression for the velocity had he not introduced another contraction in the direction of the motion, and not used the definition of the momentum as the derivative of the Lagrangian with respect to the relative velocity. His definition of the Lagrangian as the negative of the energy which has been Lorentz-contracted is inaccurate. Using the definition of mass as (5.4.36), we now find m=
dGp mel = , du 1 − β2
(5.4.43)
It was precisely this mass that Lewis and Tolman [09], and Wilson and Lewis [12], found when the units of mass and length vary with a change of axes. The latter even used it to fault Minkowski’s definition which coincided with that of Lorentz, (5.4.36). It may be thought that the electrostatic mass undergoes a Lorentz contraction, √ mel = m0 (1 − β2 ),
Aug. 26, 2011
11:16
272
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
where m0 is the invariant mass. Then, since Gp u is the work necessary to keep the electron in a state of constant motion, the total mass would be √ m0 β2 2 = m0 √ + (1 − β ) . √ (1 − β2 ) (1 − β2 ) Lewis and Tolman consider the reflection of light off a mirror located on a platform in motion at velocity u. They find that the calculated backand-forth path of light is “greater in the ratio 1/(1 − β2 ).” But, this seems to contradict their previous finding that the length should be lengthened √ only by 1/ (1 − β2 ). Here is how they patch things up: Now the velocity of light must seem the same to the observer, whether he is at rest or in motion. His measurements of velocity depend upon his units of length and time. We have already seen that a second on a moving clock is lengthened √ in the ratio 1/ (1 − β2 ), and therefore if the path of the beam of light were also greater in this same ratio, we should expect that the moving observer would find no discrepancy in his determination of the velocity of light. From the point of view of a person considered at rest, however, we have just seen that the path is increased by the larger ratio 1/(1 − β2 ). In order to account for this larger difference, we must assume that the unit of length in the moving system has been shortened in the ratio √ (1 − β2 )/1.
Lewis and Tolman fail to appreciate that the to-and-fro motion of light bouncing off a mirror is equivalent to an inelastic collision between two particles each traveling at the same velocity in opposite directions. If their speed be u, we can always transfer to a frame in which one of the bodies is stationary and the other moves with speed U = 2u/(1 + β2 ). If m0 is the stationary mass, its energy would have appeared to increase by m0 c2 1 + β2 E= √ m0 c2 = m(B)c2 , = 1 − β2 (1 − B2 ) where B = U/c. Now the total mass, m0 + m(B) = M, would have seemed to increase more than the sum of the stationary masses by the factor 1/(1 − β2 ), since M=
2m0 . 1 − β2
(5.4.44)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
273
Similar considerations apply to the lengthening of the path of light upon reflection from a moving mirror so there is no need to assume a mass √ or a length contraction by a factor of (1 − β2 )! It is precisely the mass increase (5.4.44) that the prolate model of an ellipsoid predicts, (5.4.43). This is just Poynting’s derivation of the mass–energy relationship that we discussed in Sec. 3.5.2. The electrostatic energies (5.4.35) and (5.4.41) have already been seen to be related by an imaginary factor. We can show that they are, in fact, related to the phase of the Bessel function in the bright and shadow regions. In order to do so, we place the uncharged conducting ellipsoid in a uniform external electric field which coincides with the major axis of the ellipsoid. The expression for the electrostatic potential is the potential of an electric dipole which is proportional to the electric field. The coefficients of proportionality are the depolarization coefficients, which, if the coordinate axes do not coincide with the spheroid, form a symmetric tensor of rank two. Consider an electric field directed along the x-axis which coincides with the major axis of the spheroid, the a-axis of the oblate and the c-axis of the prolate. The constant and parallel field to the major axis will induce a non-uniform charge distribution on the surface of the spheroid whose potential is ∞ ds e (s) = 8π 0 s (s + a2 )R(s) ∞ ds e e = . ≈ 5/2 8π 0 r2 s 12π 0 r3 At large r, s ∼ r2 and the potential of the induced charge (r) ∼ e/12π 0 r3 is that of a dipole. Rather, at the surface of the sphere, the potential of the induced charge is o (0) =
8π 0
√ 1 { (a2 − c2 ) − c cos−1 (c/a)}. 2 3/2 −c )
c(a2
(5.4.45)
Apart from a constant factor, (5.4.45) is identical to the attractive gravitational force at the surface of an oblate spheroid, (5.3.7). In the next section we will show how it is related to the minimal curve of a convex body in elliptic space. For a prolate ellipsoid (c > a = b) the field is aligned with the z-axis which is parallel to the major axis c. The potential of the induced
Aug. 26, 2011
11:16
274
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
distribution at the surface is p (0)
1+ε 1 e ln −ε = 1−ε 8π 0 cε3 2 e = {tanh−1 ε − ε} 8π 0 cε3 √ e {cosh−1 (c/a) − (1 − a2 /c2 )}, = 2 2 3/2 8π 0 c(1 − a /c )
(5.4.46)
√ where ε = (1 − a2 /c2 ) is the eccentricity, which is not to be confused with the dielectric constant, 0 in free space. Again, apart from a constant factor (5.4.46) is identical to the attractive, gravitational force of a prolate ellipsoid, (5.3.8). This, too, will be related to a minimal curve of a convex body in the next section, but one occurring in hyperbolic space. Moreover, expression (5.4.45) is the phase of the Debye asymptotic form of the Hankel function for a > c [Babiˇc & Buldyrev 91], or for one whose argument is greater than its order. Hankel functions describe the periodic propagation of a wave field whose wavefronts, = const., are involutes to the circle of radius a = c. The rays are half-lines tangent to the circle a = c, as shown in Fig. 5.10. In the shadow region a < c, where the rays do not penetrate, the imaginary phase of the Debye asymptotic form of a Hankel function, whose order is greater than its argument, is given by (5.4.46). The Hankel function
Fig. 5.10. The caustic circle of radius c separates the bright (periodic) region a > c from the shadow (exponential) region, a < c.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
275
describes a wave field that decays exponentially in space. Separating bright and shadow regions is the caustic circle of radius c. Even more can be said — and here is the hook-up with hyperbolic geometry. For variable minor axis, a = x say, the terms in the parenthesis of (5.4.46) can be written as y = c ln
c+
√
(c2 − x2 ) √ 2 − (c − x2 ). x
(5.4.47)
Equation (5.4.47) is the equation of a tractrix shown in Fig. 2.1.9. Revolving this curve about its asymptote we obtain a surface of revolution, which we discussed in Sec. 2.5. The surface is none other than the pseudosphere which has a constant, negative curvature −1/c2 .
5.5
Minimal Curves for Convex Bodies in Elliptic and Hyperbolic Spaces
A convex body is characterized by its area A, perimeter L, diameter D, and thickness, E. The diameter and thickness are the maximum and minimum widths of the convex body, respectively. Inequalities involving pairs of these quantities are [Sholander 52] E ≤ L/π ≤ D.
(5.5.1)
The reason for the inequalities is that a circle of perimeter L has diameter L/π. Since a circle has largest area among curves of a given perimeter, the largest value of L/π is D. In any event, L/π cannot be smaller than the minimum width of the convex body. Inequalities involving more than two quantities are √ 2 D − E2 + 2E sin−1 (E/D) ≤ L, √ 2 L≤2 D − E2 + 2D sin−1 (E/D). 2
(5.5.2a) (5.5.2b)
Simply combining the two inequalities gives the first and third inequalities in (5.5.1). However, inequalities (5.5.2a) and (5.5.2b) tell us far more. To see this, we use the relation between the inverse trigonometric functions to write
Aug. 26, 2011
11:16
276
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
them as 1 D2 − E2 − E cos−1 (E/D) ≤ (L − πE), 2 √ 1 D 2 − E2 . (πD − L) ≥ D cos−1 (E/D) − 2 √
(5.5.3a) (5.5.3b)
Introducing the angle, rˆ = cos−1 (E/D),
(5.5.4)
measured in radians, enables (5.5.3a) and (5.5.3b) to be written as 1 (L/E − π), 2 1 0 ≤ rˆ − sin rˆ ≤ (π − L/D). 2 0 ≤ tan rˆ − rˆ ≤
(5.5.5a) (5.5.5b)
Whereas the left-hand inequalities are well-known elementary trigonometric inequalities, those on the right-hand side are not. That is, the first inequality in (5.5.5a) guarantees that the isoperimetric quotient of any regular polygon is less than that of a circle. The inequality on the lefthand side of (5.5.5b) guarantees that the area of any sector inscribed in a right triangle is less than the area of the right triangle. Rather, the right-hand inequalities in (5.5.5a) and (5.5.5b) say something different. Adding (5.5.5a) and (5.5.5b) results in √ 2 (5.5.6) 2D sin rˆ = 2 D − E2 ≤ L. This inequality is stronger than the inequality for segments which states 2D ≤ L [Sholander 52]. If E is the radius of a circle, and D is a line from the center to any point outside that circle then the half-lines emanating from the point outside the circle that are tangent to the circle plus the arc length on the circle connecting the points where the half-lines touch the circle form the perimeter L of the shared area in Fig. 5.11. Then inequality (5.5.6) states that the length of the perimeter cannot be inferior to the two half-lines that are tangent to the circle. Inequalities (5.5.5a) and (5.5.5b) refer to elliptic geometry. In order to demonstrate this we consider a regular n-gon, where L = L /n is the length ˆ we add of a side. For a circumscribed n-gon in a circle of elliptic radius R,
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
277
Fig. 5.11. The perimeter L consists of the two half-lines that are tangent to the circle and the arc length between them.
inequalities (5.5.5a) and (5.5.5b) to obtain tan rˆ ≤
1 L . 2 nE
(5.5.7)
We lose no generality by assuming the width E = 1 [Sholander 52]. Consider the right triangle BAO with central angle π/n in Fig. 5.12. From spherˆ and introducing this into (5.5.7) gives ical geometry, tan (π/n) = tan rˆ / sin R, 2π(n/π) tan (π/n) sin Rˆ ≤ L . Now, taking the limit as n → ∞ shows that the spherical length of the circumference of a spherical circle is the lower bound to the perimeter of a regular n-gon 2π sin Rˆ ≤ L .
(5.5.8)
If we had, instead, wrote the sum of (5.5.5a) and (5.5.5b) as the inequality sin rˆ ≤
1 L , 2 nD
we would have considered a regular n-gon inscribed in a circle of radius R. This would again lead to (5.5.8). The trigonometry of the Euclidean plane is transformed into the trigonometry of the hyperbolic plane simply by allowing the absolute
Aug. 26, 2011
11:16
278
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
Fig. 5.12. A circle inscribed in an n-gon.
constant to become imaginary. However, inequalities must be inverted ϑ ϑ < 1, > 1, tan ϑ sin ϑ iϑ iϑ ϑ = = > 1, tan (iϑ) i tanh ϑ tanh ϑ
(5.5.9)
iϑ iϑ ϑ = = < 1. sin (iϑ) i sinh ϑ sinh ϑ Thus, any self-contradiction that may arise in hyperbolic geometry must also necessarily arise in Euclidean geometry. The second equality, (5.5.9), applies to a tractrix, and shows that it is the result of treating the relative velocity as purely imaginary, just like the transformation from elliptic to hyperbolic geometry by treating the arc length as purely imaginary. The transition to the hyperbolic realm is easily made by allowing D to become smaller than E. Then, instead of (5.5.1) we now have E ≥ L/π ≥ D.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
279
Moreover, inequalities (5.5.3a) and (5.5.3b) become √
1 (πE − L), 2 √ 2 1 (E − D2 ) − D cosh−1 (E/D) ≥ (L − πD). 2 E cosh−1 (E/D) −
(E2 − D2 ) ≥
(5.5.10a) (5.5.10b)
Defining the ‘angle,’ r¯ = cosh−1 (E/D), inequalities (5.5.10a) and (5.5.10b) can be written as L 1 π− , r¯ − tanh r¯ ≥ 2 E 1 L sinh r¯ − r¯ ≥ −π . 2 D Adding (5.5.11a) and (5.5.11b) gives √ 2D sinh r¯ = 2 (E2 − D2 ) ≥ L.
(5.5.11a) (5.5.11b)
(5.5.12)
This says that twice the straight-line segments of rays in a caustic, D ≤ E, cannot be inferior to the length, L. In order to demonstrate that we are in hyperbolic space consider a ¯ regular n-gon of length L = L /n inscribed in a circle of hyperbolic radius R. Introducing this length into (5.5.12) gives sinh r¯ ≥
1 L . 2 nD
(5.5.13)
Again we lose no generality in assuming the minimum width D = 1. Consider the right triangle BAO in Fig. 5.13 with central angle π/n. Using the hyperbolic right angle formula, sinh r¯ = sinh R¯ sin (π/n), inequality (5.5.13) becomes 2π(n/π) sin (π/n) sinh R¯ ≥ L . Proceeding to the limit as n → ∞, we get 2π sinh R¯ ≥ L ,
(5.5.14)
showing that the perimeter of a polygon is bounded from above by the hyperbolic circumference of a circle. This could have immediately been
Aug. 26, 2011
11:16
280
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
Fig. 5.13. A regular n-gon inscribed in a circle.
obtained from the elliptic inequalities by allowing their arguments to become imaginary and inverting the inequalities. Rather, had we written (5.5.13) as tanh r¯ ≥
1 L , 2 nE
we would have been led to consider a regular n-gon circumscribed about ¯ a circle of hyperbolic radius R. Inequalities (5.5.8) and (5.5.14) show that the perimeter of an n-gon in the limit where n → ∞ is bounded from below and above by the elliptic and hyperbolic circumferences of a circle, respectively.
5.6
The Tractrix
Since the elliptic plane can be represented on a sphere without distortion, it is natural to inquire whether there exists a ‘pseudo’-sphere upon which the hyperbolic plane can be developed. Such a surface in Euclidean space would have all its distances measured on its surface related to distances measured on the hyperbolic plane. The only distance we have is hyperbolic
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass measure of the relative velocity (in natural units) 1+u 1 = tanh−1 u. u¯ = ln 2 1−u
281
(5.6.1)
These lines in the hyperbolic plane are geodesics, like the great circles on a sphere. The ‘smoothness,’ or point-to-point homogeneity, of the hyperbolic plane requires a constant, negative, specific curvature, K, for the surface. In 1827 Gauss proved that the total curvature of any triangle, with angles A, B, and C, formed by geodesics on a surface S is given by K dS = K dS = K A = A + B + C − π. Since the area must be positive A = (A+B+C−π)/K > 0, the angle ‘excess’ implies K > 0, while an angle ‘defect’ necessitates K < 0. Consequently, an angle excess implies a surface of positive constant curvature, K > 0, while an angle defect implies a surface of negative constant curvature, K < 0. An example of the former is a triangle on the surface of a sphere, while that of the latter is a triangle on a pseudosphere. They can also be pictured as ‘fitting errors,’ as in Fig. 9.7. The simplest example of a pseudosphere was given by Minding in 1839 [cf. Sec. 2.5]. It is a bugle-shaped tractoid, shown in Fig. 2.19, formed by revolving the tractrix around the z-axis. The surface of revolution for √ which the cylindrical coordinates r = (x2 + y2 ) and z are expressible in ¯ defined in (5.6.1); specifically, we have terms of the parameter u, ¯ r = a = c sech u,
¯ z = c(u − tanh u).
The latter shows that the hyperbolic measure of the velocity can never be inferior to its Euclidean measure. There are two reasons why this model of hyperbolic geometry is inferior to that of elliptic geometry [Coxeter 65]. First, the maximum and minimum normal curvatures at any point on the pseudosphere have a constant product K = −1, though they individually vary from point to point on the surface. By constrast, the maximum and minimum normal curvatures of the sphere are constant everywhere. Second, the tractroid does not represent the entire hyperbolic plane since the bugle has a ‘rim’of length 2πc at u¯ = 0,
Aug. 26, 2011
11:16
282
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
Newton’s tractrix.
Fig. 5.14.
though the total area is finite. Beltrami suspected, and Hilbert proved, that there is no smooth surface that can cover the whole hyperbolic plane. By considering the tractrix itself we can make connection with the depolarization coefficient, introduced earlier, and, consequently, with the phase of the asymptotic form of the Bessel function. The equation for the tractrix, shown in Fig. 5.14, in the xy-plane is ds c =− , dy y
(5.6.2)
√ where ds = (dx2 + dy2 ) is the arc length, and c is the constant slope. Huygens interpreted the curve as the ‘path’ of a stone pulled by a rope of length c. Introducing the arc length into (5.6.2) leads to 2 c dx =− . 1+ dy y Squaring, rearranging, and taking the negative square root give √ dx = −
(c2 − y2 ) dy, y
and integrating results in x = c cosh−1 (c/y) −
√
(c2 − y2 ).
(5.6.3)
Comparing (5.6.3) with the last line in (5.4.46) shows that x would correspond to the depolarization coefficient n(x) , and y to the semi-minor axis of the ellipsoid, a (cf. Eq. (4.2) of [Landau & Lifshitz 60]). The semi-major axis c would correspond to the slope of the tractrix.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
283
Now, the integral of (5.6.2), y/c = e−s/c = tan
(s) = 1 − u2 , 2
(5.6.4)
identifies c with the absolute constant of the hyperbolic space, and is the angle of parallelism. The second equality follows from the Bolyai– Lobachevsky formula, and the last equality follows from identifying y with ¯ For a system at rest, = π/2, and the geometry is Euclidean, a = c sech u. while for < π/2 and u < 1 it is definitely hyperbolic. Since the left-hand side of (5.6.4) satisfies the functional relation f (s1 ) · f (s2 ) = f (s1 + s2 ), it follows that the right-hand side of (5.6.4) is equivalent to the hyperbolic Pythagorean theorem ¯ cosh u¯ 1 · cosh u¯ 2 = cosh u,
(5.6.5)
for a right angle hyperbolic triangle with u¯ 1 ⊥ u¯ 2 [Silberstein 14]. Since sech u¯ = sin , (5.6.5) is equivalent to sin 1 · sin 2 = sin .
5.7
Rigid Motions: Hyperbolic Lorentz Transforms and Elliptic Rotations
It is no mere coincidence that we repeatedly had to deal with double angle formulas of trigonometric and hyperbolic functions. Any motion in Euclidean geometry can be reduced to a pair of reflections [Sommerville 58]. In general, a displacement of an object is equivalent to a pair of inversions in two circles which cut perpendicularly a given circle. And, in any general displacement, there are always two points which are left unaltered. Rigid motions, characterizing different geometries, correspond to different inertial frames. The most important transformation is an involution which is a projectivity of period two. It is a non-trivial projectivity which is its own inverse. The term involution was coined by the French geometer Desargues, which, literally denotes the twisted state of young leaves [Rosenfeld 88]. If the two fixed points are real, the involution is hyperbolic, whereas, if the fixed points are imaginary, the involution is
Aug. 26, 2011
11:16
284
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
elliptic. The parabolic case corresponds to a coincidence of the two fixed points. The ‘canonical’ form of an involution is x˙ x˙ ± 1 = 0, where the plus and minus signs refer to elliptic and hyperbolic involutions, respectively. That is, an elliptic involution, x˙ x˙ + 1 = 0, has the fixed points ±i, while the hyperbolic involution, x˙ x˙ − 1 = 0, has the real fixed points ±1. For unequal fixed points, a and b, the involution becomes 1 x˙ x˙ − (a + b)(˙x + x˙ ) + ab = 0. 2
(5.7.1)
In particular, if x˙ = x˙ , then (˙x − a)(˙x − b) = 0, while, if we let a → ∞, x˙ + x˙ = 2b,
(5.7.2)
which is a reflection. By a symmetry argument, it can be shown that the conjugate non-real fixed points of an elliptic involution are u and −1/u [Schwerdtfeger 62]. The involution (5.7.1) becomes 1 x˙ x˙ − (u − u−1 )(˙x + x˙ ) − 1 = 0, 2
(5.7.3)
which is a reflection. If we let u → 0, we get the reflection in the origin [cf. (5.7.2)] x˙ + x˙ = 0.
(5.7.4)
Combining the two reflections, (5.7.3) and (5.7.4), gives the translation [Coxeter 65], 1 u − u−1 (˙x − x˙ ) + 1 = 0, (5.7.5) x˙ x˙ − 2 which can be brought into the form of the addition law for tangent, 2u x˙ − x˙ = . 1 + x˙ x˙ 1 − u2
(5.7.6)
x˙ x˙ − cot ϑ(˙x − x˙ ) + 1 = 0,
(5.7.7)
If u = tan 12 ϑ, and x˙ = tan ϕ and x˙ = tan ϕ , then tan ϕ − ϕ = tan ϑ. And in terms of ϑ, the translation (5.7.5) becomes
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
285
which is a camouflaged way of writing the clockwise rigid rotation, M(˙x) = x˙ =
cos ϑx˙ + sin ϑ . − sin ϑx˙ + cos ϑ
(5.7.8)
The Möbius transform (5.7.8) is a product of rotations, each through an angle θ = 12 ϑ. For the counter-clockwise rotation we have M(˙x) =
x˙ − tan ϑ , tan ϑx˙ + 1
(5.7.9)
which is the addition law for tangent. Because M( tan (ϑ)) = 0, its conjugate non-real fixed point is − cot ϑ, since M( − cot ϑ) = ∞. In just the same way that the cross-ratio of fixed points and their conjugates give the double angle formulas for hyperbolic functions, so the cross-ratio of the nonreal fixed points and their conjugates gives the double angle formulas for trigonometric functions. For consider the conjugate points tanh u¯ 1 and tanh u¯ 2 whose conjugate points are coth u¯ 1 and coth u¯ 2 . The distance between u¯ 1 and u¯ 2 is the positive square root of the cross-ratio: {tanh u¯ 1 , coth u¯ 1 | tanh u¯ 2 , coth u¯ 2 } = tanh2 (u¯ 1 − u¯ 2 ). In an analogous way, a segment whose ends are uˆ 1 and uˆ 2 have non-real conjugate points at uˆ 1 ± π/2 and uˆ 2 ± π/2. The fixed and conjugate non-real points are tan uˆ 1 , tan uˆ 2 , and −cot uˆ 1 and −cot uˆ 2 . Their cross-ratio, {tan uˆ 1 , −cot uˆ 2 | tan uˆ 2 , −cot uˆ 1 } = tan2 uˆ 1 − uˆ 2 , gives the double angle formula for the tangent. Switching the second and fourth member would make it negative without changing its magnitude. In contrast, if the fixed points u and 1/u are real, the involution is 1 x˙ x˙ − (u + u−1 )(˙x + x˙ ) + 1 = 0. 2
(5.7.10)
This is a general reflection, which if x˙ = x˙ becomes (˙x − u)(˙x − u−1 ) = 0. Combining the reflection (5.7.10) with the reflection through the origin, x˙ + x˙ = 0,
(5.7.11)
Aug. 26, 2011
11:16
286
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
by allowing u → 0, gives 1 x˙ x˙ + (u + u−1 )(˙x − x˙ ) − 1 = 0, 2
(5.7.12)
x˙ − x˙ 2u = = α. 1 − x˙ x˙ 1 + u2
(5.7.13)
or, equivalently,
Expression (5.7.13) can be simplified by a hyperbolic substitution. Set˙ ting x = tanh u¯ and x˙ = tanh u¯ with u = tanh 12 α, ¯ we get tanh (u¯ − u¯ ) = tanh α, ¯ which can only be the case if u¯ + u¯ = 0,
(5.7.14)
¯ This shows that the relativistic velocity composition law, so that α¯ = 2u. (5.7.13), is the addition law for equal and opposite velocities, where the relative speed of the two systems is α=
2u ¯ = tanh α. = tanh (2u) ¯ 1 + u2
The Möbius transformation, Mu (˙x) =
x˙ + tanh u¯ , tanh u¯ x˙ + 1
(5.7.15)
will readily be appreciated as the Lorentz transform in homogeneous coordinates. In fact, it is a product of Lorentz transforms at relative speeds ¯ From this we may safely conclude that u = tanh u. space and time are not separate entities, but enter only through their ratio x/t, as a homogeneous coordinate. Thus, the Möbius transforms (5.7.8) and (5.7.15) may be combined to read
√ √ x˙ + tan (u κ)/ κ , M(˙x) = √ √ x˙ tan (u κ)/ κ + 1
(5.7.16)
where κ = ±1 in the elliptic and hyperbolic cases, for which u = uˆ and ¯ respectively. But, whereas the hyperbolic measure of distance is u = u, 1+u 1 , (5.7.17) u¯ = tanh−1 u = ln 2 1−u
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
287
the elliptic measure of distance is uˆ = sin−1 u.
(5.7.18)
√ ¯ and Noting the symmetry between (5.7.18) and sinh−1 u/ (1 − u2 ) = u, √ ˆ these definitions of that between (5.7.17) and tan−1 u/ (1 − u2 ) = u, hyperbolic and elliptic measures of distance become apparent.
5.8
The Elliptic Geometry of an Oblate Spheroid
In this section we show that the motion of an oblate spheroid describes the phenomenon of aberration within the realm of elliptic geometry. That is, we will derive the expression for the distance, ϑ = cos−1
√
(1 − α2 ),
(5.8.1)
whose argument for an oblate ellipsoid is the ratio of the semi-axes, √ c/a = (1 − α2 ) from the phenomenon of aberration. This presupposes that if there is a physical phenomenon in hyperbolic space there must be a corresponding one in elliptic space. The minimum distance, ϑ = 0, occurs for a stationary system for which α = 0. The maximum distance, on the other hand, ϑ = π/2, occurs in the ultrarelativistic case α = 1, since α is always proportional to the relative velocity. For motion in the x-direction, the composition laws for the velocities in the x- and y-directions are √ uy (1 − u2 ) ux − u ux = , uy = . (5.8.2) 1 − uux 1 − uux This imposes the constraint that we measure the incoming rays as straight lines inclined to the vertical, instead of the usual convention of expressing their altitude with respect to the plane. Thus, ux = − sin ϑ, and ux = − sin ϑ so that the composition laws become √ sin ϑ + u cos ϑ (1 − u2 ) sin ϑ = , cos ϑ = . 1 + u sin ϑ 1 + u sin ϑ Their ratio, tan ϑ =
sin ϑ + u , √ cos ϑ (1 − u2 )
(5.8.3)
Aug. 26, 2011
11:16
288
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
describes a plane wave changing its direction on transition from one inertial frame to another. If light propagates along the normal to the direction in which the frame is moving at uniform velocity, ϑ = 0, and √ tan ϑ = u/ (1 − u2 ) = sinh u¯ so that ϑ is the angle of aberration. The condition ux = 0 applied to the second equation in (5.8.2) shows that the transverse velocity component becomes contracted in the prime frame. Another important special case occurs when ux = −u, implying that in the prime frame the longitudinal component of the velocity reduces (5.8.3) to the double angle formula, α (1 − α2 )
tan ϑ = √
= tan (2ϑ) =
2u =: λe , 1 − u2
(5.8.4)
sin ϑ = sin (2ϑ) =
2u =: λh , 1 + u2
(5.8.5)
1 − u2 . 1 + u2
(5.8.6)
where
and cos ϑ = cos (2ϑ) =
Finally, equating (5.8.6) with the inverse of (5.8.1) gives α = λh . As ϑ → 0 so, too, does u, while as ϑ → π/2, u → 1. Since aberration occurs in elliptic geometry, it has a size effect associated with it. An observer at the north pole of a sphere will see an object moving toward the equator decrease in size until it reaches it at π/2. Then it will tend to increase in size until it reaches the south pole at π. Klein fixed the maximum distance in elliptic space as π/2, and an object which is moving away from an observer who will always see the object diminishing in size. Since aberration was thought to live in hyperbolic space, no size effect was ever predicted. But, as we have seen, there is a size effect to aberration when it resides in elliptic space.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
5.9
289
Matter and Energy
As shown in Sec. 5.4.3, that the inertia of a charged body is increased by its motion can be traced all the way back to a little known paper by J. J. Thomson [81] entitled “On the effects produced by the motion of electrified bodies,” published in April 1881. Thomson suggested that the electrification of particles would affect their inertia in such a way that it would increase with velocity. Back in 1881 no one knew what the carriers of electricity were, only that the ‘fluid’ appeared to be incompressible. Ironically, Hall’s 1879 experiment in which he attempted to validate the electric fluid model by showing that a current carrying conductor sets up a difference of potential perpendicular to the direction of the current when placed in a magnetic field, actually signaled its death knell. Hall’s supervisor Rowland, slightly earlier, established that a moving charge creates a magnetic field just like a current in a wire. And what Thomson set out to do was to determine “what is the magnetic force produced by such a moving body.” Thomson [81] claims I have shown that the kinetic energy of a small sphere of mass m charged with a quantity of electricity e and moving with velocity v is
1 2 µe2 m+ 2 15 a
v2 ,
where a is the radius of the sphere and µ the magnetic permeability of the dielectric surrounding it. The existence in the kinetic energy of this term, which is due to the “displacement currents” started in the surrounding dielectric by the motion of the electrification on the sphere, shows that electricity behaves in some respects very much as if it had mass.
From the above relation Thomson concluded that the electromagnetic field created by the moving charge produces a reaction on the charge thereby increasing its mass by a factor 4µe2 /15a. On account of the conservation of momentum the increase in the mass must be “impulsively diminished.” For historical inaccuracy, it was Thomson, and not Lorentz, who wrote the force on a charge e moving at velocity v, as 12 e vc × H . The removal of the incorrect factor of a half was made by Heaviside eight years later.
Aug. 26, 2011
11:16
290
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
It was still another three years away before Lorentz wrote down the total force as e E + vc × H , but that is still no excuse for not giving Thomson the credit of the new term. And it is not a banality to say that the total force is the vector sum of static one and a motional one. Again Stigler’s law of eponymy has been verified! Moreover, from the vanishing of the total force, Thomson gave a procedure for determining not only the velocity of the particle, but, in addition, the ratio e/m [cf. Sec. (3.7.4.1), Eq. (3.7.15)]. Parenthetically, we might mention another historical inaccuracy. Everyone is familiar with Einstein’s derivation of Avogadro’s number from a dynamic equilibrium between opposing forces, say between a gravitational force dragging particles down and an opposing concentration gradient pushing them up [Lavenda 84]. This was in 1905. But, even earlier, Townsend [Thomson 04, pp. 79–83] showed that the charge on a gaseous ion is equal to that on the ion of hydrogen in hydrolysis, by measuring the coefficient of diffusion and comparing it with the velocity that the ion has when acted upon by an impressed electric force. The dynamic equilibrium is between the flux tending to drive the particles and the velocity acquired by the particles under the action of the electric force, or pressure gradient divided by the viscosity. The latter is given by the ratio Ee/v. Then relating the pressure and number of ions of the gas to the atmospheric pressure and Avogadro’s number, the latter can be determined if we know the charge on the ion, or conversely, we can determine the charge on an ion if we know Avogadro’s number. So, Einstein’s gedanken experiment was wellknown in scientific circles at the time he incorporated it into his theory of Brownian motion. At the turn of the twentieth century, the only elementary particle that was known was the electron, a name coined by Stoney in 1891, and found by Thomson six years later. In any charged system, according to the Victorians, part of the mass will that be of the aether. The aether was the storage place for energy derived from the magnetic and electric fields. The stored energy increases from the mechanical motion of the sphere. The motion of the sphere is met with resistance, and it is through this resistance that mechanical energy is converted into electromagnetic energy. As Thomson argued . . .[it] must correspond to the resistance theoretically experienced by a solid moving through a perfect fluid. In other words, it must be equivalent to an increase in the mass of the charged moving sphere.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
291
Thomson carried the analogy further and concluded that the capacity of a condenser in motion will not be the same as that when it is at rest, “but the difference depends on the square of the ratio of the velocity of the condenser to the velocity of light which will be exceedingly small.” Experiments on cathode rays showed that the carriers of electric charge must be considered as particles with very small mass. Thus, the idea evolved that the electromagnetic inertia of a charged particle was comparable to the entire inertia of the particle. This was actually shown by Kaufmann in Sec. 3.7.4.2, who experimenting with β-rays emitted from radium, showed that the apparent mass of these particles increased in a regular way with velocity [Cuttingham 14]. In his Yale University lectures, given in May 1903, Thomson [04] returned to the problem of mass due to the motion of electric charges. He reasoned that a mass in motion creates a magnetic field, H. Wherever there is a magnetic field there are µ0 H 2 /8π units of energy in Gaussian units. Averaging the energy over all points exterior to the sphere gives an additional amount of energy µ0 e2 v2 /3a, where µ0 is the magnetic permeability of the medium surrounding the sphere.c This corrects the numerical factor he found in 1881. Thus, he concludes that the whole of the kinetic energy is 12 m + 23 µ0 e2 /a v2 , and that there is a contribution to the mass due to its charge which in motion creates a magnetic field. Notice, that Thomson does not claim that all the mass is electromagnetic in origin, just its kinetic part is. Though he leaves out the c2 in the denominator of his ‘extra’ mass, a proof of E = mc2 can be found in these lectures. Thomson relied on imagery where the lines of force between negative charged particles were called ‘tubes of force,’ or simply ‘Faraday tubes.’ The Faraday tubes have the same direction as the electric force. The difference between the number of Faraday tubes which leave a closed surface and those which enter it is equal to the number of charges inside the surface. This is what Maxwell referred to as the electric displacement through the surface. c This differs from (5.4.25) by a factor of 1/4π, occurring on the transition from
Gauss to rationalized units. According to Heaviside, “the effect of changing from irrational to rational units is to introduce 4π … For the unnatural suppression of the 4π in the formulae of the central force, where it has a right to be, drives it into the blood, there to multiply itself, and afterwards break out all over the body of electromagnetic theory.”
Aug. 26, 2011
11:16
292
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
Now, if there are N Faraday tubes passing through a unit area at right angles to their direction, and B is the magnetic induction, then the momentum per unit volume will be NB sin θ, where θ is the angle between the magnetic induction and the Faraday tubes. The direction of the momentum is normal to both the magnetic induction and the Faraday tubes, and parallel to Poynting’s vector, which determines the direction of energy flow in the field. If v is the velocity of the sphere then the magnetic induction is 4πµNv, where µ is the magnetic permeability of the medium surrounding the sphere. Consequently, the momentum per unit volume is 4πµN 2 v sin θ. If the Faraday tubes were to move in the direction normal to their length they would carry a mass of the surrounding medium equal to M = 4πµN 2 with them, just like a cylinder being dragged broadside in a viscous liquid [cf. Sec. 5.3.1]. This, according to Thomson, was the mass of the bound aether. The aether was necessary in order not to violate Newton’s third law, so as to provide the missing link in the conservation of momentum. The momentum given to the Faraday tubes must be equal and opposite to the momentum lost in the bound aether. Thomson then claims that It is a very suggestive fact that the electrostatic energy E is proportional to M, the mass of the bound aether in that volume.
He offers the following proof. The electrostatic energy is E = 2πN 2 / , where is the specific inductive capacity of the medium (i.e. its dielectric constant). Introducing the mass, M = 4πµN 2 , he comes out with E=
1M . 2 µ
But, from Maxwell’s theory he deduces that c2 = 1/µ , and so E=
1 2 Mc , 2
(5.9.1)
the 12 being the last vestige of the nonrelativistic kinetic energy. Thomson concludes “E is equal to the kinetic energy possessed by the bound mass when moving with the velocity of light.” This is comparable to Poynting’s derivation given in Sec. 3.5.2. We should emphasize that there is no need to postulate a limiting velocity for it comes out directly from Maxwell’s theory, although
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
293
Heaviside would not agree. According to Heaviside [99], “If [the electromagnetic laws] are valid at any speed, then there is nothing to prevent speeds of motion greater than light.” This is very unrelativistic, but there must be some very special properties of the aether that makes the product of the specific inductive capacity and the permeability to always come out to be the same constant, independent of the specific nature of the aether. Thomson also established a proportionality between the tension in a Faraday tube and its mass per unit length of the string. This is the first utterance of the anisotropy of matter. Okun [89], writing in Physics Today comments that the “rest energy was one of Einstein’s great discoveries.” We have seen in Sec. 1.2.2.2 that Poincaré [00] also arrived at it by considering a pulse of light like a cannon ball shot from an artillery piece. According to Poynting’s theorem in Sec. 3.5.2, it will carry a momentum G = E/c. At the same time, the momentum is by definition G = Mv, and using G = vE/c2 , he arrived at twice (5.9.1). Although Okun [89] is correct in saying that Thomson’s increase in mass due to motion is velocity-independent, it was Heaviside who extended Thomson’s analysis of a slowly moving charge to one moving at any speed, and even speeds faster than light! He did so by considering Faraday tubes on the surface of a sphere, representing a charged particle. When a Faraday tube is in the equatorial region it imprisons more aether than when it is near the polar regions. The equatorial plane passes through the center of the sphere normal to the direction of motion. If we remember that Faraday tubes repel one another the crowding together in the equatorial plane would give rise to a pressure that re-establishes the uniform distribution of the tubes. The actual distribution is a balance between the opposing forces. The excess density of tubes packed into the equatorial region resulting from the increasing speed at which the charged particle is moving results in an additional aether that is imprisoned and this leads to an additional increase in mass. What Heaviside succeeded in doing was to show that, while the projection of the tubes on this plane is the same as that for the uniform distribution of tubes, the distance of every point in the tube from the equatorial √ plane is reduced by the factor (c2 − u2 ). From this result, Heaviside concluded that it is only when the velocity of the charged body is comparable to
Aug. 26, 2011
11:16
294
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
that of light that the distortion in the distribution of Faraday tubes becomes appreciable. All derivations of the mass variation and energy rely on the conservation of momentum [Ives 52,Rohrlich 90] although motion need not be involved. Some of these derivations we have discussed in Sec. 4.2.1. The idea is to consider a body suspended in the interior of an enclosure in which the system is stationary with respect to the medium transmitting the radiation. The body will emit radiation in the forward and aft directions symmetrically having energy 12 E in each of these directions. The momenta are equal and opposite in the opposite directions so no change in the state of a body will be observed. Now let the whole system be set in motion at a constant velocity v. In a medium, such as water, waves travel at a speed v. If the source emitting waves travels at a speed u, which is less than the wave speed, the observed frequency will be different from the source frequency. Depending on whether the source is approaching or receding from the observer there will be a change in frequency by the amount (1 ± u/v). This is the wellknown Doppler effect. However, if the observer is moving at a speed u toward a stationary source emitting sound waves at a speed v he will see wavefronts approaching him that are separated by a constant wavelength at a relative speed v + u. What changes is the frequency at which the observer sees the wavefronts. But if the source is a light source, we cannot have a speed c + u, for that would imply an emission theory. Both frequency and wavelength must shift in order to keep their product, the velocity of light, constant. This is often explained by observing that light waves are not vibrations of the ‘aether’, but, rather, self-maintained oscillations of the electromagnetic fields. So long as the observer’s velocity is small compared with that of light, the linear approximation to the Doppler shift holds. But when his velocity becomes comparable with c, we must use the full-fledged relativistic formula. It is often said, that this follows from the assumed mass dependency on the velocity. Rather, we would like to believe that it comes from the hyperbolic measure of velocity. Thus, the energy emitted in opposite directions will be given by 1 1 + u/c 1/2 1 1 − u/c 1/2 E and . E 2 1 − u/c 2 1 + u/c
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
295
We have already seen in Sec. 1.2.2.2 that Poincaré [00] — as early as 1900 — established that the momentum lost by radiation should be 1/c times the energy of the body E. In other words, Poincaré considered electromagnetic radiation as a ‘fluide fictif’ that has a density E/c2 . What Poincaré could not convince himself was that the mass decrease was due to the loss by radiation. In other words, the mass of such a fluid would be destructible, being able to reappear in other forms. Since such a mass seemed to have little to do with mass, as Poincaré knew it, it is for this reason that he referred to it as a ‘fictitious fluid,” rather than a real one. The corresponding momentum of this ‘fictitious fluid’ would be the density times c. Thus, the momentum from forward and backward radiation would be E 1 + u/c 1/2 E 1 − u/c 1/2 and , 2c 1 − u/c 2c 1 + u/c so that the net momentum would be the difference of the two c2
Eu . (1 − (u/c)2 )
√
Although this is momentum because W /c2 has units of momentum, there has been no hint of inertial mass. If u is the velocity of the body prior to the emission of radiation, when it had mass m , the conservation of momentum demands m u mu (E/c2 )u . = + √ √ (1 − (u/c)2 ) (1 − (u/c)2 ) (1 − (u /c)2 )
√
(5.9.2)
This would mean that we could observe the motion of the system with respect to that of its enclosure. Relativity forbids this by claiming that u = u, and so the conservation of momentum gives as its condition m − m = E/c2 , a result that is entirely independent of whether the system and enclosure are in motion or not! This ‘proof’ is what Ives [52] attributes to Poincaré’s ‘principle of relativity,’ which he formulated in 1904, to the effect that it is impossible by observation on a body to detect its uniform translational motion. Now, we only know classical mechanics and the Planck relation. We have no knowledge that momentum conservation looks anything like (5.9.2).
Aug. 26, 2011
11:16
296
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
A radiation source is mounted onto a moving railroad car which is traveling at a constant velocity u. The radiation source emits two pulses of radiation, both at frequency ν in the forward and aft directions with regard to the moving car. The frequency in the forward direction will be Doppler-shifted toward the blue by an amount 1 1 + u/c 1/2 ν = ν , 2 1 − u/c
(5.9.3)
while that shifted downstream will be subjected to a redshift by an amount 1 1 − u/c 1/2 ν = ν . 2 1 + u/c
(5.9.4)
Classically, we can measure only differences in energy and momentum. The change in energy, E = h(ν + ν ), and the change in momentum, G = h(ν − ν )/c will be given by [Steck & Route 83] hν , (1 − u2 /c2 )
(5.9.5a)
E u . √ c2 (1 − u2 /c2 )
(5.9.5b)
E = √ G =
Dividing (5.9.5a) by (5.9.5b) results in E u = G = mu, c2
(5.9.6)
because the only way that the momentum can change at constant velocity is for the mass to change. This is the famous Einstein equivalence between mass and energy which he derived using the relativistic expression for the kinetic energy. Here, we have employed only classical physics. We have used the longitudinal Doppler shift to ‘derive’ the result that a change in mass is measured as a change in the energy of a body. It is surprising that both space contraction and time dilatation do not depend upon whether we are moving toward or away from the source. For slowly moving inertial systems, Einstein predicts that “the time marked by the moving clock, viewed in the stationary system, is slowed by . . . 12 u2 /c2 per second,” where u is the relative velocity between the clocks. To lowest order, the Doppler effect gives a correction proportional to u/c, which is larger than that predicted by Einstein.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
297
Rather, it is only when we take the sum and difference of the two frequencies, (5.9.3) and (5.9.4), do the first-order terms cancel, giving a result that is independent of the direction of motion. This is what exactly was done in the 1938 experiment performed by Ives and Stilwell, that we discussed in Sec. 3.4, who wanted to measure the second-order Doppler effect where light is emitted in the transverse direction to the motion. Instead of measuring this, which is nearly impossible, they measured radiation in the forward and backward directions normal to the transverse direction. By averaging the two they obtained a second-order Doppler effect, (3.4.6), instead of the usual, first-order effect, which predicts a shift in wavelength, λ ≈ (u/c)λ. It is rather ironic that French [66] in his Special Relativity adds the following anecdote to the Ives and Stilwell experiment: It is a curious sidelight on this experiment that its authors did not (even as late as 1938) accept special relativity. In their view the results simply demonstrated that a moving clock runs slow (as Larmor and Lorentz suggested) by just the same factor, and in just as real a way, as a moving rod was believed to be contracted if it pointed along its direction of absolute motion through the aether. Old and cherished ideas die hard.
According to Whitrow [80], Einstein would argue that the contraction of a rod or the dilatation of time are only apparent changes, not involving any real change in the constituent of matter. Or would he? Just recall the incident with Variˇcak [11] who argued that the contraction was “so to speak, only a psychological and not a physical fact.” Although Einstein requested Ehrenfest to respond to Variˇcak, he could not let something like this go unanswered. Now, Einstein argued that “contraction was completely real,” and subject to physical measurement by an observer not moving with the contracting body [Klein 70]. However, according to the experiment of Ives and Stilwell, contraction of the rod should occur in the transverse direction of the “absolute motion through the aether.” When considering the motion of one clock with respect to another, in either frame the clock will register less ticks from the other clock than its own clock. It does not depend on whether the observer is approaching or receding from the other clock, which means that a first-order Doppler effect is not involved. This seems strange at first sight. Moreover, since the clock will register less ticks from the moving clock than from its own clock, some ticks will have gotten lost [Essen 78]. On a round trip the two clocks are compared and the moving one is seen to
Aug. 26, 2011
11:16
298
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
have gone slower, i.e. the so-called twin paradox. But, when the clocks are finally compared there is no relative velocity between them so what is being compared? The existence of a second-order time dilatation follows from the Lorentz transformation, “but it is now a real physical effect just as in the Lorentz theory from which Einstein started” [Essen 78]. Do we actually know what is going on? and why should time, unlike frequency, undergo a second-order slowing down when the latter undergoes a first-order shift? Blame it on accelerations! A twin leaves Earth, where his brother remains, and makes a round trip only to find that his brother has aged more than he has. Accelerations and decelerations are needed to complete a round trip, and, somehow, these have shortened time. According to Bondi, this is nonsense since each brother has measured ‘his’ time. But, how do we compare the times? If one brother remains inertial, the only way is to have the other brother undergo acceleration and deceleration. Again, according to Bondi, “the time taken by such an observer is less than the time recorded by the inertial observer.” But, the two clocks are not symmetrical for acceleration and deceleration have intervened. Unfortunately, acceleration does not enter into special relativity so it can have no effect whatsoever. Yet, the relation is independent of the motion, and so applies to a stationary system. Hence, the conservation of momentum does not apply, and a valid derivation of the relation should be completely ‘at rest.’ In electrodynamics, the four-force can be derived from the divergence of a stress tensor. In the early days of relativity it was believed that any force can be reduced to the electrodynamic force, apart from the gravitational force [Pauli 58]. It is the symmetry property of the energy–momentum tensor that asserts a proportionality between the momentum density and the energy flux density. This was first proposed by Planck [07] who set the density of inertial mass equal to the heat content. We have reviewed its present status [Lavenda 02]. This can be considered as a generalization of the equivalence of mass and energy. However, it makes a statement as to the localization of momentum and energy. By a mere integration over the system’s finite volume, the total momentum and energy is recovered.
References [Abraham 03] M. Abraham, “Prinzipien der Dynamik des Elektrons,” Ann. Phys. 10 (1903) 105–179.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch05
The Origins of Mass
299
[Babiˇc & Buldyrev 91] V. M. Babiˇc and V. S. Buldyrev, Short-Wavelength Diffraction Theory (Springer-Verlag, Berlin, 1991), Ch. 3. [Bondi 64] H. Bondi, Relativity and Common Sense: A New Approach to Einstein (Doubleday, New York, 1964). [Bridgman 27] P. W. Bridgman, The Logic of Modern Physics (Macmillan, New York, 1927). [Bucherer 04] A. H. Bucherer, Mathematische Einführung in Die Elektronentheorie (B. G. Teubner, Leipzig, 1904), p. 50, Eq. (91a). [Coxeter 65] H. S. M. Coxeter, Non-Euclidean Geometry, 5th ed. (U. Toronto Press, Toronto, 1965). [Cushing 81] J. T. Cushing, “Electromagnetic mass, relativity, and the Kaufmann experiments,” Am. J. Phys. 49 (1981) 1133–1149. [Cuttingham 14] E. Cuttingham, The Principle of Relativity (Cambridge U. P., Cambridge, 1914), p. 152. [Einstein 05] A. Einstein, “Ist die Trägheit eines Körpers von seinem Energieninhalt abhängig,” Ann. Phys. 18 (1905) 639–641. [Essen 78] L. Essen, “Relativity — joke or swindle?,” Wireless World, October 1978, 44–45. [French 66] A. P. French, Special Relativity (van Nostrand Reinhold, London, 1966). [Heaviside 99] O. Heaviside, Electromagnetic Theory, Vol. II (The Electrician, London, 1899), pp. 533–534. [Ives 52] H. E. Ives, “Derivation of the mass–energy relation,” J. Opt. Soc. Am. 42 (1952) 540–543. [Klein 70] M. J. Klein, Paul Ehrenfest (North-Holland, Amsterdam, 1970). [Landau & Lifshitz 60] L. D. Landau and E. M. Lifshitz, Electrodynamics of Continuous Media (Pergamon, Oxford, 1960), Sec. 4, Eq. (4.32). [Larmor 00] J. Larmor, Aether and Matter (Cambridge U. P., Cambridge, 1900) pp. 227–229. [Lavenda 84] See, for instance, B. H. Lavenda, Nonequilibrium Statistical Thermodynamics (Wiley-Interscience, Chichester, 1985), pp. 22–24. [Lavenda 02] B. H. Lavenda, “Does the inertia of a body depend on its ‘heat’ content?,” Naturwissenschaften 89 (2002) 329–337. [Lewis & Tolman 09] G. N. Lewis and R. C. Tolman, “The principle of relativity and non-Newtonian mechanics,” Phil. Mag. 18 (1909) 510–523. [Liénard 98] A. Liénard, “Champ électrique et magnétique produit par une charge électrique concentrée en un point et animée d’un mouvement quelconque,” L’Éclairage Électrique 16 (1898) 5, 53, 106. [Lorentz 52] H. A. Lorentz, The Theory of Electrons, 2nd ed. (Dover, New York, 1952), p. 212. [MacMillan 30] W. D. MacMillan, Theory of the Potential (McGraw-Hill, New York, 1930), pp. 17–18. [Milne 48] E. A. Milne, Kinematic Relativity (Clarendon Press, London, 1948). [Okun 89] L. B. Okun, “The Concept of Mass,” Physics Today June 1989, 31–36. [O’Rahilly 38] A. O’Rahilly, Electromagnetics, (Longman, Green & Co., London, 1938). [Pauli 58] W. Pauli, Theory of Relativity (Pergamon Press, London, 1958).
Aug. 26, 2011
11:16
300
SPI-B1197
A New Perspective on Relativity
b1197-ch05
A New Perspective on Relativity
[Planck 07] M. Planck, “Zur Dynamik bewegter Systeme,” Berl. Ber. 13 (June, 1907) 542–570; also in Ann. der Phys. Lpz. 76 (1908) 1–34. [Poincaré 00] H. Poincaré, “The theory of Lorentz and the principle of reaction,” Arch. Nederland 5 (1900) 252–278. [Richtmyer & Kennard 42] F. K. Richtmyer and E. H. Kennard, Introduction to Modern Physics, 3rd ed. (McGraw-Hill, New York, 1942), pp. 80–82. [Ritz 08] W. Ritz, “Ricerches critiques sur l’Électrodynamique Générale,” Ann. Chimie et Physique, 8th series, XIII (1908) 145–275; translated and commented upon by W. Hovgaard, “Ritz’s electrodynamic theory,” J. Math. Phys. 11 (1932) 218–254. [Rohrlich 90] F. Rohrlich, “An elementary derivation of E = mc2 ,” Am. J. Phys. 58 (1990) 348–349. [Rosenfeld 88] B. A. Rosenfeld, A History of Non-Euclidean Geometry (Springer, New York, 1988). [Schott 12] G. A. Schott, Electromagnetic Radiation (Cambridge U. P., Cambridge, 1912), p. 246. [Schwerdtfeger 62] H. Schwerdtfeger, Geometry of Complex Numbers (U. Toronto Press, Toronto, 1962). [Sholander 52] M. Sholander, “On certain minimum problems in the theory of convex curves,” Trans. Amer. Math. Soc. 32 (1952) 139–173. [Silberstein 14] L. Silberstein, The Theory of Relativity (Macmillan, London, 1914). [Soddy 32] F. Soddy, Interpretation of the Atom (John Murray, London, 1932). [Sommerville 58] D. M. Y. Sommerville, The Elements of Non-Euclidean Geometry (Dover, New York, 1958). [Thomson 81] J. J. Thomson, “On the effects produced by the motion of electrified bodies,” Phil. Mag. 11 (1881) 229–249. [Thomson 88] J. J. Thomson, Applications of Dynamics to Physics and Chemistry (Dawsons, London, 1888). [Thomson 04] J. J. Thomson, Electricity and Matter (Yale U. P., New Haven, 1904). [Thomson 21] J. J. Thomson, Elements of the Mathematical Theory of Electricity and Magnetism 5th ed. (Cambridge U. P., London, 1921), p. 388. [Thomson & Thomson 28] J. J. Thomson and G. P. Thomson, Conduction of Electricity through Gases, Vol. I, 3rd ed. (Cambridge U. P., Cambridge, 1928), Sec. 70. [Variˇcak 11] V. Variˇcak, “Zum Ehrenfestschen Paradoxon,” Phys. Z. 12 (1911) 169. [Whitrow 80] G. J. Whitrow, The Natural Philosophy of Time, 2nd ed. (Clarendon Press, Oxford, 1980). [Wilson & Lewis 12] E. B. Wilson and G. N. Lewis, “The space-time manifold of relativity. The non-Euclidean geometry of mechanics and electromagnetics,” Proc. Nat. Acad. Sci. 48 (1912) 387–507. [Yaghjian 92] A. D. Yaghjian, Relativistic Dynamics of a Charged Sphere: Updating the Lorentz-Abraham Model (Springer-Verlag, Berlin, 1992) p. 11. [Zahn & Spees 38] C. T. Zahn and A. H. Spees, “A critical analysis of the classical experiments on the relativistic variation of the electron mass,” Phys. Rev. 53 (1938) 511–521.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Chapter 6
Thermodynamics of Relativity 6.1
Does the Inertia of a Body Depend on its Heat Content?
Not only could electromagnetic energy increase the mass, but it was pointed out by Hasenöhrl [04] in 1904, that also heat energy could increase the ‘mechanical’ mass of a body. Shortly after Planck’s analysis of blackbody radiation, his students studied the problem of a radiation cavity set in motion and traveling at constant velocity. Building on Hasenöhrl’s [04] result that “to the mechanical mass of our system must be added an apparent mass m = 8E/3c2 ,” which he later corrected to m = 4E/3c2 , Planck’s doctoral student, Mosengeil [07], found thermodynamic expressions for the dynamics of moving systems. These results were later generalized by his mentor, Planck [07]. The nineteenth century saw the equivalence between heat and work, while the twentieth century witnessed the equivalence between heat content and mass.
According to Planck, through every absorption or emission of heat the inertial mass of a body alters, and the increment of mass is always equal to the quantity of heat . . . divided by the square of the velocity of light in vacuo.
The absorption or emission of radiant energy results in the production of heat. The heat is the average kinetic energy of the particles which is related to the change in mass of the particles. The radiating body loses mass, and the condition must be independent of the velocity at which it is traveling. It is the aim of this chapter to give mathematical substance to this statement. 301
Aug. 26, 2011
11:16
302
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity Einstein’s famous relation, E/c2 = m,
(6.1.1)
asserts that the mass of a body is a measure of its energy content. As we saw in Sec. 4.2.1 Einstein’s heuristic derivation used a gedanken experiment in which the emission of radiation from a body causes it to lose mass. It is well-known that, under well-defined conditions, external forces can cause heating, thereby increasing its rest mass. In fact, Planck has pointed to the fact that the stresses acting on the surface of a body also contribute to its apparent increase in mass, so that the left-hand side of (6.1.1) should be the heat content, or enthalpy, and not the change in energy, divided by the square of the velocity of light in vacuo. Moreover, the difference between the electromagnetic mass, (5.4.7), and the electrostatic mass, (5.4.5), of an electron has never been cleared up in a completely satisfactory way, which, like so many other things, has been swept under the carpet in the course of time. The electrostatic mass is defined as the energy of formation of a spherical charge divided by c2 . There was a missing factor of 13 separating the two magnitudes. So Lorentz’s conjecture that the origin of the mass of an electron was entirely electromagnetic had to be abandoned because there was something missing that was of a non-electromagnetic nature. The conventional argument, as given by von Laue [19], contends that since the system is not closed, the energy–momentum vector will not transform as a four-vector, as it should. By closing the system with the addition of mechanical components to the energies and momenta of the electron, the correct pre-factor of 43 could be achieved, but the price paid would be extremely high. For this procedure would introduce a negative pressure, known as the Poincaré’s stress after its author, which is related to the binding potential, or the work done by the internal binding forces as the spherical charge distribution undergoes distortion due to its motion. As discussed in Sec. 5.4.4, the FitzGerald–Lorentz contraction is used to transform the stationary spherical form of the electron into an oblate ellipsoid when in motion. However, the electron must accelerate in order to achieve a finite velocity and this has nothing to do with the FitzGerald–Lorentz contraction which requires an inertial regime. But the work required to
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
303
accelerate the electron cannot come from “a constant external pressure acting on a deformable and compressible electron, whose work is proportional to the variation in the volume of the electron,” as Poincaré [06] imagined. Slowly the Lorentzian electromagnetic world picture of an electron gave way to a thermodynamic one which was supposedly completely general, independent of any model chosen for an electron. As early as 1907 Planck showed that it was the enthalpy and not the energy that transformed correctly under a Lorentz transformation. Einstein had earlier referred to a ‘strange’ result by remarking If a rigid body on which originally no forces are acting is subject to the influence of forces that do not impart acceleration to the body, then these forces — observed from a coordinate system that is moving relative to the body — perform an amount of work dE on the body that depends only on the final distribution of force and the translation velocity.
Mechanical forces that to not impart acceleration to a body are indeed strange. The strange result is that the energy does not transform as it should under a Lorentz transform. There is something left over which, when combined with the work done by compressional forces, yields a Lorentz invariant, namely, the enthalpy. But a Lorentz transform implies that the system is inertial, and this excludes all forces which create accelerations. This is also the limit of a thermodynamic formulation, to which we now turn our attention.
6.2
Poincaré Stress and the Missing Mass We must return to Lorentz’s theory, but, in order to maintain this free from unacceptable contradictions, a special force must be invoked to account both for the contraction and for the constancy of two of the axes. I have attempted to determine this force, and have found that it can be regarded as a constant external pressure acting upon an electron capable of deformation and compression, the work done being proportional to the change in the volume of the electron. Henri Poincaré 23 July 1905
As we have mentioned in Sec. 5.4.4, Kaufmann’s experiments on β-rays showed clearly that mass increases with velocity in a regular way. Abraham concluded that the mass–velocity relation found by Kaufmann is exactly the same as the electromagnetic inertia which should vary according to his
Aug. 26, 2011
11:16
304
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
theory. A natural condition, which Abraham drew, is that the mass of an electron is entirely of electromagnetic origin. Abraham’s model views an electron as a sphere, of radius a, with a uniform distribution of charge over its surface as if it were a conductor. The problem of the distribution of the energy of a spherical conductor had been derived earlier, in 1897, by George Searle [96], a close collaborator and friend of Heaviside. So it was only a matter of writing down the field energy e2 E= 8π0 a
c c+u ln −1 , u c−u
and its momentum, e2 G= 16π0 ac
c c 2 + u2 c + u ln −2 , c−u u u2
which is in the same direction as the velocity. The expression of the mass follows directly from it; that is, the mass is a concept derived from the definition of the momentum. In contrast, according to Lorentz, the momentum is that of a uniform charge, G = γu
e2 , 6π0 ac2
where γ −1 is the FitzGerald–Lorentz contraction factor. Lorentz derives this expression from the Poynting vector for energy flow, E × H/c integrated over all space. If the motion is along the x-axis, Poynting’s vector will
be (γu/c2 ) E2y + E2z , where Ey (Ez ) is the y- (z-)component of the electric field due to a spherical electron at rest. By symmetry, the integrals of the squares of these components integrated over all space are equal to 23 Eel , where Eel = e2 /8π0 a, the electrostatic energy of the electron at rest for a uniform distribution of charge over the surface. In this way Lorentz finds 4 G = γ uEel /c2 . 3 This is the origin of the famous
4 3
factor.
(6.2.1)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
305
Likewise, Lorentz determines the energy as 1 u2 e2 1+ . E=γ 6π0 a 3 c2
(6.2.2)
Abraham was quick to criticize Lorentz’s model by the fact that the rate of working, d e2 dG =u u· 2 dt 6π0 ac dt
u , √ (1 − (u/c)2 )
does not equal the time-derivative of (6.2.2). Either energy conservation fails, or the electron cannot be a purely electromagnetic entity! Now, if the momentum would not be given as G =
u γE, c2
in the frame moving with velocity u with respect to a frame at rest, but, rather by G =
u γ(E + PV), c2
and the equation of state for photons, P=
1 E/V, 3
(6.2.3)
were used (though we are talking about electrons!), then we would, indeed, find (6.2.1), G =
4u γE. 3 c2
(6.2.4)
Rather than being valid for electrons, it has been claimed [Landau & Lifshitz 75] that the equation of state (6.2.3) is valid for the electromagnetic interactions between electrons. Be that as it may, Planck was later to identify the pressure, P, as a Lorentz invariant, and this would necessarily imply that E/V would be the density in internal energy, and not the density of the total energy of an electron.
Aug. 26, 2011
11:16
306
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
According to Lorentz, the energy transforms as (6.2.2), which we can write as E = γ(E + β2 PV),
(6.2.5)
on the strength of (6.2.3). This is what Einstein referred to as a “strange result.” Somehow, the β2 had to disappear from (6.2.5), and Poincaré [06] set himself the task of making it disappear. Though Poincaré was much more mathematically than physically minded, he quite ingeniously split the momentum and energy into field, f , and mechanical, m, components, , X = Xf + Xm
where X stands either for G or E. Because Gf gave the correct result, it = 0. But, because of the numerical discrepancy was necessary that Gm in the value of the mass, both components of the energy were required to be non-vanishing. The mechanical component of the energy was set at Em = 13 mel c2 = −Pm V, where mel = e2 /8π0 ac2 is the electrostatic mass, (5.4.5). Poincaré then added the pressure Pm to the field pressure, P = Pf + Pm , in order that it be annulled through the balance Pf = −Pm . The negative pressure Pm was referred to as the Poincaré stress by Lorentz.a Then, since the mechanical component of the energy satisfies the same Lorentz transformation, Em = γ Em + β2 Pm V , as the field energy (6.2.5), the energy resulting from the application of the Poincaré stress must be given by Em =
1 −1 γ mel c2 . 3
(6.2.6)
aAs Whittaker notes, for relativity Lorentz and Poincaré swapped roles, Lorentz
became the mathematician, while Poincaré became the physicist.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
307
This is supposedly the non-electromagnetic contribution to the energy that was required to make the energy transform,
1 1 E = γmel c2 1 + β2 + γ −1 mel c2 , 3 3 come out just like the Lorentz transform for the momentum, (6.2.4). Even today, (6.2.6) is associated with the work done by the binding forces as a spherical distribution accelerates and contracts [Yaghjian 92]. How (6.2.6) causes accelerations is not broached. But, the binding energy is proportional to the volume, and this explains the contraction factor, γ −1 , in (6.2.6). The non-electromagnetic origin of the Poincaré stress sounded the death knell for a purely electromagnetic explanation of the mass of the electron. This is a classic case where prejudice overruled logic, and force was applied to make a preconceived notion come out as desired. However, Planck was not ruled by any of these prejudices. Transferring attention from energy to heat content, which as we know from the Joule–Thomson process, is conserved in an adiabatic process we find H = E + PV = γ −1 (E + PV) + β2 γ(E + PV) 4 = γE. 3
(6.2.7)
Dividing through by the contracted volume V = γ −1 V in (6.2.7) gives the total enthalpy density, h = h + u · g = γ 2 ρel c2 , where g = γ 2 ρel u is the momentum density, and ρ = (ε + P)/c2 = h/c2 =
4 2 ε/c , 3
(6.2.8)
is the Lorentz-invariant mass density given in terms of the enthalpy density, h. This is precisely Planck’s result! Planck, in a talk given in Köln on the 23 of September 1908, referred to (6.2.8) as the law of inertia of energy: The corresponding momentum density hu/c2 was in the same direction as the Poynting vector, hu, for the
Aug. 26, 2011
11:16
308
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
flow of energy. It is as it should be: Mass is related to a conserved quantity in an adiabatic process. To paraphrase Planck, consider a ponderable flux of energy under a pressure P through a surface element dA normal to the velocity u. In a time dt mechanical energy will be performed P · dA · dt. The accompanying energy transferred is dA·ε·u dt, where ε is the energy density in (6.2.8). The momentum density then will be their sum, u(ε + P)/c2 , referred to as a unit of volume, where (ε+P)/c2 can be considered the density of mass in (6.2.8), which “is a well-known relation in relativity theory,” but has subsequently been forgotten [Lavenda 02]. Although the presence of the PV term in the heat content means that an electron cannot be compressed to a point particle, no structure has been given to it to-date. Equating the electrostatic energy to the rest energy gives a radius of 10−15 m, known as the classical radius of the electron. However, modern experiments put the electron radius at much less than 10−15 m. But, because energy does not transform into energy under a Lorentz transform, it is necessary to consider the enthalpy with the additional PV term. Moreover, it is entirely reasonable to associate relativistic mass with the heat content, and not with the energy, because only the former is conserved in adiabatic processes where the volume is altered. So even if the electron’s size is extremely small, thermodynamics still demands that it can be attributed a volume.
6.3
Lorentz Transforms from the Velocity Composition Law It may turn out that we shall be compelled to create a totally new mechanics, which we can imagine only vaguely, a mechanics in which inertia would increase with velocity, while the speed of light would be an insuperable limit. Henri Poincaré January 1904
In Victorian times, work was associated with controllable coordinates, and heat with uncontrollable ones [Thomson 68]. This decomposition can be extended to the velocity components of a particle. Consider the velocity at which a particle is moving to be comprised of two components, a uniform velocity component, u, and a component w due to random thermal motion.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
309
Since the latter will require averaging, only two components of w will be required: the components in the direction of motion and in the opposite direction to the motion. The composition of collinear velocities,b v± =
u±w , 1 ± uw/c2
(6.3.1)
for the addition of the causal, u, and random, w, velocity components is equivalent to the addition formula [Sommerfeld 09], tanh θ ± tanh θ tanh θ ± θ = , 1 ± tanh θ tanh θ
(6.3.2)
for the hyperbolic tangent. From this observation derives Robb’s [11] definition of ‘rapidity.’ We will consider the motion in a single dimension, indicating where necessary the generalization to higher dimensions. Given the collinear velocities (6.3.1), the total energy of N noninteracting particles of rest mass m is E=
1 Nmc2 Nmc2 1 . + √ √ 2 /c2 ) 2 2 (1 − v+ /c2 ) 2 (1 − v−
Introducing the composition law (6.3.1) results in
mN + K (w)/c2 c2 E= = Nmc2 cosh θ cosh θ , √ (1 − u2 /c2 )
(6.3.3)
where
K (w) = Nmc2 √
1 − 1 = Nmc2 (cosh θ − 1), (1 − w2 /c2 )
(6.3.4)
is the random kinetic energy of the thermal motion of N particles. b The original idea of using a stochastic component of the velocity is unknown.
Ives [44] used it, and it post-dates Becker [33], whom he referenced, but the trail seems to stop there.
Aug. 26, 2011
11:16
310
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity Analogously, the momenta of N non-interacting particles, 1 Nmv+ Nmv− 1 + √ , √ 2 (1 − (v+ /c)2 ) 2 (1 − (v− /c)2 )
G= becomes
G=
mN + K (w)/c2 u = Nmc sinh θ cosh θ, √ (1 − u2 /c2 )
(6.3.5)
under the composition law, (6.3.1). The energy, (6.3.3), and momentum, (6.3.5), will not form a two-vector unless the random kinetic energy, (6.3.4) is constant because E2 /c2 − G2 = (Nmc)2 cosh2 θ. In fact, averaging will be required since w is the random thermal component of the velocity. Notwithstanding this, the total energy and momentum form a two-vector without any averaging. Now consider the differences in energy and momentum. The differences in energy and momentum are 1 1 1 2 E = Nmc √ −√ 2 /c2 ) 2 /c2 ) 2 (1 − v+ (1 − v− Nmu w = Nmc2 sinh θ sinh θ, √ (1 − u2 /c2 ) (1 − w2 /c2 )
=√
(6.3.6)
and 1 v+ v− G = Nm √ −√ 2 /c2 ) 2 /c2 ) 2 (1 − v+ (1 − v− mN + K (w)/c2 w = = Nmc sinh θ cosh θ , √ (1 − u2 /c2 )
(6.3.7)
respectively. Adding (6.3.3) and (6.3.6) gives the total energy, E = E + E = Nmc2 cosh θ + θ ,
(6.3.8)
while adding (6.3.5) and (6.3.7) gives the total momenta, G = G + G = Nmc sinh θ + θ .
(6.3.9)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
311
It is now evident that (6.3.8) and (6.3.9) form a two-vector, E 2 /c2 − G 2 = (Nmc)2 .
(6.3.10)
Unwittingly, and most surprisingly, (6.3.8) and (6.3.9), constitute Lorentz transforms for the energy and momentum! All we have to do is to expand the double angle formulas, and note that G0 = Nm0 c sinh θ and E0 = Nm0 c2 cosh θ, with γ = cosh θ . Implicit in this is the definition of rapidity, u/c = tanh ϑ , for then the Lorentz transformation can be written in the suggestive form as a rotation from θ to θ + θ : E = γ (E + u G) = E + E, u G = γ G + 2 E = G + G, c
(6.3.11) (6.3.12)
which are none other than the addition formulas for the hyperbolic sine and cosine. We could have equally as well assumed the Lorentz transforms and worked our way back to derive the composition law for the velocities. The procedure is completely reversible. These equations should be compared to Ttt + β2 Txx dV0 , Ttt dV = γ i u G = Txt dV = γ 2 (Ttt + Txx ) dV0 , c c E =
(6.3.13) (6.3.14)
for a frame moving at constant velocity u in the x-direction. Ttt and Txx are the energy density and the energy flux density of the stress tensor T, respectively, in the frame at rest. According to Planck’s hypothesis [Becker 33], which Pauli [58] refers to as a theorem: To each energy flux density, S , there corresponds a momentum density, g = G0 /V0 , g = S u/c2 .
(6.3.15)
However, (6.3.15) had been derived seven years before by Poincaré [00], as we saw in Sec. 1.2.2.2. This is yet another example of Stigler’s law of eponymy. Regardless of who discovered it, (6.3.15) is correct and will satisfy the energy balance equation. However, we will show that it is not correct to
Aug. 26, 2011
11:16
312
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
relate the energy flux with space components of the stress tensor T, such that S = u · T.
(6.3.16)
We have to thank (6.3.16) for all the inconsistencies in dealing with the resulting balance equations. Considering the body to be isotropic, Txx = Tyy = Tzz = P, where P is the pressure, the Lorentz transforms (6.3.13) and (6.3.14) would result from (6.3.11) and (6.3.12) by setting G=
u PV, c2
(6.3.17)
so as to give E = γ E + β2 PV , G = γ
u (E + PV). c2
(6.3.18a) (6.3.18b)
But, we are not permitted to write G = uPV /c2 because the volume undergoes a FitzGerald–Lorentz contraction, V = Vγ −1 .
(6.3.19)
The Lorentz transforms (6.3.18a) and (6.3.18b) were first given by Planck [07] by assuming that his kinetic potential, K = PV, was a function of the velocity, as well as the temperature and volume. Adding PV to the both sides of (6.3.18a), and using (6.3.19), result in H = E + PV = γ (E + PV).
(6.3.20)
This is the enthalpy, and introducing it into (6.3.18b) gives G =
u H, c2
(6.3.21)
which can no longer be considered a spatial component of the stress, as Planck intended in (6.3.16). The problem is that (6.3.18a) and (6.3.18b) are not Lorentz transforms, but, rather, definitions of the total energy and momentum. It was Planck [07]
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity who first wrote his “gesamte energie” (total energy) as Etot = E + uG = E/γ + γβ2 H = γ E + β2 PV ,
313
(6.3.22)
where E = TS − PV,
(6.3.23)
is the internal energy, and S is the entropy. The transformation in (6.3.22) is the result of the FitzGerald–Lorentz contraction on the volume, (6.3.19), and the fact that objects get cooler as they travel at greater speedsc T = Tγ −1 .
(6.3.24)
Under the Lorentz transformation, the momentum transforms from (6.3.17) to (6.3.21). The energy transforms from the internal energy (6.3.23) to the total energy (6.3.22). In thermodynamics, it is the enthalpy H = E + PV, and not the internal energy, E, that transforms correctly under a Lorentz transformation, H = γH,
(6.3.25)
which together with the momentum, G = γ
u H, c2
(6.3.26)
are Lorentz-invariant. This is to say that H 2 /c2 − G 2 = H 2 /c2 ,
(6.3.27)
c Over the years controversies have arisen as to what gets colder when in movement.
In 1963 Ott supposedly demonstrated the inverse of Planck’s transformation laws. Following this, Arzelies independently arrived at the same conclusions. Landsberg dissented from both choices and made temperature an invariant. And somewhat later van Kampen made both the temperature and heat Lorentz invariants. A history of thermodynamic Lorentz invariants is given by Callen and Horowitz [71] who sided with Landsberg, and claimed that enthalpy, and not the energy, is the natural potential for relativistically confined systems. All these proposals overlooked the fact that the entropy must be a Lorentz invariant for, otherwise, we could distinguish rest from motion by the degree of disorder of the system. So heat and temperature must transform the same way, and both decrease when in motion.
Aug. 26, 2011
11:16
314
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
like (6.3.10), is a transformation from a state at rest to one in relative motion. Introducing (6.3.25) into (6.3.26) gives Planck’s relation, G=u
H , c2
(6.3.28)
showing that the ratio H/c2 behaves as the mass. The heat content of a body is a measure of its inertia: the mass, in general, will be a function both of the temperature and volume. Planck also derived his relation (6.3.28) on thermodynamic grounds on the assumption that his kinetic potential, K, is a homogeneous function of T, V, and β, i.e. K=
∂K ∂K (1 − β2 ) ∂K T+ V− . ∂T ∂V β ∂β
(6.3.29)
If we introduce the total energy, Etot = uG + TS − K, into (6.3.29) we come out with Etot + PV =
c2 G, u
implying that Htot /c2 is the mass, where Htot = Etot + PV is the total enthalpy since Etot = E + uG is the total energy. The internal energy contracts when in motion, but the work necessary to keep the system in a state of uniform motion ensures that the total energy will dilate when in motion. As Planck [07] rightly emphasized, the stresses acting on the surface of a particle also contribute to the mass–energy of the particle so it is really a mass–enthalpy relationship. In the present formulation, it is the kinetic energy, (6.3.4), resulting from the heat motion that contributes to the mass of the particle. But, this motion even in a state at rest, u = 0, should not vanish, as (6.3.10) would have us believe. The reason is that (6.3.4) is defined stochastically so that without suitable averaging it has no meaning.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
315
If we average, (6.3.3) and (6.3.5), E/c2 , G does, indeed, form a twovector since 2 2 2 E /c2 − G = Nm c , where m = m + K (w)/Nc2 .
(6.3.30)
The average random kinetic energy, resulting from random thermal motions, contributes to the increase in the inertial mass.
6.4
Density Transformations and the Field Picture
We have considered how energy and momentum transform, as well as other thermodynamic quantities. Now we consider how their densities transform in one dimension. The total energy density can be written as 1 1 1 2 ε = mnc , + 2 /c2 2 /c2 2 1 − v+ 1 − v− where n is the number density. With the aid of the composition law for the velocities, (6.3.1), we find 2 )2 1 + (uw/c ε = mnc2 . (6.4.1) (1 − u2 /c2 )(1 − w2 /c2 ) Because the velocities u and w enter symmetrically, the FitzGerald– Lorentz contraction of the volume should be given by √ √ V = V (1 − u2 /c2 ) (1 − w2 /c2 ).
(6.4.2)
Since (6.4.2) is not (6.3.19) we can expect that the densities will not give back the transformation laws for their macroscopic counterparts on multiplication by the volume in the moving frame. This can be seen by multiplying both sides of (6.4.1) by (6.4.2), and averaging; it does not yield (6.3.18a), but, rather, E = γ Nmc2 + K (w) + β2 PV , (6.4.3)
Aug. 26, 2011
11:16
316
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
where w2 1 − w2 /c2
P = nm √
(6.4.4)
is the pressure in one-dimension. The first term in the parenthesis in (6.4.3) is our definition of the total energy, (6.3.3), in a state of rest. Thus, (6.4.3) can be considered as the Lorentz transform applied to the energy to take it from a state of rest to one of uniform velocity, (6.3.18a). Consider now the momentum density, 1 v− v+ g = nm . + 2 /c2 2 /c2 2 1 − v+ 1 − v− Again making use of velocity composition law results in mnu 1 + w2 /c2 g= . 1 − u2 /c2 1 − w2 /c2 Multiplying both sides of (6.4.5) by (6.4.2), and averaging, give u G = γ 2 Nmc2 + K (w) + PV . c
(6.4.5)
(6.4.6)
The first term in the parenthesis is, again, the energy (6.3.3) in a state of rest. Thus, (6.4.6) can be considered as the Lorentz transform on the momentum, (6.3.18b). The work necessary to keep the platform moving at a constant speed u is u · G . The increment in the heat in a frame moving at this velocity is dQ = dE − dL , according to the first law, where the increment in the work is dL = −P dV + u · dG . Thus, dQ = dE + P dV − u · dG = γ[dE + β2 P dV] + γ −1 P dV − γβ2 [dE + P dV] =γ
−1
[dE + P dV] = γ
−1
(6.4.7)
dQ,
showing that a moving body loses heat because energy must be spent to keep it in motion. This is entirely reasonable, since there are no free lunches.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
317
If objects would become hotter the faster they travel, it would constitute a perpetual mobile since they could transform partially their heat into work to be made to go still faster. The product of the momentum G and the deterministic velocity u is the work necessary to keep the system in a state of steady motion, while the average of the product of the change in momentum, G, in a state of rest, and the random velocity w is proportional to the static pressure. Equations (6.4.7) make another point: We used the volume contraction law (6.3.19), and not (6.4.2). Although the random thermal velocity was crucial in (6.4.5) in defining the pressure according to (6.4.4), it has henceforth disappeared on the macroscopic scale. Planck did not need (6.4.4) to find that it was a relativistic invariant, the same in all inertial frames. Hence, there is a loss of information as we transform from densities to extensive quantities, and the volume contraction (6.4.2) is the last vestige of the actions of random thermal motions. We shall come back to this point in the next section. The field picture can be derived from a modified version of Planck’s derivation. We introduce the kinetic potential K as a function of x˙ , T, and V. The kinetic potential K, which is PV, transforms as
∂x˙ K = K , (6.4.8) ∂x˙ where
∂x˙ ∂x˙
√ 1 − x˙ u/c2 (1 − β2 ) = √ , = 1 + x˙ u/c2 (1 − β2 )
and the second equality follows from
x˙ u x˙ u 1 − β2 = 1 + 2 1− 2 , c c
(6.4.9)
(6.4.10)
i.e. u=
x˙ − x˙ . 1 − x˙ x˙ /c2
It is apparent that the kinetic potential undergoes a FitzGerald–Lorentz contraction when transferred to the frame traveling at a relative velocity
Aug. 26, 2011
11:16
318
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
u greater than the other frame if that frame is at rest x˙ = 0, implying that x˙ = u. This we will do at the end of the calculation. Now we inquire as to how the momentum, G=
∂K , ∂x˙
(6.4.11)
transforms. From (6.4.8) we have ∂K (6.4.12) ∂x˙
∂ ∂x˙ ∂x˙ ∂K ∂T ∂K ∂V ∂x˙ = + + K . ∂x˙ ∂x˙ ∂x˙ ∂T ∂x˙ ∂V ∂x˙ ∂x˙
G =
Both the temperature and volume transform as
x˙ u X = γ 1 − 2 X, c
(6.4.13)
which becomes the FitzGerald–Lorentz contraction when x˙ = u. Hence, (6.4.12) is given explicitly by u (6.4.14) G = γ G + 2 (E + PV) , c when x˙ = u, and use has been made of the Euler relation for the internal energy, E = TS − PV, and K = PV. According to Planck, the total energy is given as a double Legendre transform of the kinetic potential, E=T
∂K ∂K +V − K, ∂T V
(6.4.15)
which makes K a function of the internal state variables T and V, and also of γ −1 , where S=
∂K , ∂T
P=
∂K , ∂V
G=
1 ∂K . c ∂β
To find the functional dependence of K we take its differential, and use dEtot = u dG + T dS − P dV, to obtain dK = S dT + P dV + G du.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
319
Finally, if we introduce the Euler relation for Etot into (6.4.15) we find K = PV. That K is a function of V, T, u, means that P = k(T, u), where k is the density of the kinetic potential. As such this contradicts Planck’s finding that P is an invariant. This is precisely what Abraham [20] found, who then goes on to treat blackbody radiation in a moving cavity. There he finds P = const. × T 4 γ 2 , which on the basis of (6.3.24) makes the pressure an invariant. But, the pressure is only a function of u through its dependency on T. And if (6.3.24) holds, then P is invariant. This can be readily seen by considering radiation being reflected off a mirror, which would behave as a monochromatic piece of charcoal in a blackbody. If θ is the angle that the incoming radiation makes with the velocity vector of the moving blackbody, and θ the angle of reflection with respect to the direction of motion, then using the Doppler shift and Wien’s displacement law together with Stefan’s law, we can go a very long way in determining the correct velocity dependencies on the thermodynamic quantities. Stefan’s law says that the intensity of radiation, (θ, β), varies as T 4 . Wien’s displacement law says that the product of the frequency ν, where the intensity of radiation peaks as a function of frequency, and absolute temperature T is constant. Thus, the ratio of the incoming and reflected radiation is given by the Doppler shift, ν 4 1 − β cos θ −4 (θ, β) = . = ν 1 − β cos θ (θ , β) If we observe the radiation in the direction normal to the motion of the blackbody, then θ = π/2, and the above expression reduces to (θ, β) =
(π/2, β) . (1 − β cos θ)4
(6.4.16)
Integrating over the element of solid angle 2π sin θ dθ gives the energy density, ε=
2π c
π
sin θ (θ, β)dθ. 0
(6.4.17)
Aug. 26, 2011
11:16
320
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
Moreover, the component of the momentum is less than in the case that the incident ray were parallel to the velocity in proportion to cos θ. Consequently, the momentum density is 2π π sin θ cos θ (θ, β)dθ. (6.4.18) g= 2 c 0 These relations were first derived by Mosengeil [07] in his dissertation, and were published posthumously by Planck, and generalized by him. Quite remarkably the relation, Pβ = gc − βε 2π π = ( cos θ − β) sin θ (θ, β)dθ, c 0
(6.4.19)
is another way of writing (6.3.28) in density form. The first published prediction that radiation ‘carries’ mass was made by Hasenöhrl [04, 05] in the years 1904 and 1905. Introducing (6.4.16) into (6.4.17), (6.4.18), and (6.4.19), and performing the integration lead to 2π ε= (π/2, β) c g=
2π (π/2, β) c2
π
1 + 13 β2 sin θ dθ 4π = , (π/2, β) · c (1 − β cos θ)4 (1 − β2 )3
π
sin θ cos θ dθ 16π β = 2 (π/2, β) · , (1 − β cos θ)4 3c (1 − β2 )3
0
0
4π P= (π/2, β) · (1 − β2 )−2 . 3c Taking ratios of the terms eliminates the unknown (π/2, β) and leads to 4β ∂k = , 2 ∂β 3+β 2 1−β = k. P =ε· 3 + β2
gc = ε ·
These two relations lead to the differential equation, ∂k 4β , =k· ∂β 1 − β2 which can be integrated to give k=
(T) , (1 − β2 )2
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
321
where the constant of integration, (T), can be a function only of T. Inserting this expression for k into (6.4.15) leads to the differential equation for the unknown , viz. T
d = 4 . dT
Integration immediately yields Stefan’s law, (T) = σ3 T 4 , where σ is the radiation constant. Hence, the pressure, energy, momentum, and entropy are found to be
4 T σ , P= √ 3 (1 − β2 ) σ 4 3 + β2 , T V 3 (1 − β2 )3 4σ 4 β G= T V , 3c (1 − β2 )3
3 4 V T S= σ √ ·√ . 2 3 (1 − β2 ) (1 − β ) E=
It follows from (6.3.19) and (6.3.24) that the pressure and entropy are relativistic invariants. We have thus shown that the pressure depends on the velocity only through the dependence on the temperature, and that the temperature of a body in motion is lower than when it is at rest implies that the pressure is the same in every inertial frame. Although the twenty-one year old Pauli [58] writing in Encyklopädie der Mathematischen Wissenschaften “was still a student at the time, he was not only familiar with the most subtle arguments of the Theory of Relativity through his own research work, but was also fully conversant with the literature on the subject.” The quote is taken from Sommerfeld’s preface of the special German edition. However, when dealing with blackbody radiation in a moving cavity (Sec. 49) Pauli says that “by means of the formulas of Sec. 46” the above relativistic expressions follow. The formulas he is referring to are (6.4.6), (6.4.3), and the volume contraction, (6.4.2), without the stochastic velocity, etc., and are just the Lorentz transforms. They cannot provide the relativistic expressions for cavity radiation. Rather, it is the spectral distribution in the moving cavity argument that he subsequently
Aug. 26, 2011
11:16
322
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
develops which yields the relativistic thermodynamic expressions. It is not as he claims that these formula agree with his previous results. The spectral distribution analysis is the only way of arriving at them. Pauli also misses the relation between inertia and heat, (6.3.28). He finishes with the conclusion Because of the extreme smallness of the expected effects it seems unlikely that the inertia of radiation energy could be demonstrated experimentally.
This can also be said of relativity in general. But, as Pauli points out, all results were derived before “the theory of relativity had been formulated,” alluding to the fact that there is more than one way to skin a cat. The transformation of the total energy can be found from the Euler relation, Etot = uG + ST − PV .
(6.4.20)
Introducing (6.4.14) and (6.4.13) gives Etot = γ(uG + E + β2 PV),
(6.4.21)
on setting x˙ = 0. The transformation law (6.4.21) shows how the energy necessary to keep the frame at a constant speed u, uG , transforms the internal energy into the total energy, Etot − uG = γ −1 E. The transformations (6.4.14) and (6.4.21), with PV added to both sides, can thus be written as u G = γ G + 2 H , c H = γ(uG + H). These Lorentz transforms attest to the invariance of H 2 − (cG )2 = H 2 − (cG)2 ,
(6.4.22)
which agrees with (6.3.27). We will now investigate the kinetic origins of the pressure using the relativistic virial theorem.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
6.5
323
Relativistic Virial
The usual derivation of the pressure from the virial is to set the pressure equal to one-third the average of twice the kinetic energy density since the pressure will be isotropic in each of the three directions [Clausius 70]. Averaging is carried out over all the momenta with a given probability density function. Why should the deterministic momentum be averaged with respect to a probability distribution? The answer that would be given is that the particles are moving with a distribution of momenta [Einbinder 48a]. But such a distribution cannot be the work of external fields. So appeal is implicitly being made to the action of random thermal motion, and not to a velocity that is determined by external fields. The distinction between the two is never made. So pressure must be defined in terms of the momenta due to random thermal motion. However, if we define pressure in terms of the product of the stochastic change in the momentum, (6.3.7), times the stochastic velocity, it must be for the state of rest, u = 0. For otherwise it would introduce a dependency of the pressure on the velocity which no averaging can annul, and so destroy its Lorentz invariance. Thus, we define the pressure as P=
n mw2 n G(w)w = √ , 3 3 (1 − w2 /c2 )
(6.5.1)
where the change in the stochastic momentum, G, is given by (6.3.7) in a state at rest, u = 0. The average pressure, (6.5.1), can be written as the virial 3PV = K (w) + L (w) , (6.5.2) where K (w) is the average of the random kinetic energy, (6.3.4), and √ L (w) = Nmc2 1 − 1 − (w/c)2 is the average Lagrangian due to random thermal motion. This is because √ Gw = K (w)/c2 + Nm w2 = K (w) + Nmc2 1 − (1 − w2 /c2 ) , gives back the definition of the random kinetic energy, (6.3.4). Moreover, it shows that the virial must be defined in a state of rest, for, otherwise, the pressure would not be a Lorentz invariant. We will see that whether the
Aug. 26, 2011
11:16
324
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
system is in motion or at rest is determined by the ideal gas law [cf. (6.7.9) below]. For in that law there are two quantities, (6.3.19) and (6.3.24), that contract when in motion, but compensate one another so that the gas law always remains valid no matter what inertial frame we are in. The virial (6.5.2) can be written entirely in terms of the average kinetic energy by eliminating the square root using the definition (6.3.4) of the random kinetic energy. We then obtain K (w) K (w) + 2Nmc2 3PV = K (w) + Nmc2 = K (w) +
Nmc2 K (w) . K (w) + Nmc2
(6.5.3)
Equation (6.5.3) will only reduce to linear equations of states in the ultra- and non-relativistic limits, or in the small and large mass limits, respectively. Since K (w) is a random function, it makes no sense to say that it is much larger or smaller than the rest energy, Nmc2 [Einbinder 48a]. In the former limit we get K (w) = 3PV,
(6.5.4a)
2K (w) = 3PV,
(6.5.4b)
while in the latter limit,
which are well-known. However, if we appeal to thermodynamics, which is insensitive to fluctuations, an average of a function will be equal to a function of its average. Consequently, thermodynamics interprets (6.5.3) as the Grüneisen equation of state, PV = sK (w),
(6.5.5)
where s is the so-called Grüneisen parameter. On thermodynamic grounds, it is a phenomenological parameter ranging from s = 23 for a perfect, material gas, to s = 13 for a photon gas in three-dimensions. A comparison of (6.5.3) with (6.5.5) gives 1 Nmc2 s= 1+ , (6.5.6) 3 K (w) + Nmc2
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
325
for the Grüneisen parameter. In d-dimensions it will be given by 1 Nmc2 s= . 1+ d K (w) + Mmc2 In 1908 Grüneisen enunciated his empirical law as “the ratio of the coefficient of expansion of an isotropic solid to its specific heat is independent of the temperature.” The Grüneisen parameter, (6.5.6), will be independent of the average kinetic energy, or equivalently, the absolute temperature, in the extreme nonrelativistic limit, where Nmc2 K (w), in which case s = 23 , and in the ultra-relativistic limit, where Nmc2 K (w), s = 13 in threedimensions. Equation (6.4.2) predicts a thermal contraction of the volume, in addition to the mechanical contraction of FitzGerald–Lorentz type. Since w is a random speed, only the average of (6.4.2) has meaning. On the basis of the definition of the random kinetic energy, (6.3.4), it is equivalent to Nmc2 V(w) = (6.5.7) Vγ −1 = (3s − 1)Vγ −1 , Nmc2 + K (w) the last equality follows in the absence of fluctuations, where an average of a function is equal to a function of the average. This shows that in the nonrelativistic limit, where Nmc2 K (w), there will be no thermal effect on the volume contraction. The same volume contraction is obtained as in the absence of random thermal velocities. However, as we move into the relativistic regime, we expect the effect to be ever increasing, leading to smaller and smaller volumes, which are proportional to the ratio, Nmc2 /K (w). At the end of Sec. 6.7.2, we will provide more quantitative limits.
6.6
Which Pressure?
Various pressures have appeared in the literature. There is a pressure associated with the propagation of electromagnetic waves [cf. Sec. 3.5.1], another pressure invented by Poincaré that was supposed to hold the charge to the
Aug. 26, 2011
11:16
326
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
surface when an electron is in motion [cf. Sec. 6.2], and another kinetic gas pressure that appears in special relativity [cf. Sec. 6.4]. In Sec. 3.5.1 we derived the pressure of radiation from the Doppler effect applied to radiation reflected off a mirror. Here we derive the radiation pressure from electromagnetism by considering the Lorentz transformations: Ex = Ex , u Ey = γ Ey − Hz , c u Ez = γ Ez + Hy , c for the components of the electric, E, and magnetic, H, field strengths. For a plane wave of monochromatic light traveling in the x-direction, Ey = Hz , the radiation pressure is [cf. (3.5.1)] 1 − u/c
1 2 Ey + Hz2 = P , (6.6.1) P= 4π 1 + u/c where the pressure in the stationary frame is twice the energy density of the incoming wave, P = E2 y /2π, as in (3.5.2) when the mirror is at rest. The radiation pressure (6.6.1) is dependent on the velocity u, and is, therefore, not an invariant. Then there is the Poincaré stress of Sec. 6.2, which, oddly enough, Einstein [07] attempted to associate with the kinetic gas pressure (6.5.1) in three-dimensions. In the xy-plane, the electron in a state of rest will appear as a circle which becomes an ellipse when set in motion. The components of the force per unit charge are Fx = Ex , Fy = Ey /γ. The magnitude of the force normal to the surface of an electron of radius a is √ 2 2 + E /γ E2 F = (Fx2 + Fy2 ) = x y =
e2 2 2 2 cos . θ + sin θ /γ 4πa2
(6.6.2)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
327
According to an argument by Page and Adams [40], the surface element in motion will be reduced by the same amount as the force, (6.6.2), sin2 θ 2 2 , (6.6.3) cos θ + σ = 4πa γ2 which is supposedly the origin of the invariant pressure. Dividing (6.6.2) by (6.6.3) gives twice the pressure acting on the surface of the electron, F/σ =
e2 = 2P. 16π2 a4
(6.6.4)
Page and Adams fail to realize that the surface element is in the hyperbolic plane, and not in the Euclidean plane, so that there is no need to invoke charge conservation. We will rectify their situation in Sec. 9.8. The electron’s surface can be considered as a perfect reflector. If the pressure is transmitted by waves [Poynting 10], the reflected waves push back just as much as the waves impinging on the electron’s surface so that the pressure is doubled [cf. Sec. 3.5.1]. Not surprisingly, the work, 4 3 e2 πa P = = (mem − mel )c2 , 3 24πa is the difference in energy between the electromagnetic rest energy, mem c2 = e2 /6π0 a, and the electrostatic rest energy, mes c2 = e2 /8π0 a. It is this mechanical tension that Poincaré had to invent so that mem = 43 mel [cf. (5.4.7)]. This invariant pressure Einstein [07] rederived in his 1907 paper. Such a pressure is a pure constant, independent of the state of the body. Hence, it has nothing whatsoever to do with the kinetic gas pressure (6.5.1) that results from the random thermal motions.
6.7
Thermodynamics from Bessel Functions
Since the relativistic energy is related to the hyperbolic function cosh θ, and the momentum to sinh θ, the former enters into the exponential Boltzmann factor and a finite power of the latter is proportional to the density of states. The modified Bessel function is related to the relativistic partition function, which is a generating function with the only exception that the dummy variable is given a thermodynamic significance, i.e. the inverse temperature.
Aug. 26, 2011
11:16
328
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
Therefore, the modified Bessel functions can be used to determine thermodynamic equations of state that are valid in the relativistic regime. This approach dates back to 1911 with the work of Jüttner.
6.7.1
Boltzmann’s law via modified Bessel functions
We use the representation of the modified Bessel function of the second kind of order r > 12 , Kr (x) =
xr (2r − 1)!!
∞
e−x cosh θ sinh2r θ dθ,
(6.7.1)
0
where !! is the double factorial, or semi-factorial, function, i.e. (2r − 1)!! = 1 · 3 · 5 · (2r − 1). An integration by parts yields Kr (x) =
xr−1 (2r − 3)!!
∞
e−x cosh θ sinh2(r−1) θ cosh θ dθ
(6.7.2)
0
Continuing to integrate by parts generates the well-known recursion relations that the Kr (x) satisfy. The asymptotic forms of the modified Bessel functions of the second kind are important since only in these limits will closed expressions for the probability densities exist. In the large x-limit, the asymptotic form of the modified Bessel function is Kr (x) x lim √ e = 1, (π/2x)
x→∞
(6.7.3)
for any order, r. In the opposite limit of small x, the asymptotic form of the modified Bessel function is r 1 2 Kr (x) = (r − 1) . (6.7.4) 2 x The parameter, x, will now be shown to be inversely proportional to the absolute temperature so that the (6.7.3) and (6.7.4) will be the low and high temperature limits, respectively. In order to show that these are the extreme temperature limits, we must ask ourselves: What does a modified Bessel function have to do with relativistic statistical physics? As we have mentioned in the introduction
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
329
to this section, the kinetic energy E = mc2 ( cosh θ − 1), and the momentum G = mc sinh θ, so that the density of states for r = 2 is dN(G) = 2
8πV 4πV 2 G dG = 3 sinh2 θ cosh θ dθ, 3 h λc
(6.7.5)
where λc = h/mc is the Compton wavelength. The cube of the Compton wavelength is proportional to the smallest volume of phase space in which a particle can be localized. So (6.7.5) is the ratio of the actual phase volume to that of the smallest possible phase volume. However, it is not as Chandrasekhar [58] claims that the factor of 2 in (6.7.5) is due to the fact that “the Dirac equation has two (or no) linearly independent solutions according as [the relativistic conservation of energy] is satisfied (or not).” Planck got the same factor in his expression for the density of states, and he never heard of the Dirac equation. The factor is due to the two independent directions of polarization, and it comes out of classical electrodynamics without any recourse to quantum mechanics, or relativity. Maxwell’s equations contain all the necessary ingredients for characterizing the two states of polarization of a photon as we shall appreciate in Sec. 11.5.1. And any microscopic explanation of a macroscopic phenomenon does not constitute a new phenomenon, as Einstein maintained in his letter to Seelig which we cited in Sec. 1.1.1.3. We can thus calculate the total number of particles, ∞ 8πV N = 3 ex e−x cosh θ sinh2 θ cosh θ dθ, (6.7.6) λc 0 and the kinetic energy, ∞ 8πV K (θ) = 3 ex e−x cosh θ sinh2 θ cosh θ( cosh θ − 1) dθ, λc 0
(6.7.7)
as integrals over all values of θ, where x = mc2 /T is the ‘modulus,’ in Gibbs’s terminology, in energy units where Boltzmann’s constant is unity. Both the total number of particles, (6.7.6), and the average kinetic energy, (6.7.7) are Lorentz-invariant since they do not depend on γ. This will not be true of the total energy (6.3.3). In the case where the number of particles is conserved, the chemical potential is a function of temperature [Lavenda 91]. However, we have set the chemical potential equal to zero since we will deal only with ratios
Aug. 26, 2011
11:16
330
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
of thermodynamic variates. In the case of non-conservation of the particle number (6.7.6) will be a function of the temperature. Rather than relying on a specific ensemble for the expression of the pressure, which in the case of a non-constant particle number would be the grand canonical ensemble, we can apply a homogeneity argument and assume that the energy is a homogeneous function of order 3s in the momenta, where s is the Grüneisen parameter that we have introduced earlier in (6.5.6). That is, we can write the virial as 8πV x ∞ −x cosh θ ∂E 2 PV = e e G G dG ∂G 3 3 0 ∞ 2 8πVmc x = e e−x cosh θ sinh4 θ dθ = sE. (6.7.8) 3λ3c 0 The Grüneisen parameter is related to the order of homogeneity of the energy with respect to the momentum. In a d-dimensional momentum space, the energy is a homogeneous function of order d · s with respect to the momentum. By an integration of parts in the last integral in (6.7.8), we can write the virial as the ideal gas law: 8πV mc2 ∞ sinh3 θ de−x cosh θ 3λ3 x 0 8πV ∞ −x cosh θ =T 3 e sinh2 θ cosh θ dθ = NT. λc 0
PV = −
(6.7.9)
Equation (6.7.9) has the appearance of Mariotte’s law, but appearances can be deceiving especially when the particle number can be a function of the temperature! The validity of Mariotte’s law was proved explicitly by Jüttner [11], but it was implicit in Planck’s papers of 1907. It was, however, certainly not appreciated that N was not a conserved quantity, nor that what was being dealt with was not a material ideal gas. One would have to wait more than a decade until Einstein, prodded by Bose’s ‘misinterpretation’ of Boltzmann’s counting procedure, would realize that one was dealing with a degenerate gas, or one that does not conserve the particle number. Hence, the name ‘Bose–Einstein’ statistics.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
331
In order to determine the average kinetic energy, we consider the average ∞ −x cosh θ e xr−1 sinh2(r−1) θ 1 + sinh2 θ dθ cosh θ = (2r − 3)!! 0 Kr (x) Kr−1 (x) 2r − 1 = + . (6.7.10) Kr (x) x In three-dimensions r = 2, and the average kinetic energy is
2 K1 (x) −1 . K (θ) = 3NT + Nmc K2 (x)
(6.7.11)
On the strength of Mariotte’s law (6.7.9), (6.7.11) can be written as
2 K1 (x) −1 (6.7.12a) K (θ) − 3PV = Nmc K2 (x) or
K1 (x) − 1 = 1/s. K (θ)/PV = 3 + x K2 (x)
(6.7.12b)
The expression for the kinetic energy applies to a state at rest as well as a state in uniform motion; it is a Lorentz invariant. The uniform velocity u does not enter into the average (6.7.10). It is only when we form the total energy by multiplying (6.7.10) by cosh θ that we must change x to x in the average because it now applies to a state of uniform motion. Since K1 (x) < K2 (x) it follows from (6.7.12a) that K (θ) − 3PV ≤ 0.
(6.7.13)
We can verify this in the extremes cases; in the low temperature, nonrelativistic limit, x 1, and 1 + 3/8x 3 K1 (x) = =1− , K2 (x) 1 + 15/8x 2x so that s = limit,
2 3
(6.7.14)
in (6.7.12b), and in the high temperature limit, or the x 1 x K1 (x) = , 2 K2 (x)
Aug. 26, 2011
11:16
332
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
and (6.7.12b) becomes K (θ)/PV − 3 = −x(1 − x/2),
(6.7.15)
with s = 13 in the extreme ultrarelativistic limit, showing that inequality (6.7.12a) still holds. In summary, the average kinetic energy has the limiting expressions K (θ) = 3NT + Nmc2
K1 (x) x1 −1 K2 (x) x1
3 2 NT
3NT in the nonrelativistic and relativistic limits, which coincide with the low and high temperature limits, respectively. Introducing the kinetic expressions for the average of the random kinetic energy, (6.3.4), and the pressure (6.5.1) in d = 3 dimensions into (6.7.12a), give nm 1 w2 2 −1 −3 K (θ)/V − 3P = nmc √ √ 2 2 3 1 − w /c 1 − w2 /c2
2 K1 (x) w = nmc2 1 − 2 − 1 = nmc2 − 1 ≤ 0, K2 (x) c (6.7.16) where n is again the number density. In the nonrelativistic limit, (6.7.14) shows that (6.7.16) tends to 1, while in the ultrarelativistic limit, (6.7.15) shows that it tends to 0 as it should. In Classical Theory of Fields, Landau and Lifshitz [75] claim inequality (6.7.13) should be written as (6.7.17) ε − 3P = nmc2 1 − u2 /c2 ≥ 0, which is the trace Tii of their energy–momentum tensor. This they claim is the result of the kinetic expressions for the energy density and pressure, c2 , ε = nm √ 1 − u2 /c2
P=
u2 nm . √ 3 1 − u2 /c2
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
333
If this is true then there should appear an average on the right-hand side of (6.7.17). But, these averages, we learn, are over a “certain time interval,” and are not statistical averages. If we evaluate Mariotte’s law (6.7.9) in the prime system we will have PV = NT , since P and N are relativistic invariants. That is, Mariotte’s law guarantees — and foresaw — the correct relativistic transform of the variables entering into it. Then, in view of the definition of the total energy, (6.3.3), we multiply (6.7.10) through by nmc2 cosh θ to get [cf. second equality in (6.5.1)], ε − 3nT γ = ε − 3P = nmc2 γ
K1 (x) ≥ 0, K2 (x)
(6.7.18)
where we have used (6.3.24). Inequality (6.7.18) is not inequality (6.7.17). The pressure is a Lorentz-invariant, and, thus, can only be compared with another Lorentz-invariant. The total energy, (6.3.3) is not Lorentz–invariant and cannot appear in the expression for the energy–momentum tensor, (6.7.17). Consequently, if the energy–momentum tensor of relativistic mechanics has any meaning at all, it must be the average kinetic energy density, and not the total energy density that appears in their expressions. Since the right-hand side of (6.7.18) contains the factor γ, the equation of state in the ultrarelativistic limit is not ε = 3P, as Landau and Lifshitz would have us believe, since it diverges as (6.7.18) clearly shows. The correct equation of state in the ultrarelativistic limit is (6.5.4a), with the average kinetic energy, and not the total energy, for it cannot depend upon γ in its definition. A degenerate gas can either condense where the pressure is a sole function of the temperature, independent of the volume, or have a repulsive zero-point energy [Einbinder 48b]. In the latter case, the average kinetic energy, which is the thermodynamic internal energy, is a sole function of the volume, K (θ) = CV −s . Thermodynamically, the pressure is defined as P=−
d K (θ) = sCV −(1+s) = sK (θ)/V, dV
which is the Grüneisen equation of state, (6.5.5). Then on the strength of (6.7.13), 3PV − K (θ) = (3s − 1)K (θ) ≥ 0, requires s ≥ 13 .
Aug. 26, 2011
11:16
334
6.7.2
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
Asymptotic probability densities
We can form a probability distribution function from the modified Bessel functions as f (θ|x)dθ =
xr−1 e−x( cosh θ−1) sinh2(r−1) θ cosh θ dθ, (2r − 3)!! Kr (x)
(6.7.19)
where f is the probability density function. In this subsection we will see how exact, and well-known, probability density functions arise in the asymptotic limits of large and small x. In the large x limit, 3/2
√ u2 /c3 2 mc2 2 2 e−x(1/ (1−u /c )−1) , (6.7.20) f (u|x) = π T (1 − u2 /c2 )2 where the contribution of the Jacobian, 1/c(1 − u2 /c2 ), is included in the last term in (6.7.20). In the limit u c, (6.7.20) becomes 2 m 3/2 −mu2 /2T 2 e u , (6.7.21) f (u|T) = T π which is exactly Maxwell’s speed distribution. The speed of light has completely disappeared in (6.7.21) implying that it is not valid in the relativistic region for which we must return to (6.7.20). In the opposite limit of small x, we have the vibrancy condition given by the transverse Doppler effect, ω/ω0 = cosh θ,
(6.7.22)
where the proper frequency of the source ω0 must be given. No matter what it is, it appears from (6.7.22) to be the lower cut-off to the angular velocity, ω. In special relativity the frequency, like the total energy, is increased by the motion. Introducing the vibrancy condition, (6.7.22), into (6.7.19) and taking note of (6.7.4) result in 2 ω 1 3 −xω/ω0 ω f (ω|ω0 , x) = x e −1 , (6.7.23) 2 ω02 ω02 √ for r = 2, where the Jacobian, 1 ω0 ω2 /ω02 − 1 , has been included. If we set ω0 = mc2 /, we come out with, in the limit ω ω0 , the well-known
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
335
law of ultraviolet blackbody radiation f (ω|ω0 , T) = =
1 2
3 e−ω/T ω2 , T
V −ω/T ω2 e , N c3
(6.7.24)
where the expression for the total, non-conserved, particle number, N = 2V (T/c)3 has been introduced into the second line. Wien proposed his distribution, (6.7.24), on an apparent analogy with Maxwell’s speed distribution for monochromatic radiation. We can also use the vibrancy condition [Wilkins & Willams 01] of the longitudinal Doppler effect, θ = ln ω/ω0 .
(6.7.25)
It is quite remarkable that (6.7.25) gives the representation of the modified Bessel function as:
∞ 1 (6.7.26) zν−1 exp − x (z + 1/z) dz = 2Kν (x). 2 0 From (6.7.26) it is apparent that exp[ 12 x(z + 1/z)] is the generating function of the modified Bessel function. In one dimension, the density of states, cosh θ, and the longitudinal Doppler effect, (6.7.25), give the probability density function, f1 (ω|x) =
ω/ω0 + ω0 /ω −(x/2)(ω/ω0 +ω0 /ω) e , 2ωK1 (x)
(6.7.27)
and the moment equation, d ln K1 (x) 1 ω/ω0 + ω0 /ω = − . 2 dx However, there is no maximum likelihood estimate, which estimates intensive thermodynamic parameters in terms of samples of extensive thermodynamic variables [Lavenda 91], because there is an insufficient number of states.
Aug. 26, 2011
11:16
336
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
Things are different in three dimensions, with a density of states, sinh2 θ cosh θ, and the longitudinal Doppler effect giving a probability density function,
ω 3 ω 0 3 x 11 f3 (ω|ω0 , x) = + x 8 ω K2 (x)e ω0 ω ω ω0 − 1 x(ω/ω0 +ω0 /ω−2) e 2 − − , (6.7.28) ω0 ω where we multiplied and divided by ex . In the large x-limit (6.7.28) becomes 1 √ 2x x 1 (ω)6 3 (ω)4 (ω)2 2 f3 (ω|ω0 , x) = e− 2 x(ω) /ωω0 , + + π ω 8 ω3 ω03 4 ω2 ω02 ωω0 where ω = ω − ω0 is the frequency shift. Now, in the limit of small shifts this becomes the probability density function,
f3 (ω|ω0 , x) =
2 π
mλ2 T
3/2 (ω)2 e−m(λω)
2 /2T
,
(6.7.29)
√ where λ = c/ (ωω0 ) ≈ c/ω0 is the wavelength corresponding to maximum intensity of the line (center) at ω ∼ ω0 . The probability density function (6.7.29) is again the three-dimensional Maxwell speed distribution with the random speed w = λω. The exponential in (6.7.29) was derived by Lord Rayleigh [89] for the distribution of intensities caused by the Doppler shift, where m is the mass of the emitter. By way of contrast, in the small x-limit, and for high frequencies, the probability density function (6.7.28) becomes f (ω|ω0 , x) =
1 x 3 ω2 −(x/2)(ω/ω0 ) e . 2 2 ω03
(6.7.30)
For a lower cut-off half as great as in the case where the vibrancy condition was given by the transverse Doppler effect (6.7.22), viz. ω0 = mc2 /2, the probability density function (6.7.30) becomes precisely that of the Wien distribution, (6.7.24). Finally, we can use the probability density function (6.7.19) to corroborate, and quantify, our results on the thermal volume, (6.5.7). To this end, √ we average the thermal contraction factor, (1 − w2 /c2 ), where w = c tanh θ
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
337
is the stochastic rapidity, with respect to the modified Bessel pdf (6.7.19). We then obtain for r = 2: ∞ √ 1 2 2 e−x cosh θ sinh2 θ dθ. (1 − w /c ) = K2 (x) 0 If we use the double angle formula, sinh2 θ = 12 ( cosh 2θ − 1), then we can use the representation of the Bessel function as: 0
∞
e−x cosh ϑ cosh νθ dθ = Kν (θ).
We then find √
(1 − w2 /c2 ) =
x 2
K2 (x) − K0 (x) , K2 (x)
which is none other than (6.7.16) when the recursion relation, Kn+1 (x) = Kn−1 (x) +
2n Kn (x), x
with n = 1 is introduced. The limits can be displayed as 1 √
K1 (x) x1 = (1 − w2 /c2 ) = K2 (x) x1 mc2 /2T,
where the lower limit goes to zero in the ultrarelativistic limit. We can now estimate the error that was committed by exchanging the average of a function for the function of the average. If we do this in (6.5.7) then 1 √
Nmc2
x1 = (1 − w2 /c2 ) = 2 x1 Nmc + K (θ)
Nmc2 /K (θ).
In the small x-(ultrarelativistic) limit (6.7.11) gives K (θ) = 3NT (1 − mc2 /3T) whereas (6.7.4) would give 2NT. Although off by a numerical factor, in the small x-limit, the conclusion remains the same: The thermal
Aug. 26, 2011
11:16
338
SPI-B1197
A New Perspective on Relativity
b1197-ch06
A New Perspective on Relativity
volume contraction increases inversely to the absolute temperature. In the nonrelativistic limit, there is no thermal volume contraction, V(θ) = Vγ −1 ,
x 1,
while in the ultrarelativistic limit, V(θ) =
1 −1 Vγ x, 2
x 1,
the volume tends to zero with x. In the relativistic regime, the temperature and averaged volume are no longer independent variables since the average thermal contraction factor depends on the temperature.
References [Abraham 20] M. Abraham, Theorie der Elektrizität, Vol. 2 (Teubner, Leipzig, 1920), p. 347. [Becker 33] R. Becker, Theorie der Elektrizität, Vol. 2 (B. G. Teubner, Leipzig, 1933), p. 348. [Callen & Horowitz 71] H. B. Callen and G. Horowitz, “Relativistic thermodynamics,” Am. J. Phys. 39 (1971) 938–947. [Chandrasekhar 58] S. Chandrasekhar, An Introduction to the Study of Stellar Structure (Dover, New York, 1958), pp. 394–397. [Clausius 70] R. Clausius, “On a mechanical law applicable to heat,” Poggendorffs Ann 141 (1870) 124–130. [Efimov 80] N. V. Efimov, Higher Geometry (Mir, Moscow, 1980), p. 490. [Einbinder 48a] H. Einbinder, “Generalized virial theorems,” Phys. Rev. 74 (1948) 803–805. [Einbinder 48b] H. Einbinder, “Quantum statistics and the ℵ theorem,” Phys. Rev. 74 (1948) 805–808. [Einstein 07] A. Einstein, “Relativitätsprinzip und die aus demselben gezogenen Folgerungen,” Jahrbuch der Radioaktivität und Elektronik 4 (1907) 411–462; 5 98–99 (Berichtigung). [Fock 66] V. Fock, The Theory of Space Time and Gravitation, 2nd ed. (Pergamon Press, Oxford, 1966), p. 50. [Hasenöhrl 04] F. Hasenöhrl, “Zur theorie der Strahlung in bewegten Körpen,” Ann. Phys. 15 (1904) 344–370. [Hasenöhrl 05] F. Hasenöhrl, “Zur theorie der Strahlung in bewegten Körpen, Berichtigung,” Ann. Phys. 4 (1905) 4, 16. [Ives 44] H. E. Ives, “Impact of a wave packet on an absorbing particle,” J. Opt. Soc. Am. 34 (1944) 222–228. [Jüttner 11] F. Jüttner, “Das Maxwellsche Gesetz der Geschwindigkeitsverteilung in der Relativtheorie,” Ann. d. Physik 34 (1911) 856–882; “Die Dynamik eines bewegten Gases in der Relativtheorie,” ibid, 35 (1911) 145–161.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch06
Thermodynamics of Relativity
339
[Landau & Lifshitz 75] L. D. Landau and E. M. Lifshitz, The Classical Theory of Fields (Pergamon, Oxford, 1975), Sec. 34. [Laue 19] M. von Laue, Die Relativitätstheorie (Vierweg, Braunschweig, 1919). [Lavenda 91] B. H. Lavenda, Statistical Physics: A Probabilistic Aprroach (WileyInterscience, New York, 1991). [Lavenda 95] B. H. Lavenda, Thermodynamics of Extremes (Horwood, Chichester, 1995), p. 39. [Lavenda 00] B. H. Lavenda, “Special relativity via modified Bessel functions,” Z. Naturforsch. 55a (2000) 745–753. [Lavenda 02] B. H. Lavenda, “Does the inertia of a body depend on its heat content?,” Naturwissenschaften 89 (2002) 329–337. [Mosengeil 07] K. V. Mosengeil, Dissertation, Berlin, 1906; “Theorie der stationären Strahlung,” Ann. der Phys. (Leipzig) 22 (1907) 867–906. [Page & Adams 40] L. Page and N. I. Adams, Jr., Electrodynamics (Van Nostrand, New York, 1940), p. 267. [Pauli 58] W. Pauli, Theory of Relativity (Pergamon Press, New York, 1958). [Planck 07] M. Planck, “Zur Dynamik bewegter Systeme,” Berliner Sitzungsberichte, Erster Halbband 29 (1907) 542–570; Ann. der Phys. Lpz. 76 (1908) 1. [Poincaré 00] H. Poincaré, “The theory of Lorentz and the principle of reaction,” Arch. Néderland. Sci. 5 (1900) 252–278. [Poincaré 06] H. Poincaré, “Sur la dynamique de l’Électron,” Rend. Circ. Mat. Palermo 21 (1906) 129–176. [Poynting 10] J. H. Poynting, The Pressure of Light (Soc. Promotion Christian Knowledge, London, 1910), p. 32. [Robb 11] A. A. Robb, Optical Geometry of Motion (W. Heffer & Sons, Cambridge, 1911). [Searle 96] G. F. C. Searle, “Problems in electric convection,” Phil. Trans. A 187 (1896) 675–713. [Sommerfeld 09] A. Sommerfeld, “Über die Zusammensetzung der Geschwindigkeiten in der Relativtheorie,” Verh. der DPG 21 (1909) 577–582; “On the Composition of Velocities in the Theory of Relativity,” Wikisource translation at en.wikisource.org/wiki/Portal:Relativity. [Steck & Roue 83] D. J. Steck and F. Roux, “An elementrary development of mass– energy equivalence,” Am. J. Phys. 51 (1983) 461–462. [Thomson 68] J. J. Thomson, Application of Dynamics to Physics and Chemistry (Dawsons, London, 1968). [Watson 44] G. N. Watson, Bessel Functions, 2nd ed. (Cambridge U. P., Cambridge, 1944), p. 79. [Wilkins & Willams 01] D. Wilkins and D. Williams, “From rapidity to vibrancy (logarithmic vibrancy),” Am. J. Phys. 69 (2001) 158. [Yaghjian 92] A. D. Yaghjian, Relativistic Dynamics of a Charged Sphere: Updating the Lorentz–Abraham Model (Springer, Berlin, 1992).
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
Chapter 7
General Relativity in a Non-Euclidean Geometrical Setting
As an older friend I must advise against it. . . In the first place you won’t succeed; and even if you succeed, no one will believe you. Planck’s advise to Einstein against trying to formulate a general theory of relativity.
7.1
Centrifugal versus Gravitational Forces
General relativity is based on the notion that gravity, rather than being a force acting between masses, is the curvature of space-time itself. The source of the (positive) curvature is mass itself, just like electric charge is the source of the electromagnetic field. And just as free particles follow straight lines in space-time, they follow geodesics in the curved space-time of a gravitational field. Calculations of the gravitational redshift, the time-delay of radar echoes from planets, the bending of light near a massive object, and the geodesic effect all offer their support to general relativity. Not very rarely can new insights be gained by looking at old, established results from a new point of view. The old quantum theory of the hydrogen atom, which combined a mixture of continuous and discrete conditions, was subsequently reinterpreted by wave mechanics. And wave mechanics was found to be able to go way beyond the hydrogen atom in explaining the atomic constitution of matter. It was the transition from particle to wave–particle duality that opened up our ability to explore the world of atoms. So too the addition of the wave nature to the study of gravitation will allow us to explain the well-known relativistic effects of the time-delay in radar sounding, the deflection of light and the advance of the perihelion of Mercury. All this can be accomplished by widening our mechanical view 341
Aug. 26, 2011
11:16
342
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
by allowing for the occurrence of optico-gravitational phenomena. And will we not need the entire optical spectrum for all the tests of general relativity fall within the short-wavelength limit. No recourse to general relativity is necessary. And since these effects are static, no assumption need be made as to how gravitational interactions propagate with the exception of the gravitational redshift that does not use the spatial components of the metric. The trajectory of a light ray will be determined in the same way as in an inhomogeneous refractive medium. Fermat’s principle of least time, in Sec. 2.2.3, relates the length and orientation of a light ray to the time for light to propagate along a path of the ray. The analogy between the index of refraction and the square root of twice the difference between the total and potential energies is also well-known from the quantum mechanical explanation of tunneling. It is precisely this separation of the metric and mechanical properties that are described by the potential energy that can be used to distinguish between the centrifugal and gravitational fields which cause acceleration. We shall return to this lack of equivalence of between gravitational and centrifugal forces which cause acceleration in Chapter 9, but we already have seen it in action in Sec. 2.2.3 where Maxwell used this strategy in his classic treatment of inversion in elliptic space. If we take a flat space metric in the plane and consider a constant index of refraction, we will demonstrate that Fermat’s principle is capable of yielding the phase of the oscillation of a Bessel function of the first kind in the periodic domain in the asymptotic short-wavelength limit. This is identical to the WKB result, and it allows us to associate a wave phenomenon with a geodesic trajectory. The only potential present is the repulsive centrifugal potential, and that keeps the trajectory open. In contrast to the generalization of special relativity, where gravity is considered to be the curvature of space-time instead of a bona fide force, the centrifugal force is built into the phase of the Bessel function in the periodic domain where the trajectory consists of straight-line segments that are tangent to a caustic circle, whose radius is determined by the magnitude of the angular momentum, and the arc segment joining the points of tangency. Whereas all forces causing acceleration are on the same footing in general relativity, it is the gravitational force — and not the centrifugal
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
343
force — that has to be introduced by allowing for a variable index of refraction, implying that the medium through which the light ray propagates is inhomogeneous. In other words, the effect of gravitation is to make the medium optically denser in the neighborhood of a massive body, while the centrifugal force has no effect upon the optical properties of the medium. Centrifugal, Coriolis and gravitational forces are usually considered to be fictitious insofar as they can be eliminated by a change of frame. The centrifugal and Coriolis forces can be transformed away by transforming to a non-rotating frame, while the fictitious force of gravity is transformed away by transforming from a non-free falling frame to a free-falling one. We can easily appreciate that the force of gravity will affect the optical properties of the medium while the centrifugal force affects the geometrical properties by determining the radius of the caustic circle separating bright and shadow regions. All the well-known results of general relativity can be analyzed from this perspective. The presence of a Newtonian potential will cause the modification of the phase of the Bessel function and determine whether the orbit is periodic (bright zone) or aperiodic (shadow zone) depending on whether the total energy is negative or positive, respectively. The bending of light requires a coupling of the gravitational and centripetal potentials that is commonly referred to as Schwarzschild’s potential [cf. Fig. 7.6]. Here, it arises when we determine the extremum condition for Fermat’s principle. It shows that like gravitational radiation, the interaction between a light ray and massive body is a quadrupole interaction without any appeal being made to general relativity. In contrast, we shall find that the advance of the perihelion requires both the gravitational potential and the quadrupole interaction. In the general theory, the quadrupole interaction appears as a relativistic correction to the square of the transverse velocity in the conservation of energy. Whereas Newton’s potential is the cause of the closed elliptical orbit, the quadrupole causes the perihelion to slowly rotate in a rosette orbit. A dipole moment would have been sufficient to cause the advance of the perihelion, but since there is conservation of momentum, the center of mass of the system cannot accelerate and so neither can the mass dipole moment.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
344
A New Perspective on Relativity
7.2
Gravitational Effects on the Propagation of Light
7.2.1
From Doppler to gravitational shifts
That motion causes changes in the frequency and/or wavelength has been known since the middle of the nineteenth century when J. C. Doppler discovered it. If the relative speed of the waves is not the same for a moving observer as for an observer at rest, there is a frequency shift even though the wavelength remains the same. Not so for light waves where the speed of light is the same for all observers, whether they be at rest or traveling ninetenths the speed of light. The constancy of the speed of light imposes that both frequency and wavelength change such that their product, the speed of light remains c. The distinction between the two is that where frequency and wavelength change independently the vibrations belong to that of the medium but where they are negatively correlated the self-contained vibrations are of the electromagnetic fields, and not the media through which they propagate for there may even be none. In Sec. 2.5 we have described how Einstein [11] considers c as the gravitational potential, thereby negating his principle of the constancy of the speed of light. Instead of inertial frames, he considers one frame to be uniformly accelerated with respect to the other. A light signal of frequency ν0 is emitted from a source in a frame at rest. If light has traveled a distance h at a speed c then it will have acquired a speed gh/c which is the product of the (constant) gravitational acceleration, g, and the time, h/c. Thus, the first-order Doppler shift, ν = ν0 (1 + u/c) , is converted into ν = ν0 1 + gh/c2 . Now, on the strength of the equivalence principle, acceleration is equivalent to a gravitational field, and gh can be replaced by the gravitational potential, , to get ν = ν0 1 + /c2 , (7.2.1)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
345
or even, c = c0 1 + GM/rc2 ,
(7.2.2)
where c0 is the velocity of light in vacuo. Einstein has made the following reduction: relative motion −→ uniform acceleration −→ static gravitational field. Thus, the speed of light is like the speed of water waves, it is different for stationary and moving observers, and even more, it depends on the position one is located in the gravitational field. From (7.2.2) Einstein deduces that a change in the velocity normal to a wavefront is proportional to the change in the gravitational potential. This makes c a potential for gravitation, and the latter is responsible for the change in the rate at which a clock ticks. What do the experiments say? To get an effect on the change of rate of a clock on the gravitational field, one needs to consider a clock on Earth and another in a satellite orbiting the Earth. Comparison of the two clocks will require ‘back-and-forth’ light signals so in addition to the gravitational effect, (7.2.2), there will be the effect of time dilatation. A round the world trip was accepted as the second best substitute to satellite experiments, and in 1971 four atomic clocks were compared with a stationary observer after they had made round the world trips in eastward and westward directions. The transported clocks feel a smaller gravitational attraction to the Earth than the grounded clock so that they will appear to go faster. The inertial system of the ground is that it moves with the Earth about the Sun but it does not partake in the Earth’s rotation. Because gravity and time dilatation supposedly interfere destructively on the eastward journey the traveling clock should be slowed down, while, on the westward journey they complement one another so that there is a gain. The results were heralded as a great triumph for special relativity [Hafele & Keating 72]. Not only was the difference in time between the westward and eastward journeys of the same order as the difference between the individual clocks [Essen 78], but how does one go about decomposing time dilatation from gravitational effects? The latter certainly does not belong to the realm of special relativity.
Aug. 26, 2011
11:16
346
7.2.2
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
Shapiro effect via Fermat’s principle
We will now calculate the increased travel time of light in a gravitational field known as the Shapiro effect. We will not treat it from general relativity, nor by (7.2.2), but, rather, as an example of Fermat’s principle of least time. Fermat’s principle asserts that the ray path connecting two arbitrary points makes the optical path length, √ I = cτ = η (2T) dt, (7.2.3) stationary, where, as before, η is the index of refraction, c the velocity of light, τ, the propagation time, and T is the kinetic energy per unit mass. As a first application of Fermat’s principle (7.2.3) we will determine the time-delay in radar sounding. We will later see, in Sec. 9.10.3, how general relativity accounts for this time-delay by evaluating the Schwarzschild metric on the null geodesic when all the angular dependencies are ignored. If a light signal is sent from Earth, located on the x-axis at −xE , to Venus say, which is located behind the Sun at xV , as shown in Fig. 7.1, the light ray will be bent as it passes the gravitational field of the Sun. Clocks will thus be slowed down, and the time it takes the ray to bounce off the surface of Venus and return to Earth will be longer than if the sun were not present. The simplest mechanical analog of the index of refraction is η = √ 1 − 4/c2 , where is the potential energy per unit mass. We will justify this choice shortly. Since the gravitational field of the Sun makes the medium optically denser, can be identified as the gravitational potential, −GM/r, where G is Newton’s gravitational constant, M the mass of √ 2 the Sun, and r = R + x2 is the distance from the center of the Sun to Venus, with R as the Sun’s radius.
Fig. 7.1.
The set-up for the Shapiro effect.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
347
Now, according to Fermat’s principle, (7.2.3), the propagation time τ along a ray connecting the two endpoints −xE and xV is given by xV xV √ (1 + 2α/r) η(r) τ= dx = dx, c c −xE −xE where α := 2GM/c2 is commonly referred to as the Schwarzschild radius. Every mass has its accompanying Schwarzschild radius: A human with a mass of 102 kg, and radius of 1 m has a Schwarzschild radius of 10−25 m, while the Sun with a mass of 2 × 1030 kg, and a radius of 7 × 108 m will have a Schwarzschild radius of 3 × 103 m. The slowing down of clocks in a gravitational field will result in an apparent reduction in the speed of light. Light will therefore travel at the phase velocity u(r) = c/η(r), rather than c as it does in vacuo. The gravitational potential enters through the index of refraction to modify the speed of light, and not through any putative connection with the Doppler effect. Consequently, the travel time will be α xV dx τ + τ ≈ τN + c −xE r dx α xV , = τN + √ 2 c −xE R + x2 where τN = (xE + xV )/c is the Newtonian travel time, and we have used the √ approximation (1 + x) ≈ 1 + x/2. The second term is half the time-delay for a signal to bounce off Venus and return to Earth. Fermat’s principle thus predicts a time dilatation, √ 2 R + xE2 xE + 2α 2τ = ln √ 2 2 c −xV + R + xV 4xE xV 2α = 2.4 × 10−4 s, ln ≈ c R2 √ 2 where the square roots have been expanded to lowest order, R + xE2 ≈ √ 2 2 ≈ x + R2 /2x since R x , x . The factor 2 is due to xE and R + xV V V V E the fact that the signal must make a round trip.
Aug. 26, 2011
11:16
348
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
This is known as the Shapiro effect, and has been calculated without general relativity. The increased travel time corresponds to an apparent increase in distance of 36 km from Venus to Earth. In a simplified demonstration [Sexl & Sexl 79], the time-delay of radar due to the presence of
a massive body is given by τ = dx/ceff , where the effective velocity of light is given by Einstein’s expression (7.2.2), which supposedly accounts for both time dilatation and the shrinking of measuring rods in a gravitational field. The final expression for τ is valid to first-order in α/r. Both these factors have been incorporated into the stretching of the line element √ by the index of refraction η = (1 + α/r). Rather, if the effective velocity ceff were to be identified with the phase velocity, there would be only half of the effect which is within 3% of the experimental uncertainty.
7.3
Optico-gravitational Phenomena
The wave equation for a wave of definite angular frequency ω may be written from (3.8.4) as ∇ 2A +
ω2 2 η A = 0, c2
(7.3.1)
in terms of the vector potential intensity A, or what Maxwell referred to as the ‘electrokinetic momentum’ intensity. This is the equation of light in a medium of index of refraction c2 η2 = 2m W − (r) 2 , ω
(7.3.2)
where W is the total energy and is the potential energy, which is supposed to be a function only of the radial coordinate r. The index of refraction is usually positive, but, under certain circumstances like the total internal reflection of light, it can be imaginary. In this case there is an exponential penetration of light from a denser to a less dense medium. To see what this implies, take the scalar product of the subsidiary ˙ we then obtain condition (3.8.9) with J; −J˙ · ∇ρ η2 = 2. c J˙2
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
349
Whereas in the normal case where the refractive index is real, the rate of change of the gravitation current is in the opposite direction to the density gradient, the case of an imaginary index of refraction would imply that the two are in the same direction. Although (7.3.1) is what we have derived from the circuital equations, and the definition of the index of refraction, we will find that it is insufficient to account for relativistic gravitational phenomena. This is due to the fact that η accounts for the potentials, like the gravitational potential, −GM/r, but it cannot account for the metric dependent terms. This will become clear from Fermat’s principle. Expression (7.3.2) can be shown to agree with Cauchy’s expression for the index of refraction — in this case represents the gravitational potential. In 1836 Cauchy proposed a simple formula for the variation of the index of optical glasses with wavelength. Cauchy’s formula depends only on two empirical constants, C1 and C2 , η = C1 + C2 ω2 .
(7.3.3)
For gases in which η differs little from unity, we can replace 2η by η2 + 1, and write (7.3.3) as η2 = 2C1 − 1 + 2C2 ω2 .
(7.3.4)
Now consider (7.3.2) in the case where the density ρ = 3M/4πr3 is constant. We then obtain η2 = −A +
8π Gρr2 , 3
(7.3.5)
where A is an arbitrary constant. But, Gρ is proportional to the square of the frequency of free-fall so that we can write (7.3.5) as η2 = −A + B(ωr)2 ,
(7.3.6)
which is tantamount to Cauchy’s formula, (7.3.4), where B is another arbitrary constant. Comparison of our original formula (7.3.2) with (7.3.6) embodies the equivalence relation, GM = ω2 r 2 . r
(7.3.7)
In the remainder of this section we will use natural units where G = c = 1.
Aug. 26, 2011
11:16
350
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
In the Euclidean plane where ϑ = π/2, the kinetic energy, expressed in polar coordinates, is 2T = r˙ 2 + r2 ϕ˙ 2 , where ϕ is the azimuthal angle. However, we need not restrict ourselves to such a simple form of the kinetic energy. Rather, we can consider the expression 2T = E˙r2 + Gϕ˙ 2 ,
(7.3.8)
where E and G are the coefficients of the first fundamental form, which can be a function only of r. Introducing (7.3.8) into (7.2.3) we get E˙r2 + Gϕ˙ 2 dt I= η = η E + Gr2 ϕ 2 dr, (7.3.9) where the prime indicates the derivative with respect to r. The true ray path connecting any two arbitrary points will make (7.3.9) stationary. Observing that ϕ is a cyclic coordinate in as much as it is not present in the integrand while its derivative with respect to r, ϕ , is; we know that a first integral to the motion exists. Calling the integrand , it is ∂ ηGϕ = , = √ ∂ϕ E + Gϕ 2 which is a constant, regardless whether the medium is homogeneous or not. Recall that in an inhomogeneous medium, the refractive index will be a function of r. For the moment, we will assume, for simplicity, that it is a constant. The constant will be identified as the angular momentum in the natural units we are working in. Solving for ϕ , we obtain the equation of the orbit as √ E ϕ˙ dϕ . (7.3.10) = = ±√ √ 2 r˙ dr G η G − l2 What we have derived in (7.3.10) is the famous Clairut parametrization, which is an orthogonal parametrization in which both parameters of the first fundamental form, E and G, are only functions of r. Solutions to (7.3.10)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
351
are geodesics of constant speed and zero geodesic curvature. They are referred to as ‘pre-geodesics,’ and the geometrical interpretation of the angular momentum, , is the slant of the curve. The geodesic equation (7.3.10) contains all the information we need. It may be decomposed into two equations: the definition of the angular momentum, G ˙ = √ ϕ, E
(7.3.11)
and the radial equation, √ r˙ = ±
(η2 G − 2 ) . √ G
(7.3.12)
In Euclidean space G = r2 and E = 1, (7.3.11) reduces to ˙ = r2 ϕ,
(7.3.13)
which is Kepler’s law of equal areas in equal times, while (7.3.12) determines two families of curves in the rt-plane — the so-called characteristic curves. Instead of (7.3.1) we now have (7.3.14) ∇ 2 A + ω2 η2 − A = 0, where is the centrifugal energy per unit mass, 2 =
2 . r2
(7.3.15)
Although we have derived (7.3.14) as a geodesic from Fermat’s principle, it would be instructive to look for its electromagnetic origin, which might shed some light on the appearance of the last term in (7.3.14). Whereas (7.3.1) follows immediately from the circuital equations, ˙ ∇ × E = −ηH, ˙ ∇ × H = ηE.
(7.3.16)
simply by introducing H = ∇ × A, and the subsidiary equations, ¨ + ∇ φ˙ = 0, E˙ + ηA ∇ · A + ηφ˙ = 0,
(7.3.17)
Aug. 26, 2011
11:16
352
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
(7.3.14) needs an Ampère current, J = A.
(7.3.18)
This converts the reduced wave equation, ∇(∇ · A) + ω2 η2 A = J, into (7.3.14). Equation (7.3.14) requires the divergence of (7.3.18) to vanish, which is not guaranteed solely by the vanishing of the divergence of the vector potential since is a function of r. An auxiliary condition is needed that requires r ·A = 0, or that the vector potential be normal to the direction of propagation, which is the case of a transverse wave. A current proportional to the vector potential is a hallmark of superconductivity where the coefficient of the vector potential is proportional to the mass. We will return to this in Sec. 11.5.2. In his studies of the transmission of electromagnetic waves along cylindrical cables, Heaviside [94] came across (7.3.14) with the centrifugal energy (7.3.15) in which the electric and magnetic fields were zero- and first-order Bessel functions when there was no angular variations. This necessitated considering two components of the electric field vector, E and F. Let us first consider the spherically symmetric case. Let z be the axis of the cable of radius r0 , and r the distance from it. Then either E or H will be circular about this symmetry axis. In either case the electric field will have an additional, or radial, component F. If H is circular, and E longitudinal the circuital equations are 1 ∂r H ˙ = E, r ∂r
−
∂H ˙ = F, ∂z
∂E ∂F ˙ − = µH, ∂r ∂z
(7.3.19)
where (1/r)(∂/∂r)r denotes the curl which operates on a solenoidal field. In contrast, ∂/∂r is the gradient, and it operates on an irrotational field. Differentiating the last equation in time and substituting in the first two equations lead to a Bessel equation for H whose solution is a Bessel function of order one, 1 1 H + H + s2 − 2 H = 0, r r
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
353
where the prime stands for differentiation with respect to r, and s2 =
∂2 ∂2 − η2 2 , 2 ∂z ∂t
while the equation for E is a Bessel function of order zero. Rather, if H is longitudinal and E circular, the circuital equations are ∂H ˙ = E, ∂r
−
∂H ˙ = F, ∂z
1 ∂r E ∂F ˙ − = µH. r ∂r ∂z
Now, H satisfies 1 ∂ ∂H ∂2 H ¨ r + 2 = η2 H, r ∂r ∂r ∂z
(7.3.20)
whose solution is a zero-order Bessel function, while E is a first-order Bessel function. Finally, if both E and H are longitudinal, we get 2 ∂2 H ∂2 H 2∂ H + = η . ∂r2 ∂z2 ∂t2
(7.3.21)
Furthermore, if H has a periodic dependency on z, i.e. eiz/λ , (7.3.21) becomes the Klein–Gordon equation when the wavelength λ is identified as the Compton wavelength. We will have much more to say about (7.3.21) in Sec. 9.5, but it should be borne in mind, even at this stage, that the Klein– Gordon equation involves only irrotational fields. Hence, Schrödinger’s conclusion that Plane waves have only two possible states of polarization, not three, as would be expected for a vector wave (e.g. an elastic wave; remember the historical dilemma concerning the ‘elastic properties of the aether’).
As mass is associated with the longitudinal (helicity zero) state (cf. Sec. 9.6), we see that it is compatible with irrotational fields alone and does not arise from some spontaneous symmetry breaking in which the disappearance of a transverse degree of freedom makes its appearance as a longitudinal mode of vibration of the electromagnetic field.
Heaviside noted that when there are no angular variations the only Bessel functions are J0 and J1 . Since the generalization to include angular dependencies, leading to higher-order Bessel functions is “so easily made
Aug. 26, 2011
11:16
354
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
that it would be inexcusable to overlook it,” we, along with Heaviside, consider it. Take H longitudinal and the electric field with two components, one circular, E, and one radial component, F. The circuital equations are ∂H ∂H 1 ∂rE ∂F ˙ ˙ ˙ − = E, = F, − = −µH, (7.3.22) ∂r ∂ϑ r ∂r ∂ϑ at z = const. Assuming H to be a periodic function of both angle and time, the resulting equation, 1 ∂ ∂H m2 2 2 (7.3.23) r + ω η − 2 H = 0, r ∂r ∂r r is Bessel’s equation of order m. By defining the analytic function G = E + iF, we can write the circuit equations (7.3.22) as ∂ ∂ ∂G −i H = − , ∂r ∂ϑ ∂t 1 ∂ ∂ ∂H r+i G = −µ . (7.3.24) r ∂r ∂ϑ ∂t Differentiating the second equation in time gives 1 ∂ ∂ ∂ ∂ ¨ −i H = η2 H. r+i ∂ϑ ∂r ∂ϑ r ∂r Since we have considered all fields to be real, the real and imaginary parts of this equation must be equated to zero separately. The real part gives the equation for the Bessel function of order m, (7.3.23), while the imaginary part, when equated to zero, −
∂E ∂r F = , ∂ϑ ∂r
is one of the Cauchy–Riemann conditions for the existence of an analytic function. This just says that the mixed second derivatives of G are equal. Ironically, Heaviside loathed complex variables, going so far as to write to Bromwich “I could never stomach your complex integral method.” Two years after Heaviside’s death, Jeffreys was to show “that many of Heaviside’s solutions could be obtained easily by workers without
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
355
his amazing skill in manipulation, by using the theory of the complex variable.” To understand the nature of the individual terms in (7.3.22) we may avail ourselves of Heaviside’s construction of the power equation, or what he referred to as ‘activity.’ Again, Heaviside missed out on the discovery of the energy flux by a matter of months, and seeing that Poynting used Maxwell’s original notation, it appears almost miraculously that he could have made the discovery. Heaviside’s derivation was just two pages! Multiplying the first, second and third equations in (7.3.22) by E, F and H, respectively, and combining them in such a manner that the following equation results 1 ∂r EH 1 ∂ 2 1 ∂ FH =− (E + F2 ) + µH 2 . − 2 ∂t r ∂r r ∂ϑ The right-hand side is just the decrease in the total energy due to the flow of energy toward the outside of the cylinder which is represented by the left-hand side. The total flow towards the outside, which is responsible for the subsequent decrease in energy, is obtained by multiplying both sides by r dr dϑ integrating from r = 0 to r = r0 and from ϑ = 0 to ϑ = 2π. We then obtain r0 2π d 2πr0 H(r0 )E(r0 ) = − (7.3.25) (E2 + F2 ) + µH 2 r dϑ dr, dt 0 0 since the term in the derivative with respect to ϑ averages out to zero. The first term in (7.3.25) is the Poynting flux directed radially outward.a It involves only the circuital component of the electric field, and vanishes when r0 is a zero of the Bessel function. By contrast, in the case of the Klein–Gordon equation, (7.3.21), we would get an unphysical source term, r0 −2π EH dr, 0
from the necessity of performing an integration by parts. Hence, if the fields are entirely irrotational there will be no energy flux. The case of spherical symmetry can be handled analogously, and in the presence of a gravitational potential it leads to the gravitational analog a This agrees with Heaviside except for a minus sign.
Aug. 26, 2011
11:16
356
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
of the nonrelativistic hydrogen atom. Again following Heaviside, the simplest spherical waves are those for which the lines of H are circles of equal latitude, centered on an axis from which the polar angle, ϑ, is measured. Here r is the distance from the origin, and the azimuthal angle will have no role in what follows. H is circuital while E will have two components: a circuital component, E, that coincides with a line of longitude, and a radial component F. Heaviside writes down directly the wave equation for a spherical harmonic. However, it is of interest to see its origin in the circuital equations. The electric field, E, has two components: a radial component E, and a latitudinal circular component F. The magnetic field has but one component and is longitudinally circular. The circuital laws are thus ˙ = −curlϕ E = − 1 µH r
∂ ∂E rF − , ∂r ∂ϑ
[EM]
1 ∂ rH, r ∂r 1 ∂ E˙ = curlr H = sin ϑH. r sin ϑ ∂ϑ F˙ = curlϑ H = −
The circuital equations [EM] can then be combined into a wave equation, or what Heaviside referred to as the ‘characteristic’ equation ¨ = µH
1 ∂2 1 ∂ 1 ∂ sin ϑH. rH + 2 2 r ∂r r ∂ϑ sin ϑ ∂ϑ
(7.3.26)
The second term on the right-hand side of (7.3.26) just misses being the ϑ term in the Laplacian in spherical coordinates by the term 1/r2 sin2 ϑ, viz. 1 ∂ ∂ 1 ∂ 1 ∂ . sin ϑ = sin ϑ − ∂ϑ sin ϑ ∂ϑ sin ϑ ∂ϑ ∂ϑ sin2 ϑ This term is analogous to the m = ±1 term for the magnetic quantum number of an electron. It represents the projection of the angular momentum on the preferred z-axis. For if we call λ=
1 ∂ ∂ 1 ∂2 sin ϑ + , ∂ϑ sin2 ϑ ∂ϕ2 sin ϑ ∂ϑ
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
357
the ‘surface harmonic,’Y (ϑ, ϕ) satisfies the differential equation Y + λY = 0,
(7.3.27)
where λ is an eigenvalue which we seek to determine. Now Y depends only on the angles, or equivalently the ratios x/r, y/r, and z/r, say to the power . We can thus consider an -th degree homogeneous function, = r Y , where satisfies Laplace’s equation, ∇ 2 = 0, which in spherical coordinates reads 2 ∂ ∂2 r Y + 2 r Y = r−2 { + ( + 1)} Y = 0. + (7.3.28) 2 r ∂r ∂r r In view of (7.3.27) we find λ = ( + 1). There are 2 + 1 values of m for each value of , and in this case = 1 there should be three values, m = 0, ±1. The m = 0 is missing. We can find the m = 0 value if we consider the propagation of the electric field. To find this equation we take the time-derivative of the third equation in [EM], introduce the first equation, and use the second and third equations to express the circular component of the electric field in terms of the radial component. We then obtain µE¨ =
1 ∂2 1 ∂E ∂ rE + 2 sin ϑ , 2 r ∂r ∂ϑ r sin ϑ ∂ϑ
(7.3.29)
which has the correct form for the Laplacian in spherical coordinates with m = 0. In Sec. 11.5.2 we will show that (7.3.29) can support dispersion. There, we will also provide an interpretation for and m in terms of the properties of photons. Returning to our main theme, and assuming a periodic time dependence, (7.3.26) reduces to d2 ( + 1) (7.3.30) − + η2 ω2 rH = 0, dr2 r2 which is Schrödinger’s equation for the hydrogen atom. Moreover, since Coulomb’s law has the same form as Newton’s law, (7.3.30) can be solved
Aug. 26, 2011
11:16
358
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
in the exact same way by replacing e2 with GM in the expression for the index of refraction, GM 2m 2m . η2 = 2 [W − (r)] = 2 W + r ω ω The square of the momentum is written to show that it has been derived from the square of the curl, and thus polarization has already been included in the solution to Schrödinger’s equation without ever being realized. This is clear from the circuital equations [EM]. These equations describe the propagation of electromagnetic waves. For photons, the projection of the total angular momentum along the polar axis is replaced by the projection of its spin along the direction of motion. This is called helicity, and for a photon the helicity is either ±1, but never zero. It is remarkable that this information is included in the classical circuital laws, as is made evident from (7.3.28). The absence of the helicity-0 state guarantees that a photon can never come to rest, i.e. it has zero rest mass. Although we will return to problem of introducing a longitudinal mode in Sec. 11.5.5, we briefly indicate how this can be done. If G is a generalized displacement, the force acting on it will be made up of shear, compression, and rotation. The force per unit volume that arises from the stress of the medium which tends to oppose a finite elastic resistance to shear, compression and rotation is 1 F = n[∇ 2 G + ∇(∇ · G)] + λ∇(∇ · G) − µ∇ × ∇ × G, 3
(7.3.31)
where the elastic constants related to shear, compression, and rotation are n, λ, and ν, respectively. The first term in (7.3.31) is the rigidity, and when there is frictional resistance, its time-derivative is the frictional resistance to distortion. We shall neglect such effects here and concentrate on the other two terms. The second term represents a uniform tension, and its negative represents the hydrostatic pressure. When the material is incompressible, λ becomes infinite, and ∇ · G = 0, but their product remains finite. The last term is the force that tends to rotate the body as a whole. As we discussed in Sec. 3.8.1, there are times when it is more convenient to associate H to the velocity than E. Yet, it was crucial to our derivation of the electromagnetic mass in Sec. 5.4.3 that we associate H
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
359
˙ = H, and with the velocity. Such will prove to be the case here. So with G ν∇ × G = E, we take the time-derivative of the latter and identify ν−1 = , to get the second circuital equation E˙ = ∇ × H. Introducing these definitions into (7.3.31), and associating the permeability with the density, we get the equation of motion, ˙ = − 1 ∇ × E + λ∇(∇ · p−1 H), µH
(7.3.32)
where we also set the rotational constant equal to the inverse of the dielectric constant, and p−1 is Heaviside’s notation for the inverse of the timederivative. In respect to the first equation in [EM], there is an additional term in (7.3.32) which normally should be zero because ∇ ·H=
1 ∂H = 0, r sin ϑ ∂ϕ
(7.3.33)
since H consists of circles on a sphere of constant latitude and has nothing to do with ϕ which measures longitude. It is also required that H be divergentless. Then, taking the time-derivative of (7.3.32), we get ¨ = µH
∂2 H 1 1 ∂ 1 ∂ 1 ∂2 sin ϑH − rH + . r ∂r2 r2 ∂ϑ sin ϑ ∂ϑ r2 sin2 ϑ ∂ϕ2
(7.3.34)
The two numbers in quantum mechanics that describe angular momentum are and m, the angular momentum and its projection on the z-axis, respectively. For photons as well as elementary particles it is more convenient to consider the spin, and m helicity. The photon is a spin = 1 particle, and a spin-one particle has 2 + 1 states with m = −1, 0, 1. But not for a photon since a massless particle cannot have helicity m = 0. This is guaranteed by (7.3.33). But, if (7.3.33) does not hold, and H is periodic in ϕ, i.e. eimϕ , then for m = ±1, the last term in (7.3.34) cancels the term in the second expression that prevents it from being the Laplacian. In that
Aug. 26, 2011
11:16
360
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
event (7.3.34) becomes ¨ = µH
∂H 1 ∂ 1 ∂2 sin ϑ , rH + r ∂r2 ∂ϑ r2 sin ϑ ∂ϑ
(7.3.35)
for the m = 0 longitudinal mode. Now, the right-hand side is not the expression for the Laplacian in spherical coordinates, which for the m = 0 state would read 1 ∂ 2 ∂ r . ∂r r2 ∂r But, it is the free-particle spatial part of the Schrödinger equation, for the same m = 0 state, which uses the radial part of the Laplacian in cylindrical coordinates. This has been derived from the circuital equations [EM], and not from the operator pr = −i∂/∂r for the radial momentum. Moreover, the m = 0 mode corresponds to the spherical harmonic ( + 1) = −
1 ∂ ∂ sin ϑ , sin ϑ ∂ϑ ∂ϑ
with eigenvalue = 1, which is the value of its spin, while ( + 1) = −
1 ∂ ∂ 1 , sin ϑ + sin ϑ ∂ϑ ∂ϑ sin2 ϑ
corresponds to the same eigenvalue, but with m = ±1. It is only the former which is compatible with a space varying index of refraction. By shifting the quantum numbers and m from angular and azimuthal quantum numbers to spin and helicity, we have obtained the equations governing photons and the corresponding m = 0 longitudinal mode. Heaviside certainly could not have foreseen all this, but he should have noted that he did not come out with the correct spherical harmonic in terms of Legendre polynomials, but, rather, with their first derivatives. There is a paucity of equations in physics, owing to the frugality of nature that requires their reinterpretation under different physical circumstances. Heaviside began his investigation on trying to obtain condensation waves from the rapid oscillations of plus and minus charges in a conductor. Obviously since plane waves are incapable of longitudinal vibrations, he was led to consider the next in the order of simplicity — spherical waves. If electricity can be likened to a fluid, and if it were compressible then
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
361
it would naturally give rise to a condensation wave. This would make it unnecessary to consider an electric current as the motion through a space of something since the vibrations of a lattice would do equally well. For a constant index of refraction, Heaviside did not find any condensation waves. If (7.3.14) has anything to do with gravity, it will be the Helmholtz equation for the wave function that determines the gravitational field once the additional potential and index of refraction are specified. This is in contrast to Poisson’s equation where the gravitational potential is determined by the mass density.
7.4
The Models
Our three basic models are: (i) The Flat Model, where E = 1,
G = r2 ,
for which the wave equation is:
d2 2 + ω 2 η2 − 2 2 dr r
u = 0,
(7.4.1)
where, without loss of generality, we consider a reduced scalar wave equation. Equation (7.4.1) bears a remarkable resemblance to the normal form of Bessel’s equation, where a periodic solution exists for r > /ηω, and an exponentially damped solution for the reverse of the inequality. Even more can be said when we introduce another equivalence principle 2 = ω2 r 2 , r2
(7.4.2)
which takes us from constant angular momentum, implying equal areas in equal times, to one of constant angular frequency. This has the effect of converting Bessel’s equation (7.4.1) into the differential
Aug. 26, 2011
11:16
362
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity equation for Hermite polynomials
d2 2 2 2 + η − ω r u = 0. dr2
(7.4.3)
A periodic solutions exists if r stays in the limits ±η/ω. (ii) The Beltrami, or Projective, Model, where E=
1 , (1 − r2 /R2 )2
G=
r2 , 1 − r2 /R2
(7.4.4)
with R the absolute constant, or radius of curvature. The wave equation is d2 r2 2 2 u = 0. (7.4.5) +η − 2 1− 2 dr2 r R √ √ If the radius of curvature is R = 1/ ρ [i.e. R = c/ (Gρ)], and the total mass, M is constant, rather than the mass density, ρ, (7.4.5) becomes
2 d2 + η2 − 2 2 r dr
2M 1− r
u = 0.
(7.4.6)
For a constant index of refraction, (7.4.6) will give the deflection of light about a massive body, M, while in the case where the index of refraction is given by (7.3.2), it will describe the advance of the perihelion. (iii) The Stereographic Inner Product Model, where E=
1 , (1 − r2 /R2 )2
G=
r2 , (1 − r2 /R2 )2
for which the wave equation is 2 2 d + η2 − 2 2 dr r
r2 1− 2 R
2 u = 0.
(7.4.7)
However, since the Gaussian curvature is no longer constant, this will cause a modification of the conservation of the angular momentum from
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
363
its Euclidean expression (7.3.13). It will now be given by =
ϕr ˙ 2 ϕr ˙ 2 , = 1 − 2M/r 1 − r2 /R2
(7.4.8)
where we transferred from one of constant density, ρ, to one of constant mass, M, since the absolute constant, R=
√
(3/8πρ).
(7.4.9)
The absolute constant (7.4.9) is proportional to the free fall time, ρ−1/2 . We did this so that (7.4.8) would correspond to Møller’s [52] expression for the angular momentum that he got in his treatment of the perihelion shift. His conclusion that the right-hand side of (7.4.8) “cannot in general be interpreted as angular momentum, since the notion of a ‘radius vector’ occurring in the definition of angular momentum has an unambiguous meaning only in a Euclidean space.” This space should rather be a space of constant curvature. We will return to this discussion in Sec. 9.6. Transferring our attention to the phase S = −i ln u + const., we get √ S = ± (η2 − ),
(7.4.10)
in the optico-geometric limit where ∇ 2 S 1. Equation (7.4.10) can be considered as the generalization of the Poisson equation of the shortwavelength theory of gravitation. The phase S plays the role of an action, and is known as the eikonal in geometrical optics. It is completely determined by the gravitational potential, in the case that the index of refraction is varying, and by the generalized centrifugal potential . Gravitational effects will be reflected in deviations from the Flat Model for S, which is given by (η = 1) √ 2 (r − 2 ) S± (r, ϕ) = r˙ dr = ± dr r r2 − 2 − cos−1 + ϕ =± (7.4.11) r = ± tan2 ϕ dϕ = ±( tan ϕ − ϕ). The second line of (7.4.11) describes wavefronts characterized by the condition that S± = const., and are involutes of the circle r = . Now, a
Aug. 26, 2011
11:16
364
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
Fig. 7.2.
Rays tangent to a circular caustic of radius l.
caustic is an envelope of a family of rays. The wavefronts are normal to the rays PA and PB, and the caustic curve coincides with the envelope AB of this family of normals in Fig. 7.2. The wavefronts are the involute to the curve AB. The envelope of the normal to this curve is called the evolute. The rays corresponding to the eikonal S− are half-lines tangent to the circle √ r = . The first term in S− represents the length PA − r2 − l2 , while the second term is the arc length of AC. The negative sign in the former means that the direction of the ray is from P to A. An analogous explanation holds for S+ . /2 times S+ in the last line of (7.4.11) represents the difference in areas of the triangle AOB, which is 12 2 tan ϕ, and the area of the sector COB, 1 2 2 ϕ, shown in Fig. 7.3. Thus, S+ > 0 on the strength of the trigonometric inequality tan ϕ − ϕ > 0, and represents distance. Rays do not penetrate into the shadow region which in the interior of the caustic, r < l. In this region, the eikonal, (7.4.11), becomes completely imaginary, S†± = i cosh−1 2 − r 2 . − r
(7.4.12)
Since the ‘shadow’ intensities vanish exponentially, they are usually ignored. But, because the matching conditions between periodic and aperiodic zones furnish the quantum conditions in quantum mechanics, they
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
Fig. 7.3.
365
Sector inscribed in a triangle.
must have some physical meaning. In fact, we will see that the shadow region lies in hyperbolic space. In the shadow zone we are no longer restricted to the relativistic assumption that nothing travels faster than the speed of light. Thus it is entirely feasible to have angular velocity rϕ˙ > 1. The phase and group velocities are also problematic in quantum mechanics. Since their product is 1, if the group velocity is less than 1, the phase velocity must be greater than 1. But, this could be rationalized only by excluding their use in signal transmission since no optical effect could propagate faster than the speed of light. Boundary conditions in general relativity usually require space-time to be asymptotically flat. But, in rotating systems, we will learn in Chapter 9 that a cut-off must be introduced for, otherwise, distances r > 1/ϕ˙ would make the time component of the metric tensor negative. Distances greater than 1/ϕ˙ bring us into the caustic region, and since the region is hyperbolic, it does not lead to the Landau and Lifshitz [75] conclusion that “such a system cannot be made up of real bodies.” The action (7.4.12) can also be derived from Fermat’s principle using an indefinite metric. The principle now asserts that I=
√ 2 2 r2 ϕ˙ 2 − r˙ 2 dt = η r ϕ − 1 dr
√
η
(7.4.13)
Aug. 26, 2011
11:16
366
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
be stationary. Following the same procedure as given above, we find the trajectory is now given by l −1 ϕ − ϕ0 = cosh , ηr for a constant index of refraction. We can always arrange that the integration constant ϕ0 = 0 by fixing the initial point of the measurement of arc length. The extremum of (7.4.13) is just the negative of the hyperbolic distance I = − 2 − (ηr)2 = − tanh ϕ. (7.4.14) The extremum of the length of the ray is given by nothing less than the corresponding segment of the Lobachevsky straight line! Canonical parametrization, where we set the radius of the caustic circle equal to , and the arc length s = sinh ϕ, enable the profile curve to be written as: 1 s −1 , √ β(s) = g(s), h(s) = sinh s − √ 1 + s2 1 + s2 = ϕ − tanh ϕ, sech ϕ . The term g(s) measures the distance along the axis of revolution, and h(s) measures the distance from the axis of revolution. The action (7.4.12) is merely the distance along the axis of revolution, S† = g(s), which in terms of ϕ is S† = tanh2 ϕ dϕ = (ϕ − tanh ϕ) . (7.4.15) Equation (7.4.15) is the parametric equation for a tractrix, whose surface of revolution resembles a bugle, having Gaussian, constant negative curvature −1/2 . The tangent to the tractrix which intersects the x-axis has the constant value , as shown in Fig. 7.4. The distance from the origin to the point of tangency along the x-axis is ϕ. The point on the tractrix which has as a tangent intercepting the x-axis is located at a distance (ϕ − tanh ϕ) along the x-axis. This is precisely the action (7.4.15). In contrast to the bright zone, where the action is the difference in areas between a triangle and the sector in which it inscribes, in the shadow zone, it is the difference between the distance from the origin to the point of
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
Fig. 7.4.
367
Newton’s tractrix again.
intersection of tangent of the tractrix and the x-axis and the distance from the origin to the point of intersection of normal of the intersection point of the tractrix and its tangent to the x-axis, as shown in Fig. 7.4. Newton defined the tractrix as the curve for which the length of its tangent from the point of contact to the x-axis is constant. Huygens pointed out that this curve could be interpreted as the path of a dog which is pulled by a leash of length . Its primary importance is the role that it plays in hyperbolic geometry where its surface of revolution is the pseudosphere, as we have seen in Fig. 2.19. We recall that the pseudosphere is the negative curvature counterpart of a sphere. One may wonder whether there is a closed figure like a sphere which exhibits negative constant curvature. In was proved by Hilbert at the turn of the twentieth century that there is no smooth unbounded surface of constant negative curvature in ordinary space. Nevertheless, a plane of negative curvature can be obtained by introducing an indefinite metric.
7.5
General Relativity versus Non-Euclidean Metrics . . .we have no proof of the need for a curved universe (space plus time) and the physical meaning of this theory is very confusing [Brillouin 70]
We may generalize the foregoing analysis to non-Euclidean geometries by writing Fermat’s principle using the first fundamental form, I = η E dr2 + G dϕ2 = extremum. The fundamental form is a Clairut parametrization which is an orthogonal parametrization where E and G depend only on r. Then since ϕ is a
Aug. 26, 2011
11:16
368
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
cyclic coordinate, we obtain the pre-geodesics, or a curve of zero geodesic curvature as √ dϕ E = ±√ √ 2 , (7.5.1) dr G (η G − 2 ) regardless of whether the index of refraction is a constant or not. As we have already said, the angular momentum gives the slant of the pregeodesic curve. We will determine E and G from the projective metric of a spherical distance on a sphere of radius R, and then let R become imaginary. This will give us a hyperbolic metric of constant curvature, which is obviously the Beltrami metric. Then, as a final step we will consider the mass as constant and not its density. This will have the effect of going to a metric of nonconstant curvature [cf. (7.4.9)]. The line element on a sphere of radius R is (7.5.2) ds2 = R2 dϑ2 + sin2 ϑ dϕ2 . As is shown in Fig. 7.5, the point P at (R, ϑ, ϕ), where the azimuthal, ϕ, is measured around the vertical axis, is projected stereographically onto the plane at point Q with coordinates (r, ϕ). Thus, ϑ = 2 tan−1
r , 2R
whose differential is dϑ =
dr . 1 + r2 /4R2
Fig. 7.5. The stereographic projection of a point on the sphere P onto the plane at point Q.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
369
Introducing this into the expression for the line element (7.5.2) results in ds2 =
dr2 r2 dϕ2 + , 1 + r2 /4R2 (1 + r2 /4R2 )2
(7.5.3)
where we have used 1 1 r/R sin ϑ = 2 sin ϑ cos ϑ = . 2 2 1 + r2 /4R2 Expression (7.5.3) is the metric of the sphere as measured by the coordinates (r, ϕ) in the plane 1 r tan ϑ = . 2 2R Now, letting the radius become imaginary, R → iR, the metric for a sphere, (7.5.3), transforms into ds2 =
dr2 r2 dϕ2 , + (1 − r2 /4R2 )2 1 − r2 /4R2
(7.5.4)
which identifies the coefficients E and G in the orthogonal, first fundamental form. This is none other than the Beltrami metric that we first met in Sec. 2.4. So the metric for a sphere in elliptic space becomes the metric for a pseudosphere in hyperbolic space when R → iR. Finally, as far as gravity is concerned, (7.5.4) sets the radius of curvature proportional to the density, (7.4.9). The Gaussian curvature, which has the dimension of inverse square length, is thus proportional to the density. But, (7.4.9) says more. Of the three tests of general relativity, it is the gravitational redshift which is a consequence solely of the equivalence principle, and not of the gravitational field equations. It says that clocks will be slowed down in a gravitational field by an amount √ √ √ 2M 2 2 2 t = 1−ω r t= 1− t, 1−u t= r where t is the observer’s proper time, and ω is the angular velocity of rotation. If the density, and not the mass is constant, the last inequality
Aug. 26, 2011
11:16
370
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
states
ω=
8π ρ , 3
where ρ−1/2 is proportional to the Newtonian free fall time. If the equivalence principle, 2M = ω2 r3 ,
(7.5.5)
holds, all the results we get with M = const., should follow with ρ = const., With the total mass M = const., but not ρ = const., (7.5.4) becomes ds2 =
r2 dϕ2 dr2 + , 1 − 2M/r (1 − 2M/r)2
(7.5.6)
where we used M = 4πρr3 /3. The Gaussian curvature of the surface, M M , (7.5.7) K =− 3 2−3 r r becomes constant, in the large r limit, when the density, ρ, does. Since K = κ1 κ2 = LN/EG, where κ1 and κ2 are the principal curvatures, and L and N are the orthogonal coefficients of the second fundamental form, L dr2 + N dϕ2 . The mean curvature, 1 1 H = (κ1 + κ2 ) = 2 2
L N + G E
1 2M =− 1− , r r
(7.5.8)
for L = −(2 − 3M/r)/r(1 − 2M/r)2 and N = M/(1 − 2M/r). We now compare this with the Schwarzschild metric for the exterior and interior solutions. The planar line elements for the Schwarzschild metric are ds2ext =
dr2 + r2 dϕ2 , 1 − 2M/r
(7.5.9a)
ds2int =
dr2 + r2 dϕ2 , 1 − r2 /R2
(7.5.9b)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
371
for the exterior and interior regions, respectively. It would appear that (7.5.9a) and (7.5.9b) are related by the equivalence relation (7.3.7). However, the exterior metric has Gaussian curvature Kext = −M/r3 , while the interior metric has Gaussian curvature Kint = 1/R3 . Thus, by going from the exterior to the interior metric what was a surface of negative, non-constant curvature has become one of positive, constant curvature! Now it is well-known that a simple coordinate transformation can eliminate the singularity, r = 2M in (7.5.9a), and the solution can be continued all the way to r = 0. But, in no way would we expect that the curvature of the surface changes as we go from the exterior metric, (7.5.9a), to the interior metric, (7.5.9b). This does not happen in the Beltrami metric for when the principle of equivalence is applied, the curvature becomes K = −1/R2 . The radial equation obtained from (7.5.1) for the Beltrami metric is r 2 √ 2 2 2M r˙ = ± η − 2 1− , (7.5.10) r r ˙ In the case of the deflecand the angular momentum is conserved, = r2 ϕ. tion of light by a massive body η = const., while in the advance of the perihelion, the refractive index is given by (7.6.1) where W < 0 would correspond to a complex index of refraction. We can read off from (7.5.10) the potential, 2M 2M 2 + 2 1− , (7.5.11) 2S = − r r r which we shall refer to as Schwarzschild’s, and which is shown in Fig. 7.6 (b). The corresponding Newtonian potential, with = 0, is shown in Fig. 7.6 (a). For W < 0, the Newtonian orbits are elliptic and the line W = N in Fig. 7.6 (a) gives the maximum and minimum distances of the particle from the central mass. By contrast, for sufficiently large values of the angular momentum, the Schwarzschild potential has a maximum positive value, as shown in Fig. 7.6 (b). Particles with energy less than this value will not reach the origin. The minimum of S corresponds to a circular orbit.
Aug. 26, 2011
11:16
372
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
Fig. 7.6. Comparison of the Newtonian potential (a) with that of the Schwarzschild potential (b). The former is obtained from the latter by setting = 0.
The radial equation of the Beltrami metric with non-constant curvature, i.e. constant mass, gives the coupling between rotational repulsion and gravitational attraction that allows the exact calculation of the perihelion shift. General relativity gives the non-conserved angular momentum (7.4.8) and the equation for the radial coordinate as [Møller 52, p. 349]
r˙ = ±
A+
2M − r2 ϕ˙ 2 , r
which upon inserting (7.4.8) becomes
2M 2 r˙ = ± A + − 2 r r
2M 1− r
2 1/2 ,
(7.5.12)
where A is a constant. By contrast, the Beltrami metric conserves the angular momentum, = r2 ϕ, ˙ and gives a radial equation, r˙ = ±
2 η − 2 r 2
2M 1− r
,
(7.5.13)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
373
which differs from (7.5.12) by the last term. But, since we are usually working at weak fields, we may take (1 − 2M/r)2 ≈ 1 − 4M/r, and in the same approximation neglect the second term in the denominator of (7.4.8). But, general relativity comes out with the equation of the trajectory as [Møller 52, Eqn. (28) p. 350] 1 r4
dr dϕ
2
= 2A +
2M 2 2M2 − 2 + r r r3
(7.5.14)
˙ where the index which is what we would get from dividing (7.5.13) by r2 ϕ, √ of refraction η = (2A + 2M/r). Notwithstanding the final equation that general relativity uses, there should have been a discrepancy of a factor of 2 in the last term of (7.5.14). So we may ask where do the general relativistic results come from? Consider a generalization of the inner product. It was Riemann’s idea to generalize the ordinary dot product of two tangent vectors, v · w to the inner product, v◦w =
v·w . g2
In the xy-plane, the inner product is blown up by the factor g =1−
x2 + y 2 , R2
where R is the radius of the disc lying in the xy-plane. This geometric surface √ is the hyperbolic plane, and, necessarily, we must have r = (x2 + y2 ) < R. Transforming to polar coordinates, the line element is given by ds2 =
dr2 + r2 dϕ2 , (1 − r2 /R2 )2
(7.5.15)
which just misses being the stereographic projection given by the Beltrami metric (7.5.4) by being a factor of g too high in the denominator of the second term. On the strength of the equivalence principle, (7.5.5), we can write (7.5.15) as ds2 =
dr2 + r2 dϕ2 . (1 − 2M/r)2
(7.5.16)
Aug. 26, 2011
11:16
374
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
The equation for the pre-geodesic is dϕ (1 − 2M/r) ϕ˙ = = ± . 2 r˙ dr r2 η2 − (1 − 2M/r)2
(7.5.17)
r2
Although this equation splits up into the general relativistic expression for the angular momentum, (7.4.8), and the radial equation, the latter will give twice the value for the rotational-gravitational coupling that (7.5.12) gives in the weak field approximation. This is on account of the fact that it is the same G coefficient in the fundamental form that appears in (7.4.8) and the radial equation. The Schwarzschild exterior solution corresponds exactly to the stereographic metric, (7.5.15) under the transformation (7.5.5). This is the reason for the inability to integrate their equation for the pre-geodesics. Remarkably, (7.5.17), under the equivalence principle, (7.5.5), is one of the few cases which can be integrated in closed form. For a constant index of refraction, and setting R = 2, it becomes dϕ (1 − r2 /4)/r2 = ± 2 . dr 1 − (1 − r2 /4)/r In order to perform the integration we set [O’Neill 66] b 1 2 u= 1 + r , where b = √ . 4 r (1 + 2 ) This reduces the pre-geodesic to du , (1 − u2 )
dϕ = ± √ so that the solution is
r2 + r02 − 2r0 r cos (ϕ − ϕ0 ) = r12 , where ϕ0 is a constant of integration, and r02 = r12 + 4 is a Euclidean circle. The center of the circle, O, having coordinates (r0 , ϕ0 ) lies outside the disc since r0 > 2. The center O of this circle has an arc that cuts the rim orthogonally as shown in Fig. 7.7. All geodesics of the hyperbolic plane are either
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
375
Fig. 7.7. Geodesic curves that cut the rim of the hyperbolic plane orthogonally are arcs of a circle whose center O lies outside the disc.
curved arcs that cut the disc orthogonally or straight lines through the center. If we reinstate the constants, the condition that be real is ωr0 > c, and such a condition violates the limiting value of the speed of light. The fact that the curved geodesics are fixed by coordinates outside of the hyperbolic plane appears to resemble Mach’s principle whereby rotation should be reckoned by the distribution of all the masses that make up the universe.
7.6
The Mechanics of Diffraction
Still considering a flat metric where coefficients of the first fundamental form are E = 1 and G = r2 , the simplest generalization is to consider a varying index of refraction,
η=
2M 2W + , r
(7.6.1)
for a closed orbit having negative total energy, W < 0. In the shortwavelength limit, the gravitational potential in (7.6.1) will have the effect of converting the phase of the Bessel function into a Laguerre function. The equation for the pre-geodesic is dϕ . = ± 2√ dr r −2|W | + 2M/r − 2 /r2
(7.6.2)
Aug. 26, 2011
11:16
376
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
With the conservation of the angular momentum, the equation for the radial coordinate is dS 2M 2 − 2 , = r˙ = ± −2|W | + r dr r by definition of the eikonal, S. Now, introduce the change of variable, x = √ 2r |W | to obtain dS 1 λ 2 =± − + − 2 , 4 x x dx √ where λ = M/ |W |. Its square can be approximated as 2 dS λ λ 2 1 1 = 1− − . − = dx x λx 4 x(1 + 2 /λx) 4 With another change of variable, x = 4λ cos2 ϑ − 2 /λ, we get dS 1 = ± tan ϑ. dx 2
(7.6.3)
Noting that dx = −8λ cos ϑ sin ϑ dϑ, we integrate to obtain S = ±λ(2ϑ − sin 2ϑ). 1 3
(7.6.4)
With λ as the absolute constant, π times (7.6.4) is precisely the volume of a sphere of radius ρ = λ1/3 ϑ in elliptic geometry. The surface distance is proportional to the angle ϑ subtended at the center so that ϑ can be used as a measure of surface distance. As ϑ increases the object you are viewing decreases in apparent size until it arrives at the equator, ϑ = π/2. This is the maximum distance in elliptic geometry but not in spherical geometry. Increasing ϑ still further, the object now seems to be approaching and growing in size until it reaches ϑ = π. On completion of the round trip, you return to the north pole but with your head facing in the opposite direction. The elliptic plane is thus said to have only one side at [Thurston 97]! The finiteness of lines is the most novel feature of spherical geometry, and no line can be longer than π. Felix Klein suggested not to consider the whole sphere, but only half of it. This suggestion was made in order to eliminate the one, trivial, blemish of spherical geometry: any two great circles on a sphere meet not just in one point, but two diametrically opposite points, the so-called antipodes. By restricting the distance to π/2 the halves
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
377
of great circles meet only once. Consider two identical twins placed at antipodal points on each of the hemispheres. The twins regard themselves as a single entity and are not aware of the split. When any one of the twins moves, the other will also move to keep them diametrically separated. Each twin necessarily regards a pair of antipodal points on the sphere as a single location so that to him every pair of two great circles intersects at a single point, while to us it looks like two different points. When one twin traces out a triangle we see two antipodal triangles being traced out. To us Euclideans everything seems to double! This spherical flatland is called elliptic space, and so is superior to spherical geometry. Quite remarkably and unexpectedly, by quantizing, λ = n, we come out with the energy levels of the hydrogen atom, W =−
M n
2 ,
albeit mass replacing charge. Moreover, the Clairaut equation (7.6.2) is easily integrated to give r= where = eccentricity,
2 /M , 1 − (/M) sin ϑ
(7.6.5)
√ 2 M − 2|W |2 . This is the equation for an ellipse since the
2|W |2 = 1− M2
1/2
= 1−2 n
2 1/2 < 1.
(7.6.6)
√ The lengths of the major and minor axes, M and (2|W |), clearly show the competing forces of gravitational attraction and centrifugal repulsion. The transition point occurs where S (x0 ) = 0, which is x0 =
(2n)2 − 2 . n
r>
(2n)2 − 2 , 2M
For x > x0 , or equivalently, (7.6.7)
Aug. 26, 2011
11:16
378
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
there is a transition to the exponential region, where (7.6.4) transforms into S†± = ±λ( sinh 2ϑ − 2ϑ).
(7.6.8)
1
With the same absolute constant λ 3 , π times (7.6.8) is volume of a sphere with hyperbolic radius λ1/3 ϑ. But, this is exactly what we should have expected since the transformation from spherical to hyperbolic geometry is achieved by letting the radius become imaginary, ϑ → iϑ. It is intriguing to know how the hydrogen atom would behave in a hyperbolic plane. Rearranging inequality (7.6.7) we get 2n 2 2M 2 , + 2 > r r r showing that gravitation and centrifugal forces no longer oppose one another. The gravitational potential would become repulsive, and for the hydrogen atom it would mean a repulsive Coulomb potential; the atom would fly apart. All effects that we have so far discussed are nonrelativistic because c has not made its appearance. Their small corrections to celestial mechanics is fully accounted for by the Beltrami metric of hyperbolic geometry, as we shall now go on to show.
7.6.1
Gravitational shift of spectral lines
In Sec. 3.8.2.3 we saw how Einstein replaced the Doppler expression by a new one which predicted the effect that gravity would have upon spectral lines. That is, he replaced the nonrelativistic expression for the Doppler shift, ν − ν0 = −u/c, ν0
(7.6.9)
ν − ν0 = −α/2r. ν0
(7.6.10)
by
The confirmation of (7.6.10) is indeed miraculous, given its derivation. We know that motion causes a shift in the frequency, so the right-hand side should be u/c, the relativity velocity. But, if light from an unaccelerated frame is emitted, it will arrive at the accelerated frame in time h/c,
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
379
whose acceleration is being caused by gravity. So we want to replace u/c by g h/c2 , where g is the uniform acceleration on the Earth’s surface. Finally, we want to replace this scenario with one by a (Newtonian) gravitational potential which depends on the mass and has an inverse dependency on distance. The factor of one-half in (7.6.10) is meant to show that we are in the non-relativistic region. So, we have gone from the Doppler effect which depends only on the speed to one in which mass and distance have entered. The Doppler shift, (7.6.9) can also lead to a blueshift if the object is approaching. The gravitational shift cannot occur for what was attraction would now become repulsion.
7.6.2
The deflection of light
Gravity makes the medium optically denser in the vicinity of a large mass, like the Sun, than it would be in its absence. As a result, light rays will be bent toward the Sun rather than being straight lines. As we know from Sec. 3.8.2 this effect was originally predicted by Söldner as far back as 1801; it was rediscovered by Einstein [52] in 1911 by redefining the Doppler effect. The remarkable thing is that the equation for the trajectory of a light ray in a gravitational field can be derived directly from the Beltrami metric once we transform from a constant free-fall time, or frequency, to one of constant mass. This is not to be taken as a physical equivalence, but, rather, a mathematical one. However, the transformed metric of non-constant curvature is as ‘physical’ as the original Beltrami metric of constant curvature. The Beltrami metric at constant curvature describes a uniformly rotating disc, while the same metric at non-constant curvature describes the deflection of light in a gravitational field. The two are related by the mathematical statement of the equivalence principle. Nothing could be so simple nor so beautiful. The rest of the analysis is standard; but, for completeness sake we reproduce it here. We first introduce the change of variable, r = ρ−1 , in the pre-geodesic for the Beltrami metric, (7.5.10), to obtain dρ = ± −2 − ρ2 + 2Mρ3 . dϕ
(7.6.11)
Aug. 26, 2011
11:16
380
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
This identical to the general relativistic equation for the trajectory (7.5.14) √ with A = 0. Second, we introduce another new variable, σ = ρ (1−2Mρ), and neglecting the small term, 2Mρ3 we get ϕ=
ρ
0
dρ = sin−1 ρ. (1 − 2 ρ2 )
√
The geodesic r = / sin ϕ, obtained by setting the constant of integration, ϕ0 = π/2, is a straight line which passes through the origin at a distance when ϕ = π/2, and goes to infinity for ϕ → 0, π. The exact equation (7.6.11) may be cast in the form, dρ = dϕ
√ (1 − σ 2 ) ,
(7.6.12)
√ where σ = ρ (1 − 2Mρ). Since 2Mρ is a small quantity, we can use the approximations: σ = ρ (1 − Mρ) ,
ρ = σ (1 + Mρ) = σ (1 + Mσ/) ,
which are valid to first-order. Differentiating we find dρ = (1 + 2Mσ/)dσ. Introducing these approximations into (7.6.12), and integrating we get ϕ=
0
ρ
dρ = (1 − σ 2 )
√
= arcsin σ −
σ 0
(1 + 2Mσ/)dσ √ (1 − σ 2 )
2M √ 2M (1 − σ 2 ) + .
It is apparent from (7.6.12) that ϕ will have an extremum when σ = 1. This value corresponds to the closest approach of the ray to the Sun. At this distance the angle ϕ will be ϕm = π/2 + 2M/. Reinstating all constants we find the total deflection to be 2ϕm − π =
4GM , c
(7.6.13)
which is twice that obtained by treating the interaction through a Newtonian potential, as in Fig. 7.6 (a). The ratio of angular momentum to the speed of light, /c, plays the role of a collision parameter, or the closest distance approach. As expected, (7.6.13) is the ratio of the Schwarzschild radius to a characteristic length, here the collision parameter.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting
381
We can now show that the time of propagation along a ray will be lengthened in the presence of a massive body. Since the angular momentum has its Euclidean value, = ϕr ˙ 2 , in the Beltrami model, the action can be read off from (7.5.13) as S± = ±
√ 2 [r − 2 (1 − 2M/r)] dr, r
for η = 1. Treating the last term in the numerator as a small quantity we have M2 (r2 − 2 ) S± ≈ ± 1+ dr r r(r2 − 2 ) √ 2 M 2 −1 − cos . =± (r − ) 1 + r r √
The length and the time it takes to propagate along a ray will be lengthened by the amount M/r, for small values of the gravitational potential. The deflection of light attests to this lengthening by curving the ray.
7.6.3
Advance of the perihelion
We treated the advance of the perihelion in Sec. 3.8.2 by the elegant method devised by Ritz. The only blemish is that the arbitrary parameter appearing in his force law had to be chosen to give the observational result. In this section we determine the advance by our optico-gravitational approach. By doing so we go against the adage of Boltzmann who said that elegance should be left to tailors and cobblers. Taking into account both the gravitational potential and the quadrupole interaction, the index of refraction can be written as
2M η = −2W + r
2 1+ 2 r
1/2 .
In the unperturbed state, where the quadrupole interaction is absent, the gravitational potential must be large enough to produce a real eccentricity,
Aug. 26, 2011
11:16
382
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
(7.6.6). The corresponding action over a period of the motion, √ 2 2 (η r − 2 ) dr S(r, l) = r 1/2 l2 2M2 2M − 2+ dr, = −2W + r r r3
(7.6.14)
shows that a closed trajectory will result from a dynamic balance between gravitational and centrifugal forces. The quadrupole interaction requires r > 3M in order that the angular momentum be real at the extremum of the potential dS /dr = 0, 2 =
Mr2 , r − 3M
as seen in Fig. 7.6(b). Introducing this fact into (7.6.14) by writing r = r+3M, and retaining only those terms that are at most quadratic in M, give 1/2 2M 6M 3M 2 2M2 S(r, ) = −2W + + ··· dr, 1− − 2 1− − r r r r r3 where for brevity we have dropped the prime on r. Expanding the integrand in powers of the small correction terms results in S = S(0) − 3M2 S(1) , where the unperturbed action is 2 √ M 2M (0) S = − 2 dr = 2π √ − −2W + r (2W ) r and first-order correction is dr 2π S(1) = =− . √ 2 2 r ( − 2Wr + 2Mr − ) Since the trajectory is defined by the equation, ϕ+
∂S = const., ∂
the change in the angle ϕ over one revolution is ϕ = −
∂S ∂
(7.6.15)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch07
General Relativity in a Non-Euclidean Geometrical Setting ∂S(1) ∂ 3M2 = 2π 1 + 2 .
383
= 2π + 3M2
(7.6.16)
General relativity gives the shift as 6πM/l, where l is the semi-latus rectum. From the equation of the ellipse, we find l = 2 /M so that (7.6.16) is the exact same expression found in general relativity. This is also comparable to Ritz’s result (3.8.22), which we will do in a moment. The rotation of the perihelion of Mercury per revolution amounts to 0.104 . The dimensionless total energy constant W = 2.59 × 10−8 , and the mean motion ω = /ab = (2W )3/2 /M = 8.34 × 10−7 s−1 , where a = M/2W √ and b = / (2W ) are the semi-major the semi-minor axes. The calculated period of Mercury is τ = 2π/ω = 87.25 days, which is close to the actual value of τ = 88 days. The frequency of rotation of the perihelion is ω = ωϕ1 = 4.25 × 10−13 s−1 . This approach can be compared to Ritz’s calculation of the advance of the perihelion in Sec. 3.8.2. Ritz begins with his law of force, and finds it necessary to consider that angular momentum is not conserved. This he shares with general relativity. Rather, the Beltrami metric does not lead to any violation of the conservation of angular momentum. The transform from a metric of negative constant curvature to non-constant curvature uses (7.5.5). General relativity would use half that because the metric they use is the stereographic metric, (7.5.15). The factor of 2 enters in when the quadratic term is approximated by a linear one. It is not the numerical factors which make or break a theory. In nonEuclidean geometries the unit of measurement is left at discretion. This allays any suspicion that new metrics must be invented to account for the optico-gravitational phenomena.
References [Brillouin 70] L. Brillouin, Relativity Reexamined (Academic Press, New York, 1970). [Carstoiu 69] J. Carstoiu, “Les deux champs de gravitation et propagation des ondes gravifiques,” Compt. Rend. 268 (1969) 201–263; J. Carstoiu,
Aug. 26, 2011
11:16
384
SPI-B1197
A New Perspective on Relativity
b1197-ch07
A New Perspective on Relativity
“Nouvelles remarques su les deux champs de gravitation et propagation des ondes gravifiques,” Compt. Rend. 268 (1969) 261–264. [Einstein 11] A. Einstein, “On the influence of gravitation on the propagation of light,” Ann. der Phys. 35 (1911); translated in W. Perrett and G. B. Jeffrey, The Principle of Relativity (Methuen, London, 1923), pp. 99–108. [Einstein 52] A. Einstein, The Principle of Relativity (Dover, New York, 1952), pp. 99, 111. [Essen 78] L. Essen, “Relativity and time signals,” Wireless World, October 1978, pp. 44, 45. [Hafele & Keating 72] J. C. Hafele and R. E. Keating, “Around the world atomic clocks: predicted relativistic time gains,” Science 177 (1972) 166–168. [Heaviside 88] O. Heaviside, “On electromagnetic waves, especially in relation to the vorticity of the impressed forces; and the forced vibrations of electromagnetic systems,” Phil. Mag. May 1888, 380–449. [Heaviside 93] O. Heaviside, Electromagnetic Theory, Vol. I (The Electrician, London, 1893), Appendix B. [Heaviside 94] O. Heaviside, Electrical Papers, Vol. 2 (Macmillan, New York, 1894), p. 444. [Landau & Lifshitz 75] L. D. Landau and E. M. Lifshitz, The Classical Theory of Fields (Pergamon, Oxford, 1975). [Møller 52] C. Møller, The Theory of Relativity (Oxford U. P., London, 1952), p. 355. [Sexl & Sexl 79] R. Sexl and H. Sexl, White Dwarfs–Black Holes (Academic Press, New York, 1979), p. 43. [Sommerfeld 23] A. Sommerfeld, Atomic Structure and Spectral Lines (E. P. Dutton, New York, 1923), p. 466. [Thurston 97] W. P. Thurston, Three-dimensional Geometry and Topology (Princeton U. P., Princeton, NJ, 1997), p. 34.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Chapter 8
Relativity of Hyperbolic Space
Of course, since Einstein, we do not use hyperbolic geometry to model the geometry of the universe. [Greenberg 93]
8.1
Hyperbolic Geometry and the Birth of Relativity
Almost immediately after the birth of special relativity, Sommerfeld [09] made the interesting observation that the relativistic composition laws of velocities are “no longer the formulas of the plane but those of spherical trigonometry (with imaginary sides)” — trigonometrical formulas obtained by replacing the real argument by an imaginary one. Spheres of imaginary radius had been known for a long time, as this identity was pointed out by Lobaschevsky [98] himself. The first explicit connection of Lobaschevsky geometry to relativity was made by Vari´cak [10]. The hyperbolic geometry of relativity represents the velocity addition law as a triangle on the surface of a pseudosphere — a surface of revolution looking like a bugle [cf. Fig. 2.19] — and the angle of parallelism [cf. Fig. 2.12], which measures the deviation from Euclidean space. As the relative velocity approaches unity, the angle of parallelism approaches zero. The fact that the angle of parallelism provides a unique relation between circular and hyperbolic functions can be found in the early textbook on relativity by Silberstein [14] written in 1914. These developments did not have a follow up, and no place for hyperbolic geometry could be found in the relativity textbooks that followed. Undoubtedly, this was due to the influence of Einstein’s general relativity which is based upon Riemann geometry, where matter and geometry are woven together. Yet, even more astonishing is that Poincaré missed all of this! Over fifty years after the discovery of the hyperbolic geometry, Poincaré developed 385
Aug. 26, 2011
11:16
386
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
two models of hyperbolic geometry: The upper half-plane and disc models that we discussed in Sec. 2.4. The inhabitants of hyperbolic plane, Poincarites, consider geodesics as straight lines while to us Euclideans they would appear as circular arcs meeting the boundary orthogonally in the disc model, and see Poincarites shrink as they approach the real axis from the upper half-plane. Poincarites would not be able to measure their shrinkage because the rulers they use shrink along with them. These distortions are created by motion, and Poincaré was well aware of the contraction that bodies undergo in the direction of the Earth’s motion, of an amount proportional to the square of the aberration. This is the famous FitzGerald–Lorentz contraction that was first postulated independently by FitzGerald and Lorentz, as an explanation of the Michelson–Morley null result, which we discussed in Sec. 3.2. Poincaré [05] was also aware of the relativistic velocity composition law, since it was he who discovered it. Yet, he did not recognize that the longitudinal Doppler law is the invariant cross-ratio if the velocity in that law is that resulting from the relativistic subtraction law. Measurements in relativity consist in sending and receiving light signals. Distances are measured in terms of time differences. At each step, the time it takes to receive a signal sent out at a previous time is itself times a factor which turns out to be the longitudinal Doppler shift [Whitrow 33]. The space-time transformations from one inertial frame to another, involving Doppler shifts, combine to give the Lorentz transformation [Milne 48]. Had Poincaré realized that his definition of hyperbolic distance in terms of the logarithm of the cross-ratio, which for the distance between any two velocity points on a vertical half-line with an endpoint at infinity, is proportional to the logarithm of the longitudinal Doppler shift, he could have carried over the battery of concepts and tools he developed some twenty years before relativity — without distinguishing between the ‘special’ and ‘general’ theories of Einstein. This we plan to do in this chapter. The chapter is organized as follows. We discuss in Sec. 8.2 the connection between geometrical rigid motions and their relations to particular inertial frames of reference. Compounding Doppler shifts at different velocities yields the Poincaré composition law, which, in terms of homogeneous coordinates shows that the Lorentz transform is a unique Möbius
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
387
automorphism that exchanges an inertial frame of equal and opposite velocities with the state at rest. We will also appreciate it as the isomorphism that converts the Poincaré model to the Klein, or projective, model of the hyperbolic plane, as well as establishing the limit for hyperbolic rotations in terms of the angle of parallelism. We then discuss in Sec. 8.3 the relativistic phenomenon of aberration and show that it conforms to the hyperbolic law of sines. In terms of a right triangle inscribed in a unit disc, angular deformations of the noncentral angle and contractions of the side of the triangle perpendicular to the motion will respectively be related to the facts that the sum of the angles of a hyperbolic triangle is less than π, and that a FitzGerald–Lorentz contraction in the direction normal to the motion making it look like more of a rotation than a contraction. The hyperbolic contraction is in the direction normal to the motion, and not in the direction of the motion. It is exactly the same second-order Doppler effect that Ives and Stilwell measured back in 1938, as we have described in Sec. 3.4. Ives considered it as a demonstration that clocks in motion run slower, and has nothing to do with a relativistic time dilatation. The radar method of sending and receiving light signals to measure elapses in time and distance will then be used in Sec. 8.4 to contrast states of uniform motion and uniform acceleration. We confirm Whitrow’s [80] conclusion that acceleration does, indeed, affect the rate of a clock, and convert his inequality for time dilatation into an equality for systems in uniform acceleration. The characteristic means that we find for times of reflection for systems in uniform motion and uniform acceleration imply different temporal scales. Applications to general relativity and cosmology follow. Beltrami coordinates and logarithmic time are used to derive a metric first investigated by Friedmann which corresponds to ‘dust-like’ matter at zero pressure in terms of Einstein’s energy–momentum tensor. A comparison with the ‘general’ relativity then follows in Sec. 8.5. In Sec. 8.6 we unveil the hyperbolic geometry of relativity, and generalize it to multidimensional velocity space in Sec. 8.7. The Friedmann–Lobaschevsky space finds confirmation in Hubble’s discovery of the redshift in the spectra of galaxies: The greater the shift the more distant the galaxy. This we show in Sec. 8.9 is a consequence of the hyperbolic measure of the
Aug. 26, 2011
11:16
388
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
velocity, and its relation to the logarithmic scale of time through Hubble’s law.
8.2
Doppler Generation of Möbius Transformations
It has long been known [Silberstein 14] that the relativistic composition of velocities obeys hyperbolic geometry. Robb [11] proposed to call the Euclidean measure of the velocity, ¯ u = tanh u,
(8.2.1)
the ‘rapidity,’ where u is the relative velocity, having set the absolute constant c = 1. We can invert (8.2.1) to find the expression 1 1+u 1 = ln{u, 0| − 1, 1}, u¯ = ln 1−u 2 2
(8.2.2)
¯ which unlike its for the hyperbolic measure of the relative velocity u, Euclidean counterpart is not confined to the closed interval [−1, 1]. Recall from Sec. 2.2.4 that the curly brackets represent the cross-ratio of the distance between x and y in the closed interval [a, b], {a, b|x, y} =
(a − x)(b − y) . (a − y)(b − x)
(8.2.3)
Exponentiating both sides of (8.2.2) shows that the exponential of the hyperbolic length is given by the longitudinal Doppler shift, eu¯ =
1+u 1−u
1/2
=: K.
(8.2.4)
We will now show that compounding Doppler shifts generates Möbius transforms thereby relating geometric rigid motions with specific inertial frames. And just as Poincaré found, these linear fractional functions can be used to define the concept of length under which the polygons of the respective tessellation are of equal size.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
389
Preliminarily, compounding the Doppler shift at velocity u with itself gives
1+u 1−u
=
1+λ 1−λ
1/2 ,
where λ is the relative velocity in an inertial frame comprised of equal and opposite velocities λ=
2u ¯ = tanh (2u). 1 + u2
(8.2.5)
Next, consider the cross-ratio, {v, λ| − 1, 1} =
1+v 1−λ 1 + v · = . 1−v 1+λ 1 − v
The ‘new’ relative velocity v is given in terms of the ‘old’ one by v =
v−λ , 1 − λv
(8.2.6)
which will be easily recognized as the relativistic subtraction law of the velocities. Introducing the second equality in (8.2.5) to (8.2.6) gives the familiar Lorentz ‘rotation,’ v=
¯ + sinh (2u) ¯ v cosh (2u) , ¯ + cosh (2u) ¯ v sinh (2u)
in terms of the homogeneous coordinates v and v . Multiplying (8.2.6) out gives v v + λ−1 (v − v ) − 1 = 0.
(8.2.7)
Two cases are of interest: If the relative velocities are equal, v = v , (8.2.7) becomes the simplest hyperbolic involution with conjugate points at ±1 [cf. Sec. 2.2.5.2]. This is not of physical interest; rather, what is of physical interest is when the velocities are equal and opposite, v = −v . For then (8.2.7) reduces to the quadratic form, v2 − 2λ−1 v + 1 = 0,
Aug. 26, 2011
11:16
390
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
which has two real roots v± =
1±
√
(1 − λ2 ) . λ
Since the relative velocity λ is given by (8.2.5), the two roots are v+ = 1/u and v− = u. The negative of (8.2.6), Mλ (v) =
v−λ , λv − 1
(8.2.8)
is the unique Möbius automorphism which exchanges λ and 0, viz. Mλ (λ) = 0 and Mλ (0) = λ. This can be recognized as a special case of the property that Mλ is involutory: Mλ ◦ Mλ = I, the identity. A Möbius transform takes any triplet (v1 , v2 , v3 ) into any other triplet. The demonstration of the existence of such a transform rests on showing that there is a Möbius transform for which M(v1 ) = 0, M(v2 ) = 1 and M(v3 ) = ∞. The values are not only the range of the conjugate points, u and 1/u, but, moreover, allow the Möbius transform to be written as the invariant cross-ratio M(v) = {v, v2 |v1 , v3 } =
(v − v1 ) (v2 − v3 ) · . (v − v3 ) (v2 − v1 )
Thus, by the construction of the cross-ratio, we have M(v1 ) = 0, M(ν2 ) = 1, and M(v3 ) = ∞. If we identify M with (8.2.8) then the triplet (0, 1, ∞) occurs when v1 = λ, v2 = 1, and v3 = 1/λ. The latter would imply that v is unbounded which contradicts special relativity. It is the symmetry of the hyperbolic involution which implies that if u is a solution, then so is 1/u. We must therefore show that the conjugate point v+ is a repulsive fixed point, implying that repeated mappings of Mλ repel it away from v+ . The Möbius transform F conjugating the normalized Mλ to its standard form sends v+ to zero and v− to infinity, F(v) =
v − v+ , v − v−
with F−1 (v) =
−v− + v+ . −v + 1
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
391
Rather than calculate the composition F ◦ Mλ ◦ F−1 (z) to find the standard form of Mλ , it suffices to calculate it at a point F−1 (1) = ∞ so that Mλ (∞) = 1/λ, and F(1/λ) = −1. Thus, F ◦ Mλ ◦ F−1 (v) = eiπ v,
(8.2.9)
is the standard form of Mλ , and represents a rotation about the fixed point v− which interchanges λ with O. The Möbius transformation (8.2.8) is therefore elliptic. Several properties are immediate: v+ is the inverse of v− with respect to the circle of inversion C1 in Fig. 8.1. The fixed point v− lies inside the unit disc and the other fixed point v+ lies outside except when v− = v+ , and then they both lie on the circle of inversion. Fixed points closer and closer to the center send their conjugates points further and further away. Inversion takes circles orthogonal to the original one into themselves. Two such orthogonal circles, C1 and C2 , are shown in Fig. 8.1. The line joining their centers has the fixed points diametrically opposite lying on the circumference of the circle C2 whose center is O2 = λ−1 . The arcs of C2 lying in C1 , and orthogonal to it at the points of contact correspond to geodesics in the Poincaré model.
Fig. 8.1. Circles of inversion. The circle C1 cuts the circle of inversion C2 at right angles at P and Q. A line from the origin of C1 intersects C2 at two points: v− and its inverse v+ , which are fixed points of the Lorentz transform. The relative velocity λ lies at an equal hyperbolic distance from v+ that v− lies from the origin. A hyperbolic rotation of π occurs about v− which exchanges the state of uniform velocity λ and the state of rest at O1 .
Aug. 26, 2011
11:16
392
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
Fig. 8.2. A more detailed description of the circle containing the fixed points v1 and λ which are uniform states of motion at relative velocities u and 2u/(1 + u2 ). The Möbius automorphism of the disc may be considered as a composition of two hyperbolic rotations: A rotation of π about the hyperbolic midpoint between the origin and λ, and a rotation about the origin. The maximum angle φ is determined by the angle of parallelism, , beyond which no motion can occur.
The intersection of the arc PQ with the line , shown in Fig. 8.2, occurs at the hyperbolic midpoint, v¯ − = u, between 0 and λ. This is a direct consequence of the definition of hyperbolic distances: Whereas the hyperbolic distance from 0 to λ is 1+u 1+λ = 2 ln , h(0, λ) := {−1, 1|λ, 0} = ln 1−λ 1−u the distance from 0 to v¯ − is half as great
1+u ¯ = {−1, 1|u, 0} = ln . h(0, u) 1−u
The arc PQ is itself the perpendicular bisector of Oλ. The Möbius transformation (8.2.8) is the composition of two hyperbolic reflections in perpendicular lines through v− . Moreover, it is the unique Möbius automorphism that exchanges O and λ by a rotation through π about the hyperbolic midpoint v− of the hyperbolic line segment. Expression (8.2.5) is the isomorphism that takes the Poincaré model to the Klein model. Both are hyperbolic models of the unit disc, but whereas the Poincaré model is conformal the Klein model is not. The price to be
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
393
paid is that geodesics in the Poincaré model are arcs of circles that cut the unit disc orthogonally, while the geodesics in the Klein model are straight lines. The isomorphism λ maps the arc with ends P and Q onto the open chord with the same endpoints. Since v− is the point where line cuts the ¯ is the point at which circumference of the orthogonal circle C2 , λ(v2 ) = λ(u) the line intersects the chord PQ. But this is precisely the definition (8.2.5) of λ. When the orthogonal arcs intersect, there occurs a hyperbolic rotation. At the limit when they are asymptotic, there is a limit rotation while when they become ultraparallel there is a hyperbolic translation. However, hyperbolic translations in hyperbolic velocity space would contradict the fact that the limiting velocity is that of light. Consider the right triangle OλQ formed by the intersection of lines and c in Fig. 8.2. The angle at the origin is the limiting angle, or the angle of parallelism that we introduced in Sec. 2.5. For angles less than ϕ, the angle of parallelism will not be reached. Consequently, c is the limiting line for hyperbolic rotations. According to ¯ while according to the Bolyai–Lobaschevsky the right triangle, cos = λ, formula for the radian measure of the angle of parallelism, ¯
¯ = 2 tan−1 e−λ . (λ)
(8.2.10)
¯ is a sole function of the hyperbolic distance λ. ¯ The closer it The angle (λ) is to π/2, the less pronounced the hyperbolic distortions become. Thus, ¯ = tanh λ¯ = cos (λ)
2v− 2 1 + v−
is just the isomorphism of the Poincaré model onto the Klein model. Since must necessarily be acute, hyperbolic translations are ruled out, and the limiting rotation occurs for asymptotes.
8.3
Geometry of Doppler and Aberration Phenomena
¯ The altitude Consider the triangle in Fig. 8.3 with sides α¯ and δ¯ , and base u. h¯ cuts the base into two parts ε¯ and u¯ − ε¯ . The angles formed from the sides and the base are ϑ¯ and ϕ. ¯ The sines of these angles are sin ϑ¯ = h/α =
sinh h¯ tanh h¯ sech ε¯ = , tanh α¯ sinh α¯
Aug. 26, 2011
11:16
394
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
Fig. 8.3.
Extension of hyperbolic trigonometry to general triangles.
and sin ϕ¯ = h/δ =
tanh h¯ sech(u¯ − ε¯ ) sinh h¯ = , tanh δ¯ sinh δ¯
¯ since deformation only occurs normal to the direction of motion, i.e. u. Introducing the hyperbolic Pythagorean theorem of the first triangle, ¯ cosh α¯ = cosh ε¯ cosh h, into the hyperbolic Pythagorean theorem for the second triangle, ¯ cosh u¯ cosh ε¯ − sinh ε¯ sinh u), ¯ cosh δ¯ = cosh h¯ cosh (u¯ − ε¯ ) = cosh h(
(8.3.1)
results in cosh δ¯ = cosh α¯ cosh u¯ − tanh ε¯ sinh u¯ cosh α. ¯
(8.3.2)
Finally, introducing cos ϑ¯ = tanh ε¯ / tanh α¯ gives the hyperbolic law of cosines ¯ cosh δ¯ = cosh u¯ cosh α¯ − sinh α¯ sinh u¯ cos ϑ.
(8.3.3)
In an exactly analogous way we find cosh α¯ = cosh δ¯ cosh u¯ − sinh δ¯ sinh u¯ cos ϕ. ¯
(8.3.4)
Now, introducing cos ϑ¯ = tanh ε¯ / tanh α¯ into cos ϕ¯ = tanh (u¯ − ε¯ )/ tanh δ¯ results in tanh δ¯ cos ϕ¯ =
tanh u¯ − tanh α¯ cos ϑ¯ . 1 − tanh u¯ tanh α¯ cos ϑ¯
(8.3.5)
But, this should be a velocity composition law [cf. Fig. 8.4 where the triangle has to be fitted on a surface of a pseudosphere rather than the flat Euclidean
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
Fig. 8.4.
395
Hyperbolic velocity triangle.
plane], and it will become one when we introduce the velocity components ¯ and u2 = δ = tanh δ¯ . u1 = α = tanh α, Inserting these definitions into (8.3.5) gives u2 cos ϕ¯ =
u − u1 cos ϑ¯ . 1 − uu1 cos ϑ¯
(8.3.6)
Equation (8.3.6) is the equation of aberration in the direction of the motion. In the limit as α, ¯ γ¯ → ∞, u1 , u2 → 1, and they become light signals. The hyperbolic cosine law, (8.3.3), can be written as sin ϑ¯ sinh δ¯ tanh δ¯ = cosh u¯ 1 − tanh α¯ tanh u¯ cos ϑ¯ = , sinh α¯ tanh α¯ sin ϕ¯
(8.3.7)
which is the hyperbolic law of sines, sin ϑ¯ sin ϕ¯ = , ¯ sinh α¯ sinh δ
(8.3.8)
showing that sides can be expressed in terms of angles in hyperbolic geometry [Buseman & Kelly 53]. The hyperbolic law of sines, (8.3.8), happens also to be the equation of aberration normal to the motion, √ u1 sin ϑ¯ (1 − u2 ) u2 sin ϕ¯ = . (8.3.9) 1 − uu1 cos ϑ¯ Taking the differential of (8.3.6), − sin ϕ¯ dϕ¯ =
γ −2 sin ϑ¯ u1 ¯ dϑ, ¯ 2 u2 (1 − uu1 cos ϑ)
(8.3.10)
Aug. 26, 2011
11:16
396
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
and introducing (8.3.9) result in √
dϕ¯ = −
(1 − u2 ) ¯ dϑ, 1 − u tanh ε¯
(8.3.11)
where we used tanh α¯ · cos ϑ¯ = tanh ε¯ . Dividing both sides by the time increment gives the Doppler shift as ν = Kν0 , where
(8.3.12)
√
K=
(1 − u2 ) 1 − uu1 cos ϑ¯
(8.3.13)
¯ is the Doppler factor. A moving object emits a signal at frequency ν0 = dϑ/dt with velocity u1 , and ν = −dϕ/dt ¯ is the frequency that the observer at rest registers. If the signal is emitted at the velocity of light, u1 = 1, implying that α¯ → ∞, and ϑ¯ → π/2, or equivalently ε¯ → 0, it follows from (8.3.3) that δ¯ → ∞ such that the difference δ¯ − α¯ remains finite 1 . (1 − u2 )
¯
¯ e(δ−α) = cosh u¯ = √
(8.3.14)
The Doppler shift (8.3.12) then becomes the exponential Doppler shift, ¯
¯ ν0 . ν = e−(δ−α)
(8.3.15)
An exponential law for the longitudinal Doppler shift is known [Prohovnik 67], but not for the transverse redshift. The exponential law would imply that the frequency and energy of light received from receding galaxies depend on an exponential factor which would decrease the luminosity and explain things like Olber’s paradox, where the decrease in the intensity of the source (inverse square of the distance) is exactly balanced by the increase in the number of sources (square of the distance). The linear approximation,
ν 1 = −(δ¯ − α) ¯ = − u2 , ν0 2
(8.3.16)
equates the redshift with the distance, δ¯ − α, ¯ and with the second-order relative velocity, obtained by expanding (8.3.14) to first-order. If u is the
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
397
Earth’s orbital velocity (3 × 104 m/sec), (8.3.16) would predict a frequency shift of (δ¯ − α) ¯ = 5 × 10−9 . Ordinarily, one writes the Doppler factor (8.3.13) with u1 = 1 without realizing that it requires the limit α¯ → ∞, which, in turn, requires that it be normal to the motion. That is, the Doppler shift (8.3.13) is ¯ −1 . K = (cosh u¯ − sinh u¯ cos ϑ) In the line of sight, we get the longitudinal shift K = eu¯ , while normal to our sight it becomes (8.3.15). Thus, there is a shift even in the normal direction. However, for ϑ¯ = 0, the shift will not be exponential, and consequently the ‘distance’ will not be given by the Lobaschevsky straight line. In this sense, the Lobaschevsky straight line is the ‘shortest distance.’ Aberration equations (8.3.6) and (8.3.9) can be combined in the halfangle formula, tan ϕ/2 ¯ = sin ϕ/(1 ¯ + cos ϕ) ¯ to read tan ϕ/2 ¯ =
1−u 1+u
1/2
¯ = e−u¯ cot ϑ/2. ¯ cot ϑ/2
(8.3.17)
Equation (8.3.17) will give the well-known expression for angle of parallelism: The ratio of concentric limiting arcs between two radii is the exponential distance between the arcs divided by the radius of curvature. Something very strange occurs at ϕ¯ = π/2. For we then obtain from (8.3.17), ¯ = e−u¯ , tan ϑ/2
(8.3.18)
showing that ϑ¯ is the angle of parallelism. The angle becomes a sole func¯ An observer in the frame in which the tion of the hyperbolic distance, u. √ object is at rest will see it rotated by an amount sin ϑ¯ = (1 − u2 ), exactly equal to the FitzGerald–Lorentz contraction, as we will see in Sec. 10.5. Since the angle of parallelism provides a unique link between circular and hyperbolic functions, a rotation and contraction can only be related at the angle of parallelism, if the geometry is indeed hyperbolic. The equivalence of rotations and contractions was first discussed by Terrell [59], but his analysis cannot be extended to the situation where the angle is not acute; the angle of parallelism must always be acute, tending to a right angle only in the limit of Euclidean geometry, as the velocity of light, c → ∞.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
398
A New Perspective on Relativity
8.4
Kinematics: The Radar Method of Signaling
With the realization that there is no such thing as a rigid body in relativity, Whitrow [33] went on to develop a radar method, or what he called a ‘signal-function method,’ where light signals are transmitted between different inertial frames and non-inertial ones. It was afterward referred to as the ‘K-calculus technique’ by Bondi [60] without a hint of where the original idea came from, and Milne [48] used it extensively in his research prior to him. In fact, the original idea and the definition of rapidity can be traced all the way back to Robb [11] in 1911.
8.4.1
Constant relative velocity: Geometric-arithmetic mean inequality
As in kinematic relativity [Milne 48], time measurements are much more fundamental than distance measurements, and the latter are deducible from the former. In other words, distances are measured by the lapse of time. This has been criticized by Born [43] as being impractical since no one has ever received light signals from nebulae beyond the horizon. However, it is far superior to the usual method in general relativity that uses a metric, or rigid ruler, to measure distance. So what was discarded in special relativity made its come back in general relativity. The most ideal situation would be to introduce into the fabric of the theory distances measured in brightness, or the difference between apparent and absolute brightness. However, no one has ever succeeded in doing so and we will base all distance measurements on the so-called radar method, where a light signal is sent out and reflected at a later time. All that is needed is that at each reflection a certain retardation factor, K, comes in, determined by the clock in the frame that is sending out the light pulse. Consider two observers, A and B, where observer A sends out a light signal in his time t1A which is received by observer B in his time t2B . In terms of A’s time, B will receive it in time Kt1A , where K is some constant factor that is a function only of the relative velocity of the two inertial frames. The signal that arrives at B in time t2B will be reflected at some later time. The reflected signal leaves B in time t3B which arrives at observer A in time t4A , where t4A = Kt3B . From this it is apparent that both observers will call the
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
399
reflection time, tr =
√
(t1A t4A ) =
√ BB (t2 t3 ),
(8.4.1)
the geometric mean of the time intervals, and it is an invariant independent of the frame. The appearance of the geometric mean, as opposed to the arithmetic mean, implies implicitly the existence of another time scale, namely, a logarithmic one [cf. Eq. (8.4.30) below]. So the ‘signal-function method’ of Whitrow singles out the geometric mean as the time of reflection. The reading shown by a synchronous (stationary) clock at the event should be midway between the observer’s time, t1A , of sending out the signal, and the time he receives its reflection, t4A , t=
1 A (t + t4A ). 2 1
(8.4.2)
This was Einstein’s choice, but it is by no means the only choice. The measure of the space interval is the difference between the ‘average’ for the light-signaling process, (8.4.2), and the time the signal was sent out, r = t − t1A =
1 A (t − t1A ). 2 4
(8.4.3)
In terms of B’s coordinates, he will measure a time interval t =
1 B (t + t3B ), 2 2
(8.4.4)
r =
1 B (t − t2B ), 2 3
(8.4.5)
and a space interval
separating the event from where he is located. The two systems of inertial coordinates (t, r) and (t , r ) are related by t2B = t − r = K(t − r) = Kt1A ,
(8.4.6a)
t3B = t + r = K −1 (t + r) = K −1 t4A .
(8.4.6b)
The time t2B is the time on B’s clock when the signal is received, and t3B is the moment on B’s clock when it is sent back.
Aug. 26, 2011
11:16
400
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
If B reflects the signal instantaneously so that t2B = t3B then from the second equalities in (8.4.6a) and (8.4.6b) it follows that 1 + u 1/2 1 + r/t 1/2 = , (8.4.7) K= 1 − r/t 1−u upon taking the positive square root, and setting r/t = u. Now suppose we place ourselves at the origin of B’s frame, r = 0. Then summing (8.4.6a) and (8.4.6b) gives t=
1 A 1 t (t1 + t4A ) = (K + K −1 )t = √ , 2 2 (1 − u2 )
(8.4.8)
showing that a clock traveling at a uniform velocity goes slower than one at rest. This expression for time dilatation only holds for frames moving at a constant velocity u [cf. Eq. (8.4.27) below]. In terms of the longitudinal Doppler shift, (8.2.4), the two systems of coordinates are related by 1 1 A (t + t4A ) = (Kt3B + K −1 t2B ) 2 1 2 1 = {(K + K −1 )t + (K − K −1 )r } 2 ¯ = t cosh u¯ + r sinh u,
(8.4.9)
1 1 A (t4 − t1A ) = (Kt3B − K −1 t2B ) 2 2 1 = {(K − K −1 )t + (K + K −1 )r } 2 ¯ = t sinh u¯ + r cosh u.
(8.4.10)
t=
and r=
These are none other than the well-known Lorentz transformations. Taking their differentials and forming the difference of their squares show that the hyperbolic distance, dt2 − dr2 = dt 2 − dr 2 ,
(8.4.11)
is invariant. Now, we ask what happens when the light signal is reflected when it arrives at B. In this case, t2B = t3B ≡ tr is the time of reflection, and it occurs
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
401
at the same point in space for B so that r = 0. The Lorentz transformations, (8.4.9) and (8.4.10), reduce to ¯ t = tr cosh u,
(8.4.12a)
¯ r = tr sinh u.
(8.4.12b)
Equation (8.4.12a) is a statement of the arithmetic-geometric mean inequality: The arithmetic mean t can never be inferior to the geometric mean tr since cosh u¯ ≥ 1. Adding and subtracting (8.4.12a) and (8.4.12b) give t + r = Ktr ,
(8.4.13a)
t − r = K −1 tr .
(8.4.13b)
Taking the differentials of (8.4.13a) and (8.4.13b), and then the product of the two, without requiring that K be constant, result in dt2 − dr2 = dtr 2 − tr 2 du2 .
(8.4.14)
A space-time interval has been transformed into a velocity space-time interval.
8.4.2
Constant relative acceleration
There is general consensus [Møller 52] that acceleration has no effect on the rate of a clock, and that the expression for time dilatation (8.4.8) can be used in its infinitesimal form whether or not u is constant. However, according to Einstein’s equivalence principle uniform acceleration is equivalent to, or indistinguishable from, a uniform gravitational field. It has been shown from the gravitational redshift that the latter, indeed, has an effect on the rate of a clock. This contradiction has been clearly pointed out by Whitrow [80], who shows that the time dilatation is greater when the velocity is varying with time than when it is constant. We convert his inequality into an equality. Consider two observers receding from one another with an average velocity r/t. Their identical clocks were synchronized at tA = tB = 0 when they were at the same point. At time t1A , A emits a signal which is picked up and immediately reflected by B at time tB r , and then received back at A
Aug. 26, 2011
11:16
402
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
at time t3A . The space interval is t − t1A = t3A − t =
1 A (t − t1A ) = r. 2 3
From this it follows that t1A = t − r,
(8.4.15a)
t3A = t + r,
(8.4.15b)
and consequently, √
tA r =
(1 − (r/t)2 ) t.
(8.4.15c)
Since the Doppler shift is now given by K=
1+u 1−u
1/2
=
1 + r/t 1 − r/t
(8.4.16)
and not by (8.4.7), we can express (8.4.15a) and (8.4.15b) as t3A = Kt1A , or tB r = K 1/2 t1A , t3A = K 1/2 tB r .
(8.4.17a) (8.4.17b)
But, from (8.4.15c) it is apparent that tB r = tA r so that the clocks remain synchronized, and we can drop the superscripts on the time. Expressing r and t in terms of t1 and t3 we find [Page 36] g=2
1 1 r = − , t1 t3 tr 2
(8.4.18)
where g is the uniform acceleration due to gravity. Employing (8.4.17a) and (8.4.17b) we write (8.4.18) as g=
K 1/2 − K −1/2 . tr
(8.4.19)
Equation (8.4.18) enables us to express the Doppler shift, K, in terms of the ratio of the time the signal was received back to that when it was sent out t3 /t1 = K.
(8.4.20)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
403
Taking the logarithms of both sides of (8.4.20), and then differentiating with respect to t, give d ln t3 1 du d ln t1 1+u 1−u − = − = , dt dt t3 t1 1 − u2 dt where we have used the differentials of (8.4.15a) and (8.4.15b). Dividing √ both sides by (1 − u2 ) results in K du K −1 1 − = = g, 2 3/2 t3 t1 dt (1 − u )
(8.4.21)
which is identical to (8.4.18). If the ratio (8.4.20) had been proportional to the square of the Doppler shift, we would have found that (8.4.21) vanishes. If we consider t to be the time of reflection on B’s clock, we can write (8.4.18) as 1 1 = + g/2, t1 t
(8.4.22a)
1 1 = − g/2, t3 t
(8.4.22b)
which can easily be seen by subtraction. Rather, adding (8.4.22a) and (8.4.22b) gives 1 1 = t 2
1 1 . + t1 t3
(8.4.23)
The time of reflection on B’s clock is the harmonic mean for uniform acceleration, in contrast with the geometric mean as the time of reflection for uniform motion. Writing t1 = t − r and t3 = t + r in (8.4.23) clearly shows that the space-time interval is not invariant t2 − r 2 = t t = tr 2 , unless we require the reflection times to be the same, meaning that clocks A and B are synchronous [Prohovnik 67].
Aug. 26, 2011
11:16
404
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
Multiplying the left- and right-hand sides of (8.4.22a) and (8.4.22b), ¯ give rearranging, taking the square roots, and using gt = sinh u, r ¯ ¯ = sech2 (u/2)t, t = sech(u/2)t
(8.4.24)
where the second equality follows from (8.4.15c). The equalities in (8.4.24) express quantitatively that the harmonic mean is always smaller than the geometric mean which is smaller than the arithmetic mean, because the equality of times can never apply. The first equality in Eq. (8.4.24) states physically that the time of reflection on B’s clock is always less than on A’s clock. Taking the ratio of (8.4.22a) and (8.4.22b) we get K=
t3 1 + gt /2 . = 1 − gt /2 t1
(8.4.25)
Instead of (8.4.7), we now have 1 ¯ gt = r/t = tanh u/2. 2
(8.4.26)
Differentiating with respect to the arithmetic time average gives ¯ g dt = sech2 u/2 =
du¯ dt dt
¯ du sech2 u/2 dt 1 − u2 dt
and now using (8.4.21), gives dt . (1 − u2 )
¯ dt = cosh2 (u/2) √
(8.4.27)
Comparing (8.4.27) with the expression for time dilatation for uniform motion, (8.4.8), leads to the inescapable conclusion that clocks will run even slower in a uniformly accelerating frame than in an inertial frame when viewed from a stationary frame. The consensus of opinion that acceleration will have no effect upon the apparent rate of clocks is inaccurate.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
405
The transformation laws (8.4.17a) and (8.4.17b) can be expressed as t + r = K 1/2 tr,
(8.4.28a)
t − r = K −1/2 tr.
(8.4.28b)
Taking the differentials of (8.4.28a) and (8.4.28b), and then their product, result in 1 ds2 : = dt2 − dr2 = dtr 2 − tr 2 (d ln K)2 4 1 = dtr 2 − tr 2 du¯ 2 . 4
(8.4.29)
The appearance of tr 2 in the velocity space component of the metric (8.4.29) implies uniform expansion, and solicits the introduction of logarithmic time, τ = 2τ0 ln (tr /τ0 ),
(8.4.30)
where τ0 is an absolute constant, into (8.4.29) so that ds2 = dt2 − dr2 =
eτ/τ0 {dτ 2 − τ02 du¯ 2 }. 4
(8.4.31)
Thus, the formulas of the transformation of coordinates (8.4.28a) and (8.4.28b) can be written as ¯ , t + r = τ0 eτ/τ0 +u/2
(8.4.32a)
¯ t − r = τ0 eτ/τ0 −u/2 .
(8.4.32b)
These equations clearly show that logarithmic time, (8.4.30), is hyperbolic time. At constant tr , a surface of revolution is obtained by rotating the hyperbola t2 − r2 = tr 2 around the t axis to give a bowl-shaped form. As it ¯ vanishes in the metric. should, the relative velocity, u, The equivalence relations (8.4.32a) and (8.4.32b) can be combined to read t + r = K(t − r),
(8.4.33)
Aug. 26, 2011
11:16
406
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
which is the square of (8.4.7) for uniform motion. In fact, at constant velocity, t + r = K(t − r ), t − r = K −1 t + r ,
(8.4.34a) (8.4.34b)
we conclude that whereas (8.4.32a) and (8.4.32b) do not retain its invariant hyperbolic form the latter does t2 − r 2 = t 2 − r 2 .
(8.4.35)
Adding and subtracting the equations yield the well-known Lorentz transformations (8.4.9) and (8.4.10), and from which it can be concluded that the Lorentz transformations leave invariant the hyperbolic ‘distance’ (8.4.35). In terms of radar signaling, (8.4.33) consists in a single observer: Alight pulse is emitted in time t1 , and observed by him at a later time t2 = Kt1 . Alternatively, in the case of constant velocity, (8.4.34a) says a light signal is emitted at time t1 , in the prime inertial frame, and observed in the unprimed frame at a later time t2 = Kt1 . Whereas, (8.4.34b) says that if a signal is emitted at time t1 , it will be observed at time t2 in the primed inertial frame. For uniform motion the geometric mean time remains invariant, √ √ (t1 t2 ) = (t1 t2 ). This is the same as requiring the hyperbolic line element (8.4.35) to be invariant. While, for uniform acceleration, the time of reflection in the B frame is the harmonic mean of the A frame. In his analysis of uniform acceleration, Page [36] attempted to show that the space-time interval between neighboring points is not constant. His analysis replaces (8.4.22b) by 1 1 = − g/2. t3 t
(8.4.36)
This condition would necessarily imply that the harmonic means in the two frames are equal. Solving (8.4.22a) and (8.4.36) for the times t and t , with t > t we get 1 ¯ t˜ := (t + t ) = t sech2 (u/2), 2 1 2 ¯ ¯ r˜ := (t − t ) = [r − t tanh (u/2)]sech (u/2). 2
(8.4.37a) (8.4.37b)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
407
However, (8.4.37b) vanishes by the definition of the Lobaschevsky line segment [cf. (8.4.26)], and hence r˜ = 0. The time of reflection is given by the harmonic mean (8.4.23). Therefore, for uniformly accelerating systems the point of reflection must occur at the origin of B’s frame, whose time is given by the harmonic mean of A’s clock.
8.5
Comparison with General Relativity We are convinced that purely mathematical reasoning can never yield physical results, that if anything comes out of mathematics it must have been put in in another form. Our problem is to find out where the physics got into the general theory. Bridgman
Einstein’s theory of relativity essentially consists of two principles [Fock 69, p. 233]: The unification of space and time into a four-dimensional space with an indefinite metric, and the relation of the curvature of the space to the presence of matter. Einstein also proposed an ‘equivalence’ principle between inertia and gravitational mass, or between acceleration and gravitation. The latter has been criticized by Fock [69, pp. 232, 233]. With a constant index of refraction, gravitational considerations appear only in the specification of the absolute constant, which is related to the constant, negative curvature of the hyperbolic space. A centrifugal potential appears explicitly in the flat metric whereas the gravitational potential does not. In the general case of non-uniform motion the relevant space is the Lobaschevsky–Friedmann velocity space [Fock 69, Sec. 94], which can, and will in Sec. 8.6, be derived without any appeal to Einstein’s equations, and the unphysical assumption that matter must be ‘dust-like’ at zero pressure. Even dust exerts pressure! We will see in Sec. 8.7 velocity components are related to the sides of a Lambert quadrilateral whose Weierstrass coordinates of the point of the acute angle show that the geometric mean time enters as a magnification of these coordinates, and not as a separate entity. Most importantly, by avoiding the ‘rigid scaffolding’ employed by Einstein, which is applicable to inertial frames of reference only [Fock 69], acceleration has been accounted for as changes in velocity space, where the independence of the ‘coordinates’ and time has disappeared. But before
Aug. 26, 2011
11:16
408
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
embarking on our journey into hyperbolic space let us pause to see how general relativity accounts for a state of constant acceleration. The general theory is based on the significance of an indefinite metric tensor, which sets gravity propagating at the speed of light or beyond [cf. below]. Space-time is labeled by a four-dimensional set of coordinates xi for which three of the coordinates label the position of the event in space, while the fourth one fixes the time coordinate. In general relativity these labels have no physical significance. The proper distances are fixed by the metric tensor, ds2 = gij dxi dxj ,
(8.5.1)
where the Einstein convention of summing over repeated indices has been used. The metric tensor, (8.5.1), says that if xi and xi + dxi are labels for two events in space-time then ds will be their proper distance between the two events. If it turns out that ds2 > 0, then the two labels are separated by a time-like interval whereas if ds2 < 0, the labels are separated by a space-like interval. The metric tensor determines the geodesics of a freely moving particle. They can be derived from Fermat’s principle of least time. If we consider the proper time interval [τ1 , τ2 ], then the variational principle states
τ2
δ
ds = δ
τ1
τ2
τ1
dxi dxj gij dτ dτ
dτ,
or equivalently,
τ2
τ1
d dτ
dxk gik dτ
1 ∂gkl dxk dxl − δxi dτ = 0. 2 ∂xi dτ dτ
This must be zero for all virtual variations δxi which vanish at the endpoints. That means that the Euler–Lagrange equations, d dτ
dxj gij dτ
−
1 ∂gkl dxk dxl = 0, 2 ∂xi dτ dτ
(8.5.2)
must be satisfied at each instant of proper time. In the case of constant gravitational acceleration along the x-axis, the coefficients of the metric
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
409
tensor are: g11 = g22 = g33 = 1,
g44 = −c2 (1 + gx/c2 )2 ,
(8.5.3)
and all other are zero. In the xτ-direction, the contracted Ricci tensor vanishes, because each of the components, R11 = R44 = 0, and so too does the Gaussian curvature [cf. Sec. 9.10.3 for details]. It is one thing to assume that gravity propagates at the speed of light, but another to assume that it travels at c(1+gx/c2 ) [Møller 52, p. 257], which is what (8.5.3) claims. Thus, the Euler–Lagrange equations (8.5.2) reduce to a single equation, dt 2
d2 x 1 ∂g44 dt 2 2 = −g 1 + gx/c , (8.5.4) = dτ 2 dx dτ dτ 2 where i = 1 and k = l = 4. The increment in proper time is related to coordinate time by [Møller 52, p. 247] 2 u2 dτ = dt 1 + 2χ − 2 , (8.5.5) c c where
gx χ = gx 1 + 2 , 2c
is the so-called scalar gravitational potential [Møller 52, p. 255], and u = |dx/dt| is the speed of the particle measured from the rest frame. If we use coordinate time, the equation of motion (8.5.4), with the help of (8.5.5), becomes 2 dx d2 x 2g/c2 − + g(1 + gx/c2 ) = 0. 2 2 dt dt 1 + gx/c The solution to this equation, x=
gx0 c2
1 + 2 sech(gt/c) − 1 , g c
(8.5.6)
for the initial conditions x = x0 , and dx/dt = 0, clashes with (4.3.37), which, in the present case, is x=
gτ c2
cosh −1 . g c
Aug. 26, 2011
11:16
410
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
In fact, differentiating with respect to proper time gives dχ d2 x = g(1 + gx/c2 ), = g cosh (gτ/c) = 2 dx dτ
(8.5.7)
which is simply Newton’s second law! There is no physical reason why the speed, u = |dx/dt| = c(1 + gx0 /c2 ) tanh (gt/c)sech(gt/c), should show a maximum in time when the particle is under constant accel eration, nor is there any reason to call c 1 + gx/c2 the velocity of light when it is apparent that this velocity becomes infinite as x → ∞. There is also no reason to consider (8.5.5) a valid relation between proper and coordinate velocities when the particle is under acceleration. Moreover, it is also apparent from (8.5.6) that proper (hyperbolic) and coordinate times have been confused. This is evident from Møller’s conclusion: For t → ∞ the particle approaches the singular wall x = −c2 /g of our system of coordinates. At this place also the velocity of light tends to zero, and no signals of any kind will ever reach the boundary plane.
Thus, general relativity has a ‘black hole’ at the boundary of a system in which the particle is merely undergoing uniform acceleration! We are thus forced to conclude that the geodesic equation (8.5.2) does not fix the motion of a moving observer relative to another. By ‘comoving’ it is meant that the observer is at rest relative to the matter placed at his position, but the observer is allowed to move freely with no forces acting on him. This also puts into serious doubt whether the curvature of space, as contemplated from the metric tensor (8.5.1), really describes the action of gravity!
8.6
Hyperbolic Geometry of Relativity
All the relations of relativity can be derived directly from the hyperbolic differential of arc length, or the Beltrami metric, in velocity space. Consider a velocity vector with two components, u1 and u2 . Moreover, consider a disc of radius c in which a line passing through the center is cut by another line forming an angle θ with it. The Euclidean distance from the center of √ 2 the circle to the point of intersection is u/c = u1 + u22 /c. In Euclidean
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
411
space, distances are relative. Congruent triangles of different sizes can all have the same angles. No so in Lobaschevsky geometry where, as we have seen, the angles determine the sides of the triangle. Hence, lengths are also absolute. √ The Euclidean length of the increment in the velocity du = (du21 + du22 ) will be related to the hyperbolic measure du¯ by du¯ = du, where √
(1 − u2 sin2 θ/c2 ) . 1 − u2 /c2
=
(8.6.1)
From the definition of the curl, u × du = u du sin θ, we may write the hyperbolic line element as √ du¯ =
du2 − (u × du)2 /c2 , 1 − u2 /c2
(8.6.2)
for the hyperbolic line element in terms of u1 and u2 . Now, if du = u − w, the hyperbolic measure of their difference will be √
[(u − w)2 − (u × w)2 /c2 ] = 1 − u · (w + du)/c2
√
[(u − w)2 − (u × w)2 /c2 ] + O(du), 1 − u · w/c2 (8.6.3) on the strength of (8.6.2), where O(du) is an infinitesimally small quantity of than higher order than √
[(u − w)2 − (u × w)2 /c2 ].
If the two velocity vectors are parallel, (8.6.3) reduces to (6.3.1). If not, the composition of Lorentz transforms in different directions involve rotations, as Einstein knew in 1905.
Aug. 26, 2011
11:16
412
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
If we want to determine the hyperbolic length between u2 and u1 , we write the velocity vector in parametric form u = u1 + λ(t)(u2 − u1 )
0 ≤ λ(t) ≤ 1,
and introduce it into the expression [Fock 69] 1√ 2 ˙ 2 /c2 ) (u˙ − (u × u) u¯ = dt, 1 − u2 /c2 0 to get u¯ = c
1
√
[(u2 − u1 )2 − (u1 × u2 )2 /c2 ] dλ. 1 − u2 /c2
0
If we set λ=
(8.6.4)
b + ax (c2 − u2 ) ln , 2a b − ax
where the constants a=
√
(c2 (u2 − u1 )2 − (u1 × u2 )2 ),
and b = c 2 − u 1 · u2 , the integral (8.6.4) becomes 1 u¯ = c 0
ab dx b+a c ln , = 2 b−a b2 − a 2 x 2
(8.6.5)
which is the hyperbolic measure of distance. Using the well-known relation for inverse hyperbolic functions, (8.6.5) can be written as a ¯ = tanh (u/c). (8.6.6) b Squaring both sides of (8.6.6) and multiplying by c2 , (u2 − u1 )2 − (u1 × u2 )2 /c2 ¯ = c2 tanh2 (u/c), (1 − u1 · u2 /c2 )2 shows that the left side of (8.6.6) is the ratio of the relative velocity to c. Whereas the Euclidean measure of the relative velocity is bounded from ¯ above by c, there is no limit on its hyperbolic measure u.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
413
To see the relation between the hyperbolic measure of distance, (8.6.5), and the Lobaschevskian arc length, (8.6.2), we write the former in the form √ 2 [c2 (u22 − u12 ) − (u1 × u2 )2 ] c u¯ = ln 1 + 2 . √ 2 c − u1 · u2 − [c2 (u2 − u1 )2 − (u1 × u2 )2 ] Considering the second term in the argument of the logarithm as small, we approximate ln (1 + x) x so that √ 2 c (u2 − u1 )2 − (u1 × u2 )2 u¯ = c 2 √ 2 c − u1 · u2 − c (u2 − u1 )2 − (u1 × u2 )2 √ (u2 − u1 )2 − (u1 × u2 )2 /c2 + O(|u2 − u1 |), (8.6.7)
1 − u1 · u2 /c2 where the remainder in (8.6.7) contains higher-order terms in the velocity difference. In the case of parallel vectors, (8.6.7) reduces to the velocity composition law, (6.3.1). If we specialize to the case of parallel vectors, (8.6.5) reduces to c c2 − u1 · u2 + c(u2 − u1 ) u¯ = ln 2 2 c − u1 · u2 − c(u2 − u1 ) c − u1 c + u 2 c · = ln c − u2 c + u1 2 c = ln {u2 , u1 | − c, c}. (8.6.8) 2 The argument of the logarithm is our old friend the cross-ratio, and underscores its fundamental role in defining distance in hyperbolic space. This shows that the Lobaschevsky segment is really the shortest distance between two points on the hyperbolic plane. We recall from Sec. 2.2.4 that projections are transformations that change lengths and angles. No property of three points can be invariant because any three points can be transformed into any other three points on the line. Four collinear points are needed (c, u2 , u1 , −c) for the cross-ratio (8.6.8). Connecting these points by lines emanating from a common point, the invariance of the cross-ratio can be shown as an invariance of the ratio of the angle formed from this vertex, as we have done in Sec. 2.2.4. As u2 moves toward c, the cross-ratio, as well as its logarithm, increases indefinitely. Rather, if u2 is found between u1 and −c, the cross-ratio will be
Aug. 26, 2011
11:16
414
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
between 0 and 1. This makes the hyperbolic distance (8.6.8) negative. And as u2 moves along the line toward −c, the hyperbolic length will decrease indefinitely. Hence, the points ±c are ‘infinitely distant.’ Infinitely distant points are upper and lower bounds on the Euclidean measures of distance, in this case velocities, since we are working in velocity space. The same can be said about configuration space, where the points at infinity will be given by ±c/ω, with the frequency to be defined, for example, in considering √ gravitational collapse, whose free-fall frequency is ω = (Gρ), where G is the Newtonian gravitational constant and ρ is the density of matter. We can also write the hyperbolic distance (8.6.8) as c 1 c+u u¯ = ln · , (8.6.9) 2 c−u 1 by setting u1 = 0 and putting u2 = u. Alternatively, we can equate (8.6.9) with (8.6.8) without any prior conditions and find the velocity subtraction law, u=
u 2 − u1 . 1 − u1 · u2 /c2
(8.6.10)
The subtraction, and not the addition, law for velocities is required for the construction of the cross-ratio. It is also quite remarkable that the triangle defect of hyperbolic geometry corresponds exactly with the angle of aberration. When a moving object, like us who are on Earth, is trying to determine the position of another object, say a star, the directions to the star will differ by aberration. That is, consider an incoming ray in the xy-plane. With respect to two platforms S and S moving at a relative velocity v with respect to one another, an incoming light signal will make angles α and α , respectively. The velocity components in the x and y directions will be related by ux =
ux − v 1 − ux v/c2
and
uy =
uy . γ(1 − ux v/c2 )
The velocity components of the incoming light signal will be ux = − cos α and uy = −c sin α, with analogous expressions in the primed platform. Hence, the formulas for aberration are [cf. (8.3.6)] cos α =
cos α + v/c , 1 + (v/c) cos α
(8.6.11a)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
415
and [cf. (8.3.9)] sin α =
sin α . γ(1 + (v/c) cos α)
(8.6.11b)
With the aid of the half-angle formula, sin α 1 , tan α = 2 1 + cos α the formulas for aberration, (8.6.11a) and (8.6.11b) can be combined to yield [cf. (8.3.17)] 1 1 tan α = K tan α. 2 2
(8.6.12)
Consider two types of displacements from u, du and δu, where du¯ 2 = c2
c2 (du)2 − (u × du)2 , (c2 − u2 )2
δu¯ 2 = c2
c2 (δu)2 − (u × δu)2 . (c2 − u2 )2
and
By the definition of the cosine of the angle between the two displacements [Fock 69], du¯ · δu¯ = du¯ δu¯ cos α = c2
c2 du · δu − (u × du) · (u × δu) , (c2 − u2 )2
we have c2 du · δu − (u × du) · (u × δu) √ , c2 (δu)2 − (u × δu)2 c2 (du)2 − (u × du)2 ·
cos α = √
(8.6.13)
as the expression for the cosine of the angle between the relative velocities of two bodies.
8.7
Coordinates in the Hyperbolic Plane
Consider an orthogonal system through the origin O in Fig. 8.5. Let U and P be the points on the axes where the perpendicular projections from a point P not lying on the axes meet the axes. We then have a Lambert quadrilateral OVPU. If the angle at P were a right angle, we would have Euclidean geometry; if it is acute we have hyperbolic geometry.
Aug. 26, 2011
11:16
416
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
Fig. 8.5. A Lambert quadrilateral in velocity space consisting of three right-angles and one acute angle.
Though named after Lambert, it was known to Ibn al-Haytham almost seven hundred years earlier [Rosenfeld 88]. This is yet another example of Stigler’s law of eponymy. The distances to the points U and V are u¯ = tanh−1 u,
and
v¯ = tanh−1 v.
(8.7.1)
The Weierstrass coordinates can now be introduced as ¯ X = uT,
Y = v¯ T,
¯ and T = cosh u¯ cosh w,
(8.7.2)
¯ v¯ w, ¯ and z¯ are four sides of a Lambert quadrilateral, shown in where u, ¯ Fig. 8.5, consisting of three right angles and one acute angle between w and z¯ . The condition that the two sides will intersect to form an acute angle is 1 − tanh2 u¯ − tanh2 v¯ = 1 − u2 − v2 > 0. The Euclidean measures of the two sides are u = tanh u¯ cosh v¯ , (1 − v2 ) v ¯ z= √ = tanh v¯ cosh u. (1 − u2 )
w= √
(8.7.3)
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
417
By giving to each point the triple (X, Y, T) of Weierstrass coordinates, the hyperbolic plane is mapped onto the locus T 2 − X 2 − Y 2 = 1, which is one of two sheets of a hyperboloid in Cartesian three-dimensions. The infinitesimal metric, ¯ du¯ 2 ¯ 2 + cosh2 w dσ¯ 2 = dX 2 + dY 2 − dT 2 = dw =
(1 − v2 )du2 + 2uv du dv + (1 − u2 )dv2 , (1 − u2 − v2 )2
(8.7.4)
is the spatial component of the Lobaschevsky velocity space metric. If we want a full time-velocity space metric we must magnify the T coor√ dinate, tr > 0 times, viz. T = tr / (1 − u2 − v2 ). This results in a time-like, indefinite metric, dτ 2 = dT 2 − dX 2 − dY 2 = dtr 2 − tr 2 dσ¯ 2 ,
(8.7.5)
which we will meet again in our discussion of cosmological models in Sec. 9.11. A time-velocity metric, similar to (8.7.5), was derived by Friedmann in 1922 using the Einstein equations to relate the coordinates to the Lagrangian variables, ui . It was derived under the condition that matter was ‘dust-like’ exerting zero pressure, which is certainly a dubious assumption and does not correspond to anything physical. It was also assumed that the velocities variables ui are constants relating the spatial coordinates xi to time, but, subsequently, they were differentiated to obtain the Lobaschevsky–Friedmann metric (8.7.5) [Fock 69]. As can be seen from the definition of the Weierstrass coordinates, (8.7.2), each of the coordinates become magnified tr times [Fock 69, Eq. (94.47)]. In a multi-dimensional velocity space, the spatial part of the metric (8.7.4) can be written as dσ¯ 2 =
(du)2 − (u × du)2 , (1 − u2 − v2 )2
Aug. 26, 2011
11:16
418
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
√ by introducing the coordinates Xi = ui tr / (1−u2 −v2 ) into the infinitesimal metric, Xi2 . d¯τ 2 = dT 2 − i
In relation to the Robertson–Walker metric, which we will come across in Sec. 9.11, the scale factor R(t) multiplying the spatial part of the metric is just t, which implies uniform expansion. Introducing logarithmic, or hyperbolic, time according to t¯ = t0 ln (tr /t0 ),
(8.7.6)
where t0 is the age of the system on the tr scale, (and not (8.4.30)) into (8.7.5) gives ¯ ¯ 2 + cosh2 w ¯ du2 )}. d¯τ 2 = e2t/t0 {dt¯2 − t02 (dw
(8.7.7)
The proper time interval is the quantity τ¯0 determined at constant velocity by the equation τ¯0 =
t¯ 0
¯ et/t0 dt = t0 et/t0 − 1 .
This law could have been anticipated because tr is the geometric mean. Only ¯ The exponential variable for t¯ t0 will the proper time coincide with t. scale factor multiplies both time and velocity increments, and testifies to the fact that they are not independent, but, are related by the Beltrami coordinates and logarithmic time. For fixed t0 the velocity space line element is ¯
¯ 2 + cosh2 w ¯ du¯ 2 ). dσ¯ 2 = t02 e2t/t0 (dw
(8.7.8)
The terms in the parentheses have the metric form of a pseudosphere in velocity space, with constant negative curvature, −1, that we introduced in Sec. 2.5 and met on many an occasion. The scale factor is the same exponential that appears in the proper time increment. The hallmark of a pseudosphere is that lines which do not intersect are, nevertheless, not parallel. Along a light track (8.7.7) vanishes resulting in √ (1 − u2 )2 dw2 + (1 − w2 )du2 dt¯ = t0 . (1 − u2 )(1 − w2 )
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
419
This is a generalization of the well-known one-dimensional expression, whose integral identifies (8.2.1) as the length of the corresponding segment of a Lobaschevsky straight line.
8.8
Limiting Case of a Lambert Quadrilateral: Uniform Acceleration
A limiting case arises when inequality (8.7.3) reduces to an equality u2 + v2 = 1,
(8.8.1)
√ or v = (1 − u2 ) =: u∗ . The velocities u¯ and u∗ are said to be complementary [Greenberg 93]. The defining relation for uniform acceleration is (8.4.16), which upon resolving for the velocity gives u = tanh u¯ =
2(r/t) . 1 + (r/t)2
(8.8.2)
r/t represents the Euclidean ‘length,’ while u¯ is the length that Poincaré used; the two being related by eu¯ =
1 + (r/t) . 1 − (r/t)
(8.8.3)
The complementary velocity is found to be u∗ =
√ 1 − (r/t)2 = sech u¯ = (1 − u2 ), 1 + (r/t)2
which verifies (8.8.1). The angle of parallelism, (8.2.10), ∗
(u¯ ∗ ) = 2 tan−1 e−u¯ ,
(8.8.4)
is defined solely in terms of the ‘distance’ u¯ ∗ from the foot of the perpendicular to the angle of parallelism. The angle of parallelism is the lower bound for the angle of parallax. It was Bernoulli who first showed that ∗ 1 1 + ie−u¯ −1 −u¯ ∗ 2 tan e . (8.8.5) = ln ∗ i 1 − ie−u¯
Aug. 26, 2011
11:16
420
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
This is because (u¯ ∗ ) =
2 ∗ tanh−1 ie−u¯ , i
which is equal to (8.8.4). In particular, ∗
¯ = tan (u¯ ∗ )/2 = e−u¯ r/t = tanh u/2
(8.8.6)
shows that the closer the complementary velocity u∗ is to zero, the closer is to being a right angle. For large u∗ , or nonrelativistic velocities, the angle of parallelism is practically zero. For the Earth’s orbital motion, (8.8.6) is 10−4 , giving an angle of parallelism = 890 59 39.4 . The deviation from Euclidean space is only 20.6 . However, things change drastically as the velocity of light is approached: for a relative velocity 0.95, the angle of parallelism drops to 180 12 , and vanishes in the limit [Silberstein 14]. The double angle formula, tan (u¯ ∗ ) =
∗
∗
e−u¯ + e−u¯ ∗ 1 − e−2u¯
¯ = 1/ sinh u¯ ∗ = sinh u,
(8.8.7)
shows that provides the link between circular and hyperbolic functions. In particular, (8.8.7) relates the angle of parallelism to the particle velocity. Consider a Lambert quadrilateral with three right angles and an ideal point in Fig. 8.6. Equation (8.8.7) implies that the complementary velocities,
Fig. 8.6. A Lambert quadrilateral comprised of complementary segments where the ‘fourth vertex’ is an ideal point.
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
421
u¯ and u¯ ∗ , which are adjacent to the two angles of parallelism, are related by ∗
1 + cosh u¯ ¯ = coth u/2 sinh u¯ 1 + e−u¯ = , 1 − e−u¯
eu¯ =
(8.8.8)
and an identical expression for u¯ in terms of u¯ ∗ . Equation (8.8.3) implies the addition law for the hyperbolic measure of the complementary velocities, which, in turn, implies the product law for the average velocities. Rather, if u¯ 1 and u¯ 2 are the components of the hyperbolic measure of ¯ their composition law follows velocity composition law: the velocity u, e−u¯ =
e−u¯ 1 + e−u¯ 2 . 1 + e−u¯ 1 −u¯ 2
(8.8.9)
Finally, the generalization of (8.8.6) to n components, n
n ∗ ¯ tan (u¯ ∗i )/2 = e− i=1 u¯ i = Gn = tanh (u/2),
(8.8.10)
i=1 ∗
∗
∗
selects out the geometric mean G = (eu¯ 1 eu¯ 2 · · · eu¯ n )1/n , and the last equality follows from (8.8.8), which is valid for a single complementary velocity or a sum of n velocities. The opposite right angle is divided into two angles of parallelism such that ¯ + (u¯ ∗ ) = π/2. (u)
8.9
(8.8.11)
Additivity of the Recession and Distance in Hubble’s Law
The fact that the shift z = δλ/λ0 for lines in the spectrum of a given galaxy is independent of the wavelength is a necessary, but not a sufficient condition, that the redshift is due to motion. It was Edwin Hubble who interpreted these redshifts as Doppler shifts, which are indicative of recessional motion. In so doing he obtained a linear relation between the velocity of recession, ¯ and radial distance, r, with a constant of proportionality that is the same u, for all galaxies. We will show that both these quantities are additive.
Aug. 26, 2011
11:16
422
SPI-B1197
A New Perspective on Relativity
b1197-ch08
A New Perspective on Relativity
There will be a redshift if the detected wavelength, λ, is greater than the emitted wavelength, λ0 , in 1+z=
λ = K = eu¯ . λ0
(8.9.1)
Now, K is the ratio of the received, t2 , to the emitted time, t1 . We can therefore define a hyperbolic measure of the time interval as [Milne 48] ¯ 0 = ln t2 = ln K. t/t t1
(8.9.2)
A comparison of (8.9.1) and (8.9.2) results in ¯ 0 = H r¯ , u¯ = t/t
(8.9.3)
where u¯ is the hyperbolic measure of the velocity, H = t0−1 , the Hubble parameter, and r¯ = t¯ is the hyperbolic measure of distance in natural units. Hubble’s law (8.9.3) could have also been derived by setting the onedimensional velocity space metric (8.4.29) equal to zero, and introducing the logarithmic time (8.7.6). Consequently, (8.9.1) is the exponential law [Prohovnik 67] ¯
1 + z = eH t .
(8.9.4)
Only when H t¯ 1 can we neglect powers of H t¯ greater than first so that (8.9.4) reduces to the relation [Hoyle et al. 00] ¯ z = H t.
(8.9.5)
The exponential law (8.9.4) implies that when there are more than one redshifts, it is their geometric mean which should be taken. For example, the cluster Group II has n = 21 redshifts, in which case (8.9.1) generalizes to n n n n λi (8.9.6) (1 + zi ) = = K(u¯ i ) = exp u¯ i . λ0i i=1
i=1
i=1
i=1
Hyperbolic velocities, u¯ i , like the hyperbolic distances r¯i , are therefore additive. The average wavelength is the geometric mean wavelength. This is implied by the exponential law (8.9.6).
Aug. 26, 2011
11:16
SPI-B1197
A New Perspective on Relativity
b1197-ch08
Relativity of Hyperbolic Space
423
References [Bondi 60] H. Bondi, Cosmology (Cambridge U. P., London, 1960). [Born 09] M. Born, “Die Theorie des starren elektrons in der Kinematik des Relativitátsprinzips,” Ann. der Phys. 30 (1909) 1–56. [Born 43] M. Born, Experiment and Theory in Physics (Cambridge U. P., London, 1943). [Buseman & Kelly 53] H. Busemann and P. J. Kelly, Projective Geometry and Projective Metrics (Academic Press, New York, 1953). [Fock 69] V. Fock, The Theory of Space, Time and Gravitation, 2nd ed. (Pergamon Press, Oxford, 1969). [Greenberg 93] M. J. Greenberg, Euclidean and Non-Euclidean Geometries: Development and History, 3rd ed. (W. H. Freeman, New York, 1993). [Hoyle et al. 00] F. Hoyle, G. Burbidge and J. V. Narlikar, A Different Approach to Cosmology (Cambridge U. P., Cambridge, 2000). [Lobaschevsky 98] N. I. Lobaschevsky, Zwei Geometrische Abhandlungen (Leipzig, 1898). [Milne 48] E. A. Milne, Kinematical Relativity (Oxford U. P., London, 1948). [Møller 52] C. Møller, Theory of Relativity (Oxford U. P., London, 1952). [Needham 97] T. Needham, Visual Complex Analysis (Clarendon Press, Oxford, 1997). [Page 36] L. Page, “A new relativity,” Phys. Rev. 49 (1936) 254–268. [Poincaré 05] H. Poincaré, “Sur la dynamique d’électron,” Comptes Rendus Hebdomadaires des seances de l’Academie des Sciences 140 (1905) 1504–1508; extended version in Rend. Circ. Mat. Palermo 21 (1906) 129–176. [Prohovnik 67] S. J. Prohovnik, The Logic of Special Relativity (Cambridge U. P., London, 1967). [Rosenfeld 88] B. A. Rosenfeld, A History of Non-Euclidean Geometry (Springer, New York, 1988), pp. 59–64. [Robb 11] A. A. Robb, Optical Geometry of Motion (W. Heffer & Sons, Cambridge, 1911). [Silberstein 14] L. Silberstein, The Theory of Relativity (MacMillan, London, 1911). [Sommerfeld 09] A. Sommerfeld, “Über die Zusammensetzung der Geschwindigkeiten in der Relativitheorie,” Verh. Deutsch. Phys. Ges. XI (1909) 577–582; “On the composition of velocities in the theory of relativity,” Wikisource translation. [Terrell 59] J. Terrell, “Invisibility of the Lorentz contraction,” Phys. Rev. 116 (1959) 1041–1045. [Whitrow 33] G. J. Whitrow, “A derivation of the Lorentz formulae,” Quart. Jour. Math (Oxford) 4 (1933) 161–172. [Whitrow 80] G. J. Whitrow, The Natural Philosophy of Time, 2nd ed. (Clarendon Press, Oxford, 1980).
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Chapter 9
Nonequivalence of Gravitation and Acceleration
The treatment of the uniformly rotating rigid body seems to me to be very important because of an extension of the relativity principle to uniformly rotating systems by trains of thought which I attempted to pursue for uniformly accelerated translation in the last second of. . . my paper (of 1907). Einstein in a letter to Sommerfeld dated 29/9/1909.a
9.1
The Uniformly Rotating Disc in Einstein’s Development of General Relativity
According to John Stachel [89], the rigidly rotating disc is the “missing link” in Einstein’s formulation of general relativity for it made him aware of the need for a non-flat metric in a relativistic treatment of the gravitational field. To Einstein, gravitation is a form of acceleration insofar as it can be nullified by another accelerating frame. Einstein’s conclusion that Euclidean geometry does not apply to a reference frame in uniform rotation stems from his belief that a measuring rod applied to the periphery undergoes Lorentz contraction, while the one applied along the radius does not. Hence, Euclidean geometry does not apply [to a system of coordinates in uniform rotation].
In a private letter, dated August 19, 1919, to a then well-known philosopher, Joseph Petzoldt, Einstein goes into more detail: Let U0 be the circumference, r0 the radius of the rotating disc, considered from the standpoint of K0 [the rest frame]; then on account of ordinary a “That isolated remark, important as it is, does not change my opinion that Einstein
was concentrating in other directions during this period (12/1907–06/1911).” A. Pais, in Subtle is the Lord.
425
Aug. 26, 2011
11:17
426
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity Euclidean geometry U0 = 2πr0 .
(9.1.1)
U0 and r0 naturally are to be thought of as measured with non-rotating measuring rods, i.e. at rest relative to K0 . Now let me imagine co-rotating measuring rods of rest length l laid out on the rotating disc, both along the radius as well as the circumference. How long are these, considered from K0 ? Let us imagine, in order to make this clearer to ourselves, a ‘snapshot’ taken from K0 (definite time t0 ). On this snapshot the radial measuring √ rods have the length l, the tangential ones, however, the length (1 − v2 /c2 ). The ‘circumference’ of the circular disc (considered from K) is nothing but the number of tangential measuring rods that are present in the snapshot along the circumference, whose length considered from K0 is U0 . Therefore, √ U = U0 / (1 − v2 /c2 ).
(9.1.2)
r = r0
(9.1.3)
On the other hand, obviously
(since the snapshot of the radial unit measuring rod is just as long as that of a measuring rod at rest relative to K0 ). Therefore, from (9.1.2) (9.1.3), U U0 , = √ r r0 (1 − v2 /c2 )
(*)
U 2π . = √ r (1 − v2 /c2 )
(**)
or, on account of (9.1.1),
We pause to see the speciousness of Einstein’s reasoning. Instead of setting the disc in uniform rotation, we set it in motion at a constant velocity v. We would then expect the radius to contract under the FitzGerald–Lorentz contraction, √ (9.1.4) r = r0 (1 − v2 /c2 ), and expect that the circumference should remain unchanged under uniform motion, U = U0 .
(9.1.5)
The ratio of (9.1.5) to (9.1.4) is (∗ ), and with (9.1.1), gives (∗∗ ). So, Einstein was begging the question! He also contended that clocks go slower at the periphery of the rotating disc because clocks in motion go slower than ones at rest: The rotating observer notes very well that, of his two equivalent clocks, that placed on the circumference runs slower than that placed at the center.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
427
This is supposedly a result of special relativity and has absolutely nothing to do with the fact that the disc is accelerating uniformly. From these observations he concluded that In general relativity, space and time cannot be defined in such a way that the differences of the spatial coordinates be directly measured by the unit measuring rod, or difference in the time coordinate by a standard clock.
Let us recall from Sec. 1.1.1.1 that the gravitational form of time dilatation is based upon Einstein’s so-called equivalence principle, and has nothing to do with general relativity! All during the period when the uniformly rotating disc held Einstein’s attention, probably between mid-July to mid-October of 1912, Einstein expressed his confusion of the relationship between coordinates and measurements with rods and clocks. In Einstein’s own words One sees already from the previously treated highly special case of the gravitation of masses at rest that the space-time coordinates lose their simple physical interpretation; and it still cannot be foreseen what form of the general space-time transformation equations may take. I should like to ask all colleagues to have a try at this important problem!
As Stachel correctly observes Curiously enough, a paper actually deriving the metric of the rotating disc had been published two years earlier by Theodor Kaluza (1910). The paper was to have been delivered by Kaluza at the 1910 Naturforscherversammlung in Königsberg, where he was then working; but he took sick and only the published version appeared under the title “Zur Relativitätstheorie,” which gave no idea of its contents. I have found no evidence that Einstein — or anyone else in the long history of the rotating-disc problem for that matter — was aware of the existence of Kaluza’s work.
Kaluza was right to conclude that “On closer examination, the geometry of a rotating disc is non-Euclidean, specifically Lobachevskian geometry,” but he gets the analysis wrong. Instead of deriving Einstein’s result, 0
2π
r 2πr dϕ = √ , 2 (1 − r ) (1 − r2 )
√
where r and ϕ are polar coordinates, he writes:
r2 dϕ. (1 ± r2 )
√
(9.1.6)
Aug. 26, 2011
11:17
428
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
The square in the numerator is undoubtedly a typo, but the ± in the denominator is perhaps his uncertainty of which non-Euclidean geometry to use. Kaluza’s expressions for the arc length, 2 1/2 r2 dϕ 1+ dr, (9.1.7) 2 dr 1±r and phase, ϕ − ϕ0 = cos−1
r 0
r
√ ± r0 (r2 − r02 ),
(9.1.8)
also suffer from the indeterminacy. Moreover they are wrong: Beginning with his expression for the arc length, (9.1.7), he would have found the equation of a straight line, r 0 , (9.1.9) ϕ − ϕ0 = cos−1 r in polar coordinates, where r0 is the distance from the line to the origin. He would have obtained the same result with a Euclidean metric. Placing the r0 in front of the first term on the right-hand side and deleting it from the second term in (9.1.8), and choosing the negative sign, would give the eikonal of geometrical optics, (7.4.11). Kaluza’s result, (9.1.6), is however more interesting, and puts to rest Einstein’s red herring about the lack of physical significance of coordinates. Consider a sphere of radius R. The Euclidean distance on the sphere r has the elliptical counterpart R tan rˆ /R, where rˆ is the elliptic measure of the distance on the sphere. The distance between (r, ϕ) to (r + dr, ϕ) has the elliptic separation dˆr =
dr . 1 + r2 /R2
The transformation from elliptic to hyperbolic geometry consists in replacing the radius R by i R so that the hyperbolic separation is d¯r =
dr , 1 − r2 /R2
(9.1.10)
and gives the distance of a hyperbolic straight line segment, iR tan (ˆr/iR) = R tanh (¯r/R).
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
429
Likewise, to determine the circumference of a circle in elliptic geometry we introduce the spherical coordinates x = R cos ϕ sin rˆ /R,
y = R sin ϕ sin rˆ /R,
z = R cos rˆ /R.
Then the distance between (r, ϕ) to (r, ϕ + dϕ) is dˆs2 = dx2 + dy2 + dz2 = R2 sin2 rˆ /R dϕ2 ,
(9.1.11)
and, consequently, the circumference of the circle will be
2π
R 0
sin (ˆr/R)dϕ = 2πR sin (ˆr/R).
Again making the transition from elliptic to hyperbolic geometry, R → i R, and noting that sin (ix) = i sinh x, we get the hyperbolic circumference as 2πr . (1 − r2 /R2 )
U = 2πR sinh (¯r/R) = √
(9.1.12)
Moreover, from (9.1.11) we find that the hyperbolic distance between (r, ϕ) and (r, ϕ + dϕ) is d¯s = R sinh (¯r/R)dϕ, and combining it with (9.1.10), the hyperbolic separation d¯s of the points (r, ϕ) and (r + dr, ϕ + dϕ) is d¯s2 =
r2 dϕ2 dr2 + , (1 − r2 /R2 )2 1 − r2 /R2
(9.1.13)
which, from what we know in Sec. 2.5, is what Beltrami had found back in 1868! Kaluza’s error was to use the Euclidean measure of the radial term instead of its hyperbolic measure. Thus we see that there is no truth to Einstein’s assertion that “the radial measuring rods have length l; the tangential ones, however, have length √ (1 − v2 /c2 ).” Nor can the observer distinguish his position on the disc by how slow his clock goes with respect to the clock at the center. This “misunderstanding is quite fundamental,” to use Einstein’s own words. The rulers are not shorter, nor the clocks slower in the hyperbolic metric. Recall Poincaré’s beautiful discovery of a distance function that made all his tessellations the same size in Fig. 1.1.
Aug. 26, 2011
11:17
430
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
Let us reiterate what we said about the difference in size of the bug’s legs in Sec. 2.1.1. There is a fine line when talking about the metric, length, distance and time. We want to look at the picture in our Euclidean metric, and say “lengths are shorter and time goes more slowly as we proceed further out on our disc.” However, to the person who lives there he sees no difference, nor can he measure one. Einstein constructed his general theory as a generalization of the flat metric of Minkowski to a non-flat metric. According to Einstein, the uniformly rotating disc was of “decisive importance” because it showed that a gravitational field (equivalent, to him, to a centrifugal field) causes “nonEuclidean arrangements of measuring rods, and thus compels a generalization of Euclidean space.” The four-dimensional formulation of Minkowski, and its non-flat generalization, required Einstein to go beyond Gauss’s twodimensional theory of surfaces. In his words: I first had the decisive idea of the analogy of mathematical problems connected with the theory and Gauss’s theory of surfaces in 1912 after my return to Zurich without knowing at that time Riemann’s and Ricci’s, or Levi-Civita’s, work.
But once the equation of the phase trajectory is obtained, which is given in terms of the coefficients of the first fundamental form, (7.5.1) imposing the condition of the conservation of angular momentum, or its non-Euclidean generalization, gives the equation of motion, (7.5.10). Integration of the latter gives the relation between time and space. In the Euclidean case, the equation of motion, √ 2 (r − r02 ) r˙ = ± , (9.1.14) r can be integrated at once to give r2 − t2 = r02 .
(9.1.15)
In terms of the Minkowski line element, this would be a space-like interval, where no ‘signal’ can be transmitted from the point r to the point r0 in time t between the events that take place at those points. Since the velocity is greater than that of light, there can be no causal relation between the two events. However — and this is a big however — (9.1.15) is not a statement
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
431
of the invariance of two inertial frames of reference in which in the K0 frame of reference the two events happen simultaneously. Yet, it gives a relation that is contrary to Einstein’s assumption that general relativity should be a generalization of the flat Minkowski metric to a non-flat metric. Any non-Euclidean generalization of the equation of motion will contain small corrections to the equation of motion (9.1.14), and, thus, cannot change the qualitative nature of the solution, (9.1.15). Hence, there was no need for Einstein to look beyond Gauss’s theory of curvature to look for a four-dimensional generalization to include time. At least, in hindsight, Einstein saw the need for introducing a nonEuclidean metric based on the uniformly rotating disc to be based upon three restrictions: (i) The special theory should hold in a ‘global inertial frame,’ where no gravitational field exists. (ii) Small measuring rods do not change their length in any gravitational field, the acceleration of a clock has no influence on its rate. (iii) Any coordinate system may be used, since The general laws of nature are to be expressed by equations which hold good for all system of coordinates, that is, are covariant with respect to any substitutions whatever (generally covariant).
The first assumption is that special relativity should be a limiting case when the gravitational field vanishes. Since by the principle of equivalence, uniform acceleration is indistinguishable from a uniform gravitational field, and special relativity should be recovered in the limit where all accelerations vanish. However, with regard to the second assumption that rulers and clock are not affected by acceleration is contrary to Einstein’s own finding in Sec. 3.8.2.3, and to our conclusions to the contrary in Sec. 8.4.2. The paradox — which is not a paradox at all — lies in the principle of equivalence which states that (ωr)2 =
2GM , r
(9.1.16)
√ where ω = (Gρ), which is the inverse of the free-fall time. The left-hand side of (9.1.16) is the square of (constant) angular velocity of rotation, while the right-hand side is the gravitational potential. The passage is from one of
Aug. 26, 2011
11:17
432
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
constant density, ρ, to one of constant mass, M. Although mathematically equivalent, they are not physically one and the same. Moreover, a constant gravitational field cannot be annihilated in an accelerating frame of reference, such as a free-falling elevator, as we shall see in the next section. Centrifugal forces are not the same as gravitational forces, as we have realized in Sec. 7.5. How rulers become distorted, and clocks vary in rates depend on the frame of reference: to the inhabitants of the rotating disc there are no noticeable changes depending on where they are on the disc. It is to us Euclideans that the changes are perceptible because we are using Euclidean rulers whereas the inhabitants are employing hyperbolic ones. So the assumption that the rulers be small enough has no meaning just as the breaking up of a rigid circular disc when set into motion “on account of the Lorentz contraction of the tangential fibers and the non-contraction of the radial ones.” The theme of this chapter is the nonequivalence of gravitation and acceleration, so that when the latter vanishes, all the results of special relativity should be recovered is a non-sequitur. Special relativity attributes to gravity the same form as electromagnetism, and there is no justification in this. We have Maxwell’s equations to show that electromagnetic waves travel at the velocity of light, but what equations are there to show that gravitational waves also propagate at the same velocity? All we have to do is to remind ourselves of the unsatisfactory formulation of a Maxwellian theory of gravitation that we discussed in Sec. 3.8.1. The irony of it all is that Einstein sought mathematical help from his friend Grossmann, who was an expert not only in tensor calculus, but also in non-Euclidean geometries. Grossmann is usually criticized for having led Einstein astray on the Ricci tensor, but, what he should have suggested was looking at hyperbolic geometry as a setting for a theory of gravity. And this was also brought to Einstein’s attention by Vari˘cak, who not only questioned the reality of the Lorentz contraction, but who also introduced Lobachevsky geometry into special relativity. Therefore, the true limit of a relativistic theory of gravitation should be the flat metric of Euclidean space, and not the flat metric of Minkowski. The origin of all relativistic corrections are due to negative curvature of space. Consider the conservation of energy: r˙ 2 + r2 ϕ˙ 2 + 2(r) = 2W ,
(9.1.17)
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
433
where W is the total energy, and is some scalar potential, per unit mass. The conservation of angular momentum is that r2 ϕ˙ = ,
(9.1.18)
be constant, again per unit mass. Inserting (9.1.18) into (9.1.17) results in: 2 (9.1.19) r˙ = ± 2W − 2 − 2(r) . r Suppose further that has the Beltrami form in (7.5.11),
2 GM 1+ 2 . (r) = − r r
(9.1.20)
Now, start from the (normalized) equation for the trajectory, √ dϕ E = ±√ √ , (9.1.21) dr G (Gη2 − 2 ) where = /c is also the ‘distance of nearest approach,’ or collision parameter, in scattering theory. For the Beltrami metric, (9.1.13), the coefficients of the fundamental form are [cf. (7.4.4)] E=
1 (1 − ω2 r2 /c2 )2
,
G=
r2 . 1 − ω2 r2 /c2
Introducing the ‘principle of equivalence,’ (9.1.16), the equation of the pregeodesic (9.1.21) becomes ϕ˙ dϕ = =± √ 2 2 , dr r˙ r [η r − 2 (1 − αr)]
(9.1.22)
where α := 2GM/c2 is the Schwarzschild radius. Imposing conservation of angular momentum, (9.1.18), in (9.1.22) gives the radial equation, α 2 , (9.1.23) r˙ = ± 1− 2 1− r r for a constant index of refraction. For small α, (9.1.23) has the solution, r2 − t 2 = 2 . 1 − α/r
(9.1.24)
Aug. 26, 2011
11:17
434
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
This can be verified as follows. To first-order in α, (9.1.24) is equivalent to α − 2 = t 2 . r2 1 + r Taking the square root and differentiating give √
(r2 + αr − 2 ) r + α/2 1/2 α α 2 · 1− = ± 1+ − 2 r 2r r
r˙ = ±
=±
α 2 1+ − 2 r r
=±
1/2 α · 1− r
2 α2 1− 2 + 3 r r
,
which is (9.1.23). Expression (9.1.24) has a striking resemblance to the exterior solution of the Schwarzschild metric that we will investigate in Sec. 9.10.3.
9.2
The Sagnac Effect
Whereas special relativity teaches us that all inertial frames are equivalent, all accelerative frames are not. That is, we can determine the absolute motion in a uniformly accelerated frame. This was known since the early days of relativity, and is referred to as the Sagnac effect [13], after the French scientist who discovered the effect. Actually, he claimed that it proved “the reality of the luminiferous aether by the experiment with a rotating interferometer,” which is the title of one of the two publications dealing with his interferometer. The Sagnac effect was used as an experimental test of whether light can propagate with the velocity c on a moving platform. A non-null effect was found, and this was used as an argument to discredit special relativity. It is treated by emission theory, similar to that of the Michelson–Morley experiment in Sec. 3.2, where light in the direction of the aether wind has
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
435
velocity c + u, while, light in the opposite direction travels at velocity c − u, where u is the relative velocity of the aether wind. On a disc of radius r, rotating at angular velocity ω, the velocity of light in the direction of rotation is c + rω, whereas light in the opposite direction travels at velocity c − rω. The Sagnac effect occurs in ring interferometry where a beam of light is split into two beams that are made to follow opposite trajectories. The light beams travel in circles, or rings, that enclose a given area. Upon returning to the initial point on the ring, the light beams are allowed to interact in such a way that they produce an interference pattern. The position of the interference fringes depends on the angular velocity ω of the rotating disc. When the platform is in motion one of the two beams will cover less distance than the other. This produces a shift in the interference pattern. The original schematic representation of the Sagnac interferometer is shown in Fig. 9.1. The Sagnac interferometer has been likened to a gyroscope: a compass points in the same direction after spinning up. Thus, just as a gyroscope, it measures its own angular velocity and can be used in an inertial guidance system. However, whereas a gyroscope conserves angular momentum, the interferometer does not. This fact will be used explicitly in the following. The shift in fringes can be considered as a simple statement that light travels different distances in the direction of motion and in the direction opposite to the rotating disc. The times it takes light to travel around the disc in the direction of motion and in the opposite direction are: ct± = 2πr ± L± , where L± = ωrt± is the change in the distance that light covers when it travels in the direction and in the opposite direction to the rotating disc. Hence, the time difference is: t = t+ − t− =
2πr 2πr 4πr2 ω . − = 2 c − ωr c + rω c − ω2 r 2
(9.2.1)
This result is typical of emission theory, where one of the velocities is greater than the speed of light. The prediction made by Sagnac was that the difference in time traveled by the beams of light to the screen for an
Aug. 26, 2011
11:17
436
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
Fig. 9.1. The Sagnac Interferometer as originally depicted in his 1913 article. The horizontally rotating plate has a light source at O, which is a lamp with a horizontal metal filament. The objective of the microscope C0 projects the image of the filament through the Nicol prism N which then falls on a reflecting mirror m. Two beams traveling in opposite directions are reflected on four mirrors M which complete a closed circuit, a1 a2 a3 a4 .
interference pattern to form depends on the area of the ring, πr2 , and the angular velocity of rotation, ω. The shift in the fringes, ϕ = 2πδ = ωλ t, where ωλ is the frequency of the light used. All relativistic frequency shifts depend on the ratio of the pertinent energy involved to the rest energy. For example, the gravitational redshift is proportional to the ratio of gravitational energy to the rest energy. Here, the ratio will be proportional to the kinetic energy of rotation, ϕ = 4πω/c2 ,
(9.2.2)
to the rest energy, where, as usual, is the angular momentum (relative unit mass).
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
437
All treatments of the Sagnac effect use the incorrect general relativistic expression for the line element, ds2 = dr2 +
r2 dϕ2 , 1 − ω2 r2 /c2
(9.2.3)
which we have pointed out, on numerous occasions, is inconsistent with a hyperbolic metric of constant negative curvature. If we use the stereographic model of Sec. 7.4, the metric will be given by ds2 =
dr2 + r2 dϕ2 . (1 − ω2 r2 /c2 )2
This orthogonal metric gives rise to the equation of the geodesic, dϕ (1 − ω2 r2 /c2 ) ϕ˙ = =± 2√ . r˙ dr r [1 − (2 /c2 r2 )(1 − ω2 r2 /c2 )2 ]
(9.2.4)
In order that r be a periodic function of time, i.e. rωλ = c sin (ωλ t + ϑ), where ϑ is an arbitrary phase, the angular momentum must be given by [cf. (7.4.8)] =
ωr2 . 1 − ω2 r2 /c2
(9.2.5)
Inserting (9.2.5) into (9.2.2) gives Sagnac’s expression (9.2.1), ϕ =
4πωλ ωr2 = ωλ t, c2 − ω 2 r 2
for the difference in time traveled by the light beams to the screen in order to have an interference pattern. Under the principle of equivalence, (9.1.16), the same expression for the angular momentum is obtained in general relativity for the advance of the perihelion, (7.4.8), where it is claimed that (9.2.5) [Møller 52] cannot in general be interpreted as angular momentum, since the notion of a ‘radius vector’ occurring in the definition of the angular momentum has an unambiguous meaning only in Euclidean space.
The non-conservation of the angular momentum is now due to the presence of gravity, and vanishes in the absence of mass. This is a summarizing dismissal if ever there was one!
Aug. 26, 2011
11:17
438
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
Introducing the expression for the angular momentum, (9.2.5), into the equation of the trajectory, (9.2.4), gives the rate equation,
ω2 r 2 r˙ = c 1− 2 , (9.2.6) c where we have chosen the positive sign for simplicity. The rate equation (9.2.6) is the momentum per unit mass, and so it is related to the action S by ∂S = r˙ . ∂r Integrating we find the action,
ω2 r 2 S=c 1− 2 dr c
ωr c c ω2 r 2 = + sin−1 r 1− 2 . 2 ω c c
(9.2.7)
Expression (9.2.7) is the phase of the Hermite polynomials in the shortwavelength (WKB) limit. So as long as r < c/ω we get a periodic solution of a harmonic oscillator which can be quantized,
c2 r˙ dr = ω
0
2π
cos2 ϑ dϑ = π
c2 , ω
(9.2.8)
where we have used the change of variable, r = (c/ω) sin ϑ. Thus, the quantum conditions require the right-hand side of (9.2.8) to be semi-integral multiples of Planck’s constant. No quantum condition exist for the angular momentum (9.2.5) because it is not conserved. What about radii for which r > c/ω? According to special relativity, “for all points with r < c/ω the rotating system of reference may be represented by a uniformly rotating material,” while if the inequality is reversed, “no absolutely rigid body can exist, since they would provide a means of transmitting signals with velocities greater than c” [Møller 52]. The action, (9.2.7) becomes imaginary in form though not in reality for we now have √ 2 2 2 S=c (ω r /c − 1)dr,
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
439
and introducing the substitution r = (c/ω) cosh ϑ there results c2 c2 sinh2 ϑ dϑ = ( sinh 2ϑ − 2ϑ) ω 4ω ωr c √ 2 2 2 . = r (ω r /c − 1) − cosh−1 c 2
S=
(9.2.9)
In the same way that we transformed from an oblate to a prolate spheroid in Sec. 5.4.4, we have transformed from a periodic to an exponential solution by reversing the inequality r < c/ω. The hyperbolic action, (9.2.9), will be finite because the terms in the parentheses of the second line are proportional to the volume of the pseudosphere of radius ϑ, which we know to be finite. Likewise, if we had used the substitution r = (c/ω) cos ϑ in (9.2.7), we would have obtained the action, S=
c2 (2ϑ − sin 2ϑ), 4ω
(9.2.10)
which is proportional to the volume of a sphere in elliptic geometry. Consequently, the transition from a uniformly rotating disc with r < c/ω to one where r > c/ω is one from elliptic to hyperbolic geometry.
9.3
Generalizations of the Sagnac Effect
The relativistic Sagnac effect considers magnitude of shift in the phase of two light beams as the ratio of the potential energy of the centrifugal force to the rest energy. We can also consider the shift in phase caused by the magnetic energy associated with the angular momentum about the polar axis. In the same way that the angular momentum is not conserved in the Sagnac effect, so we expect m = r2 sin2 ϑϕ˙ will not be conserved. Instead of a disc, we take a hemisphere with a disc on it that is determined by the angle ϑ with respect to the normal from the center O of the hemisphere, as shown in Fig. 9.2. The disc will have a perimeter 2π sin ϑ. We now set the hemisphere rotating at a constant angular velocity ω, and again determine the time difference for light to propagate in the forward and reverse directions.
Aug. 26, 2011
11:17
440
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
Fig. 9.2.
Disc cut out of hemisphere at an angle ϑ.
Light traveling with the hemisphere will cover more than one circumference around the disc and hit the light source from behind in time, ct1 = 2πr sin ϑ + ωrt1 sin ϑ, while light traveling in the opposite direction of rotation will travel less than the perimeter before colliding with the light source from the front side. The time it takes is ct2 = 2πr sin ϑ − ωrt2 sin ϑ. Solving for the times t1 and t2 , and forming their difference, give t = t1 − t2 =
4πωr2 sin2 ϑ c2 − r2 ω2 sin2 ϑ
.
The accompanying phase shift is ϕ = ωλ t =
4πr2 ωωλ sin2 ϑ c2 − r2 ω2 sin2 ϑ
,
which we shall now show is = 4πmωλ /c2 .
(9.3.1)
Consider the line element that is generalized by the stereographic inner product, ds2 =
dϑ2 + sin2 ϑ dϕ2 (1 − κ2 sin2 ϑ)2
,
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
441
where κ is still arbitrary, but we know to be related to the absolute constant. Because ϕ is cyclic we immediately have a first integral, m sin2 ϑ ϕ = = sin ϑ0 , √ 2 2 2 2 (1 − κ sin ϑ) (1 + sin ϑ ϕ ) where the prime stands for differentiation with respect to ϑ, and ϑ0 is the minimum value of the colatitude. The magnetic quantum number, m, represents the projection of the angular momentum onto the vertical axis. Rearranging we get dϕ (1 − κ2 sin2 ϑ) m ϕ˙ . = =± 2 √ dϑ ϑ˙ sin ϑ [2 − (m2 / sin2 ϑ)(1 − κ2 sin2 ϑ)2 ] For κ = 0 this reduces to the well-known equations of classical mechanics, m = r2 sin2 ϑϕ, ˙
r2 ϑ˙ =
2 −
m2
sin2 ϑ
.
(9.3.2)
Since m, as well as , is conserved (9.3.2) can be immediately integrated to give:
r2 dϑ
r2 = cos−1 √ 2 ( − m2 / sin2 ϑ)
cos ϑ cos ϑ0
= t − t0 .
(9.3.3)
This represents the angular distance of a moving point on a great circle from the line of nodes, measured on the orbital plane, that occurs in time (t − t0 ), where t0 has appeared as an arbitrary constant of integration. Inversion gives cos ϑ = cos ϑ0 · cos
(t − t0 ) , r2
(9.3.4)
showing that the moving particle will complete a great circle in period r2 /. We now compare this with their generalizations in which m is not conserved.
Aug. 26, 2011
11:17
442
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity Their generalization for κ = 0 must be m=
r2 sin2 ϑϕ˙ 1 − κ2 sin2 ϑ
2˙
2
−
r ϑ=±
, m2
sin2 ϑ
(9.3.5) 2
2
(1 − κ sin ϑ)
2
.
(9.3.6)
If m in (9.3.5) is to be the same as in (9.3.1), it requires setting κ = rω/c, and ϕ˙ = ω, so that the angular momentum in the polar direction will not be conserved. Introducing (9.3.5) into (9.3.6) results in √ r2 ϑ˙ = ± (2 − 20 sin2 ϑ),
(9.3.7)
which has a different form than (9.3.2). In fact, it will lead to another periodic function whose period depends on ratio of the conserved, 0 = r2 ω, to the non-conserved, , angular momentum. Calling that ratio, λ, and integrating (9.3.7) now lead to 0
sin ϑ
√
dϑ (1 − λ2 sin2 ϑ)
=
z 0
dz = 2 (t − t0 ). r [(1 − z2 )(1 − λ2 z)]
√
(9.3.8)
Like the integral for arcsine, inverting (9.3.8) gives 0 (t − t0 ) z = sin ϑ = sn , r2
(9.3.9)
where sn is an elliptic function, coming from the elliptic integral of the first kind, (9.3.8). For λ = 0 / < 1, the period is real and finite. The period T satisfies T/r2 = 2K, where K is the complete elliptic integral of the first kind. It decreases with increasing values of the angular momentum. The particle on the great circle makes complete revolutions; the motion is a libration. Observe that ϑ˙ never changes sign. Its square is analogous to the kinetic energy of the particle so that the particle can never come to rest as in the case of a plane pendulum whose total energy is greater than the potential energy. And like the plane pendulum, the period is not independent of the amplitude so that the motion is not isochronous.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
9.4
443
The Principle of Equivalence
The principle of equivalence asserts that, in some sense, a field of acceleration is equivalent to a gravitational field [Fock 66]. That is to say by transforming to an accelerated frame of reference the gravitational field can be made to disappear. As a consequence, a gravitational field can be mimicked by a field of acceleration, and both can be made to disappear by a transformation to a local inertial frame where material particles behave as if they were ‘free’ of gravitational or centrifugal forces. The principle of equivalence has become part of folklore. Gamow [62] tells us that any departure from uniform motion, like a moving car hitting a railing, “will be painfully noticeable.” To explain what is happening, Einstein is said to have devised a gedanken experiment in which he is found in a rocket ship, as shown in Fig. 9.3, far away from any masses which would influence the outcome of any experiment he may perform.
Fig. 9.3. Gamow’s [62] depiction of Einstein’s gedanken experiment showing the equivalence between acceleration and gravity.
Aug. 26, 2011
11:17
444
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
Before the rockets are fired, all unattached objects would float freely. However, once the rockets have been turned on, everything that is unattached will suddenly be slammed against the side where the rockets are operating. A gravitational field has been produced. An accelerated rocket ship has remarkably created a gravitational field, and if we can tune the rockets to the same acceleration experienced on Earth, the inhabitants of the spacecraft would believe that they are still on Earth. Now Einstein performs an experiment in which he releases two balls of different materials. While the balls are in his hand, they are accelerating along with him and the spacecraft. But, once released there will be no force upon them and they will move at a constant velocity when they were released. In other words, they will be in a state of uniform motion. But, the spaceship is still accelerating so that at some point the accelerating floor will come crashing into them simultaneously. However, to Einstein it will appear that the balls are falling under the influence of gravity and hit the floor at the same time. So an accelerating reference frame can mimic a gravitational field, or even annul one. As Einstein concludes: The implementation of the general theory of relativity [for velocity and for acceleration] leads directly to a theory of gravitation; because we can ‘produce’ a gravitational field by a mere change in the coordinate systems.
In fact, it has been claimed [Stachel 89] that the seeming equivalence between a field of gravity and a non-inertial frame of motion, like a rotating disc, was behind Einstein’s search for a geometrical theory of gravitation. Drawing on the putative analogy between the properties of measuring rods and clocks on a rotating disc with gravity, Einstein [20] writes In the general theory of relativity space and time cannot be defined in such a way that differences in the spatial coordinates can be directly measured by the unit measuring rod, or differences in the time coordinate by a standard clock.
Einstein was troubled with the relationship between measuring rods and clocks, on the one hand, and the appropriate coordinates to describe accelerating frames, on the other hand. Although he realized that the observer located at the center of a rotating disc will admit that the outer edge of the disc requires a non-Euclidean geometry, he did not indicate which nonEuclidean geometry is required [Gray 07]. What we will do here is to show
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
445
that a uniformly rotating disc can be described by the Beltrami metric of hyperbolic geometry. Einstein [89] considered the analogy with a rotating disc in two papers published in 1912. The first is an attempt to deal with a stationary, uniform, gravitational field by allowing the speed of light to vary. So he was led to consider an emission theory where a source emits light traveling at a velocity c, but to a stationary observer who views the source traveling at a relative velocity u, the speed of light will appear to be c + u. As we have seen in Sec. 3.2, emission theory is used to analyze the Michelson– Morley interferometer experiment, but is at odds with the Fresnel dragging coefficient in Fizeau’s experiment, which requires the relativistic addition of velocities. Oddly enough this was after he made his ‘agreement’ to disagree with Ritz, which we discussed in Sec. 4.2.2. Einstein then went on to consider spatially inhomogeneous, but again time-independent, gravitational fields. Here, he drew on the analogy with a rotating disc for which he had to downgrade his equivalence principle to infinitesimal regions, or what mathematicians would call tangent spaces. To Einstein gravity acts as a non-uniform force, affecting all bodies equally, but varying from point to point. In his own words: Let us consider a space-time domain in which no gravitational field exists relative to a reference body K whose state of motion has been suitably chosen. . . Let us suppose the same domain referred to a second body of reference K , which is rotating uniformly with respect to K. In order to fix our ideas, we shall imagine K to be in the form of a plane circular disc, which rotates uniformly in its own plane about its center. An observer who is sitting eccentrically on the disc K is sensible to a force which acts outwards in a radial direction, and which he would interpret as an effect of inertia (centrifugal force) by an observer who was at rest with respect to the original reference body K. But, the observer on the disc may regard his disc as a reference body which is ‘at rest’; on the basis of the general principle of relativity he is justified in doing this. The force acting on himself, and in fact on all other bodies which are at rest relative to the disc, he regards as the effect of a gravitational field. Nevertheless, the space distribution of this gravitational field is of a kind that would not be possible in Newton’s theory of gravitation. (The field disappears at the center of the disc and increases proportionally to the distance from the center as we proceed outwards.) But since the observer believes in the general theory of relativity, this does not disturb him, he is quite in the right when he believes that a general law of gravitation can be formulated by a law which not only explains the motion of the stars correctly, but also the field of force experienced by himself.
So what Einstein is saying is that if we can solve the uniformly rotating disc, we have solved the problem of an inhomogeneous gravitational field. The
Aug. 26, 2011
11:17
446
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
confusion that existed in Einstein’s mind is exemplified by the following passage: . . . at this stage the definition of space coordinates also represents insurmountable difficulties. If the observer applies his standard measuring rod tangentially to the edge of the disc, then, as judged from the Galilean system, the length of this rod will be less than 1, since, moving bodies suffer a shortening in the direction of motion. On the other hand, the measuring rod will not experience a shortening in length, as judged from K, if it is applied to the disc in the direction of the radius. If, then, the observer first measures the circumference of the disc with his measuring rod and then the diameter of the disc, on dividing one by the other, he will not obtain as quotient the familiar number π = 3.14 . . ., but a larger number, whereas, of course, for a disc at rest with respect to K, this operation would yield exactly. This proves that the propositions of Euclidean geometry cannot hold exactly on a rotating disc, nor in general in a gravitational field, at least if we attribute the length l to the rod in all positions and in every orientation. Hence the idea of a straight line also loses meaning. We are therefore not in a position to define exactly the coordinates x, y, z relative to the disc by means of the method used in discussing the special theory, and as long as the coordinates and times of events have not been defined, we cannot assign an exact meaning to the natural laws in which these occur.
Granted geodesics are no longer straight lines, but this does not mean that the uniformly rotating disc does not possess a well-defined metric. The distinction between the uniformly rotating disc and gravity will come not from the definition of the metric, but, from the non-constancy of the curvature of the metric, as we will appreciate in Sec. 9.6. Rather, Poincaré offers a concrete model in his unevenly heated disc that we discussed in Sec. 2.1.1. Length, as it would appear to us outside of the disc, will become distorted, and how much it will be distorted will depend on the radius R of the disc, which is the absolute constant. Distance, Poincaré now defines as d¯r = dr/(1 − r2 /R2 ), which for a finite tract is r¯ = R tanh−1 (r/R). Einstein [20] had to be familiar with Poincaré’s ideas about space because he uses the unevenly heated slab as an illustration in his presentation of his general theory as he did about time measurements in the special theory by bouncing off light signals between observers in different inertial frames. Einstein now considers a grid of little squares on a marble slab as constituting a “Euclidean continuum with respect to a little rod, which has been used as a ‘distance’ (line-interval).” He then considers the thermal deformation of the rod: We shall suppose that the rods ‘expand’ by an amount proportional to the increase in temperature. We heat the central part of the marble slab, but not the periphery, in
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
447
which two of our little rods can still be brought into coincidence at every position on the table. But, our construction of squares must necessarily come into disorder during the heating, because the little rods on the central region of the table expand, whereas those on the outer part do not. With reference to our little rods — defined as unit lengths — the marble slab is no longer a Euclidean continuum, and we are also no longer in the position of defining Cartesian coordinates directly with their aid, since the above construction can no longer be carried out. . . The method of Cartesian coordinates must then be discarded, and replaced by another which does not assume the validity of Euclidean geometry for rigid bodies. The reader will notice that the situation depicted here corresponds to the one brought out by the general postulate of relativity.
If Euclidean geometry is to be discarded, then what must take its place? Einstein goes on to tell us: Gauss indicated the principles according to which we can treat the geometrical relationships in the surface, and thus pointed out the way to the method of Riemann of treating multi-dimensional, non-Euclidean continua. Thus, it is that the mathematicians long ago solved the formal problems to which we are led by the general postulate of relativity.
But, we know Gauss never published anything on non-Euclidean geometry, apart from the occasional correspondence. So it was not Gauss who pointed out the way to Einstein through Riemannian geometry. In other words, Einstein finds the necessity of creating a different edifice than the one which has already been constructed. He chose not to avail himself of the existing non-Euclidean geometries of constant curvature, but chose a path that would appear to be a generalization of his special theory of relativity of space-time, and referred to it as ‘the general principle of relativity.’ Einstein’s assumption that there are no privileged systems of coordinates, which seemingly appears as a generalization of the covariant form of his special relativity, is a red herring. The principle of relativity, which makes equivalent the observer and what he is observing, relies on inertial reference frames. In the presence of gravitation these frames simply do not exist. The inability to distinguish between cause and effect, or the reciprocity of phenomena in electrodynamics, like the observation during the relative motion of a magnet with respect to a conducting circuit, an electric current is induced in the latter. It is all the same whether the magnet is moved or the conductor; only the relative motion counts.
The observation was made by the sixteen year-old Einstein, but has no place in gravitation. In the words of Fock [66] “in the ‘General Theory of Relativity’ there is less relativity and not more than in the ‘special theory’.”
Aug. 26, 2011
11:17
448
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
No such dilemma occurs if we can show that the metric of a uniformly rotating disc corresponds exactly to the Beltrami metric. Whether or not it applies to gravity is another matter. As a result of the deformations caused by an accelerated frame of reference, the principle of equivalence had to undergo qualifications and restrictions. First, and foremost, the equivalence was a local one, at a single point in space [Fock 66]. Second, the deformations on the body and on the measuring sticks used in measurement must be small enough so that the notion of a rigid body retains meaning [Møller 52]. Thus, an equivalence between an accelerated frame of reference, caused by a rigid uniformly rotating system and a gravitational field, should hold if the motion of the former is slow enough and the field created by the latter is weak enough. Deformations, whether large or small, cause deviations from Euclidean geometry. If the measuring rods are contracted on a rotating disc in the direction of rotation, we cannot expect the Pythagorean theorem to hold in its Euclidean form for any inscribed right triangle. Thus, what might be locally Euclidean may very well deviate to other geometries when stresses are present. Not long after the advent of special relativity Robb [11] noticed that the Euclidean triangle of velocities must be replaced by a Lobachevsky triangle for large velocities. Distortions which require a geometry different from Euclidean geometry also produce optical effects [Robb 11]. When light encounters a change in density of the medium, the rays bend in such a way that they minimize their propagation time between ray end-points. According to Huygens’s [62] principle, objects are not where they appear to be, but, are slightly displaced due to the curvature of the rays. In an analogous way that a nonconstant index of refraction relates the Euclidean distance to the optical path length, a metric density relates the Euclidean distance to the hyperbolic distance. The hyperbolic geometry that describes a uniformly rotating disc also describes the bending of light by a massive body when the transition is made from a constant to a non-constant surface of negative curvature. The transition occurs by relating the absolute constant of the hyperbolic geometry to a free-fall time and then transferring from a system of constant density to one of constant mass. A uniformly rotating disc has constant density, while, in the deflection of light, the mass is constant. In free-falling
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
449
frames, the laws of physics are locally the same as in inertial frames, so that this transformation does not introduce anything that would have an effect on non-inertial, or gravitational, forces. We will thus come to appreciate that not all forms of accelerative motion are equivalent.
9.5
Fermat’s Principle of Least Time and Hyperbolic Geometry
As we know from Secs. 2.2.3 and 7.2.2, Fermat’s principle of least time states that light propagates between any two points in such a way as to minimize its travel time. Fermat knew that light travels more slowly in denser materials, but he did not know whether or not light travels at a finite speed or infinitely fast. The index of refraction, η, takes into account the inhomogeneities through which light propagates. Over scales in which the Earth appears as a flat surface y = 0, η is a function only of the height y. The optical path length, I, or the product of the propagation time and the velocity of light connecting two points, (x1 , y1 ) and (x2 , y2 ), in a plane extending above and normal to the surface, is I=
x2
x1
√ η(y) (1 + y2 ) dx,
(9.5.1)
where the prime stands for differentiation with respect to x. Moreover, if we assume that the index of refraction decreases with height by making it inversely proportional to its distance y from the x-axis, Fermat’s principle of least time, (9.5.1), becomes the Poincaré upper half-plane model of hyperbolic geometry. Poincarites from the heated plane model of Sec. 2.1.1, find their rulers shrink as they do when approaching the x-axis so that the boundary appears infinitely far away. The geodesics look much different than straight lines connecting two points in Euclidean geometry. The geodesics can be found from the condition that Fermat’s principle (9.5.1) be an extremum. With η(y) = 1/y, the Euler equation for the extremality of (9.5.1) is ∂ d ∂ , = dx ∂y ∂y
(9.5.2)
Aug. 26, 2011
11:17
450
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
√ where = (1 + y2 )/y is the integrand of (9.5.1). The solution to the resulting differential equation, y + (1 + y2 )/y = 0, is the family of circles, (x − a)2 + y2 = b2 , where a and b are two constants of integration. These circles are centered on the x-axis, and since we are considering only the upper half-plane, y > 0, the half-circumferences will be the geodesics of our space. As x1 → x2 the geodesics straighten out into lines parallel to the y-axis. Employing polar coordinates, the arc length, γ, between (r1 , θ1 ) and (r2 , θ2 ) cannot be less than [cf. (2.3.8)] √ 2 (dx + dy2 ) h(γ) = κ y γ θ2 √ 2 (r + r2 ) =κ dθ r sin θ θ1 θ2 dθ csc θ2 − cot θ2 , ≥κ = κ ln csc θ1 − cot θ1 θ1 sin θ where κ, the radius of curvature, is the absolute constant of the hyperbolic geometry. Different hyperbolic geometries with different values of κ are not congruent [Busemann and Kelly 53]. In the limit as θ2 → π/2, the angle θ1 becomes the angle of parallelism, γ¯ = κ ln cot[(γ)/2].
(9.5.3)
This is still another way of deriving Bolyai–Lobachevsky formula that expresses the angle of parallelism, , as a sole function of the hyperbolic arc length γ, ¯ which is shown in Fig. 9.4 to be the shortest distance connecting the bounding parallels 1 and 2 . The angle of parallelism enters in the analysis of the Terrell–Weinstein effect which relates the FitzGerald– Lorentz contraction, to a rotation [cf. Sec. 9.9]. We have already remarked in Sec. 2.1.1 that Poincaré originally conceived of an unevenly heated slab where the x-axis is infinitely cold. As the Poincarites approach the x-axis, the drop in temperature causes them and their rulers to contract in exact proportion as they do. We also know from
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
451
Fig. 9.4. The angle of parallelism between two bounding parallels connected by the geodesic curve γ.
Sec. 2.5 that Poincaré also considered a disc model of hyperbolic geometry. If the Poincarites living in the half-plane and disc could communicate with one another there would be nothing that would allow them to distinguish between these two worlds. We can think of the Poincaré sphere, in three-dimensions, as possessing an index of refraction that varies from the center of the sphere to its surface as being proportional to (R2 − r2 ), where r is the Euclidean distance from the sphere’s center, and R, is the radius of the sphere. From the foregoing quote, we know that Einstein was familiar with this model, just when we do not know. If R happens to be a star’s radius its temperature at its surface would be infinitely cold, just like the x-axis in the upper half-plane model. By rescaling to a unit radius, the Poincaré unit disc model has a hyperbolic length of a curve γ given by √ 2 (dx + dy2 ) γ¯ = κ , (9.5.4) 2 2 γ 1−x −y where the ‘stereographic’ inner product of the hyperbolic plane, 1−x2 −y2 , plays the same role as the inverse of the index of refraction in (9.5.1). The hyperbolic length (9.5.4) is the metric that gives the interior of the unit disc its hyperbolic structure. It can be derived by mapping the entire half-plane into the unit disc by means of an inversion [Needham 97], but that will not concern us here since it involves entering the complex plane [cf. Sec. 2.4]. The absolute constant of hyperbolic geometry determines whether we are in configuration or velocity space, and we will have the occasion to
Aug. 26, 2011
11:17
452
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
switch back and forth. The radii of the Poincaré discs will set the limitations imposed by relativity: Whereas in velocity space the disc will have radius c, the radius will be c/ω in the configuration space of a uniformly rotating disc, where ω is the constant angular speed of rotation. The curvature of the space is negative and constant [cf. (9.6.17) below]. The transition to non-constant curvature consists in replacing the angular velocity by the free-fall frequency, thereby transferring a system at constant density to one of constant mass. From this we conclude that it is either configuration space or velocity space which determines the metrical properties, and not space-time. Time enters in the magnification of the Beltrami coordinates in velocity space. To the Poincarites living in the κ-disc their world would appear infinite because their rulers shrink along with them as they approach the rim, κ. This can be seen by introducing the hyperbolic polar coordinates, x = κ tanh (r/κ) cos ϑ and y = κ tanh (r/κ) sin ϑ so that the hyperbolic metric (9.5.4) becomes dγ¯ 2 = dr2 + κ2 sinh2 (r/κ)dϑ2 .
(9.5.5)
√ In this polar geodesic parametrization, E = 1 and G > 0, where G is the measure at which the radial geodesics are spreading out from the origin [O’Neill 66]. Because sinh x > x for all x > 0, the rate at which the geodesics spread out, κ sinh (r/κ), will be greater in the hyperbolic plane than the Euclidean plane, where the rate of spreading is r, which is what the hyperbolic rate tends to in the limit as κ → ∞. Consequently, distances become larger, or equivalently, measuring sticks shrink as the rim is approached. This shrinkage causes the geodesics to bend in such a way that they cut the rim orthogonally. The Euclidean parallel postulate, that if a point is not on a given line then there is a unique line through this point that does not meet that line, is invalidated in the hyperbolic plane. In fact, there are an infinite number of geodesics that pass through any given point that do not meet another geodesic. The geodesics still appear as straight lines to the Poincarites, whereas, to us Euclideans, they appear to be bent, and things vary in size depending
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
453
on where we look. We also see things like the bending of light and the shifting of frequencies in a gravitational field.
9.6
The Rotating Disc
Consider a rotating κ-disc, where κ is the relativistic limit that is placed on the radius vector. At the center of the disc we have an inertial system which is described by Euclidean geometry. A clock located anywhere else on the disc will have a velocity rω relative to the inertial system, and consensus has it that its clock will be retarded by the amount,
r 2 ω2 τ=t 1− 2 . (9.6.1) c This fixes the absolute constant, or the disc radius, at κ = c/ω. Now, it is argued [Møller 52] that any rod in motion should undergo a FitzGerald–Lorentz contraction. This means that any two points on the disc that are at a distance r from the center, say, (r, ϑ) and (r, ϑ + dϑ), and are connected by a measuring rod, is shortened with respect to the length of the rod in the inertial frame dr0 by an amount, √ r dϑ = dr0 (1 − r2 /κ2 ). From our earlier discussion, we expect the geodesics to be either bowshaped or straight lines if they pass through the origin. We will now give a geometric explanation of why successive Doppler shifts occur with rotations. In order to do so, we must determine the ratio of the hyperbolic to Euclidean lengths. Consider two variable points, u and v, on the interval (x1 , x2 ). The hyperbolic distance between u and v is given by the cross-ratio, defined in Sec. 2.2.4, κ e(x1 , u) e(x2 , v) · h(u, v) = ln 2 e(x1 , v) e(x2 , u) κ e(u, v) κ e(u, v) = ln 1 + + ln 1 + , 2 e(v, x1 ) 2 e(u, x2 ) where e(u, v) is the Euclidean distance between u and v, and e(x1 , x2 ) = e(x1 , v) + e(v, x2 ). Since we will let u and v tend to a common limit, p, we can
Aug. 26, 2011
11:17
454
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
expand the logarithms in series and retain only the lowest order to obtain, in the limit, the metric density [Busemann and Kelly 53], 1 h(u, v) κ 1 + =: (p), (9.6.2) lim = u,v→p e(u, v) 2 e(p, x1 ) e(p, x2 ) which is the inverse of the harmonic mean of the two distances. In respect to Fermat’s principle of least time, (9.5.1), the metric density (9.6.2) can be associated with a non-constant index of refraction for it converts the Euclidean distance, de, into a hyperbolic distance, dh. Just as the index of refraction varies with height, by causing light to arch its path upwards in order to minimize its propagation time between given endpoints, acceleration, in general, creates distortion causing objects to vary in size and not be where they seem to be [Huygens 62]. In order to obtain an explicit expression for the metric density, (9.6.2), we use two elementary facts about circles: (i) All chords passing through an interior fixed point are divided into two parts whose lengths have a constant product, e(p, x1 )e(p, x2 ) = (1 + r/κ)(1 − r/κ) = 1 − r2 /κ2 , and (ii) the length of a chord is twice the square root of the squares of the difference between the radius and the perpendicular distance from the center to the chord,
r2 2 e(p, x1 ) + e(p, x2 ) = e(x1 , x2 ) = 2 1 − 2 sin φ . κ Combining these two geometrical facts gives √ 1 − (r2 /κ2 ) sin2 φ (u1 , u2 ) = κ , 1 − r2 /κ2
(9.6.3)
where φ is the angle formed by the intersection of lines 1 and 2 in Fig. 9.5. The lines intersect at a point p in the κ-disc. The polar coordinates are u1 = (r/κ) cos ϑ,
u2 = (r/κ) sin ϑ,
(9.6.4)
in either velocity or configuration space, where the radius of curvature κ has the values c and c/ω, respectively.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
Fig. 9.5.
455
Geometric characterization of the metric density.
Whereas lengths are relative in Euclidean geometry, and angles are absolute, the relation between lengths and angles in hyperbolic geometry makes lengths, as well as angles, absolute. Denoting χ as the angle of inclination of the tangent line 1 we have tan χ =
r sin ϑ + r cos ϑ du2 = . du1 r cos ϑ − r sin ϑ
(9.6.5)
Moreover, since χ = ϑ + π − φ, we get tan χ =
tan ϑ − tan φ . 1 + tan ϑ tan φ
(9.6.6)
Equating the two expressions (9.6.5) and (9.6.6) we find tan φ = −r/r = −rϑ ,
(9.6.7a)
− sin φ =
u1 du2 − u2 du1 rϑ , = √ √ 2 (1 + r2 ϑ 2 ) r (du1 + du22 )
(9.6.7b)
cos φ =
1 u1 du1 + u2 du2 . =√ √ 2 2 (1 + r2 ϑ 2 ) r (du1 + du2 )
(9.6.7c)
Aug. 26, 2011
11:17
456
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
Consequently,b dh2 (u1 , u2 ) = κ2 = κ2
du21 + du22 − (u1 du2 − u2 du1 )2 (1 − u21 − u22 )2 dr2 + r2 dϑ2 (1 − r2 /κ2 ) = E dr2 + G dϑ2 , (1 − r2 /κ2 )2
(9.6.8)
is the square of the hyperbolic line element expressed in terms of u1 and u2 , and polar coordinates, r and ϑ. The metric coefficients in the fundamental form are E=
κ2 , (1 − r2 /κ2 )2
(9.6.9a)
G=
κ2 r 2 . 1 − r2 /κ2
(9.6.9b)
The coefficients of the fundamental form determine the equation of the trajectory by requiring that the integrand in Fermat’s principle, =
√
(E + Gϑ 2 ),
be an extremum. Since the fundamental coefficients will, in general, not contain the variable ϑ, it will be a cyclic coordinate meaning that there exists a first integral of the motion, ∂ Gϑ =√ = = const., ∂ϑ (E + Gϑ 2 ) where = /c, is, again the collision parameter, or distance of closest approach [cf. Eq. (9.1.21)]. Solving for ϑ gives the equation of the trajectory √ E ϑ = ±√ √ 2 . (9.6.10) G (G − 2 ) b General relativity proposes a metric [Møller 52, Eq. (7) on p. 224]
dh =
dr2 +
r2 dϑ2 , (1 − r2 ω2 /c2 )
where the first term, at ϑ = const., does not integrate to give the hyperbolic measure of distance, but, rather, gives its Euclidean measure, r.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
457
This is also known as the equation for the geodesic in the Clairaut parametrization [O’Neill 66]. The first two terms in the numerator of (9.6.8) is twice the Euclidean kinetic energy, 2T = r˙ 2 + r2 ϑ˙ 2 ,
(9.6.11)
in the Euclidean limit κ → ∞, when the metric is divided through by dt2 . The second term in (9.6.11) can be written as 2 /r2 , where e = r2 ϑ˙
(9.6.12)
is the angular momentum per unit mass. As we have seen, the Beltrami metric (9.6.8) conserves the angular momentum, (9.6.12), while the stereographic inner product model does not. This is a consequence of the factor 1 − r2 /κ2 in the numerator of (9.6.8). For uniform radial motion ϑ = const., (9.6.8) reduces to dh = dr =
dr , (1 − r2 /κ2 )
(9.6.13)
whose integral is h = κ tanh−1 (r/κ).
(9.6.14)
Alternatively, for uniform circular motion, r = const., (9.6.8) reduces to r dϑ . (1 − r2 /κ2 )
(9.6.15)
2πr > 2πr, (1 − r2 /κ2 )
(9.6.16)
dh = √ Integrating over a period, h= √
shows that the length of a hyperbolic circle of radius sinh (h/κ) is greater than that of an Euclidean circle having a radius r [= κ tanh (h/κ)]. This is none other than Einstein’s old result!
Aug. 26, 2011
11:17
458
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
We can further justify inequality (9.6.16) by considering the circumference of a hyperbolic circle with center O and radius (9.6.14) in Fig. 9.5, which is determined by observing that every point p on this circle has √ φ = π/2. Thus, = κ/ (1 − r2 /κ2 ) = κ cosh (h/κ), and this value multiplied by 2πr/κ = 2π tanh (h/κ), gives the hyperbolic circumference 2πκ sinh (h/κ).c The metric coefficients in (9.6.8), (9.6.9a) and (9.6.9b), determine the constant, Gaussian curvature of 1 d K=− √ 2 (EG) dr
Gr (EG)
√
= −1/κ2 ,
(9.6.17)
where the subscript denotes the derivative. The metric coefficients also determine the geodesics from (9.6.10), . ϑ = ± 2 √ r 1 − (/r)2 (1 − r2 /κ2 )
(9.6.18)
The geodesics are straight lines, r cos (ϑ − ϑ0 ) = γ,
(9.6.19)
where ϑ0 is the angle that the normal to the line r = γ makes with the polar axis, as shown in Fig. 9.5. We already know that the geodesics must pass through the origin, and these have an inclination ϑ = ϑ0 with respect to the polar axis. In physical terms, there is nothing to counter the centrifugal force that would allow for the formation of a closed orbit. We now inquire into the physical meaning of φ whose tangent is related to the equation of the geodesics according to (9.6.7a). The equations of aberration, (8.3.6) and (8.3.9) are here given by u1 cos φ − u2 , 1 − u1 · u2 cos φ/c2 √ u1 sin φ (1 − u22 /c2 ) u sin φ = . 1 − u1 · u2 cos φ/c2
u cos φ =
(9.6.20a)
(9.6.20b)
c This is what the relativists confuse with the expansion factor of the universe as we
shall see in Sec. 9.11.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
459
They can be used to derive the most general composition law of velocities. Squaring (9.6.20a) and (9.6.20b), and adding, result in u2 = =
(u1 − u2 )2 − (u1 × u2 )2 /c2 (1 − u1 · u2 )2 u21 + u22 − 2u1 · u2 cos φ − (u1 · u2 sin φ)2 . (1 − u1 · u2 cos φ/c2 )2
(9.6.21)
Dividing (9.6.20b) by (9.6.20a) gives √ u1 sin φ (1 − u22 /c2 ) tan φ = . u1 cos φ − u2
(9.6.22)
Whereas the composition law of velocities, (9.6.21), invalidates the law of cosines, u2 = u21 + u22 − 2u1 · u2 cos φ,
(9.6.23)
(9.6.22) invalidates the law of sines. We can appreciate this by considering the phenomenon of stellar aberration, where u1 = c and u2 equals the Earth’s velocity. Equation (9.6.22) becomes tan φ =
√ sin φ (1 − β2 ) , cos φ − β
where β = u2 /c.
Fig. 9.6.
Geometric set-up for stellar aberration.
(9.6.24)
Aug. 26, 2011
11:17
460
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
In Fig. 9.6, L represents the telescope’s lens, and O the eye of the observer at the moment a light ray arrives at L from a star S. OE indicates the direction in which the Earth is orbiting about the Sun. In the time τ that it takes for the light ray to pass through the telescope, the Earth will have moved a distance u2 τ to position O . The distance between the lens and the new position of the Earth is τ. If the Earth were stationary then the telescope would be pointed along O L, but, because of the Earth’s motion, it is pointed in direction OL. Drawing O L and LL completes the parallelogram. Denote ∠LO E by φ and ∠L O E by φ. The difference φ − φ is attributed to stellar aberration, which displaces the star’s actual position toward the direction OE in the plane SO E. The law of sines for the triangle LO L is sin ∠LO L sin ∠LL O = . LL LO
(9.6.25)
Introducing the facts that LL = OO = u2 τ, and LO = τ, we get sin (φ − φ) = β sin φ.
(9.6.26)
Since the Earth’s relative velocity, β := u2 /c = 10−4 , is small, we may replace the sine by its argument to get the formula, φ := φ − φ = β sin φ,
(9.6.27)
which is commonly used to calculate aberration [Smart 60], where β is called the constant of aberration. This Euclidean approximation is equivalent to approximating the square root in (9.6.22) by unity, and neglecting terms of higher power than first in β so that (9.6.22) will reduce to tan φ tan φ(1 + β sec φ). Then performing the expansion of the trigonometric functions to first-order in the difference φ results in (9.6.27). The violations of the laws of cosines and sines, (9.6.23) and (9.6.25), mean that the addition law of velocities cannot be represented as a triangle in the flat Euclidean plane, but, rather, as a distorted triangle on the surface of a pseudosphere, a surface of revolution with constant negative curvature resembling a bugle surface in Fig. 2.19. The bugle has a rim, so that it occupies only a finite region of the hyperbolic plane, and it obeys the hyperbolic axiom that for any given line and point p not on , there are
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
461
at least two lines through p that do not intersect . The angle defect of the triangle is a direct consequence of this axiom, and the area of the triangle is proportional to its defect, as we have seen in Sec. 2.1.1. Furthermore, the parallelogram rule for the addition of velocities in Newtonian kinematics is no longer valid, and we are left only with the triangle rule. Thus, (9.6.22) can be considered as the relativistic generalization of the law of aberration that would be applicable to relativistic velocities. The plane of orbit that the Sun traces out in a year is called the ecliptic plane. The great circle in which this plane intersects the celestial sphere, at whose center the Earth is found, is called the ecliptic. If the fixed star is at the pole of the ecliptic, φ = π/2 all along the Earth’s orbit. The aberrational orbit will be a circle about the pole of the ecliptic with radius β. This corresponds to the circle C with center O and Euclidean radius r in Fig. 9.5. For every point p on the locus of points at a given distance from the center, φ = π/2, √ and = κ/ (1 − r2 /κ2 ) = κ cosh (¯r/κ). Multiplying this by the Euclidean circumference, 2πr = 2π tanh (¯r/κ) gives the circumference of the hyperbolic circle, 2πκ sinh (¯r/κ), a result we found earlier. For stars that lie in the ecliptic, φ varies between ±π/2 and 0 [Sommerfeld 64]. A hyperbolic motion which takes O to p transforms a circle C centered at O into an ellipse E centered at p, shown in Fig. 9.5. At the fixed point p, the function reaches its maximum value κ/(1−r2 /κ2 ) at φ = 0, and √ its minimum value κ/ (1−r2 /κ2 ) at φ = π/2. Therefore the semi-major and √ semi-minor axes of the ellipse are (1 − r2 /κ2 )/κ and (1 − r2 /κ2 )/κ, which are the inverse of the values of at φ = π/2 and φ = 0, respectively. The Euclidean area of the ellipse is π (1 − r2 /κ2 )3/2 3 r¯ = sech π . κ κ2 κ2 If we use the hyperbolic definition of angular momentum for the stereographic inner product model that we found previously in (9.2.5), h =
r2 ϑ˙ , 1 − r2 /κ2
(9.6.28)
then we must modify the equation for the radius of the trajectory,
r2 2 r2 r˙ = ±c 1 − 2 1− 2 1− 2 , (9.6.29) κ r κ
Aug. 26, 2011
11:17
462
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
˙ the Euclidean expression for the if and only if = e /c, where e = r2 ϑ, angular momentum. For the Beltrami metric, (9.6.28) is an option since (9.6.29) must also be modified. However, for the stereographic hyperbolic metric, (9.6.28) is no longer an option, it is a must. Møller [52] claims that the notion of a ‘radius vector’ in the definition of angular momentum can only be defined unambiguously in the Euclidean plane, and assumes that (1 − r2 /κ2 )−1 is a small correction to the angular momentum, so that there is only a slight violation of the conservation law of angular momentum. Slight, or not, it is still a violation of a conservation law! Rather, (9.6.28) is to be considered as the hyperbolic conservation law for angular momentum. General relativity agrees with (9.6.28), but not with (9.6.29). The radial equation must then be
2 2 2 r r˙ = ±c 1 − 2 1 − 2 , (9.6.30) r κ if it is to correspond to the stereographic hyperbolic metric. Consequently, it leads to an equation of the trajectory of the form (1 − r2 /κ2 )/r2 , [1 − (/r)2 (1 − r2 /κ2 )]
ϑ = ± √
(9.6.31)
whose solutions are bowed geodesics whose centers lie outside the disc as we have shown in Sec. 7.5. The bending of the geodesic is attributed to the rotation of the disc [Grøn 04]. But if the stereographic hyperbolic metric were applicable it would invalidate Einstein’s old result (9.6.16), and it would also be in conflict with his general theory. There is no way out for (9.6.28) to hold, and yet come out with inequality (9.6.16). To calculate the hyperbolic distance between points r1 and r2 , r2 √ 2 dr + r2 dϑ2 (1 − r2 /κ2 ) h(r1 , r2 ) = , (9.6.32) (1 − r2 /κ2 ) r1 we introduce the ‘effective’ centrifugal potential,
r2 2 c (r) = 2 1 − 2 , 2r κ
(9.6.33)
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
463
where relativistic effects are accounted for in the second term. Squaring (9.6.29) gives the hyperbolic energy conservation law, r˙ 2 + 2c (r) = c2 ,
(9.6.34)
if = e in (9.6.33) so that the factor in front of the square root in (9.6.29) does not belong there. In terms of the effective potential, we can write the hyperbolic distance (9.6.32) as the logarithm of the cross-ratio, √ √ κ + r2 (1 − 2c (r2 )/c2 ) κ − r1 (1 − 2c (r1 )/c2 ) κ h(r1 , r2 ) = ln . · √ √ 2 κ − r2 (1 − 2c (r2 )/c2 ) κ + r1 (1 − 2c (r1 )/c2 ) (9.6.35) This clearly shows that it is the cross-ratio that determines the hyperbolic distance between any two points. At low angular momentum, the crossratio in (9.6.35) simplifies to κ + r2 κ κ − r1 κ h(r1 , r2 ) = ln · = ln{r1 , r2 |κ, −κ} (9.6.36) 2 κ − r2 κ + r1 2 between the ordered points (κ, r2 , r1 , −κ). The hyperbolic distance (9.6.36) vanishes when r1 = r2 and tends to infinity when either r2 ↑ κ or r1 ↓ −κ. To the Poincarites, it would seem like the rim is infinitely far away. If the uniform acceleration is caused by gravity, this will fix the radius of curvature as √ κ = (3/4πGρ), (9.6.37) for a mass of constant density ρ, and G is the Newtonian gravitational constant. The factor, κ appears as a free-fall time. Free-falling objects know no restrictions placed on their velocities, like the restriction to regions of the √ rotating disc where r < c/ω. The frequency ω = (4πGρ/3) is the minimum frequency which an object must rotate in order to avoid gravitational collapse. But if we apply the free-fall frequency to the same condition as that of the disc, we get 3 r< c (9.6.38) 8πGρ which is the density formulation of the Schwarzschild inequality α/r < 1.
Aug. 26, 2011
11:17
464
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
With inequality (9.6.38) a constant free-fall time is thus compatible with a uniformly rotating system. This choice of the absolute constant sets the Gaussian curvature (9.6.17) directly proportional to the constant mass density, viz., 4 K = −1/κ2 = − πGρ. 3 In the hyperbolic space of constant negative curvature, a gravitational potential cannot be appended onto the energy conservation law (9.6.34) as a separate entity. However, if mass, rather than density, is constant, the curvature will no longer be constant so that gravitational acceleration will not be equivalent to a uniformly rotating disc. Although the curvature will no longer be constant, it will show that gravitational effects enter, not only through a potential in energy conservation, but, also in the specification of the absolute constant κ that determines the point at infinity, or the ideal point, where two parallel lines intersect. We will come back to this case in Sec. 9.10.
9.7
The FitzGerald–Lorentz Contraction via the Triangle Defect
The fact that a hyperbolic triangle must be fitted onto a pseudosphere in hyperbolic space, rather than lying flatly in the Euclidean plane, causes an angle defect where the sum of the angles of the triangle is less than two right angles. The curvature of space implies a ‘fitting error.’ If the curvature is like a cylinder of a hat, it will produce an angle defect as shown in the picture on the left in Fig. 9.7. Moreover, since the angles of a hyperbolic triangle determines the sides, we can expect the defect to shorten the length of at least one of its sides. Likewise, a positive curvature also produces a fitting error by trying to place a flat object on a sphere in the picture on the right in Fig. 9.7. This time there will be an angle excess so that the sum of angles of a triangle will add to more than π. Consequently we can expect a lengthening of the sides of the triangle, which physically corresponds to a space dilatation. Since the triangles lie in velocity space we can expect that the angle defect will be related to the FitzGerald–Lorentz contraction, and the angle excess to the opposite effect of space dilatation. Whereas the contraction
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
465
Fig. 9.7. Fokker’s [65] visualization of fitting errors when objects are placed on curved surfaces. The left and right sides correspond to negative and positive curvature, respectively.
is well-known — but not in the hyperbolic context to be described here — the dilatation is unknown, and we will differ its discussion until Sec. 11.3. In the early days of relativity, Ehrenfest [09] arrived at the paradoxical conclusion that the circumference of a rotating disc should be shorter than 2πr due to the FitzGerald–Lorentz contraction. For, according to Ehrenfest, the periphery of a cylinder when set into motion will “show a contraction compared to its state of rest: 2πr < 2πr, because each element of the periphery is moving in its own direction with instantaneous velocity r ω.” As we have seen, Einstein [Stachel 89] came to the opposite conclusion that the circumference had to be greater than 2πr claiming that it was necessary to keep the longitudinal FitzGerald–Lorentz contraction as distinct from the shortening of the tangential components of the measuring rods on the √ disc by a factor of (1 − r2 ω2 /c2 ). And because you need more tangential measuring rods than when the √ disc is at rest, its circumference should be increased by the factor 1/ (1 − r2 ω2 /c2 ). He concluded that a “rigid disc must break up if it is set into motion, on account of the Lorentz contraction of the tangential fibers and the non-contraction of the radial ones” [Stachel 89]. Einstein’s argument, that the measuring rods contract so that more are needed to measure the circumference of the disc, fails to answer the question of why the periphery
Aug. 26, 2011
11:17
466
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
Fig. 9.8.
Hyperbolic right triangle inscribed in a unit disc.
of the disc also does not contract when set into motion. And the contraction should be greater the faster the disc rotates! Consider a unit discd in velocity space with a right triangle inscribed in it, as shown in Fig. 9.8. In view of the Euclidean expression for the kinetic energy, (9.6.11), we may consider that the Euclidean measures of the sides ˙ and the hypotenuse γ < 1, which ensures that the are β = r˙ and α = rϑ, rotating system may be “represented by a uniformly rotating ‘material’ disc since nothing, in Euclidean space, can surpass the velocity of light.” [Møller 52] The angle A at the center of the disc will not be distorted so that it will obey Euclidean geometry. Thus, its Euclidean measure A will coincide with ¯ its hyperbolic measure A, ¯ = β/γ = tanh β/tanh ¯ cos A = cos A γ. ¯
(9.7.1)
However, because the angle B is non-central, its Euclidean measure, ¯ The logarithm of the B, will not coincide with its hyperbolic measure, B. cross-ratio is the hyperbolic distance, 1 e(c, u) e(b, v) α¯ = ln · 2 e(b, u) e(c, v) √ √ 1 (1 − β2 ) (1 − β2 ) + α = ln √ · √ . 2 (1 − β2 ) − α (1 − β2 ) d This implies we are using natural units where the velocity of light c = 1.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
467
Exponentiating both sides and solving for α¯ give ¯ α = tanh α¯ · sech β.
(9.7.2)
This shows that the Lobachevsky straight line segment in hyperbolic ¯ which is the space, tanh α, ¯ has been shortened by the amount, sech β, √ ¯ FitzGerald–Lorentz contraction factor. The contraction, (1−β2 ) = sech β, implies ¯ β = tanh β.
(9.7.3)
These is the set of Beltrami coordinates. In terms of these coordinates the Beltrami metric form is dh2 = dα¯ 2 + cosh2 α¯ dβ¯ 2 . Moreover, the Euclidean Pythagorean theorem, γ 2 = α2 + β2 , or the Euclidean kinetic energy, γ = 2T in (9.6.11), asserts that tanh2 γ¯ = tanh2 α¯ · ¯ gives way to the hyperbolic Pythagorean theorem sech2 β¯ + tanh2 β, ¯ cosh γ¯ = cosh α¯ · cosh β.
(9.7.4)
¯ Now, the cosine of the hyperbolic measure of the angle B, cos B¯ = tanh α/tanh ¯ γ, ¯ or the ratio of the adjacent to the hypotenuse, will be related to the cosine of its Euclidean measure by cos B =
α tanh α¯ sech β¯ = γ tanh γ¯
= cos B¯ sech β¯ = cos B¯
√ (1 − β2 ).
(9.7.5)
Consequently, cos B¯ > cos B, and since the cosine decreases monotonically on the open interval (0, π) it ¯ Since B = π/2−A, the sum, A+ B¯ < π, which is the wellfollows that B > B. known angle defect. And since the angles of a hyperbolic triangle determine their sides, the side α will appear smaller than its hyperbolic measure, α, ¯ by an amount given precisely by the FitzGerald–Lorentz contraction factor, √ (1 − β2 ).
Aug. 26, 2011
11:17
468
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
The origin of the FitzGerald–Lorentz contraction is to be found in the hyperbolic angle defect since the sides of a hyperbolic triangle are determined by their angles. The exact same contraction factor is found for the normal, or second-order, Doppler shift. Due to uniform acceleration, the Euclidean measure of α will not be √ tanh α, ¯ but it will be decreased by the factor (1 − β2 ). It will appear to us Euclideans that the disc is rotating at a slower rate than it would to the Poincarites who measure a length α¯ = tanh−1 α. By writing (9.6.16) as ¯ 2πr = h sech β, we can interpret our slower rate to a smaller perimeter, 2πr to cover than to the Poincarites who have to cover the larger perimeter, h.
9.8
Hyperbolic Nature of the Electromagnetic Field and the Poincaré Stress
Hyperbolic geometry also applies to Maxwell’s equations, and, in this section, we will show how it can be used to calculate the Poincaré stress that we analyzed in Sec. 6.2. It will make the assumption of charge conservation on the surface of the electron completely superfluous. Consider a charge moving in the x-direction at a constant, relative speed β. The law of transformation of the electromagnetic fields, E and H, are Ex = Ex ,
Hx = Hx ,
(9.8.1a)
Ey = (Ey − βHz ),
Hy = (Hy + βEz ),
(9.8.1b)
Ez = (Ez + βHy ),
Hz = (Hz − βEy ),
(9.8.1c)
√ where we denote = 1/ (1 − β2 ) so as not to be confused with γ, the hypotenuse of the triangle. We consider the primed inertial system to be at rest in the xy-plane. From the last section, we know that the sides of a triangle may be expressed in terms of the angles of the triangle. Consequently, the first two
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
469
transformation laws, (9.8.1a) and (9.8.1b) can be stated as ¯ = β/γ = tanh β/tanh ¯ cos A = cos A γ, ¯ cos B = α/γ =
(9.8.2a)
tanh α¯ ¯ sech β¯ = cos B¯ sech β, tanh γ¯
(9.8.2b)
where the latter is the hyperbolic Pythagorean theorem (9.7.4). The components of the force are obtained by multiplying their projections in the x, y, and z planes by the factors 1, −1 , and −1 , respectively [Lorentz 16]. In the xy-plane the force components will be given by e e Ex = cos A, 4πa2 4πa2 e e e ¯ Fy = (Ey − βHz ) = Ey / = cos B¯ sech β, 2 2 4πa 4πa 4πa2 Fx =
(9.8.3a) (9.8.3b)
where a is the radius of the sphere of charge e. The magnitude of the force is F=
√
(Fx2 + Fy2 ) =
e √ ( cos2 A + cos2 B). 4πa2
(9.8.4)
Without realizing that
cos2 A + cos2 B =
2 ¯ ¯ β 1 − sech2 β¯ + (1 − sech2 α)sech
tanh2 γ¯
= 1,
which follows directly from the hyperbolic Pythagorean theorem, (9.7.4), Page and Adams [40] invent a charge conservation condition, per unit area on the surface of the electron, ρ dσ = ρ dσ , where ρ = e/4πa2 . The surface elements, dσ and dσ , are supposedly related by √ dσ = dσ ( cos2 A + cos2 B), so that upon solving for the unknown charge density, ρ, the square root in (9.8.4) is eliminated.
Aug. 26, 2011
11:17
470
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
Since the electromagnetic field vanishes inside the electron, the stress acting on the surface, S =
e2 1 2 , F = 2 32π2 a4
(9.8.5)
is the well-known Poincaré stress that was needed to reduce the 43 factor in the expression for the energy of an electron to unity, leading to the conclusion that the mass of an electron is not totally electromagnetic in origin. This corrects the derivation given in Sec. 6.2. There is no need to invoke a hypothetical charge conservation on the surface of the electron when it is realized that the surface element is in the hyperbolic plane, and not the Euclidean plane.
9.9
The Terrell–Weinstein Effect and the Angle of Parallelism
If we want to determine the size of a rod traveling at a relative velocity β, we have to take into account that the photons we observe emanating from the ends of the rod will arrive at different times. Terrell [59] showed that one can interpret what is usually viewed as a FitzGerald–Lorentz contraction as a distortion due to the rotation of the rod. Weinstein [60] claimed, about the same time, that the length of a rod can appear infinite which he claimed cannot be due to a mere rotation. Here, we will show it to be due a phenomenon analogous to stellar parallax, and involves the angle of parallelism in hyperbolic geometry. In the limiting case we have the Euclidean distance β = cos A [cf. Fig. 9.8], since the maximum length of hypotenuse of the inscribed right triangle in a unit disc is 1. According to the definition of the angle of paral¯ whose Euclidean lelism, (9.5.3), the hyperbolic measure of the velocity, β, value satisfies β < 1, is 1 1+β 1 1 + cos A β¯ = ln = ln 2 1−β 2 1 − cos A =
1 + cos A 2 1 = ln cot(A/2), ln 2 sin A
(9.9.1)
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
471
where we have used a half-angle trigonometric formula in writing down the third equality. Exponentiating both sides of (9.9.1) results in the Bolyai– Lobachevsky formula [cf. Eq. (9.5.3)] ¯
¯ cot[(β)/2] = eβ ,
(9.9.2)
¯ is the angle of parallelism. where A = (β) Consider a rod moving with relative velocity β along the r axis. Light from the trailing and leading edges must travel over different distances, and, hence arrive at different times. Suppose the distance covered by photons emanating from the trailing edge is d1 , while that from the leading edge d2 ; they are also their respective times in natural units. The observer that sees light at time t will have emanated from the trailing and leading edges at t − d1 and t − d2 , respectively, because of the finite propagation of light. The Lorentz transformations for the space coordinates will then be r1 = [r1 − β(t − d1 )],
(9.9.3a)
r2
(9.9.3b)
= [r2 − β(t − d2 )],
√ where, again, = 1/ (1 − β2 ). The difference between (9.9.3a) and (9.9.3b) provides a relation between the lengths of the rod in the system traveling at the velocity β. At rest, the difference is = r2 − r1 , while in motion, = r2 − r1 , where = + β(d2 − d1 ) . Now, the difference in distances traveled by the photons from the leading and trailing edges is just the length of the rod in the system at rest, so that the length of the rod in motion will be Doppler-shifted by an amount, 1 + β 1/2 = , (9.9.4) 1−β if the rod is approaching the stationary observer. For a rod receding from the observer, the signs in the numerator and denominator must be exchanged because β → −β. We could have arrived at (9.9.4) directly by observing that in addition to the usual Doppler effect there is a time dilatation between observers located on the moving and stationary frames.
Aug. 26, 2011
11:17
472
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
Setting the Euclidean measure of the relative speed, β, equal to the cosine of the angle of parallelism in the Doppler expression (9.9.4) gives ¯ ¯ / = cot (β)/2 = eβ ,
(9.9.5)
if the rod is approaching, while ¯ ¯ / = tan (β)/2 = e−β ,
(9.9.6)
if it is receding. In general, the angle of parallelism, , must be greater than A, and the larger the hyperbolic measure of the relative velocity β¯ the smaller will be the angle A. Thus, we would expect to see a large expansion of the object as it approaches us, and a corresponding large contraction as it recedes from us. These are the conclusions that a single observer would make, and not those of two observers, as in the usual explanation of the FitzGerald–Lorentz contraction. Weinstein came to same conclusions by plotting the exponent of the hyperbolic arctangent, rather than the tangent, because he did not go to the limit where β = cos A, which then defines the angle of parallelism. However, unlike the astronomical phenomenon of parallax, where the radius of curvature is so large and the parallax angle so small as to thwart all attempts to-date at measuring a positive defect, the distortions predicted by (9.9.5) and (9.9.6) are actually more dramatic, precisely because of the finite speed of light. Since π − (π/2 + A + B) < π/2 − A = φ, the defect is smaller than the complementary angle to A, known as the parallax angle, φ, in astronomy. Moreover, since A(= π/2 − φ) ≤ , or φ > π/2 − , there exists a lower bound for the parallax of stars, if space is, indeed, hyperbolic. Since φ is the upper bound of the defect, the latter may stand a greater chance of being measured. Although no astronomical lower bound for the parallax angle found to-date, the non-Euclidean nature of light rays may be easier to access because the finite velocity of light is not a constraint on the hyperbolic measure of the velocity.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
9.10 9.10.1
473
Hyperbolic Geometries with Non-Constant Curvature The heated disc revisited
We return to the heated disc that was discussed in Sec. 2.1.1. There, it provided us with an example of a two-dimensional geometry which is acted upon by thermal stresses that tend to warp it and, in so doing, modify its Euclidean geometry. Our intention was to determine what are the consequences in assuming different physical laws for the transport of heat on the geometry. Rather than considering the temperature as a correction factor in the physical law, we consider it to determine the law itself through the metric [Robertson 50], dr2 =
dx2 + dy2 , T 2 (x, y)
(9.10.1)
with the stereographic inner product T > 0, but, otherwise, unknown. Under the assumption that the transport of heat is directed radially outward from the center of the disc, (9.10.1) becomes dr2 =
dr2 + r2 dϕ2 , T 2 (r)
(9.10.2)
under a change to polar coordinates. If there is a heat source, of intensity σ, located at the center of the disc, the law of stationary heat conduction will be given by Poisson’s law, dT 1 d r = −σ, k r dr dr
(9.10.3)
where k is the thermal conductivity. The solution to (9.10.3) is T = T0 −
σr2 , 4k
(9.10.4)
where the constant of integration, T0 , is the temperature at the center of the disc. The first constant of integration would have led to an infinite temperature at the center of the disc, and, so, has been set equal to zero.
Aug. 26, 2011
11:17
474
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
The line element for going from r to r + dr and ϕ to ϕ + dϕ, (9.10.2), is now explicitly given by d¯r2 =
dr2 + r2 dϕ2 . (T0 − σr2 /4k)2
(9.10.5)
The Gaussian curvature, (T0 − σr2 /4k)2 d K=− r dr
T0 + σr2 /4k T0 − σr2 /4k
=−
σT0 , k
is constant, and negative if there is a heat source at the center, σ > 0, or positive if it is a sink, σ < 0. Outside the disc of radius r1 , which is at temperature T1 , T will behave as a logarithmic potential, r T = T1 ln +1 , r1
(9.10.6)
because it satisfies Laplace’s equation, 1 d dT r = 0, r dr dr in two-dimensions. The Gaussian curvature, K=−
T02 , r2
is still negative, but is no longer constant. This appears to contradict the fact that thermal stresses distort what would otherwise be flat, Euclidean geometry. If, instead of considering heat sources or sinks, we were to consider the rate of heating, we would have the diffusion equation for heat conduction. In two-dimensions it reads ∂T ∂T a2 ∂ r − = 0, (9.10.7) r ∂r ∂r ∂t where a2 = k/ρc with c as the specific heat. Eliminating time by looking for 2 2 a solution whose temporal dependency is exponentially decaying, e−a µ t ,
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
475
where µ is a completely arbitrary constant, (9.10.7) reduces to 1 T + T + µ2 T = 0, r
(9.10.8)
where the prime denotes differentiation with respect to r. It will be immediately appreciated that (9.10.8) is the equation for a Bessel function of order zero, J0 . Changing our perspective, we now assume that the temperature is finite at the center, T0 , of the disc and vanishes at the rim, which is infinitely cold. In this case the solution to (9.10.8) is T = T0 J0 (µr). To take into account that the temperature vanishes on the rim, r1 , we set the zero of the Bessel function µr1 = λ, i.e. J0 (µr1 ) = 0, and eliminate µ in the argument. The solution can now be written as λr T = T 0 J0 . (9.10.9) r1 The line element (9.10.2) is now given explicitly as d¯s2 =
dr2 + r2 dϕ2 T02 J02 (λr/r1 )
.
(9.10.10)
Since J0 has the infinite power series, J0 (x) = 1 −
x2 x4 + − · · ·, 4 64
it is clear that for discs of large radius, the curvature will be negative and constant. In general, the curvature is
2 2 λ 2 2 K = −T0 (9.10.11) J0 + J 0 . r12 The first term in (9.10.11) represents the heat source, while the second term is proportional to the square of the heat flux, since by Fourier’s law of heat conduction, the heat flux is proportional to the negative of the temperature gradient. For extremely large discs, the first term vanishes, and the negative curvature becomes proportional to the square of the heat flux. In other words, the transport of heat curves space as do heat sources.
Aug. 26, 2011
11:17
476
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
Moreover, the curvature (9.10.11) will remain finite at the rim of the disc, which is the coldest possible, because the zero of J0 and the zero of J1 = −J0 have no common root. Even at zero temperature, there is finite curvature! The curvature (9.10.11) cannot distinguish between the diffusion equation (9.10.7) or a wave equation. This is contained in the multiplicative factor to (9.10.9) that would make it the complete solution. However, if it has any sense to introduce a time component to the line element (9.10.10), the wave equation would have to be compatible with a hyperbolicinvariant form of the metric. Nevertheless, both the parabolic and hyperbolic equations of motion give the spatial component of the metric. This fact leaves much to be desired in assuming a hyperbolic-invariant form, implying the existence of thermal waves, as opposed to thermal diffusion. In an analogous way that we went from a space of negative to positive curvature by exchanging a source for a sink, if we make the substitution µ → iµ, the Bessel function, J0 (ix) = I0 (x), becomes a modified Bessel function. Since I0 (x) = 1 +
x4 x2 + + · · ·, 4 64
the line element, ds2 =
dr2 + r2 dϕ2 T02 I02 (λr/r1 )
,
will have positive constant curvature for large disc radii. This is, indeed, surprising inasmuch as one would think that circular and ordinary Bessel functions would apply to bounded, positive curvature, while hyperbolic and modified Bessel functions would be compatible with unbounded, negative curvature.
9.10.2
A matter of curvature
Geometries which are both homogeneous and isotropic have constant (Gaussian) curvature. Gaussian curvature is a measure of a surface’s intrinsic geometry, or the invariance of a surface to bending without stretching. As we know, there are three distinct simply connected isotropic geometries
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
477
in any dimension: Euclidean with zero curvature, elliptic with positive curvature, and hyperbolic with negative curvature. Homogeneity implies that there is at least one isometry that takes one point to another so that the points appear to be indistinguishable. Isotropy implies that space is isotropic so that all directions appear the same. The appearance of non-Euclidean geometries with constant curvature are rather rare because homogeneity and isotropy are very strong conditions which are seldom met with in cosmology. The fact that the inertial mass of a rotating system can be handled within the confines of constant negative curvature, while gravitational mass cannot, leads us to believe that the fields of acceleration of uniform rotation and gravitation are not equivalent. To transform the exterior solution of the Schwarzschild [16] metric into the interior one, possessing constant (negative) curvature, it is necessary to assume that the mass is not a function of the radius r. Then for objects in which the density, ρ, is essentially uniform, M = (4π/3)ρr3 , introducing this into the Schwarzschild metric renders it equivalent to the hyperbolic metric with constant Gaussian curvature, (9.6.17), where the absolute constant is given by (9.6.37). Gaussian curvature appears here as a relativistic effect, vanishing in the nonrelativistic limit as the speed of light increases without limit. Flatness cannot only be achieved in the limit of a vanishing density, but also in the case where relativistic effects become negligible. The two cases of constant density and constant mass are distinguishable by the different slopes of the curve of the velocity of rotation of galaxies as a function of their distance from the galactic center. For distances less than rc = 2 × 104 light years the curve rises with a constant positive slope. That means, if the centrifugal and gravitational forces just balance one another, the rotational velocity is proportional to the density, which remains essentially uniform. For distances greater than this value, the curve slopes downward, where the rotational velocity is now proportional to the inverse square root of the distance from the galactic center. This implies that the galactic mass is confined to a region whose volume has a radius less than rc , for once outside this volume it appears that the mass is independent of the radius.
Aug. 26, 2011
11:17
478
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
The transformation from constant density to one of constant mass necessitates replacing the metric coefficients (9.6.9a) and (9.6.9b) by E=
κ2 , (1 − α/r)2
(9.10.12a)
G=
κ2 r 2 , (1 − α/r)
(9.10.12b)
respectively. The Gaussian curvature (9.6.17), α 3α K =− 2 3 1− , 4r κ r
(9.10.13)
will be negative provided r > 43 α. Although this distance is less than the Schwarzschild radius, the inequality means that the singularity cannot be approached without a change in the sign of curvature. The coexistence of elliptic and hyperbolic spaces depending on the distance from the singularity does seem rather surprising. However, distances less than α invalidate the positive definiteness of the stereographic inner product, and thus insure the negativeness of the Gaussian curvature, (9.10.13). Under this transformation, the metric (9.6.8) transforms into dh2 =
dr2 + (r dϑ)2 (1 − α/r) , (1 − α/r)2
where 1 − α/r is the stereographic inner product for a non-constant, negative, curvature for r > α.
9.10.3
Schwarzschild’s metric: How a nobody became a one-body
Surely there has never been a more ludicrous attempt to prove a conclusion in physical science than this arbitrary fixation of a constant, with equal justification, might have been given any value we please.e O’Rahilly [38]
The transition from a system of constant mass to one of constant density is exemplified by the exterior and interior solutions to the Schwarzschild e O’Rahilly’s comment about the rest energy applies equally to the constant of inte-
gration in Schwarzschild’s metric.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
479
metric. Schwarzschild [16] studied a static spherically symmetric field produced by a spherically symmetric body at rest. The static condition does not mean that dt = 0, but, rather, that the coefficients of the fundamental form do not depend upon time. In spherical coordinates, the line element is ds2 = E dr2 + Fr2 dσ 2 − G dt2 ,
(9.10.14)
dσ 2 = dϑ2 + sin2 ϑ dϕ2 .
(9.10.15)
where dσ 2 is given by
Since spherical symmetry is invoked, the coefficients of the fundamental forms, E, G, and F can, at most, be functions of the radial coordinate. Einstein’s condition for empty space is that the Ricci tensor should vanish, Rµν = 0.
(9.10.16)
According to Dirac [75], this constitutes a law of gravitation. ‘Empty’ here means that there is no matter present and no physical fields except the gravitational field. The gravitational field does not disturb the emptiness. Other fields do.
We repeat our statement made in the Introduction (Chapter 1): That gravity acts where matter and radiation are not does not seem credible. Since we are looking for a spherically symmetric solution, we need not consider the angular dependency in the metric. If we set F = 0, we do not find that the Ricci tensor components vanish, but only the contracted scalar curvature. The unknowns are determined by Einstein’s equations that involve the contracted Ricci tensor, ρ
Rµν = λµλ,ν − λµν,λ + µλ λνρ − ρµν λρλ ,
(9.10.17)
Where the Einstein convention of summing over repeated suffixes is used, and the comma in the subscript indicates differentiation with respect to the coordinate that follows. The only nonvanishing Christoffel symbols of the
Aug. 26, 2011
11:17
480
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
second kind are: 111 =
Er , 2E
122 =
Gr , 2E
212 =
Gr . 2G
These expressions are to be substituted into (9.10.17). Since the only surviving components of the Ricci tensor are 2 R11 = 212,1 − 111 212 + 212 , R22 = −122,1 − 122 111 + 221 122 , where 1 = r and 2 = t, we get R11 =
Gr 2G
R22 = −
− r
Gr 2E
αr − 43 α2 G2 E r Gr + r2 = 4 , 4EG 4G r (1 − α/r)2
+ r
αr − 43 α2 Gr2 G r Er = − − . 4EG 4E2 r2 (1 − α/r)
(9.10.18)
(9.10.19)
It is apparent that neither (9.10.18) nor (9.10.19) vanishes. But upon dividing (9.10.18) by E and (9.10.19) by G, the their sum, or total scalar curvature, R, does vanish, i.e. R11 R22 + E G 1 Gr 1 Gr Er G r Gr2 = − − 2 + E 2G r G 2E r 2E G 2G2 E
R=
=
αr − 43 α2 αr − 43 α2 − = 0. κ2 r 4 κ2 r 4
(9.10.20)
So for a spherically symmetric solution, F = 0, (9.10.18) and (9.10.19) do not vanish. It is only when we set F = 1 in (9.10.14) do they vanish separately. The big question is why should the angular dependence make a difference when gravity acts radially? At least we can say Newtonian gravity acts radially, and if there are angle dependencies, like those in Ampère’s law, these angular dependencies should be universal and follow some law. That the Ricci tensor Rνµ = 0 for the spherically symmetric solution, while Rνµ = 0 when the angle dependencies are included makes the criterion for empty space, (9.10.16), extremely dubious.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
481
We want now to compare criterion (9.10.16) with what we know from differential geometry. However, the metric (9.10.14) is an indefinite form. It can be made definite by substituting τ for it, as the independent variable. Then since the coefficients of the fundamental form depend only on the radial coordinate we can set F = 0 and obtain ds2 = E(r)dr2 + G(r)dτ 2 . In the Schwarzschild solution F is set equal to unity, and all the radial dependencies fall on the coefficients E and G of the fundamental form. This has the effect of changing the sign of 122 so that R22 changes sign, but does not vanish. The total curvature is Grr Gr2 Gr Er − − GE 2GE2 2EG2 √ ( G)r 2 = √ = −2K, √ (EG) E r
R=
where K is the Gaussian curvature, (9.6.17). This is a particular form of the general relation, K=−
1 R, n(n − 1)
for n = 2 dimensions. However, we cannot attach any significance to the imaginary time variable, τ, in determining the curvature of space-time, for the time has no significance in terms of curvature. Clocks may run slower in a gravitational field, but that is the effect of the gravitation field on time keepers, and not the effect that time has on the field. And the reason why clocks do run slower in a gravitational field certainly has nothing to do with a shift in frequency due to the Doppler effect because velocities do not enter at all. Even more can be said about the outer solution, when we try to match the two conditions at the radius, r1 , of the sphere, i.e. 1−
r2 α = 1 − 12 . r1 R
(9.10.21)
If such a relation would be valid for any generic r, we might try and set 1 − r2 /R2 =
√
(1 − α/r) 1 − α/2r,
(9.10.22)
Aug. 26, 2011
11:17
482
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
for relatively weak fields. But, if we are to replace this in the spatial metric for the outer solution, dl2 =
dr2 + r2 dϕ2 , 1 − α/r
(9.10.23)
the angular term must also change. For if we want to replace (1 − α/r)2 by (1 − 2r/R) (1 − r2 /R2 )2 , for large R, we must recall that the hyperbolic line element, dl2 = dx2 + dy2 − dz2 , becomes the Beltrami metric, dl2 =
dr2 r2 dϕ2 + , (1 − r2 /R2 )2 1 − r2 /R2
or, equivalently, = dr2 + R2 sinh2 (r/R)dϕ2 ,
(9.10.24)
since r = R tanh−1 (r/R), under the pseudospherical coordinates, R, r, and ϕ, for which z = R cosh (r/R), x = R sinh (r/R) cos ϕ,
(9.10.25)
y = R sinh (r/R) sin ϕ, where 0 ≤ r < ∞, and 0 ≤ ϕ < 2π. Thus, for weak fields, α, that imply large absolute constant, R, according to (9.10.22), the Beltrami metric, (9.10.24), can be used for the space part of the outer Schwarzschild metric, (9.10.23). We will now show that the transition that occurs at the surface r1 is one from a hyperbolic metric, (9.10.24), for r > r1 to an elliptic metric, [cf. (9.10.29) below] for r < r1 .
9.10.4
Schwarzschild’s metric: The inside story
Landau and Lifshitz [75] would contest the existence of the inner solution. For a field in the interior of a spherical cavity in a centrally symmetric distribution, we must have [E = G = 1], since otherwise the metric would be singular at r = 0. Thus the metric inside such a cavity is automatically Galilean, i.e. there is no gravitational field in the interior of the cavity (just as in Newtonian theory).
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
483
It is incomprehensible why the boundaries of the disc would pose such a problem as to warrant reducing the geometry on its interior to a Euclidean one. The coefficients of the fundamental form, (9.10.14), are [Møller 52] E(r) =
1 , 1 − 2M/r − λr2 /3
(9.10.26)
where M and λ are constants, F = 1, and G = 0. For the exterior solution, λ is set equal to zero, while for the interior solution, M = 0 [Møller 52]. However, if we set λ = 8πρ, this can be seen as a transition from one of constant mass, M, to one of constant density, ρ. But why should they not be mutually exclusive in (9.10.26)? Or is it an artifice to transfer from the outer to inner solutions? Then the line element to consider is dl2 =
dr2 + r2 dϕ2 , 1 − 2M/r − λr2 /3
(9.10.27)
and to simplify matters still further we have set ϑ = π/2, placing us in the plane. The exterior solution, where λ = 0, falls outside the domain of nonEuclidean geometries of constant curvature, but the interior solution, where M = 0, certainly does come under their jurisdiction. For then (9.10.27) becomes the line element of elliptic space, dl2 =
dr2 + r2 dϕ2 , 1 − r2 /R2
(9.10.28)
or, equivalently, dl2 = dˆr2 + R2 sin2 (ˆr/R)dϕ2 ,
(9.10.29)
of positive, constant curvature, 1/R2 , where rˆ = R sin−1 (r/R) , √
(9.10.30)
and R = (3/λ) is the absolute constant. Now comes the crux of the matter: If we hold r constant in (9.10.27) with M = 0, we obtain the periphery of a circle with length 2πr. Thus comes the conclusion that “the geometry of the surface r = r1 = const. is the same as on a sphere of radius r1 in Euclidean space” [Møller 52]. This is inaccurate since once the radial part of the metric (9.10.28) is given, the angular part is
Aug. 26, 2011
11:17
484
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
that in (9.10.29). And at constant r, the angular part can be integrated from 0 to 2π to give 2π R sin (ˆr/R)dϕ = 2πR sin (ˆr/R) < 2πˆr. 0
We would indeed measure a larger circumference than what the Poincarites would measure. We would say that our rulers have undergone a space dilatation. In contrast to what Møller purported, that we would see no difference in the circumference, our standard rulers do not give r1 as the distance from the origin, r = 0, but, rather, r1 dr r1 rˆ1 = = R sin−1 , √ 2 /R2 ) R (1 − r 0 which is noticeably larger than r1 because our ‘standard’ rulers are Euclidean rulers! Thus, the internal solution to Schwarzschild’s problem is an example of elliptic geometry with constant curvature. It appears as the antithesis of the uniformly rotating disc. Rulers measuring the circumference of the disc appear stretched. The Schwarzschild problem is not a single problem for it entails transiting from a metric where λ = 0 and α = 0 to one of α = 0 and λ = 0.
9.11 9.11.1
Cosmological Models The general projective metric in the plane
Surprisingly, cosmological models with constant densities would correspond to the Schwarzschild inner solution, but with more options available. Everyone, or almost everyone, begins with the Friedmann– Lemaitre–Robertson–Walker metric, dr2 2 2 2 2 2 ds = −dt + R (t) (9.11.1) + r dσ , 1 − kr2 where the parameter k determines the spatial curvature, t is ‘cosmic’ time (whatever that is), and R(t) is the scale factor. For k = +1 the spatial sections correspond to a sphere, or one of higher dimensions; for k = 1 the spatial
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
485
sections correspond to a universe with negatively curved sections, and finally for k = 0, the spatial sections are flat. As we know, we can write (9.11.1) equally as well as ds2 = −dt2 + R2 (t) dχ2 + k −1 sin2 χ dσ 2 , (9.11.2) where we introduce the ‘angle,’ χ, in place of the coordinate r according √ to χ = sin−1 ( (k)r). Here, the parameter k has turned up elsewhere, and that elsewhere is subsequently set equal to zero on the basis of isotropy, so that the only way to go from a closed to an open model is through the transformation χ → iχ. Furthermore, by the transformation [Rindler 77], r=
ρ 1 + 41 kρ2
(9.11.1) can be written as 2
2
2
ds = −dt + R (t)
,
dρ2 + ρ2 dσ 2 (1 + 41 kρ2 )2
(9.11.3) ,
(9.11.4)
whose spatial part is still determined by the sign of k. We recognize the terms in the square parentheses of (9.11.4) as the stereographic inner product metric, for k > 0 it is elliptic while for k < 0 it is hyperbolic. That latter has occupied our attention in Sec. 7.4. If ρ is the hyperbolic distance, the r in the transform (9.11.3) cannot be. While it is true that you can multiply the metric by a factor R2 , it will decrease the curvature by an amount k/R2 , it says nothing about the size of the disc itself. In both the elliptic and hyperbolic cases the radius, √ r0 = 1/ k, (9.11.5) is a constant! So any scale factor, R(t), will have no effect upon the plane where the Poincarites live. It is usually argued that the volume in elliptic space is independent of k, an absolute constant. The volume of a cone length r1 and solid angle is [Rindler 77] r1 r2 dr 3 V(t) = R (t) √ (1 − kr2 ) 0 r1 1 = R3 (t) r2 1 + kr2 + · · · dr, 2 0 which to lowest-order is independent of k.
Aug. 26, 2011
11:17
486
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
A closer, and more precise, derivation of the expression for the volume in elliptic space shows that this is not true, but only approximately in the Euclidean limit. For the elliptic line element (9.11.2), the volume element is √ dV(t) = R3 (t)k −1 sin2 ( (k)r) sin2 ϑ dr dϑ dϕ. √ For a sphere of radius r < π/2 k, the volume is then [cf. (9.2.10)] k −1
2π
sin ϑ dϑ 0
0
= 2π
π
dϕ
R(t) √ k
r
√ sin2 ( (k)t)dt
0
3
√ √ √ [ (k)r − sin ( (k)r) cos ( (k)r)].
(9.11.6)
As (9.11.6) clearly shows, only in the limit as k → 0 does the term in the brackets tend to 12 k 3/2 r3 . This is the Euclidean limit of an infinite radius of curvature in which k disappears from the expression for the volume. So if r is not the radial coordinate in the non-Euclidean plane then just what is it? For the case k = 1, it is given in Fig. 9.9. It supposedly represents a ‘geodesic plane’ through the origin O obtained by setting ϑ = π/2 in the metrics (9.11.1) or (9.11.4). The disc of radius 1 is just the elliptic plane where rulers get longer as they move further from O. Points close to the rim have very small stereographic arc lengths since they correspond to circles about the north pole on the sphere.
Fig. 9.9. Interpretation of the variables of the two metrics which are the radii of the elliptic plane.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
487
Admittedly, in the hyperbolic case, “no such simple interpretation of ρ and r exist. . . Light propagates along geodesics, e.g. along great circles on the sphere and straight lines in the plane” [Rindler 77]. But, what does light propagate along in hyperbolic geometry? This answer we already know: along geodesics on the pseudosphere. Let us recall, from Sec. 2.5, what Beltrami did back in 1868. He mapped a negatively curved surface onto a unit disc. He took the disc as the plane, lines within the disc as a measure of the distance between the distance of preimage points on the negatively curved surface. The distance between any two points is thus meaningful for all points in the unit disc. As one of the points tends to the rim of the disc, the distance tends to infinity so that the plane, and the lines in it are indeed infinite. There is nothing beyond infinity and it makes no sense to consider expansion factors greater than unity. Beyond the rim no stereographic projection takes place so R(t) > 1 would lie outside the elliptic plane, whose geometry is unknown, but certainly not that of the elliptic plane. In Sec. 111 of Landau and Lifshitz [75] is observed that since the radius of curvature in the ‘closed’ universe metric is (9.10.29), the way to cross over to negative curvature is “by replacing [R] by [iR].” For then dl2 =
dr2 + r2 dσ 2 , 1 + kr2
(9.11.7)
or what should amount to the same thing, d2 = R2 {dχ2 + sinh2 χ σ 2 },
(9.11.8)
√ where k = −1/R2 > 0, χ = sinh−1 ( (k)r), and the ‘angle,’ χ, can go from 0 to ∞. It cannot be over-emphasized that (9.11.7) is not a hyperbolic metric! Landau and Lifshitz, as well as all previous authors, should have real√ √ ized that χ = tanh−1 ( (k)r), and not χ = sinh−1 ( (k)r), is the hyperbolic radius. Landau and Lifshitz also fail get the surface area of the sphere correctly. √ The Euclidean radius, (1/ k) tanh χ has to be multiplied by the ratio of the
Aug. 26, 2011
11:17
488
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
√ hyperbolic to the Euclidean lengths of arc, 1/ (1 − kr2 ) = cosh χ and this value times 2πr gives the length of the circumference of a hyperbolic circle √ of radius χ as (2π/ k) sinh χ. The Euclidean element of area is √
(EG − F)dr dϕ,
for ϑ = π/2. Thus, Landau and Lifshitz would obtain 2π r r 2π √ [ (1 + kr2 ) − 1] dr dϕ = √ 2) k (1 + kr 0 0 2π 4π = [cosh χ − 1] = sinh2 (χ/2), k k
(9.11.9)
(9.11.10)
which is correct, but for the wrong reason. The surface area, (4π/k) sinh2 χ, √ they claim is because the radius is (1/ k) sinh χ. The correct expression is obtained by inserting the coefficients of the fundamental form, E=
1 , (1 − kr2 )2
G=
r2 , 1 − kr2
into expression (9.11.9). Integration then gives 2π r 1 r dr dϕ 2π = − 1 √ 2 3/2 k (1 − kr2 ) 0 0 (1 − kr ) 2π 4π = ( cosh χ − 1) = sinh2 (χ/2), (9.11.11) k k which is the same expression as Landau and Lifshitz would have found, (9.11.10), but, again, for the wrong reasons. The metric (9.11.7) can be derived on a geometrical analogy by considering the geometry of an isotropic three-dimensional surface embedded in a fictitious four-dimensional space, where the fourth coordinate, so we are told, has nothing to do with time. Then, by extending our imagination as well as the Pythagorean theorem, we require R2 = x 2 + y 2 + z2 + w 2 to be constant, where R represents the radius of the hypersphere. The surface of this hypersphere will then be identified with our universe. Set r 2 = x 2 + y 2 + z2
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
489
to be the square of the radius of our three-dimensional universe, so that R2 = r2 + w2 = const. Differentiating we get r dr = −w dw, squaring and eliminating w2 in favor of R2 , give dw2 =
r2 dr2 . R2 − r 2
Adding this term to the metric dr2 + r2 dσ 2 gives dl2 =
R2 dr2 + r2 dσ 2 , R2 − r 2
and the transformation R → iR reproduces the spatial metric (9.11.7), with k = 1/R2 . How does the spatial part of the Robertson–Walker metric (9.11.1) stand up against the most general projective metric for the plane? All we have to do is to consider Beltrami’s derivation of his metric in Sec. 2.4, and from whose paper we took the title of this section. Consider the equation for the fundamental conic section = 0. Then consider two points x and y. We will then have three expressions xx , xy , and yy . The cross-ratio of the two points to the two points where the line connecting them meets the conic is given by the quotient of the roots, λ+ /λ− , to the quadratic equation [cf. Sec. 2.5], xx λ2 − 2λxy + yy = 0. The cross-ratio is
√ xy + (2xy − xx yy ) λ+ = , √ λ− xy − (2xy − xx yy )
and its logarithm
k ln
xy + xy −
√
√
,
(2xy − xx yy ) (2xy − xx yy )
is the distance between the two points. The distance depends on the unit of measurement which is given by the absolute constant, k. Since the logarithm is twice tanh−1 , we can express the distance between x and y as xy xy = ±2ki cos−1 √ . (xx yy ) (xx yy )
2k cosh−1 √
Aug. 26, 2011
11:17
490
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
If the two points are infinitesimally close together, y = x + dx, we can approximate the distance, 2ik sin
−1
√
(xx dx dx − 2x dx ) , xx
by the argument itself, and come out with the square of the arc length as d2 = 4k 2
2x dx − xx dx dx 2xx
.
Taking the fundamental conic as a circle of radius 2k, x2 + y2 = 2k 2 , we get xx = x2 + y2 − 4k 2 ,
dx dx = dx2 + dy2 ,
x dx = x dx + y dy.
Thus, the square of the line element can be brought into the form d2 = 4k 2
4k 2 (dx2 + dy2 ) − (y dx − x dy)2 . (4k 2 − x2 − y2 )2
(9.11.12)
Furthermore, if we introduce the polar coordinates, x = r cos σ, and y = r sin σ, then (9.11.12) becomes d2 =
dr2 (1 − 41 r2 /k 2 )2
+
r2 dσ 2 1 − 41 r2 /k 2
.
(9.11.13)
As (9.11.13) does not correspond to the spatial component of the Robertson– Walker metric, (9.11.1), the latter has led to the confusion of identifying the Euclidean measure of distance, r = 2k tanh (r/2k) with 2π sinh (r/2k), the circumference of a hyperbolic circle of radius r.
9.11.2
The expanding Minkowski universe
The Minkowski metric, ds2 = dτ 2 − dρ2 − ρ2 dσ 2 ,
(9.11.14)
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
491
may be transformed by ρ = t sinh χ,
(9.11.15a)
τ = t cosh χ,
(9.11.15b)
ds2 = dt2 − t2 [dχ2 + sinh2 χ dσ 2 ].
(9.11.16)
into the metric,
The metric (9.11.16) appears to have an expansion factor R(t) = t that the flat metric, (9.11.14) does not [cf. Eq. (8.7.5) and following discussion]. The space part is the metric of the hyperbolic plane, and v = ρ/τ = tanh χ, would be associated with a recessional velocity of a galaxy ‘at coordinate distance,’ χ. But, from (9.11.15a) it would appear that χ is the ‘distance’ sinh−1 ρ/t. For if this were the case, the rim would not be infinitely far away! In contrast, distance is defined in elliptic geometry by showing that it satisfies the triangle inequality, as we shall do in Sec. 9.11.3. Miraculously, we have converted a flat space metric, (9.11.14), into a hyperbolic metric, (9.11.16), which has larger circumferences, areas and volumes than its Euclidean counterparts. We would also observe that (9.11.15b) is the time dilatation of the special theory, i.e. √ τ = t/ (1 − v2 ). All this we have obtained from a seemingly innocuous transformation, (9.11.15a). But is it really innocuous? Recall a similar transformation of the pseudospherical coordinates, (9.10.25). There the hyperbolic invariancy condition was z2 − x2 − y2 = R2 = const. Now, the similar condition on the transformations (9.11.15a) and (9.11.15b) gives τ 2 − ρ2 = t2 = const. We might have imagined this when we were unable to express the metric, (9.11.16) in terms of its radial coordinates instead of its ‘angle,’ χ.
Aug. 26, 2011
11:17
492
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
Fig. 9.10. The three possible scenarios of closed, flat and open universes. The freckles are the galaxies which are more or less evenly distributed.
9.11.3
Event horizons
Consider the first model, k > 0, of a spherical, closed universe in Fig. 9.10. If the sphere is being blown up like a rubber balloon, there will be photons that will never have the chance to reach us. Our own galaxy has been circled and the solid line is the geodesic that a photon would take to reach us. The dashed line separates those photons that reach us from those that do not in time t = t0 . In the three-dimensional model this would be represented as a light front called the ‘event horizon.’ The rays, or null geodesics, are determined from (9.11.1) by setting ds = 0, and since we can avail ourselves of spherical symmetry, we can put dσ = 0, leaving ρ0 t0 dρ dt χ(ρ0 ) = , (9.11.17) = √ 2 (1 − kρ ) 0 0 R(t) as the definition of the ‘coordinate’ horizon, corresponding to the distance, in comoving coordinates, that a photon has traveled to arrive at an observer at t0 when it started at the beginning of the universe. The ‘proper’ distance to the event horizon is defined as ρ0 t0 dρ c dt ρ(ρ ¯ 0 ) = R(t0 )χ(ρ0 ) = R(t0 ) . (9.11.18) = √ 2 (1 − kρ ) 0 0 R(t) Now for each of the scenarios depicted horizon will be given by −1 ρ0 sin ρ0 dρ χ(ρ0 ) = = ρ0 √ (1 − kρ2 ) 0 sinh−1 ρo
in Fig. 9.10, the coordinate
(k = 1), (k = 0), (k = −1).
(9.11.19)
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
493
But, there are two solutions to ds2 = 0, so we can have, equally as well, the solutions −1 ρ0 cos ρ0 (k = 1), dρ χ(ρ0 ) = − (9.11.20) = −ρ0 √ (k = 0), (1 − kρ2 ) 0 −1 cosh ρ0 (k = −1). As we have already mentioned, Landau and Lifshitz [75] always consider the spatial metric written in the form: d2 =
dρ02 1 − ρ02
+ ρ02 dσ 2 .
(9.11.21)
For the closed and flat universes, ρ0 can be considered as the Euclidean distance from the origin, which can be chosen anywhere we please. This is because cos−1 ρ is a distance. Imagine a creature X at the north pole, and another creature Y creeping away from it. As Y heads towards the equator his image as seen by X becomes smaller and smaller. Once south of the equator, Y’s image, as viewed by X, begins to grow again until he reaches the south pole. On completing the his world’s trip, Y returns to the north pole only to find his head pointing in the opposite direction. The elliptic plane has only one side! It is easy to show that cos−1 ρ is a distance because it satisfies the triangle inequality. [Busemann and Kelly 53, p. 213] The triangle inequality states that the length of two sides of a triangle can never be inferior to the third. Consider the normalized coordinates (x1 , y1 ) and (x2 , y2 ) such that x12 + x22 = 1 and y12 + y22 = 1, with x2 , y2 > 0. Then the triangle inequality requires cos−1 x2 + cos−2 y2 ≥ cos−1 (x1 y1 + x2 y2 ).
(9.11.22)
Since the cosine is a monotonically decreasing function on the interval (0, π), if we take the cosine of both sides of (9.11.22) we have to reverse the inequality. We then obtain √ √ x2 y2 − (1 − x22 ) · (1 − y22 ) = x2 y2 − |x1 y1 | ≤ |x1 y1 + x2 y2 |. No such relation hold for sinh−1 ρ0 in the third case in (9.11.19) so it cannot be considered as the distance from the origin. However, as we have seen in Sec. 2.2.4, thanks to the cross-ratio inequality (2.2.17) that tanh−1 ρ satisfies the triangle inequality (2.2.18).
Aug. 26, 2011
11:17
494
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity Even before Beltrami’s time, d2 = dχ2 + k 2 sinh2 (χ/k)( sin2 ϑ dϕ2 + dϑ2 )
(9.11.23)
had been known as the line element of a pseudospherical surface. For ϑ = π/2 we have, in Beltrami’s [68] own words, . . . the variable ϕ is taken as the longitude variable of the variable meridian, and consequently the radius of the parallel corresponding to the meridian is sinh χ. The variation of the radius is therefore cosh χ dχ, which is >dχ, and this is absurd, because the variation in question is the projection of dχ onto the plane containing the parallel.
Again in a letter, Gauss tells of a fundamental discovery. In a letter dated 12 July 1831 to Schumacher, Gauss states that the semi-perimeter of a non-Euclidean circle of radius χ has the value 1 χ/k πk e − e−χ/k , 2
(9.11.24)
where k is a constant. It is this constant that Gauss says may perhaps be detectable by measurements over very large distances, and Beltrami made it the radius of his pseudosphere. A fortiori (9.11.23) can be transformed into the metric given in the Preface that was first written down by Riemann in his Habilitation. In 2 order to do so, we write the angular component as d2 = j dλj , where the quantities λr determine the direction of the radius vector [Beltrami 68], xj = rλj = 2k tanh (χ/2k)λj ,
(9.11.25)
such that j λ2j = 1. Taking the differential of (9.11.25) and rearranging Beltrami gets cosh2 (χ/2k) dxj = λj dχ + k sinh (χ/k) dλj ,
(9.11.26)
where we used the double angle formula for sinh χ/k. Now, by definition, cosh2 (χ/2k) =
1 1−
1 4k 2
2 j xj
,
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
495
so that squaring and summing (9.11.26) give
j
1−
1 4k 2
dxj2
2 j xj
2 2 2 2 2 = dχ + k sinh (χ/k)d ,
which is precisely Riemann’s formula, (R), in the Preface. It has been derived from the fact that the inverse hyperbolic tangent, and not the inverse hyperbolic sine, is the hyperbolic length. Now turn to the condition imposed by setting ds2 = 0 to get one of the two roots in (9.11.17). Two quantities depending on separate variables that are equal can be so if they are equal to a constant. The expansion factor R(t) must be determined from other considerations. Those considerations entail the Einstein or Friedmann equations. Instead of (9.11.19) we might be tempted to try χ(ρ0 ) =
0
ρ0
tan−1 ρ0 dρ = ρ0 1 + kρ2 tanh−1 ρ0
(k = 1), (k = 0),
(9.11.27)
(k = −1).
Now, χ will have the meaning of distance only in the open universe, since tanh−1 ρ0 is a hyperbolic distance thanks to its logarithmic representation. That is, the hyperbolic distance between two elements ρ and ρ is ρ 1 k ln ; 2 ρ the distance from an element to itself, ρ 1 k ln = 0; 2 ρ and, finally, the additivity of distances, 1 ρ ρ ρ 1 1 k ln = k ln + k ln . 2 ρ 2 ρ 2 ρ The ratio, ρ/ρ can be expressed as the cross-ratio {ρ, ρ |0, ∞}, and the distance is the logarithm of the cross-ratio.
Aug. 26, 2011
11:17
496
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
Hence, there is not a single metric that will encompass all three scenarios, which are distinguished by the three k values, and whose expressions are distances in their respective geometries.
9.11.4
Newtonian dynamics discovers the ‘big bang’
There is a general consensus that the Robertson–Walker metric has two undetermined ‘constants’: k and R(t). We now see how the field equations ‘impose’ conditions on these two ‘elements’ [Rindler 77]. Newton’s second law, applied to the universe, reads ¨ = −m mR
M , R2
(9.11.28)
where the expansion factor, R(t), is confused with the radial coordinate separating the masses m and M. If the total mass of the universe is constant, then M=
4π 4π ρ(t)R3 (t) = ρ(t0 )R3 (t0 ) = C/2 = const. 3 3
(9.11.29)
Thus, the equation of motion is ¨+ 2R
C = 0. R
˙ we get the first integral of motion, Multiplying through by R, C R˙ 2 − + k = 0, R
(9.11.30)
where k is an arbitrary constant of integration. Equation (9.11.30) is known as the Friedmann equation in Newtonian cosmology, which, with the exception of the cosmological constant, holds also in general relativity. Just as in the Schwarzschild outer solution, (9.10.23), we are going to pack a lot of physics into an arbitrary constant of integration. The constant k is known as the ‘energy index,’ and it represents the total energy density of the universe. The only meaning that k can acquire from (9.11.30) is that of a negative energy density. But, if we want to associate it with the absolute constant in the Robertson–Walker metric, (9.11.1), we have to be prepared to assume that there are the possibilities for a positive energy density, as
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
497
well as a zero energy density. Never has such exaggerations taken place in physics. Even more can, and is, said when (9.11.30) is integrated in time. Taking the positive square root in (9.11.30) and integrating give
R 0
dρ = t. (C/ρ − k)
√
(9.11.31)
It is immediately apparent that the integral of the Friedmann equation, (9.11.31), is not the same as (9.11.18), for, otherwise, it would specify the expansion factor, R = 1. In fact, the solutions, (9.11.19), have nothing in common with the solutions
√ C − kR 1 C −1 − R(C − kR) (k > 0), √ tan kR k k 2 R 3/2 t (k = 0), = 3 C C 1 √ |k|R C −1 R(C + |k|R) − √ (k < 0). sinh |k| C |k| (9.11.32) The conventional interpretation of all this is: Plotting R(t) versus t, as in Fig. 9.11, all three curves coalesce at time t = 0. This supposedly represents the explosive birth of the cosmos, warmly referred to as the ‘big bang,’ a name coined by Fred Hoyle. At this point in time the universe started as a primordial fireball with infinite density and no size. As time passed, (9.11.32) predicts that the universe has three possibilities open to it: It can expand forever, with a positive energy index k < 0, or a zero energy index, k = 0, or it can fall back on itself with a big ‘crunch’ if it has a negative energy density, with k > 0. In the latter case, the energy acquired in the primordial fireball was not sufficient to sustain continual expansion. The Friedmann equation, (9.11.30) can be written as the metric, ds2 = dt2 −
dR2 . C/R − k
(9.11.33)
In order for the metric to be hyperbolic it is necessary that R < C/k, which substantially limits the evolution of the universe. Moreover, (9.11.33) assumes spherical symmetry so it will not hurt to add a term −R2 dϕ2 to it
Aug. 26, 2011
11:17
498
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
Fig. 9.11.
The fates of the universe.
in order to determine the space-like curvature. In other words, we consider a space-like slice of the present galactic time, and determine the Gaussian curvature as K=
C 4π ρ(t0 )R03 4π ρ(t), = = 3 R3 (t) 3 2R3
(9.11.34)
on the strength of the conservation of total mass in the universe, (9.11.29). Equation (9.11.34) shows that the Gaussian curvature is positive if the density of matter is positive. However, with negative energy density we would also have to make leeway for negative densities of matter, even though the total mass of the universe is positive and constant. It should also be borne in mind that the Gaussian curvature, (9.11.34) is independent of the energy index, k. This alone makes the whole scenario less than dubious for basing it on Newton’s gravitational law and his second law, Newton himself could have arrived at the big bang scenario of the universe! But, (9.11.28) only has the exterior appearance of Newton’s law where R is the distance between the two masses m and M, and not the expansion factor, R(t). So we cannot blame Newton for these scenarios!
References [Beltrami 68] E. Beltrami, “Teoria fundamentale degli spazi di curvatura costante,” Annali di Matematica Pura ed Applicata, series II (1868) 232–255; translated
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch09
Nonequivalence of Gravitation and Acceleration
499
in J. Stillwell (ed.), Sources of Hyperbolic Geometry (Amer. Math. Soc., Providence RI, 1996), pp. 41–62. [Busemann and Kelly 53] H. Busemann and P. J. Kelly, Projective Geometry and Projective Metrics (Academic Press, New York, 1953). [Ehrenfest 09] P. Ehrenfest, “Gleichförmige Rotation starrer Körper und Relativitätstheorie,” Phys. Z. 10 (1909) 918; “Uniform rotation of rigid bodies and the theory of relativity,” translated by Wikisource. [Einstein 20] A. Einstein, Relativity: The Special and General Theory (Methuen, London, 1920). [Einstein 55] A. Einstein, The Meaning of Relativity (Princeton U. P., Princeton, 1955). [Einstein 89] A. Einstein, “The speed of light and the statics of the gravitational field,” in The Collected Papers of Albert Einstein: The Swiss Years, Vol. 4 (Princeton U. P., Princeton, 1989), pp. 95–106; “On the theory of the static gravitational field,” ibid pp. 107–120. [Fock 66] V. Fock, The Theory of Space, Time and Gravitation, 2nd ed. (Pergamon Press, Oxford, 1966). [Fokker 65] A. D. Fokker, Time and Space, Weight and Inertia (Pergamon Press, Oxford, 1965), p. 139. [Gamow 62] G. Gamow, Gravity (Anchor Books, New York, 1962). [Gray 07] J. Gray, Worlds Out of Nothing (Springer, New York, 2007), p. 318. [Grøn 04] Ø. Grøn, “Space geometry in rotating reference frames: A historical appraisal,” in [Rizzi & Ruggiero 04]. [Huygens 62] C. Huygens, Treatise on Light (Dover, New York, 1962). [Landau & Lifshitz 75] L. D. Landau and E. M. Lifshitz, The Classical Theory of Fields (Pergamon Press, Oxford, 1975), p. 362. [Langevin 35] P. Langevin, “Remarques au sujet de la Note de Prunier,” Comptes Rendus 200 (1935) 48–51. [Lorentz 16] H. A. Lorentz, The Theory of Electrons, 2nd ed. (B. G. Teubner, Leipzig, 1916). [Møller 52] C. Møller, The Theory of Relativity (Oxford U. P., Oxford, 1952). [Needham 97] T. Needham, Visual Complex Analysis (Clarendon, Oxford, 1997). [O’Neill 66] B. O’Neill, Elementary Differential Geometry (Academic Press, New York, 1966). [Page & Adams 40] L. Page and N. I. Adams, Jr, Electrodynamics (Van Nostrand, New York, 1940). [Rindler 77] W. Rindler, Essential Relativity (Springer-Verlag, New York, 1970), p. 208. [Rizzi & Ruggiero 04] G. Rizzi and M. L. Ruggiero, Relativity in Rotating Frames (Kluwer, Dordrecht, 2004). [Robb 11] A. A. Robb, Optical Geometry of Motion (W. Heffer and Sons, Cambridge, 1911). [Robb 36] A. A. Robb, The Geometry of Time and Space (Cambridge U. P., Cambridge, 1936). [Robertson 50] H. P. Robertson, “The geometries of the thermal and gravitational fields,” Am. Math. Monthly 57 (1950) 232–245.
Aug. 26, 2011
11:17
500
SPI-B1197
A New Perspective on Relativity
b1197-ch09
A New Perspective on Relativity
[Sagnac 13] G. Sagnac, “Sur la preuve del la réalité de l’éther lumineux par l’expérience de l’interférographe tournant,” Comptes Rendus 157 (1913) 1410–1413. [Schwarzschild 16] K. Schwarzschild, “Über das Gravitationfeld einer Kugel aus inkompressibler Flüssigkeit,” Sitzber. Preuss. Akad. Wiss. (1916) 424–434 (presented at the meeting of 24 February 1916). [Smart 60] W. M. Smart, Textbook on Spherical Astronomy, 4th ed. (Cambridge U. P., Cambridge, 1960). [Sommerfeld 09] A. Sommerfeld, “Über die Zusammensetzung der Geschindigkeiten in der Relativtheorie,” Physikalisches Zeitschrift 10 (1909) 826–829. [Sommerfeld 64] A. Sommerfeld, Optics (Academic Press, New York, 1964). [Stachel 89] J. Stachel, “The rigidly rotating disc as the ‘missing link’ in the history of general relativity,” in eds. D. Howard and J. Stachel, Einstein and the History of General Relativity (Birkhäuser, Basel, 1989). [Terrell 59] J. Terrell, “Invisibility of the Lorentz contraction,” Phys. Rev. 116 (1959) 1043. [Weinstein 60] R. Weinstein, “Observation of length by a single observer,” Am. J. Phys. 28 (1960) 607.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch10
Chapter 10
Aberration and Radiation Pressure in the Klein and Poincaré Models
The hyperbolic distance in the Klein model “differs from the formula in the Poincaré disc model by a mere factor of two!.” [Needham 97]
10.1
Angular Defect and its Relation to Aberration and Thomas Precession
The angular defect concerns both aberration and parallax, although the two phenomena are quite distinct from each other. In fact, Bradley discovered aberration in 1728 while looking for parallax. Although both phenomena cause the locus of a star to trace out an ellipse, the direction and magnitude of the angular deviation in aberration is quite different from that caused by parallax. The crucial difference is that the magnitude of deviation caused by aberration is independent of the distance to the star, and is much greater than for parallax. In Sec. 9.9 we found the angle of parallax is greater than the defect, and, moreover, the angle of parallax is greater than the complementary angle of parallelism, which is a sole function of distance. In the Klein model, we will appreciate that the angle of parallelism is a limiting angle, while the angular defect is always present. It has also been shown that the angular defect of a hyperbolic triangle is related to the upper bound on the Euclidean measure of relativistic velocities using the conformal Poincaré disc model [Criado & Alamo 01]. On the other hand, if the Klein model is used, which is not conformal, one would find Lorentz contraction in the direction normal to the motion, as we have seen in Sec. 9.7. The angular defect in the hyperbolic triangle, which is proportional to the area, has also been implicated in the determination of the rotation of 501
Aug. 26, 2011
11:17
502
SPI-B1197
A New Perspective on Relativity
b1197-ch10
A New Perspective on Relativity
axes in successive Lorentz transformations in different planes [Sard 70]. It came as a curious surprise that successive Lorentz transforms, or ‘boosts’ as they are now referred to, is not only another boost, but one that involves a rotation. In physics, the angle of rotation is known as Wigner’s angle, and is the kinematic factor underlying Thomas precession. However, what we refer to as the ‘Thomas’ precession falls under Stigler’s law of eponymy because it was actually discovered by Emil Borel [13], a doctoral student of Poincaré. During his exploration of what he referred to as ‘kinematic’ space, Borel discovered that a system whose accelerations are rectilinear for observers in that frame will appear to be rotated with respect to inertial observers. Borel observed that a vector transported parallel to itself over a closed path on the surface of a sphere will be viewed as a change in orientation by an observer at the center of the sphere. The amount of change in orientation is proportional to the enclosed area for the inertial observer whose velocity is equal to the initial and final velocity of the accelerating system. Borel predicted that for a circular orbit of radius R and angular velocity ω, the precession of the orbit would be the order of β2 := (ωR/c)2 , and whose rate would be ωβ2 . While he attributed this effect to be a direct consequence of the nature of Lorentz transformations, he failed to apply it to any known physical phenomenon, and, undoubtedly, this is why he lost out to Llewellyn Thomas whose Christmas holiday calculation was done in 1925. Borel’s priority in the Thomas precession has recently been pointed out by Stachel [95]. If u and v are two velocities we know from Sec. 9.6 that the most general composition law is √ w=
[(u − v)2 − (u × v)2 /c2 ] . 1 − u · v/c2
(10.1.1)
The non-planar aspects of the composition law can be clearly seen in the second term of the numerator of (10.1.1). Expression (10.1.1) can also be derived by differentiating the Lorentz transformations at constant, relative velocity [Fock 66, pp. 46–47]. Then, introducing v = u + du into (10.1.1), and dividing through by dt, the law of acceleration is obtained as √ ˙ = w
[u˙ 2 − (u × u) ˙ 2 /c2 ] . 2 1 − u /c2
(10.1.2)
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch10
Aberration and Radiation Pressure
503
This decomposes the acceleration into longitudinal (u u) ˙ and transverse (u ⊥ u) ˙ components, analogous to the longitudinal and transverse masses. Taking the inner product of u with (5.4.39) yields u · u˙ =
3 F·u (1 − u2 /c2 ) 2 . mel
(10.1.3)
When they are parallel to each other, (10.1.3) gives the longitudinal mass, and when they are perpendicular F · u = 0, and (5.4.39) gives the transverse mass. It is the second term in the numerator of (10.1.2) that is related to the Thomas precession: the rotation of the electron’s velocity vector, 2 , dϑ = (u × u)dt/u ˙
(10.1.4)
caused by the acceleration, u, ˙ in time, dt. Then, as the velocity turns by dϑ along the orbit, the spin projection turns in the opposite direction by an amount equal to the angular defect of the hyperbolic triangle whose vertices are the velocities in three different inertial frames in pure translation with respect to one another. The defect caused by aberration can be readily calculated. Consider the triangle formed by three vertices u1 , u2 , and u3 in velocity space. By setting u3 = nc, where n is the unit normal in the direction of the light source, we are considering an ideal, or ‘improper,’ triangle [Kulczycki 61], which shares many properties of ordinary triangles, but has the property that the sum of its angles is less than two right angles — its so-called defect. Consequently, there will be two parallel lines forming an ideal vertex u3 whose angle is zero so that cos ϑ3 = 1. The cosines of the angles are given by the inner products [Busemann & Kelly 53] cos ϑi =
(uk − ui ) · (uj − ui ) − (uk × ui ) · (uj × ui )/c2 , ik ij
(10.1.5)
√ where ik = [(uk − ui )2 − (uk × ui )2 /c2 ], and a similar expression for ij . All three angles can be calculated by permuting cyclically the indices, and it is easy to see that ϑ3 = 0. By choosing a frame where the velocities are equal and opposite in direction, u1 = −u2 , we are, in fact, considering a
Aug. 26, 2011
11:17
504
SPI-B1197
A New Perspective on Relativity
b1197-ch10
A New Perspective on Relativity
‘two-way’ Doppler shift. The relative velocity isa γ=
2β , 1 + β2
(10.1.6)
where β = u/c, and u = |u1 | = |u2 |. The projection of the velocity onto the normal of the wavefront is n · u1 = −n · u2 = u cos ϑ ≥ 0.
(10.1.7)
The cosine law (10.1.5) for angles ϑ1 and ϑ2 can be written as cos ϑi =
(u2 − cn · ui ) u(c − n · ui )
i = 1, 2.
(10.1.8)
On account of (10.1.7), the cosine of the first angle, cos ϑ1 =
β − cos ϑ , 1 − β cos ϑ
(10.1.9)
represents the usual formula for aberration, except for the negative sign which implies reflection and guarantees that the angle of parallelism is acute. The second equation of aberration for the first angle is: sin ϑ1 sin ϑ . = λ1 λ The expression for ratio of the wavelengths, √ (1 − β2 ) λ1 = , λ 1 − β cos ϑ is Doppler’s principle. For the second angle we have β + cos ϑ , 1 + β cos ϑ again on account of (10.1.7), and, hence, √ (1 − β2 ) sin ϑ2 = sin ϑ. 1 + β cos ϑ Finally, by (10.1.7), we find the relation, cos ϑ2 =
cos ϑ2 =
γ − cos ϑ1 , 1 − γ cos ϑ1
(10.1.10)
(10.1.11)
(10.1.12)
(10.1.13)
(10.1.14)
between the two cosines, where γ is the relative speed given by (10.1.6). a It should be clear from the context when γ denotes the relative speed of the two
systems, and when it denotes the Lorentz factor.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch10
Aberration and Radiation Pressure
505
The aberration formula (10.1.10) has the identical form of the law of reflection for a moving mirror. For a stationary mirror, λ1 = λ and ϑ1 = ϑ, where the angles are subtended by the incoming and outgoing rays, and the surface of the mirror. However, it must be borne in mind that the angles are at the vertices in velocity space so that an angle of ϑ = π/2 is parallel to the wavefront, or perpendicular to the motion. The first, (10.1.9) and (10.1.10), and second, (10.1.12) and (10.1.13), pair of aberration equations can be combined to read tan (ϑ1 /2) = tan (ϑ2 /2) =
1−β 1+β 1−β 1+β
1/2 cot (ϑ/2),
(10.1.15a)
tan (ϑ/2),
(10.1.15b)
1/2
respectively. Expression (10.1.15b) is the usual formula given for aberration, and (10.1.15a) is what we found in (8.3.17). By letting the third vertex be the speed of light, we have formed an ideal triangle. In hyperbolic geometry, a transversal which cuts the two parallel lines forms angles in the direction of parallelism such that the sum of the angles is less than two right angles. Two important cases arise: (i) when the vertices, u1 and u2 , of the ideal triangle are on the same limiting curve, or horocycle, H, whose center is at infinity, [shown in Fig. 10.1], and (ii) when the transversal is perpendicular to one of the parallel lines [shown in Fig. 10.2].
Fig. 10.1. A segment H of a horocycle with center at infinity with angles of parallelism .
Aug. 26, 2011
11:17
506
SPI-B1197
A New Perspective on Relativity
b1197-ch10
A New Perspective on Relativity
Fig. 10.2. Angle of parallelism with transversal perpendicular to one of the parallel lines.
In the first case, ϑ = π/2, ϑ1 = ϑ2 = , where , the angle of parallelism, is given by 1 − β 1/2 ¯ ¯ tan ( (u)/2) = = e−u/c . (10.1.16) 1+β ¯ From (10.1.9) and (10.1.12) we find is function only of ‘distance’ u. cos ϑ1 = cos ϑ2 = β, which is the Euclidean measure of distance in velocity space. In the second case, one of the angles is π/2, and the other is necessarily acute, being the angle of parallelism. In other words, lines with a common normal cannot be parallel so that must be acute. It is readily seen from (10.1.15b) that ϑ2 cannot become a right angle because that would imply cos−1 ( − β) = ϑ > π/2, and so violate (10.1.7). Again β is the Euclidean measure of length, which is equal to the hyperbolic tangent of its hyperbolic measure [cf. Eq. (10.3.2) below]. Negative values are ruled out in hyperbolic geometry: “the hyperbolic tangent is a function that assumes all values between 0 and 1” [Kulczycki 61, p. 163]. In other words, the angle of parallelism must be an acute angle, for, otherwise, the lines would be divergent. The formation of an ideal triangle is related to the fact that c is the limiting speed. We will return to this point Sec. 10.3.1. Rather, if ϑ1 = π/2, and (10.1.15a) is introduced into (10.1.15b), we get 1−β ¯ ¯ tan ( (2u)/2) = , (10.1.17) = e−2u/c 1+β ¯ From where the angle of parallelism, , is a function of the twice ‘distance’ u. (10.1.14) we find the new hyperbolic measure of distance as cos ϑ2 = γ, which again is related to the hyperbolic tangent through (10.5.5) below. In contrast to (10.1.16), the hyperbolic measure has become twice as great in (10.1.17). This, as we shall see, is the same as performing a ‘two-way’ Doppler shift.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch10
Aberration and Radiation Pressure
507
The defect, η = π − ϑ1 − ϑ2 > 0, is expressed in terms of the relative speed β and the angle ϑ subtended by the direction of the light source and the line of sight of the observer, i.e. β tan (η/2) = √ sin ϑ. (10.1.18) (1 − β2 ) In the Thomas precession, the velocity turns along the orbit by an amount ϑ, while the spin projection in the orbital plane turns in the opposite direction by the amount η = −dϕ, the hyperbolic defect, where dϕ is the change in the angle that the spin projection makes with the velocity vector in time dt. To calculate this change we consider a triangle with sides β¯ 1 , β¯ 2 , and β¯ 3 and corresponding angles ϑ1 , ϑ2 , and ϑ3 . According to Gauss’s equation [Greenberg 93] 1 sin (η/2) = cos (ϑ1 + ϑ2 + ϑ3 ) 2 1 1 = cos (ϑ1 + ϑ2 ) cos (ϑ3 /2) − sin (ϑ1 + ϑ2 ) sin (ϑ3 /2) 2 2 1 ¯ 1 ¯ ¯ ¯ cosh 2 (β1 + β2 ) − cosh 2 (β1 − β2 ) = sin ϑ3 2 cosh 1 β¯ 3 2
sinh 12 β¯ 1 · sinh 12 β¯ 2 = sin ϑ3 cosh 12 β¯ 3 √ √ ( cosh β¯ 1 − 1) · ( cosh β¯ 2 − 1) sin ϑ3 . (10.1.19) = √ √ 2 · ( cosh β¯ 3 + 1) Now letting β¯ 1 → β¯ 2 and β¯ 3 → 0, with ϑ3 = dϑ, there results [Sard 70] 1 sin (η/2) = (γ − 1) sin (dϑ), (10.1.20) 2 √ 2 where γ = 1/ (1 − β ) is the Lorentz factor. For an infinitesimal time interval, γ − 1 |u × u| ˙ −dϕ = η = (γ − 1)dϑ = dt, β2 c2 where we have introduced (10.1.4). The angular velocity of the Thomas procession is thus given as γ 2 |u × u| ˙ ωT = ϕ˙ = − . (10.1.21) 2 1+γ c However, if the relative speed is that of the two systems β = γ, given in (10.1.6), (10.1.20) is replaced by 1 sin (η/2) = ( − 1) sin (dϑ) = γ 2 − 1 sin (dϑ), 2
Aug. 26, 2011
11:17
508
SPI-B1197
A New Perspective on Relativity
b1197-ch10
A New Perspective on Relativity
with = (1 + β2 )/(1 − β2 ). The angular velocity of the Thomas precession would then be |u × u| ˙ ωT = −2γ 2 . (10.1.22) c2 At low speeds, γ ≈ 1, and (10.1.22) would be four times as large as (10.1.21). Comparison can be made with general relativity by associating the acceleration with that of Newtonian gravity [Schiff 60], GM r, r3 where if M is the mass of the earth, r would be the radial vector connecting the center of the earth to an orbiting satellite. The satellite would precess in the plane of the orbit at a rate, u˙ = −
ωT = n
α |u × r| , r r2
where α = 2GM/c2 is Schwarzschild’s radius. If the relative speed is β, then n = 1/4, while for the relative speed of γ, we have n = 1. Apart from predicting a precession frequency in the opposite direction, general relativity claims that n = 3/4 [Schiff 60]. We can therefore conclude that anytime a component of the acceleration exists normal to the velocity, “for whatever reason, then there is a Thomas precession, independent of other effects” [Jackson 75] — including relativistic ones. This kinematic effect is amplified by the compounding of Doppler shifts, and this will be a recurrent theme throughout this chapter. The fundamental connection between hyperbolic geometry and optical phenomena in general, and relativity in particular, is that compounding longitudinal Doppler shifts gives the cross-ratio, whose logarithm is the hyperbolic distance. As we know from Sec. 2.2.4, the cross-ratio is a projective invariant of four points. This is the smallest number of points that is invariant, since three points on a line may be projected to any other three. For consider two relative velocities, β1 and β2 . Compounding their longitudinal Doppler shifts gives 1 + β1 1/2 1 − β2 1/2 1 − β1 1 + β2 1 + (β1 − β2 )/(1 − β1 β2 ) 1/2 = = {β1 , β2 | − 1, 1}1/2 , (10.1.23) 1 − (β1 − β2 )/(1 − β1 β2 )
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch10
Aberration and Radiation Pressure
509
whose logarithm is precisely the hyperbolic distance with an absolute constant of unity. If we had considered a velocity addition law rather than subtraction law, one of the velocities in the cross-ratio would be negative. We know from Sec. 2.4 that the projective or Klein disc is not conformal, except at the origin of the hyperbolic plane, while the Poincaré disc is. The disc models also differ in how hyperbolic distance is measured: the hyperbolic distance is twice as great in the Poincaré disc than it is in the Klein disc. The factor two is not just a mere numerical factor, since it is indicative of reflections and the way velocities are compounded and distances measured. Moreover, it will change the dependencies of energy, momentum, and consequently, mass, on the relative speed. Another possibility of vindicating hyperbolic geometry consists in the distinction between aberration and the pressure of radiation against a moving mirror. Early in the development of the ‘special’ theory, the Lorentz transform and its inverse were used to determine the pressure of radiation on a moving mirror [Abraham 04,Einstein 98]. It is still common to use aberration to determine the radiation pressure, even though Einstein calculated the difference in the energy density after being reflected from the mirror and the initial energy density in order to determine the radiation pressure. A ‘two-way’ Doppler shift is involved, and not a one-way Doppler shift [Terrell 61]. This we will show to be the same distinction between the Klein and the Poincaré models of hyperbolic geometry. Moreover, it will turn out that the second-order Doppler effect predicted by the two-way Doppler shift is an experimental test for the angle of parallelism [cf. Sec. 10.6 below].
10.2
From the Klein to the Poincaré Model
The relativistic velocity addition law for two systems moving at equal and opposite speeds, (10.1.6), is the isomorphism from the Klein model of hyperbolic geometry onto the Poincaré disc and upper half-plane models. In the Poincaré disc model, points of the hyperbolic plane are represented by points interior to a Euclidean circle, . Lines not passing through the center of the circle are represented by open arcs of circles which cut a fixed circle, , orthogonally at P and Q in Fig. 2.27. Points lying on the real axis in the half-plane model, called ideal points or points at infinity, become
Aug. 26, 2011
11:17
510
SPI-B1197
A New Perspective on Relativity
b1197-ch10
A New Perspective on Relativity
points on the unit circle in the Poincaré disc model, whose locus represents a circle at infinity, or a ‘horizon’. Not only did Beltrami discover the Poincaré disc model, some fourteen years before Poincaré rediscovered it, he also constructed the Klein, or projective model, by projecting a hemisphere vertically downwards onto the complex plane. We have discussed the Poincaré disc model in Sec. 2.5, where we placed a sphere whose south pole is centered at the origin of the disc in Fig. 2.26, and having the same radius as the disc. We can also place the Beltrami disc on the equator of the sphere as shown in Fig. 10.3. A chord on the disc, PQ, is projected vertically downwards into the southern hemisphere. This chord becomes a semicircular arc dangling vertically downward from the equator. Astereographic projection from the north pole N transforms the semicircular arc into an arc of a circle that cuts the disc normally or a straight line through the center of the equator. Stereographic projection is conformal so that the hanging semicircular arc will produce a circular arc that cuts the equator at right angles, and it projects circles onto circles or straight lines. Thus, what was a non-Euclidean geodesic straight line in the Beltrami–Klein model has become a circular arc in the Poincaré model. Although the projection of a small circle on the hemisphere becomes an ellipse on the disc, so that the Klein model is not conformal, the redeeming feature of the model is that the vertical sections of the hemisphere
Fig. 10.3. Poincaré’s projections of the Beltrami model vertically into the southern hemisphere and stereographically back onto the equator.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch10
Aberration and Radiation Pressure
511
Fig. 10.4. Klein model where vertical sections of the hemisphere are projected into straight lines. Geodesics retain their straightness at the cost of not being conformal.
are projected into Euclidean straight lines as shown in Fig. 10.4. In other words, the hyperbolic lines of the Klein model are Euclidean chords of the unit-circle. We discussed the cross-ratio in Sec. 2.2.4. Here, we motivate its logarithm as a measure of hyperbolic distance. The original idea was Cayley’s, in which he started with projective geometry and then introduced the notion of Euclidean distance. But, it was Klein who realized the potency and generality of the idea. A, B, and C are ordinary points inside , and P and Q are the ends of the chord through A, B, and C. Recalling from Sec. 2.2.4 that the cross-ratio of the four points P, A, B, Q is {A, B|P, Q} =
e(AP) e(BQ) · . e(AQ) e(BP)
Likewise, the cross-ratio of the four points P, B, C, Q is {B, C|P, Q} =
e(BP) e(CQ) · . e(BQ) e(CP)
Their product, {A, B|P, Q} · {B, C|P, Q} =
e(AP) e(CQ) · = {A, C|P, Q}, e(AQ) e(CP)
has eliminated the intermediate point B [cf. (2.2.16)]. This motivates Klein’s definition of the length of the segment AC as h(AC) =
1 | ln{A, C|P, Q}|, 2
(10.2.1)
Aug. 26, 2011
11:17
512
SPI-B1197
A New Perspective on Relativity
b1197-ch10
A New Perspective on Relativity
since the distances add, h(AB) + h(BC) = h(AC). But we also know from (2.2.17) that if we shorten the interval we lengthen the distance between two intermediary points so that (10.2.1) satisfies the triangle inequality. If P = 1, Q = −1, A = 0, and B = b, then Klein’s distance is 1 1+b 1 · = tanh−1 b. h(AB) = ln 2 1−b 1 So the Euclidean distance from the origin to a point b is the hyperbolic distance b¯ = tanh−1 b. As b varies from 0 to 1, b¯ varies from 0 to ∞. Poincaré, on the other hand, determines the distance of the arc from A to B as twice Klein’s distance, viz. h (AB) = | ln{A, B|P, Q}|.
(10.2.2)
Again, let the ends of the chord at P and Q be 1 and −1. If A and B have coordinates x and y then the cross-ratio is 1−x 1+y · . 1+x 1−y If A = γ(x) and B = γ(y), it follows that d A B = d (AB), since 1−x 2 1 − γ(x) . = 1 + γ(x) 1+x {A, B|P, Q} =
Hence, γ given by (10.1.6), is an isomorphism that makes the lengths of the Klein and Poincaré models coincide.
10.3 10.3.1
Aberration versus Radiation Pressure on a Moving Mirror Aberration and the angle of parallelism
Having derived the formulas for aberration in Sec. 8.6 and Sec. 10.1, we now consider, in greater detail, the limiting forms (10.1.16) and (10.1.17) which are the Bolyai–Lobachevsky formulas for the angle of parallelism. Although there has been no mention of hyperbolic geometry, this situation has been widely discussed in the literature [Terrell 59,Weisskopf 60], and without any mention of an angle of parallelism.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch10
Aberration and Radiation Pressure
513
For ϑ1 = π/2 in (10.1.9), the observer in a frame in which the object is at √ rest will see the object rotated by an amount sin ϑ = (1 − β2 ), just equal to the FitzGerald–Lorentz contraction. The angle of parallelism, ϑ = cos−1 β, provides the link between circular and hyperbolic functions. Only at the angle of parallelism can a rotation be equated with a FitzGerald–Lorentz contraction. Terrell [59] also considers the opposite case where ϑ2 = π/2 in (10.1.14) and ϑ = cos−1 (−β). He concludes that to the stationary observer, the object appears “to be rotating about its line of motion in such a way as to appear broadside at ϑ = cos−1 (−β), and to present a view of its rear end from that time on.” However, the stationary observer will not see any motion of this sort performed by the moving object because the angle of parallelism, linking circular and (positive) hyperbolic functions, must be acute; otherwise, the hyperbolic measure of distance would turn out to be negative! Terrell’s [59] analysis cannot therefore be extended to angles of parallelism greater than π/2, for such angles do not exist. In other words, the observer must make his observation of the object in the same inertial frame of the object, and the condition ϑ1 = π/2 makes ϑ an angle of parallelism via the equation of aberration, (10.1.9). We recall from Sec. 1.1 that it was the Serbian mathematician, Variˇcak [11], who dared to question the reality of the Lorentz contraction, and provoked Einstein’s [11] summary responseb : The question of whether the Lorentz contraction is real or not is misleading. It is not ‘real’ insofar as it does not exist for an observer moving with the object.
We will analyze the angle of parallelism further in terms of the projective disc model, showing that it leads to Lorentz contraction in a direction normal to the motion [cf. Eq. (10.5.2) below]. This will add further support to a contraction normal to the motion that we found using the angle defect in Sec. 9.7. In the next section, we will relate it with the vanishing of the radiation pressure on a moving mirror. b It is ironic that Variˇcak’s works [10,11,12] on hyperbolic geometry went almost
completely unnoticed, yet his small note on whether the Lorentz contraction was real or not caused a great deal of commotion and confusion [Miller 81]. Einstein delegated Ehrenfest to answer Variˇcak, but, then, realizing that it might rock the boat of relativity, decided to answer himself.
Aug. 26, 2011
11:17
514
SPI-B1197
A New Perspective on Relativity
b1197-ch10
A New Perspective on Relativity
The angle of parallelism in (10.1.16) is a sole function of the ‘distance’ ¯ The latter is the hyperbolic measure of distance in velocity space, β. 1 1+β 1 β¯ = ln (10.3.1) · = tanh−1 β, 2 1 1−β whose Euclidean measure is β. More precisely, (10.3.1) is the Klein length of the velocity segment. On the basis of (10.3.1), we get the basic relation for the measure of a straight line segment in Lobachevsky space, ¯ β = tanh β¯ = cos ϑ(β),
(10.3.2)
and
1 β , sinh β¯ = √ . (10.3.3) (1 − β2 ) (1 − β2 ) Whereas the first equality in (10.3.2) and those in (10.3.3) hold for all oneway Doppler shifts, the second equality in (10.3.2) is valid only at the angle ¯ is a function only of β. ¯ of parallelism, where ϑ(β) cosh β¯ = √
10.3.2
Reflection from a moving mirror
If ϑ is the angle that a ray makes with the surface of a mirror, and ϑ the angle of the reflected ray with respect to the surface of the mirror then the law of reflection states that ϑ = ϑ . This changes when the mirror is in motion. As we know from Sec. 3.5.1, radiation pressure has a long history since Maxwell first predicted it. It also constituted one of the early testing grounds of relativity. If the mirror is receding from the radiating source, the ratio of the wavelengths of impinging and reflected radiation is λ cos ϑ + β = , (10.3.4) λ cos ϑ − β because the wavelength is lengthened in the forward direction and shortened in the backward direction. The angle of reflection is referred to the frame in which the source is at rest, cos ϑ =
cos ϑ + γ , 1 + γ cos ϑ
(10.3.5)
where γ, given by (10.1.6), is the isomorphism from the Klein to the Poincaré models. It involves a two-step process for carrying a point β in the Poincaré disc to the corresponding point γ in the Klein model.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch10
Aberration and Radiation Pressure
515
Introducing (10.3.5) into the ratio (10.3.4), where ϑ1 = ϑ and λ1 = λ , leads to 1 + β2 λ (1 + γ cos ϑ), (10.3.6) = λ 1 − β2 showing clearly that the wavelength of the reflected radiation, λ , has been shortened with respect to the wavelength of the incoming radiation, λ. In fact, expression (10.3.6) is Doppler’s principle, (10.1.11), obtained by replacing the relative velocity β by −γ. Introducing (10.3.6) into the aberration equation (10.1.10), which just happens to have the same form as the law of reflection from a moving mirror, results in sin ϑ 1−β tan (ϑ /2) = tan (ϑ/2). (10.3.7) = 1+β 1 + cos ϑ The ratio of the tangents is the square of that for aberration, (10.1.15b)!
10.4
Electromagnetic Radiation Pressure
Let us briefly summarize our results in Secs. 3.5.1, 6.6 and 9.8. Maxwell showed that the pressure exerted on a square centimeter by a beam of light is numerically equal to the energy in a cubic centimeter of the beam. Consider a plane wave of monochromatic light traveling in the x-direction. Maxwell’s equations for the relevant components of the electric, E, and magnetic, H, fields are Ex = Ex , Ey = γ Ey − βHz , Hz = γ Hz + βEy , where for a plane wave propagating in the x-direction, Ey = Hz . The radiation pressure, P , in the frame moving at velocity, u, is related to the pressure in the stationary frame, P, according to (3.5.1)c 1 2 1−β E =P , (10.4.1) P = 2π y 1+β c This expression also appeared in Abraham’s [23] work, but we have not been
able to establish a priority claim with respect to the analysis of Poynting which we discussed in Sec. 3.5.1.
Aug. 26, 2011
11:17
516
SPI-B1197
A New Perspective on Relativity
b1197-ch10
A New Perspective on Relativity
where P = (1/2π)E2y is Maxwell’s prescription of associating the pressure acting on a square centimeter of surface with the energy density in a cubic centimeter of the beam. Let us remind ourselves that the relativistic Doppler shift in the frequency ν , from its stationary value, ν, ν = γ (1 − β cos ϑ) , (10.4.2) ν combines the ordinary Doppler shift with the relativistic time dilatation factor. Of course, (10.4.2) can be derived from the Lorentz transformation; it can also be derived, however, in more general terms from the relative velocity, w, of the corresponding segment s¯ of the Lobachevsky straight line (10.1.1), where the relative velocity is related to the corresponding segment s¯ of the Lobachevsky straight line by w = c tanh s¯. Expression (10.1.1) spans ¯ the entire gamut: from a single velocity, β = c tanh (u/c) [the first equality in Eq. (10.3.2)], to equal and opposite velocities, γ = c tanh γ¯ [Eq. (10.4.12) below]. If the energy increases with speed w asd K :=
E0 , (1 − w2 /c2 )
E = √
dAs we discussed in Sec. 5.4.4, Abraham’s [04] model was an early contender to
taken into account the electron’s energy dependency upon speed, in which he took Searle’s [97] expression for the total energy of a spherical body of radius r with a uniform distribution of charge, e, in motion with a uniform speed w,
e2 c 1 + w/c E= ln −1 , 2r w 1 − w/c for the energy of an electron. It is commonly believed that Abraham’s model distinguishes itself insofar as the electron remains rigid both in the state of rest as in the state of relative motion. If this were true, its energy would not be a function of the relative velocity. Abraham, as we have seen, obtains this dependency on invoking a dilatation of the semimajor axis that depends on the relative velocity through the Lorentz factor. His expression for the energy shows that the energy is proportional to the dif¯ − w), where ference in the hyperbolic and Euclidean measures of the speed, (w Poincaré’s hyperbolic measure is given by the logarithm of the cross-ratio, 1 + w/c ¯ = c ln w . 1 − w/c It demonstrates that the body’s energy, and hence its mass, increases as a result of the motion, and shows that such a dependency is tied to the deviation from Euclidean geometry.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch10
Aberration and Radiation Pressure
517
then (1 − u · v/c2 ) E = E √ , (1 − u2 /c2 )
√ where E/E0 = 1/ (1 − v2 /c2 ). For v = c cos ϑ we get 1 − β cos ϑ E = E √ . (1 − β2 )
(10.4.3)
The energy, (10.4.3), and amplitude [cf. Eq. (10.4.9) below], transform in the same way as the frequency, (10.4.2). This was stressed by Einstein [98] as being of particular relevance since, according to him, Wien’s distribution is related to it. Because the volume transforms as the inverse of the frequency, the energy density, ε, will transform as the square of the frequency, ε = K 2 ε.
(10.4.4)
But this is none other than what Poynting claimed in Sec. 3.5! Observing the motion in the line of sight, (10.4.4) reduces to Poynting’s expression (10.4.1) for the energy densities [cf. first equation in Sec 3.5.1]. In the general case, the radiation falls obliquely on the mirror, making an angle ϑ with the normal, as in Fig. 10.5. The energy that falls on a unit area normal to the rays (CB in Fig. 10.5) has an area of magnitude 1/ cos ϑ on the surface AB. In addition, the component of the momentum is reduced by a factor of cos ϑ than if it were directed normal to the surface. Consequently, the momentum per unit area is decreased by a factor of cos2 ϑ , and this factor must be multiplied to the energy density when calculating the pressure. We thus obtain (cos ϑ − β)2 , P = 2ε cos2 ϑ = 2ε 1 − β2
(10.4.5)
for the radiation pressure, where the pre-factor 2 comes from the fact that, upon reflection, the mirror receives Poynting’s ‘double dose’ of momentum, and the pressure is, consequently, doubled.
Fig. 10.5.
Radiation falling obliquely on a mirror of length AB.
Aug. 26, 2011
11:17
518
SPI-B1197
A New Perspective on Relativity
b1197-ch10
A New Perspective on Relativity
Whereas the derivation of the radiation pressure on a moving mirror based on aberration is conceptually incoherent [Terrell 61], Einstein’s [98] original derivation is. From his two-way Doppler shift, and his requirement to calculate the reflected energy in the same frame as the incident energy, he could have deduced many of the results presented here, together with the realization of the intimate relationship between special relativity and hyperbolic geometry that, as we have seen in Chapter 9, applies to relativity in general. We shall now show that whereas (10.1.15b) is related to oneway aberration, its square, (10.3.7), relates to the change in wavelength on reflection from a moving mirror. Although Einstein [98] gets the same result as (10.4.5), he uses energy conservation and by transforming to the mirror’s moving frame, reflecting and transforming back to the stationary frame. The first step would have yielded half the pressure, as shown below, but is more enlightening than the method used above, since it brings out the fact that it is a second-order relativistic effect. Einstein obtains the frequency shift,e 1 + β2 − 2β cos ϑ , (10.4.6) ν =ν 1 − β2 upon reflection. In addition, he gives the law of the transformation of the cosine of the angle, cos ϑ =
(1 + β2 ) cos ϑ − 2β , 1 + β2 − 2β cos ϑ
(10.4.7)
which is not the aberration formula (10.1.5), but, rather, (10.3.5) with β → −β. Had Einstein used the above procedure to calculate the radiation pressure, he would have obtained
1 + β2 − 2β cos ϑ P = 2ε 1 − β2
1 + β2 = 2ε 1 − β2
2
(1 + β2 ) cos ϑ − 2β 1 + β2 − 2β cos ϑ
2
2 ( cos ϑ − γ)2 ,
e Einstein later corrects the denominator to read as in expression (10.4.6).
(10.4.8)
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch10
Aberration and Radiation Pressure
519
which is certainly not (10.4.5). This is the radiation pressure that a mirror feels when it moves at constant relative speed γ. Pauli [58] uses the fact that the amplitudes, A and A, transform as the frequencies, i.e. 1 − β cos ϑ A = A √ , (1 − β2 )
(10.4.9)
to claim that the radiation pressure, P = 2A2
( cos ϑ − β)2 = 2A2 cos2 ϑ = P , 1 − β2
(10.4.10)
is invariant. This is not, however, what one would conclude from (10.4.1). Recall from Sec. 6.4 that the invariance of the pressure was first established by Planck [08] by studying how thermodynamic densities transform under the Lorentz transformation. Since 1 ≥ cos ϑ ≥ β, we average (10.4.10) over the solid angle with the given limits and get 1 Ptot (β) = 4π
cos−1 β
0
ε = 1 − β2
P · 2π sin ϑ dϑ
1−β 0
x2 dx =
ε (1 − β)2 . 3 1+β
(10.4.11)
This result differs from Terrell [61] in the limits of integration, and it is also at variance with Rindler and Sciama [61], and Schlegel [60]. The total radiation pressure, (10.4.11), tends to its classical value of ε/3 in the limit as β → 0, and vanishes in the limit as β → 1. The former is the blackbody radiation limit, and the latter is completely comprehensible since light waves cannot exert a pressure on an object which is traveling at the same speed. Whereas Terrell finds the same classical limit for the radiation pressure, he concludes that “it becomes infinite for β = 1,” which is (10.4.11) under β → −β, i.e. the mirror is approaching the radiation source. Curiously, the arithmetic average of forward and backward pressures,
1 1 + 3β2 1 , Ptot (β) + Ptot (−β) = ε 2 3 1 − β2 is precisely what von Laue [19] finds for the xx-component of the stress tensor in an inertial frame moving in the x-direction.
Aug. 26, 2011
11:17
520
SPI-B1197
A New Perspective on Relativity
b1197-ch10
A New Perspective on Relativity
The process of reflection changes the Euclidean measure of the relative speed, β, into γ = tanh γ, ¯
(10.4.12)
which is now the corresponding segment of the Lobachevsky straight line in velocity space. When (10.4.12) approaches cos ϑ, the angle of parallelism is reached and the radiation pressure will vanish [cf. Eq. (10.3.2)]. The relations between first- and second-order relativistic effects are ¯ ¯ ¯ cosh 2(u/c) = cosh2 (u/c) + sinh2 (u/c) = cosh γ¯ =
1 + β2 = , 1 − β2
(10.4.13)
¯ ¯ cosh (u/c) ¯ sinh 2(u/c) = 2 sinh (u/c) = sinh γ¯ =
2β , 1 − β2
(10.4.14)
which were originally derived by Variˇcak [10] way back in 1910! With β = tanh (γ/2) ¯ as the relative speed, and cos ϑ = tanh δ¯ , the conservation of energy demands P β = ε( cos ϑ − β) − ε ( cos ϑ + β),
(10.4.15)
which is given explicitly by 2ε
sinh2 (δ¯ − γ/2) ¯ tanh (γ/2) ¯ 2 cosh δ¯
cosh2 (δ¯ − γ) ¯ tanh (δ¯ − γ/2) ¯ + tanh (γ/2) ¯ . = ε tanh δ¯ − tanh (γ/2) ¯ − cosh2 δ¯
Equation (10.4.15) expresses the fact that the difference in energy is equal to the work, P β. In the limit as δ¯ → ∞, we get the line of sight relation, 1−β −γ¯ P = 2εe = 2ε , 1+β which is (10.4.1). When radiation impinges on a forward moving mirror, the wavelength of incident radiation is shortened by the amount proportional to 1 − β, while the reflected radiation is elongated by an amount proportional
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch10
Aberration and Radiation Pressure
521
to 1 + β.f In fact, (10.4.15) is the negative of what Pauli [58] considers as Einstein’s expression for the radiation pressure. P β is the work that is required to move the mirror backwards. Poynting [10] very vividly describes pressure absorption as the ceasing of wave motion at a black surface, where the waves deliver up all their momentum. Since the waves press against [the black surface] as much as they pressed against [the source] in being emitted. . . the pressure against [the black surface] is therefore equal to the energy density per cubic centimeter in the beam.
If the source is moving forward at constant relative speed β the work, P β, is determined by the one-way Doppler shift; that is, the difference between the incident energy density per unit area per unit time, ε( cos ϑ − β), and the energy absorbed by the black surface, ε cos ϑ , viz. P β = ε( cos ϑ − β) − ε cos ϑ (1 − β cos ϑ)2 cos ϑ − β = ε cos ϑ − β − 1 − β cos ϑ 1 − β2 cosh2 (δ¯ − γ/2) ¯ · tanh (δ¯ − γ/2) ¯ . = ε tanh δ¯ − tanh (γ/2) ¯ − cosh2 δ¯
This gives a radiation pressure, P = ε
( cos ϑ − β)2 ¯ sinh2 (δ¯ − γ/2) = ε , 1 − β2 cosh2 δ¯
that is exactly half of (10.4.5). It has the same form as (10.4.8), since the latter can be expressed as P = 2ε
¯ sinh2 (δ¯ − γ) . 2 ¯ cosh δ
The fact that the wavelength at which the radiation is absorbed is greater than that at which it is emitted by the source, i.e. λ /λ = (1 − β cos ϑ)−1 , f No appeal is being made to emission theory since the frequency varies inversely
to the wavelength in order to maintain the speed of light constant. Electromagnetic vibrations are self-contained, and are not those of the medium.
Aug. 26, 2011
11:17
522
SPI-B1197
A New Perspective on Relativity
b1197-ch10
A New Perspective on Relativity
means that less energy is absorbed than was emitted. This is true also for the two-way shift. The factor of 2 has led to the confusion of whether to consider the radiation pressure reflected by a moving mirror as being a one- or two-way Doppler shift, or equivalently, as belonging to the Klein or Poincaré model of the hyperbolic plane.
10.5
Angle of Parallelism and the Vanishing of the Radiation Pressure
Consider again a unit disc with center O and some hyperbolic distance β¯ whose value is (10.3.1). This defines the ‘distance’ β¯ in terms of the logarithm of the cross-ratio. Now consider the right triangle that has an angle A at the origin, as shown in Fig. 9.8. We recall from Sec. 9.7 that since the angle A is located at the origin, the hyperbolic measure of A will be the same as its Euclidean measure. Also recall that hyperbolic tangents correspond to straight lines in Lobachevsky space, the cosine of the angle will be the ratio ¯ = tanh β/ ¯ tanh γ, of the adjacent to the hypotenuse, cos A = cos A ¯ where A is the Euclidean measure of the angle, and we have set the absolute constant equal to one. Now, the Euclidean length of the opposite side, α, can be calculated from the cross-ratio, and what we found was ¯ α = tanh α¯ sech β,
(10.5.1)
or (9.7.2). Expression (10.5.1) represents the ratio of the Euclidean to hyperbolic arc lengths, which is progressively smaller than 1 the larger the Euclidean distance in velocity space. This is the origin of the Lorentz contraction in the direction normal to the motion that we found in Sec. 9.7. Had one endpoint of the line-segment been located at the origin O, there would have been no distortion of the angle and therefore the line segment would have been tanh α. ¯ As we have seen in Sec. 9.7, its non-central location is what is responsible for the angle defect that we observe. In the projective model, the hyperbolic measures of all other angles will be different than their Euclidean counterparts. For the angle B, cos B =
√ α tanh α¯ = sech β¯ = cos B¯ (1 − β2 ). γ tanh γ¯
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch10
Aberration and Radiation Pressure
523
Since the last term is less than unity, cos B¯ > cos B. And since cosine is a decreasing function over the open interval (0, π), it follows that B¯ < B, so that its hyperbolic sum will be less than its Euclidean sum, π. Thus, the angular defect is the origin of the FitzGerald–Lorentz contraction in the direction normal to the motion, just like the second-order Doppler shift. The first-order, longitudinal, Doppler shift plays a fundamental role in hyperbolic geometry. It determines the velocity composition law and the cross-ratio, and hence the hyperbolic distance. The second-order, lateral, Doppler shift is the ratio of the Euclidean to hyperbolic line segments and determines the angle defect. Now, the largest value of α occurs when it reaches the chord PQ. Its hyperbolic measure becomes infinite, and the angle B tends to zero for β¯ = 0. However, the Euclidean measure of α is α = sin A(β¯ ) =
√ 2 (1 − βmax ) = sech β¯ ,
(10.5.2)
where the maximum relative velocity is δ¯ in Fig. 9.8. Only at the angle of parallelism can rotation be linked to hyperbolic contraction, and this is precisely what happens when A = cos−1 βmax . Since the Euclidean measure of the hypotenuse, γ = 1, (10.3.1) gives 1 1 + cos A ¯β = ln 1 − cos A 2 1 + cos A 2 1 = ln = ln cot (A/2), 2 sin A where A is the angle of parallelism, which is a function of the length β¯ at βmax = cos A in Fig. 9.8. We recall that expression (10.5.2) is what Terrell [59] finds for the rotation of an object that an observer will see in the same frame as the moving object when the stationary observer’s view is in the direction normal to the motion. And since this is a limiting form of aberration it does not depend upon the distance between the observer and the object that is being observed. Associating the angle A with ϑ, the radiation pressure (10.4.5) vanishes for the one-way shift at the critical angle ϑ = cos−1 βmax , whereas for a
Aug. 26, 2011
11:17
524
SPI-B1197
A New Perspective on Relativity
b1197-ch10
A New Perspective on Relativity
Fig. 10.6.
The Poincaré half-plane model of measuring distances.
two-way shift (10.4.8) vanishes at its critical angle, ϑ = cos−1 γmax . At these critical angles, the waves have ceased to press against the mirror, and, consequently, the radiation pressure vanishes. This is what the Klein disc model predicts. Now, let us see what the Poincaré disc has to say about two-way Doppler shifts. Since the model is conformal there is no need to distinguish between Euclidean and hyperbolic measures of the angles. Consider the hyperbolic arc length, γ, ¯ from A to B in the Poincaré half-plane, in Fig. 10.6, of a semi-circle of radius 1. Its length is determined by the logarithm of the cross-ratio, {A , B |P, Q}, where the primes denote the projections of A and B onto the x-axis. Using the Klein definition of hyperbolic distance, (10.2.1), we have γ¯ =
1 1 1 + cos ϑ 1 − cos ϑ , · ln A , B |P, Q = ln 2 1 − cos ϑ 1 + cos ϑ 2
(10.5.3)
where ϑ = ∠BOQ and ϑ = ∠AOQ. Hence, the Euclidean length of γ is tanh γ¯ =
cos ϑ − cos ϑ , 1 − cos ϑ cos ϑ
(10.5.4)
and when γ¯ becomes the hyperbolic length RB, ϑ = π/2, (10.5.4) reduces to [cf. Eq. (10.3.2)] γmax = tanh γ¯ = cos ϑ(γmax ).
(10.5.5)
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch10
Aberration and Radiation Pressure And since γ¯ = tanh−1 γ =
1+β = ln 1−β
525
1+γ 1 ln 2 1−γ ¯ = 2β,
(10.5.6)
we, in effect, are dealing with Poincaré’s definition of hyperbolic distance [cf. Eq. (10.3.1)]. Thus, ϑ becomes the angle of parallelism, which is a function solely of the arc length, γ. This is so because BR is perpendicular to the line h whose bounding parallel through B is h . Hence, the angle between BR and h is also equal to ϑ.
10.6
Transverse Doppler Shifts as Experimental Evidence for the Angle of Parallelism
The one-way Doppler shift, (10.4.2), predicts a small ‘blueshift’ when ϑ = π/2, √ ν = ν/ (1 − β2 ).
(10.6.1)
As we know from Sec. 3.4, Ives and Stilwell [38] were the first to test time dilatation by measuring the difference in the Doppler shift of spectral lines emitted in the forward and backward directions by a uniformly moving beam of hydrogen atoms. It might be more advantageous to consider the two-way Doppler shift, where (10.3.6) gives the frequency shift 1 + γ cos ϑ ν = ν √ . (1 − γ 2 )
(10.6.2)
The two-way aberration formula, √ (1 − γ 2 ) sin ϑ = sin ϑ, 1 + γ cos ϑ
together with (10.6.2) lead immediately to the law of sines, (10.1.10), for a moving mirror, which, as we have pointed out, also happens to be the formula for aberration.
Aug. 26, 2011
11:17
526
SPI-B1197
A New Perspective on Relativity
b1197-ch10
A New Perspective on Relativity
The two-way Doppler shift, (10.6.2), like its one-way counterpart (10.4.2), predicts a ‘redshift’ as either the transmitter, or receiver, recede from the other. However, for ϑ = π/2, a blueshift would remain. The shifted frequency would be 2 1 + β ν = ν = ν/ sin ϑ . (10.6.3) 1 − β2 In this limit, (10.3.7) reduces to the angle of parallelism: 1−β = e−γ¯ , tan (ϑ /2) = 1+β
(10.6.4)
which follows from (10.5.6). The angle ϑ is, indeed, acute, and γ = tanh γ¯ = cos ϑ . Therefore, a second-order shift predicted by (10.6.3) would be a direct confirmation that relativity operates in hyperbolic velocity space. At the present time, the experimental evidence is not conclusive. Light pulses reflected from a rotating mirror have not shown relativistic frequency shifts [Davies & Jennison 75], nor have those from dual disks rotating at equal speeds in opposite directions operating in the microwave region [Thim 03]. However, a positive result has been reported by measuring the Mössbauer effect with source and absorber mounted on a rotating disk [Champeney et al. 64]. The null results can possibly be explained by a confusion between oneway and two-way Doppler shifts. In [Thim 03], the one-way, (10.6.1), and two-way, (10.6.3), shifts were placed on equal footing because both predict a frequency shift proportional to β2 . Hence, it is not clear to experimenters what they should be looking for is a two-way, second-order Doppler shift, and not a first-order one.
References [Abraham 04] M. Abraham, Boltzmann-Festschrift (1904), p. 85; “Zur Theorie der Strahlung und des Strahlungsdruckes,” Ann. der Phys. 14 (1904) 236–287. [Abraham 23] M. Abraham, Theorie der Elektrizität, 5th ed. (Teubner, Leipzig, 1923), p. 316. [Borel 13] E. Borel, “La théorie de la relativité et la cinématique,”Comptes Rendus des séances de l’Académie des Sciences 156 (1913) 215–217. [Busemann & Kelly 53] H. Busemann and P. J. Kelly, Projective Geometry and Projective Metrics (Academic Press, New York, 1953), p. 186.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch10
Aberration and Radiation Pressure
527
[Champeney et al. 64] D. C. Champeney, G. R. Isaak, and A. M. Khan, “A time dilatation experiment based on the Mössabauer effect,” Proc. Phys. Soc. 85 (1964) 583–593. [Criado & Alamo 01] C. Criado and N. Alamo, “A link between the bounds on relativistic velocities and the areas of hyperbolic triangles,” Am. J. Phys. 69 (2001) 306–310. The formula for the metric on page 307 in this article is inaccurate. It is given correctly in footnote 10, where r is replaced by v. [Davies & Jennison 75] P. A. Davies and R. C. Jennison, “Experiments involving mirror transponders in rotating frames,” J. Phys. A 8 (1975) 1390. [Einstein 11] A. Einstein, “Zum Ehrenfestschen Paradoxon,” Phys. Z. 12 (1911) 509–510. [Einstein 98] A. Einstein, “On the electrodynamics of moving bodies,” in Einstein’s Miraculous Year, ed. J. Stachel (Princeton U. P., Princeton NJ, 1998), pp. 123–160. [Fock 66] V. Fock, The Theory of Space, Time, and Gravitation, 2nd ed. (Pergamon Press, Oxford, 1966), pp. 375–383. [Friedmann 22] A. Friedmann, “Über die Krümmung des Raumes,” Z. Phys. 10 (1922) 377–386. [Greenberg 93] M. J. Greenberg, Euclidean and Non-Euclidean Geometries, 3rd edn. (W. H. Freeman, New York, 1993), p. 434. [Ives & Stilwell 38] H. E. Ives and G. R. Stilwell, “An experimental study of rapidly moving objects,” J. Opt. Soc. Amer. 28 (1938) 215–226. [Jackson 75] J. D. Jackson, Classical Electrodynamics, 3rd ed. (Wiley, New York 1975), p. 546. [Kulczycki 61] S. Kulczycki, Non-Euclidean Geometry (Pergamon Press, Oxford, 1961), p. 77. [Larmor 00] J. Larmor, Aether and Matter (Cambridge U. P., London, 1900), pp. 177–179. [Laue 19] M. von Laue, Die Relativitätstheorie, Vol. 1 (Vieweg, Braunschiweig, 1919), p. 205, first formula in the third line of equation (XXVIII). [Miller 81] A. Miller, Albert Einstein’s Special Theory of Relativity (Addison-Wesley, Reading MA, 1981), pp. 249–253. [Needham 97] T. Needham, Visual Complex Analysis (Clarendon Press, Oxford, 2005), p. 307. [Pauli 58] W. Pauli, Theory of Relativity, (Dover, New York, 1958), p. 97. [Planck 08] M. Planck, “Zur Dynamik Bewegter Systeme,” Ann. d. Phys. 26 (1908) 1–34. [Poynting 10] J. H. Poynting, The Pressure of Light (Soc. Promoting Christian Knowledge, London, 1910), p. 85. [Rindler & Sciama 61] W. Rindler and D. W. Sciama, “Radiation pressure on a rapidly moving surface, ” Am. J. Phys. 29 (1961) 643. [Rindler 82] W. Rindler, Introduction to Special Relativity (Clarendon Press, Oxford, 1982), p. 48. [Sard 70] R. D. Sard, Relativistic Mechanics (Benjamin, New York, 1970), p. 289. [Searle 97] G. F. C. Searle, “On the motion of an electrified ellipsoid,” Phil. Mag. 44 (1897) 329–341.
Aug. 26, 2011
11:17
528
SPI-B1197
A New Perspective on Relativity
b1197-ch10
A New Perspective on Relativity
[Schiff 60] L. I. Schiff, “Motion of a gyroscope according to Einstein’s theory of gravitation,” Proc. Natl. Acad. Sci. 46 (1960) 871–882. [Schlegel 60] R. Schlegel, “Radiation pressure on a rapidly moving surface,” Am. J. Phys. 28 (1960) 687–694. [Stachel 95] J. J. Stachel, “History of relativity,” in Twentieth Century Physics, eds. L. M. Brown et al. (AIP, New York, 1995), Vol. 1, 249–356. [Terrell 59] J. Terrell, “Invisibility of the Lorentz contraction,” Phys. Rev. 116 (1959) 1041–1045. [Terrell 61] J. Terrell, “Radiation pressure on a relativistically moving mirror,” Am. J. Phys. 29 (1961) 644. [Thim 03] W. H. Thim, “Absence of relativistic transverse Doppler effect at microwave frequencies,” IEEE Trans. Instrum. Meas. 52 (2003) 1660–1664. [Variˇcak 10] V. Variˇcak, “Die Reflexion des Lichtes an Bewegten Spiegeln,” Physik Zeitschr. XI (1910) 586–587. [Variˇcak 11] V. Variˇcak, “Zum Ehrenfestschen Paradoxon,” Phys. Z. 12 (1911) 169. [Variˇcak 12] V. Variˇcak, “Über die nichteuklidische interpretation der relativtheorie,” Jber. Dtsch. Mat.Ver. 21 (1912) 103–127. [Weisskopf 60] V. F. Weisskopf, “The visual appearance of rapidly moving objects,” Phys. Today Sept. 1960, 24–27.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
Chapter 11
The Inertia of Polarization
Special relativity killed the classical dream of using the energy–momentum–velocity relations of a particle as a means of probing the dynamic origins of mass. [Pais 82]
11.1
Polarization and Relativity
Polarization is a property of the orientation of oscillators that produces transverse wave motion. In the case of light waves that travel without obstruction, the polarization is always normal to the direction of propagation. The medium ‘wiggles’ back and forth in a direction perpendicular to the direction of motion. For plane waves of electromagnetic origin, the transversality condition demands that the electric and magnetic fields be perpendicular to the direction of propagation, and perpendicular to each other. Traditionally, the electric field vector has been used to describe polarization, since the magnetic field vector is both proportional, and perpendicular, to it. When the wave is polarized, the electric field remains constant both in amplitude and phase; it can be oriented in a single direction, in which case we speak about linear polarization, or it can rotate as the wave progresses, which may either be circular or elliptical polarization.
11.1.1
A history of polarization and some of its physical consequences
The concept that light waves are transverse is due to Thomas Young and it was used by him to explain the phenomena of polarized light. Up until that time, light was thought to be constituted of longitudinal waves. This seemed to be compatible with Huygens’s principle which can be applied to a wavefront as it expands from a point source through the aether. The aether 529
Aug. 26, 2011
11:17
530
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
was deemed necessary as the medium which supported the propagation of waves. At some point a parent wavefront will disappear instantaneously, leaving in its wake a myriad of daughter wavelets which again expand as spherical waves in the aether. Now, the gist of Huygens’s principle is that the disturbances of the daughter wavelets will be only observed on their common forward envelope. Thus, Huygens envisaged both a fission and fusion of wave forms: A fission from a parent into daughter wavefronts, and a fusion of these wavelet motions along a common envelope forming a single wavefront at a later time. However, it was not until the discovery of polarization that a choice had to be made between transverse and longitudinal propagation of the waves. In fact, polarization was used initially to support the corpuscular theory of light. Bartholinus discovered the effect of double refraction in 1669 which occurred when light passed through crystals of calcite, then known as Icelandic spar. Somewhat later, Huygens discovered the phenomenon of polarization by passing light in series through two calcite crystals. Although, as we have seen, Huygens relied on wave theory, the phenomenon of polarization was used in favor of a corpuscular theory of light. Earlier Newton had suggested that double refraction and partial reflection could be explained by assuming that the particles of light were asymmetric. Malus took this idea one step further by assuming that the particles of light were initially disoriented, and only when they pass through a double diffracting crystal become ordered like those of magnetic bodies. Carrying the analogy further he assumed that light particles had poles, and that the oriented light be called polarized light. Among the initial proponents of a vibration theory of light were Euler and Young. They argued that if light were composed of particles, the particles would have to be exceedingly small so that when two light beams cross each other they do not interfere with one another. Even more important, there is no ‘dissipation’ of light when it travels over great distances. If light were composed of particles, the particles making up the rays of light would interfere with one another making the image fuzzy, in contrast to the sharp images that are observed. According to their vibration theory, there would have to be a medium in which the vibrations propagate like that of sound waves. When sound waves propagate in a solid medium, the polarization of the waves is in the direction of the shear stress in the plane
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
531
normal to the direction of propagation. In gases and liquids, sound waves are longitudinal with their oscillations in the direction of the motion. Every vibratory source would require a different medium or aether. As Maxwell lamented, Aethers were invented for the planets to swim in, to constitute electric atmosphere and magnetic effluvin, and so on to convey sensations from one part of our body to another, and so on, till all space had been filled three or four times over with aethers.
For light transmission the medium was referred to as a ‘luminiferous aether,’ as opposed to an ‘electric aether’ that was required for the propagation of electrical disturbances. To Fresnel we owe the idea that light waves are completely transverse which was a revolutionary idea since transverse elastic waves in solids were completely unknown at that time. However, this luminiferous aether was no ordinary aether since the theory of elastic waves in solids leads to the conclusion that longitudinal waves are always present in the reflection and refraction of elastic waves. However, the boundary conditions introduced by Fresnel were not the boundary conditions of an elastic medium, but they did account for phenomena associated with the propagation of light. Diffraction phenomena could be explained by vibration theory in which light was seen to be a longitudinal vibration like those of sound waves, or wave theory which advanced the transverse vibrations of the aether. The death knell of corpuscular theory came with the Fizeau and Foucault measurements of the velocity of light which clearly showed that the velocity of light was smaller in liquids than in air, contrary to what Newton predicted. Thereafter the wave theory triumphed, but there was another merger to be made. Maxwell advanced the idea that light was really an electromagnetic wave on the basis of a common velocity of propagation. But, Maxwell did not live to see his idea become reality when Hertz, in 1888, showed that electromagnetic waves stemming from oscillating electric currents can exhibit reflection, diffraction, refraction, interference, and, last but not least, polarization. Polarization is described by two perpendicular components normal to the direction of wave propagation. Plane waves of any polarization can be obtained by combining any two orthogonally polarized waves. So what does this have to do with relativity?
Aug. 26, 2011
11:17
532
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
Lorentz introduced two masses according to the relation between the acceleration a and the force F [cf. (5.4.39)],a a=
√ F − (F · u)u . (1 − u2 ) m
(11.1.1)
If the force is in the direction of the motion, we get the so-called ‘longitudinal’ mass, while if the force is normal to the motion the ‘transverse’ mass results. Since it was the latter which coincided with the mass determined from Kaufmann’s deflection experiments on negatively charged particles that we discussed in Sec. 5.4.1, the former was subsequently forgotten. Furthermore, in the old definition of the electromagnetic mass that we discussed in Sec. 3.7.3.2, the mass was defined as the ratio of the electromagnetic momentum, p, to the speed, u, viz. p = mu, in the case where the electron is asymmetrical. This definition made the mass a vector quantity when the momentum was not in the direction of the velocity. Likewise, the longitudinal mass also became a vector m =
∂m ∂p =m+u . ∂u ∂u
In an asymmetrical electron both masses have transverse components. However, in the models set forth by Abraham and Lorentz in Sec. 5.4.4, both masses are in the direction of the motion, and their moduli reduce to the normal transverse and longitudinal components. J. J. Thomson criticized the Lorentz force, which is what a charge particle experiences as it traverses a magnetic field, for violating Newton’s third law of motion. According to him, this could be rectified by considering the existence of momentum in the electric field. In his words, “the loss of momentum in the pulse should be equal to the gain of momentum by the body.” The amount of momentum in the field Thomson found to be proportional to the number of ‘Faraday tubes’ passing through a unit area a In this chapter we use natural units in which = c = 1. In natural units length and
time have the same dimensions while mass has the dimensions of inverse length, e.g. Compton length. Also in this chapter we will use p as the momentum since G is reserved for a generalized displacement.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
533
drawn at right angles to their direction. It was also shown to be proportional to the magnetic induction, a quantity which Larmor associated with the velocity of the aether. Then, the direction of momentum would be at right angles to both the magnetic induction and the Faraday tubes. Since the momentum is proportional to the Poynting vector, the Faraday tubes would be a materialization of the electric field.b As the Faraday tubes move through the aether, the motion of these cylinders normal to their lengths would necessarily lead to an increase in their mass for they would drag the aether with them. We have corroborated in Sec. 5.3.1 that the broadside motion of a rod increases its mass over that of its frontal motion. A moving charge creates a magnetic field whose energy we have shown to be proportional to the kinetic energy in Sec. 5.4.3. So if the geometry of the mass is considered to be a sphere when at rest, there will be an additional increase in its momentum when set in motion. According to Thomson, the additional momentum does not reside in the sphere, but, rather, in the aether surrounding it. A third type of polarization is well-known in hadron colliders. If pl is the momentum along the beam direction, the experimental particle physicist’s definition of rapidity is y=
1 W + pl , ln W − pl 2
(11.1.2)
bAgain there is a potential priority dispute between Poynting and Heaviside for
the discovery of the ‘Poynting’ vector. While it is true that Poynting’s paper was received by the Royal Society on December 17, 1883, and read on January 10 of the following year, it carries a footnote that was subsequently added by Poynting on the 19th of June leading one to believe that it did not appear in print until after that date [Nahin 88]. While in the June 21st edition of The Electrician, Heaviside wrote The direction of maximum transference is therefore perpendicular to the plane containing the magnetic force and the current directions, and its amount per second proportional to their strengths and the sine of the angle between their directions.
And it was not until the following year, January 10, 1885 to be precise, that Heaviside actually published a proof of this theorem. It was a two step proof, thanks to his vector calculus, and not a many page one containing infinite triple integrals like the one given by Poynting. Moreover, it was Heaviside who got the direction of Poynting’s vector right, as we shall see in Sec. 11.5.1.
Aug. 26, 2011
11:17
534
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
where W is the total energy W 2 = pl2 + m2t .
(11.1.3)
The transverse momentum, pt , is related to the transverse mass mt by mt =
√
(m2 + pt2 ).
(11.1.4)
Expression (11.1.2) differs from the usual definition of rapidity insofar as it replaces the modulus, |p|, by pl . In hadron collider physics, this modification is justified by the fact that particle production is a constant function of the rapidity in (11.1.2). We will soon appreciate that the difference between mass and momentum polarizations lies in which quantity is being held constant: For mass polarization it is the total energy, while for momentum polarization it is the mass that is the invariant. In 1852 Sir George Gabriel Stokes [52] showed that a partially polarized light beam could be characterized by four parameters now bearing his name. In substance, Stokes demonstrated that when any two beams of light are superimposed incoherently, the Stokes parameters are additive. Moreover, any arbitrary light beam may be considered as a superposition of an unpolarized beam and an elliptically polarized one. Because of their operational forms, the Stokes parameters have been related to quantities that appear in the quantum-mechanical treatment of light. For example, the equivalence between the Stokes parameters and the components of the density matrix have also been noticed by Perrin [42], and Falkoff and Macdonald [51]. We plan to reinterpret the Stokes parameters to give the relativistic invariant forms W 2 − p2 = m2 ,
p 2 + m2 = W 2 ,
in the cases where the mass m, or the total energy W , is invariant. The polarization arises by designating the orthogonal components of the mass, momentum, or mass and momentum. In the case of the hadron collider, the latter is realized which involves the orthogonal components of the total mass and the transverse momentum, i.e. mt = pt + im with modu√ √ lus (mt m∗t ) = (pt2 + m2 ).
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
535
The beam momentum and the transverse mass can be represented by the spherical coordinates pl = W cos 2ϑ = W cos 2χ cos 2ψ, pt = W sin 2ϑ cos 2ϕ = W cos 2χ sin 2ψ,
(I)
m = W sin 2ϑ sin 2ϕ = W sin 2χ, in the case of complete polarization. This shows that the total energy, and not the transverse mass [Jackson 05], is the conserved quantity. The second equality in first line of (I) is none other than the Pythagorean theorem for elliptic geometry. In experimental particle physics, the rapidity (11.1.2) is replaced by the so-called ‘pseudorapidity.’ If ϑ is the angle between the particle momentum p and the direction of the momentum of the beam then cos ϑ = pl /|p|. In the limit m |p|, the rapidity (11.1.2) can be replaced by the pseudorapidity 1 1 |p| + pl 1 + cos ϑ η = ln = ln = − ln tan (ϑ/2). (11.1.5) 2 |p| − pl 2 1 − cos ϑ This identifies the angle ϑ in the limit m |p| with the Boylai–Lobachevsky angle of parallelism. In the Euclidean limit the pseudorapidity vanishes, while as the angle of parallelism decreases, the pseudorapidity increases. Since the transverse momentum is related to ‘missing,’ or ‘invisible,’ mass in collider particle production, the infinite limit of the pseudoadditivity would be related to the limit where all the masses are accounted for, in which case the particle momentum is directed along the beam momentum. The pseudorapidity (11.1.5) provides a unique link between hyperbolic and circular functions. We saw in Sec. 2.5 that hyperbolic geometry depends on an absolute constant, k, such that the area of any triangle ABC is area(ABC) =
π 2 k × defect(ABC). 180
Since the defect, or the (positive) difference between 180◦ and the sum of the angles of the triangle, measured in degrees, is minutely small on a terrestrial scale, while the area is finite, the constant k 2 must be immensely large. As we have seen in Sec. 2.5, the parallaxes of fixed stars serve as lower bounds to k 2 .
Aug. 26, 2011
11:17
536
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
In (11.1.5), we have implicitly assumed k = 1 since we are considering natural units. Recall in Sec. 2.4 we showed this to be equivalent to the choice of our unit of length so that the ratio of corresponding arcs on concentric horocycles is equal to e when the distance between horocycles is 1. This choice is analogous to the choice of the unit of angular measure so that a right angle will have a radian measure of π/2. It makes the area of a triangle equal to its defect, provided the defect is now measured in radians instead of degrees. Using double angle formulas, we may express the angle of parallelism, ϑ(η), measured in radians, in terms of the pseudorapidity as tan ϑ(η) = 1/sinh η,
cos ϑ(η) = tanh η,
sin ϑ(η) = 1/cosh η.
The distinction between the invariancy of the total energy, W , or the total mass, m, is geometrically related to the distinction between elliptic and hyperbolic geometries, and optically connected to the difference between birefringence and dichroism. Dichroism is related to the unequal absorption of two orthogonally polarized light components, while birefringence is the unequal retardation of orthogonal components. If we are considering processes which conserve the total energy, there can occur mass polarization. Denoting by 2ϑ and 2ϕ as the polar and azimuth angles, respectively, and choosing the momentum to form an angle 2ϑ with the z-axis, we can write the momentum p and mass components ml and mt in terms of these angles through spherical coordinates p = W ε cos 2ϑ, ml = W ε sin 2ϑ cos 2ϕ,
( )
mt = W ε sin 2ϑ sin 2ϕ. Since the degree of polarization, ε ≤ 1, is constant, the equality of the differences, W 2 − p 2 − m2 W 2 − p2 − m2 = = 1 − ε2 , W2 W 2
(11.1.6)
will always hold no matter what frame we are working in, where the √ mass m = (m2l + m2t ). This was first commented on by Paul Soleillet
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
537
in 1929 who also devised 4 × 4 matrices that act on these four-vectors, and can be applied to the description of the polarization of Compton scattering [Fano 49].c Complete polarization, ε = 1, corresponds to ‘on mass shell,’ where the ‘mass shell,’ or ‘mass hyperboloid,’ refers to solutions of W 2 − p2 = m2 , describing the combinations of momentum, p, and energy, W , that are allowed for a relativistic particle of mass m. ‘Virtual’ particles may be ‘off shell,’ or partially polarized. In the sequel we will always treat the ‘on shell’ case, or that of complete polarization, ε = 1. It is apparent from these equations that W and p are invariant for a rotation of the axes through the azimuthal angle. But, ml and mt change with the axes, and are related to one another through a rotation about this angle. In birefringent media there is a phase delay. Polarizers exploit the birefringent properties of crystals like quartz and calcite. An ideal birefringent crystal transforms the polarization state of an electromagnetic wave without loss of energy. The crystal has an optical axis for which light has a different index of refraction for light polarized parallel and perpendicular to this axis. A beam of unpolarized light is split by refraction at the surface of these crystals into two rays: light rays polarized parallel to the optic axis are known as the ‘ordinary’ rays, while light rays polarized normal to the optic axis are called ‘extraordinary’ waves. Only the former obey Snell’s law, (3.5.6). A Nicol prism, which was an early prototype of a birefringent polarizer, can be used to measure the degree of plane polarization with respect to two arbitrary orthogonal axes, and the degree of plane polarization with respect to a set of axes oriented at 45◦ to the right of the previous ones. The measurement of the degree of circular polarization requires a quarter-wave plate. A quarter-wave plate is a phase retarder that can be used to transform circularly polarized light into linearly polarized light or vice versa. cAgain Stigler’s law of eponymy is borne out in that spectroscopists refer to the calculus where light is represented by a vector which is operated on by an optical element as the Jones and Mueller calculus [Kliger et al. 90], and not to its rightful discoverer, Soleillet [29], who more than a decade earlier than Jones [41], and almost two decades earlier than Mueller [48], discovered it.
Aug. 26, 2011
11:17
538
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
Fig. 11.1.
Spherical right triangle for scheme (II).
The elliptic nature of the phase changes is made apparent by considering the spherical right triangle shown in Fig. 11.1, with coordinates. p = W cos 2ϑ = W cos 2ψ cos 2χ, ml = W cos 2ϕ sin 2ϑ = W sin 2ψ cos 2χ,
(II)
mt = W sin 2ϑ sin 2ϕ = W sin 2χ. According to this spherical right triangle, scheme ( ) corresponds to the Poincaré sphere, shown in Fig. 11.7 below. Electromagnetic waves may be characterized by their electric vectors which can be decomposed into orthogonal components that encounter different propagation effects in the media through which they pass. We have already discussed phase lags between the two components giving rise to birefringence, which can be characterized by a rotation of 2ϕ in the plane perpendicular to the direction of momentum propagation. The rotation matrices are unitary. However, it may occur that the amplitudes of one of the orthogonal components of the electric vector gets reduced in dichroic media. Radiation filters serve to block all the radiation in one of the modes, and are known as polarizers. In terms of the parameters describing the polarized state, the total intensity is reduced. Translated into relativistic terms, the total energy will no longer be a conserved quantity, and transformations from one inertial frame to another involve a Lorentz boost of 2β = tanh 2ϑ in the direction of propagation. Such transformations are described by Hermitian matrices. We are now dealing with momentum polarization, where the momentum is decomposed into orthogonal components pl and pt , such that the
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
Fig. 11.2.
539
Hyperbolic right triangle related to the scheme (III).
√ √ momentum is p = pl + ipt , with modulus (pp ) = (pl2 + pt2 ). In terms of the polar and azimuthal angles, 2ϑ and 2ϕ, the energy and momentum are given by W = m cosh 2ϑ = m cosh 2χ cosh 2ψ, pl = m sinh 2ϑ cos 2ϕ = m cosh 2χ sinh 2ψ,
(III)
pt = m sinh 2ϑ sin 2ϕ = m sinh 2χ. The second equalities in (III) are deduced by considering the hyperbolic right triangle in Fig. 11.2. In particular, the second inequality in the first line of (III) will be recognized as the Pythagorean theorem for a hyperbolic right triangle. Thus, whereas birefringence involves phase changes of the orthogonal components of the electric vector and belongs to elliptic space, dichroism involves the reduction in total intensity and lives in hyperbolic space. In comparison to (II), scheme (III) is obtained by allowing the polar angle ϑ to become imaginary. This is analogous to the transition from elliptic to hyperbolic geometry which is affected by allowing the radius of a sphere to become imaginary and thus transforming a sphere into a ‘pseudosphere’ that we discussed in Sec. 2.5. In Sec. 11.2 we will show how the Stokes parameters can be written in terms of the density matrix, which, in turn, can be expressed in terms of the mass, momentum, and energy terms, or in terms of the components of angular momentum since all can be expressed in terms of a conserved four-vector. In terms of mass, momentum and
Aug. 26, 2011
11:17
540
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
energy, the density matrix, 1 W +m ρ= 2 pl + ipt
pl − ipt W −m
,
(11.1.7)
has the total energy as its trace, and has a vanishing determinant. In analogy with the three components of linear momentum, we write the generators of rotation as a scheme (II) type px = W sin 2ϑ cos 2ϕ = m cos 2ϕ, py = W sin 2ϑ sin 2ϕ = m sin 2ϕ,
(II )
pz = W cos 2ϑ. The relativistic conservation of energy is W 2 = px2 + py2 + pz2 = m2 + pz2 =: p2 .
(11.1.8)
This reduces to a scheme (I) type when the transverse momentum and mass become zero as it would be for a particle of zero mass.
11.1.2
Spin
Not long after the proposal of ‘spin’ as an additional degree of freedom of the electron, experimenters were under the belief that there should be an analogy between the behavior of linearly polarized light and the asymmetric orientation of spins in an electron beam [Farago 71]. A spin-1 particle, with a well-defined momentum, p, can have a spin along the direction of the motion, opposed to the direction of motion, or normal to that direction, h = ±1, 0. The new property, h, known as the particle’s helicity, is not confined to spin-1 particles. However, if the particle happens to be a photon its transverse wave property excludes the value 0. This value has been associated with the longitudinal mode, and the presence of mass, in the electroweak interaction [Gottfried & Weisskopf 86]. The orientation of spin along, or opposite to, the direction of propagation n can have only the values h = ±1. Moreover, since the orbital angular momentum, L, vanishes in the direction of propagation n, the helicity is defined as the projection of the total angular momentum J in the direction
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
541
of motion, i.e. J · n = (L + S) · n = S · n = h, where S is the spin of the photon. To add to the confusion, instead of helicity, referring to the orientation of spin with respect to the axis of quantization, states of definite helicity are related to left- and right-handed circular polarization, as opposed to linear polarization. Additional confusion is further incurred by the close formal analogy between spin- 12 particles and spin-1 photons. Since there are only two helicity states h = ±1, these states can be represented as spinors, just like electrons! We will see that the properties of light can be fully determined by the density matrix (11.1.7), where the Stokes optical parameters will replace the mechanical parameters [cf. (11.2.2) below]. The diagonal elements give the probability of finding a photon in the beam in one of the two helicity states. By allowing the beam to pass through various polarization filters, information can be obtained about its polarization. Since the intensity must be real, the off-diagonal elements must be complex conjugates of one another, i.e. the density matrix must be Hermitian. This reduces the total number of independent parameters to four, if the total intensity of the beam is included. It is quite remarkable that Stokes came to the exact same conclusions way back in 1852, with absolutely no knowledge of the quantum nature of light, or even Maxwell’s theory predicting the transverse nature of wave propagation! Parenthetically, we may add that associating the longitudinal mode of the state of helicity |0 with mass is not without its problems. For spin-1 particles there need not be a direction in which the spins are pointing, either up or down. Although there can be a preferred direction, it is not possible to specify a projection along this axis so as to obtain helicity. The spin vector of |0 can be thought of as precessing in the direction perpendicular to the motion. Thus, quantities that characterize particles of spin-1 must not depend upon the preferred direction, and a vector of polarization is not applicable. These quantities must be at least quadratic in the spin components, or second-order tensors. To see that the polarization is insufficient to characterize such states, we could equally as well produce the state of zero polarization by all the
Aug. 26, 2011
11:17
542
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
particles in the state |0, or by an equal mixture of the states |1 and | − 1. It would then be necessary to construct a monopole, vector, and secondrank tensor in order to obtain a complete characterization of the state of polarization. Thus, the association of a longitudinal mode with the |0 state would mean a complete overhaul of the properties of polarization. The very fact that electrons share both undulatory and corpuscular properties of light, and do have mass, would tend to rule out that a completely new mechanism be added to treat polarization once massless particles acquire mass in the electroweak theory. It would be far simpler to assume that the acquisition of mass is a breakdown in the pure helicity of the state due to dispersion. Stokes’s analysis of the polarization of electromagnetic radiation has been gaining increasing interest in other branches of physics due, undoubtedly, to its similarity to a rotation in a four-dimensional Minkowski space [Soleillet 29]. This is a consequence of the realization that the Stokes parameters are the components of a conserved four-vector. This column vector can be scattered into a new column vector, with the same conservation properties, by matrices which change the state of the polarization of light. The transformation matrix appears as a generalized matrix of rotation in which two components are rotated through an imaginary angle, and the other two components are rotated through a real angle. The rotation through an imaginary angle is a ‘rotation’ of the total energy and linear angular momentum by a Lorentz transform, while the rotation of the other two components through a real angle is analogous to the introduction of a phase difference between the components of vibration of the electric vectors along mutually perpendicular axes, and is the origin of mass polarization. The former case provides a physical example of hyperbolic geometry in which there is a contraction of rulers as the boundary of the space is approached as seen, of course, from a Euclidean perspective. An additional representation of the Stokes parameters, proposed by Poincaré in 1892, and which we will discuss in Sec. 11.2, is physically equivalent to a light beam being rotated through an angle around its direction of propagation. It is
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
543
related to Rayleigh scattering, where the rotation of the outgoing linear polarization vector is rotated away from the incoming linear polarization vector, and constitutes an elliptical geometric distortion effect. Just as there are two independent states of light polarization, the density matrix can be represented as the sum of the identity matrix and the inner product of the Stokes parameters and the Pauli spin matrices [Fano 49]. An identical treatment can be given to the weak interaction where the proton and neutron are a ‘charge doublet’ of the nucleon. This doublet can only be distinguished by the weak interaction in which a free neutron decays into a proton, an electron, and an antineutrino. We will return to this shortly in Sec. 11.1.7. Now, the two-dimensional unitary modular group, SU(2), can be represented by the three 2 × 2 Pauli spin matrices so that the ordinary spin multiplets of particles like electrons can be derived from this group. It was Heisenberg’s foresight that led him to apply the same group of transforms to the neutron-proton charge doublet, or what has become known as ‘isospin.’
11.1.3
Angular momentum
The Stokes parameters also bear an intimate tie with the angular momentum operators. In exactly the same way that each state can be chosen to be a simultaneous eigenfunction of the square of the total angular momentum, J 2 , and its projection on the direction of momentum, Jz , which we will take as the z-axis, we can a priori conclude that there will be J(J + 1) eigenvalues, where J is either an integer or half-integer, and each multiplet will consist of 2j + 1 states with eigenvalue jz of the operator Jz , varying in steps from −j to +j. In this analogy with angular momentum, the total energy, W , corresponds to the total angular momentum, J. Its projection onto the z-axis, Jz , corresponds to the operator of linear momentum, pˆ z . It will turn out that pz , or Jz , is proportional to the difference between the populations of the two states of isospin, or helicity, or any other two mutually exclusive states. When the populations of the two states become equal the particle’s velocity goes to zero. The remaining two angular momentum operators, J± = Jx ± iJy , are known as ‘ladder’ operators, since they cause jumps up and down in a
Aug. 26, 2011
11:17
544
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
multiplet, creating or destroying a particle as they go. To these operators ˆ l , and transverse, we will associate the ‘mass’ operators in longitudinal, m ˆ t , directions of momentum. They can be considered as the last vestiges of m the ‘transverse’ and ‘longitudinal’ masses that discussed in Sec. 5.4.4, were introduced early in relativity theory, and then quickly forgotten when it was found that the transverse mass was the mass measured in the e/m experiments described in Sec. 5.4.1. In group theory jargon, we are saying that the Stokes parameters are the operators that generate the SU(2) algebra. Since two helicity, spin, or isospin, states are involved, the Stokes parameters can be represented by the creation of a spin up (down), a†+ (a†− ), or the annihilation of one, a+ (a− ). In terms of these second quantized operators, the total energy W becomes the total number, ˆ = a†+ a+ + a†− a− , W of particles operator, and the operators of the mass components and momentum are ˆ l = a†+ a− , m ˆ t = a†− a+ , m pˆ z =
1 † (a a+ − a†− a− ). 2 +
(11.1.9)
The conservation of angular momentum, ˆ 2 = 1 (m ˆt+m ˆ l ) + pˆ z2 , ˆ lm ˆ tm W 2
(11.1.10)
is also the square of the total quasi-spin operator of isospin, and it gives rise to the eigenvalue equation ˆ 2 = W (W + 1). W The momentum operator (11.1.9) is the difference in the number of the two spin states. In Sec. 11.1.7 we will show that it is proportional to the relative of velocity of an electron, which is found to be equal to its longitudinal polarization in the electroweak interaction. Spin states can be classified into multiplets, each characterized by an ˆ 2 . The significance of the statement that each eigenvalue of the operator W ˆ2 state can be chosen to be a simultaneous eigenfunction of both pˆ z and W
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
545
is that the difference W 2 − pz2 is invariant under a Lorentz transformation. What one would think of as the space of ‘space-time’ [Dirac 47] is really spanned by the Stokes parameters, and their mechanical counterparts of mass, momentum, and energy. The Stokes parameters play a fundamental role in the characterization of polarized relativistic systems in separating the energy and momentum, which evolve according to Lorentz transformations, and of the polarized mass components, which undergo rotational transformations. No matter how enticing the analogy between the Stokes parameters and angular momentum operators may be, we have to realize that it is less than perfect because the former vary continuously, while the latter are discrete. Particles with zero rest mass can have only two states of polarization, ±W , while particles of finite mass have 2W + 1 states of polarization. The eigenvalues of the operator pˆ z will have 2W + 1 values of the multiplet from W to −W for massive particles that are aligned parallel, anti-parallel, or normal to the direction of momentum.
11.1.4
Elastic strain
The distinction between vibratory motion in the direction of wave propagation in contrast to vibratory motions in directions normal to wave propagation can be understood by considering the nature of strain upon an elastic body. If a displacement G satisfies the condition ∇ × G = 0,
(11.1.11)
throughout a strained body, then no element in that body experiences rotation. Such a strain is said to be irrotational, or longitudinal. Alternatively, if the displacement G satisfies ∇ · G = 0,
(11.1.12)
then no element in the strained body undergoes a change in volume. Such a strain is said to be solenoidal, circuital, or transversal.
Aug. 26, 2011
11:17
546
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
Moreover, any vector field may be decomposed into purely longitudinal and transverse parts so that the most general type of strain is a superposition of the two. It is also possible to treat the two types of strains separately. It will then be found that the two types of disturbances will be propagated at different velocities so that if a single source emits both types of disturbances one will travel faster than the other. Any wave equation with a single speed of propagation must, therefore, contain a single type of disturbance — either longitudinal or transversal. If condition (11.1.11) is met everywhere in the body, the displacement can be represented as the gradient of a scalar potential, φ, viz. G = ∇φ.
(11.1.13)
The displacement will therefore occur in the direction normal to the surfaces φ = const., and if n is the unit normal we may write (11.1.13) as G=n
∂φ . ∂n
We will restrict our attention to infinitesimal strains, or those for which the square and products of the derivatives ∇x = ∂/∂x, ∇y = ∂/∂y, and ∇z = ∂/∂z, of the displacement G will be negligible in comparison with the linear terms. Then, the principal elongations will be λx = ∇x Gx ,
λy = ∇y Gy ,
λz = ∇z Gz ,
so that the cubic dilatation is simply their sum, c = λx + λy + λz = ∇ · G = ∇ · ∇φ = ∇ 2 φ.
(11.1.14)
In other words, the cubic dilatation is equal to the divergence of the displacement, or to the Laplacian of the field, φ. Neighboring values of φ are potential surfaces which split the body into a series of infinitely thin surfaces. If the displacement remains constant on any one of the surfaces, say x = const., and changes only when passing from one plane to the next, then the cubic dilatation reduces to ∇x Gx . Now, the fact that the curl vanishes, (11.1.11), means that ∇x Gy = ∇x Gz = 0, so that Gy and Gz are constant. That is, the transverse components of the
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
547
displacement are arbitrary constants, which, without any loss of generality, we may take as zero. Hence, only the longitudinal displacement Gx remains finite, indicating, for instance, that all the molecules of a lattice vibrate in the direction of wave propagation. The lines corresponding to the principal axes are replaced by rotational ones for transverse displacements, where the lines are in the direction of the axis of rotation. The cubic dilatation, (11.1.14), vanishes, indicating that the volume of any portion of the body remains the same so that (11.1.12) applies. Thus, the generalized displacement can be represented as the curl of a vector A, G = ∇ × A, in contrast to longitudinal strain, (11.1.13). The vector potential, A, plays an analogous role to the scalar potential φ of longitudinal strain. Instead of the cubic dilatation in terms of the Laplacian, we now have the curl as the indicator of the intensity of rotational motion, J = ∇ × ∇ × A = ∇ × G.
(11.1.15)
Since ∇ × ∇ = ∇(∇ ·) − ∇ 2 , if we introduce the auxiliary condition that the vector field is sourceless, ∇ · A = 0, we can write the rotation, or vortex, J as J = −∇ 2 A, which is entirely analogous to (11.1.14) for cubic dilatation. Equations (11.1.14) and (11.1.15) are the well-known Poisson equations, where if dV is an element of volume in which the rotation J does not vanish, (11.1.15) has the solution, J A= dV, (11.1.16) r for the vector potential, while, in the exactly analogous way, (11.1.14) has the solution, c φ=− dV, (11.1.17) r
Aug. 26, 2011
11:17
548
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
for the scalar potential, if the cubic dilatation, c, does not vanish in the volume element dV. In electrodynamics, the vector J represents the current density, and the scalar c stands for the charge density. Again assume that the displacement G depends only on the x-coordinate. Since (11.1.12) holds, ∇ x Gx =
∂Gx = 0. ∂x
This means that Gx is constant, which we can conveniently take to be zero. The displacement is therefore normal to the x-axis, lying in the yz-plane, and consisting of two non-zero components. The strain is said to be transversal, where, for instance, the particles ‘wiggle’ in the directions normal to the direction of propagation of the wave disturbance. As an easy reminder, we may say that a longitudinal disturbance needs two components to vanish on account of (11.1.11), whereas a transversal disturbance needs only one component to vanish on account of (11.1.12). The foregoing discussion elicits an interpretation of longitudinal and transverse wave motion in terms of the underlying medium. It is the reason why the concept of an aether was so well received and widely accepted before relativity. Heaviside’s opinion sums up the tendency of the period to regard the aether with open arms: Aether is a wonderful thing. It may exist only in the imagination of the wise, being invented or endowed with properties to suit their hypotheses; but we cannot do without it. . . But admitting the aether to propagate gravity instantaneously, it must have wonderful properties, unlike anything we know.
So the aether was the deus ex machina upon which physical theories were built. No matter how unsuccessful were the experiments to “set the aether in motion,” it served both as a guide and crutch upon which to build physical theories so that its demise cannot be entirely looked upon as a positive move. Whether it exists between particles, or within them, it provided the trunk whose branches bore fruit. In particular, it led Maxwell to add on a new current to Ampere’s law, called by him the displacement current, and in so doing ‘closed the circuit,’ and allowed for electromagnetic wave propagation. The aether and its conservation did play a role. It was said that Kelvin could not understand a phenomenon until he made a mental picture of the aether to which it corresponded. And the seminal idea of Kelvin, back in
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
549
1853, that energy can be stored in the field implied that there was a medium in which it could be stored. It also led to Maxwell’s abandoning a theory of gravity, as we saw in Sec. 3.8.1, and declaring that such a theory was beyond nineteenth-century physics, for it would imply that, due to the attractive nature of masses, the aether must store ‘negative’ energy! So the aether was the medium in which energy could be stored, like a stretched rubber band that could give up energy upon request. But, since the abolishment of the aether, “we don’t have this invisible, convenient storage vault to make the field energy easier to ‘visualize.’ The field energy is, in this sense, a greater mystery for us today than it was for the Victorians” [Nahin 88]. According to Maxwell [65], the total field energy density, W=
1 (0 E2 + µ0 H 2 ), 8π
(11.1.18)
is localized in space, but, it can be far from any material whose dielectric constant and permeability are 0 and µ0 , respectively. The only remnant of √ the ‘material’ body lies in their product, 1/ (0 µ0 ) = c, the speed of light in vacuo. In many ways, the present-day vacuum in quantum field theory plays the role of the deceased aether. The essential assumption in the Higgs mechanism is that the ground state, or vacuum, is asymmetric, notwithstanding the fact that the Lagrangian is symmetric. The Higgs mechanism with its nonvanishing vacuum expectation value plays the role of the vector potential in an apparent analogy to spontaneous magnetization in ferromagnetism when the temperature is lowered below its critical value [cf. Sec. 11.5.2 below]. So in many ways the vacuum has replaced the aether. There may be many roads to a discovery that destroy the uniqueness of a single theory like that of general relativity [cf. Chapter 7]. Modern day tendencies are to replace aethers by field Lagrangians and let their symmetry, or better symmetry-breaking, be their deus ex machina. It would not be inappropriate to recall the words of Heaviside [12] concerning the Lagrangians and the principle of least action: Whether good mathematicians, when they die, go to Cambridge, I do not know. But it is well known that a large number of men go there when they are young for the purpose of being converted into senior wranglers and Smith’s prizemen. Now at Cambridge, . . . there is a golden or brazen idol called the Principle of Least Action. Its exact locality is kept secret, but numerous copies have been made and
Aug. 26, 2011
11:17
550
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity distributed amongst the mathematical tutors and lecturers at Cambridge, who make young men fall down and worship the idol.
How times have changed — and how times may yet change again!
11.1.5
Plane waves
Since electron spin appears as the counterpart of the polarization of light we might be inclined to use the Stokes parameters to characterize the polarization of elementary particles [Jauch & Rohrlich 55]. Consider a plane wave propagating in the positive z-direction with wave number κ and angular velocity ω, which is completely polarized. In optics it is necessary to consider four ‘amplitudes,’ Ex , Ey , Hx and Hy , each of which satisfies the wave equation. Rather than using the two components of the magnetic field, H, we may use the vector potential A which is related to it by A = (∇ × H)/κ2 . Since H = ∇ × A, this implies that H satisfies the reduced wave, or Helmholtz, equation, ∇ 2 H = −∇ × ∇ × H = −κ2 H, since ∇ · H = 0. The non-vanishing components of the vector potential, Ax = a sin (κz − ωt)/κ,
Ay = b sin (κz − ωt + δ)/κ,
˙ because φ ≡ 0, the have amplitudes a and b, and phase δ. Since E = −A, non-vanishing components of the electric field are Ex = a cos (κz − ωt),
(11.1.19a)
Ey = b cos (κz − ωt + δ).
(11.1.19b)
If the electric vector, E = Ex + Ey , rotates counter-clockwise when the observer is facing into the oncoming wave, such a wave is said to be leftcircularly polarized. In the jargon of elementary particle physics it means that the particle has positive helicity, or that the spin of the particle is in the direction of the momentum. In contrast, if the rotation of the electric vector is clockwise when looking into the wave, the wave is said to be rightcircularly polarized, or, equivalently, that the particle has negative helicity,
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
551
in which case the spin is in the opposite direction to the momentum of the particle. The stresses, formed from the products of (11.1.19a) and (11.1.19b), and averaged over a period of the motion, are the way Maxwell accounted for the ponderomotive forces of the electric field. The normal stress, Jz = E2x − E2y =
1 2 (a − b2 ), 2
(11.1.20)
is related to the (radiation) pressure, while the tangential stress, Jx = 2Ex Ey = ab cos δ,
(11.1.21)
is the stress due to shearing. We emphasize that it is precisely through the Maxwell stresses, such as (11.1.20) and (11.1.21), that we can account for the actions of inertia in a theory which is otherwise completely devoid of it. Then you ask, stress on what? And here we return to the aether, not as a luminiferous, gaseous, aether, but an elastic, or ‘jelly-like,’ solid as Stokes liked to think of it. The third component, Jy = ab sin δ,
(11.1.22)
is related to the projection of the angular momentum on the z-axis. It is the spin component of the angular momentum [cf. (11.5.47) below], S=
1 kˆ E×A= ab sin δ, 4π 4πω
(11.1.23)
where kˆ is the unit vector pointing in the z-direction. Finally, the fourth component, J, is related to the total energy, W=
1 2 1 2 a + b2 . E + H2 = 8π 8π
(11.1.24)
The direction of the flow of energy is determined by Poynting’s vector, which can arguably be also associated with the name of Heaviside, 1 P¯ = E × H, 4π
(11.1.25)
Aug. 26, 2011
11:17
552
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
as we explained in footnote 2 of this chapter. The magnetic intensity H = ∇ × A has components Hx = −a cos (κz − ωt + δ),
Hy = b cos (κz − ωt).
How are the two vectors (11.1.23) and (11.1.25) related? The latter represents the linear momentum of the electromagnetic field per unit volume. The moment of the linear momentum density is the total angular momentum density, J=r×P=
1 (r × E × H) . 4π
Expressing the magnetic force in terms of the vector potential, and using the vector identity E × ∇ × A = ∇A · E − E · ∇A, we get r × P = r × ∇A · E + E · ∇A × r. The first term is analogous to the orbital angular momentum density [Rohrlich 65], L = r × ∇A · E,
(11.1.26)
while the second term can be rewritten as E · ∇A × r = ∇ · (EA × r) + r × A∇ · E + E · ∇r × A. When integrated over the volume, the first term vanishes under the assumption that the fields vanish sufficiently fast at infinity (Maxwell’s infinite integrals to get finite quantities!), the second term vanishes in the absence of charges, and the third term is (11.1.23) since ∇r = 1 is the unit dyadic. If (11.1.23) and (11.1.26) are to apply to a photon, they need to be reinterpreted. The spin of a photon is usually assumed to be 1. But, what does this mean in terms of the decomposition of the total angular momentum in terms of its orbital and spin components? If we interpret (11.1.26) as spin itself, then (11.1.23) can be taken as the projection of the spin on a preferred direction, or the two components of the helicity of a photon.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
553
Dividing (11.1.23) by (11.1.24) gives 2ab sin δ S = kˆ 2 . W a + b2 ω This was first derived by Abraham in the special case of circular polarization, and later generalized to a spherical wave by Sommerfeld [34]. The presence of the vector product in (11.1.23) implies that the spin is different from zero if the wave is other than linearly polarized. The kind of polarization is determined by the phase, δ. A linearly polarized wave has δ = 0, and consequently, the intrinsic spin of the particle vanishes. Rather, for δ = π/2 and a = b, the polarization ellipse degenerates into a circle resulting in a state of right-circular polarization, where the above ratio reaches a maximum of 1/ω. For the same condition on the amplitudes, but with a phase δ = −π/2, a state of left-circular polarization results. Finally, introducing Planck’s relation, W = ω, in natural units, gives ˆ These angular momenta correspond to helicities the spin states S = ±k. h = ±1, since there is no photon state with h = 0, because electromagnetic waves have only transverse fields. In other words, spin orthogonal to the direction of propagation for photons, as well as for all massless particles, does not exist.
11.1.6
Spherical waves
Next in line after plane waves, in regard to their simplicity, are spherical waves. They were originally thought to produce condensation waves. Kelvin suggested that the rapid charging of two conducting spheres connected to an alternating dynamo would produce waves of compression, just as the rapid back-and-forth actions of a piston in a cylindrical cavity would do. Only here, it would be the rapid alternating charging that would be the seat of compressional waves. Compressional waves in electromagnetism was loathsome to Heaviside, and he rejected them outright. There are no ‘longitudinal’ waves in Maxwell’s theory analogous to sound waves. Maxwell took good care that there should not be any.
Aug. 26, 2011
11:17
554
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
The ability of a changing electric field to induce a magnetic field, and a changing magnetic field to induce an electric field create radiation, and prohibit the formation of condensational waves. The radiation components of the electric and magnetic fields are not dependent upon charge and current, respectively. Rather, they are cut loose of these sources so that electric and magnetic variations influence one another and enable radiation to travel unlimited distances for unlimited amounts of time. This attests to the absence of mass of the photon. Maxwell’s circuital equations inevitably lead to a wave equation. This was an oversimplification for Gauss, but sufficient for Maxwell’s needs [cf. Sec. 1.2.1]. In the spherically symmetric case to be treated in Sec. 11.5.1, the wave equation has the solution of the product of a spherical Bessel function of order = 1, and a spherical harmonic in which m = −1, 0, 1. But, from its derivation from the circuital equations, the state m = 0 is missing, for if it did exist it would correspond to the state of zero helicity. Yet, if we allow for a new current, which is indicative of compressional waves, the state m = 0 will make its appearance. In Sec. 11.5.2 we will appreciate that a current proportional to the vector potential A introduces mass by introducing dispersion, whereas a term proportional to −∇ · A is analogous to a hydrostatic pressure, which, by itself, is not related to either incompressible or compressible fluid flow [Landau & Lifshitz 59]. In the standard theory of the electroweak interaction, the appearance of the longitudinal mode with h = 0 occurs as a result of the breaking of gauge invariance. In so doing the gauge fields acquire mass, but cannot propagate unless their frequencies exceed the mass created. If our interpretation of longitudinal modes in Maxwell’s equation is correct, the appearance of mass has absolutely nothing to do with the appearance of a longitudinal mode with h = 0. In Sec. 11.5.2 we will analyze the putative analogy between the superconducting state in the Meissner effect and the vacuum state of the Higgs field in the symmetry breaking mechanism of electroweak theory. We will conclude that the analogy is evanescent.
11.1.7
β-decay and parity violation
Another example of the relation between the relative velocity and the normal stress (11.1.20) is afforded by parity violation in the weak interaction.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
555
Fig. 11.3. Weak β-decay of the neutron. In Fermi’s theory this occurs at a single point where the emission of an electron-antineutrino pair is analogous to electromagnetic photon emission.
Weak interactions first made their appearance in nuclear β-decay. Fermi’s theory models β-decay as analogous to an electromagnetic transition of an excited atom. However, instead of ejecting a photon, an electronantineutrino pair is ejected. The most elementary example of β-decay is neutron decay, shown in Fig. 11.3, where a neutron, n, decays into a proton, p, an electron, e, and an antineutrino ν¯ e . n −→ p + e + ν¯ e . The inverse reaction, p −→ n + e¯ + νe , where e¯ is the positron and νe the neutrino, cannot be observed outside of the nucleus because the proton is lighter than the neutron. Inside the nucleus, the proton can ‘borrow’ the necessary energy from the rest of the nucleus. To investigate parity violation one studies the β-decay of an unstable nucleus with a large spin that can be polarized so that it points in a specified direction. Agood candidate is Co60 , which is polarized so that its spin points in the direction of an applied magnetic field, B, as shown in Fig. 11.4. When the nucleus decays it emits an electron with momentum p. The experiment consists in determining the directional distribution of this momentum. The emission probability per unit solid angle, dP/d, is a 2 × 2
Aug. 26, 2011
11:17
556
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
Fig. 11.4.
The decay of polarized cobalt.
matrix, whose most general form, dP = AI + Bs · p, d contains arbitrary, but positive, constants, A and B, where I is the unit matrix, and the spin is defined in terms of the Pauli matrices as s = σ /2. More electrons will be emitted into one of the hemispheres, either above and below the xy-plane. This is a violation of parity inversion. For if the coordinate axes are inverted, the momentum p being a polar vector will change sign, but the spin s does not because it is an axial vector like angular momentum. Hence, under parity inversion the probability per unit solid angle will be AI − Bs · p, and is not an invariant. Another experimental possibility is to measure the polarization of electrons that are emitted from unpolarized nuclei. In the case of Co60 ,
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
Fig. 11.5.
557
The decay plane of cobalt 60.
β-decay would yield Co60 −→ Ni60 + e + ν¯ e . For there to be conservation of momentum, the recoil of the Ni60 nucleus must be such that pNi + pe + pν¯ e = 0. The momenta define a plane called the decay plane which is shown in Fig. 11.5. Suppose that the initial state of Co60 is unpolarized so that it has no preferential direction. Neither do the linear momenta so that leaves only the spin of the electron. Being an axial vector, reflection through the origin will have no effect on it, but a rotation about the n-axis will, so that if parity is conserved, the electron spin must be pointing in the direction n, normal to the decay plane. This means that there can be no polarization of the electron along the direction of its momentum. However, parity conservation was found broken, and the electron has a longitudinal polarization equal to −u, the negative of the relative speed, u. This means that the state of helicity h = − 12 is more populated than the state of helicity h = + 12 . If a2 denotes the number of electrons with helicity h = − 12 , and b2 those with helicity h = + 12 , what is experimentally open to
Aug. 26, 2011
11:17
558
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
measurement is the relative velocity, u=
a2 − b 2 Jz = , W a2 + b 2
(11.1.27)
where Jz and W are given by (11.1.20) and (11.1.24), respectively. Similar experiments involving the conversion of a proton to a neutron shows that positrons are also longitudinally polarized, but with opposite polarization, +u. Now, if we solve (11.1.27) for the square of the ratio, b/a, we easily find b2 1−u , (11.1.28) = 1+u a2 which is the square of the longitudinal Doppler shift, a result to be expected. Furthermore, if we decompose the wave function into orthogonal components of the spin up |u and spin down |d, ψ = a|d + b|u, we can determine the orientation of spin in relation to the z-axis, say, by solving the eigenvalue equation, a pz px − ipy a a s·p = =W . b b b px + ipy −pz The ratio, px − ipy W − pz b = = , a px + ipy W + pz
(11.1.29)
can be evaluated by introducing the spherical coordinates px = W sin ϑ cos ϕ, py = W sin ϑ sin ϕ, pz = W cos ϑ.
(‡)
And when this is done, we get the stereographic projection formula, b sin ϑ 1 − cos ϑ −iϕ e e−iϕ = = sin ϑ 1 + cos ϑ a 1 − cos ϑ 1/2 −iϕ = tan (ϑ/2) e−iϕ = e , 1 + cos ϑ
(11.1.30)
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
Fig. 11.6.
559
The spherical coordinates used to describe the orientation of spin.
for the orientation of the spin with respect to the z-axis, shown in Fig. 11.6. The first line of (11.1.30) is not only the half-angle formulas for the tangent, but are transcriptions of (11.1.29). They show that the total energy, W 2 = pz2 + px2 + py2 ,
(11.1.31)
is that of a relativistic, massless, particle. But, wait, appearances can be deceiving. The second line of (11.1.30) is the formula for stereographic projection. We know from Sec. 2.2.3 that stereographic projection is a conformal map of the surface. In regard to Fig. 7.5 a would be the diameter of the sphere, 2R, and b would be the point on the plane where the projection is made. Comparing the last expression in (11.1.30) with the square root of (11.1.28) identifies u = cos ϑ.
(11.1.32)
So the last equation in (‡) is Wu = pz . However, we know that Wu = p, so we have to identify the total momentum, p, with the momentum in the z-direction, pz . This is obvious because we arranged our axes so that the momentum will be pointing in the z-direction. Then what are the remaining two terms in (11.1.31)? The electron cannot be ultrarelativistic because u < 1, and because it is an electron it must have mass. We are therefore led to conclude that the last two terms in (11.1.31), if they are non-zero, must be related to the mass of the electron. If we set px2 +py2 = m2 , (11.1.31) becomes the relativistic expression for the energy of a massive particle whose momentum is pz = p. This is the origin of mass polarization. Consequently, our transformation to
Aug. 26, 2011
11:17
560
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
spherical coordinates becomes √ px = W (1 − u2 ) cos ϕ = m cos ϕ, √ py = W (1 − u2 ) sin ϕ = m sin ϕ, pz = Wu = p. Furthermore, the last equality in (11.1.30) with the identification (11.1.32) and the definition of hyperbolic distance enable it to be written as b 1 − u 1/2 −iϕ ¯ e = e−(u+iϕ) , (11.1.33) = a 1+u where u¯ is the hyperbolic measure of the velocity in a velocity space with absolute constant unity. It also makes ϑ = cos−1 u the angle of parallelism, and has converted the formula for stereographic projection, (11.1.30) into the Bolyai–Lobachevsky formula for the angle of parallelism, ¯ tan [ϑ(u)/2] = e−u¯ , ¯ where the angle ϑ is a sole function of the hyperbolic velocity, u. Moreover, the ratio is real, ϕ = 0 and we are dealing with plane polarization; if it is imaginary, u¯ = 0 and W 2 = m m, and the polarization is circular with ±i for right- and left-circular polarization; and finally if it is complex we are dealing with elliptic polarization.
11.2
Stokes Parameters and Their Physical Interpretations
Unwittingly we have derived expressions for the famous Stokes parameters in Sec. 11.1.5. For the derivation of these parameters it is sufficient to consider only the components of the electric field since the effect of light on molecules is to cause a redistribution of static charges. Before it was known that light was an electromagnetic phenomenon, Stokes considered that, for
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
561
linearly polarized light, the electric vector is oriented along the polarization axis of the light. For light propagating along the z-axis, (11.1.19a) and (11.1.19b) describe right-linearly polarized light along the x- and y-axes, respectively. The relative magnitude of these two components determine the orientation of the polarization axis. There are various forms of light polarization, and all can be represented as linear combinations of the orthogonal components, (11.1.19a) and (11.1.19b). The extreme cases are linearly polarized light, where one of the components vanishes, and circularly polarized light, where they become equal. In general, light will be elliptically polarized, and the square root of (11.1.20) will be proportional to the eccentricity of the ellipse. According to Stokes’s definition, (11.1.20) represents the difference in intensities between horizontal and vertical linearly polarized components. Stokes interpreted (11.1.21) as the difference in intensities between linearly polarized components oriented at angles ±45◦ . What we have referred to as spin, (11.1.22), to Stokes was the difference in intensities between right- and left-circularly polarized components. Finally, the total energy, (11.1.24), is proportional to the total intensity. The following table summarizes the Stokes parameters: J ≡ Q ≡ Jh − Jv U ≡ J+45 − J−45 V ≡ Jr − Jl
total intensity difference in horizontal and vertical polarized light intensities difference in linearly polarized components oriented at ±45◦ intensities difference in right- and left-circularly polarized light intensities
The three Stokes parameters therefore measure the ‘preference’ of the light wave to be horizontal, linearly-polarized at an angle +45◦ , and right-circularly polarized [Shurcliff 62]. All components can be obtained by combining the orthogonal components of the electric vector. The not so obvious one is (11.1.20), for it appears to require the vector potential. Actually, it represents the difference in intensities of right- and left-circularly polarized light. In this way it makes spin point in the direction of the momentum of a particle and spin point in the opposite direction of a
Aug. 26, 2011
11:17
562
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
particle, or what is referred to as ‘helicity,’ synonymous to right-circular and left-circular polarization, respectively. Consider the linear combinations, √ C1 = (Ex − iEy )/ 2,
√ C2 = (Ex + iEy )/ 2.
The intensities of right- and left-circular components are Jr = C1 C1 and Jl = C2 C2 . Introducing Ex = a and Ey = beiδ results in Jr = ab sin δ + W , Jl = −ab sin δ + W . One-half of their difference, 12 (Jr − Jl ) is (11.1.22), and one-half their sum is (11.1.24). Actually, the Stokes vector is defined as twice this value. For unpolarized light, the polarization-dependent terms (11.1.20), (11.1.21), and (11.1.22) will all vanish, while for partially polarized light, J 2 ≥ Jx2 + Jy2 + Jz2 , which can be understood when one considers partially polarized light as consisting of two beams, one which is completely polarized and the other unpolarized. The contribution of each of these beams to the magnitude of the total beam determines the degree of polarization. The equality sign applies to the state of complete polarization, where J can be looked upon as a radius vector of a sphere with coordinates (Jx , Jy , Jz ). Points on this sphere will correspond to specific states of polarization. Linearly and circularly polarized light can be converted into one another through the use of retarders, such as a quarter-wave plate. A quarter-wave plate increases the phase of one linear component by 90◦ with respect to the other. Retardation is caused by the refractive index of a material. When light passes from vacuum into matter, the speed of light is reduced in proportion to the inverse of the refractive index. This we saw was the determining factor in accepting the wave theory of light over the corpuscular theory. Since the frequency remains the same, the phase angle changes more rapidly with position inside the body than it does in the vacuum. The increase in the phase of light as it traverses the body appears as a retardation of light. The Poincaré sphere was designed by its discoverer to calculate the effects of the retarder on polarized light.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
563
Fig. 11.7. The Poincaré sphere is the parametrization of the Stokes parameters in elliptic geometry.
According to Poincaré [92],d a state of polarization can be represented by a point on a sphere whose radius is given by the intensity εJ, where J is the total intensity, and ε is the degree of polarization. The Poincaré sphere is shown in Fig. 11.7, where each point on the sphere denotes a specific type of polarization. The polarization is specified by the azimuth ψ, ellipticity, and handedness, either left or right. On the sphere this is given by the angles 2ψ and 2χ, which are the longitude and latitude, respectively. The factor 2 in the longitude indicates that any polarization ellipse is indistinguishable from one rotated by π radians. The azimuth, ψ, is the inclination of the semimajor axis of the polarization ellipse with respect to the x-axis, as seen in Fig. 11.8. The factor of 2 multiplying the latitude, χ, indicates that the same polarization ellipse can be obtained by interchanging the semimajor and semiminor axes, and rotating it through π/2 radians. The four Stokes parameters are denoted by J, Q, U, and V, and determine a polarized state on the surface of the ellipse according to J = a2 + b2 , Q = Jε cos 2χ cos 2ψ = a2 − b2 ,
(11.2.1a) (11.2.1b)
d Poincaré became interested in optics as a result of the lectures he gave at the
Sorbonne during the years 1888, 1889, and again in 1899. It seems like each time he taught a new course new discoveries were to be made.
Aug. 26, 2011
11:17
564
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
Fig. 11.8. The polarization ellipse swept out by the electric field vector which is enclosed by a rectangle of sides 2a and 2b. The transformation to new electric vector components Ex and Ey consists in a counter-clockwise rotation about the angle ψ.
U = Jε cos 2χ sin 2ψ = 2ab cos δ,
(11.2.1c)
V = Jε sin 2χ = 2ab sin δ.
(11.2.1d)
The first set of equalities are spherical coordinates of latitude 2χ and longitude 2ψ, as shown in Fig. 11.7. Any two diametrically opposite points on the sphere represent an orthogonal pair of polarization forms. There is a direct correlation between any point on the sphere and the form of polarization. The second set of equalities express the Stokes parameters in terms of the horizontal and vertical, a and b, components of the electric vector and the phase angle, δ, between them. Expression (11.2.1a) is just the total intensity, expressed as the squares of a and b. If the electric vibration is horizontal, (11.2.1b) becomes 1 while if the vibration is totally vertical, −1. It vanishes for circular polarization, a = b, and is elliptically polarized with a major axis at ±π/4. It thus expresses the preference for a horizontal, as compared to a vertical, vibration. Expression (11.2.1c) expresses the preference for +π/4 vibration, while (11.2.1d) that of right circular polarization. Alternatively, they can be given the density matrix representation, 1 J + Q U + iV ρ= , 2 U − iV J − Q
(11.2.2)
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
565
whose trace is the total intensity J, and whose determinant J 2 − V 2 − Q2 − U 2 ≥ 0, where the equality sign applies to the case of complete polarization. The property that the Stokes parameters form an invariant four-vector will be of great usefulness to our development. The Poincaré sphere is constructed by projecting points that define a light vector in the complex plane onto a real three-dimensional sphere. As we have seen in Sec. 11.1.5, every type of polarization can be described by orthogonal components of the electric vector. If we represent the ratio of the two components as Ey b = eiδ , Ex a
(11.2.3)
we can then map the ratio Ey /Ex onto the complex plane consisting of axes u and v, where Ey /Ex = (b/a)( cos δ + i sin δ) := u + iv. Every possible form of polarization is represented in the uv-plane. Any circle whose center is at the origin has radius b/a. Moreover, the phase δ is the angle between the u-axis and the radius vector [cf. Fig 11.9 below]. All possible values of b/a are obtained by considering an infinite number of concentric circles about the origin, and all possible values of δ are realized by sweeping the radius vector around each of the concentric circles. Whereas the origin has b = 0, and therefore represents linearly horizontal polarized light, values of u and v which are infinite require a = 0, and therefore represent vertically polarized light. All states in the upper half-plane, v > 0, represent phase differences between 0 and π, and are right-handed polarizations, while all states in the lower half-plane correspond to phase differences between −π and 0, and so represent left-handed polarizations. In regard to the Stokes parameters V and U, the intersection of the unit circle (a = b) with the v-axis represents states of right-circularly (north pole) and left-circularly (south pole) polarized light, while intersections of the unit circle with the u-axis represent +π/4 (east) and −π/4 (west) linearly polarized light, as shown in Fig. 11.9. Thus, any polarized state can be identified with a point in the complex plane. We know from Sec. 2.2.3 that by a stereographic projection any polarized state can be projected onto a Riemann sphere — only in this case it is
Aug. 26, 2011
11:17
566
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
Fig. 11.9.
Complex plane representation of polarized states.
called the Poincaré sphere! Circles in the uv-plane whose centers lie on the u- and v-axes project into lines of longitude or lines of latitude, respectively, on the Poincaré sphere. Consider the former case first. A circle whose center is (u0 , 0) cuts points (0, 1) and (0, −1), which represent rightand left-circularly polarized light projected onto the south and north poles of the sphere. Every circle, or longitudinal line, will be characterized by two values of the azimuth ψ of the characterizing polar ellipse. One value represents points in the right hemisphere, while the other represents points in the left hemisphere. To find the value of u0 — which we guess will be given by the formula for stereographic projection — we must solve the equation for √ a circle whose radius is (u20 + 1), i.e. (u − u0 )2 + v2 = u20 + 1. Introducing the definitions of u = (b/a) cos δ and v = (b/a) sin δ into this formula for a circle results in Q b2 − a2 = − = − cot 2ψ = u0 . 2ab cos δ U
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
567
Since 2ψ lives in the semi-open interval (−π, π], the two values of the azimuths are ψ = − 12 cot −1 u0 , and ψ = − 12 ( cot−1 u0 ± π), where the + (−) sign applies to u0 > 0 (u0 < 0). Therefore, the longitude to which each circle centered on the u-axis in √ the complex plane at (−cot 2ψ, 0) is 2ψ, and has a radius ( cot 2 2ψ+1) = csc 2ψ. Now consider the second case where circles centered on the v-axis at (0, v0 ) project into parallels of latitude on the Poincaré sphere where √ |v0 | > 1. The radius of each circle is r = (v02 − 1) so that the equation of the circle is u2 + v2 − 2v0 v = −1. Again introducing the definitions of u and v in terms of the polarizing ellipse leads to a2 + b2 = 2ab sin δv0 , or a2 + b2 J = = csc 2χ = v0 . 2ab sin δ V The radius of the circle in the uv-plane is r =
√ (csc2 2χ − 1) = cot 2χ.
Therefore, the latitude to which a circle of radius r = cot 2χ, centered at (0, csc 2χ), is projected onto the Poincaré sphere is given by the angle 2χ. The stereographic projection of points on the uv-plane onto the Poincaré sphere is shown in Fig. 11.10. The complex plane bisects the sphere in such a way that its center coincides with that of the sphere. The orientation of the sphere is such that the +y axis of the sphere coincides with the +u-axis, and the +z-axis with the +v-axis. A point P on the plane is projected to a point P on the sphere by extending the line connecting P and V, where V denotes vertically polarized light, and H, horizontally polarized light. Thus, the latitude of the point P is given by the angle 2χ formed from the vector from O to P and the projection of this vector onto the xy-plane, where positive angles are measured for increasing z.
Aug. 26, 2011
11:17
568
SPI-B1197
b1197-ch11
A New Perspective on Relativity
Fig. 11.10.
11.3
A New Perspective on Relativity
Stereographic projection of the complex plane onto the Poincaré sphere.
Poincaré’s Representation and Spherical Geometry
The mixing of (11.1.21) and (11.1.22) does not reflect their original definitions as Maxwell stresses and (11.1.20) as the momentum. Rather, if we introduce the Poincaré representation with the angle variables (2χ, 2ψ), which are related to (2ϑ, 2ϕ) by the right-spherical triangle shown in Fig. 11.1, we will get 1 † a+ a+ − a†− a− , 2 1 † Jx = W cos 2χ · sin 2ψ = a+ a− + a†− a+ , 2 1 † Jy = W sin 2χ = − a+ a− − a†− a+ , 2 Jz = W cos 2χ · cos 2ψ =
(11.3.1a) (11.3.1b) (11.3.1c)
in place of (II ). Whereas expression (11.3.1a) corresponds to the xx-component of the Maxwell stress, σxx =
1 2 (E − E2x ), 8π y
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
569
(11.3.1b) is the amount of x-momentum that flows in the y direction, σxy =
1 Ex Ey . 4π
Due to symmetry, this is equal to the amount of y-momentum that flows in the x direction. Finally, expression (11.3.1c) is the angular momentum operator. The same configuration repeats itself by mixing the tangential and normal Maxwell stresses in the plane normal to the invariant momentum operator. The mixing of the normal and tangential stresses with the spin and total energy corresponds to the rotation of light through an angle 2ψ about its direction of propagation. For example, light can be passed through a crystal plate with simple rotatory power [Perrin 42], where J = J, Jz = Jz cos 2ψ − Jx sin 2ψ,
(11.3.2a)
Jx Jy
(11.3.2b)
= Jz sin 2ψ + Jx cos 2ψ, = Jy .
The rotations of the normal and tangential stresses are quite different from those predicted by relativity [McCrea 47]. Whereas (II ) elicits a mechanical interpretation, (11.3.1a)–(11.3.1c) requires an electromagnetic interpretation. In the former, the axis normal to the mixing of the two mass components was the momentum, whereas, in the latter, it is the two stress components that lie in a plane normal to the spin. Whereas the square of the orthogonal vectors corresponds to the relativistic mass relation, the sum of the square of the normal and tangential Maxwell stress components is Jz2 + Jx2 = W 2 cos2 2χ = W 2 ( cos2 2ϑ sin2 2ϕ + cos2 2ϕ).
(11.3.3)
Averaging (11.3.3) over all directions of polarization, by integrating over all ϕ, where cos2 2ϕ = sin2 2ϕ =
1 , 2
Aug. 26, 2011
11:17
570
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
gives Jx2 + Jz2 = W
2
1 + cos2 2ϑ 2
.
(11.3.4)
If 2ϑ is interpreted as the angle of scattering with respect to the direction of propagation of the primary beam whose intensity is W 2 , then (11.3.4) gives the intensity of a scattered beam. Observing that Jy2 =
1 2 2 W sin 2ϑ, 2
we obtain Rayleigh’s expression [Born & Wolf 59], ε=
Jy2 Jx2 + Jz2
=
sin2 2ϑ , 1 + cos2 2ϑ
(11.3.5)
for the degree of polarization, although Rayleigh derived it in a different way. Moreover, by a change of coordinates we have gone from a hyperbolic to an elliptic space. Consider again the elliptic velocity right triangle in Fig. 11.1. The angle 2ϕ at the origin will have the same elliptic measure as the Euclidean measure. The cosine of the angle is cos 2ϕ = tan 2ψ/ tan 2ϑ. The same, however, is not true for the non-central angle 2ω for it will undergo distortion, and so, too, the side 2χ. Its cosine will be given by cos 2ω =
tan 2χ sin 2χ sec 2ψ = , tan 2ϑ sin 2ϑ
where the last relation follows from the elliptic Pythagorean theorem, cos 2ϑ = cos 2ψ · cos 2χ. For the elliptic angle, ω, ˆ there is no distortion so that cos 2ωˆ = tan 2χ/ tan 2ϑ. It therefore follows that the relation between the two measures of the angle is cos 2ω = cos 2ωˆ · sec 2ψ. Since sec 2ψ ≥ 1, cos 2ω > cos 2ω, ˆ and because the cosine is a decreasing function on the interval (0, π/2), ω < ω. ˆ This is the origin of the angle excess in elliptic geometry. And just like hyperbolic geometry, the angles of an elliptic triangle also determine the sides.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
571
It will appear to us that the side 2χ will undergo a dilatation by the amount
cos 2χ = sec 2ψ = cos 2ϑ
√
(u2 + (1 − u2 ) cos2 2ϕ) ≥ 1, u
(11.3.6)
where, again, the first equality is the elliptic Pythagorean theorem, and the inequality, cos 2ϑ < cos 2χ, implies that ϑ > χ. The space dilatation depends upon the polarization which is determined by the relative phase 2ϕ. For a linearly polarized wave, 2ϕ = 0, π, the stretching is maximum, 1/u, while for left- (right-) circular polarization, 2ϕ = −π/2 (2ϕ = +π/2), it vanishes. This occurs when the amplitudes of the orthogonal components of the electric vector become equal. Intermediary, elliptic, polarization occurs in the interval 1 ≤ sec 2ψ ≤ 1/u. We have underscored the analogy between the Stokes parameters and the operators of SU(2). What can we say about the strong interaction which supposedly uses SU(3) whose states are the color charges? Since there are supposedly three ‘colors’ for each of the six quark species, a 3 × 3 matrix is required. This means that there will be eight generators, replacing the three Pauli matrices of SU(2). These are known as the Gell-Mann matrices, named after their inventor. Instead of a single (Casimir) invariant of SU(2), there will be three. But, for a compact group these Casimir invariants can always be written as a sum of squares of generators [cf. (11.3.7) below]. This would imply that the SU(3) group is not elementary, but, rather, the different SU(2) subgroups of SU(3) can be used. Within each subgroup the operators would be those of the ordinary angular momentum algebra [Lipkin 66]. In other words, any two components of the triplet can define isospin leaving the third component invariant. The couplings of these subgroups would be related to the additivity of the Stokes parameters when there is a superposition of two independent light beams. Additivity reflects the lack of interference, or the lack of correlation of the amplitudes and phases. It is from this additivity principle which makes the scattering parameters of the emergent beam a linear homogeneous function of the incident beam from which the Lorentz and rotational transformations are immediate consequences.
Aug. 26, 2011
11:17
572
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
11.3.1
Isospin and the electroweak interaction
The distinction between elliptic and hyperbolic geometries can be translated into the language of Lie groups. A ‘compact’ Lie group is associated with elliptic geometry, where the parameters of the group can assume values over a closed interval. The group U(1) for the electromagnetic interaction is compact because it is characterized by a unique angle that can take on values in the closed interval [0, 2π]. It is said that U(1) applies to electromagnetic interactions because it represents phase changes; the electromagnetic four-vector potential, Aµ , is determined up to four-divergence of an arbitrary function. However, even classically, it is known that when a circularly polarized light beam is directed at a target it sets the electrons in the target into circular motion in response to the rotating electric field. Hence, a relationship is suggested between circularly polarized light and photons in a definite state of angular momentum. A fortiori photons have definite states of helicity which are related to states of left- and right-handed circular polarization. Thus, the photon is not a singlet, but a doublet, just like the three doublets of leptons, the electron, muon, and tau, all with their own neutrinos. The doublet structure that defines an SU(2) symmetry for the weak force arises from the lepton’s behavior with respect to weak decays, like the β-decay discussed in Sec. 11.1.7. And just as each doublet belongs to a fundamental representation of weak SU(2) symmetry so, too, the photon has a doublet structure. Analogous to the three weak isotopic spin components of the local gauge, the three elements of the electromagnetic interaction are the Stokes parameters. The compactness of the group ensures that the group is unitary, or that it has a unitary representation. Non-compact groups have parameters that are not restricted to a finite interval. An example is the Lorentz group, where the ‘boosts,’ or transforms from one inertial frame to another, are represented by non-unitary matrices. In fact, the ‘boost’ parameters are nothing but rapidities, u¯ = tanh−1 u, which are not restricted to finite intervals. As we know, these belong to hyperbolic geometries.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
573
The distinction is also represented in the quantities that are conserved: Compact Lie groups conserve the total energy, or total momentum, while non-compact ones will conserve mass. Thus far, non-compact Lie groups have not found their way into gauge theory since internal quantum numbers, like isotopic spin, appear to be associated with compact symmetry groups. But, by all of what we have said about the transformation from elliptic to hyperbolic geometry, and back, we expect non-compact groups to find their way into gauge theory, or something more fundamental than it. Generalizing to n-dimensions, the unitary group U(n) is represented by n × n unitary matrices. Those with determinant equal to +1 define the special unitary or modular group SU(n). The elements of SU(n) have n2 − 1 independent parameters. Examples of such groups are the SU(2) group of isotopic spin and the SU(3) group associated with color. The unitary transformations of SU(2) are given by U = e−iσ ·α , where σ consists of three generators, which are the Pauli spin matrices, and the components of α are the three weak isotopic spin components of the local gauge, α3 α1 − iα2 σ ·α = . α1 + iα2 −α3 The αi form a linear space, known as the Lie algebra, in which there is both vector and scalar products. The fact that the operators do not commute leads to a form of vector multiplication, or Lie product, while the scalar product, or the negative of the determinant, α23 + α21 + α22 = const.,
(11.3.7)
expresses the conservation of something like total angular momentum, energy, or intensity. This invariant commutes with all the generators. The doublet structure of quarks and lepton that defines a SU(2) symmetry for the weak nuclear force follows from their behavior with respect to weak decay, such as the β-decay discussed in Sec. 11.1.7. Electrons, muons, and tau particles each have their own neutrinos and form three doublets. This carries over to quarks, which again form three distinct doublets.
Aug. 26, 2011
11:17
574
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
According to the ‘standard’ theory, the mediators of the weak SU(2), or spin-1 gauge particles are by definition massless. However, it has been known since the early 1930’s that the force between nucleons has an extremely short range, and this is what led Yukawa to propose his short range potential. It implies that the masses of the spin-1 vector mesons that mediate the weak interaction, or the W -bosons, are anything but zero. So what is done is to ‘mix’ the electromagnetic U(1) symmetry with the SU(2) symmetry. To the masses of the charged bosons W 1 and W 2 , one adjoins a third component W 3 , corresponding to the third (diagonal) Pauli matrix, σ3 . The new W 3 component would have the same coupling strength as the W ± = W 1 ± iW 2 ,
(11.3.8)
bosons, but it would be neutral. Being neutral, W 3 would imply a new class of interactions for both the electron and neutrino. These ‘neutral interactions’ were unknown at the time they were predicted, and earlier gauge theories were built to exclude the possibility of such neutral currents. The problem then was to couple the new field W 3 to a physical gauge field. This was taken to be the four-vector potential Aµ = (φ, A) itself. But this required something more than SU(2). So the combined weak and electromagnetic interactions would be ‘unified’ in the larger gauge group SU(2) × U(1). In the absence of W 3 , the force between two electrons would be given exactly by Coulomb’s law, while, in its presence, Coulomb’s law must be modified. What was charge and vector potential in electromagnetism must now be modified to contain a touch of the new weak interaction. The simplest way was to consider a linear combination of the two, A µ cos ϑw sin ϑw Aµ = , (11.3.9) Z0µ − sin ϑw cos ϑw Wµ3 where ϑw is the so-called Weinberg angle that is defined in terms of the ‘coupling’ constants of (hyper-) charge and the weak isotopic charge, so as to produce a ‘new’ four-vector potential, A µ , in respect to the ‘old’ fourvector potential Aµ , and a new weak field, Z0µ . This was a newly hypothesized neutral weak boson that forms the SU(2) triplet of weak bosons together with the original W bosons, (11.3.8). The reason why Wµ3 -field
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
575
Fig. 11.11. The scattering of a neutrino and antineutrino emits a Z0 boson which decays into W bosons.
was ousted was due to the definition of the new four-vector potential, A µ . This meant that Wµ3 cannot be considered to be purely weak, but also contains an electromagnetic contribution. Then, Z0µ would be the ‘physical’ neutral weak field. However, since (11.3.9) is invertible, the roles of W 3 and Z0 can be interchanged. As it stands, (11.3.8) and the Z0 boson must satisfy a conservation relation of the form (11.3.7). The Z0 -emitted boson in neutrino-antineutrino scattering decays into the W + and W − bosons shown in Fig. 11.11. In contrast, the components of the four-vector Aµ transform according to Lorentz, A0 + uA1 A0 = √ , (1 − u2 )
A1 + uA0 A1 = √ , (1 − u2 )
A2 = A2 ,
A3 = A3 ,
(11.3.10)
which leaves the square magnitude, A20 − A21 − A22 − A23 ,
(11.3.11)
invariant, where A0 = φ. The hyperbolic nature of the four-vector Aµ makes it transform through an imaginary angle in (11.3.10), and leads to a different type of invariant. Any other field which is coupled to it must transform in the same way in order to be compatible with it. In other words, the invariant (11.3.11) is not the same as the invariant (11.3.7), for the former is hyperbolic while the latter is elliptic. We recall that the invariant in hyperbolic space is the mass, whereas the total energy is the invariant in elliptic space.
Aug. 26, 2011
11:17
576
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
It is hard to believe that Nature is such an improviser of mixing hyperbolic with elliptic elements, which would be like gluing two incompatible pieces together. The story is still not over. The masses of the charged bosons (11.3.8) weigh in at about 80 times that of a proton, and the neutral Z0 boson is slightly heavier at 91 times that of a proton. The problem was to get mass out of a theory which apparently forbids it. It was required that the Lagrangian, leading to correct equations of motion, must be gauge-invariant, and this prevented mass appearing explicitly in the Lagrangian through a term of the form mAµ Aµ . The rabbit was pulled from the hat by introducing a spin-0 field, together with its accompanying particle, known as the Higgs field and particle, after Peter Higgs who invented them. Then by introducing a potential of the field which undergoes a second-order phase transition, mass would suddenly appear at the onset of the phase transition. Therefore, it was claimed that, some new physics is called for such as spontaneous symmetry-breaking, where the Higgs field allows quarks and electrons to acquire mass. The postulated, but unproven, Higgs field is analogous to Cooper pairs in superconductivity, and like Cooper pairs, is massive.e This is analogous to the Dirac equation where mass is introduced ‘by hand,’ in order to get it to satisfy the relativistic conservation of energy. The mysterious Higgs field has been likened to an aether [Moriyasu 83]. Have we come more than a century after its demise just to return to the aether that was found so useful in electromagnetic theory? The Higgs field of a superconductor was the ensemble of electrons bound into Cooper pairs. Does the Higgs field represent a new binding force that has a range much smaller than the weak interaction, or is it just a figment of the imagination? It is argued that purely transverse waves cannot describe mass because Maxwell’s equations are both transverse and massless. Any and all attempts to destroy the transverse property of the electric and magnetic e The question whether or not the spontaneous break-down of the SU(2) × U(1)
to the U(1) of electromagnetism depends on the open question whether the Higgs field actually exists. It is claimed that if the Higgs field is mathematical, rather than physical, then there must be some new physics lying around that makes the spontaneous symmetry-breaking such a good description of elementary particles down to distances of the order of 10−16 cm [Georgi 09]. Though symmetry describes the mechanism, it cannot supply the physics.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
577
fields have met with disastrous consequences [Heaviside 99]. We will analyze those consequences of introducing mass into Maxwell’s equations in Sec. 11.5.
11.4 11.4.1
Polarization of Mass Mass and momentum
The Stokes characterization of the two independent states of light polarization is mathematically identical to the orientation of a spin- 12 particle [Fano 54]. We have seen that a completely polarized beam of light has an electrical vibration which may be represented by its components along two rectangular axes, (11.1.19a) and (11.1.19b). Electromagnetic vibrations change irregularly and erratically. Yet, for elliptically polarized light the irregular vibrations must be such that the ratio of the amplitudes, together with the phase difference, must be absolute constants. Hence, no averaging is required. The average energy and momentum of vibrations are W = a2 + b2 , p = a2 − b2 .
(A)
In spherical coordinates of a vector of length W , longitude 2ϕ, and colatitude 2ϑ, the Stokes parameters are given by scheme (II). These three quantities determine elliptic vibrations, apart from their phase. Dirac [47] made the distinction between the way W and p transform by rotation through a hyperbolic angle, and the rotation of ml and mt through a real angle, but thought that the former applies to the space and time coordinates, x and t, while the latter to the space coordinates y and z. This is unfortunate since it has led to the introduction of space-time invariance which has nothing to do with the theory. From (II) it is at once apparent that W and p are invariant under a rotation of axes, while the mass components change with the axes. If ml and mt are the values of ml and mt after a rotation of axes through an angle 2ϕ in the clockwise direction, ml = ml cos 2ϕ + mt sin 2ϕ, mt = −ml sin 2ϕ + mt cos 2ϕ,
(a)
Aug. 26, 2011
11:17
578
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
while W = W , p = p. The rotation (a) can be thought of as a rotation of two nucleons, or any mixture of the two, in isospin space. From these equations it follows that W 2 − p 2 − ml 2 − mt 2 = W 2 − p2 − m2l − m2t
(11.4.1)
is an invariant under rotations. In other words, (11.4.1) is invariant under a rotation of the axes. This has nothing to do with its invariancy under a Lorentz transform! As Dirac pointed out, we can satisfy (11.4.1) when W = W and p = p , but with invariant mass, by rotating the axes through a hyperbolic angle. ¯ is related to the Euclidean The hyperbolic measure of the relative velocity, u, measure u, by the usual form of the rapidity, ¯ u = tanh u,
(11.4.2)
or equivalently, u¯ = tanh
−1
1+u 1 . u = ln 2 1−u
(11.4.3)
Since p = Wu, it follows that ¯ W = a2 + b2 = m cosh u, ¯ p = a2 − b2 = m sinh u.
(A’)
Rotating (A’) through the hyperbolic angle v¯ results in W = W cosh v¯ + p sinh v¯ , p = W sinh v¯ + p cosh v¯ ,
(b)
while the total mass, m=
√
(m2l + m2t ),
(11.4.4)
is invariant because each of its components remain invariant, ml = ml and mt = mt . The pair of equations (b) is the Lorentz transform in the plane p, W , and not in the xt-plane as Dirac [47] would have us believe.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
579
If we insist on the invariance of a four-vector, we can always choose our axes so that one points in the direction of the momentum, thus leaving two slots vacant in the four-vector that needs to be filled. On the strength of energy conservation, they can be filled only by the components of the mass such that (11.4.4) holds. And once it is recognized that p is the momentum in the direction of the motion so that the momentum is not given by its three Cartesian components, Dirac’s theory becomes equivalent to Stokes’s formulation with the transformation (b). The ratio of the semiminor to the semimajor axis of the electric ellipse is [cf. (11.4.13) below] 1/2 b = 1 − cos 2ϑ = tan ϑ. (11.4.5) a 1 + cos 2ϑ The numerical value of tan ϑ represents the ratio of the sides of the rectangle, of area ab, which encloses the ellipse that the point of the vibrating electric vector traces out in Fig. 11.8. Now, from the relation p = Wu and the relation between Euclidean and hyperbolic measures of the relative velocities, (11.4.3), we find the same ratio of the axes of the ellipse to be [cf. (11.1.30)], 1/2 b = e−u¯ = 1 − u . (11.4.6) a 1+u Finally, comparing (11.4.5) with (11.4.6), and noting (11.4.2), we find ¯ u = cos 2ϑ = tanh u,
(11.4.7)
which again identifies 2ϑ with the Bolyai–Lobachevsky angle of parallelism. This angle provides the link between hyperbolic and circular function as (11.4.7) clearly shows. The ratio of the momentum to the energy, p a2 − b2 = cos2 ϑ − sin2 ϑ = cos 2ϑ, = 2 W a + b2
(11.4.8)
is precisely (11.4.7). Fermi’s original formulation of β-decay took into account five different interactions, called scalar (S), vector (V), tensor (T), axial vector (A), and pseudo-scalar (P). These interactions are distinguished by the way
Aug. 26, 2011
11:17
580
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
they transform under Lorentz transformations [Lipkin 62]. Consider Fermi transitions where S and V interactions contribute, as in the case where a left-handed neutrino is ejected. The V interaction will give right-handed electrons, while the S interaction gives left-handed electrons in the extreme relativistic limit. In the helicity plane, the V axis will align itself with the vertical, while the S-axis will align itself with the horizontal axis. These two axes are symmetric about the axis that makes a 45◦ angle which occurs when u = 0, and is an even parity s-state. As the electron slows down, these vectors rotate toward one another until they coincide in the zero velocity state at 45◦ in Fig 11.12. At some velocity u, the V interaction will have an average helicity +u, while the S interaction will have a mean helicity −u. The mean helicity is given by (11.4.8), where the angle ϑ represents the angle between the vertical axis and the vector V. The electron state corresponding to the S interaction makes the same angle between the horizontal and the S vector. These vectors play the role analogous to the vibrating electric vector in optics, whose components must always remain orthogonal to one another because photons can only travel at the speed of light.
Fig. 11.12. V and S interactions rotate toward one another as the electron velocity decreases.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
581
The helicity states become orthogonal only in the extreme relativistic limit. In general, the decay probability will not reduce to the sum of the squares of the different helicity states, except in the extreme relativistic limit. If we characterize the decay according to S and V interactions, the decay probability will not consist of independent contributions, except in the ultrarelativistic limit and become identical in the nonrelativistic limit. In general, therefore, the S and V interactions will contain energy-dependent cross terms in the decay probability spectrum, which would vanish in the extreme relativistic limit where the S and V states become orthogonal. These energy-dependent cross terms in the decay probability are known as Fierz interference terms. The simplest type of Fierz interference occurs between two channels with opposite electron helicity, and the same values for all other quantum numbers. The wave functions corresponding to these two orthogonal states are entirely analogous to the orthogonal states of longitudinal spin whose spin direction is given by stereographic projection. In the plane represented by the orthogonal axes of positive and negative electron helicity, a vector at an angle ϑ with respect to the vertical (he = +1) represents a mixture of states having both positive and negative helicities that have amplitudes proportional to cos ϑ and sin ϑ, respectively. The mean helicity of such a
state, he , is (11.4.8) [Lipkin 62]. So parity non-conservation is written into the Stokes parameters when we identify the rotated Poincaré representation (II) with the Minkowski representation (A’). In fact, (11.4.8) is the parity violation law of weak interactions [Omnès 70], as we have seen in Sec. 11.1.7. Because particles and antiparticles are oppositely polarized, charge conjugation symmetry has also to be abandoned. In all cases, the degree of polarization is found to be equal to the relative Euclidean velocity, u. But, the angle ϑ is the angle of parallelism, ¯ It is entirely reasonable it will be a function of the hyperbolic velocity, u. that longitudinal polarization should tend to zero with the velocity since, in the limiting case of zero momentum there can be no longitudinal polarization. A limiting case occurs in the polarization of muons, where the muon and anti-muon are 100% polarized because they travel at the speed of light. For an electron, the state of helicity − 12 is more heavily populated than the state of helicity + 12 . Calling a2 the number of electrons found with
Aug. 26, 2011
11:17
582
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
helicity − 12 and b2 the number of electrons with helicity + 12 , we immediately find (11.4.8). In other words, if the mean spin of a particle is ± 12 in the direction of polarization, its mean value in any other direction will be its projection, 1 1 cos 2ϑ = (P+ − P− ), 2 2 where the probabilities for finding the particle with spins ± 12 are P+ = cos2 ϑ and P− = sin2 ϑ, which conserve probability, P+ + P− = 1. In β-decay there are two types of measurements made on leptons. The first consists of polarization measurements that determine the mean helicities of the particles with respect to a specified axis, and the second determines the angular distribution of the emission of a particle with respect to a specified axis. Whereas angular momenta are restricted to discrete values, linear momenta are not, and are known to have a 1 ± cos 2ϑ distribution, or more generally as 1 + A cos 2ϑ, where A, the asymmetry parameter, is a mean value that is determined by the projection of angular momenta on a preferential axis, or the two possible states of helicity of the electron and neutrino [Lipkin 62]. Since A is proportional to ±1, the probability for any interaction will be 12 (1 ± u), depending on whether the helicities of the electron and neutrino are equal or opposite. When there is a difference in phase 2ϕ between components of vibration along orthogonal axes, we have a counter-clockwise rotation ml = W sin 2ϑ cos 2(ϕ + ϕ ) = ml cos 2ϕ − mt sin 2ϕ ,
(11.4.9a)
mt = W sin 2ϑ sin 2(ϕ + ϕ ) = ml sin 2ϕ + mt cos 2ϕ ,
(11.4.9b)
together with the invariancy of pz = pz , and W = W . This says that there are two polarization states of mass, both normal to the direction of momentum. The phase 2ϕ ‘mixes’ these components according to (11.4.9a) and (11.4.9b). This is somewhat analogous to the early days of relativity where distinction was made between the ‘transverse’ mass, for which the force is normal to the velocity, and the ‘longitudinal’ mass, where the force is parallel to the velocity [cf. Sec. 11.1.1].
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
583
Just as in the Trouton–Noble experiment, described in Sec. 3.7.2, the charge on the electron would feel a couple whose axis is perpendicular to the plane formed from the velocity and direction of its motion. The only difference is that the velocity is not due to the Earth’s motion through the aether, but to the mass in motion. This could be the origin of mass polarization at the elementary particle level. Whereas Lorentz provided the bridge from the bulk to the atomic level, this would provide a bridge from the atomic level to that of its elementary particle constituents. We will now show that ±mt describe right- (left-) circular polarization, and ml the polarization at 45◦ degrees from the two orthogonal components of the electric vector. Squaring (11.4.9a) and (11.4.9b), and adding give m2 = m2 = W 2 − p2 = W 2 sin2 2ϑ,
(11.4.10)
on account of (11.4.4) and (11.4.8). In view of (11.4.7), (11.4.10) asserts that √ W (1 − β2 ) = const.,
(11.4.11)
under rotations. Equation (11.4.11) is none other than invariancy of the total mass m, and explains the increase of energy with speed. From (11.4.10) we may say that mass is identified with transverse momenta, which in jets provides an estimate of the masses of resonance states. They are invariant under rotation. Resonance states live longer than the time of their creation, and have masses comparable to twice their transverse momenta [Heisenberg 66]. A plane wave will be polarized along the z-axis, either in the positive or negative direction. A more general treatment of polarization in any direction is to consider II as direction cosines so that polarization in any direction will be given by σ · p = σ x px + σ y py + σ z pz px − ipy pz , = px + ipy −pz
Aug. 26, 2011
11:17
584
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
where
0 σx = 1
1 , 0
0 σy = i
−i , 0
1 σz = 0
0 , −1
are the Pauli spin matrices, all of which have eigenvalues ±1. Thus, if the wave function, ψ = aψ+ + bψ− ,
(11.4.12)
is a linear combination of ψ+ and ψ− , which represent states of spin ± 12 in the positive and negative z-directions, the relative weights are given by a a σ ·p = , b b or a = b
1 + cos 2ϑ 1 − cos 2ϑ
1/2
eiϕ = cot ϑeiϕ .
(11.4.13)
Again we find a stereographic projection of an arbitrary spin direction onto the z-axis. The equation for the Pauli spinor , analogous to the Dirac equation, is (σ · p)(p) = W (p).
(11.4.14)
Spin in Stokes’s momentum space (II ) requires a Pauli spinor, and its negative energy solutions double that number. There will be non-zero solutions to (11.4.14) if and only if the determinant, 1 + pz /W (px − ipy )/W W2 (px + ipy )/W 1 − pz /W = W 2 − pz2 − px2 − py2 ,
(11.4.15)
of the pair of linear homogeneous equations (11.4.14) vanishes. The vanishing of (11.4.15) is precisely the condition for complete polarization, ε = 1, and if the four Stokes parameters satisfy this condition they may be considered the polarization parameters of the light beam.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
585
The terms in the matrix corresponding to (11.4.15) are related to those of the density matrix [McMaster 54], where pz /W = ±1 is plane polarization along he = ±1,f right or left-handed helicity (u = 1), px /W = ±1 is plane polarization at π/4, or equal left- and right-handed helicity u = 0, and py /W = ±1 is right- or left-handed polarization, respectively. The former pair gives the probabilities for propagating in the direction of the z-momentum or opposite to it, 12 (1 ± u), or 12 (1 ± cos 2ϑ), while the latter pair is proportional to the mass times the phase, and gives the probability for a turn in a ‘space-time path.’
11.4.2
Relativistic space-time paths: An example of mass polarization
Feynman [65], in his visualization of the “space-time paths for the onedimensional Dirac particle,” wrote the propagator as K+− =
N
(iεm)R ,
(11.4.16)
zig-zag paths
for an N-segment trajectory in time t with R reversals. In his notes [Schweber 86], Feynman writes that “each turn to + gives +iε, each turn to − gives −iε, where ε := t/N is the infinitesimally small time interval. Gersh [81] claims that the minus sign should be present in (11.4.16) in order “to get the correct nonrelativistic limit.” In fact, both signs should be present in (11.4.16), which still is only part of the propagator. Feynman associates the probability amplitude for a reversal with mass. But mass does not enter in the way it enters the Dirac equation in configuration space because there it enters in the diagonal terms, and not in the off-diagonal ones. In a one-dimensional stochastic model [Gaveau et al. 84] of an electron shuttling back-and-forth at the speed of light, the energy conservation equation (11.1.8), implying a wave equation, is waived in favor of the telegraph equation, where mass enters through the dissipative term. This is also inaccurate, and is only salvaged formally by an analytic continuation of time. But, this does not explain why an electron should shuttle back f The helical states stand in for orthogonal components, E and E , of the electric x y
vector, E.
Aug. 26, 2011
11:17
586
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
and forth at the speed of light. So where does mass enter in the expression for the probability amplitude, and at what speed will the electron travel? Feynman was correct to associate the probability amplitude for a path reversal with the mass, but this is only part of the story. The total propagator,g px − ipy pz , (11.4.17) K(p) := σ · p = px + ipy −pz is the same as the matrix of the local weak SU(2) gauge transformation. If (11.4.17) is to reflect Feynman’s rule for a path reversal, it must be given by p me−iϕ σ ·p= , (11.4.18) meiϕ −p which is (slightly!) more general than Feynman’s prescription, (11.4.16), since it allows for phases other than ϕ = π/2. Then, the propagator for path of length N and energy W that will be traversed in time t such that W = N/t =: 1/ε, in natural units, is eiσ ·pε for each segment. This propagator propagates the Pauli spinor (p, t) to (p, t + ε) = eiσ ·pε (p, t),
(11.4.19)
in time ε. The matrix exponential function is defined by the infinite series,
n eiσ ·pε = I + iσ · pε + · · · + in σ · p εn /n!, where I is the unit matrix. For small time intervals, this permits us to write (11.4.19) as
(p, t + ε) = I + iσ · pε (p, t). Then, proceeding to the limit as ε → 0 gives ˙− p m e−iϕ − −i = , ˙+ m eiϕ −p +
(11.4.20)
(11.4.21)
g This identifies the Stokes parameters (II ) with the three weak isotopic spin com-
ponents of the local gauge.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
587
where the dot denotes differentiation with respect to time. For ϕ = π/2, (11.4.21) becomes the Dirac equation in momentum space,
W = σz p + βm , where p = pz , and
0 β= −1
−1 . 0
In (11.4.21) we have introduced the mass according to the first two equations in scheme (II ), i.e. px ± ipy = me±iϕ . For ϕ = 0, π mass is longitudinal corresponding to linear polarization, while for ϕ = ±π/2, the mass is transverse corresponding to right-(left-) circular polarization. The trace of (11.4.21) vanishes as every SU(2) representation must be symmetrical about 0; the spin varies from −j to +j. The elimination of either component in (11.4.21) by increasing its order gives back the Klein–Gordon equation, ¨ = − p2 + m2 , where represents either component of the spinor. The propagator (11.4.17) is related to the density matrix, 1 W +p ρ= 2 ml + imt
ml − imt . W −p
(11.4.22)
Photons show only longitudinal polarization: spins parallel or anti-parallel to the direction of propagation. For photons u = ±1 and it has two helicities h = ±1, corresponding to the Jz component of angular momentum, (11.1.20). Since both mt and ml vanish, there is no state of helicity h = 0. In other words, there are no ladder operators, Jx ± iJy , with a multiplet of 2J + 1 degenerate states, −J, −J + 1, . . . , 0, . . . , J − 1, J. For complete polarization the determinant of the density matrix, (11.4.22), vanishes. The diagonal terms are the probability for an electron to propagate with its helicity in the direction of the momentum, P+ =
1 2
W +p W
=
1 (1 + u) = cos2 ϑ, 2
Aug. 26, 2011
11:17
588
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
and the probability that it will propagate with its helicity in the anti-parallel direction with respect to the direction of its momentum, P− =
1 2
W −p W
=
1 (1 − u) = sin2 ϑ. 2
The difference between the two probabilities is the relative velocity, u = p/W . This is none other than the parity violation in weak decay, (11.4.8). The helicity goes to zero as its momentum goes to zero — a fact that is well-known. In optics, P± = 1 would correspond to plane polarization along he = ±1. The mass in Feynman’s formula, (11.4.16), corresponds to the offdiagonal terms, me±iϕ in (11.4.17) for a phase, ϕ = π/2. Feynman, with his chess-board approach to the zig-zag motion of the electron, was considering the transverse mass, mt = m sin ϕ for a phase ϕ = π/2. Consequently, Feynman’s formula (11.4.16) is only part of the propagator (11.4.17), consisting of the off-diagonal terms for a phase, ϕ = π/2. Two dimensions is vital in order to account for electron’s spin. Comparing (11.4.22) with (11.2.2) the transverse mass corresponds to the Stokes parameter U, and the difference in linearly polarized components which are oriented at ±π/4 intensities. Analogously, the longitudinal mass, ml = m cos ϕ corresponds to Q, the difference in horizontal and vertical polarized light intensities. The wave field is transverse and this explains why the density matrix (11.4.22) is two-dimensional in any frame of reference. Mass is normally taken into account in a field which is longitudinal, like sound waves, where the velocity of propagation is inversely proportional to the square root of the density. However, by the fact that helicity can either be in the direction of momentum or in the opposite direction, the difference in the number of particles of opposite helicities, which is proportional to the electron’s relative velocity, possesses inertia. Feynman’s image of an electron shuttling back-and-forth at the speed of light is to be replaced by the electron’s helicity, or its spin axis, that is doing the shuttling in the direction of the electron’s momentum, or in the opposite direction, thereby oscillating between a left and a right-handed screw. And because helicity is proportional to the velocity of an electron,
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
589
the shuttling is done at that velocity, and not the velocity of light. Helicity must therefore have inertial properties. The conservation equation, (11.1.8), leads to two values of the energy, W = ±p. To interpret negative values of W , the analogy with the generators of angular momentum, rather than the Stokes parameters where W > 0, is the more pertinent one. To the total energy W , there corresponds a multiplet of 2W + 1 states of the same eigenvalue W but with the projection onto the direction of angular momentum, pz taking on values between +W and −W . These states are identified as states of helicities h = 1 and h = −1, respectively. The extreme states apply to massless particles which are either right or left-handed. This would apply to an ultrarelativistic electron, where the rest mass of electron can be neglected since it behaves essentially as a zero mass particle. As the velocity of an electron decreases from its ultrarelativistic value
to smaller values, the average helicity, | h | = u, would also decrease. The helicity would, therefore, vary in a continuous manner, and not as changes in discrete values, −s, −s + 1, . . . , +s, where s is the spin of the particle [Schweber 61, p. 113]. This would imply that the rest mass varies as √ m = W sin 2ϑ = W 1 − u2 , (11.4.23) and, since W = const., the mass becomes increasingly smaller as the relative velocity u → 1. The usual relativistic result, where m = const, and W becomes infinite in the same limit, does not apply. It would also apply to a left-handed neutrino whose spin is anti-parallel to its momentum, and right-handed antineutrino whose spin is parallel to its momentum. In the hole picture, the antineutrino would have a momentum anti-parallel to the momentum of the negative energy state which has been vacated [Schweber 61]. But, there is no need to consider a Dirac ‘sea’ filled with negative energy states, which can never be neutralized [Oppenheimer 30]. Rather, the anti-particle of the electron is righthanded in the ultrarelativistic limit, corresponding to the eigenvalue −W of the total energy, while the left-handed neutrino would have an energy eigenvalue +W . Mass, therefore, is a measure of the correlation between states of positive and negative helicity. The energy eigenvalues, W± = ±p, are analogous for a spin system where W+ and W− are the energies for the spin to align in
Aug. 26, 2011
11:17
590
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
the direction of momentum and in the direction anti-parallel to momentum, respectively. The amplitude of the correlation, mei2ϕ
√
. W − pz · W + pz
µ = |µ|ei2ϕ = √
(11.4.24)
is always less than, or equal to, unity by Schwarz’s inequality. It is the square-root of the ratio of the product of off-diagonal to the product of diagonal terms in the density matrix. The inequality |µ| < 1 accounts for ‘off mass-shell,’ or virtual processes, which do not conserve energy, (11.1.8). Expression (11.4.24) is a measure of correlation between states of helicity +1 and −1. The amplitude |µ| is a measure of their ‘degree of coherence,’ while 2ϕ, is a measure of their ‘effective phase difference.’
11.5
Mass in Maxwell’s Theory and Beyond
In this section we seek to generalize Maxwell’s electromagnetic theory in three directions: (i) the introduction of the state of helicity h = 0, and see how Maxwell’s equations exclude it, (ii) the introduction of mass into these equations, and (iii) can a generalization of these equations support compressional waves? In this way we carve out a precise domain of validity of Maxwell’s relations. We begin by a simple radiation mechanism and show that even though there is an h = 0 helicity state it cannot propagate.
11.5.1
A model of radiation
As a simple model of radiation [Skilling 42] we consider a short wire of length carrying a current I sin ωt. The wire is placed at the origin such that /2 points up from the equator in the z-direction and −/2 points down in the opposite direction, as shown in Fig. 11.13. The vector potential is given as the retarded potential integrated along the wire, viz. Az =
/2 −/2
I sin ω(t − r) dz. r
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
591
Fig. 11.13. A short vertical antenna.
Since current is flowing in the z-direction, there will be only one component of the vector potential. If the distance at which the vector potential is measured is much greater than the length of the wire, the denominator in the integrand will remain sensibly constant during the integration. Moreover, if the length of the wire is small compared to the wavelength of the radiation, then so too will be the numerator. Consequently, the integral can easily be performed with the result Az =
a sin ω(t − r), r
where a = I. Transforming to spherical coordinates this component of the vector potential will have two components: one radial, Ar =
a sin ω(t − r) cos ϑ, r
(11.5.1a)
and one tangential, a Aϑ = − sin ω(t − r) sin ϑ, r as shown in Fig. 11.13, with a vanishing third component Aϕ = 0.
(11.5.1b)
Aug. 26, 2011
11:17
592
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
The curl of A gives the circular magnetic field, which consists of circles of constant latitude, sin κ(r − t) aκ − cos κ(r − t) , (11.5.2) Hϕ = curlϕ A = − sin ϑ r κr where κ is the wave number. We can appreciate that the circular magnetic field, (11.5.2), is a spherical Bessel function of order 1. The radial and tangential components of the electric field can be determined from ˙ + ∇ · (∇ · A)dt. E = −A (11.5.3) These components are given explicitly by 2κa cos ϑ 1 1 cos κ(r − t) + sin κ(r − t) , Er = − r κr (κr)2 1 aκ sin ϑ 1 cos κ(r − t) − Eϑ = 1− sin κ(r − t) , r κr (κr)2
(11.5.4a) (11.5.4b)
and Eϕ = 0. The relative magnitudes of the terms in (11.5.4b) can be gleaned from their dependence on the inverse powers of r. For instance, if the second term in the first expression in (11.5.4b) is small compared to unity, the tangential component of the electric field will have the same form as that of the magnetic field, i.e. considered as a function of r, it will be a spherical Bessel function of order 1. This is precisely what Maxwell’s equations predict. Maxwell’s equations, in spherical coordinates, are ∂ 1 sin ϑ Hϕ , r sin ϑ ∂ϑ 1 ∂ E˙ ϑ = curlϑ H = − r Hϕ , r ∂r 1 ∂ ∂Er ˙ Hϕ = −curlϕ E = − rEϑ − . r ∂r ∂ϑ E˙ r = curlr H =
(SM)
Differentiating the third equation with respect to time and introducing the first two equations lead to the wave equation, 2 ¨ ϕ = 1 ∂ rHϕ + 1 ∂ 1 ∂ sin ϑHϕ , H r ∂r2 r2 ∂ϑ sin ϑ ∂ϑ
(11.5.5)
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
593
where the right-hand side is not exactly the Laplacian in spherical coordinates. It can be brought into that form by using Legendre’s equation for m = ±1, d 1 d 1 d d 1 sin ϑ = sin ϑ − dϑ sin ϑ dϑ sin ϑ dϑ dϑ sin2 ϑ = −( + 1),
(11.5.6)
whose solution is the spherical harmonic, = Y±1 . In light of Legendre’s equation, the second term in (11.5.5) can be replaced by −2/r2 , since = 1. For any m in the range − ≤ m ≤ , Legendre’s equation is 1 d m2 d = 0. (11.5.7) + ( + 1) − sin ϑ dϑ sin ϑ dϑ sin2 ϑ
Maxwell’s equations contain the facts that there are only two helicities: parallel and anti-parallel to the momentum. It is for this reason that Planck was able to obtain the correct density of states of his harmonic oscillators in his study of blackbody radiation, without any knowledge of the polarization of the photon. Assuming the circular magnetic field component, Hϕ , varies periodically in time, (11.5.5) for = 1 becomes the ‘spherical Bessel differential equation,’ d2 2 2 d − + + ω2 Hϕ = 0, (11.5.8) r dr r2 dr2 where the dispersion relation is κ = ω. It is important to observe that because of (11.5.6) we do not have to specify the value of m. There are two linearly independent solutions to (11.5.8). They are the spherical Bessel and Neumann functions,
Hϕ (r, t) = −aω2 j1 (κr) cos ωt + n1 (κr) sin ωt sin ϑ,
(11.5.9)
where j1 (x) =
sin x cos x , − x x2
(11.5.10a)
Aug. 26, 2011
11:17
594
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
and n1 (x) = −
cos x sin x − , x x2
(11.5.10b)
are spherical Bessel and Neumann functions of order 1, respectively. In contrast, for the radial component of the electric intensity, Er satisfies the reduced wave equation, ∂ 1 1 ∂2 ∂ sin ϑ + ω2 Er = 0. r+ 2 (11.5.11) ∂ϑ r ∂r2 r sin ϑ ∂ϑ On the strength of Legendre’s equation, (11.5.7) for = 1, we must choose m = 0 in order to come out with the same spherical Bessel differential equation, (11.5.8), whose solution, Er =
2aω n1 (κr) cos ωt − j1 (κr) sin ωt cos ϑ, r
(11.5.12)
is again given in terms of the spherical Bessel, (11.5.10a), and Neumann, (11.5.10b), functions of order 1. In comparison to (11.5.9) it is a power higher in 1/r. For the magnetic field component, Hϕ we did not have to specify m, that is done by sin ϑ in (11.5.9) which makes it proportional to either the Legendre polynomial, P11 = − sin ϑ, or P1−1 = 12 sin ϑ. Rather, for the electric field component, Er , we had to specify the value m = 0 in order that it satisfies the spherical Bessel differential equation, (11.5.8), and this is substantiated by the fact that its solution, (11.5.12), is proportional to the Legendre polynomial P10 = cos ϑ. In other words, the circular magnetic field gives the two longitudinal helicity states, m = ±1, parallel and anti-parallel to the direction of motion, while the radial component of the electric field gives the transverse helicity state, m = 0. We will now show that the radial component of the electric field cannot propagate! Two regions need be considered: one in which λ r, and the other λ r, where λ = κ−1 . The former occurs near the radiating antenna. The highest-order terms in (11.5.4b) dominate, which describe an oscillating double, or dipole. This is the source of electromagnetic radiation. However, in the latter region where λ/r is small all higher powers of λ/r may be neglected so that, in this region far from the oscillating doublet, Er vanishes and Eϑ is given in terms of P1±1 , just like Hϕ . These are the
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
595
Fig. 11.14. The configuration of electric and magnetic fields on the surface of a sphere. P is Poynting’s vector showing the direction of radiation. In any small portion, a spherical wave cannot be distinguished from a plane wave.
longitudinal helicity states of the photon. This is, yet, another example where a transverse helicity state is not related to a massive vector field. The tangential electric and circular magnetic force components are given by the common expression, Eϑ Hϕ
=
aω cos κ(r − t) sin ϑ, r
(11.5.13)
and are mutually perpendicular, as shown in Fig. 11.14 with Er = 0 together with Eϕ = Hr = Hϑ = 0. Each vector varies as the inverse of the wavelength, κ = ω, and the solution describes a spherically symmetric wave traveling outward. Both components are inversely proportional to the radius, and, thus, become weaker and weaker as they travel further from the source. The circular magnetic field are parallels of constant latitude on any sphere of radius r, while the tangential electric field are the meridians. The time rate of radiation is obtained by integrating Poynting’s vector over the surface S of the sphere
1 P · dS = 4π =
1 4π
E × H · dS 0
π
ωa r
cos κ(r − t) sin ϑ
2 = ω2 a2 cos2 κ(r − t). 3
2
· 2πr2 sin ϑ dϑ (11.5.14)
Aug. 26, 2011
11:17
596
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
In the region where only the radiation components (11.5.13) subsist, the Poynting vector is radially directed, and since it is in the direction of E × H it is pointed outward, as shown in Fig. 11.14. The average power radiated from a small antenna with uniform current distribution, 13 ω2 I 2 2 = 1 ω2 e2 u2 , is identical to Larmor’s formula (4.3.13) when averaged. 3 However, Poynting, as well as many other authors believed that the electric field is always parallel to the wire, assuming it is parallel to the vector potential. This belief is based on Ohm’s law J = σE, where σ is the conductivity. Although this is true inside the wire, it is not true outside the wire, where it is nearly perpendicular to the wire [Nahin 88]. We have Heaviside [92] to thank for this observation: . . . the transfer [of energy]. . .takes place, in the vicinity of the wire, very nearly parallel to it, with a slight slope towards the wire. . . Prof. Poynting, on the other hand (Royal Society Transactions, February 12, 1885), holds a different view, representing the transfer as nearly perpendicular to a wire, i.e. with a slight departure from the vertical. This difference of a quadrant can, I think, only arise from what seems as a misconception on his part as to the nature of the electric field in the vicinity of a wire supporting electric current. The lines of force are nearly perpendicular to the wire. The departure from perpendicularity is usually small that I have sometimes spoken of them as being perpendicular to it, as they practically are, before I recognized the great physical importance of the slight departure. It causes the convergence of energy into the wire.
The electric vector lies in the plane of the wire and the radius vector r. The magnetic vector is perpendicular to this plane. Because the electric vector is proportional to sin ϑ, there is no radiation in the direction of oscillation; that is, E cannot be parallel to A. Poynting would have his vector pointing inward, which compensates energy dissipation through Joule heating, but it would raise havoc with the radiation of radio waves. Although this energy compensation is true of a very small part of Poynting’s vector, the remainder of E × H is parallel to the wire outside the wire. Hence, energy propagation along the wire occurs outside the wire. The electric and magnetic fields in (11.5.13) are both proportional to sin ϑ. This means that no radiation is emitted in the direction of oscillation, while there is maximum radiation in the direction perpendicular to the oscillating dipole. Radiation in any other direction is proportional to the sine of the angle the direction it makes with the vertical z-axis along which the electric charge is oscillating. The radial component of the electric force, Er , is proportional to cos ϑ, which corresponds to the spin normal to the direction of propagation. However, it cannot propagate.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
597
Fig. 11.15. The polar plots of the spherical harmonics. Maxwell’s equations prohibit the middle radiation pattern.
The polar plots of the spherical harmonics Y1m , for m = 1, 0, −1 are shown in Fig. 11.15. If radiation could occur in the z direction, there would be maximum radiation in the direction of oscillation and zero at right angles to it. This would be the hallmark of compressible longitudinal waves. Maxwell took great care that his equations should describe an incompressible fluid, and, thus, they are incapable of describing the inductive zone. The inductive zone can be distinguished from the radiation zone, because the former varies as the inverse cube of the radius while the latter as the inverse of the radius. In the inductive region, the electric field and magnetic field components are due to electric charges and current, respectively. The current is encircled by the stationary magnetic field. It is a region in which electromagnetic statics applies. A short wire connecting two metal spheres, acting a condenser, is visualized as a dipole. The current carried along the wire alternatively charges and discharges their capacitance. So in the inductive zone, where the antenna appears as a dipole, and at a distance that is short compared to a wavelength of radiation, there is a radial component of the electric field. But due to its short range it cannot propagate into the radiation zone. In the inductive zone there are large amounts of energy that are continually transforming back-and-forth between the electric and magnetic fields. The fields are strongest at the equator of any imaginary sphere of radius r, and vanish at the poles. In the intermediary zone, where both electric field components of induction and radiation are present, they are out of phase. Only in the radiation zone are the electric and magnetic fields in phase with one another – precisely as Maxwell’s equations predict!
Aug. 26, 2011
11:17
598
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
We now want to modify Maxwell’s equations to take into account the possibility of the existence of a longitudinal mode with m = 0. In the absence of shear, a generalized force will be given by ¨ = η∇(∇ · G) − ν∇ × ∇ × G, ρG where G is any spatial displacement, and η and ν are the elastic constants related to compression and rotation, respectively, and ρ is a density. We now set G equal to what Maxwell called the ‘electrokinetic momentum,’ A. Heaviside argued in favor of setting the electric force, E, equal to the veloc˙ Then, with the simplifications, ρ = η = ν = 1, we get the first set of ity G. generalized Maxwell’s circuit equations, ¨ = ∇(∇ · A) − ∇ × H. A
(11.5.15)
From the second circuital equation, ˙ = −∇ × A, ˙ ∇ × E = −H
(11.5.16)
˙ we see that this relation will be satisfied by E = −A. Thus, from (11.5.1a) and (11.5.1b), we find the components of the electric vector are ˙ r = − aω cos ω(t − r) cos ϑ, Er = −A (11.5.17a) r and ˙ ϑ = aω cos ω(t − r) sin ϑ. Eϑ = −A r
(11.5.17b)
Expressing (11.5.17a) in terms of spherical Bessel and Neumann functions, we get Er = aω2 n0 (κr) cos ωt − j0 (κr) sin ωt P10 , (11.5.18) where j0 (x) =
sin x , x
(11.5.19a)
and n0 (x) = −
cos x , x
are spherical Bessel and Neumann functions of order 0.
(11.5.19b)
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
599
In contrast to (11.5.12), (11.5.18) falls off as inverse distance and when squared and integrated over a surface will be a constant. Thus, (11.5.18) can propagate! From the middle diagram of Fig. 11.15, we see that radiation is being emitted in the direction of the oscillating charges. This was strictly forbidden by Maxwell’s equations which permit only the first and third configurations, i.e. normal to the direction of the oscillating charges. However, the mismatch on the indices is sufficient to indicate that this is an artificial propagation. Consider the power equation, 1 d 2 E + H2 + (∇ · A)2 = −∇ · E × H + E(∇ · A) , 2 dt
(11.5.20)
which is obtained by multiplying (11.5.15) by E and (11.5.16) by H, and adding them. The new term on the left-hand side of (11.5.20), 12 (∇ · A)2 , is the energy of compression, while the new term on the right-hand side, E(∇ · A) is the momentum it creates. From the expression of the orbital angular momentum, (11.1.26), we see that E · ∇A is its corresponding linear momentum just as E(∇ · A) is the momentum due to compression by the action of a hydrostatic pressure, −∇ · A. The radially outward Poynting vector, E × H now has another contribution coming from the radial component of the electric vector, Er ∇ · A, since Eϑ ∇ · A vanishes on integrating over a spherical surface. Noting that ∇ · A = aω2 j1 (κr) cos ωt + n1 (κr) sin ωt P10 , the additional power will be π 1 Er ∇ · A (2πr2 sin ϑ)dϑ = 13 ω2 a2 cos2 κ(r − t). 4π 0
(11.5.21)
The power due to longitudinal waves of compression and expansion, (11.5.21), is exactly half of Poynting’s value, (11.5.14). Longitudinal wave propagation is, therefore, a less efficient means of power radiation than transverse wave propagation. One final point: The reason for splitting the wire into equal halves, one above and one below the equatorial plane in Fig. 11.13, which can be a conducting sheet, is that everything below this plane of symmetry can be
Aug. 26, 2011
11:17
600
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
eliminated. The actual antenna and conducting plane can be replaced by an isolated antenna of double length without changing the electromagnetic fields [Skilling 42]. The effect of the antenna above the conducting plane is the same as that below the plane with the current and charges being equal and opposite.
11.5.2
Enter mass: Proca’s equations
In the 1930’s the Romanian physicist Alexandru Proca developed a vector meson theory of nuclear forces that was subsequently used by Yukawa to obtain a Nobel Prize for himself. What he did was to modify Maxwell’s equations so that they would admit a non-vanishing photon mass through the appearance of the Compton wavelength, λc = m−1 . Proca’s equations read: E˙ = ∇ × H + mA, ∇ · E = −mφ,
˙ = −∇ × E, H
∇ · H = 0,
(P)
together with the auxiliary conditions, ˙ ∇ · A = −φ,
∇ × A = mH, ˙ mE + A + ∇φ = 0.
(P’)
The last equation follows from taking the time derivative of the first equation on the left, introducing the first equation on the right, and observing that any field will satisfy the Klein–Gordon equation, E¨ = ∇(∇ · E) − ∇ × (∇ × E) − m2 E.
(11.5.22)
The first equation in the second set is the transversality condition, which in terms of the four-vector potential, Aµ , can be expressed as ∂µ Aµ = 0. That mass requires the presence of the potentials, φ and A, means that the energy densities and momentum will also require them. Scalar multiplication of the first of Proca’s equation by E and the second by H lead to the power density equation, ∇ · (E × H) +
1 ∂ 2 1 ∂ 2 (E + H 2 ) = −∇ · (φA) − (φ + A2 ). 2 ∂t 2 ∂t
(11.5.23)
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
601
Surprisingly, mass will not appear in energetic considerations, as (11.5.23) testifies. Expression (11.5.23) shows that the fields and potentials can be conserved independently, 1 ∂ 2 (φ + A2 ) + ∇ · (φA) = 0, 2 ∂t or if not, can be combined into the energy density, 1 2 (E + H 2 + φ2 + A2 ), 8π
(11.5.24)
1 (E × H + φA). 4π
(11.5.25)
and energy flux,
On the basis of (11.5.24) and (11.5.25) Bass and Schrödinger [55] were able to discriminate between transverse and longitudinal waves. Maxwell called A the ‘electrokinetic’ momentum, and rightly so. For a transverse wave, the contribution from A will be negligibly small so that it can be considered as an (E, H)-wave, while for a longitudinal wave, the momentum is in the direction of A almost entirely so that it can be considered a (φ, A)-wave. Longitudinal waves are to be associated with the potentials while transverse waves with the fields. If the fields were to vanish all together, the components of the fourvector potential would be gradients in the direction of motion and would be ineffective to sustain wave motion since there is no longer induction. Keeping a small, but finite, electric field shows that both the scalar and vector potentials satisfy the Klein–Gordon equation, (11.5.22), which now reduces to ∇ · ∇φ − φ¨ = m2 φ.
(11.5.26)
Although Proca’s equations are self-consistent insofar as the fourvector potential satisfies the Klein–Gordon equation, (11.5.26), they are not covariant gauge-invariant. If we introduce the definition of the magnetic
Aug. 26, 2011
11:17
602
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
field into Faraday’s equation, we can write it as ∇ × (E + m−1 A) = 0. This can be satisfied identically by ˙ − m−1 ∇φ, E = −m−1 A
(11.5.27)
which is the last equation in (P ). Now, if we change the potentials in such a way A → A = A − ∇, ˙ φ → φ = φ + , for arbitrary , there will be no change in (11.5.27). Moreover, if we require to satisfy the wave equation, ¨ ∇ 2 = , then ∇ · A + φ˙ = ∇ · A + φ˙ .
(11.5.28)
Things which are equal but have nothing in common can only be equal if they are equal to a constant. We are free to choose this constant equal to zero, and (11.5.28) becomes the first equation in (P ), which is the Lorentz gauge. The only blemish on the Proca equations is that the gauge potential does not satisfy the same wave equation as the four-vector potential. This will be remedied in Sec. 11.5.3. If we choose a non-covariant gauge, ∇ 2 = 0, this would necessarily imply that ∇ · A = ∇ · A , or that ∇ · A = 0, which is the Coulomb, or radiation gauge. This would be counter-productive since it would ensure transverse waves. Also, would not satisfy the same wave equation as the four-vector potential. Proca’s equations, (P) and (P ), preserve the transversality condition, and, thus, can support only transverse waves. It is very enticing, and not new to electroweak theory, to associate the unused ‘third degree of freedom’
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
603
with a longitudinal mode, and with mass. Bass and Schrödinger [55] have done precisely this way back in 1955. While admitting that Plane waves have only two possible states of polarization, not three, as would be expected for a vector wave (e.g. an elastic wave; remember the historical dilemma concerning the ‘elastic properties of the ether’),
they contend that a third state of polarization, namely, a longitudinal wave, is possible for any two Maxwellian transversal waves with the same wave normal. The third wave is propagated with the same velocity; it is perfectly respectable, and remains so, however small a value we adopt for the rest-mass.
Thus, they associate the third, unused, degree of freedom of light, with a longitudinal mode that would be a massive field. However, there is no reason to believe that the longitudinal and transverse modes will propagate at the same velocity since the mechanism of wave generation is completely different, and so too what is being propagated. Bass and Schrödinger use Proca’s equations to support their assertions. However, the transversality condition has not been affected by the introduction of mass so that a longitudinal mode of propagation is immediately ruled out. The transverse waves easily follow from the second line of (P): ∇ · H = 0 means that our plane wave traveling in the z direction has Hz = 0. The same will be true of the electric field if φ = 0 so that the Proca equations reduce to E˙ = ∇ × H + mA, ∇ · E = 0,
˙ = −∇ × E, H ∇ · H = 0,
(PT)
with the auxiliary conditions, ∇ · A = 0,
∇ × A = mH,
(P T)
˙ = 0. mE + A From the first equation in (P T), we know that Az = 0, so that A can either be parallel to E or H. But because of the last equation in (P T) we set E A. We know from Sec. 11.5.1 that this cannot be the case outside of a conducting wire. Taking the time derivative of the last equation in (P T), and eliminating the time derivative of the first term by using the first equation in (PT) and
Aug. 26, 2011
11:17
604
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
the second equation in (P T), give ¨ = −∇ × ∇ × A − m2 A. A
(11.5.29)
This is the equation of motion of an incompressible elastic solid which shows resistance to both translation and rotational motion. To see this in greater detail, we form the power equation. Multiplying ˙ we get (11.5.29) through by A, 1 d ˙2 ˙ × ∇ × A), A + m2 A2 + (∇ × A)2 = ∇ · (A 2 dt
(11.5.30)
on the strength of the vector identity, ∇ · (X × Y) = Y · curl X − X · curl Y, for any two vectors X and Y. The terms in (11.5.30) have the following ˙ 2 is the kinetic energy, 1 m2 A2 is the potential energy,h and significances: 12 A 2 1 2 i 2 (∇ × A) is the energy of rotation. Consequently, (11.5.29) is the equation of motion of a transverse wave. Next, Bass and Schrödinger consider longitudinal waves. Since the magnetic field will always be solenoidal, they set it equal to zero, H = 0, and the rotational energy vanishes. Now we should expect some form of compressional motion just like in (11.5.15). Let’s see. Proca’s equations, (P) and (P ), then reduce to E˙ = mA,
∇ × E = 0,
∇ · E = −mφ,
(PL)
h This is the term by which mass is introduced in the Lagrangian. But because this term is not invariant under a gauge transformation it will introduce additional terms that are linear in the four-vector that are not canceled out in the transformation of the wave function. It is for this reason that such a term is banned from the Yang–Mills Lagrangian, and recourse is made to gauge symmetry-breaking. i Recall that in (11.5.20) we found the compressional energy, 1 (∇ · A)2 due to a 2 static pressure, −∇ · A. Now we have rotational energy 12 (∇ × A)2 due to angular
momentum, ∇ × A.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
605
with the auxiliary conditions, ˙ ∇ · A = −φ,
∇ × A = 0,
[P’L]
˙ + ∇φ = 0. mE + A Again, taking the time derivative of the last equation in (P L), and eliminating the time derivatives of the first and third terms now result in ¨ = ∇(∇ · A) − m2 A. A
(11.5.31)
This is the equation of a compressible elastic solid which offers resistance to translation. That is, A, and also E, are polar, which are propagated without magnetic force. So we conclude, along with Heaviside: “This makes longitudinal electric waves.” The scalar potential, φ, satisfies the same equation as (11.5.31), except that the first term on the right-hand side is ∇ 2 φ, so this does not tell us anything about the nature of the elastic solid. Again we form the power ˙ We then obtain equation by multiplying (11.5.31) by A. 1 d ˙2 ˙ · A), A + m2 A2 + (∇ · A)2 = ∇ · A(∇ 2 dt
(11.5.32)
on the strength of the vector identity, div(cX) = c div X + X · grad c, ˙ 2 is the kinetic for any scalar c and vector X. From (11.5.32) it is clear that 12 A energy, 12 m2 A2 is the potential energy, 12 (∇·A)2 is the energy of compression, ˙ ·A) is the energy flux density just like in (11.5.20). Hence, (11.5.31) and −A(∇ is a longitudinal wave, but we did not need Proca’s equations to get it. If we do not set φ = 0 to get transverse waves, or H = 0 to get longitudinal waves, the entire set of Proca’s equations gives the Klein–Gordon equation, (11.5.22). The power equation will contain both the energies of rotation and compression — but with equal elastic constants. Equal elastic coefficients allow the first two terms on the right-hand side of (11.5.22) to be combined into a single term, the Laplacian. And the resulting wave is still transverse, exactly as Maxwell predicts. The vanishing of the scalar field converts the Lorentz gauge into the Coulomb gauge and ensures that A will be solenoidal. Alternatively, if H is polar, or vanishes, the circuital equations are broken, and only longitudinal waves persist.
Aug. 26, 2011
11:17
606
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
All these conclusions hold independently of whether the mass, m = 0 or not! For both types of waves, the presence of mass is to introduce a potential energy term, with m playing the role of a spring constant. In regard to the wave equations, (11.5.29) and (11.5.31), the effect of this term is to introduce dispersion so that the group and phase velocities will not be equal. The presence of mass has nothing to do with the existence of longitudinal waves. However, in contrast to Maxwell’s equations, the presence of mass requires the potentials, and not just the fields. If mass required a longitudinal mode it could not be polarized; only transverse waves are polarizable. The weak point in the Proca equations, (P), is the expression for the divergence of the electric field. Instead of setting it equal to the charge density, as Gauss would have done, it is set equal to the scalar field. When the latter vanishes, it makes both E and A solenoidal. It will nevertheless be solenoidal anyway far from electric charges. The presence of φ is required when we create longitudinal waves. From what has been said in Sec. 11.1.4, we have no reason to believe that transverse and longitudinal waves will propagate at the same speed. Since we know that the transverse waves propagate at the speed of light, longitudinal waves will either propagate slower or faster. If G is any generalized displacement, the most general form of the force due to shear, compression, and rotation is 1 2 F = ξ ∇ G + ∇(∇ · G) + η∇(∇ · G) − ν∇ × (∇ × G), (11.5.33) 3 where ξ is the rigidity, η the compressive resistivity, and ν the elastic constant related to rotation. Using the vector identity, curl2 = ∇div − ∇ 2 and the force may be written as F = (ξ + ν)∇ 2 G + (η + 31 ξ − ν)∇(∇ · G). Neglecting shear, and with ν = η the compressibility vanishes, bringing us back to Maxwell’s theory. In general, η can take on all values from 0 to ∞, √ and since the speed of propagation will be proportional to η, longitudinal waves will, in general, propagate faster than transverse waves. This is
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
607
the conclusion Heaviside reached, and he was no stranger to tachyons, or particles that travel faster than the speed of light. Writing in 1898 (7 years before special relativity, if you mark its birth with Einstein’s 1905 paper) Heaviside [99, Appendix G] remarked that Searle, J. J. Thomson, and FitzGerald all considered that no charged body can travel faster than the speed of light. This is because the energy of a charged body is infinite at the speed of light, “and since this energy must be derived from an external source, and infinite amount of work must be done, that is, an infinite resistance will be experienced.” One way of proving this a ‘fallacy’ is to consider two oppositely charged bodies, both moving at the speed of light, so that “the infinity disappears, and there you are, with finite energy when moving at the speed of light.” Heaviside was considering electromagnetic energy, and not the total mechanical energy, which relativity theory asserts is true for charged, as well as uncharged, matter. The lack of distinction between the two energies would have troubled Maxwell deeply. For Maxwell reasoned, as we have seen in Sec. 3.8.1, that it takes energy to overcome the repulsion when two like charges are brought together. This energy goes “into the field” giving it a positive energy density. But, two neutral masses attract one another so that it takes energy to keep them apart, and this would mean that there would be a negative energy density in the field. This so worried Maxwell that he gave up all hope of including gravity as a field theory. This, too, troubled Heaviside for he wrote in July 1893: To form any notion at all of the flux of gravitational energy, we must first localize the energy. In this respect it resembles the legendary hare in the cookery book. Whether this notion will turn out to be useful is a matter for subsequent discovery. For this, also, there is a well-known gastronomical analogy.
By making all matter obey relativity, any reasoning of this type becomes completely sterile, together with the notion of how energy is stored in the field.
11.5.3
Proca’s approach to superconductivity
We can remedy the fact that any arbitrary gauge in Proca’s equations satisfies the wave equation instead of the Klein–Gordon equation by replacing ˙ everywhere in Proca’s equations. We then A by A − ∇, and φ by φ +
Aug. 26, 2011
11:17
608
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
obtain E˙ = ∇ × H + m(A − ∇), ˙ ∇ · E = −m(φ + ),
˙ = −∇ × E, H
∇ · H = 0,
(L)
with the auxiliary conditions, ∇ · A = −φ˙ + m2 , ∇ × A = mH, ˙ + ∇φ = 0, mE + A
(L )
Now all potentials and fields will satisfy the Klein–Gordon equation, (11.5.22). The current, J = m (∇ − A) ,
(11.5.34)
˙ ρ = −m(φ + ),
(11.5.35)
and charge density,
satisfy the continuity equation ρ˙ = −∇ · J,
(11.5.36)
which is none other than the first equation in (L ). The new potential, , in this gauge has the significance of an internal, as opposed to the external potential, A, in the Meissner effect. Quantum mechanics gets projected onto the macroscopic stage when electrons interacting with the lattice produce attractive forces between themselves. When the electron energies are sufficiently small, this attractive force induced by lattice interactions is sufficient to overcome their Coulomb repulsion. Pairs of electrons with their spins in opposite directions lock together to form a spin-0 boson with double negative charge. These ‘Cooper’ pairs have an enormously large effective size, about 10−4 cm, due to their very weak binding. Hence, these Cooper pairs will overlap with other Cooper pairs producing a state of coherence due to the locking together of the phases of their wave functions. Instead of dealing with 106 pairs, the current of the superconductor acts as if it were a single, free particle. The Meissner effect results when an external magnetic field interacts with the Cooper pairs. When the magnetic field penetrates into the
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
609
superconductor it will create a current resulting in the flow of Cooper pairs. This current, in turn, will generate its own magnetic field so as to oppose the external field. However, since the nullification is not exact, there will be a small magnetic field that seeps into the superconductor decreasing exponentially with distance. When the quantum mechanical flux, J=
1 e2 (ψ ∇ψ − ψ∇ψ ) − ψ ψA, m 2mi
is evaluated by a wave function of the form ψ ∼ eiα , where α = e2 , the fine-structure constant, we get a current density of the form (11.5.34), J=
e2 (∇ − A), m
(11.5.37)
except that the coefficient is inversely proportional to the mass, whereas it is proportional to it in (11.5.34). Since the Meissner effect is time-independent, the Coulomb gauge, ∇ · A = 0, is applied to the continuity equation, ∇ · J = 0.j Applying this to (11.5.37) requires ∇ 2 = 0, which means ∇ is constant. If we take this constant to be zero, we come out with London’s equation, J = −mA,
(11.5.38)
which is the hallmark of superconductivity. The phase, , is an internal phase that depends only on the properties of the superconductor. On the contrary, the magnetic field is an applied external field. The flux, (11.5.37), is the difference between the internal momentum, ∇, and the external momentum A. The vanishing of the internal momentum, ∇ = 0, is precisely the condition for the onset of superconductivity, described by London’s equation, (11.5.38). Hence, j It is amusing that applying the Coulomb gauge in (L ) gives φ = m2 . Introducing
this into the expression for the charge density, (11.5.35), results in a field equation for a forced harmonic oscillator, φ¨ + m2 φ = −mρ. So it would appear that the charge density makes the gravitational field oscillate at a frequency inverse to the Compton length.
Aug. 26, 2011
11:17
610
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
in the modified Proca equations (L) and (L ) has the physical significance of an internal field that is created when the external electromagnetic fields act on a continuous medium, like a composite system of electrons interacting with a lattice so that they are bound together in Cooper pairs. This medium has many properties of the aether, even at relatively short distances. From now on we will omit the internal phase . There is a great deal of folklore connecting the Meissner effect with spontaneous symmetry-breaking in the electroweak interaction [Gottfried & Weisskopf 86]. They concern: (i) Within a superconductor the frequency of any disturbance must exceed a certain threshold and thus correspond to a quantum of energy ω0 having a finite mass. (ii) Not only transverse waves, but also longitudinal ones can propagate in a superconductor, and it is the longitudinal waves that carry mass. (iii) The existence of a threshold, ω0 , and longitudinal fields are both related to the helicity state h = 0. We will now address these points. Although the definition of the magnetic field in terms of the vector potential leads to some perplexity of depending upon a finite mass, as we have already mentioned, the really suspicious equation is the modified Gauss law, which is given on the left-hand side of the second line in (P). It is analogous to the potential term in the first set of Maxwell’s equations that was introduced by Helmholtz, as we will discuss in Sec. 11.5.5 below. For if we equate it to Gauss’s law, we get ρ = ∇ · E = −mφ.
(11.5.39)
This is certainly not the solution (11.1.17) to Poisson’s equation, (11.1.14), with c = ρ. Moreover, if we take the divergence of the last equation in (P ), and use the Coulomb gauge we get (∇ 2 − m2 )φ = 0,
(11.5.40)
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
611
which cannot admit a plane wave solution. This is because the four-vector, (k, ω) is time-like, ω ≥ |k|. We can transform the space part to zero, but not the time part. If we take the divergence of the first equation on the left-hand side in (P), and use Gauss’s law, the first equality in (11.5.39), we come out with the continuity equation (11.5.35), where the flux is given by London’s equation, (11.5.38). Taking the curl of the first equation in (P), we get ¨ = m2 H, ∇ 2H − H
(11.5.41)
on the strength of the last equation in (P), and the second equation on the first line of (P ). This shows that the magnetic field satisfies the Klein– Gordon equation. But, (11.5.41) does not describe the static Meissner effect: Static magnetic fields cannot penetrate into a superconductor beyond a layer of thickness ∼ m−1 = λc , the Compton wavelength. The persistent current, (11.5.38), is also confined to this layer. Can we convert (11.5.41) into the stationary equation, (∇ 2 − m2 )H = 0?
(11.5.42)
The dispersion relation corresponding to (11.5.41), ω 2 = κ 2 + m2 ,
(11.5.43)
says that the four-vector (κ, ω) is time-like. Hence, its space part κ can be transformed to zero giving us a quanta of mass ω = m. But, its time part cannot be transformed to zero. However, if H remains steady in time, E is polar from the second equation on the first line of (P). If H remains steady in time, so too must A, and nothing can propagate. With H solenoidal, ∇ 2 H = −curl2 H, but with E polar, ∇ 2 E = grad div E, the latter too will satisfy (11.5.42), so that both fields will decay exponentially, as we now show. Rayleigh, in his second volume of The Theory of Sound tells us how to solve this equation. Expressing ∇ 2 in polar coordinates, he finds: ∇
2e
−mr
r
=
2 d d2 + r dr dr2
e−mr e−mr e−mr 1 d2 r· = = m2 . 2 r r dr r r
Aug. 26, 2011
11:17
612
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
If J is an impressed electric current in a conductor, the wave equation (11.5.42) becomes (∇ 2 − m2 )H = −∇ × J, whose solution can be written as −mr e ∇ ×J H= dV. r
(11.5.44)
Such solutions were well-known long before Yukawa applied them to limit the range of the nuclear binding forces that are mediated by the exchange of a new quantum, which he drew from the analogy with the photon. However, the electromagnetic force is infinite, making the photon massless, while nuclear forces were known to short range, and in order for their range to be less than one fermi, the mass of the mediating particle had to be greater than 200 MeV, which is not far from the 140 MeV of the known π-meson. Heaviside referred to (11.1.17) as ‘pot,’and (11.5.44) as ‘pan,’although he was not referring to pots and pans. In his own words . . . pot means “potential,” or the “the potential of,” and has no more to do with kettle than the trigonometrical sin has to do with the unmentionable one.
According to Heaviside, m−1 would not be related to the range of the potential; rather, its inverse would represent the space derivative, d/d(ut), which transforms J(t) to J(t − r/u), thereby making (11.5.44) a retarded potential. Spontaneous symmetry-breaking in the electroweak interaction now exploits Lorentz invariance to show that H must satisfy the Klein–Gordon equation. Admittedly, the constraint of Lorentz invariance is too simplistic for a superconductor since the ions in the conductor supposedly select out a preferred, time-independent, frame. But, it is argued [Gottfried & Weisskopf 86], that since electroweak theory must be Lorentz-invariant, so the electric and magnetic fields must propagate, albeit only above a threshold frequency, because, now, the photons have ‘acquired mass.’ Since the dispersion equation of the Klein–Gordon equation is (11.5.43), the quanta of the field have mass m. Gottfried and Weisskopf argue that As these fields are vectorial, these quanta are conventional spin-1 bosons with helicities h = ±1 and 0. The “lost” degree of freedom [the phase] has reappeared as the
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
613
longitudinal mode with h = 0 . . . The phase is not an independent degree of freedom of the electrons; in the superconducting state it is the longitudinal degree of freedom of the electromagnetic field.
This, however, is only wishful thinking for if E and H both satisfy the Klein–Gordon equation, the waves are transverse. Quanta with finite mass can propagate both transversally as well as longitudinally. But, for the latter to take place H must remain steady in time, or vanish altogether. Only in this case will E be polar, and there will be longitudinal electric waves. Arguments related to Lorentz invariance are irrelevant since Lorentz invariance applies to all fields or to none. For a phase to represent a longitudinal mode requires an act of faith! It is as Heaviside says: What happens in an unbounded non-conducting uniform medium is that the circuital E and H make Maxwellian waves which go out to infinity, whilst the polar part of E makes longitudinal waves, which also go out to infinity. Nothing is left behind.
Nothing is left behind if H = 0, but if H is steady and (11.5.42) applies, then there is a fixed, permanent magnetic field. But, this would imply longitudinal electric waves, which no one has ever seen. There is the perennial argument as to which pair of fields (A, φ) or (E, H) is more fundamental. Maxwell referred to A as the ‘electrokinetic momentum,’ which “may even be called the fundamental quantity in the theory of electromagnetism.” Hertz and Heaviside disagreed. And Heaviside even went so far as to express his desire to “murder” Maxwell’s “monster.” We have seen that Proca’s equations lead at once to London’s equation, (11.5.38), and the Yukawa equation (11.5.42), provided the magnetic field remains steady. Otherwise, (11.5.41) will not reduce to it. If we consider the pair (A, φ) as fundamental, in order to get transverse waves φ must be steady, while in order to get longitudinal waves H must vanish. This is contained in the first two equations of (P ). Rather, if we consider the pair (E, H) to be fundamental, we need φ to vanish, while in order to get longitudinal waves H must remain steady. Thus, the question of whether the waves are transverse or longitudinal lie in the nature of φ
Aug. 26, 2011
11:17
614
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
and H. The crucial point, therefore, is not which pair is more fundamental, but, rather the nature of the magnetic field. If H propagates along with E we have induction and the waves are transverse, if H is steady, or vanishes, E is left to propagate alone and the waves are necessarily longitudinal. Maxwell’s equations, on the other hand, are less clear-cut, for they allow the coexistence of ∇ · A and ∇ × A. But — and this is a big ‘but’ — they must contribute equally so that their difference is the Laplacian. If 1 1 (E × H) are the (E2 + H 2 ) and 4π we can get rid of the potentials, then 8π electromagnetic energy density and momentum, respectively. If we get rid of the fields, then 12 (φ2 +A2 ) and φA are the energy density and momentum, respectively. However, if the Proca equations (P) and (P ) hold, the field energy density, 1 2 E + m−2 (∇ · E)2 + A2 + m−2 (∇ × A)2 , 8π
(11.5.45)
and momentum density, E × ∇ × A = (∇A) · E − E · (∇A),
(11.5.46)
contain only the fields E and A. The energy density of E has a compressional contribution, while that of A has a rotational contribution. The linear momentum, (11.5.46), may not look as amounting to much, but its moment, or the (total) angular momentum is most suggestive. Taking the moment of the terms gives r × (∇A) · E + E · (∇A) × r = r × (∇A) · E + ∇ · (EA × r) + E · (∇r) × A + r × A(∇ · E). Now, whatever the form of the divergence of E, whether it be given by Gauss’s law, or vanish for a solenoidal field, or be given by Proca’s equation in terms of the scalar potential, it cancels out in the above formula. Since ∇r = 1, the unit dyadic, and we can neglect divergence since its integral vanishes by assumption,
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
615
the total angular momentum density is found to be J = r × (∇A) · E + E × A ≡ L + S,
(11.5.47)
where L and S are the ‘orbital’ and ‘spin’ angular momenta densities, respectively. We have come across the latter in (11.1.23). The total angular momentum, (11.5.47) is expressed solely in terms of E and A. We may thus consider the pair E and A as fundamental, which are parallel in free space, but out of phase, and the other pair, φ and H, as the conditioning potential and field. Moreover, given (11.5.41) we can easily transform the space part to zero, so that (11.5.43) reduces to ω = ±m. This is on account of the fact that (11.5.43) guarantees that (κ, ω) is a time-like four-vector. More drastic action is required to transform the time part to equal zero, i.e. assume H remains steady so as to produce longitudinal waves. By adding a vector source to (11.5.41), (∇ 2 − κ2 )H = −∇ × J, (11.5.48)
where κ2 = − ∂2 /∂t2 + m2 , we can again write the solution in the form of Yukawa’s potential, (11.5.44); only this time we have −κr e ∇ ×J H = pan curl J = dV. (11.5.49) r Independent of what κ represents, (11.5.49) says that the magnetic force is the pan-potential of the curl of the impressed current. Since pan curl = curl pan we can write (11.5.49) in terms of the vector potential, −kr e J A= dV. (11.5.50) r If the propagation is isotropic and dispersionless, m = 0, and the value of the source J is taken to be the value of the impressed current at time t − r, then (11.5.50) is the retarded potential. This was already appreciated by Heaviside before the turn of the last century.k But, what Heaviside failed to appreciate was that in the presence of dispersion, m = 0, the range of k As we know from Sec. 4.1.1, Eq. (4.1.10), Liénard and slightly later Wiechert also
introduced retarded potentials about the same time.
Aug. 26, 2011
11:17
616
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
the retarded potential is curtailed, as a comparison of (11.1.16) and (11.5.50) readily shows. That was Yukawa’s contribution: Massive mediators of vector meson interactions have a finite range. Although a threshold is needed, the effects of such symmetrybreaking is irrelevant if the threshold is high enough. And although this is the universally accepted interpretation, it would mean we expect a symmetry-breaking anytime there is a threshold that separates thermal from non-thermal radiation. Should such a transition be accompanied by symmetry-breaking or by a longitudinal component? Bass and Schrödinger [55] expressed this in the following terms: If these L-[longitudinal] waves contributed to the heating and pressure effects of black-body, we should expect the constant of Stefan’s law, the constant in front of Planck’s formula, and the measured radiation pressure to be 32 times the values we actually find for them. Our actual findings might thus be construed to indicate that we are faced with the limiting case of rest-mass zero. But this would be a poor and, so we believe, a wrong solution to the dilemma. In a reasonable theory we cannot admit even hypothetically that a certain type of modification of Maxwell’s equations, however small, would produce the above grossly discontinuous changes. Even if we had it ‘from the horse’s mouth’ that in Nature the limiting case is realized, we should still feel the urge to adumbrate a theory which agrees with experience on approaching to the limit, not by a sudden jump at the limit.
There are echoes here of how Schrödinger took the continuous limit of a finite-difference equation to arrive at his continuous wave equation [Lavenda 00], which fourteen years before his discovery, he argued that Nature never goes to the limit [cf. footnote n of this chapter]. So Schrödinger would not look too kindly on a mechanism in which there is a “sudden jump at the limit.” There is more than one way to ‘skin a cat,’ and the analogy with the Meissner effect in order to generate mass is not the most aesthetically pleasing one. In the early 1970’s it was believed that the dispersion relation (11.5.43) could place bounds on the mass of the photon [Jackson 75]. If we write it as ω2 = ω02 + m2 ,
(11.5.51)
where ω0 is the frequency of a lumped LC circuit, then the smaller the frequency ω0 , the larger will be the fractional difference between ω and ω0 , so that it would provide a limit for the photon mass. The idea was to
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
617
measure the resonant frequencies of a series of circuits whose frequencies ω0 are in known ratios. If the observed frequencies ω were not in the same ratio, it would be evidence for a finite m in (11.5.51). Two circuits were compared: one with an inductance, L, and capacitance, C, and another circuit with the same L, but two capacitances, C, in parallel. In the first circuit, ω02 = 1/LC, while the square of the observed frequency of the second circuit would be in the ratio 2 : 1 with respect to the square of the frequency of the first circuit to within experimental error, having corrected for resistance effects. Thus, an upper limit to the photon mass could be inferred. The fly in the ointment is the observation that any lumped circuit is incapable of setting limits on the photon mass [Jackson 75]. A two-terminal box has a current, I, at one terminal and a voltage, V, between the terminals ˙ A lumped inductance two-terminal box has a voltage given by I = CV. ˙ When two such boxes are connected they have a common I and V = −LI. ¨ The only thing V so that inserting the latter into the former gives I = −LCI. that can be surmised from the combined system is the resonant frequency √ ω0 = 1/ (LC), and nothing more.
11.5.4
Phase and mass
Mass can also creep into Maxwell’s equations via phase relations. We return to the Proca equations in the case of transverse waves. Setting φ = 0 and assuming that all fields can be expressed as plane waves with frequency ω and wave number κ, the last relation in (P’) is E = −i
ω A. m
(11.5.52)
Now, by Ohm’s law, E is proportional to the current J, at least if the space is isotropic. A is parallel to J because that is what creates it. So E A, at least in empty space [cf. Sec. 11.5.1 where this is not true]. But, (11.5.52) tells us something more. Because of the i, the two fields will be out of phase by π/2. When this is substituted into Proca’s modification of the first circuital equation, we get 2
m 0 E˙ = ∇ × H + i E. ω
(11.5.53)
Aug. 26, 2011
11:17
618
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
Were not for the imaginary factor, the last term in (11.5.53) might be construed as a constitutive relation for the field current, J, e.g. a form of ohmic relation. Phase information is obtained from curls and cross products. Multiplication by i simply means that a right-hand rotation has been performed, a rotation of x → y about z, say. And since the curl and cross product are subject to the right-hand screw rule, we can interpret X × Y as the complex number ±iXY. It would, therefore, make more sense to write the current as an additional curl of the magnetic field. In this way mass would enter as phase information. However, this would make Maxwell’s equations lopsided. If a current is added to the second of Maxwell’s equation, viz. ˙ = −∇ × E − Jm , µ0 H
(11.5.54)
it would require the auxiliary relation µ0 ∇ · H = ρm in order that the continuity equation be satisfied. But, what would happen if the permeability, µ0 , were to vary in space? The permeability is analogous to inertia in Maxwell’s equations, and if it were to vary it would be equivalent to a varying index of refraction. Maintaining the solenoidal character of H, the divergence of (11.5.54) would lead to the continuity equation, ρ˙ m = −∇ · Jm , where the magnetic charge density is ρm = ∇µ0 · H, corresponding to magnetic charges, or magnetons in Heaviside’s terminology. The restoration of symmetry in Maxwell’s circuit equations comes, however, at the cost of introducing a magnetic charge density, ρm , and its accompanying current density, Jm . Heaviside even went so far as to attribute the name ‘duplex equations’ to the symmetrized set, and said that ρm = 0 was merely an ‘experimental input.’ The concept of a magnetic pole is not repugnant in itself; Maxwell and both his predecessors and followers used it freely. It is like the Faraday tube, and its relation to the aether that led to its demise. Recall from Sec. 5.4.3 that the Faraday tube was essential to J. J. Thomson’s conclusion that motion increases inertia. Whereas, the demise of the free magnetic pole was ushered in by the discovery of the electron, and the failure of experiments to discover the corresponding magnetic counterpart.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
619
Observing that |H| |κ| = < 1, |E| ω where the inequality follows from the dispersion relation, (11.5.51), we can transform (11.5.53) and (11.5.54) into 0 E˙ = ∇ × H + imH,
(11.5.55a)
˙ = −∇ × E + imE, µ0 H
(11.5.55b)
by setting κ = m. Equations (11.5.55a) and (11.5.55b) combine to give the Klein–Gordon equation (11.5.41), for either E or H, e.g. E¨ = −∇ × ∇E − m2 E, where we set the product 0 µ0 = 1. The phase terms are what is necessary to ‘couple’ the two relations, as we will now show. If we define a complex vector, F(±) = E ± iH, Maxwell’s equations can be written in the compact form, ∂F(±) ± i∇ × F(±) = 0. ∂t
(11.5.56)
Furthermore, if we include the phase relations contained in (11.5.55a) and (11.5.55b), we get the coupled set of equations, F˙ (±) ± L · ∇F(±) − mF(∓) = 0, where the j = 1 angular momentum matrices are 0 0 0 0 0 1 0 Lx = i 0 0 −1 , Ly = i 0 0 0 , Lz = i 1 0 1 0 −1 0 0 0
(11.5.57)
−1 0 0
0 0. 0
The rationale behind (11.5.57) is analogous to Weyl’s equations, which couples the left- and right-handed portions of the Dirac wave functions [cf. Sec. 11.6.1 below]. If the fields were uncoupled, then with the mass, m = 0, the particle would propagate at a velocity less than c. By performing
Aug. 26, 2011
11:17
620
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
a Lorentz boost, we can overtake the particle and therefore invert its helicity. The helicity would not then be a property of the particle. The coupling would then be proportional to the mass, and this is what (11.5.57) says.
11.5.5
Compressional electromagnetic waves: Helmholtz’s theory
It is well-known that plane waves cannot support compressional waves, so the addition of an extra current in Maxwell’s second circuit equation has not really been instructive. Long before Proca modified Maxwell’s equations, our old friend Heaviside, after listing all the reasons why Maxwell should not have introduced longitudinal waves into his theory, asked “Why, then, should he spoil his work by introducing longitudinal waves?” Since competing theories to Maxwell, like Hermann Helmholtz’s theory, did include longitudinal waves, Heaviside was prompted to looking into the consequences of such waves, if for nothing else than to find arguments with which to criticize Helmholtz’s theory. Helmholtz’s theory was purported to be a generalization of Maxwell’s theory to include longitudinal waves, if the need ever arose. Many of the nineteenth century ‘giants’ thought that longitudinal waves would explain the recently discovered X-rays. Among those was Ludwig Boltzmann who claimed that “Whether the longitudinal oscillations and the other generalizations, which Helmholtz had added to Maxwell’s theory, are of great importance or not, is a question that the present stage of science is unable to decide.” Regardless of whether or not Nature actually did admit longitudinal waves in electromagnetism, Heaviside denied their existence outright: “No one has the right to trifle with Maxwell’s equations this way.” To distinguish between Helmholtz’s and Boltzmann’s interpretation of Maxwell, Heaviside referred to his interpretation as “my Maxwell.” Nature did bow to Maxwell, and it was Heaviside who carried the day — but not so in other areas outside his field of expertise.l l Having settled his squabble with Boltzmann, Heaviside was ready to take on
another to be famous German, Max Planck. But, this time it was thermodynamics and not electrodynamics that would be the area of contention, and, in particular, the notion of entropy. Here, Heaviside intervened in the nasty debate between
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
621
A force moving through the aether would meet with resistance arising from the stresses in the medium: shear, rotation and compression. Disre¨ where ρ is the density of the aether, would be garding shear, the force, ρG, given in terms of the generalized spatial displacement, G, by ¨ = η∇(∇ · G) − ν∇ × (∇ × G). ρG
(11.5.58)
The elastic constants, η and ν, are those of compression and rotation, just ¨ as in (11.5.33) with F = ρG. In contrast to J. J. Thomson’s argument of the increase in inertia due to motion in Sec. 5.4.3, where a charge in motion creates a magnetic field which is proportional to the velocity, so that the magnetic energy is proportional to the kinetic energy at low speeds, we follow Heaviside and ˙ consider E to be the velocity of the aether, G. Had we chosen H to represent the velocity, as Larmor insisted, there would be no discussion between transversal and longitudinal waves. Only the former would exist since there are no magnetic charges (poles). Their nonexistence did not stop the likes of Heaviside to include them, if for nothing else than for the sake of symmetry. Whether or not they exist is just a matter of experimental input, like the laws of electrodynamics: “If they are valid at any speed, then there is nothing to prevent speeds of motion greater than light.” However, Heaviside criticized this choice of the velocity, for if H is velocity, the case is far worse, for an impossibility is involved. The electric force becomes rotation or proportional thereto, and the impossibility is that we need to have E both circuital and polar at the same time roundabout an isolated charge!
With E as velocity, −ν∇ × G = H,
(11.5.59)
Perry and Swinburne, past and present Presidents of the Institution of Electrical Engineers, the latter asserting that the textbook definition of entropy to be “fundamentally wrong.” [Nahin 88] Planck was called on to adjudicate, and he wholly sided with Swinburne. Planck had a go at Heaviside’s ‘ghostly’ reference to the entropy’s tendency to increase. Heaviside rather fashioned Kelvin’s concept of the “universal dissipation of energy,” tending to a minimum as opposed to Boltzmann’s universal tendency for the entropy to increase. Needless to say, Planck had the upper hand, and left Heaviside, in a rare occasion, speechless. This goes to show that once outside their area of expertise, these geniuses of science were prosaic.
Aug. 26, 2011
11:17
622
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
(11.5.58) would give the first circuital equation, while the time derivative of (11.5.59) would give the second one, ˙ −∇ × E = µ0 H,
(11.5.60)
if ν = µ−1 0 . Using the third auxiliary relation in (P ), we can write the first circuital relation in terms of A alone, viz.
¨ = η∇(∇ · A) − ν(∇ × ∇ × A), A
(11.5.61)
provided the scalar potential satisfies the wave equation, φ¨ = η∇ 2 φ. √ The scalar potential, φ, would propagate at a speed η, which could be greater than that of light. The presence of a convergence (=-divergence) in (11.5.61), which although is analogous to a hydrostatic pressure, does not necessarily mean that there is a longitudinal mode. For an incompressible fluid that satisfies Euler’s equation, the vanishing of the divergence of the velocity is the condition that the fluid velocity is everywhere normal to the wave vector, and, consequently, the wave is transverse. Such a term is in the Lorentz gauge, which is the first equation in (P ). In fact, combining the Lorentz gauge with ˙ + E + ∇φ = 0, A
(11.5.62)
¨ = ∇(∇ · A) − ∇ × (∇ × A). A
(11.5.63)
results in
This is precisely (11.5.61) with η = ν = 1 and our (11.5.15), so Maxwell’s equations do not require a vanishing coefficient of compression! Recalling our discussion in Sec. 11.5.3, folklore has it that the transition to superconductivity occurs with the creation of a longitudinal mode with the simultaneous appearance of a finite mass. If what we have said in Sec. 11.1.4 about two velocities of propagation is correct, then there can be no coexistence of transverse and longitudinal modes of propagation at a single velocity. As in the case of a sound wave, the fluid velocity is in the direction of propagation, and, consequently, sound waves are longitudinal. Furthermore, they need a medium to propagate in. If E is the velocity, then the only way to get
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
623
longitudinal propagation is to have H polar or vanish, as Heaviside clearly realized. ˙ and performing several in inteMultiplying (11.5.61) through by A grations by parts we get 1 d ˙2 ˙ − ηA(∇ ˙ {A + ν(∇ × A)2 + η(∇ · A)2 } = −∇ · {ν∇ × A × A · A)}. 2 dt (11.5.64) ˙ 2 is the kinetic energy density, 1 (∇ ×A)2 is the rotational energy The term 12 A 2 density, while 12 (∇ · A)2 is the energy density of compression [cf. (11.5.20)]. In fact, as we have already mentioned, −∇ · A can be likened to a static pressure. And a pressure exists both when the fluid is incompressible as well as when it is compressible. Moreover, Maxwell’s theory predicts a pressure due to radiation. The condition that the flow be incompressible is ∇ · E = 0, because we have taken E to be the velocity. Thus, there exists a velocity potential whose gradient is E, and which satisfies Laplace’s equation. ˙ = 0, and, hence, In view of condition (11.5.62) this means that ∇ 2 φ = −∇ · A the third term on the left-hand side of (11.5.64) drops out. That is, there is no rate of change of the compressional energy so for all intent purposes it does not exist. According to Heaviside, when the aether is incompressible, η becomes infinite and ∇ ·A tends to zero in such a manner that the pressure remains finite. However, A is not E so that the same restriction placed on the latter does not necessarily apply to the former. Moreover, if E is a gradient, the second term on the left-hand side of (11.5.64) is also zero, because H “remains steady in time and place.” E cannot be incompressible flow for, otherwise, the second circuital law, (11.5.60), would vanish. So Maxwell’s aether is not incompressible, but one in which the elastic constants of rotation and compression just happen to be equal. What if we go to the other extreme where the motion is entirely compressible? Equation (11.5.58) becomes ¨ = η∇ 2 A, A
(11.5.65)
Aug. 26, 2011
11:17
624
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
if ∇ × A = 0 [cf. Sec. 11.1.4], so there is no magnetic field. With φ = 0, (11.5.65) can be split up into the pair of equations: E˙ + η∇(∇ · A) = 0, ∂ (∇ · A) + ∇ · E = 0. ∂t
(Z)
The second equation follows from (11.5.62) upon taking is divergence. The strains are most easily characterized by what are known as ‘surfaces of discontinuity,’ where a displacement, or its derivatives, may experience a discontinuity, or jump. Using Christoffel’s notation we will indicate a discontinuity in G by [G]. If the body is not torn apart by the strain, we may well suppose that the normal component of the displacement be continuous, or that [G · n] = 0, where n is the unit normal to the surface, S. In contrast, the displacement tangential to the surface may undergo a discontinuity, [T] = 0, where T = G − (G · n)n. Let E and ∇ · A be continuous across the surface S, whereas the acceleration of the displacement, G, and the gradient of ∇ · A are discontinuous. Writing the first equation in (Z) on both sides of the surface S and subtracting one from the other result in dE = −η[∇(∇ · A)]. dt Since [E] = 0, its divergence [∇ · E] = m · n, where the vector m characterizes the discontinuity. Also since the pressure is continuous across the discontinuity [∇ · A] = 0, but its gradient is not, [∇(∇ · A)] = αn, where α is a scalar which we will shortly determine. The kinematical condition of compatibility, upon taking the time derivative of both sides yields d (∇ · A) = −αu, dt
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
625
where u is the speed of propagation of the wave. Now, by the discontinuity in the second equation in (Z),
d (∇ · A) = −[∇ · E], dt
there results α=
m·n , u
and, consequently, [∇(∇ · A)] =
m·n n. u
Introducing this last condition into the first equation in (Z), together with the additional kinematic condition of compatibility,
dE = −um, dt
we obtain the condition u2 m = η(m · n)n. There are only two ways that this condition can be satisfied: (i) The vector m is normal to S so that the speed is u=
√
η,
(11.5.66)
or (ii) m is tangent to the surface so that m · n = 0 requires u = 0. This proves a theorem of Hugoniot: There are only two kinds of discontinuities in a compressible, non-viscous fluid: longitudinal discontinuities which propagate with speed (11.5.66), and transversal discontinuities which do not propagate at all.
Thus, by eliminating the magnetic field, and demanding that A be irrotational we have made longitudinal electrical waves. The wave equation (11.5.65) is expressed in terms of the vector potential, A. Now, if A were
Aug. 26, 2011
11:17
626
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
solenoidal we could write (11.5.65) as ¨ η∇ × ∇ × E = −E. This equation can further be split into two equations of first order ˙ −∇ × E = H, ˙ η∇ × H = E. The curl suffers a discontinuity, [∇ × v] = n × m. Thus, in the first case where m is normal to the surface, [∇ × v] = 0, while in the second case where m is tangent to the surface, [∇ × v] = 0. Hence, for transverse waves, the vortex velocity 12 (∇ ×v) jumps, with m tangent to S, while ∇ ·v is continuous. Whereas for longitudinal discontinuities the opposite occurs. In the former case, the longitudinal discontinuities are not propagated at all. We can, therefore, have either transverse or longitudinal discontinuities, but not both at the same time. In Helmholtz’s theory, the first circuital law appears in the form, ˙ ∇ × H = 1 E˙ − 0 ∇ φ,
(11.5.67)
where, in general, 1 and 0 are two components of a dielectric constant , i.e. 1 + 0 = . The dielectric constant, 1 , is a property of matter, while 0 belongs to the aether. As Heaviside was quick to point out, the potential term allows the electric current to have divergence. It also allows for the establishment of longitudinal waves without the contradiction that the current be stationary. That is, if we introduce (11.5.62) and use the Lorentz gauge, φ˙ = −∇ · A,
(11.5.68)
we could get (11.5.63) without the second term if A were irrotational. The second term in (11.5.67) avoids this contradiction, and gives a wave equation, ¨ − ∇ φ, ˙ ∇ × ∇ × A = −1 A
(11.5.69)
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
627
where we used 1 + 0 = . Imposing the Lorentz gauge, (11.5.68), on (11.5.69) gives ¨ ∇ 2 A = 1 A.
(11.5.70)
√ √ Such a wave travels at the superluminal speeds since 1/ 1 > 1/ . So depending on whether A is irrortational or solenoidal, (11.5.67) gives rise to longitudinal or transverse waves, with the former traveling faster than the latter. Since the former are at variance with Maxwell’s theory we are forced to set 0 = 0, and so with it Helmholtz’s generalization of Maxwell.
11.5.6
Directed electromagnetic waves
Mass can also creep in very simply in the propagation of electromagnetic waves along a cable through the assumption that the fields are periodic both in time and along the axis of propagation. We choose the z-axis as the axis of symmetry along the cable. The radial coordinate, r, measures the distance to any point from the center of the cable. We have discussed such an example in Sec. 11.1.6, but, consider it here from a different perspective. There are two components of the electric field: E is along the axis of symmetry, and F points in the radial direction. This fixes the magnetic field to be circular about the symmetry axis. Maxwell’s equations are thus given by 1 ∂ E˙ = curlz H = rH, r ∂r ∂H F˙ = curlr H = − , ∂z ˙ = curlϕ E = ∂F − ∂E . −H ∂z ∂r
(C)
These equations can give wave equations by increasing the order and eliminating one of the variables. For instance, the longitudinal component of the electric field will satisfy 1 ∂ ∂E ∂2 E ¨ r + 2 = E. r ∂r ∂r ∂z
(11.5.71)
Aug. 26, 2011
11:17
628
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
Now assume, as is usually done, that the electric and magnetic fields are simply periodic with respect to z and t so that each will have the form of a plane wave ei(ωt−mz) . The wave equation (11.5.71) will then reduce to Bessel’s equation of order zero, 1 d dE −2 r + λ − m2 E = 0, r dr dr
(11.5.72)
where E depends only on r, λ = ω−1 , and we have chosen the wavelength along the z-axis to be the Compton wavelength, m−1 . There are two options: (i) If m−1 > λ, the solution will oscillate as a Bessel function of order zero. If the waves were not controlled by the wire we should have λ = m−1 in the dielectric. (ii) If λ > m−1 , the disturbance decays exponentially in the radial direction. The solution to (11.5.72) is then a modified Bessel function of order zero. Mass has entered through the assumption that the longitudinal component of the electric field is periodic along the axis of symmetry with a wavelength given by the Compton wavelength. This does not destroy the transverse nature of Maxwell’s equations. The Compton wavelength is the smallest wavelength that the cable is capable of supporting, and its relation to that of the electromagnetic radiation will determine the nature of the propagation of radial disturbances. Even more can be said. Instead of introducing the Compton wavelength, we use the wave number κ. Then, (11.5.72) becomes 1 d dE r + (ω2 − κ2 )E = 0. r dr dr
(11.5.73)
We know that the relativistic dispersion equation (11.5.43) applies. This would identify ω in the second expression of (11.5.73) with the inverse Compton wavelength, and hence with the mass, while κ is related to the momentum through the de Broglie relation. But, by the very definition of
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
629
the operator of angular momentum, the first term should be proportional to the square of the momentum.m That momentum and mass can be construed in two different ways makes it reasonable to consider mass to have tensorial components while momentum can be treated as a scalar, instead of the other way round. We write the coefficient of the second term in (11.5.73) as 1 2 2 2 2 K = (ω − κ ) = ω 1 − 2 , v
(11.5.74)
by the definition of the phase velocity v = ω/κ. For a given mode of vibration there is the lowest possible frequency, corresponding to infinite wavelength, κ = 0, viz. ωcrit = K, which defines K in (11.5.74). Now, rearranging (11.5.74) to read: ω2 = (κ2 + K 2 ), differentiating with respect to κ, and using the definition of the group velocity, u = dω/dκ, we find 1 = uv. Thus, if u < 1, we come out with the inevitable conclusion that phase velocity > velocity of light, which is a well-known property of waves on wires [Brillouin 60]. Now, u < v implies m−1 > λ. In a letter to Niels Bohr in March 1930, Werner Heisenberg outlined his picture of a ‘lattice world’ [Gitterwelt]. It was Heisenberg’s idea to introduce a fundamental length, m−1 , where he supposed the mass was that of the proton, for lack of another elementary particle known at that time. Space, according to Heisenberg, is chopped up m Dirac [47, p. 153] notes a problem on transforming from Cartesian to spherical coordinates in that the commutator “makes pr like the momentum conjugate to the r coordinate, but it is not exactly equal to this momentum because it is not real. . .. Thus pr − ir−1 is real and is the true momentum conjugate to r.” Yet, he goes on to use pr = −i∂/∂r as if it were the true momentum. Actually, the momentum operator will be different in cylindrical and spherical coordinates.
Aug. 26, 2011
11:17
630
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
into lengths m−1 so that (11.5.72) actually stands for the finite difference equation, (ω2 − m2 )un −
un+ − 2un + un− = 0. 2
Since ω > m, or λ < m−1 , the spacing < m−1 is required. But a particle cannot be localized to within dimensions smaller than the Compton length so Heisenberg set equation to the Compton wavelength of the proton. Over distances smaller than causality breaks down and we have no knowledge of what goes on over such distances. Electrons with wavelengths greater than can propagate along the linear lattice. However, what is good in one dimension does not mean it is also good in three dimensions. In fact, Heisenberg found that in three dimensions, the lattice picture violates energy and momentum conservation, and dropped the whole idea of a lattice world picture. We have subsequently resuscitated Heisenberg’s Gitterwelt [Lavenda 00], and have shown that there is more than meets the eye, even in a single dimension. The recursion relation for a modified Bessel function gives rise to a diffusion process. The transformation from a real diffusion process to quantum mechanics consists in replacing the real probability of transition by the probability amplitude for a reversal of the path, just as in Feynman’s formulation in Sec. 11.4.2. It is the phase factor eiπ/2 that guides the motion of the particle, and converts a modified Bessel function into an ordinary Bessel function. The transformation from a probability to a probability amplitude converts the recursion relation into a finite space-time equation which transforms into Schrödinger’s equation in the limit that the spacing between the lattice points tends to zero.n However, there are good reasons for keeping the spacing finite, since it avoids all problems of n It is truly ironic that the discoverer of the nonrelativistic wave equation would
some fourteen years prior be uttering these words: It is so to speak part of the creed of the atomist that all partial differential equations of mathematical physics. . . are incorrect in a strictly mathematical sense. For the mathematical symbol of the differential quotient describes the transition in the limit to arbitrary small spatial variations, while we are convinced that in forming such ‘physical’ differential quotients we must stop at ‘physically infinitely small’ regions. . .
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
631
non-locality, and it provides relativistic corrections to Schrödinger’s equation, like the Sommerfeld relativistic correction to the Balmer term due to fine-structure.
11.6
Relativistic Stokes Parameters
In this section we show that the Stokes parameters are analogous to the four-vector of energy–momentum in the case of complete polarization. The density matrix representation of the Stokes parameters will give us insight into the transformation from the Weyl to the Dirac equations, and show what options are available in the interpretation of the components of the density matrix. One of these options will allow us to account for the microwave Lamb shift through the circularly polarized component of the mass, which the Dirac equation is unable to account for. Moreover, all hydrogen-like splittings, such as the hyperfine and fine-structure splittings, occur from negative to positive k = ±(j + 12 ) values, and are characterized by left-hand elliptical polarization.
11.6.1
Weyl and Dirac versus Stokes
It is well-known that Weyl’s equations can be derived from splitting the relativistic conservation of energy, W 2 − p2 = m2 , into (W − σ · p)(W + σ · p)ψ = (me−iϕ )(meiϕ )ψ, since (σ · p)(σ · p) = p2 , where ψ is a wave function with two components. The second-order differential equation may be factored, and since the two resulting first-order equations must give a single second-order equation, From what has been said earlier, Schrödinger was infatuated whether there is a ‘jump’ at the approach of continuous physical laws. Luckily for physics he had a change of heart, and went to the continuous limit in 1926.
Aug. 26, 2011
11:17
632
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
they must be coupled, viz.
W −V+σ ·p me−iϕ meiϕ W −V−σ ·p
ψa ψb
= 0.
(11.6.1)
These are the Weyl equations. The potential V makes W > W0 = (p2 + m2 ), which is analogous to the case of partially polarized light. The matrix in (11.6.1) is the density matrix representation of the Stokes parameters, where the helicity operator is √
σ ·p=
σ ·r σ ·r (σ · r) = 2 r · p + iσ · r × p . 2 r r
(11.6.2)
The last term in (11.6.2) is the spin orbit interaction, σ · L. We are primarily interested in bounded solutions. To this end we consider the spinor, ψ=
k f (r)Yjm
−k ig(r)Yjm
,
(11.6.3)
k is a generalized spherical harmonic.o For each j there are two where Yjm
Y’s. These have orbital angular momenta equal to j − 12 and j + 12 . The parity of the spherical harmonics is determined by whether is even or odd. Specifying the parity and j uniquely determines . Instead of determining these states by parity it will prove more convenient to introduce a new quantum number k defined as k = ±(j+ 12 ). If k is positive then k = = j+ 12 , while if it is negative then k = −( + 1), where = j − 12 . The spin-orbit interaction acts on the generalized spherical harmonics to give k k = −(k + 1)Yjm , σ · LYjm
and −k k = −Yjm , σ · rˆ Yjm
where rˆ is the unit vector. o There should be no confusion between the magnetic quantum number and the
mass.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
633
With the spinor wave function given by (11.6.3), the Weyl equations (11.6.1) become α (1 − k) g− W+ − r r α (1 + k) W+ + f+ r r
dg + mf = 0, dr df + mg = 0, dr
(11.6.4) (11.6.5)
if we are willing to lose all phase information, in the presence of the Coulomb potential, and α = 1/137 is the fine-structure constant in the natural units we are using. The loss of phase information will also plague the Dirac equation. We can get rid of the 1/r terms by writing g = G/r and f = F/r; for then there results α k + G− r r α k W+ + F+ r r
W+
dG + mF = 0, dr dF + mG = 0. dr
For a bounded solution to exist, both F, G → 0 as r → ∞. When r → ∞, the coupled equations reduce to dG = WG + mF dr
dF = −WF − mG dr
⇒
d2 G 2 2 G = 0. − W − m dr2
There can be no bound state solutions for W < m. Moreover, as r → 0 the coupled equations reduce to dG (α + k) − G = 0, dr r dF (α + k) + F = 0, dr r again showing that there is no solution vanishing at the origin. It is often said that Dirac’s equation is equivalent to Weyl’s equation, but this is simply not true. By adding and subtracting the two Weyl
Aug. 26, 2011
11:17
634
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
equations, (11.6.1), Dirac obtained
−W p −p W
φ+ φ+ = −m , φ− φ−
(11.6.6)
where φ+ = ψb + ψa and φ− = ψb − ψa . The density matrix determines the evolution of the wave function. It is equivalent to the coupled set of equations Zα dF k G = 0, + F− W +m+ r dr r dG k Zα − G+ W −m+ F = 0. dr r r In (11.6.6) the mass now appears in the diagonal terms instead of Weyl’s equation, (11.6.1). As such it provides no coupling between the spinor components, and goes against the grain of Feynman’s idea that the probability amplitude for a reversal depends on the mass [cf. (11.4.16)]. In Dirac’s theory, like the Klein–Gordon equation, mass is appended onto the energy, and is not an intrinsic part of the field. Unlike the Weyl equations, (11.6.6) admits a bound state solution; for in the limit as r → ∞, the pair of coupled equations reduce to dG + (W − m)F = 0 dr dF − (W + m)G = 0 dr
⇒
d2 F 2 2 F = 0. + W − m dr2
Hence, a bounded solution exists for W < m. But, this is surprising since we always have W 2 ≥ p2 + m2 . While this is true for optical phenomena and repulsive mechanical systems, it is not true for attractive potentials which can form orbits. There is no optical analog for such problems. Hence √ 2 2 for bounded states, both F and G tend to zero with e− (m −W )r as r → ∞.
√ 2 This calls for the coordinate change ξ = m − W 2 r, and the pair of
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
635
coupled equations reduce to 1 Zα + G = 0, η ξ dG k Zα − G− η− F = 0, dξ ξ ξ
dF k + F− dξ ξ
where
η=
m−W m+W
.
It is clear from the smallness of Zα that η 1. The tried and true method of solving the coupled differential equations is to look for a power series solution which we can cut off at a certain point. The series solutions we looking for are: F = ξν
∞
An ξ n e−ξ ,
n=0
G = ξν
∞
Bn ξ n e−ξ ,
n=0
where ν will be determined by the indicial equation which guarantees that the wave function goes to zero as some power of ξ. Introducing the power series into the coupled equations results in 1 (ν + n + k)An − An−1 − Bn−1 − ZαBn = 0, η
(11.6.7)
(ν + n − k)Bn − Bn−1 − ηAn−1 + ZαAn = 0.
(11.6.8)
The indicial equations tells us how the wave function tends as r → 0. Setting n = 0 and observing that the coefficients are zero for negative indices, we get (ν + k)A0 − ZαB0 = 0, ZαA0 + (ν − k)B0 = 0.
Aug. 26, 2011
11:17
636
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
A non-trivial solution demands the determinant of the coupled equations to vanish, i.e. √ ν = ± (k 2 − (Zα)2 ).
(11.6.9)
We must choose the positive solution in order to avoid singularities at the origin. Multiplying (11.6.7) by η, and subtracting (11.6.8) from it, give Bn =
η(ν + n + k) − Zα An . ν + n − k + Zαη
Substituting this into the recursion relations (11.6.8) yields
(ν + n + 1 − k + Zαη) 2ν + 2n + Zα(η − 1/η) An+1
. = An (ν + n − k + Zαη) (ν + n + 1)2 − k 2 + (Zα)2 The asymptotic behavior will be eξ unless the series is forced to terminate, and this produces an eigenvalue condition. One of the terms in the numerator must be zero, and this determines the integer n = N. It is not difficult to see that it is the second term in the numerator, 2(ν + N) −
Zα 1 − η2 = 0, η
or, in terms of the original variables, ZαW = 0. (m2 − W 2 )
ν+N− √
(11.6.10)
Introducing the value for ν found in (11.6.9) into (11.6.10), and after rearranging, we get
Z2 α2 W =m 1− √ 2 (N + |k|) + 2N( (k 2 − (Zα)2 ) − |k|)
1/2 .
(11.6.11)
The formulation does not allow a solution for N = 0, k > 0. In fact, by setting n = N + |k| ≥ 1, where −n ≤ k < n, and |k| = j + 12 , the energy
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
637
levels of the Dirac atom can be cast in the form
(Zα)2 W =m 1−
√
2
j + 21 − (Zα)2 − j + 12 n2 + 2 n − j + 12
1/2 .
The energy levels are only a function of two quantum numbers, n, the nonrelativistic principal quantum number, and j, the sum of the orbit angular momentum and spin. Since the Lamb shift deals with the splitting of different ’s with the same j, it is beyond the Dirac solution to tackle. Finally, expanding the square root in powers of α, the Sommerfeld relativistic correction to the Schrödinger equation (Zα)2 (Zα)4 W − m = −m − m 2n2 2n4
n j+
1 2
3 − 4
+ ···
is recovered that is valid to (Zα)4 . An essentially identical expression was first given by Sommerfeld in 1916, way before Schrödinger mechanics and Dirac relativistic mechanics were invented. The expression Sommerfeld found was: % W +m=m
1+
(nr +
√
Z2 α2 (n2φ − Z2 α2 ))2
1/2 ,
where W denotes the energy of the bound electron after deducting the rest energy m, nr is the radial quantum number, and nφ = j + 12 is Sommerfeld’s notation for the Bohr azimuthal quantum number. It corresponds to + 1 in Schrödinger’s scheme. Two terms of different , but with the same j, always coincide so that relativity and spin partly compensate each other. The principal quantum number is the sum n = nr + nφ . Thus, Sommerfeld found & (Zα)2 n RZ2 3 W =− 2 1+ + ··· , − nφ 4 n n2 upon expanding in powers of α, where R = mα2 /2 is the Rydberg constant. The Balmer term, −R/n2 , undergoes a relativistic modification that depends on j and on the fine-structure constant α.
Aug. 26, 2011
11:17
638
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity What Dirac did, essentially, was to transform Weyl’s equations,
W +Q U + iV
U − iV W −Q
W +U Q − iV
Q + iV W −U
ψa ψb
= 0,
into
φ+ φ−
= 0,
(11.6.12)
where φ± = ψa ± ψb . The Stokes parameters are U = m cos ϕ, V = m sin ϕ, and Q = σ · p. We get Dirac’s equation on setting ϕ = 0, π, which is linear polarization. The presence of a left-handed elliptical polarized component of the mass will be shown to be related to the microwave Lamb shift, in Sec. 11.6.3 which distinguishes between different values that the Dirac equation does not. In its most general form, (11.6.12) becomes
Zα + m cos ϕ σ · p + im sin ϕ W + φ+ r = 0, φ − σ · p − im sin ϕ W + Zα − m cos ϕ r or, equivalently, σ ·L Zα ∂ + m sin ϕ + m cos ϕ i σ · rˆ − + φ+ ∂r r r φ− = 0, σ ·L Zα ∂ − m sin ϕ W+ − m cos ϕ i σ · rˆ − + ∂r r r
W+
where we have introduced σ · p = −iσ · ∇ = −iσ · rˆ
σ ·L ∂ + iσ · rˆ . ∂r r
The non-zero integer, k, determines whether the spin is parallel (k < 0), or anti-parallel (k > 0), to the momentum in the nonrelativistic limit. It is
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
639
related to the spin-orbit coupling σ · L by k 2 = (σ · L + 1)2 = L2 + 2σ · L + 1. Now, the total angular momentum is J2 = L2 + σ · L + S 2 , where the eigenvalue of the square of the spin is 12 ( 12 + 1) = 43 . Thus, the operator, K, whose eigenvalue is k, is K2 = J2 + 41 . Consequently, k 2 = j(j + 1) + 41 = (j + 12 )2 , and so 1 k=± j+ . 2 The coupled set of first-order differential equations is equivalent to the second-order wave equation: k 2 Zα 2 d2 ψ 2 2 − m sin ϕ + − W+ + m cos ϕ ψ = 0, r r dr2 or, by expanding terms, (ZαW − km sin ϕ) ν2 d2 ψ 2 + 2 ψ = 0, − a −2 r r dr2
(11.6.13)
where a2 = m2 − W 2 > 0, and ν2 = k 2 − (Zα)2 . We will employ the old quantum condition that the action, when evaluated over a closed orbit, be an integral value, N, of Planck’s constant, ' ' (ZαW − km sin ϕ) ν2 2 p dr = −a + 2 − 2 dr = 2πN, (11.6.14) r r in natural units. In the method of complex integration [Born 60], r is a line in the complex plane where the integrand is pictured on a Riemann surface of two sheets with branch points at the roots e1 and e2 of the radicand with e1 > e2 . The path of integration is taken around the line joining the two roots. In the sheet of the Riemann surface where the root is positive it goes from e2 → e1 with dr > 0, while in the sheet with the negative root, the path goes from e1 → e2 with dr < 0, as shown in Fig. 11.16.
Aug. 26, 2011
11:17
640
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
Fig. 11.16. The diagrams of the original and deformed paths of integration with the pole at r = ∞ as if it were at a finite distance.
In order to evaluate the integral, we distort the path so that it separates into individual contours, each of which encloses one pole of the function. The poles are located at r = 0 and r = ∞. With the direction of rotation given in Fig. 11.16, the value of N is just the negative sum of the residues of the integrand in these poles, where the residue is 2πi times the coefficient of 1/(r − r0 ) in the Laurent expansion in the neighborhood of the pole r0 . As a check we set ϕ = π/2, and find ZαW = N, (m2 − W 2 )
−ν + √
which is precisely (11.6.10). Now, turning to (11.6.14), a necessary condition that the integrand have real roots is that ZαW > |k|m sin ϕ. The eigenvalue condition is now −ν +
ZαW − km sin ϕ = N, √ 2 (m − W 2 )
(11.6.15)
which differs from (11.6.10) by the second term in the numerator. Since W < m and Zα < |k|, we expect the angle will be very small. Just how small it is, the Lamb shift will tell us in Sec. 11.6.3.
11.6.2
Origin of the zero helicity state
The Dirac equations in the presence of a central force are 1−k dψB + ψB = 0, dr r dψA 1+k −(W + m − V(r))ψB + + ψA = 0. dr r (W − m − V(r))ψA +
The two-component wave functions, ψA and ψB , will be shown to have opposite parities. These equations are comparable to the Weyl equations (11.6.4) and (11.6.5). Whereas Weyl’s equations are coupled through the
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
641
cross terms involving mass, the Dirac equations are coupled through the diagonal terms of the density matrix involving energy as well as mass. Terms have been exchanged in the density matrix as a result of taking linear combinations of the original spinor component wave functions [cf. (11.6.6)]. Eliminating one of these wave functions results in: [(W − V)2 − m2 ]ψA −
d 2 ψA k(k + 1) 2 dψA = 0. + ψ + A r dr r2 dr2
(11.6.16)
The constant potential solutions to (11.6.16) are spherical Bessel functions [cf. (11.5.10a) and (11.5.19a)]. The hydrogen-like solutions to this equation, as we have seen, are to be sought in a power series expansion which must terminate, and, thus, provide an eigenvalue condition. In momentum space, the Dirac equation for a central force is (W − V(r))ψB = (σ · p)ψA ,
(11.6.17)
where
∂ ∂ − i px − ipy pz ∂x ∂y . = −i σ ·p= ∂ ∂ ∂ px + ipy −pz − +i ∂y ∂z ∂x
∂ ∂z
(11.6.18)
The mass, which would appear missing in (11.6.17), has been incorporated into the diagonal terms in (11.6.18) since pz = W cos ϑ = W
1 − u2 = m,
py = W sin ϑ sin ϕ = Wu sin ϕ = p sin ϕ, px = W sin ϑ cos ϕ = Wu cos ϕ = p cos ϕ. If we take ψA to be an s 1 state wave function with spin up, 2
1 ip·r−iWt e , ψA (W , p) = R(r) 0
Aug. 26, 2011
11:17
642
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
then for ψB we find
∂ ∂ − i i ∂x ∂y R ei(p·r−Wt) ψB = − 0 ∂ ∂ ∂ W −V+m +i ∂x ∂y ∂z i 1 dR z x − iy 1 i(p·r−Wt) =− e W − V(r) + m r dr x + iy −z 0 ∂ ∂z
1 dR i 1 i(p·r−Wt) cos ϑ sin ϑe−iϕ e =− W − V(r) + m r dr sin ϑeiϕ − cos ϑ 0 dR i =− W − V(r) + m dr
4π 3
where the spherical harmonics are Y10 Y1±1
=
−Y10
3 cos ϑ, 4π
=∓
√ 1 0 1 + 2Y1 ei(p·r−Wt) , 0 1
3 sin ϑe±iϕ . 8π
The p 1 state is a linear combination of these spherical harmonics so that 2 ψA and ψB have opposite parities. Observe that mass does not couple the components of the spinor making it impossible to use a Lorentz transform to an inertial frame moving faster than the particle so that its helicity would be reversed. Moreover, if m = 0, the rotation lies in the px py -plane so that the presence of mass tilts the plane of rotation toward the north pole. If the mass were considered separately, as in Dirac’s formulation, it could not be connected to the helicity h = 0. Even if m = 0, as for a neutrino, we have no way of eliminating the helicity h = 0 if the angular momentum is not exactly parallel to the z-axis so that σ · p would be the helicity operator with helicities h = 0, ± 12 . According to the Dirac equation it is only when pz = 0, and rotation occurs in the momentum px py -plane that we get h = ± 12 , whether or not the mass is finite. Thus, finite mass cannot be associated with h = 0.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
643
With the aid of Legendre’s equation, (11.5.7), (11.6.16) can be written as ∂ 2 ψA 1 ∂ 1 1 ∂ψA ∂ 2 ∂ψA + + q2 ψA = 0, r + sin ϑ ∂ϑ ∂r r2 ∂r r2 sin ϑ ∂ϑ r2 sin2 ϑ ∂ϕ2 where q2 = (W − V)2 − m2 . The solutions to the radial equation are spherical Bessel functions. When the quantum conditions are applied, we will find that the magnetic quantum number satisfies |m| ≤ . Therefore, we now turn our attention to the solution of the reduced wave equation in the short-wavelength diffraction limit. We will again employ Fermat’s principle of least time, just as we did in Sec. 7.2.2, only now constrained to motions on a sphere of radius . Fermat’s principle asserts that a ray will follow the path from (ϑ0 , ϕ0 ) to (ϑ, ϕ) such that the optical path length,
dϑ2 + sin2 ϑdϕ2 ,
I=
(11.6.19)
is an extremum. If the potential V is slowly varying, as we assume it is, we may consider the index of refraction to be constant. Choosing ϑ to be the independent variable, (11.6.19) can be written as I=
ϑ
1 + sin2 ϑϕ2 dϑ,
(11.6.20)
ϑ0
where the prime stands for differentiation with respect to ϑ. Calling the integrand of (11.6.20), we have, on account on the cyclic nature of ϕ, that sin2 ϑ ϕ d = √
= const. dϕ 1 + sin2 ϕ2
(11.6.21)
is a first integral of the motion. We set the constant equal to sin ϑ0 so that Snell’s law is satisfied, and proceed to solve for ϕ . We then obtain dϕ =
sin2 ϑ
sin ϑ0 dϑ
√
1 − sin2 ϑ0 / sin2 ϑ
,
which requires that ϑ > ϑ0 because the sine is an increasing function on the interval (0, π/2). The integration can be carried out straightforwardly
Aug. 26, 2011
11:17
644
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
Fig. 11.17. A right-spherical triangle.
giving the difference in longitude, ϕ(ϑ) − ϕ0 = arccos
tan ϑ0 tan ϑ
,
(11.6.22)
of two great circle arcs, ϑ0 and ϑ. The integration constant, ϕ0 , has been chosen such that ϕ = ϕ0 when ϑ = ϑ0 . Now, (11.6.22) is related to the expressions for the sine and cosine laws of the right-spherical triangle shown in Fig. 11.17, where, for instance, cos (ϕ − ϕ0 ) = cos · sin =
cos ϑ cos ϑ sin ϑ0 , sin = cos ϑ0 cos ϑ0 sin ϑ
on account of the elliptic form of the Pythagorean theorem, cos ϑ = cos ϑ0 · cos ,
(11.6.23)
and the law of sines, sin ϑ0 = sin ϑ, sin for a right triangle. Introducing the extremum condition (11.6.22) into Fermat’s principle (11.6.20) gives the extremum optical path as ϑ cos ϑ dϑ Iext = = arccos = (ϑ, ϑ0 ). (11.6.24) √ cos ϑ0 (1 − sin2 ϑ0 / sin2 ϑ) ϑ0 It is quite remarkable that the extremum of the optical path should give the elliptic version of the Pythagorean theorem. is the shortest optical path connecting the latitudes ϑ and ϑ0 , all three being the sides of a right-spherical triangle.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
645
Now, the law of cosines is cos ϑ = cos ϑ0 cos + sin ϑ0 sin cos ξ. If it turns out that the angle ξ = π/2, we are talking about a right-spherical triangle, and the law of cosines reduces to the elliptic Pythagorean theorem, (11.6.23). If we set sin ϑ0 and sin equal to the relative velocities u0 and u, respectively, the condition that the spherical triangle be a right triangle can be expressed as u · u0 = 0. Then, the relativistic addition of complementary velocities,
1−w
2
√
1 − u2 1 − u20 , 1 − u0 · u
√ =
(11.6.25)
yields the Pythagorean theorem for spherical trigonometry, (11.6.23), where w = sin ϑ. The phase, or eikonal, will be a certain linear combination of (11.6.22) and (11.6.24). Just as in Sec. 7.2.2 we define the eikonal, S, as the integral of the Legendre transform of with respect to ϕ [cf. 7.4.11] viz. ∂ − ϕ dϑ ∂ϕ & sin2 ϑ0 dϑ dϑ =
−
√ √ sin2 ϑ 1 − sin2 ϑ0 / sin2 ϑ 1 − sin2 ϑ0 / sin2 ϑ sin2 ϑ0 = 1− dϑ (11.6.26) sin2 ϑ ( ) cos ϑ tan ϑ0 = arccos − sin ϑ0 arccos cos ϑ0 tan ϑ
S=
= − m(ϕ − ϕ0 ), where m = sin ϑ0 is the azimuthal quantum number.
(11.6.27)
Aug. 26, 2011
11:17
646
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
Fig. 11.18. A right-spherical triangle traced out by an orbiting electron where co – denotes the complement of the angle.
The spherical harmonics can be expressed in terms of their corresponding associated Legendre polynomials, Yl±m (ϑ, ϕ) ∼ Plm ( cos ϑ)e±iϕ . The associated Legendre polynomials, Pl±m , have the asymptotic form Pl±m ( cos ϑ) ∼ e±iS(ϑ,ϑ0 ) , where S is given by (11.6.26). The complement of (11.6.24) is the angular distance of the orbiting electron from the line of nodes, measured on the orbital plane, and (11.6.22) is the projection of this angular distance onto the equator. This is shown in Fig. 11.18, which is a right-spherical triangle equivalent to Fig. 11.17. The cosine of the angle of inclination, i, is given by the ratio of the elliptic measures of the adjacent to the hypotenuse, cos i =
tan tan[π/2 − (ϕ − ϕ0 )] = , tan (π/2 − ) tan (ϕ − ϕ0 )
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
647
while for the sine of the angle, we have sin i =
cos ϑ sin (π/2 − ϑ) = = cos ϑ0 . sin (π/2 − ) cos
The latter implies that the elliptic dilatation factor, √ (1 − m2 /2 ) = √ ≥ 1, (1 − m2 /2 sin2 ϑ) becomes larger the closer ϑ approaches ϑ0 . The dilatation factor reaches its smallest value for m = 0. Even before the advent of quantum theory, the Dutch physicist Zeeman (1896) found a splitting of the spectral lines when atoms are placed in a magnetic field. Zeeman observed that when the magnetic field is placed perpendicularly to the light path, the spectral lines split into three, while when the field is parallel to the light path they split into two. His countryman, Lorentz, showed that a charged oscillator in a magnetic field could explain these splittings. Lorentz accomplished this by decomposing the motion of the oscillator into two opposite circular motions normal to the field, and one linear motion parallel to the field. When an impressed magnetic field is applied to a rotating electron, it makes it precess about the direction of the field. The frequency of precession is known as the Larmor frequency, ωL = eH/2m, where H is the applied magnetic field. Lorentz’s theory predicted that the frequency of one of the circular motions is increased exactly by ωL , while the other frequency of the circular motion is diminished by exactly the same amount. This is referred to as the ‘normal’ Zeeman effect to distinguish it from the ‘anomalous’ effect in which the splitting pattern consists of a greater number than three lines (or doublet). In terms of the azimuthal quantum number, m, the above analysis implies that it can change by −1, 0, 1. Transitions that leave m invariant correspond to linear polarization in the direction of the field. For transitions m ± 1 → m, the radiation is circularly polarized about the direction of the field. Even in the absence of an applied magnetic field, a revolving electron produces its own magnetic field. An increase (decrease) in m corresponds to right- (left-) circularly polarized light. Observations that are made longitudinal to the magnetic field result in a doublet with frequencies displaced
Aug. 26, 2011
11:17
648
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
Fig. 11.19. Zeeman splitting: light path parallel (perpendicular) to field results in a doublet (triplet).
in both directions of the central frequency, and the disappearance of the central line. In contrast, transverse observations result in a triplet where the central frequency is plane-polarized parallel to the field, while the outer lines are polarized in a perpendicular direction, as shown in Fig. 11.19. This is yet another example of the myriad of physical meanings that the index m on the associated Legendre polynomials can take on.
11.6.3
Lamb shift and left-hand elliptical polarization
In the late 1940’s Lamb and Retherford were able to detect transitions between the very close 2s 1 and 2p 1 levels of hydrogen induced by per2 2 turbations using microwaves at frequencies of the order of 1000 Mc/sec. The shift can be explained as arising from the interaction of the atom with its own radiation field. To obtain the observed shift, Hans Bethe had to subtract off the infinite electromagnetic energy of the electron, and then introduce a plausible cut-off on the integral over the energy. The original back-of-the-envelope calculation of Bethe was elaborated on by Welton [48] who considered fluctuations of the zero-point oscillations of the electromagnetic field. The energy of such oscillations
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
649
is proportional to E2
=
H2
2 = π
∞
κ3 dκ.
(11.6.28)
0
This equation is clearly infinite, in which the integrand is the energy, ω = κ, times the number of Planck oscillators between the frequencies κ and κ+dκ, which is proportional to κ2 dκ. At absolute zero the integral over all modes is infinite, and this will require some type of cut-off. A free electron, acted upon by an electric field, will have an equation of motion, mω2 q = −eE,
(11.6.29)
where it is supposed that the electron’s motion is periodic, and q is its displacement from its equilibrium position. Squaring (11.6.29), and using (11.6.28), we find the mean-square displacement is (at least heuristically) given by (q)2 =
2 e2 π m2
κ1
κ0
dκ , κ
(11.6.30)
which diverges ‘only’ logarithmically if the limits are zero and infinity. The equation of motion (11.6.29) fails to take into account processes regarding the radiation of the electron. The longitudinal recoil of the electron will have to be added to the transverse oscillations which have been included. Recoil will become important for wavelengths smaller than the Compton wavelength, which is the smallest wavelength that can be used to localize the electron. Therefore, to exclude recoil, we set the upper limit at κ1 = κc , the inverse of the Compton wavelength. The lower limit was more problematic since the divergence arises from very large, but very low, frequency oscillations. Electron binding will help eliminate such oscillations so the lower limit κ0 was taken by Bethe to be slightly greater than the energy of ionization of the hydrogen atom, 13.6 eV, or 1 Rydberg. Bethe [47] needed 17.8 times this amount to obtain approximate agreement with the experimental result of Lamb and Retherford [47]. This alone attests to the ad hoc nature of Bethe’s procedure.
Aug. 26, 2011
11:17
650
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity Due to the zero-point fluctuations, the Coulomb potential, V(q) = −
e2 , r
will be perturbed so that we may develop the perturbed potential in a Taylor series expansion in the displacement about this value. We then obtain 1 V(q + q) = 1 + q · ∇ + (q · ∇)2 + · · · V(q). 2 What is important is its average value, and since the spatial distribution is isotropic, it will be given by 1 V(q + q) = 1 + (q)2 ∇ 2 + · · · V(q). 6 Although this series fails to converge in the case of the Coulomb potential, this should not prevent us from retaining only the lowest nonvanishing term, for which we have the expression (11.6.30). Multiplying the result by the probability density, |ψ100 (0)|2 , where ψ100 is the ground state wave function of the hydrogen atom, gives an energy shift of 4 e2 α κc W = |ψ100 |2 . ln (11.6.31) 2 3 κc κ0 Quantum electrodynamics corroborates (11.6.31), while providing a cosmetic touch-up by adding small terms to the logarithm [Sakurai 67]. This would seem like a great achievement on account of the ad hoc nature of the cut-offs. Quantum field theory offers still another explanation of the Lamb shift. The quantization of the radiation field leads to fluctuating field strengths in empty space which can be viewed as a shielding effect on the electron whereby the surrounding cloud of virtual electron-position pairs shield its charge so that it appears smaller than what it actually is at large distances. The virtual electron-positron pairs, which are produced by the electric field acting on the vacuum (aether), are attracted to the single electron of the hydrogen atom by polarizing the pairs so that the positrons are attracted slightly closer to the electron while the electrons, in the virtual pairs, are repulsed. Thus, the space surrounding the electron appears as a polarized dielectric, as depicted in Fig. 11.20. Rather than invoking the
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
651
Fig. 11.20. The conventional explanation of the Lamb shift as the shielding of the electron’s charge by virtual electron-positron pairs that are produced by the vacuum when acted upon by an electric field.
radiation field, we will attempt to explain the Lamb shift as the effect of the electric field acting to circularly polarize the electron mass. According to Dirac, the energy depends only on n and the magnitude of k so that there is no lifting of the degeneracy of states with the same j values, but with different values. However (11.6.15) clearly shows that with a non-vanishing azimuthal angle ϕ = 0, the energy will also depend on the sign of k, and consequently levels with different values of k but with the same modulus will be split. This will lift the degeneracy and account for the Lamb shift. Solving (11.6.15) for the energy we get W =m
(Zα)2 √
1− (N + |k|)2 + 2N (k 2 − (Zα)2 ) − |k|
Aug. 26, 2011
11:17
652
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
k 2 sin2 ϕ √
1− 2 (N + |k|) + 2N k 2 − (Zα)2 − |k|
× +
(N
+ |k|)2
Zα km sin ϕ
, √ + 2N k 2 − (Zα)2 − |k|
(11.6.32)
which reduces to Dirac’s result, (11.6.11), when ϕ = 0. We may verify (11.6.32) by squaring both sides to obtain W 2 q − 2ZαW km sin ϕ = m2 q − m2 (Zα)2 + k 2 sin2 ϕ , (11.6.33) where
q = (N + |k|)2 + 2N |ν| − |k| , and we have canceled the common terms (Zα km sin ϕ)2 . Now adding (Zα)2 W 2 to both sides of (11.6.33) we find
2 ZαW − km sin ϕ = m2 − W 2 q − (Zα)2 . (11.6.34) Observing that q − (Zα)2 = (N + |ν|)2 , and taking the positive square root of (11.6.34) give us back the eigenvalue condition, (11.6.15). Expanding (11.6.32) in powers of Zα gives & (Zα − k sin ϕ)2 n 3 (Zα)2 2 W − m = −m − − ··· , + (Zα − k sin ϕ) |k| 4 2n2 2n4 (11.6.35) where n = N + |k|. The nonrelativistic energy is now dependent on two quantum numbers, n and k, and on the value of the azimuthal angle, ϕ. The fact that ϕ = 0 represents a preference for right circular polarization. We must now determine this angle. Observe that for ϕ = 0, (11.6.35) depends on n and |k|, but not on the sign of k. This implies that energy levels with the same j and same n are degenerate. This degeneracy is peculiar to the Coulomb field. In nonspherically symmetric potentials, as in many electron atoms, the level with lower lies above that with higher , due to screening. Even in hydrogen, where there is no screening, there is still a splitting between the atomic levels with the same total angular momentum, but with different orbital angular momenta. This tiny splitting is between the 2s 1 (k = −1, = 0) 2
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
653
Fig. 11.21. Splitting of energy levels of a hydrogen-like atom (not drawn to scale). All shifts are left-hand elliptical polarizations.
and 2p 1 (k = +1, = 1) levels, with the 2s state being the higher of the two, 2 as shown in Fig. 11.21. It is known as the Lamb shift, and was predicted by S. Pasternack even before the 1940’s. Because (11.6.35) is unable to account for the Lamb shift with ϕ = 0, since it only depends on the modulus of k and not on its sign, recourse had to be made to quantum electrodynamics. However, for ϕ = 0, (11.6.35) predicts a splitting. The splitting where the 2s 1 lies above the 2p 1 level 2 2 is left-hand elliptical polarization. Moreover, even in the nonrelativistic limit, where we retain only the leading term in (11.6.35), we find the shift in energy between the 2s 1 and 2p 1 states to be 2
2
1 W = W2s 1 − W2p 1 = − mZα sin ϕ, 2 2 2
(11.6.36)
where Z = 1 for hydrogen. The degeneracy in the Dirac equation has been lifted. The measured frequency shift is 1057 megacycles per second corresponding to 0.035 cm−1 . To get an idea of how small this is just compare it to the ionization energy of the ground state, 2.7 × 104 cm−1 . The shift corresponds to an energy of
Aug. 26, 2011
11:17
654
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
4.34 × 10−6 eV. With 12 αm = 1.863 × 103 eV, in (11.6.36), a negative phase angle of ϕ = −0.024 is given, indicating left-handed elliptical polarization. All the energy levels of the splittings of a hydrogen-like atom are shown in Fig. 11.21. All splittings occur with a negative difference in the k values. For instance, the fine-structure splitting, which is roughly ten times that of the Lamb shift, has a negative phase angle, ϕ = −0.16 . All allowed splittings show left-handed elliptical polarizations, which can be considered a selection rule, and are obtained from (11.6.35) without any recourse to quantum electrodynamics.
References [Bass & Schrödinger 55] L. Bass and E. Schrödinger, “Must the photon mass be zero?” Proc. Roy. Soc. A 232 (1955) 1–6. [Bethe 47] H. A. Bethe, “The electromagnetic shift of energy levels,” Phys. Rev. 72 (1947) 339–341. [Born & Wolf 59] M. Born and E. Wolf, Principles of Optics (Pergamon, Oxford, 1959), p. 653. [Born 60] M. Born, The Mechanics of the Atom (Frederick Ungar Pub. Co., New York, 1960) Appendix II. [Brillouin 60] L. Brillouin, Wave Propagation and Group Velocity (Academic Press, New York, 1960), p. 143. [Dirac 47] P. A. M. Dirac, The Principles of Quantum Mechanics, 2nd ed. (Oxford U. P., London, 1947), pp. 257–258. [Falkoff & MacDonald 51] D. L. Falkoff and J. E. MacDonald, “On the Stokes parameters for polarized radiation,” J. Opt. Soc. Am. 41 (1951) 861–862. [Fano 49] U. Fano, “Remarks on the classical and quantum-mechanical treatment of partial polarization,” J. Opt. Soc. Am. 39 (1949) 859–863. [Fano 54] U. Fano, “A Stokes-parameter technique for the treatment of polarization in quantum mechanics,” Phys. Rev. 93 (1954) 121–123. [Farago 71] P. S. Farago, “Electron spin polarization,” Rep. Prog. Phys. 34 (1971) 1055. [Feynman & Hibbs 65] R. P. Feynman and A. R. Hibbs, Quantum Mechanics and Path Integrals (McGraw-Hill, New York, 1965), p. 35. [Gaveau et al. 84] B. Gaveau et al. “Relativistic extension of the analogy between quantum mechanics and Brownian motion,” Phys. Rev. Lett. 53 (1984) 419–422. [Georgi 09] H. Georgi, Lie Algebras in Particle Physics, 2nd ed. (Levant Books, Kolkata, India, 2009). [Gersh 81] H. A. Gersh, “Feynman’s relativistic chessborad as an Ising model,” Int. J. Theor. Phys. 20 (1981) 491–501. [Gottfried & Weisskopf 86] K. Gottfried and V. Weisskopf, Concepts of Particle Physics, Vol. II (Oxford U. P., New York, 1986).
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-ch11
The Inertia of Polarization
655
[Heaviside 92] O. Heaviside, Electrical Papers, Vol. II (MacMillan, London, 1892), pp. 94–5. [Heaviside 99] O. Heaviside, Electromagnetic Theory, Vol. II (The Electrician, London, 1899), Appendix D. [Heaviside 12] O. Heaviside, Electromagnetic Theory, Vol. III (The Electrician, London, 1912), Sec. 514. [Heisenberg 66] W. Heisenberg, Introduction to the Unified Field Theory of Elementary Particles (Wiley, New York, 1966), p. 131. [Jackson 05] J. D. Jackson, “Kinematics,” http://pdg.lbl.gov/2005/reviews/ kinemarpp.pdf. [Jackson 75] J. D. Jackson, Classical Electrodynamics, 2nd ed. (Wiley, New York, 1975), Sec. 12.9. [Jacobson & Schulman 84] T. Jacobson and L. S. Schulman, “Quantum stochastics: The passage from a relativistic to a non-relativistic path integral,” J. Phys. A: Math. Gen. 17 (1984) pp. 375–383. [Jauch & Rohrlich 55] See, for example, J. M. Jauch and F. Rohrlich, Theory of Photons and Electrons (Addison-Wesley, Reading MA, 1955), Sec. 2.8. [Jones 41] R. Clark Jones, “A new calculus for the treatment of optical systems. I. Description and discussion of the calculus,” J. Opt. Soc. Am. 31 (1941) 488–493. [Kliger et al. 90] D. S. Kliger, J. W. Lewis, and C. E. Randall, Polarized Light in Optics and Spectroscopy (Academic Press, Boston, 1990). [Lamb & Retherford 47] W. E. Lamb, Jr. and R. C. Retherford, “ Fine structure of the hydrogen atom by a microwave method,” Phys. Rev. 72 (1947) 241–243. [Landau & Lifshitz 59] L. D. Landau and E. M. Lifshitz, Fluid Mechanics (Pergamon Press, Oxford, 1959). [Lavenda 00] B. H. Lavenda, “Heisenberg’s Gitterwelt revisited,” Nuovo Cimento B 115 (2000) 1385–1395. [Lipkin 62] H. Lipkin, Beta Decay for Pedestrians (North-Holland, Amsterdam, 1962), p. 96. [Lipkin 66] H. J. Lipkin, Lie Groups for Pedestrians (North-Holland, 1966), p. 138. [Maxwell 65] J. C. Maxwell, “A dynamical theory of the electromagnetic field,” Phil. Trans. 155 (1865) 459. [McCrea 47] W. H. McCrea, Relativity Physics, 2nd ed. (Methuen, London, 1947), p. 59. [McMaster 54] W. H. McMaster, “Polarization and the Stokes parameters,” Am. J. Phys. 22 (1954) 351–362. [Moriyasu 83] K. Moriyasu, An Elementary Primer for Gauge Theory (World Scientific, Singapore, 1983), p. 120. [Mueller 48] H. Mueller, “The foundations of optics,” J. Opt. Soc. Am. 37 (1948) 661. [Nahin 88] P. J. Nahin, Oliver Heaviside: Sage in Solitude (IEE Press, New York, 1988). [Omnès 70] R. Omnès, Introduction to Particle Physics (Wiley-Interscience, London, 1970), pp. 81ff. [Oppenheimer 30] J. R. Oppenheimer, “On the theory of electrons and protons,” Phys. Rev. 35 (1930) 562.
Aug. 26, 2011
11:17
656
SPI-B1197
A New Perspective on Relativity
b1197-ch11
A New Perspective on Relativity
[Perrin 42] F. Perrin, “Polarization of light scattered by isotropic opalescent media,” J. Chem. Phys. 10 (1942) 415–427. [Poincaré 92] H. Poincaré, Théorie Mathématique de la Lumi`ere (Georges Carré, Paris, 1892), Vol. 2, Ch. 12. [Rohrlich 65] F. Rohrlich, Classical Charged Particles (Addison-Wesley, Reading MA, 1965). [Sakurai 67] J. J. Sakurai, Advanced Quantum Mechanics (Addison-Wesley, Reading MA, 1967), p. 293. [Schweber 61] S. S. Schweber, An Introduction to Relativistic Quantum Field Theory (Harper & Row, New York, 1961), p. 110. [Schweber 86] S. S. Schweber, “Feynman and the visualization of space-time processes,” Rev. Mod. Phys. 58 (1986) 449–507. [Shurcliff 62] W. A. Shurcliff, Polarized Light: Production and Use (Harvard U. P., Cambridge MA, 1962). [Skilling 42] H. H. Skilling, Fundamentals of Electric Waves (Wiley, New York, 1942), Ch. XI. [Soleillet 29] P. Soleillet, “Sur les paramètres caracterérisant la polarisation partielle de la lumière dans le phénomenènes de fluorescence,” Ann. de Phys. 12 (1929) 23–97. [Sommerfeld 34] A. Sommerfeld, Atombau und Spektrallininien, 5th ed. (Teubner, Berlin, 1934), p. 688. [Stokes 52] G. G. Stokes, “On the composition and resolution of streams of polarized light from different sources,” Trans. Cambridge Phil. Soc. 9 (1852) 399; Mathematical and Physical Papers, Vol. 3 (Cambridge U. P., Cambridge, 1901), p. 233. [Welton 48] T. A. Welton, “Some observable effects of the quantum-mechanical fluctuations of the electromagnetic field,” Phys. Rev. 74 (1948) 1157–1167.
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-Index
Index
Green’s, 129 incompressible, 623 increase in mass, 293 jelly-like, 551 luminiferous, 128, 551 in contrast to electric, 531 mass, 292 Maxwell’s equality of compression and rotation, 623 motion, 187 necessity, 264, 292 restores action and reaction, 186 rotation, 131 storage for energy, 290 velocity, 161, 621 wind, 113, 434 Ampère’s law, 153, 201, 207 relation to Biot and Savart, 201 analytic continuation, 585 angle defect, 281, 387, 414, 464, 467, 472, 503, 507 cause of Lorentz contraction, 523 caused by aberration, 503 upper limit on, 89 angle excess, 281, 464, 570 angle of parallelism, 82, 238, 283, 393, 397, 419, 420, 471, 472, 506, 512, 526, 579 Bolyai–Lobaschevsky formula for, 85, 450 for β-decay, 560 from aberration, 397 from pseudorapidity, 535
aberration, 288, 414, 458, 501, 504 angle, 414 constant, 460 in direction of motion, 395 normal to the motion, 395 stellar, 89, 111, 124, 459 two-way, 526 Abraham’s model, 263, 304 absolute constant, see curvature, radius absolute instrument, 61 acceleration, 408 centripetal, 147, 148, 228 effect on clock rate, 387, 401, 404 longitudinal and transverse, 503 role in twin paradox, 298 uniform, 406, 419, 431 accelerative frame, 448 motion, nonequivalent, 449 action and reaction law, 150, 185 violated by Lorentz’s force, 186 action at a distance, 158, 177, 181, 200, 207, 262 action principle, 218 activity, see power density additivity principle, 571 adiabatic process, 308 conservation of enthalpy, 308 aether, 548, 618 absolute motion, 297 conservation, 548 density, 621 dielectric constant, 626
657
Aug. 26, 2011
11:17
658
SPI-B1197
A New Perspective on Relativity
b1197-Index
A New Perspective on Relativity
link between hyperbolic and circular functions, 420 lower bound for angle of parallax, 419 angle, critical, 136 angular momentum Euclidean conservation, 433 non-conservation, 166, 363, 437, 442, 462 operators, 543 orbital, 552, 615 spin, 551, 615 total, 541, 615, 639 antineutrino, 589 asymptotic lines, 82 Balmer series, relativistic correction to, see Sommerfeld’s relativistic correction Beltrami coordinates, 467 flat model, 100 metric, 87, 95, 218, 368, 410, 429, 433, 448, 467, 482, 494 for a uniformly rotating disc, 445 pre-geodesic, 371 relation to Liénard’s radiation loss, 224 model, 91, 98, 362, 523 deflection of light, 381 Bessel function, 352, 475 modified, 476 determining thermodynamic properties, 328 generating function for, 335 spherical, 593, 598 β-decay, 555, 582 Fermi’s theory, 555, 580 big bang, 497 Biot–Savart law, 251 birefrigent polarizer, 537 birefringence, 137, 536, 539 black hole, 410 blackbody radiation, 319, 616 blueshift, 526
Boltzmann’s law, 328 Bolyai–Lobaschevsky formula, see angle of parallelism Boyle’s law, see Mariotte’s law Casimir invariant, 571 Cauchy’s formula, 349 caustic, 342, 343, 364, 366 centrifugal force, 343, 432, 439 potential, effective, 462 charge conjugation symmetry, violation, 581 circle at infinity, 81, 509 circuital equations, see Maxwell’s equations Clairaut representation, 350, 367 collision parameter, 380, 433 color charges, 571 compressional waves hallmark of, 597 Kelvin’s electromagnetic analogy, 553 Compton wavelength, 329, 353 Cooper pairs, 576, 608 as spin-0 bosons, 608 coordinates comoving, 410, 492 controllable and uncontrollable, 308 cyclic, 350 homogeneous, 97, 389 pseudospherical, 482 Weierstrass, 97, 98, 416 Coriolis force, 343 Coulomb’s gauge, 602, 605, 610 law, 185, 198, 201, 251, 357, 633, 650 breakdown of, 574 covariance, Einstein’s principle, 431, 447 cross-ratio, 67, 68, 70, 73, 80, 86, 190, 285, 389, 413, 453, 463, 508, 522, 524 Cayley’s definition, 104, 511 Poincaré’s definition, 103, 104, 512, 516
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-Index
Index relation to Doppler shifts, 190 relation to hyperbolic distance, 466 cubic dilatation, 546 curvature, 476 as fitting error, 464 constant, 477 Gaussian, 362, 366, 369, 370, 409, 458, 464, 474–476, 478, 498 Einstein’s need to go beyond, 430 relation to scalar curvature, 481 geodesic, 351 mean, 370 negative, 97, 464, 477 relation to source, 474 non-constant, 464 positive, 476, 477 relation to sink, 474 versus negative, 465 principal, 370 radius, 147, 191, 225, 228 scalar, 161, 479, 481 as a criterion for emptiness, 480 de Broglie’s relation, 122, 628 density matrix, 540, 541, 585, 641 reduced, 587 representation of Stokes parameters, 564 density of states, 329 depolarization, 248, 282 dichroism, 536, 539 diffraction short-wavelength limit, 342 diffraction, short-wavelength limit, 643 diffusion process, 630 dipole radiation, 597 Dirac’s equation hydrogen-like solutions to, 641 in presence of central force, 640 momentum space, 641
659
nonequivalence to Weyl’s equations, 633 dispersion, 134, 192, 615 displacement current, 131, 183, 199, 548 dissipation, 585, 596 Kelvin’s principle, 621 Doppler shift, 107, 117, 119, 182, 296, 319, 336, 344, 386, 421, 504, 515, 525 compounding, 388, 508 connection with hyperbolic length, 190, 508 exponential, 396 first-order, 297 for water waves, 294 general, 396 in β-decay, 558 lack of in emission theories, 211 longitudinal, 296, 335, 388, 400, 471, 558 ordinary, 516 relativistic, 516 second-order, 121, 297 related to space contraction, 468 successive occurrence with rotations, 453 transverse, 334 two-way, 504, 507, 518, 526 as experimental test for the angle of parallelism, 509 prediction of redshift, 526 used to distinguish Klein and Poincaré models, 509 duplex equations, 618 dynamic equilibrium, 293 ecliptic plane, 461 Ehrenfest’s paradox, 465 eikonal, 136, 363, 428, 645 Einstein’s addition law, see velocity composition law equations, 161, 417, 479, 495 postulates, 2, 4, 208
Aug. 26, 2011
11:17
660
SPI-B1197
A New Perspective on Relativity
b1197-Index
A New Perspective on Relativity
elastic constants compressional, 358, 606, 621 equality, 622 rigidity, 358, 606 rotational, 358, 606, 621 electric displacement, 291 waves, longitudinal, 605, 613, 625 electrokinetic momentum, 131, 133, 181, 348, 547, 550, 591, 598, 601 potential, 179, 261, 262 electromotive force, 206 electron classical radius, 308 models, 262 spin, 550 electroweak interaction, 540, 554, 573, 574, 610 Lagrangian, 576 mediators, 616 theory, 542 ellipsoid oblate, 247, 273, 302 prolate, 242, 248, 270 elliptic circumference of circle, 429 distance, 428 geometry, 95, 287, 377, 570 aberration in, 288 transformation to hyperbolic, 428 volume of a sphere, 376 metric, for Schwarschild inner solution, 483 plane, 67, 376 space dilatation, 465, 647 volume, 486 emission theory, 187, 192, 294, 434, 445 Ritz’s, 188, 209 energy binding, 307 blackbody radiation, 321 conservation, 432, 520, 540 density compressional, 605, 623 electromagnetic, 549
negative, 496, 607 positive, 497 rate of change of kinetic, 623 flux, 311 gravitational, 253 hyperbolic conservation, 463 index, 496 inertia, 141 internal, 313 kinetic due to a magnetic field, 264 Euclidean, 457, 467 negative, 204 random thermal motion, 309 relativistic, 331 resulting from random thermal motion, 314 levels of Dirac atom, 637 negative, 549, 589 rotation, 604 radiant, 301 relativistic, 534 conservation, 631 total, 314 energy, relativistic, 576 energy–momentum tensor, 161, 311, 332 enthalpy, 143, 302, 303, 312, 313 density, 307 total, 314 entropy, 313, 620 blackbody radiation, 321 relativistic invariant, 321 equivalence principle, 427, 431, 433, 437, 443 downgraded by Einstein, 445 Euclid’s fifth postulate, see Euclidean, parallel postulate Euclidean distance, 410 geometry, 283, 397, 430 length, 419, 522 metric, 428 parallel postulate, 79, 452 violation, 88
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-Index
Index Euler’s equation, 409, 622 relation, 318, 322 event horizon, 492 extinction theorem, 192, 211 Faraday’s equation, 602 tubes, 244, 291, 293, 533, 618 crowding, 293 Fechner’s hypothesis, 202 Fermat’s principle, 61, 134, 342, 346, 408, 449, 454, 643, 644 Fermi transitions, 580 fields conditioning, 615 fundamental, 613 solenoidal and irrotational, 352 Fierz interference, 581 first law, 316 FitzGerald–Lorentz contraction, 116, 302, 312, 313, 315, 317, 387, 426, 450, 464 as a rotation, 397 shortening of circumference of rotating disc, 465 Fizeau’s drag coefficient, 109 flat model, 361 flux, quantum mechanical, 609 forces centrifugal and gravitational, 477 centripetal, 150 gravitational, 298 quasi-longitudinal and transversal, 147 radiation reaction, 213, 217 Abraham’s, 215, 220 self-reaction, 209 torsional, 147 four-vector, 574, 575 Abraham’s, 215 dispersion relation, 611 time-like, 611, 615 energy–momentum, 631 invariance of, 579 of electrodynamics, 181 potential, 575, 600–602
661
of electroweak field, 574 velocity, 226 Frenet–Serret equations, 226 Fresnel’s dragging coefficient, 110, 111, 125, 192, 445 Friedmann’s equation, 496 fundamental form, 483 first, 95, 350, 367, 430, 433, 479, 481 second, 370 gas degenerate, 330, 333 pressure of, 333 ideal, 330 gauge invariance, 554 theories, 574 transformation, 586 Gauss’s law, 606, 611, 614 for gravitation, 158 geodesics, 99, 449, 487 null, 492 spreading of, 452 geodetic projection, 94 Gitterwelt, 629, 630 gravitation Einstein’s theory of, 407 Maxwellian theory of, 156 Ritz’s theory of, 163 gravitational collapse, 463 current, 349 energy flux, 162 field, 432 inhomogeneous, 445 produced by acceleration, 444 potential Newton’s, 381, 431 scalar, 409 Schwarzschild’s, 371 radiation, 343 redshift, 369 vortex, 162 waves, 163 propagation speed, 432
Aug. 26, 2011
11:17
662
SPI-B1197
A New Perspective on Relativity
b1197-Index
A New Perspective on Relativity
traveling at the speed of light, 161, 163 Grüneisen equation of state, 324, 333 parameter, 330 gyroscopic motion, 228 Hamilton’s principle, 134 heat, 316 conduction, 473 diffusion equation, 475 Fourier’s law, 475 heat content, see enthalpy Heaviside ellipsoid, 263 helicity, 358, 540, 541, 543, 553, 588 inertial properties, 589 longitudinal mode, 541 operator, 632 positive and negative, 550, 581 reversal, 620, 642 synonymous to circular polarization, 562 zero, 554 Helmholtz’s electromagnetic theory, 128, 620, 626 equation, 135, 361, 550 Higgs field, 576 analogous to aether, 576 nonexistence, 554 mechanism, 549 horizon, see circle at infinity horocycle, 82, 85, 505, 536 Hubble’s law, 388, 421, 422 parameter, 422 Hugoniot’s theorem, 625 Huygens’s principle, 174, 448, 529 hydrogen atom, 649, 650, 652 energy level splittings, 654 gravitational analog, 377 hypercharge, 574 hyperbolic circumference of circle, 429 distance, 80, 86, 412, 495, 511
as shortest, 397 Klein’s definition, 511 Poincaré’s definition, 512 triangle inequality for, 72 geodesics, 452 geometry, 91, 100, 535, 572 hallmark, 80 transformation to elliptic, 570 volume of a sphere, 378 involution, 389, 390 law of cosines, 394 law of sines, 387, 395 length, 412 line element, 406 motion, 223 parallel axiom, 461 rotation, 393 transition to elliptic, 96 ideal gas law, see Mariotte’s law ideal point, see point, at infinity ideal triangle, 505 incompressible fluid condition for, 622, 623 induction, 262, 554, 601 inductive zone, 597 inertia of energy Heaviside’s law, 132 Planck’s law, 307 Thomson’s law, 149, 255 inversion, 58, 62 involute, 363 involution, 283 involutory, 75 isoperimetic quotient, 276 isospin, 543, 571, 578, 586 jets, 583 Jones and Mueller calculus, 537 Joule –Thomson process, 307 heating, 596 K-calculus, 398 Kepler’s law, 351
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-Index
Index kinematic condition of compatibility, 624 relativity, 398 kinetic potential, 317 Klein model, see projective, model Klein–Gordon equation, 353, 587, 600, 608, 611, 612, 619 Lagrangian of random thermal motion, 323 Lamb shift, 638, 640, 653 inability Dirac’s equation to account for, 637 Laplace’s equation, 357, 474, 623 Larmor formula, 223, 254 generalization, 217 reduction, 224 frequency, 647 radiation term, 231 LC lumped circuit, 617 least action, principle, 549 Legendre’s equation, 593, 643 libration, 442 Lie algebra, 573 group compact and non-compact, 572 relation to non-Euclidean geometries, 572 product, 573 Liénard’s force, 180, 213–215, 230 formula, 225 potential, 252 rate of energy loss, 216 relation to Beltrami metric, 216 retarded potential, 182 limit circle, 98 limit cycle, see horocycle limiting curve, see horocycle
663
Liouville–Beltrami half-plane model, 91, 103 Lobaschevskian geometry, see hyperbolic geometry Lobaschevsky–Friedmann metric, 417 space, 407 London’s equation, 609 hallmark, 609 Lorentz –Dirac equation, 215 boost, 502, 538, 572, 620 electron, 148 force, 154, 185, 217, 249 for magnetic charge, 250 modifications, 153 Thomson’s derivation, 290 vanishing, 150 violation of third law, 150, 532 gauge, 602, 622, 626 invariance, 106, 612, 613 model, 263, 304 Abraham’s criticism, 305 transformation, 93, 311, 313, 391, 400, 471 derivation, 93 for energy and momentum, 311 Mach’s principle, 375 magnetic charge, 162, 249, 621 density, 618 continuity equation, 618 magnetic field, internal, 609 magnetic monopoles, 618 magnetons, see magnetic monopoles Mariotte’s law, 330, 331 mass anisotropy, 293 as a vector, 146, 532 density Lorentz invariant, 307 dipole moment, 343 electromagnetic, 251, 302, 532 electrostatic, 251, 269
Aug. 26, 2011
11:17
664
SPI-B1197
A New Perspective on Relativity
b1197-Index
A New Perspective on Relativity
elliptically polarized component, 638 enthalpy equivalence, 314 increase by heat, 301 invariancy, 583 longitudinal, 146, 270 measure of correlation of helicity states, 589 missing in collider production, 535 operators, 143, 544 polarization, 536, 542, 559, 583 rest, 143, 144, 589 components, 148 shell, 537, 590 transverse, 146, 153, 193, 269, 534, 535, 588 and longitudinal, 250, 532, 544, 582 as vectors, 532 mass and energy, 289, 296 based on conservation of momentum, 293 Einstein’s equivalence, 193, 296 Poynting’s equivalence, 124, 273 Thomson’s equivalence, 291, 292 mass and heat content, 314 Planck’s equivalence, 298 maximum likelihood estimate, 336 Maxwell’s fish-eye, 61 speed distribution, 334, 336 theorem, 61 Maxwell’s equations, 128, 351, 619 along a cable, 627 and radiation pressure, 515 defect, 183, 200 spherical waves, 592 relation to hyperbolic geometry, 468 transverse, 129 and massless, 576 Meissner effect, 554, 608 Minkowski’s metric, 93, 430, 490, 542 Möbius automorphism, 392
transform, 57, 72, 104, 285, 286, 392 Mössbauer effect, 526 momentum blackbody radiation, 321 conservation, 295 role of aether, 292 electrokinetic, see electrokinetic momentum polarization, 538 transverse, 534 neutral currents, 574 Newton’s law, 132, 410 Ampère’s contradiction, 178 applied to the universe, 496 correction to, 165 nonrelativistic limit, 638 Ohm’s law, 596, 617, 618 Olber’s paradox, 396 optical path length, 346, 448, 449, 643 optico-gravitational phenomena, 342 parallax, 88, 472, 501, 535 angle, 472 parity inversion, see β-decay parity violation, law, 581 particle number, 329 non-conservation, 330 partition function, relativistic, 328 Pauli equation, 584 spin matrices, 543, 573, 584 spinor, 586 equation, 584 perihelion, advance, 363, 381, 437 Ritz’s priority, 166 perpetual motion, 183, 204, 230, 317 as used by Carnot, 203 as used by Helmholtz, 203 as used by Poynting, 145 as used by Ritz, 183, 230 perspectivity, 69–71 phase transition, second-order, see spontaneous symmetry-breaking
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-Index
Index photon spin, 552 Planck’s hypothesis, 311 relation, 314, 553 Poincaré definition of hyperbolic distance, 525 disc model, 65, 81, 100, 101, 392, 451, 508, 524 Beltrami’s discovery, 91 onto Klein model, 393 half-plane model, 79, 80, 100, 449, 450 principle of relativity, 140 representation, 145, 568 sphere, 538, 562, 563 stress, 253, 302, 306, 326, 468, 470 non-electromagnetic origin of, 307 Poincaré, pressure, see Poincaré, stress point antipodal, 94, 377 at infinity, 69, 70, 79, 91, 97, 98, 464 conjugate, 391 fixed, 73, 284, 391 repulsive, 390 Poisson’s equation, 361, 610 generalized, 363 Riemann’s modification, 207 equations, 547 law for thermal conduction, 473 polarization, 129, 160, 163, 329, 531 circular, 537, 553, 587, 647 complete, 535, 587 condition, 584 degree, 581 elementary particles, 550 ellipse, 563 elliptic, 577 inertia, 144 linear, 541, 647 longitudinal, 581, 587 plane, 588
665
potential advanced, 196, 200 nonexistence, 183 chemical, 330 conditioning, 615 dipole, 273 four-vector of electromagnetic field, 574 internal, 608 irrotational, 625 kinetic, 312, 314 Legendre transform, 318 logarithmic, 235, 474 Newton’s, 371 retarded, 162, 181, 209, 229, 590, 615 FitzGerald’s, 182 Heaviside’s interpretation, 612 irreversibility, 183 Riemann’s first introduction, xlvii Ritz’s modification, 187 transformation into advanced, 229 scalar, 547, 622 solenoidal, see Coulomb gauge vector, see electrokinetic, momentum velocity, 623 power density, 254, 270, 599, 600, 604 Poynting’s vector, 249, 292, 304, 307, 533, 551 gravitational, 162 pre-acceleration, 231 pre-geodesic, 368, 375 pressure, 312, 330, 624 blackbody radiation, 321 due to crowding of Faraday tubes, 293 hydrostatic, 143, 358, 622 analogy to, 554 electromagnetic analog, 163, 623 Lorentz-invariant, 305 negative, 302
Aug. 26, 2011
11:17
666
SPI-B1197
A New Perspective on Relativity
b1197-Index
A New Perspective on Relativity
radiation, 118, 212, 326, 509, 515, 521, 551, 623 critical angle, 524 Maxwell’s prescription, 515 relativistic invariant, 321, 323, 327 principal elongations, 546 probability amplitude for path reversal, 585 Proca’s equations, 600 modified, 608 projective correspondence, 68 geometry, 80, 91, 94, 97 invariant, see cross-ratio model, 96, 106, 362, 392, 508, 509, 523, 524 plane, 71 projectivity, 283 propagator, 587 pseudorapidity, 535 pseudosphere, 82, 84, 90, 95, 96, 367, 394, 418, 460, 464, 494, 539 as a surface of revolution of the tractrix, 275 volume, 439 Pythagorean theorem elliptic, 535, 570, 644, 645 Euclidean, 448, 467, 488 hyperbolic, 283, 394, 467, 469, 539 quadrilateral, Lambert, 416, 420 quadrupole interaction, 343, 381, 382 quantization condition, 639 quantum number, 632 angular momentum, 357 azimuthal, 356, 637, 645, 647 helicity, 359 magnetic, see azimuthal orbital, 637 principal, 637 radial, 637 spin, 359 radar method, see K-calculus radiation, 294, 554 by an accelerated charge, 264 damping, 229
gauge, see Coulomb’s gauge model, 590 non-thermal, 616 reaction force, 212, 220, 226 components, 227 vanishing, 225, 227 zone, 597 radiation, pressure, see pressure, radiation radius of curvature, 84, 463, 472 as the absolute constant, 450 rapidity, 309, 311, 388, 578 example of a non-compact Lie group, 572 in hadron colliders, 533 stochastic, 337 Rayleigh distribution, 336 scattering, 543, 570 redshift, 421 exponential, 422 gravitational, 436 longitudinal and transverse, 396 reflection law of in motion, 514 total, 136 refraction, 128 index, 61, 76, 133, 192, 211, 342, 348, 358, 562, 618 Cauchy’s expression, 349 elliptic space, 61 mechanical analog, 346 refraction, double, see birefringence relativity and spin, 637 Einstein’s principle, 445 Poincaré’s principle, 295 Ricci tensor, 161, 432 contracted, 409, 479 Riemann’s metric, for non-Euclidean geometries, 90, 495 rigid body, 137, 448 Ritz’s force, 153, 180, 254 gravitational, 166 reduction to Liénard’s, 180 Robertson–Walker metric, 418, 484 rotating disc, 453, 484
Aug. 26, 2011
11:17
SPI-B1197
A New Perspective on Relativity
b1197-Index
Index Einstein’s analogy with gravitation, 445 Rydberg constant, 637 Schott radiation term, 230 Schrödinger’s equation, 357, 630 Schwarzschild electrokinetic potential, 185 force, 229 metric, 346, 370, 434, 477, 484 exterior solution, 434, 479, 481 interior solution, 477, 484 example of elliptic geometry, 484 radius, 168, 171, 347, 433, 478 in terms of density, 463 second law, 230 Shapiro effect, 348 singularity, 478 Snell’s law, 77, 125, 136, 643 Sommerfeld’s relativistic correction, 637 due to fine-structure, 631 space-time paths, relativistic, 585 spectral lines, gravitational shift of, 378 spin, 135, 541 –orbit interaction, 632, 639 multiplets, 544 spinors, 541 spontaneous symmetry-breaking, 576, 612 Sagnac effect, 434 explained by emission theory, 435 generalized, 439 non-conservation of angular momentum, 435 standard theory, 574 Stefan’s law, 319, 321 stereographic arclengths, 486 inner product, 440, 473 distortion, 64, 451 for non-constant curvature, 478 positive definiteness, 478
667
model, 362, 437 plane, 64 projection, 65, 91, 96, 369, 509 of spin, 558, 581, 584 of the complex plane onto the Poincaré sphere, 565, 567 Stigler’s law of eponymy, 100, 119, 168, 290, 311, 416, 502, 537 Stokes momentum space, 584 Stokes parameters, 145, 534, 560, 638 and spin, 577 analogy with angular momentum, 545 and polarization of elementary particles, 550 and SU(2), 544 as a four-vector, 542, 565 density matrix representation, 632 the Dirac equation, 638 strain, 222, 545 irrotational, 545 longitudinal, 548 solenoidal, 545 transversal, 548 stress, 142 Maxwell, 551, 568, 569 shear, 358, 531 tangential, 551 tensor, 311 thermal, 474 strong interaction, 571 superconductivity, 576, 609, 622 hallmark, 352, 609 superluminal speeds, 9, 293, 430, 627, 629 surfaces of discontinuity, 624 tachyons, see superluminal speeds telegraph equation, 585 temperature, 328, 329 limits, 328 relativistic variation, 313 tessellations, 106 thermal conductivity, 473 Thomas precession, 502, 507 Borel’s priority, 502
Aug. 26, 2011
11:17
668
SPI-B1197
A New Perspective on Relativity
b1197-Index
A New Perspective on Relativity
time coordinate, 409 curvature, lack of significance, 481 dilatation, 114, 115, 223, 296, 347, 400, 427, 491, 516, 525 second-order, 298 free-fall, 349, 363, 370, 431, 448, 463 geometric mean, 406, 407, 418 local, 189 logarithmic, 405, 418, 422 proper, 217, 409 reflection, 399 time, hyperbolic, see time, logarithmic torsion, radius of, 147, 226, 227 tractrix, 90, 282, 366 tranversality condition, 602 triangle inequality, 58, 71, 72 cross-ratio, 71 inverse cosine, 493 twin paradox, 297 ultraparallels, 82 universe closed, 492 flat, 492 open, 485, 492 vacuum, 549 relation to aether, 650 velocity absolute, 180, 189, 199 versus relative, 115 complementary, 419 composition law, 106, 190, 192, 208, 210, 286, 309, 389, 414, 421, 459, 502, 645 argument against emission theories, 192 as the isomorphism from Klein to Poincaré models, 509 collinear, 126
Lorentz transforms from, 308 group, 134, 629 hyperbolic measure, 189, 560 phase, 134, 629 random, 308 relative, 191 vibrancy condition longitudinal Doppler shift, 335 transverse Doppler shift, 334, 336 virial, 323, 330 Clausius’s, 180 theorem, 323 vortex, 547 velocity, 626 W -bosons, 574 wave compressional, see wave, longitudinal condensation, 361, 553 longitudinal, 128, 358, 553, 605, 614 electric, 160 model of hydrogen atom, 135 transverse, 604 transverse and longitudinal, 129, 163 weak isotopic charge, 574 Weber’s force, 178 applied to gravity, 164 Weinberg angle, 574 Weyl’s equations, 619, 631–633 Dirac’s transformation, 638 Wien’s displacement law, 319 distribution, 335, 336, 517 Wigner angle, 502 Yukawa’s potential, 574, 612, 615 Z-boson, 576 Zeeman effect, 647