PROGRESS I N OPTICS VOLUME VIII
E D l T O R I A L ADVISORY BOARD M. F R A N ~ O N ,
Paris, France
E. INGELSTAM,
St...
48 downloads
1538 Views
18MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
PROGRESS I N OPTICS VOLUME VIII
E D l T O R I A L ADVISORY BOARD M. F R A N ~ O N ,
Paris, France
E. INGELSTAM,
Stockholm, Sweden
K. KINOSITA,
Tokyo, Japan
A. LOHMANN,
S a n Diego, U.S.A.
W. MARTIENSSEN
Frankfurt a m M a i n , Germany
G. SCHULZ,
Berlin, Germany
M. E. MOVSESYAN,
Erevan, U.S.S.R.
A. RUBINOWICZ,
Warsaw, Poland
W. H. STEEL,
Sydney, Australia
G. TORALDO DI FRANCIA, Florence, Italy
W. T. WELFORD,
London, England
PROGRESS I N O P T I C S VOLUME VIII
EDITED B Y
E. WOLF University of Rochester, N.Y.. U . S . A .
Contributors J . W. G O O D M A N , G. A. F R Y ,
H. 2. C U M M I N S , H. L . S W I N N E Y , A. M U S S E T , A. T H E L E N , H. R I S K E N , T. Y A M A M O T O , L. L E V I , C. L . M E H T A
1970 NORTH-HOLLAND P U B L I S H I N G COMPANY - AMSTERDAM
. LONDON
@
1970, NORTH-HOLLAND
P U B L I S H I N G COMPANY
All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the Copyright owner
LIBRARY OF CONGRESS
CATALOG CARD NUMBER:
N O R T H - H O L L A N D I.S.B.N.:
61-19297
0 7204 1508 x
A M E R I C A N E L S E V I E R I.S.B.N.:
0 444 10020 2
PUBLISHERS:
NOltTH-HOLLAND P U B L I S H l N G COMPANY - AMSTERDAM NORTH-HOLLAND PUBLISHING COMPANY, LTD. - LONDON SOLE DISTRIBUTORS FOR T H E U.S.A. A X D CANADA
AMERICAN E L S E V I E R PUBLISHING COMPANY, INC. 52 VANDEKBILT AVENUE
N E W YOKK, N.Y. 10017
PRINTED IN T H E NETHERLANDS
C O N T E N T S O F V O L U M E I (1961) THE MODERN DEVELOPMENT OF HAMILTONIAN OPTICS,R. J . 1-29 PEGIS. . . . . . . . . . . . . . . . . . . . . . . . . WAVEOPTICSA N D GEOMETRICAL OPTICSIN OPTICAL DESIGN, 11. 31-66 K. MIYAMOTO. . . . . . . . . . . . . . . . . . . . . DISTRIBUTION A N D TOTALILLUMINATION OF 111. THE INTENSITY 67-108 ABERRATION-FREE DIFFRACTION IMAGES, R. BARAKAT . . . . LIGHTA N D INFORMATION, D. GABOR . . . . . . . . , . . 109-153 IV. V. ON BASICANALOGIES AND PRINCIPAL DIFFERENCES BETWEEN 155-210 OPTICAL A N D ELECTRONIC I N F O R M A T I O N , H. WOLTER . . . . COLOR,H. KUBOTA. . . . . . . . . . . . 211-251 VI . INTERFERENCE V I I . DYNAMIC CHARACTERISTICS OF VISUAL PROCESSES, A. FIOREN253-288 TIN1 . . . . . . . . . . . . . . . . . . . . . . . . . . V I I I . MODERN ALIGNMENT DEVICES,A . C. S . V A N HEEL . . . . . 289-329 I.
C O N T E N T S O F V O L U M E I1 (1963) I. 11. III. IV.
V. VI.
RULING,TESTINGA N D USE OF OPTICALGRATINGS FOR HIGH1-72 RESOLUTION SPECTROSCOPY, G. W. STROKE . . . . . . . . THEMETROLOGICAL APPLICATIONS OF DIFFRACTION GRATINGS, 73-108 J . M. B U R C H .. . . . . . . . . . . . . . . . . . . . . DIFFUSION THROUGH NON-UNIFORM MEDIA,R. G. GIOVANELLI109-129 CORRECTIONOF OPTICALIMAGES BY COMPENSATION OF ABERRATIONS A N D BY SPATIAL FREQUENCY FILTERING, J . TSUJI131-180 UCHI . . . . . . . . . . . . . . . . . , . . . . . . . FLUCTUATIONS OF LIGHTBEAMS, L. MANDEL . . . . . . . . 181-248 METHODSFOR DETERMINING OPTICALPARAMETERS OF THIN FILMS, F. ABELES. . . . . . . . . . . . . . . . . . . . 249-28 8
C O N T E N T S O F V O L U M E I r r (1964)
.
THE ELEMENTS OF RADIATIVE TRANSFER, F. KOTTLER . . . I. APODISATION, P. JACQUINOT A N D B. ROIZEN-DOSSIER . . . II. TREATMENT OF PARTIAL COHERENCE, H. GAMO. . . 111. MATRIX
1-25
29-186 187-332
C O N T E N T S O F V O L U M E I V (1965) 1-38 HIGHERORDERABERRATIONTHEORY, J . FOCKE . . . . . . 37-83 APPLICATIONS O F SHEARING INTERFEROMETRY, 0. R R Y N G D A H L SURFACE DETERIORATION OF OPTICALGLASSES, K. KINOSITA 85-143 PPTICAL CONSTANTS OF THINFILMS, P. ROUARD A N D P. Bous145-1 97 QUET . . . . . . . . . . . . . . . . . . . . . . . . . THE MIYAMOTO-WOLF DIFFRACTION WAVE,A. RUBIXOWICZ199-240 V. THEORY OF GRATINGS AND GRATING MOUNTINGS, VI . ABERRATION W. T. WELFORD. . . . . . . . . . . . . . . . . . . . 241-280 AT 4 BLACKSCREEN,PARTI : KIRCHHOFF’S VII. DIFFRACTION THEORY, F. KOTTLER. . . . . . . . . . . . . . . . . . 281-314
I. 11. III. IV.
CONTENTS O F VOLUME V (1966) I. 11. 111. IV. V. VI. VII.
OPTICALPUMPING, C. COHEN-TANNOUDJI A N D A. KASTLER. . 1-81 NON-LINEAROPTICS,P. S. PERSHAN . . . . . . . . . . . 83-144 TWO-BEAM INTERFEROMETRY, W. H. STEEL . . . . . . . 145-197 INSTRUMENTS FOR THE MEASURINGOF OPTICAL TRANSFER FUNCTIONS, K. MURATA. . . . . . . . . . . . . . . . . 199-245 LIGHTREFLECTION FROM FILMS OF CONTINUOUSLY VARYING REFRACTIVE TNDEX, R. JACOBSSON . . . . . . . . . . . . 247-286 X-RAY CRYSTAL-STRUCTURE DETERMINATION AS A BRANCH OF PHYSICAL OPTICS,H . LIPSON A N D C. A. TAYLOR . . . . . 287-350 THEWAVEOF A MOVINGCLASSICAL ELECTRON, J. PICHT. . . 351-370
.
CONTENTS O F VOLUME VI (1967) 1.
RECENT ADVANCES IN HOLOGRAPHY, E. N. LEITHA N D J . UPAT-
. . . . . . . . . . . . . . . . . . . . . . . . . 1-52 SCATTERING OF LIGHTBY ROUGH SURFACES,P. BECKMANN 53-69 MEASUREMENT OF THE SECOND ORDERDEGREE OF COHERENCE, M. FRANCON AND s. MALLICK. . . . . . . . . . . . . . 71-104 K. Y A M A J I . . . . . . . . . . . 105-170 IV . DESIGNO F ZOOM LENSES, SOME APPLICATIONS O F LASERSTO INTERFEROMETRY, I). R. V. HERRIOTT . , . . . . . . . . . . . . . . . . . . . . . 171-209 STUDIESOF INTENSITY FLUCTUATIONS IN V I . EXPERIMENTAL LASERS, J . A. ARMSTRONG AND A . w. SMITH . . . . . . . . 211-257 V I I . FOURIER SPECTROSCOPY, G. A. VANASSE,H. SAKAI. . . . . 259-330 V I I I . DIFFRACTION AT A BLACK SCREEN,PART 11: ELECTROMAGNETIC THEORY,F. KOTTLER. . . . . . . . . . . . . . 331-377 NIEKS
11. 111.
.
...
C O N T E N T S O F VOLUME V I I (1969) MULTIPLE-BEAM INTERFERENCE AND NATURALMODES IN OPEN RESONATORS, G. KOPPELMAN. . . . . . . . . . . . 1-66 METHODSOF SYNTHESIS FOR DIELECTRIC MULTILAYER FILTERS, 11. E. DELANOA N D R. J . PEGIS. . . . . . . . . . . . . . . 67-137 111. ECHOES AT OPTICALFREQUENCIES, I . D. ABELLA. . . . . . 139-168 IMAGE FORMATION WITH PARTIALLY COHERENTLIGHT, B. J . IV. THOMPSON. . . . . . . . . . . . . . . . . . , . . . . 169-230 V. QUASI-CLASSICALTHEORYO F LASERRADIATION, A. L. MIKAELIANAND M. L. TER-MIKAELIAN. . . . . . . . . . 231-297 VI. THEPHOTOGRAPHIC IMAGE, S. OOUE. . . . . . . . . . 299-358 V I I . INTERACTION OF VERYINTENSE LIGHTWITH FREE ELECTRONS, J. H. EBERLY.. . . . . . . . . . . . . . . . . . . 359-415 I.
.
.
. .
PREFACE This volume, like its seven predecessors, presents review articles dealing with various aspects of optics. Some of the articles cover relatively new topics, such as synthetic aperture techniques, light beating spectroscopy, statistical properties of laser light and the theory of photoelectric counting. The other articles deal with multilayer antireflection coatings, the optical performance of the human eye, vision in communication and some aspects of interference microscopy. It is hoped that the comprehensive nature of all these review articles, written by experts in the respective fields, will ensure that they will become useful contributions to the optical literature.
Department of Physics and Astronomy, University of Rochester, N . Y., 14627 September, 1970
EMILWOLF
This Page Intentionally Left Blank
CONTENTS I . SYNTHETIC-APERTURE OPTICS by J . W . GOODMAN (Stanford. Calif.) 1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . INTERFEROMETRY . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Basic principles . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Image formation from interferometric data . . . . . . . . . . . . . 2.3 Practical difficulties in interferometric measurements . . . . . . . . 2.4 Fringe detection and sensitivity . . . . . . . . . . . . . . . . . . 2.5 Intensity interferometry . . . . . . . . . . . . . . . . . . . . . 3. FEEDBACK-CONTROLLED OPTICS. . . . . . . . . . . . . . . . . . . . 4 . IMAGING WITH PARTIALLY FILLED APERTURES . . . . . . . . . . . . . 4.1 General properties of partially filled apertures . . . . . . . . . . . . 4.2 Temporal synthesis of a filled aperture with a double-objective telescope 5. APERTURE SYNTHESIS WITH COHERENT ILLUMINATION . . . . . . . . . . 5.1 The active interferometer . . . . . . . . . . . . . . . . . . . . . 5.2 Doppler-spread imaging . . . . . . . . . . . . . . . . . . . . . 5.3 Holographic arrays . . . . . . . . . . . . . . . . . . . . . . . 5.4 Optical synthetic-aperture radars . . . . . . . . . . . . . . . . . 6 . OBJECTI~ESTORATION BEYOND THE DIFFRACTION LIMIT . . . . . . . . . 6.1 Noiseless object restoration . . . . . . . . . . . . . . . . . . . . 6.2 Restoration in the presence of noise . . . . . . . . . . . . . . . . 7 . APERTURE SYNTHESIS BY USE OF A PRIORI INFORMATION . . . . . . . . . 7.1 Objects restricted in polarization . . . . . . . . . . . . . . . . . 7.2 Objects restricted in spatial structure . . . . . . . . . . . . . . . 7.3 Temporally restricted objects . . . . . . . . . . . . . . . . . . . 7.4 Chromatically restricted objects . . . . . . . . . . . . . . . . . . 8 . CONCLUDING REMARKS . . . . . . . . . . . . . . . . . . . . . . . . ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3 4 4
7 9 15 19 21 26 26 28 32 32 33 34 36 39 39 41 44 45 45 46 47 47 47 48
I1. THE OPTICAL PERFORMANCE O F THE HUMAN EYE by GLENNA . FRY(Columbus. Ohio) 1 . INTRODUCTION. INVESTIGATIVE APPROACHES. . . . . . . . . . . . . . 53 2 . THEANATOMY A N D PHYSIOLOGY OF THE RETINA . . . . . . . . . . . . 55 3 BASICCONCEPTS: LINES.POINTS A N D BORDERS . . . . . . . . . . . . . 62 POWER A N D CONTRAST AND MODULATION SENSI4 . VISUALACUITY.RESOLVING TIVITY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 66 4.1 Measurement and specification of visual acuity . . . . . . . . . . . 4.2 Resolving power . . . . . . . . . . . . . . . . . . . . . . . . . 68 70 4.3 The concepts of contrast and modulation . . . . . . . . . . . . . .
.
CONTENTS
X
.
5 SPREAD A N D TRANSFER FUNCTIONS . . . . . . . . . . . . . . . . . . 5.1 Spread functions . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Convolving a line with a point spread function . . . . . . . . . . . 5.3 Convolving a pattern of parallel strips with a line spread function . . . 5.4 Sine-wave techniques . . . . . . . . . . . . . . . . . . . . . . 5.5 Summary of spread and transfer functions used for the eye . . . . . . 5.6 Index of blur . . . . . . . . . . . . . . . . . . . . . . . . . . 6 . PSYCHOPHYSICAL TECHNIQUES . . . . . . . . . . . . . . . . . . . . 6.1 The role of coherence . . . . . . . . . . . . . . . . . . . . . . 6.2 Special devices for measuring resolving power. modulation thresholds and transfer factors . . . . . . . . . . . . . . . . . . . . . . . 7 . DETERMINATION OF THE MODULATION TRANSFER FUNCTION FOR THE RE-
71 71 71 73 74 75 80 84 84 86
89 TINA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 7.1 Derivation from the combined function . . . . . . . . . . . . . . . 98 7.2 Direct measurement with a double slit interference pattern . . . . . . 8. PERCEIVED MODULATION OF A SINE-WAVE GRATING AT SUPRATHRESHOLD 99 LEVELS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 . DETERMINATION OF THE MODULATION TRANSFER FUNCTION OF THE OPTICAL 100 SYSTEM OF THE EYE . . . . . . . . . . . . . . . . . . . . . . . 9.1 The Campbell-Green method . . . . . . . . . . . . . . . . . . . 100 104 9.2 The Arnulf-Dupuy method . . . . . . . . . . . . . . . . . . . . 10. KESOLVING P O W E R AT UNIT MODULATION . . . . . . . . . . . . . . . 106 10.1 Effect of pupil size on the resolving power . . . . . . . . . . . . . 106 10.2 The effect of luminance level on resolving power . . . . . . . . . . 108 10.3 Resolving power for double slit interference patterns . . . . . . . . 110 11. DEPENDENCE OF THE RESOLVING POWER ON WAVELENGTH COMPOSITION. . 110 12. VISIBILITYOF SQUARE-WAVE GRATINGS. . . . . . . . . . . . . . . . 116 120 13. THE VISIBILITY OF A BAR . . . . . . . . . . . . . . . . . . . . . . 123 14. THE VISIBILITYOF BORDERS . . . . . . . . . . . . . . . . . . . . . 124 15. RETINAL REFLECTOMETRY . . . . . . . . . . . . . . . . . . . . . . 126 16. ABERRATIONS O F T H E EYE . . . . . . . . . . . . . . . . . . . . . . 17. MOTORADJUSTMENTS O F THE EYE . . . . . . . . . . . . . . . . . . 127 128 18. GENERAL REVIEWS. . . . . . . . . . . . . . . . . . . . . . . . . 128 APPENDIXI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
111. LIGHT BEATING SPECTROSCOPY by H . Z . CUMMINSand H . L . SWINNEY (Baltimore, Maryland) 1. INTRODUCTION . . . . . . . . . . . . . . . . . . 1.1 Historical review . . . . . . . . . . . . . . . 1.2 Statistics and spectra . . . . . . . . . . . . . 2 . THE THEORY OF LIGHTBEATING SPECTROSCOPY . . 2.1 Classical coherence theory . . . . . . . . . . . 2.2 Homodyne or self-beat detection . . . . . . . 2.3 Heterodyne detection . . . . . . . . . . . . . 2.4 Forrester’s approach . . . . . . . . . . . . . . 2.5 Quantum theory . . . . . . . . . . . . . . . 3 . LIGHTSCATTERING THEORY. . . . . . . . . . . . 3.1 Scattering by a dilute solution of particles . . . 3.1.1 Spherical scatterers . . . . . . . . . . . . 3.1.2 Nonspherical scatterers . . . . . . . . . 3.1.3 Rigid rods . . . . . . . . . . . . . . . . 3.2 Scattering by pure fluids and liquid mixtures . .
. . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . . . . .
. . . . .
. . . . . .
. . . . .
. . . . . .
. . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . . .
135 135 138 141 142 143 146 147 150 154 154 156 158 158 159
CONTENTS
XI
4. COHERENCE A N D SIGNAL TO NOISECONSIDERATIONS . . . . . . . . . . . 4.1 Spatial coherence . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Source spectrum and statistics . . . . . . . . . . . . . . . . . . 4.2.1 Phase fluctuation . . . . . . . . . . . . . . . . . . . . . . 4.2.2 Amplitude fluctuation . . . . . . . . . . . . . . . . . . . . 4.3 Light beating with a multimode laser source . . . . . . . . . . . . 4.4 Signal to noise . . . . . . . . . . . . . . . . . . . . . . . . . 5. APPARATUS A N D PROCEDURE . . . . . . . . . . . . . . . . . . . . . 5.1 Homodyne spectroscopy . . . . . . . . . . . . . . . . . . . . . 5.2 Heterodyne spectroscopy . . . . . . . . . . . . . . . . . . . . . LINEWIDTH EXPERIMENTS . . . . . . . . . . . . 6. REVIEWOF RAYLEIGH 6.1 Dilute solutions of macromolecules . . . . . . . . . . . . . . . . . 6.2 Simple fluids near the critical point . . . . . . . . . . . . . . . . 6.3 Binary critical mixtures . . . . . . . . . . . . . . . . . . . . . 6.4 Fixman's modification . . . . . . . . . . . . . . . . . . . . . . 7 . CONCLUSIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . REFERENCES
163 163 167 168 169 170 174 184 185 188 190 190 191 194 195 196 197
I V . MULTILAYER ANTIREFLECTION COATINGS by A. MUSSETand A . THELEN (Santa Rosa. Calif.)
. . . . . . . . . . . . . . . . . . . . . . . . . . . 1. INTRODUCTION 203 2. SINGLELAYERANTIREFLECTION COATINGS . . . . . . . . . . . . . . . 205 206 3. TWO-LAYER ANTIREFLECTION COATINGS ON GLASS. . . . . . . . . . . . 4. THEDESIGNMETHODOF EFFECTIVE INTERFACES . . . . . . . . . . . . 210 5. TWO-LAYER ANTIREFLECTION COATINGS ON HIGH-INDEX SUBSTRATE . . . 212 6. THREE-LAYER ANTIREFLECTION COATINGS O N HIGH-INDEX SUBSTRATE . 214 217 7 . THREE-A N D FOUR-LAYER ANTIREFLECTION COATINGS O N GLASS. . . . . . 8. SYNTHESIZED LAYERS . . . . . . . . . . . . . . . . . . . . . . . . 222 9. THE COATING OF REFLECTION REDUCING MULTILAYERS . . . . . . . . 224 225 10. ENVIRONMENTAL STABILITY OF MULTILAYER ANTIREFLECTION COATINGS . 11. OPTICALPERFORMANCE OF COATINGSAND INCREASE IN TRANSMISSION THROUGH AN OPTICAL SYSTEM . . . . . . . . . . . . . . . . . . 225 230 12. PHOTOGRAPHIC APPLICATIONS . . . . . . . . . . . . . . . . . . . . . OF STRAY LIGHTIN AN OPTICALSYSTEM . . . . . . . . . . 231 13. SUPPRESSION 236 REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
V . STATISTICAL PROPERTIES OF LASER LIGHT by H . RISKEN(Stuttgart)
. 1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . 2. SEMICLASSICAL THEORY . . . . . . . . . . . . . . . . . . . 2.1 Laser equations without noise . . . . . . . . . . . . . . . 2.2 Laser equation near threshold with noise (Langevin method) . 2.3 Laser equation with noise (Fokker-Planck equation method) . 3 SOLUTION OF THE LASERFOKKER-PLANCK EQUATION . . . . . . 3.1 Stationary solution and its expectation values . . . . . . . . 3.2 Expansion in eigenmodes . . . . . . . . . . . . . . . . . 3.3 Correlation functions . . . . . . . . . . . . . . . . . . 3.4 Transient of the laser oscillation . . . . . . . . . . . . . .
.
. . . . 241 . . . . 244
. . . .
....
. . . . . . . .
. . . . . . . .
. . . . . . . .
244 248 249 252 252 255 256 260
XI1
CONTENTS
4 . PHOTOELECTRON COUNTINGDISTRIBUTION . . . . . . . . . . . . . . . 4.1 General relationships between the photoelectron distribution and intensity distribution . . . . . . . . . . . . . . . . . . . . . . . 4.2 Counting distribution for short intervals . . . . . . . . . . . . . . 4.3 Expectation values for arbitrary intervals . . . . . . . . . . . . . 4.4 Condensation effect of the counting distribution . . . . . . . . . . . 5. FULLY QUANTUM MECHANICAL THEORY. . . . . . . . . . . . . . . . 5.1 Introductory remarks . . . . . . . . . . . . . . . . . . . . . . 5.2 Model and derivation of the laser master equation . . . . . . . . . . 5.3 Laser master equation near threshold . . . . . . . . . . . . . . . 5.4 c-number equation of the laser master equation . . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
264 264 265 271 271 272 272 273 279 286 291
VI . COHERENCE THEORY O F SOURCE-SIZE COMPENSATION IN INTERFERENCE MICROSCOPY b.y T . YAMAMOTO (Tokyo)
1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 . GENERAL THEORY OF TWO-BEAM INTERFERENCE MICROSCOPES . . . . . . . 2.1 General theory of two-beam interferometers . . . . . . . . . . . . . 2.2 Localized fringes . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Interference microscope and source-size compensation . . . . . . . . 2.4 Delay compensation . . . . . . . . . . . . . . . . . . . . . . . 2.5 Use of laser as a source for interference microscopes . . . . . . . . . 3 . COHERENCE DIFFRACTION THEORY O F I M A G E FORMATION A N D TWO-BEAM I N TERFERENCE
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1 Steel’s unified theory . . . . . . . . . . . . . . . . . . . . . . 3.2 Extension to the polarization interferometer . . . . . . . . . . . . 3.3 Shearing elements and tilting elements . . . . . . . . . . . . . . . 4 . LOCALIZATION O F F R I N G E S WITH P A R T I A L L Y C O H E R E N T LIGHT. . . . . . 5 . SOURCE-SIZE COMPENSATION. . . . . . . . . . . . . . . . . . . . . 6 . P R A C T I C A L METHODSO F SOURCE-SIZE COMPENSATION I N SHEARING I N T E R FERENCE MICROSCOPE WITH POLARIZED LIGHT . . . . . . . . . . . . . 6.1 Methods for obtaining fringed field of view . . . . . . . . . . . . . 6.2 Methods for obtaining uniform field of view . . . . . . . . . . . . . 6.3 Objective compensation and pupillary compensation . . . . . . . . . 7 . I M A G E S O F SOURCE-SIZE C O M P E N S A T E D I N T E R F E R E N C E MICROSCOPES . . . . 7 . 1 Image intensity . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Harmonic analysis of image intensity . . . . . . . . . . . . . . . 7.3 Effects of thickness of the object under observation . . . . . . . . . 8. CONCLUSION. . . . . . . . . . . . . . . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
297 300 300 302 303 305 305 306 306 311 312 315 317 321 323 325 330 331 332 334 336 338 338
VII . VISION I N COMMUNICATION by L . LEVI (New York. N.Y.) 1. BASICCONCEPTS . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Problems of psychophysical investigations . . . . . . . . . . . . 1.2 The fundamental characteristics . . . . . . . . . . . . . . . . . 2 . BRIGHTNESS FUNCTION . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Psychophysical measurement technique (GREENand SWETS[1966]) . 2.2 Instantaneous brightness function . . . . . . . . . . . . . . . . 2.3 Steady-state brightness function . . . . . . . . . . . . . . . . .
. . .
. .
345 345 346 347 347 348 350
XI11
CONTENTS
3 . SPATIALFREQUENCY RESPONSE . . . . . . . . . . . . 3.1 Subsystems of the visual system . . . . . . . . . 3.2 Optical subsystem . . . . . . . . . . . . . . . . 3.3 Retina-brain portion . . . . . . . . . . . . . . . 3.4 Total visual system . . . . . . . . . . . . . . . 3.5 Comparison of results . . . . . . . . . . . . . . 4 . NOISEIN THE VISUAL SYSTEM. . . . . . . . . . . . 4.1 Thrcshold measurements . . . . . . . . . . . . . 4.2 Noise sources and luminance dependence . . . . . 4.3 Spatial spectrum of noise . . . . . . . . . . . . . 4.4 Amplitude distribution of noise . . . . . . . . . 4.5 Noise on the object . . . . . . . . . . . . . . . 5. SHAPEOF M T F , LINEARITY A N D STATIONARITY . . . . 5.1 Linearity of brightness function . . . . . . . . . 5.2 Shape of the visual mtf . . . . . . . . . . . . . 5.3 Non-linearities of area effects . . . . . . . . . . 5.4 Stationarity. homogeneity, or isoplanatism . . . .
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . APPENDIX: METHODSO F MEASURINGTHE MTF OF ABOVE
THRESHOLD . . . . . . . . . . . . .
KEFERENCES . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
351 351 351 354 354 356 359 359 . 361 363 . 363 365 . . . 365 . . . 365 . . 365 . . . 366 . . . 368 . . . . . . 368 THE TOTAL VISUALSYSTEM . . . . . . . . . . . . 368 . . . . . . . . . . . . 370
.
VIII . THEORY O F PHOTOELEC.TRON COUNTING by C . L . MEHTA (Rochestcr. N.Y.) 1. INTRODUCTION . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . DISTRIBUTION . . . . . . . . . . 4 . PHOTOCOUNTING 4.1 Polarized thermal light . . . . . . . . . . . . 4.2 Partially polarized thermal light . . . . . . . 4.3 Laser light . . . . . . . . . . . . . . . . .
. . . . 2 . PHOTOELECTRON COUNTINGFORMULA 3. INTENSITY . . . . . . . . . . FLUCTUATIONS 3.1 Polarized thermal light . . . . . . . . . . 3.2 Partially polarized thermal light . . . . . 3.3 Laser light . . . . . . . . . . . . . . . 3.4 Harmonic signal mixed with thermal field
4.4 Bunching effects
. . . . . . . . . . . . . . . .
5 . DEADTIMEEFFECTS . . . . . . . . . . . . . 5.1 General theory . . . . . . . . . . . . . . 5.2 Intensity stabilized laser light . . . . . . . 5.3 Thermal light . . . . . . . . . . . . . . . 6 . MULTIPLECORRELATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . 7 . INVERSION PROBLEM 8. Two PHOTON ABSORPTION. . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
375
. .
. .
. . . . . . . . . . . . . . . . . . . . . . . .
371 384 384 391 395 397 399 399 404 406 407
. . . . . . . . . . . . APPENDIXA : SOMEPROPERTIES O F COMPLEXGAUSSIANDISTRIBUTIONS . . . . APPENDIXB: SOLUTION OF AN INTEGRAL EQUATION . . . . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
411 411 415 417 418 423 429 431 434 437
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . AUTHORINDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SUBJECTINDEX
441 448
.
. . . .
. . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
This Page Intentionally Left Blank
I SYNTHETIC-APERTURE OPTICS BY
J. W. G O O D M A N Stanford University, Stanford, California, U S A
CONTENTS INTRODUCTION. . . . . . . . INTERFEROMETRY . .
. .
. . . . . . . . .
3
. . . .
4
.
21
IMAGING WITH PARTIALLY FILLED APERTURES
26
APERTURE SYNTHESIS WITH COHERENT ILLUMINATION . . . . . . . . . . . . . . . . . .
32
OBJECT RESTORATION BEYOND THE DIFFRACTION LIMIT. . . . . . . . . . . . . . . . . . .
39
APERTURE SYNTHESIS BY USE O F A PRIOR1 INFORMATION . . . . . . . . . . . . . . . . .
44
CONCLUDING REMARKS . . . . . . . . . . . .
47
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . .
47
REFERENCES . . . . . . . . . . . . . . . . . . . . .
48
.
. . . . .
PAGE
.
FEEDBACK-CONTROLLED OPTICS. . . . . . .
Q 1. Introduction The most important practical limitation to the resolving power of any optical system is diffractioii of light by the finite size of the primary collector. Often this limitation is not reached, for imperfections of the optical components may introduce fixed wavefront aberrations, and atmospheric turbulence may introduce dynamic distortions, both of which restrict resolution to less than the “diffraction limit”. Nonetheless, with sufficient cost, time and care, it is possible to manufacture large-aperture corrected optical systems which are, for all practical purposes, free from fixed aberrations. Furthermore, in space applications (KOPAL[1968]), or in earth-based applications which utilize appropriate image processing (MORGAN [ 1967]), atmospheric effects can be negligible and diffraction-limited performance can be closely approached. Thus for a significant class of problems, the primary limit to resolution is aperture size. In such cases the most obvious means to obtain higher resolution is the construction of larger diffraction-limited optics. However, as aperture size is increased, a number of factors make further increases less and less attractive. First, the sheer weight and size of the optics make their use all the more awkward and difficult. Second, the cost and time associated with the fabrication of largeaperture systems soon become prohibitive for many applications. It is, therefore, important to consider alternative means for obtaining high resolution without the necessity of constructing ever larger optical components. Our purpose in this review article is to present a survey of a variety of techniques for increasing optical resolution by what may be termed synthetic-aperture optics. This term is defined here in a very broad sense to include a n y technique for achieving, with one or moYe small apertures, the resolution normally associated with a single large aperture. To date, the largest body of experience with aperture synthesis has been in the microwave region of the spectrum, where synthetic-aperture 3
4
SYNTHETIC-APERTURE OPTICS
[I.
s
2
radars (CUTRONA,VIVIAN, LEITHand HALL[1961]) and radio-astronomy telescope arrays (BRACEWELL [1962]) have been used for a number of years. It is significant that most previous experience has been at microwave frequencies, for indeed the primary technical difficulties associated with many optical aperture-synthesis techniques can be traced to the minute size of optical wavelengths, typically some five orders of magnitude shorter than microwave wavelengths. In the material to follow, the various approaches to optical aperture synthesis have been divided into six classes as follows: (1) interferometry; (2) feedback-controlled optics; (3) imaging with partially filled apertures; (4) aperture synthesis with coherent illumination; ( 5 ) object restoration beyond the diffraction limit; and (6) aperture-synthesis techniques based on a priori object information. In several cases a given technique could be placed logically in two or more of these categories, and in such cases we have chosen the particular category which seemed t o us to be most appropriate.
Q 2. Interferometry The oldest embodiments of the principles of optical aperture synthesis are found in the field of interferometry. For completeness, and for the purpose of establishing notation, we first briefly review the properties of interference phenomena (WOLF [1954, 19551, MANDEL and WOLF [1965]). 2.1. BASIC PRINCIPLES
Suppose that, as a result of some distant source, or system of sources, there exists across a certain plane a statistically stationary optical wave. Neglecting polarization effects (which can be included without changing the basic results to follow), we may represent the wave by a complex-valued analytic signal V ( x ,t ) , which is a function of position x in the plane and time t . The quantity of chief interest in interferometry experiments is the complex degree of coherence, y l z ( t ) . This quantity is defined in terms of the time-averaged product of the field amplitudes at positions x, and x, separated by vector spacing s = xz-xl,and at time instants t, and t, separated by delay z. Thus
1,
s
21
5
I N TE R F E R O M E T R Y
where the angle brackets signify an infinite time average. The complex degree of coherence has a direct physical interpretation in terms of the Young’s interference experiment illustrated in Fig. 2.1. Let S be an arbitrary source, and suppose that two pinholes are pierced at points x, and x2 in an otherwise opaque screen. The intensity of light observed at a point P behind the screen can be expressed as
Intensity
Fig. 2.1. Young’s interference experiment.
where I, and I, are the intensities that would be produced from the two pinholes individually, t is the relative delay introduced by the difference in propagation times over paths from x1 to P and x, to P, c is the velocity of light, and Re{ } signifies “real part of”. As indicated in Fig. 2.1, the pattern of interference consists of relatively fine fringes under a coarse spatial envelope. The envelope has fallen appreciably when the delay time z approaches a value equal to the reciprocal of the optical bandwidth of the interfering waves. The fine structure of the pattern changes by one full period when z changes by one period of the optical oscillation. When the optical wave is quasi-monochromatic (narrowband) and when t is much smaller than the reciprocal of the optical bandwidth, the fringes are of approximately constant depth, and the complex degree of coherence may be written
where G
=
clx is the mean optical frequency of the waves and
6
SYNTHETIC-APERTURE OPTICS
[I,
9 2
p l z = ylz(0) is known as the “complex coherence factor” (also known as the “complex visibility function” in radio astronomy). The modulus lplzl of the complex coherence factor may be identified as the “visibility” of the fringes in the usual sense; that is,
where I,,, and Imin signify the maximum and minimum intensities of the fringes for a given spacing s. The phase of the complex coherence factor is identical with the spatial phase of the sinusoidal fringe pattern, relative to the phase of the fringe pattern that would be produced by a monochromatic point source of frequency fi, located in the plane of S, equidistant from points x, and x,. Aside from the basic definitions above, the relation of most importance to us here is the van Cittert-Zernike theorem (BORNand WOLF [ 1964]), which relates the complex coherence factor to the intensity distribution across an incoherent source of illumination. Let I(,) represent the intensity distribution of that source, where I ( u )is normalized such that J I (u)du equals unity, and u is an angular position measured in radians, as seen from the origin of the x plane. The van Cittert-Zernike theorem specifies that I (u)and ,ul, are related by
1 , .
where, to take care of scaling factors that would otherwise arise, 1sJ is interpreted as the spacing of the two pinholes measured in wavelengths. The phase angle y,, depends on the distances D, and D, from the origin of the u plane to the points x1 and x,, respectively, in accord with the formula
This angle may be neglected when the source of interest is very distant from the x plane (D,-D, 4v,a. Using the one-dimensional version of
1,
§ 61
43
OBJECT RESTORATION
the Rayleigh criterion, the number of resolvable elements in the unprocessed image is 4v,a. Thus when resolution beyond the diffraction limit is attempted, very large contributions to the mean-square image error are introduced. It is therefore very important that any background noise be confined to the same spatial region occupied by the object. (c) White measurement noise (i.e., noise with a constant power spectral density over all frequencies). The mean-square image error is in this case zEo(1/AJ2. Thus the errors increase even more rapidly than in the preceding case. (d) Bandlimited measurement noise. If the spectral density of the measurement noise is uniform over the passband of the system and zero outside that passband, the mean-square image noise is again CEOI/&. Comparison of cases (c) and (d) shows that it is extremely important to artificially bandlimit any noise produced during the detection process. However, even when such precautions are taken, the mean-square image error increases dramatically when resolutions beyond the diffraction limit are attempted. Figure 6.1 shows a plot of the r.m.s. image error (e2)* vs. the parameter d,lR, where d, is the size of a resolution element in the
5 d,/R
Fig. 6.1. r.m.s. image error (after RUSHFORTH and HARRIS[1968])
S Y N 'I' H E T I C -A P E R T U R E 0 P T I C S
44
[I,
5
7
processed image (as determined by the average spacing of the zeros of and R is the usual Rayleigh limit to resolution ( R = i v c in this one-dimensional analysis), The parameter c = 4vc tc represents the space-bandwidth product of the unprocessed image. For simplicity, a noise spectral density of unity is assumed in this figure. Lower SIN ratios in the unprocessed image imply correspondingly higher values of (e2)* than indicated here. Each different value of c yields a different curve, and each curve is characterized by two regions, one of low slope and one of high slope. As the complexity of the unprocessed image increases (i.e., c increases), the curves shift to the right, indicating that to improve the resolution in the processed image to some fixed fraction of the Rayleigh limit requires greater and greater signal-to-noise ratio as the space-bandwidth product of the unprocessed image grows. These results indicate that, if the object is very poorly resolved by the optical system a t the start, a significant improvement in resolution (i.e., the addition of a few resolvable elements) can be accomplished if reasonably high signal-to-noise ratios are available before processing. If, however, the object is extremely complex at the start (e.g., an aerial reconnaissance photograph), improvement of resolution by even a small fraction of the Rayleigh limit requires signal-to-noise ratios that are unrealistically high. There are, of course, applications in which the object is very poorly resolved at the start and the signal-to-noise ratio is high. For such applications, object restoration beyond the diffraction limit could play an important role. In addition, in microscope imagery of time-invariant specimens, rather enormous signal-to-noise ratios can be achieved by scanning the image slowly with a very sensitive detector, and therefore object restoration may find some application in this area. Finally, it should be appreciated that when information less complete than full object restoration (e.g., discrimination, counting, determination of size and general shape) is desired, significant improvements appear achievable by post-detection processing with far less stringent requirements on signal-to-noise ratio.
+M(x))
tj 7. Aperture Synthesis by Use of a priori Information A wide variety of techniques for utilizing a priori object information to obtain resolution beyond the usual diffraction limit are described in the literature. Indeed several techniques described in previous sections
1,
s
71
S Y N T H E S I S BY USE O F A P R I O R 1 I N F O R M A T I O N
45
could be placed in this category. An object is seldom a completely general function of position, polarization and time; when one or more characteristics of the object are specified a priori, such restrictions can often be used to obtain improved resolution (see LOHMANN [1968], LUKOSZ[1966]). In this section we give some examples of the important gains that can be achieved by proper use of a priori information. 7 . 1 . OBJECTS RESTRICTED I N POLARIZATION
When a nonbirefringent object is illuminated by polarized light, proper encoding of the object information into two separate polarization “channels” can double the resolution of the optical system. For example (see LOHMANN [1956], GARTNERand LOHMANN [1963], LOHMANN and PARIS[1964]), as shown in Fig. 7.1, a Wollastonprism can be used to encode spatial frequencies vz > 0 into one polarization and spatial frequencies v, < 0 into the orthogonal polarization. The full aperture of the optical system is used for each polarization component. A second Wollaston prism recombines the two channels to produce an image with doubled resolution in the x-direction. Wol laston Prism
Wollaston Prism
Imaging Elements
(Source
(Polarizer
‘Object
Analyzer1 (Image
Fig. 7.1. Aperture synthesis with Wollaston prisms (LOHMANN and PARIS[1964]).
7.2. OBJECTS RESTRICTED I X SPATIAL STRUCTURE
When the spatial structure of the object is highly constrained, significant improvements of resolution can be attained by taking advantage of these constraints. An example is the use of the moirC effect by Lord Rayleigh to test gratings. When a grating of known periodicity and unknown defects is placed in contact with a perfect grating of the same period, deformations of the coarse moirC fringes indicate the very fine grating defects. A second example is provided by the so-called “super-gain aperture” (TORALDO Di FRANCIA [1952]). If the amplitude and phase transmis-
46
S Y N T H E TI C - A P E R T U R E 0 P T I C S
[I,
s:
'i
sion across the pupil of an imaging system are varied in the proper prescribed manner, the central maximum of the point-spread function is made narrower, a t the price of higher secondary maxima, which may be several resolution units from the central maximum. Such a technique can be used to resolve two point sources separated by less than the Rayleigh limit, at the price of inefficient use of energy and a very limited allowable object field. Other examples are the double star methods of LACOMME [1954] and LAU 119371, which may be used to resolve two closely spaced point sources, and the grating technique utilized by BACHLand LUKOSZ [1967] which increases resolution a t the price of a limited object field. The reader may consult the references cited for further details. 7.3. TEMPORALLY RESTRICTED OBJECTS
If the object is time independent, or changes only slowly with time, a scanning system may be employed to transfer spatial information to the time domain. The temporally modulated wave may be sent to the image plane by an optical system with essentially no resolving power, and a second, synchronous scanning system may then be used to reconstruct the image. A second example is the moving-grating technique of LUKOSZ and MARCHAND [1963] which is illustrated in Fig. 7.2 (see also LUKOSZ [1967]). In this case a moving grating is used to deflect the object light through a multiplicity of angles, each angle of deflection being simultaneously coded with a different temporal frequency. The various temporal frequency components each carry through the system a different segment of the spatial-frequency spectrum of the object. At the output of the system a second grating, located in a plane conjugate to that of the first grating, moves in the opposite direction with the same speed. The result is a decoding of the temporal frequency information. The image produced by the first lens is then relayed t o a detector
t"
vO-2Av
Fig. 7.2. Aperture synthesis with a moving grating (LUKOSZ [1967]).
ACKNOWLEDGMENTS
47
which has associated with it a bandpass temporal filter, responding only to frequency v,,. The final image resolution can far exceed that produced by the unaided optical system. 7.4. CHROMATICALLY RESTRICTED OBJECTS
If the object is chromatically restricted, i.e., contains only various shades of gray, then spatial information can be transferred to chromatic information. For example, if white light is sent into a prism spectroscope and the displayed spectrum is used for illuminating the black and white object, a specific wavelength is assigned to each spatial x-coordinate (see, for example, KARTASHEV[19601). The chromatically coded light may be sent through an optical system with spatial resolution in the y-direction only. A second prism spectroscope again displays the various wavelengths in the x-direction, thus recovering spatial resolution in that dimension. Q 8. Concluding Remarks In the previous sections we have attempted to outline the present status of the field of synthetic-aperture optics. Some of the techniques mentioned are old, while some are relatively new. Some of the techniques discussed have been put to practical use on a limited basis (e.g., the Michelson stellar interferometer and the intensity interferometer) ; however, the majority have not yet reached a state of development that allows their use 011any significant scale. I t is premature to attempt a guess as to which of these various methods will be important in the future and which will remain a theoretical curiosity, for the results will undoubtedly rest on future developments in optical technology and on the ingenuity of optical engineers.
Acknowledgments The material presented in this article has been drawn largely from a report entitled “Synthetic-Aperture Optics”, which was the product of a Summer Study held in 1967 by the National Academy of Sciences, under the sponsorship of the U.S. Air Force Systems Command, to whom we express our gratitude. While it is impossible t o mention the names of all 56 scientists who participated in the Study, special mention should be made of the following: David D. Cudaback, Douglas G. Currie, Richard H. Miller, and David H. Rogstad, who made major
48
S Y N T H E T I C - A P E R T U R E 0 PTI C S
I1
contributions to section 2 of this article; David A. Markle, for his contributions to section 3; Janusz S. Wilczynski, for his contributions to section 4;Craig K. Rushforth, who is responsible for much of section 6; and Adolf W. Lohmann, whose contributions are represented in very abbreviated form by section 7. The good qualities of this article are reflections on the competence and enthusiasm of the participants in the Study; the author accepts responsibility for any shortcomings.
References BACHL,A. and W. LUKOSZ, 1967, J . Opt. soc. Am. 5 7 , 163. BARNES, C. W., 1966, J . Opt. SOC.Am. 5 6 , 575. BEAVERS, W. I., 1963, Astron. J . 6 8 , 273. BEAVERS, W. I. and W. D. SWIFT,1968, Appl. Opt. 7 , 1975. BORN,M. and E. WOLF, 1964, Principles of Optics (second ed., Pergamon Press, London, N.Y.). R. N., 1962, in: Handbuch der Physik, Vol. LIV, ed. S.Fliigge BKACEWELL, (Springer-Verlag, Berlin) pp. 42-129. BROWN, R. Hanbury and R. Q. TWISS,1956, Nature 1772 27. BROWK, R. Hanbury, 1968, in: Annual Review of Astronomy and Astrophysics 6 , ed. L. Goldberg (Annual Reviews, Inc., Palo Alto, Calif.) pp. 13-38. BUCK, G. J . and J . J . GUSTINCIC, 1967, I E E E Trans. Ant. Prop. AP-15, 376. CRANE,R . , 1969, Appl. Opt. 8, 538. CURRIE,D. G., 1968a, in: Synthetic-Aperture Optics, Vol. I1 (National Academy of Sciences, Washington, D.C.) pp. 79-83. CURRIE,D. G., 196813, i b i d , pp. 35-41. CUTRONA,L. J . , W. E. VIVIAN,E. N. LEITHand G. 0. HALL, 1961, I R E Trans. Mil. Electr. MIL-5, 127. CUTKONA, L. J. and G. 0. HALL,1962, I R E Trans. Mil. Electr. MIL-6, 119. CUTRONA,L. J., E. N. LEITH,L. J. PORCELLO and W. E. VIVIAN,1966, Proc. I E E E 5 4 , 1026. ELLIOT, J . L., 1965, M.S. Thesis, Dept. of Physics, MIT. ELLIOT, J. L. and F. SCHERB, 1967, Astron. J. 72, 794. EVANS,J . V. and T. HAGFORS, 1968, Radar Astronomy (McGraw-Hill Book Co., New York). FRIEDEN, B. R., 1967, J . Opt. SOC.Am. 5 7 , 1013. GAMO,H., 1963, J. Appl. Phys. 3 4 , 875. GARTNER, W. and A. LOHMANN, 1963, Z. Physik 1 7 4 , 18. GOODMAN, J. W. and R. W. LAWRENCE, 1967, Appl. Phys. Letters 11, 77. H A R R I S , J . L., 1964, J. Opt. SOC.Am. 54, 931. HODARA, H., 1965, Proc. IEEE 53, 696. HUFNAGLE, R. E., 1967, in: Restoration of Atmospherically Degraded Images, Vol. I1 (National Academy of Sciences, Washington, D. C.) pp. 239-246. JENNISON, R. C., 1958, Monthly Notices Roy. Astron. SOC.1 1 8 , 276. JENNISON, R. C., 1967, Introduction to Radio Astronomy (Philosophical Library, Inc., New York). A. I., 1960, Opt. Spectry. 9, 204. KARTASHEV,
I1
REFERENCES
49
KINGSTON,R. H., 1968, in: Synthetic-Aperture Optics, Vol. I1 (National Academy of Sciences, Washington, D.C.) pp. 73-78. KOPAL,Z., 1968, Telescopes in Space (Faber & Faber Ltd., London). KULAGIN, E. S., 1967, Opt. Spectry. 2 3 , 459. LACOMME, P., 1954, Opt. Acta 1, 33. LAU,E., 1937, Phys. 2 s . 3 8 , 446. LOHMANN, A. W., 1956, Opt. Acta 3, 97. LOHMANN, A. W. and D. P. PARIS, 1964, Appl. Opt. 3, 1037. LOHMANN, A . W., 1968, in: Synthetic-Aperture Optics, Vol. I1 (National Academy of Sciences, Washington, D.C.). LUKOSZ, W. and M. MARCHAND, 1963, Opt. Acta 10, 241. LUKOSZ, W., 1966, J . Opt. SOC.Am. 56, 1463. LUKOSZ, W., 1967, J . Opt. SOC.Am. 5 7 , 932. MACPHIE,R. H., 1966, IEEE Trans. Ant. Prop. AP-14, 369. MANDEL, L. and E. WOLF, 1965, Rev. Mod. Phys. 37, 231. MARKLE, D. A,, 1968, in: Synthetic-Aperture Optics, Vol. I1 (National Academy of Sciences, Washington, D.C.) pp. 133-167. MEHTA,C. L., 1968, J . Opt. SOC.Am. 5 8 , 1233. MEYER-ARENDT, J . R. and C. B. EMMANUEL, 1965, Optical Scintillation: A Survey of the Literature, N.B.S. Tech. Note No. 225 (National Bureau of Standards, Washington, D.C.). MICHELSON, A,, 1890, Phil. Mag. 3 0 , 1 . MICHELSON, A., 1920, Astrophys. J . 5 1 , 257. MICHELSON, A. and F. PEASE, 1921, Astrophys. J . 5 3 , 249. MILLER,R. H., 1966, Science 153, 581. MILLER,R. H., 1968, in: Synthetic-Aperture Optics, Vol. I1 (National Academy of Sciences, Washington, D.C.) pp. 95-105. S.P., 1967, Nature 2 1 3 , 465. MORGAN, MURRAY, B. C., 1968, in: Synthetic-Aperture Optics, Vol. I1 (National Academy of Sciences, Washington, D.C.) pp. 239-262. O’NEILL,G. K., 1968, Science 160, 843. PEASE, F. G., 1931, Ergeb. Exakt. Naturw. 10, 84. PROTHEROE, N. M., 1955, Contrib. Perkins Obs. 2, 127. ROGSTAD, D. H., 1968, Appl. Opt. 7, 585. ROMAN,P. and A. S. MARATHAY, 1963, Nuovo Cimento 30, 1452. RUSHFORTH, C. K. and R. W. HARRIS,1968, J . Opt. SOC.Am. 5 8 , 539. SCHERB, F. and J . L. ELLIOT, 1967, Astron. J . 7 2 , 826. SHERWIN, C. W., J . P. RUINAand R. D. RAWCLIFFE, 1962, I R E Trans. Military Electron. MIL-6, 11 1 . SIEGMAN, A. E., 1966, Proc. IEEE 5 4 , 1350. SLEPIAN, D. and H. 0 . POLLAK, 1961, Bell Syst. Tech. J. 40, 65. STROKE, G. W., 1969, Opt. Acta 16, 401. TORALDO Dr FRANCIA, G., 1952, Nuovo Cimento Suppl. 9, 426. TORALDO DI FRANCIA, G., 1969, J. Opt. SOC.Am. 59, 799. TWISS,R. Q,, 1969, Opt. Acta 16, 423. WALTHER, A,, 1963, Opt. Acta 10, 41. WILCZYNSKI, J. S., 1967a, J . Opt. SOC.Am. 5 7 , 579. WILCZYNSKI, J. S., 1967b, J. Opt. SOC.Am. 5 7 , 1415.
50
S Y N T H E T I C-A P E R T U R E O P T I C S
[I
WILCZYNSKI, J . S., 1968, in: Synthetic-Aperture Optics, Vol. I1 (National Academy of Sciences, Washington, D.C.) pp. 211-233. WOLF,E., 1954, Proc. Roy. Soc. (London) A 2 2 5 , 96. WOLF,E., 1955, Proc. Roy. Soc. (London) 230, 246. WOLF,E., 1962, Proc. Phys. Soc. (London) 8 0 , 1269. WOLTER,H., 1961, in: Progress in Optics, Vol. I, ed. E. Wolf (North-Holland Publishing Co., Amsterdam) pp. 155-210.
I1
THE OPTICAL PERFORMANCE OFTHEHUMANEYE BY
G L E N N A. F R Y College of Optometry, T h e Ohio State University Columbus, Ohio, U S A
CONTENTS PAGE
9 3
9
s 9:
INTRODUCTION. INVESTIGATIVE APPROACHES . 2 . THE ANATOMY AND PHYSIOLOGY O F THE RETINA . . . . . . . . . . . . . . . . . . . . . 3 . BASIC CONCEPTS: LINES. POINTS AND BORDERS 4 . VISUAL ACUITY. RESOLVING POWER AND CONT R A S T A N D MODULATION SENSITIVITY . . . .
53
5.
71
1.
SPREADANDTRANSFERFUNCTIONS . . . . . . 3 6 . PSYCHOPHYSICAL TECHNIQUES . . . . . . . . 9 7 . DETERMINATION O F THE MODULATION TRANSFER FUNCTION FOR THE RETINA . . . . . . . . 9 8. PERCEIVED MODULATION O F A SINGLE-WAVE GRATING A T SUPRATHRESHOLD LEVELS . . . 3 9 . DETERMINATION O F THE MODULATION TRANSFER FUNCTION O F THE OPTICAL SYSTEM O F THE EYE . . . . . . . . . . . . . . . . . . . . . . . 9 10. RESOLVING POWER A T U N I T MODULATION . . 9: 11 . DEPENDENCE O F THE RESOLVING P O W E R ON WAVELENGTH COMPOSITION . . . . . . . . . . 9 12. VISIBILITY O F SQUARE-WAVE GRATINGS . . . . 5 13. THE VISIBILITY O F A BAR . . . . . . . . . . . 5 14. THE VISIBILITY O F BORDERS . . . . . . . . . 3 15. RETINAL REFLECTOMETRY . . . . . . . . . . . 9 16. ABERRATIONS O F THE E Y E . . . . . . . . . . . 9 17 . MOTOR ADJUSTMENTS O F THE E Y E . . . . . . . 9 18. GENERAL REVIEWS . . . . . . . . . . . . . . . APPENDIX I . . . . . . . . . . . . . . . . . . . . . . REFERENCES . . . . . . . . . . . . . . . . . . . . .
55 62 66 84
89 99
100 106 110 116 120 123 124 126 127 128 128 129
Q I . Introduction, Investigative Approaches The human eye (Fig. 1) is an optical device which forms an optical image on its retina. The retina (Fig. 2) has many layers, but the first layer to become involved in the act of seeing is the layer of photoreceptors which are called rods and cones. The rods respond at low levels of irradiance so that if two neighboring rods absorb one quantum each within the same tenth of a second, the two rods acting together can generate a signal which can be transmitted through the retina to the brain and can be detected. Two quanta absorbed by a single rod can also generate such a signal. The cones are much less sensitive than the rods, but at the center of the fovea only cones are found, and the sensitivity of this region of the retina has to depend on the cones. A single cone has to absorb many
OPT1
CILIARY BODYT]-
ORA S E R R A T A l
Fig. 1. Horizontal section of the human eye. (From FRY[1965a].)
53
54
THE OPTICAL PERFORMANCE OF THE HUMAN EYE
111,
9
1
CHOROID PIGMENT E P l T H t L l U M
RODS AND CONES /OUTER LIMITING MEMBRANE C E L L BODIES OF RODS
(0)AND
CONEStb)
~
SYNAPSES BIPOLAR (d,e.f,h) AND HORIZONTAL ( C ) C E L L S SYNAPSES AMACRINE C E L L S (i) GANGLION CELLS (rn,n,o,p,s) GANGLION C E L L AXONS VITREOUS
t I
9 1 INNER LIMIT I NG ,'MEMBRANE
Fig. 2. The retina. A schematic reconstruction showing pathways through the retina and their lateral interconnections. (Drawing from POLYAK [I9571 and relabeled by FRY[1965a] .)
more than two quanta during a short flash of light in order to generate a detectable signal. The same number of quanta can also produce a detectable signal if they are distributed over a small cluster of cones. In dealing with theoretical problems of optical image formation on the retina, one can substitute for the layer of rods and cones a hypothetical retinal layer which is smooth and uniform, infinitesimal in thickness, and capable of absorbing all the light that falls on it. Although it is true that many problems can be solved by this approach, one can also use other approaches. One can enucleate the eye and cut a window in the back of the eye as DEMOTT[1959] and others have done with eyes of lower animals, and investigate directly the quality of the image formed in the vitreous humor. So far the results obtained in this way have not correlated well with data obtained by other kinds of measurements. Problems have been encountered in preventing changes in the eye between enucleation and measurement. Such eyes are deprived of their blood supply and mechanisms for maintaining pressure, posture and transparency. Another approach is to study the image formed by light reflected from the retina. This is useful in retinoscopy and objective refractometry where measurements are needed for prescribing glasses to correct refractive errors and to compensate for loss of accommodation. The light is diffusely reflected, and the pigment epithelium appears to be
11.
§ 21
T H E ANATOMY AND PHYSIOLOGY O F T H E RETINA
55
the layer most involved. Fortunately, this layer lies next to the photoreceptors so that no allowance needs to be made for discrepancies between subjective and objective measurements of the refraction of the eye (FRY[ 19491). The refraction of the eye is the reciprocal of the distance measured along the primary line of sight from the spectacle point (14mm in front of the cornea) to the point conjugate to the retina. It is not possible as of now to evaluate the role played by the retinal elements in degrading the image formed on and reflected back from the pigment epithelium, and it has been difficult to relate the quality of this image to the image picked up by the photoreceptors as the light passes through the retina on the way to the pigment epithelium. Progress, however, is being made in this direction. Vos [1963] in particular has studied various ramifications of this problem. Many investigators have measured the distribution of flux in the images formed by light reflected from the retina and have used this information to assess the quality of the optical images formed on the retina. These measurements will be considered in more detail later. In the psychophysical approach the eye is left intact and the subject describes what he sees when certain stimuli are applied to the retina. Among other things, he can make the judgment that he did or did not see something during a specified interval of time, or that he is now seeing this or that, or that two patches are equal in brightness or can be discriminated from each other. If we utilize such judgments in the study of the optical performance of the eye, it is necessary to understand the retinal mechanisms for transforming the optical image to a physiological impression that can be transmitted along isolated optic nerve fibers to the brain. It is very proper, therefore, to include in this discussion the eye-brain mechanisms for processing and transmitting retinal impressions. § 2. The Anatomy and Physiology of the Retina
If we are to use the psychophysical approach, it is essential to understand the structure and function of the retina. It is not a matter simply of reviewing well known facts. There has been continuous progress toward a fuller understanding of these mechanisms. These recent developments are the things that I propose to emphasize. Figure 1 shows a cross section of the human eye. Light entering the eye passes through the cornea on its way to the pupil, which constitu-
56
T H E OPTICAL PERFORMANCE O F T H E HUMAN EYE
[II,
P
2
tes the aperture-stop, and the light is then refracted by the crystalline lens of the eye, which lies behind the pupil, and proceeds to the retina. As shown in POLYAK'S [1957] diagram (Fig. 2 ) , the photoreceptors lie on the back side of the retina so that light passing through the pupil must also pass through the entire retina before entering the photoreceptors. The photoreceptors funnel the light into their outer segments. This funnelling process amplifies the problem of self-filtering of photopigment that absorbs light and generates signals. In the case of the cones, the funnelling accounts for the directional sensitivity in which cones pointed toward the center of the pupil absorb much more light than those pointing toward the edge of the pupil. This is known as the STILES-CRAWFORD[1933] effect and is illustrated in Fig. 3. I.0-
0.80.6 0.4
-
0.24 2 0 2 4 r r I DISTANCE FROM THE CENTER OF THE PUPIL
Fig. 3. The Stilcs-Crawford effect. Data for the horizontal meridian of the right eye of Subject B.H.C. (From FRY[1955].)
Since the cones have a center-to-center distance of about 3p, the diameter of a cone is small with respect t o the wavelength of light, and it is necessary to treat the photoreceptors as wave guides; thus a special branch of optics called retinal optics (ENOCH [1963]) has grown up to cover this phase of the problem. The important question to be asked in this connection is whether the photoreceptors that fall in the different parts of the area covered by the image of a point respond in proportion to the flux per unit area falling in these parts (FRY[1955]; ENOCH and FRY[1958]). Retinal densitometry is an objective method of studying the bleaching of the photopigment by light. It has been demonstrated (RUSHTON [1956 and 19611; FRY[1969a]) that the rate of bleaching in rods for white light (2750' K) is
11.
s
51
THE ANATOMY A N D PHYSIOLOGY O F THE RETINA
ds/dt
=
- 1 . 6 2 ~ 10-7 Es,
57
(1)
where s is the fraction of the maximum amount of photopigment in a given unit area of the retina, and E is the illuminance falling on the retina, expressed in photopic trolands. When a given molecule of photopigment absorbs a quantum, it decomposes and generates a signal; but the molecule eventually regenerates so that in darkness the total amount of photopigment per rod always returns to its maximal value. As a matter of fact, at ordinary levels of luminance only a small fraction of the pigment is bleached, so that the rate of bleaching is proportional to the retinal illuminance. This fact is of considerable importance in analyzing the effect of optical blur on the response of the eye. Note that when s = 1, eq. (1) becomes a statement of the basic photochemical (Roscoe-Bunsen) principle, that the response is proportional t o the intensity times the time, i.e., ds
N
Edt.
(2)
The situation is somewhat more complicated with cones. Microspectrophotometry has made it possible to demonstrate that there are three kinds of cones - red-sensitive, green-sensitive and bluesensitive. Furthermore, retinal densitometry has made it possible to measure the rates of bleaching of the red-sensitive and green-sensitive pigment in the fovea of the eye where only cones are found. For white
Fig. 4. Relative luminosity curves: (a) photopic; (b) scotopic. (From Science of Color.)
5s
T H E OPTICAL PERFORMANCE OF T H E HUMAN E Y E
[II,
§ 2
light (2750' K) the rate of bleaching (RUSHTON [1958]) is the same for red-sensitive and green-sensitive pigments:
dsldt
=
- 2 10-7 ~
ES.
(3)
The amount of blue-sensitive pigment is so small that up to now its rate of bleaching has not been measured by retinal densitometry. In studies of image processing by the eye, it is customary to use an artificial aperture-stop which restricts the size of the beam entering the eye to about 2mm. This eliminates any complications from the
w 0.4
>
w
a 400
500
450
0.008
-
0906
-
OLOO)
-
600
550
650
roo
0.001 -
0!002
400
450
WAVELENGTH
550
500
(mu1
Fig. 5. The red (R), green (G) and blue (B) components of the ICI photopic luminosity curve @). The graph a t the right for the blue end of the spectrum is magnified vertically 100 times. (From FRY[1965b].)
11,
8 21
T H E ANATOMY A N D PHYSIOLOGY O F THE RETINA
ANGULAR
DISPLACEMENT
FROM T H E CENTER OF T H E
59
FOVEA
Fig. 6. Concentration of rods and cones a t different distances from the center of the fovea (based on the data of GSTERBERG [1936]).
Stiles-Crawford effect and keeps the rate of bleaching proportional to the amount of light entering the eye. The concept of rate of bleaching gives new meaning to the photopic (cone) and scotopic (rod) luminosity curves. These luminosity curves (Committee on Colorimetry of the Opt. SOC.Am. [1953]) are shown in Fig. 4.The retinal illuminance can be re-expressed in terms of retinal irradiance; luminosity for the rods and for the different kinds of cones can be re-expressed in terms of rate of bleaching per unit of retinal irradiance for the different dil bands of the spectrum. The laws of color mixture indicate that the signals generated by the different cones are proportional to the rates of bleaching and are additive; hence, it is assumed that brightness information from red-, green-, and blue-sensitive cones is combined and summated and transmitted over the same channels. Therefore, it may be said that the brightness response of the cones in a given region of the retina is proportional to the sums of the rates of bleaching in the three kinds of cones. I n the case of the peripheral retina, the response of the rods is integrated with that of the cones.
60
THE OPTICAL P E R F O R M A N C E OF T H E HUMAN E Y E
tII,
9
2
The splitting of the photopic luminosity curve into its red, green, and blue components must conform to the absorption curves found by retinal densitometry and microspectrophotometry of individual photoreceptors, but one can analyze color mixture data by locating the three fundamental colors on a color mixture diagram according to specific criteria and can use this approach to make a precise evaluation of the red, green, and blue components of the luminosity curve. My own analysis (FRY[1965]) of the luminosity curve into its three components is shown in Fig. 5 . It is important to keep in mind the relative number of rods and cones at different parts of the retina. ~ S T E R B E R G ’ S [1935] data are shown in Fig. 6. At the center of the fovea only cones are found. The number of ganglion cells per unit area is higher here than elsewhere, and on this account the fovea is better designed for the perception of fine detail than any other part of the retina. Thus, to see most effectively one must point his eye so that the critical details which need to be seen project their images on the fovea. This pointing skill is a feature of the eye’s performance as an optical instrument which will be considered later. The important thing is that when the eye is used in conjunction with a microscope or telescope or for viewing a photograph made with a microscope or telescope or for viewing a photograph made with a camera, it is the foveal region of the retina which is all important. It is proposed, therefore, to cover only foveal vision in this review. Another feature of the retina is that each photoreceptor is not
I .O
-
0.8 -
\ I
R I Roe-kt
I=Ioe-k2(i/cTf
I1
I
p0.6 z
kr0.8 0-=8
; pR I 1 I I
20.4 -
I Q W
0.2 -
2 GO.0 -
------=’
_I
w c0.2
I
1
I
I
I,
‘./
I
I
I
I
I
Fig. 7. Line spread functions for inhibition ( I ) and excitation ( R )in the retina. The area under the I curve is 0.8 times the area under the R curve.
11,
5
21
T H E ANATOMY AND PHYSIOLOGY O F T H E RETINA
61
connected by an independent channel to its private optic nerve fiber. Although the number of optic nerve fibers supplying the fovea is about as large as the number of cones in the fovea, the interconnections are very diffuse. As shown in Fig. 2, the rods and cones have to transmit their signals to the ganglion cells. l k ganglion cells give rise to an intermittent all-or-none “spike” response, whereas the photoreceptors give rise to steady potentials. By means of lateral branches or by means of the horizontal and amacrine cells the rods and cones can transmit their excitatiorl to a large cluster of ganglion cells. On the other hand, the rods and cones can also generate inhibitory potentials which are even more widespread. I n the following analysis, which applies to the human retina, it has been assumed that the line spread functions for excitation and inhibition are bilaterally symmetrical and conform to the spread functions shown in Fig. 7 . The spread function for excitation has a sharp peak which prevents the spread of excitation from restricting acuity. The wide bottom provides widespread lateral summation which makes it possible for large objects to be seen a t low levels of illumination. As will be seen later, the diffuse spread function for inhibition enhances contrast. It is assumed that the excitatory and inhibitory effects generated by cones are proportional t to the stimulus applied to the cones and that these effects are strictly additive when they converge at a common pathway. These assumptions make it possible to predict response patterns by convoluting the stimulus pattern with the physiological spread functions. The assumptions that have been made about excitation and inhibition cannot be claimed to have been fully substantiated, although they do provide an explanation of what the eye sees. The amount of excitation and inhibition generated in the retina need not be proportional to the stimulus, and the inhibition need not be of the simple forward type in which one part of the retina inhibits activity in another region without inhibiting the capability of that region for inhibiting other regions. The processes of excitation and inhibition could be much more complicated than have been assumed, and in the end these assumptions may have to be modified or revised.
t This relation could be non-linear without affecting too much the calculation of the distributions of excitation and inhibition, but the evidence indicates that the relation is actually linear.
62
THE OPTICAL P E R F O R M A N C E O F T H E HUMAN EYE
III.
s
3
Electrophysiological techniques have been used by ENROTH-CUGELL and ROBSON[1966] to study the mechanisms of spread of inhibition and excitation. The activity of a single ganglion cell was recorded and this technique was used to map out the region of the retina which contributes signals to it. This region is known as the “receptive field”. The effects of stimulating isolated areas can be studied or these various areas can be made to interact with each other. Mechanisms for the spread of excitation and inhibition were demonstrated. Both spread functions, which included blur of the optical image, appeared to be Gaussian with the spread being greater for inhibition than for excitation. Not all of the ganglion cells behaved in the same way. It is to be hoped that further studies along these lines will eventually provide us with a complete picture of the mechanisms of the excitation and inhibition in the retina. The paper by Enroth-Cugell and Robson includes references to many previous investigators who have contributed to our knowledge of retinal spread functions. The ganglion and bipolar cells that supply the center of the fovea have been pushed sidewise from the center of the fovea so that the cones at the center interact with other cones in only three directions, sidewise and away from the center, but never across the center of the fovea. No specific neural interaction effects have yet been found to be associated with this arrangement. The close packing of the cones is achieved with a hexagonal array, but such a pattern is not systematically maintained over any large region (WSTERBERG [1935]). No physiological effect has been definitely shown to be associated with this arrangement, although the possibility for involvement in a kind of retinal astigmatism must be recognized.
Q 3. Basic Concepts: Lines, Points and Borders The visual field of the eye can be described as a cone which contains all the lines of sight that can pass through the center of the pupil and penetrate the portions of the retina that respond to light. One can relate the distribution of the luminance in the field of view to the distribution of flux on the retina which is characterized by the presence of lines, points, sharp borders, gradients and uniform patches. What the observer sees can also be described as a subjective field of view broken up by points, lines, sharp borders and gradients. The terms point and line are figurative words that have no real meaning in terms
11,
S 31
BASIC CONCEPTS:
LINES, POINTS AND BORDERS
63
of the optical image formed on the retina; a point is merely a small patch bounded by a border, and a line is a strip bounded on opposite sides by borders. Furthermore, since each border is a special kind of gradient, the distribution of flux on the retina can be described as an array of uniform patches and gradients. The point concept, the line concept and the border concept all have meaning in terms of object space. So far as the perceived distribution of color in the subjective field of view is concerned, there is some justification for differentiating between a sharp border and a gradient, because in the process of degrading a border between two adjacent areas, one can find a threshold level at which the border is perceived as either sharp or blurred.
Fig. 8. Pattern for demonstrating photometric matching of average brightncss when the grating is fine and the matching of individual bars when the grating is coarse.
In the case of uniform arrays of points or parallel lines, the eye has the remarkable capacity of assessing directly the average brightness. In Fig. 8 one can vary the luminance of the uniform patch to match the “average” brightness of the grating. The eye can also perform an additive mixing of different juxtaposed hues and saturations while at the same time being able to detect the hues, saturations and brightnesses of the elementary patches of color. Advantage is taken of this remarkable capability of the eye by such artists as Seurat and van Gogh, and it is also put to use in connection with line drawings, etchings, stipling, half-tone reproductions, photographic grain and T V rasters. At a grosser level of detail different contrast borders in the field of view take on different meanings. We regularly speak of striped zebras and spotted cows. These arrays of stripes and spots are considered to be an aspect of the color covering the surfaces of these animals. The borders of the spots on a spotted cow (Fig. 9) have a different meaning than the border of the cow separating the cow from the background.
64
T H E OPTICAL P E R F O R M A N C E O F T H E H U M A N EYE
111,
§ 3
Fig. 9. Camouflage involving confusion between the border of a n object (a cow) and borders bctwcen patches of color covering t h e object and the background. Courtesy of S. Kenshaw.
A part of the border of the cow may be invisible and still filled in with an imaginary border by the observer. All of this merely illustrates the point that human performance based on information extracted from displays formed on the retina can be much more complicated than the problem of coding the information transmitted along optic nerve fibers. Thus in this discussion of the optical performance of the eye, we can and we have restricted consideration to the visibility of lines, points, sharp borders, gradients and uniform patches without reference to the meaning they convey. The problem of transmitting the impression of a gradient or a uniform patch along the optic nerve must not be confused with the more general problem of transmitting information. It is helpful to point out at the outset the difficulty which may be
11, §
31
BASIC CONCEPTS: LINES, POINTS AND BORDERS
A
65
B
Fig. 10. Perception of border contrast and the difference in brightness between two isolated patches.
encountered in using an expression such as the “perception of a brightness difference” or the “brightness difference threshold”. One can compare the brightnesses of two independent patches of color (Fig. 10A) and subjectively assess the brightness difference, or one can have two contiguous patches (Fig. 10B) separated by a contrast border, as in the case of the border line between the two compared patches of color in a photometer. I n this case one can interpret the perception of a border as the perception of a brightness difference, but it ought to be recognized that the physiological mechanisms involved differ in this case as compared with two separate patches of color. In order to avoid confusing the two problems, phraseology such as border contrast sensitivity, border perception, and border contrast thresholds must be used in dealing with the visibility of a contrast border. It must also be recognized that many of the physiological ramifications involved in the perception of borders and gradients can be set aside if we are interested only in the role played by the quality of the optical image. Consequently, in the following discussion, I shall not attempt to deal with the problems that arise from changes in curvature and length of a border, or the effects of borders crossing each other or forming sharp angles with each other. The expression “perception of fine detail” can include all kinds of combinations of curved and straight borders, but it has become customary in the study of the optical performance of the eye to deal primarily with straight borders which are long enough to make a further increase in length of no importance and with combinations of straight borders to form lines, bars and gratings. Occasional reference will be made to targets of other configuration, such as points, disks and annuli, when information obtained with such targets can throw light on the visibility of borders, bars and gratings.
66
T H E OPTICAL PERFORMANCE O F T H E H U M A N EYE
111.
§ 4
Q 4. Visual Acuity, Resolving Power and Contrast and Modulation Sensitivity 4.1. MEASUREMENT AND SPECIFICATION O F VISUAL ACUITY
In various branches of human endeavor, it becomes necessary to rate the performance of an observer in terms of a single measure of his capacity for perceiving fine detail. The ophthalmologist and optometrist are confronted with the necessity of rating observers who have to be compensated for eye injury or who have to be classified for the purpose of payment of income tax. This is done by specifying the central (foveal) acuity of each eye. Visual acuity is usually measured with Snellen letters (Fig. 11) or with similar targets, such as the Landolt C (or broken ring). Visual acuity can be rated by specifying
SNELLEN
FOUCAULT
LANDOLT
KOENIG
IVES
Fig. 11. The critical dimension of commonly used visual acuity test patterns.
the size of some critical detail in minutes of arc. In the case of the broken ring, it is the width of the break in the ring, although in this case the width of the break is equal to width of the stroke forming the ring and is also equal to one fifth of the outside diameter. In the case of Snellen letters, all of which are capitals, the critical detail is the width of the “strokes” that make up the letters; this is also equal to one fifth of the overall height of the letters. Another popular target is the Koenig bar target (Fig. l l ) , which consists of two bars separated by an interval equal to their width and having a length equal to three times their width. I n this case, the width of the interval between the bars is taken to be the critical detail. In using various kinds of targets for the measurement of visual acuity, it is sometimes necessary to be arbitrary about what constitutes the critical detail. For example, in the case of Bodoni lower-case type, the critical dimension is considered t o be one fifth of the height of
11,
s
41
67
V I S U A L ACUITY
lower-case letters like e and o as opposed to 1 and j . However, one can specify also the size of the type as 8-point, 10-point, etc. This avoids the problem of specifying some critical detail. One must always specify what it is about the target that the observer must detect, identify, or see in a given way in order to pass the test. One must also specify how the results on several trials are to be averaged or assessed in terms of a single measure. The ability to see fine details involving a critical dimension of one minute of arc is considered to be a normal performance. One of the problems is that the measurement of visual acuity with different types of target does not always yield the same answer. The remarkable thing is that a complex, civilized society can get by without insisting on a higher degree of standardization than exists at the present time.
AVERAGE LUMINANCE (A) SINE WAVE
( 6 ) SAW TOOTH
( C ) SQUARE WAVE
Fig. 12. Specification of period, resolving power, visual acuity, modulation and contrast for various types of gratings.
Gratings can also be used for measuring visual acuity. When a square-wave grating (Fig. 12), which is also called a Foucault target (Fig. l l ) ,is used, the bright and dark bars are usually of equal width. As in the case of the Koenig bar pattern, the width of the individual bars is taken to be the critical detail for specifying visual acuity. This is illustrated by the visual acuity measurements reported by SHLAER [1937-381. When the dark-light ratio is other than one to one, the critical detail is taken to be one half of the cycle or period or center-tocenter distance. This is important because the temptation here is to use the width of the cycle as the index of performance, and this is what is done in the recent studies of visibility of gratings. But this should not be described as a measure of visual acuity. COBB [1911] used an Ives pattern (Fig. 11) produced with two transparent gratings to measure visual acuity. This pattern is equivalent to a sawtooth distribution (Fig. 12), and one half of the width of
68
THE OPTICAL PERFORMANCE O F THE HUMAN E Y E
[II,
ss
4
a tooth is used as the measure of visual acuity. Similarly, in the case of a sinusoidal grating (Fig. 1 2 ) , one half of the cycle is the proper dimension to be used in specifying visual acuity. Although one can specify the linear measurement of the critical dimension, provided that the distance of the test object from the eye is also specified, it is more customary to use the angular measurement expressed in minutes of visual angle measured at the entrancepupil of the eye. The reciprocal of the critical dimension expressed in minutes of arc is the measure of visual acuity. This is also called decimal acuity. Another specification of visual acuity is the Snellen notation. In this case the size of the test object is specified by giving the distance at which the critical detail subtends one minute of arc. The visual acuity is specified by the use of a fraction, the denominator of which indicates the distance at which the critical detail subtends one minute of arc and the numerator of which indicates the actual distance at which the critical detail is at the threshold of visibility. The distances can be specified in either feet or meters. The test is usually made at 20 feet and the size of the test characters is varied to measure acuity. The resulting Snellen fractions become 20/20, 20/40, etc. It is also important to be aware of other specifications of visual acuity. Whenever it is necessary, the Snellen fraction can be immediately reduced to the decimal notation of visual acuity. The decimal notation can easily be converted to a percentage notation known as percentage visual acuity. It is easy to confuse visual acuity with what is known as the visual efficiency (Am. Med. Assoc. Council on Industrial Health [ 19551). Visual efficiency is related to decimal acuity ( A ) by the following formula: Visual efficiency = 0.836(11A-1).
(4)
The idea behind the use of this scale is that a person whose percentage visual efficiency has dropped from 100 percent (20/20) to 50 percent is able to earn only one half as much and should, therefore, in the case of accidental impairment of vision, be compensated in proportion. 4.2. RESOLVING POWER
Physicists are accustomed to the use of the concept of “resolving power” for specifying the performance of an optical instrument. This concept is used in connection with a grating target and represents the
11,
P
41
69
RESOLVING POWER
number of cycles per millimeter in the image that can be resolved. The number of cycles per millimeter is called the spatial frequency of the image of the grating. The same concept can be applied to the eye, but instead of specifying the number of cycles per millimeter, it is more usual to specify the resolving power in terms of the number of cycles per degree or per minute of visual angle. In converting from visual angle to distance across the retina, it may be assumed for most purposes that the distance from the second nodal point of the eye to the retina is 17 mm. The concept of resohifig power is also used in connection with a two-point target. The concept is based on the capability of the human eye for resolving the image of a double star. I t is assumed that the eye can resolve the images of two stars when the angular separation is such that the distance between the centers of the two images on the retina is equal to the radius of the first dark ring in the Fraunhofer diffraction pattern of a single star (Fig. 13). This is known as Rayleigh’s criterion. Hence, the theoretical capability of two stars of a given wavelength (A) being resolved by an eye can be computed from the radius g of the entrance-pupil: Resolution threshold in radians = 0.61 A/g.
(5 1
There is a certain amount of interest in the ability of the eye to resolve two points, and OGLE [1951] had some hope of using the twopoint approach in fundamental studies of the optical performance of the human eye. However, gratings and bars appear to be the preferred targets for the study of the performance of the eye.
I.o
0
.I.o
2.0
DISTANCE FROM CENTER (MINUTES)
Fig. 13. Rayleigh criterion for resolving the images of two monochromatic points.
70
T H E OPTICAL P E R F O R M A N C E O F T H E HUMAN E Y E
111, I 4
4.3. THE CONCEPTS O F CONTRAST AND MODULATION
Contrast (C) at a border is defined in several ways, but the most widely used definition is in terms of an object (0)and its background (B): Contrast
=
(LB-LO)/LB,
(6)
where L, and Lo represent the luminances of the background and object respectively. In measuring the threshold it is usually found that the value of (L,-L,( is the same for objects brighter and darker than the background, and hence the contrast threshold may be specified as Contrast threshold
=
1 (LB-LO)/L,l = jAL/L1.
(7
1
In a disk-annulus pattern such as is used in photometry one can determine when the disk is darker or brighter than the annulus, but not the point at which the two are equal; thus, the range between the two levels equals 2(AL(,and L has to be computed from the relation
In the case of a bipartite pattern or square-wave grating, there is no basis for deciding what parts constitute object and background, and hence contrast has t o be defined in terms of the bright and dark parts of the pattern as follows: Contrast
=
(Lmax-Lmin)/Lmi,.
(9)
Contrast can also be expressed in terms of modulation ( M ) : Modulation
=
(Lmax--Lmin)/(Lmax +Lmin)
= Amplitude/Average luminance.
(10)
I n the case of sine waves, as well as sawtooth waves and square waves (see Fig. 12), 4(Lmax-Lmin) equals the amplitude (AC component), and +(Lmax +L,,,) equals the average luminance (DC component). Since ccntrast thresholds are apt to be of the order of 0.01, it follows that the modulation threshold is numerically about one half as large as the contrast threshold, and it is necessary for visual scientists working in this area to be ever mindful of this difference. The concepts of contrast and modulation can also be applied to
11,
9 51
SPREAD A N D TRANSFER FUNCTIONS
71
distributions of retinal illuminance on the retina as well as to distributions of luminance in the field of view. These concepts can also be applied to the demodulated image transmitted through the retina to the brain. § 5. Spread and Transfer Functions 5.1. SPREAD FUNCTIONS
We have already considered the spread functions of excitation and inhibition in the retina. We can apply the same concept to a radially symmetrical optical image of a point source formed on the retina. Once the distribution of flux for a point is given, the spread function for a line or for a contrast border can be computed. This procedure has come t o be known as convolving (or convoluting) a line or contrast border with a point spread function. In the case of a border one can go from a point spread function to a line spread function and then convolve the border with the line spread function. When the image of any object is formed on the retina, the object is said to be convolved with the point spread function for the optical image. When such an image is in turn transmitted through the retina, it is further convolved with the spread function for excitation and inhibition. The spread of excitation in the retina is superimposed upon the spread involved in the formation of the optical image, and in this kind of situation where two spreading mechanisms occur in tandem, one can convolve the point spread function of the first with that of the second and obtain a single point spread function that represents both. The procedure for doing this has been described by FRY[1955]. If the point spread function is radially asymmetrical, the line spread function can still be bilaterally symmetrical in special cases, but it can also be bilaterally asymmetrical. In this review there has been no need to consider asymmetrical spread functions because we have been primarily concerned with axial chromatic aberration, spherical aberration, and the effect of changing pupil size and throwing the eye out of focus. In the study of these effects consideration can be limited to radially symmetrical beams of light. 5.2. CONVOLVING A L I N E W I T H A POINT SPREAD FUNCTION
The concentration of flux at a given part of the retinal image of a point can be described in terms of the lumens per square minute measur-
72
THE OPTICAL PERFORMANCE OF T H E HUMAN E Y E
[II.
5
5
ed at the second nodal point. Flux per square minute can also be expressed in trolands: lumens per square minute. One troland = 8.46 x In the case of a radially symmetrical retinal image of a point the ] a function of the distance (Y) from flux per square minute [ E ( Y ) is the center of the image (Fig. 14), where Y is measured in minutes of angular length t subtended at the second nodal point.
FRAUNHOFER
0
DISTANCE FROM CENTER (MINUTES) Fig. 14. Gaussian and Fraunhofer point spread functions which approximate each other; g = 1 mm, L = 492 nm, u = 0.4 min (from FRY[1965c]).
Once the distribution of flux for a point is given, the distribution for a straight line of constant candlepower per unit of length can be computed [FRY(1955)J.The first step is to normalize E ( r ) by dividing it by F , the total amount of flux in the image. The normalized value of E ( Y )is designated G ( Y )and is defined as where
F
= 2nIOmrE(r)dr.
The distribution of flux across the image of a line can be described in terms of flux per unit area E (t) as a function of the distance ( t ) from the center of the image. The formula for this distribution is
E ( t ) = 2.1,;
[G(r)r/Z/(r2-t2)]dr,
(13)
7 Distance across the retina can be expressed in minutes or in microns (p). If the secondary nodal point lies 17 millimeters in front of the retina, one minute = 4.92 microns.
11,
I
51
SPREAD AND TRANSFER FUNCTIONS
73
where D is the flux per minute of length of the image. The next step is to normalize E ( t ) by dividing it by the flux per minute of length of the image (0). The normalized value of E ( t ) is designated H ( t ) and is defined as
H(t) = E(t)/D,
(14)
where
The quantity D can also be described as the flux entering the eye from a segment of the object line one minute long measured at the first nodal point. Actually, the normalized spread function for a line can be determined from the point spread function without giving any consideration to either D or E (t):
H(t) = 2
[G(Y)Y/~(Y'-~')]~Y.
J;,
Procedures for working backward to determine the point spread function from a line spread function have been described by MARCHAND [1964, 19651. 5.3. CONVOLVING A PATTERN O F PARALLEL STRIPS WITH A L I N E SPREAD FUNCTION
We can now use the image of a line to convolute a pattern, such as a bar or a straight contrast border or a grating, which can be divided into long parallel strips, each having constant luminance from end to end. The distribution of luminance across such a pattern can be de scribed as a function L(s) of s, where s represents the distance in minutes from some arbitrary starting point. The distribution of flux in the convoluted image on the retina is given by the following general formula :
I-, 00
E(s)= A
H ( t ) L(s-t) dt,
(17)
where A is the area of the pupil expressed in square millimeters and where E is expressed in trolands and L in nits. In the case of an abrupt contrast border (step function) between a
74
T H E OPTICAL PERFORMANCE OF T H E HUMAN E Y E
[II,
9 5
dark border on the left and a bright border on the right, eq. (17) reduces to
E ( s )= A L
J-:
H(t)dt.
(18)
One can differentiate eq. (18) to obtain the line spread function from the border spread function:
H ( t ) = [dE(s)/ds]/(AL).
(19)
5.4. SINE-WAVE TECHNIQUES
A sine-wave grating continues to be a sine wave after it has been convolved with any point spread or line spread function, and the only effect is a loss in amplitude. This remarkable property of a sine wave is the basis for the development of a large set of sine wave techniques used in the study of the performance of optical systems and other spreading mechanisms. The uses of these techniques in the study of image formation and transmission of images through the retina constitutes the major wave of progress in the optics and the related physiology of the eye during the past decade. This era was initiated by SCHADE[1956]. The general procedure for convolving a sine wave with a line spread function is illustrated below for the case of an optical image formed on the retina. The formula for the distribution of luminance ( L )across a sinusoidal grating is
L (s)
=
4(Lmax +I-min) +B
where 4(L,,, +I-,,,) 2(Lmax-Lmin) 1
= =
(Lmax--Lmin)
c o ~(zns/:) 9
(20)
average luminance, amplitude,
d = wavelength (or peak t o peak distance) in minutes,
l / S = spatial frequency in cycles per minute, and s = distance across the grating from a given peak used as an
arbitrary reference point. As explained by PERRIN [1960], the general formula [eq. (17)] for convolving a strip pattern with a symmetrical line spread function H ( t ) reduces in the case of a sinusoidal grating to
where
SPREAD AND TRANSFER FUNCTIONS
15
J’rW
T = modulation transfer function = H ( t ) cos (antis)dt, t = distance across the retina from the center of the line spread function in minutes, A = area of the pupil in square millimeters and is the factor for converting from luminance ( L ) in nits to retinal illuminance ( E ) in trolands, and H ( t ) = normalized line spread function. As long as s and t are expressed in minutes measured at the primary and secondary nodal points, we do not have to be concerned about differences in s in object space and on the retina. It should be noted that T(l/S) is the Fourier transform of the normalized line spread function H ( t ) . When a sine wave is convolved with a line spread function, the ratio of the after and before modulations represents the modulation transfer factor ( T ) : Modulation after convolution T = Modulation before convolution’ (22) If these factors for different spatial frequencies are plotted on a graph, the resulting curve is called the modulation transfer function. This function can be computed directly from the point or line spread function by the use of the Fourier transforms (GIVENS[1966]). When the line spread function is bilaterally symmetrical, the inverse Fourier transform can also be used to work backward from the modulation transfer function to compute the line spread function. When the line spread function is bilaterally asymmetrical, special techniques (PERRIN [1960]; LAMBERTS [1958]) can be used to go back and forth between it and the modulation transfer function. Various psychophysical methods have been developed for determining experimentally the modulation transfer functions and from these deriving the line spread functions for optical blur and the spread of excitation and inhibition. 5.5. SUMMARY O F SPREAD AND TRANSFER FUNCTIONS USED FOR
THE EYE
In this section of the review I have summarized some point spread functions involving radial symmetry and the corresponding modulation transfer functions which have been found useful in dealing with probems related to the eye. When an eye is free from spherical and other monochromatic
76
T H E OPTICAL PERFORMANCE O F T H E H U M A N E Y E
[II,
§ 5
aberrations, it is called a diffraction-limited eye. When a point source of coherent light is conjugate to the retina of a diffraction-limited eye, the image formed on the retina is a Fraunhofer diffraction pattern. The point spread function for the Fraunhofer diffraction pattern (Fig. 14) is given by the following formula (FRY[1955]):
where
-
E [(2/7)J1(7>I2 is the first order Bessel function of (7) and where
J,(r)
7 = (1/3438)( 2 @ / 3 , ) ~ .
(23)
(24)
I n eq. (24) Y is the distance from the center of the image measured in minutes of arc a t the second nodal point, 3, the wavelength, g the radius of the entrance-pupil, and 3438 the factor for converting from radians to minutes. The same units are used for g and A. The formulas for the corresponding line and contrast border spread functions can be found elsewhere (FRY[1955]). The corresponding modulation transfer function (Fig. 15) is
T
= (Z/n)(e-sin
8 cos O ) ,
(25 )
where cos
e = 3438 2/(2gq,
(26)
and where S is the center-to-center distance between the bright lines in the retinal image of the grating measured in minutes of arc at the second nodal point. The derivation of eq. (25) has been explained by GIVENS[1966]. In a diffraction limited eye out of focus, the point spread function 1.0
0.8
0.6
T
0.4
0.2 0
0
20
40
60
CYCLES PER DEGREE Fig. 15. Gaussian and Fraunhofer modulation transfer functions corresponding to the two line spread functions in Fig. 14.
11,
9 51
77
SPREAD AND TRANSFER FUNCTIONS
0
20
SPATIAL FREQUENCY
40
60
00
(c/deg)
Fig. 16. Modulation transfer functions for a diffraction-limited eye. The pupil diameter is 2.5 mm and the wavelength is 578 nm. The different curves are for the eye in focus (0) and for the eye out of focus to various degrees. The numbers indicate diopters (from CAMPBELLand GUBISCH[1967]).
for monochromatic light is called a Fresnel image. The formula for a Fresnel point spread function can be found elsewhere (FRY[1955]). Linfoot has worked out modulation transfer functions for Fresnel images slightly out of focus. A graph showing certain of these functions (Fig. 16) has been presented by CAMPBELLand GUBISCH[1967]. It would be helpful if tables for these functions were available. As shown in Fig. 14, the Fraunhofer diffraction pattern can be approximated by a Gaussian spread function:
E
=
E , exp[-&(~/o)~]],
(27 )
where r is the distance from the center of the image and CJ is the standard deviation of the distribution. The Gaussian spread function can also be used to express the spread of inhibition (Fig. 7 ) , although the value for sigma is higher. The line spread function has the same form as the point spread function, and the border spread function corresponds t o the integral of the probability curve, values for which are widely available in table form. The corresponding modulation transfer function (GIVENS[1966]) is
T = exp [-2(n0/S)~].
( 2 8)
Figure 15 shows the modulation transfer for both the Fraunhofer and the Gaussian point spread functions in Fig. 14, which approximate each other.
78
T H E OPTICAL PERFORMANCE O F THE HUMAN E Y E
[11,
9 5
The line spread function shown in Fig. 7 , which is used to represent the spread of excitation, conforms to the following equation:
R
=
R, exp(-kt),
(29)
where k is a constant which varies with the luminance level and where t is the distance from the center of the image. The corresponding modulation transfer function is as follows:
T
=
[1+(2~/kB)~]-’.
(30)
The curve for this function is shown in Fig. 17. The “pillbox” distribution representing the out-of-focus blur circle is an approximation or substitute for Fresnel distributions. The image of a point source represents a disk-shaped area over which the flux is uniformly distributed.
0
20 40 60 CYCLES PER DEGREE
80
Fig. 17. Modulation transfer functions for physiological spread of excitation and inhibition corresponding t o the line spread functions in Fig. 7.
The formulas for the corresponding line and border spread tions are:
*
func-
H ( t ) = [2/(76P)]2 / ( 1 2 - t 2 ) , (31) where P is the radius of the blur circle and t is the distance from the center of the image; for a border E ( w ) = ( A L / n )[&c+(w/P) d ( P 2 - w z )
+ sin-’
(w/P)],
(32)
where L is the luminance on the bright side of the border, A is the area of the pupil, and w is the distance from the border. * Equation (32) for the border spread function includes corrections for the errors which appear in eq. (7.10) in Fry [1955].
79
SPREAD AND TRANSFER F U N C T I O N S
0.2
-0.2
0.4
0.6
0.8
1.0
27r i ( I & 1
Fig. 18. Modulation transfer function for a n out-of-focus blur circle. The symbol i represents the radius of the blur circle.
The formula for the modulation transfer function (GIVENS[ 19661) is
T
= 2 [J1(2nf/iB)]/(2nP/d),
(33)
where f is the radius of the blur circle and J1(27tf/S) is the first order Bessel function of ( 2 n f / s )This . function [eq. (33)] is shown in Fig. 18. The phenomenon of spurious resolution which occurs with an eye (FRY[1953, 19611) out of focus finds its counterpart in the oscillation of the modulation transfer function above and below the zero level. For over a century the concept of the blur circle has been widely used in providing approximate answers to out-of-focus imagery in the
I PLANE OF THE EXIT PUPIL Fig. 19. Geometry involved in computing the size of the blur circle for anout-of-focus eye (from FRY[1955]).
80
THE OPTICAL PERFORMANCE O F THE HUMAN E Y E
[II, §
5
eye, and it is still used for quick checks in the most sophisticated studies. It is assumed that when the eye goes out of focus, the distance ( v ) from the retina to the point of best focus (M’) in millimeters is three tenths of the number of diopters out of focus and the radius ( F ) of the blur circle is related to the radius (8‘) of the exit-pupil by the following equation: ~
F = [~’/(O’M’)][DI,
__
(34)
where O’M’ is the distance from the exit-pupil to the point of best focus. The relationships are shown in Fig. 19. See also Appendix I. In astigmatism the blur ellipse is the counterpart of the blur circle. Problems in astigmatism can be solved by using approaches (FRY [1955]) similar to those used for simply throwing the eye out of focus. 5.6. INDEX O F BLUR
The FRY-COBB[1935] index of blur is useful in dealing with various types of problems. It can be defined in terms of point, line, or border spread functions (FRY[1955]), but the most meaningful definition is in terms of the spread function for a border (Fig. 20). It represents the ratio of the slope (dE/ds) at the midpoint to the difference in retinal illuminance on the two sides of the border. In terms of the normalized point spread function, I/# = 2IOwG dr.
(35)
I n terms of the normalized line spread function,
POINT
LINE
BORDER
Fig. 20. Relation of the index of blur to the point, line and border spread functions of the eye (from FRY[1959b]).
11.
5 51
SPREAD AND TRANSFER FUNCTIONS
81
It turns out also that the index of blur can be related to the modulation transfer function. T ( I / S ) d(l/S).
I/+ = 2
(37)
/om
This means that 46 is the reciprocal of the area under the curve representing the modulation transfer function. For example, the two transfer functions shown in Fig. 15 have the same +-value because the areas under the two curves are equal. I n the case of the Fraunhofer distribution for a point, it turns out that in radians = 0.591/g. (38)
+
If we compare this equation with eq. (5),it is apparent that by chance the value of I$ is almost numerically equal to the radius of the first dark ring. Hence, it may be said, in accordance with the Rayleigh criterion, that two points can be resolved when they are separated by an amount equal to 4. One of the advantages of the Fry-Cobb index of blur is that it can also be easily computed for a monochromatic Fresnel image, and thus it can be used to assess the performance of the eye for any wavelength, pupil size, or lack of focus. The formula for for a monochromatic Fresnel image (FRY [1955]) is
+
46 in minutes = 3438 Ac/(f’g’V),
0
20
10
(39)
30
lGl
Fig. 21. The V(o)function used in computing the index of blur (from FRY[1955]).
82
T H E OPTICAL PERFORMANCE O F T H E HUMAN E Y E
[II,
§ 5
where c is the distance from the exit-pupil to the retina, g’ the radius of the exit-pupil, f’ the secondary focal length of the eye, ilthe wavelength of the light, and V a function of & defined as follows:
2 2.5! 3.5!
(G)
+
2
4
4 ~
4.5!5.5!
..
.],
(40)
where
0=
[2JG(g’)2](1z’/A) (l/a-l/c),
(41)
and where a is the distance from the exit-pupil to the plane of the Fraunhofer image; n‘ is the index of the vitreous; and 0 is a measure of the extent to which the eye is out of focus. For an emmetropic eye in focus for an infinitely distant object,
g’ = ( c i f ’ k
(42)
where g is the radius of the entrance-pupil. Using this equation, eq. (39) reduces to
4 in minutes = 3438 A/(gV).
(43)
Fig. 22. The effect of pupil size and improper focus on the amount of blur as measured b y 4 (from FRY[1955]). .$expressed in microns: 6‘= radius of exit-pupil. See Appendix I.
TABLEI Values of V for different values of
la 0 2 4 6
W
v
I4
V
14
V
I4
V
1.6977 1.5973 1.3314 0.9881
8 10 12 14
0.6677 0.4397 0.3208 0.2825
16 18 20 22
0.2741 0.2617 0.2322 0.2017
24 26 28 30
0.1663 0.1522 0.1494 0.1472
11,
5
5j
SPREAD A N D TRANSFER FUNCTIONS
When the Fraunhofer image falls on the retina, and eq. (43) reduces to
$ in minutes
=
2028 Alg.
83
CO reduces to zero (44)
This is equivalent t o eq. (38). The constant V has been evaluated for various values of G, and the results are tabulated in Table 1 and plotted in Fig. 21. Figure 22 shows for monochromatic yellow light how a change in pupil size affects the depth of focus and how it affects the amount of blur when the eye is in or out of focus. According to eq. (44),I/+ varies in proportion to the radius of the entrance-pupil when the eye is in focus.
-1.00
0
+I00
DIOPTERS OUTOF FOCUS Fig. 23. Comparison of the 4-values for the physical and the geometrical images of a monochromatic point for an eye thrown out of focus to various degrees (from FRY [1955]). expressed in microns. 6‘ = radius of exit-pupil. See Appendix I.
The index of blur can also be used to demonstrate how well the “pillbox” distribution representing a blur circle approximates a Fresnel image. For the pillbox distribution,
$
= &nV = g[ng’/O’M’]la-~(,
(45 )
__
where f is the radius of the blur circle and O’M’ is the distance from the exit-pupil to the optical image. Figure 23 shows the $-values for the pillbox distributions compared with the +-values for Fresnel images. For small values of (a-c), Fresnel images are needed to assess the amount of blur; but for large values of (a-c), the pillbox distributions serve as useful approximations. In the case of the Gaussian spread function,
4 = 2.5066 U.
(46)
84
THE OPTICAL PERFORMANCE
O F THE HUMAN EYE
[II, $ 6
$j6. Psychophysical Techniques 6 . 1 . THE ROLE O F COHERENCE
Since in most situations the eye has t o cope with images formed by coherent light from point sources, it is desirable to evaluate the performance of the eye in terms of such images. At any rate, the interpretation of data dealing with the performance of the eye must take into consideration the coherence of the light. Some of the problems are illustrated by SHLAER’S [1937-381 study. The pupil of the eye was made conjugate to a source of light (the incandescent ball of a tungsten arc). Two plates of ground glass were placed between the source and the condensers to diffuse the image of the source. A transmitting grating which served as the test object was placed in this beam at a point conjugate to the retina. In this arrangement coherence and diffraction are difficult to assess, and the principles involved in interpreting the performance of the eye differ considerably from the situation in which the grating used as a test object is made up of points of coherent light. VAN NES and BOUMAN[1967] used a dispersing prism type uf monochromator to obtain monochromatic light. They focused the source on the entrance-slit of the monochromator and used a pair of lenses to form an image of the exit-slit in the plane of an artificial pupil placed just in front of the eye. A transparent sinusoidal grating was placed close to the primary focus of the lens system used to focus the exit-slit of the monochromator on the artificial pupil. This made the source conjugate to the plane of the pupil and the grating conjugate to the retina. This arrangement could have been modified so that the data could be interpreted in terms of ordinary seeing if a ribbon filament had been used as the source and if lenses had been mounted at the entrance and exit slits to make the source conjugate to the plane of the grating. This would have made the grating consist of points of coherent light. This method of illuminating a monochromator has been described by SAWYER [l966]. LE GRAND[1936] called attention to a method of viewing objects which he called directed vision. He used a lens to focus a point source at the center of the pupil of the eye and then placed a small opaque object between the lens and the eye. This produced a sharper image than when a diffusing surface constituted the background for the object. According to Le Grand, this avoids the possibility of having the image degraded by spherical aberration, He showed that the re-
11,
9
61
P S Y C €1 0 P H Y S I C A L T E C H N I Q U E S
85
solving power for two parallel threads was much better in directed vision than in normal vision. In Fig. 24 the light from the source at A is focused on the slit at C by the lens B. The lens D focuses an image of the slit at the entrancepupil of the eye. This arrangement conforms to Le Grand’s condition of directed vision. BERGER-LHEUREUX-ROBARDEY [1965] considers the beam entering the eye in this case to be coherent. According to Berger-Lheureux-Robardey, one can transform such a beam to an incoherent beam simply by increasing the width of the slit so that the image of the source completely covers the pupil. This raises the question of what constitutes coherence in a beam of light.
Fig. 24. Arrangement for “directed vision”. The source A is conjugate to the pupil, and the target a t E is conjugate to the retina.
Berger-Lheureux-Robardey claims that the arrangement in Fig. 24 with a wide slit is equivalent to ordinary seeing. This does not appear to be correct. She could have modified the arrangement as shown in Fig. 25 to make it correspond to ordinary seeing. This arrangement makes the source (A) conjugate to the target (E), so that each point in the target which transmits light becomes a source of coherent light. The images of these points formed on the retina are Fraunhofer images. If they are slightly out of focus, we call them Fresnel images of coherent point sources; but when the source itself is conjugate to the pupil of A
B C
D E
Fig. 25. Modification of the arrangement in Fig. 24 which makes the source (A) conjugate to the target E and to the retina. The diaphragm a t C is conjugate to the pupil. This corresponds to ordinary seeing in that each point in the target plane is h. point source of coherent light.
86
THE OPTICAL PERFORMANCE O F THE HUMAN E Y E
[II,
9
6
the eye, objects placed in the plane at E conjugate to the retina can only produce shadow images on the retina. Let us look again at Fig. 24. If instead of putting fine threads in the plane conjugate to the retina as Le Grand did, we place a grating in this plane, this restructures the beam entering the eye, and one can no longer assume that the beam entering the eye is a simple image of the slit (C). If the arrangement shown in Fig. 24 is used for stimulating the retina, then introducing into the beam a Babinet compensator between two polarizers, a slit, a transmitting grating, an opaque point or bar, or a dispersing prism will affect the distribution of flux in the plane of the pupil, as well as in the plane of the retina. Furthermore, the distribution of flux in the plane of the retina can be manipulated by spatial filtering in the plane of the pupil. No attempt has been made to cover all of the ramifications of this problem in the present review. However, this problem is involved in some of the various arrangements for measuring the optical performance of the eye, and we need to be aware of when and how the measurements are being affected. Needless to say, if we place in the plane conjugate to theretina a self-luminous surface as the face of a TV tube or a print illuminated from in front, or a diffusely reflecting or transmitting surface upon which falls an optical image of a grating, any of these arrangements simulates or constitutes ordinary seeing. All of these arrangements have been used for measuring the resolving power and modulation thresholds of the human eye. DEPALMAand LOWRY[1962] used transparent gratings placed at some distance in front of a diffusely reflecting surface which formed a luminous background. The interpretation of the results in this case presents somewhat of a problem. 6.2. SPECIAL DEVICES
FOR MEASURING KESOLVING POWER, MODULATION THRESHOLDS AND TRANSFER FACTORS
LE GRAND[1935] was apparently the first investigator to use interference fringes formed on the retina to measure the image degradation produced by the retina and brain independently of the image-forming mechanism of the eye. In the discussion that follows, the degradation produced by the combination of the retina and brain will be referred to as the retinal degradation to simplify phraseology, if for no other reason. There is also reason to believe that most of this
11,
5 61
PSYCHOPHYSICAL TECHNIQUES
87
degradation does occur at the retina, but we do not have to consider this issue settled. Le Grand looked at a point source through two small apertures of variable separation placed in front of one eye and was able to produce fringes of various spacings on the retina. He was able to measure the resolving power using fringes of high contrast and variable spatial frequency, and he was also able to test the effect of reducing the contrast by attenuating one of the beams. He compared the results obtained in this way with the results obtained with a grating formed by a Babinet compensator placed between two polarizers and illuminated by light from a source which was conjugate to the pupil and which produced an image large enough to cover the pupil. With the latter arrangement the contrast could be varied by manipulating one of the polarizers. This grating pattern is not equivalent to a grating made up of points of coherent light
Fig. 26. Westheimer’s arrangement for producing interference fringes on the retina [1960]). (from WESTHEIMER
WESTHEIMER [19601 has developed an ingenious arrangement (Fig. 26) for producing sinusoidal interference fringes on the retina. The lens L, produces an image of the source V on a slit S, and this is re-imaged at M by the lenses at L, and L,. The interference filter at IF makes the light monochromatic (555 nm), and the grating (6 lines per mm) at RR forms a grating diffraction image at M. A diaphragm is used to screen out all but the zero and the two first order images. These are covered with polaroids with the axis for the zero order image perpendicular to the axes for the first order images. The lens L, and the eye can be moved in a fore and aft direction to change the spacing, and the Polaroid analyzer (A) can be rotated to control the modulation. This device permits the measurement of the resolving power and the modulation thresholds at various spatial frequencies. By means of a field splitter ARNULFand DUPUY[1960] presented
88
T H E OPTICAL P E R F O R M A N C E O F THE H U M A N E Y E
[II,
4 6
t o the eye beams of light which produced sinusoidal grating patterns with the lines of both grating patterns perpendicular to the dividing line of the field splitter. In this way the modulations of the two gratings could be compared and could be made to be perceived as equal by manipulating the modulation difference. One of the grating images formed on the retina was an interference pattern produced by two rectangular beams entering the eye at two different parts of the pupil. The separation of the two beams was varied to manipulate the spatial frequency, and the long dimension of one of the beams was varied to degrade the modulation. The second fringe pattern was a multiple interference pattern produced by a wedge of two plane surfaces. The spacing of the fringes in this case could be varied by changing the angle of the wedge. For various spacings, Arnulf and Dupuy were able to measure the contrast loss produced by the optical system of the eye for the multiple reflection pattern by reducing the contrast of the double slit pattern to the same level. Using the double slit pattern by itself, they were able to measure the threshold of visibility for various spacings by manipulating the modulation. They also measured the resolving power using the multiple reflection fringe pattern. Unfortunately, the grating pattern produced by multiple reflection does not conform to ordinary seeing because the source is conjugate t o the plane of the pupil.
m TL Fig. 27. Arrangement used by Campbell and Green to produce interference fringcs on the retina (from CAMPBELLand GREEN[1966]).
CAMPBELLand GREEN[1965] used the arrangement shown in Fig. 27 to produce a set of sinusoidal interference fringes on the retina, the spacing and modulation of which can be varied. The lens L, in front of the laser source produces a divergent beam of coherent light from a point source a t the secondary focal point. The pellicle beam splitter RP and the first surface mirror M, produce a second source, and the
II,
3 ‘71
MODULATION TRANSFER FUNCTION
89
two sources are imaged by the objective lens L, in the plane of the pupil. The spacing of the fringes can be varied by changing the lateral displacement of the two sources, and the modulation can be manipulated by superimposing a patch of veiling luminance. With this arrangement they were able to measure the modulation thresholds for the retina and brain. They also used a TV display for generating a sinusoidal grating pattern consisting of points of coherent light. This is the same technique which had been introduced by SCHADE[1956]. The spacing of the grating could be varied and the modulation could be manipulated to measure the threshold. The threshold in this case is determined by the retina and the brain combined with the imageforming mechanism of the eye. By comparing the thresholds obtained by the two methods and by assuming that the modulations arriving at the brain are the same in the two cases for a given frequency, Campbell and Green were able to assess the contrast transfer function for the optical system of the human eye. Q 7. Determination of the Modulation Transfer Function for the
Retina 7 . 1 . DERIVATION F R O M THE COMBINED FUNCTION
One can start with a modulation threshold curve for the retina and brain combined with the image-forming mechanism of the eye and then divide the ordinates of this curve by the modulation transfer factors for the optical system of the eye. The resulting curve should represent the modulation threshold obtained when the optical system is by-passed. I n making this analysis, I have used data obtained in my own laboratory (FRY[1969b]). The grating pattern was generated by starting with a square-wave grating and convoluting it with a Gaussian spread function to produce a sine wave having a modulation of 0.73. ENGEL [l968] and FRY[1968a] have demonstrated that a sine-wave grating of known modulation can be produced in this way. The contrast was reduced to the threshold level by a patch of veiling luminance. The grating pattern was 87‘ high and 174’ wide and had a constant average luminance of 21.7 fL. The surround had the same luminance, and the whole pattern was viewed through a 2-mm artificial pupil at a distance of 1 meter. This gives an average retinal illuminance of 233 trolands. The results for subject DK are shown in Fig. 28. The modulation thresholds have been plotted as a function of
90
THE OPlICAL PERFORMANCE OF THE HUMAN EYE
[II,
9 7
Fig. 28. Modulation threshold data of DK. The dashed curve shows the modulation of the retinal image. These are the values for threshold which would be expected if the optics of the eye were by-passed (from FRY[1969b]).
center-to-center separations ( S ) of the bright bars in minutes as measured a t the second nodal point of the eye. The reason for using B rather than the spatial frequency (1/B) is that threshold data for bars and disks, etc. are all plotted as a function of size with size increasing from left to right, and hence this facilitates comparison.
-DIFFRACTION w
0 0 0
0
-
THEORY
GAUSSIAN APPROXIMATION
2
2z -
30.54
-
-I
a
-
z -
I - W
a
-
MINUTES
0
0
I DISTANCE FROM CENTER OF IMAGE
2 (r)
Fig. 29. Image formed on the retina by a point source of white light with a 2-mm exit-pupil and the Gaussian spread function which approximates i t (a = 0.4 min). (From FRY[1969b].)
11,
S 71
MODULATION TRANSFER FUNCTION
91
White light was used in the experiment. The point spread function for white light for a diffraction-limited eye with a 2-mm exit-pupil has been computed by Fry and is shown in Fig. 29. This differs somewhat from the spread function for a 2-mm artificial pupil placed at the anterior focal point of the eye, but this difference was ignored. The theoretical point spread function can be approximated by a Gaussian spread function as shown in Fig. 29. Equation (27) represents this function. The corresponding modulation transfer function [eq. (2S)l is included in Fig. 28 and is used in deriving the modulation threshold curve for the retina. The advantage in using logarithmic scales for plotting this type of data is that a shift sidewise of a modulation transfer curve without a change in form corresponds to a simple change in the value of 0. A shift of the threshold curve up or down indicates a general change in the threshold level at all frequencies. Furthermore, the threshold curve can be inverted to show its relation to the modulation transfer curve. The same ordinate scale can be used for modulation thresholds and transfer factors. The next step is to make allowance for the effect of the slope of the gradients between the troughs and peaks at low frequencies. The upturn of the curve in Fig. 28 at low spatial frequencies reflects the fact that the threshold depends not only upon the luminance difference between the peaks and troughs of the sine wave, but also upon the slope of the gradients between the peaks and troughs. One can demonstrate this by showing that the visibility of a single blurred border is dependent on the slope of the gradient at the border as well as the luminance difference on the opposite sides of the border. The same apparatus that was used to produce a sine-wave grating
Fig. 30. Basis for comparing the modulation threshold for a single blurred border to the modulation threshold for a sine-wave grating (from FRY[1969b]).
92
T H E OPTICAL PERFORMANCE O F THE HUMAN E Y E
[11,
7
was also used to produce a bipartite pattern with a dividing border that could be blurred to various degrees. The contrast threshold for such a blurred border can be compared with the thresholds for a grating. One can express the contrast for a blurred border in terms of modulation. Figure 30 shows the basis for comparing a blurred border to a sine wave grating. In the case of the single border (Lmax-Lmin). dL/ds = (0.40/0)
(47)
In the case of a grating dL/ds
= ( z l g ) (Lmax--Lrnin)*
(48)
Thus, in order for the (dL/ds)/(L,,,--Lmin) value for a single border to be equal to that for a sine-wave grating, the o value for the border must be such that o = 0.127 S. (49) In Fig. 31 the modulation thresholds for a single border have been plotted (curve A) as a function of o and those for a sine-wave grating (curve B) as a function of S. The cr and 3 scales have been adjusted so that the (dL/ds)/(Lma,--Lmin)values are equivalent. It is obvious that SIGMA VALUES FOR A SINGLE BLURRED BORDER
0.001
1
10
I MINUTES 100
CENTER TO CENTER SEPARATION
Fig. 31. Modulation thresholds for subject DK for a single blurred border (A) and a sine-wave grating (B). Curve C is the modulation threshold for a sine-wave grating corrected for the border gradient effect (from FRY[1969b]).
11,
g 71
MODULATION TRANSFER FUNCTION
93
the modulation threshold for the single border becomes independent of the amount of blur for small amounts of blur, and hence allowance
for blur can be made in the threshold values for a sine wave by multiplying the threshold values (curve B) by a factor defined as follows:
'
Threshold value for a sharp border = Threshold value for a blurred border'
(50)
The resultant curve (curve C) is shown in Fig. 31. Allowance for blur does not eliminate entirely the upturn in the modulation threshold curve. As we shall see later, part of the upturn can be attributed to the inhibitory network in the retina. Let us now examine the role played by physiological irradiation and inhibition. If it can be assumed that the excitation ( R ) transmitted from the photoreceptors to the ganglion cells in the retina is linearly related to the intensity of the stimulus ( E )and if part of the excitation spreads or irradiates laterally, one can treat irradiation in exactly the same way as optical blur. Let us assume that the line spread function for excitation conforms to eq. (29). If we convolute the distribution of retinal illuminance in the retinal image of a sine-wave grating with the line spread function defined in eq. (29), we obtain the distribution of excitation (I?) transmitted to the ganglion cells, which is
R
=pE
+pTR A E cos ( 2 n ~ / B ) ,
(51)
where E is the average retinal illuminance and A E is the amplitude of the modulation and s represents the distance across the grating from an arbitrary starting point a t the center of one of the bright bars. The symbol p represents a constant. The symbol T , represents the modulation transfer function for the spread of excitation which is given by eq. (30). The response of the ganglion cells is also controlled by the inhibition ( I )generated by the photoreceptors. RATLIFF[1965, p. 1571 has assumed that the inhibition ( I ) is proportional to the stimulus applied to the retina and has a Gaussian line spread function conforming to the following equation:
I
= I, exp [ - + ( t / ~ ) ~ ] ,
(52)
where t is the distance from the center of the line image. If the distribution of retinal illuminance in the retinal image of a
94
T H E OPTICAL PERFORMANCE O F T H E HUMAN EYE
[II,
§ 7
sine-wave grating is convoluted with the spread function defined by eq. ( 5 2 ) , we obtain the distribution of inhibition, which is
I
= Y E +yTI
AE cos (Zns/B).
(53)
The symbol y represents a constant. The symbol TI represents the modulation transfer function for the spread of inhibition and is
T I = exp [ - 2 ( n ~ 7 / 8 ) ~ ] .
(54)
It may now be assumed that the net excitation transmitted to the ganglion cells is proportional to ( R - I ) : (R-I)
1
(,L-Y)E +AE(,LLTR-YTI)cos ( 2 ~ ~ / 8 ) .
(55)
It follows, therefore, that the modulation transfer function [T,,-,,] for physiological irradiation of excitation and inhibition is T(R-I)= ( ~ - Y / P ) - ' [TR- ( Y / P ) T I I -
(56)
The total modulation transfer function ( T ) for the human eye including optical blur as well as physiological irradiation of excitation and inhibition is the product of T , and T,.-,,,
T = TETU-I,,
(57)
where T , is the modulation transfer function for the image-forming mechanism of the eye. Figure 32 provides a useful way of visualizing the components which
CENTER TO CENTER SEPARATION IMINUTESJ
Fig. 32. Interrelations between the components contributing t o the modulation transfer function for the human eye (from FRY[1969b]).
11,
I
71
95
MODULATION TRANSFER FUNCTION
v)
W 3
' 0
20
40
60
80
100
120
DISTANCE ACROSS THE RETINA
140
160
180 200
220
(MINUTES)
Fig. 33. Distributions across the retina of excitation (I?) and inhibition ( I )produced by a sine-wave grating of constant modulation and variable frequency. Each half cycle is 1.25 times wider than the half cycle which precedes it. The numbers above and below the ( R - I )curve indicate the lengths of the half cycles. The transfer curves are shown in Fig. 32 (from FRY[1969b]).
demodulate the image of a sine-wave grating. In the same graph are shown plots of T,, T,, T I , ( y / p ) T , and [TR- ( y / p ) T I ] .The constants aE, k and a, shift the T,, T , and TI curves sidewise without changing their forms. The ratio ( y / p ) controls the ratio of inhibition to excitation and raises or lowers the values of T by changing the DC component of the excitation transmitted to the ganglion cells. Figure 33 shows the interrelation of excitation and inhibition at the various frequencies in the case of a sine-wave grating. A sine wave of variable frequency has been convoluted with the E and R spread functions to show the distribution of excitation, and it has been convoluted with E and I spread functions to show the distribution of inhibition. In the lower curve the two distributions have been combined by subtracting the inhibition from the excitation. It should be noted that at low frequencies the AC component of inhibition partially neutralizes the AC component of excitation. This effect accounts for
96
THE OPTICAL PERFORMANCE
OF THE H U M A N E Y E
[II,
4
7
part of the upturn in the modulation threshold curve at low frequencies. However, at high frequencies the AC component of inhibition is reduced t o zero, and it no longer has the effect of lowering the AC component of excitation. At all frequencies inhibition enhances modulation by lowering the DC component. It is possible that the visibility of each border in the grating is affected by lying close to other parallel borders.
Fig. 34. The relation of the modulation threshold curve (curve C in Fig. 31) to the modulation transfer curve of the eye ( T )defined by eq. (56). (From FRY[1969b].)
If the variation in the modulation threshold for the eye at different spatial frequencies represents compensation for the demodulation imposed by optical and physiological blur, the reciprocals of the modulation transfer values ( T ) at the various frequencies should be proportional to the modulation threshold values corrected for blur (curve C in Fig. 3 1 ) . Figure 34 shows the reciprocals of the modulation transfer values multiplied by a constant ( K ) plotted on the same graph with the corrected threshold values. The values of the constants K , aE, k , 0, and ( y i p ) have been chosen to obtain a reasonable agreement between the experimental and theoretical results. The failure to obtain perfect agreement at the high frequencies could indicate that the assumed spread function for physiological irradiation of excitationneeds to be modified. It could also be attributed to spherical aberration of the eye which would increase the value of a E . Another factor to be considered is the blur produced by the high frequency micronystagmoid movements of the eye.
11,
5
71
97
MODULATION TRANSFER FUNCTION
Figure 31 brings out the fact that the threshold for a single sharp border is much higher than the minimum threshold for a sine-wave grating which occurs at a frequency of about five cycles per degree (5 = 12 min). Figure 35 illustrates a possible basis for this difference. This figure shows a contrast border and a sine wave a t a frequency of five cycles per degree which have been convoluted with the total spread function for the human eye. The difference in excitation on the two sides of the contrast border is less than the difference between the peaks and troughs of the grating. CONTRAST BORDER
SINE WAVE =12min)
6
r---------
,,
10 min
I
___-_----A
J
\-I
\ f
.
Fig. 35. The effect of demodulation by the eye on the image of a sharp contrast border and the image of a sine-wave grating with a center-to-center spacing of 12 min. The transfer curves are shown in Fig. 32. (From FRY[1969b].)
Other investigators have attempted to analyze the transfer function for the optical image forming mechanism of the eye and retina into its components. RATLIFF[1965, pp. 146-1591 used only two components, optical blur and inhibition. What Ratliff calls “optical blur” includes physiological blur and optical blur. PATEL[1966] used data obtained by measuring the light reflected from the retina to deduce the role played by the retina in determining the spread function of the eye as a whole. SCHADE[1956] and NACHMIAS [1968] have used a Gaussian function to represent optical blur and excitation and a second Gaussian function to represent inhibition. Since the optical transfer function is not affected by changes in the luminance level and stimulus duration, the changes in the form of the modulation threshold curve which have been reported by VAN NES [1968], SCHOBER and HILZ[1965], NACHMIAS [1967], and others have to be explained by changes in (YIP), k and 0, and also by changes in the form of the excitation and inhibition spread functions.
98
T H E OPTICAL PERFORMANCE O F THE HUMAN EYE
[II,
§ 7
7.2.DIRECT MEASUREMENT WITH A DOUBLE SLIT INTERFERENCE PATTERN
One way to demonstrate that the spread of excitation in the retina constitutes a blur component independent of optical blur is to by-pass the optics of the eye altogether and measure directly the transfer function for the retina and brain. This is exactly what LEGRAND [1935], WESTHEIMER [1960], ARNULF and DUPUY[1960] and CAMPBELL and GREEN[1965] have done by measuring the modulation thresholds for Young interference fringes formed on the retina. The data for Westheimer, Arnulf and Dupuy, and Campbell and Green have been replotted in Fig. 36. The points representing the data for Campbell and Green are points taken from the smooth curves which they used to represent their data. The points for Arnulf and Dupuy represent average values for three subjects. Also, for each subject the experiment was repeated several times using artificial pupils of different size. Since the results are based on two small beams passing through the center of the artificial pupil, changing the size of the pupil should have no effect on the results. As a matter of fact, the size of the pupil was found to have no effect, and the data points for artificial pupils of 4, 2.5, 1.6 and 1 mm in diameter have been averaged. I
0.I
5- 0.01
3 3
n 0
=0.001
I CENTER
10
TO
100
CENTER SEPARATION
Fig. 36. Modulation threshold data for Young interference fringes formed on the retina so that the optical system of the eye is by-passed. Different levels of average retinal illuminance were used: Westheimer, 2200 trolands; Campbell and Green, 500 trolands; Arnulf and Dupuy, 50 trolands.
1,
I 81
MODULATION O F A SINE-WAVE GRATING
99
The dashed curve used by Westheimer to fit his data (Fig. 36) conforms to the following equation: Mthreshold
(58)
where K is a constant and T is defined by eq. ( 2 5 ) and represents the modulation transfer function for the optical system of a diffractionlimited eye with a 2-mm pupil exposed to 555 nm. Westheimer stated that his data could be fitted in this way, but since the measurements by-pass the optics of the eye, it is obvious that the spreading mechanisms have to be described in terms of physiological mechanisms in the retina. Westheimer’s data have also been fitted with a second curve B, which conforms to eq. (58), but this time T represents the modulation transfer function for the combination of physiological excitation and inhibition as defined by eq. (56). Westheimer concluded from his data that the modulation transfer function for the eye and brain is monotonic for the range of frequencies investigated, and that his data show no upturn in the modulation threshold at low spatial frequencies, which might indicate inhibition. The data of Arnulf and Dupuy and those of Green do show an upturn. The curve C used to fit the data of Arnulf and Dupuy is the same as curve B, and has merely been displaced down and to the right. Since logarithmic scales are used for i and M , sliding a curve down corresponds t o a change in K , and moving it sidewise involves a change in k and a change in cE and a,. The data for Campbell and Green do not conform to curve B. The difference in general threshold levels for the investigators is puzzling. They worked with different field sizes and luminance levels, but there appears to be no simple explanation for the differences in thresholds. $j8. Perceived Modulation of a Sine-Wave Grating at
Suprathreshold Levels I n deriving the modulation transfer function for the human eye, the writer and others, eg., MENZEL 119591 and PATEL[1966], have assumed that the modulation of the output from the retina at the threshold of visibility is constant from one spatial frequency to another, and that the variation in input required for the threshold reflects the different amounts of demodulation produced by the system at the different frequencies.
100
T H E OPTICAL P E R F O R M A N C E O F T H E H U M A N EYE
[II,
§ 9
As shown above, this relation may be complicated at low frequencies by mechanisms other than those that can be described as spreading mechanisms, but in the range from 6 to 60 cycles per degree the basic assumption probably holds. One can do such obvious things as showing that an increase in optical blur increases the modulation threshold at the higher frequencies. Another approach to this problem is to check to see if the amplitude of the perceived modulations at suprathreshold levels correlate with the thresholds at different frequencies. BRYNGDAHL [1966] has measured the perceived brightness of bright and dark bars of a grating at suprathreshold levels for different frequencies. Heshowed that the perceived modulation varies with frequency in accordance with what might be expected from the changes in threshold. He also showed that the perceived modulation is greater than the actual modulation of the grating. This may be related to the lowering of the DC component of the eye’s response by inhibition. Bryngdahl used a stimulus pattern similar to that shown in Fig. 8 and photometrically matched the upper half first to the bright lines and then to the dark lines. He used center-to-center separations of 7 minutes and greater. With spacing finer than this, it becomes difficult for the eye to assess anything other than the average brightness (FRY[1959]). ARNULFand DUPUY[1960] had to match two suprathreshold sinusoidal gratings on the basis of equal perceived modulation. They reported measurements for center-to-center spacing as low as 1.74 minutes. § 9. Determination of the Modulation Transfer Function of the Optical System of the Eye 9.1. THE CAMPBELL-GREEN METHOD
CAMPBELL and GREEN [1965] simply measured the modulation threshold for Young interference fringes for different spatial frequencies and then separately measured the modulation thresholds for a sinusoidal grating generated with a TV tube and viewed through artificial pupils of different size. The same spatial frequencies were used in the two experiments. Since the only effect that the optical system of the eye can have on a sinusoidal grating stimulus exposed to the eye is to reduce its modulation, it may be assumed that if the
11,
§ 91
101
T H E MODULATION TRANSFER FUNCTION
image of the TV display on the retina and the Young fringes have the same spatial frequency, they must have the same modulation threshold. Thus, the ratio of the modulation reductions necessary to bring the two patterns to their thresholds must represent the optical transfer function for the image-forming mechanism of the eye. Unfortunately, the monochromatic light emitted by the laser (632.8 nm) differed from the wavelength composition of the TV display, which peaked at 530 nm and had a half-width of *30 nm. It can likely be assumed that a 632.8 nm set of interference fringes has the same threshold as a 530 nm grating. The average retinal illuminance for both types of gratings was kept constant at 500 trolands. The data for subject Green for a pupil 2 mm in diameter are given 1.0
0.5
; L
u w 4L -8
0.2
c
0.1
I
0
10
I
t
20 30 40 Spatial frcquency (cicleg)
I
50
60
Fig. 37. Threshold data for subject DG for Young fringes (solid line) and for a sinusoidal grating (open circles) viewed through a 2-mm artificial pupil with monochromatic light of 530 nm. The filled circles represent modulation transfer factors deduced from the data for the image-forming mechanism of the eye (from CAMPBELLand GREEN [1965]).
102
9 9
[IL
T H E OPTICAL PERFORMANCE O F THE HUMAN E Y E
1.0
0.9 08 -Y
.g 0.7 Q
0.6 bf.
.-E 0.5 d
..a
+
2
0.4
-Y
2
0.3
0.2 0.1
, I
0
0.1
I
I
0.3 0.4 0.5 Noniializrd spatial frcqmucy
0.2
I
I
0.6
0.7
Fig. 38.The data of subject DG for the modulation transfer factors for the image-forming mechanism of the eyc for a 2-mm pupil ( 0 )(same as in Fig. 37) and for three other pupil sizes: V , 2.8 mm; m, 3.8 mm; A,5.8 mm. The abscissavaluesrepresent normalized spatial frequency. The Fraunhofer transfer function is also shown. The wavelcngth (530 nm) was the same for all curves (from CAMPBELLand GREEN[1965]).
in Fig. 37. The accommodation of Green’s eye was paralyzed, and the retina was made conjugate to the TV display with a +l.SO lens mounted before the eye. The solid curve without dots represents the data obtained with double slit interference fringes. This figure shows how the modulation transfer factors for the optical mechanism are determined. The ratio for the two thresholds at a given frequency gives the transfer factor for that frequency. The curve representing these ratios a t different frequencies is the modulation transfer function. In Fig. 38 a normalized frequency scale is used: Normalized frequency =
[A/(2[)] (1/S),
(59)
where il and g are expressed in the same units and d is expressed in radians. This scale has the merit that all diffraction-limited eyes fall on the one theoretical line regardless of wavelength or pupil size. The curves for actual eyes of different pupil size are not to be compared
11,
S 91
103
THE MODULATION TRANSFER FUNCTION
1.0
0.8
0.6
:0.2 M
z
..-d E
o
+2
*
3
0.8
c
c
6
0.6 0.4
0.2
0.1
0.2
0.3 0.4 0.5 0.6 Normalized frequency
0.7
0.8
0.9
.,
Fig. 39. The data of subject DG for the modulation transfer factors of the image-forming mechanism of the eye for the eye in focus (0) and for the eye out of focus to various degrees measured in diopters: 0.50; A, 1.00; 0, 2.00. The pupil diameter = 2 mm and 1 = 530 nm. The broken curves represent the Fraunhofer and the Fresnel transfer functions (from CAMPBELLand GREEN [ 1 9 6 5 ] ) .
with each other, but each is compared to the theoretical curve to determine whether it measures up to its own standard of performance. The theoretical curve in Fig. 38 is the same as the curve in Fig. 15 for the Fraunhofer image. The only difference is that in Fig. 38 a normalized frequency scale is used. The discrepancies which are found must be traced to spherical aberration and other factors. Campbell and Green also presented data for an eye thrown out of focus to various degrees by changing the lens mounted in front of the eye. The modulation transfer functions are shown in Fig. 39 along with the theoretical curves for a diffraction-limited eye. These curves were computed by Linfoot according to the theoretical method described by HOPKINS [1962].
104
T H E OPTICAL PERFORMANCE OF T H E HUMAN EYE
[II,
9
9
The Campbell-Green method should be tested at different luminance levels. If it is a valid method for determining the transfer function for the image-forming mechanism, the results should not be affected by changing the luminance level. By working at a relatively low luminance one could avoid the problems encountered with the coarseness of the retinal mosaic at high frequencies. 9.2. THE ARNULF-DUPUY METHOD
ARNULF and DUPUY [1960] measured the modulation transfer function for the image-forming mechanism of the eye by comparing the fringes formed by interrreflection at a glass wedge with a set of Young fringes of the same frequency as described above. The beam forming the wedge fringe pattern fills the artificial pupil mounted in
0
2
4
x
6
8
10
m Fig. 40. Data of ARNULF and DUPUY[1960] for the modulation transfer function of the optical system of the eye for differentpupil sizes; 1 = 546. The Fraunhofer modulation tranfer curve is included for comparison.
front of the eye, and the modulation of this pattern is affected by changing the size of the artificial pupil. The modulation of the Young fringes can be varied from unity t o zero and can be used to measure directly the modulation of the wedge fringes. The data have been presented by Arnulf and Dupuy as shown in Fig. 40. The abscissa scale is a normalized scale in which the spatial frequency 1/S in lines per radian is multiplied by the normalizing factor A ( 2 c ) . One can represent on such a graph by a single curve [eq. (19)] the theoretical modulation transfer function for an optical system which is in focus and limited only by diffraction. This one curve applies t o pupils of all sizes and to all wavelengths.
T H E MODULATION TRANSFER FUNCTION
1 2 CENTER
5
10
105
50
20
TO CENTER SEPARATION ( 5 )
Fig. 41. Modulation transfer data of ARNULFand DUPUY[I9601 for pupils 2.5 mm in diameter and smaller plotted as functions of g.
I-0
0.5 T
0.2 minutes I
,
I
Fig. 42. Modulation transfer data of ARNULF and DUPUY[I9601 for 2.5-mm and 4-mm pupils plotted as functions of f .
The advantage in plotting the data in this way is that if the optical system is diffraction-limited, the curves for all pupil sizes must coincide with the theoretical curve. In fact, the curves for pupils under 0.87 mm do coincide, and for pupils of this size and smaller the eye functions as a perfect optical system. The discrepancies become progressively worse for larger pupils indicating that the eye is subject to spherical aberration and other sources of optical blur. In Figs. 41 and 42 the data for the different pupil sizes have been replotted with the modulation transfer factors as a function of the center-to-center separation. The curves used to represent pupil sizes of 0.76 mm and smaller are the theoretical Fraunhofer curves. The theoretical curves all have the same form and are merely displaced sidewise from each other. Displacement to the left indicates improved performance, and hence the amount of improvement produced by a change in pupil size is indicated. The optical system of the eye achieves
106
THE OPTICAL PERFORMANCE OF T H E HUMAN E Y E
[II,
$ 10
its maximum performance when the pupil is 2.5 mm in diameter. The data in Fig. 41 can be compared directly with the data of Campbell and Green in Fig. 38.
Q 10. Resolving Power at Unit Modulation 10.1. EFFECT O F PUPIL SIZE ON THE RESOLVING POWER
COBB [1915] made a careful study of the effect of pupil size on visual acuity measured with an Ives patten illuminated with white light. As explained above, this pattern is equivalent to a sawtooth distribution. He used artificial pupils of various sizes and kept the average retinal illuminance constant at 148 trolands. The results are shown in Fig. 43. The results have been expressed in terms of resolving power ( l / S ) instead of in terms of visual acuity. Cobb pointed out that if the eye were diffraction-limited, it should perform much better than it does. The resolving power (1/B) should vary directly as the diameter of the pupil, as indicated by the dotted line. I n adding this dotted line to the figure, it has been assumed that the two curves should come together for small pupils. Cobb concluded that the performance of the eye with large pupils must be seriously impaired by spherical and chromatic aberration and other defects.
0.7 0.6 -
;;; 0.5 -
\ -
oI
0.4 -
w
n
0.3
Trolands
WHITE LIGHT
-
1 2 DIAMETER
3
4
5
6
OF ENTRANCE-WPIL
Fig. 43. Resolving power of the eye for a sawtooth grating for pupils of different size. The average retinal illuminance was kept constant a t 148 trolands. (Data from COBB [1915].)
11,
§ 10)
RESOLVING P O W E R AT U N I T MODULATION
107
BERGER-LHEUREUX-ROBARDEY [ 19651 has measured the resolving power of the eye at unit modulation with the Wollaston prism technique described below. The beam of light used in this measurement was obtained by focusing an enlarged image of a mercury source on the artificial pupil mounted just in front of the eye. The arrangement is similar to that shown in Fig. 24, except that in this figure the artificial pupil is not shown. A filter rendered the light monochromatic (546.3 nm). Artificial pupils of different size were used as indicated in Fig. 44, and the resolving power at unit modulation was measured by placing a Wollaston prism in the beam at the point conjugate to the retina and rotating the Wollaston prism around an axis parallel to the fringes. The grating pattern was sinusoidal.
0
1
2
3
4
5
6
DIAMETER OF ENTRANCE-PUPIL
Fig. 44. Resolving power of the eye for a sinusoidal grating for pupilsof different size and different levels of retinal illuminance. (Data from BERGER-LHEUREUX-ROBARDEY [1965].)
The results obtained at three luminance levels are shown in Fig. 44. Although the role of diffraction is different when the source is conjugate to the aperture-stop instead of to the retina as in Cobb’s case, the data do show that with monochromatic light the performance of the eye is still impaired by spherical aberration and other defects when the pupil is large; they also show that the performance is affected by
108
T H E OPTICAL PERFORMANCE OF T H E HUMAN EYE
-
[II,
10
10000
0 v)
1000
2
a 2
100
K
t
10
i -_1
I
t; K
0.1
> a
0.01
,
I
I
, 1 , , , , 1
2
5
10
20
RESOLUTION THRESHOLD (MINUTES)
Fig. 45. Effect of the level of retinal illuminance on the resolution threshold for the fovea of the eye for a square-wave grating. (White light with a 2-mm artificial pupil.) (Data from SHLAER [1937-381.)
.-,*
lo
,to I to 31 to 7
1- ' '
' "%\I
'I' ''\I
'I / ;/ I I f
0.1 -
OPAQUE
[TO CLEAR RATIOS
MODULATION THRESHOLD FOR THE RETINA
5
0.01 -
F Q _J
3
n 0
E 0.001
111 I
minutes I
I
11.
§ 101
109
RESOLVING POWER A T U N I T MODULATION
the source in the plane of the pupil. The grating was conjugate to the retina. The data for white light are shown in Fig. 45. The resolving power reaches an upper limit where further increase in retinal illuminance has no effect. Shlaer believed that this upper limit was dependent on the coarseness of the retinal mosiac. In order to check this possibility further, he investigated the effect of changing the modulation because if the resolving power were limited by the coarseness of the retinal mosaic, it should not be affected by changing the modulation. In order to manipulate the modulation, he changed the relative widths of the light and dark bars of his squarewave grating and found for the three black-white ratios which he used only a small difference, which he considered negligible. In a separate paper (FRY[1968c]) I have described a more detailed analysis of Shlaer’s modulation data, in which I have assumed that the image-forming mechanism of the eye has a Gaussian spread function with a sigma value of 0.212; and I have computed the effect of optical blur on the retinal images of his three gratings. I have plotted Shlaer’s data in Fig. 46, which shows the thresholds in terms of the modulation of the image formed on the retina. The curve through the data has a slope similar to that found for the retinal threshold curve in Figs. 28 and 36. Hence, it is assumed that the limit encountered by Shlaer is not due to the coarseness of the retinal mosaic. Furthermore, he was able to improve the acuity beyond the level indicated in Fig. 46 by increasing the size of his pupil, It must be noted that the TABLE2 Resolution thresholds measured with a double slit diffraction pattern ~~
~
Investigator
~~
Resolution threshold (minutes)
Retinal illuminance (trolands)
Le Grand
1.25 1.45
13 600 2.72
546
Campbell
0.98
500
633
Green
0.95
500
633
Westheimer
1.25
2 200
555
Byram
0.50
I
not specified
Wavelength (nm)
not specified
110
T H E OPTICAL P E R F O R M A N C E O F THE HUMAN E Y E
[II,
§ 11
lowest resolution threshold found by Shlaer, namely 0.94 min, is better than the best measurements (Table 2 ) reported by several investigators who used double slit interference patterns. 10.3,RESOLVING PATTERNS
POWER FOR DOUBLE SLIT INTERFERENCE
The resolving power as measured with a double slit interference pattern must be better than that obtained with normal seeing in order to use the modulation thresholds to measure the optical transfer function. This relation holds in Fig. 37, where the modulation threshold curves cross the unit modulation line at different points. We must face the possibility that the upper limit of resolution is determined by the spread of excitation in the retina when a double slit pattern is used. It may be noted that the threshold curves (Fig. 36) for the double slit pattern cross the unit modulation line at a slope, indicating that the limit of resolving power has not yet been reached. The values for resolving power at unit modulation as found by different investigators using the double slit interference pattern are shown in Table 2. BYRAM[1944] reported that whereas the transition from resolution to fusion was smooth in ordinary seeing, it was not smooth with a double slit pattern; as the lines approached the limit of resolution, they broke up into segments and appeared to shimmer or flutter. CAMPBELL and GREEN[1965] pointed out that just before fusion the central patch scintillates and appears brighter and more desaturated than the surround. These effects may be taken as evidence that the structure of the retinal mosaic is involved. HELMHOLTZ [1924] reported seeing similar effects when a fine grating is viewed with the natural pupil. WESTHEIMER[1960] reported that his subjects found the transition t o be smooth. LE GRAND[1935] reported that at 13 600 trolands the difference between the Young fringes and the Babinet fringes was negligible, but the Babinct fringes suffered more impairment than the Young fringes as the luminance level or contrast was reduced.
Q 11. Dependence of the Resolving Power on Wavelength Composition The existence of three types of cones in the fovea poses a problem for resolving power because they do not respond equally to all mono-
rr, S 113
D E P E N D E N C E ON WAVELENGTH COMPOSITION
111
chromatic stimuli, and thus the resolving power of the eye should depend on the extent to which the three types of cones are involved. Are the cones of a kind arranged in clusters or do the different kinds of cones interdigitate so that cones of each of the three types are uniformly distributed? Are the three types equally numerous? How do the responses of the three kinds of cones interact at the level of the bipolars and ganglion cells? O’BRIEN and MILLER [1952] investigated this problem by using double slit monochromatic interference fringes, but they were unable by this method to throw any light on the arrangement or relative numbers of the three receptors. HARTRIDGE [1947] used a single point and reported a scintillation of changing colors as the image moved across the retina. At ordinary luminance levels visual acuity with high contrast gratings is independent of the wavelength for a given pupil size when monochromatic stimuli are used, provided that allowance is made for the effect of wavelength on diffraction. Yellow light, which stimulates red- and green-sensitive cones, yields the same result as red light, which stimulates only red-sensitive cones. One way of explaining this independence is to say that the three kinds of cones channel their brightness responses through the same ganglion cells. As long as the luminance levels are above the thresholds of the three kinds of cones, it is reasonable to expect a simple summation of excitation and inhibition at the level of interaction between the retino-cortical pathways. The spread of excitation and inhibition is the same as if each cone had an ICI luminosity curve. In a previous publication (FRY[1955]), I have assumed that the retinal image of a polychromatic point source represents a superposition of the monochromatic images, each of which is dependent upon the total flux for that part of the spectrum and the extent to which the eye is out of focus for that wavelength. The ICI photopic luminosity curve was used in computing the amount of flux for the different parts of the spectrum. In accordance with this straightforward analysis of the problem, I have computed the distribution of retinal illuminance in an image for white light having a uniform energy spectrum. It was assumed that the eye was focused for 555 nm, and the data of WALD and GRIFFIN[1947] for the chromatic aberration of the eye were used in computing the extent to which the eye was out of focus for other wavelengths. The distribution of flux in such an image is shown in Fig. 47. Included in the same figure is the distribution of flux for the
112
T H E O P T I C A L P E R F O R M A N C E O F THE HUMAN EYE
DISTANCE ACROSS THE RETINA
[II,
5
11
(MICRONS)
Fig. 47. The theoretical optical images formed on the retina for a white point and for a monochromatic point (555 nm) with a 2-mm exit-pupil and with chromatic aberration and diffraction taken into account. (From FRY[1955].)
image of a monochromatic point of 555 nm. The total flux in these two images is not the same; the amounts of flux have been adjusted so that the retinal illuminances at the two peaks are equal. It is obvious that the white image shows more spread than the yellow one. I proposed a simplified method of evaluating the effects of chromatic aberration for a diffraction-limited eye through the use of the Fry-Cobb index of blur, +. It turns out that for a polychromatic point image in a diffraction limited
-
0
0 +ID DIOPTERS CUT OF FOCUS FOR YELLOW LIGHT
-100
Fig. 48. The theoretical effect of throwing the eye out of focus on the value of (expressed in microns) for white light and yellow light (555 nm) with a 2-mm cxitpupil. (From FRY[1955].) Eye thrown o u t of focus by moving the retina.
11,
§
111
D E P E N D E N C E ON W A V E L E N G T H C O M P O S I T I O N
113
RADIUS OF EXIT PUPIL
Fig. 49. The theoretical effect of varying the pupil size on the value of 4 (expressed in microns) for white light and yellow light (555 nm). Eye focused for 555 nm. (From FRY[1955].)
where F A represents the total flux in the image per unit wavelength, and F represents the total flux in the image. Table 1, which gives values of V for various values of a, can facilitate the computation of 4. Figure 48 shows a plot of 4 expressed in micron units of distance across the retina as a function of diopters out of focus for yellow light and for white light. When the eye is in focus, the image for monochromatic yellow light is sharper than the image for white light; but for values of 4 greater than 12p, the depth of focus is better for white light than for monochromatic light. The index of blur approach can also be used to show the effect of pupil size on blur. The results are shown in Fig. 49. For monochromatic light, the reciprocal of 4 increases in proportion t o the radius of the exit-pupil, whereas for white light, the image becomes sharper as the radius increases up to about 1.5 mm and then becomes poorer as the effect of chromatic aberration offsets the effect of pupil size on diffraction. CAMPBELL and GUBISCH[1967] have approached the problem by treating the modulation transfer function for heterochromatic light as the weighted sum of the several transfer functions for the different parts of the spectrum. These component parts are weighted according to the luminosity curve, and allowance is made in the transfer functions for the extent to which the eye is out of focus, They have used the average measurements of IVANOFF [1947], HARTRIDGE [19471 and CAMPBELL [1957] for the chromatic aberration of the eye. Using this approach, Campbell and Gubisch showed that the
114
THE OPTICAL P E R F O R M A N C E O F T H E H U M A N E Y E
[II,
11
modulation transfer functions for white and yellow light converge at high and low frequencies but diverge from each other in the middle of the frequency range with the maximum difference falling around 30 t o 40 cycles per degree, depending on the size of the pupil. Hence, one cannot expect to find differences in the performance of the eye for white and yellow light at unit modulation. One must look for the difference in the frequency range of about 30 cycles per degree. Campbell and Gubisch measured the modulation thresholds for white and yellow light at a frequency of 30 cycles per degree with the eye thrown out of focus various amounts. The pupil diameter was 2.5 mm. The results are shown in Fig. 50 and can be compared to the curves in Fig. 48. 10 r
> t
z
5
3-
w
v)
t v)
a n c
Z
0 1
I
]
,
I
I
-0.5 0 +0.5 LENS POWER, DIOPTERS
+ 1.0
t1 5
Fig. 50. Data for RWG for contrast sensitivity (1/S)for a sine-wave grating for various diopters out of focus for white light (filled circles) and yellow (555 nm) light (open circles) with a 2.5 mm artificial pupil and a spatial frequency of 30 cjdegree. (From CAMPBELLand GUBISCH[1967].)
They also showed that for a small pupil (1.5 mm) the difference between monochromatic yellow and white light is negligible, but is measurable with a 2.5 mm pupil and again disappears with a 4.0mm pupil. This agrees with the predictions in Fig. 49, except that at a pupil diameter of 4.0, it must be assumed that spherical aberration is so bad that the difference between white and yellow light is negligible. Although the use of the CIE photopic luminosity curve for weighting the different parts of the spectrum in computing the modulation transfer function for heterochromatic light is satisfactory, if one is looking for theoretical values to compare with the experimental data obtained by reflection from the retina, Campbell and Gubisch have questioned whether the CIE photopic luminosity curve can be used
11,
9
llj
D E P E N D E N C E ON W A V E L E N G T H C O M P O S I T I O N
115
in computing theoretical values to compare with the experimentally obtained values for modulation thresholds. By comparing the results obtained with white light with the results obtained with monochromatic light (546 and 578 nm), they concluded that it is better to use the green component of the luminosity curve for weighting the different parts of the spectrum when the eye is in focus for 546 nm and to use the red component of the luminosity curve when the eye is in focus for 578 nm. The theory involved is that the red-sensitive cones dominate the situation in one case and the green ones dominate in the other. This conclusion is based in part on VON BAHR’S [1946] finding that if grating acuity is measured with a mixture of two colors, the performance is determined by the color for which the eye is focused. One can raise the question as to what would happen if one could restrict the input to the retina to one kind of cone. I n the different types of dichroniatic forms of color blindness, it is often, but not universally, assumed that one of the three kinds of cones is missing. But the chances arc that these are replaced by cones of the other two types, and hence the resolving power need not be affected because the population of photoreceptors is more sparse. I n blue cone monochromatism (BLACKWELL and BLACKWELL [1961]), visual acuity is impaired; however, information about the distribution and number of photoreceptors in such retinas and about the neural organization is lacking. GREEN[1968] has approached this problem by using normal eyes and the two-color and bleaching techniques developed by STILES [1949] and BRINDLEY[1954] for isolating the responses of the red, blue, and green photoreceptors. In this technique, the modulation thresholds for red and green light were measured on red and green backgrounds. I t was found that the modulation threshold for green light was dependent upon the extent to which green light stimulates green receptors. Since red light stimulates only red receptors, the modulation thresholds measured with red light are dependent upon red-sensitive photoreceptors acting alone. The modulation curves for red and green light are very similar, and this helps to explain the eye’s response to yellow light, which involves a combination of the red and green photoreceptors. I n studying the isolated response of the blue receptors, Green first bleached the red and green receptors by exposing the retina to yellow light and then measured the modulation thresholds for blue light with a blue sinusoidal grating superimposed on a yellow background. The resolving power of the eye for high levels of modulation is no better
116
THE OPTICAL PERFORMANCE O F THE HUMAN EYE
[II,
$ 12
than 10 cycles per degree. One could conclude from this that the blue receptors are less numerous and more sparsely distributed than the red and green receptors, and the high levels of acuity achieved with blue light under ordinary conditions must be explained by the involvement of the red and green photoreceptors. The bleaching and two-color techniques used by Green involve high levels of luminance. At ordinary luminance levels, below 1000 trolands, the red and green photoreceptors involve little or no bleaching; thus, changes in adaptation dependent upon mechanisms at the level of the bipolars and higher could affect equally the responses of the red, green and blue photoreceptors regardless of the wavelength composition involved in producing the state of adaptation. An important point which may be deduced from Green’s data is that below its threshold a cone yields no response which can be summated with the responses from other photoreceptors. Consequently, it may be assumed that there is no such thing as a sub-threshold response of the red photoreceptors which can affect the supra-threshold response of the green photoreceptors. § 12. Visibility of Square-Wave Gratings
Although sine-wave gratings appear to be ideal from the point of view that no matter how much one demodulates such a pattern, it still retains the form of a sine wave, there are, however, things t o be VARYING MAG. oVARYlNG CONTRAST
MINUTES I
I
Fig. 51. iModulation threshold curve for a square-wave grating. Averagc data for two subjects. Whitc light and a 2-mm entrance-pupil. The dashed curve represents the modulation threshold for a single border. (From LEVYand FRY[1968].)
11,
§ 121
117
VISIBILITY O F SQUARE-WAVE GRATINGS
learned from the study of the eye's response to square waves. At high frequencies, it may be assumed that the optical image of a square-wave grating formed on the retina has become transformed to a sine wave. Since it is approximately true that for an eye in focus the point spread function can be approximated by a Gaussian spread function, it follows that the modulation of the retinal image for a square wave of high frequency is precisely 41n times that of a sine wave. Using a formula derived by ENGEL [l968] for the modulation of a line grating convoluted with a Gaussian spread function, FRY[1968a] has shown that the modulation of a high frequency square wave convoluted with a Gaussian spread function is 41" times that for a sine wave. CAMPBELLand ROBSON [1964 and 19681 have verified by experiment that this is true. They had arrived at the conclusion that this relationship must exist by assuming that the eye filters out all but the first harmonic of the square wave. Figure 51 shows the modulation threshold curve (average for two subjects, DL and DK) obtained in a study by LEVYand FRY[1968]
MINUTES 0
"
I
Z
'
"
!
50
"
~
"
~
"
"
~
"
100 150 DISTANCE ACROSS THE RETINA
r
I
-
200
Fig. 52. Distributions across the retina of excitation (R) and inhibition ( I )produced by a square-wave grating of constant modulation and variable frequency. Each half cylcc is 1.25 times wider than the half cycle which precedes it. The numbers below the ( R - I ) curve indicate the lengths of the half cycles. The transfer curves are shown in Fig. 32. (From LEVYand FRY[1968].)
118
T H E OPTICAL PERFORMANCE OF THE HUMAN EYE
[Il,
s
12
for a luminance level of 21.7 fL and an artificial pupil of 2 mm. Shown also by the dotted line is the level of the modulation threshold for a single border between a bright and a dark area. In order to make the comparison between the grating and a single border meaningful, the grating was always presented with a border between a dark bar on the left and a bright bar on the right a t the center of the pattern; the subject was asked to fixate a point on this border. This corresponds to ficating a point on a single border. It is interesting to show the effect of convoluting a square-wave grating with a Gaussian spread function, which simulates the optical spread function of the eye, and an exponential spread function, which represents physiological irradiation of excitation, and a negative Gaussian spread function, which represents the spread of inhibition. This is shown in Fig. 52 and represents the same kind of analysis as that shown in Fig. 33 for a sine wave. The thing that makes the square wave differ from the sine wave is that the effects a t the borders between the bright and dark bars eventually become the same from border to border and are equivalent to those for a single straight border between extended bright and dark areas; whereas in the case of the sine wave, the gradient between the peaks and the troughs gradually flattens, and eventually the blurred bars become washed out to a uniform patch of brightness. It is obvious from Fig. 5 2 that the improved visibility a t a frequency of about 3 cycles per degree is dependent on the fact that the modulation of the inhibitory component has been reduced to a low level, whereas at lower frequencies, the modulation of the inhibitory component reduces the modulation of the excitation transmitted through the retina. The upturn in the modulation curve shown in Fig. 51 can, therefore, be partially attributed to the inhibitory mechanism, but it actually rises above the level for a single border. If the spacing of the grating were gradually increased, the threshold for the grating would eventually have to return to the level for a single border. The question needs to be raised, therefore, as to whether the mechanism subserving border contrast makes it possible for the visibility of a given border in the grating t o be affected by the presence of adjacent borders. If we consider the case of a single border, we can convolute it with the spread functions involved in optical blur and in the spread of excitation and inhibition, the same as in the case of a square-wave or sine-wave grating, as shown in Fig. 35. The excitation transmitted through the retina is considerably enhanced on the bright side of the border and
11,
5
121
VISIBILITY O F SQUARE-WAVE
A
GRATINGS
119
R
Fig. 53. Cornsweet’s pattern (RATLIFF[1965]) (A), which when rotated gives the illusion illustrated in (B) of a uniform disk surrounded by a uniform annulus of lowcr brightness.
depressed on the dark side. Although some observers claim to see bright and dark bands on the two sides of a contrast border, these are probably caused by poor focus, which results in gradients which produce definite bright and dark Mach bands. If the eye is in focus, the observer sees a contrast border as an abrupt transition from bright to dark, and the color on the bright side of the border is perceived as uniform or else as gradually decreasing from the border outward. The counterpart of this occurs on the dark side of the border. The extreme case of this phenomenon can be demonstrated by rotating the pattern shown in Fig. 53A, which when rotated creates the illusion of a contrast border between uniform patches of color of different brightness, as shown in Fig. 53B. If we can think of frequency of impulses in the optic nerve fibers as the correlate of brightness, then it is necessary to postulate some kind of frequency-equalizing mechanism in the retina which tends to wash out gradients but which can maintain an abrupt frequency difference at the midpoint of an S-shaped gradient. It must be supposed that this frequency-equalizing mechanism operates at a higher level than the mechanism involved in the spread of excitation and inhibition. It is conceivable that this mechanism would permit the presence of one border to affect the visibility of a neighboring border. If this kind of mechanism operates in the case of a grating, the remarkable thing is that it must operate when the inhibiting border lies at its own threshold of visibility. This problem has been studied by FIORENTINI [1968] by measuring the effect of a single inhibiting border on a nearby, parallel, narrow bar used as a test stimulus. WILD 119591 has also studied this problem.
120
THE OPTICAL PERFORMANCE OF THE HUMAN E Y E
rII,
g 13
§ 13. The Visibility of a Bar
It Is interesting to compare the results obtained with a square-wave grating with those obtained with a single bar (LEVYand FRY[1968]). The open circles in Fig. 54 represent measurements made with single bright bars having various widths. Reducing the width of a single bright bar has no effect on its visibility down to a width of about 8 minutes. As width is further reduced, the threshold curve eventually conforms to a straight line with a slope of minus one. 0.20 oBRIGHT BAR 'DARK BAR
a
WIDTH OF BAR (MINUTES)
Fig. 54. Contrast thresholds for bright and dark bars of various widths. Bar height = 78 min. Background luminance = 21.7 fL. Artificial pupildiameter = 2 mm. Averaged data for two subjects. (From LEVYand FRY[1968].)
If a dark bar is used instead of a bright bar, the results show, as indicated by the filled circles in Fig. 54, that the dark bar has a minimum threshold at a bar width of about 10 minutes. The difference between bright and dark bars at the wide end of the scale indicates that two borders with their dark sides facing each other interfere more with each other than two borders with their bright sides facing each other. When the bars are narrow, the opposite relation exists. These findings are not consistent with the assumption that the contrast threshold is the same for dark objects on a given background as it is for bright objects. In this experiment the subject fixated the center of the bar both when the bar was dark and when it was bright. It has been proposed by FRYand COBB [1935] that the index of blur defined above can be assessed from a set of threshold data for bright bars on a darker background. The data in Fig. 54 for bright bars have been replotted in Fig. 55B t o illustrate the procedure. According to the theory, a straight line with a slope of minus one tangent
11,
§ 131
121
T H E VISIBILITY O F A B A R
I
10
100
10
I
100
WIDTH OF BAR (MINUTES)
Fig. 55. Data in Fig. 34 analyzed to determine the index of blur.
to the curve where the bars are narrow and a horizontal straight line tangent to the curve where the bars are wide must intersect at the point which has an abscissa value equal to 4. As a bar increases in width, the illuminance at the center of the image increases. For narrow bars this increase is proportional to the width, but as the bar gets wider, the illuminance increases less rapidly and eventually levels off to a constant value. Figure 56 shows the effect of increasing the width of a bar on the distribution of flux across the image. In particular, the change in concentration of flux at the center of the image can be noted. In this figure the line spread function is Gaussian, and the distance across the retina is expressed in microns.
-40
-30
-20
-10
0
+I0
+20 + 3 0
+40p
W
Fig. 56. Bars of various widths convoluted with a Gaussian spread function having a o-value of 5 microns. The abscissas (m)indicate distance from the center of the image measured in microns. (From FRY[1955].)
122
T H E OPTICAL P E R F O R M A N C E OF T H E HUMAN E Y E
100 I2O-- -
[II,
§ 13
JL-
80 60 cn W
3
2
40-
> W
>
F 20 U 1 W
K
-
-.
-
2' I
O
~
20
0
20 -20
0
ACROSS THE RETINA
20
-20
0
20
(MINUTES)
Fig. 57. Distributions of excitation ( X ) and inhibition ( I )across images of bright bars on a bright background; A L / L = 0.286. The R - I curves represent the net excitation transmitted through the retina. The transfer curves are shown in Fig. 32. (From LEVY and FRY[1968].)
Fry and Cobb made the simple assumption that the threshold for a given background level depends on having a fixed amount of retinal illuminance a t the center of the image. For the case in which the bar is very wide, this can be easily computed. It was later found (FRY[1946, 1965dl) that the index of blur as measured by the Fry-Cobb method varies with the luminance level; it had to be concluded that physiological irradiation is involved. Figure 57 shows bright bars of various widths on a background of a given luminance convolved with spread functions which give curves for R, I and R-I. This figure indicates that for a very wide bar the threshold no longer depends upon the response at the center of the bar, but rather upon the peak and trough at each of the two borders. The implication of Fig. 57 is that as the bar increases in width, the threshold should go through a minimum before it becomes independent of width. With a bright bar (Fig. 55B) the penomenon of a minimum does not occur, but it does occur with a dark bar as shown in Fig. 55A. I n this case the horizontal line used to assess 4 is drawn tangent to the threshold curve a t its minimum.
THE VISIBILITY O F BORDERS
123
It should bz noted in Fig. 57 that at a width of 10 minutes the peak in the response is still a single peak and has not broken into a double peak. It is as if a response with a single pnak rises out of the center of a crater of inhibition; the floor of the crater is the background, and it is the extent of t h i rise from the floor of the crater to the peak of the excitation which determines the threshold. The floor-to-peak rise would be nearly the same with or without inhibition, and hence inhibition can be ignored in analyzing threshold data for bars narrower than 10 minutes. It should be noted in passing that 4 as derived from the threshold data for bright and dark bars is of the order of four minutes of arc; but in the analysis of the modulation data for a sine-wave grating it was assumed that the point spread function for the image-forming mechanism is Gaussian and has a standard deviation of 0.4minute, which gives it a +value of one minute. The implication is that the major source of the spread is physiological irradiation. One can use threshold data for a single bar not only to determine the value of but also to determine the entire line spread function (FRY[1955]) for the combination of optical blur and spread of excitation. The first step in deriving the line spread function is to plot the reciprocal of the threshold values of contrast as a function of the width G of the bar and then differentiate to obtain values of [d(l/C)/dG]for various values of G. Since in this case 3 = 2t, one can now compute the normalized line spread function [ H ( t ) ]for the combination of optical blur and spread of excitation:
+,
H ( t ) = Cmin [d(l/C)/dG].
(62)
I t is important to note that this whole procedure can be based on threshold measurements for bars that range in width from 0 to 10 minutes; and as indicated in Fig. 57, one can assume that in this range the threshold is dependent on a constant peak to floor difference in R. For a bright white bar on a background of 2293 trolands with a pupil 2.33 mm in diameter, FRYand COBB [1935] found that the line spread function approximates a Gaussian distribution having a standard deviation of 0.73 minute ($ = 2.92 minutes).
5 14.
The Visibility of Borders
The visibility of borders is an intriguing problem in itself, with many
124
T H E O P T I C A L P E R F O R M A N C E O F T H E H U M A N EYE
9 16
facets. It merits consideration here because LOWRYand DEPALMA [1961] have worked backward from measurements of the Mach band effectto determine the modulation transfer function of the eye, including the optical as well as the physiological components. Lowry and DePalma used a uniform gradient between two uniform areas and obtained bright and dark Mach bands a t the two edges of the gradient. RATLIFF[ 19651 has also used the Mach band phenomenon in connection with his study of the mechanisms of lateral inhibition in the limu111s eye (RATLIFFet al. [1963]). MARIMONT [1963] used the asymmetry of the bright and dark Mach bands studied by Lowry and DePalma as evidence that the system is non-linear. We could explain this effect by assuming that the part of the system beyond the retina is non-linear, but we must face the possibility that our concepts a t the retinal level may have to be revised to take non-linearities into account.
8
15. Retinal Reflectometry
As pointed out above, one of the methods of studying the quality of the image formed on the retina is to measure and analyze the flux reflected from the retina when the image of a line or border or grating is formed on the retina. Various investigators (WESTHEIMER and CAMPBELL119621; KRAUSKOPF 1119621; FLAMANT [1955]; ROHLER[1962]) have contributed studies of this kind, but one of the more recent and more sucessful of these studies is the one made by CAMPBELL and GUBISCH[1966]. White light was used, and the image of a narrow bar was formed on the retina with artificial pupils of different size. The distributions of flux deduced from light reflected from the retina are shown in Fig. 58. Shown also in each of the diagrams for the different pupil sizes is the theoretical perfect image of a white slit formed by a diffraction-limited eye corrected for chromatic aberration. As pointed out by Campbell and Gubisch, the measured image conforms to the theoretical image when the pupil size is reduced to 1. 5 mm, except a t the bottom of the distribution, where it flares out. This flaring out has been attributed to stray light. Campbell and Gubisch explained that the stray light component is not involved in the psychophysical approach described above because stray light is involved with the Young fringes formed on the retina as well as with the image of a sinusoidal grating formed in the ordinary way. Consequently, when the one modulation transfer
11,
9 151
125
RETINAL REFLECTOMETRY
function is subtracted from the other, the stray light is eliminated. One can make an analysis of the role played by chromatic aberration by computing the distribution of flux in the image on the basis of
2.4 riim
4.9 mm
2.0 rnm
5.8 mm
..
I 4
I 2
..
.-
0
..
I
1
2
4
Angiilw distance (minutes of arc) Fig. 58. Optical line spread functions of the human eye for various pupil diameters. The continuous curves are derived from measurements of light reflected from the retina. The dotted curves are theoretical curves for white light for an eye corrected for chromatic and spherical aberration and limited only by diffraction. (From CAMPBELLand GUBISCH[1966].)
126
THE OPTICAL P E R F O R M A N C E O F THE HUMAN E Y E
[11,
5
1G
diffraction theory on the assumption that the eye is diffraction-limited. The CIE luminosity curve can be used in this analysis, and one does not have to be concerned with the splitting of this curve into red, green and blue components as in the psychophysical method of measuring optical blur. In this way, one can account for about one half of the discrepancy between the theoretical and measured distributions. The remainder of the discrepancy must be attributed to spherical aberration, irregular astigmatism, stray light (FRY[1965c]), and the factors which could make the distribution of flux to which the photoreceptors respond differ from the distribution of flux in the reflected image. From the data in Fig. 58, GUBISCH[1967] has computed Strehl ratios for the eye for different pupil sizes and has found these to range from 0.025 for a 6.6-mm pupil to 0.65 for a 1.5-mm pupil. These ratios permit the eye to be compared to other optical instruments.
Q 16. Aberrations of the Eye Only a limited number of aberrations has to be studied in order to investigate image quality of the eye. As explained above, the eye turns toward whatever has to be seen critically, and we do not have t o pay attention to the quality of the images falling anywhere on the retina other than at the center of the fovea. Consequently, we do not have to be concerned with the variations in radial astigmatism and coma corresponding to the secondary lines of sight. The beam centered on the line of sight is not always radially symmetrical, but once the distribution of flux formed on the retina by this beam has been determined, the procedure does not have to be repeated for other beams. Since the primary line of sight deviates from the optic axis by about 5", it is oblique to each of the refracting surfaces; and the images formed by different wavelengths of light are laterally displaced from each other. Since only one beam is involved, it is more appropriate t o describe this type of aberration in terms of dispersion rather than in terms of chromatic differences in magnification. This aspect of image quality can be manipulated by changing the centration of the pupil and by placing dispersing prisms in front of the eye. The problems of chromatic dispersion have been ignored in this review, but they have been treated elsewhere (FRY[1955]). The problems of chromatic aberration have been handled in this review by collecting data relative to the axial chromatic aberration
11,
5
171
M O T O R ADJUSTMENTS O F THE EYE
127
of the eye and using this data to compute the effect of combining images of different wavelength. The major problem which has not yet been covered in this review is that of determining the monochromatic aberrations of the beam corresponding to the primary line of sight. One obvious approach t o this problem is to trace beams of light through different parts of the pupil, which converge at the center of the fovea and then to use this data in reverse to determine the course of the rays transmitted through the exit-pupil from a point in front of the eye. One can then reconstruct the wave front emerging from the exit-pupil and from this compute the distribution of flux on the retina. In order to demonstrate the feasibility of the procedure, the writer [I9551 used IVANOFF’S data [1947] and computed the distribution of fluxin images of monochromatic light out of focus to various degrees. The results obtained have been useful in demonstrating many aspects of the role played by spherical aberration, but as pointed out by WESTHEIMER[1955], Ivanoff’s analysis of his data involves an artifact, and the whole procedure should be repeated using more valid data for spherical aberration. This kind of study should be extended to demonstrate the effects obtained with pupils of different size and with heterochromatic point sources, although experimentally this complication can be avoided by using monochromatic light. The changes in the spherical aberration of the eye with changes in accommodation should also be investigated. Attention needs to be called to the recent work of ARNULFet al. [1956] and of BERNY[1968] on the use of the knife-edge technique for the study and quantitative assessment of the spherical aberration of the eye. This technique supplements the ray tracing techniques used by the earlier investigators (AMESand PROCTOR [1923]; FRY [1949]; IVANOFF [1953]; KOOMEN et al. [1949]). Spectacle lenses have t o be maintained rigidly with respect to the head while the eyes move, and hence the problems of design are somewhat similar to those involved in the design of eyepieces for a roving eye. A contact lens mounted on the cornea could be designed to correct for the spherical aberration of a roving eye. I t also permits the use of spatial filtering. § 17. Motor Adjustments of the Eye
In physiology the words motor and sensory differentiate two major
128
T H E OPTICAL PERFORMANCE O F T H E H U M A N E Y E
[If, A P Y . I
functions of a nervous system. In this sense, motor adjustments of the eye under nervous control include the pointing of the eye in a given direction, the focusing of the eye, and the regulation of the size of the pupil. These problems have been reviewed recently by FIORENTINI [1961], WESTHEIMER[1963], ALPERN[1962] and LOWENSTEIN and LOWENFIELD [1962]. All that needs to be said here is that micronystagmoid movements can either blur the retinal image or improve the visibility when they are slow enough. Researches in this field form a large and separate branch of visual science. The subject of the nervous control of accommodation and the art and science of pupillography have similarly been developed into well identified, separate branches of visual science.
Q 18. General Reviews Attention is also called to several reviews of both the sensory and motor mechanisms involved in the perception of fine detail (WESTHEIMER [1965]; LIT [1968]; BOYNTON [1962]; ONLEY 119641; RIGGS [1965]).
Appendix I Although the distance from the second nodal point to the retina is now generally assumed to be about 17 mm, it is 15 mm in the Helmholtz schematic eye. The secondary focal length f‘ is 20 mm and the index of the vitreous (%’) is f (FRY[1959b]). In Figs. 22 and 23, it has been assumed that R‘ coincides with F’, that f’ = 20 mm and that the distance from the second nodal point to the retina is 15 mm. The effect of throwing the eye out of focus can be explained in terms of Fig. 19. In Figs. 22 and 23 the eye without a lens is in focus for an object at R which lies at an infinite distance and is then thrown out of focus by a thin lens at the primary focal point (F). The number of diopters out of focus is the power of the lens, and it follows from the Newtonian formula for conjugate foci that Diopters out of focus = 11/(0.001ff’)= 3.33 II where f , f‘ and II are expressed in mm. __ In eq. (41) a = O’M’, c = O’R’ and (a-c) = 2).
111
REFERENCES
129
References ALPERN,M., 1962, Movements of the Eyes, in: The Eye, Vol. 3, ed. H. Davson (Academic Press, New York) Chs. 1-8. American Medical Association Council on Industrial Health, 1955, A.M.A. Archives of Industrial Health 12, 439-449. AMESJr., A. and C. A. PROCTOR, 1923, Am. J. Phys. Optics 4, 3-37. ARNULF,A. and 0. DUPUY,1960, C. R. Acad. Sc. 250, 2757-2759. ARNULF,A,, 0. DUPUYand F. FLAMANT, 1956, M6thode objective pour 1’6tude des de’fauts du systeme optique de I’oeil, in: Problems in Contemporary Optics (Instituto Nazionale di Ottica, Arcetri-Firenze) pp. 330-335. BERGER-LHEUREUX-ROBARDEY, S.,1965, Rev. Opt. 44, 294. 1 BERNY,F., 1968, Formation des Images RBtiniens: ddtermination de l’aberration spherique du systeme optique de l’oeil, Doctoral Thesis, University of Paris. BLACKWELL, H. R. and 0. M. BLACKWELL, 1961, Vision Research 1, 62-107. BOYNTON, R. M., 1962, Ann. Rev. Psych. 13, 171-200. BRINDLEY, G. S., 1954, J. Physiol. 124, 400-408. BRYNGDAHL, O., 1966, J. Opt. SOC.Am. 56, 811. BYRAM, G. M., 1944, J. Opt. SOC.Am. 34, 718-724. CAMPBELL, F. W., 1957, Optica Acta 4, 157-164. CAMPBELL,F. W. and D. G. GREEN,1965, J. Physiol. 181, 576-593. CAMPBELL, F. W. and R. W. CUBISCH,1966, J. Physiol. 186, 558-578. CAMPBELL,F. W. and R. W. CUBISCH, 1967, J. Physiol. 192, 345-358. CAMPBELL,F. W. and J . G. ROBSON,1964, J. Opt. SOC.Am. 54, 581. CAMPBELL, F. W. and J, G. ROBSON,1968, J . Physiol. (London) 197, 551. COBB,P. W., 1911, Am. J . Physiol. 29, 76. COBB,P. W., 1915, Am. J. Physiol. 36, 335. Committee on Colorimetry of the Opt. SOC.Am., 1953, The Science of Color (Thomas E. Crowell, New York) pp. 223-228. DEMOTT,D. W., 1959, J . Opt. SOC.Am. 49, 571. DEPALMA, J. J. and E. M. LOWRY, 1962, J . Opt. SOC.Am. 52, 328-335. ENGEL, G. R., 1968, J. Opt. Soc. Am. 58, 1416. ENOCH, J . M., 1963, J . Opt. SOC.Am. 53, 71. ENOCH, J. M. and G. A. FRY,1958, J. Opt. SOC.Am. 48, 899. ENROTH-CUGELL, C. and J. G. ROBSON,1966, J . Physiol. 187, 517. F I O R E N T I N I , A , , 1961, Dynamic Characteristics of Visual Processes, in: Progress in Optics, Vol. 1, ed. E. Wolf (North-Holland Publishing Co., Amsterdam) Ch. 7. FIORENTINI, A,, 1968, Excitatory and Inhibitory Interactions in the Human Eye (to be published in the Trans. of the Intern, Conf. on Visual Science. Indiana University, April 2-4, 1968). FLAMANT, F., 1955, Rev. Opt. 34, 433. FRY,G. A,, 1946, Optometric Weekly 37, 1521. FRY,G. A,, 1949, The 0-Eye-0 15, 8. FRY,G. A , , 1953, Am. J. Optom. 30, 22-37. FRY,G. A,, 1955, Blur of the Retinal Image (The Ohio State University Press. Columbus). FRY,G. A., 1959a, The Relation of Blur and Grain to the Upper Limit of Useful
130
THE OPTICAL PERFORMANCE O F THE H U M A N E Y E
[I1
Magnification, Report (RADC-TN-59-267) from the Ohio State University ltesearch Foundation to the Rome Air Development Center under Contract No. AF30(602)-1580. FRY,G. A., l959b, The Image-Forming Mechanism of the Eye, in: Handbook of Physiology, Vol. 1 (American Physiological Society, Washington, D.C.) pp. 647-670. FRY,G. A , , 1961, J . Opt. Sac. Am. 51, 560-563. FRY, G.A , , 1965a, The Eye and Vision, in: Applied Optics and Optical Engineering, Vol. 2, ed. I 0, and Pi(cu) is set equal to zero for cu < 0. Now the photocurrent i ( t ) actually consists of a series of discrete pulses which we will assume to be infinitely narrow. Therefore C i(t) has two distinct contributions: If the electrons at t and t +z are distinct,
(W(1)(t) W ( l ' ( t + t ) )= ( W y t , t + t ) )
= .2(1)2g(2)
(z),
while if the same electron occurs at t and t+z,
(W(1) (t) W ( l ) ( t + z ) )= (W (1)(t ) ) 6(z) = . ( 1 ) S ( t ) . Therefore,
C i(t)= e2o ( I ) S(T) + e 2 d
g @ )(t)
(2.9) =
e ( i ) S(T) +(ij2g(2)(z).
We consider below two different detection schemes which were analyzed from a different point of view by FORRESTER [1961]. 2.2. HOMODYNE OR SELF-BEAT DETECTION
Suppose that the optical field is a narrow band Gaussian random process. (For monochromatic light scattered by a dilute solution of scatterers, for example, the Gaussian statistics of the scattered field follows from the central limit theorem.) The field is characterized by an autocorrelation function C,(Z)
=
( E * ( t )E ( t + t ) )
=
(I)@')(t).
For random Gaussian fields, the second-order correlation function g(2)(T) is related t o g(l)(t)by (MANDEL [1963], REED [1962]): g(2)(t)= 1 +p (412.
(2.10)
144
LIGHT BEATIN G SPECTROSCOPY
[III,
s
2
Whence
C i(7)= e(i) d(z) +(i)z(1+Jg(l)( t ) J 2 ) .
(2.11)
In the experiments which we will be discussing, the normalized correlation function g(l)(T) will usually have the form gC1) (7)= e-iOoT e-7171.
(2.12)
The optical spectrum of a field described by eq. (2.12) is
which is a Lorentzian of half-width at half-maximum Aw, (optical) = y , centered at w = coo, with total intensity (I). The photocurrent spectrum associated with this field is found from eqs. ( 2 . 7 ) , (2.11) and (2.12): 1 0 3
Pi(@)= gs_03 eioT{e(i) d(z) + (i)2 + ( i ) 2 e-2YIT1)dT (2.14a) 1 - 4;)+(;>2d 2n
2Yb ( w ) +(02 w 2 +(2y)2'
Since P,(w) is symmetric about w = 0, we combine the positive and negative frequency parts to obtain a spectrum for positive frequencies only : (2.14b)
P;(w)w where $(t) is the stochastic phase modulation. For this field
The optical spectrum which is the Fourier transform of g ( l ) ( t )will be centered at w,, but will be broadened due to the decay of the phase autocorrelation (exp {id(t)}exp { -i+ (t +z)}) with increasing z.
(2.16)
so that from eq. (2.9), the photocurrent autocorrelation function is C i(t)= e(i) d(z) +(Q2, and the photocurrent spectrum is
e(i> P t ( w ) = __ n
+(i)2
d’(w).
(2.17)
Comparison of eqs. (2.17) and (2.14) shows that the source we are discussing here produces the same d.c. and shot noise terms as the Gaussian random field, but that the “light beating” term is completely missing ! Thus the photocurrent spectrum of an amplitude stabilized
146
LIGHT BEATING
SPECTROSCOPY
[III,
5
%
single mode laser exhibits no light beats even though the optical spectrum has a finite linewidth due to the random phase modulation. This result, which can be attributed to the lack of phase sensitivity in the photoemission process, is equivalent to the assertion in 5 1.2 that there is no “excess noise” in the photoelectric emission current produced by a constant intensity source. Thus the total power (neglecting shot noise) in the spectrum produced by a Gaussian field is 2 ( i ) 2 , while for the “coherent” source, the power is (Q2. This result also follows from the requirement that P ( o ) d o = C, = (i2>, and ( i 2 ) = (Q2 for a constant intensity source, while for a Gaussian source ( i 2 ) = 2(Q2.
Jr
2.3. HETERODYNE DETECTION
In 2.2, we considered the spectrum of the photocurrent with the detector illuminated only by the field under study. As an alternative procedure, the detector can be illuminated simultaneously by this field and by a coherent local oscillator signal. This heterodyne detection scheme was utilized in the original polystyrene diffusion broadening experiment of Cummins, Knable and Yeh (CUMMINSet al. [1964]) and was extended to the study of critical opalescence by ALPERTet al. [1965]. (Methods for producing local oscillator signals will be discussed in 5 5 . ) Let the field under study be E , and the local oscillator field be ELO( t)= E!, exp {-ioLOt}. The photocurrent produced by either Es or EL, in the absence of the other is i, or i,,, where (is(t)) (iL0
=
eo(E:
(4 E S ( t ) ) >
( 4 ) = eo(E$ (4 E L ,
( t ) > = e4EI,ol2,
since is constant. The current autocorrelation function C , ( T ) = e2od(z) ( E * ( t ) E ( t ) ) + e 2 0 2 ( E * ( t ) E ( t ) E*(t+t) E ( t + z ) ) simplifies considerably if the local oscillator field is much stronger than the scattered field so that I,,>> (Is). When ( E * (t) E ( t ) E* (t+r) x E ( t + t ) ) is expanded using E ( t ) = E , ( t ) +E$ exp (-icuLOt}, the result contains 16 terms of which ten are zero, and three are time independent terms whose sum is I f o +21L, (I,). The remaining three terms give ILo{rexP (ioLO.)l
(E:
(t) Es
(t+.)) +[exp (-ioLo.)l ( E s (t-t.))} +(E: (W,( t ) E: (t +.) Es (t +.)>.
111,
I 21
147
THE T H E O R Y O F L I G H T BEATING SPECTROSCOPY
If I,, >> ( I s ) ,we can neglect the last term in the preceding line (which is just the homodyne spectrum discussed earlier) and also keep only I,, of the d.c. terms. Thus:
C i( T )
= e2aI,,6(t)
+e2a2&
+e2a21,,
+
( I s ) {[exp (iwLoz)]gil) (t) [exp (-i%o
=
e i,,d(t)
+& +iLo(is)
T)lgP*(.I>
+
{[exp (iwL0z)]gF)( T ) [exp (-iwLot)]gA1)* ( T ) } .
(2.18)
The photocurrent spectrum produced by the combined signal and local oscillator field is again found from eq. (2.18) via the WienerKhintchine theorem, eq. ( 2 . 7 ):
+[exp ( - i ~ ~ ) ~ , t ) ] g ~ ~ ) * ((2.19) ~))d~. Again the photocurrent spectrum consists of three terms. The first is the shot noise, the second is the d.c., and the third term gives the heterodyne light beating spectrum. Notice that in this case the light beating spectrum depends on gJ1) rather than gb2), and is therefore determined uniquely by the spectrum of the signal field independent of
the statistics.
Characteristically gil) (T) = [exp (-iw0z)] (2.12)), whence
[exp
(-yltl)]
(as in eq.
(If wLo = w o , the last term should be doubled for w 2 0, and set = 0 for w < 0.) Note that the heterodyne light beating spectrum is a Lorentzian identical in shape to the optical spectrum but centered a t w = lwo-wLoI.
This characteristic is one major advantage of the heterodyne detection scheme. The light beating spectrum is an exact replica of the optical spectrum shifted t o a center frequency jwo-wLoI, and it does not depend on the statistics of the field. Thus, for example, two single mode lasers can produce a heterodyne beat signal even though, as we have seen, neither will produce a homodyne signal. 2.4. FORRESTER’S APPROACH
A useful heuristic discussion of the light beating effect was given by
148
LIGHT BEATING SPECTROSCOPY
[III,
§ 2
FORRESTER [1961]. Following Forrester, we consider an optical source of spectral density I @ ) which illuminates a photodetector. We will assume that the optical field is spatially coherent over the illuminated portion of the photocathode, and that the detector response is independent of frequency. Divide the frequency range into intervals of width Ag, as shown in Fig. 1. Then intensity in the mth frequency interval is given by I(flm)ag,and the corresponding electric field can be written
4
4 l
A N G U L A R FREQUENCY
p
Fig. 1. Distributed optical spectrum I@).
where the different phase factors +m are assumed to be random. (This assumption is equivalent t o asserting that the field is a Gaussian random process.) The photocurrent is then i = alE12, where a is the quantum efficiency, and E = cE==lE,,n. Hence the photocurrent is given by
' ( t ) = GAP 2 ['(Pm)'@n)i+
exp {i(~m-B,)t+i(+nL-~n)}.
(2.22)
m, n
We can write the sum over m and n as three separate sums:
c c + L: + 2.
m,n
=
m i n m=n
(2.23)
m i n
The m = n sum is the d.c. photocurrent, n M
(2.24) which we now drop since we are interested in the spectrum of the photocurrent. If we reverse the roles of m and x in the third sum in eq. (2.23) and then combine the first and third sums, we have
111,
s
21
THE THEORY O F LIGHT BEATING SPECTROSCOPY
;(4
=
243
2 [ W m ) W n ) I +cos [(Bm-Bn)t+(+m-+n)l.
149
(2.25)
m>n
Now let b = m--.n, w = /3,-B,, and y n = c # ~ + ~ - +Then, ~ . dropping the sum over 6, we have the component of the current at frequency w (in bandwidth AB) : OD
; ( w >t ) =
2cAg
C [I(Bn+W) n=l
I(Bn)I+
cos (wt+yn).
(2.26)
Since the phase factors y n are assumed to be random, the time average of the current squared is the sum of the squares of the terms in eq. ( 2 . 2 6 ) . Therefore, since (cosz (wt+y,)) = 4,the power P ( m , A/?) of the photocurrent in bandwidth Ag at frequency w is 03
P ( w , AB)
=
2 W V
2 I ( P n + w ) Wni.
(2.27)
.n=l
The experimentally determined quantity is the power spectral density of the photocurrent, which is obtained by dividing eq. (2.27) by the bandwidth Ag :
Pi (w> = 2a2 2 I ( P , + w ) I(@,)AB,
(2.28)
n
where P i ( @ )is the photocurrent power spectrum neglecting d.c. and shot noise components. In the limit A/? i 0, eq. (2.28) becomes (2.29) For a Lorentzian line of half-width y centered a t w o , we have (see eq. (2.13)) (2.30)
Substituting (2.30) in (2.29) and evaluating the integral, we obtain
which is just the light beating term in eq. (2.14). The convolution equation derived by Forrester (eq. 2.29) was also utilized by ALKEMADE[ 19591. Through the convolution theorem for Fourier transforms Forrester's result can be shown to be equivalent to the result of the preceding section,
150
LIGH1 BEATING SPECTROSCOPY
since I @ ) and ( E * ( t ) E ( ~ + T )are ) a Fourier transform pair (MANDEL [1963] p. 200). The convolution equation (eq. (2.29)) is often quite convenient for finding the light-beating component of the photocurrent when the optical spectrum is known. Three examples were considered by Forrester: Optical spectrum 1 ( B )
Light-beating spectrum P: (0)
A: Rectangular
R: Gaussian
C: Lorentzian
The rectangular spectrum (A), for example, is much easier to evaluate through the convolution equation than through the WienerKhintchine theorem, although for the Lorentzian (C), the converse is true. Note that in each case the total power in the light beating spectrum is equal to the d.c. power (i)2, a result which must always hold for fields with Gaussian statistics regardless of the actual spectrum. 2.5. QUANTUM THEORY
The quantum theory of optical coherence has been developed during the past five years by GLAUBER[1963a, 1963b, 1964, 1965, 1966, 19681. The starting point of the quantum theoretical treatment is the photoelectric interaction in the detection process (eq. (2.1)) but with the field quantities treated as operators rather than as classical variables.
111,
5
21
T H E THEORY O F LIGHT BEATING SPECTROSCOPY
151
GLAUBER[1963] separated the field operator E(r, t ) into positive and negative frequency parts which are photon annihilation and creation operators, and showed that the probability per unit time that a photon is absorbed at a point r a n d time t by an "ideal" photodetector is proportional to the intensity
I ( r , t ) = (ylE(-) ,.( t ) E'+' (r, t)ly),
(2.31)
where Iy) is the initial state of the radiation field. Similarly, the rate per unit (time). of detecting n-fold delayed counting coincidences is proportional to
W("'(Yltl. . . r n t n )= = ( y ] E ( - )(r1tl) .. . E ( - ) ( r n t n )E'+)(rntn). . . E(+)(rltl)1 ~ ) .
(2.32)
For the more general case where the field may not be in a pure state ly), the density operator p is used to define a series of quantum-me-
chanical correlation functions:
G1 (rt, r't') = Tr{pE(-) (rt) E(+)(r't')} (2.33)
G(")(r1tl . . . r 2 n t 2 n= ) = Tr{pE(-)(rltl) . . . E(--)(rntn) E(+)(Y~+ . .~. E~ (~++ )~()Y ~ ~ ~ ~ ~ ~ ) GLAUBER [19661 has specifically considered the problem of evaluating the photocurrent autocorrelation function for an N-atom model photodetector. If N is sufficiently large,
( i ( t )i ( t + z ) )
= e20d(z)G'l) (0) +e202G(2) (z),
(2.34)
where o is again the quantum efficiency of the detector and is assumed to be constant over the bandwidth of the signal. For the case of fields with Gaussian statistics, Gt2) (z) = G(l)(0)(1+Ig(l) (r)12)(GLAUBER [1965] p. 150), while oG(l) ( 0 ) is the average photoemission rate, whence
( i ( t )i ( t + z ) )
=
e(i)d(z)+(i)2(1+Igc1'
(z)12),
(2.35)
where g'l) ( r )is the normalized form of G(l)(z): gel) ( T ) = G(1)(z) /G(') (0).
(2.36)
The final step in evaluating (2.35) for a specific Gaussian field requires a relation between g(l)(z) and the optical spectrum. GLAUBER ([I9651
152
LIGHT BEATING SPECTROSCOPY
[III,
$ 2
(p. 168)showed thatgcl) (T)is just theFourier transformof thenormalized (T) = optical spectrum. For a Lorentzian spectrum of half-width y , l+exp ( - 2 y l t l ) , which together with (2.35) recovers our classical result (eq. (2.14)).Thus, for narrow band Gaussian random fields, the quantum theory and the semiclassical theory lead t o identical photocurrent spectra. We now consider the relation between the quantum theory and the semiclassical theory starting with eq. (2.34), which gives the current autocorrelation function (and thus the photocurrent spectrum) in terms of the two quantum mechanical correlation functions G(I)(0) and G(,) (T). For each mode of the radiation field, a convenient set of basis states are the coherent states \ak)which are related to the number states In) by
>.I
=
exp
(-a142)
1+)!.(
a”
n
. ) .1
(2.37)
The la) states are readily shown to be eigenstates of the annihilation operator:
atlat) = a,l%>>
(2.38)
and also,
(%la; = 4
(%I.
GLAUBER[ 1963bl discusses various representations of the density operator p using the coherent states la) as basis states. (See also KLAUDER and SUDARSHAN [1965].) A particularly important class of fields exists for which the density operator can be written in the “P-representation” which for a single mode has the form: (2.39) where P (R) is a “quasi-probability” function. Note, however, that P ( a ) is not restricted t o positive values. (For narrow band classical fields E ( r , t ) == A exp {-i(wot-ko r ) } , the (complex) amplitude which function A is described by a probability function W , is always positive.) For Gaussian fields, the QM quasiprobability function is (2.40a) while the classical probability distribution is
111,
5
51
THE THEORY O F LIGHT BEATING SPECTROSCOPY
163
(2.40b) Thus for this case P ( K )and W , ( A ) are formally identical and lead to the same photocurrent spectra, as we have seen. Now consider the coherent state. Classically, E (r,t ) = A , exp {-i(q,t-k, * r ) ) ,so
W , ( A ) = 6(2)(A-Ao).
(2.4la)
Since the intensity E * ( r , t ) E ( r , t ) of this field is constant, the photocurrent autocorrelation function is given by
( i ( t ) i(t+z)> = ei6(z)+i2. Quantum mechanically, the coherent state is described by
P(,)
(2.41b)
= 6'2'(c(--cco).
Thus
s
G ( ~ ) ( T=) P(cc)lx(0)]2d2ct=
Ic1012
=
(I),
so that, from eq. (2.34)
( i ( t )i ( t + z ) )
(I)+e2cr2 (I), = e(i>8(z)+(i)2, = e2a6(z)
(2.42)
which is just the result of our classical calculation. Similarly, the model of a laser which we have utilized in 0 2.2 (an amplitude stabilized field with random phase) is described classically by (2.43a)
while the QM equivalent is (GLAUBER[1965] p. 164) (2.43b) Since the classical distribution function (2.43~~) again represents a constant intensity source, the photocurrent spectrum is identical t o that for a coherent source of the same intensity. Equivalently, P ( K ) of (2.43b) and (2.41b) lead to identical photocurrent spectra.
154
LIGHT B E A T I N G SPECTROSCOPY
[m
§ 3
Thus, for all homodyne cases which we considered in 92.2, the semiclassical and quantum theories lead to identical predictions. Extension to the heterodyne problem for the case in which a large (coherent) local oscillator field is at the center frequency of the superimposed Gaussian random field has been discussed by CUMMINS [1968] and again leads to predictions which agree with the semiclassical theory. In conclusion, we point out once again that the quantum theory prediction for the current autocorrelation function, eq. (2.35), and the semiclassical prediction, eq. (2.9), cannot be automatically equated despite their similarity in form, because the quantum mechanical and the classical correlation functions need not be identical. However, for the cases of interest in this review the P-representation always exists, and the quasi-probability P ( a ) is found to be identical to the classical amplitude probability function W , ( A ) . For these cases, the quantum mechanical and the classical values of G(l)(0) and G(2)(T) are the same, and thus both theories lead to the same predictions for the photocurrent spectrum regardless of the intensity of the optical field. This result can, in fact, be shown to hold for any case in which the P representation exists if P ( a ) is nonnegative for all a (SUDARSHAN [1963]).
Q 3. Light Scattering Theory In the preceding sections we have discussed the photoelectric current spectra produced by various optical fields without specifying the source of the fields. We now specialize t o light scattering experiments and present a brief summary of light scattering theory which will serve as a basis for the discussion of recent experiments in 5 6. We will assume that the scattering medium under study is illuminated by a monochromatic exciting source E ( t ) = E, exp ( - h o t ) and that the detector is small enough so that the scattered field is spatially coherent over the entire detector. The effects of relaxing these conditions will be considered in § 4. 3.1. SCATTERING BY A DILUTE SOLUTION O F PARTICLES
Consider a large volume (filled with a solvent) which contains N identical scatterers. The volume is illuminated with a monochromatic plane wave of frequency GO, polarized perpendicular to the scattering plane, and light scattered at an angle 0 is observed at a distant point R, (see Fig. 2 ) .
111,
§ 31
LIGHT SCATTERING THEORY
155
b Fig. 2. Geometry of a light scattering experiment.
The field observed at R, due to the
it”scatterer will be
Ej = A j ( t ) ei@je-ioot
(3.1)
where the amplitude A may depend on the orientation of the scatterer. If we let the position of the jthscatterer be rj and we choose the phase 4 = 0 for a scatterer at the origin, then +j
M
(K,-Ks) . rj = q r j ,
(3.2)
where KOand Ks are the wave vectors of the incident and scattered light respectively. If the scatterer moves slowly (v ].
(3.4)
The average total scattered intensity is given by 1 s = (lEs12>,
where the angular brackets denote a time average. Since the scatterers are not correlated, all cross terms average to zero, whence
156
LIGHT BEATING
SPECTROSCOPY
We now invoke the statistical independence of the scatterers to eliminate cross terms ( j # m ) , the statistical independence of position and orientation to factor amplitudes and phases, and finally the fact that the N scatterers are identical so that each must have the same autocorrelation function. Thus
(A* (t) A (t+z)) x
C(z) = N exp (-iw,z)
([exp {-ia . r(411rexp {iq . r(t+t)}l). (3.7) Since we are now dealing with single particle correlation functions, the subscripts have been eliminated. The optical spectrum is then given by the Wiener-Khintchine theorem as: =
(N/2n)
r+m
J --oo
[exp
{i(co-wO)z}l
[cA(z)l[cq4
(z)l dz,
(3*8)
where
(z>l= > [C, (z)l = ([exp {-ia r(411 [exp {iq r(t+411>. ECA
*
*
3.1.1. Spherical scatterers
If the scatterers are spherical, then the scattering amplitude A ( t ) is a constant and the amplitude autocorrelation function [C,(Z)] = IA 12. From eq. (3.8), the optical spectrum is then given by 1
= NI4",/-_
+0°
[exp {i(w-Wo)z)l [C+I).(
dt-
(3.9)
The phase autocorrelation function [C, (z)] is easily analyzed for [1964], ARECCHI[1967], CUMMINS three important cases (cf. PECORA et al. [1968]): a. Static scatterers (fixed random positions),
[C,
(41= 1.
(3.10a)
b. Scatterers moving with constant velocity v,
[C, (z)] = exp (iq * vz).
(3.1Ob)
111,
9 31
157
LIGHT SCATTERING THEORY
c. Scatterers undergoing translational diffusion characterized by the diffusion constant D,, (3.10~) For each case, the phase autocorrelation function can then be put into eq. (3.9) to find the optical spectrum: a. Static scatterers,
I ( w ) = N J A I 2S(w--w,j,
(3.l l a )
so the total scattered intensity NIAj2 appears at the frequency of the incident light coo, and the scattering is perfectly elastic. b. Constant velocity v,
I ( w ) = NIA12G(w-wo+q
*
v).
(3.1l b )
Again the total scattered intensity appears at a single frequency, but now is Doppler shifted to w = w o - q . v. This case can be easily extended to the dilute gas where the velocities are distributed according to Maxwell-Boltzmann statistics, and leads to the usual Dopplerbroadened (Gaussian) spectrum. c. Translational diffusion, [exp {i(w-wo)z}] [exp {-DTq2jzl}]dt (3.1lc)
The quantity in brackets in eq. (3.11~)is a normalized Lorentzian centered at w = w o , with half-width at half-maximum of Aw;
= DTq2.
(3.12)
Equation (3.1lc) for the quasielastic scattering from scatterers undergoing translation diffusion has been utilized in the analysis of the experiments which will be discussed in 3 6. For the case of spheres in water at 20" C, assuming D, = kT/6zqr (Stokes' law; q = viscosity) and Avac = 6328 A, it was shown that the linewidth (eq. (3.12)) becomes
Am,
=
49n sin2 46 Y
,
(3.13)
where Y is the radius of the sphere in microns (CUMMINSet al. [1964]).
158
LIGHT BEATING SPECTROSCOPY
[111,
3
3.1.2. Nonspherical scatterers Many important macromolecules are physically anisotropic but optically isotropic. For optically isotropic particles which are very small compared to the wavelength, the scattering amplitude is independent of orientation, and the spectrum will be identical to that of spheres regardless of the shape (Rayleigh limit). If the scatterer is too large t o meet the Rayleigh criterion (L ,
(4.24)
eolKI2 (lEL(t)12)
= iL0J
(4.25)
where the factoring of the correlation functions follows from the independence of the laser field and the scattering process, and the equality (iLo)= i, follows from the time independence of ]EL( t )l2 = xjIEOj12. The photocurrent autocorrelation function for the field E ( t ) is
C i(t)=e2a( 1 EL( t )12) ( If (tj+Ke-iWMti2)6 (t)+ + e 2 0 2 ( / E L ( t ) / I EL(t+t~~2)(~f(t)+Ke-'w~t(2~f(t+t)+Keiw~ Ten of the 16 terms in the last factor are zero, and by eq. (4.21a), (IEL(t)I21EL(t+T)l2) = (IEL(t)12)2, so we have
Ci )(.
= e((is)+iLO)S(Z)+
+e2a2 ( I E L
(4 12>2 {(lf(t)12I f ( ~ + ~ ) l 2 ) + 2 1 ~ I 2 < l f ( ~ ) l 2 > +
+2IKI2[eiWMt ( f * ( t ) f ( t + ~ ) > + e - ' ~(~ f ('t ) f*(t+.))I+IRI4) =
+(is)2g:2) +2i,,
+
(is) +2iLO (is) [eiWM'gp)( t ) + e - ' " ~ ' g ~ ) * ( t )+i$. ]
e( (is)+iL,)6
(t)
(t)
If i,
>> (is), we are left with
C i(7)
=
eiL,S(t)+i~,+2i,,
(is)[eiW~'gp) ( ~ ) + e - ~ ~ ~ (t)], ' g ~ ) *(4.26)
which is exactly the same current autocorrelation function that was obtained in 5 2.3 for heterodyning with a monochromatic source for which the current from either the local oscillator or sample in the absence of the other was i, or (is). I t therefore follows that the
174
LIGHT BEATING SPECTROSCOPY
[IIL
§ 4
photocurrent spectrum is the same for heterodyning with both the scattered field and the local oscillator signal derived from a common amplitude-stabilized multimode laser source as it is for heterodyning with a monochromatic source which produces the same scattered field intensity (or photocurrent (is)) and the same local oscillator intensity (or photocurrent iLo). Thus we conclude that for both the homodyne and heterodyne detection techniques the light beating spectrum obtained with a multimode laser source is the same as that obtained with a monochromatic source of the same intensity, except for the presence of the normally unimportant intermode beats. 4.4.SIGNAL
To NOISE
So far, we have computed the photocurrent spectrum Pf(co) in terms of the ensemble or time-averaged photocurrent (i). For the case of homodyne detection of a Lorentzian optical spectrum when there are N coherence areas on the detector, for example, we have from eq. (4.9):
Pf(co)=
e(i) ~
+ 2 (Ni ) 2 ~
7c
2y/n co2+(2y)2’
where we have dropped the d.c. term which is normally blocked in an experiment. The spectral power at frequency co consists of two terms: the signal term, and the shot noise term, e(i)/n. The shot noise term, being constant, poses no limitation in principle to the precision with which the signal term can be measured. (A method for accurately subtracting out the shot noise term is discussed in 3 5 . 1 . ) In practice, measurements of P t ( w ) are performed in finite time intervals, so that the observed spectrum is never identical to the prediction based on infinite time averages. The observed spectrum will exhibit fluctuations which will set the limit in practice on the precision with which the signal term can be measured. A meaningful index of the precision which can be expected in a given experiment is the ratio of the signal term to the r.m.s. uncertainty in the combined signal plus shot noise which we shall henceforth refer to as the signal to noise ratio S/N. HAUS[1968] has discussed the signal t o noise problem from the point of view of the statistics of the individual photoelectric emission events. Alternatively, the photocurrent can be considered as a continuous process, and the S/N ratio evaluated by standard methods of
III,
$ 41 PHOTO
COHERENCE AND SIGNAL TO NOISE CONSIDERATIONS
17.5
-
IF BANDPASS FILTER
INTEGRATOR
Fig. 4. Block diagram of an electronic spectrum analyzer.
electrical signal analysis (FORRESTER et al. [1955], BENEDEK [1968b]), which is the procedure we shall follow. A block diagram of an electronic spectrum analyzer is shown in Fig. 4. The optical signal I ( o ) is detected by a phototube which has an output current i, ( t ) with a corresponding power spectrum P, ( w ) . The photocurrent i, (t) passes through a narrow bandpass filter (filter A, the spectrum analyzer IF bandpass), is detected and squared and then integrated by the filter B, whose output current i, (t)is recorded. The squared signal i, ( t ) is a noise process and will contain fluctuations, no matter how large i3 ( I ) may be. The signal i, [t)corresponds to a finite time average of i 3 [ t ) ,unlike the infinite time average spectra derived for various cases in 5 2. The magnitude of the fluctuations in i 4 ( t )depend on the product 6T of the bandwidth 6 of the spectrum analyzer IF bandpass filter (filter A) and the time constant T of the averaging filter (filter B), as we shall now show using well-known results from linear circuit theory (see, for example, ASELTINE[1958]). We will calculate for the measured current i 4 ( t ) the ratio of the mean square fluctuation to the mean value squared,
by tracing the photomultiplier current i, ( t ) through the spectrum analyzer to the recorder. The photocurrent i, [t)is assumed to be a Gaussian random variable. If the photocathode area A is large compared to a coherence area kt,,h, then the photocurrent will have Gaussian statistics as a consequence of the central limit theorem. For very small photocathodes the photocurrent statistics will depart from Gaussian, going over to the exponential distribution W [ i )= (i)-l exp ( - i / ( i ) )in the limit A > Acoh. The photocurrent i l ( t ) is filtered by the bandpass filter A. The
176
LIGHT BSATING
SPECTROSCOPY
[III,
9: 4
spectral density P z ( w ) of the filter output is related to the input spectrum P, ( w ) by P2(w)
(w)12
= IyA
(4.28)
pl(w),
where Y,(w) is the admittance of filter A. (This theorem and other results from linear circuit theory which we will use are proved in ch. 15 of ASELTINE [1958].) In order to carry out the calculation explicitly we assume that the spectrum analyzer has a Lorentzianshaped IF bandpass with a center frequency p and full-width S. Therefore (4.29)
Further, we assume that the filter bandwidth 6 is small compared to the width of the signal spectrum P l ( w ) , so that P l ( w ) varies slowly over the width of the filter and hence to a good approximation (4.30)
The autocorrelation function C,(T) for the current i, (t) can now be calculated by the Wiener-Khintchine theorem:
I-, c, W
c2(T) =
P,(w)e-'"'dw
(4.31)
(o)eiB7e--iS7.
=
The effect of the detector and squaring circuit which follow filter A is to remove the carrier and square i, ( t ) ,so that i3 (t) = li, (t)1 2 , and thus we have, again applying the Wiener-Khintchine theorem,
P3(w)= -J 2n
(li,
(t)l2
li, (t+t)12)eiw7dt.
-W
If the input i l ( t ) of a linear system (such as filter A ) is a Gaussian random variable, so is the output i z ( t ) ,and therefore we have (cf. eq. (2.10)) (li2(t)lB
F2(t+T)l2)
= (li2(~)12Y+l(~;(~)
=
[C,
@)I2+
IC,
iZ(t+T))I2
l2
(T)
and (4.32)
111,
$ 41
177
COHERENCE A N D SIGNAL TO NOISE CONSIDERATIONS
The current i3 ( t ) is integrated by filter B, and the spectrum P, ( w ) of the output current i 4 ( t )is (as in eq. (4.28)) P4(w) =
IYB(w)12P3(W).
Filter B is an RC integrator with a time constant T = RC, so -7-1
Hence
To complete the calculation of eq. (4.27)
we need to find (i, ( t ) ) . The Fourier transform i, (0) of i, ( t ) is related to the Fourier transform of i3 ( t ) by
i,
(w, =
y, ( w ) i3 ( w ) ,
but 1
i3( w ) = 252
so that 1
(i4
(4) =2n
=
”
J-,
a7
,
J-, YB
C,(O),
Ii, (t)1 eciwtdt,
)
J-,
(]i,(t’)) , 1 eiwt’dt’dw (4.34)
(0 CiWt
since only i, (t’) in the integrand is affected by the ensemble average. We have finally the result (4.35)
The relation between the mean square fluctuation in the detected current and the mean current squared (eq. (4.35)) has been investigated in our laboratory by Mr. Ronald Reese and Mr. Donald Henry. The photocurrent produced by a white light bulb source was analyzed with a variable bandwidth spectrum analyzer (Panoramic SB-15) and averaged by a variable time constant RC integrator. The fluctuations in the observed current i 4 ( t )were monitored on an oscilloscope and
L I C H T R E A T I N G S PE C T K O S C O PY
LIII,
9 4
Fig. 5 . The spectrum analyzer output before averaging, for different bandwidths 6 1 2 ~ (a) : 7 . 5 Hz, (b) 18 Hz, (c) 33 Hz and (d) 68 Hz.
C O H E R E N C E A N D SIGNAL TO N O I S E C O N S I D E R A T I O N S
179
0
Fig. 6. The spectrum analyzer output after averaging, for different time constants: (a) T = 10 msec, (b) T = 27 msec, (c) T = 100 msec and (d) T = 270 msec. In each case the IF bandwidth 61272 = 33 Hz.
180
1111,
LIGHT BEATING SPECTROSCOPY
9
4
a digital record was obtained with a signal averager (Fabri-Tek Model 1052). Before presenting the quantitative results of the noise measurements let us examine the qualitative time dependence of the signal current on the bandwidth 6 and time constant T . Oscillograms of the detected spectrum analyzer output after detection but before averaging (current i 3 ( t ) in Fig. 4) are shown in Fig. 5 for different bandwidths 8/27c: (a) 7.5 Hz, (b) 18 Hz, (c) 33 Hz and (d) 68 Hz. (The oscilloscope time base is the same for each case, 50 msec/cm.) Note how the rate of fluctuation in the (Gaussian) signal increases with increasing bandwidth. Oscillograms of the signal after averaging (current i 4 ( t )in Fig. 4) are shown in Fig. 6 for different time constants, with an IF bandwidth 8/2z = 33 Hz in each case: (a) T = 10 msec, (b) T = 27 msec, (c) T = 100 msec and (d) T = 270 msec. Equation (4.35) predicts that the normalized mean square fluctuation in i 4 ( t )should be equal to (1 +dT)-l, and clearly the fluctuations decrease monotonically for (a) through (d), for which (1+6T)-1 = 0.33, 0.15, 0.046 and 0.018, respectively. The quantitative results of the noise measurements are summarized in Fig. 7, a graph of ((Ai4)2)/(i4)2 as a function of 6T. The solid line is the theoretical value (lfaT)-l and the points are the measured values. For wide spectrum analyzer bandwidths the fluctuations are
0.64
0
10
20
30
40
50
60
70
80
90
100
8T Fig. 7. Normalized fluctuations in the observed signal as a function of the product of the spectrum analyzer bandwidth 8 and the integration time T . The dots are the experimental points and the solid line is a plot of eq. (4.35).
111,
§ 41
COHERENCE A N D SIGNAL TO NOISE CONSIDERATIONS
181
small even for fairly short time constants, but the fluctuations increase markedly when the bandwidth is made very narrow, unless long time constants are used. Now we will apply the results we have obtained to the real problem of interest, the signal to noise ratio for light beating signals. We have from eqs. (4.30), (4.31), (4.34) and (4.35)
[( (Ai,)z)]'
=
BP, ( w ) (l+dT)-*,
(4.36)
(i4)
=
BP,(w),
(4.37)
where R is a constant, Pi(co) is the photocurrent spectrum, and w now denotes the center of the spectrum analyzer bandpass. The photocurrent spectrum in every case (@2.2-2.3) consists of a signal P, ( w ) plus a white shot noise background P,, = e(i)/x, so that (4.38) pi ( 0 ) = p s ( 0 ) + P S N Only the fraction P,/PsN of the observed current (i,) is the desired signal (eq. (4.37)),but both Ps(co)and PSNcontribute to the fluctua-
tions in i, . Thus we have finally that the true signal to noise ratio in the observed spectrum, the ratio of the signal amplitude at a frequency cr) to the r.m.s. fluctuation in the signal plus shot noise at that frequency, is (4.39)
(The same result is given in the review by BENEDEK [1968].) The signal to noise ratio as a function of frequency, (S/N),, attains its highest value when Ps(w)/PsNis a maximum, that is, at the center frequency coC of the photocurrent spectral line. Thus
The signal t o shot noise ratio at 0 = coC for the photocurrent spectra obtained in $9 2.2-2.3 for homodyne and heterodyne detection of a Gaussian random optical field with g(l)(z) = exp (-yIzl) is (4.40)
where N is the number of coherence areas subtended by the detector,
182
CIII, § 4
LIGHT BEATING SPECTROSCOPY
(is) is the photocurrent produced by the scattered field alone, and is a constant which has the values ,u = 1 for homodyning, ,u = 2 for heterodyning with wM 3 y (wlT is the local oscillator frequency), and ,u = 4 for heterodyning with wM = 0. In the usual case 6T >> 1, so that
,u
(4.41)
(If ST 5 1, then (S/N)max5 1, even if Ps/PsN>> 1, so that investigation of a spectral line would be extremely difficult.) If the signal to shot noise ratio is very large, (,u(&)/Ney) >> 1, then from eq. (4.41) we have
-I
W
(r
0 0
- - - - - - - -SH!!!I_NOISE_20
40
- - - - - - - - - -_----- 60
80
-- ---
100
FREQUENCY (kHz)
Fig. 8. Homodyne spectra for a xenon sample a t a temperature 0.30 C" above the critical temperature, illustrating the dependence of the signal to noise ratio on linewidth when Ps P S N .(a) 0 = 30". 2y/2n = 3.9 kHz, (b) 0 = 150", 2 y / 2 n = 57 kHz. I n both (a) and (b) the relative resolution ( s l y ) = 0.032 and the time constant T = 0.4sec.
>>
111,
§ 41
COHERENCE AND SIGNAL TO NOISE CONSIDERATIONS
(S/N)max= (ST)'.
183
(4.42)
Thus for a fixed relative resolution (Sly) of lines of varying widths y ,
(S/N)max = y+"(~/y)TI!
(4.43)
and the signal to noise ratio is proportional to the square root of the width of the line studied. This is a quite surprising result, one that is not intuitively obvious: for a fixed relative resolution (Sly), the signal to noise ratio decreases with decreasing linewidth y, even though the power per unit bandwidth in the signal spectrum P, (w) increases! ! ! The homodyne spectra in Fig. 8, recorded for a xenon sample a t a temperature 0.30C" above the critical temperature for two scattering angles, (a) 8 = 30" and (b) 8 = 150", illustrate the dependence of the signal to noise ratio on linewidth predicted by eq. (4.43).The scattering volume, scattered light signal (is), relative resolution (Sly), and time constant T are the same for the two spectra, and in both cases Ps >> P S N .Since the linewidth is given by y = xq2 (eq. (3.21))and q2(150")/q2(30") = 13.9,the spectral density P,(w) at w = 0 is 13.9 times greater for the 8 = 30" spectrum than it is for the 8 = 150" spectrum, yet eq. (4.43)predicts that the signal to noise ratio for the 8 = 150" spectrum should be (13.9)*= 3.7times that for the 8 = 30" spectrum. The spectra are seen to exhibit the predicted behavior. If the signal to shot noise ratio is small, then the signal to noise ratio is (4.44)
Thus in this case the signal to noise ratio for a fixed relative resolution (Sly) is inversely proportional to the square root of the linewidth y. Since the (S/N)maxis directly proportional to y* for large signal to shot noise ratios and inversely proportional to y* for small signal to shot noise ratios, it is of interest to find the optimum (S/N)maxwhen y can be varied:
where
(S/N)max = and the relative resolution (Sly) is held constant. The result is that the
184
LIGHT BEATING SPECTROSCOPY
[IIL
s
5
optimum signal to noise ratio is attained for PJP,, = 1. Thus if the physical quantity to be determined from a linewidth measurement can be determined equally well for different linewidths (e.g., by changing the scattering angle), then the proper choice of linewidth is the one that yields a signal to shot noise ratio of unity. I n three cases discussed in 9 3 the Rayleigh linewidth was proportional t o q2 : y = D,q2 for a dilute solution of spherical macromolecules, y = xq2 for a simple fluid, and y = Dq2 for a binary mixture. Thus in these cases the same information about the desired quantities D,, x and D can be obtained for different scattering angles (this is not true if y does not have a simple q2 dependence - see 9 3.1.2, 0 3.1.3 and 9 6.4). It follows from the preceding paragraph that the optimum signal to noise ratio is obtained when the scattering angle is chosen so that, if possible,
PsIPs,
= ,u(i,)/NeDq2 = 1,
(4.45)
where D denotes D,, x or D . Note that the laser beam power that can be effectively used in increasing the signal to noise ratio of a light beating experiment can be estimated from eq. (4.41). A laser beam power that yields a signal to shot noise ratio of unity results in a signal to noise ratio only a factor of two smaller than that produced with infinite beam power. Thus if the laser power were increased without limit from the level for which (is) = Ney/,u (i.e., Ps/P,, = l ) , the ultimate signal t o noise ratio would be the same as that obtainable (with (is) = Ney/,u) by increasing the integration time T by a factor of four. I n most experiments the small gain in signal to noise ratio obtained by increasing the laser power far beyond the level for which PsIPsN= 1 would be more than offset by the resultant detrimental sample heating.
5 5.
Apparatus and Procedure
The spectrum of light scattered by a sample can be measured by mixing the light with itself (homodyne technique) or with a local oscillator (heterodyne technique) in a square law detector (for example, a photomultiplier). The power spectrum of the photocurrent, P f ( o ) , is then normally measured with a spectrum analyzer which determines the Fourier transform of the current autocorrelation function ( i ( t ) i ( t + ~ ) )which , in turn is proportional to the second order correlation function of the optical field (9 2.1). Alternatively, the opti-
111,
9
5j
APPARATUS AND PROCEDURE
185
cal spectrum can be deduced from a direct measurement of the photocurrent autocorrelation function using an autocorrelation function computer. 5.1. HOMODYNE SPECTROSCOPY
The basic spectrometer used for homodyne measurements of an optical spectrum is diagrammed in Fig. 9 (FORDand BENEDEK [1965]). The output of a laser is focused on a sample and the light scattered through an angle 8, defined by apertures, is collected and focused onto the cathode of a photomultipler. The output photocurrent is Fourier analyzed with an electronic spectrum analyzer (various commercial spectrum analyzers are available which cover the spectrum from the subaudio region out to the microwave region). The output of spectrum analyzers is usually a voltage spectrum, proportional to [ P t ( w ) ] *which , must be squared to give the desired P f ( w ) . Typical spectra obtained with large and small signal to shot noise ratios are shown in Fig. 10. These spectra were obtained with a SingerPanoramic SB-15 spectrum analyzer for a carbon dioxide sample at the critical density and near the critical temperature (SWINNEY[1968]). After subtraction of the flat shot noise background, the spectra accurately follow the predicted Lorentzian line shape (eq. (2.14)), and the linewidth y = xq2 (eq. (3.2)) yields a value for the thermal diffusivity x in good agreement with thermodynamic measurements (SWINNEY and CUMMINS[1968]). The current autocorrelation function can be measured directly by substituting a correlation function computer for the spectrum analyzer and squaring circuit in Fig. 9. Figure 11 shows the correlation function obtained using a P.A.R. Model 100 Correlator, again for scattering from carbon dioxide near the critical point (CUMMINS[1968]). The correlation function is exponential as predicted (eq. (3.19)), and the correlation time is within a few percent of the value y-l obtained from linewidth measurements for the same sample conditions. Direct measurement of the correlation function is a far more efficient technique than the measurement of the spectrum with the usual swept-frequency spectrum analyzer (such as the Panoramic SB-15 or General Radio 1900 A), because these spectrum analyzers scan in frequency, measuring only one bandwidth at a time, whereas a correlation function computer (at least in principle) analyzes all of the signal all of the time. Thus the rate of data collection with a correlation function computer is faster than data collection with a swept-frequency
186
LIGHT BEATING
LENS
SPECTROSCOPY
SAMPLE
I\
Fig. 9. Homodyne spectrometer
Fig. 10. Typical spectra with large and small signal to shot noise ratios. The spectra were obtained for scattering from a carbon dioxide sample at the critical density and near the critical temperature.
111,
§ 51
APPARATUS AND PROCEDURE
187
< i ( t l i ( t + T ) ) vs T T increment: 5psec/channel
r-
Fig. 11. Photocurrent autocorrelation function for light scattered from carbon dioxide near the critical point.
spectrum analyzer by a factor which is of the order of the number of bandwidths swept. However, there exist commercial “real time” spectrum analyzers (e.g., Hewlett-Packard Model 8054, Federal Scientific Model PSD-g/AU) which simultaneously sample all frequency intervals. Such “real time” spectrum analyzers are in principle as efficient as correlation function computers.
Experimental considerations The signal to shot noise ratio for the photocurrent is inversely proportional to the number of coherence areas within the solid angle subtended by the collection optics (5 4.1). Since the number of coherence areas in a given solid angle decreases as the scattering volume decreases, it is desirable to make the scattering volume as small as possible. Therefore, the laser beam should have a small angular divergence and the focusing lens should have a short focal length. In order to have a well-defined scattering angle the collection aperture must be small. Also, the short focal length focusing lens requirement of the preceding paragraph must be relaxed for very small scattering angles, since the spread in angles in the focused incident beam can introduce a significant spread in the scattering angle. Multiple scattering is frequently a problem in scattering intensity measurements, particularly for highly turbid samples such as fluids near the critical point. The effect of multiple scattering on the spectral distribution of scattered light has been calculated for one case of interest: double scattering from thermal diffusion type density fluctuations in a fluid (FERRELL [1968]). Ferrell found that in the two limiting cases of forward and backward scattering the shape of the spectrum is insensitive to the presence of a double-scattering component.
188
LIGHT B E A T I N G SPECTROSCOPY
1111,
3
5
The spectrum of the photocurrent contains the desired signal spectrum plus a white shot noise background. The shot noise level can be determined by examining the spectrum at high frequencies, beyond the range in which the signal level is significant. An alternative technique for measuring the shot noise level is to substitute a light bulb (noise source) for the sample. The intensity of the light bulb is adjusted to produce an average photocurrent (i) equal to that which was produced by the sample. The signal t o shot noise ratio for the homodyne spectrum produced by the light bulb source is (9 2.2)
Ps (w
=
O)/PSN = (i)/Ney,
where N is the number of coherence areas subtended by the detector (9 4.1), and 1' is the linewidth of the light bulb spectrum (within a numerical factor, of the order of 1, which depends on the shape of the spectrum). Typical values are: phctocathode current (i) M A, N w 100, and for a 3000" K tungsten bulb, y M 1015sec-l. Thus Ps(w M O ) / P S N M lop9. Therefore, the signal spectrum for a light bulb source is entirely negligible compared to the shot noise, so that the light bulb effectively serves as a white noise generator. With a light bulb source the shot noise level can be determined at any frequency, so this technique imposes less stringent requirements on the frequency response of the spectrum analyzer than the technique of determining the shot noise level by examining the spectrum at frequencies much greater than the linewidths of the signals of interest. Furthermore, the white noise produced by the light bulb source can also be used to calibrate the frequency response of the entire spectrometer system, from the photodetector t o the recorder. Photomultiplier dark current is usually not a problem in light beating experiments because signals which are intense enough to yield usable signal to shot noise ratios result in a cathode photocurrent far greater than the dark current (see eq. (4.45)).Hence the principal criterion for selecting a photomultiplier tube for optical beating experiments is that it should have a high quantum efficiency in the optical region investigated. 5.2. H E T E R O D Y N E SPECTROSCOPY
Heterodyne spectroscopy differs from homodyning only in that the optical signal whose spectrum is to be determined is mixed with a monochromatic local oscillator signal at the photodetector. We will
111,
9 51
APPARATUS AND PROCEDURE
189
discuss the various techniques that have been developed for producing the local oscillator signal. Our discussion of heterodyne detection in 9 2.3 started with the assumption that the local oscillator is a monochromatic signal. In practice the local oscillator signal is always derived from the same laser which illuminates the scattering sample, since phase and frequency variations in the laser, which would otherwise distort the spectrum, are then automatically cancelled out. This fact has led to some confusion in terminology, since a “superheterodyne” spectrometer in which the local oscillator is derived from the same source as the signal (in contrast to an independent source) could be designated a homodyne (as opposed to heterodyne) spectrometer (cf. DELANGE[1968]). In the terminology which we are following in this review, a spectrometer employing a local oscillator (of any derivation) is designated a heterodyne Spectrometer, the term homodyne being reserved for the direct detection scheme of 9 2.2 in which only the signal field is present. Various techniques have been developed for obtaining the local oscillator signal. Most of them utilize the incident light at frequency wo directly as the local oscillator, so that wLo = w,,. From eqs. (2.19) -(2.20) we see that the spectrum will then be centered at w = 0, as in the homodyne case. The unshifted local oscillator can be obtained in several ways. LASTOVKA and BENEDEK[1966a] used some of the light transmitted through the cell, recombining it with the scattered light on a beam splitter. BERGE[I19671 put a piece of milky quartz in the scattering cell; by adjustment of the optics to locate the effective scattering volume in the sample very close to the surface of the milky quartz (which produces strong elastic scattering), the scattered signal and local oscillator signal can be made to originate within a single coherence volume which leads to a high mixing efficiency. LASTOVKA and BENEDEK [1966b] utilized the scattering from dust on the surface of the cell in the same way. A similar scheme which provides a more uniform distribution of the local oscillator power was employed in the TMV experiment of CUMMINSet al. [1968], who placed a Teflon wedge in the scattering volume with its edge partially intersecting the beam. Alternatively, some of the laser signal can be split off before the scattering volume, shifted in frequency, and recombined with the scattered light at a beam splitter. Since wLo # wo, the heterodyne current spectrum for this case is not centered at zero, but at lwLo-wol. Thus the entire optical spectrum is reproduced in the photocurrent spectrum, making it possible to study dissymmetry in the optical
190
LIGHT BEATING SPECTROSCOPY
[111,
s
6
spectrum which is, of course, lost if wLo = wo. Frequency shifting of the local oscillator from w,,to oLocan be achieved by Bragg reflection from travelling sound waves in an ultrasonic tank (cf. CUMMINSand KNABLE[1963]). Two “Bragg tanks” were used in the polystyrene diffusion broadening experiment of CUMMINSet al. [1964]. Despite the advantages of having a displaced local oscillator (wLo # wo), there is a particular difficulty with techniques in which the local oscillator beam traverses a different optical path than the scattered light, since the two signals must be recombined with perfect spatial cross-coherence in order to achieve the full signal of eq. (2.20). Any optical imperfection will tend to reduce the “mixing efficiency”, and thus reduce the light beating component of the photocurrent spectrum, leading to a reduction in the signal to noise level. The schemes discussed above, in which the signal and local oscillator are produced within the same volume, do not suffer from this difficulty since the signal and local oscillator light traverse identical optical paths. In that case, it is possible t o obtain a mixing efficiency close to unity and to achieve signal to noise levels comparable to those obtained with homodyne detection (ADAMet al. [1969]).
Q 6. Review of Rayleigh Linewidth Experiments In this section we review the light beating measurements of the linewidth of light scattered quasi-elastically by dilute solutions of macromolecules and by simple fluids and binary mixtures near the critical point. This technique for studying the dynamic properties of critical fluctuations can also in principle at least, be extended to solids. However the conditions under which solid state transitions may produce critical opalescence are rather restrictive, and no unambiguous case is yet known to exist (cf. SHAPIROand CUMMINS[1968]). For a discussion of other applications of the light beating technique to the study of distributed spectra, see BENEDEK[1969b]. 6.1. DILUTE SOLUTIONS O F MACROMOLECULES
The first experimental light beating measurement of a Rayleigh linewidth, reported in 1964, utilized dilute monodispersed solutions of polystyrene latex spheres (commercially available from the Dow Chemical Company). That experiment was performed with a heterodyne spectrometer, and the results exhibited the behavior predicted by eqs. (3.11 c)-(3.13) and (2.20) (CUMMINSet al. [1964]). Homodyne
111,
S 61
REVIEW O F RAYLEIGH LINEWIDTH EXPERIMENTS
191
measurements on the same material were reported by ARECCHI[1967]. Arecchi’s results verified the prediction of (2.14 b) for the optical spectrum (3.11 c), thus also demonstrating the random Gaussian nature of the scattered field. In 1967 Dubin, Lunacek and Benedek extended the technique to the study of several biological macromolecules (DUBINet al. [1967]). Their homodyne measurements of light scattered by bovine serum albumin, ovalbumin, lysozyme, tobacco mosaic virus (TMV), DNA and polystyrene latex spheres were all analyzed in accord with eq. (2.11 c), which is appropriate for spherical scatterers as we have seen. For the two largest macromolecules studied (TMV and DNA) the spectra were in fact found to give poor fits to single Lorentzians, as we should expect from the discussion of 3.1.2 for nonspherical scatterers. Recently, CUMMINSet al. [1968] have studied the spectrum of light scattered from TMV, utilizing the multi-Lorentzian spectrum predicted by PECORA [1964]. Both homodyne and heterodyne detection were utilized in this experiment which gave results for both the translational and rotational diffusion constants of TMV. With some additional sophistication in technique, the analysis of the spectrum of light scattered from complex macromolecules should provide additional parameters characterizing internal degrees of freedom in addition to the diffusion constants. Similarly, the spectral analysis of light scattered from self-propelled microorganisms can be used to study their movements, an effect which BERGEet al. [1967] utilized in a study of the salinity dependence of the motility of fish spermatozoa. 6.2. SIMPLE FLUIDS NEAR THE CRITICAL POINT
As a consequence of the ordinary linearized hydrodynamic equations, the Rayleigh linewidth for a simple fluid is given by the LandauPlaczek equation, y = x q 2 (eq. (3.21)), where x,the thermal diffusivity, equals A/pc,, where A, p and c, are the thermal conductivity, density, and specific heat at constant pressure, respectively. The specific heat c, diverges much more strongly than A in the critical region, so the Rayleigh linewidth should go to zero at the critical point. Thus measurements of the Rayleigh linewidth can be used to study the detailed temperature and density dependence of x in the critical region. Many of the thermodynamic properties of systems in the critical region are found to exhibit simple power law dependences on the reduced differential temperature E 3 IT-TcI/Tc (FISHER [1967],
192
L I G H T BEATING SPECTROSCOPY
[III,
9
6
HELLER[ 1967]), and experimental determinations of the numerical values for the exponents serve as crucial tests for the theories of critical phenomena. Above the critical temperature on the critical isochore c, (which diverges in the same way as the isothermal compressibility) is propord.Therefore x E Y - @ . Similarly, below the tional to E - Y , while A critical temperature on the coexistence curve, x cY’-*’. The exponent y is 1.00 for “classical” models (e.g., the van der Waals fluid) and is 1.25 for the 3-dimensional Ising model, which is mathematically identical to the lattice gas model; experiments indicate y = 1.3*0.1 (HELLER[1967]). Linewidth measurements combined with the known critical behavior of c, offered a way to determine the critical behavior of the thermal conductivity A , which was not known, although thermodynamic measurements of A for CO, by SENGERS [1965] did show that A has a critical anomaly. Linewidth measurements can be performed for an isothermal sample, unlike thermodynamic measurements of A, which require a temperature gradient in the sample and hence are limited to temperatures not too close to T,. The first measurements of the Rayleigh linewidth in a simple fluid were reported by FORDand BENEDEK[1965, 19661, for sulphur hexafluoride and shortly afterwards ALPERT et al. [1966] reported linewidth measurements for carbon dioxide near the critical point. These experiments showed that the Rayleigh linewidth follows the sin2 t 8 dependence predicted by the Landau-Placzek equation (where 0 is the scattering angle), and the linewidth was observed t o approach zero approximately linearly with T-T,, indicating that x goes to zero with an exponent y-y GX 1. However, these measurements were performed over small temperature ranges for samples whose mean density was not precisely known; therefore, the result y--y = 1 had a large uncertainty. Recent precision measurements of the Rayleigh linewidth in SF, (SAXMAN and BENEDEK[1968], BENEDEK [1968a, 1968bl) and in CO, (SWINNEY and CUMMINS[1968]) have shown that x for these two fluids does indeed go to zero as the critical point is approached, but the manner in which x approaches zero along the critical isochore is quite different for the two fluids, and in neither case does x depend linearly on T-T,. The results for the thermal diffusivity of CO, are shown in Fig. 12, which includes in addition to the data of Swinney and Cummins, light beating linewidth measurements of 15 very near N
- -
111,
9
61
193
REVIEW O F RAYLEIGH LINEWIDTH EXPERIMENTS
T , (SEIGELand WILCOX[1967]) and thermodynamic data for x far from T , (compiled by Dr. J. V. Sengers - cf. refs. 44-51 in MOUNTAIN [1966]). The exponent y - y for CO, (slope of Fig. 12), given by combining the three independent sets of data, is 0.73h0.02, while for SF, Saxman and Bendek found that y - y = 1.26&0.02. These different results for SF, and CO, are startling in view of the expected universality of the critical exponents. For both CO, and SF, below the critical temperature, it was found that y'-y' M 8, along both the gas and liquid sides of the coexistence curve. Thus the Rayleigh linewidth measurements indicate that the thermal conductivity of CO, has a strong critical point singularity, y m 0.6, both above and below T,, while SF, has this singularity below T , but is only weakly divergent if at all above T,. Recently KADANOFF and SWIFT[l96S], using scaling law techniques, have predicted that the thermal conductivity has a strong critical point singularity, A ~~5-1, where 5 is the two-body correlation N
0.001
0.01
LO
0.1
1-1,
10
w)o
(C")
Fig. 12. The thermal diffusivity of CO, along the critical isochore versus the difference and CUMMINS[1968]; m, thermodynamic data; A, temperature T-T,. 0 , SWINNEY SEIGELand WILCOX[1967]. The slope given by combining the three independent sets of data is 0.73&0.02.
194
[III,
LIGHT BEATING SPECTROSCOPY
-
S
6
-
length. On the critical isochore, t E-”, and on the coexistence curve, E N E-”’, where v M v’ M f . Thus A e-8 and x E*. The results for CO, are in good agreement with the prediction of Kadanoff and Swift, and are also in accord with the approximate equality expected for the exponents above and below T , (KADANOFF et al. [1967]). Preliminary results for xenon also indicate y-y M f (HENRYet al. [1969]). In contrast, extensive linewidth measurements of Saxman and Benedek for SF,, which were made for several isochores over a range of two orders of magnitude in E , clearly show a behavior above T , which is dramatically different from that observed for CO, and for binary mixtures (5 6.3). The cause of this difference is not presently understood. N
6.3. BINARY CRITICAL MIXTURES
The Rayleigh linewidth for a binary mixture is given by y = Dq2 (eq. (3.21)), where the binary diffusion coefficient D,which plays the role that x plays for simple fluids, goes to zero at the critical point. The first measurement of the Rayleigh linewidth for a binary mixture near the critical point was by Alpert and collaborators for the mixture aniline-cyclohexane (ALPERTet al. [1965], ALPERT[1966]). The Rayleigh linewidth was observed to vary approximately linearly with T-T, and with sin2 46 (cf. DEBYE[1965]). Similar behavior was observed by WHITEet al. [1966] for a critical mixture of polystyrene macromolecules in cyclohexane. Recent extensive measurements of the Rayleigh linewidth by Chu and collaborators (CHU [1967a], [1967b], CHU et al. [1968]) for the system isobutyric acid and water and by BERGEand VOLOCHINE [1968] for an aniline-cyclohexane mixture have revealed a y F% behavior for these mixtures rather than y F as indicated by the early experiments; hence the binary diffusion coefficient D E% in the critical region. SWIFT [1968] has shown that a binary mixture is the dynamic analog of a simple fluid, and, in particular, that the mass diffusion mode of a mixture corresponds to the heat conduction mode of a simple fluid. Thus Swift predicted that the binary diffusion coefficient D for a mixture and the thermal diffusivity x for simple fluid should exhibit the same critical behavior, which would be x D E ~ v, M $. The D F% behavior observed for two binary mixtures supports Swift’s prediction, as does the agreement between the exponents obtained for binary mixtures and carbon dioxide.
-
-
-
-
N
-
111,
61
REVIEW OF RAYLEIGH LINEWIDTH EXPERIMENTS
195
6.4.FIXMAN’S MODIFICATION
The equations of motion used in deriving the Landau-Placzek equation for the Rayleigh linewidth (eq. (3.21)) were the linearized [1960] has shown that these equahydrodynamic equations. FIXMAN tions must be modified slightly in the immediate neighborhood of the critical temperature in order to include the effects of long-range density correlations. BOTCH[ 19631 has incorporated Fixman’s modification in a derivation of the Rayleigh linewidth, and has obtained for the linewidth Y = xq2(1+q2E2). (6.1) Since the correlation length 6 diverges as T --f T,, the q 2 6 2 term should become significant very near T,, while far from the critical point qzE2
(6.2)
+ 1 and 6‘is a correlation length different from the correlation
* Dr. Yeh has recently reanalyzed his xenon data, including in the analysis an aperture correction formerly omitted, and has obtained a correlation length an order of magnitude smaller than his earlier reported value (private communication).
196
L I G H T B E A TIN G S P E C T R O S C O P Y
[111,
§ 7
length t. Thus further experiments for different systems covering a range of temperatures very near T , and a large range of scattering angles are clearly needed. § 7. Conclusions
We have analyzed many of the technical problems that must be considered in any light beating experiment. For example, it has been shown that for both the homodyne and heterodyne detection techniques the light beating spectrum obtained with a multimode laser source is the same as that obtained with a monochromatic source or single mode laser with the same intensity, except for the presence of the normally unimportant intermode beats. It has also been shown that the signal to noise ratio for the light beating spectrum cannot be increased without limit by increasing the intensity of the exciting source; rather, only a small gain in S/N is realized by increasing the laser power beyond the level for which the signal to shot noise ratio is unity. The S/N (for a fixed relative resolution and a large signal to shot noise ratio) is proportional to the square root of the linewidth, so very precise measurements of extremely narrow spectral lines (of the order of Hz) will be difficult, no matter how powerful the laser source. On the other hand, if the linewidth can be broadened without decreasing the signal to shot noise ratio below unity by, for example, increasing the scattering angle, then the signal to noise ratio will be increased, even though the power per unit bandwidth in the signal spectrum is reduced. The applications of the technique of light beating spectroscopy in the past five years have demonstrated the value of the technique in the study of spectral lines far too narrow to be measured by conventional spectroscopic techniques. Light beating measurements of the Rayleigh linewidth in simple fluids and binary mixtures have yielded valuable new information on the dynamics of the critical region. For example, the critical behavior of both the thermal diffusivity of simple fluids and the diffusion coefficient of mixtures has been found to be described by a two-thirds power law, in agreement with the prediction of dynamical scaling. Although significant results have already been obtained with the light beating technique, the application of light beating spectroscopy to the study of critical systems and other systems must be considered to be in its infancy. Systematic light beating studies of the temperature, density and q dependence of the Ray-
1111
REFERENCES
197
leigh linewidth lie ahead; work is in progress in many laboratories and doubtlessly new results will soon be reported. Finally, the authors would like to acknowledge the help provided by members of the light scattering group at Johns Hopkins University, and to thank Professors E. Wolf, L. Mandel, and G. Benedek for reading the preliminary manuscript of this article and making many helpful suggestions which have been incorporated in the final version. We also wish to thank the Advanced Research Projects Agency and the Army Research Office (Durham) for financial support of our experimental light scattering spectroscopy program.
References ADAM,M., A. HAMELIN and P. BERGE,1969, Optica Acta, 1 6 , 337. ALKEMADE, C. T. J., 1959, Physica 2 5 , 1145. ALPERT,S. S., 1966, Time-Dependent Concentration Fluctuations Near the Critical Temperature, in: Critical Phenomena, Proc. Conf. Washington, D.C., April 1965, eds. M. S. Green and J. V. Sengers (Natl. Bur. Standards Miscellaneous Publication 273, Washington) pp. 157-160. ALPERT,S. S.,D. BALZARINI, R. NOVICK, L. SEIGELand Y. YEH, 1966, Observation of Time-Dependent Density Fluctuations in Carbon Dioxide Near the Critical Point Using an He-Ne Laser, in: Physics of Quantum Electronics, Conf. Proc. San Juan, 1965, eds. P. L. Kelley, B. Lax and P. E. Tannenwald (McGraw-Hill Book Co., New York) pp. 253-259. ALPERT,S. S.,Y. YEH and E. LIPWORTH, 1965, Phys. Kev. Letters 1 4 , 486. ARECCHI, F. T., 1965, Phys. Rev. Letters 1 5 , 912. ARECCHI, F. T., 1967, Phys. Rev. 1 6 3 , 186. ARECCHI, F. T., 1968, Photocount Distributions and Field Statistics, in: Intern. School of Phys. “Enrico Fermi” XLII Course, Varenna 1967, ed. R. Glauber (Academic Press, New York) in press. ARECCHI, F. T., A. BERNEand A . SONA,1966, Phys. Rev. Letters 1 7 , 260. ASELTINE, J . S., 1958, Transform Method in Linear System Analysis (McGrawHill Book Co., New York) Ch. 15. BBDARD,G., 1967a, Phys. Rev. 1 6 1 , 1304. B~DARD G.,, 1967b, J. Opt. SOC.Am. 5 7 , 1201. BENEDEK, G. B., 1968a, Thermal Fluctuations and the Scattering of Light, in: Statistical Physics, Phase Transitions, and Superfluidity, Vol. 2 - 1966 Brandeis University Summer Institute in Theoretical Physics, eds. M. ChrCtien, S.Deser and E. P. Gross (Gordon and Breach Science Publishers, Inc., New York) p. 1 . BENEDEK,G. B., 1968b, Optical Mixing Spectroscopy, with Applications to Problems in Physics, Chemistry, Biology, and Engineering, in: Polarisation Matiere et Kayonnement, Livre de Jubil6 en l’honneur du Professcur A. Kastler (Presses Universitaire de France, Paris) p. 49. BERGE,P., 1967, La Diffusion InClastique des Photons, Communication presented a t the meeting of the French Crystallographic Society, Lyon, April 1967.
198
LIGHT BEATING
SPECTROSCOPY
[I11
BERGE,P., P. CALMETTESand B. VOLOCHINE, 1968a, Phys. Letters 2 7 A , 637. BERGE, P. and B. VOLOCHINE, 1967, Compt. Rend. 2 6 4 B , 1200. BERGE,P. and B. VOLOCHINB, 1968, Phys. Letters 2 6 A , 267. BERGE, P., I3. VOLOCHINE, R. BILLARU and A. HAMELIN, 1967, Compt. Rend. 2 6 5 0 , 889. M. ADAM,P. CALMETTESand A. HAMELIN, 1968b, BERGE,P., B. VOLOCHINE, Compt. Rend. 2 6 6 8 , 1575. ROLWIJN, P. T., C. T. J . ALKEMADE and G. A. BOSCHLOO, 1963, Phys. Letters 4, 59. BORN, M. and E. WOLF,1964, Principles of Optics, 2nd ed. (MacMillanCo., New York). (See also 3rd ed., 1965.) BOTCH,W. D., 1963, Studies of some Critical Phenomena, unpublished Ph. D. thesis, University of Oregon. BROWN, R. HANBURY and R. Q. TWISS,1958, Proc. Roy. Soc. (London) 2 4 3 A , 291. CHU, B., 1967a, Phys. Rev. Letters 1 8 , 200. CHU,B., 1967b, J. Chem. Phys. 4 7 , 3816. CHU,R. and F. J. SCHOENES, 1968, Phys. Rev. Letters 2 1 , 6. CHU,B., F. J. SCHOENES and W. P. KAO,1968, J . Am. Chem. Soc. 9 0 , 3042. CUMMINS,H. Z., 1968, Laser Light Scattering Spectroscopy, in: Intern. School of Phys. “Enrico Fermi” XLII Course, Varenna 1967, ed. R. Glauber (Academic Press, New York) in press. CUMMINS, H. Z., F. D. CARLSON, T. J . HERBERT and G. WOODS,1969, Biophys. J. 9, 518. CUMMINS, H. 2. and N. KNABLE,1963, Proc. I E E E 5 1 , 1246. CUMMINS, H. Z., N. KNABLE, L. G A M P E L ~ YEH, ~ ~ Y1963, . Appl. Phys. Letters 2, 62. CUMMINS,H. Z., N. KNAULE and Y . YEH, 1964, Phys. Rev. Letters 12, 150. CUMMINS, H. Z. and H. L. SWINNEY,1966, J . Chem. Phys. 4 5 , 4438. DELANGE, 0. E., 1968, IEEE Spectrum 5, 77. UEBYE,P., 1965, Phys. Rev. Letters 14, 783. DUBIN,S. B., J . H. LUNACEK and G. B. BENEDEK,1967, Proc. Natl. Acad. Sci. U.S. 57, 1164. EINSTEIN, A,, 1910, Ann. Physik 3 3 , 81. FEIIRELL, R. A,, 1968, Phys. Rev. 1 6 9 , 199. FISHEK, M. E., 1967, Rept. Prog. Phys. 30, 615. FIXMAN, M., 1960, J . Chem. Phys. 3 3 , 1357. FORD, N. C. and G. B. BENEDEK,1965, Phys. Rev. Letters 15, 649. FORD, N. C. and G. B. BENEDEK,1966, The Spectrum of Light Inelastically Scattered by a Fluid Near its Critical Point, in: Critical Phenomena, Proc. Conf. Washington, D.C., April 1965, eds. M. S. Green and J. V. Sengers (Natl. Bureau of Standards Miscellaneous Publication 273, Washington) p. 150. FOKIIESTER, A. T., 1961, J . Opt. SOC.Am. 5 1 , 253. FORRESTER, A . T., R. A. GUDMUNDSEN and P. 0 . JOHNSON, 1955, Phys. Rev. 9 9 , 1691. FREED, C. and H. A. HAUS, 1965, Phys. Rev. Letters 15, 943. FREED, C. and H. A. HAUS,1966, Amplitude Noise in Gas Lasers Below and Above The Threshold of Oscillation, in: Physics of Quantum Electronics,
1111
REFERENCES
199
Conf. Proc. San Juan 1965, eds. P. L. Kelley, B. Lax and P. E. Tannenwald (McGraw Hill Book Co., New York) p. 715. GLAUBER, R. J.. 1963a, Phys. Rev. 130, 2529. GLAUBER, R . J., 1963b, Phys. Rev. 131, 2766. GLAUBEK, R. J., 1964, Quantum Theory of Coherence, in: Quantum Electronics 111, Proc. Third Intern. Conf. Paris 1963, eds. P. Grivet and N. Bloembergen (Columbia University Press, New York) p. 111. GLAUBER, R. J., 1965, Optical Coherence and Photon Statistics, in: Quantum Optics and Electronics, Les Houches Summer School of Theoretical Physics, Grenoble 1964, eds. C. De Witt, A. Blandin and C. Cohen-Tannoudji (Gordon and Breach, New York) p. 63. GLAUBER, R. J . , 1966, Photon Counting and Field Correlations, in: Physics of Quantum Electronics, Conf. Proc. San Juan 1965, eds. P. L. Kelley, B. Lax and P. E. Tannenwald (McGraw Hill Book Co., New York) p. 788. GLAUBER, R. J., 1968, Coherence and Quantum Detection, in: Intern. School of Phys. “Enrico Fermi” XLII Course, Varenna 1967, ed. R. Glauber (Academic Press, New York) in press. GOLAY, M. J . E., 1961, Proc. I R E 49, 958. GREYTAK, T. J. and G. B. BENEDEK, 1966, Phys. Rev. Letters 17, 179. HALPERIN, B. I . and P. C. HOHENBERG, 1967, Phys. Rev. Letters 19, 700. HAUS,H. A,, 1968, The Measurement of G ( 2 )and its Signal to Noise Ratio, in: Intern. School of Phys. “Enrico Fermi” XLII Course, Varenna 1967, ed. R. Glauber (Academic Press, New York) in press. HELLER,P., 1967, Rept. Progr. Phys. 30, 731. HENRY, D. L., H . 2 . CUMMINS and H. L. SWINNEY, 1969, Bull. Am. Phys. SOC. 1 4 , 73. J A K E M A N , E., C. J. OLIVER and E. R. PIKE,1968, J. Phys. A. [Proc. Phys. SOC. (London)] 1 , 406. J A K E M A N , E. and E. K. PIKE,1968, J. Phys. A. [Proc. Phys. SOC.(London)] 1, 128; 1, 625. JAVAN, A,, E. A. BALLIK and W. L. BOND,1962, J. Opt. Soc. Am. 52, 96. J A V A N , A , W. R. BENNETT and D. R. HERRIOTT, 1961, Phys. Rev. Letters 6, 106. KADANOFF, L. P., W. GOTZE,D. HAMBLEN, R. HECHT,E. A. S. LEWIS,V. V. PALCIAUSKAS, M. KAYL,J. SWIFT,D. ASPNESand J . KANE,1967, Rev. Mod. Phys. 39, 395. KADANOFF, L. P. and J . SWIFT,1968, Phys. Rev. 165, 310. KLAUDER, J. R . and E. C. G. SUDARSHAN, 1968, Fundamentals of Quantum Optics (W. A. Benjamin Inc., New York). KOMAROV, L. I. and I. 2. FISHER, 1962, Zh. Eksperim. i Teor. Fiz. 43, 1927 (English Transl. Soviet Phys. - JETP 16, 1358). LASTOVKA, J . B. and G. B. BENEDEK, 1966a, Light Beating Techniques for the Study of the Rayleigh-Brillouin Spectrum, in: Physics of Quantum Electronics, Conf. Proc., San Juan 1965, eds. P. L. Kelley, B. Lax and P. E. Tannenwald (McGraw Hill Book Co., New York) p. 231. LASTOVKA, J. B. and G. B. BENEDEK, 1966b, Phys. Rev. Letters 17, 1039. MANDEL, L., 1958, Proc. Phys. Soc. (London) 72, 1037. MANDEL, L., 1963, Fluctuations of Light Beams, in: Progress in Optics, Vol. 2,
200
LIGHT BEATING SPECTROSCOPY
[111
ed. E. Wolf (North-Holland Publ. Co., Amsterdam) p. 181. MANDEL,L., 1964, Some Coherence Properties of Non-Gaussian Light, in: Quantum Electronics 111, Proc. 3rd Intern. Conf., Paris 1963, eds. P. Grivet and N. Bloembergen (Columbia University Press, New York) p. 101. MANDEL,L., 1966, J . Opt. SOC.Am. 5 6 , 1200. MANDEL, L., E. C. G. SUDARSHAN and E. WOLF,1964, Proc. Phys. SOC.(London) 8 4 , 435. MANDEL,L. and E. WOLF,1963, J . Opt. SOC.Am. 53, 1315. MANDEL,L. and E. WOLF,1965, Rev. Mod. Phys. 3 7 , 231. MANDEL,L. and E. WOLF,1966, Phys. Rev. 1 4 9 , 1033. MARTIENSSEN, W. and E. SPILLER, 1966, Phys. Rev. 1 4 5 , 285 MORGAN,B. L. and L. MANDEL, 1966, Phys. Rev. Letters 1 6 , 1012. MOUNTAIN,R. D., 1966, Rev. Mod. Phys. 3 8 , 205. PECORA, R., 1964, J . Chem. Phys. 40, 1604. PECORA, R., 1968, J . Chem. Phys. 48, 4126. PHILLIPS, D. T., H. KLEIMAN and S. P. DAVIS,1967, Phys. Rev. 1 5 3 , 113. REED,I. S., 1962, I R E Transactions on Information Theory IT-8, 194. SAXMAN, A. C. and G. B. BENEDEK, 1968, to be published. SEIGEL, L. and L. R. WILCOX, 1967, Bull. Am. Phys. SOC.1 2 , 525. SENGEKS, J . V., 1965, J . Heat Mass Transfer 8, 1103. SHAPIRO, S. M. and H. 2. CUMMINS,1968, Phys. Rev. Letters 2 1 , 1578. SUDARSHAN, E. C. G., 1963, Phys. Rev. Letters 1 0 , 277. SWIFT,J . , 1968, Phys. Rev. 1 7 3 , 257. SWINNEY, H. L., 1968, The Spectrum of Light Scattered by Carbon Dioxide in the Critical Region, Unpublished Ph.D. Thesis, The Johns Hopkins University. SWINNEY, H. L. and H. Z. CUMMINS,1968, Phys, Rev. 1 7 1 , 152. TOWNES, C. H., 1961, Some Applications of Optical and Infrared Masers, in: Advances in Quantum Electronics, ed. J . R. Singer (Columbia University Press, New York) p. 8. VANDE HULST,H . C., 1957, Light Scattering By Small Particles (John Wiley and Sons, New York) ch. 7. VANHOVE,L., 1954, Phys. Rev. 95, 249. WHITE,J . A., J . S. OSMUNDSON and B. H. AHN,1966, Phys. Rev. Letters 1 6 , 639. WOLF, E., 1964, Recent Researches on Coherence Properties of Light, in: Quantum Electronics 111, Proc. 3rd Intern. Conf. Paris 1963, eds. P. Grivet and N. Bloembergen (Columbia University Press, New York) p. 13. WOLF,E., 1966, Optica Acta 1 3 , 281. WOLF,E. and C. L. MEHTA,1964, Phys. Rev. Letters 1 3 , 705. YEH, Y., 1967, Phys. Rev. Letters 1 8 , 1043. YEH,Y. and H. Z.CUMMINS,1964, Appl. Phys. Letters 4, 176.
IV MULTILAYER ANTIREFLECTION COATINGS BY
A. MUSSET and A. THELEN Optical Coating Laboratory, Inc., Santa Rosa, California, U S A
CONTENTS INTRODUCTION. . . . . . . . . . . . . . . .
.
203
SINGLE LAYER ANTIREFLECTION COATINGS . . 205 TWO-LAYER ANTIREFLECTION COATINGS ON GLASS. . . . . . . . . . . . . . . . . . . . . . 206
THE DESIGN METHOD O F EFFECTIVE INTERFACES. . . . . . . . . . . . . . . . . . . . . . 210 TWO-LAYER ANTIREFLECTION COATINGS ON HIGH-INDEX SUBSTRATE. . . . . . . . . . . .
212
THREE-LAYER ANTIREFLECTION COATINGS ON HIGH-INDEX SUBSTRATE. . . . . . . . . . . . 214
THREE- AND FOUR-LAYER ANTIREFLECTION COATINGS ON GLASS . . . . . . . . . . . . . . 217
. . . . . . . . . . . . .
222
THE COATING O F REFLECTION REDUCING MULTILAYERS . . . . . . . . . . . . . . . . . . .
224
ENVIRONMENTAL STABILITY O F MULTILAYER ANTIREFLECTION COATINGS . . . . . . . . . .
225
SYNTHESIZED LAYERS
OPTICAL PERFORMANCE O F COATINGS AND INCREASE I N TRANSMISSION TH’ROUGH AN OPTICAL SYSTEM. . . . . . . . . . . . . . . . . . . 225 PHOTOGRAPHIC APPLICATIONS. . . . .
. . . .
230
SUPPRESSION O F STRAY LIGHT I N AN OPTICAL S Y S T E M . . . . . . . . . . . . . . . . . . . . . 231 REFERENCES .
. . . . . . . . . . . . . . . . . . . .
236
Q 1. Introduction Advances in thin film technology and a continuing need for better performance of optical systems have created a strong interest in reducing the reflection of optical surfaces beyond that achievable by single layers. In Fig. 1 the reflectance of an uncoated glass surface of index 1.52 is compared with the reflectance of the same surface when overcoated with a single layer of index 1.38 [magnesium fluoride) and optical thickness 125 nm. Although the coating reduces the reflectance substantially there are two major deficiencies: 1. the residual reflectance at the minimum is not low enough for many applications; 2 . while the reflected light from the uncoated surface is neutral in color, that from the coated surface is not.
4%
I
0
400
500
600
700mp
WAVELENGTH
Fig. 1. Comparison of the reflectance of a single layer antireflection coating (solid curve) with the reflectance of the uncoated surface (dotted curve), ns = 1.52, ~ Z M= 1.0, n, = 1.38, 4n,d, = 510 nm. 203
204
MULTILAYER ANTIREFLECTION COATINGS
[IV, § 1
There are basically two ways one can improve the performance of a single layer, viz. 1. by varying the index of refraction of the film continuously with its thickness (inhomogeneous antireflection coatings) or 2. by constructing the antireflection coating of several homogeneous layers with different indices of refraction (multilayer antireflection coatings). Although the design, construction, and use of inhomogeneous antireflection coatings is still in the experimental state, multilayer antireflection coatings are widely used. It is for this reason that we shall restrict the discussions of this paper to homogeneous multilayer antireflection coatings. The literature on antireflection coatings is very extensive and scattered. No attempt will be made to trace the various designs historically. We shall restrict ourselves to the practically important designs and cover current design methods. Throughout the theoretical section of this paper we shall consider absorption-free materials, no dispersion, and normal light incidence. Another major restriction is imposed by the fact that only a few materials lend themselves to deposition as thin films. Table 1 contains a list of commonly used TABLE1 Coating materials Useful wavelength region
Material
Approximate refractive index a t 550 nm
Na,AlF,
1.35 1.38 1.45
(0.2 to 1 0 p m 2 p m 0.2 to 7pm 7pm 0.6 to
Nd, 0, ZrO, CeO, ZnS TiO, Si Ge
2.0 2.1 2.30 2.30 2.30 3.40 (over useful range) 4.0 (over useful range)
0.4 t o > 2 p m 0.25 t o 7 p m 0.4 to 5 pm 0.4 to 1 5 p m 0.4 to 1 2 p m 8 pm 0.9 t o 1.3 to 3 5 p m
MgFz SiO,
Si,03 CeF, 3'
IV,
5
21
SINGLE LAYER ANTIREFLECTION COATINGS
205
TABLE2 Commonly used substrate materials Material
Approximate refractive index
Sapphire Synthetic quartz
1.7 1.45
Glass Si 1icon Germanium MgF, (Irtran 1) ZnS (Irtran 2)
1.5-1.7 3.45 4.0 1.38 2.2
Useful wavelength region (0.2 to 55 ,urn (0.2 to 4.0,urn and from 45 ,urn on 0.35 to 2.5,um 1.2 t o 50 pm 1.8 t o 25 prn 0.5 to 9 ,urn 0.7 t o 14.5,urn
materials. Table 2 lists some optical substrates for which antireflection coatings are desired. The process of designing multilayer filters is still an art. There are no methods yet of finding a certain design which can be proven to be optimum in its class. So the usual procedure is to use a design method which relates simple designs t o more complicated ones. Several methods are available and the selection for a particular problem is quite arbitrary and highly personal. The presentation in this paper is no exception. It is to be understood that the derivations are half-logical and half-intuitive. Some long formulas cannot be avoided. § 2. Single Layer Antireflection Coatings
The reflectance of a single homogeneous film on a glass surface is given by the formula (SMAKULA [1941])
Rfilrn
n2 (nM-ns)2- (nz-n2) (n2-ng) sin2 (2nndlil) = n2(nM+ns)2- (nif-n2) (n2-n;) sin2(2nnd/A) ’
with n : refractive index of the film, d : geometrical thickness of the film, n, : refractive index of the substrate, nM: refractive index of the surrounding medium, il : wavelength. Since the reflectance of the uncoated surface is
(1)
206
M U L T I L A Y E R A N TI R E F L E C T I 0 N C 0 A T I N G S
[IV,
§ 3
(3)
and particularly that
Rfilm= 0 for n
=
2nnd
d G Sand __ = hn,+n,etc. 2
(4)
Unfortunately there is no coating material with a refractive index low enough to fulfill condition (4) for normal glass surfaces (n rn 1.5). Magnesium fluoride with a refractive index of 1.38 is the best acceptable material. This leads to the limitations discussed in the introduction. For substrates with higher indices of refraction, especially n 2 1.3S2, single film antireflection coatings can be very effective. A significant advantage of a single layer antireflection coating is the fact expressed in relation (3) that its reflectance never exceeds the uncoated surface reflectance. This is not the case for multilayer antireflection coatings which contain materials with refractive indices higher than the substrate index.
5 3.
Two-Layer Antireflection Coatings on Glass
For normal glass surfaces (n rn 1.5) two layer antireflection coatings can eliminate one or the other deficiency of the single layer antireflection coatings as described in the introduction. Complete narrow band reflection reduction is accomplished by the so-called V-coating and a broadening of the minimum by the so-called W-coating. The reflectance of a two layer coating is given by (THELEN [1956]) . , x
R=-
n
l+X'
with
where n, : refractive index of the film next to the surrounding medium, p1 = 2nn,d,/J.,
IV,
9 31
ANTIREFLECTION COATINGS ON GLASS
201
n2: refractive index of the film next to the substrate, p2 = 2nn2d2/A.
For small R,
R In the V-type coating we set R
M
X.
=X =
0 which leads to
These relations allow a wide variety of solutions. Figs. 2 and 3 give some examples. From all the combinations of nl and n2 compatible with equations (8) and (9) the ones with the lowest values are preferred because they yield the broadest reflectance minimum (THELEN [ 19561). In the W-type coating we insert a half wave film (n2d2= +A) be4%
w V
5
3
0 lW _I
LL
w
m
2
I
0 WAVELENGTH
Fig. 2. Reflectance of various two-layer V-type ns = 1.52, nM = 1.0; solid curve: n, = 1.38 , n2 = 1.701, dotted curve 1 : +z1 = 1.38 , n 2 = 2 .3 0 , dotted curve 2: n, = 1.865, n2 = 2 . 3 ,
coatings on a substrate with index 4n,d, = 540 nm 4n,d, = 540 nm; 4n,d, = 698.5 nm 4nzd2= 113.3 nm: 4n,d, = 540
4n2d, = 540
nm nm.
208
MULTILAYER
ANTIREFLECTION
COATINGS
WAVELENGTH
Fig. 3. Reflectance of various two-layer V-type coatings on a substrate with indcx ns = 1.60, % ~ = f 1.0; solid curve: n, = 1.38 , 4n1d1 = 540 nm vz2 = 1.746, 4n,d, = 540 nm; dotted curve 1: n, = 1.38 , 4 n,d , = 671.5nm n 2 = 2.30 , 4n,d, = 113.2nm; dotted curve 2: n, = 1.818, 4vz,d, = 540 nm n2 = 2.30 , 4 n 2 d 2 = 540 nm.
12
1.4
1.6
1.8
2.0
2.2
2.4
2.6
2.8
3.0
"2
Fig. 4. Coefficientof cosz p in equation ( l o ) , A,, as a function of n,. n, = constant = 1.38, %M = 1, n~ = 1.52 (solid curve) and 1.60 (dotted curve).
IV,
§ 31
209
ANTIREFLECTION COATINGS ON GLASS
', \
,
_,I'
L . 4
..____. --
*r
%
with index
Fig. 5. nS = 1
4%
1
2
2
I 0 W
$ 3 W 0
6n 2
I
0 400
500
600
700mp
WAVELENGTH
Fig. 6. Reflectance of various two-layer W-type coatings on a substrate with index V Z = ~ 1.60, PZM = 1.0; solid curve: n, = 1.38 , 4n,d1 = 510 nm n, = 1.70 , 4n,d, = 1020nm; dotted curve 1: n, = 1.38 , 4n1d, = 510 nm n, = 2.10 , 4n,d, = 1020 nm; dotted curve 2: n, = 1.38 , 4n,d1 = 510 nm n, = 2.40 , 4vz,d, = 1020nm.
210
M U L T I L AY E R
A N T IR E FLE C T I 0N C0AT I N G S
[IV,
5
4
tween a quarter wave single layer antireflection coating and the substrate. By setting y , = 2y1 = 2 y equation (6) yields:
which is the reflectance of a single film with n = nl. But while for the single film this point is a minimum it is now a maximum as long as the coefficient of cos2y, A , < 0. Fig. 4 shows the coefficient A , as a function of n, for nl = 1.38, nbf= 1 and ns = 1.52 and 1.60. Fig. 5 gives some examples for a substrate of index 1.52 and Fig. 6 for a substrate of index = 1.60. While two layer antireflection coatings cannot eliminate both previously mentioned deficiencies of the single layer antireflection coating on normal glass surfaces they can provide excellent antireflection for substrates with higher indices of refraction as will be shown later.
Q 4. The Design Method of Effective Interfaces For the design of multilayer antireflection coatings the method of considering two effective interfaces inside the multilayer is a very effective method (DUFOUR[1954]; SMITH [1958]; THELEN[1960]; THETFORD [1970]). Two adjacent interfaces inside the multilayer system (Fig. 7 ) are selected and the Fabry-Perot formula is applied. One arrives at the following formula
IV, §
41
211
DESIGN METHOD O F EFFECTIVE INTERFACES
I
2
SPACERLAYER
7
?/ //
///////////.///////
JSUBSITTEM
SUBSTRATE
Fig. 7. Effective interfaces inside a multilayer.
with T (A) : Transmission through the total multilayer system, T,(A) : Transmission through subsystem I, T2(A): Transmission through subsystem 11, R(A) =d{R,(A) R,(A)} with R,, R, being the reflectivities of the subsystems I, I1 for light incidence from the spacer, r$, 4, : Phase changes upon reflection associated with R,, R, . = 4nnd/A with n, d being the index of refraction, physical B thickness of the spacer layer. The important feature of this formula is that the amplitude and phase relationships can be considered separately: T o @ )depends on the amplitudes of the reflectivities of the subsystems only and F(A) depends primarily on the phase upon reflection of the two subsystems and the thickness of the spacer. Both factors are always 5 1. For high transmittance (T FZ 1) the amplitudes have to be adjusted so that T o @ )M 1 and the phases SO that F(A) M 1. From the structure of equation(l1) it can be seen that
T o @ )= 1 only for Rl(A) = R2(A)
(12)
and
E(A) = 1 only for R(A) = 0 or sin2 +(r$,f#~~-B)
=
0.
(13)
The application of this method to two- and three-layer systems is relatively simple because the subsystems I and I1 reduce to single films.
212
MULTILAYEK ANTIREFLECTION COATINGS
[IV,
s
5
There are two groups of designs for which the amplitude condition (12) can be fulfilled for all wavelengths. It can beshown (Thelen [1969])
that the transmittance and reflectance of a multilayer are invariant to multiplying all indices of refraction (multilayer, substrate and medium) with a constant factor or replacing them with their reciprocal value while keeping the optical thicknesses constant. Group I : For the purpose of this discussion we consider subsystem I t o have a substrate index of nMand a medium index (spacer) of n,. We can shift one system into the other by multiplying all indices with a shift factor y: nx
=1
y n , and n, = yn,.
(14)
This leads to spacer index nx transformation
?%subII =
=
dn,nM,
Ynsub I = (nS/nM)+nsub I.
(15)
GroupII: Instead of shifting subsystem I into I1 by multiplying by a shiftfactor alone we now take the reciprocal values and then shift. The result is
nx = y/n, and ns = y/n,,
(16)
which leads to spacer index n, transformation
%z.,,b II =
=dnsnbf,
y/n,,,
I
= nRlns/nsub I.
(17)
Because of the previously discussed lack of coating materials with very low indices of refraction the direct application of these concepts is limited to substrates with indices of refraction well above 2. Antireflection coatings of this type were extensively discussed by MUCHMORE[1948] and YOUNG[1961]. They used electrical network synthesis for the derivation.
Q 5 . Two-Layer Antireflection Coatings on High-Index Substrate Group I-type coatings: By setting q.~= y1 = p2and n2 = n12/(ns/nL,) eq. (6) changes into
IV.
S 51
COATINGS ON HIGH-INDEX SUBSTRATE
213
Fig. 8. Reflectance of two-layer group I-type coatings on a high-index substrate with ns = 3.45, n M = 1; solid curve: n, = 1.56 , 4n,d1 = 1, n2 = 2.896, 4n,d, = A,; dotted curve: n, = 1.38 , 4n1d1 = 1, n2 = 2.56 , 4n,d, =A,.
x./
x
Fig. 9. Reflectance of two-layer group 11-type coatings on a high-index substrate with ns = 3.45, n M = 1; solid curve: n, = 1.56 , 4nld1 = A, nz = 2.21 , 4n,d, = A,; dotted curve: n, = 1.38, 4n,d1 = 10 n2 = 2.50, 4n,d, = A,.
214
M U L T I L A Y E R A N T I RE FLE CT I 0 N COAT1N GS
[IV,
I
ti
which yields a single broad minimum with zero reflectance at y = in. Fig. 8 shows two examples. Group 11-type coatings: With q~ = p1 = y 2 and n2 = nSnM/nl we arrive at (eq. (6))
This coating has a small maximum at y
= +z with
and two zero reflectance points at
where R, is the reflectance of the uncoated substrate. The total bandwidth for which R 5 R,,, is given by:
Fig. 9 shows two examples.
Q 6. Three-Layer Antireflection Coatings on High-Index Substrate In the previous paragraph the thickness of the spacer layer (with refractive index d ( n S n M )was ) zero. By giving it a finite thickness we can further reduce the residual reflectance and increase the bandwidth. The actual amount of the thickness depends on the phase factor and is difficult to optimize. Setting the thickness equal t o the thicknesses of the other two layers is an obvious choice, mostly because the designs are then more symmetrical and the analysis is simple. The reflectance of a three-layer coating is given by the formula (THELEN [1956])
IV, S 61
COATINGS ON HIGH-INDEX SUBSTRATE
21 5
with
x=
nS ~
(A2+B2)
4%
and cos 91 cos q2cos T3-
(%
nM n3 -
2)
cos pll sin p12 sin p3-
(2:: r:) ~
--
sin rplcos q2sin p13,
(::zl)
cos ql cos q2 sin v3+ - - - cos pll sin q2 cos v3
+
(2 :) ---
sin TI cos q2 cos y 3 -
(=-A nht n2
n1n3
sin y1 sin q2sin ps.
Group I-type coatings: By setting q.~ = v1 = q2 = p3 and n2 = d ( n S n M ) > n3 = n12/(nS/nAf) eq' (24) yields 4
Fig. 10. Reflectance of three-layer group I-type with nS = 3.45, n~ = 1; 1z1 = 1.56 , solid curve: n, = 1.8574, m3 = 2.896 , dotted curve: nl = 1.38 , n2 = 1.8574, n3 = 2.56 ,
coatings on a high-index substrate 4n1d1 = & 4n,d, = I , 4n3d, = 1,; 4nld1 = & 4n,d2 = & 4n3d3 = l o .
216
MULTILAYER
ANTIREFLECTION COATINGS
This coating produces two zero reflectance points at tan2 qo =
'd(nMnS)
+
nM/nlfnl/d(nSnM)
and the reflectance in the center at q
= $ 7 ~is
(27)
Fig. 10 shows two examples.
Group 11-type coatings: By setting = y1 = q2 = p3 and n2 = 2/(nsn,), n3 = nsnM/nl eq. (24) changes into
X./X
Fig. 11. Reflectance of three-layer group 11-type coatings on a high-index substrate with nS = 3.45, n M = 1 ; 4n,d, = I ,
solid curve:
n , = 1.56
dotted curve:
n 2 = 1.8574, 4n2d2 = I , n 3 = 2.21 , 4n,d3 = I , ; n, = 1.38 , 4n,d, = I ,
,
A,
n 2 = 1.8574, 4n2d,
=
n3 = 2.5
=I,.
,
4n,d,
IV, §
71
ANTIREFLECTION COATINGS O N GLASS
217
(28)
Now we have three zero reflectance points, one in the center a t in, and two at the points given by
pl =
Q 7. Three- and Four-Layer Antireflection Coatings on Glass Again, due to the lack of coating materials with a low enough index of refraction, it is not possible to design group I- and group 11-type coatings for glass substrates directly. We can, though, design antireflection coatings from air into a high index substrate and from this high index substrate into glass. We then can let the thickness of the high index substrate shrink to zero and we end up with a composite antireflection coating for glass. As a first combination let us use a two-layer group I-type coating to match from air into the dummy high-index medium with index n, and a single layer coating to match from the dummy medium to glass. This construction requires the following index relations (eqs. (15) and (4)): n2 = n,(n,/n,)+,
n3 = (nDns)+,
or n2/fil= n3/(fiMns)+.
(32)
The resulting three-layer coating has a broad minimum for 9 = in. In Fig. 12 the reflectance of three different designs are shown. It appears that the design with n2 = n, has the broadest minumum.
I
WAVELENGTH
Fig. 12. Reflectance of three-layer antireflection coatings on glass with n~ = 1.52. n M = 1; solid curve: n, = 1.38 , 4n1d1 = 510 um n, = 2.1 , 4n,d, = 510nm n, = 1.88 , 4n3d3 = 510nm; dotted curve 1: n, = 1.38 , 4n1d, = 510nm n2 = 1.9 , 4n,d, = 510nm n3 = 1.698, 4n3d3 = 510 n m ; dottcd curvc 2: nl = 1.38 , 4n1d1 = 510nm n2 = 2.30 , 4n,d, = 510 nm n, = 2.055. 4n,d, = 510 nm. 4%
I
0 WAVELENGTH
Fig. 13. Reflectance of four-layer group 11-typc coatings on glass with n~ = 1.52, '?%M= 1; solid curve: nl = 1.38 , 4n1d1 = 485 nm n, = 2.2 , 4n,d, = 485nm n, = 2.43 , 4n3d3 = 485nm n4 = 1.887, 4n,d, = 485 nm; dotted curve: n, = 1.38 , 4n1d, = 485 nm n, = 2.1 , 4n,d, = 485nm n3 = 2.322, 4n3d3 = 485 nm n4 = 1.887, 4n,d, = 485 nm.
IV.
S 71
219
ANTIREFLECTION COATINGS ON GLASS
WAVELENGTH
Fig. 14. Reflectance of three-layer group I-type coatings on glass with ns
=
1.52,
=
1.52,
n M = 1;
solid curve:
dotted curve:
n, = 1.38,
n, = 2.15,
4n,d, 4n,d,
n3 = 1.70,
4n,d, =
n, = 1.38, n, = 2.35,
4n,d, =
n3 =
4n,d, 1.70, 4n,d,
= =
= =
525 nm 1050 nm 535 nm; 525 nm 1070 nm 535nm.
WAVELENGTH
Fig. 15. Reflectance of a detuned version of one of the coatings of Fig. 14, ns n M = 1; solid curve: n, = 1.38, 4n,d, = 510 nm n, = 2.15, 4n,d, = 1020 nm n3 = 1.62, 4n3d, = 510 nm; dotted curve: n1 = 1.38, 4n,d, = 510 nm n2 = 2.15, 4n,d, = 1020nm n3 = 1.70, 4n,d, = 510nm.
220
MULTILAYER ANTIREFLECTION COATINGS
[IV, § 7
As a second example we use two group 11-type coatings. The resulting four layer coating has to satisfy the following two conditions (eq. (17)):
nln2 = nMnD,
n3n4= nDns,
(33)
which leaves the indices of three layers open. As an additional requirement we can specify that the reflectance in the center is zero. By a straightforward expansion of eq. (24) the reflectance of a four-layer coating for p = rpl = p2 = p3 = p4 = &n is found to be
R w nMn2n,/ns n1n3-n1n3/n2n 4 .
(34)
R is zero for n1
(35)
= (nM/nS)
n4
Combination of eqs. (33) and (35) leads to
ni = n:(ns/nM)8,
n2 2 = n:(nM/ns).).
(36)
The two examples of Fig. 13 prove that indeed very broad band antireflection coatings can be designed this way. As a third example we use two group I-type coatings. With eq. (15) we can establish the two index relations nz/nl = (nD/nM)+, n3/n4= (nD/ns)+,
(37 1
or n1 ( n S ) f / n 2
= n4 (nM)'/n3
>
(38)
which is the same as eq. (35). By setting n2 = n3 we can reduce the four layer coating to a three layer quarter-half-quarter coating. The index relation is now n4/n1=
(ns/nM)+.
(39)
The optimum bandwidth should be determined by varying n 2 . Fig. 14 indicates that the optimum occurs for n2 = 2.15. In order to further study the quarter-half-quarter coating let us set p1 = p, y 2 = Zp, and p3 = rp in eq. (24) and evaluate the reflectance for p = & + E with E + 0. We arrive at a relation of the following form:
Re,, =
((-- ):
5 4nM
nMnl nSn3
-
2
WAVELENGTH
Fig. 16. Reflectance of a detuned version of one of the coatings of Fig. 13, nS = 1.52, nM = 1; solid curve: n, = 1.38 , 4n,d, = 485 nm n 2 = 2.2 , 4n,d, = 485nm n3 = 2.435, 4n3d3 = 485 nm n4 = 1.837, 4n4d4 = 485 nm; dotted curve: n, = 1.38 , 4n,d, = 485 nm n, = 2.2 , 4n,d, = 485nm n3 = 2.435, 4n3d3 = 485 nm n4 = 1.887, 4n4d4= 485 nm.
WAVELENGTH
Fig. 17. Reflectance of a detuned four-layer coating on medium index glass, nS = 1.62,
nM = 1.0; solid curve:
dotted curve:
n, = 1.38 , 4n,d, = 485 nm n2 = 2.176, 4n,d, = 485 nm n3 = 2.45 , 4n3d3= 485nm n4 = 1.91 , n, = 1.38 , n, = 2.176, n3 = 2.45 , n4 = 1.98 ,
4n4d, = 485nm; 4n,d, = 485 nm 4n,d, = 485 nm 4n,d3 = 485nm 4n4d4= 485nm.
222
M U L T I L A Y E R A N T I R E F L E CT I O N C O A T I N G S
[IV,
8
WAVELENGTH
Fig. 18. Reflectance of two three-layer antireflection coatings for glass with odd thickness ratios, n s = 1.52, n M = 1.0; solid curve: n, = 1.38, 4n,d, = 567.2 nm n2 = 2.10, 4n,d2 = 212.3 nm n3 = 1.80, 4n3d, = 731.4 nm; dotted curve: n, = 1.38, 4n,d, = 500.4 nm n2 = 2.10, 4n,d, = 862.3 nm n3 = 1.8 , 4n,d, = 324.5nm.
which indicates that by slightly deviating from eq. (39) a broadening can be accomplished. The result of this detuning is shown in Fig. 15 for the practically important case of the quarter-half-quarter coating with n, = 1.38 (magnesium fluoride), n2 = 2.15 (zirconium oxide) and n2 = 1.62 (cerium fluoride) (THELEN [1965]; Cox et al. [1962]). Curves for detuned four-layer coatings are given in Figs. 16 and 17. Three-layer antireflection coatings can of course also be designed by applying the general method of effective interfaces directly (THELEN [1960]; THETFORD [1970]). An interesting design by Thetford is given in Fig. 18.
5 8.
Synthesized Layers
Most of the designs described so far require high indices of refraction which are still low enough to be possible but which are difficult to achieve. There are methods, though, to replace these less available materials with combinations of easily available ones. A graphical method was recently presented by Rock [1968]. OSTERBERG [ 19621 discussed a method based on admittance matching.
1 v.
5
SYNTHESIZED LAYERS
81
223
WAVELENGTH
Fig. 19. Reflectance of a synthesized four-layer antireflection coating on glass, H S = 1.52, n~ = 1; n, = 1.384, 4n1d, = 510nm n, = 2.35 , 4n,d, = 1020nm n3 = 1.55 , 4n3d3= 510nm n4 = 1.384, 4n4d, = 510nm.
4%
8 z
U
h 3
iw2 a
2
I
0 WAVELENGTH
Fig. 20. Reflectance of a synthesized four-layer vzs = 1.52, PZM 1.0; n, = 1.38, 4n,d1 = n2 = 2.30, 4n,d, = n3 = 1.38, 4n3d3 =
antireflection coating on glass.
5
510nm 1105nm 170nm n, = 2.30, 4n4d4= 85nm.
224
M U L T I L A Y E R ANTIREFLECTION COATINGS
[IV,
§ 9
A quarter-wave layer next to the substrate (index nF)can be replaced by two quarter-wave layers with indices nA and nB if the following relation is fulfilled: .AI%
=n
Fh'
Fig. 19 gives as an example a coating derived by OSTERBERG [1962] from one of the designs given in Fig. 15. A third method is based on the concept of Herpin equivalent layers (EPSTEIN[1952]). I n this way YOUNG [1965] derived the design shown in Fig. 20 again using one of the designs of Figure 15 as starting design. Since all synthesizing methods really only provide equivalent performance at the wavelength point of matching large deviations from the starting design can result. It is normally necessary to further optimize the thicknesses. Refining methods (BAUMEISTER [1958] ; THELEN [1969]) are generally used for this purpose (YOUNG [1965]).
5 9. The Coating of Reflection Reducing Multilayers The individual layers in a multilayer system are usually coated in succession by now-conventional thermal evaporation of the coating materials in a high vacuum chamber. An elevated substrate temperature of up to 300" C, and glow discharge cleaning contribute greatly to the adhesion and durability of the multilayer. Techniques for the controlled evaporation of some of the commonly used coating materials have been summarized by COX et al. [1962], Cox and HAS [1964] and BALL [1966]; certain mixed oxides have also been and PUTNER found to be suitable evaporants (THELEN [1965]). Proper control of the thickness of each layer may be afforded by measuring the reflectance of some suitably positioned monitor plate in the evaporation chamber and stopping the deposition when the reflectance has reached a predetermined level. A separate monitor plate area is recommended for each layer. To improve the control sensitivity for layers whose refractive indices are close to that of the monitor plate, the plate surface may be precoated with some suitable material in order t o simulate a relatively wide differencebetween the indices of the plate and film. Visual monitoring of layer thicknesses, though practised for single magnesium fluoride films and suggested for certain two-layer coatings (SAWAKIand KUBOTA[1953]), is inadequate for the control of multilayers.
OPTICAL PERFORMANCE O F COATINGS
225
Q 10. Environmental Stability of Multilayer Antireflection Coatings It is desirable for any optical coating to show good chemical and mechanical stability in addition to its optical performance; unless the coated surface is to be hermetically sealed, good coating stability is essential. Single layer coatings of magnesium fluoride only achieved widespread usefulness when, by glow discharge cleaning the substrate and depositing the magnesium fluoride at an elevated substrate temperature, good substrate adhesion and mechanically hard coatings could be produced. Multilayer coatings can also be made durable by employing these two basic techniques in the deposition process, together with proper choice of coating material. Environmental stability tests commonly applied include the (Scotch Tape) adhesion test and tests for abrasion resistance, exposure to humidity at elevated temperatures including temperature-humidity cycling, exposure to salt fog and salt spray and exposure to extreme temperatures (-260" to +300" C). Commercial multilayer coatings may be repeatedly cleaned without damage, exercising only the care usual in the handling of coated optics. This ability to remove greasy contaminants (thumb prints) from multilayer antireflection films is very important since a thumb print shows up vividly by reflection against the almost black background although on an uncoated glass surface it might pass unnoticed. Q 11. Optical Performance of Coatings and Increase in Transmission Through an Optical System Measured spectral reflectance curves of properly prepared films conform closely to computed data. Fig. 21 shows the measured spectral performance of a two-layer V-coating; the single surface reflectance has been reduced to the extremely low value of 0.02 % at the design wavelength. This figure may be contrasted t o the value of 1.3 % for a magnesium fluoride single layer on crown glass. Which of these two coatings is superior depends upon the application, for the reflectance of the V-coated surface rises steeply away from the minimum while the single layer reflectance curve is much flatter. Thus for white light applications the single layer will be more effective and the V-coating better for monochromatic light at its design wavelength. The V-coating of Fig. 21 was made for optics operating at the 633 nm helium-neon laser line.
226
M U L T I LAYE R
[IV,
AN T IREFLECTION COATINGS
I
11
nm WAVELENGTH
Fig. 21. Measured reflectance of a two-layer TiO,+MgF, reflectance a t the 633 nm helium-neon laser line.
V-coating giving 0.02
yo
WAVELENGTH
Fig. 22. Measured reflectancc of a three-layer antircflection coating consisting of MgF,+ZrO,+CeF, on crown glass. The coating was a quarter-half-quarter wave design a t 550 nm. (After Cox et al. [1962].)
s
111
OPTICAL PERFORMANCE O F COATINGS
227
WAVELENGTH
Fig. 33. Measured reflectance of a three-layer antireflection coating consisting of MgF, Nd,O, CeF, on crown glass. The coating was a quarter-half-quarter wave design a t 1100 nm. (After Cox e t al. [1962].)
+
+
Figs. 22 and 23 show the spectral performance of prepared threelayer films in the visible and near infrared wavelength regions; the reflectance is suppressed to lower than 4% over a span of several hundred nanometers. Fig. 24 is a scan of an experimental multilayer coating which shows suppression of the reflectance to below 0.8 yo over an extended spectral range of about 420 to 900 nm. Such wide-band coatings find application in instrumentation where both visual and near-infrared sighting is employed. In computing the effectiveness of systems in transmitting white light, rather than energy of a discrete wavelength, or the effectiveness of a coating in suppressing unwanted reflections, we need to integrate over the spectral region concerned. Thus the energy effectively reflected from a surface is given by
where R, is the reflectance of the surface a t wavelength A , E,l is the energy distribution of the incident light, S , is the spectral response of the detector, We thereby obtain the area below the spectral reflectance curve
228
M U L T I LAY E R A N T I R E F L E C T I 0N
I
400
500
600
700
LIV,
C0A T I N G S
800
900
5 11
nrn
WAVELENGTH
Fig. 24. Measured reflectance of a very wide band multilayer antireflection coating effective over the range 400-900 nm
weighted according to the characteristics of the source and detector. The fractional reduction of reflected light on coating a surface is thus
and it follows that the reflectance values for the coated surface can rise to quite high values in the spectral regions where E , and S , are small without impairing the usefulness of the coating. Thus, the coating of Fig. 22 is visually effective because the relatively high reflectivity at 400 nm occurs where the visual response is near-zero. Suppression of the reflectance of an optical surface leads to an increased transmission of light through that surface. For 2N surfaces the overall useful transmittance of an imaging system is T 2 X , T being the single surface transmittance. Fig. 25 shows how T 2 N varies with T for selected values of 2N. The improvement in transmittance can easily amount to one or more photographic stop numbers in a severalelement system. For a non-imaging system it is relevant to include the effect of intra-lens reflections, and Fig. 26 shows how the overall system transmittance then varies with T for selected values of 2N.
IV, § 111
OPTICAL PERFORMANCE OF COATINGS
PERCENT TRANSMITTANCE T FOR SINGLE SURFACE
Fig. 25. Overall useful transmittance of image-forming light for a system containing N elements in air as a function of the transmittance of a single surface.
It will be seen that for any selected value of the parameter 2N greater than unity the overall transmittance of the system is greater when the intra-surface reflections are considered. This transmittance increase is far from desirable for image-forming systems for it represents potential stray light in the image plane. Examination of Figs. 25 and 26 will show that the transmittance increase is greatest for large values of 2N and low values of the single surface transmittance. For any particular 2N value the increase is least when the single surface transmittance is greatest and it follows that the more effectively we
230
MULTILAYER ANTIREFLECTION COATINGS
PERCENT TRANSMITTANCE T FOR SINGLE SURFACE
90
95
100
Fig. 26. Overall transmittance of light for a system containing N elements in air as a function of the transmittance of a single surface. Both image-forming and stray light arising from intra surface reflections have been included.
can suppress surface reflections in a refracting systems the less stray light will arise. This is discussed more fully below.
12. Photographic Applications In the foregoing discussion of multilayer designs and their spectral characteristics the region of interest mainly considered was the visual, 400-700 nm region. If the multilayer coatings are to be used in visual
IV,
5
131
SUPPRESSION O F STRAY LIGHT
231
or pseudo-visual instrumentation it is appropriate to use a spectral reflectance curve which best fits this region. On the other hand, if the response of the system to be coated extends beyond these wavelength limits, close con. ideration must be given to the overall spectral response and the multilayer coating characteristics can critically affect this. With single layer magnesium fluoride coatings it is common practice to center the film at the wavelength of greatest interest, be this visual or photographic, and to stagger the coating thicknesses in a many-element system to avoid strong coloration in the transmitted light. The color effects likely to result when a many-element system is coated with single magnesium fluoride layers have been studied by MURRAY [1956]. SCHARF[1952] proposed a color index based on the total axial glass thickness and the difference between the transmission densities at 400 and 750nm. With multilayer coatings the proper selection of spectral response can be much more critical because, unlike single layers, the reflectivity can rise to values substantially in excess of the bare substrate values outside the low reflectivity zone. Thus, for example, a multilayer-coated lens reflecting a total of less than 1% of the incident light in the 400-790 nm region can reflect a total of 25% of the light at 1500 nm. The interference color by reflection has been calculated for 2- and 3-layer coatings by KUBOTA [ 19611. A lens manufacturer faced with the problem of coating a manyelement zoom lens for color photography must use a coating which produces acceptable color rendition in the final photograph. The spectral reflectance and absorbance of the coatings, the spectral transmittance of the lens elements and the response of the film all contribute to the system response and variations of the coating characteristics in the violet and near ultraviolet wavelengths can critically affect the color rendition. Some otherwise suitable coating materials become strongly absorbing at shorter wavelengths and those that remain transparent can not always be deposited controllably.
9 13. Suppression of Stray Light in an Optical System Although the advantages of reflection reducing coatings are nearly always explained in terms of the resulting increase of light transmission through the system in nearly every case the chief advantage lies in the reduction of unwanted light in the image plane (Fig. 27). This generali-
232
MULTILAYER ANTIREFLECTION COATINGS
[IV, §
13
Fig. 27. Intra lens reflections in an optical system. One possible path for stray light arriving at the image plane is depicted for a photographic triplet.
zation holds good whether we are considering a single lens (say an eyeglass lens) or a system containing very many lenses (say a flexible endoscope). The simplest case is perhaps that of the protective glass in front of an instrument dial. Sometimes proper optical design can lessen annoying reflections from the glass; for example, the glass can be formed concave t o the observer. The real solution though, lies in effectiv? use of surface coatings. Fig. 28 illustrates the increase in visibility of an instrument dial when the frontal glasses are multilayer-coated. For a sensitive instrument movcment whose pointer position may be influenced by the pressure of a static charge on the cover glass, the multilayer coating may be rendered conductive by incorporating a metal or conductive metal-salt layer. With proper design a resistivity of less than 350 000 12 per square may be realized with only imperceptibly increased reflectivi-
Fig. 28. The multilayer-coated instrument glass on the left greatly improves the visibility of the meter face by reducing unwanted reflections from the glass to negligible levels.
IV,
9
131
233
SUPPRESSION O F STRAY LIGHT
ty; the absorption of the conductive layer affects only the transmission of the coated surface and this typically falls to 30-90%. Intra-surface reflections in an eyeglass lens can give rise to irrelevant images when a bright source of light is in the scene, a situation habitually encountered under night-driving conditions. If the surface reflectivity of the lens is reduced to $yothese irrelevant images cease to be visible. Since the production of stray light by intra-surface reflections involves a double reflection (we do not normally consider any more than two reflections because the stray light intensity is then negligible), the effect of reducing the surface reflectivity from 5% to *yois to reduce the stray light inten;ity by two orders of magnitude. The successful suppression of stray light in the image by the use of multilayers stems from this fact. For a system involving many refracting elements the stray light increases because of the large number of surfaces that may form pairs. Moreover there is greater likelihood of a pair of surfaces causing the stray light to come to a focus in the region of the image plane. If this be the case the partially-focussed stray light may be of such brightness as to obliterate some wanted detail in the scene. We illustrate the increase of stray light with the number of elements-in-air of a refracting 201%
UNCOATED SYSTEM 16
12
!-
3
4
MUTILAYER- COATED SYSTEM
0
18
20
NUMBER OF ELEMENTS
Fig. 29. The stray light in the image plane as a function of the number of elements-in-air of a lens system. The single surface reflectivity has been taken as 4.0 yo for the uncoated system, 1.4 Yo for the magnesium fluoride-coated system, 0.35 yo for the multilayer-coated system. The stray light has been cxpressed as a percentage of the wanted light in the image.
244
MULTILAYER ANTIREFLECTION COATINGS
[IV,
$ 13
Fig. 30. Night time scene taken with a 15-element zoom lens coated throughout with single layers of magncsium fluoride. *4 150 watt P A R spotlamp in the center of the scene was aimed directly a t the camera lens.
Fig. 31. The same scene as in Fig. 30 has been taken under identical condition of scene, exposure and processing except that the zoom lens was now multilayer-coated.
Fig. 32. San Francisco’s Golden Gate Bridge a t night taken with a 15-element zoom lens coated throughout with single layers of magnesium fluoride. The driving lights of the parked car a t the foot of the picture were aimed directly a t the camera lens. Many of the irrelevant stray light images come to a focus in the region of the film plane.
Fig. 33. The identical bridge scene but with the zoom lens multilayer-coated.
236
MULTILAYER ANTIREFLECTION COATINGS
[IV, § 13
system in Fig. 29. We have here considered light passing axially through a centered system and computed the transmission factors for the system (a) when we ignore and (b) when we include the effect of intra lens reflections. The increase of the system transmission for case (b) represents the potential stray light in the image plane. Whereas a single layer of magnesium fluoride reduces the stray light by a factor of about 4 for a 10-element system for example, use of multilayers effect a further reduction by a factor of 10. Some of these effects are dramatically demonstrated by Figs. 30 through 33. The circular halo of stray light in Fig. 30 leads to an overall reduction of image contrast over a large area of the scene. The 15-element system was coated throughout with single magnesium fluoride layers and reference to Fig. 29 indicates that an uncoated system would produce a stray light intensity about three times as great. An identical zoom system was multilayer coated and Fig. 31 was secured under identical conditions of scene, exposure and processing. There is a major reduction in the level of stray light and contrast over much of the scene has been greatly improved. We should note here that the exposure times for the two photographs were identical but a more meaningful comparison would have been made had the exposure of Fig. 30 been increased (nearly doubled) to compensate for the lesser transmission of the magnesium fluoride-coated system. The contrast between the two photographs might then be expected to be even more striking. Figs. 32 and 33 were similarly taken with the same pair of zoom systems. With the zoom setting (focal length setting) chosen, several of the doubly-reflected stray light beams come to a best focus in the region of the image plane and generate an objectionable array of irrelevant images. Even quite minor images in a static scene become very obtrusive when the scene is panned; sunset scenes are classic cases where unwanted images intrude into the scene and it is not always feasible t o keep localized glare sources out of the object field. Fig. 33 shows how a high degree of stray light suppression can be achieved by using multilayers.
References BAUMETSTER, P., 1958, J. O p t . SOC.Am. 48, 955. Cox, J . T, G. HASSa n d A. THELEN, 1962, J . O p t . Soc. Am. 52, 965. Cox, J. T . and G. HASS,1964, Antireflection Coatings for Optical and Infrared
IV]
REFERENCES
237
Optical Materials, Vol. 2, in: Physics of Thin Films, eds. G. Hass and R. E. Thun (Academic Press, New York and London). DUFOUR, C . and A. HERPIN, 1954, Optica Acta 1, 1. EPSTEIN, L. I., 1952, J. Opt. SOC.Am. 42, 806. KUBOTA, H., 1961, Interference Color, in: Progress in Optics, Vol. 1 , ed. E. Wolf (North-Holland Publishing Co., Amsterdam). R. B., 1948, J . Opt. SOC.Am. 38, 20. MUCHMORE, MURRAU, A. E., 1956, J. Opt. SOC.Am. 46, 790. OSTERBERG, H., 1962, Military Standardization Handbook 141, Optical Design, Defence Supply Agency Washington D.C., Chapt. 21. PUTNER T. and R. BALL,1966, Vacnique 5, 3. ROCK,F., 1968, U S . patent pending. SAWAKI, T. and H . KUBOTA,1953, Sci. Light (Tokyo) 2, 128. SCHARF, P. T., 1952, J . SOC.Motion Picture Television Engineers 59, 191. SMAKULA, A., 1941, Glastech. Ber. 19, 377. SMITH,S. D., 1958, J . Opt. SOC.Am. 48, 43. THETFORD, A., 1970, A Method of Designing Three Layer Antireflection Coatings, Optica Acta, to be published. THELEN,A , , 1956, Optik 13, 537. THELEN,A , , 1960, J . Opt. SOC.Am. 5 0 , 509. THELEN, A , , 1965, U.S. Patent 3, 185, 020. THELENA , , 1969, Design of Multilayer Interference Filters, in: Physics of Thin Films, Vol. 5, eds. G. Hass and R. E. Thun (Academic Press, New York and London). YOUNG, L., 1961, J. Opt. SOC.Am. 51, 967. YOUNG, L., 1965, Appl. Opt. 4, 366.
This Page Intentionally Left Blank
V S T A T I S T I C A L P R O P E R T I E S O F L A S E R LIGHT* BY
H. R I S K E N I . Institut fur theoretische Physik der Universitat, Stuttgart, Gerrnany
* This article was written during the author’s stay at the Department of Electrical Engineering, University of Minnesota, Minneapolis, Minnesota and was supported by Project Themis, Office of Naval Research.
CONTENTS
. . . . . . . . . . . . . . . . 3 2 . SEMICLASSICAL T H E O R Y . . . . . . . . . . . . 1.
$
INTRODUCTION .
244
3. SOLUTION O F THE LASER FOKKER-PLANCK
EQUATION . . . . . . . . . . . . . . . . . . . .
3 3
241
..
264
. . . .
272
4 . PHOTOELECTRON COUNTING DISTRIBUTION 5 . FULLY QUANTUM MECHANTCALTHEORY
252
REFERENCES . . . . . . . . . . . . . . . . . . . . .
291
§1. Introduction
In physics one may distinguish two developments. After the laws for certain quantities (e.g., the position of a small particle, intensity of an optical field, a current in a circuit) have been discovered, the fluctuation properties of these quantities (e.g., Brownian motion, fluctuations of light, noise in circuits) have been investigated several decades later on. The same development occurred in the laser field, only in a much shorter time period. After SCHAWLOW and TOWNES [1958] had shown how to extend the maser principle to the optical region, the first laser action was observed by MAIMAN [1960] and by COLLINS et al. [1960]. Then a large number of papers appeared dealing for instance with mode competition, frequency pulling and pushing and hole burning effects. A microscopic theory of these effects was given by several authors, the most complete investigations are the ones of HAKENand SAUERMANN [1963a, b] and LAMB[1964]. After the first laser was built, not much was known about the statistical properties of the laser light (eg., intensity fluctuation, linewidth) except that the linewidth was very small. One of the first attempts to derive the statistical properties of laser light was made by WAGNERand BIRNBAUM [1961] using a linear theory. As we know today the laser oscillation can only be described by nonlinear equations and therefore their results apply only to a light amplifier or a laser below threshold. The first microscopic theory of laser fluctuations seems to be the one of HAKEN[1964a, b] and soon afterwards several other papers appeared (HAKEN[1965, 19661, HAKENand WEIDLICH[1966], LAX [1966a, b], SAUERMANN [1965, 19661, RISKENet al. [1966a, b, c], PAUWELS [1966], KORENM A N N [1965]). In these investigations the nonlinear laser equations were solved by a linearization procedure, similar to the one used by BLACQUI~RE 1119531 and GRIVETand BLACQUI~RE 1719631 in the noise investigation of a maser oscillator. The results of the above laser theories agreed very well with noise measurements of FREED and HAUS [1965, 19661, ARECCHIet al. [1966], SMITHand ARMSTRONG [1966a], 241
243
STATISTICAL PROPERTIES O F LASER LIGHT
[v. § 1
and others. However, the linearized theories are not applicable near laser threshold. A more general solution of the nonlinear laser equation with noise, which was valid below, at and above threshold was derived by RISKEN[1965, 19661 and by LAXand HEMPSTEAD [1966], using a Fokker-Planck equation. The stationary solution of the laser FokkerPlanck equation was substantiated experimentally by SMITHand ARMSTRONG [ 1966b1, who showed that near threshold this solution is superior to those which followed from the linearized laser equations or from the signal plus noise model of LACHS[1965] and of GLAUBER [1966]. (For a review article pertaining to experimental studies of intensity fluctuation in lasers see ARMSTRONG and SMITH[1967].) After the appearance of these early papers, fluctuation phenomena in a laser were derived with more advanced methods. In the papers of WEIDLICH et a]. [1967a, b] and HAKENet al. [1967] the master equation of WEIDLICH and HAAKE [1965a, b] and the coherent state representation of GLAUBER [1963a, b] and SUDARSHAN 1119631 were used. SCULLYand LAMB[1966,1967a] and FLECK [1966a, b] derived and solved the master equation in the occupation number representation: MCCUMBER [1966], WILLIS [1967], GORDON[1967] and BRUNNER [1967] also developed a theory of laser noise. On the experimental side more detailed investigations were performed by ARECCHIet al. [1967a, b, c], CHANGet al. [1967] and DAVIDSON and MANDEL [1967]. Very good agreement was found not only with the stationary solution but also with the correlation function (derived from the Fokker-Planck equation by RISKENand VOLLMER[1967a] and HEMPSTEAD and LAX [1967]) andwith the transient behavior of the laser oscillation (calculated by RISKENand VOLLMER [1967b] and SCULLY and LAMB[1967b]). The main result of the theoretical and experimental investigation is that the statistical properties of laser light differ essentially from the properties of ordinary, i.e., thermal light, even if the linewidth of the thermal light would have been made as narrow as that of laser light ( e g , by use of filters) and if the intensity would have been made as high as that of laser light. For instance, the photoelectron counting probability distribution (for time intervals much shorter than the reciprocal linewidth) is a Bose-Einstein distribution for thermal light and for laser radiation far below threshold, whereas for laser light well above threshold the photoelectron counting distribution is a Poisson distribution. In the interesting threshold region one has a continuous transition between these two distributions. Because the laser threshold region is very small one may regard this transition as a phase transition.
v, 5 1;
I N TR 0 D U CTI 0N
243
For instance, below threshold the relative intensity fluctuations are large (phase a ) ; above threshold the relative intensity fluctuations are small (phase b). (For a more detailed discussion between phase transition and laser threshold see GRAHAM and HAKEN[1968].) The following two points make it difficult (and therefore interesting) to investigate the statistical properties of laser light. First, one has to treat a nonlinear equation with noise, preferably without a linearization procedure. Secondly, the noise originates from the quantum behavior of the light field (i.e., spontaneous emission). From the last point it follows that a rigorous description of the electromagnetic field should be given by quantum electrodynamics. Because of the large number of photons present in the laser resonator (number at threshold is of the order 4000), it turns out, however, that to a high accuracy (relative error of the order 1/4000) the light field can be treated classically provided a proper noise force is added, whereas the atoms are treated quantum mechanically (semiclassical theory). Near threshold the error due to this procedure is much smaller than the error due to the linearization of the laser equations. Although the accuracy of the semiclassical theory is high enough for a comparison with experiments, a more rigorous treatment is desirable from a theoretical point of view. The present review is divided into two main parts. In the first part (5s 2-4) the semiclassical theory is developed. The laser Fokker-Planck equation is derived in 5 2 ; this equation is solved in 3. Using a connection between photoelectron counting distribution and intensity distribution given by MANDEL [1958, 19631, the counting distribution is calculated and compared with measurements in 4. I n the second part (5 5 ) the fully quantum mechanical laser equations are derived using the master equation method of WEIDLICHand HAAKE[1965a, b]. I n 3 5.3 it is shown explicitly how the Fokker-Planck equation of the semiclassical theory can be obtained from the master equation without introducing any additional noise forces. Moreover first order quantum corrections are calculated in 3 5.3 for the purpose of supplementing the results of 3 3. It should be noted that the more logical approach would have been to start with 5 and then proceed with 3 3, where the laser Fokker-Planck equation is solved. Because the semiclassical theory is much simpler and because the small quantum correction to this theory has not been measurable up t o now, we have chosen the semiclassical laser theory as the starting point. In the whole article we have restricted ourselves to a homogeneously
644
STATISTICAL PROPERTIES O F LASER LIGHT
[v,
s
2
broadened two level laser system with one traveling mode operating not too far from threshold. The restrictions of homogeneously broadened, two level systems, and a running mode were made because of the simplification in deriving the laser Fokker-Planck equation. Without these restrictions one can derive the same Fokker-Planck equation as discussed in § 2. We have restricted ourselves to the region not too far from threshold for the following reasons: First, because of the low stabilization force near threshold, fluctuations have a large influence only near and below threshold, they have a negligible influence very high above threshold. Secondly, the theory is now well established not too far from threshold and it agrees with experiments very well. Third, for high pumping powers the one mode assumption is not valid even for a homogeneously broadened laser system with a traveling wave because in the high pumping region off resonance modes begin to oscillate too (see RISKENand NUMMEDAL [1968a, b, c], GRAHAM and HAKEN[1968]).
Q 2. Semiclassical Theory 2 . I . LASER EQUATION WITHOUT NOISE
In this paragraph we derive the semiclassical laser equations. Semiclassical means that we neglect the operator character of the light field and treat it as a classical variable but that the atoms are treated quantum mechanically. Since even at laser-threshold a large number (of the order 4000) photons are in the cavity the relative error due to neglecting the operator character is very small (of the order 0.1 yo) and has, up to now, no measurable influence on the photon statistics, see ARMSTRONG and SMITH[1967]. We postpone the more complicated fully quantum mechanically treatments to 3 5. I n order to derive these equations we confine ourselves to a running wave, single mode ring laser model with N two-level homogeneously broadened atoms. I t turns out that more general models (standing wave type resonator, inhomogeneous broadening, multi-level system:) lead, near threshold, t o the same equation; only the basic parameters of this equation are of a slightly different form. The equation of motion of the density matrix p(” for the p’th atom under the influence of an electric field E is given by
v,
5
21
245
S E M I CLASS1 CAI. T H E 0 R Y
where the Hamilton operator H
= H,-exE
(2.2)
consists of a free field part H , and an interaction part, -exE. The electric field is assumed to be polarized perpendicular to the laser axis z. Because we have assumed a two level system the density operator can be expanded in the two eigenstates 11) and 12) of the Hamilton operator H , Hol1)
(ilj)
= Elll), =
dij
H012)
=
%coo=
E212) &g-&1.
(2.3)
From (2.1) it follows that the equation of motion for the elements [&)I* = ( i l p ( @ ) l j )of the density operator p ( @ reads )
pi$) = &)
= iw, p g ) +i(e/%)x12E (pi$)-&))
-y2pi/”2) __
(2.4)
In deriving (2.4) and (2.5) we have assumed that there is no permanent dipole moment in the ground and excited state (11x1 1) = (21x12) = 0. The phase factors of the ground and excited states were chosen in such a manner that the matrix element x12= (1 1x12) is real. The underlined terms in (2.4) and (2.5) were added in order to describe damping of the off-diagonal elements (-y2pi$)) and of the diagonal elements (-y1(pi$)-&))) and in order to include pumping (ylo,/N). I n $ 5 we will see that these terms can be derived by adding reservoirs to the system. Eqs. (2.4) and (2.5) are the Bloch equations (y2 = l/T2, and BLOCH y1 = l/Tl) of the spin resonance theory (WANGSNESS [1953]) applied to the case of a dipole moment under the influence of an electric field. Eqs. (2.4) and (2.5) determine the density matrix elements under the influence of an electric field E. By the wave equation ( K describes the losses of the electric field)
the atoms react back on the electric field via the polarization 1 P(2, t ) = exla(fi$)(t)+f~)(t)). A (zP-z)eA
I:
(2.7)
The summation in (2.7) means that we have to sum up over a number of active atoms situated at zg which in turn are placed in a volume
246
STATISTICAL PROPERTIES OF LASER LIGHT
[v, 9 2
element A around z. This volume element may be so small that in it the electric field E ( z , t ) is practically constant, but so large that it contains a large number of active atoms. Since we are concerned with a one dimensional ring laser with one allowed direction of propagation, the electric field has the form of a traveling wave. Because of the relatively weak interaction with the active atoms, the amplitudes will change slowly. Therefore we can make the ansatz (V is the cavity volume) iw, (t-
v
2E0
where b ( t )is a slowly varying complex function with respect to the period 2n/w, (]&I gthr
Ky2/g2,
(2.13)
" 3
9
21
247
SEMICLASSICAL THEORY
which is the SCHAWLOW-TOWNES [1958] formula. The steady state intensity is then given by (2.14)
Near threshold (oo M crthr) one can simplify (2.11)further. If the time variation of b and s is slow compared to the times l/y, and l/yz,the inversion is approximately (2.15)
Inserting the approximate inversion (2.15)leads, after neglecting the (small) derivative S, to the rotating-wave VAN DER POL [1927] or [ 18941 equation RAYLEIGH
b-(y-Bb*b)b
=
b-/?(d-b*b)b
=0
(2.16)
with
In deriving (2.16)and (2.17)we have assumed K tz-l
3
> . . . > tl)
2
W,(b(”,t,; . . .; b(’), t l ) =
IT G(b‘i’, bci-l),
lti-ti-1l)W(b‘’), t l ) ,
(2.38)
i=2
where G is the Green’s function of the Fokker-Planck eq. (2.34), i.e., G has the initial value G(b(2),b(1), 0)
=
(j(b(2)-b(l)).
(2.39)
Q 3. Solution of the Laser Fokker-Planck Equation 3.1. STATIONARY SOLUTION AND ITS EXPECTATION VALUES
The stationary solution W ( f ,9,f) of the Fokker-Planck equation (2.37) surely does not depend on the time t. It does not depend on
the phase 9 either, because no direction of the b-plane is preferred. Hence the stationary solution W,,(f) has to obey the equations
Here S can be interpreted as a probability current in the f direction, which is a constant because of the first part of (3.1).This current must originate either from the origin or from infinity. In the present case, it has no physical meaning to assume a current S different from zero. Furthermore, it can be shown that for S # 0 the distribution function W s t ( f )does not go to zero sufficiently fast enough for 7 4 00. When S = 0 the stationary distribution function follows immediately from (3.1), viz., Jlr Wst(f) = -exp{-p+&zv2} 2n
1 - = /om
“4
exp{-p+&z
Jlr
= - efa2exp
2n
{-$(F-u)~) (3.2)
P}f df = F,(a) .
This stationary distribution function is shown in Fig. 1 for different pump parameters near threshold. It is a Gaussian distribution in the intensity I = f 2 with the mean value f = a truncated at f = 0. The moments M n and the generating function M ( s ) of the moments of the stationary distribution function are given by
Mn
=
(f.)= F,(a)/Fo(a)
~ ( s= ) <e18) = ~ , , ( a + 2 s ) / ~ ~ ( a ) ,
(3.3) (3.4)
v, S 31
THE LASER FOKKER-PLANCK EQUATION
-
I
I
2
4
6
1
8
253
1
1
0
Fig. 1. The stationary distribution (3.2) as a function of the normalized intensity
I=
P2.
where the integrals
can be reduced to the error integral 2
@(a) = -
1/n
Ia
e-"' dx
0
by the following recurrence relations
F,(a)
=
2(n-l)F,~,(a)+aF,_,(a);
F,(a) = 2+aF,(a) Fo(a) = &exp i (&~2))[l+@(+a)].
(n 2 2 ) (3.7)
The integrals F , are related to the parabolic cylinder function D , ( z ) by F,(a) = n ! 2:(n+l) $a2D - n-1(- 4 1 / 2 1 (3.8) (see GRADSHTEYN and RYZHIK[1965]). Because the generating function of the cumulants K , of the
254
STATISTICAL PROPERTIES O F LASER LIGHT
[vs
s
3
stationary distribution (3.2) can be expressed by (see for instance STRATONOVICH [1963] for a definition of cumulants)
the cumulants K,(a) are derivatives of each other K,+,(a)
dn d ds" ds
= --lnF,(a+2s)Js=,
d" da"
= 2"-KK,(a)
-10
-8
-6
-4
=
-2
= 2n-
d 2-Kn(a). da
0
2
4
d" d - InF,(a+2s)js=, da" ds (3.10)
6
8
10
Fig. 2. The first moment < f > = M , ( a ) = K , ( a ) (solid line) and the asymptotic expansions K,(a) = 2/lal--8/lals for a > 1 (broken line) as a function of the pump parameter a. The unnormalized intensity ( I ) (in photon numbers) is only valid for threshold photon number = 4000 found by ARECCHIe t al. [1967c].
The first moment M,(a) = K,(a) is shown in Fig. 2 as a function of the pump parameter; in Fig. 3 the first four cumulants are plotted. The stationary distribution ( 3 . 2 ) was found by RISKEN[1965], LAX and HEMPSTEAD [1966], FLECK [1966a, b], SCULLY and LAMB[1966, 1967al WEIDLICH et al. [1967a]. The result of Scully and Lamb has a different form which, however, is practically identical to ( 3 . 2 ) . For unnormalized units the intensity is O ( a ) > = d41B (I(a)>. Thus ns = d a ( f ( 0 ) )= 4% 2 1 4 % is the number of photons at threshold, which ARECCHIet al. [1967c] have found for their laser to be 4000.
I
2.0 1.6
-
-
1.2
-
-
0.8 -
-
-
-0.4 -
-10
I -8
I -6
I -4
I -2
I 0
2
4
I
I
6
8
10
3.2. EXPANSION I N EIGENMODES
In order to obtain the joint distribution of Z'th order in eq. (2.38) or in order to give a transient solution of the Fokker-Planck eq. (2.37) it is sufficient to know the Green's function of the Fokker-Planck equation. This Green's function can be expressed in the following way (RISKENand VOLLMEK[1967a, b], HEMPSTEAD and LAX [1967])
where vnmare the eigenfunctions and A,, one dimensional Schroedinger equation ~:m+
[Anrn-vn(p)I~nrn
are the eigenvalues of the =
0
(3.12)
with the potential
Because of (3.2) and (3.11) the eigenfunction yo, belonging to the stationary eigenvalue I,, = 0 is
256
[v.
STATISTICAL PROPERTIES O F LASER LIGHT
s
3
-
yoo(f)= V‘FM exp{-+~*+$a~2).
(3.14)
The yrLnl(F) are assumed to be normalized. Thus, because of the orthogonality, we have (3.15)
The expression (3.11) is a Green’s function of the Fokker-Planck equation because, as one may verify by insertion, G is a solution of (2.37) and because of the completeness relations l o o
M
s(Y-Y’)
=
2 y n m ( f )y n m ( y ’ ) ; ~ ( v - q ~=’ ) 27G 2 -
m=O
n=-w
ein+P‘)
2
(3.16)
the initial conditions are given by G(Y, 31; F’, p’; 0)
1
= zS(f-F’) S(y-p’)
Y
=
S(6-6’).
(3.17)
I n order to obtain numerical results, one has to calculate the eigenvalues and eigenfunctions of the Schroedinger eq. (3.12). Only below (a > 1 ) threshold this Schroedinger equation can be solved analytically (except of course for the stationary solution yoo).Thus one has to use either approximations (for instance a variational method, RISKEN[1966]) or a numerical integration of the Schroedinger equation (RISKENand VOLLMER [1967a, b], HEMPSTEAD and LAX[1967]). 3.3. CORRELATION FUNCTIONS
The two most important correlation functions for the light field are the correlation function of the amplitude (d2b= d(Re b) d(Im b ) )
g(a, t) = ( b * ( t + t ) b ( t ) > =JJb*b‘
W,(b, b‘; t)dZb d2b‘
(3.18)
and the correlation function of the intensity fluctuation
K ( 4 ).
=
< (Ib (t+.)
-,1
(lbl”)
(16 (4 12-
(Ibl”))
=JJ ( ~ b ~ 2 - ( ~ b ~ z ) ) ( ~ b ’ ~ z - (W,(b, ~ b ~ zb’; ) ) t)d2bd2b‘
(3.19)
(see BORNand WOLF[1964], MANDELand WOLF[196l]). In the stationary state, which will be treated here, these correlation functions and the joint distribution function of second order W ,
V,
s
31
257
THE L A S E R F O K K E R - P L A N C K E Q U A T I O N
depend only on the time difference t. Using (2.38), where W is now the stationary distribution (3.2), we obtain by inserting the explicit expression (3.11)
W 2 ( ffp; ,
f ' , CpI'> ).- = c
a
m
2 2
'no(') ~'nn"') y n r n ( fy)n r n ( f ein(p-m')exp{-AnrnI?I}. ') (3.20) 2~cf 2nf'
m=On=--m
Carrying out the integration leads to (in normalized units)
(3.21)
(3.22)
The correlation functions for
z=0
were already calculated in
3
3.1
Since all matrix elements V mare positive and their sum is one (3.24)
the Vm give the relative influence of the m'th order damping term. The Fourier transform of the correlation function gives the spectral profile. Because the correlation functions are sums of exponential functions, the spectral profile is a sum of Lorentzian functions. However, the influence of other Lorentzian lines beyond the lowest order is not very pronounced for the correlation function of the amplitude. Calculation of V t ) shows that l - V t ) = 2z=lV i ) is of the order of 2 % near threshold and smaller outside. Therefore the spectral profile is nearly a Lorentzian with a linewidth (in unnormalized units) -
AV = d p q Aln = u L q / ( I ) ; uL = Aln(I>.
(3.25)
The decay constant ill, is plotted in Fig. 4 as a function of the pump
258
STATISTICAL PROPERTIES O F LASER LIGHT
[v, § 3
Ii
'I
I: ' I
1
,'i I 1
!
Fig. 4. The eigenvalue ,Ilo (solid line) and the asymptotic expansionsl.,, = la1+4/lal for a > 1 (broken line) as a function of the pump parameter a.
1.0
,It
I " ' " ( O
-10
i -8
-6
-4
-2
0
2
4
6
8
10
Fig. 5 . The linewidth factor U L = AlO(I) as a function of the pump parameter a.
parameter and in Fig. 5 we have plotted uL as a function of the pump parameter. The factor uL varies continuously from 2 to 1 by passing through the threshold region (RISKEN[1966], see also LAX 11967~1, LAXand LOUISELL119671); thus it connects the linearized laser, and the oscillator theories, below and above threshold (BLACQUI~RE [1953], GRIVETand BLACQUI~RE [1963]). The ratio of 2 to 1 occurs because above threshold the laser amplitude is stabilized and there-
Fig. 6. The first four matrix elements V E ) as functions of the pump parameter a (see (3.22)).
THE LASER FOKKER-PLANCK EQUATION
35
259
c
15
10
5 I
-10
-8
,
-6
I
-4
I
-2
I
,
0
2
I
4
I
6
8 a 10
Fig. 7. The first four non-zero eigenvalues A,, and the effective eigenvalue Aeff (see (3.26)) as functions of the pump parameter a. Very below ( a > 1) the effective eigenvalue can be approximated by Jeff = 21al.
fore only half of the noise power (in yj direction) contributes to the linewidth. The correlation function of the intensity fluctuation does not contain only the lowest decay constant, as was approximately the case for the correlation function of the amplitude. One sees in Fig. 6 that approximately 4 terms of the series (3.22) have to be included for pump parameters slightly above threshold. A closer inspection of the potential V o ( f )of the Schroedinger equation shows (see RISKENand VOLLMER [1967a]) that for high pump parameters the eigenvalues are pairwise degenerate. (See also the plots of these eigenvalues in Fig. 7 and the eigenfunction and the potential in Fig. 8.) For large pump parameters only one decay constant prevails in agreement with the linearized laser theories (see HAKEN[1964a]). As was shown by RISKEN and VOLLMER[1967a], one may introduce an effective correlation function and decay constant, defined by
which agrees well with the exact expression (3.22) even slightly above threshold. The effective width Aeff is, however, 25 % larger than the lowest decay constant for a w 4.5. This deviation of Aeff from A,,, was substantiated experimentally by measurements of ARECCHIet al.
260
iv, § 3
STATISTICAL PROPERTIES O F LASER LIGHT
1
40
20
0
Fig, 8. The potential V , of tb.- :,chrocdinger eq. (3.12) and tlic first flvc and eigenfurctions for the pump parameter a = 10.
* ’ .:,
%’:!
uos
[1967b]. (A table of c l g vdiies,matrix elements and Aeff is contained in RISKEN[1968].) For unnormalized units the linewidth of the spectrum of the intensity fluctuation is thus given by 7
A%).( = 4%Aeff ( a ) . At threshold a typical value of Av,/(2n) is 1400 CIS (see ARRECHIet al. [1967c]). Below threshold (a (in photon numbers) and t (in milliseconds) are valid for a threshold photon number 4000 and for a threshold linewidth of the intensity fluctuation 1400 CIS as found by ARECCHIet al. [1967a]. The three indicated regions are: I - region of spontaneous emission; I1 region in which spontaneous quanta are amplified by induced emission; I11 - saturation region. The solution I ( f )= a I ( 0 ) exp(2af)/[a- I(0)+I(O) exp ( 2 a f ) l of the normalized eq. (2.16) (B = 1; d = a ; I = 6.6) without noise is dotted in for a = 8. The initial value I(0)was chosen in such a manner that the solutions with and without noise agreed in the end of region 11.
and the distribution function can thus be approximated by the Gaussian function
The distribution (3.30) does not depend on the phase p because the initial distribution does not depend on it. The transient distribution of the intensity (3.33) Jo
is shown in Fig. 9. I n Fig. 10 and Fig. 11 the transient mean intensity
I
V,
31
263
THE LASER FOKKER-PLANCK EQUATION
( ( I ( f ) and ) the transient variance ( ( I ( f-)( I ( f ) ) ) 2 )are plotted. The moments ( I k ( f )obey ) the equations (Io = 1, K 2 1) d (I”(i)) = 2Ka(Ik (i)) -2K(f“+l (f) ) 4K2(I”’c-1 (f) ), df
+
-
(3.34)
which can be derived directly from the Langevin eq. (2.20) or the Fokker-Planck eq. (2.36). The initial condition that no photons are present for t = 0 is
(I”(0))= 0, k 2
(3.35)
1.
Using (3.34) and (3.35) the first two coefficients of the Taylor expansion for ( I ( i ) )are easily found to be
( I ( t ) )= 4f+4aP+ . . . .
(3.36)
Thus, in normalized units, all curves of the transient mean intensity start with the same finite slope. I n order to find the physical significance of this effect, we rewrite the expression in unnormalized quantities. We obtain (neglecting the number of thermal quanta)
( I ( t ) ) = AN,t+B(N,--N,),N2t2--B(N,--Nl),hrN2t~+ . . . (3.37) A
= I
2g2/y2; B = 2 g 4 / ( y J 2 ; I
I
I
I
W2--NJthr = Kyz/g2. I
I
I
I
(3.38) 1
-
5
-&to7
-
4
-
3
-
c\
-4.107
I
; 1
-
-.-
a
; 5
-
’?
I
I
L L
1 2
2
4
‘2
-2.107
b
1
:
-
o=O
I
I
0
Fig. 11. The transient variance of the distribution (3.33) as a function of time for various pump parameters. The unnormalized intensity is valid for the same threshold values as in Fig. 10.
264
S T A T I S T I C A L PROPERTIES O F L A S E R LIGHT
[v, § 4
We see that the term linear in t stems from the spontaneous emission rate. The second term stems from the spontaneous emission, which is amplified by induced emission. The third term describes the losses of the spontaneous quanta due to their finite lifetime in the cavity. Above threshold, the sum of both terms, which is quadratic in time, is positive. Higher expansion terms of (3.37) stem from induced emission and from the change of inversion. Thus we can mainly distinguish three regions: Spontaneous emission (region I ) , amplification of spontaneous emission by induced emisstion (region 11), saturation effects of the inversion (region 111). This is also shown in Fig. 10. The transient mean squared deviation or variance has the following interesting property above threshold (a > 0). It reaches a maximum value in the beginning of the saturation region I11 as shown in Fig. 11. This means of course that for this time, the spread of the distribution function is largest (see Fig. 9). For larger times the variance becomes smaller and finally reaches its stationary value. This can be physically interpreted as follows: Since in the switching on process the spontaneous photons are amplified, small fluctuations of the spontaneous photons lead, because of the large amplification, to large fluctuations in the beginning of the saturated region 111. Because of the large stabilization of the nonlinearity, these fluctuations are then diminished and thus reach finally their relatively low values. A t threshold (a = 0 ) no amplification occurs. Due to the low stabilization effect of the nonlinearity for a = 0, the variance reaches its largest value for t --f 00. No maximum occurs for finite 1 and a < 0. The transient calculations are in agreement with measurements performed by ARECCHIet al. [1967a].
5
4. Photoelectron Counting Distribution
4.1. GENERAL RELATIONSHIPS BETWEEN T H E PHOTOELECTRON
DISTRIBUTION AND INTENSITY DISTRIBUTION
The distribution function of the light field inside the laser cavity is not measured directly. Usually one measures the intensity outside the laser cavity with a photon detector. Because only a small fraction of the light intensity is transmitted by the mirror and finally is absorbed by the detector, one usually counts only a few photoelectrons in a given time interval T . For this reason one cannot directly apply the results of 3 3, where the intensity was treated as a continuous variable, but we must look for the discrete distribution of the photoelectrons.
V,
I
41
PHOTOELECTRON COUNTING D I S T R I B U l I O N
265
The connection between the continuous intensity distribution and the discrete photoelectron distribution was given by MANDEL [ 1958, 19631. (See also KELLEYand KLEINER[1964], MANDEL and WOLF [1965] and 9 5.3 for a quantum mechanical derivation.) His result is that the probability $(%, T , t ) of finding TZ photoelectrons in the time interval t , t+T for a given intensity I ( t ) is given by p(rz, T , t ) = n ! [ G C ’ J ~ ‘ + dt’] ~ I ( ~exp ’)
{-
ct’Jt‘+TI(t’)dt’). (4.1)
In (4.1), x’ is the factor relating the average number of photoelectrons in the interval T and the time averaged intensity, i.e.,
rT
00
5=
2 np(n, T , t ) = GC’ a==0
I(t’)dt’.
(4.2)
The factor GC’ is determined by the mirror transmittance, by a geometrical factor giving the fraction of the intensity which falls on the photodetector and by the quantum efficiency of the photocounter. Since I ( t ) is not a given quantity but a stochastic variable, one must average (4.1) over the distribution of this stochastic variable in order to get the measured photon distribution, i.e.,
P(%
T ) = .
(4.3)
In (4.3) ( ) denotes the averaging process of the stochastic light intensity. The N-fold photoelectron counting distribution p (a1,T I ; n,, T,; . . .; a,, T,) of finding in a counter 1, n1 photoelectrons in the interval (tl, t,+T,), in counter 2, n, photoelectrons in the interval (t, , t,+ T,) and so on reads
P(%T,:
n2,
T,, . . .; a,, T , ) =
=
(P(%>
Tl, tl) P(%> T,, t z ) . . .P(%> T , , &)>> (4.4)
see BEDARD[1967b]. Generally, the calculation of the average in (4.3) and (4.4) is complicated mainly because one needs the joint distribution function for all I(t’) with Min ti 5 t’ 5 Max ti+Ti, i.e., a joint distribution of infinite order. 4.2. COUNTING DISTRIBUTION FOR SHORT INTERVALS
In this subsection we assume that the time interval T is short compared to the time in which the intensity changes its value appreciably
266
STATISTICAL PROPERTIES O F LASER LIGHT
TdPq = T
(5.11)
where all the reservoir variables are averaged over. The equation for p is the master equation and is derived by the following steps. First one assumes that the total density matrix factorizes at a certain time to in a density operator of the atoms and of the light field times a density operator containing the reservoir variables only ptot(tO)
= P(tO)pR(tO).
(5.12)
An iteration procedure for ptot(t) leads after two iteration steps t o
Pm)
-i
=
-i 2 t [HIP),Pto&)l+(X)
~ ~ l ~ ~ ~ ~ [ ~ l ( ' ~ ~ P (5.13) t O t ( ~ O ) l l ~
to
The master equation for p is obtained by taking the trace of (5.13) with respect to the reservoir system, then setting it equal to the time derivative of p and letting t go to t o , p(to) = lim Tr,{ p!:!(t)} t +to
(5.14)
The term containing the simple commutator in (5.14) vanishes for interaction operators containing reservoir variables and the double commutator vanishes for all terms except for those where both interaction operators contain variables of the same reservoir. This follows immediately from relations of the form Tr,(cip,) = TrR(cicp,pR) = 0 ($ f $'). Therefore the master equation (5.14) can be reduced to
v, g 51
FU LLY
277
Q U A N TU M M E CII A N I C A L T H E 0 K Y
Inserting (5.9) into (5.16) and neglecting antiresonant terms (i.e., terms with a time dependence of the form exp {i(wo+L?,)t}) and using
2 g:
exp{i(Q,-w,) ( t - z ) }
=
2~B(t--t)
(5.18)
9
TrR(C:C~PR(tO)}= % t h ; we finally end up with )L (;
=
K(%h+l){[b>
=
K{[bp,
Pb+l+LbP>
b+l$- Ib,
TrR{c~c:pR(tO)} =
b+I)+K%h{[b+,
p b + ] } f 2 K n t h L [ b ,P I ,
PbI+[b+P> bl}
bfl.
(5.19)
=
(5.20)
The &function in (5.18) appears because the frequencies 0, are assumed to be continuously distributed in the reservoir and therefore the summation over p can be replaced by an integration. The constant K is the damping constant of the resonator, nth = [exp (fiwo/KT) - 11-l is the occupation probability of photons with frequency L?, = coo of the reservoir of temperature T . Because the reservoirs are assumed to be infinitely large, nth does not change with time. The evaluation of (5.17) is made in a similar manner. Neglecting antiresonant terms, the equations corresponding to (5.18) and (5.19) read in this case TrR{Wi$’) ( t ) Wig)
(Z)
pR(tO)}= yij6ikdj,d(t-Z)*
(5.21)
In deriving (5.21), we have assumed that the energy levels are distributed irregularly in such a manner that the equation E , - - E ~ = E k - 8 1 is only solvable for i = k , j = L. The y j i are reservoir constants and they describe real transitions from level j + i for j # i and virtual, phase destroying transitions for j = i. After considering (5.21) and the corresponding expressions in which the factors in the trace are changed cyclically, we finally end up with
278
STATISTICAL PROPERTIES O F LASER LIGHT
[v. § 5
For a three-level system eq. (5.22) was derived by WEIDLICHand HAAKE[1965b]. The time change of the expectation value ( A i j ) = Tr{Aij p} due to the terms (5.22) is
= ~y7j(An.)Bij-+Z:(yi,+yjr)(Aij). r
(5.23)
r
Thus the y i j describe transitions from the level i to the level j . For i = j , (5.23) are the rate equations for the occupation numbers. Terms of the form yii only enter in the nondiagonal terms ( A i j ) ( j f i), where they describe damping by virtual processes additional to the real transitions. For the real transitions only outgoing processes are responsible for damping, as shown in Fig. 17. (See also SCHMID and RISKEN[1966].) For a two-level system (5.22) reads explicitly
Fig. 17. Transitions leading to a damping of the off-diagonal element A,, (solid lines) and transitions leading to no damping of the off-diagonal element A t , (broken lines).
++Y21{[4$)
P, Ai;’I+[A:$>
++yl,{[Ag’ p, Ag’I+[A:;’, ++y2,{[.1L$)
PAi;’l)
(5.24)
PAg’l}
p, AL$)I+[Ai& PAi$’l}.
For a two-level system it is convenient to introduce spin operators = sf;
Ag’+A;?’
A!$’ = sp; =
1,
+(A&A(r”’) 11
A&$’= ++szp,
= szp
A;?) = +-szp.
(5.25)
The product of two operators can be reduced to one operator according to (see also LAX [1963])
ApA g
z
A Lf) Sj,.
(5.26)
V,
s
51
FULLY QUANTUM MECHANICAL THEORY
279
I n a two-level system the reduction relations of the spin operators s-s+ = 1- s,;
s+s- = '2+ s
2
(5.27)
correspond to (5.26). Thus the total master equation for a two-level, resonant system ( e 2 - cl = mug)where anti-resonant terms are neglected, finally reads (5.28)
5.3. LASER MASTER EQUATION NEAR THRESHOLD
The purpose of this section is to derive a master equation containing only the light field. For this end we introduce the following expressions, which depend only on the light operators b, b+, (the index A indicates averaging over the atoms) : N
1 p = TrA{p},
p*
TrA
{ p=1 2
N
1
J
pz
=N
TrA{
2 2szpP)*
p=1
(5'32)
In (5.32) p is the density operator of the light field alone, p* and pz are the expectation values of the positive and negative frequency parts of the polarization and of the inversion. Using the master equation (5.28) for a two-level system, the following equations for p, pf, p, can be derived:
2 80
STATISTICAL PROPERTIES O F LASER LIGHT
[v, § 5
(5.37)
(b+Tr,(sLs; p}-TrA{sLs; p)b++ b Tr,(s;s;
p) -Tr,{sis;
p}b) (5.38)
PlfV
in (5.34) and similar terms in (5.35) and (5.36). This neglection is similar t o the procedure of WEIDLICHet al. [1967a], where the corresponding expansions of the density operator were neglected. However, the difference is that p, p*, pz are still operators in the light field domain and thus lead to results being equivalent for normal and antinormal distribution functions. A next step of the present method would be to include terms of the form (5.38) but neglect expectation values of the atomic system of spin operators belonging to three different atoms. For further considerations we restrict ourselves to the case of small K ( K
.*)
(GI d2u>
(5.45)
are the eigenstates of the annihilation operator b l u ) = ulu).
(5.46)
Note that W, is not necessarily positive. It therefore does not have the usual meaning of a probability distribution, but serves merely as a
282
STATISTICAL PROPERTIES O F LASER LIGHT
[v.
s
5
tool to calculate all quantum mechanical expectation values in the c-number domain. Continuous functions similar to (5.42) were first introduced by WIGNER[1932] and further investigated by MOYAL [1949]. The continuous functions used here were introduced and studied in detail by KLAUDER[1960], GLAUBER[1963a, b] and SUDARSHAN [1963]. (For a review article see the book of KLAUDER and SUDARSHAN [1968]; for recent investigations see the paper of AGARWALand WOLF [ 19681.) The next task is to find an equation of motion for the distribution functions W na from the master eq. (5.39) and (5.40). For this purpose we multiply (5.39) and (5.40) with 0 na and take the trace. By a proper cyclical permutation of the factors under the trace and by using [b, 0
= ia*O na;
[0na, b+] = ia 0 na
a2
a2
Tr{b+O,bp}
=
~
1,; Tr{bO,b+p} = aia a h * f a aia aiu* ~
we finally end up with the following equation
,g,
a
[. LU+iu* aiu* f i a ia*+2 1f,, (5.49) a a g2 ia-+iu*) +ia ia*]gna. [4 ‘2j aia aia* g2
= oAfna- __
iu-
__
YlYZ
a2
+l
__ Y1Y2
The equation for the distribution function W na follows from (5.48) and (5.49) by the replacement iu
--f
-a/&;
ict* -f
-a/a~*;
a/aia --f
Using real variables in vector notation
U;
a/aia* --f u*. (5.50)
v,
P
51
283
FULLY QUANTUM MECHANICAL THEORY
a1 = *(a+.*),
a2 = $(u-a*),
u = {a1,U z }
(5.51) the master equation for the distribution function has the form
+ V (urx2N G ,,-KW~,])
=
at
(5.53) The upper (lower) signs are valid for the normal (antinormal) distribution function W . Because of the denominator in ( 5 . 5 3 ) , the master equation contains derivatives up to infinite order. In the next step, we evaluate the denominator in (5.53). In order to see the magnitudes of the different terms, it is convenient to introduce the normalized coordinates (2.35), where ,I?, q and d are now defined by (u0 = uAN = (Ni-NY) is the total inversion)
B = 4 g 2 ~ / ( ~ 1 y 2 ) ;= g2N/(4y2), g2(Ni-NY)/y2 = 4q0, = K+,I?d,
-
u =dp/qd.
(5.54)
For the sake of simplicity we further put nth = 0. In the normalized equation, terms of the form U, Vii, d,JU/2 are all of the same ordernear threshold. Retaining only terms up to the order d ( p / q ) , we obtain
(a/ai)wna + ~ { u [ a - - l u l 2 ] ~ ~ , } - = d~~~ dPx{ { lu 1 '/ (4c+4)f ( -vu)/ (40,) Q ( f u [ (a I U I WnaI 1. - I U 1 */ (4uA) IWna}
-
-
)
(5.55)
d(q/P) is approximately the number of photons at threshold and is for a typical laser (ARECCHIet al. [1967c]), of the order 4000. Therefore the Fokker-Planck equation (2.36) is a very good approximation. The right-hand side of (5.55) are the first order quantum mechanical corrections of this Fokker-Planck equation. (By evaluating the denominator in (5.53) further, higher order quantum corrections can be found.) Some of these corrections have different signs for W , and W ,
284
STATISTICAL P R O P E R T I E S O F L A S E R L I G H T
iv, I 5
and thus lead to slightly different solutions for W , and W,, which the classical Fokker-Planck eq. ( 2 . 3 6 ) cannot describe. In the FokkerPlanck eq. ( 2 . 3 6 ) ,the q value was defined (nth= 0) by q = g2N2/( 2 y , ) . The differences between this definition and (5.54) lead to terms of the order 2 / ( p / q ) and thus are irrelevant in the classical Fokker-Planck eq. ( 2 . 3 6 ) .It is worthwhile to mention that in the quantum corrections a term of the form v{iilU14W} appears. This stems from the second expansion term of the denominator in (5.53). In unnormalized coordinates the distribution functions W , and W , are connected by (see GLAUBER[1963ab]; W,(u) = (ulpIu)/n) e-l01*W,(u+v)d2v,
(5.56)
which reduces, in first order quantum corrections and in normalized coordinates, to
W,(U)
=
Wn(U)++2/B/g2W,(U).
(5.57)
Because the difference between W , and W , is of the same order as the right-hand side of eq. (5.55), there is no need for distinguishing between W , and W , in the semiclassical limit, i.e., in the case where the right-hand side of eq. (5.55) is neglected. By a somewhat lengthy calculation it can be shown that the solution W, and W , of (5.55)obeys the relation (5.57) to first order corrections for all times. Eq. (5.55) can be solved by a similar manner as the Fokker-Planck eq. ( 2 . 3 6 ) . Perturbation procedure seems to be particularly appropriate for expressing the new eigenfunction and eigenvalues in terms of the old ones. In this work we want t o solve the more general eq. (5.55) only in the stationary state. The stationary distribution function depends only on the field amplitude f . As in 9 3 we can introduce a probability current S. The relation S = 0 reads now
(a-P )f -W’,,/ Wna-fd/rB/4{f
+ ($-$a)
W’/,, W },,
+fl9/(4c~~){-aT~+P~+PW’~,/W =, ,0. }
(5.58)
Because we want to calculate only first order quantum corrections, we may insert the classical result W’/W in the curly parenthesis. The coefficient of the term . \ / c / o A then disappears and we obtain
w ,,= ( 2 n ) - 1 , ~,{1h$v‘a P[~-+(P-U)~I)
exp ( - ~ P + + u P } . (5.59)
v, S 51
285
FULLY QUANTUM MECHANICAL THEORY
3
4
2
6
r.72
8
Fig. 18. Thc normal (W,) and antinormal (W,) ordered stationary distribution functions (5.59) of the laser master equation near threshold ( 5 . 3 9 ) , (5.40), for a pump parameter a = 3 and a threshold photon numbcr .\/(q/p) = 20. The solution WSt which follows from the semiclassical calculations is also shown. For a threshold photon e l al. [1967c], thc differences between numbcr 2/(q/b)= 4000 as measured by ARECCHI the three distribution functions are down by a factor of 200 and lie, in the drawing, completely inside the line thickness of Wst.
The normalization constant JV normalization constant of $ 3 Jlr a,
nL
can be expressed by the uncorrected
=N ( l . f i a d 6 ) .
(5.60)
In Fig. 18 the distribution functions W,,areplotted. Using the inlegral (3.5) and the recursion relation (3.7), we obtain 2n
s
f2W, ,fdf
=
(I)r$dfi
(5.61)
or in unnormalized coordinates (bfb)
=
d a ( I ) -4,
(bbf)
=
d4/p (I)++
(5.62)
where ( I ) is the uncorrected moment of $ 3. It should be recalled that only quantum corrections up to first order can be calculated with (5.59). For instance we obtain (b+b+bb) = (q/p) ( f 2 ) - 2 d q l B ( I ) ,
( b b b f b f ) = (q/P) (12)++dq//i(I) (5.63)
which are correct to first order but not to second order. Accidentally the commutation relation (bbbfb+-bfbfbb)
= 4(b+b)+2
is even fulfilled for the second order term 2.
(5.64)
288
STATISTICAL PROPERTIES O F LASER LIGHT
LV,
9 5
Because d(P/q) is of the order 1/4000, the difference between W , and W , is very small near threshold, i.e., where ( P ) is of the order one. For much larger pumping parameters, where for instance ( f z ) z / ( p / q ) is of the order one (the photon number is then the square of the photon number at threshold), quantum corrections may play an important role. For this case more expansion terms of the denominator in eq. ( 5 . 5 3 ) must be used.
Photon counting distribution Using the Glauber P-representation (5.45) of the density operator we obtain p(n)
==
(nldn)
==
s
~ e - 1 ~ ~- ~ 12, Wn(u)d2u. (5.65) I < ~ l ~ > I 2 W n (= ~)d2~
Since usually only a small fraction of the intensity inside the laser falls on the photon detector during a time interval T , the intensity is attenuated by a factor u‘T.Thus instead of W n ( u )Wn(u/y”(u’T)) , must be inserted (see also GLAUBER [1966], ARECCHI and DEGIORGIO [1968]). Then (5.65) is exactly the Mandel’s equation for the photoelectron counting distribution for short intervals (4.8). For normalized coordinates, we thus obtain by using (5.59), (5.65), (4.9) and (3.5) Vn
P (n) = n!F,o{ (1-%ad%)F , (a -2Y) +dfi [*-+a21 F,+l +%a4%iFn+z(a-W
-
&dB/qFv+,(a-2v)),
(a-2v) (5.66)
which differs from (4.10) by the small correction terms of the order d(P/q). SCULLY and LAMB[1967a] have derived a master equation in the occupation number representation for the light field alone. Writing their equation in operator form and applying the same technique, we find that without quantum corrections their equation agrees exactly with our Fokker-Planck eq. (2.36). The first order quantum corrections disagree with the ones derived here. This may be attributed to the fact that these authors use a slightly different pumping model (not an ideal two level system with a constant number of atoms as used here). 5.4. c-NUMBER EQUATION O F THE LASER MASTER EQUATION
In § 5.3 we have seen that the solution of the operator eqs. (5.39), (5.40), can be reduced to the solution of the c-number equation (5.52),
v,
I
51
FULLY QUANTUM MECHANICAL THEORY
287
(5.53). The operator p is then given by (5.45). Following a method of HAKENet al. [1967] (see also GORDON[1967], LAX(1967b], and LAX and LOUISELL [1967]), we derive in this section a c-number equation of the master eq. (5.28), which now contains all the atomic variables too. However, as we will see below, we do not need all N atomic variables, but only the ones which refer to macroscopic quantities, namely the total atomic dipole moment and the total inversion
Sf
=
Is;,
s- = 2s-,fi s, = ZS,.
P
P
(5.67)
P
We define the distribution function f by
f (u, u*,v, 7J*, I ; t)
= M'
J . . .J exp{-ivt-iv*t*-i~I-iua-iu*a*) x F(5, [*,
5,a, a*, t)d2Ed2a d t
(5.68)
where
F
= Tr{eiSS-efSszefS*S'eia'
e
b+ l a b
P(t)l = TWPl.
(5.69)
(For the sake of simplicity we only treat the normally ordered distribution function.) For the following analysis, it is convenient to use a decomposition of the operator 0 in the form
0
=
o,o,;
N
(5.70)
eiSs;eiSs.,eiS*s:. , 0L -- eia'b+eiab OA(fi)-
M' is a normalization constant, so that
S f d2ud2vd I = 1. u,u* are
the macroscopic variables associated with the lightfield operators b, b+, v, v* and I are quantities corresponding to the total complex dipole moments S - or Sf and the inversion S,, respectively. The definition (5.68) ensures that f is real provided p is a hermitian operator. Expectation values of the light operator can be obtained as in 5 5.3 by the first part of eq. (5.43). The same procedure may be applied to expectation values which are functions of the macroscopic variables Sf, S-, S,. If the resulting order of the operators does not agree with the order wanted, the usual commutation relations must be applied. Multiplying (5.28) with the operator 0 and taking the trace we have (5.71)
288
STATISTICAL PROPERTIES O F LASER LIGHT
[t-,
s
5
with (5.7%)
(5.73)
(5.74)
Using (5.30) we find
N
fYl2
[ 2 Tr~s,O*,spa,P) -Tr{s,sp*,oa,d
TrP*,s,spa,
-
+
(711+Y22)
['
PI1
Tr{s,,oA~cs~/~o~,p)~Tr{s~~UA~o'*~p>
-Tr{o.4pS~po'App)l
with 0
(5.75)
OA,O'*,;
oa,
=
JJ O,,O,.
1 (5.76)
v#a
Using the property of the trace that an operator product under it may be rearranged in a cyclic manner, we have brought the p in every term to the right-hand side. Because s-, s+ and s, obey eq. (5.27), we may express terms of the form s i s ; , s t s i , ,;s by the single operators. For our further analysis it is most important that the operators which occur in front of the density matrix under the trace can be expressed as linear combinations of derivatives of the operator O,, with respect to the variables 5, 5*, [ and of the operator OA,L;
o,,
=
eits;eicsz,eit*s;
(5.77)
v, 9 51
FULLY QUANTUM MECHANICAL THEORY
289
We exhibit these linear combinations explicitly by the following formulas:
s+O P .4P s-P
=
[+e-'c++eic(it)2(iE*)2-t
(it)(it*)]O,,
(5.78)
+ [2 ( e d -
(5.79)
We now calculate the term of eq. (5.73) which stems from the interac-
290
[v, § 5
STATISTICAL PROPERTIES O F LASER LIGHT
tion between the field and the atoms. Note that in this interaction only the operators S+, S- of the total atomic dipole moment occur so that we may immediately write =
+
+
-ig{Tr{ [OS+b OS-b+]p}--Tr{ [S+bO S-b+O]p}}.
(5.80)
Using the operator relations
(5.81)
we find
aF
(z)AL,
=
-ig
a a (-a(il*) a(itc) ~
-
~
a a($)
a ~
a(itc*)
(5.82)
The field term was already calculated in 9 5.3 (put g
=,);(
-“““;d-1 a
=
0 in (5.48))
F + ~ K ~ z ~ ~ (ia) ( ~ cFc. * (5.83) ) ) According to the definition of f this distribution function is a Fourier transform of F . Thus the expressions (5.79), (5.82) and (5.83) may be expressed by f and its derivatives. We write this resulting equation for f in the form +iu*
a(lR
vl
291
REFERENCES
af
-=
at
Lf
(5.84)
where the Liouville operator L consists of the atomic part, the atomfield interaction and the field part
L = LA+L,,+L,.
(5.85)
The different contributions are defined as follows:
(5.86) +fryl2
[ N(e-a/aI-
+ ava v+ av*a v*
1)
-
~
-
I
2 (e-a/aI- 1)I
(5.87)
(5.88)
Taking into account only first and second derivatives the leading terms of this equation agree with the Fokker-Planck equation derived and solved below and above threshold by RISKENet al. [1966a, b, c]. For a complete solution of the two-level laser problem (without linearization of the coefficients, adiabatic approximation), however, the full eq. (5.84) must be investigated.
References AGARWAL, G. S . and E. WOLF,1968, Phys. Rev. Letters 21, 180. ARECCHI,F. T., A. BERNEand P. BURLMACCHI, 1966, Phys. Rev. Letters 16, 32. ARECCHI,F. T. and V. DEGIORGIO, 1968, Phys. Letters 27A,429.
292
STATISTICAL PROPERTIES O F LASER LIGHT
[v
ARECCHI, I;.T., V. DEGIORGIO and B. QUERZOLA, 1967a, Phys. Rev. Letters 1 9 , 1168. ARECCHI, F. T., M. GIGLIOand A. SONA,1967b, Phys. Letters 2.54, 341. ARECCHI,F. T., G. S . RODARI and A. SONA,1967c, Phys. Letters 2 5 A , 59. AKMSTRONG, J . A. and A. W. SMITH,1967, in: Progress in Optics, Vol. 6, ed. E. Wolf (North-Holland Publishing Co., Amsterdam; John Wiley and Sons, New York) p. 211. ARZT,V., H. HAKEN, H. RISKEN, H. SAUERMANN, C. SCHMID and W. WEIDLICH, 1966, Z. Physik 1 9 7 , 207. B~DARD G.,, 1967a, Phys. Letters 2 4 A , 613. BEDARD,G., 1967b, Phys. Rev. 1 6 1 , 1304. B~DARD G.,, 1967c, J . Opt. Soc. Am. 5 7 , 120. BHAKUCHA-REID, A. T., 1960, Elements of the Theory of Markov Processes and Their Applications (New York-Toronto-London, McGraw-Hill Book Company, Inc.). B L A Q U I ~ RA,, E , 1953, Ann. Radio Elect. 8 , 36. BORN, M. and E . WOLF, 1964, Principles of Optics (Pergamon Press, London). BROWN, 13. Hanbury and R . Q. TWISS,1956, Nature 1 7 7 , 27. BROWN,13. Hanbury and R . Q. TWISS,1957a, Proc. Roy. Soc. London A 2 4 2 , 300. BROWN, R . Hanbury and K. Q. T w ~ s s ,3957b, Proc. Roy. Soc. London A 2 4 3 , 291. BRUNNER, W., 1967, Ann. Physik 2 0 , 53. CHANDRASEKHAR, S., 1943, Rev. Mod. Phys. 15, 1; this article is contained in: Selected Papers on Noise and Stochastic Processes, ed. Nelson Wax (Dover Publication, Inc., New York, 1954). V. KORENMANN, C. 0. ALLEYand U. HOCHULI, CHANG, R . F., R. W. DETENBECK, 1967, Phys. Letters 2 5 A , 272. COLLINS,R. J.. D. F . NELSON,A. D. SCHAWLOW, W. BOND,C. G. B. GARRET und Ti'. KAISER,1960, Phys. Rev. Letters 5, 303. UAVIUSON, F . and L. MANDEL,1967, Phys. Letters 2 5 A , 700. FLECK, J . A,, 1966a, Phys. Rev. 1 4 9 , 309. FLECK, J. A,, 1966b, Phys. Rev. 1 4 9 , 322. FREED, C. and H. A. HAUS,1965, Appl. Phys. Letters 6 , 85. FREED, C. and H. A. HAUS,1966, Phys. Rev. 1 4 1 , 287. GLAUBER, R. J., 1963a, Phys. Rev. 1 3 0 , 2529. I =
2 Tij(P,Q, t);
i, j
= 1,2,
ij
where
rij(P>Q, t) = (v,(P,t ) *
vT(Q,t+t)>,
(3.2) (3.3)
and the angular bracket denotes a time average and the asterisk the complex conjugate. If the Fourier inverse transform with respect to z is taken, we have a fundamental expression of two-beam interference in terms of mutual spectral density (BORNand WOLF[1959], p. 501),
where
I-, 03
fij(P, Q, v)
=
Tij(P,Q,
t)exp(i2nvt)dt
(3.5)
and the circumflex denotes the time Fourier transform. On the other hand, as ZERNIKE[1938] has shown, each mutual spectral density is propagated according to the generalized HuygensFresnel principle (BORNand WOLF [1959], pp. 514, 531). With the approximation that reduces the Huygens principle for amplitudes to a two-dimensional transform, the mutual spectral density on one of the reference surfaces is reIated to that on the subsequent reference surface by a four-dimensional Fourier transform, since it is a function of the four spatial coordinates that denote the position in a plane of the two points (BLANC-LAPIERRE and DUMONTET [1955]). In Fig. 3, a general system of microscope with Kohler illumination system is considered. In most two-beam interference microscopes, two microscopes are assumed to be so arranged in parallel for two paths that they possess the same optical axis coincided virtually. Consider now, for each path, the field stop plane, the reference sphere touching the entrance pupil plane at its center, the object plane, the referencesphere touching the exit pupil plane at its center, and the image plane
308
SOURCE-SIZE COMPENSATION
[VI,
§ 3
as shown in Fig. 3. We take these surfaces as the reference surfaces. We have assumed here that the separate optical system is aligned with the same optical axis * and has the same number of stops. Each optical system need not be identical but may possess partly or totally a different magnification. So the double-focusing interferometer can be included in the theory.
*’
_ - T,----+
I I
I
I
-
--t--T2---i
I
I I I
I
I
Lc I
- _ _ T2’
---
I I
I
To condense the notation, points are specified by a position vector normal to the optic axis and we use the following reduced coordinates (see Fig. 3) for the path i (i = 1, 2) , instead of geometrical coordinates (in capitals) :
* A slight misalignment can be interpreted as a difference in aberration between the two systems.
“1,
§ 31
u.= a
309
COHERENCE DIFFRACTION THEORY
1 1 1 ui, iii = ui, u: = Ti sin ui T i sin Gi Tl sin ui I
u;,
(3.7)
where the subscript i denotes a quantity relating to the path i, a i , Fii, Fii, ni are the refractive indices of the spaces as shown in Fig. 3, u i , G i ,Gi,a; the angular radii of the circular pupils * in conformity with the usual sign convention, Ti, p i , T: the distances measured from left to right, from the centre of the field stop, of the object and of the image to their corresponding reference spheres respectively. For systems obeying the isoplanatic condition or corrected for the transversal aberrations on the reference sphere (HOPKINS [1951, 1965, 1966]), the Gaussian image of a point at xi in the object plane is at xi = xi in the image plane and similarly ui = ui.If there is no difference in magnification between the systems for the two paths, nor in the angle the pupil subtends at the object (image) plane, x = x1 = x2,u = u1 = u2 etc. hold. Thus, the scheme of the unified theory may be expressed as follows: For the illumination system: 00
f i j ( i i i ,Ti,v) =
, v) exp{i2n(iiixi-Vjyj)}dxidyj, (3.8) [[ f i j ( x i yj, JJ-m
f i j ( i i i 3,, , v) = f i ( Z i , v ) f ; ( T j , v ) * f i j ( B i ,T,, v),
(3.9)
For the observation system:
T i j ( U i ,v;, v) = //-:Pij A
,
(Fi, yj,v) exp(i2n ( u ; X i - v~yj)}dxidyj, (3.12) A
P;j(u;,v;, v) = fi(UL,v)f;*(v;, v) * r i j ( u :v;, , v),
(3.13)
where mutual spectral density through one of the reference surfaces is denoted by a plain symbol before the plane and by the same symbol with a prime after the plane. L i ( X i ,v) denotes the spectral amplitude transmission of the object * The aperture stops are assumed to coincide with the pupils in position, but not always in size. They may be sometimes larger than the pupils.
31 0
[VI,
SOURCE-SIZE COMPENSATION
s
3
placed in the path i . fi(ui, v) or fi(ui, v) is the spectral amplitude transmission or pupil function of the condenser or objective for the path i. For example, if the condenser has no absorption and a wave aberration W i ( u i )for a circular pupil, it is f i ( U i , ).
5 pi; 1 4>Pi-
= “xP{i(24c)Wi(ui)}, =
luil
0,
(3.15)
We assume again that the image planes for both paths are so closely situated that the one (say x;) may be taken as the image plane common to both beams by considering the focal shift of the other as a defocusing aberration. Hence, we have the reduced co-ordinates for the point X’ in the image plane
, n’ * sin & x1 = x; x ,2 = n‘ - sin a;
ii
1
x.
(3.16)
Putting p’ = sin dJsin a;,
(3.17)
x; = p’xi.
(3.18)
we get By using eqs. (3.8)-(3.14), we can calculate all of P,,(x:, y i , v) = 1, 2) in the image plane, from the knowledge of P,,(x,, y,,v) on the source. Then, by summing them up to get P(P, Q, v ) given in (3.4), and taking its Fourier transform, r ( P , Q, z) is obtained:
(i,j
r ( P , Q, r ) = r11(4, Y ;, r)+rZz(p’xl, P’Y;, z) +rl2(4> P’Yl, ~)+rzl(p.’x;, Y L .I. When the two points P and Q coincide for z the point P will be obtained as
I ( P ) = T ( P >p, 0) = f&;,
= 0,
(3.19)
the intensity at
x;,0 ) + ~ 2 2 ( p ’ x ; ,E L K ; , 0)
+2%{G&;,
p’&
0)).
(3.20)
It is worthwhile to note that for quasi-monochromatic light of frequency Y&Av, mutual coherence function or mutual spectral density can be approximately expressed by the mutual intensity function: T(P, Q, 0 ) ~exp{-i2z+r} or f’(P, Q, O)d(v-$), providing 1x1 u) q * ( ~ v)> , q*(x>u) 4 * ( ~ v), ,
Q12 Q22
= q(x, u)q
( ~v), , = 4 * b 3 U ) q(yJv),
(3.29)
with
q(x, u) = exp{i2nlxl} * exp{-i2nsul},
(3.20) (3.31)
where 20. denotes the splitting angle t and 5 the distance from the object plane t o the Wollaston prism. So far we have neglected the following causes which affect the coherence of polarized light: variation of path-difference denoted by Q::), depending on the second power of the incidence angle: the curved fringes at infinity by a Savart plate and the hyperbolic fringes at infinity by a Wollaston prism when it is illuminated just behind by a slit [1952]; FRANCON and parallel to its localized fringes (FRANGON SERGENT [1955]); depolarization due to the rotation of the plane of vibration inside the crystal element cut at a large angle such as 45' with the optic axis. Both restrict the usable source size and should be examined to the second approximation, after some practical method of source-size compensation is devised. It is shown that the influence on the coherence is approximately given by the integral of QI",' (u',v', x',y ' ) , taken over the pupil, and it is equivalent to the one due to the astigmatic aberration of the optical system (YAMAMOTO [1968]).
*
-
.
S = d2 e(nz-n$)/(nj++zg)where 2e is the thickness of the Savart plate, no and 12, the ordinary and extraordinary indices of the crystal and n the index of refraction of the surrounding medium. t d = { ( n e - n o ) / n }. tg E , where E is the wedge angle.
VI. § 41
316
LOCALIZATION O F FRINGES
§ 4. Localization of Fringes with Partially Coherent Light In a general arrangement which is illustrated in Fig. 5 in the case of a Savart plate, the object plane X is supposed to be an extended source M of arbitrary coherence Fij(xi, yi,0) between the two beams i and i. The optical system 0 forms of the image of M at M'. S is generally considered as a shearing element, whose coherence transmission is given by eq. (3.25). By using the method given in the preceding section, the mutual intensity in the plane X is represented by *
r(x',Y',0) = c ij
ss
FZi(Xi, yi,0) * @&;.-si-xi, y ; - s i - y i ) x
x exp(-i2n (dixi-diyj) where
'f9'(xiJ
Yj)
- exp{-i2n = gi(xi)
(disi-dj sj))dxi dyi (4.1)
gi*(Yj)'
(4.2)
Here, Qii is the coherence spread function for the beams i and j , gi(xi) being the amplitude spread function of the system for the beam i. M'
M
I
X,
(b) Fig. 5. Calculation of the coherence of a radiation field passing through (a) a shearing element and an image-forming system, or (b) an image-forming system and a shearing element, in the order named.
*
Hereafter, the integral sign without limit will stand for
s-",
.
316
SOURCE-SIZE COMPENSATION
[VI,
§ 4
The intensity at the point X’ becomes, if the system is assumed stigmatic, i.e. g i ( x i )= 6(xi),
I ( X ’ ) = ql(x;-s1, x;-sl, o)+T;z(p’x;-s2, p k - s 2 , 0) + 2 9 [q2(x;-Sl, p ’ x i - s z , 0) x x exp{-i2n ((Sl-p’ B,)x;-Sl sl+ S , s,)}], where p = sin =,/sin ul, p’ = sin uk/sin u;,
(4.3)
and FFj(P, Q, 0) denotes the value of rij(xi, yj, 0) in the plane X when x iis replaced by P, yj by Q. The visibility is determined by the modulus of cross-coherence between such two points in the source as
x1 = x;-sl,
yz = p ’ x ; - s z .
(4.4)
Incoherent source being assumed, it follows that the unit visibility is obtained only if the (overall) magnification of the system is equal for the two paths: p = p‘ and psl-sz
if p
=
= 0,
1, sl-sz
=
0.
Eq. (4.6) is equivalent to the classical rule of Raveau (COTTON [1934]) which states that the plane of localization lies at the intersection of two rays derived from a single incident ray. In the case of a Savart plate, eq. (4.6) means that u1 = up = 0; for each shear is perpendicular with each other: s1* s2 = 0. The fringes are localized in the focal plane. When we place a Savart plate in front of an aberrant system with a finite aperture, it is shown that the visibility of the fringes in the focal plane is equal to the modulus of the optical transfer function of the system at the normalized spatial frequency (Sl-Sz) with respect to infinity, while the variation in fringe position indicates its phase variation. Similarly, in the analogous arrangement as shown in Fig. 5(b), we find the mutual intensity across the image plane as
r ( X ’ , Y ’ , 0) = 2 e x p ( - i 2 n ( 8 ~ ~ ~ - 8 ~ y* ~exp(i2n(Sisi-Sisi)} )} x ij
VI, §
51
SOURCE-SIZE
COMPENSATION
317
Although the form is quite similar with eq. (4.1), the essential difference is the definition of shears. In eq. (4.11, si and g i are a reduced shear and a reduced tilt in the object space, whereas in eq. (4.7) s: and Si are the ones in the image space. It is impossible for si of a Savart plate to tend to null, unless the image plane goes to an infinite distance. The fringes are localized at infinity.
Fig. 6. Localized fringes by a Wollaston prism.
For a Wollaston prism as shown in Fig. 6, the condition for localization eq. (4.6) is expressed as sin u * 5 = 0, where 5 is the distance from the source (object) to the Wollaston; it is impossible to make u equal to zero while keeping 5 definite. Consequently, the localization is obtained in the image plane only when 5 = 0. Q 5. Source-Size Compensation The combination of the two arrangements as shown in Figs. 5(a) and (b) enables us to constitute a source-size compensated interference microscope. There are four possible modes according to which one is placed before the other, the image plane and the exit pupil of the precedent system being coincided with the object plane and the entrance pupil of the subsequent system respectively. We call the coincided planes the intermediate planes, denoted by XI o r x 2 for the beam 1 or 2, and place the object to be observed in either plane.
318
SOURCE-SIZE COMPENSATION
[VI,
s
5
Fig. 7. Source-size compensation in the field plane: illustration of the case when two shearing elements are placed outside a microscope (condenser-objective system).
It is to be noted that the source plane X and the (final) image plane X are common t o both beams in the sense explained before (9 3). If we take as an example the case shown in Fig. 7, the one (Fig. 5(a)) is placed before the other (Fig. 5(b)) with each pupil superposed, similar operations as before lead to the mutual intensity in the image plane being represented by
qx',Y',0) =
ss
z: ij
Tij(Xi,
Y j , 0)
x & & - (si+s'i)-xi,y : - ( s j + s ; ) - y j ) x exp{-i2n( (d,x,+sixi) - (djyj+d;yj))}dx,dyj xe~p{-i2n(8~s~-~,s,-d~s~+d~s~)),
(5.1)
Y j ) = &(Xi) . g"?(Yj),
(5.2)
where &(Xi,
and gi(xi)denotes the overall spread function of the composite system (condenser-objective system), with &,j the overall transmission of coherence. When the approximation that the whole system is stigmatic is made, one finds easily
r(x',Y', 0) = 2 lyj(x;-(s,+s;),y ; -
(Sj+Si),
0)
ij
x exp{-i2n((di+d;) xi- (Sj+Si) yj)} x exp(i2n (si(di+di)
s; (dj+ S i ) ) } .
-
(5.3)
Thus it is concluded that the one system can compensate the loss of cross-coherence due to the other, if the following two conditions are both fulfilled:
(I) the overall magnification between the planes X and X' must be equal for the two paths:
VI,
5 51
319
SOURCE-SIZE COMPENSATION
m, = m2 = m
where
p = sin a,/sin u,,
or
p =p’,
p’ = sin ailsin
(5.4) M;;
(5.5)
(11) the ratio of the sum of the reduced shears for each path must be equal to the ratio of the apertures subtended a t the source plane : P(Sl+S&
=
( s 2 + 4
0.
(5.6)
“ I f p u l , (sz--s,)+
or
(.;-s;) = 0,
(5.7) (5.8)
(Sl+s;) = ( s 2 + 4 ) .
Using condition (I), this is also expressed in terms of the relative shear with respect to one path which is measured in geometrical coordinates, (S,-S,)+m(S;-s;)
=
0.
(5.9)
(111) In addition to these conditions, the condition for the zero phase of cross coherence:
* If
(8,fs”;) -p’(8,+8k)
=
0.
(5.10)
p’ = 1,
(s^,-SA1)+
(S‘;-s”;)
=
0,
(5.11)
or ($1+$;)
=
(8,tQ
(5.12)
is satisfied, the fringes disappear. Using the condition (I),this is also represented as (5.13) where T,, T ; , T,, Tb are the aperture distances for each path, as defined in 0 3. STEEL[I9591 derived these conditions in a slightly different way. I t is shown that they never change even when the finite aperture and aberrations of the system are taken into consideration or when any other mode of combination of the two is adopted. From only the two conditions (I) and (II), source-size compensated interference microscopes presenting a frilzged field can be derived. All the conditions
* In a common-path interferometer, p = p’ = 1. The same condition must be valid for maximum contrast in interferometers of a different type.
320
SOURCE-SIZE COMPENSATION
[VI,
s5
(I), (11) and (111) being satisfied, the ones showing a uniform field will result. In order to find polarization interferometers belonging to the latter, F R A N ~ [1956, O N 19661 showed the following compensation principle equivalent to eqs. (5.7) and (5.11). We may call this the principle of localized fringe transfer: (a) the two fringe systems of the birefringent elements, which may be either fringes at infinity or localized fringes, must be exactly superposed in one plane (we will call it the plane of superposition); (b) the two birefringent systems must be so oriented that the splitting produced by one is exactly cancelled by the other. When eqs. (5.7) and (5.11) are perfectly established as well as the condition p = p’ = 1, the two exit pupils and the two exit windows are exactly coincided. Each of the rays emitting from the source proceeds as if the shears and tilts never occurred. Therefore, the validity of the conditions (5.7) and (5.11) is not confined to the planes of the source and the image. They are virtually valid for any other pairs of conjugate planes; that is, the interference phenomena are “non-localized”. It is this fact that justifies the equivalence of FranCon’s conditions to ours. U’
!
I c
I
I
Fig. 8. Source-size compensation on the pupil surface.
To complete the study of the compensation, suppose that another objective is placed in the image plane and assume that it forms the image of the exit pupil plane U‘ at U“, and that p = p’ equals to unity. In Fig. 8, the complex system is represented by a simple objective 0. By taking the Fourier transform of the mutual intensity r ( X ’ , Y , 0 ) , we get eq. (5.11) as the condition for recovering the modulus of cross-coherence and eq. (5.7) as the condition for zero phase,
VI,
§ 61
PRACTICAL METHODS
32 1
so that the two conditions are interchanged. The fringes to appear in the pupil are just what is called “source fringes”, while the fringes to appear in the image plane are called “test fringes” by STEEL[1965, 19671. These are specified as complementary. These considerations not only give general grounds for reasoning the principle of the practical methods of source-size compensation in any type of two-beam interferometer, some of which will be given in the next section, but they provide a useful tool for approaches to new designs of interference microscopes. It is last but not least that the analysis based on eqs. (5.1) and (5.2) will be used to study the influence of imperfect compensation upon image-formation by an interference microscope (§ 7 ) .
8 6. Practical Methods of Source-Size Compensation in Shearing Interference Microscope with Polarized Light In this section, we will confine ourselves to presenting all the practical methods of source-size compensation in a shearing interference microscope. Most of them were described by KRUG,RIENITZ and SCHULZ[1964], FRANCON [1961, 1966, 19671, BRYNGDAHL [1965], but the purpose of our study is somewhat different from theirs. We will consider them as theoretical consequences of the discussion made in the previous section, and will classify them systematically into several modes of construction, add some which have been derived analytically, and finally make a simple comparison from a theoretical point of view. As stated earlier (9 2.4), in interference microscopes the interferometer with dissimilar arms is rare, because matching of the dispersion in the two arms is difficult. In interferometers of symmetrical form, there is no particular problem about source-size compensation except matching aberrations and adjustment of the optics which was discussed partly by HOPKINS[1957b, 19671. Interference microscopes in common-path form can be well realized with polarization interferometers. The double refracting systems such as Savart plates or Wollaston prisms can divide the light beam into two beams polarized at right angles with exactly equal amplitude, independent of the wave-length and nearly of incidence-angle. At the same time, they produce shearing (lateral displacement) or double focusing (longitudinal displacement) * between the two polarized beams.
*
The best known form is described by SMITH[ 19551.
322
SOURCE-SIZE
COMPENSATION
[VI,
5
6
Ordinarily, the double refracting system is self-compensated satisfactorily. The delay is not worth due consideration. In the present status of technique, the double-focus method is not so suitable for interference microscopy as the shearing one, because the possible defocusing cannot be so large as to give an undisturbed reference wave-front (KRUG,RIENITZand SCHULZ [1964], p. 136); moreover, as STEEL[1967] pointed out, it is also difficult to form reference fringes without affecting the visibility. This is why it is excluded in this section.
Fig. 9. Practical methods for obtaining the fringed field: (a) NOMARSKI and WEIL [1955], (b) NOMARSKI and WEIL [1955], (c) GONTIER[1957], GUILD[1957],
(d) GONTIER[1957], GUILD[1957], (e) UHLIG [1965];
VI,
s
PRACTICAL METHODS
61
323
6. 1, METHODS F O R ORTATNING FRINGED F I E L D O F VIEW
6.1.1. I n the object plane
When only eq. (5.7) is valid, the fringes of spacing inversely proportional to I (d,--$,)+ (&--$;)I appear across the field. The optical-path difference due to the object can be measured by reading the deformation of the fringes which play the role of reference fringes. When each of the two Savart plates or the two Wollaston prisms is placed outside
(b')
(d')
2
, , (e')
I I
, , I
(a') YAMAMOTO [1968], (b') YAMAMOTO [1968], (c') FRANFON and PRAT [1964],
(d') YAMAMOTO [1968], (e') YAMAMOTO [1968].
324
S O U R C E - S I Z E C 0 M P E N S A T 10 N
[VI, §
6
of a microscope (condenser-objective system), one before and the other after, eq. (5.7) is rewritten as s+s‘
=
0
(6.1)
in terms of scalar reduced shears defined by eq. (3.28) or (3.31). From the relation (6.1), we get S .m
=
-S’,
(6.2)
for Wollaston prisms 05 . m
=
-a‘(’,
(6.3)
for Savart plates
where m is the magnification of the condenser-objective system between the planes x and x‘. A most simple solution is that m is unity. Then, the illuminating system and the image-forming system must be identical and arranged symmetrically with respect to the intermediate (object) plane, and the two birefringent systems have to be also identical and placed in an appropriate orientation. The system with transmitted light can immediately be converted to a reflection system. All the methods as shown in Figs. 8 and 9 follow this principle and can use a large light source. NOMARSKI and WEIL [1955] is the first to have described this kind of system with his modified Wollaston (NOMARSKI [1955]), as shown in Figs. 9 [a), (b). The fringe spacing i is given by
i
=
@/a)* T / ( T - ( ) ,
where 20 is the splitting angle of the Wollaston. The equivalent Savart systems are shown in Figs. 9 [a’), (b’). 6.1.2. In the pupil surface
As illustrated earlier, the complementary fringes of spacing inversely proportional to 1 [sz-sI)+ (d-s;)] appear across the pupil when eq. (5.11) is established. The fringes represent also the wave-front aberration, with respect to a chosen position of the object plane, which will affect the visibility of the fringes that must appear in the image plane when the same optical system is used as a part of an interference microscope. The corresponding equation to eq. (6.1) is given by 3+gr = 0
(6.5)
in terms of scaler reduced tilts defined by eq. (3.28) or (3.31). From this, we find the relations quite similar to eqs. (6.2) and (6.3):
VI.
§ 61
PRACTICAL METHODS
for Savart plates
s
4L = s’,
325
(6.6)
-o’(T’-t’), (6.7) where 4L is the magnification for the entrance and pupil planes of the whole system. If 4L is assumed to be unity, the pupils lie in the principal planes of [1957] and GUILD[1957] described independently the system. GONTIER the system shown in Figs. 9 (c), (d) for the study of the irregularity on the pupil surface. FRANCON and PRAT[1964] described its equivalent shown in Fig. 9 (c’), where two modified Savart plates by Franqon are used. It may be converted to the reflection system as shown in Fig. 9 (d’). In these cases, ordinary Savarts are used with the light source of more limited extent because of the curvature of fringes. For the same reason, Franqon’s modified Savart may be used in Figs. 9 (a’), @’). FranCon and Prat applied the system to an interferential focometer for the lens L, because the fringe spacing i is proportional to the focal length f of L: for Wollaston prisms o(T--5) A
=
i = (A/ZS)f.
(6.8)
For optical testing of the microscope objective, the systems as shown in Fig. 9 (e) and Fig. 9 (e’) can be used. UHLIG[1965] tested the mechanical tube length by the former system, because the fringe spacing varies proportionally with 5-l. YAMAMOTO [ 19681 utilized the latter to test the aberration of an objective with respect to the image plane conjugate to the reflecting object plane. This is no more than a polarization version of the classical Waetzman’s interferometer (WAETZMAN [ 19 121 ) . 6.1.3. Comparison between the uses of Savart plates and Wollaston prisms
The position of a Savart plate along the optical axis is completely free, but the fringe spacing is constant, unless it is split into two wedged ones which slide with respect to each other (TSURUTA [1963]) or it is replaced with two counter rotating Savart plates (STEEL[1964]). On the contrary, the spacing of the fringes by the Wollaston varies as we move it along the axis. 6.2. METHODS FOR OBTAINING UNIFORM FIELD O F VIEW
When both the conditions indicated by eqs. (5.7) and (5.11) are satisfied, the two beams become coherent and the intensity distribution turns uniform everywhere, with a large light source. Since, both
326
SOURCE-SIZE COMPENSATION
[VI,
a
6
eqs. (6.1) and (6.5) are satisfied, eqs. (6.2) and (6.6) or eqs. (6.3) and (6.7) hold. In order that eqs. (6.2) and (6.6) are satisfied for two Savarts, the optical system placed between them should be afocal, namely, telescopic, and the ratio of the shearing distances S , S’ of the Savart plates should be equal to the constant magnification of the afocal system,
yjs
1
-m
=
-&,
(6.9)
For two Wollastons to be used, they should be placed in the conjugate planes of the optical system which are placed between the Wollastons, and the ratio of the splitting angles 0, 0‘ of the Wollastons should be connected with the magnification rji for the two planes where each Wollaston is placed, O’/O =
-1jrji.
(6.10)
In Fig. 10, the methods for obtaining a uniform field can be classified in four modes of combination of the two fundamental forms as shown in Figs. 5 (a) and (b), where the exit pupil of the first system and the entrance pupil of the second are usually taken a t infinity: mode (1): two birefringent systems are both outside the condenserobjective system (Figs. 10 (a)-(e)); - mode (2): two birefringent systems are both between the condenser and the objective (Figs. 10 (f), (g)); - modes (3) and (4):one birefringent system is outside the system and the other inside it (Figs. 10 (h), (i)). -
The uniform field can be converted into the fringed one without spoiling compensation, when we place some additional interferometer which forms its localized fringes, real or virtual, in the image plane. 6.2.1. Mode (1) The most simple method which is shown in Fig. 10 (a) was described by FRANCON and YAMAMOTO [1962]. (a) and @) which FRANCON and CATALAN[1960] described follow the principle given by eq. (6.9). The adjustment for obtaining afocality was made by the movement of the condenser. (c) is the well-known Smith’s method (SMITH[1947]), where the two Wollastons are placed in the focal planes of the condenser and objective (T = 5‘ and T‘ = Since the focus of high aperture objective or condenser is not accessible, Nomarski realized this principle with his modified Wollaston prism. FRANCON and YAMAMOTO [1962]
c’).
VI,
S 61
PRACTICAL METHODS
(i)
7/-!
327
m t
uJ+
Fig. 10. Practical methods for obtaining the uniform field (transmission system) : (a) FRANFON and YAMAMOTO [1962], (f) LEBEDEFF [1930], (b) FRANFON and CATALAN [1960], (9) LEBEDEFF [1930], (h) FRANFON [1957], (c) SMITH[1947], (a) FRANFON and YAMAMOTO [1962], (i) LINDBERG [1952]. [1957], (e) FRANFON
328
SOURCE-SIZE COMPENSATION
[VL
I
6
extended Smith's method t o the case where T--5 # 0 and T'-(' = 0 are valid, which is shown in (d). (c) and (d) follow eq. (6.10). A mixed system consisting of a Savart plate and Wollaston prism was described by FRANCON [1957], which is shown as (e). In this case, a modified Wollaston prism should be used so that both the direction of vibration and the direction of shear may be parallel to those of the Savart plate. 6.2.2. M o d e (2) LEBEDEFF[1930] used the well-known interferometer by JAMIN [1868]. This is the first system which was source-size compensated
for a uniform field. It consists of two identical crystal plates cut at 45" to the optical axis, with a half-wave plate between them. The object is also introduced between them. We can suppress the halfwave plate by replacing the two identical plates with two identical Savart plates, but the non-straightness of their fringes reduces the usable aperture (LEBEDEFF [1930]), see Figs. 10 (f), (g). 6.2.3. M o d e s (3) and (4)
FRANCON [1957] used two Savart plates, one under the object and the other behind the afocal system consisting of the objective and an auxiliary lens (Fig. 10 01)).This system enables one to convert the well-known Franqon's interferential ocular ( FRANCON [ 1952,19541) into the source-size compensated one, just as the system shown in Fig. 10 (e). LINDBERG[1952] described a mixed form as shown in Fig. 10 (i). 6.2.4. Reflection system
Since some of the above mentioned systems are symmetrical with respect to the object plane, they can be adapted immediately to a reflection system as shown in Figs. 11 (a) and (c) (SMITH[1947], NOMARSKI and WEIL [1955], FRANCON [1953]). Later, DYSON[1963] made the beam reflect back and traverse the optical system twice so that the interferometer became an extremely stable one (Fig. 11 (b)). More recently, FRANFON and YAMAMOTO [1962] proposed the system shown in Fig. 11 (d), where an auxiliary lens L, is added to make the whole catadioptric system (L,+L,+L,) afocal. So far, reflection systems have been described which are to observe the p2ane reflecting object. To observe the curved object, YAMAMOTO [I9681 described the system which is shown in Figs. 11 (e) and (f). A varifocal system is introduced, which consists of two lenses L, and
VI,
5
329
PRACTICAL METHODS
61
M'
M
M
M'
t M'
5'
(e1
Fig. 11. Practical methods for obtaining the uniform field (reflection system): (a) SMITH[1947],
(b) DYSON[1963], (c) FRANFON [1953],
(d) FRANFON and YAMAMOTO [1962], (e) YAMAMOTO [1968], (f) YAMAMOTO [1968].
330
SOURCE-SIZE COMPENSATION
[VI,
J
6
L, both of which have the power, equal but contrary in sign. As the distance between them varies, the focal length of the composite system varies inversely proportionally to it. Its principal planes H and H’, however, stay always at the focal plane of L, and L, respectively. I n the system shown in Fig. 11 (e), if the principal plane H lies in the plane M, conjugate with the image plane M’ of the microscope, the ensemble system including the reflecting surface can be made afocal with unit magnification * by coinciding the focus of the varifocal system (L,+L,) with that of the catadioptric system. In order to do it, one may move the diverging lens relative to the converging one while keeping the latter immobile. Consequently, by placing two identical Savart plates, one before the varifocal system, the other after the objective, source-size compensation will be established, irrespective of the magnification of the objective and nearly of the curvature of the object. I n the system shown in Fig. 11 (f), by placing the first Wollaston a t an appropriate distance from the diverging lens and moving them as a body, the first Wollaston W may be imaged on the second W‘ with a constant magnification independent of the magnification of the objective for a large extent of range of curvature of the object. 6.3. OR J E C T I V E COMPENSATION AND P U P I L L A R Y COMPENSATION
I n the source-size compensation for obtaining a uniform field of view, two different types of compensation may be distinguished by the place where two rays derived from a single ray intersect with an angle, i.e. the position of the plane of superposition (9 5 ) . The case when the position is in the exit pupil of the objective may be called the pupillary compensation, while the case when it is close to the object plane, the objective compensation (FRANCON and YAMAMOTO [1962]) t. The important thing is that the aberrations of the objective affect the interference in quite different manners in these two cases. Consider the two methods which were shown in Figs. 10 (a) and (c). We illustrate them again in Fig. 12, the objective compensation by solid lines and the pupillary compensation by dotted lines. To obtain the same shear PIP, in the object plane, the two rays are separated in the pupil plane by the distance m i n the former, but *
Because M and M’ are the principal plane pair of the catadioptric system.
t The first objective compensation was described for double focusing interference [1951]. niicroscopes by PHILPOT
VI,
§ 71
331
IMAGES OF INTERFERENCE MICROSCOPES
M
M'
Fig. 12. Comparison between the objective compensation and the pupillary compensation.
null in the latter. In other words, the two images of the exit pupil are sheared by to the pupil in the former. Since the fraction of diameter corresponds to a normalized spatial frequency s",--s^, , it is shown that the visibility reduces to the value of the optical transfer function at the frequency &,-& with respect to the object P, or P, (YAMAMOTO[1965, 19681). Even when the condenser-objective system is corrected perfectly, the visibility never attains to unity in the case of objective compensation. The more the shear amounts, the more serious loss will be involved. Aberrations accelerate the deterioration. On the other hand, in the pupillary compensation, the surface of the exit pupil is really curved and astigmatic; moreover, large amounts of distortion are present in the pupil plane. Consequently, the superposition of the localized fringes becomes imperfect. And SO the pupillary compensation will be difficult with a high aperture.
5 7.
Images of Source-Size Compensated Interference Microscopes
If the source-size compensation is attained, the object can be observed through a microscope with a large light source which is usually imaged in the entrance pupil of the condenser. For simplicity, we assume that the optical system satisfies p = p' (eq. (5.5)) and p equals unity. Then, it is not necessary to distinguish the variables for beam 1 from those for beam 2. As stated earlier, if an object is introduced which is characterized by amplitude transmission Li(B)for the path i, the coherence transmission is given by eq. (3.11),
Then the mutual intensity in the image plane is represented by the
332
SOURCE-SIZE
COMPENSATION
[VI.
§ 7
following double convolution integral, using eqs. (4.1) and (4.7):
r&d,y‘, 0) =
/!
ri.(x, y, o)@zj(B-si-x, y-sj-y) I
X A i j ( X , p)@ i j ( X ’ - s ; - X ,
-
y’-sj-y) x exp{-i2z((dix-diy) (Bix’-$y’))} x exp{-i2n ( (siBi-sjdj)- (s:B ~ - . s ~ i ~ ) ) } d x d y d ~ d ~ .
+
T,,, Tlz, After calculating this integral for all pairs of i, j , i.e. T,,,
rZ1, putting x’= y‘, we obtain the intensity distribution in the image
plane, I ( x ’ ) = T ( x ’ ,x’,O), following the scheme of the unified theory. Since the object is partially coherently illuminated by each beam alone, rii’s (i = 1, 2) agree with the intensities which were given with the transfer factors or transmission cross-coefficients (HOPKINS [1953], BORNandWoLF [1959] p. 523) by HOPKINS[1953,1957b], with the exception of a bodily displacement due to si . But, the cross-coherence such as T,,slightly differs from them, as we will discuss below. If the object is sufficiently low contrast in the sense that
Li@)= \Li(X)l exp{iyi(n)}
(7.3) (7.4)
llLi(%)l-ll p2
(7.22)
where /?= - (nsin u)2/n,,and p is the reduced radius of the condenser aperture. Consider the image produced by a shearing interference microscope. Substituting eq. (7.22) for L i ( X ) in eq. (7.1) and calculating with appropriate approximation, the image intensity is obtained as follows:
I&’)
= T(x’,x’,0)
=
2 exp(i(2nvlc)(1-iflp2)
(p(x’-s~)-p(x’-ss;))}. (7.23)
ij
The factor (1-@p2) is the “obliquity effect” of illumination aperture size, for which INGELSTAM [1960], INGE LsTAMand JOHANSSON [1958], TOLMAN and WOOD[1956], GATES[1956], BRUCE[1955, 19571 and THORNTON [1957] have derived the correction formulae. With reflected light microscopy, we may put no = -n in eq. (7.20).
* The tilt will also take place in the exit pupiI. This effect was described recently by GUILLARD[1963-19641, and was ignored here.
338
SOURCE-SIZE COMPENSATION
[VI
Q 8. Conclusion An outline of a general theory of source-size compensation in twobeam interference microscopy has been presented. It is formulated in terms of four coherence functions rijbetween the radiation from the beams i and j at two points. The representation of these functions was investigated on any one of the reference surfaces (pupil planes or window planes) in the image-forming system including several shearing and/or tilting elements. This representation applies to both the localization of fringes and the source-size compensation in any plane of interest, whereas the latter was described formerly as the superposition of two systems of localized fringes. These studies have yielded simple formulae which serve as an analytic tool for devising new designs, some of which were described above. It was also deduced that the transfer properties of source-size compensated interference microscopes can be expressed by the newly defined transfer coefficients of the interference microscope, characteristics somewhat different from those of ordinary microscopes. They can be utilized for examining the influence of imperfect compensation and fixing permissible aberrations for high visibility. This is indispensable information for an attempt to produce any kind of interf erence microscope. The explained theory which we owe principally to Steel unifies the theory of interferometer and that of image-formation by a microscope. It has been proven that the theory throws a new light on the difficult theoretical and practical problems encountered in interference microscopy.
References ALLEN, R. a n d J . W. BRAULT, 1966, I m a g e Contrast a n d Phase-Modulated Light Methods i n Polarization a n d Interference Microscopy, in: Advances i n Optical and Electron Microscopy, eds. R. E. Barer and V. E. Cosslet (Academic Press, New York) p . 77. J . D. a n d A. LOHMANN, 1964, Optica Acta 12, 185. ARMITAGE, BEKAN, M. and G. B. P A R R E N T , 1964, Theory of Partial Coherence (PrenticeHall, Englewood Cliffs, New Jersey) p. 111. BLANC-LAPIEKKE, A. a n d P. DUMONTET, 1955, Rev. O p t . 3 4 , 1 . BORN, M. a n d E. WOLF,1959, Principles of Optics (Pergamon, London). B R E M M E R , H., 1951, Physica 17, 63. BRUCE, C. F., 1955, Aust. J. Phys. 8, 224. BRUCE, C. F., 1957, Optica Acta 4 , 127.
VII
REFERENCES
339
BRYNGDAHL, O., 1965, Applications of Shearing Interferometry, in: Progress in Optics, Vol. 4, ed. E. WOLF(North-Holland Publishing Co., Amsterdam) p. 37. CONNES,P., 1956, Rev. Opt. 35, 37. COTTON,A., 1934, Rev. Opt. 13, 153. DUMONTET, P., 1955, Optica Acta 2, 53. DYSON,J., 1950, Proc. Roy. SOC.204, 170. DYSON,J., 1963, J. Opt. SOC.Am. 53, 690. ELLIS,G. W., 1966, Science 154, 119s. FRANGON, M., 1952, Rev. Opt. 31, 65. F R A N ~ M., O N1953, , Rev. Opt. 32, 349. FRANFON, M., 1954, Optica Acta 1, 53. FRANFON, M., 1956, Interferences, diffraction et polarisation, in: Handbuch der Physik, Vol 24, ed. S. Flugge (Springer, Berlin) p. 171. FRANFON, M., 1957, J. Opt. SOC.Am. 47, 528. FRANFON, M., 1961, Progress in Microscopy (Pergamon, Oxford). M., 1966, Optical Interferometry (Academic Press, London). FRANFON, M., 1967, Einfuhrung in die neueren Methoden der Lichtmikroskopie FRANFON, (Verlag G. Braun, Karlsruhe) p. 102. F R A N ~ M. ON and , L. CATALAN,1960, Rev. Opt. 39, 1 . FRANFON, M. and R. PRAT, 1964, Optica Acta 11, 252. FRANFON, M. and B. SERGENT, 1955, Compt. Rend. Acad. Sci. 241, 27. FRAN~O M.Nand , S. SLANSKY, 1965, Coherence en Optique (C.N.R.S., Paris). FRANFON, M. and T. YAMAMOTO, 1962, Optica Acta 9, 395. GABOR, D. and W. P. Goss, 1966, J. Opt. SOC.Am. 66, 849. GATES,J . W., 1956, J. Sci. Instr. 33, 507. GERHARDT, U. and H. LENK,1960, Feingeratetechnik 9, 529. GERHARDT, U. and H. LENK,1962, Feingeratetechnik 11,208. GERHARDT, U., 1967, Feingeratetechnik 16, 505. GONTIER, M. G., 1957, Compt. Rend. Acad. Sci. Paris 244, 1019. GUILD,F., 1957, Phys. SOC.Year Book, p. 30. GUILLARD, M., 1963-1964, Rev. Opt. 42, 463; 43, 27, 64, 349. HANSEN, G., 1930, 2. Instrumentenk. 5 0 , 460. HANSEN, G., 1942, Zeiss Nachr. 4, 109. HANSEN, G.,1954, 2. Angew. Phys. 6,203. HANSEN, G.,1955, Optik 12, 5. HANSEN, G.and W. KINDER,1958, Optik 15, 560. HOPKINS, H. H., 1943, Proc. Phys. SOC.55, 116. HOPKINS,H. H., 1951, Proc. Roy. SOC.A208, 263. HOPKINS, H. H., 1953, Proc. Roy. SOC.A217, 408. HOPKINS,H.H., 1957a, Proc. Phys. SOC.B70, 449, 1162. HOPKINS, H. H., 1957b, J. Opt. SOC.Am. 47, 508. HOPKINS, H.H., 1965, Japan J. Appl. Phys. 4, suppl. 1, 31. HOPKINS, H. H., 1966, Optica Acta 13, 343. HOPKINS,H. H., 1967, The Theory of Coherence and Its Applications, in: Advanced Techniques of Optics (Van Nostrand, London) p. 189. INGELSTAM, E. a n d L . P. JOHANSSON, 1958, J. Sci. Instr. 35, 15. INGELSTAM, E., 1960, Problems Related to the Accurate Interpretation of Microinterferograms, in: Interferometry (H.M.S.O.,London) P. 137.
340
SOURCE-SIZE COMPENSATION
INOUE, S. and W. L. HYDE,1957, J . Biophys. Biochem. Cytol. 3, 831. IWATA, G., 1949, Proc. Phys. SOC.Japan 4 , 195 (in Japanese). J , , 1868, Comp. Rend. Acad. Sci. Paris 6 7 , 814. KIND,E. G. and G. SCHUIZ, 1959, Optik 1 6 , 2. KRUG,W., J . RIENITZand G. SCHULZ,1964, Contributions to Interference Microscopy (Hilger and Watts, London). LEBEDEFF,A,, 1930, Rev. Opt. 9, 385. LINDRERG, O . , 1952, 2 . Physik 1 3 1 , 231. LINNIK, W., 1934, 2 . Instrumentenk. 5 4 , 462. LOHMANN, A., 1956, Optica Acta 3, 97. MAGILL, P. J . and A. D. WILSON,1968, J. Appl. Phys. 3 9 , 4717. M A K ~ C H AA.L ,and M. FRANCON, 1960, Diffraction (Rev. Opt., Paris) p. 97, MENZEL,E., 1958, Optik 1 5 , 460. MENZEL,E., 1960, Die Abbildung von Phasenobjekten in der optischen Ubertragungs-theorie, in: Optics in Metrology, ed. P. Mollet, p. 283. MERTZ,L., 1959, J . Opt. SOC.Am. 4 9 , No. 12, p. iv. G. and G. SCHULZ, 1964, Optica Acta 1 1 , 89. MINKWITZ, MURTY,M. V. R. K., 1964, J. Opt. SOC.Am. 5 4 , 1187. MUKTY, M. V. R. K. and D. MALACARA-HERNANDEZ, 1965, Japan J . Appl. Phys. 4 , suppl. 1, 106. G., 1955, J . Phys. Radium 1 6 , 9 (S). NOMARSKI, NOMARSKI, G. and A. R. WEIL, 1955, Rev. Metallurgie 5 2 , 121. PHILPOT, J. St., 1951, Some New Form of Interference Microscope, in: Le Contraste de Phase et le Contraste par Interfkrences, ed. M. FranCon (Rev. Opt., Paris) p. 45. POLZE, S., 1965, Monatsberichte der Deutschen Akademie der Wissenschaften zu Berlin 7 , 631. RANTSCH,K., 1949, Die Optik in der Feinmesstechnik (Carl Hanser, Munchen). G. and G. MINKWITZ, 1961, Ann. Physik (7) 7 , 371. SCHULZ, SCHULZ, G., 1964, Optica Acta, 1 1 , 43, 131. SLANSKY, S., 1959, J . Phys. Radium 2 0 , 13 (S). SLANSKY, S., 1960, Rev. Opt. 3 9 , 555. SLANSKY, S., 1962, Optica Acta 9 , 277. SLEVOGT, H., 1954, Optik 8, 366. SMITH,F. H., 1947, Brit. Pat. 639014, U.S.P. 2, 601, 175. SMITH,F. H., 1955, Microscopic Interferometry, in: Modern Methods of Microscopy, ed. A. E. J . Vickers (Butterworth Scientific Publications, London) p. 76. SNOW, I vo where * vo = D/Af m 1130, cycles/mm
(9)
and D, is the apparent pupil diameter in mm. Aberrations in the eye's optical system lower the mtf below the value given by eq. (7) (WESTHEIMER 1119831). This effect is quite small when the pupil diameter is below 2 mm and increases in significance as the pupil diameter increases. Thus, instead of increasing continually with pupil diameter, as implied by eq. ( 7 ) , the mtf begins to decrease beyond a pupil diameter of about 2.4 mm (CAMPBELLand GUBISCH [1966]). * This estimate is based on Gullstrands schematic eye (SOUTHALL [1937] pp. 56-59), which places the iris about 21 mm from the fundus. Data given there for the lens imply that this enlarges the pupil about 5%. On the other hand, the corneal curvature enlarges the pupil diameter by about 6.5%. The value of v,, in the text is based on 0 . 5 5 5 ~ light and a refractive index of 1.336 for the image space.
VII,
5
31
353
S P ATIAL FREQUENCY RESPONSE
A representative set of curves, obtained with white light from a high-pressure mercury lamp is shown in Fig. 2.
Pupil Diameters
+ 2mm A
3mm 3.8mm 4.9mm 5.8mm 6.6mm
LL
IE
0
50 Ret.
Spatial
100 150 Freq. (cycles/mm)
200
Fig. 2. Mtf of eye optics for various pupil diameters. (From CAMPBELL and GUBISCH[1966].)
In addition to the techniques just described, more direct measurements have been made using excised steers’ and pigs’ eyes (ROHLER [1962], DE MOTT [1959]). These showed poorer results, but would not seem to be immediately applicable to a live, human eye. Another method (ARNULF and DUPUY[1960]) used to measure the mtf of the optical portion employs an ingenious technique to form on the retina a sinusoidal pattern of known modulation. Using a splitfield technique, the observer views a sinusoidal target of unity modulation, restricted to one half of his field of view. The other half is covered by a sinusoidal pattern formed directly on his retina. This latter pattern is adjustable to match the modulation of the former. When a match is established, the known modulation of the pattern will give directly the mtf of the optical portion. This adjustable pattern may be formed as follows (ARNULFand DUPUY[ 19601, BERGER-L’HEUREUX-ROBARDEY [ 19651, WESTHEIMER [ 19601, CAMPBELL[ 19681). A pair of narrow, parallel strips of coherent, monochromatic light is imaged in the pupil. This results in the appearance of regularlyspaced (Young’s) interference fringes on the retina of the observer.
354
VISION I N COMMUNICATION
[VII.
§ 3
The fringe spacing is controlled by the spacing of the strips and, at any one spacing, the contrast is adjusted until a match is obtained. The contrast may be varied by changing the amount of flux entering through one strip relative to that through the other (e.g. by shortening one slit) or by adding a controlled amount of incoherent light to the incident radiation. Extensive results obtained with this method have been published (BERGER-L’HEUREUX-ROBARDEY [19651). 3.3. RETINA-BRAIN PORTION
The method just described has also been used to find, indirectly, the mtf (TRB) of the retina-brain portion. On the assumption that, at threshold, the sensed modulation has some fixed value ( K ) ,independent of spatial frequency, we need measure only the values of the modulation on the retina (M,(v)) at threshold at various spatial frequencies. Clearly K = M,(v)T,,(v), and hence, M,(v) = K/ T RB ( v ) T&. This technique has indeed been used (WESTHEIMER [1960], CAMPBELLand GREEN [1965], CAMPBELL[1968]). Representative results for the retina-brain portion are shown in Fig. N
100
-
L
.s c
-a 0
-0
0
5
210
-
r 0
e t, r
I
3.4. TOTAL VISUAL SYSTEM
The mtf of the total visual system may be obtained as the product: of the mtf’s of the subsystems just discussed. On the other hand, a
VII,
9 31
S PA T I A L
F R E Q CJ E N C Y R E S P 0 N S E
355
variety of methods have been used to measure the mtf of the total visual system directly. Here again the most popular method seems to be threshold measurements. A sinusoidal luminance variation at some fixed spatial frequency is presented to the observer and the modulation is reduced until it is no longer detectable. The required luminance patterns have been obtained by modulating a crt raster (CAMPBELL[1968], SCHADE [1956], PATEL[1966]), by generating an interference pattern (ARNULF and DUPUY[1960], BERGER-L'HEUREUX-ROBARDEY [1965]), by means and of photographic transparencies (ROSENBRUCH [1959], DE PALMA LOWRY[1962], VAN NES and BOUMAN [1967], FRY[1969]), and by using the moir6 effect (MENZEL [1959]). Three supra-threshold methods have been used to obtain the visual mtf. Two of these use split-field photometry; that is, the region whose luminance is to be measured (the test pattern) is placed adjacent to a region of controllable and known luminance, and the luminance of the latter is adjusted until it appears to match the test pattern luminance. This technique has been used to measure the mft of the visual system directly (BRYNGDAHL [1964, 19661) and also to measure its step-function response (LOWRYand DE PALMA [1961]), which, upon differentiation and subsequent Fourier transform, yields the line spread function and the mtf, respectively. (Cf. also MENZEL [1959].)
1.0 -
0.5 -
356
VISION I N COMMUNICATION
[VII,
9 3
The third method (DAVIDSON [1968], WATANABE et al. [1968]) is more direct and similar to the magnitude estimation method. It is based on an observer rating the relative contrast of two sinusoidal patterns of different spatial frequencies. Results for the split-field method a t high luminance (BRYNGDAHL [1966]) are shown in Fig. 4. These methods are described in more detail in Appendix 1; some of their shortcomings, too, are described there. 3.5. COMPARISON O F RESULTS
The mtf’s for the two fractions of the visual system can be combined and compared with those found for the total visual system. Such a comparison does not seem to have been made. * It is not attempted here in view of the large discrepancies between the data obtained by the various workers for the total visual system. These discrepancies make such a comparison too uncertain to be of much value. I n attempting to compare the results published by the numerous investigators, it must be noted that many factors affect the performance of the visual system in general (in addition to the special factors associated with threshold measurements). Among these are pupil diameter, the state of accommodation, and retinal illumination and adaptation level. The effect of pupil diameter must, a t least in part, be due to the changed optical mtf - a reduction in the amount of longitudinal spherical aberration, and in the effect of both this and chromatic aberration, with reduction in pupil diameter. The effect of accommodation, too, has been investigated (SCHOBERand HILZ [1965]): a significant deterioration of the performance has been noted as the eye accommodates to shorter viewing distances. This may be due to increased aberrations contributed by the eye’s lens, whose anterior curvature increases significantly with such accommodation. The effect of the retinal illumination level on the mtf is less obvious. We would, of course, expect this level to affect the mtf via the noise level for those data which were obtained by the threshold method. On the other hand, if this noise has a generally uniform spectrum, the * One author evaluated the optical portion directly (CAMPBELL and GUBISCH [1966]) and also as the ratio between the mtf’s obtained for the total system and for the retina-brain portion (CAMPBELL[1968]). However, a comparison of these was not included and, indeed, the discrepancies seem quite large, especially a t the higher spatial frequencies.
VII,
§ 31
SPATIAL FREQUENCY RESPONSE
357
detected noise should have a spectrum matching the visual mtf - at all levels. The change in noise level should, therefore, affect the level of the threshold but not its relative spectral pattern. The effect of the illumination level on the mtf shape as measured by supra-threshold methods is even more difficult to explain. These variations imply that the visual spread function changes, perhaps in a manner analogous to changes in the spectral sensitivity curve, in the Purkinje effect (SOUTHALL [1937] pp. 274-275), which is explained in terms of a gradual transfer from rod t o cone vision with increasing illumination. In 9 5 it is shown that a major part of these changes may be due to the non-linearity in effects responsible for the shape of the mtf. On the whole, data of visual system mtf published agree qualitatively: at low spatial frequencies the mtf rises with increasing frequency until a maximum is attained; as the spatial frequency increases further, the mtf levels off and begins to drop. * (Some workers report a secondary maximum beyond the primary peak (PATEL[1966], LOWRYand DE PALMA [1961].) At still higher frequencies, the mtf drops at an approximately exponential rate. Quantitatively, however, agreement between the various measurements is extremely poor, even though these can be compared only on a relative scale; each is known only within a proportionality factor which can be chosen independently for each set of data. Even the choice of a reference point, where all the curves could be made to agree, is not obvious. The techniques of two authors (MENZEL [1959], BRYNGDAHL [1964, 19661) ensure that the mtf at zero spatial frequency is unity. The data of other workers cannot, however, be extrapolated reliably to zero frequency, and this can therefore not be used as a reference point. Another reasonable choice of reference point would seem to be the mtf peak. But even the location of the peak varies widely among the reports. Indeed, rather than treating it as a reference point, the location of the peak may be used as an index for illustrating the extent of the discrepancies. Figure 5 shows the spatial frequency at the mtf peak, as a function of the mean retinal illurnination, as reported by nine different workers. Only those who reported results with sinusoidal, rather than “square* I t has been stated (LOWRY and DE PALMA [196l], RONCHIand VAN NES [1966]) that some observers (WESTNEIMER [1960], ROSENBRUCH [1959]) have failed t o confirm the existence of a peak away from the origin. These statements are irrelevant, however, since the observers cited did not investigate the low spatial frequencies a t which the low-frequency depression occurs.
358
[VIL
V I S I O N IN C O M M U N I C A T I O N
s
3
wave” test patterns, are included. Dependence on retinal illumination was used as the independent variable, because this dependence seems t o have been investigated more fully than dependence on pupil diameter and accommodation. The values of these latter parameters, for each plot, are listed in the box insert in the figure. To note the magnitude of the inconsistencies, compare, for instance, the results of PATEL [1966] with those of VANNES and BOUMAN [1967] obtained under almost identical conditions and yet differing by a factor of almost two over most of the range. An extreme discrepancy appears when the results of SCHADE [1956] are compared with those of BRYNGDAHL [1964]; they differ by a factor of four throughout and at low light levels by a factor of eight. c
pupil Ref. diam
E E
a
view dist
-a
0 0 a
-01 0.2
0.5
I
2
5 10 20 50 100 200 05k Ik Retinal Illumination (Td)
2k
5k IOk
Fig. 5. Location of the peak of the visual mtf as a function of mean retinal illumination, as reported by a number of workers. Pupil diameters and viewing distances used are listed in the insert. 1. SCHADE[1956], 2. PATEL [1966], 3. VAN NES and BOUMAN [1967], 4. BRYNGDAHL [1964, 19661, 5. DE PALMA and LOWRY[1962], 6. LOWRY and DE PALMA [1961], 7. CAMPBELL[1968], 8. DAVIDSON[1968], 9. FRY[1969].
Such discrepancies are unusual even for psychophysical measurements, especially in view of the obviously great care taken in obtaining the data. Undoubtedly, part of the explanation lies in the great variations between individuals - but the available data do not indicate that such variations can account completely for the discrepancies. Perhaps there are other variables, not yet considered, which influence
VII,
9
41
N O I S E I N THE V I S U A L SYSTEM
359
the results so drastically, and a search for such variables would seem to be in order. One factor, the orientation of the test pattern, has already been shown to affect the threshold. Visual sensitivity is highest for vertical and horizontal lines and significantly lower for lines at 45” to these (WATANABE et al. [1968]).
5 4.
Noise in the Visual System
The third fundamental characteristic of the visual system is its noise. We recall that this includes any factor which causes an unpredictable deviation of the detected value from the actual one. This includes not only measurable physiological quantities, such as spurious neural pulses, but also random factors occurring at the higher, cortical level. Thus we might say that noise varies inversely with attention and that the experience of an observer in an experiment reduces the noise level in his visual system, for the particular task involved in that experiment. 4.1. T H R E S H O L D MEASUREMENTS
Since the observer does not sense this noise directly, we must use indirect methods for evaluating it. Although estimation experiments could be used to determine noise levels, detection experiments, determining the threshold signal levels, are easier to apply. In its simplest form, this approach equates the just noticeable difference (jnd) with the noise level as mapped into the stimulus domain. Much work has been done to establish the luminance jnd. The results of one major investigation (BLACKWELL [1946]) covering the threshold contrast, C, for circular discs over a large range of illumination levels is shown in Fig. 6. Contrast is defined as C = AE/E (10) where E is the retinal illumination due to the background and A E is an illumination increment. For threshold contrast, A E is the just noticeable increase in illumination. The retinal illumination is given in troland. * These data correspond to a 50 yo detection probability.
* The illumination in troland is the object luminance (in cd/m2)multiplied by the pupil area (in mmz). The original report presented the data in terms of luminance. Published data (DE GROOTand GEBHARD[1952]) concerning pupil diameter were used t o convert to retinal “illumination”. The troland is popularly referred to as a unit of illumination, though, dimensionally, it is a unit of intensity and represents the intensity at the observer’s pupil.
360
[VII,
VISION IN COMMUNICATION
-4
-3
-I 0 I Log Retinal Illumination
-2
2
3
4
4
( Log Td)
Fig. 6. Threshold contrast for uniformly luminous circular discs of various diameters, as a function of retinal illumination. (From BLACKWELL [1964].)
The absolute thresholds, i.e. the illumination required for detection against an absolutely dark background, are listed in Table 1. TABLE1 Retinal illumination at absolute threshold Target diameter (minutes of arc)
3.6 9.68 18.2 55.2 121
Illumination
77.6 9.42 3.66 0.518 0.210
The fact that the threshold contrast varies significantly with luminance level and target size demonstrates that no single number can describe the noise characteristics in any useful way. If the target had not been circular, or not been uniform, what would the threshold contrast have been? What is the threshold contrast for 99% detection probability? To answer such questions, the spatial spectrum of the noise and its dependence on the luminance level must be investigated,
VII,
s
41
NOISE IN THE VISUAL SYSTEM
361
and it is not enough to obtain an “effective” noise level - the distribution of noise levels must be determined. Once all this is known, we can calculate the detection probability for any target assuming an optimum detection strategy - or any other specific detection strategy. But, what strategy does the visual system use? Clearly, much more must be known about the higher-level functioning of vision before detailed noise characteristics can be extracted from threshold data. It is therefore with a severely limited amount of experimental data that we approach the investigation of noise characteristics. 4.2. N O I S E SOURCES A N D LUMINANCE D E P E N D E N C E
On the basis of physical theory, we must ascribe to the visual system at least three noise sources. 1. Detector noise
By whatever mechanism the eye converts radiation into a neural impulse, we must expect this mechanism to operate occasionally even in the absence of incident radiation, and, the more sensitive the detector, the more likely it is to be triggered spontaneously. This prediction has been confirmed experimentally and the effect has been called “dark light” (RUSHTON [1963]), l z o . 2 . Sensation and neural noise
At the other end of the neural pathway, we must expect spontaneous stimulation of sensation, even in the absence of neural activity in the optic nerve - the process which translates neural impulses into sensation must be expected to take place occasionally even in the absence of neural impulses (LEVI [1969]), resulting in sensation noise, n s . In addition to this “sensation noise” spurious pulses must be expected to occur at all neural terminals, such as in the plexiform layers in the retina and in the lateral geniculate body. We shall not discuss these further, however, because at the present state of knowledge they may be treated together with either of the preceding two noise sources. 3. Radiation noise In addition to the noise sources internal to the visual system, there is noise superimposed on the entering radiation at the time it enters the eye. This noise is due to the quantum nature of light the fact that light arrives in individual quanta. This noise has
362
VISION I N COMMUNICATION
[VII,
9 4
been analyzed most extensively and is often referred to as “quantum noise” (SCHADE[1956], ROSE[1957], BOUMAN [1961], MORGAN [1965]).
These three types of noise differ significantly in their dependence on luminance level. The sensation noise may be expected to remain constant -independent of luminance. The other noise terms are affected by the “gain” of the visual system. Since the gain factor ( k in eq. (3)) depends strongly on the adaptation level and, therefore, on the retinal illumination, these terms, too, must be expected to vary with illumination. In addition, whereas the detector noise should remain constant, the radiation noise varies directly with the square root of the illumination, as shown by statistical considerations. Thus there are at least three terms in the description of the illumination - dependence of noise, N y , in the sensation domain:
N y ( E ) = k(E)n,B+k(E)(adE)B+n,, (11) where a is a constant and no, n, are the noise levels defined above. We drop the Lo-term in eq. (3), because it is negligible when operating near the adaptation luminance level. It is usually more convenient t o express the dependence in the stimulus domain, and this is readily written by dividing each term by k and raising it to the 8-l power: N,(E) = n,+adE+“,/k(E)I1/fi. (12) If we substitute for k its value as given in eq. (4),we find, finally:
In the limits, then, we have for the illumination-to-noise ratio:
E - N
E/cl
for E I JL(x)dxl. The triangle inequality shows that this is impossible with L ( x ) real and positive: IJL(x)exp(i2nvx)dxl5 J IL. (x)exp (i2nvx)Idx = S L ( x )lexp (i2nvx)Idx SL (x)dx. ** These have been called “finite-spread’’ non-linearities in contrast to the “zerospread” non-linearities treated in the preceding section (INGELSTAM [1965]).
VII,
s
51
367
SHAPE O F MTF, LINEARITY A N D STATIONARITY
hibitory effect grows super-linearly with the illumination at the point causing the inhibition or with the adaptation level. The inhibitory effect seems to decrease with mean illumination to such a degree that one worker (PATEL[1966]) has found the low-frequency depression of the mtf t o disappear entirely at 3 Td. Such a non-linearity would also account for the broadening of the low-frequency depression as the luminance rises, as reported by many who have studied the effects of mean luminance on mtf, regardless of the method used (SCHADE [1956], PATEL[1966], DE PALMA and LOWRY[1962], VAN NES and BOUMAN [1967], BRYNGDAHL [1966]). In Fig. 5 this finds expression in an upward shift of the peak frequency with rising luminance. Only one worker (BRYNGDAHL [1966]) seems to have reported separate brightness data for peaks and troughs of sinusoidal test patterns, permitting a more detailed study of non-linearities. He shows the apparent luminance of peaks and troughs as a function of modulation; one typical example is reproduced in Fig. 8.
0.25
0.5 Modulation
0.75
I.o
Fig. 8. Apparent (solid lines) and actual (broken lines) peak and trough luminance in a sinusoidal test pattern of 15 cycles/mm on the retina. (From BRYNGDAHL [1966].)
At photopic brightness levels, the curves for the apparent troughluminance show a descent rate which is much higher than that for the actual trough-luminance (shown in a broken line in Fig. 8). But when the modulation exceeds about 0.3, the descent begins to slow down progressively. (For modulations of 0.5 and larger, the apparent trough luminance remains almost constant - a t zero.) This is the behavior expected on the basis of the earlier hypothesis of “effect
368
V I S I O N IN C O M M U N I C A T I O N
[VII,
APP.
sub-linearity”. The initial high descent rate is simply a manifestation of the inhibition effect due to the neighboring luminance peaks, as these become more and more pronounced. The subsequent flattening is the result of the saturation effect. The super-linearity of the apparent peak-luminance is less obvious. It may be explained, however, on considering the effect of the comparison field on adaptation. When this field is less luminous than the test pattern, as it is during trough measurements, it will not have much effect on adaptation. When it is brighter, however, as it is during peak measurements, it will affect adaptation significantly (STEVENS [1966]) and, hence, enhance the inhibition effect and, with it, the apparent relative peak luminance, due to the “cause super-linearity’’ hypothesized earlier. 5.4. STATIONARITY, HOMOGENEITY, OR ISOPLANATISM
Fourier transform analysis is applicable only if the system characteristics are spatially constant. Clearly, if the mtf changes in the distance covered by one signal cycle, it cannot be very meaningful to speak of an mtf. I n vision this may mean that such analysis must always be restricted to rather small area elements. If this restriction is observed, however, the mtf concept can still be very useful (DAVIDSON [1968]). This requirement, for mtf invariant with location, is analogous to the stationarity requirement in statistical analyses. It has also been called homogeneity (DAVIDSON[ 19681) and isoplanatism (BORNand WOLF[1965]).
Acknowledgments The work for this review was supported by the Office of Naval Research. The encouragement by Dr. G. C. Tolhurst, Chief, Physiological Psychology Branch, ONR, and Prof. H. Lustig, Chairman, Dpt. of Physics, City College, is gratefully acknowledged. APPENDIX
Methods of Measuring the MTF of the Total Visual System above Threshold Three methods of measuring the visual mtf above threshold were mentioned. Here these are described in some more detaii.
VII, APP.]
M E T H O D S O F MEASURING THE M T F
369
In the first method (BRYNGDAHL [1964, 19661) the field is split down the middle. In one half of the field the observer sees a sinusoidally varying luminance pattern and in the other half a uniformly illuminated field. The observer then adjusts the luminance of the uniform field until it matches the peaks of the sinusoidal luminance pattern, and the known luminance Lo,,, of the uniform field is recorded. This is then repeated for a match to the troughs, yielding the minimum apprehended luminance equivalent (Lomin). These methods yield the luminance-equivalent perceived modulation:
Mo
= (Lomax--lomin)/(Lomax+Lomin).
When this is compared with the objectively measured modulation, M , in the test pattern, the value of the luminance-equivalent mtf at that frequency is obtained: T(Y) =
M,/M,
where the modulation
are, respectively, the actual luminances at the peaks and LmaX,L, and troughs of the test pattern. In the other split field approach the observed luminance variations are measured across an edge which, objectively, represents a rapid transition from light to dark. When such a transition is viewed, an “overshoot” phenomenon is observed, i.e. an even lighter band appears in the light region near the transition and an even darker band in the dark region. These bands are referred to as Mach bands (GRAHAM [ 19651). Such resulting apparent luminances were measured by means of a controllable comparison field and the results of these measurements yield the integral of the line spread function of the visual system. The Fourier transforms were found for the derivatives of both the apparent and the actual luminance functions. Their ratio yields the mtf (LOWRY and DE PALMA [1961]). In the direct method, two sinusoidal patterns are presented t o the observer successively, and he is asked to indicate which has the greater modulation. One of these is at the “standard” spatial frequency, which is picked arbitrarily, and the other is at the frequency at which the mtf is to be found. This method yields, then, the mtf relative to that at the “standard” frequency (DAVIDSON [1968], WATANABE et al. [ 19681).
370
VISION I N COMMUNICATION
[VII
I n this context a brief criticism of the various techniques may be appropriate. The threshold techniques suffer from a weakness analogous to the one mentioned in connection with the brightness function. In these techniques, the actual quantity measured is the threshold contrast. The reciprocal of this contrast - as a function of spatial frequency is then taken to be identical with the mtf, within a proportionality factor. This conclusion is based on the tacit assumption that the “noise” in the visual system affects the detection of all spatial frequencies equally. The validity of this assumption is the subject of Section 4.3. In the supra-threshold split-field methods, luminance is compared to luminance so that the non-linearities of the brightness function (Section 2 . 2 ) are canceled out of the results. Therefore we should not expect the mtf-values obtained to correspond to the perceived brightness mtf. (For small modulations, the perceived brightness mtf should be /Itimes the measured luminance mtf.) I n the split-field method employing the sinusoidal pattern, the experiment is clearly set up so that it must yield unity mtf at the origin (v = 0 ) , so that the absolute values of the mtf obtained there have no physical significance. Thus, the modulation “enhancement” apparent at intermediate frequencies may not be an enhancement in the absolute sense. In view of the heavy dependence of mtf on adaptation level (cf. Fig. 5 ) , the major weakness of this method would seem to be its interference with adaptation. Changes in adaptation level, as measurements are made, would seem to be inevitable and this, in turn, must distort the results obtained. This is discussed further in Section 5.3.
The supra-threshold method in which modulations at different frequencies are compared successively would seem to suffer from the same weakness as the magnitude estimation methods - inherent inaccuracies and large scatter.
References AKNULF,A. and 0. DUPUY,1960, Compt. Rend. Acad. Sci. Paris 2 5 0 , 2757. BERGER-L’HEUREUX-ROBARDEY, S., 1965, Rev. Opt. 4 4 , 294. BLACKWELL, H. R., 1946, J . Opt. SOC.Am. 36, 624. BLACKWELL, H. R., 1953, J. Opt. SOC.Am. 43,456. BORN, M. and E. WOLF,1965, Principles of Optics, 3rd ed. (Pergamon, London) p. 482.
VII]
REFERENCES
371
BOUMAN, M. A., 1961, History and Present Status of Quantum Theory in Vision, in: Sensory Communication, ed. W. A. Rosenblith (MIT, Cambridge) p. 377. BRYNGDAHL, O., 1964, J. Opt. Soc. Am. 5 4 , 1152. BRYNGDAHL, O . , 1966, J . Opt. Soc. Am. 5 6 , 811. CAMPBELL, F. W., 1968, Proc. I E E E 5 6 , 1009. CAMPBELL, F. W. and D. G. GREEN,1965, J. Physiol. 1 8 1 , 576. CAMPBELL,F. W. and R. W. GUBISCH,1966, J. Physiol. 1 8 6 , 558. COLTMAN,J . W. and A. E. ANDERSON, 1960, I.E.E.E. Proc. 4 8 , 858. DAVIDSON, M., 1968, J. Opt. Soc. Am. 5 8 , 1300. DE GROOT,S.G. and J. W. GEBHARD, 1952, J . Opt. SOC.Am. 4 2 , 492. DEMOTT,D. W., 1959, J . Opt. SOC.Am. 4 9 , 571. DEPALMA, J. J . a n d E . M. LOWRY, 1962, J . Opt. SOC.Am. 5 2 , 328. EKMAN, G., 1958, J . Psychol. 4 5 , 287. FLAIMANT, F., 1955, Rev. d’Opt. 3 4 , 433. FRY, G. A , , 1969, J . Opt. SOC.Am., 5 9 , 610. GRAHAM, C. H., 1965, Vision and Visual Perception (Wiley, New York) p. 549. GREEN,D. M. and J . A. SWETS, 1966, Signal Detection Theory and Psychophysics (Wiley, New York) Ch. 4. H., 1963, Vision Res. 3 , 457. GROSSKOPF, INGELSTAM, E., 1965, Japan J. Appl. Phys. 4, Suppl. 1, 15. KRAUSKOPF, J., 1962, J . Opt. SOC.Am. 5 2 , 1046. LEVI,L., 1969, Nature 2 2 3 , 396. LOWRY, E. M. and J . J . DE PALMA, 1961, J . Opt. SOC.Am. 5 1 , 740. MARIMONT, R. B., 1963, J . Opt. Soc. Am. 5 3 , 400. MENZEL,E., 1959, Naturwissenschaften 4 6 , 316. MORGAN, R. H., 1965, Am. J. Roentgenol. Radium Therapy Nucl. Med. 9 3 , 982. O’NEILL,E. L., 1963, Introduction to Statistical Optics ( Addison-Wesley, Reading, Mass.) Appendix A-3. PATEL, A. S.,1966, J . Opt. SOC.Am. 5 6 , 689. ROHLER,R., 1962, Vision Res. 2 , 391. ROHLER,R., U. MILLERand M. ABERL,1969, Vision Res. 9, 407. RONCHI,L. and F. L. VANNES, 1966, Atti Fond. Giorgio Ronchi Contrib. 1st. Nazl. Ottica 2 1 , 218. ROSE,A , , 1967, Advan. Biol. Med. Phys. 5 , 211. ROSENBRUCH, I = uAluA). (2.15) n
(2.16) (2.17)
If we make use of the diagonal representation of the density operator (SUDARSHAN [1963]; GLAUBER[l963]; MEHTA [1967]; KLAUDER and SUDARSHAN [1965]) (2.18) we may express the expectation value on the right-hand side of (2.12) as an integral
(2.19)
382
THEORY O F PHOTOELECTRON C O U N T I N G
[VIII,
s
2
where
W’ = W’(t,1‘)= I c d r / t t + T d t ~ v * ( t’) r , V ( r ,t’).
(2.20)
In obtaining (2.19), we also used eq. (2.16) and its Hermitean adjoint. Further, if we define
we may rewrite eq. (2.19) in the form (2.22)
The functional +({v,}) is properly normalized, but, in general, it is not a positive definite functional. Consequently in general P ( W ) is not a positive definite function. However, for radiation fields produced from most of the available sources one may interpret P ( W ) as a probability function. In general, of course, one must regard it as a generalized function. We thus find that the basic formula of photoelectron counting is essentially unaltered by field quantization. Let us consider some general consequences of the photoelectron counting formula (2.1) or (2.22). The average and the mean square of the number of photoelectrons is given by (n) =
2 .P(%
t , T ) = .(W(t, TI)
(2.23)
n
(n2)=
znn2p(n, t, T ) = cx<W)+a2(W2).
(2.24)
n
From (2.23) and (2.24), we obtain the following expression for the variance, first given by MANDELet al. (1964), ( ( A n ) 2 )= =
(n2)-
(n)2
(.>+.Y(AW)z>,
(2.25)
where
((AW)’}
=
(W2)-(W)2.
(2.26)
Formula (2.25) shows that the variance of the fluctuations in photoelectrons may be regarded as consisting of two parts: (1) The fluctuations in the number of classical particles obeying Poisson distribution (term ( n ) ) ; (2) The fluctuations in the classical wave field (the wave
VIII,
s
21
PHOTOELECTRON COUNTING FORMULA
383
interference term c ~ ~ ( ( A W ) ~This ) ) . result is strictly analogous to a celebrated result of EINSTEIN [ 1909a, b] relating to energy fluctuations in an enclosed blackbody radiation.* A similar formula was later derived by FURTH[1928b] for energy fluctuations in a thermal radiation field of arbitrary spectral profile. More recently GHIELMETTI [1964] derived an expression analogous to eq. (2.22),withn representing the total number of photon (rather than the number of photoelectrons emitted in a detector) in a radiation field. Thus we find that strictly analogous formulae hold for both the energy fluctuations in a closed radiation field and those in the photoelectric counts registered in a photo-detector. It was noted earlier in the quantum mechanical derivation of formula ( 2 . 2 2 ) that P ( W ) is, in general, not a positive definite function. Thus the result that ((AW)z) 2 0 which holds for all classical probability distributions P ( W ) is not necessarily true in general. One may therefore expect to find cases where the variance of the fluctuations in the number of photoelectric emissions becomes smaller than that expected from classical particle statistics. For example, when the radiation field has a well defined number of photons (ie. when the density operator corresponds to an eigenstate of the number operator), ( ( A n ) 2 )= 0, and from ( 2 . 2 5 ) we see that ( (AW)z) is then negative. For radiation fields from a well stabilized laser, the intensity is essentially constant and ( (AW)z) = 0. The variance ( (Am)2) is then seen to be equal to ( m ) i.e. same as that for a system of classical particles. For fields obtained from thermal sources, P ( W ) is always positive so that for such fields 2 0 and hence the variance of the fluctuation in the number of photoelectrons is always greater than ( n ) . One may also relate various other moments of n to those of W . Thus in particular, one finds that the Kth factorial moment of n is simply proportional to the Kth moment of W : (dk’)
3
(n(n-1) . . .(%-A+
1)) = M k ( W ’ k ) .
(2.27)
* Einstein gave a purely dimensional argument to interpret the second tcrm in his formula for the variance of the energy fluctuations. LORENTZ [1916] later verified the correctness of this interpretation (see also the footnote on p. 434 of this article). Einstein’s forniula may thus be regarded as a reflection of the wave particle dualism of the radiation field. For a lucid account of the significance of Einstein’s result, see BORN [ 19491.
384
T H E O R Y O F PH 0 T O E L E CTRON C O U N T I N G
[VIII,
5
3
Q 3. Intensity Fluctuations The basic quantity which enters in the formula for the photoelectron counting distribution is the probability density of the integrated light intensity W . In this section we will investigate the form of this probability density for few of the typical cases of interest. 3.1. POLARIZED THERMAL LIGHT
Let us assume that the light beam is completely polarized. In this case we can describe the wave field by a scalar random process V ( t ) in the form of an analytic signal. * We also assume that the light beam falling on the detector originates from a thermal source. The random function V ( t )may then be represented as a stationary complex Gaussian process. The instantaneous intensity I ( t ) is given by
I ( t ) = V*(t) V(t).
(3.1)
Since V is distributed according to a Gaussian distribution, the probability density of I is an exponential function
P(I)=
1 ~
exp(-I/(I)).
(0
(3.2)
We are interested in the statistical properties of the integrated light intensitv (3.3)
From (3.1) and (3.3) it follows immediately that the mean and variance of W are given by
( W >= ( O T ,
=
where
I+)
J, J
(3.4)
jr(t-t’) 12 dt dt’,
(3.5)
is the correlation function r(T) =
(V*(t)V(t+t)).
(3.6)
* For an introduction to the statistical description of wavefields see MANDELand WOLF[I9651 $ 3 .
VIII,
31
INTENSITY FLUCTUATIONS
385
I n deriving eq. ( 3 . 5 ) we have made use of the moment theorem relating to complex Gaussian random process [cf. eq. (A.5b) of the appendix]. It is shown later in this section (eq. (3.31)) that the nth cumulant of W can be expressed as rT
rT
A result similar to this was conjectured for a real Gaussian random process by RICE [1945] and was later proved by MIDDLETON [1957] and SLEPIAN[1958]. Eqs. (3.4) and ( 3 . 5 ) are special cases of (3.7). From the knowledge of the cumulants, one can determine the moments of W . Thus in particular ( W > = K1, ( W 2 ) = .2+.:, (W3)
=
K3+
3 K 2 K1+
K:,
(W4)= K q + 4 K 3 K 1 + 3 K ~ + 6 K : K 2 + K ~ ,
where the summation on the right-hand side of the last expression includes all possible positive integers nl, n 2 , . . ., nk such that n,+n2+
. . . +nk
=
n.
(3.9)
So far we have considered only the statistical constants of W . It is very difficult to derive an exact expression for the probability density of W in which T is arbitrary. In fact no simple expression for P ( W ) is known for any case of direct physical interest. However, it is not difficult to derive asymptotic expressions for P ( W ) when the parameter T is either very small or very large. When T is very small compared to the coherence time T,, the intensity I ( t ) may be considered to be constant in the time interval of duration T and W is then, approximately, equal to I T . Hence from eq. ( 3 . 2 ) we may write (3.10)
where ( W ) = T ( I ) .
386
THEORY OF PHOTOELECTRON COUNTING
LVIII,
s
3
On the other hand, when T is very large compared t o the coherence time, we may divide this interval into a large number of sub-intervals each of which is greater than or of the order of T,. The contributions to W from each of these sub-intervals are random variables and may be considered as statistically independent. From the central limit theorem, one may then conclude that W is approximately normally distributed (cf. RICE [1945]):
P ( W ) = (232((AW)2)}-* exp {-+(W-(W))2/((AW)2)}.
(3.11)
The mean ( W ) and the variance ((AW)2) may be determined from eqs. (3.4) and (3.5) respectively. In the limit when T becomes very large, all the fluctuations in the intensity may be expected to be smoothed out on integration. W may then be regarded as a constant corresponding to a &function distribution: P(W)= 6(W-(W)). (3.12) Such a distribution is expected, to be appropriate for light from an incandescent lamp, for example, for which even the fastest available detectors will average out all the fluctuations present in the light beam. One may also obtain an approximate expression for P ( W ) when T is arbitrary. Let us divide the time interval T into say N subintervals each of length 6t, (T = N 6 t ) . Let us further assume that it is possible to choose 6t so small that there are no appreciable fluctuations in the wave field during this time and also so large that the wavefields belonging to different time intervals are uncorrelated. Thus 6t is of the order of the coherence time. We may then regard V(n6t);n = 0, 1, . . ., N-1, as N independent complex Gaussian random variables with equal variance and we may write N
wm 2
V" (n6 t ) V ( n6 t ) 6t.
(3.13)
n=O
From eq. (A.16) of the appendix, it then readily follows that aN
P(W)= -
WN-1
Z N (N-l)!
e-+aW
(3.14)
The constants a and N may be determined from the requirement that the mean and the variance of W should agree with those given by
VIII,
5 31
387
INTENSITY FLUCTUATIONS
eqs. (3.4) and (3.5). Thus one obtains
N
=
[
T2
/oT/
Jy(t-t’)
J2
dt dt’]
-’,
(3.15)
and a = 2N/(W),
(3.16)
where y ( 7 ) is the normalized correlation function
Y(.)
=
r(t)/r(o).
(3.17)
The distribution (3.14) of the integrated intensity W which is seen to be a Gamma distribution was first suggested by RICE [1945] as an appropriate approximate distribution. It agrees very well with the exact distribution in the two extreme limits when either T is very small or when it is very large. When T is very small, we see from (3.15) that N is nearly unity and P ( W )then reduces to an exponential function in agreement with eq. (3.10). When T is very large, N is also very large and in this limit P ( W )may be approximated by a Gaussian distribution and eventually tends to a &function in agreement with eqs. (3.11) and (3.12). The time interval 6t chosen in deriving the gamma-distribution (3.14) has the physical interpretation of being of the order of coherence time. From (3.15) we find on simplification that (3.18)
When T is large, (3.18) reduces to Mandel’s definition of the coherence time (MANDEL [ 1959]), (3.19)
and appears to be a reasonable measure of coherence time for at least thermal radiation (see also WOLF [1958]; MANDELand WOLF[1962]; MEHTA [1963]). The problem of determining the probability density P ( W ) may be reduced to solving an associated integral equation (SLEPIAN [1958]; KACand SIEGERT[1947]). Let us assume that A,, A,, . . . are the eigenvalues of the integral equation (3.20)
388
THEORY O F P H O T O E L E C T R O N C O U N T I N G
[VIII,
9: 3
arranged in decreasing order A, 2 A, 2 . . .. The kernel r(t-t') is Hermitian and positive definite (MEHTA et al. [1966]) and hence the eigenfunctions can be chosen to form an orthonormal set JOT +*(lC)
(t)+tL) (t)dt = 8,,
(3.21)
and we may write (3.22) k
Let us also express the random function V ( t )as
v(t)= 2
(3.23)
Ck+(k)(t)>
k
where ck are random coefficients
*. We may then write
r(t-t')= (V*(t')V ( t ) )= 2 (c : c , > # ~ * ( ~ ) ( t ' )+("(t).
(3.24)
On comparing (3.22) and (3.24) we obtain (3.25)
= 'k'kt'
(c:cI)
Further, since V ( t ) is a Gaussian random process, the coefficients ck which are linear functions of V ( t )are distributed according to a multivariate Gaussian distribution and it follows from (3.25) that
$(iC))
=
expi-
(n'k)-l
2
A,11Ck12).
(3.26)
k
Now from eqs. (3.23), (3.3) and (3.21) it follows that (3.27)
From eq. (A.14) of the appendix we therefore obtain the following expression for the characteristic function of W : c ( h ) = (,Ihw)
(1-iAkh)-1.
=
(3.28)
k
We see that the problem of finding the characteristic function associated with the random variable W is reduced to finding the eigenvalues Ak of the integral equation (3.20) and then evaluating the product (3.28). From (3.28) we may also write down for the cumulant gener-
* Representation (3.23) of the random function V ( t )is known as Karhunen-LoBve expansion (see for example DAVENPORT and ROOT[1958] p. 96).
VIII,
S 31
INTENSITY
389
FLUCTUATIONS
ating function of W :
(3.29)
Hence the rtth cumulant of W is given by == (9'-
K,
1)!
2 A:,
(3.30)
k
from which it follows that K,
=
1)!
(yt-
soT
r(")(t, t ) dt,
(3.31)
T(t-t'),
(3.32)
where
P ) ( t ,t ' )
=
and
r(,)(t, t ' ) =
P - 1 )
(t, t " ) P )(t", t') dt",
rt
2 2.
(3.33)
The evaluation of the eigenvalues 1, of the integral equation (3.20) can be carried out explicitly for the interesting case when the spectral [1958]), i.e. when profile of the light is Lorentzian" (SLEPIAN (3.34)
In eq. (3.34), Y,,is themid-frequency and cr is half-width of the spectral line. In this case the correlation function r(t)is given by r(Z) =
r(o)e-.l7l e-Zvivo7
(3.35)
The integral equation with this kernel ** is solved in Appendix B. From eqs. (B.12) and (B.19) of that appendix we obtain the following expression for the characteristic function: C ( h ) = euT [cosh 2+8 sinh z
(3.36)
\
* Strictly speaking g ( v ) must be zero when v < 0. However for the quasi-monochromatic case y o >> 0,we may take (3.34) to approximately hold for all frequencies. ** Actually the solution t o the integral equation with kernel I'(~-~')exp(2niv,(t- -T')} is given in that appendix. However the eigenvalues 1, are the same in both cases.
390
[VIII,
THEORY O F PHOTOELECTRON COUNTING
$ 3
where 2 =
{$T2-2io(
W)h}*.
(3.37)
In Fig. 1 the moduli of the characteristic functions calculated from 1.0
I
I
h-1.0
IC(h)l
0.5 - h = 2.0
-_h = 4.0 h.10.0
---_
_---
------__
_ _ ---_ I
0
/
--===- -------
/
I
t P(w/iv)
1.5
1.0
0.5
0
0.5
1.0
1.5
2.0
Fig. 2. The probability density of the normalized integrated intensity of completely polarized (9= 1 ) thermal light. (After JAISWAL and MEHTA [1969].) The spectral profile is Lorentzian.
VIII,
9 31
391
INTENSITY FLUCTUATIONS
the approximate expression (3.la), viz.
C,(h)
=
(1-2ih/~)-~
(3.38)
are compared with the exact values calculated from (3.36). As expected the two curves coincide in the two extreme limits when aT > 1. The probability density P ( W ) may be obtained by taking the Fourier inverse of (3.36). The results of numerical computations are shown in Fig. 2. It is found that even for aT 1 the values of P ( W ) given by the approximate expression (3.14) agree very well with those given in Fig. 2. For a Lorentzian spectral profile, one may also evaluate the first few cumulants of W explicitly. From (3.7) one obtains after straight forward but lengthy calculations the following expressions for the first five cumulants (MIDDLETON[ 19571): N
K1
==
K~
=
(W), (W)2(28-1+ep2fl)/(282),
K3
=
3(W)3{
K~
Kg
=
=
(B-
+ (B+
1)e-2fl}//33> 10e-2fl$-5 28e-28+ep4fl-29
1)
[- B2 + + 1. 8ec2fl 36ec2fl 3 ( 2 0 e ~ ~ f l + e - ~ j + 7 ) + 5 ( W ) 5 [+ B2 P3 B4 6(W)4
2e-2P
___--
8B4
2P3
f
_.
~
~
13) 1, +-3(12e-2fl+e-4flB5
(3.39)
where aT.
(3.40)
If one uses (3.8) and (3.39) one may also evaluate the first few moments of W . 3.2. PARTIALLY POLARIZED THERMAL LIGHT
Next let us consider a plane stationary, partially polarized wave traveling in the z-direction, and let V(t)be the vector analytic signal describing the field at a fixed point in space at time t. The integrated light intensity is then given by
W ( T )=
loT
V*(t)* V ( t )dt.
(3.41)
392
T H E 0RY 0F P H O T O E L E C T R O N C O U N T I N G
[VIII,
9 3
The Cartesian components V,(t) and V,(t) of V are assumed to be distributed according to a complex Gaussian random process. The statistical properties of V are then completely characterized by the 2 x 2 coherence matrix (3.42) where
Tij(.) = (V ?(t)Vj(t+.)). Let
(3.43)
+im) be the eigenfunctions of the matrix integral equation
IoT 2
rij(tl-tz)
(i = X , Y),
+im)(tz)dt2= v m + j m ) ( t l ) J
(3.44)
i=o, y
corresponding to the eigenvalues v, . The functions 41"' may be chosen to satisfy the orthonormality condition JOT
4:'") ( t )
( t ) dt = 6,,
.
(3.45)
Following a method strictly analogous to that used in deriving eq. (3.28) we may express the characteristic function C ( h ) of W in the form C ( h ) 3 (eihw) = (l-iv&-l. (3.46) m
The cumulants of W are now given by the formula K
~
=
( k - l ) ! I T d t 1 . . . J o T d t k x . .. ~ F i l i z ( t l - t z x) 0
il
is
XFi,i,(t2-t2) . . . P ( k i l ( t k - t l ) , (3.47)
= rji(.). where fij(.) When T is very small, in comparison with the coherence time, we may write eq. (3.44) in the form
(3.48) Since the two eigenvalues of the matrix J ( 0 ) are &(l+cY)(I); $(l-Y) exp (- V - P ) ( W ) 2w
-
(3.51)
Fig. 3 shows the behavior of this distribution for various values of the degree of polarization P.
I
1.0
T (v,~
=
s
S ( 2 ) ( ~ - ~ c ) ( ~ )d2v, (d
(3.64)
whereas the thermal part has the usual Gaussian diagonal representation (3.65) The density operator of the superposed field is then given by
Assuming that the counting interval is small compared to the coherence time and using eqs. (3.66) and (2.1) one finds that the counting distribution may be expressed in terms of the Laguerre polynomial
VIII,
31
397
INTENSITY FLUCTUATIONS
Here (n,) = U C T I V , / ~ and (nT) = uT(IT) are the average counts obtained from the coherent and from the thermal fields respectively. As may be verified, the average and the variance of n are given by +(lZT);
+(WT> ( W 2 )= (W)2+
soTs
II’(r-d)12drdz’+
The results of this section may readily be extended to include the case when the incident light is in any particular state of polarization.
VIII,
§ 41
P H O T O C O U N T I N G DISTRIBUTION
399
The nth cumulant of the integrated intensity may in this case be shown to be given by
S, 7I‘;,) t ) dt + T
K,
=
(n- 1 ) !
(t,
(3.80)
where
I - p (t, t’) = (V&(t’)V.&)),
(3.81)
Q 4. Photocounting Distribution As pointed out earlier, the light fluctuations are not measured directly, but are usually inferred from the photoelectric measurements. According t o Mandel’s formula, the probability p (n) of detecting n photoelectrons in a time interval T is given by
p ( n )=
IOm n!
ecaWP( W )dW.
In 9 3 we evaluated the probability density P ( W ) in a few typical cases, and for various values of the counting time T . In this section we will study the corresponding behavior of p (n). 4.1. POLARIZED THERMAL LIGHT
Let us first consider a plane wave beam of completely polarized thermal light. When T > T,, namely
where
c = {(l/T~)+2(n)/(TT,)):,
(4.5)
S,(x)= (2x/n):e"K,_+(x),
(4.6)
and K,-; is the modified Hankel function of order n-$. This formula is in good agreement with the exact expression for p ( n ) only when T 2 lOT,. Since no explicit expression is available for P ( W ) for arbitrary values of T, it is not possible to give the corresponding expression for * ( a ) . However, if we use the Rice's approximate expression for P ( W ) [eq. (3.14)], we obtain from (4.1) the following formula for p ( n ) (MANDEL [1959]):
ll(n+N) +(%)
=
1
n!r(N)( l + ( n ) / N ) N
1
(l+N/(n))"'
(4.7)
The parameter N is given by eq. (3.15). Formula (4.7) is encountered in statistical mechanics in connection with the fluctuation of bosons in N-cells of phase space; in the present context N is, however not necessarily an integer. The expression (4.7) is valid in the two extreme limits T > T,, and appears to be a fairly good approximation for the intermediate values of T also, as is evident from our earlier discussion in connection with the corresponding probability density P ( W ) . For the special case when the spectral profile of the light beam is either Gaussian or rectangular, BBDARDet al. [1967] have evaluated the first few factorial moments using eq. (4.7) and
10
I
10-1
I
I
I
I
16'
lOl
I
10
10-2
PT
I
I
I
10'
I
10
oT
Fig. 5. A comparison of the factorial moments calculated from the approximate probability [eq. (4.7)] (broken curves), with the exact values calculated from the cumulants [eq. (3.7)]: (a) Rectangular spectral profile; (b) Gaussian spectral profile. (After BBDARDet al. [1967].)
w
z
405
THEORY OF PHOTOELECTRON COUNTING
[VIII,
§ 4
compared them with the exact values obtained from eqs. (2.27), (3.7) and (3.8). Their results are reproduced in Fig. 5. It is to be noted that the generating function
is related to the characteristic function
C (12) =
eihwP(W) dW /om
of W by the formula
G(s) = C(itcs).
(4.9)
From eq. (3.28) we may therefore write
G ( s ) = JJ ( 1 + t c d k ) ~ l ,
(4.10)
lc
where 3Lk are the eigenvalues of the integral equation (3.20). The product on the right-hand side of eq. (4.10) is evaluated explicitly in Appendix B for light with Lorentzian spectral profile. From eqs. (4.9) and (3.36), we find that coshz+&sinhz
(",' + iT) )-', -
-
(4.11)
where z = .[0~T~+2oT(n)s}~.
The counting distribution p ( k ) and the factorial moments may be obtained from G (s) by repeated differentiation: dk
k!
dsk
(4.12) (firr1>
(4.13) s=l
(4.14)
The factorial moments in this case may, of course, be directly obtained from eqs. (2.27), (3.8) and (3.39). In Fig. 6 we compare the first few counting distributions and the factorial moments obtained from eqs. (4.4) and (4.7) with those obtained from the exact formulas (4.13) and (4.14).
VIII,
3 41 PHOTOCOUNTING
DISTRIBUTION
N
404
THEORY OF PHOTOELECTRON COUNTING
[VIII,
s
4
4.2. PARTIALLY POLARIZED THERMAL LIGHT
When the counting time T is small compared with the coherence time, the probability density P ( W ) for partially polarized thermal light of degree of polarization B is given by eq. (3.51). From (3.51) and (4.1) we then obtain the following expression for the counting distribution $(a) (MANDEL [1963b]):
(4.15)
One may interpret this result as a distribution of n bosons in two cells of phase space with occupation numbers in the ratio ( 1 - 9 ) to ( 1 +P). The variance ( ( A n ) 2 )is given by ( (An)
> = (n>{1+4 (1 +g2) >.
(4.16)
When T is large compared to the coherence time, P ( W ) is approximately Gaussian and+(%)is therefore of the form given by eq. (4.3). It is possible to obtain an approximate expression for p (n)analogous to (4.7). The coherence matrix J ( 0 ) of eq. (3.42) may be diagonalized by a unitary transformation. The intensity I ( t ) may thus be divided into two parts I l ( t )and 12(t),which, on account of the Gaussian nature of the wavefield, are statistically independent random variables. Their averages are the eigenvalues of J ( 0 ) :
I ( t ) = Il(t)+I,(t),
(4.17)
(1,) = + ( l + q < o >(1,) = + ( 1 - 9 ) ( I ) .
(4.18)
The contribution to the photocounting distribution pl(K) and p , ( k ) from I l ( t )and12(t)separately may be approximated by (4.7). We may therefore write (see also HELSTROM [1964]) n
P(n) =
2 $1(K)92(n-J4)
1c=o
2N ]-n+k.
(n>(1 -9)
(4.19)
VIII, 5 41
405
PHOTOCOUNTING DISTRIBUTION
It is to be noted that this expression holds for light beams with arbitrary spectral profile and that it leads to the correct distribution in the two extreme limits T >> T , and T