ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS
VOLUME 76
EDITOR-IN-CHIEF
PETER W. HAWKES Laboratoire d’Optique Electr...
19 downloads
580 Views
18MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS
VOLUME 76
EDITOR-IN-CHIEF
PETER W. HAWKES Laboratoire d’Optique Electronique du Centre National de la Recherche Scientifique Toulouse, France
ASSOCIATE EDITOR
BENJAMIN KAZAN Xerox Corporation Palo Alto Research Center Palo Alto, California
Advances in
Electronics and Electron Physics EDITED BY PETER W. HAWKES Laboratoire d’Optique Electronique du Centre National de la Recherche Scientijique Tou louse. Frun ce
VOLUME 76
ACADEMIC PRESS, INC. Harcourt Brace Jovanovich, Publishers Boston San Diego New York Berkeley London Sydney Tokyo Toronto
COPYRIGHT 01989 BY ACADEMIC PRESS,INC. ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED O R T R A N S M I T E D IN ANY FORM O R BY ANY MEANS, ELECTRONIC O R MECHANICAL, INCLLJDING PHOTOCOPY, RECORDING, O R ANY INFORMATION S T O R A G E AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM T H E PUBLISHER.
ACADEMIC PRESS, INC. 1250 Sixth Avenue, San Diego, C A 92101
United Kingdom Edition published by ACADEMIC PRESS INC. (LONDON) LTD. 24-28 Oval Road. London NW1 7DX
LIBRARY OF CONGRESS CATALOG CARDNUMBER:49-7504 ISBN 0-1 2-0 14676-2 PRlNlED I N THE UNITED STATES OF AMERICA
89 YO Y1 Y2
9 8 7 6 5 4 3 2 1
CONTENTS CONTRIBUTORS ................................ PREFACE ....................................
I. I1. 111.
IV. V. VI .
VII . VIII . IX . X. XI . XI1.
XI11.
I. I1.
111.
1V. V.
VI . VII .
The Optics of Round and Multipole Electrostatic Lenses L . A . BARANOVA A N D S . YA. YAVOR Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Equations of Motion of Charged Particles and Methods for Field Distribution Calculation . . . . . . . . . . . . . . . . . . . The Basic Concepts of Paraxial Optics . . . . . . . . . . . . . . Aberrations of Electrostatic Lenses . . . . . . . . . . . . . . . . . Phase-Space Approach to Particle Beams . . . . . . . . . . . . . Current Density Distribution and Frequency-Contrast Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Round Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quadrupole Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . Two-Dimensional (Cylindrical) Lenses . . . . . . . . . . . . . . . Transaxial Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . Crossed Lenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . New Types of Lenses . Aberration Correctors . . . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Electron Microscopy of Fast Processes 0. BOSTANJOCLO Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Interactions of Electrons with Matter . . . . . . . . . . . . . . . Modes of Electron Microscopy . . . . . . . . . . . . . . . . . . . Time Resolved Electron Microscopy . . . . . . . . . . . . . . . . Application of Real-Time Electron Microscopy to Fast Laser-Induced Processes . . . . . . . . . . . . . . . . . . . . . . . Space-Time Resolution of Real-Time Microscopy . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgment . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vii ix
3 9 20 36 51 69 78 115 148 159 174 187 200 201
209 211 216 223 260 273 276 276 276
vi
CONTENTS
I. I1.
111.
IV. V. VI . VII .
High Resolution Transmission Electron Microscopy and Geology MARCELLO MELLINI . . . . . . . . . . . . . ................. Introduction Technical and Experimental Aspects . . . . . . . . . . . . . . . . Structure and Microstructure of Minerals . . . . . . . . . . . . . Structural Control Over Microstructure . . . . . . . . . . . . . . Mineral Reactions . . . . . . . . . . . . . . . . . . . . . . . . . . . Extraterrestrial Mineralogy . . . . . . . . . . . . . . . . . . . . . Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
On Generalized Information Measures and Their Applications INIIER JEET TANEJA I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I1. Shannon’s Entropy and Its Generalizations . . . . . . . . . . . . 111. Generalized Distance Measures . . . . . . . . . . . . . . . . . . . IV. Generalized Measures of Directed Divergence . . . . . . . . . . V. Generalized Divergence Measures . . . . . . . . . . . . . . . . . VI . Generalized Entropies for Multivariate Probability Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII . Applications to Statistical Pattern Recognition . . . . . . . . . . Entropy Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . INDEX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
282 284 292 297 305 317 323 324
328 329 352 353 359 368 386 410 411 415
CONTRIBUTORS Numbers in parentheses refer to the pages on which the authors’contributions begin.
L. A. BARANOVA(l), A. F. Ioffe Physico-Technical Institute, Academy of Sciences of the USSR, 194021 Leningrad K-21, USSR 0. BOSTANJOGLO (209), Optisches Institut der Technischen UniversitPt Berlin, D-1000, 12, Strasse des 17. Juni 135, Federal Republic of Germany MARCELLO MELLIN~ (281), Dipartimento di Scienze della Terra, Universita di Perugia, Italy INDERJEET TANEJA (327), Departamento de Matematica, Universidade Federal de Santa Catarina, 88.035, Florianopolis, S.C., Brazil
S . YA. YAVOR (l), A. F. Ioffe Physico-Technical Institute, Academy of Sciences of the USSR, 194021 Leningrad K-21, USSR
vii
This Page Intentionally Left Blank
PREFACE This volume of the Advances is biased strongly toward particle optics and electron microscopy. The first chapter, long enough to form a short monograph, did in fact begin life as a Russian book, in which the authors brought together much of the work of the Leningrad school led by S. Ya. Yavor, who is herself co-author with V. M. Kel’man of the standard Russian textbook on electron optics. I felt that the text of L. A. Baranova and S. Ya. Yavor deserved a wider audience, and the result is the English-language version published here. The second chapter, on the study of very fast processes in the electron microscope, is written by a specialist who has made numerous significant contributions to this difficult subject. The information that can be obtained in this way is not only of the greatest importance in microcircuit engineering but also sheds light on many fundamental physical processes. The third chapter was solicited in the spirit of a number of reviews in earlier volumes, in which we asked a specialist in a particular field to examine the benefits of using a particular technique or type of instrument. In this chapter, M. Mellini considers the contribution of high-resolution electron microscopy to geology. The wide range of examples shows convincingly how useful this technique is proving in this field. The final chapter is concerned with an aspect of statistics that is of particular relevance for pattern recognition: the use of generalized information measures. Much of the material in this article originated in the author’s own research group, and 1 am very happy to include this account in which the newer results are set in context. It is a pleasure to thank all the contributors for the trouble they have taken over their chapters. As usual, I conclude with a list of forthcoming reviews. Peter W. Hawkes
FORTHCOMING REVIEWS Parallel Image Processing Methodologies Image Processing with Signal-Dependent Noise Pattern Recognition and Line drawings Bod0 von Borries, Pioneer of Electron Microscopy IX
J. K. Aggarwal H. H. Arsenault H. Bley H. von Borries
X
PREFACE
Signal Analysis in Seismic Studies Magnetic Reconnection Sampling Theory Finite Algebraic Systems and Trellis Codes Electrons in a Periodic Lattice Potential The Artificial Visual System Concept Corrected Lenses for Charged Particles A Gaseous Detector Device for ESEM The Development of Electron Microscopy in Italy The Study of Dynamic Phenomena in Solids Using Field Emission Amorphous Semiconductors Resonators, Detectors and Piezoelectrics Median Filters Bayesian Image Analysis SEM and the Petroleum Industry Emission Electron Optical System Design Statistical Coulomb Interactions in Particle Beams Number Theoretic Transforms Phosphor Materials for CRTs Tomography of Solid Surfaces Modified by Fast Ions The Scanning Tunnelling Microscope Scanning Capacitance Microscopy Applications of Speech Recognition Technology Multi-Colour AC Electroluminescent Thin-Film Devices Spin-Polarized SEM The Rectangular Patch Microstrip Radiator Active-Matrix TFT Liquid Crystal Displays Electronic Tools in Parapsychology Image Formation in STEM Low-Voltage SEM Languages for Vector Computers
J. F. Boyce and L. R. Murray A. Bratenahl and P. J. Baum J. L. Brown H. J. Chizeck and M. Trott J. M. Churchill and F. E. Holmstrom J. M. Coggins R. L. Dalglish G. D. Danilatos G. Donelli M. Drechsler W. Euhs J. J. Gagnepain N. C. Gallagher and E. Coyle S. and D. Geman J. Huggett V. P. Il'in G. H. Jansen G. A. Jullien K. Kano et al. S. B. Karmohapatro and D. Chose H. Van Kempen P. J. King H. R. Kirby H. Kobayashi and S. Tanaka K. Koike H. Matzner and E. Levine S. Morozumi R. L. Morris C. Mory and C. Colliex J. Pawley R. H. Perrott
xi
PREFACE
Electron Scattering and Nuclear Structure Electrostatic Lenses CAD in Electromagnetics Scientific Work of Reinhold Riidenberg Atom-Probe FIM Metaplectic Methods and Image Processing X-Ray Microscopy Applications of Mathematical Morphology Focus-Deflection Systems and Their Applications Electron Gun Optics Thin-Film Cathodoluminescent Phosphors Electron Microscopy and Helmut Ruska
G. A. Peterson F. H. Read and I. W. Drummond K. R. Richter and 0. Biro H. G. Rudenberg T. Sakurai W. Schempp G. Schmahl J. Serra T. Soma et u1. Y. Uchikawa A. M. Wittenberg C. Wolpers
This Page Intentionally Left Blank
ADVANCES I N E L K 1 KONICS AND CLtCTKOh PHYSICS.VOL 76
The Optics of Round and Multipole Electrostatic Lenses
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . .
3
11. Equations of Motion of Charged Particles and Methods for Field Distribution
111.
IV .
V.
VI .
V11 .
VIII .
Calculation . . . . . . . . . . . . . . . . . . . . . . . . A . Equations of Motion. . . . . . . . . . . . . . . . . . . . B. Potential Distribution in Electrostatic Lenses . . . . . . . . . . The Basic Concepts of Paraxial Optics . . . . . . . . . . . . . . A . The Paraxial Equations of Trajectory . . . . . . . . . . . . . B. The Paraxial Characteristics of Electron Lenses . . . . . . . . . C . Cardinal Elements . . . . . . . . . . . . . . . . . . . . . D . The Matrix Method . . . . . . . . . . . . . . . . . . . . Aberrations of Electrostatic Lenses . . . . . . . . . . . . . . . A . Third-Order Geometrical Aberrations . . . . . . . . . . . . . B. Additional Data on Geometrical Aberrations . . . . . . . . . . C . Chromatic Aberration . . . . . . . . . . . . . . . . . . . D . Distortions Due to Mechanical Defects . . . . . . . . . . . . E. Experimental Methods for Determining Electron Optical Characteristics . Phase-Space Approach to Particle Beams . . . . . . . . . . . . . A. The Conception of Phase Space and the Liouville Theorem . . . . . B. Beam Emittance and Its Transformation in Electron Optical Systems . . C . The Beam Envelopes . . . . . . . . . . . . . . . . . . . . D . Crossover . . . . . . . . . . . . . . . . . . . . . . . . Current Density Distribution and Frequency-Contrast Characteristics . . . A. Calculation of Current Distribution in Space Beyond the Lens . . . . B. Frequency-Contrast Characteristics of Electron Optical Systems . . . . Round Lenses . . . . . . . . . . . . . . . . . . . . . . . A . Field Distribution and the Paraxial Optics of Round Lenses . . . . . B. Spherical Aberration of Round Lenses . . . . . . . . . . . . . C . Field Aberrations . . . . . . . . . . . . . . . . . . . . . D . Chromatic Aberration . . . . . . . . . . . . . . . . . . . E. Two-Electrode Immersion Lenses . . . . . . . . . . . . . . F. Einzel Lenses . . . . . . . . . . . . . . . . . . . . . . G . Multielectrode Immersion Lenses . . . . . . . . . . . . . . H . Some Applications of Round Lenses . . . . . . . . . . . . . Quadrupole Lenses . . . . . . . . . . . . . . . . . . . . . . A. Fields of Quadrupole Lenses . . . . . . . . . . . . . . . . B. The Paraxial Properties of Quadrupoles . . . . . . . . . . . .
. . . . . .
. . .
.
. . . . . .
. . . . .
. . . . . . .
.
. .
. . . .
. . . . . . . . . . .
9 9 12 20 21 24 27 30 36 36 41 44 45 47 51 52 57 61 65 69 70 75 78 79 x4 XX
92 94
. . . . .
101 110 114 115 . . . 116
. .
124
I 19x9 hy Acadcmic Prcas . Inc All rights or reproduction in any form rescrved . ISBN 0-I?-l11467h-?
Iknplt4i ~r.insl~ition copyright I
2
L . A . BARANOVA AND S . YA . YAVOR
IX .
X.
XI .
XI1 .
XI11 .
C . Quadrupole Systems . . . . . . . . . . . . . . . . . . D . Geometrical Aberrations of Quadrupole Lenses . . . . . . . E . Chromatic Aberrations of Quadrupoles. Achromatic Lenses . . . Two-Dimensional (Cylindrical) Lenses . . . . . . . . . . . . A . Optical Properties of Two-Dimensional Lenses . . . . . . . . B. The Parameters of Some Two-Dimensional Lenses . . . . . . Transaxial Lenses . . . . . . . . . . . . . . . . . . . . A . Potential Distribution and Focusing in Transaxial Lenses . . . . B. Geometrical Aberrations of Transaxial Lenses . . . . . . . . C . Chromatic Aberration . . . . . . . . . . . . . . . . . D . Transaxial Lenses Formed by Parallel Plates . . . . . . . . Crossed Lenses . . . . . . . . . . . . . . . . . . . . . A . A Three-Electrode Einzel Crossed Lens with Identical Rectangular B. Modifications of an Einzel Crossed Lens . . . . . . . . . . C. Systems of Crossed Lenses . . . . . . . . . . . . . . . D . Correctors of Geometrical Aberrations . . . . . . . . . . New Types of Lenses. Aberration Correctors . . . . . . . . . . A . Coaxial Lenses with Transverse Fields . . . . . . . . . . . B . Radial Lenses . . . . . . . . . . . . . . . . . . . . C. Correction of Geometrical Aberrations by Means of Octupoles . . D . Lenses with Partial Aberration Correction . . . . . . . . . Conclusion . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . .
. . . . . .
. . .
. . .
i28 139 143 148 149 . 152 159 . 161 . 165 169 . 170 174
. . . .
. . . . . . Apertures I76 . . . . 180
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . .
182 184 187
187 189
. 193 . 198 200 201
This chapter is concerned with electrostatic focusing of charged particles . The description is largely restricted to low-intensity beams. when the effect of intrinsic space charge of the beam can be neglected . Both round and astigmatic electron optical lenses are considered . Electrostatic lenses find wide application in many fields of science and technology . Along with the traditional applications (cathode-ray devices. input devices in spectrometers. etc.), there have been newer areas for their use . In recent years. methods have been devised to probe matter by charged particle beams . The data obtained in this field have been used to design instruments for effective technological control of production of microelectronic devices. One example is instruments to control solid surfaces . The problems of charged particle focusing that arise here are primarily resolved with electrostatic lenses . The number of publications on electrostatic lenses and their optimization is very large . There are a few books that tackle these questions to some extent . However. there is no monograph that covers all of the lens types of interest and that includes recent results of original research . This article considers the present state of the theory and application of electrostatic lenses. The methods for field calculation. the theory of focusing.
T H E OPTICS OF R O U N D AND MULTIPOLE ELECTROSTATIC LENSES
3
and aberrations of electrostatic electron lenses are discussed; recent data on round and quadrupole lenses are presented; and two types of astigmatic lenses, transaxial and crossed ones, which are not commonly known, are considered in detail. An outline of some new lens types and aberration correctors is also included. The article consists of 12 sections. After the Introduction, the next five sections are concerned with basic concepts relevant to all lens types: potential distribution (Section II), the theory of first-order focusing (Section 111) geometric and chromatic aberrations (Section IV) application of phase-space theory to problems of focusing charged particle beams (Section V), and current-density distribution in beams with finite phase space (Section VI). The chapters that follow (Sections VII-XII) deal with each type of lens individually, providing the basic characteristics, the methods to calculate lens systems, and ways of designing electron optical systems with corrected aberrations. Our objective is to give a coherent treatment of the theory of electrostatic lenses on a unified basis and to systematize the available data concerning their electron optical properties. The authors hope that this monograph will provide guidance to the extensive literature on this subject and will stimulate the solution of problems associated with the selection, calculation, and design of electron optical systems.
I.
INTRODUCTION
Electron lenses' are the basic components of most electron optical devices. Depending on the application, a device may also contain deflecting components, various mass and energy analyzers of the charged particles, and correctors to cancel the focusing and deflection errors. But, practically all schemes using charged particle beams include the focusing elements-lenses. The available monographs on electron optics all give much attention to lenses. It is generally believed that the history of electron lenses started with the publication of H . Busch's work in 1926 (Busch, 1926), which showed that electric and magnetic fields with rotational symmetry could focus beams of charged particles, that is, they could act as lenses. An experimental investigation of electrostatic round and two-dimensional lenses was conducted by Davisson and Calbick (1931). The further development of electron optics
' The terms electron lenses and electron optics originated early in the history of this field. They are not quite adequate. and it would be more suitable to speak of lenses for charged particles and of charged particle optics. However, for brevity and for the sake of tradition, we use here the commonly accepted terminology.
4
L. A. BARANOVA A N D S. YA. YAVOR
resulted in a detailed theory and design of both electrostatic and magnetic electron lenses. In practice, the choice between the two types of lenses is based on the ability of a lens to meet most of the requirements that a particular problem implies. Each of the two lens types possesses some advantages and disadvantages. The advantages of electrostatic lenses are their smaller weight and size, the lack of power consumption-which facilitates the stabilization and reduces the voltage supply weight-and the simplicity of the manufacture technology. They have zero response time and, for this reason, may be used to work with fast processes. Unlike iron-free magnetic lenses, electrostatic lenses provide a higher field precision. Another merit of the lenses, compared to iron magnetic lenses, is the absence of residual fields and, therefore, a better reproduction of field distribution. The optical power of electrostatic lenses does not depend on the mass of charged particles, but is determined only by their energy; so for focusing of heavy particles of moderate energies, they should be preferred to magnetic lenses. On the other hand, for the focusing of light particles, magnetic lenses possess a greater optical power, which is practically attainable at high energies. Therefore, electrostatic lenses should only be used at low and moderate energies. A traditional application of magnetic lenses is the focusing of highenergy electrons. As a rule, magnetic lenses show a lower level of aberrations. The development of electron optics has been closely related to its various practical applications. As far back as 1931, the idea of focusing charged particle beams to obtain an adequate electron optical image was realized in the first electron microscope by E. Ruska Electron microscopy has made great progress since that time. The resolution of a transmission electron microscope is about 0.1 nm, which is close to the theoretical limit. Various modifications of the electron microscope have been designed (high voltage and scanning microscopes, and others) that have found wide application in many fields. It should be noted that the recent modifications are based on magnetic focusing. The electrostatic lens is used only in the microscope electron gun to form an accelerated electron beam. In a high-voltage microscope, the electrons are accelerated by a series of electrostatic lenses. At present, ion scanning microscopes with an ion probe formed by a system of electrostatic lenses are in wide use (Levi-Setti, 1980). A schematic diagram of such a device is shown in Fig. 1. The objective and projector of the microscope are round einzel lenses. There is a field-emission ion source, with an emitter radius of 150 nm and great brightness (5 x lo3 A/cmz.sr for Ar ions). This provides a total probe current of 2 x lo-" A at a probe radius of 100 nm and an accelerating voltage of 10-20 Kv. There is another class of conventional electron optical devices that use electrostatic lenses as the focusing system-cathode-ray tubes. At present, the tubes are an integral part of various devices produced commerically, e.g., TV
THE OPTICS OF R O U N D AND MULTIPOLE ELECTROSTATIC L E N S t S
5
+u
7
5
9 1
I
f
FIG. I . Electron optical diagram of a scanning ion microscope: ( 1 ) ion source; (2) field emission cathode;(3) gas inlet;(4)liquid nitrogen;(5)objective;(6)aperture diaphragm;(7)double deflection plates; (8) projector: (9)deflecting plates; (10) sample; (1 1) detector.
6
L. A. BARANOVA A N D S. YA. YAVOR
FIG.2. Principal diagram of a wideband oscillograph tube: ( I ) electron gun; (2) einzel crossed lenses; (3) deflecting plates; (4) post-accelerating system; ( 5 ) screen.
sets, oscilloscopes, etc. Figure 2 shows a diagram of a modern wideband oscilloscope tube. Its focusing system consists of three crossed electrostatic lenses, which have partly replaced round lenses, because they increase the deflection and provide a smaller spot on the screen. Electrostatic lenses find new applications in investigations of solid surfaces. Surface research is one of the most important fields in modern science. Solid-surface physics is associated with a large number of fundamental and applied problems in microelectronics, catalysis, adhesion, friction, etc. Numerous techniques have been suggested to solve these problems, most of which are based on charged particle beams probing the surface under study. The installations that have been designed are, as a rule, sophisticated and envisage the use of several techniques simultaneously (Cherepin and Vasil’ev, 1982). A schematic representation of a microprobe for secondary ion massspectrometry (SIMS) is given in Fig. 3 (Mc’Hugh, 1975). It uses electrostatic round lenses to form the primary ion beam, which scans the specimen, and to focus secondary ions. The SIMS technique allows one to find the surface distribution of a specific chemical element with a resolution of a few microns. One should keep in mind that progress in electron optical device design is closely associated with the development of the adjacent areas of science and technology. It has been stimulated by the improvement of high-vaccum technology, fabrication of vacuum materials, design of new types of cathodes (e.g., field-emission cathodes), etc. There are several types of electrostatic lenses, each of which has found specific applications associated with its electron optical properties. Round lenses are most commonly used because the study of their properties has the longest history. This is the only type of lenses capable of uniformly converging
T H E OPTICS OF ROUND A N D MULTIPOLE ELECTROSTATIC LENSES
7
c
FIG.3. Diagram of an ion microprobe for surface analysis: ( I ) primary ion source; (2,lO) mass analyzers; (3) condensor; (4) deflecting plates; ( 5 ) objective; ( 6 ) optical system; (7) sample; (8) secondary ion focusing system; (9)electrostatic analyzer; ( I 1) ion detector.
charged particle beams in any direction, creating a correct electron optical image. All the other lenses are astigmatic, so it is necessary to use a system of lenses in order to get a point image of a point object. Two-dimensional (cylindrical) lenses converge particles in only one direction; for this reason, they are largely employed to focus ribbon beams, for example, in ion sources. Application of quadrupoles in electron optical devices started in the 1950s (Courant et al., 1952). These lenses have a high optical power and are used to focus high-energy beams. Much effort has been made to create quadrupoleoctupole correctors. A wide fan-shaped beam or a set of beams lying in the same plane can be conveniently controlled by transaxial lenses, which combine well with a prism spectrometer. Cathode-ray devices use crossed lenses, which permit correction of spherical aberration and, at the same time, possess a comparative simplicity of production and adjustment. One can see from the foregoing discussion that the applications of electrostatic lenses and, therefore, the requirements placed on them are very diverse. One or the other lens parameter may become essential for the solution of a particular problem. For instance, transportation of high-energy particle
8
L. A. BARANOVA A N D S. YA. YAVOR
beams requires a system of very high optical power. The lenses used in a microscope must form a correct image, the objective must have low spherical aberration, and the projector lens must possess low distortion. The focusing of ion beams, known for their large energy spread, requires lenses with low chromatic aberration. The various requirements have given rise to a large number of modifications of electrostatic lenses, which differ in both design and optical characteristics. At present, much work is being done to optimize the properties of the available types of lenses and to design new ones. There is a profound analogy between the light propagation is an optical medium and the motion of charged particles in electric and magnetic fields, which is based on the analogy between the Fermat principle for light propagation and the least action principle for particle motion. Many of the available electron optical devices have analogues among optical devices (microscopes, spectrometers). Their principal designs are, as a rule, similar and contain similar components (lenses, prisms). Since light optics has a long history, some of its basic results have been borrowed by electron optics. For this reason, much of the theoretical treatment of electron optical components and the terminology in electron optics are basically similar to those in light optics. However, there is a fundamental difference between the two fields of knowledge. In light optics, we deal with well-bounded homogeneous media, whose boundaries may be chosen arbitrarily. In electron optical systems, the media are essentially inhomogeneous, their transitions are continuous, and the obeys the Laplace equation, electron optical refractive index n = and thus n cannot be given arbitrarily. The refractive index variation in glass lenses is comparatively small, smaller than one order, while in electron optical systems it varies over a very wide range of values. The implications from these differences will be considered with reference to round-lens optics. Since we can take the refractive index and boundary shape of a glass lens arbitrarily, it is not difficult to make a diverging lens or lenses with corrected spherical and chromatic aberrations. This cannot be done for round electron optical lenses. Here, we have an unambiguous correlation between the field distribution in space and its distribution along the axis due to the superposition of the axial symmetry conditions on the Laplace equation, which completely defines the shape of equipotential surfaces. Therefore, there are no diverging round electron lenses or lenses with corrected spherical or chromatic aberrations. Similar conclusions can be drawn from comparison of two-dimensional (cylindrical) electron and light lenses. In some types of astigmatic electron lenses, where the field has a different symmetry, these limitations may be removed. A n advantage of electron lenses is the possibility of the electric control of their parameters, while glass optics permit only mechanical adjustments to be
,/m
THE OPTICS OF ROUND AND MULTIPOLE ELECTROSTATIC LENSES
9
made. So the requirements for the calculation accuracy of electron optical design may be somewhat lower because there is an opportunity to obtain the necessary parameters by a slight adjustment of the electrode potentials. The development of electron optics is much affected by the variety of its applications. As a result, there is a certain disintegration of electron optical studies and a variability in the terminology used, as well as some difficulties in the use of achievements in related areas. In this monograph, the theory of electrostatic lenses is treated consistently from a unified point of view; the basic characteristics of the lenses are compared to point out their common properties and specific features, together with their essential applications.
11. EQUATIONS OF MOTIONOF CHARGED PARTICLES METHODSFOR FIELDDISTRIBUTION CALCULATION
AND
The various types of available electron lenses are commonly considered in terms of the general theory of motion of charged particles in electric and magnetic fields. To determine the trajectory of a charged particle, one must know the equation of motion of a particle in static fields and the potential distribution in the lens in question. The first problem is discussed in Section 1I.A; a review of the field calculations is given in Section 1I.B. A . Equations of' Motion
The relativistic equation of motion of charged particles in an arbitrary electric field E and in a magnetic field H is as follows:
]
rnv
=
e(E
+ v x H),
where t denotes the time and v the charged particle velocity; c is the speed of light, rn and e are the rest mass and the charge, respectively. In the static case, the field E relates to the potential cp in the following way;
E
=
-grad 40.
(2)
Here, the energy conservation law allows one to relate the particle velocity and the electrostatic potential at a certain point in space:
In this equation, cp
=0
at the point where the particle velocity is zero.
10
L. A. B A R A N O V A A N D S. YA. YAVOR
In the nonrelativistic case (ti2 V,).The work we have cited also presents data on the spherical
0
E
9 a
0
5 L
THE OPTICS OF ROUND AND MULTIFQLE ELECTROSTATIC LENSES 155
a FIG.61. A two-dimensional einzel lens with a thick intermediate electrode: (a) lens section by the yOz-plane; (b) relationship between the positions of the object P and image Q: h,/u, = h J u , = h,/uz = 0.5; a , la, = u3Juz= 0.5.
aberrations of two-dimensional lenses. A comparison with round lenses shows that the spherical aberration of slit lenses in the y0z-plane is similar in order of magnitude to the aberration of lenses formed by circular apertures (for the same electrical and geometrical parameters). A thick intermediate electrode is used to increase the optical power of a three-slit einzel lens. Such a modification also allows the spherical aberrations to be reduced. The calculation of this type of lens [Fig. 61 (a)] has been made using a wide variation of the intermediable electrode thicknesses, interelectrode distances, and gap widths (Afanas'ev et al., 1975). The field has been found by approximate calculations based on the series expansion of potential in each electrode slit in a complete set of orthonormalized functions. In interslit space, Dirichlet's problem can be solved exactly, and the unknown expansion coefficients can be found from the continuity condition of the normal potential derivative in each slit. This method provides a high accuracy of the results with minimum computations. The lens parameters were calculated by numerical integration of the trajectory equations. The calculations of the paraxial properties are represented as curves of equal magnifications and equal excitations [Fig. 61(b)]. By the excitation b, here we understand the ratio of the electrode potential difference to the accelerating voltage as fi = ( Vz - Vl)/V,. The calculations were made for seven lens geometries; the excitation, object, and image positions varied widely. Figure 61 (b) illustrates the relationship between the image position Q
I56
L. A . BARANOVA A N D S. YA. YAVOR
and the object position P measured from the lens center to the right and to the left, respectively. The geometrical parameters of the lens are given in the figure caption. Note that if the object and image were in the lens field, the values of P and Q were determined from their real position, and not from the apparent position seen from the outside (asymptotic position). Because of the lens symmetry, the plots for the positive and negative excitations are given in the same picture, one half for each. So, if Q > P for fl > 0 or Q < P for fl < 0, the parameters can be found by interchanging P S Q and M 4 l/M. The data analysis shows that the lens optical power grows with the distance between the electrodes and with the intermediate electrode thickness, as well as with decreasing slit widths at the end electrodes. The same authors have calculated the coefficients of spherical and chromatic aberrations. The spherical aberration is shown to decrease with increasing slit width of the end electrodes. The effect of the other dimensions is not so essential. Two-dimensional lenses with the retarding intermediate electrode have, like round lenses, much greater spherical aberration than lenses with the accelerating electrode (at the same optical power). In the region of P 2 5a,, the spherical aberration coefficients can be well approximated (the error 10%) by the formula
-
The work of Alexandrov et al. (1977) catalogs the parameters of einzel slit lenses with thick intermediate electrodes. The lenses are symmetrical relative to the plane passing through the middle of the intermediate electrode. The potential distribution was obtained by numerical integration of the Laplace equation or by modeling on a resistance network. The optical parameters were found by numerical integration of the trajectory equations. The catalog was made for a wide range of geometrical parameters of the lens, object plane positions, and electrode potential ratios. It includes the following lens characteristics: the cardinal elements, linear and angular magnifications, the Gaussian image positions, maximum trajectory deviations from the axis, as well as the coefficients of spherical and chromatic aberrations. The fields of two-dimensional lenses formed by pairs of parallel plates have been carefully studied because of the relative simplicity of analytical calculations. The potential distribution in two- and three-electrode lenses with an equal distance a between the parallel plates [Figs. 62(a), (b)] was calculated by the method of separation of variables (Tsyrlin, 1977), based on the assumption that the gaps between the electrodes were small and the plates of the outer electrodes were semi-infinite. The Fourier integrals obtained were transformed into series by the residue theorem. In both cases, the series can be
"3
*
a
* v,
AY
A
.
z
k2
"2
"f
a2
- 01
0
2
7
V 'I
1
i
summed to obtain the solution in closed form. Theexistence of the closed form is accounted for by the possibility of solving the problem by means of a conformal transformation based on an elementary function. For a two-electrode lens [Fig. 62(a)] the potential distribution has the form 1 cp(y,z) = ? ( V ,
+ V,) k (V,
-
[:
V,) -
-
-arctan sin . k(nz/u)
For a three-electrode lens the potential distribution is written as the sum of the symmetrical and antisymmetrical functions:
dY,4
= cp+
These functions are expressed as follows:
+ cp-.
(325)
158
L. A. BARANOVA A N D S. YA. YAVOR
where
The potential distribution on the z-axis can be obtained by setting y = 0 in Eqs. (324), (328),and (329). Two dimensional lenses can be designed in which each pair of plates remains symmetrical relative to the x0z-plane but the distances between the plates in the pairs are different [Figs. 62(c), (d)]. For instance, Glikman and Yakushev (1967) used the method of conformal transformations to calculate the potential distribution in a three-electrode lens shown in Fig. 62(d). The authors used the Christoffel-Schwartz method; the result is given in parametric form. The three-electrode two-dimensional lenses represented in Fig. 62(b) are employed in prism spectrometers as separate elements of these devices or as part of electrostatic prisms. As a rule, they operate in the telescopic mode, in which a parallel incident beam remains parallel after it has passed through the field. Since two-dimensional lenses are always convergent, the telescopic regime can be provided only by creating an intermediate focus inside the system. For given geometrical parameters, this mode is determined by suitable selection of the potentials on the electrodes. The work of Glikman et al. (1 967b) presents the calculations for a three-electrode lens operating in the telescopic mode and having equal distances between the parallel electrode plates. The relationships between the electrode potentials, cardinal elements, positions of the intermediate focus, and magnifications of the beam cross sections have been found and presented in numerous plots. When a two-dimensional system is used in a prism and performs the focusing and energy separation of the beam, the charged particles are directed onto the system at a large angle to the y0z-plane. To determine the position of the cardinal elements in lenses with oblique beams, one can use the formulae and plots for a normal beam, in which the potential cp(y,z) is replaced by the value cp*(y,z) = cp(y,z) - f$,sin26, where 6 is the angle formed by the beam axis and the y0z-plane. The potential distribution in a three-electrode einzel lens was calculated by VukaniE et al. (1976), taking into account the gap widths between the electrodes; the calculation was made by separation of variables. For the potential distribution on the z-axis, the following simple expression was obtained based
THE OPTlCS OF ROUND AND MULTIPOLE ELECTROSTATIC LENSES
159
on the assumption that the interelectrode gaps were small:
(330) For the finite interelectrode gaps s, the expression for the axial field E was found in the completed form:
sin h(rcz/a) - arctan cos h(7r(l + s)/a)
(331)
A comparison with experimental data obtained by measuring the field in the electrolytic tank showed a good agreement. These analytical expressions for the field distribution were used by CiriE et al. (1976b) to calculate a few hundred trajectories in the lens in question and to find its cardinal elements for a wide range of geometrical parameters and electrode potential ratios. The authors analyzed only lenses with a retarding intermediate electrode, as in the most commonly used lenses. The results were presented in the form of plots, from which it is clear that the optical power grows with increasing interelectrode gap in the range under study (s/a = 0 - 0.2) for constant intermediate electrode length. The problem of finding the field between two infinite planes parallel to each other with an arbitrary potential distribution was solved by De Wolf (1978). In another work (De Wolf, 1981), the same problem was considered for a symmetrical electrode potential distribution stepwise constant in the rectangular areas. A semianalytical method using Green's functions was suggested. The method can be applied in approximate calculations of the field between parallel plates of finite size. Thus, there is a possibility of determining, for the lenses discussed in this chapter, the field deviation from the twodimensional pattern, which is due to the finite character of the electrodes, and of estimating the effect of this factor on the lens optics.
X. TRANSAXIAL LENSES The term transaxid has been introduced to describe lenses, in which the field has axial symmetry but the beam axis does not coincide with the symmetry axis; the optic axis is now normal to the latter (Strashkevich, 1962). Such lenses are represented schematically in Fig. 63. They can be formed from a series of coaxial circular cylinders with ring-like slits [or parts of such
160
L. A. BARANOVA A N D S. YA. YAVOR
6 FIG.63. Modifications of immersion transaxial lenses: (a) cylindrical electrodes; (b) planar electrodes.
cylinders, see Fig. 63(a)], with the beam axis lying in the plane of symmetry normal to the generating lines of the cylinders. In another modification [Fig. 63(b)], which has turned out to be of greater practical interest, the lens is formed from pairs of parallel plates with circular gaps between the pairs. In this case the beam axis lies in the plane of symmetry parallel to the electrodes. In a transaxial lens, as in a two-dimensional lens, the aperture has very different dimensions in two mutually perpendicular directions (in the planes xOz and yOz; see Fig. 63). In the x0z-plane, which is usually called midplane, the aperture is large, and the size of the beam focused in this plane may also be large. For this reason, such lenses are convenient for handling disc beams, as well as fan-shaped beams. Unlike the two-dimensional lens, the transaxial lens has a nonzero optical power in all the planes of symmetry. Such a lens converges charged particles in two perpendicular directions. However, its optical power in the midplane is considerably lower than that in the normal direction. As a result, a transaxial lens possesses, as a rule, considerable astigmatism. If a transaxial lens has one circular gap, it represents an immersion lens; if there is more than one gap, it may be an einzel lens. In the midplane the beams usually form an angle less than 360°, so the electrodes need not be axially symmetric in the whole region. Their angular size is selected so as to provide the axial symmetry of the field in the region through which the beam passes with sufficiently accuracy. In a sense, the transaxial lens is closer to a glass lens than any other type of electrostatic lenses. In the midplane, its equipotentials have the form of concentric circles (or their segments); the radius of curvature and electron optical refractive index can be regulated independently. In the midplane, this
THE OPTICS OF ROUND AND MULTIPOLE ELECTROSTATIC LENSES 161
lens may be convergent or divergent; its spherical and chromatic aberrations can be corrected. An advantage of transaxial lenses is the considerable simplicity of calculation of the electron optical properties in the midplane. I t should be emphasized that the transaxial lens, unlike other types of lenses, has no magnetic analogue. Although the lenses in question are comparatively new, they have attracted much attention due to their properties and have already been applied in electrostatic prism spectrometers to improve their characteristics (Kel’man et a]., 1979). The paraxial optics of transaxial lenses will be considered in Section X.A, their aberrations will be considered in Sections X.B and X.C, and some modifications of these lenses will be described in Section X.D. A. Potential Distribution and Focusing in Transaxial Lenses
The theoretical foundations of transaxial lenses have been laid by Brodsky and Yavor (1970,1971) and by Karetskaya et al. (1970,1971); the first two also take into account the relativistic effect. Here we shall restrict ourselves to a description of the optical properties of transaxial lenses transmitting lowenergy beams. Let us consider the field distribution in the vicinity of the lens optic axis, which represents, as usual, the interception line of two planes of field symmetry. We shall locate the origin at the center of curvature of the circular gaps between the electrodes and direct the z-axis along the lens axis. Because the lens field at a sufficiently large distance from the edges possesses rotational symmetry, the potential expansion coefficients in Eq. (33) in the vicinity of the lens axis are not indepenent, but are interrelated by the following expressions obtained from the Laplace equation:
One can see from Eq. (332) that all the coefficients are expressed in terms of the potential distribution on the axis 4(z),which completely determines the field distribution throughout the space. Considering these relations, we can obtain from Eq. (38) the paraxial trajectory equations:
162
L. A. BARANOVA A N D S. YA. YAVOR
In the relativistic case, the second and third terms of Eqs. (333) contain an additional factor
where E = - e / 2 m c z . As before, the potential is taken to be zero at the point where the particle velocity is zero. A comparison with the paraxial trajectory equations for a twodimensional lens (314) shows that Eqs. (333) contains new terms with the factor l/z. These are responsible for the particle focusing in the x0z-plane, producing a certain difference in the optical properties of two-dimensional and transaxial lenses in the y0z-plane. However, for a large gap curvature radius, this difference is only slight, and for practical applications, the first-order properties of transaxial lenses in the y0z-plane can be found using the formulae and data for two-dimensional lenses. If a more accurate calculation of paraxial properties in the direction parallel to the y0z-plane is necessary, they should be found from the second expression of Eqs. (333) by the conventional method described in Section 111. The second equation of (40), taking into account Eq. (332),can be written as follows:
~. (335) differs from the analogous Eq. (319) for a where Y = ~ 4 " Equation two-dimensional lens in the additional term in the right-hand side. By integrating Eq. (335) in the thin-lens approximation and taking into account $'(zl) = $ ' ( z 2 ) = 0, we obtain the following expression for the focal length in image space:
Here the coordinate z was taken to be constant within the lens and equal to R, the radius of curvature of the refractive layer. Of primary importance is the focusing parallel to the midplane that is described by the first of Eqs. (333).Its solution can be reduced to a quadratures as follows:
Here the subscript 0 designates the initial values of the corresponding parameters. Expression (337) completely defines the trajectory projections onto the midplane, and it can yield the values for all the cardinal elements in this plane.
THE OPTICS OF ROUND AND MULTIPOLE ELECTROSTATIC LENSES
I63
The relationship between the coordinates z of a point object lying on the lens axis and its image can be obtained from Eq. (337) by setting x, = 0 and x(zi) = 0. Then we have
From expression (338) it is easy to determine the position of foci in object space z(F,,) and in image space z(Fi,) by setting zi + co or z , -+ -a, respectively. Then we shall get
For a trajectory parallel to the z-axis in object space, from Eq. (337) we find:
Hence, taking into account Eqs. (50)and (57), we shall find the focal lengths in the midplane of a transaxial lens
From comparison of Eqs. (339) and (341), one can see that z(F,x)
=
-.L z(F,*) = .fox.
(342)
The positions of the principal planes can be found using expression (49) as well as (339) and (341):
I t is clear that the principal planes H,, and Hi,of the lens coincide, so it can be regarded as a thin lens. Let us consider an immersion lens formed from a pair of electrodes separated by a narrow circular gap with radius R. When the gap width is much smaller than its radius of curvature, the integral in expression (333) can be found in the thin-lens approximation. Considering the value of z constant and equal to R within the field, let us factor it oustide the integral sign to obtain the expression
(344)
164
L. A. BARANOVA A N D S. YA. YAVOR
One can see that the position of the principal planes coincides with the zcoordinate of the gap if the latter is sufficiently narrow. Using Eq. (344),we can rewrite expressions (338),(339),and (340) in a form analogous to the formulae of glass optics for a cylindrical boundary between two media. These expressions will completely coincide if we introduce the electron optical refractive index n = When the interelectrode gap is not narrow, or when the lens contains several concentric gaps, one can introduce the effective radius of curvature R e = z ( H x ) ,which is defined by expression (343).Then the formulae of paraxial optics in the midplane will also coincide, in the general case, with those of glass optics for a cylindrical surface, Expression (338) will take the form:
a.
Correspondingly, the expressions for the positions of foci and focal lengths can be written as follows: 1 ~
It follows from this that a transaxial lens with concentric gaps can be replaced by one refractive surface with a radius of curvature equal to Re and potentials $o and 4ion either side of it. It is clear from Eq. (346) that the sign of the focal lengths depends on the sign of the effective radius of curvature and the potential ratio in object and image space. By varying these values, we can easily pass from a converging lens to a diverging one and vice versa. For instance, if it is necessary to change the sign of the focal length in an immersion lens with one gap, leaving the potential ratio d i / & constant, we should merely change the direction of the gap curvature relative to the beam direction. The magnification of a transaxial lens is given by the expression M
Z. =--I-.
(347)
ZO
It is easy to see this if we draw a trajectory through the center of curvature and the object point and recall that such a trajectory is not refracted by a transaxial lens. For a lens formed from a few pairs of electrodes separated by narrow concentric gaps, the effective radius of curvature can be obtained by dividing
THE OPTICS OF ROUND AND MULTIPOLE ELECTROSTATIC LENSES
165
the integration interval in Eq. (343) into the corresponding number of parts
Here Rk is the radius of curvature of the k-th gap; dkand 4k- represent the electrode potentials on the right and on the left of this gap, respectively; and N is the number of gaps with +N = 4i. When deducing this formula, it was assumed that the distance between the gaps was much larger than the gap width and the interelectrode distance in the direction normal to the midplane. The optical power of a transaxial lens was calculated by Brodsky and Yavor (1970),assuming that the gap was not narrow and that the potential in the refractive layer was distributed linearly. The above formulae cannot be used straightforwardly to describe multielectrode transaxial systems with nonconcentric gaps. Such a system should be subdivided into elements in which all the gaps are concentric; the optical parameters of each element should be calculated individually, and then the parameters of the system as a whole should be determined by the methods of matrix algebra described in Section 1II.D.It is evident that one can act like this only if the fields of the neighboring gaps do not overlap and if the subdivision into elements is justified. If, however, the fields of nonconcentric gaps overlap, the axial symmetry of the system is disturbed, so it cannot be regarded as transaxial. Nonconcentric systems possess a larger flexibility, but their calculation is more sophisticated (Nevinny et al. 1985).
B. Geometrical Aberrations of Transaxial Lenses Third-order geometrical aberrations of transaxial lenses can be calculated using Eqs. (85) by the method described in Sections 1V.A and 1V.B. In the general case, the aberration blurring is characterized by 10 coefficients in the direction parallel to the midplane and by 10 coefficients in the perpendicular direction. By choosing the aperture position, some of the coefficients can be reduced to zero. The calculation of the aberration blurring A x for transaxial lenses can be considerably simplified by writing the x-projection of a trajectory in terms of quadratures [see (335)]. Then the linearly independent solutions x,(z) and x&z) will take the form
Expressions for all the aberration coefficients are given in the work of Kel'man et al. (1979).
166
L. A. BARANOVA AND S. YA. YAVOR
FIG.64. A diagram for the calculation of spherical aberration in the midplane of a transaxial lens.
An important property of a transaxial lens is the possibility of determining exactly its spherical aberration in the midplane without restriction to the third-order term. For this purpose, an original approach to the calculation of the lens optical properties has been developed that is based on axial field symmetry (Brodsky and Yavor, 1970, 1971). Let us assume that the lens field in the midplane is enclosed between the arcs of two concentric circles with the radii R , and R, (Fig. 64) and that the potentials on both sides of the lens are constant and equal to &, and di, respectively. Having passed from the Cartesian coordinates x, z to the cylindrical coordinates r, II/, we can write the exact expression for the trajectory in the midplane.
Here, the subscript 0 marks the parameters on the left of the lens, the subscript h, represent the “impact” parameters of the incident and outgoing trajectory, respectively. In an axially symmetric field, the generalized angular momentum of a charged particle P# is constant: s marks the parameters on the right of it, and h, and
P$ = m r 2 4 = const.
Hence, we have
(35 1)
THE OPTICS OF ROUND AND MlJLTIPOLE ELECTROSTATIC LENSES 167
Let us denote by z, the exact value of the distance between the lens center and the point of intersection with the lens axis of an outgoing trajectory at an arbitrary angle from a point object located on the axis. As before, the positions of the point object and its paraxial image will be denoted by zo and zi. This notation has been retained, inspite of the use of cylindrical coordinates, in order to have a certain uniformity in the formulae of paraxial optics and spherical aberration. The exact value of the transverse spherical aberration is given by the expression Axi = (zi- z,)tan ys.
(353)
For zs, we have from Eq. (352)
It can be seen from Fig. 64 that the angles ys and yo are related as follows: '/r = yo
+ t,b,
-
t,b,
+ arcsin
(2)
-
arcsin(2j.
(355)
Substituting the expression for the trajectory from Eq. (350) into Eq. (355) and integrating it by parts, we will have
Expression (353) for the spherical aberration, after substitution of zi and z , ~ from Eqs. (338) and (354), will be written as follows:
(357) where
Since it has been assumed that the field is enclosed within the radii R , and R,, but the object and image are outside the field, the extension of the integration limits in Eq. (358) will not change the result. Expressions (357)and (358)give the exact value of the spherical aberration. By series expansion of the expressions in powers of the small parameter h,,
168
L. A. BARANOVA A N D S. YA. YAVOR
one obtains the spherical aberration with accuracy up to terms of any order:
Here A, =
(21 - l)!! (21)!!(21+ 1)
1
[-F+
(21 + l ) q q +l i 2 ) (360)
where Ek are Euler numbers, and Bk are Bernoulli numbers. The expression for the spherical aberration up to fifth-order accuracy has the form Ax
=
&[(% 2)h i + -
(&A:
-$)h:].
(361)
Let us consider, as earlier, a lens formed from several pairs of electrodes separated by narrow concentric gaps. By integrating Eqs. (358) and (360) in the thin-lens approximation, we shall obtain an expression for a lens containing N narrow gaps: 1 A o = - - - J & ZO
(21 - l)!! A , = (21)!!(21+ 1)
N
1
1
k = l Rk
1
(E-c)’ 1
[- p f (R,)
Z1+t
-k=l
(362) We shall write out the expression for A , because, together with A,, it defines the third-order aberration
By choosing the values of the electrode potentials and the radii of circular gaps suitably, we can minimize the spherical aberration. If we halt at third-
THE OPTICS OF ROUND AND MULTIPOLE ELECTROSTATIC LENSES 169
order aberration, it is clear that a system with the desired first-order properties and compensated third-order spherical aberration can be obtained by combining the refractive layers, very much like the way in which this is done in glass optics. C. Chromatic Aberrution
The chromatic aberration of transaxial lenses can be found from expressions (99) by substituting the linearly independent solutions of paraxial equations (333) into them. Then the expression for the chromatic aberration has the form of Eq. (100).In the y0z-plane, the general expressions for linearly independent solutions of the paraxial equation in explicit form are unknown. For this reason, the aberrations are usually calculated by numerical methods or semianalytically, using some field distribution models. For the trajectory projection onto the midplane, the independent solutions are written as quadratures (349), and the expression for A x i has the form (Karetskaya et al., 1970):
xbz,)P--,A 4
(364)
40
where
Here, as earlier, it has been assumed that the object is located on the concave side of the lens. For a parallel incident beam the expression in this case is p = p - - (40)3’2[(34i{0B F-2 4 i
,4512
*)
z4312
- 11.
(366)
Hence, we can derive an expression for an arbitrary position of the object
It follows from Eq. (364)that the coefficients of axial chromatic aberration and chromatic aberration of magnification are interrelated and can vanish simultaneously at P = 0. Moreover, as was pointed out in Section IV, the aberration of magnification can be corrected by choosing the aperture position suitably. In a transaxial lens the correction requires z B = 0.
170
L. A. BARANOVA A N D S. YA. YAVOR
In a multielectrode transaxial lens with narrow concentric gaps, PF can be easily found:
From Eqs. (367) and (368), the conclusion can be drawn that the chromatic aberration in a lens formed by one refractive layer cannot be cancelled. One exception is the case when the object is at the center of curvature (zo = 0); however, there is then no focusing. Let us consider a lens formed by electrodes with two concentric gaps. The expression for the effective radius of curvature, from Eq. (348), is
while PF is equal to
A lens consisting of two refractive layers has enough free parameters for it to be possible to cancel the chromatic aberration. For instance, for a parallel beam incident from the convex side, the condition of achromatism will take the form
Thus, the chromatic aberration in the midplane of a transaxial lens can be completely eliminated by selection of relations between the electrical and geometrical parameters of the lens, provided that it consists of at least two refractive layers. D. Transaxial Lenses Formed by Parallel Plates
Here we shall discuss transaxial lenses in which the electrodes lie in two parallel planes and are separated by narrow circular gaps (Glikman et al., 1971).Such a lens having one gap is illustrated in Fig. 63(b). The field of a transaxial lens with planar electrodes was found by the method of separation of variables, assuming that the gaps between the electrodes are infinitesimal. For the case of two concentric gaps, the expression
THE OPTICS OF ROUND AND MULTIPOLE ELECTROSTATIC LENSES 171
is [in polar coordinates (r,z)]
Here I, and I , are the Bessel functions, R , and R , are the gap radii of curvature, d is the distance between the planes on which the electrodes are located, and is the intermediate electrode potential. In Cartesian coordinates [see Fig. 63(b)] the expression for the potential distribution along the optical axis has the form $ ( z ) = ($0
-
$1)'(z,R,)
+ ($1
-
$i)'(z,R2)
+ $i,
(373)
where
Because the lenses in questions were designed to operate in prism spectrometers, two operational regimes were studied. In one regime the lens focuses the incident parallel beam in the midplane, leaving it parallel in the perpendicular plane; the latter is achieved by producing an intermediate focus in this plane. In the other regime (anamorphotic), the beam going out from a point source is transformed to a parallel beam in both directions. These regimes are provided by selection of the intermediate electrode potential at a given energy of the outgoing beam. The expression for the potential distribution (373) has been used to find the required values of the potential on the intermediate electrode, the cardinal elements, and the geometrical and chromatic aberrations. These data have been summarized in numerous plots and tables. It should be noted that they are in good agreement with the calculations obtained in the thin-lens approximation (see above). The transaxial lenses we have described have been used as collimating and focusing lenses in prism instruments, such as the electron Auger spectrometer (Bobykin et al., 1978) and the mass spectrometer (Kel'man et al., 1976). The design of transaxial lenses makes them suitable devices to combine with prism analyzers. Both transmit wide beams in the midplane, while in the perpendicular plane the beam cross section is very small.The astigmatism of transaxial lenses is necessary to match the particle sources with the prisms, An advantage is the low aberration of the lens in the midplane, which favors a higher resolving power of the device.
172
L. A. BARANOVA AND S. YA. YAVOR
Pz 91
$02
f 2 4 FIG.65. Electron optical diagram of a prism Auger spectrometer: (1.2.3) prism electrodes; (3,4,5) transaxial lenses; (8) source; (9) detector.
Figure 65 shows a schematic diagram of an electron prism spectrometer. One can see that all the elements form one unit placed on two parallel planes. The parameters of the three-electrode transaxial lenses are as follows: The distance between the parallel electrodes d = 12 mm; the average gap radius between electrodes 4 and 5 RI = 5d, between electrodes 3 and 4 R , = 7d; and the circular gap width between the electrodes is 0.25d. The lenses operate in the anamorphotic mode; and the ratio of the electrode potentials &4/&5 = 3.22 at &3/45= 0.16. The focal lengths of the lenses are very different, in the midplane and the plane normal to it. For the collimating lens f o x = 15.5d, fOy = 4.51d; the focusing lens has the same focal lengths in image space. Due to this correlation of the focal lengths, it is possible to use large divergence angles in the y0z-plane (normal to the midplane). The large focal length in the midplane increases the linear dispersion of the instrument. The basic characteristics of the instrument are as follows. The relative line half-width of the spectrometer measured from the peak of the elastically scattered electrons is 0.2% when the diameter of the primary electron beam is 1 mm and the exit slit width of the energy analyzer is 3 mm. The aperture ratio of the analyzer is 0.12% of 471,and the luminous emittance L = 0.32% mm’. The analyzed electron energy varies from 150 to 2300 eV.
THE OPTICS OF R O U N D AND MULTIPOLE ELECTROSTATIC LENSES
173
Fw. 66. Diagram of a prism mass spectrometer with transaxial lenses: (1-4) electrodes of a transaxial lens; (4-6) electrodes of a telescopic system; (M) magnetic prism; (S) source; ( D ) detector; (A) aperture.
A schematic diagram of a prism mass spectrometer is shown in Fig. 66. The spectrometer uses four-electrode transaxial immersion lenses to collimate and focus the beams. The distance between the parallel plates is 8 mm, and the interelectrode gaps are 1 mm. The focal length in the midplane is 460 mm. The linear dispersion of the mass spectrometer is 11.4 mm per 1% mass change. The resolving power is 120,000 at the peak half-width and 50,000 at 10% height. Transxial lenses can be used to increase dispersion in spectrographs. The electron optical scheme of a magnetic spectrograph (Fig. 67) consisting of a sector analyzer and a four-electrode transaxial einzel lens was calculated by
FIG.67. A magnetic spectrograph with higher dispersion: (1-4) electrodes of a transaxial lens; ( 5 )sector magnet.
174
L. A. BARANOVA A N D S. YA. YAVOR
Afanas'ev et al. ( 1982).The dispersion is increased because the transaxial lens increases the angle between the axial trajectories of different monoenergetic components of the particle beam without violating the focusing of these monoenergetic components. In the system shown in Fig. 67, the center of curvature of the first refractive layer coincides with the magnet deflection center, so the first layer does not change the dispersion. It serves to retard the beam as a whole and to focus individual monoenergetic components. The main contribution to dispersion occurs in the second refractive layer, whose radius of curvature is small, but the potential ratio is large. The third refractive layer further increases the dispersion and retards the beam down to the initial energy. The calculations show that the system provides a 4.5-to 5.0-fold increase in the angular dispersion. The foregoing discussion offers guidance about the advisability of creating and combining various electron optical elements, including transaxial lenses, on two parallel plates.
XI. CROSSED LENSES Crossed lenses are a new type of lenses (Afanas'ev et al., 1980).Their basic difference from those described earlier is the intrinsic three dimensionality of the field, which makes their calculation much more sophisticated. A crossed lens consists of a series of parallel plates with apertures having two planes of symmetry and adjusted coaxially. The dimensions of the apertures in two perpendicular directions are different; in the neighboring plates the apertures are rotated through 90" relative t o each other. Figure 68 illustrates a simple crossed einzel lens with rectangular apertures. It consists of three electrodes. Equal potentials V, are applied to the two outer electrodes, and the potential V, is applied to the inner electrode. Sometimes einzel lenses have more complex aperture shapes or a larger number of electrodes with alternating applied potentials V, and V,. A crossed lens, like any other lenses with a varying axial potential, may be of immersion type. A simple immersion lens contains only two electrodes with different potentials. Crossed lenses are simple in design and easily adjustable; their construction is relatively simple. For these reasons, such lenses are widely used in cat hode-ray tubes. Qualitatively, the optics of a crossed lens can be regarded as the combined effect of a round lens, a quadrupole, and an octupole; generally, therefore, crossed lenses are astigmatic. The relative contribution of the quadrupole component to the lens optical power is largely determined by the difference
THE OPTICS OF ROUND AND MULTIPOLE ELECTROSTATIC LENSES
175
r
FIG.68. A three-electrode einzel crossed lens.
between the transverse and longitudinal aperture lengths, and commonly it exceeds the contribution of the round-lens component. A single crossed lens is thus convergent in one of the planes but divergent in the perpendicular plane. If the electrodes are arranged as in Fig. 68, the focusing occurs in the x0z-plane for (V2 - V,)/V, > 0 or in the y0z-plane for (V, - V,)/V, < 0. A beam can be focused in all directions by a system of crossed lenses, just as in quadrupole optics. The presence of an octupole component in the potential distribution offers the possibility of correcting the lens aberrations. The term crossed lens was first introduced by Yavor (1970), where the electron optical properties of these lenses were considered in the approximation of narrow and infinitely long electrode apertures. These lenses differ from a series of two-dimensional lenses turned 90" to each other in that the field in the latter is created by plates with parallel slits, while the plates with perpendicular slits have equal potentials and field-free space between them. It was shown that crossed lenses may be divergent and achromatic. In the design of a commercial quadrupole lens, Himmelbauer (1969) also used a set of plate electrodes with curved apertures possessing two planes of symmetry (Fig. 69). A series of such electrodes alternately rotated 90" to each other models the aperture of a quadrupole lens. However, in an attempt to exactly reproduce a quadrupole field, the author failed to reveal the advantages of a crossed lens over a quadrupole, in particular, the possibility of correcting third-order aberrations. An einzel lens is an integral part of most systems formed from crossed lenses, so Section X1.A describes its simple modification and Section X1.B discusses possible optimization of its parameters.
176
L. A. BARANOVA A N D S. YA. YAVOR
FIG.69. An electrode of a crossed lens with a curved aperture.
Crossed lenses, like many other astigmatic lenses, are usually used in various combinations. This question is discussed in Section X1.C. In Section X1.D we shall describe various correctors designed on the basis of plate electrodes normal to the axis. A. A Three-Electrode Einzel Crossed Lens with Identical Rectangular Apertures
The field of a crossed lens has two planes of symmetry, and its potential distribution is described by expression (33), in which all the coefficients are nonzero. Such a field cannot be approximated by any two-dimensional models. The calculations are a very complex problem, even for a lens with simple rectangular apertures. The potential distribution of an einzel crossed lens with equally spaced electrodes (Fig. 68) was calculated using potential expansion in the apertures in terms of orthogonal functions (Afanas’ev and Yavor, 1973, 1977). In each aperture the potential was expanded in the complete orthonormalized set of functions. The Dirichlet problem for the regions between the apertures with a common term of this series as the boundary value can be solved exactly, after which the unknown coefficients of the expansion are found from the continuity of the normal potential derivative in each aperture. The order of the set of linear equations, to the solution of which the problem is reduced, is much lower than in the integral equation approach, let alone in the finite difference method. However, the calculation of the coefficients becomes a
THE OPTICS OF ROUND AND MULTIPOLE ELECTROSTATIC LENSES 177
more complicated task. This method allows us to find the potential and its derivatives on the system axis with high accuracy, using a small computer, in an acceptable time; this would not be possible with the finite difference method or the integral equation approach. The field calculations for a modification of a crossed einzel lens are given in Fig. 70, which shows the potential distribution on the lens axis cpoo(z) = &z) - V,, as well as the transverse potential derivatives q ( x .y , z ) inclusive up
FIG.70. Potential distribution components of an einzel crossed lens
178
L. A. BARANOVA A N D S. YA. YAVOR
to the fourth order: ( 7 2 ' p + q ' q ( x , y ,z ) %J2&)
=
d2pXd2qy
lx=l=o.
It follows from the expression for the potential distribution in Eq. (33) that qoo(z)and (qzo+ (pO2)/2correspond to the axially symmetric component of the field, while (qz0- q O 2 ) / 2represents the quadrupole component. We should recall that here, as usual, the potential q ( x , y , z ) is zero where the particle velocity is zero. The electrode potentials are measured from this point, that is, from the cathode. If the potential distribution is known, the first- and third-order optical characteristics of the lenses can be obtained by solving Eqs. (38) and (85). Numerical integration was used by Afanas'ev and Yavor (1977) to find the paraxial properties and spherical aberration coefficients for a three-electrode einzel lens with rectangular apertures, as well as for doublets formed from such lenses. Due to the complexity of these calculations for crossed lenses, they were largely studied experimentally. First-order focusing properties of crossed einzel lens were analyzed by Petrov and Yavor (1975), and by Petrov (1976). The measurements were made on an electron optical bench by the two-grid shadow method. The lens was formed from three-plate electrodes with identical rectangular apertures. The outer electrodes had the same potentials, equal to the anode potential V , , while the inner electrode had a potential V,. The potential ratio VJV, on the lens electrodes, its geometrical parameters, namely, the ratio of the aperture sides u/b, and the interelectrode distance h, varied in a wide range. Figure 71 illustrates the measurements of the distance from the lens center
FIG.7 I . First-order parameters of an einzel crossed lens.
THE OPTICS OF ROUND AND MULTIPOLE ELECTROSTATIC LENSES 179
to the image zi/2b, of the angular magnification r, the focal length f/2b for the aperture side ratio u/b = 2, and the distance between the object and the lens center z0/2b = 13. The abscissa shows the potential ratio V,/Vl. The dashed curves represent the x0z-plane (the inner electrode aperture is extended in the direction Ox), and the solid curves refer to the y0z-plane. Curves 1, 2, and 3 correspond to the interelectrode distances h/2b = 1.0,0.5, and 0.3. From the experimental data and calculations, one can arrive at the conclusion that the optical power of the lens grows in the given range of geometrical parameters with increasing u / h ratio and decreasing distance between electrodes h/2b (h/2b 2 0.2). This is where a crossed lens essentially differs from a quadrupole, whose optical power drops with decreasing length; in addition, a crossed lens is weaker than a quadrupole lens. A comparison with two-dimensional and round lenses consisting of three plates shows that the latter are weaker than a crossed lens for identical electrode potentials and identical distances between them. Baranova et al. (1987a) made detailed measurements of the focal lengths of an einzel crossed lens in the converging and diverging planes. Approximate functions were found that permit calculation of first-order properties with 4-57; error. The spherical aberration of a crossed einzel lens was studied by Petrov and Yavor (1975, 1976). The coefficient C,, responsible for the aberration of the line image in the mid-plane was measured. Figure 72a shows the dependence
c,, ,10 Jcm I4
Ft
'" t
i
0
1.
2
3
o/h
-2
-T
- b' a FIG.72. The spherical aberration coefficient C,, of an einzel crossed lens as a function of (a) the image position z,; (b) the aperture side ratio u / h .
180
L. A. BARANOVA A N D S. YA. YAVOR
of C, on the distance between the lens center and the Gaussian image plane. The dashed curves describe the case when the inner electrode potential is lower than the potential on the outer electrodes (V, < V , ) ; the solid curves correspond to the case V, > V,. The lens parameters are: a / b = 2, h/2b = 0.33, and 0.5 and 1.0 for curves I, 2, and 3, respectively. It is clear from the picture that the coefficient C, may change its sign and become negative. This happens only when the potential on the inner electrode is higher than the potential on the outer electrodes ( V, > Vl). For V, < V, the spherical aberration coefficient is always positive. These potential relations are valid for electrons. For ions they are reversed; for V, < V, the coefficient C,, may change its sign, but for V, > V, its sign is constant. This conclusion holds for crossed lenses in the whole range of the measurements and calculations. Figure 72(b) shows the dependence of the spherical aberration coefficient C, on the ratio of the aperture sides a/b, which considerably affects the aberration value. At a/b = 1 (square aperture), the quadrupole component of the lens field is zero. For this reason, the spherical aberration coefficients in the planes xOz and yOz (the same as the paraxial properties) are identical. However, even a slight growth of a/b sharply reduces the coefficient C,; when V, > V, it passes through zero near a/b = 1. For large values of a/b the spherical aberration depends but slightly on this parameter. B. Modijications of an Einzel Crossed Lens
The effect of aperture shape on the electron optical parameters of a threeelectrode einzel lens was studied experimentally by Baranova et al. (1 982). The optical power and spherical aberration coefficients Clx, C,, in lenses with rectangular and curved apertures were compared (Fig. 69). The measurements showed that at the same values of a/b, the optical power of the first lens was higher than that of the second one. The aberration characteristics in the midplanes of both lenses are similar. If the inner electrode potential is higher than the potential on the outer electrodes, then the coefficient C,, in either lens grows with the interelectrode distance, going to zero under certain conditions. Therefore, by varying this distance, one can obtain C , , = 0 for different positions of the image. In a lens with rectangular apertures, the variation limits for the coefficient C , , are somewhat smaller. The coefficient C, contributes to the aberration of the image outside the midplane. In contrast to Clx, it decreases with the distance between the lens electrodes. However, the coefficient C, of a lens with curved apertures changes its sign under certain conditions, whereas in a lens with rectangular apertures it remains positive in all the operational regimes under study. The sign of the coefficient C, in lenses with rectangular apertures can be
THE OPTICS OF ROUND AND MULTIPOLE ELECTROSTATIC LENSES
181
changed by reducing the aperture length in the outer electrodes down to a certain value, dependent on the inner electrode geometry. However, C, and C , , do not vanish simultaneously. The aberration properties of a crossed lens can be considerably improved by choosing the conditions under which both coefficients are sufficiently small. For instance, at h/2b = 0.45 and with the ratio u/b = 2.5 for the inner electrode, the ratio a/b 8 1.9 should be used for the outer electrodes. If the inner electrode has a/b = 2.0, the outer ones should have an optimal value for u/b of about 1.65; for a/b = 1.6 in the inner electrode, the a/b value for the outer electrodes is about 1.3. Methods to increase the optical power of crossed lenses were considered in the work of Baranova et al. (1986a). It was pointed out earlier that the value l / f for these lenses grows with increasing a/b ratio. According to the measurements, this value increases twofold in three-electrode lenses if the a/b ratio is raised from 1.6 to 3.0. However, for a given aperture width, this increases the transverse dimensions of the lens, which is not always feasible. There are two other ways of increasing the optical power: by decreasing the interelectrode distance or by using a larger number of electrodes in the series. The measurements made on an electron optical bench by the two-grid shadow method have shown that for a/b = 2.0-3.0 the optical power of a threeelectrode einzel lens grows 1.5-1.7 times when the interelectrode distance h/2b changes from 0.5 to 0.16. Hence, the application of lenses with large interelectrode distances is unjustifiable, because this simultaneously reduces the optical power and increases the system length. However, the dependence of l/f on h is not monotonic; it reaches its maximum at an interelectrode distance on the order of (0.1-0.2) 2b. When the electrodes approach each other more closely, l/f again begins to drop due to the overlapping of the fields. The spherical aberration coefficient C, passes through zero at a certain interelectrode distance. Further increase of the lens optical power can be achieved by adding one or more pairs of serially arranged electrodes. As earlier, the apertures in the neighboring electrodes are rotated 90" to each other. All the odd electrodes have the anode potential V,; the potential V, is applied to the even electrodes. It should be emphasized that the constructional complexity of crossed lenses depends little on the number of electrodes. The investigations have shown that for a sufficiently large interelectrode distance (h/2h 2 OS), the optical powers nearly add; in a five- or sevenelectrode lens, l/f grows about 2-3 times, respectively. However, with decreasing interelectrode distances, this effect becomes weaker, which seems to be due to the field overlap. When the aperture lengths become greater, the field overlap also grows. Spherical aberration measurements show that in this case there are regimes with small or negative C,. A larger number of electrodes at a constant focal length commonly reduces the spherical aberration.
182
L. A. BARANOVA A N D S. YA. YAVOR
We have so far considered the properties of einzel lenses. Immersion lenses, however, have not received that much attention from researchers. The work of Yavor (1970) gives an expression for the focal length of a two-electrode immersion lens regarded as a thin lens:
.fx
2 V,h =
(V2
-
Vl)’
f, = -
2V1h (V2 - Vl).
(374)
A numerical calculation of such lenses has been made by Gritsuk and Lachashvili (1979), who calculated the electric field by the method of integral equations with boundary collocation. The paraxial properties and spherical aberration were determined from a family of trajectories calculated by integration of the equations of motion. Series of trajectories enter the system for a fixed position of the object. The image position was regarded as the limit to which the point of interception of the trajectories with the system axis tended to go, while the trajectory inclination in object space tended to go to zero. Analogous limit values were used to find the spherical aberration coefficients. C. Systems of Crossed Lenses
Like other astigmatic lenses, crossed lenses are most often used as doublets and triplets. Such lens systems converge the beam in all directions, permitting the creation of a stigmatic image or formation of a beam with controlled astigmatism. A doublet is the simplest system of crossed lenses capable of focusing charged particles in all directions. Figure 73 shows schematic diagrams of five-electrode and six-electrode doublets. In a five-electrode doublet, the central electrode is common to both lenses [Fig. 73(a)]. In this system the first, third, and fifth electrodes have the same
FIG.73. Schematic diagrams of doublets formed by crossed lenses: (a) five-electrode;(b)sixelectrode.
T H E OPTICS OF R O U N D AND MULTIPOLE ELECTROSTATIC LENSES
183
potentials Vl. The second and fourth electrodes have the potentials V, and V4, one of which is to be higher than Vl, and the other lower, so that a fiveelectrode doublet converges the particles in two perpendicular directions. A six-electrode doublet may be formed either by two equally oriented crossed lenses or by lenses rotated 90" relative to each other [Fig. 73(b)]. In the first case, one of the potentials on the inner electrodes of the lenses must be higher than Vl, and the other must be lower. In the second case, the inner electrode potentials (V, and V,) must be simultaneously higher or lower than V1 and may have the same value. Although the design of a five-electrode doublet is a little simpler, the sixelectrode structure possesses a number of essential advantages. A six-electrode doublet permits a unipolar power supply to be used for the lenses (for example, V2 > V , , Vs > Vl). I t has a higher optical power and permits a more complete compensation of the spherical aberration. In a five-electrode doublet, only one of the lenses may possess a negative coefficient of the spherical aberration in the midplane. In a six-electrode doublet, if V2 > V, and V, > V , , one lens may possess a negative spherical aberration in the x0z-plane, the other in the y0zplane. Thus, in both midplanes one can achieve at least a considerable reduction of the spherical aberration, if not a complete compensation of it. First-order electron optical parameters and spherical aberration of stigmatic six-electrode doublets have been measured by Petrov et al. (1978). The inner electrode potentials of the doublet lenses providing the stigmatic mode are shown in Fig. 74 as a function of the image position. Figure 75 presents the measurements of the spherical aberration of a stigmatic doublet
"1
1
2 j
4 I
I
1
-1
I
,
1
1
I
I
I
2 4 2,/26 0 a 6 2 4 ZJ26 FIG 74. Potentials on the lenses of a six-electrode doublet providing the formation of a stigmatic image: curves 1, 2, 3, and 4 correspond to z,/2h = 10.3, 8.3, 6.7, and 5.7, respectively.
8
1E
184
L. A. BARANOVA A N D S. YA. YAVOR
c,,
fO2Clk
4 t
FIG.75. Spherical aberration coefficient of a six-electrode doublet in the stigmatic mode: curves I , 2, and 3 correspond to z0/2h = 10.3, 8.3, and 6.7, respectively.
in the x0z-plane at various positions of the object and image. In one of the regimes, measurements of the spherical aberration coefficients were made in both planes xOz and yOz. Both coefficients, C , , and C 1 , , were shown to be negative and small in their absolute values. A triplet of crossed lenses represents an optically more flexible system, which allows, in particular, similar magnifications to be obtained in both planes of symmetry. Such systems have found wide applications in oscillographs in designs with increasing deflection. They have replaced quadrupoles possessing more complex designs and thus have permitted higher performance characteristics of the instrument to be achieved. Questions concerning the application of crossed lenses for focusing electron beams in vidicons are discussed in the work of Petrov (1982). D . Correctors of Geometrical Aberrations
Correctors may have a construction very much like that of crossed lenses. Plate electrodes can be used to design electron optical elements, which allow us to correct geometrical aberrations of various orders. The properties of a planar electrode with an aperture are largely determined by the degree of the aperture symmetry. In a crossed lens the electrode aperture has two planes of symmetry; the basic harmonic in the potential expansion of such lenses is thus the quadrupole harmonic, in addition to the axially symmetric one.
T H E OPTICS OF R O U N D AND MULTIPOLE ELECTROSTATIC LENSES
185
FIG.76. Modifications of planar correctors of geometrical aberrations: (a) with four planes of symmetry; (b) with three. five, and six planes of symmetry, respectively.
An electrode with an aperture having four planes of symmetry [Fig. 76(a)] is analogous to an octupole. The electrostatic field it creates contains the axially symmetric and octupole components as the first two harmonics. Such an element can be used to correct third-order geometrical aberrations. Generally, a planar electrode with an aperture possessing N planes of symmetry corrects aberrations of the ( N - 1)-th order. Figure 76(b) shows possible types of such correctors that are easy to combine with crossed lenses, broadening their functional potentialities. Third-order aberration correctors have been studied experimentally in combination with an einzel crossed lens and a doublet of such lenses (Baranova et al., 1978; Baranova et al., 1982). It is shown that the axially symmetric component of a corrector field is not large and does not affect firstorder focusing within the experimental error. The spherical aberration of a corrector, as well as that of a standard octupole, depends linearly on the voltage applied to it. We shall illustrate this by describing the result of the corrector action on the spherical aberration of a crossed lens. Measurements were made of the spherical aberration of a crossed lens, together with two types of correctorsone with a cross-shaped aperture, and the other with a square one. They involved the whole length of the line image created by the lens. The lens with
186 ',5
c ,
L. A. BARANOVA A N D S. YA. YAVOR c1
FIG.77. The spherical aberration coefficients as a function of the potential U = V, - V , : (a) cross shaped; (b) square correctors.
rectangular apertures had an aperture side ratio a/b = 1.6 and an interelectrode distance of hj2b = 0.45. The corrector was placed behind the lens at a distance s/2b = 1.75 from its central electrode. The smaller aperture dimension of the corrector was equal to the smaller dimension of the lens aperture. The measurements of the spherical aberration coefficients of such a system are shown in Fig. 77 as a function of the corrector potential V, (on the abscissa, the potential U = V, - V,). The curves correspond to a positive potential U on the corrector with a cross-shaped aperture and to a negative potential U on a corrector with a square aperture, because the octupole components of the two elements are rotated 45" to each other for the electrode position indicated in Fig. 77, when potentials of the same sign are applied. If one of the correctors is rotated 45" relative to the positions shown in the picture, both electrodes will correct at the same polarity of the power supply. One can see from the illustrations that the corrector changes the coefficients C,, and C,, in opposite directions, so that if one coefficient is negative and the other positive, the corrector can considerably reduce the spherical aberration of the line image in the lens. The aberration minimum is achieved at smaller potentials if a corrector with a square aperture is used. Such a corrector should be preferred if we recall that it also has a simpler aperture configuration. A similar effect was obtained in the correction of spherical aberration A x in a stigmatic doublet, when the corrector was placed between the lenses at the same distance from each of them. The coefficients C,, and C,, in the conditions under investigation d o not vanish simultaneously but vanish at different values of the corrector potential. The minimum image width was obtained for an intermediate value of the corrector potential.
THE OPTICS OF ROUND AND MULTIPOLE ELECTROSTATIC LENSES 187
It should be noted that in addition to the correctors we have described, crossed lenses easily combine constructionally with round and twodimensional lenses formed from planar electrodes arranged in a similar way. This principle can be used to design various electron optical systems with a large variation of their focusing and aberrational properties.
XII. NEWTYPES OF LENSES.ABERRATION CORRECTORS In Sections VII to XI, we have discussed in detail the properties of the principal electrostatic lenses. In this section we shall deal with lenses that have not, for some reason, found wide application but are, however, of interest and worth special consideration. In recent years there have been some problems requiring high-quality focusing of high-energy beams as well as high current density on the target. A possible approach to such a problem is to use lenses with transverse fields and hollow beams. Electrostatic lenses that can provide focusing of such beams will be described in Section XI1.A. A description of radial lenses, whose electrodes consist of segments of a cone or wedge, is given in Section X1I.B. Finally, Sections X1I.C and X1I.D consider some methods for the correction of geometrical aberrations. An account is given of the designs of some electrostatic systems used for this purpose. A . Coaxial Lenses with Transverse Fields
Focusing of high-energy beams usually requires the use of quadrupole lenses possessing predominantly a transverse field. However, in order to provide stigmatic focusing with slightly differing magnifications in two perpendicular directions, it is necessary to use systems consisting of several quadrupoles. The adjustment and matching of such systems present a certain difficulty. It is quite clear that stigmatic focusing can be obtained much easier in round lenses, but usually such lenses exhibit a small optical power, because their fields are predominantly longitudinal. A strong-focusing round lens with a transverse field can be designed on the basis of two coaxial electrodes, one inside the other. By applying a potential difference to these electrodes, we produce a transverse axially symmetric field. Because the electrode axis lies outside the lens field, the lens is suitable for focusing hollow beams only. Calculations of the optical properties of coaxial lenses cannot be made directly in the framework of the paraxial optics theory developed earlier. If we expand the trajectory with respect to the lens axis, as we did before, the
188
L. A. BARANOVA A N D S. YA. YAVOR
restriction to the first expansion term will result in large errors, because the beam travels far from the axis. If the meridian trajectory is taken to be the beam axis, the calculations of the optical properties become considerably more complicated because of its curvature. The theory of systems with a curved axis has been developed in the work of Grinberg (1948) and Vandakurov (1957). A simple example of coaxial lens design is a lens consisting of two coaxial circular cylinders, with the radius of the inner cylinder being small. The potential distribution within the lens away from the edges has the form
Here, r l , r2 and V,, V2 are the radii and potentials of the inner and outer cylinders, respectively. The trajectory of a charged particle in such a field is expressed in terms of quadratures. Retaining the general character of the expression, we can set V2 = 0, so that for the meridian plane we have (see Kel’man and Yavor, 1968): z
cosv,dp
- 20
1
Jsin’ vo
+ D In p
,
D=
2e Vl mu; In(r2/r,) ’
(376)
Here, the subscript 0 denotes the initial parameter values, p = r/rO and cos vo = iO/uO. Expression (376) allows us to calculate, particle trajectories in the lens in question and to determine its optical properties using trajectory analysis. One can also calculate a single trajectory taken to be axial and expand the others in small parameters (angles and distances) around the axial trajectory. Further, the conditions for the beam focusing with respect to the curved axis are determined by conventional methods. One of the problems is to select lens parameters in such a way that the plane of focusing of the hollow beam round the axial trajectories coincides with the point of interception of the latter which the lens axis. The optical properties of a lens with a cylindrical rod on the axis have been analyzed in a number of studies ( e g , Krejcik et al., 1979, 1980a, 1980b; Liebl, 1979).A parallel beam entering the lens through a circular slit is considered. Numerical calculations show that the position of the point of intersection of the trajectory with the lens axis strongly depends on the initial distance between the particle and axis. This results in increasing spot size in the vicinity of the point of intersection. The spot can be made smaller by using a doublet of coaxial lenses with opposite polarity on the outer electrodes and a common axial electrode (see Fig. 78). A calculation was made of two doublet variants with different lens sequences. In one variant, the electrode potentials were such that the particles
THE OPTICS OF ROUND AND MULTIPOLE ELECTROSTATIC LENSES 189
FIG.78. A doublet of round lenses with an electrode on the axis.
in the first lens were deflected towards the axis, while in the other lens they were deflected away from the axis due to the opposite field direction. In the other variant, the lenses were arranged in the reverse order. The focusing quality is better in the second variant, in spite of the fact that the trajectories in it travel farther from the axis than in the first one. The numerical results were verified on an experimental device designed for focusing beams of protons with an energy of 400 keV. A system of four coaxial lenses with a common axial electrode of 0.2 mm in diameter was used. The outer electrodes were 16 mm in diameter; their lengths were 50 and 100 mm. The lens potentials varied between 1.0 and 2.7 kV. The intermediate radius of the circular aperture varied from 3.0 to 3.3 mm. The spot diameter was found experimentally to be 1.3 mm for a circular aperture width of 0.6 mm, which agrees well with the calculations. Thus, coaxial lenses are suitable for the focusing of ion beams of several MeV, the lens potentials being a few dozens of kV. B. Radial Lenses
We shall discuss here the focusing properties of electron optical elements, in which the potential distribution does not depend on the value of the radius vector in spherical coordinates. Such fields are created, for example, by conical electrodes, which can be cut along the generating lines [Fig. 79(a)]. Another possibility is to use planar electrodes forming a wedge and cut along straight lines emerging from the same point on the wedge edge [Fig. 79bl. Radial systems were suggested for use as charged particle spectrometers (wedge-like and cone-like prisms) in the studies of Glikman et al. (1973b, 1977).The possibility of focusing charged particles by means of such systems were considered by Yavor (1 984), Baranova and Yavor ( 1 984), and Baranova et al. (1985).The potentials must be applied to the electrodes in such a way that
190
L. A. BARANOVA AND S. YA. YAVOR
2 4 +Y
I
Y FIG.79. Radial lenses: conic (a);wedge-like (b)(the potentials on the central V , , and outer V, electrodes).
K , intermediate
the field possesses two planes of symmetry, and the lens axis then coincides with their interception line. The potential distribution cp(lc/,v) far from the lens edges is found by separation of variables, because the lens electrodes cohcide with the coordinate surfaces of the spherical coordinate system. In a radial conical lens cut into m-electrodes, it has the form (Baranova et al., 1986b):
Here, 28 is the cone angle and f V are the potentials on the main electrodes. The coefficients a2(2kf depend on the number of electrodes, their angular dimensions, and the potentials on the additional electrodes. The simplest conical lens consists of four electrodes with alternating potentials on them, which can be regarded as a quadrupole lens with a varying aperture. If the value of the cone angle 28 tends to zero, the potential distribution in Eq. (377) is transformed into the eorresponding distribution for a quadrupole with concave dectrodes. By increasing the number of electrodes into which the cone has been cut, we can control the field configuration, for example, linearizing it or creating an additional octupole component. Series (377) can be reduced to closed form; in the four-electrode conical lens, we'obtain the following expression (Baranova et al., 1987b): 2v
cp(v, lc/) = -arctan 71
8 v 2tan2-tan2-cos2t,b
[
tan4 ~2
-
tin4 v/2
T H E OPTICS OF ROUND AND MULTIPOLE ELECTROSTATIC LENSES
191
The field of a wedge-like radial lens was calculated by Baranova et al. (1985),who also found it in closed form. Unlike the conical lens, the wedge field has no planes of antisymmetry, which gives rise to a nonzero potential on the axis. A wedge-like lens can be compared to an asymmetrical quadrupole lens, which has no planes of field antisymmetry either. The equations for paraxial trajectories in radial lenses can be obtained from Eq. (38) by Substituting the corresponding expressions for $2(t). We shall assume that the potential distribution within the lens is described by the expression for the two-dimensional case and that it drops to zero bn the edges. Then we shall have z2xl‘ + p x
= 0,
z2y”
-
p2y = 0,
(378)
where Bz is the lens excitation. In a four-electrode conical lens (379) In a three-electrode wedge-like lens [ V , = Vo = 0; see Fig. 79(b)], we have
Here K O and K , represent the coefficients in the potential series expansion equal, respectively, to
2 K O =-arctansinh II
n
K2
=z 0
The trajectory equations (378) are the Euler equations and have analytical solutions. For a 2 = B2 - 1/4 > 0, we obtain
+
x = (~)1’2[xocos(aln~)
The reference point for the z-coordinate is the tip of th,e cone or wedge. In order to obtain the trajectory projections on the y0z-plane, it is necessary to replace the trigonometric functions by the hyperbolic ones in expression (382). The condition z2 I 0 corresponds to a weak lens, but we shall not deal with this case here.
192
L. A. BARANOVA A N D S. YA. YAVOR
The expression for the focal lengths of a radial lens in the x0z-plane, which is convergent, is
The length of the region in which the field is considered to be nonzero is called the effective length L of the lens, and somewhat exceeds the length of the electrodes. The values of all coordinates at the lens entrance are denoted by the subscript 0. A specific feature of radial lenses, as well as of other asymmetric lenses, is the shift of their optical center relative to the geometrical center. By the optical center we understand a point to which both principal lens planes move as the optical power of the lens tends to zero. The optical center is shifted relative to the geometrical center towards the smaller aperture, and the larger the electrode divergence angle, the greater the shift (Baranova et al., 1986~). The aberrations in conical radial lenses are described by the general expressions for aberration integrals obtained from calculations of electron optical systems with two planes of symmetry and two planes of antisymmetry. Therefore, general conclusions from these expressions also hold for them. The fundamental conclusion concerns the impossibility of cancelling third-order spherical aberration. In calculations of geometrical aberrations in wedge-like lenses, account should be taken of the fact that due to the absence of planes of field antisymmetry, supplementary fourth-power terms (octupole) appear in the expansion. These terms are determined by the system geometry and contribute to the third-order aberrations. Because of the large choice of the geometrical and electrical parameters of wedge-like lenses, the additional octupole component may vary in sign and, over a wide range, in value, permitting us to raise the question of correcting geometrical aberrations. Radial systems can be used to form achromatic lenses if electric and magnetic fields are applied simultaneously (Baranova and Yavor, 1984; Baranova et al., 1986d). Equations (378) preserve their form if the lines of force of the electric and magnetic fields in the paraxial region are mutually perpendicular, which happens only when the planes of symmetry of one field coincide with the planes of antisymmetry of the other. In this case a particle traveling along the z-axis is affected by parallel forces from the electric and magnetic fields; the polarity of the electrodes and poles must be chosen in such a way that the forces act in the opposite directions. Like quadrupoles, conical achromatic systems have the poles of the magnetic lens rotated 45" relative to the electrodes of the electrostatic lens. In wedge-like achromatic systems, the poles in the magnetic lens are shifted with respect to the electrostatic lens electrodes in the direction normal to the z-axis.
THE OPTICS OF R O U N D AND MULTIPOLE ELECTROSTATlC LENSES
193
The condition of achromaticity can be obtained by differentiation of the expression for the excitation b’ of a compound lens with respect to energy and setting the derivative equal to zero, as we did for a quadrupole lens in Section VIII [see Eq. (309)l. Radial systems allow deflecting and focusing fields to be easily combined in one unit. One of their possible applications is the focusing of deflected beams, because in such systems the varying aperture can be matched with the trajectory path. C. Correction of Geometrical Aberrations by Means of Octupoles
Correction of the geometrical aberrations of focusing systems would allow us to improve the parameters of a large number of electron optical devices. Scherzer (see Section 1V.B) has shown that a possible way of achieving such correction is the application of elements not possessing axial symmetry. It can be seen from the expressions for the third-order geometrical aberrations that their value can be changed by changing the potential expansion term, which depends on the fourth powers of transverse coordinates, that is, the value 44(z). The electron optical element, whose first term in the potential expansion depends on the fourth coordinate powers, is the octupole. It is formed by eight symmetrically arranged electrodes, to which alternating potentials f U are applied (Fig. 80). The potential distribution in the octupole can be found from
FIG.80. An electrostatic octupole
I94
I A R A K A N O V A AN11 5 Y A Y A V O K
Eq. (33),setting (/I(:)
=
const and
(/)L(~)
0: (384)
Here, [ ( z )characterizes the dependence of the potential distribution on the :-cciordinatc; the value of t ( : ) is normalized ( o u n i t y . The cocllicient K , is related to the electrode profile; i t is equal to unity if the pole profile is described by the function , / ' ( . Y , J ~ ) = (s4 6 s ' ~ ' + y4)/R4. The octupole field is predoniinantly transverse and has four planes of symmetry and four planes of antis ymme t r y . We shall recall that the question of aberration correction has been partly discussed in Section X1.D with reference to specially designed octupoles (on the basis of planar electrodes with apertures), which combined well with crossed lens systems. It can be seen from the trajectory equation (85)that the octupole field does not affect the lirst-order focusing properties, but contributes o n l y to aberrations; this contribution can be varied by altering the potential Ll on the electrodes. I t can be shown that the combination of octupoles with a round lens does not allow its geometrical aberrations to be completely corrected. This is due to the fact that the field of a round lens does not depend o n the azimuthal angle, while the octupole field is periodic in angle with a period of n,/2 and changes sign within this pcriod. Therefore, while reducing the aberration in two perpendicular directions, the octupole increases i t i n directions rotated 45' . When correcting the spherical aberration, the octupole transforms the aberration disc formed by the round lens into a rosette. I t was shown by Scherzcr and Typke ( I 967/6X) that in this case, however, there is some total gain in the resolving power due to the cusrent density redistribution in the Gaussian image plane. 1'0 provide complete correction of the third-order spherical aberration, astigmatic elements have to be introduced into the system. Such elements deform the beam, making it astigmatic, and the octupoles are arranged in such a way that the directions in which they increase the aberration coincide with those in which the beam cross section is small. O n e should bear in mind that the number of aberration coctfcients then increases, and the number of octupoles must be equal to it. In aberration correction design, the octupoles may be combined with lenses or used separately. A design is much simpler if the octupole is combined with a quadrupole lens. For this, a n octupole like the one shown in Fig. 80 or similar t o it but having difkrent electrode profiles, receives additional potentials, which create the quadrupole field component. For instance, the potential V is applied to the two side electrodes, while - V is applied t o the upper and lower electrodes. The electric supply in such a system can ~
+
THE OPTICS OF ROUND AND MULTIPOLE ELECTROSTATIC LENSES
195
independently control the octupole and quadrupole field components. The trajectory equations in a separate octupole written up to third-order terms have the form [see Eq. ( 8 5 ) ] x" = yt(z)x(x2 - 3 9 ) ,
y" = yt(z)y(y2 - 3x2).
(385)
Here, y represents the octupole excitation related to its parameters as follows:
The contribution of the octupole to the third-order geometrical aberrations can be found from expressions (385). For simplicity, we shall restrict ourselves to consideration of the spherical aberration only. Correction of geometrical aberrations in electron optical systems containing octupoles has been studied in great detail by Baranova and Ovsyannikova (1971) and Baranova et al. (1971). The spherical aberration in systems with two planes of symmetry but containing no apertures is given by expression (297). The spherical aberration coefficients of a separate octupole have the form
c,, = -y
s",
c,, =c2y = 3y
r(z)x,4dz,
1
c,, =
Iu
t(z)y2dz, -m
(387)
e
t(z)x,"y,"dz,
-00
where x, and y , are independent solutions of the differential equations (385) without the right-hand sides. They depend linearly on z and satisfy the following initial conditions at the entrance to the octupole: .%(ZI)
= a,,
Y&l)
= a,,
. a z , ) = Y;(z,) = 1.
(388)
It has been assumed that the beam at the octupole entrance is astigmatic; a, and a, represent the distances from the object to the entrance in the XOZ- and y0z-planes, respectively. The total spherical aberration of the system can be determined from formulae (97). If the octupole is superimposed on the lens, the trajectory equations taking into account the third-order terms have the form of Eq. (85). The image distortion due to the presence of spherical aberration is described by the same expression [see Eq. (297)], in which the coefficients are determined by mere summation of the lens and octupole coefficients. The latter can be found using formulae (387), where x, and y , represent independent solutions of the paraxial trajectory equation for the lens. In the thin-lens approximation, the spherical aberration coefficients are
196
L. A. B A R A N O V A A N D S. YA. YAVOR
identical for both the separate octupole and the one superimposed on the lens: C,, = -$a:, C - -yea,,4 C,, = CZy= 3ydu;a;, (389) 1,
-
where P is the effective length of the octupole:
P
1
co
=
t(z)dz.
(390)
-co
From the rectangular approximation for the function t(z), we can obtain the following expressions for the coefficients of the separate octupole: 1 Clx = - -r"P 5
+ ad5 - a
[:
Czx = CZ, = 3y - P 5
3
c1,
1 =
-
j*"(P
+ ay)5
-
a;],
1 1 + -2t 4 ( a , + a,) + -3t 3 ( a f + 4axay+ a;)
+ P2a,ay(a, + a,) + &;a;
1
J
,
(391)
When, however, the octupole is superimposed on a quadrupole lens, and the field distributions in them are approximated by the rectangular model, the aberration coefficients of the octupole can be calculated using formulae (387) and taking into account Eq. (271). In order to find the conditions for correction of spherical aberration in a system, it is necessary to write down the expressions for its coefficients derived from formulae (97) and to equate them to zero. The set of equations thus obtained is used to derive the required octupole excitations. We shall illustrate this by giving the condition for correction of the coefficient C,, in a system consisting of a quadrupole lens and an octupole that follows it at a distance 1. In the thin-lens approximation involving Eqs. (300) and (389), this condition will take the form
Comprehensive information about aberration correction is contained in the work of Hawkes, (1977) and Yavor et al. (1969). Later results concerning the designing of systems including correctors have been given by Pohner (1977), Bernhard (1980), and Hely (1982). Introduction of the octupole component in the field can be made by changing the lens design, which in fact will mean superposition of the octupole lens, rather than by introducing additional elements. This can be illustrated by the crossed lens described in Section XI. Here we shall dwell on the designs based on quadrupole lenses. An octupole component arises in a quadrupole if
THE OPTICS OF ROUND AND MULTJPOLE ELECTROSTATIC LENSES
v,
/I
Flci.
49
tY
6
197
C
8 I. Quadrupole-octupole lenses: (a) five-electrode; (b) asymmetrical with planar
electrodes: (c) three-electrode lenses.
it has no plane of field symmetry or of field antisymmetry. This situation can be achieved either by disturbing the lens geometrical symmetry or (in some cases) by changing the power supply. The potential distribution in such a lens is described by the series in Eq. (33), in which 4 4 ( ~#) 0. In this case a potential usually arises on the axis $(z). The fourth-order term with respect to the x- and y-coordinates coincides with the first term in the potential expansion of the octupole and can be employed for correction of third-order geometrical aberrations. The value and sign of $4(z) can be varied by varying the electrode profiles. One of the simplest designs of a combined quadrupole-octupole lens permitting electrical control of the octupole field component is the fiveelectrode lens (Baranova et al., 1968),whose possible modification is given in Fig. 8l(a). The lens consists of four inner electrodes and one outer electrode embracing the other four. The quadrupole component is determined by the potential V. The octupole component is formed due to the bulging of the field between the inner electrodes, and its value varies with the potential -t U. The octupole that is formed in this case differs from the one shown in Fig. 80 in that it has no planes of field antisymmetry. In the general case, the axial potential of a five-electrode lens is nonzero, which gives rise to an axially symmetric field component. The relationships between the components in the lens potential expansion characterized by the functions $i(z) [see (33)] can be varied by varying the geometrical parameters of the lens: the angular dimensions of the interelectrode gap 2 6 and the ratio of the radii R , / R , . The potential that arises on the axis can be compensated by applying a potential that is equivalent but opposite in sign, to all the lens electrodes. This lens compares favorably with an eight-electrode quadrupole-octupole lens because it is simpler in design but retains independent electrical control of the quadrupole and octupole components. Partial correction of third-order aberrations is also possible in quadrupole
198
L. A. BARANOVA A N D S. YA. YAVOR
lenses, which have only two planes of geometrical symmetry instead of four. This can be achieved, for example, by moving apart one pair of electrodes or by changing the electrode size ratios. Figure 81b illustrates an asymmetric lens with planar electrodes. A similar design can be made on the basis of a lens with concave electrodes by increasing the angular dimensions of one pair of electrodes at the expense of the other pair. A three-electrode quadrupole lens formed from the embracing outer electrode and two inner electrodes is also possible (Fig. 81c). The advantage of asymmetrical lenses is the design simplicity and the absence of additional potentials for creating the octupole component. One disadvantage is that no electrical adjustment is possible. A system of two sextupoles has also been suggested as a corrector of thirdorder spherical aberration (Crewe and Kopf, 1980). One example of a lens with a transverse field possessing two planes of symmetry and having no planes of antisymmetry is the biplanar lens (Afanas’ev, 1982; Afanas’ev and Sadykin, 1982). Its electrodes lie in two parallel planes, and the potentials are applied to them symmetrically relative to the midplanes. Such a lens is similar in design to a two-dimensional einzel lens [Fig. %(a)]; the difference is only in the directions of particle movement: The axis of a biplanar lens is parallel to the x-axis, while that of a twodimensional lens coincides with the z-axis. The methods for calculation of the field distribution in these lenses are the same, but the power expansion of small parameters is made around different axes. The basic field component of a biplanar lens in the vicinity of the axis is the quadrupole. The field also contains a weak axially symmetric component associated with the edge effects, as well as all even harmonics, since the planes of field antisymmetry are absent. Calculations of the lens paraxial properties made in the rectangular model approximation for the quadrupole component and assuming linear variation for the axial potential at the lens edges are consistent with the experimental results obtained by the shadow projection method. The advantages of this lens are design simplicity and the possibility of correcting the geometrical aberrations. D . Lenses with Partial Aberration Correction
Here we will discuss some unusual electrostatic lens designs, in which third-order aberrations can be corrected. The correction is possible due to the octupole component arising in the lens. The work of Okayama and Kawakatsu (1982) describes a new electron optical element consisting of an electrostatic quadrupole lens and a round aperture placed coaxially at some distance from it. When a nonzero potential is applied to the aperture diaphragm, an
THE OPTICS OF ROUND AND MULTIPOLE ELECTROSTATIC LENSES
199
octupole component arises in the region where the quadrupole and aperture fields overlap. The potential distribution of such an element was found by numerical solution of the three-dimensional Laplace equation using the relaxation method. It was shown that the effective octupole lies near the edge of the quadrupole lens. The paraxial properties and third-order aberration coefficients were calculated as a function of excitations of the quadrupole and the potential on the aperture diaphragm. I t was found that in such an element the spherical aberration in the converging plane could be completely corrected. This conclusion was confirmed experimentally using the shadow projection method. An advantage of this design is the automatic adjustment of the effective octupole with respect to the quadrupole lens, which permits the image errors due to imperfect adjustment to be eliminated. A way to correct the aberrations of lenses, whose electrodes are cylindrical or resemble elongated boxes with a rectangular cross section, was described in the monograph by Klemperer and Barnett (1971). For this purpose, the gaps between the electrodes were curved (lipped lenses). Such lenses.were studied in detail by Glikman and Sekunova (1981) and Glikman and Iskakova (1982). Depending on the degree of symmetry, the lens field may include, beside an axially symmetric component, a quadrupole or/and an octupole component. The first work describes a box-like lens with a square cross section that has the gaps on each face cut along the circle. Its field possesses four planes of symmetry and represents superposition of axially symmetric and octupole components. The lens forms a correct electron optical image, but its spherical aberration is different from that of a round lens, having two coefficients instead of one. It has been pointed out earlier that the spherical aberration of an axially symmetric system can not be compensated by octupoles unless astigmatic elements are introduced. However, partial reduction of the aberration is possible. Stigmatic einzel tube lenses formed by three electrodes are considered by Glikman and Iskakova (1982). The lines separating the electrodes represent the lines formed from the intersection of two cylinders of identical radius (Fig. 82). The lens is astigmatic because of the presence of only two planes of symmetry. The authors found the parameters that describe the paraxial properties of such lenses and the coefficients of their spherical aberrations. It is shown that the coefficients are of opposite signs and that one of them passes through zero. Tube lenses with interelectrode gaps having a rectangular tooth-like profile have been described by Glikman et al. (1984) and Glikman and Iskakova (1985). These lenses can be recommended when high-quality focusing in one direction is necessary. Tube lenses can be used as stigmators.
200
L. A. BARANOVA A N D S. YA. YAVOR
I
I
FIG.82. An astigmatic einzel tube lens
XIII. CONCLUSION Electron optics, which at their initial stage of development served mainly as a basis for electron microscopy and mass spectrometry, has expanded its range of application considerably. During the last decade, interest in electrostatic electron optics has increased greatly. This interest has been largely stimulated by new applications of electrostatic optical devices and instruments, in particular, to solve important technological problems in the field of solid state electronics. To cope with these problems, it is necessary to increase the lens gathering power, to optimize charged particle convergence, to create the desired current density distribution in the beam cross section, etc. Investigations, primarily of theoretical character, have been concerned with both conventional systems and the design of new ones. At present, there is a great variety of lenses and lens systems. The type of electron optical element and its basic properties are determined by such parameters as field symmetry, independent potential distribution with respect to any of the coordinates, as well the mutual positions of the field symmetry axis and the axis of the focused beam. The effect of the latter factor can be illustrated with reference to various classes of electron optical elements created on the basis of an axially symmetric field. When the beam axis coincides with the axis of field rotational symmetry, we have a standard round lens. When, however, the axes are normal to each other, we obtain a transaxial lens. The fundamental difference in their electron optical properties is quite evident. The material presented here shows that the requirements that arise in each particular case cannot be satisfactorily met by any one class of lenses. Each class of lenses has a predominant sphere of application. The choice is primarily made on the basis of lens paraxial properties. (The choice between electrostatic and magnetic systems is made on the basis of their properties discussed in the introduction.) When a sharp, undistorted image is desired,
THE OPTICS OF ROUND A N D MULTIPOLE ELECTROSTATIC LENSES
201
round lenses are undoubtedly preferable. If, however, a perfect image is unnecessary, astigmatic lenses should be preferred in many cases. Within each class of lenses, various designs are possible, which are determined, on the one hand, by the permissible dimensions and production technology and, on the other hand, by the requirements on the optical characteristics. Of great importance is lens optimization in order to obtain the desired values of the parameters of the instruments to be designed. Extensive investigations of various lens types in a wide range of geometrical and electrical parameters have provided a basis for a possible choice of suitable modifications. This book has given a few illustrations to show how optical characteristics can be essentially improved by selection of lens designs, in particular, by introducing asymmetry, by increasing the number of electrodes, and by using a complex power supply. Another straightforward approach to optimization problems is the direct search for potential distribution satisfying the requirements imposed on the system, in particular, the requirement of minimum aberrations. To minimize the spherical and chromatic aberrations in a system of quadrupole-octupole lenses, this approach has involved application of variational methods. However, optimization problems can not be considered to have been solved, and the potentialities of both trends in theoretical research are still great. Of importance also is the improvement of production technologymachining and adjustment of lenses-which may affect the “mechanical aberrations.” This question has many aspects and presents certain difficulties in calculations. in addition to comparatively new types of lenses, transaxial and crossed, which have found application, other promising electron optical systems are emerging. These are, for example, coaxial cylindrical lenses with a hollow beam, which provide high optical power. Such lenses, as well as some other new lens types, have been described in Section XII. New theoretical and computational methods are currently being developed to calculate the electrostatic fields. Much work is being done to standardize the program packages for computer analysis of fields and properties of electron optical systems. Progress in this area is directly related to further advancement in the study of electron optical systems, in particular, lenses. We may expect further improvement of electron optical characteristics and more extensive applications of electrostatic lenses.
REFERENCES Adams, A., and Read, F. R. (1972a). J . Phys. Ec Sci. Instr. 5, 150. Adams. A., and Read. F. R. (1972b).J . Phys. E : Sci. Instr. 5, 156. Afanas’ev, V. P. (19821. Zlr. Tekh. Fiz. 52, 945; Soo. Phys. Tech. Phys. 27, 604.
202
L. A. BARANOVA AND S. YA. YAVOR
Afanas’ev, V. P.,and Sadykin,A. D.(l982).Zh.Tekh. Fiz.52, 1213, 1226;Sov. Phys. Tech. Phys. 27. 735, 737. Afanas’ev, V. P., and Yavor, S. Ya. (1973). Z h . Tekh. Fiz. 43, 1371; Sou. Phys. Tech. Phys. 18, 872. Afanas’ev, V. P., and Yavor, S. Ya. (1977). Z h . Tekh. Fiz. 47, 908; Sou. Phys. Tech. Phys. 22, 544. Afdnas’ev,V. P.,Glukhoi,Yu.O.,andYavor,S. Ya.(1975).Zh. Tekh. Fiz.45,1526, 1973;Sov.Phys. Tech. Phys. 20,969, 1240. Afanas’ev. V. P., Bardnova, L. A,, Ovsyannikova, L. P., and Yavor, S. Ya. (1979). Zh. Tekh. Fiz. 49, 733; SOP.Phys. Tech. Phys. 24, 425. Afanas’ev, V. P., Baranova, L. A,, Petrov, I.’A., and Yavor, S. Ya. (1980). Optik 56, 261. Afanas’ev, V. P.. Bardnova, L. A,, Ovsyannikova, L. P., and Yavor, S. Ya. (1982). Tenth Proc. Int. Congr. Electron Microscopy, Hamburg. Alexandrov, M. L., Gall’, L. N., Lebedev, G . V., and Pavlenko, V. A. (1977). Z h . Tekh. Fiz. 47,241; Sou..Phys. Tech. Phys. 22, 139. , , AniEin, B., Terzit, I., Vukanif, J. and BaboviE, V. (1976): J. Phys. ,Ji; Sci. Instr. 9, 837. Artamonov, 0. M., Bolotov, B. B., and Smirnov, 0. M. (1976). Prib. Tekh. Exp. N3.207. Augustyniak, W . M., Betteridge, D., and Brown, W. L. (1978). Nucl. Instr. Methods 149, 669. Balandin, G . D., Gaydukova, 1. S., Ignat’ev, A. N., and Der-Shvarts, C . V.(1977). Elektronnaya Tekhnika Ser. 4, NI,29. Ballu, Y. (1980). In ‘,‘Applied Charged Particle Optics.’’ Part B. Adv. in Electronics and Electron Physics, (A. Septier, ed.), Suppl. 13B, p. 257. Academic Press, New-York. . Banford, A. P. (1966) “The Transport of Charged Particle Beams.’: E.8c.F.N. Spoon Ltd., London. baranova, L. A., and Ovsyannikova, L. P. (1971). Z h . Tekh. Fiz. 41,2182; Sou. Phys. Tekh. Phys. ’ 16, 1730. Baranova, L. A., and Yavor, S. Ya. (1984). Z h . Tekh. Fiz. 54, 1999; Sou. Phys. Tech. Phys. 29, 1173. Baranova, L. A., Fishkova, T. Ya., and Yavor. S. Ya. (1968).Radiotekh. Electron. 13,2108; Radio. Eng. Electron. Phys. 13. Baranova, L. A.,Ovsyannikova, L. P.,and Yavor, S . Ya. (1971). Zh. Tekh. Fiz. 41, 1323;Sou. Phys. Tech. Phys. 16, 1040. Baranova, L. A., Petrov, I . A., and Yavor, S. Ya. (1978). Z h . Tekh. Fiz. 48,2588; Sot.. Phys. Tech. P hys. 23, I48 I . Bardnova, L. A., Sadykin, A. D., Muchin, V. M., and Yavor, S. Ya. (1982). Z h . Tekh. Fiz. 52, 246; . Sou. Phys. Tech. Phys. 27, 161. Baranova, L. A.; Narylkov, S. C . ,and Yavor, S. Ya. (1985).Z h . Yekh. Fiz. 55,2209; Sou. Phys. Tech. Phys. 30, 1303. Baranova, L. A., Sadykin, A. D., and Yavor, S. Ya. (1986a). Radiotekh. Electron. 31, 365; Radio. Eng. Electron. Phys. 31. Baranova, I,. A., Narylkov, S. G., and Yavor, S . Ya. (1986b). Radiotekh. Hectron. 31. 778; Radio. knq. Electron. Phys. 31 (8). 169. Baranova, L. A., Narylkov, S. G., and Yavor, S. Ya. (1986~).Zh. Tekh. Fiz. 56, 2075; kov. Phys. Tech. Phys. 31, 1246. Baranova,, L.A., Narylkov, S. G.. and Yavor, S. Ya. (1986d). Z h . Tekh. Fiz. 56, 2279; Sor. Phys. Tech. Phys. 31, 1366. Bardnova, L. A., Bublyaev, R. A,, and Yavor, S. Ya. (l987a).Zh. Tekh. Fiz. 57,430; Sou. Phys. Tech. Phys. 32, 26 I . Baranova, L. A.. Narylkov,S.G..and Yavor,S. Ya.( 1987b).Zh. Tekh. Fiz.57,156;Sou.Phys. Tech. P hys. 32,9 1. Berger, C., and Baril. M. (1982). J . Appl. Phys. 53,3950. Bernhard, W. (1980). Optik 57, 73. Binns, K. J., Lawrenson, P. J. (1963). “Analysis and Computation of Electric and Magnetic Field Problems.” Pergamon Press, Oxford.
T H E OPTICS OF R O U N D AND MULTIPOLE ELECTROSTATIC LENSES
203
Bobykin, B. V., Nevinnyi, Yu. A,, and Yakushev. E. M. (1975).Z h . Tekh. Fiz. 45,2369; Sou. Phys. Tech. Phys. 20, 1475. Bobykin, B. V., Zhdanov, V. S., Zernov, A. A,, Lyubov, S. K., Malka, V. Y., and Nevinnyi, Yu. A. (1976).Z h . Tekh. Fi:. 46, 1348: Sou. Phys. Tech. Phys. 21,766. Bobykin, B. V.. Volkova,I.G.,Gall’, R. N., Karetskaya.S. P., Kel’man, V. M., Nevinnyi, Yu. A.,and Kholyn, N. A. (1978). Zh. Tekh. Fiz. 48, 853; Sou. Phys. Tech. Phys. 23, 500. Bonjour, P. (1979a).Reu. Phys. Appl. 14, 533. Bonjour. P. (1979b).Reo. Phys. Appl. 14,715. Bonner, R. F., Hamilton, G. F., and March, R. E. (1979).I n f . J. Mass Spectrom. Ion. Phys. 30,365. Bonshiedt, B. E, and Markovich, M. G. (1967).“Fokusirovka i Otklonenie Puchkov v Elektronnoluchevykh Priborakh.” (Focusing and Deflection of Beams in Cathode Ray Devices). Sov. Radio, Moskow. Brodsky, G. N., and Yavor. S. Ya. (1970).Z h . Tekh. Fiz. 40, 1304; Sou. Phys. Tech. Phys. 15, 1006. Brcidsky, G. N., and Yavor, S. Ya. (1971).Z h . Tekh. Fiz. 41,460; Sou. Phys. Tech. Phys. 16, 356. Brunt, J. N. H., and Read, F. H. (1975).J . Phys. E: Sci. Insrr. 8, 1015. Busch. H. (1926).Ann. Phys. 81,974. Cherepin, B. T., (1981).“Ionny Zond.” (Ion Microprobe.) Naukova Dumka, Kiev. Cherepin, B. T., and Vasil’ev, M. A. (1982). “Metody i Pribory dlya Analiza Poverkhnosti.” (Technique and Instruments for Surface Analysis.) Naukova Dumka, Kiev. CiriE, D., TerziE, I., and VucaniE, J. (1976a).J . Phys. E: Sci. lnstr. 9, 839. Ciri:, D., Terzi:, I., and Vucanii, J. (1976b).J. Phys. E: Sci. Instr. 9, 844. Cook, R. D., and Heddle, D. W. 0.(1976).J . Phys. E: Sci. Instr. 9,279. Courant, E. D., Livingston, M. S., and Snyder, H. S. (1952).Phys. Reo. 88, 1190. Crewe, A. W., and Kopf, D. (1980). Optik 56, 391. Davisson, C. J., and Calbick, C. J. (1931).Phys. Rev. 38, 585. Der-Shvarts, G. V., and Makarova, 1. S. (1966).Radiofekh.Electron. 11, 1802; Radio Eny. Electron Phys. 11. Der-Shvarts. G. V.. and Makarova, I. S. (1969).Rudiofeckh.Electron. 14,378;Radio Eng. Electron Phys. 14. De Wolf, D. A. (1978). Proc. I E E E 66, 85. De Wolf. D. A. (1981). Proc. I E E E 69, 123. Di Chio, D., Natali, S. V., and Kuyatt, C. E. (1974). Rev. Sci. Instr. 45, 559. Doynikov, N. 1. ( I 966). ElektroJizicheskaya Apparatura N4, 84. Draper, I., and Lee, Ch. (1977).Rev. Sci. lnstr. 48, 809. Drummond, 1. W. (1981). Vacuum 31,579. Dymnikov, A. D., and Yavor. S. Ya. (1963).Z h . Tekh. Fiz. 33, 851; Sou. Phys. Tech. Phys. 8, 639. Dymnikov, A. D., Fishkova, T. Ya., and Yavor, S. Ya. (1965).Doklady Akad. Nauk S S S R 162,1265. Dymnikov, A. D., Fishkova, T. Ya., and Yavor, S. Ya. (1966).Izu. Akad. Nauk S S S R (Ser. Fiz.) 30, 739: Bull. Acad. Sci. U S S R (Phys. Ser.) 30. Enge. H. A. (1959).Ren. Sci. Instr. 30, 248. Enge. H. A. (1961). Rea. Sci. Instr. 32, 662. Fink. I., and Kisker, E. (1980).Reu. Sci. lnsfr. 51, 918. Fishkova, T. Ya. (1980).Elektronnaya Tekhnika Ser. 4, N4, 19. Fishkova, T. Ya.. Baranova, L. A,, and Yavor, S. Ya. (1968).Z h . Tekh. Fiz. 38,694; Sou. Phys. Tech. Phys. 13, 520. Fiyata, Y., and Matzuda, H. (1975).Nucl. Insfr. Methods 123,495. Gaidukova, 1. S., Il’ina, 0. Yu., and Yarmusevich, Ya. S. (1980). Radiotekh. Electron. 25, 1256; Radio Eny. Electron. Phys. 25 (6), 110. Gavrilov, E. I., and Shpak, E. V. (1983). Z h . Tekh. Fiz. 53, 1637; Sou. Phys. Tech. Phys. 28, 1007. Geyzler, E. S., Kucherov, G . V., and Tsyganenko, V. V. (1981).Radiotekh. Electron. 26,416; Radio Eny. Electron. Phys. 26(2), 146.
204
L. A. BARANOVA AND S. YA. YAVOR
Glaser, W. (1952).“Grundlagen der Elektronenoptik.” Springer, Wien. Glavish, H. F. (1972).Nucl. Instr. Methods 99, 109. Glikman, L. G., and Iskakova, Z. D. (1982). Zh. Tekh. Fiz. 52,1874; Sou. Phys. Tech. Phys. 27, I 1 50. Glikman, L. G., and Iskakova, Z. D. (1985). Zh. Tekh. Fiz. 55,422,620; Sou. Phys. Tech. Phys. 30, 25 1, 367. Glikman, L. G., and Sekunova, L. M. (1981). Zh. Tekh. Fiz. 51, 1804; Sou. Phys. Tech. Phys. 26, 1046. Glikman, L. G., and Yakushev, E. M . (1967). Zh. Tekh. Fiz. 37, 2097; Sou. Phys. Tech. Phys. 12, 1544. Glikman, L. G., Kel’man, V. M., and Yakushev, E.. M. (1967a). Zh. Tekh. Fiz.37, 13;Soo. Phys. Tech. Phys. 12. 9. Glikman, L. G., Kel’man, V. M., and Yakushev, E. M. (1967b). Zh. Tekh. Fiz. 37, 1720; Sou. Phys. Tech. Phys. 12, 1261. Glikman. L. G., Karetskaya, S. P., Kel’man, V. M., and Yakushev, E. M. (1971). Zh. Tekh. Fiz. 41, 330; Sou. Phys. Tech. Phys. 16, 247. Glikman, L. G., Kel‘man, V. M., and Nurmanov, M. Sh. (1973a). Zh. Tekh. Fiz.43,1358,2278; Sou. Phys. Tech. Phys. 18, 864, 1441. Glikman, L. G., Kel’man, V. M., and Fedulina, L. V. (1973b). Zh. Tekh. Fiz. 43, 1793; Sou. Phys. T w h . Phys. 18, 1139. Glikman, L. G., Pavlichkova, 0. B., and Spivak-Lavrov, I . F. (1977). Zh. Tekh. Fiz. 47, 1372; SOU. Phys. Tech. Phys. 22, 788. Glikman, L. G., Iskakova, Z. D., and Petrov. 1. A. (1984).Zh. Tekh. Fiz. 54,2342; Sou. Phys. Tech. Phys. 29, 1378. Grime, G. W., Watt, F., Blower, G. D., Takacs, J. and Jamieson, D. N. (1982). Nucl. Instr. Methods 197,97. Grinberg, G . A. (1948). “Izbrdnnye Voprosy Matematicheskoy Teorii Elektricheskikh i Magnitnykh Yavleniy.”(Selected Problems of the Mathematical Theory of Electric and Magnetic Phenomena.) Nauka, Moskow-Leningrad. Gritsyuk, N. P., and Lachashvili, R. A. (1979). Zh. Tekh. Fiz. 49,2467; Sou. Phys. Tech. Phys. 24, 1389. Grivet, P. (1972).“Electron Optics.” Pergamon Press, Oxford. Grumm, H. (1952). Optik 9,281. Hanszen, K. J., and Lauer, R. (1967).In “Focusing of Charged Particles”(A. Septier, ed.), vol. I. p. 251. Academic Press, New York, London. Harting, E., and Read, E. H. (1976). “Electrostatic Lenses.” Elsevier, Amsterdam. Hawkes, P. W. (1965a). Phil. Trans. Roy. Soc. London A257.479. Hawkes, P. W. (1965b). Optik 22,543. Hawkes, P. W. (1966/1967). Optik 24,60. Hawkes, P. W. (1970). “Quadrupoles in Electron Lens Design.” Academic Press, London. Hawkes, P. W. (1977). Optik 48, 29. Hawkes, P. W. (1980). In “Applied Charged Particle Optics.” (A. Septier, ed.) Adv. in Electronics and Electron Physics, Suppl. 13A, p. 45, Academic Press, New York. Heddle, D. W. O., Papadovassilakis, N., and Yateem, A. M. (1982).J . Phys. E: Sci. Instr. 15, 1210. Hely, H. (1982). Optik 60,353. Himmelbauer, E. E. (1969). Philips Res. Reprs. Suppl. 1. 1. Hoof, H. A. (1981). J. Phys. E: Sci. Instr. 14, 325. ll’in, V. P. (1974). “Chislennye Metody Resheniya Zadach Elektrooptiky.” (Numerical Methods for Computing Electron Optical Problems.) Nauka, Novosibirsk. Iznar, A. N. (1977). “Elektronno-opticheskie Pribory.” (Electron Optical Devices.) Mashinostroenie. Moskow.
THE OPTICS O F ROlJND AND MULTIPOLE ELECTROSTATIC LENSES
205
Kanaya, K., and Baba, N. (1977).Optik 47.239. Kanaya. K., and Baba, N. (1978).J . Phys. E: Sci. Instr. 11, 265. Kantorovich. L. V., and Krylov. V. I . (1950). “Priblizhennye Metody Vysshego Analiza.” (Approximate Methods of Higher Mathematical Analysis.) Gostechteorizdat, MoskowLeningrad. Kapchinsky, I. M. (1966). “Dinamika Chastits v Lineynykh Rezonansnykh Uskoritelyakh.” (Particle Dynamics in Resonance Accelerators.) Atomizdat, Moskow. Karetskaya, S. P.. Kel’man, V. M., and Yakushev, E. M. (1970).Zh. Tekh. Fiz. 40,2563; Sou. Phys. Tech. P h y . 15, 2010. Karetskaya, S. P., Kel’man. V. M., and Yakushev. E. M. (1971). Zh. Tekh.R z . 41, 325; Sou. Phys. Tech. Phys. 16, 244. Kartashev, V. P., and Kotov, V. I. (1966). Zh. Tekh. Fiz. 36, 1569; Sou. Phys. Tech. Phys. I I , 1173. Kartashev, V. P., Kotov, V. I., and Khozyrev, Yu. S. (1976).Zh. Tekh. Fiz. 46, 1342; SOP.Phys. Tech. Phys. 21, 763. Kawakatsu, H., Vosburg, K. G.,and Siegel, B. M. (1968). J . Appl. Phys. 39,255. Kel’man, V. M., and Yavor, S. Ya. (1961).Zh. Tekh. Fiz. 31, 1439; Sou. Phys. Tech. Phys. 6, 1052. Kel’man, V. M., and Yavor, S. Ya. (1968). “Electronnaya Optika” (Electron Optics), 3d ed. Nauka, Leningrad. Kel’man, V. M., Nazarenko, L. M., and Yakushev, E. M. (1976). Zh. Tekh. Fiz. 46,1700; Sou. Phys. Tech. Phys. 21, 979. Kel’man. V. M., Karetskaya, S. P., Fedulina, L. V., and Yakushev, E. M. (1979). “ElektronnoOpticheskie Elementy Prizmennykh Spektrometrov Zaryazhennykh Chastits.” (Electron Optical Elements of Prism Spectrometers for Charged Particles.) Nauka, Alma-ata (Kazakh SSR). Kisker, E. (1982). Reo. Sci. Instr. 53. 114. Kiss, A,, Koltay, E., Ovsyannikova, L. P., and Yavor, S. Ya. (1970). Nucl. Instr. Methods 78,238. Klemperer, O., and Barnett, M. E. (1971).“Electron Optics,” 3rd ed. University Press, Cambridge. Kodama, M. (1980). Jpn. J . Appl. Phys. 19, 395. Koltay, E., Kiss, 1.. Baranova, L. A,, and Yavor, S. Ya. (1972). Radiotekh. Electron. 17, 1906; Radio Eng. Electron Phys. 17, 15 18. Kotov, V. 1.. and Miller, V. V. (1969). “Fokusirovka i Razdelenie PO Massam Chastits Vysokich Energy.” (Focusing and Mass Separation of High Energy Particles.) Atomizdat, Moskow. Krejcik, P., Dalgish, R. L., and Kelly, J. C. (1979).J . Phys. D: Appl. Phys. 12, 161. Krejcik, P., Kelly, J. C., and Dalglish, R. L. (1980a).N u d . Instr. Methods 168,247. Krejcik, P., King, B. V., Kelly, J. C. (1980b).Optik 55. 385. Kyuatt, C. E.. Natali, S., Di Chio. D. (1972). Reu. Sci. Instr. 43, 84. Landau, L. D., Lifshits, E. M. (1960). “Teoriya Polya.”(The Theory of Field.) Fizmatgiz. Moskow. Larson, J. D. (1981). Nucl. Instr. Methods 189. 71. Lawson, J. D. (1977).“The Physics of Charged-Particle Beams.” Clarendon Press, Oxford. Lebedev. N. N.. Skalskaya, 1. P., and Ufland, Ya. S. (1955).“Zbornik Zadach PO Matematicheskoy fizike.” (Problem Exercises in Mathematical Physics.) Fizmatgiz, Moskow. Legge, G . J. F., Jamieson, D. N., O’Brien, P. M. J., and Mazzolini A. P. (1982).Nucl. Instr. Merhods 197, 85. Lejeune, C., and Aubert, J. (1980). In “Applied Charged Particle Optics.” Part A. Adv. in Electronics and Electron Physics, (A. Septier, ed.). Suppl. 13A, p. 159. Academic Press, New York. Levi-Setti, R. (1980). In “Applied Charged Particle Optics.” Part A. Adv. in Electronics and Electron Physics, (A. Septier, ed.). Suppl. 13A, p. 261. Academic Press, New York. Lichtenberg, A. J. (1969). “Phase-Space Dynamics of Particles.” John Wiley and Sons, New York. Liebl, H. (1967).J . Appl. Phys. 38, 5277. Liebl, H . (1979). Optik 53,333.
206
L. A. BARANOVA AND S. YA. YAVOR
Liebl, H. (1981). Nucl. Instr. Methods 187, 143. Lyubchik, Ya. G., Savina, N. V., Fishkova, T. Ya., and Shkunov, V. A. (1971). Radiotekh. Electron. 16, 1941; Radio Eng. Electron Phys. 16. Martin, F. W., and Goloskie, R. (1982). Appl. Phys. Lett. 40, 191. Mc’Hugh, J. A. (1975).I n “Methods of Surface Analysis.” (A. W. Czanderna, ed.). p. 223. Elsevier, Amsterdam. Mulvey, T., and Wallington, M. J. (1973). Rep. Proy. Phys. 36,347. Natali, S.. Di Chio, D., Uva, E., and Kuyatt, C. E. (1972). Rev. Sci. Instr. 43,80. Nevinnyi,Yu. A.,Sekunova,L. M.,andYakushev, E. M.(1985).Zh.Tekh. Fiz.55,1713;Sou. Phys. Tech. Phys. 30, 1001. Novgorodtsev, A. B. (1982). Zh. Tekh. Fiz. 52, 2047; Sou. Phys. Tech. Phys. 27, 1257. Ohiwa, H., Blackwell, R. J., and Siegell, B. M. (1981). J. Vuc. Sci. Tech. 19, 1074. Okayama, S., and Kawakatsu, H. (1978).J . Phys. E: Sci. Instr. 11,211. Okayama, S., and Kawakatsu, H. (1982). J. Phys. E: Sci. Instr. 15, 580. Okayama, S., and Kawakatsu, H. (1983).J . Phys. E: Sci. Instr. 16, 166. Orloff, J. H., and Swanson, L. W. (1978). J . Vac. Sci. Tech. 15,845. Orloff, J. H., and Swanson, L. W. (1979).J . A p p l . Phys. 50, 2494. Ovsyannikova, L. P., and Shpak, E. V. (1977a).Zh. Tekh. Fiz. 47,438; Sou. Phys. Tech. Phys. 22, 260. Ovsyannikova, L. P., and Shpak, E. V. (1977b). Zh. Tekh. Fiz. 47,617; Sou. Phys. Tech. Phys. 22, 371. Ovsyannikova, L, P., and Shpak, E. V. (1978). Zh. Tekh. Fiz. 48, 1304; Sou. Phys. Tech. Phys. 23, 732. Ovsyannikova, L. P., and Szilagyi, M. (1970). Periodica Polytechnica-EElectrotekhnika, 14, 99. Ovsyannikova, L. P., and Yavor, S. Ya. (1965). Zh. Tekh. Fiz. 35,940;Sou. Phys. Tech. Phys. 10, 723. Ousyannikova, L. P.;and Yavor, S. Ya. (1967). Radiotekh. Electron. 12,489; Radio Eng. Electron Phys. 12,449. Ovsyannikova, L. P.,Chechulin, V. N., and Yavor, S . Ya. (1968).Zh. Tekh. Fiz. 38, 1953;Sou. Phys. Tech. Phys. 13, 1566. Ovsyannikova, L. P., Utochkin, B..A., Fishkova, T. Ya., and Yavor, S. Ya. (1972). Radiotekh. Electron. 17, 1062; Radio Eny. Electron Phys. 17, 825. Ovsyannikova, L. P., Shpak, E. V., and Yavor, S . Ya. (1975). Zh. Tekh. Fiz. 45, 2421; Sou. Phys. Tech. Phys. 20, 1509. Papoulis, A. (1968). “Systems and Transforms with Applications in Optic.” McGraw-Hill, New York. Pease, R. F. W. (1981). Coniemp. Phys. 22,265. Petrov, 1. A. (1975) Zh. Tekh. Fiz. 45,2203; Sou. Phys. Tech. Phys. 20, 1380. Petrov, LA. (1976).Zh. Tekh. Fiz. 46, 1085; Sou. Phys. Tech. Phys. 21, 640. Petrov, I. A. (1982). Eldktronnaya Tekhnika, Ser. 4,N3, 30. , Petrov, 1. A., and Yavor, S. Ya. (1975).Pis’mu Zh. Tekh. Fiz. 1,651; Sou. Phys. Tech. Phys. Lett. I , 289. Petrov, I. A., and Yavor, S. Ya. (1976). Z h . Tekh. Fiz. 46, 1710; Sou. Phys. Tech. Phys. 21, 985. Petrov, 1. A., Baranova, L. A., and Yavor, S. Ya. (1978). Zh. Tekh. Fiz. 48, 408; Sou. Phys. Tech. Phys. 23,242. Pierce, J. R. (1954).“Theory and Design of Electron Beams,” 2rtd ed. Van Nostrand, New York. Pohner, W., (1977).Optik 47,283. , Rang, 0. (1949). Optik 5, 518. Read, F. H. (1969). J. Phys. E: Sci. Instr. 2,679. Read, F. H., Adams, A., and Soto-Monliel, J. R . (1971). J . Phys.E: Sci. Instr. 4,625. I
.
THE OPTICS OF R O U N D AND MULTIPOLE ELECTROSTATIC LENSES
207
Rheinfurth. M. (1955). Optik 12.41 I . Rose, H. (1967). Optik 26, 289. Saito. T., and Sovers, 0.J. (1977).J . Appl. Phyx 48. 2306. Saito, T., and Sovers, 0.J. (1979).J . Appl. Phys. SO, 3050. Saito. T., Kikuchi, M., and Sovers, 0.J. (1979), J . Appl. Phys. SO, 6123. Sakudo, N., Hayashi, T. (1975). Rea. Sci. Insrr. 46, 1060. Scherzer. 0.(1936).Z s . .fur Phys. 101, 593. Scherzer. 0.(1947). Optik 2, 114. Scherzer, 0..and Typke, D. (1967/1968). Oprik 26.564. Septier. A. (1961 1. In “Advances in Electronics and Electron Physics”(L. Marton. ed). Vol. 14, p. 85. Academic Press. New York, Septier, A. (ed.).(1967).“Focusing of Charged Particles.” Vol. I , p. 509; Vol. 2, p. 471. Academic Press. New York. Shimizu, K., and Kawakatsu. H. (1974).J . Phys. E : Sci. Instr. 7,472. Shkunov, V. A., and Semenik, G. I . (1976). “Shirokopolosnye Ostsyllograficheskie Trybki i ikh Primenenie.” (Wide Band Oscillograph Tubes and their Applications.) Energiya, Moscow. Shpak, E. V., and Yavor, S. Ya. (1984).Zh. Tekh. Fiz. 54, 1992; Sou. Phys. Tech. Phys. 29, 1169. Shukeylo, I . A. (1959).Zh. Tekh. Fiz. 29, 1225; Sou. Phys. Tech. Phys. 4, 1123. Steffen. K. G. (1965).”High Energy Beam Optics.” Interscience, New York. Strashkevich, A. M. (1962).Zh. Tekh. Fiz. 32, 1142; SOP.Phys. Tech. Phys. 7,841. Strashkevich, A. M. (1966).“Electronnaya Optika Elektroctaticheskikh Sistem.”(Electron Optics of Electroctatic Systems.) Energiya. Moskow. Leningrad. Szilagyi, M., and Szep, J. (1987). I E E E Trans. Electr. Derices ED-34,2634. Szilagyi, M., Szep, J., and Lugosi, E. (1987). I E E E Trans. Electr. Deuicus ED-34, 1848. Tsukkerman, 1. I. ( I 972). “Preobrazovaniye Elektronnykh Izobrazheniy.” (Electron Image Processing.) Energiya, Leningrad. Tsyganenko. V. V., and Kucherov, G. V. (1973).Rudiotekh. Elektron. 18, 1085; Radio Eng. Elekrron Ph,vs. 18. 805. Tsyrlin. L. E., ( 1977). “lzbrannye Zadachi Rascheta Electricheskikh i Magnitnykh Poley.” (Selected Problems of Computing Electric and Magnetic fields.) Sov. Radio, Moskow. Vandakurov, Yu. V. (1957). Zli. Tekh. Fiz. 27. 1850: Sor. Phys. Tech. Phys. 2, 1719. Varankin, G. K. (1974).Prih. Tekh. Eup. N I , 21 1. Vijayakumar, P. S., and Szilagyi, M. (1987). Re”. Sci. Instr. 58,953. Vlasov. A. G., and Shapiro, Yu. A. (1974). “Metody Rascheta Emissionnykh ElektronnoOpticheskikh Sistem.” (Calculation Methods for Emission Electron Optical Systems.) Mashinostroenie, Leningrad. Vukanit. J., Terzit, I., Anitin, B., and Cirii., D. (1976).J . Phys. E: Sci. Instr. 9,842. Wannherg, B.. and Skollermo. A. (1977).J . Electr. Spectr. Relut. Phen. 10,45. Yamazaki, H. (1979). Oprik 54, 343. Yavor, S. Ya. (1962).Proc. Synipos. Elecrron Vucuum Pliysics. Hungary, p. 125. Yavor. S. Ya. (1968).“Fokusirovka Zaryazhennykh Chastits Kvadrupol’nymi Linzami.” (Focusing of Charged Particles by Means of Quadrupole Lenses.) Atom’izdat, Moskow. Yavor, S. Ya. (1970).Zh. Tekh. Fiz. 40, 2257; SOL..Phys. Tech. Phys. IS. 1763. Yavor, S. Ya. (1984). Pis’mu Zh. Tekh. Fiz. 10. 183: S o r . Phys. Tech. Phys. L e f t . 10, 76. Yavor, S. Ya., Fishkova. T. Ya., Shpak, E. V., and Baranova, L. A. (1969). Nucl. Instr. Methods 76, 181. Zienkiewicz. 0.C. (1977).“The Finite Element Method in Engineering Science.” McGraw-Hill, London.
This Page Intentionally Left Blank
ADVANCES IN ELECTRONICS A N D ELECTRON PHYSICS, VOL. 16
Electron Microscopy of Fast Processes 0. BOSTANJOGLO Optisches lnstitut der Technischen Uniuersitat Berlin Berlin, Federul Republic cf Germany
1. Introduction . . . . . . . . . . . . . . . . . . . . 11. Interactions of Electrons with Matter. . . . . . . . . . . . A. Single Scattering of Primary Electrons . . . . . . . . . . B. Secondary and Backscattered Electrons. . . . . . . . . . C. Auger Electron and Characteristic X-Ray Emission. . . . . . D. Optical Photons. . . . . . . . . . . . . . . . . . E. Electron-Beam-Induced Conductivity (EBIC). . . . . . . . 111. Modes of Electron Microscopy. . . . . . . . . . . . . . A. Conventional Electron Microscopy B. Scanning Electron Microscopy. . . . . . . . . . . . . IV. Time-Resolved Electron Microscopy . . . . . . . . . . . . A. Generation of Pulsed Electron Beams . . . . . . . . . . B. Image Detector for Short-Exposure Electron Microscopy . . . C. Periodic Processes and Stroboscopic Electron Microscopy . . . D. Fast Nonrepetitive Processes and Real-Time Electron Microscopy V. Application of Real-Time Electron Microscopy to Fast Laser-Induced A. Time Scale of Fast Laser-Induced Processes . . . . . . . . B. Explosive Crystallization of Amorphous Films . . . . . . . VI. Space-Time Resolution of Real-Time Microscopy . . . . . . . VII. Summary . . . . . . . . . . . . . . . . . . . . . References , . . . . . . . . . . . . . . . . . . . .
. . . . . 209 . . . . . 211 211
.
.
.
.
.
. . . . .
. . . . .
. . . . .
. . . . .
. 212 . 214
. . . .
. . . . .
. . . . .
. . . .
. 215
. 216 . 216 . .
. . . . . . . . . . . Processes. . . . . . . . . . . . . . . . . . . . . . . . . . .
216 221 223 224 233 231 253 260 260 263 213 216 216
I. INTRODUCTION Fast processes proceeding on the microsecond to picosecond time scale are not only of fundamental scientific interest but are also of great technological importance. Autocatalytic and laser-driven phase transitions, magnetic reversal in ferromagnetic materials, and switching of integrated electronic circuits are the most conspicuous examples of fast processes related to the above time scale. A large number of very different time-resolving techniques have been introduced to analyze their dynamics. Real-time measurements of reflectivity, transmission, and scattering of light have been extensively used by many authors (Auston et al., 1978; Baeri 209
Copyright i 19x9 hy 4cddemic Prc?, InL All right* of rrproduLtion In m y form rcwrved ISBN n- I 2-1)1 4 m ~ - 2
210
0. BOSTANJOGLO
et. al., 1985; Bergner et al., 1987; Von der Linde and Fabricius, 1982; Lo and Compaan, 1981; Lompre et al., 1984; Shank et al., 1983)to trace laser-induced semiconductor-metal transitions in the context of laser annealing. In addition to the annealing beam, a second laser beam of a different wavelength, used to discriminate against the scattered annealing radiation, was used as a probe. Its intensity changes caused by phase transitions were sensed by fast photodiodes. A time-resolving electron diffraction camera, based on a modified streak camera, has been reported by Mourou and Williamson (1982) and Williamson et al. (1984), and laser-induced melting of thin-metal films was studied by transmission diffraction in the subnanosecond time regime. The short illuminating electron beam pulses were produced by driving the photocathode with picosecond laser pulses. Pulsed reflection electron diffraction was employed by Khaibullin (l984), and crystal structure and lattice temperature of bulk silicon during pulsed laser,annealing were investigated. Facilities for X-ray diffraction and shadow microscopy at nanosecond exposure times have been realized with voltage-pulsed field ,emitters (Jamet and Thomer, 1976), laser-produced plasma sources (Rosser et al.., 1985), and synchrotron radiation (Larson et al., 1983). Applications concentrated on transformations of crystal structure and changes of temperature during pulse , .stressing of the specimen by shock waves and laser flashes. Other fast but less direct methods based on photoelectron emission and electric conductivity were introduced by Eberhardt et al. (1982) and Galvin et al. (1982), respectively. The latter method is particularly sensitive to the transitions isolator semiconductor --+ metal, which occur, e.g.,,when certain 'semiconductors (germanium, silicon) melt, but melting and solidification of metals can also be traced (Tsao et al., 1986); All these different techniques have specific advantages and drawbacks. The light optical methods are the fastest, reaching down into the femtosecond regime. They measure almost exclusively reflectivity and absorption giving refraction and extinction indices, from which, e.g., density and relaxation time of free charge carriers are deduced with an adequate theory of the electronic structure. Surface microscopy can be readily performed. Electron and X-ray diffraction are slower and instrumentation is more involved, as compared to light optical methods of similar time resolution. For instanc,e, high vacuum is a must. The higher complexity, especially in the case of synchrotron radiation, may be outweighed by the direct access to ,the atomic packing without ,the uncertainties of an interposed theory. A common drawback of the above methods is the limited spatial resolution on the order of 1 pm and the inability to measure ekctric and magnetic specimen fields. An alternative is imaging electron probes. Potentially; they reach a spatial resolution on the subnanometer scale, they have direct .access to crystal structure and morpholbgy, and they sense electric and magnetic specimen I
--f
ELECTRON MICROSCOPY OF FAST PROCESSES
21 1
fields. A further outstanding feature are the numerous interactions of electrons with matter, which supply a vast number of imaging modes yielding diffeknt specific information.
11. INTERACTIONS
OF
ELECTRONS WITH MATTER
Some possible interactions of a probing electron with an atom are shown in Fig. 1. Interactions that depend to a minor extent on the specific nature of the atom are omitted. Such less useful effects for analysis are Brems radiation of the decelerating primary electron and transition radiation. The latter is emitted from metals, when the dipole formed by the impingingelectron and its mirror charge collapses. Based on the interactions shown in Fig. 1, very different types of signals can be collected to give an “image” of the specimen. The different interaction processes are briefly discussed in the following. A fuller treatment is given by Reimer (1 985). A. Single Scattering of’ Primary Electrons
According to quantum mechanics, the scattering ‘amplitude for .a single scattering process of an electron by an atom is proportional to probing electron
Auger e Iect rons
s e c o n d a r y and back scattered
l o w angle scattering
FIG. 1. Some interactions of a fast probing electron with an atom. Ei are ionization energies of the atomic levels.
212
0. BOSTANJOGLO
where and I&) denote the initial and final eigenfunctions of the electronic states of the atom and H is the interaction Hamiltonian. Depending on whether the two atomic states are equal or different, scattering is elastic or inelastic. In the case of elastic scattering, any statistical phase factors of the eigenfunctions cancel and the scattering amplitudes of different atoms are coherent. The specimen acts as a three-dimensional diffraction grating, and the waves scattered by different atoms interfere. The total scattering amplitude contains information on the spatial atomic distribution. Elastically scattered electrons are the probe of choice if crystal structure and orientation, grain boundaries, and lattice defects are to be investigated. Bragg diffraction is then the main cause of image contrast. As fields due to electric and magnetic polarization of the specimen shift the phase of the probing electron waves, their distribution can also be imaged. If the primary electron is inelastically scattered by the atom, a bound electron is excited into upper levels or into the continuum. In addition to such intraband or interband transitions and ionization processes, scattering can also excite quasiparticles, such as plasmons in the conduction electron gas of metals and excitons in semiconductors and insulators. In summary, the primary electron suffers an energy loss that is characteristic of the electronic structure of the specimen. Inelastically scattered electrons may be exploited for imaging combined with chemical analysis, as was demonstrated by several authors (Colliex et al., 1975; Isaacson and Johnson, 1975; Zanchi et al., 1978). B. Secondary and Buckscuttered Electrons
Electrons ejected by a specimen that is bombarded by primary electrons with energy in the keV regime have a broad energy distribution, which can be divided into three main parts. The high-energy part extending from the energy E , of the primary electrons down to about 1 keV is mainly due to backscattered electrons. A central part in the range 50 eV to 1 keV is characterized by peaks due to Auger electrons. Finally, there is a low-energy part with a pronounced maximum at several electron volts. Somewhat arbitrarily, these electrons with exit energies below 50 eV are called true secondary electrons. Most of the secondary electrons have energies of 5-10 eV and come, therefore, from within several nanometers below the surface. They are efficiently used for surface topographical investigations, as their yield decisively depends on the tilt angle p of the trajectory of the impinging electron, as measured against the surface normal. An approximate expression for the yield ijSEof secondary electrons can be derived with the Bethe stopping
ELECTRON MICROSCOPY OF FAST PROCESSES
21 3
law (Bethe, 1930), which gives the energy loss per unit path length d E / d s of a primary electron with energy E as
(3
dE 1 ----In ds E
-
.
Energy dissipation is assumed to be caused by ionization of atoms, having a mean ionization energy Ei. Secondary electrons generated at a depth x below the surface escape with a probability exp( - .u/D), with D the exit depth, so that
bSE . --ln(E)-ja'exp( 1 1 E Ei coscp
--;)dx.
Because of their low energies, the secondary electrons are very susceptible to electric (and magnetic) surface fields. This makes them an ideal voltage probe for integrated circuits. In this context, the total yield 6 of ejected electrons as a function of the energy E of the primary electrons (Fig. 2) is of decisive importance. Since the absorbed current is a fraction (1 - 6) of the impinging electron current, a floating specimen can be charged positively (6 > 1) or negatively (S < l), depending on whether E , < E < E , or E < El and E , < E , respectively. Irradiation with primary electrons of energy E = El = E , leaves the specimen neutral, as S(El) = f i ( E , ) = 1, but only the energy E , gives stable operation. If, for example, 6 exceeds 1 by some fluctuation, i.e., the energy drops below E,, the surface is positively charged. Consequently, the exciting electrons are accelerated before impact and their yield is decreased again. The energy spectrum of the backscattered electrons consists of an elastic peak and a broad low-loss maximum, which shifts towards the zero-loss peak with increasing atomic number of the scatterer. The elastic fraction is due predominantly to single scattering. Accordingly, these low-loss backscattered electrons come from a thin surface layer of several nanometers. The broad tail of the spectrum down to about 50 eV is due to multiple inelastic scattering.
El
E2
h E
FIG.2. Yield 6 of backscattered and secondary electrons as a function of the energy E of the primary electron.
214
0. BOSTANJOGLO
The angular distribution of the backscattered electrons depends on the angle of incidence of the primary electrons. At normal incidence an approximate Lambert cosine law is observed, caused by electrons, most of which have experienced numerous scattering events and move isotropically, as in a diffusion process. In addition, there is a fraction of electrons that leave the specimen after one or two large-angle scattering events. With increasing angle of incidence (measured against the normal to the surface), a reflection-type distribution evolves with a growing portion of electrons escaping from the target after only few small-angle scattering processes. The elastic fraction of the scattered electrons carries information on crystal structure and surface morphology. It is exploited in diffraction and microscopy with low- and high-energy reflected electrons. Reflection electron microscopy has recently received interest as a high-resolution imaging technique for surfaces, as demonstrated by the work of Telieps (1983), Telieps and Bauer (1985), Ishitsuka et al. (1986), and Ogawa et al. (1986). C . Auger Electron and Characteristic X - R a y Emission
If the exciting primary electron has a kinetic energy that exceeds the ionization energy Ei of an inner, normally filled shell, for instance a K shell, this shell can be ionized. The generated vacancy attracts an electron, which moves in from an upper, e.g., L,, level. The remaining energy difference E , , - EiLl can be either emitted as a characteristic X-ray photon hw = E,, - EiL1,or in the Auger process it is transferred to an electron with an inferior binding energy, e.g., an L, electron. This L, electron is then ejected with a nonzero kinetic energy. Since the ejection of electrons causes a redistribution of the Coulomb field in the atom, its energy levels are displaced in the Auger process. A satisfactory approximation for the exit energy EA of an Auger electron, emitted by an atom with atomic number Z in a KLIL, process, is E A = EiK(Z)- EiL1(Z)- EiL3(Z+ 1). The Auger electrons appear in the energy spectrum of the ejected electrons as weak peaks in the range 40 eV-2 keV. In contrast to the peaks due to electrons that suffered characteristic energy losses by exciting interband and intraband transitions or plasmons, the Auger peaks do not move when the energy of the primary electrons is varied. As characteristic X-ray and Auger electron emission are alternate processes, their probabilities, i.e., yields, are complementary. An atom with a large yield 6, for emission of characteristic X-rays, i.e., an atom with large atomic number, has a low yield hA = I - 6, for Auger electron emission and vice versa.
ELECTRON MICROSCOPY OF FAST PROCESSES
215
Both Auger electrons and characteristic X-ray photons have energies that are characteristic of the electronic structure, i.e., chemical nature, of the specimen. Since the yield of X-ray emission increases with the atomic number, heavy atoms are best analyzed by X-ray fluorescence and light atoms are best analyzed by Auger electrons. The probe volumes are also complementary. As most Auger electrons have energies well below 1 keV, they have small scattering lengths, and most unscattered Auger electrons originate only a few atomic layers beneath the surface. They are an excellent probe for chemical mapping of surfaces. In contrast, scattering of X-ray photons is much weaker than the Coulomb scattering of Auger electrons. Accordingly, the X-ray photons stem from depths comparable to th,e range of the primary electrons, usually several micrometers. Characteristic X-ray analysis thus gives information on the chemical nature of the bulk.
D. Optical Photons Light emission from solids that are bombarded by electrons is due to various processes depending on the electronic band structure. In semiconductors and insulators, electrons are excited from the valence band, leaving free holes there, or from donor atoms across the band gap into the conduction band. The charge carriers thermalize within a time roughly coinciding with the Debye period l/fD % 1 ps and crowd at the bottom of the conduction or at the top of the valence band, respectively. Further relaxation is based on recombination of free electrons with holes or acceptor atoms. Most recombinations are nonradiative and deexcitation proceeds by multiphonon production. The radiative transitions are classified as intrinsic and extrinsic processes. Intrinsic photon emission is due to recombination of free electrons with holes across the band gap with or without exchange of a phonon. The former compensates for a change of electron momentum in materials with an indirect band gap. Apart from recombination with a hole, an electron can create a bound state, the exciton, having a hydrogen-like emission spectrum within the band gap. Extrinsic emission results in recombination of electrons with holes at trapping centers such as acceptor atoms or lattice defects. Consequently, extrinsic luminescence can be exploited to image lattice defects. Cathodoluminescence of metals is caused by quite different processes, as there are no holes formed. It consists of the low-frequency part of Brems radiation, of transition radiation, and of decaying plasmons of the conduction electron gas.
216
0. BOSTANJOGLO
E. Electron-Beum-Induced Conductivity ( E B I C ) An electron with an energy of several keV impinging on a semiconductor can generate up to several thousand free electron-hole pairs, thus locally modulating the electric conductivity to an appreciable extent. This effect is measured as the EBIC signal in an external circuit, for instance as a current modulation, if a constant voltage source is driving the circuit. Since the conductivity depends on the density and velocity of the free charge carriers, it is strongly influenced by traps and local electric fields. Consequently, electron-beam-induced currents are efficiently used to image electrically active lattice defects, pn-junctions, and sites of avalanche multiplication of charge carriers in semiconductor devices. In principle, all these different types of signals presented in Sections A to E can be used to study time-varying processes. But, as resolution ultimately is determined by shot noise, signals with a large efficiency, such as elastically scattered and secondary electrons, are preferred as time-resolving probes for very fast processes. Therefore, applications of only these two types of signals for electron microscopy of fast processes will be discussed.
111. MODESOF ELECTRON MICROSCOPY
There are two distinctly different modes of microscopy, conventional and scanning, depending on whether the signal detection and processing are parallel or sequential. A. Conventional Electron Microscopy
In conventional microscopy, the signals are picked up simultaneously from all object points and are simultaneously processed to an image by lens or mirror imaging or simply by central projection. There are various modes of conventional electron microscopy based on different signals, as shown in Fig. 3. 1. Transmission Electron Microscopy ( T E M )
Transmission microscopy is by far the most widespread conventional mode, as it allows investigations of crystal structure and crystal defects at high resolution in routine operation. The specimen, a thin film, is illuminated in transmission by a focused electron beam from a thermal or field emission gun and imaged by an objective lens and several projective lenses. In addition,
U w’
thermal electron gun
double condenser lens
specimen o b j e c t i ve lens aperture intermediate lens
I
projective lens
d e t e c t or I
magnetic prism
object i ve lens aperture project i ve lens
detector thermal e l e c t r o n gun (C)
FIG.3. Different modes of conventional electron microscopy: a) transmission, b) reflection, c) mirror, d ) secondary, and e) field emission electron microscopy.
218
,
0. BOSTANJOGLO spqcimen
ions, electrons, photons, heat
- L
aperture
y s
p eci me n
ted f o cleaning 1
1
detector
detector
selected area diffraction can be performed either with a focused beam or with an area-selecting aperture in the image plane of the objective lens. The contrast that is most frequently exploited is due to scattering-Bragg scattering in crystals-and absorption at the objective lens aperture. Strongly scattering parts in a weakly scattering matrix appear dark in a bright-field image (Fig. 4).
FIG.4. Imaging of a thin film with conventional transmission microscopy. Contrast is generated by scattering of the illuminating electrons by the specimen and absorption of the scattered radiation by an aperture at the back focal plane of the objective lens.
ELECTRON MICROSCOPY OF FAST PROCESSES
219
in t ensily
7 J c FIG.5 . Defocused imaging of a phase,.object, consisting of, magnetic domains with antiparallel in-plane magnetization. , . The image intensity is sketched for the dashed plane.
Local fields due to electric and magnetic polarization scatter by rather small angles (10-5-10-4 rad), so that imaging with an objective aperture is inconvenient. On the other hand, the pertaining potentials produce a phase shift of the electron wave
with A the magnetic vector potential of the magnetization, U the electric potential of the specimen with thickness D,Eo ( > > e U )and 1 the kinetic energy and wave length of the imaging electron, and C its trajectory. This phase shift causes pronounced interference patterns outside the object plahe, so that magnetic and electric domain boundaries may easily be visualized by defocused imaging (Fig. 5 ) Because of the small scattering angles, a very small illumination divergence ( 5 rad) must be used, and, in the case of magnetic domains, the specimen must be screened from the magnetic field of the objective lens. The spatial resolution is determined by spherical and chromatic aberration of the objective lens and by diffraction at the objective aperture, giving a disc of confusion with diameter 6 (in the object plane)
The different letters designate spherical C, and chromatic C , aberration constants, maximum semiangle a passed by the objective aperture, and spread of electron energy A E ,
220
0. BOSTANJOGLO
Blurring due to inelastic scattering and chromatic aberration can be appreciably reduced by energy filtering. The maximum resolution in the elastic image ( A E = 0) is then given by
hmin= const Cf’423/4,
(5)
reaching down to 0.2 nm for nonperiodic objects. 2. Reflection Electron Microscopy ( R E M ) Transmission electron microscopy is limited to thin films. This restriction is overcome by reflection electron microscopy. Using high-energy electrons, the setup is similar, except for the difference that the specimen is illuminated by a grazing electron beam. Furthermore, the surface of bulk material is usually imaged with electrons scattered by small angles only. A small angle of incidence ( “N lo-’ rad) is used to sense even small surface modulations. Small takeoff angles are utilized for imaging in order to provide as many elastically scattered electrons as possible. Contrast is due to shadow casting and scattering within the uppermost layers, and spatial filtering of the scattered electrons by the objective lens aperture. In conjunction with ultrahigh vacuum techniques, surface morphology, even oligoatomic steps, can be successfully studied (Ishitsuka et al., 1986; Ogawa et al., 1986). However, due to the oblique takeoff angle of the imaging electrons, the field of view is extremely anisotropic. This can be corrected with an added cylinder lens or by image processing. The resolution is primarily limited by chromatic aberration, as a substantial fraction of the “reflected” electrons has suffered inelastic collisions. In addition to imaging, the crystal lattice structure of the top layers can be studied by reflected electron diffraction in selected areas. 3. Electron Mirror and Emission Microscopy
Electron mirror microscopy exploits electrons that are reflected by an equipotential at a variable distance in front of the specimen surface. The object, being at a potential slightly more negative than the cathode, is part of an electrostatic immersion lens. Atomic steps, electric fields, and magnetic fields can be imaged, the latter with a considerably reduced resolution and only outside the center. The setup can be also operated as a powerful lowenergy electron reflection or emission microscope by adjusting the potential of the specimen positive relative to the cathode (Telieps, 1983; Telieps and Bauer, 1985). In electron emission microscopy, the object is the source of a cathode lens. The imaging electrons are emitted by heating or bombardment with electrons,
ELECTRON MICROSCOPY OF FAST PROCESSES
22 1
positive ions, or photons. The emitted current density carries local information on material parameters as atomic number and work function, and on the surface relief. The latter acts by local microlenses due to deformation of the electric field at the surface. The lateral resolution of mirror, emission, and low-energy electron reflection microscopy is restricted by the large spherical and chromatic aberration of the electrostatic immersion lens at the specimen. Optimizing the electron optics, Telieps (1983) achieved a lateral resolution of z 4 nm, at least with the low-energy reflection microscope. 4. Field Electron Microscopy
The field electron microscope operates without lenses. The object is a microtip with a radius r z 50 nm at a high negative potential of several keV. Electrons are field emitted and centrosymmetrically accelerated towards the viewing screen. Contrast is generated by differences in work function or electron affinity in the case of adsorbed molecules. The resolution is determined by a compromise between geometric imagin of and diffraction at the atoms, giving as the minimum resolved distance 2 h , with best values at around 0.5 nm.
e
B. Scanning Electron Microscopy ( S E M )
Figure 6 schematically shows the setup for scanning electron microscopy. The electron beam from a thermal or field emission gun is focused by several lenses down to a diameter of 0.5 to 10 nm in the plane of the specimen. The beam is raster scanned across the specimen synchronously with the electron beam of a cathode-ray tube (CRT). The image is generated by modulating its intensity with one of the various signals recorded. Information depends on the operation mode and on the signal. Emission modes comprise secondary, backscattered, Auger electrons; characteristic X-ray; and visible light photons. Absorbed electron current and beam-induced conductivity belong to the absorption mode. Image resolution is limited by the diameter 6 of the electron probe on the specimen and by the exit volume of the signal used. The former is given by the geometric optical source size, the spherical and chromatic aberration, and diffraction:
where 6, is the diameter of the crossover, ci, is the semiangle subtended by the crossover at the first aperture, ci is the semiangle of the beam converging
222
0. BOSTANJOGLO
condenser lenses scan aenerators
de f Iect i n g coils
probe f ormrng lens amplifier specimen
signal absorbed
e-
EBlC FIG.6 . Scheme of a scanning electron microscope.
on the specimen, and C, and C , are spherical and chromatic aberration constants of the probe-forming lens. At high currents and in the case of a thermal electron source, its large crossover and emission angle exceed the aberration terms and determine the final spot size. The exit volume of the signal used increases with growing penetration depth, i.e., the energy of the primary electrons, thus reducing resolution more seriously than the spot size 6 due to Eq. ( 6 ) .There is, however, no point in reducing the acceleration voltage below a certain limit. Resolution also decreases with decreasing signal/noise ratio. Since the yield of signal particles falls with decreasing primary energy, a compromise between shot noise and exit volume must be sought. The resolution ranges from 0.1 nm to about 1 pm, the bad resolution holding in the case of X-ray imaging. The crystal lattice structure can be analyzed by keeping the electron beam fixed on the probed area and scanning the angle of incidence through the
ELECTRON MICROSCOPY OF FAST PROCESSES
223
Bragg angles. In this way, so-called electron channeling patterns are generated that are the result of multiple diffraction at the lattice.
IV. TIMERESOLVED ELECTRON MICROSCOPY Time-resolved electron microscopy can be classified into stroboscopic and real-time microscopy. Stroboscopic microscopy is chosen when periodic processes are to be investigated. The signal is picked up periodically in exact synchronism with the periodic process. Access to all phases of the process is realized by varying the phase of the sampling procedure. Because of the additive collection of signals over many periods of the process, noise, being random, is effectively eliminated. Both conventional and scanning microscopes can be operated in the stroboscopic mode. Magnetic reversals induced by high-frequency magnetic fields in low-loss ferromagnetic materials and switching of electric potentials in clocked electronic circuits are important examples of application that will be discussed in Section 1V.C. The stroboscopic technique is limited to periodic processes. Nonrecurrent events are the domain of real-time microscopy. Fast processes, proceeding on the microseconds time scale and below, can only be traced by conventional microscopy, as scanning is far too time consuming to keep pace with the rapid events. Real-time microscopy is applicable to all time-varying processes. The main application are fast phase transitions, which will be discussed in Section 1V.G. Time-resolved microscopy is distinguished by the fact that the electron beam should be pulsed, and this is for three reasons. In the first place, pulsing of illumination is recommended in order to avoid radiation damage to the specimen and/or detector at the high intensities I used for short recording times At to maintain a sensible signal/noise ratio 1 A t / m t = Secondly, pulsing the illumination is a convenient way for time-resolved imaging. In addition, by pulsing the illumination periodically, the stroboscopic operation is easily realized. Of course, real-time and stroboscopic microscopy can be also accomplished by gating the detector, but the problem of radiation damage remains. The third reason for pulsing the beam is the possibility of increasing the emissivity of the electron source far beyond its stationary value. In the short-exposure mode of real-time microscopy, the image signal is to be collected within only a single short period of time. Accordingly, the illuminating electron source must provide a high-intensity pulse and the
-
n f .
224
0. BOSTANJOGLO
detector must be very sensitive. Before discussing time-resolving microscopy in detail, the two prerequisites-electron beam pulsers and an adequate detector-will be described. A . Generation of Pulsed Electron Beams
Electron beams for electron microscopy and analysis are generated by thermal emission, field emission, and photoemission. Tungsten and LaB, are preferred as thermal emitter materials. Usually, a thermal gun is built as a three-electrode system consisting of an electron emitter, a current regulating and focusing Wehnelt electrode, and an accelerating anode. The actual source is the crossover, with a diameter of about 50- 100 pm formed by the Wehnelt electrode. The thermal electron emission current density j at the temperature T is given by the RichardsonDushman law:
where C is a constant, W is the work function of the emitter, and k is Boltzmann’s constant. At the usual operating temperature of 2800 K for tungsten, the values of current density j and axial brightness j / m ’ (2cr is the FWHM value of beam divergence angle) are several A/cm’ and l o 5 A/cm2 .sr. Due to its lower work function, a LaB, cathode has values that are an order of magnitude higher. However, this cathode must be used at pressures below lo-’ mbar to avoid rapid corrosion of LaB, by oxidizing residual gases. Field emitters are distinguished by a very high emissivity of several 10’ A/cm2.sr due to their small emitting area. Conventional solid-state field emitters are, however, extremely sensitive to ion bombardment and can be safely operated only in ultrahigh vacua. A possible alternative is the liquid metal source. Here, a regenerating liquid tip is drawn by field forces from the liquid layer covering a robust, pointed wire. Stable field electron emission was recently observed by Hata et al. (1987), even at mbar, if certain conditions are fulfilled. The field intensity at the tip must stay below 10, V/mm and the radius of the supporting solid tip must be less than 10 pm. The usual photoemitters are very sensitive to oxidizing gases and can therefore be operated only in ultrahigh vacua. Recently, however a photocathode with a 20-nm thin, rough gold film as photoemitter that worked in a conventional vacuum of mbar was introduced by May et al. (1987). Electron emission was achieved by focusing an ultraviolet laser beam to a diameter of 10 pm on the gold film from the back side. The electrons escaped
ELECTRON MICROSCOPY OF FAST PROCESSES
225
by field-assisted photoemission. Photoemission alone would be inferior by four orders of magnitude. All three cathode types can be used for production of electron pulses and this is done in several ways. I . Modulation
of’ the Wehnelt Voltuye
The electron beam is turned off in the quiescent state by a negative Wehnelt bias. An electron pulse is produced by applying a short positive voltage pulse of an adequate amplitude to the Wehnelt electrode via a hightension capacitor. In this way, the pulse generator is decoupled from the high acceleration voltage of the electron gun. Electron beam pulse widths of 5 ns at operating frequencies up to 100 MHz have been reported by Szentesi (1972). The method suffers from the serious drawback that an energy spread of the beam electrons is produced at higher frequencies, as reported by Schief and Steiner (1973). This effect becomes appreciable when the time of flight of the electrons in the accelerating field becomes comparable to the period of the modulating voltage. Image resolution is then seriously degraded by chromatic aberration of the imaging lenses.
2. Pulsing of u Filter Lens An electrostatic three-electrode Einzel lens with a sufficiently high negative potential at the central electrode is an electron mirror. According to Plies (1982),it can be used to switch off the electron beam. Due to the energy spread of the electron beam, the blocking voltage must exceed the acceleration voltage applied to the electron gun. To produce an electron pulse, an opening voltage pulse is supplied to the central lens electrode via a capacitor. The blanking lens must be shaped such that the potential saddle is very flat in the radial direction. Then the lens can be switched by small voltage pulses at the central electrode. The lens proposed by Plies is designed to pulse 20-keV electron beams with a switching voltage amplitude of only 10 V at frequencies up to 100 MHz. 3. Drpection across u n Aperture Most often, electron pulses are generated by deflecting the beam with a transverse electric or magnetic field across an aperture. As well-shaped rapid voltage pulses are more easily produced than current pulses, electric fields are preferred for deflection at high frequencies. Magnetic field pulses are disfavored, as it is rather inconvenient to eliminate damping due to eddy
226
0. BOSTANJOGLO
currents induced in neighboring metals and due to ringing caused by selfinductance and winding capacitance of the coils.' Beam pulsing is achieved in two ways. Either the electron beam is deflected off an aperture with a dc bias voltage in the quiescent state and an electron pulse is produced by switching it off and on again for the wanted pulse duration, or, more frequently, the electron beam is swept across the aperture by a sinusoidal deflection voltage CJ, = U,cosst. The pulse duration is determined then by the slope of the sweeping voltage at the zero crossing points. In order to produce electron pulses with the very frequency w of the sinusoidal voltage, crossing of the aperture must be allowed only during one slope of the voltage. This is achieved by superimposing an appropriate second sweeping voltage. The deflecting transverse electric fields are produced either with a lumped parallel plate capacitor, a microwave cavity, or a traveling wave structure. The lumped plate capacitor is the simplest system. As the deflection angle H for electrons of energy eCJ( >>eUocos o t ) entering the capacitor at time t = 0 is given by
IUo sin(ot,) 02su ot, '
.-
effective deflection is realized only as long as the time of flight tJ of the electron through the capacitor is negligible against the period 2n/w of the deflecting sine voltage. Here I and s are length and mutual distance of the capacitor plates. The limiting frequency is increased by reducing the length I of the capacitor to a minimum. Electron pulse widths of 10 ps at deflection frequencies up to 10 GHz were achieved by Menzel and Kubalek (1979) with 1-30 keV electrons. There is a second reason to keep the length of the capacitor at a minimum at high frequencies. If the time of flight is comparable to the period of the deflection voltage, electron acceleration at the entrance and deceleration at the exit of the capacitor due to axial stray fields from the capacitor plates to the earthed surroundings do not cancel anymore. This results in a shift and spread of the electron energy. Microwave cavities tuned to the working frequency of the device under study were used as effective deflectors and bunchers in the gigahertz region. Electron pulses of 0.2 ps with a repetition rate of 1 GHz could be realized by Hosokawa et al. (1 978). The microwave cavity, however, has the disadvantage that it can be used at single frequencies only.
' Magnetic field deflectors, however, have the advantage of being significantly less sensitive to fluctuating charges on electron-beam-induced contamination layers.
ELECTRON MICROSCOPY OF FAST PROCESSES
+u e
227
4
1
FIG.7. Traveling wave deflector for an electron beam.
A disadvantage of the lumped capacitor is the fact that in the case of a pulsed deflection voltage the electrons fall behind the latter at short pulse widths, as the electric signal propagates with the vacuum velocity of light, which exceeds the velocity of the beam electrons. In order to fully exploit the deflecting action at short pulse widths, a meander-type traveling wave deflector system was introduced (Fig. 7) by Feuerbaum and Otto (1978). The deflecting voltage is forced on a detour, giving a decreased effective propagation velocity along the electron trajectory, which can be made to coincide with the electron velocity at a particular acceleration voltage. The traveling wave deflecting structure shown in Fig. 7 can be thought of as being a coaxialcable wave guide that is cut apart along the axis, with the two halves placed opposite and equal electrodes facing one another. In order to avoid reflections, the wave guides are terminated with a resistor R equal to the wave impedance Z. Ambipolar pulses f U are fed to the two wave guides, generating a transverse deflecting field. This field travels along the axis with a velocity made to coincide with that of the beam electrons, which accordingly experience the full deflecting field all their way through the deflector. In contrast to microwave cavities, which are tuned to a single frequency, and unscreened traveling wave systems, which reveal dispersion (Meinke and Gundlach, 1968), the short capacitor and the trough-type traveling wave deflector are applicable to all frequencies within their bandwidth of several gigahertz. In case where the image signal is obtained from a probe focused onto the specimen, pulsing by simple deflection across an aperture has the serious
228
0. BOSTANJOGLO
electron source deflecting capacitor
;.A; -
chopping aperture probe forming
FIG.8. Ellipticdeformation of acircular electron spot of original diameter d , by deflection of the probing beam across an aperture.
disadvantage that the effective probe size is increased along the direction of deflection (Fig. 8). The circular probe of original size d, is deformed to an ellipse with axes d, and d'. Disregarding the possibility of using unrealistically small chopping apertures, the deformation can be almost eliminated by placing the deflecting capacitor and the chopping aperture in suitable planes between the probe-forming lenses (Menzel and Kubalek, 1979).
4. Pulsed Elcctron Emission Pulsed electron beams can be generated by initiating pulsed emission of electrons from the source. This is done by supplying pulsed energy to the electrons, either by light, heat, or accelerating voltage. a. Pulsed Photoelectron Emission. In order to achieve pulsed photoelectron emission, the photocathode is irradiated with focused ultraviolet light pulses from a laser. Usually, stable photoelectron emission neccessitates an ultrahigh vacuum. Recently, suitable photocathodes were introduced that operated efficiently, even in normal high vacuum. Akhmanov et al. (1985) used bulk tantalum, and May et al. (1987) used thin, rough gold films as cathode material. The thin-film cathode is illuminated for convenience from the rear. Accordingly, its thickness must approximate the exit depth of the photoelectrons ( z20 nm) but should exceed the absorption length of light.
ELECTRON MICROSCOPY OF FAST PROCESSES
229
The film must be coarse grained, containing numerous microtips to provide field-assisted photoelectron emission, as photoemission from smooth gold films is inferior by a factor of lo4 at the used wavelength of 266 nm (Marcus et al., 1986). The light pulses were derived from an actively mode-locked CW Nd: YAG laser. They were passed through a lightguide, where they were compressed from 70 ps to 3 ps by chirp and dispersion. Then they were frequency quadrupled and thereby sliced to 1.5 ps. These ultraviolet pulses were finally focused to a spot of 1 0 p m in diameter on a rough-gold-film cathode, from which electron pulses were extracted with a two-anode structure. The electron pulses had a width of 1 ps, peak values of 0.5 mA, and a repetition rate of 100 MHz. An emissivity of lo8 A/cm2.sr was reported.
6. Pulsed Thermal Electron Emission (Bostanjoglo and Heinricht, 1987). The most widely used electron guns are based on thermal electron emission from tungsten. Their emission current density, as given by Richardson’s law [Eq. (7)] increases by one order of magnitude for every 200 K increase in temperature near their normal operating temperature of 2800 K. However, a substantial stationary increase of temperature above 3000 K, even when staying below the melting point, results in rapid destruction of the cathode wire by field-assisted flow. As, however, the involved hydromechanics proceed on the 100-ns time scale (Bostanjoglo and Heinricht, 1986), heating of the cathode with focused nanosecond laser pulses is not expected to destroy the emitter, even if the melting point is exceeded. Destruction is further obstructed by the fact that laser pulse heating is localized. The setup of a laser-flash driven thermal electron gun, which can operate in a technical vacuum of mbar, is shown in Fig. 9. It is a commercial three-electrode tungsten hairpin gun, installed in any commercial electron microscope, which has been modified for additional laser flash heating. The usual constant electron beam current operation is performed by conventional heating, focusing with the Wehnelt electrode, and accelerating with the anode. In addition, green laser pulses (532-nm wavelength, 5 4 s FWHM) from a Qswitched, frequency-doubled Nd:YAG laser can be focused to a spot with a diameter of 100 pm onto the tip of the tungsten hairpin. When a 5-11s laser pulse is applied, an electron pulse is emitted, having its maximum at the end of the laser pulse and being significantly broader. Because of the delay and the increased pulse width, the majority of electrons escape by thermal emission, not photoemission. The pulse shape and, in particular, the amplitude, expectedly depend on the energy of the light pulse and on the stationary background temperature (Figs. 10 and 11). Peak heights of focused electron beam pulses up to 5 mA were achieved, which is to be compared to the stationary value of 20 pA with the employed acceleration
0. BOSTANJOGLO
230
thermal e gun viewing lens
Nd-YAG l a s e r 532nm 5ns
-
L
E
anode condenser lens
e- pulse slicer
FIG. 9. Scheme of a laser pulse-excited thermal tungsten hairpin electron gun (From Bostanjoglo and Heinricht, 1987.)
FIG.10. Electron current pulses as emitted by the gun shown in Fig. 9, using laser pulses of different energies and a steady-state temperature of 2800 K of the cathode. The pulses are shown at two different time scales. (From F. Heinricht, PhD thesis.)
ELECTRON MICROSCOPY OF FAST PROCESSES
2
23 1
2.5' 0
I
0 :To = 3000K
+ :To I 28OOK 0 To = 2400K
50
100
150
200
250
300
350
400
Laser pulse energy ( p J ) FIG.11. Electron pulse amplitude as a function of the applied laser pulse energy and steady state temperature 7;,. The pronounced scattering of the measured values is caused by scattering of the deposited laser pulse energies as the shape of the emitter is changed with each shot.
voltage. At higher laser pulse energies, a delayed and broad satellite pulse piles up, with its maximum occurring about 100 ns after the main pulse. As the laserpulsed cathode wire shows shallow molten regions with numerous protrusions (Fig. 12), electrons are emitted from liquid material. The satellite pulse is probably caused by field-assisted thermal emission from the liquid when it disintegrates due to temperature-induced gradients of the surface energy. The
FIG. 12. Tip of a tungsten hairpin emitter after several thousand shots. The spherical protrusions show that the material is molten by the laser pulses.
232
0. BOSTANJOGLO
delay in the order of 100 ns is typical for flow processes (see Section V.A). Of course, this irregular satellite pulse must be suppressed, and this is done with a blanking unit consisting of a deflecting capacitor and a chopping blade. When laser pulses are applied to a cold cathode, the leading electron pulse splits into several bursts. In order to obtain a smooth shape, the emitter must be kept at a stationary background temperature above 2500 K. The bursts are thought to be caused by laser-induced desorption of adsorbed oxygen and accompanying changes in the work function. The latter is known to depend heavily on temperature below 2800 K in an ordinary high vacuum of lo-’ mbar, because of oxygen adsorption (Gmelin, 1978). As shown in Fig. 11, the electron pulse amplitude at first increases with laser pulse energy up to about 150 pJ. Above this value, however, the amplitude decreases again and the leading pulse is found to become very irregular too. Simultaneously, the electron emission is succeeded by arcing of the electron gun. Apparently, laser-supported combustion and detonation waves (Pirri, 1971,1973; Raizer, 1965) are produced. The generated tungsten plasma absorbs light by inverse Brems radiation and thus screens the cathode. The attractive features of this gun are ease of operation; robustness; no need for ultrahigh vacuum, as with field-emission cathodes; and operation as a pulsed cathode without obstructing the standard constant current mode. c. Cold Cuthode Semiconductor Junction Gun. If the electric field in a reverse-biased p-n junction of a semiconductor is high enough, the mobile charge carriers are so heavily accelerated that avalanche breakdown occurs. The resulting heating of electrons may stimulate a significant emission into the vacuum if the p-n junction is located close enough to the surface of the specimen. Using this effect, van Gorkom and Hoeberechts( 1987)developed an efficient cold cathode semiconductor junction gun. Figure 13 shows the principle. Conduction electrons, coming from the pdoped region into the reverse-biased depletion layer, are accelerated there by the high electric field and produce electron-hole pairs by collisions with bound electrons. The electrons crowd in the conduction band of the thin n-doped layer. Those that have a kinetic energy along the surface normal exceeding the work function escape into the vacuum. Doping and geometry must be such that the bias field provides efficient avalanching, but tunneling across the energy gap is negligible. This is necessary, as most electrons arriving at the conduction band of the n-layer by the Zener effect would have low energies and could not leave the semiconductor. A simplified scheme of the p-n junction electron source is sketched in Fig. 13(b). The emitter is produced by photolithography on a silicon chip. Driving this source in a two-electrode configuration with an acceleration voltage of 10 kV, a current density of 1500 A/cm2, and a brightness of
ELECTRON MICROSCOPY OF FAST PROCESSES semiconductor
I
233
vacuum
EC EF EV
-T 1
(a) (b) FIG. 13. Semiconductor pn-junction electron gun (From Van Gorkom and Hoeberechts, 1987.)(a) Energy band model of a reverse biased pn-junction showing avalanche multiplication of electrons in the depletion region ( E c , E, , bottom and top of conduction and valence band; E,. Fermi level; U,, reverse voltage: W, work function). (b) Schematic setup of a semiconductor junction gun.
lo7 A/cm2.sr have been achieved. Another very attractive feature, apart from these high values, is the fact that the emission current can be modulated with the reverse bias at frequencies up to the gigahertz region. There is, however, the serious drawback that the high current densities are reached only by lowering the work function with a cesium layer on the silicon surface. This cathode, therefore, requires a vacuum free from oxidizing gases such as oxygen, water, or carbon dioxide.
B. Image Detector for Short- Exposure Electron Microscopy (Bostanjoglo et ul., I987u, 1 9 8 7 ~ )
Short exposure times can be realized in two ways. Either the specimen is illuminated with a pulsed electron gun, having a brightness that exceeds that of a constant current gun by several orders of magnitude, or the image signal is collected with a gated image intensifier. Though the first method is to be preferred as the only way to reduce shot noise in the image, there are also several points that may favor a gated detector. In the first place, damaging of the specimen by electron bombardment due to thermal and electronic decomposition can be a problem at the high current densities supplied by a superradiant pulsed electron gun. Secondly, it is easier to realize a short gating time than a short electron beam pulse with a high intensity. Thirdly, the gain of a detector may be increased in the gated mode far beyond its stationary value.
234
0. BOSTANJOGLO
FP PC MCP
sc FP
FIG. 14. Closed-type micro channel plate image converter with proximity focusing. (FP, fiber plates; PC, photocathode; MCP, micro channel plate; SC, phosphor screen).
Image converters with microchannel plates, incorporated as main amplifiers, are distinguished by their high gain and compact construction, especially in the case of converters of the proximity-focusing type (Fig. 14). The microchannel plate is a continuous dynode secondary electron multiplier, consisting of an array of about lo6 hollow.-glass conducting channels. Channel diameter and length typically are 10 pm and 0.5-1 mm. A survey of properties is given by Wiza (1979). The sealed image converter consists of a photocathode, one or several cascaded microchannel plates, and an output scintillator. The open converter lacks fhe photocathode. The open type is more versatile and economic, as it can be assembled from selected components, reserving full access to all parts. Gating can be accomplished in principle
16' -
c
c
-
- current /
__
gain
I
I
Input current ( A ) FIG. 15. Output current versus electron input current (Bostanjoglo et al., 1 9 8 7 ~ )of a microchannel plate (Varian VUW 8922 40 x 0.5 mm) for pulsed (both voltage and input current) and conventional dc operation (shown in the insert as given by the manufacturer). The dashed line indicates the gain defined as the ratio of output to input.
ELECTRON MICROSCOPY OF FAST PROCESSES
0.5
1.0
MCP Voltage
235
1.5 2 .o pulse (kV)
(a)
.-C
W
c3
MCP Voltage pulse 2.1 kV. 5, 50ns
Input current pulse length A Ips
1 1
I
10-9
I
lo-*
I
IO-~
I
IO-~
Input current ( A ) (b) FIG. 16. Gain of a pulsed microchannel plate as function of (a) voltage and (b) input current (From Bostanjoglo et al.. 1987c.)
by pulsing any of the voltages. One must, however, bear in mind that gating, apart from determining the sampling time, must serve two further purposes: to increase the gain beyond the stationary value and to protect the detector against overload during the intensive illumination. This goal, in particular the first point, is reached only by pulsing the voltage across the microchannel plate.
236
0. BOSTANJOGLO
As the gain of a microchannel plate depends exponentially on the voltage, a moderate increase of the latter has a tremendous effect on the gain. If short gating pulses are applied, the voltage amplitudes can exceed the safe maximum dc ratings very significantly, as was demonstrated by Bostanjoglo et al. (1987~). The gain can be safely increased by orders of magnitude. A channel plate with a maximum safe dc voltage and input current density of 1 kV and 0.1 nA/cm2, respectively, can be operated for at least 50 ns at 2.1 kV to amplify input currents of densities up to 0.1 pA/cm2 (electron current pulse width 1 ps) without damage. Thereby the gain is increased by roughly two orders of magnitude above the stationary value (Fig. 15). The gain of a pulsed microchannel plate depends exponentially on the exciting voltage pulse amplitude, even at high input current densities, although it gradually decreases with growing input (Fig. 16). The exponential dependence of the gain on the voltage is additionally exploited to realize short exposure times with readily produced nonrectangular exciting voltage pulses (Fig. 17). These advantages of pulsed channel plates cannot be provided by sealed image converters for the following reasons. The capacitance between photocathode and microchannel plate is at least one order of magnitude smaller than the capacitance of the latter. Furthermore, the thin-film electrodes of the channel plate are not a good electric shield. Accordingly, when the voltage pulse is applied across the channel plate, a significant fraction is capacitively coupled to the photocathode. This causes arcing, with permanent damage to the voltage stability of the gap and to the photocathode layer. Therefore, the open channel-plate image converter (Bostanjoglo et al., 1987a, 1987c)was used as gated detector in short-exposure transmission microscopy, as described in Section 1V.D.b. A certain disadvantage of the open image converter must be remembered, however. Due to the decreasing secondary electron emission with increasing energy of the impinging electron (Fig. 2), only about 25% of the incoming high-energy electrons are registered. C. Periodic Processes and Stroboscopic Electron Microscopy
The main applications of stroboscopic microscopy have been investigations of periodic changes induced by high-frequency magnetic fields in ferromagnetic films and of electric potentials in semiconductor devices. Electron-beam stroboscopy is of particular technical interest as a fast and nonloading voltage probe for testing large-scale integrated circuits. 1. Applications to Magnetic Reversal
The macroscopic distribution of the magnetization vector in thin magnetic films is conveniently visualized in transmission microscopy by defocused
ELECTRON MICROSCOPY OF FAST PROCESSES n
> Y
Y
-InaJ 3
a2 aJ
237
a
c
9)
-2 1 0
>
a 0 u z
n
3
m
v
b ,
FIG 17. Generation of short exposure times with a microchannel plate, based on the exponential dependence of the gain on the voltage. (a) Exciting voltage and (b) output current. (From R. P. Tornow, PhD thesis.)
imaging (Fig. 5). Domain walls, their substructures, and fluctuations of the magnetization direction within the domains show up as bright and dark lines. The mean direction of the local magnetization vector within the domains may be determined with the “right-hand rule” of the Lorentz force. Applying a magnetic field H to a magnetic structure sets the local magnetization vectors M into rotation due to the torque M x He, where the effective field He is the sum of applied field and interactions due to exchange, anisotropy, demagnetization, and magnetostriction. As the magnetization is magnetostrictively coupled to the lattice, it dissipates its precession energy by spin-phonon interactions and orientates finally along the effective field. The temporal change dM/dt of the local magnetization M is given by the gyroscopic equation of Landau and Lifshitz (1935)
dM dt
~
=
1)lTH, x M - -M M,
x
(M x He),
(9)
238
0. BOSTANJOGLO
with = p0e/2rn, the gyromagnetic ratio, p o the vacuum permeability, e and rn, charge and mass of the electron, M , the saturation magnetization, and p a n empirical damping constant. The two terms on the right-hand side describe two fundamentally different modes of response of a magnetic structure to an applied field. The first term gives an inertia-type response consisting of precession around the effective field, whereas the second one comprises relaxation into the new equilibrium. A magnetic structure frequently consists of various domains separated by domain walls, with a gradual change of the local magnetization direction. Such a magnetic structure can respond to an applied field simply by displacing the domain walls. Thereby the magnetization within the wall rotates, generating a stray field energy that is proportional to the square of the propagation velocity. In analogy to the kinetic energy of a true mass, an inertial mass m can be ascribed to the moving domain wall, and, in general, to any propagating structure containing a gradient dtl/az of the magnetization angle 8. As was shown by Becker (1951) and Konishi et al. (1975), a 180" Bloch wall in a thin film (Fig. 18a) has a mass rn per wall area such that
which is caused by the first term in Eq. (9). Here N is the in-plane demagnetization factor of the wall and the z-axis is the in-plane normal to the wall. The second expression in Eq. (9) is a damping term, which shows that the magnetization relaxes within a time z = l/BTH, into equilibrium. Values typical for many low coercivity materials are p = 0.001 - 0.1 and He =
(a)
(b)
FIG.18. 180' magnetic domain wall in a thin ferromagnetic film with in-plane magnetization. (a) Bloch wall and (b) crosstie wall with Bloch lines and crossties (the local magnetization is indicated by arrows).
ELECTRON MICROSCOPY OF FAST PROCESSES
239
10 - 100 A/cm. They give z = 10-1000 ns as relaxation times for the motion of the magnetization vector. The magnetic structure can be quite complicated [Fig. 18(b)], so that magnetic reversal comprises various mechanisms, such as domain rotation, displacement of walls, and wall substructures. The different mechanisms proceed with different relaxation times and can therefore be observed individually, using exciting fields that change on different time scales. As complete decoupling is not feasible, transient stray fields build up, causing nonlinearities and irreversibilities. Using stroboscopic transmission microscopy in the defocused mode, Golubkov et al. (1979) and Bostanjoglo and Rosin (1980, 1981a, 1981b) investigated the different fast magnetic reversal effects in thin films. If a magnetic field is applied along a 180" domain wall, the latter tends to move in such a way that the domain with the favored magnetization grows. If the wall is fixed at two points, e.g., by film inhomogenieties causing a higher local coercivity, the wall will bow. A sinusoidal field H = H , sin wt will cause forced vibrations of the wall. Figure 19 shows the maximum excursion of a domain wall at three different exciting frequencies o.The amplitude H , of the exciting sinusoidal magnetic field was kept constant. The dependance of the vibration amplitude on the frequency is not monotonous, but has a resonance at 8 MHz. The resonance curve, showing the dependance in detail, is given in
Fici. 19. Stroboscopic defocused electron images of forced vibrations of a crosstie wall (Bostanjoglo and Rosin, 1980~1,IY80b. 1981a, 1981b)excited by a magnetic sine field H,sinwf along the wall ( H , = 190 Aim). The wall was imaged at its two maximum excursions and the negative prints were copied on top of one another to demonstrate the vibration amplitude at different frequencies c0/2n. (a) I MHz: (b) 7 MHZ: (c) 18 MHz.
240
0. BOSTANJOGLO
0
-
10 20 W / ~ J T MHz) ( Frc. 20. Amplitude x 0 of forced vibrations of the crosstie wall shown in Fig. 19 versus frequency to/2n of theexciting magnetic field H , sin cot (directed along the wall). The unsymmetric shape near the resonance is typical for unharmonic vibrations.
Fig. 20. Starting at zero frequency, the amplitude x o of the forced oscillation gradually increases with growing frequency. At about 10 MHz, the wall exercises large statistical jumps and a resonance collapse occurs, resulting in a new domain structure. If, on the other hand, the resonance curve is swept coming from the high-frequency region, the amplitude at first grows steeply with decreasing frequency. Then, beyond about 10 MHz the amplitude decreases again. Obviously, resonance is passed in this direction without a collapse. Such a nonsymmetrical resonance curve is typical for unharmonic oscillators with an additional cubic restoring force m A w g x 3 in the equation of motion d2x mT dt
+ mt dx + mw$ 1 + A x 2 ) x = 2M,H0 sin cot. dt --
Here x,, wo, m, and t are displacement, eigenfrequency, mass per area, and relaxation time of the wall; M , is the saturation magnetization; and A > 0 is an anharmonicity constant. The right-hand side of Eq. (1 1) is the exciting force exerted by the applied field. The anharmonicity term m A w i x 3 is due to stray fields generated within the domains near the displaced wall. Values m z 4.10-’ kg/m2 and t z 30 ns can directly be inferred for the mass and relaxation time of the wall that is shown in Fig. 19. Bloch lines within the domain walls are the one-dimensional analogue to Bloch walls. They constitute the boundary between two antiparallel magnetized parts of the domain wall [Fig. 18(b)]. Just as domain walls, they can respond as an entirety to a time-varying magnetic field. If a step-like magnetic field with a rise time exceeding the spin-spin relaxation time ( z4 ns) is applied perpendicular to the wall, the Bloch line responds by damped oscillations
ELECTRON MICROSCOPY OF FAST PROCESSES
24 1
FIG.21. Stroboscopic defocused electron images of the ringing response of a Bloch line on a magnetic wall, being excited by a rectangular magnetic field pulse H = 1 10 Aim (Bostanjoglo and Rosin, 1980a. 1980b, 1981a, 1981b). The field has a duration of 90 ns, riseifall times are 5 ns, and the repetition rate is 5 MHz. The sampling electron pulse width is 4 ns. Imaging points of time, as related to the leading edge of the magnetic field pulse, are indicated in the upper left corners.
about its new equilibrium (Fig. 21). At small amplitudes of the Bloch line oscillations, the neighboring crossties remain fixed. If, however, the Bloch line approaches a crosstie beyond a critical distance, the latter at first is repulsed and then the pair is annihilated. If the rise time of the magnetic field is inferior to the spin-spin relaxation time, the magnetization vectors cannot respond coherently. The domains on both sides of the wall break up into subdomains, creating stray fields that, in turn, force the domain wall to subdivide into numerous transient Bloch lines and crossties. These are gradually annihilated within the first 10 ns of such an incoherent switching due to a steep step-type magnetic field. The stroboscopic pictures demonstrate that magnetic switching comprises reversible processes, which can be imaged, and irreversible changes, which can result in resonance collapses of the original magnetic structure. Substructures, consisting of gradients of the magnetization direction, can move as entireties, in close analogy to particles in mechanics with an associated inertial mass and friction. What these entireties, which perform an individual motion, actually are, depends on the relation of the rise and fall times of the exciting magnetic fields to the spin-spin relaxation time. 2. Application to Time-Varying Electric Fields- Electron Beam Testing
The complexity of integrated circuits is continuously growing. In order to increase speed, the size of individual gates is steadily reduced and, simultaneously, the scale of integration is increased. Classical testing with mechanical probes is growing more and more prohibitive because of oversized capacitive
242
0. BOSTANJOGLO
and mechanical influence on micrometer and submicrometer devices. Electron and light beams have been successfully introduced as alternative probes, being non-mechanical, fast, and of low loading. As there are several articles reviewing electron beam testing by Feuerbaum ( 1983), Menzel and Kubalek (1983), and Wolfgang (1 986), only a brief survey will be given here. In the early days of electron beam testing, conventional emission (Spivak et al., 1966) and mirror electron microscopy (Gvosdover et al., 1970; Szentesi, 1972) were used in the stroboscopic mode. Rapid field changes in p-n junctions and propagation of surface acoustic waves on piezoelectric devices were successfully visualized up to 100 MHz. However, as periodic processes can be equally well analyzed with scanning microscopy, the latter was soon preferred because of its significantly larger variety of image signals and better resolution. The main interest in beam testing is directed toward the local electric potentials in large-scale integrated circuits and their time dependence. The basis of present fast beam testing modes is the “voltage contrast” effect of secondary and photoemitted electrons. a. Voltage Contrast. Signals that depend on specimen voltage are Auger electrons, secondary electrons, and photoelectrons. Auger electrons have the advantage that topographical and material contrast can be avoided. Since, however, their yield is several orders of magnitude inferior to that of secondary electrons and photoelectrons, the latter are preferred. The number of the secondary electrons or photoelectrons generated by the primary beam and collected by an electron detector is determined by the local electric fields on the specimen. Regions with a positive voltage attract a portion of the emitted electrons with low kinetic energy back to the specimen. These electrons are missing in the image signal, and, consequently, areas with positive potential will show up as dark. However, this voltage contrast signal depends in a nonlinear way on the specimen voltage, due to the peculiar energy distribution N ( E ) of the secondary electrons (Fig. 22). It can therefore be used only for qualitative investigations. The basis of quantitative voltage contrast is the linear shift of the energy spectrum of the secondary electrons with the applied specimen voltage (Fig. 23). The energy spectrum can be analyzed differentially using the linear shift of the peak of the spectrum, with applied voltage as the image signal. More commonly, the shifted spectrum is registered integrally with a retarding field analyzer (Fig. 24). The current 1 collected at a fixed retarding voltage -lUGol is a nonlinear function of the specimen voltage Usp. A linear signal can, however, be generated by installing a feedback, which keeps the collector current 1 constant by varying the retarding potential - 1 U,I [Fig. 23(b)]. The change AUG of the retarding voltage for keeping the emission current from the
v,,
+
luspl
0
E D
eUextr. FIG.22. Energy distribution N ( E ) of the ejected electrons at different potentials - 1 USPI, 0, and lU,pl of the specimen. U,,,, is extraction voltage applied to the detector; E is the energy of an electron reaching the detector.
4
A N IE)
=o
usp
IN
TNiEldE e(Usp +IUGO
I
D
USP
-I'GOI
-iuGl
(a) (b) FIG.23. Shift of energy spectrum N ( E ) of ejected electrons by a specimen potential UVp (a) and electron current I (b), as detected by a retarding held analyzer at different retarding voltages U G . ~
probing beam
I
specimen I retarding grid Fici. 24. Scheme of a retarding field energy analyzer with hemispherical electrodes. The probing beam may be either an electron or laser beam, ejecting secondary or photoelectrons of energy E, respectively.
244
0. BOSTANJOGLO
specimen,
constant when the specimen potential changes by AUsp is given by A1 UGl = -AUsp. Therefore, this change in the retarding voltage can be used as an imaging signal for quantitative measurements. In order to measure the potential distribution on large-scale integrated circuits quantitatively, which is a goal of highest technical importance, several conditions must be fulfilled. The interaction of local electric fields at the surface of the circuits must be reduced, and topographical and material contrast must be suppressed. Local electric fields, generated by a difference of voltage at neighboring devices on the integrated circuit, block the emission of low-energy electrons and alter the angular distribution of all emitted electrons. The influence of local fields on the energy spectrum can be effectively reduced by applying a strong extracting field at the specimen. Using favorably curved extracting and retarding field electrodes, the energy analyzer can be made insensitive to the angular distribution of the emitted electrons. A highly advanced spectrometer for electron beam testing, which solves the problem of local fields, has been developed by Plies and Schweizer (1987). It is schematically shown in Fig. 25. The spectrometer is optimized to achieve low aberration focusing of the primary electrons onto the specimen, and it minimizes the influence of local electric fields on the secondary electrons. It consists of a combination of a probe-deflecting system, a retarding-field spectrometer, and a microchannel plate detector. Primary electrons of low energy ( z 700 eV) are used in order to avoid electrical charging by working near the E , point (Fig. 2) and to keep radiation damage low. A low chromatic aberration constant is achieved using a short focal length, which resulted in a probe diameter of 0.15 pm at a current of 0.5 nA. The secondary electrons are extracted with a high-voltage U , = 2 k V applied to a planar extraction grid, furnishing an extraction field of 1 kV/mm, being far superior to local fields at the specimen surface. The high extraction voltage also helps to focus the secondary electrons, having a finite energy spread AE z 10 eV, as chromatic aberration of the focusing magnetic lens decreases with A E / e U , [see eq. ( 6 ) ] . The focus of the secondary electrons is nearly outside the magnetic field and within a region of constant electrical potential. Accordingly, the secondary electrons travel on straight lines passing their crossover. The latter is made to coincide with the center of the spherical retarding field, so that all secondary electrons move parallel to the electric field within the retarding field region. The spectrometer is therefore largely independent from the angular distribu-
ELECTRON MICROSCOPY OF FAST PROCESSES
245
deflector
FIG 25 Retarding-field-spectrometer objective for electron beam testing according to Plies and Schweizer (1987) with a channel-plate detector and electromagnetic probe forming lens ( U D ,U , . dnd UR are deflection, extraction and retarding voltages, respectively).
tion of the secondary electrons. Those overcoming the retarding field are amplified by the microchannel plate detector. Topographical contrast arises as the secondary electron yield varies with the angle q of incidence of the primary beam with respect to the surface normal as I/cos cp [eq. (2)]. With increasing inclination cp, the yield increases and the energy spectrum is displaced towards smaller energies (Koshikawa and Shimizu, 1973), thus causing faulty measurements. Material contrast is caused by the dependence of the secondary electron yield on atomic number, density, and work function. Even equal materials of a specimen can give different yields due to absorbed gases or contamination by beam-cracked hydrocarbons from the vacuum system. The influence of these effects can be eliminated by electronic processing of the signal. Two signals are obtained: one contains topographical and material contrast plus voltage information and the other is taken at zero voltage. As the first two contrast mechanisms are independent from the specimen voltage, they are eliminated by subtraction of both signals. h. Electron Beam Testing Modes. The aim of beam testing is to provide visual information on the spatial distribution of time-periodic voltage signals
246
0. BOSTANJOGLO
within the device under study in the time and frequency domain. There are several complementary testing methods, all exploiting the voltage contrast but processing the signal in very different ways (Wolfgang, 1986; Brust and Fox, 1985). Waveform Sampling. This mode is particularly suited for verification of the performance of large-scale integrated circuits, where the time dependence of the potential at particular test points is wanted. The probing beam is focused onto the selected area and pulsed in synchronism with the frequency of the voltage. The emitted electrons carry information on the voltage value at the pulsing time. By sweeping the phase of the beam pulser through 360", the total waveform is gained. Waveforms with rise times down to a few picoseconds were resolved at working frequencies up to several gigahertz (Fujioka and Ura, 1981; Hosokawa et al., 1978). Figure 26 shows an example of the waveform recording mode that was applied to study the motion of the high-field domain in a Gunn device. The waveform was measured at successive points A to E along the active region of the device. The anode was biased slightly below the threshold voltage of the Gunn effect and a 1 GHz rf triggering voltage was added. Each time
D
l o o p s I div ai
I
I
I
I
I
I
,
,
fi
I
(a) ( b) FIG.26. Wave form (a)sampled at points A to E in the active region of a Gunn diode, shown in (b) (From Fujioka and Ura, 1981.)
ELECTRON MICROSCOPY OF FAST PROCESSES
247
FIG.27. Propagation of the high-field domain in a Gunn diode, visualized by stroboscopic scanning microscopy in the voltage contrast mode. The imaging points of time after triggering of the domain motion are shown in the upper right corners. (From Hosokawa, Fujioka, and Ura, 1978.)
the threshold voltage was exceeded, a high-field domain nucleated near the cathode and started to drift towards the anode. The arrival of the domain at a test point is signaled by a drop in the potential. A propagation velocity of lo5 m/s is deduced. Stroboscopic Imaging. If an image of the spatial distribution of a voltage with a particular frequency is wanted at particular points of time, the probing beam is pulsed with this frequency and, in addition, scanned across the device, keeping the pulsing phase constant. Figures 27 and 28 give examples of such stroboscopic images of time-varying voltage distributions. Figure 27 shows as an example for processes on the picosecond time scale, the propagation of the high-field domain in a Gunn device. Slower variations of electric potentials, proceeding on the nanosecond time scale but in a device of significantly higher complexity, are shown in Fig. 28. Frequency Tracing. Frequency tracing is used to visualize all points of a device carrying a voltage of a wanted frequency coo. The probing beam scans the device and ejects electrons, which are exploited as an image signal. Since their current depends on the local voltage, it is modulated by the local frequency. The signal is passed through a narrow bandpass filter centered on
248
0. BOSTANJOGLO
FIG.28. Voltage distribution in a switching NAND gate at different points of time, indicated in the lower left corners, as visualized by stroboscopic SEM in the voltage contrast mode. (From Menzel and Kubalek (1983), reprinted with permission of the publisher, FACM, Inc. The JBI Building, Box 832, Mahwah, N. J. 07430, USA.)
wo before reaching the brightness control of the CRT of the scanning
microscope. Thus, only those device points appear bright that carry a potential of the wanted frequency. In order to circumvent limitations due to the bandwidth of the detector system, which may be significantly smaller than the wanted frequency wo, the heterodyne technique is applied. The probing beam is pulsed with a convenient and fixed frequency oj0 + wi. The ejected electron signal current is proportional to the pulsing beam intensity and, in addition, depends on the local voltage with frequency wo. Accordingly, the ejection process mixes the frequencies wo + mi and coo. The signal, collected by the energy-analyzing electron detector, contains, among other things, a Fourier component of the
ELECTRON MICROSCOPY OF FAST PROCESSES
249
+
difference frequency (wo mi) - wo = mi, i.e., the intermediate frequency, whenever a voltage with the wanted frequency wo is present at the object points. This component is extracted with a bandpass filter. The intermediate frequency is selected to fall well within the bandwidth of the detector system. Limits are set now not by the detector, but by the bandwidth of the beam pulser, which must exceed oo coi. This, however, is much more easily realized than increasing the bandwidth of the detector.
+
Frequency Mapping. If the voltage frequency wo is unknown, it can be determined by the frequency mapping mode. In this case, the probing beam scans the device along only one line. Simultaneously, a spectral analysis of the voltage contrast signal is carried out and displayed on the CRT of the scanning microscope. The y-axis coincides with the line scanned and the x-axis corresponds to the frequency axis. Frequency mapping and tracing function without synchronization with the device under study. They can be applied to asynchronously driven circuits, e.g., free-running oscillators. Logic State Tracing. This mode was introduced by Brust and Fox (1985) to trace a specific periodic voltage, e.g., a particular bit pattern, in a circuit. The probing beam scans the device and ejects electrons, which carry information on the local time-varying voltage, consisting of locally different bit patterns. This signal is fed to a correlator, which compares it to a particular stored bit pattern, coinciding with the wanted signal. The output of the correlator controls the CRT brightness of the scanning microscope. The stored reference signal can either be derived from a pattern generator or from a preceding measurement by the probing beam itself. Figure 29 shows an example of logic-state tracing in a clocked integrated circuit.
c. Resolution of Electron Beam Testing. An important question is the resolution that is threefold, concerning space, time, and voltage. Spatial Resolution. Spatial resolution is limited by the usual parameters of scanning microscopy, such as the spot size of the probe on the specimen and the exit volume of the secondary electrons, as discussed in Section 111.B. In addition, further degradation of resolution is caused by beam pulsing and by aberrations due to the extraction field of the detector. Degradation of the spot size due to electron beam pulsing depends on the pulsing technique. It can be made negligibly small in the case of a properly designed beam blanking system based on a capacitor deflector and chopping aperture (Menzel and Kubalek, 1979). High extraction fields of the electron detector,
250
0. BOSTANJOGLO
FIG.29. Logic-state tracing of an integrated circuit. (a) Conventional SEM image of several interconnections of the circuit, produced with secondary electrons. (b) Imaging of only those two interconnections, that carry the wanted signal [marked by arrows in (a)]. (From Brust and Fox (1985.)
being inevitable for breaking down microfields at the specimen surface, introduce only defocusing if the electrodes are planar. This is easily compensated with the probe forming lens. Nonplanar electrodes additionally cause astigmatism, which must be corrected by an additional stigmator. A resolution of about 0.1 pm is achievable in the voltage contrast mode with advanced detectors (Plies and Schweizer, 1987). Time Resolution of Voltage Measurements. The time resolution is determined in principle by generation time and dispersion of the time of flight of the secondary electrons, the bandwidth of the detector and-in the stroboscopic mode-by the exciting beam pulse width. s, thus allowing voltage Secondary electron emission occurs within measurements up to the terahertz range. Photoelectron emission with light of frequency f occurs within a time I/f, according to the uncertainty relation, and is even less limiting. The time of flight of a secondary electron of energy E is
where e and m are charge and mass of electrons; U , is extraction voltage; s is the distance between specimen and detector, and E is the energy of the secondary electrons. Due to the energy spread AE of the latter, there is a dispersion,
m AE 2E eU,’
ELECTRON MICROSCOPY OF FAST PROCESSES
25 1
of the time of flight, which limits the maximum voltage frequency f,,, to a z 1/2 A t f . The time dispersion is reduced by keeping s as small as value jmax possible and the extraction voltage U , as high as possible. With A E % E z 10 eV, U , z 2 kV, and s % 2 mm, maximum frequencies of several gigahertz are attainable. The bandwidth of the detecting system is of major importance in real-time measurements. It is determined primarily by the maximum operating frequency of the linearizing feedback system used in the quantitative voltage mode (Feuerbaum, 1983), being about 300 kHz. The rise times of the electron detectors (plastic scintillators and photomultiplier tubes or microchannel plates) are on the order of Ins and can be neglected here. The limitation due to the small bandwidth of the feedback system is resolved by the stroboscopic or sampling technique. In both modes, the generated secondary or photoelectron signal is averaged over a long time t comprising many cycles of the periodic voltage. The bandwidth A j z l / t may be chosen conveniently small to eliminate noise without degrading the signal. Time resolution is now given by the exciting pulse width. A resolution in the picosecond and subpicosecond regime has been attained at fixed frequencies of several gigahertz (Fujioka and Ura, 1981; Gopinath and Hill, 1973, 1977; Hosokawa et al., 1978), and a resolution of several picoseconds was achieved at variable frequencies up to several hundred megahertz or even gigahertz (Bokor et al., 1986; Feuerbaum and Otto, 1978; Marcus et al., 1986; May et al., 1987; Menzel, 1981; Menzel and Kubalek, 1979; Weiner et al., 1987). Voltage Resolution. The minimum detectable voltage Uminat the specimen is limited by the noise of the secondary current lsE. This current is derived by amplification of the primary current I , with an average yield 6, giving I , , = Tdl,, where T is the transmission of the spectrometer grids. Part of the noise is thereforecaused by shot noise in the primary current, with a mean power density AI;, which is amplified, giving T26'A1;
=
2~Afl,T'6~,
(15)
with A.f the bandwidth of the detector. In addition, there are fluctuations of amplification, showing up as shot noise of the secondary current 2el,,TAf
= 2c61,T2 Af.
(16)
The total mean square of the noise is then _ .
AI;
=
2eA,flpT'6(l
+ a).
( 1 7)
252
0. BOSTANJOGLO
With a retarding field spectrometer detector, the minimum detectable voltage USpis determined as follows. Operating the detector with a retarding voltage tiG at a current level Is, results in a shift of the spectrum by AUsp when the specimen voltage is abruptly changed by AUsp (Fig. 30). This shift is accompanied by a change AlsE of the emitted current. In order that this change of current is detected, it should be at least three times the noise amplitude. With
one finally gets
A minimum detectable voltage of Uspmin % 0.1 mV is expected, according to Eq. (19), under usual testing conditions, and actually an only slightly worse value of 0.5 mV was observed (Menzel, 1981).
d. Photoemission Sampling. Recently, time-resolved photoemission was introduced as an alternative to electron beam testing (Blacha et al., 1987; Bokor et al., 1986; Marcus et al., 1986; Weiner et al., 1987). Using focused visible or ultraviolet laser radiation, photoelectrons are ejected from the probed area by multiphoton or single photon processes, respectively. They carry information on the local electric potential as secondary electrons and are processed in an analogous way. A resolution of 17 ps was achieved for potential variation of several tens of millivolts on the submicrometer scale.
AUG
AUSPrnin
FIG.30. Change of ejected electron current Isp by a shift AUsp of the specimen potential Usp.
ELECTRON MICROSCOPY OF' FAST PROCESSES
253
D. Fust Nonrepetitioe Processes and Real- Time Electron Microscopy Fast nonrecurrent processes can be caused either by triggering the collapse of a highly metastable phase or by rapidly transfering a stable phase far into a metastable region, wherefrom the system relaxes, again by a fast process. In the first case, only an activation energy for nucleation of a stabler phase is to be supplied. Once present, the latter can grow precipitously due to excess energy liberated in the collapse. The second process usually requires more energetic stimulating pulses, as the total transformation energy must be supplied. Both types of fast processes can be initiated, among others, by ion, electron, and laser beams. These stimulants have a great advantage over other possible stimulants in that they can be focused in space and time. Accordingly, the fast processes can be started at a predetermined point of time and in a localized area, and even high converted power densities with the associated unusual states of matter can be handled in the laboratory. Using an electron microscope as the analyzing tool, it is straightforward to exploit the electron beam itself for initiating the fast process. However, laser beams have several advantages as compared to electron or ion beams. Very short and powerful pulses can be much more easily obtained with lasers. Even small-sized solid state laser oscillators give pulses with duration 5-50 ns and power densities up to 10" W/cm2, when Q-switched and focused by a single lens (Koechner, 1976).With some additional effort, using mode coupling and pulse slicing, single picosecond pulses with power densities up to 10l2Wlcm' are readily produced. Commercial lasers provide radiation with photon energies ranging from the far infrared, e.g., 0, 1 eV for the carbon dioxide laser, up to the ultraviolet at z 5 eV for excimer lasers. The interaction of laser light with matter can therefore be selected to range from purely thermal to purely electronic. Of course, at high pulse intensities the probability for multiphoton absorption sharply rises and electronic transitions are excited besides the thermal vibrations, even with low-energy quanta.' In addition, excitation may be extremely selective due to the narrow bandwidth of the laser light. Finally, targets can be treated by laser pulses in ambient atmospheres. There is no limitation to vacuum or low-pressure atmospheres, as with electron or ion beams. This fact greatly facilitates large-scale industrial laser machining. There are also, however, problems with laser beams. In the first place, the beam energy is deposited very inhomogeneously in the target, according to Beer's exponential law. This results from the fact that light energy is deposited by statistical absorption of photons. Treatment, therefore, is confined to Details on rnultiphoton absorption rates are given by Tozer (1965). Keldish (1965), and Bebb and Gold (1966).
254
0. BOSTANJOGLO
surfaces in most cases. Secondly, special conditions must be met to treat highly reflecting or transparent materials effectively. Nevertheless, the advantages of laser beams prevail and there is a rich body of applications to material processing, comprising etching, ablation, deposition, and transformation (Appleton and Celler, 1982; Bauerle, 1984; Poate and Mayer, 1982). As the electron microscope is an effective instrument for structure analysis and the laser is a powerful tool for treating matter, it is attractive to combine both techniques for investigations of fast processes. Thus far, two modes of electron microscopy have been adapted for real-time investigations, the transmission and reflection modes. suitable for thin films and surfaces of bulk materials, respectively. 1. Real-Time Techniques
Two distinctly different techniques are feasable for time-resolved investigations. In the continuous mode the image signal is followed with a high time resolution over an extended period of time, giving the total history of the process. In the short exposure mode an image, or diffraction pattern, is registered with a very short exposure time, storing a selected intermediate state. Both techniques have been realized in transmission and reflection electron microscopes. 2. Real-Time Transmission Electron Microscopy a . Continuous Mode. Figure 31 shows a commercial transmission electron microscope, as adapted by Bostanjoglo et al. (Bostanjoglo and Endruschat, 1984, 1985; Bostanjoglo et al., 1985, 1986) for investigations of laser-induced phase transformations. For this aim, modifications were introduced at the electron illumination system, the specimen chamber, and the detector. The illuminating electron beam is pulsed using a deflecting capacitor and a chopping blade. The beam is deflected off the specimen in the quiescent state and switched on for only several microseconds. Pulsing is an inevitable precaution to avoid radiation damage of the specimen and the detector. A pulsed laser beam is used to trigger the fast processes to be studied by electron microscopy. A Q-switched, frequency-doubled Nd:YAG laser is used, producing green pulses of 532 nm wavelength with a duration of 20-30 ns (FWHM). The laser beam is at first expanded and then focused with a laser objective lens and a dielectric mirror onto the thin-film specimen to a spot of about 15 pm in diameter. The mirror substrate, which should be polished to l / l O , is heavily doped silicon to avoid charging by the electron beam. The mirror has a central bore for transmission of the electron beam and can be tilted about two axes and translated in three directions for adjustment.
255
ELECTRON MICROSCOPY OF FAST PROCESSES -HV electron gun
condenser lens
b l a n k i n g system
dielectric mirror
specimen objective lens aperture
brightfield i m a g e scintillator photomultiplier
p h a s e transition
FIG.31. Transmission electron microscope for the continuous-rnode, time-resolved investigations of laser-pulse-induced phase transitions in thin films.
A green HeNe laser is employed for adjusting the light beam, whereby the specimen is viewed in the backscattered light with a telescope. The experiment runs as follows. A master pulse generator switches the electron beam on. About 100 ns later, when the electron beam has safely centered on the specimen, the Q-modulating Pockels cell of the laser is switched off. A giant laser pulse is then emitted after about 150 ns, initiating the phase transition. During illumination with the electron beam, a bright-field image-or diffraction pattern-is generated by conventional imaging on the detector, which is a combination of a fast plastic scintillator and a photomultiplier tube, placed beneath the final viewing screen. The signal of the photomultiplier is stored by storage oscilloscopes. The rise time of the total detecting system is 3 ns. Two storage oscilloscopes are used in order to display the change of contrast due to the phase transformation at two widely differing time scales. The triggering of the oscilloscopes deserves some consideration. Oscilloscope 1 covers the whole period of the electron beam illumination of 1-10 p s and is triggered at its beginning. A phase transition will show up as a step in the image intensity. Although it will be detected, its rise time may not be resolved.
256
0. BOSTANJOGLO
As it may occur with some unpredictable delay after the initiating laser pulse, the recording oscilloscope should be triggered by the very transition for high resolution. Because of the heavy shot noise, the shaping of a trigger pulse from the image signal is somewhat involved. In the simplest case, if only one phase transition is present, the image signal is a double step with superimposed noise (Fig. 32). The first step is due to the switching in of the electron beam, and the second one is caused by the phase transition. The signal is fed to a trigger pulse generator (Bostanjoglo and Horinek, 1983), having a high-impedance input, and to the oscilloscopes via a low-loss 50 C2 transmission cable. The latter compensates the propagation delay of the trigger pulse generator. A differentiating network at the input of the pulse shaper transforms the doublestep signal into two single pulses plus differentiated shot noise. This is fed after amplification to an ultrafast ECL comparator, the reference voltage of which is adjusted slightly above the maximum noise pulses. Accordingly, the comparator delivers a well-shaped pulse at any change of the image intensity, which exceeds the shot noise. A following type D flipflop
FIG. 32. Typical example of the continuous-mode, time-resolved TEM. (a)Final structure of laser pulse-crystallized amorphous Si-AI layer. (b) Change of the bright-field image intensity due to crystallization within the circled area. The intensity levels of the amorphous and crystalline phase are denoted by a and c, respectively. LP is the laser pulse. The noise is due to shot noise in the electron image. (c) The crystallization step of (b) at a magnified time scale. (From Bostanjoglo el al., 1986.)
ELECTRON MICROSCOPY OF FAST PROCESSES
257
transforms the double pulse into a single, broad square pulse with duration equal to the distance between switch on of the electron beam and phase transition. The trailing edge of this pulse coincides with the onset of the phase transition and is used to trigger oscilloscope 2, which displays the transition on a magnified time scale. If the phase transition consists of more than one step, any one of these can be readily selected to trigger the oscilloscope with a preset down-counter replacing the D flipflop. Using laser pulses of higher power density, the phase transition is initiated with negligible delay, so that oscilloscope 2 can be triggered directly with a fast photodiode, which monitors the laser pulse shape. Of course, the setup in Fig. 31 omitting the laser can be used to study fast phase transitions initiated by the illuminating electron beam itself (Bostanjoglo and Liedtke, 1980; Bostanjoglo and Schlotzhauer, 1981; Bostanjoglo, 1982, 1983; Bostanjoglo et at., 1982; Bostanjoglo and Hoffmann, 1982, Bostanjoglo and Horinek, 1983).A variant of the detecting method described above was introduced by Takaoka et al. (1986). The image intensity is picked up simultaneously at three different points of the final image with three scintillator/photomuItipliers. This technique can be applied if a phase front moves across macroscopic distances. The direction of propagation and mean velocity are then determined with a high accuracy. h. Short-Time Evposure Imuying. Figure 33 shows the commercial transmission microscope of Fig. 3 1 adapted for short-exposure photography (Bostanjoglo et al., 1987a, 1987b, 1987~). Electron beam illumination and laser pulsing are as before, but now the image detector is a gated image converter, as described in Section 1V.B. It consists of a voltage-pulsed microchannel plate and an output scintillator at a constant acceleration potential. By pulsing the channel plate, a short-exposure highly amplified image is displayed on the output scintillator, wherefrom it is picked up by a television camera and transferred to a monitor and a memory for image processing. The electron beam pulser, the annealing laser, and the image converter are driven by a master pulse generator with appropriate pulse delays. The complementary character of information furnished by the two modes of real-time transmission microscopy is demonstrated in Fig. 34. It shows the disruption of a germanium film by an intense laser pulse. According to the oscillogram the disruption starts quite abruptly after a delay of % 120 ns after the laser pulse and clears the field of view of 0.5 pm /2/ within 20 ns. The shortexposure micrograph, taken 140 ns after the laser pulse, shows that the film disintegrates in a very inhomogeneous way by breaking up into fragments.
3. R e d - Time Reflection Electron Microscopy Transmission electron microscopy is limited to thin-film specimens with thicknesses below ~ 0 . 5 p m In . order to investigate fast processes on bulk
258
0. BOSTANJOGLO
electron gun condenser lens
master p u l s e genera tor
-
trig MCP
Nd YAG laser
532 nm
blanking system
dielectric mirror
20ns
M
ba
g HeNe l a s e r
telescope telescope
I.(“‘
exposure pulse gen
=*
”
specimen objective l e n s aperture inter rnedi a te lens
,
projector lens b r i g h t f i e l d image
;
micro channel p l a t e 2scintillator
FIG.33. Transmission electron microscope for short-exposure time imaging of laser-pulseinduced phase transitions in thin films.
FIG.34. Time-resolved transmission electron microscopy of perforation of a thin film by a laser pulse. (a) Short-exposure time micrograph of a transient state during perforation, taken 140 ns after the laser pulse. Exposure time was 15 ns. (b) Change of bright-field image intensity within an area or 0.5 pm in diameter due to perforation, observed by the continuous mode (LP laser pulse). The two complementary results were gained on similar films with coinciding laser pulse energies.
259
ELECTRON MICROSCOPY O F FAST PROCESSES
laser- pulsed thermal e--gun condense1 lens
e- blanking u n i t
specimen (tiltable, translatable] objective lens aperture intermediate lens project or lens
MCP image converter
U FIG.35. Time-resolving reflection electron microscope for investigations of laser-pulseinduced modifications of surfaces (From Bostanjoglo and Heinricht, 1988.)
material, a time-resolving reflection electron microscope was assembled from components of a commercial transmission microscope (Bostanjoglo and Heinricht, 1988). The setup is shown schematically in Fig. 35. The electron optics consist of the conventional electromagnetic lenses, whereas the electron gun was modified for laser-pulsing, as described in Section IV.A.4.b. It allows conventional constant and pulsed high-current electron illumination. An electron beam blanking unit is installed to cut out a short illuminating pulse. Laser pulsing of the specimen is as described before. The electron image of the surface, or the high-energy reflected electron diffraction pattern, are picked up by a closed-type channel-plate image converter, operated in the conventional constant-voltage mode. The electron illumination time is determined by the duration of the sliced electron pulse and is z 2 0 ns. The two lasers and the beam blanking unit are driven by a master pulse generator. Figure 36 shows as an example the short-exposure image of a crater, shot into the surface of a silicon crystal by a 25 ns laser pulse. At the used laser power densities of 90 MW/cm2 the hydrodynamic processes are confined to almost the first hundred nanoseconds after the initiating laser pulse.
260
0. BOSTANJOGLO
FIG.36. A crater shot by a laser pulse (532 nm, 25 ns FWHM, 170 pJ, 90 MW/cm2)into the ( I 1 I ) surface of a silicon crystal. (a) Transient shape of the crater 40 ns after the laser pulse, as imaged by the time-resolving reflection electron microscope (REM) with an exposure time of 20 ns. (b) Final shape of the crater, imaged as in (a).(c) A crater shot with a similar laser pulse energy (180 pJ), as imaged with R E M but with a conventional long exposure time. (d) The crater or (c), conventionally imaged with scanning electron microscopy using secondary electrons. (From Bostanjoglo and Heinricht, 1988).
In addition to imaging, thermal and secondary electrons emitted by the laser-pulsed specimen can be picked up with an Everhart-Thornley detector ( Everhart and Thornley, 1960), an electrically screened scintillator/ photomultiplier, and surface processes traced continuously.
v. APPLICATION OF REAL-TIME ELECTRON MICROSCOPY TO FASTLASER-INDUCED PROCESSES Before discussing applications of real-time electron microscopy to fast processes induced by laser pulses, estimates of the relevant time scales are presented. A . Time Scale of’ Fust Laser-Induced Processes
Adiabatic deposition of high-density laser power in a solid results in highly excited electrons and holes, at first. Existing free charge carriers absorb the light energy directly by inverse Brems radiation, and bound electrons
ELECTRON MICROSCOPY OF FAST PROCESSES
26 1
are excited by multiphoton absorption. A t very high power densities above l O I 4 W/cm2, corresponding to electric field amplitudes of light on the order of the intra-atomic Coulomb fields ( lo9 V/cm), field ionization and electron tunneling occur, in addition. The time of interaction At, between laser light of frequency (1)and atomic electrons is estimated with the uncertainty relation to be A t , z h / h o z s. Equilibrium between the free charge carriers is reached theoretically within s (Yoffa 1980) by distributive collisions, plasmon production, and electron-hole production and recombination in semiconductors. The laser-induced generation rate of electrons and holes and their transient concentration is usually very high. For instance, if a green laser pulse (ho% 2 eV) with energy E z 1 mJ and duration z 10 ns is focused to a spot with area A z cm2 on silicon (absorption length d z 0.2 pm, reflectivity R z 0.8), free charge carriers with a concentration N = ( 1 - R ) E / A d h o 2 10’’ cm-3 are produced. At these high concentrations, recombination proceeds by the Auger process, as this is a three-particle collision process with a probability N Thereby an electron-hole pair annihilates and the band gap energy E, is transferred as kinetic energy to a third free charge carrier. At lower concentrations, when the plasmon energy h o p is smaller than the band gap energy E,,
-
’.
e, m e , and m h are charge and effective masses of electrons and holes, and E is permittivity. Plasmons are also excited by electron-hole recombination. Thermalization of the electronic system with the crystal lattice proceeds by a multiphonon process within the electron/hole-lattice relaxation time z. Its value can be deduced from the imaginary part of the complex refractive index ii = n - i.u with
n2
- x2
z nn[ 1 -
(z)’],
where (JJ is the frequency of light and nn is the refractive index of the nonexcited material. A value of T z 1 ps is deduced for silicon from recent time-resolved transmission measurements with ps-laser pulses (Baeri et al., 1985; Lompre et al., 1984), in agreement with estimates of T based on the Hall mobility p = er/ni. Thus, the initially decoupled electronic system returns into equilibrium with the lattice within a picosecond, and any slower processes are expected to obey classical thermodynamics with the usual lattice temperature.
262
0. BOSTANJOGLO
Once the energy has been transferred to the lattice, it is dissipated by thermal diffusion, as latent heat by phase transformations and by radiation. The time scale At characterizing these dissipative processes and accompanying material transport, proceeding on a micrometer level AL x 1 pm, is expected to be significantly slower than the picosecond electronic relaxations. Estimates will now be given that use material parameters typical for semiconductors and metals. Dissipation of energy by heat conduction across a distance AL x 1 pm is characterized by a diffusion time Atd = (AL)2/2Dthz 0.5 p s with the thermal diffusivity &, z 0.01 cm2/s (e.g., for amorphous Si). Phase transitions propagate at a temperate T with a velocity u = *aj,,exp(
-2)
= u,exp(
-+).
Here a is the distance between two neighboring atomic sites, fD is the Debye frequency, i.e., the most frequent jump frequency of an atom, exp( - E , / k T ) the Boltzmann probability factor for a successful crossing of the potential wall with activation energy E , and us is velocity of sound in the condensed phase. Thus, the minimum propagation times Atp of phase transitions on the micrometer scale are Atp = AL/v 2 AL/u, x 1 ns, with u I us x 1000 m/s. Eq. (22) also holds for removal of condensed material by vaporization, with E , roughly coinciding with the sublimation heat per atom. So, the maximum velocity of ablation by boiling is again the velocity of sound. The time At, of cool-down due to heat radiation is estimated from the Stefan-Boltzmann law dT pcd- z -uoT4, dt giving
with emissivity u , o is the Stefan-Boltzmann constant; p, c, and d are density, heat capacity, and absorption length of the target for visible light. Inserting the values c1 z 1, c p z 5 J/K.cm3 and d x 20 nm typical for metals, one gets At, 2 10 p s for temperatures T up to the highest possible boiling point ( 56000 K), so that radiation can usually be neglected. Transport of material, such as flow of a liquid layer driven by surface forces, propagates with the Rayleigh velocity
ELECTRON MICROSCOPY OF FAST PROCESSES
263
where p. y, and d are density, surface energy, and thickness of the crumbling liquid layer. A sensible value for the latter is the absorption length of light. Using mean values for liquid metals y z 0.1 N/m, p z 5 g/cm3, d z 20 nm, the time scale for hydromechanical fluctuations across distances exceeding AL z 1 pm is Atf z AL/u, z 20 ns. In summary, roughly two time regimes can be associated with high-power laser-induced processes. The faster regime comprises excitation and thermalization of the electronic system, which proceed within 10-'4-10-12 s. A slower one concerns dissipation of energy within the lattice and structural relaxations, which are characterized by a nanosecond time scale, at least for processes occurring on the micrometer level. These slower lattice processes are presently within the realm of real-time electron microscopy. Laser-initiated explosive crystallization of amorphous films is a particular fast process that has been investigated in some detail by electron microscopy. The results will be now discussed as an example of the application of real-time electron microscopy. B. Explosive Crystallization of Amorphous Films
I . Experimentul Results Amorphous films of a variety of elements [Sb (Bostanjoglo and Schlotzhauer, 1981; Bostanjoglo et al., 1982; Gotzeberger, 1955), Ge (Bostanjoglo, 1982; Bostanjoglo and Endruschat, 1984, 1985; Endruschat, 1986; Mineo et al., 1973; Takamori et al., 1973), Si (Andra et al., 1982; Bostanjoglo, 1982; Geiler et al., 1982, 1986; Gotz, 1986; Koester, 1978; Wagner et al., 1985)], alloys [ e g , Fe,Ni, ~x with x 2 0,6 (Bostanjoglo and Liedtke, 1980)], and compounds [SiOz (Aleksandrov, 1984)] prepared in a particular metastable state exhibit self-sustained crystallization. Once crystallization is locally started by a &like mechanical shock, light, or electron beam pulse, it spreads explosively across large areas, yielding a centrosymmetric crystal texture. Figure 37 shows a typically threefold structure generated by a laser-pulseinduced explosive crystallization of an amorphous germanium film. Explosive amorphous films of other materials, e.g., Si and Sb, give similar structures. They usually consist of a fine-grained central area 1, which approximately coincides with the laser spot. Then comes a region 2 with large radial crystals. Finally, there is a third region (3), consisting of several concentric bands of tilted crystals. Only region 2 is present in all explosive crystallization events. If the triggering pulse has a sufficiently low energy, region 1 shrinks to a single crystallite and the concentric bands may be missing altogether. The three regions not only differ with respect to their texture, but also their temporal formation is as dissimilar.
264
0. BOSTANJOGLO
FIG.37. Typical three-fold structure of an explosively crystallized amorphous germanium film. The dark surrounding material has remained in the original amorphous state.
FIG.38. Explosive crystallization of amorphous germanium within region 1. initiated by a low-energy laser pulse. (a) Final structure. (b)Abrupt change of image intensity within the circle in (a)due to crystallization. (c)Crystallization step of (b) at a larger time scale. (From Endruschat, 1986.)
ELECTRON MICROSCOPY O F FAST PROCESSES
265
FIG.39. Long-term crystallization of amorphous germanium within region I , initiated by a high-energy laser pulse. ( a )Final structure. (b)Gradual change of image intensity within the circle in ( a ) due to crystallization. Crystallites are smaller and crystallization time is markedly larger than in Fig. 38. (From Endruschat. 1986.)
ci. Region 1. Figures 38 and 39 show crystallization texture and formation dynamics of the fine-grained region l . This region is characterized by a small crystal size 0, 5 0.1 pm and a delay of crystallization on the order of 10 ns, which decreases with increasing laser pulse energy. A t low pulse energies, crystal growth is completed within A t , z 20-80 ns, giving crystallization velocities LI,= D c / 2 A t , zz 5-10 m/s. At high pulse energies, crystallization deccelerates appreciably during its course down to 0.5 m/s and lasts up to 500 ns. At even higher laser pulse energies, a hole is opened at the center by melting during laser irradiation. The hole continues to grow, even after the laser pulse, for several hundred nanoseconds by a propagating molten rim (Fig. 40) with a velocity of 50-60 m/s in films with a thickness of d z 100 nm. Identifying this measured value with the Rayleigh velocity in Eq. (25) gives a surface tension y z 0.5 N/m, which is approximately the value for liquid germanium near the melting point T,, = 1210 K of the crystal. Thus, as long as region 1 is not utterly destroyed, its temperature is in the vicinity of Tn,,.
h. Region 2. This is the region that is always present after explosive crystallization. It is characterized by radial growth of large crystals with sizes in the micrometer range. Growth starts here with a considerably larger delay, of 100-200 ns after the laser pulse, than in the central region; but once nucleated, the crystals grow very fast. Figures 41 and 42 show the evolution dynamics. Crystallization proceeds by the advance of a diffuse phase boundary for approximately 400-500 ns. The velocity of the radial crystal growth is o, = 12-15 m/s in all cases and, in contrast to region 1 , it was not observed to depend on the laser pulse energy.
FIG. 40. Formation of a hole in region I by melting and capillarity effects. ( a ) Intermediate state 500 ns after the laser pulse. Exposure time was 15 ns. (b) Motion of the molten rim across the viewing field of the photomultiplier (0.5 pm 9). The intensity dip prior to the rise marks the entrance of the molten rim into the viewing field. (From Endruschat, 1986 and R. P. Tornow PhD thesis).
FIG. 41. Explosive crystallization of amorphous germanium in region 2, initiated by a laser pulse. (a) Final shape of a radially grown crystal. (b)Change of image intensity within the circle in (a) due t o crystallization. (c) Crystallization step of (b) at an increased time scale. (From Endruschat, 1986.)
266
FIG.42. Intermediate states (a1 to a5) of explosive crystal growth in regions 2 and 3 of amorphous germanium, initiated by a laser pulse, The points of time after the laser pulse are indicated in the upper right corners. Exposure time was 15 ns. Below ( h l to b5), the final states are shown. The images were taken from neighboring areas of the same film at the indicated times after a laser pulse of equal energy. Figures a l to a4 show the radial crystal growth in region 2. Figure a5 shows a transient state of the delayed helical growth in region 3. The pronounced tilting of the radial crystals in region 2 of Fig. a5, generating radial grain boundaries, occurs at least several microseconds after crystallization. (From R. P. Tornow P h D thesis).
268
0. BOSTANJOGLO
Though the phase transformation starts significantly earlier in region 1, it may be still in full swing there, while it has entirely come to an end in region 2. This is substantiated by the delayed opening of a hole in region 1 (Fig. 42), which means that this hottest region consists of a liquid/solid mixture at a temperature near T,,, gradually solidifying during 0.5-1 ps, well after the surrounding region 2 has transformed into a stable crystal structure. c. Region 3. This region does not necessarily emerge in an explosive crystallization. If it is present, it consists of one or more concentric bands of rotated crystals, which are well separated from the radial crystals by a finegrained layer (Bostanjoglo, 1982). The growth of the bands does not follow that of the radial crystals right away. Actually, there is a substantial break of about 1 p s between the end of radial growth and the appearance of the first rotated crystals. Crystallization in this region lasts for about 5 ps.
2. Model Based on Nucleation and Growth These electron microscopical results support the following model of explosive crystallization. It is based on the assumption that the amorphous film has a roughly defined melting temperature T,,, which is below the melting temperature T,, of the stable crystal phase (Aleksandrov, 1983, 1984; Baeri and Campesano, 1982; Spaepen and Turnbull, 1982). Actually, amorphous films with an atomic packing different from that of the liquid, such as Ge or Si, are expected to have a melting point. Heating in this case leads to a first-order phase transition with a latent heat of fusion instead of gradual softening, as is observed with glasses. If an amorphous film is now adiabatically heated with a 6-shaped energy pulse within a limited area of several micrometers in diameter to an intermediate temperature between Tmaand T,,, the amorphous film is superheated locally. This is the starting point for explosive crystallization, which, being very fast, is certainly not a solid phase transition, but is believed to proceed via a transient liquid according to the scheme: superheated amorphous solid
melting
epitaxial
supercooled liquid solidification crystal.
These processes may be described by the usual kinetic theory of nucleation and growth (Bostanjoglo and Endruschat, 1985; Endruschat, 1986; Feder et al., 1966; Geiler et al., 1982, 1986; Gotz, 1986; Porter and Easterling, 1981). Before a phase transition from a nonordered to an ordered structure can actually set in on a macroscopic scale, critical nuclei of the ordered phase must appear, i.e., such nuclei of the new phase, which are large enough to grow spontaneously. According to the above scheme, there are three processes
ELECTRON MICROSCOPY OF FAST PROCESSES
269
involved in producing a critical crystal nucleus: 1. Appearance after a waiting time t,, of the first critical liquid nucleus in a volume V, later filled by a crystal. 2. Total melting of the volume V by growth of the liquid nucleus, consuming a time t , . 3. Generation of the first critical crystal nucleus in the supercooled liquid volume V, consuming a time tic.
The appearance times I l l , t,, are determined by
I
111,'
V N d t = I,
(26)
with &' the appropriate nonstationary nucleation rate, given by Feder el al. (1966) as a relaxation-type expression:
N
=
f i s [ 1 - exp(-i)].
The asymptotic value Ns is the stationary nucleation rate (Porter and Easterling, 1981) at the temperature T
1
1 6 ~ 7 T: ~i 3 k T H i ( T - T,,)2 ' with f, the adsorption frequency of atoms at the nucleus, n the density of nucleation sites that coincides with the atomic density nA if nucleation is homogeneous. The exponential is the thermodynamic probability of generation of a critical spherical nucleus, with ylz the interface energy between phases 1 and 2, T,, melting temperature, H , the enthalpy of melting per volume, ( T - T,) the supercooling or heating and k Boltzmann's constant. The adsorption frequency .fa is usually written as
where ,fb is the mean atomic jumping frequency (roughly the Debye frequency), ED is activation energy for self-diffusion in the parent phase, w ( ~ 0 . 5is) the probability that the atom swings in a direction suitable for a transition, Dd is the coefficient for self-difiusion, and u is the lattice constant. In the case of a liquid parent phase, the diffusion constant Dd can be replaced by the viscosity coefficient v] according to Einstein's relation D = kT/6zv]u. The relaxation time z of the nonstationary nucleation rate in Eq. (27) is
270
0. EOSTANJOGLO
given for spherical nuclei as (Feder et al., 1966)
It turns out to fall below 1 ns for germanium films (Endruschat, 1986) at the relevant temperatures, so that nucleation is approximately stationary on the investigated time scale. The time spent to melt a volume V of superheated amorphous material by a liquid nucleus growing with velocity om is t , = fl/2tJm. As the boundary between the two disordered phases is diffuse, the propagation velocity of the melting front is given by classical kinetic theory (Porter and Easterling, 1981) as
where H,,, is the melting enthalpy per volume, u = 1 / Y n , is the mean atomic distance, and ( T T,,,) is the superheating of the amorphous phase. The two exponentials, exp( - E,/kT) and exp[-(ED A G ) / k T ] ,express the jump probabilities of an atom across the potential wall between two states of different stability, which is characterized by the difference in frce enthalpy T,,,. per atom AG = H,,,,(T - Tm,)/nA Once a supercritical crystal nucleus is formed in the supercooled liquid, it will continue to grow by one of the following mechanisms: migration of a diffuse phase boundary, dendritic crystallization, transverse growth with nucleation of ledges or dislocations on an atomically smooth boundary, or twin-assisted atomic attachment. The boundary of a crystal during growth or in the final stage was not observed to be either dendritic or regularly shaped. Therefore, crystal growth is believed to proceed mainly by atomic attachment to a diffuse phase boundary. The growth velocity is then given by an expression analogous to Eq. (31): ~
+
where H,,, is the melting enthalpy per volume of the crystal, is the viscosity of the supercooled liquid, and (T,,, - 7')is the supercooling. The propagation velocity Phas , a maximum at a certain temperature T,,, (Fig. 43). The reason is that at lower temperatures the diffusion slows down and at higher temperatures the supercooling, and, accordingly, the instability of the liquid is reduced. Moreover, propagation at temperatures T < T,, is unstable. If the
ELECTRON MICROSCOPY OF FAST PROCESSES
27 1
Temperature ( K ) Flci. 43. Velocity L’,of crystallization by cxplosive liquid-phase epitaxy in amorphous
germanium and propagation velocity I),,, of melting of the amorphous phase superheated by a laser pulse. T,,, and T,, are the melting temperatures of amorphous and crystalline germanium, respectively. (From Endruschat, 1986).
velocity is somewhat increased by an instability, the production rate of heat v CH,, by crystallization and the local temperature are also increased. This, in turn, causes a further increase of velocity due to duc/dT > 0. Correspondingly, an incremental transient depression of velocity leads to a steady decrease of the latter. On the other hand, operating temperatures above T,,, yield a stable velocity because of “negative feedback” du,/dT < 0.
3. Comparison qf the Model with Experiments on Germanium Films Applying the above theory to germanium, the following material parameters were used, as given by Aleksandrov (1983). (a, I, c, standing for amorphous, liquid and crystalline): yal = 0.04 J/m2, ylc = 0.18 J/mZ, H,, H,,
T,,
=
= 2.67 x
=
1.8 x lo9 J/m3,
lo9 J/m3,
1210 K, E D = 0.87 x
J, fD
=
l O I 3 Hz, and
n = nA = 4.4 x lo2’ m-3.
(i.e., homogeneous nucleation is assumed) and T,, = 850 K, as adopted by Baeri and Campisano ( 1982). The viscosity q( T )of supercooled liquid germanium was obtained by extrapolating measured values below T,, (Endruschat, 1986). The hottest region, i.e., region 1, is characterized by a dense nucleation ) ~crystal, which is to be used of crystals, with a volume V = 0.1 ( ~ m per
272
0. BOSTANJOGLO
.- 10-51
-
4-
aJ u Cld'O:
e
0
--
g
-1s:
aJ
a10
I
I
\
~
900
I
I
I
I 1
950
1
1
1
1
ld00
1
'
'
1050
Temperature ( K 1
I,,
FIG.44.Appearance times and t I c of the first critical liquid and crystalline nucleus in the superheated amorphous and supercooled liquid material, respectively. (From Endruschat, 1986.)
in Eq. (26) (film thickness d x 0.1 pm and crystal diameter 0,x 0.1 pm). Obviously, this region is transformed by the laser pulse into a slush consisting of liquid nuclei at mean distances 0,in an amorphous matrix. A total melting of region 1 is to be excluded, since this area would not crystallize then, due to additional heat-up by the liberated latent heat H,, by as much as H,,/p,c, % 1500 K ( p l and c1 are the density and heat capacity of the liquid). Figure 44 shows the appearance time t , , , t , , of the first critical nuclei in the volumes V. Their sum and, in particular, the melting time t , of the volume I/ may well account for the observed delay of crystallization. When the first critical nuclei have appeared, crystallization sets in by solidification of the supercooled liquid and by melting a part of the surrounding amorphous material. At lower laser pulse energies, where region 1 is heated only slightly above T,,, the observed velocity of crystallization is about 5 m/s, which in fact agrees with the computed value at T,,, as shown in Fig. 43. The decelerating effect of high laser pulse energies is also understood with the same figure. The film is heated to temperatures near the upper melting point T,,, where supercooling of the melt and, accordingly, crystal growth velocity u, drop to zero. The fast radial crystallization in region 2 may be understood as a continued explosive liquid-phase epitaxial process. This region is outside the main part of the Gaussian laser pulse and is not heated by the latter up to Tma.So it is still amorphous after the laser pulse. As region 1 crystallizes, heat diffuses outside, heats the adjacent amorphous material up to T,,, and melts it locally. An additional heat source is solid-state crystallization at the inner rim of region 2. As the temperature of the melt is not much above T,, initially,
ELECTRON MICROSCOPY OF FAST PROCESSES
273
crystallization starts with a velocity of u, x 5 m/s. The crystallizing material ejects latent heat of fusion at a rate of H,,v,, which is used to heat adjacent amorphous material to a temperature T 2 T,, and transform it to a supercooled liquid, whereby heat is dissipated at a rate of [H,, + paca(T,, - To)+ p , c , ( T - T,,)]7,,, with To the local starting temperature, and c and p the heat capacity and density. Because of the positive feedback du,/dT > 0 at T,, < T < T,,,, the crystal growth velocity u, rapidly settles at the first stable value, which actually is the maximum value u,,,,, x 12 m/s. This stationary state is attained quite independently from what has been going on in region I. This state and the high value of the growth velocity are exactly the observed features of region 2. The fast autocatalytic liquid-phase epitaxial crystal growth stops when the crystallization front runs into the area with a too low starting temperature To,such that the liberated heat of fusion H,, does not suffice to melt the amorphous material. As heat continues to diffuse out from the regions 1 and 2, the amorphous material in region 3 is slowly heated, whereby solid-state crystallization can be activated supplying additional heat. If melting of the amorphous film is achieved, explosive liquid-phase epitaxial crystal growth can again set in. But, now it cannot spread across larger areas. It stops periodically and starts again each time the slow heat diffusion catches on, giving the periodic crystalline band structure observed in region 3. The delay of crystal growth in this region is then determined by heat diffusion across region 2, being on the order of
with the width of region 2 being R x 5 pm and the thermal diffusivity of crystalline germanium D,, z 0.12 cm’/s. This theoretical value is in rough agreement with the delay observed to occur between the end of radial crystal growth in region 2 and the beginning of crystallization in region 3.
VI. SPACE-TIME RESOLUTION OF REAL-TIME MICROSCOPY In real-time electron microscopy, the image signal is collected in a single pulse. In order to reduce shot noise, a high-current electron pulse must be used for illumination. This, in turn, causes adiabatic heating of the specimen, since the energy deposited by the electrons is not dissipated by heat conduction, as in a stationary microscopy. Ultimate resolution, therefore, is determined by a compromise between shot noise and radiation damage. As reflection microscopy suffers more from chromatic aberration than transmission microscopy, a higher resolution may be reached with the latter. It will
274
0. BOSTANJOGLO
be discussed in the limiting case of a weakly scattering, thin-film specimen. The bright-field image current of a selected homogeneous specimen area of diameter D is then J
= -D2joenp( 71
-nAd{~2msinOdO).
4
(34)
where ,j, is the impinging electron current density, d is the film thickness, nA is number of atoms per volume, 0 is the effective atomic differential elastic scattering cross section (taking into account Bragg scattering), and c i is the half-aperture angle. A phase transition changes the scattering cross section and consecutively causes a change, AJ, of the image current. The image signal, as picked up by a detector with gain G, is J, = GJ. It is superimposed by noise with an average amplitude A phase transition is resolved when the phase-induced change AJs of the image signal is roughly three times larger than the noise, i.e., AJs 2 3 m . Now the current noise amplitude is composed of fluctuations of the gain and of shot noise of the image signal, respectively. Here A j is the detechaving the rms values n a n d tor bandwidth and e is the electron charge. The rms of the total noise amplitude is then
a.
d a ,
a
=
J
~ t 2eJAfG’ J ~x J m .
(35)
Since the detectors used are based on high-gain secondary electron emission, so that L?LGz x G >> 1 and, furthermore, shot noise and image current are of the same order of magnitude near the resolution limit, Eq. (35) simplifies, as indicated on the right-hand side. Therefore, the resolution limit is determined by
Inserting the minimum detectable rise/fall time At x 1/Af of a transition due to the finite bandwidth A f , one gets the condition for mutual space-time resolution: D2At k
72e n ( AJ/J)2j,’
-
(37)
being valid in the limit of weak scattering (nA do + 0). Assuming a change of contrast A J / J = 0.5 and a current density at the object j , x 10 A/cm2, which is achieved with a thermal tungsten electron gun, Eq. (37) states that phase transitions with durations A t 2 3 ns can be resolved in specimen areas down to D x 0.1 pm in diameter. The experimentally reached resolution is actually very near to this theoretical prediction (see Fig. 32b).
ELECTRON MICROSCOPY OF FAST PROCESSES
275
Space and time resolution can be mutually increased only by increasing the electron current density, j 0 , impinging on the object. But here radiation damage sets an absolute limit. As heat is dissipated by conduction on the microsecond time scale in thin films, nanosecond electron pulses deposit energy adiabatically. An illuminating electron pulse of duration t , 2 A t then heats the specimen of thickness d by AT according to 1 ~
t’
- j,. tpAE ~
~~
= cpdAT
1
2 -j,AtAE, e
(38)
where A E is the mean energy loss per beam electron. The simple Bethe stopping power formula (Reimer, 1984) gives a satisfactory approximation for the mean energy loss
dE=
Kptl,
(39)
where K is approximately a constant for atoms with mean atomic number K 2 5.10 l 3 J cni’/g. The maximum tolerable current density j,,,, is reached when melting sets in, i.e., AT 2 T,,
where the high-temperature limit c = 3 k i i t 1 , of ~ the specific heat of solids was used ( k is the Boltzmann constant; inA is the atomic mass). Combining Eqs. (37) and (40) gives the achievable absolute spatial resolution limit in the case of ad i a ha t i c e I ect ro n pu I se il Ium i n a ti on.
Inscrting mean values for mass and melting point of heavy atoms ( i n A 2 100 i ~ ~ ~ , Tll ~ ~2 ~ ,2000 , , ; K ) and a n optimistic value for the change of contrast AJ .I z 0.5. the spatial resolution limit is estimated to be Dmin= 4 nm. The absolute limit of time resolution Atnli,,,in the case of the continuous mode o f time-resolved microscopy, is givcn by the geometric sum Atnlin = I t j c , + O,s>O,
(6)
respectively, for all P = ( p i ,p 2 , . . . ,p.) E A,". Sharma and Mittal's main motivation was to generalize the three entropies, Hr(P), H s ( P ) , and ,H(P). With this aim, they arrived at H;(P). H"P) reduces to H s ( P ) and , H ( P ) when r = s and r-' = r = (2 - s), respectively. H : ( P ) reduces to H",(P) and H,(P) when r -+ 1 and s -+ 1, respectively. Also, H",P) reduces to Shannon's entropy, H ( P ) ,when s -+ 1. Thus, we can see that the entropy of order r and degree s contain, either as a limiting case or as a particular case, Shannon's entropy, the entropy of order r, thc entropy of degree s, the entropy of kind t, and the entropy of order I and degree s. The entropies H"P) and H"P) are not additive, not recursive, and do not have sum representations. Before proceeding further, we shall present in the next section a list of most of the generalized entropies known in the literature. For convenient reference, the entropies listed above are also written again.
F. List of Generalized Entropies
.
For all P = ( p 1 , p 2 , . . , p n ) E A:, the following generalized entropies are known in the literature by their respective authors, starting with Shannon (1948). In some cases it is understood that P E An. By no means can we say that the list is complete. At the end of this chapter it is shown in a graphic way, how these entropies reduce to Shannon's case either in the limiting or in the particular case. Shunnon (1948)
334
INDER JEET TANEJA
RCnyi (1961)
A c z d and Dardczy (2963)
,
$J4(p) = (s - r ) - l log
r#s,r>O,s>O
i=l
Vurmu (1966)
Huurda and Charvat (1967)
&(P)
= (21-s
-
l)-l
c
p;
[iIl
-
]
1 ,
s # 1, s > 0
Belis and Guiusu (1968) PiwilogPi &(P) = - i = l " , Piwi i= 1
w i > 0 , i = 1 , 2 ,...,n
INFORMATION MEASURES AND THEIR APPLICATIONS
335
Ruthie ( I 970)
si 2 1 , i = 1,2,..., n , r # l , r > 0
Arimoto ( IY71)
&(P)
= (21-1
-
11-l[(
f p:">'
i=l
-
t
I],
# 1, t > 0
Sharmu and Mittul ( I 975) $ 1 3 ( ~ )=
-
I)-'
)-I], s- l / r - I
i= 1
Tuneju (1975) (refer also Sharma and Taneja, 1975, 1977)
s # kn, k = 0, 1,2,..., r
Picurd (1979)
i= t
>0
s#l,.s>0
336
INDER JEET TANEJA
r # 1,s# I,r>O,s>O where vi > 0, i = 1,2,. . . ,n and P = ( p l , p 2 , .. .,p,)
E
A,.
Ferreri (1980)
Sant 'anna and Taneja (1983) 423(p)
f pi);[gol
= -i =
sin sp,
1
i= 1
0<s
1,
s > 1,
=
Proof. Parts (i)-(v) are easy verifications. (vi) It follows from the known result (Hardy et al., 1934, pp. 106, Theorem 150),
lnv
5v
-
I,
u
2 0,
where we substitute v = 2" -'Ix. (vii) It follows from the result (Hardy et al., 1934, pp. 40, Theorem 42),
2 y(v
-
11,
S y ( v - I), u
2 0, where we substitute v
=
2' -s and y
7 2 1,
0 2 7 5 1, = x.
H . Analytic and Algebraic Properties of UniJied ( r ,s)-Entropy
In this subsection, we shall study some properties of unified ( r , s)-entropy given in Eq. (7). Some of these properties can be seen in Capocelli and Taneja (1985).Unless otherwise specified, it is understood that the results given below are true for all r > 0 and any s. Property 1. Nonnegativity. For all P Property 2. Continuity. For all P uous function of P.
. . ,p , ) E A:, 8 : ( P ) 2 0.
=(pI,p2,.
= ( p , , p 2 , ., . , p , , ) E
A:, & ( P ) is a contin-
340
INDER JEET TANEJA
Property 3. Symmetry, For all P = ( pl, p 2 , . . . ,p,) E A:, &(P) is a symmetric function of its arguments, i.e., 8S(~17~29..*3Pn) = &s(Pr(l),Pr(2),....Pr(n)),
where t is an arbitrary permutation of { 1,2,. ..,n). Property 4 . Normality.
as(*,4)= 1.
Property 6 . Limiting case. For all P
lim & ; ( P ) =
= ( p l ,p 2 , . . . ,p,) E
- l)-l[p;;:
A:, we have s # 1,
- 11,
r+cc
For s = 1, i.e., limr-w H , ' ( P ) = -logp,,, can be seen in Shiva et al. (1973). For s # 1, the result can be proved by the composition relation given in Eq. (9). Property 7. Monotonicity. For all P = ( pl, p z , . . .,p,) E A:, &:(P) is a decreasing function of r (s fixed).
For s = 1, refer to Shiva et al. (1 973). For s # 1, s > 0, refer to Sharma and Mittal (1975). While the extension to s 0 is an easy verification. Property 8. Concauity. For all P = (p1,p2,.. . , p , ) E A:, &;(P) is a concave function of P for ail (r,s) E rl,where ( r , . ~ r) :> 0 with s 2 r or s 2 2
-
(12)
It is already known that H z ( P )and , H ( P )are concave functions of P for all the values of the parameters. H,!(P)is a concave function of P for 0 < r < 1. The concavity of H ( P ) is already known. The concavity of H",P) for s > 1 is a
INFORMATION MEASURES AND THEIR APPLICATIONS
34 1
direct consequence of the concavity of H ( P ) because of relation (10) and propositions 2.1 (iii) and 2.l(v). The concavity of H"P) for s 2 2 - l/r, r # 1 , s # I , r > 0 can be seen in Van der Pyl (1977). The concavity of H"P) for s 2 r > 0 follows on the lines of proposition 4.2(ii), and it can be seen in Taneja (1988~).Thus combining all these we get the required result. Property 9. P seudoconcauity. For all P = ( p l , p z ,. . . ,p n ) E A:, &;( P ) is a pseudoconcave function of P for all r > 0 and for any s. Pseudoconcavity of H ( P ) follows from the concavity of H ( P ) , because every concave function is pseudoconcave. For pseudoconcavity of H,!( P ) , r # 1, Y > 0 refer to Ben-Bassat and Raviv (1978). The pseudoconcavity of H : ( P ) ( r # I , s # 1, r > 0) and H ; ( P )follows from proposition 2.l(iii) with the composition relations (9) and (lo), respectively. Property 10. Quasiconcavity. For all P = ( p I , p 2,..., p , ) quasiconcave function of P for all r > 0 and any s.
E
A:, a ; ( P ) is a
The proof follows from property 9, because every pseudoconcave function is quasiconcave (Mangasarian, 1961, pp. 143, Theorem 5). Property 11. Schur concavity. For all P = ( p l , p z , . . . ,p , ) Schur-concave function of P for all r > 0 and any s.
E
A:, &:( P ) is a
Prooj. In case of Shannon's entropy, H ( P ) , the result is already known (Csiszir and Korner, 1961). In view of relations (9), (lo), and proposition 2,l(iii), it is sufficient to prove the result only for Hf(P) ( r # 1, r > 0). Let P, U E A: such that
and
2 cit 2 cik
i= 1
We can write
=
k= 1
= I,
cik 2 0, i, k = 1,2,..., n.
342
INDER JEET TANEJA
We know that (Gallager, 1968, pp. 523)
for all i = 1,2,.. . , n. Taking log(.) on both sides of Eqs. (14), multiplying by ( 1 - r ) - ' ( r # l), and using the expression (1 3), we get H,'(P) 2 H,'(U),
r # 1, r > 0.
This completes the proof. Property 12. Maximality. For all P = ( p l , p 2 , . . .,p,) E A:, 8 s ( P ) is maximum when the probability distribution is uniform, i.e.,
for all r > 0 and any s. The proof follows from the property 1 I (Marshall and Olkin, 1979, pp. 7).
I.
INEQUALITIES AND
BOUNDSON GENERALIZED ENTROPIES
In this subsection we shall provide some inequalities involving generalized entropies. Upper and lower bounds on the unified (r,s)-entropy in terms of maximum probability are given. Some bounds on the entropy series in the case of Shannon's entropy are also given. Inequality I. we have
Inequalities among entropies. For all P = ( p p z ,. . .,p,)
5 H";P), 2 H",f), (ii) H:(P){ (iii) H ; ( P )
forallr# 1,r>0.
5 H(P), H(P),
r > 1,s # 1, 0 < r < I , s # 1, r > 1, 0 < r < 1,
2 N(s)* Hf(P),
5 N ( s ) . H,'(P),
s < 1, s > 1,
EAt,
INFORMATION MEASURES AND THEIR APPLICATIONS
343
where N ( s ) appearing in (5)and (iv) is given by Eq. (1 I). (v) H m {
I H,!(P), 5 H,!(P),
( H , ' ( P )2 I , s < 1) ( H , ' ( P )S I , s < I )
or or
(H,!(P) $ I , s > I), (H,!(P)>= I, s > 1)
for all r # 1, r > 0. (vi) H ; ( P ) { 2 H(P)' 5 H(P),
( H ( P )2 I, s < 1) ( H ( P )5 1, s < 1)
or or
( H ( P ) g 1, s > l), ( H ( P )2 1, s > 1).
The proof of parts (i) and (ii) follows from property 7. Parts (iii) and (iv) follows from proposition 2. I (vi). Parts (v) and (vi) follow from proposition 2.1 (vii).
For the proof for s = I , it., in case of entropy of order r, H : ( P ) ( r # 1, r > 0) refer to Shiva et al. (1973). The other cases follow from relations (9) and (lo), and from proposition 2.1(iii). Inequality 3 .
Bounds on b:(P). For all P
= ( p , , p 2 , .. ., p n ) E
A:, we have
...
(Ill)
1 1 - pmax5 -a;(P). 2
Inequalities 3.(i)and 34i) are true for all r > 0 and any s, while inequality 3.(iii) is true for (r,s) E r2,where 1 or pmax2 -, (s 2 2 or (r,s) E rl 2
for all r > 0.
Proof. (i) In case of Shannon's entropy, the left-hand side of the inequality can be proved by recursivity property and the right-hand side follows from Jensen's inequality (McEliece, 1977). In view of relations (9) and (lo), and proposition 2.l(iii), it is sufficient to prove the result only for the entropy of order r, i.e., we need to show that
n-o ( n - o) times We know that (Gallager, 1968, pp. 523)
,
1 5 o 5 n. (17)
INFORMATION MEASURES AND THEIR APPLICATIONS
345
n
where 1 5 o S n. Taking log(.) on both sides of Eq. (18), multiplying by - r ) - ' ( r # 1) and simplifying, we get the right-hand side of inequality (1 7). Let us now prove the left-hand side. Again, we know that (Gallager, 1968, PP. 523)
(1
Similarly,
where 1 2 0 5 n in Eqs. (29) and (20). Adding Eqs. (19) and (20), we get
Taking log( .) on both sides of Eq. (21), multiplying by (1 - r ) - ' ( r # I), and simplifying, we get the left-hand side of Eq. (17). (ii) Without loss of generality we can suppose that p n = pmax.Let 0 = n - I , then 1 - pmax= 1 - pn = 1 p i = 1 - C= : I pi. Making these substitutions in part (i), we get the required result. ( =
346
INDER JEET TANEJA
(iii) The proof of this part is divided into two parts. First part. In this part we will show that
for all r > 0. Consider a function
and
where
(2' -s - l)-l(s - l)p"-2, In2 p '
s = 1,
and
Also,
G>
[ - = ((1)
=o.
s # 1,
347
INFORMATION MEASURES AND THEIR APPLICATIONS
If s < 2, t c ( p ) < 0 for all p E (0,1]. This implies that the function [ , ( p ) is strictly concave and attains its single maximum at t : ( p ) = 0, i.e., when [2(1 - s)-'(2'-&- I)] 1i s - 2 , SZ1, s = 1.
p = L 2 2
Thus the only zeros of t , ( p ) are when p -m(s 5 1). Thus for s < 2, we have
1
LO,
Similarly for s > 2, we have
I
10, -
For s = 2, we have
+ 0,
(,(p)
+
1 ifOO, r = l , s # 1, r # 1,s= 1,r>0, r = 1,s= 1
INFORMATION MEASURES AND THEIR APPLICATIONS
for LY
= 1
349
and 2, with
and n
H ( P J 1 U )= -
1 pi10gui.
i- 1
(30)
Proof. Nath (1975) and Van der Pyl ( I 978) proved the following inequalities: H,'(P) 5 "H,(PI(U),
r # 1, r > 0
(31)
for all P, U E An, LY = 1 and 2, where
and
In the limiting case we have lim 'H,(PI/U) = lim 'H,(PII U ) = H(P11 U ) , r+ 1
r-I
where H ( P 1 J U )given in Eq. (30) is the well-known inaccuracy measure (Kerridge, 1961). In this case, we know that
H ( P ) 5 H(PI1U ) ( 34) for all P, U E An is the well-known Shannon's or Gibb's inequality. The remaining part of the proof follows from relations (9) and (lo), and proposition 2. I(iii) applied to Eqs. (31) and (34).
350
INDER JEET TANEJA
for all P, U E A,,, where 'Hr(PII V )and *H,(PII V )are given in Eqs. (32) and (33), respectively, and Df(PIlU)=(l-r)-'log
)
cpiu,'-' ,
(i11
r#l,r>O
is the directed divergence of order r (Renyi, 1961) given in Section IV.
Proof. We know that [Van der Pyl(1978)l
i:PI
i= 1
Taking log(-) on both sides of Eq. (36), multiplying by (1 - r)-'(r # l), and simplifying we get the required result. Inequality 5. Bounds on the entropy series. Let Pa, = ( p l ,p z , . . .) be a sequence of probability distribution such that pn 2 0, n 2 1, pn = 1, with pn 2 pn+ n. It is well known (Wyner, (1973) that the entropy series N ( P ) given by
x:=,
+
m
HRm) =
-
1 PnlogPnr
n= I
converges if and only if the series
converges. Moreover, the following inequalities (Capocelli, Santis, and Taneja, 1988) hold:
(0 H ( e J 2 S(P,) 2 0, (ii) H ( P 3 2 UP,),
INFORMATION MEASURES AND THEIR APPLICATIONS
(iii) H ( f , )
35 1
5 W k ( f x )k, = 1, 2,...,
where
and
+ 0.766k + 8.531 is a constant independent of the probability distribution f,, and for x 2 0, i = 1,2, .... log0 x
= x,
log'x
=
log*x
=
and
'
if log'- x 2 1 otherwise.
log(l0g'- x),
i
0, O < X S l logx + log*(logx),
x >1
i.e.,
log* x
= logx
+ log(l0gx) + log[log(logx)] + .. .
with addends all positive. (v)
lim [ L ( f , ) - Wl(f,)]
=
a.
s(P)- r
(vi) lim [Wk(fx)- Wk+l(P,)] = m. S(P)+ n
(vii)
lim [W,(P,)
s(P)- x
-
V(Pm)]= m.
352
INDER JEET TANEJA
111. GENERALIZED DISTANCE MEASURES
r > 0 plays a n important role in the enThe quantity (Z:= tropy of order r and degree s. Let us write it in a simplified form, given by
for all P = ( p l , p 2 , . . .,pn) E A:. The quantity G;(P) given in Eq. (37) is either called the generalized distance measure (Boekee and Van der Lubbe, 1979; Capocelli et al., 1985) or the generalized certainty measure (Van der Lubbe et al., 1984). Another generalized distance measure considered by Capocelli et al. (1985) is given by
T;(p)=[g] 1/r-p
,
r # p, r L 0 , p 2 0
(38)
i= 1
for all P = ( p 1 , p 2 , .. . , p , ) E A,". The quantities (37) and (38), in particular, contain the distance measures considered by Trouborst et al. (1974), Gyorfi and Nemetz (1975), Devijver ( 1974), and Vajda (1 968). The generalized distance measures (37) and (38) satisfy some properties (Capocelli et al., 1985) given in the following two propositions: For all P = ( p l , p 2 , .. . ,p , ) E A,", we have
Proposition 3.1.
( el )(i) Gf(P) is a convex function of P for r > 1, rp 2 1 or 0 < r < 1, p < 0.
(ii) G,P(P)is a concave function of P for 0 < r < 1, p > 0, rp i1. (e2)(i) G;(P) is a decreasing function of r ( p fixed and p > 0).
(ii) G f ( P ) is an increasing function of r ( p fixed and p < 0). (iii) G;(P) is a decreasing function of p (r fixed and r > 1). (iv) G;(P) is an increasing function of p ( r fixed and 0 < r < 1).
(e3) (i) Gf(1 - Pmax, Pmax) 5 G ; ( p )
Ol,pl,p>O(orO 1 or w < 0 and is
The proof of Lemma 4.1 can be seen in Ferentinos and Papaioannou (1983) and in Csiszar (1972). Lemma 4.2 is already known in the literature.
Proof sf (ii). Let P, = ( p Z lp, a z , . . . ,p X n E ) An and U, = (uO1,ua2,. . . ,uun)E A,, 2. From Lemma 4.1, we have
a = 1 and
E.,
1 p;pf;' + jL,C ~ ; ~ u : ; ' n
n
i=1
i= 1
where > 0, Ebz > 0, and i1 + EL, tion, we can write
=
1. By the concavity of logarithmic func-
Inequality (54) is obtained from inequality (53), where we have used the fact that the logarithmic function is increasing. Multiplying by ( r - l)-'(r # 1) on both sides of (54), we get
+
E.,D~(PlllU1) l.,D~(P2IlU2) 2 D,!(EL,P,+ izP2llE.,U1 O 0, r # 1, s # 1. Finally, combining all these results we get the convexity of g > ( PIlU) in A, x Al, for s 2 r > 0. (iii) In view of proposition 4.l(iii) and the composition relation (44), it is sufficient to prove that D,!(Pl[U)is an increasing function of r. In order to prove this, let us write
for all P, U
E
All. Let ui/pi = wi, i
=
I , 2 , . . . ,n. Then from Eq. (56) we can write
where P = ( P I ,p 2 , .. . , p n )E A, and W = ( w l , w 2 , . . . ,W J , wi > 0, i = 1,2,. . . ,n. Wc know (Hardy et al., 1934, pp. 15, Theorem 5 ) that the function given in Eq. (56) is increasing in r. Since log(.) is an increasing function, this proves the required result. (iv) I t can be proved on similar lines as property 3(i). (v ) It follows on the lines of property 11.
INFORMATION MEASURES AND THEIR APPLICATIONS
359
( v i ) The inequalities given in ( e l )and (el) are due to proposition 4.1 (vi) and the composition relations (44)and (45),respectively. The inequalities given in (e3)and (e4)are due to part (iii).
V. GENERALIZED DIVERGENCE MEASURES
This section deals with the generalizations of two different kinds of divergence measures. One is known as the j n ~ i ~ ~ ~ aradius f j o n or the Jensen diflerence divergence measure (Sibson, 1969) and the other is well known as J divergence (Kullback and Leibler, 195 1; Jeffreys, 1946). A. Information Radius and the J-Divergmce
By using the concavity of Shannon's entropy, we can write H ( P )+ 2 H ( U )5 H ( T )
(57)
for all P, U E: A,,. The difference
for all P, U E A,, is known as the information radius (Sibson, 1969) or Jensen difference divergence measure (Burbea and Rao, 1982).For simplicity, we shall call R(PI1 U ) , the R-divergence. Another measure of divergence known in the literature is J-divergence (Kullback and Leibier, 1951; Jeffreys, 1946) and is given by
J ( P I I U )= D(PIIU) + D(UIIP) =
c (pi n
i= 1
-
u;)log-Pi ui
for all P, U E A,,, where D ( P ( I U )is as given in Eq. (39). By simple calculations, we can write
for all P, U
E
A,,.
(59)
360
INDER JEET TANEJA
B. Generalizations of R-Divergence
In this subsection, we shall present three different ways to generalize R-divergence, i.e., the Jensen difference divergence measure given in Eq. (58). These generalizations are as follows. Taking € ; ( P ) in place of H ( P ) in Eq. (58), we have 1w-;(PJJ U ) = €;(+ P
+U
-
W"P)
+ SS(U) 2
for all P, U E A,,, where € ; ( P ) is the unified (r,s)-entropy given in Eq. (7). More clearly, we have r # 1, s # 1 , r > O5
('RXPIIU),
for all P, U
E
An, where
'R:(PI I V )
When r
= s in
= ( 1 - 2'
')-
{5 [(,k r=l
s-ljr-l
p;)
+
(
u;]
Iir
'1
Eq. (62), we have
An alternative way to generalize R(PIIU) is to replace D(PIIU) by
,F;(PllV) in Eq. (60). Then we get 2
f
1
+
: ( P l l u ) = j [ 3 " : P l ( U ) .F;(UiiP)]
INFORMATION MEASURES AND THEIR APPLICATIONS
for all P, U
E
36 I
A,,, where 9 ; ( P l l V )is given in Eq. (49). More clearly, we have
r = I,s= 1
for all P, U
E
A,,, where
r # 1,s # l , r > 0
s # 1,
When r
=s
in Eq. (67), we have
s f 1. s > 0.
There is also a third way to generalize the R-divergence similar to Eqs. (67) and (69) based on an expression given in Eq. (70).These generalizations are as follows:
1'
# 1, s # 1, r > 0,
r # I,r>O.
362
INDER JEET TANEJA
The following limits hold: lim 3R";PllU) = 3R:(PIIU);
lim3R:(PI(U) = 3 R ; ( P l l U )= 'R;(PIIU),
s- 1
r- I
where ' R i ( P J J U )is given in Eq. (63). When r
= s in
Eq. (71), we have
3R:(PllU)= *R;(PIIU),
where 2Ri(PllU)is as given in Eq. (70). The last generalizations can be unified as follows:
for all P, U
E
An.
Remarks. The generalized measures given in Eqs. (61), (66), and (73) are the author's contributions and are presented here for the first time, except Eqs. (64) (Rao, 1982) and (65) (Burbea and Rao, 1982).The measure given in Eq. (70)can be seen in Taneja ( I 988a). The following proposition holds: Proposition 5.1.
For all P, U (i)
E
A,, we have
'..I.'':(PIIU) 2 0 for ( r , s )E
rl,
where I-, is given by Eq. (12). (ii) 'Y ",PllU) 2 0 for all r > 0 and any s. (iii) 3"1':(PIIU)2 0 for all r > 0 and any s.
(74)
Proof. (i) This follows from the concavity of € s ( P ) ( P E A,,) for all (r,s) E r, given in property 8. (ii) This follows by the nonnegativity of Fi((PllU) given in proposition 4.2(i). (iii) We can write
INFORMATION MEASURES AND THEIR APPLICATIONS
363
and where qs is as given in Eq. (46). In view of relations (75) and (76), and proposition 4.l(iii), it is sufficient to prove the nonnegativity of 3R:(Pl/U)(r # 1, r > 0) because the nonnegativity of R(PIIU)is obvious from Eq. (57). Let us now prove the nonnegativity of 3 ~ ; ( ~ l l By Lemma 4.2, we can write
v).
for all i = 1,2,.. . ,n, P = ( p l , p 2 , .. .,p,) E A, and U = ( u 1 , u 2 , .. . ,u,) E A,. Multiplying Eq. (77) by [ ( p i + ui)/2] and summing over all i = 1,2,.. . ,n, we get
Taking log(-) on both sides of Eq. (78) and multiplying by ( r - l)-'(r # I), we get the required result. (iv) Again using Lemma 4.2, we can write
s-I r- 1
-21
or
1
s-
r-l
(79)
< 0.
Subtracting 2 from both sides of Eq. (79), multiplying by (1 - 2l -')-'(s and simplifying we get
# l),
364
INDER JEET TANEJA
for all r > 0. Using the concavity property of the logarithmic function we can write
for any r > 0. Multiplying Eq. (81) by ( r - l)-'(r # l), we get
s
~ R ; ( P I I U ) , > 1, 'R:(PlIU){ 2 3 R : ( ~ l I ~ ) , 0 < r < 1.
In a similar way we can prove that
{:
~ R ; ( P ~ I U ) , o < s < I, 'RS(PI I U ) = 3R:(PIIU), s > 1.
Combining Eqs. (80)-(83), we get the required result.
C . Generalizations of J-Divergence In this subsection we shall present two different ways to generalize the J-divergence given in Eq. (59) involving one and two scalar parameters. The generalizations involving one scalar parameter (Rathie and Sheng, 1981; Bubea and Rao, 1982; Taneja, 1983; Burbea, 1984) are given by
J:(PIIU) = (1 - 2l-')-'
[
+
p;.i'-s
c n
pi'-";
-
i= 1
s # l,S>O
1
2 , (84)
and 2Jf(PllU)= ( r
-
1)-'210g
2
9
(86)
r#l,r>O
for all P, U
E
A,,. We can easily verify that
limJ:(PIIU)
s- 1
= lim r-
1
'J:(PIIU) = 21im2J,'(PIIU)= J(PIIU). r-1
INFORMATION MEASURES AND THEIR APPLICATIONS
365
The generalizations involving two scalar parameters considered by Taneja given by
( 1983) are
s-
Ijr- 1
and
The following limits are easy to verify:
We can also write
where D f ( P I I U ) and qs are given by Eqs. (40) and (46), respectively. Also
and
366
INDER JEET TANEJA
Both the generalizations of J-divergence involving one and two scalar parameters can be unified in the following way:
for all P, U E A,,, and a = 1 and 2. The following proposition holds: Proposition 5.2.
For all P, U
E
A,,, we have
(i) “%‘f:(PIIU ) 2 0 ( a = 1 and 2) for all r > 0 and any s. (ii) “W-;(PIIU ) ( a = 1 and 2) are convex functions of the pair of distributions (P, U ) E A,, x A,, for all s 2 r > 0. (iii) Proof. (i) In view of proposition 4.2(i) and relation (89), the nonnegativity of ‘J:(P 11 U ) is clear. In view of relation (90), it is sufficient to prove the nonnegativity of ’J,‘(PI I U ) given in Eq. (86). Its proof is as follows: By Lemma 4.2, we can write
i.e.,
Taking log(.) on both sides of Eq. (93), multiplying by (r - l)-’(r # l), and simplifying we get
INFORMATION MEASURES AND THEIR APPLICATIONS
367
(ii) It can be proved on lines similar to proposition 4.2(ii), where instead of using Lemma 4.1, we use the fact that the function CY=,(plu!-' p! -ru,! '), r # 1 is convex in the pair ( P , U ) E A,, x A, for r > 1 or r < 0 and is concave for 0 < r < I . (iii) Again by the use of Lemma 4.2, we have
+
for all P = ( p 1 , p 2 , .. . ,p,) E A,, and U = ( u l ,u 2 , .. . , u,) E A,,. Subtracting 2 on both sides of Eq. (94), multiplying by ( 1 - 2 ' - ' ) - ' ( s # I), and simplifying we get
for all P, U E A,, and r > 0, r # 1. Using the concavity of the logarithmic function, we can write
(96) Multiplying Eq. (96) by ( r - I ) - ' ( r # l), we obtain (97 1
In a similar way we can show that
Combining Eqs. (95)-(98), we get the required result. For statistical applications of the measures given in Eq. (91) refer to Taneja (1987).
368
INDER JEET TANEJA
VI. GENERALIZED ENTROPIES FOR MULTIVARIATE DISTRIBUTIONS PROBABILITY The idea of entropy measure needs to be developed for multivariate probability distributions, in particular, for bivariate cases, especially in the problems of communication that require analysis of messages sent over a channel and received at the other end. The same is also required in the bounding Bayesian probability of error. In ordcr to develop this idea, let us consider two discrete finite random variables X = {1,2,..., n ) and Y = { 1,2,.. .,m}or a joint experiment ( X , Y ) with joint and individual (marginal) probabilities denoted by a
aij = P r { X = i, Y = j } , 0 A = ( a l , , a l z , ... , a i m , .. . , a n i , a n 2 ,. . , a n m )E An,,
0
pi = Pr{X
=
P
if,
qj=Pr{Y=j), for all i = 1,2,. . . ,n; j X = i is denoted by a
=
bjli = P r { Y
= ( p l , p z , .. . , p n ) E
At,
and
Q=(q1,q2,...,qrn)~A;
1,2,. . .,m.The conditional probability of Y =j
IX
=i),
=j
given
Bi = ( b , , i ,bzl,,. . ., bmli)E A:
for all i = 1,2,. . . ,n; j = 1,2,. . . ,m. Similarly, the conditional probability of X = i given Y = j is denoted by 0
silj= P r { X
= iI
Y =j } ,
Bj = ( b l l j , b z l,..., j bnlj) E A:
f o r a l l i = 1,2,..., n ; j = 1,2,..., m. Let us also denote, P*Q
= ( ~ 1 4 1~71
4 2’ . . > ~ 3
1 q 3m. .
., P.419.. .,P n q m ) E A f m .
The following relations are well known in the literature:
f o r a l l i = 1,2,..., n ; j = 1,2,..., m. If X and Y are independent random variables, then i = 1,2,..., n ; j = 1,2,..., m. aij=piqj, Based on the above notations, the joint and individual unified (r, s)-entropies can be written as: &;(x, Y )= €;(A), € : ( X ) = a;(P),
INFORMATION MEASURES AND THEIR APPLICATIONS
369
and
&XY) = gXQ), where €: is the unified (r,s)-entropy given in Eq. (7). Also, we can easily write &:(X, Y, Z ) , etc. Similarly, the individual conditional unified (r,s)-entropies are given by
& c ; ’ ( Y (= X i) = &(B,),
i = 1,2 ,..., n
&s(X I Y = , j )
j = 1,2,.
and = &:(El,),
. . ,m.
There is no unique way to define the conditional generalized entropies. I t has been defined in different ways by different authors. We shall specify here five different ways to define conditional generalized entropies. One is restricted to only entropies of degree s given in Eq. (3), and the other four are for the unified (r,s)-entropy given in Eq. (7). We shall observe that these different approaches in the limiting case reduce to the well-known Shannon’s conditional entropy. These five approaches have been divided in two subsections: the first approach is only for the entropy of degree s and the second approach is for the unified entropy. Henceforth, unless otherwise specified, the letters X , Y , Z , . . . ,XI, X , , . . . etc. will represent the discrete finite random variables. A . Entropy
01’ Degree s for Multiuariate Probability Distributions
In this subsection, we shall define a conditional entropy of degrees, which in the limiting case contains Shannon’s conditional entropy. This definition was first considered by Daroczy (1970) and satisfies many of the properties of Shannon’s case. In order to simplify the results, let us unify these two entropies in the following way: s # l,s>O, C S ( P )= -
f p,logp,,
I =
for all P
s=1
1
= ( p I , p 2 . . . , p n )E
A:.
Define CYX 1 Y ) =
m
j= I
4”C”(X
1 Y = /).
s > 0,
370
I N D E R J E E T TANEJA
where C S ( XI Y
s # 1 , s > 0,
, =j
)=
(99) -
C bil,loghi,j.
s = 1.
i= 1
In a similar way, we can define Cs(Y I X). Define
C"(X,Y I Z ) =
1 C?C"X, Y I z = !), f= Y
1
where
and C~
=
Pr{Z
=
P ) for all / = 1,2,.. . ,v.
Also define
l s ( xA Y ) = c'(x)- cs(xIY),
9
>0
and
where
and
Cs(XIY
=
j,Z = /)
=
f o r a l l j = 1,2,..., m a n d / = 1,2,..., v. The measure l S ( X A Y )is known in the literature as the mutual information of degree s. Based on the definitions given above, the following propositions hold (Taneja, 1988b).
INFORMATION MEASURES A N D THEIR APPLICATIONS
37 1
Proposition 6.1. For all s > 0, we have
+
(i) C ' ( x , Y ) = C s ( X ) C'(Y 1 X ) = c?( Y)
+ C'(X 1 Y).
(ii) C s ( X )= C ' ( x 1 Y ) + rs(xA Y ) . (iii) C s ( X ,Y, Z ) = C ' ( X ) + Cs( Y ,Z I X )
+ C S ( Z x, ] Y) = C S ( X )+ CS( Y 1 X ) + cyz 1 x, Y ) . (iv) c'(x,,x ~ ,. .. ,x,) = c'(X, + C'(X2 1 X , ) + C'(X3 I X , , X 2 ) = C"X, Y )
+ . . ' + C " X , I X , , X , ,..., X , ~ I ) = 1 CS(Xi1 X I ,x2,. .. , x i d
1).
1=1
(v) C'( Y, z I X ) = CS(Y I X ) + cyz I x , Y).
1 Z ) = c'(X 1 Y, Z ) + P ( x Y 1 z). (vii) I ' ( X , Y Z ) = Is(X Z ) + Is( Y Z I X ) . (viii) f s ( X Y) = C s ( X )+ C'(Y) C " ( X ,Y ) = I'(Y X ) . (ix) C s ( X )+ C'( Y ) + C s ( Z )- C ' ( X , Y, Z ) = rs(xA Y I Z ) + ~ ' (A xY). (X) I'(X A z ) f { '(x A Y I z )= I'(x Y) + {'(x z 1 Y ) . (xi) I ~ ( XX ~, A, X , I X,) = I " ( X , X , I X,) + I s ( X 2 X 3 I X , , X,). (vi) c'(X
A
A
A
A
A
A
-
A
A
A
A
The proof of these properties is a simple verification. Proposition 6.2. For all s 2 1, we have (i)
rs(xA
(ii) I s ( X
A
Y) 2 0, i.e., cs(x1 Y) 5 c ' ( X ) . Y I Z ) 2 0, i.e., C ' ( x I Y, Z ) 5 C s ( X I Z ) .
The proof of part (i) can be seen in Daroczy (1970), and part (ii) follows form part (i).
372
INDER JEET TANEJA
(iv) C s ( Y J X + ) C s ( Z I X )2 C s ( Y , z [ X ) , (V)
I " ( x ,Y
A
z )2 Is(Y A zlx),
S
s
2 1.
2 1.
(vi) If C s ( X , , X 2 )# 0, then C S ( XI Y ) cyx, Y )
+
I
CS(Y Z ) > C S ( X12)
s
C S ( Y , Z )= C S ( X , Z ) '
2 1.
Proof. (i) This is obvious from proposition 6.1(i). (ii) This is obvious from proposition 6.l(v). (iii) For all s 2 1, we have
Cs(XI Y )
+ C2(Y I Z ) 2 Cs(X I Y ,Z ) + Cs(Y 12) =
(proposition 6.2(iii))
C"X, Y I Z ) 2 C S ( XI Z).
(iv) For all s 2 1 , we have C S ( Y1 X )
+ C S ( ZI X ) 2 C S ( YI x , Z ) + CS(ZI X ) , =
Cs(Y,Z I X ) (proposition 6.1(v)).
(v) For all s 2 1, we have
rs(Y A Z l x ) 5 Is(xA
-
+
z )+ I S ( Y A zlx)= IS(x,Y A z ) .
C S ( XI Y ) CS(Y I Z ) C S ( XI Y ) CS(Y I Z ) CS(Z)'
+
+
d t ( X , Y ) = d",x,y , C"X, Y )'
C"X, Y ) # 0,
INFORMATION MEASURES AND THEIR APPLICATIONS
373
and
Then for all u = 1,2, and 3, we have
(i) d",X, Y ) 2 0,
d " , X , X ) = 0, s > 0.
(ii) d:(X, Y ) = &( Y, x),
(iii) d",X, Y )
s > 0.
+ d",Y,Z) 2 d",X,Z),
s 2 1.
This means that for s 2 1, d:(X, Y )(u = 1,2, and 3) form pseudometric spaces among the random variables. Proof. For u = 1 and 2, the proof follows from proposition 6.3(iii) and (iv), respectively. For s = 1, when ct = 1,2, and 3 refer to Horibe (1973, 1985). Let us prove the result for u = 3. We will prove this in three different cases. Cuse I . When C s ( X )2 Cs(Y) 2 C s ( Z )> 0, we have
Cuse 2. When C s ( X )2 C s ( Z )2 Cs(Y ) > 0, we have
374
I N D E R J E E T TANEJA
Case 3. When C s ( Z )2 C s ( X )2 C s ( Y )> 0, we have
This completes the proof of the proposition.
Proof. We have (i) d i ( X , Y ) = C s ( X 1 Y ) = 2C“X,
+ Cs(Y1 X ) ,
Y ) - C S ( X )- C S ( Y i ,
L(
C S ( X )- C S ( Y ) , CS(Y ) - C S ( X ) ,
=
ICS(X)- C S ( Y ) ( ,
s > 0.
(ii) For s 2 1, we know that
C s ( X 12)5 C s ( X 1 Y ) C S ( X1 Z ) - CY(Y 1 Z ) 5
+ Cs(Y I Z ) , i.e.,
cyx 1 Y ) ,
5 C”(X I Y ) + CS(YI X ) , = d ; ( X , Y ) .
(100)
Since C s ( X , I XI) 2 0, this gives
C S ( X ,1 X 2 ) 2 CS(X,)- CS(X,).
Expressions (106)and (107) together give the required result.
376
INDER JEET TANEJA
B. Unijied (r,s)-Conditional Entropies In the previous subsection, the definition of Cs(X I Y) is based on the wellknown property of Shannon's entropy, i.e.,it is especially defined to satisfy the following property:
C"X, Y)
=
CS(Y)
+ CS(XI Y),
s
> 0.
(108)
Some authors (Sahoo, 1983; Van der Lubbe et al., 1987) extended Eq. (108) for other entropies, but it didn't give a simplified expression, as in the case of C s ( X 1 Y )given in Eq. (99). In this subsection, we shall use four different ways to define the unijied (r,s)-conditional entropies. When s = 1 in Eq. (99), we have n
H(X(Y ) =
C qjH(X(Y =
(109)
j),
j =1
where n
H(X1 Y =j ) = -
I
1birjlogbilj,
i= 1
j = 1,2,. . , ,m.
Let us replace H(X Y = j ) given in Eq. (99) by the unified (r,s)-conditional (individual)entropy &:(X 1 Y = j ) ( j = 1,2,.. . ,m).Then we have
'&:(X 1 Y ) =
m
C qj&(X 1 Y = j )
(110)
i= 1
for all r > 0 and any s. More clearly, we have the following individual expressions: s-l/r-l
s # l , r # 1,r>0,
Y ) = (2'-l - l)-'
qj [jI1
1 b$ ) - 1 ],
(iI1
t # 1 , t > 0.
INFORMATION MEASURES AND THEIR APPLICATIONS
377
We shall now use the expressions given in Eqs. (1 14)and (1 15) as the basis for writing an alternative way of defining unified (r, s)-conditional entropies. Let us define
'H,!(X 1 Y ) = ( 1
-
r ) - I log
c qj
{jml
]
bilj , i l l
r # 1, r > 0
( 1 16)
and
The definition of 2H,'(X 1 Y )is based on expression (1 14),and the definition of 3H,'(X 1 Y ) is based on expression (1 15). In the limiting case we have lim 'H,!(X I Y ) = lim 3 ~ f ( 1 XYI r-
I
r-
=H(X
I Y),
1
where H ( X 1 Y ) is as given in Eq. (109). We shall now use expressions (116) and (117) to define the conditional entropies of order r and degree s, using the compositivity relation given in Eq. (9). These definitions are as follows: 'H:w
I Y) = YA2H,'(X I Y)) = (21 - s
-
1)-
1
{(jtl
qjbilj)-l'r-' -
I),
s # 1, r # 1, r > 0, ( 1 18)
and
"XX
I Y ) = YA3Hf(XI Y ) ) ,
In the limiting case we have lim 'H;(x1 Y ) = lim 3H;(X 1 Y), r-
I
r-l
378
INDER JEET TANEJA
Also we can check that
'HE(X I Y ) = 2H:(X I Y) and : H ( X I Y ) = : H ( X I Y ) . The exact expressions of : H ( X 1 Y ) and 3 H 3 X 1 Y ) are given by $4(XIY)=(2'-'-
( 120)
l)-l
and
(121) respectively. Expression (120) is obtained from (118) by taking r
=
l/t and
s = 2 - t. Expression (121) is obtained from (1 19) by taking r = s.
We know that
I(x A
Y ) = H ( X ) - H(X 1 Y ) ,
where I ( X A Y ) is the well-known mutual information (Ash, 1965) between the random variables X and Y. Based on the definitions of unified (r,s)-conditional entropies given above, we can generalize I ( X A Y ) in the following way:
a-)us(x A Y ) = &",x) - "&s(x 1 Y), where ct = 1, 2, and 3. By simple calculations we can write I(X
A
Y ) = D(AJIP*Q),
where n
D ( A ( ( P * Q= ) i='
m
2 a i j l o a.. gL j=1
Pi4j
is a directed divergence between the distributions A and P*Q. We shall now present a fourth way to define the unified (r,s) conditional entropy. This is based on the generalizations of D(AJIP*Q)in terms of the unified (r,s) directed divergence .F;(AI I P*Q) given in Eq. (49). This definition is as follows: 48i(x1 Y ) = &i(x)- 4~1,'i(x A Y), where
'~ 1 ';(x Y ) = 9 t ( A I I P * Q ) . A
Thus we can write .Jr/.;(X
A
Y ) = &s(x)- "&;(x I Y),
INFORMATION MEASURES AND THEIR APPLICATIONS
379
where
and
for o! = 1, 2, 3, and 4. Remarks. 'H,'(X I Y ) is defined in a natural way, as is Shannon's entropy. ' H , ' ( X Y ) can be found in Aczel and Daroczy (1963) and Behara and Nath (1970). 3H,'(X Y ) has been taken by Arimoto (1975) to relate it to Gallager's random coding exponent function. 4H,'(X 1 Y ) has been adopted by Renyi (1960) and is based on the definition of mutual information between random variables.
I
Based on the above definitions the following proposition holds: Proposition 6.6. We have (i) € i ( X ) 2 0,
"&s(X 1 Y ) 2 0 ( a = 1, 2, 3, and 4).
(ii) &:(X, Y ) 2 & : ( X ) or &:( Y ) .
(iii) If X and Y are independent random variables, then
&;(x,Y ) = &;(x)+ a;(Y) + (2' (iv) "&:(xI Y ) 5 &:(X), (el) for c1
=
-s
-
l)-'€;(x)&;( Y).
1 it is true for ( r , s ) E r l ,
(e2)for c1 = 2, 3, and 4 it
IS
true for all r > 0 and any s.
(vi) 2 8 s (I Y) ~ 5 3&;(XI Y ) . Parts (i), (ii), (iii), and (vi) are true for all r > 0 and any s.
(122)
380
INDER JEET TANEJA
Proof. Parts (i), (ii), and (iii) are easy to verify. Part (iv) (el) follows from the concavity of & ; ( X ) for (r,s) E rl given in property 8. For part (iv) (ez) when ot = 2 and 3, it is sufficient to prove the results 'Hf(X 1 Y ) 5 H , ' ( X ) and 3H,'(X [ Y ) 5 H,'(X). The first follows from Van der Lubbe et al. (1982), and the second follows from Arimoto (1975). For c( = 4, part (iv) (e2)holds because of the nonnegativity of Y;(AIJP*Q)(i.e., F:(Pll V) for all r > 0 and any s given in Section IV. Let us now prove parts (v) and (vi). (iv) From Lemma 4.2, we can write n
\s-l/r-l
,.
> - I
1
-> r-1
1,-
s- 1
'-
s-1 0 0.
This gives 9s(2Hf(x1 Y)) 5 9s(3Hf(xI Y ) ) ,
(129)
INFORMATION MEASURES AND THEIR APPLICATIONS
38 1
i.e., 2HS ,(X 1 Y ) 5 3 H 3 X 1 Y ) , When r
=
r # 1 , s # 1 , r > 0.
( 1 30)
1 , we have
* H ; ( X 1 Y ) 5 3Hs(X 1 Y ) ,
s # 1.
(131)
Combining Eqs. (129)-(13l), we get the required result.
I
Proposition 6.7. We have
H”X, Y )
s-1
5 H ; ( Y )+ ‘H;(X I Y),
s
2 r, r r - l 2 1,
2 H s ( Y ) + ‘H:(XI Y ) ,
r
5 s, r - -
~
s-1 r-1
(132)
5 1.
for all r # 1, s # 1, r > 0 Proof. We know (Behara and Chawla, 1974; Rathie and Taneja, 1989) that
where 0
Sj
5 y j , j = 1,2,.. .,m. We also know (Gallager, 1968, pp. 523) that
1
5
u; i= 1
2
uij]
= 45,
o < r 5 1, (1 34)
(
aijJ = 45,
r
21
for all j = 1,2,.. . ,m. Case 1. 0 < r 5 1. In this case, substituting n
h j = C u L , j = l , & ...,m, i= 1
yj=q5,j=
1,2,..., m,
and s-1
p=l_1 ,
r#l,s#l
382
INDER JEET TANEJA
in Eq. ( 1 33), we get
I
s-l/r-1
s- l/r- 1
2 1 or-
, m
s-l/,-l
r-1
(135) s- l / r - 1
m
s-1 I 1. r-1-
0 < r < 1,0 1. In this case, substituting
6.,= q',, yj
=
,Ia:j,
I=
1
j =
LZ...,m, j = I , 2,. . .,m,
INFORMATION MEASURES AND THEIR APPLICATIONS
383
and s-
p
=
1 z,
s#l,rfl,
into Eq. (133), we get
1.e..
(138)
\
s-1 r > 1,0 l,p>O,rp> l ( o r O < r < l,p 1 , r p g 1.
5 [(n
- 1)'
Proposition 7.3. We have the following bounds: (i) P,
sI
-
Tf(X I Y ) ,
r 2 0, p 2 0, r # p ,
The proof of propositions 7.2 and 7.3 is based on propositions 3.1 and 3.2, respectively, and it can be seen in Capocelli et al. (1985). The particular cases of propositions 7.1-7.3 involve known entropies and distance measures and are as follows: Shannon's entropy. (Chu and Chien, 1966; Hellman and Raviv, 1970). We have 1 P 0 i n Eqs. (147), (148). and (149), we get Eqs. (167), (168). and (169), respectively. For the bound (168), also refer to Toussaint (1977) and Taneja (1983). The first condition for the bound given in (167) is 0 < I' < I , which is because of Eq. (147),but it has been proved independently by Ben-Bassat and Raviv (1978) that i t holds for 0 < r 5 2 ( r # 1). Entr-opj,of' kind
t.
(Boekee and Van der Lubbe, 1980;Taneja, 1982). We have 1 P 0,
394
INDER JEET TANEJA
r#l,r>O,
r # l,s#I,r>O,
and
r#l,s#l,r>O.
Let us write these generalizations in a unified way: " R : , r # 1, s # 1, r > 0, " R ; , r = I, s # 1, r # 1, s = I, r > 0, R, r = 1 , ~ = 1,
where CI = 2 and 3. We have the following relation [refer to expression (74)]
z3d ;,
ssr,
29"S,
s
2 r.
INFORMATTON MEASURES AND THEIR APPLlCATlONS
395
Let us write the measures relating to "9'; in a more general form for the two-class case as follows:
and
r # 1,sf l,r>O.
Let us write these measures in a unified way:
When pl = p2
=
1/2 in Eq. (183), we have
396
INDER JEET TANEJA
Thus we have
where
and
Using the concavity of the logarithmic function, we can write
From Eqs. (186) and (187), we have
where
and
INFORMATION MEASURES AND THEIR APPLICATIONS
= (;)'-'c1
-
397
'H;(X I Y ) ] ,
where
s - I/*- 1
-p(y)dy)
-
I]},
By Lemma 4.2, we can write 1s -
lir- 1
r # 1, s # 1, I > 0.
(190)
398
INDER JEET TANEJA
From Eqs. (190) and (1 91), we obtain
where
and
H"XI Y = y ) = ( 2 l - s - l)-l{[p(xl ly)'+ p(x2fy)']s-I'r-1 r # 1, s # 1 , r > 0.
-
11,
Unifying the results given in Eqs. (185), (188), (189), and (192), we have
where ' Y : ( X [Y ) is as given in Eq. (149) with X = (x1,x2)and Y as a continuous random variable. Based on the relations given above we shall now present some error bounds. 1. Upper Bounds on the Probability of Error in Terms of
3 * y X ~ lPJ , and 3C
Proposition 7.4. We have
and
where 3 Y 3 p , , p 2 ) and
33yi are
given by Eqs. (183) and (181), respectively.
Proof. From Eq. (147), we have
INFORMATION MEASURES AND THEIR APPLICATIONS
399
By Eqs.(193) and(196) we get Eq. (194), while Eq. (195)followsfrom Eq. (184) and ( 194). 2. Lower Bounds on 3 V " s ( p l , p 2 ) and 3Ysin Terms of the Probability of Error When n = 2 in Eq. (156), we have
H(X 1 Y ) 5 H(Pe),
(197)
where
H ( P e ) = - P, log Pe - (1 - PJlog(1 From Eqs. (185) and (197), we have R ( P , , P ~5) 1 - H(Pe)* Using Lemma 4.1, we can write
Let p(e I y ) = min
then from Eq. (199), we get
S P : + ( l -Pe)', ( 1 - Pe)',
2 P:
+
O < r < 1, r > 1.
(200)
Taking log(.) on both sides of Eq. (200), multiplying by ( 1 - r ) - l ( r # l), and simplifying, we obtain
3 W ~ 1 2, ~1 -2 H) ~ ( P , ) ,
r # 1, r > 0,
where
-
H,?(P,) = (1 - r)-' Iog[P:
+ (1
-
When r -+ 1, Eq. (201) reduces to Eq. (198).
P,)'],
r
+ 1, r > 0.
(201)
400
INDER JEET TANEJA
We can write
=
(')
1 --s
~ 1 ~-? ( p e ) 1 9
(202)
where H"Pe)
= (21-3 -
1)[2(1-"'H'Pe' - 11,
s # 1.
Also,
V~,C~R,'(P~,PJI,
3 R X ~ 1 , ~= 2)
2 qs[1
- Hf(Pe)I,
(i)
1 -s
=
r # 1 , s # 1, r > 0,
[l
-
H"Pe)],
+ (1
-
Pe)*]s-l'r-l - l},
(203)
where HpJJ
= ( 2 1 - s-
l)-l{[P;
r#l,s#I,r>O.
Combining Eqs. (198), (201), (202), and (203), we have proved the following proposition: Proposition 7.5. We have
INFORMATION MEASURES AND THEIR APPLICATIONS
40 1
Particular case of Eq. (204)
When p1 = p 2
=
1/2, then from Eqs. (204) and (184), we have
2
3v;
(3'~ s[I
-
a;(Pe)].
From relation (1 82) and result (205), we can also obtain
where 2Y'f is given in Eq. (181) for c( = 2. From the inequalities given in Eq. (182), it is quite clear that the result, Eq. (205), is better than Eq. (206). C. Generalized Measure of Chernof, Bhattacharya Distance, and the Probability of Error
Let
r > 0.
When r
=
1/2 in Eq. (207), we have (208)
K,,2 =
where F is the well-known Bhattacharya distance or Matusita's measure of affinity (Matusita, 1967). The measure r
P(Y I X l ) ' P ( Y
I X 2 Y -'dY,
'0
(209)
is known as the Chernoff measure (Kailath, 1967). Thus, based on Eq. (209), we call K , given in Eq. (207) the generalized measure of Chernofl. Let us write K , and F in a more general form involving the prior probabilities p1 and p 2 given by
402
INDER JEET TANEJA
and
When p l = p 2
=
112, we have
and
We can simplify measures (210) and (211) in the following way:
and
where
and F ( Y ) = J P ( X 1 I Y)P(X2 I Y).
Based on the above notations, we have the following proposition: Proposition 7.6. We have
Proof. We have
INFORMATION MEASURES AND THEIR APPLICATIONS
:;:p( I:;)]
+ p 2E , [
'"]2(1
403
-r),
(2 15)
where E, and E 2 represent the expected values in their respective forms. Using Lemma 4.2 in Eq. (215), we have
2(1 - r) 2 1 or (2(1 - r) 5 0, 2(1 - r )
0 5 2(1
-
r) 5 1.
Simplifying Eq. (216) we obtain the required result. Particular cases of Eq. (214)
(i) When r
=
1/2, we have K1/2(Pl?PZ)= F(P1,Pz).
(ii) When p1 = p z = 1/2, we have
(2 F2(1-r),
Proposition 7.7. We have
1 0 < r 5 x,r 2 1 ,
404
INDER JEET TANEJA
where 1 Kr(Pe)= -[P:(l 2
-
Pe)'-r
+ (1 - Pe)'Pd-r],
r > 0.
(218)
Proof. We have
where
Let
It is easy to verify that K r ( p )is a convex function of p for r > 1 and a concave function of p for 0 < r < 1. Therefore, we can write
i.e.,
Also, we can write
Expressions (219), (220), and (221) together give the required result. Particular case of Eq. (217) (i) When p1 = p2 = 1/2, then from Eqs. (217) and (212), we have
2 P:(l 5 P:(l
-
+
Pe)' ( I - P,)'P,' - I , Pe)l-r+ (1 - P J P ; - ~ ,
r
1,
O O,
r
+ 1 , s # 1 , r > 0.
406
INDER JEET TANEJA
The measures given above can be unified in the following way: r # 1 , s # 1 , r > 0, r = 1,s # 1, aw;=[y,?, r#l,s=l,r>o, r = 1 , s = 1,
"J;, "J;,
where a = 1 and 2. The following inequalities also hold [refer to expression (92)]:
Let us write the measures given in Eq. (223) for a = 2 in the more general case involving prior probabilities p1 and p 2 in the following way:
(J(Pl,P2),
where
r = 1 , s = 1,
INFORMATION MEASURES AND THEIR APPLICATIONS
and 2J;(pl,p2) =
( 1 - 2'-")-'
{[
(y(lPIP(Y
I X1)l"P2P(Y I X 2 ) 1 1 - r
+ CPIP(Y I X l ) l 1 - T P 2 P ( Y I X2)l')dY = (1 - 2'
{(Iy
I
CP(X1 Y)'P(X2
I Y)'
'-ll, --*
s- l j r - 1
+ P(X,lY)'P(X,
IY)1-rlP(Y)dY)
r#l,s#l,r>O. When p1 = p2 = 112 in Eq. (225), we have
If we write
then
and
where
- 11,
407
408
INDER JEET TANEJA
and JS(Y) = (1 - 21-s)-1{cP(x,JY)'P(X,lY)'-'
+ p(x, 1 y)' -'p(x, I y ) q -
I/'-
-
I},
r # L s # 1,r>0,
respectively. In a unified way we can write:
where
"UY)
=
r # 1, s # r = 1,s # r # 1, s = r = 1, s =
J;(y), J",y), J,!(y), J(y),
1, r > 0, 1, 1 , r > 0, 1.
It is easy to check that the following inequalities hold:
Based on the above considerations we shall now present relations between generalizations of J-divergence, Bhattacharya distance, and the probability of error. Proposition 7.8. We have 2WS(Pl,P2) L I l Y - V e ) ,
3 W S ( p l , ~L2* p)S ( ~ e ) ,
(230) s L r > 0,
2w; 2 2WS(Pe),
(231) (232)
and where
I
' W ; 2 2W;(pe),
s 2 r > 0,
(233)
r#l,s#l,r>O, J;(Pe) = ( 1 - 2l-s)-'[K,(Pe)S-'/'-1 - 11, J",P,) = ( 1 - 21-s)-1[2(5-1)J(pe) - 11, r = 1,s # I, r # 1 , s = 1 , r > 0, Ws(Pe)= J ; ( P e ) = ( r - l ) - ' logK,(Pe), (234) J ( P e ) = (2Pe - 1)log
~
and K:(Pe) is as given in Eq. (218).
",>.
r = 1 , s = 1,
INFORMATJON MEASURES AND THEIR APPLICATIONS
409
The proof of Eq. (230) follows from relation (217) given in proposition 7.7. Equation (232) follows from (230) and (226) by taking p1 = p 2 = 1/2. Equation (231) follows from relations (229) and (230). Equation (233) follows from relations (232) and (224). Proposition 7.9. We have
for any s, where
and F( p l , p 2 ) is as given in Eq. (2 1 1). The proof follows from inequalities (214) given by the proposition 7.6. Particular case of Eq. (235).
When p1 = p 2 = 1/2, we have
where F is as given in Eq. (208). The particular cases of the propositions 7.8 and 7.9 for r = 1 and s = 1 can be seen in Toussaint (1974) and in Devijver and Kittler (1981).
410
INDER JEET TANEJA
ENTROPY GRAPH
The following graph indicates how all the entropies given in Section 1I.F reduce to Shannon’s case in the limiting or in the particular case:
i 423
0
425
0
422
INFORMATION MEASURES AND THEIR APPLICATIONS
41 1
REFERENCES Aczel, J., and Darbny, 2. (1963). Publications Mathematicae 10, 171-190. AczCI, J., and Daroczy, 2. (1975). “On Measures of Information and their Characterizations,” Academic Press, New York. Arimoto, S. (1971). Information and Control 19, 181-190. Arimoto, S. (1975). Colloq. on Information Theory, Kesthely, Hungary 41-52. Arimoto. S. (1976). I E E E Trans. on Inform. Theory IT-20,460-473. Ash, R. (1965). “Information Theory,” Interscience New York. Behara, M., and Chawla, J. M. S. (1974). In “Entropy and Ergodic Theory: Selecta Statistica Canadiana.” 11, 15- 38. Behara, M., and Nath, P. (1970), In “Probability and Information Theory 11” (M. Behara, K. Krickeberg, and J. Wolfowitz, eds.). Springer Verlag, Berlin, pp. 102-137. Belis, M., and Guiasu S., (1968). I E E E Trans. on Inform. Theory IT-14, 591-592. Ben-Bassat, B. (1978). Information and Control 39, 227-242. Ben-Bassat. B., and Raviv, J. (1978).IEEE Trans. on Inform. Theory IT-24, 324-331. Blumer, A. (1982). Ph.D. Thesis, University of Illinois at Urbana-Champaign, Department of Mathematics. Boekee, D. E., and van der Lubbe, J. C. A. (1979). Pattern Recognition 11,353-360. Boekee, D. E., and van der Lubbe, J. C. A. (1980). Informafion and Control 45, 136-155. Burbea, J. (1984). Utilitas Mathematica 26, 171-192. Burbea, J., and Rao, C. R. (1982). IEEE Trans. on Inform. Theory lT-28,489-495. Campbell, L. L. (1965). Information and Control 23,423-429. Campbell, L. L, (1985). In/orrnation Sciences 25. 199-210. Campbell, L. L. (1987). Queen”sMathematical Preprint, No. 12. Capocelli, R. M., and Taneja, I. J. (1984). Proc. I E E E Intern. Conf. on Systems, Man and Cybernetics, Oct. 9- 12, Halifax, Canada. pp. 43-47. Capocelli. R. M., and Taneja, I. J. (1985). Cybernetics and Systems 16, 341-376. Capocelli, R. M., Gargano, L., Vaccaro, U., and Taneja, 1. J. (1985). Proc. IEEE Intern. ConJ on Systems, Man and Cybernetics, Arizona, U.S.A.. November 12-15, pp. 78-82. Capocelli, R. M., de Santis, A., and Taneja, 1. J. (1988).IEEE Trans. on Inform. Theory IT-34.134138. Chen, C. H., (1976). Information Sciences 10, 159-171. Chu. J. T. and Chueh, J. E. (1966). J. Franklin Inst. 282, 121-125. Csiszar, I. (1972). Periodica Marh. Hung. 2, 191-213. Csiszar, I. (1974). Trans. of the 7th Prague ConJ, pp. 83-86, Prague; Chechoslovakia. Csiszar, I. and Kdrner, J. (1981). “Information Theory: Coding Theorems for Discrete Memoryless System.” Academic Press, New York. Daroczy, 2. (1970). Information and Control 16, 36-51. Devijver. P. A. (1974). IEEE Trans. on Comp. C-23, 70-80. Devijver, P. A. (1977). Informarion and Control 34,222-226. Devijver, P. A., and Kittler. J. V. (1982).“Pattern Recognition: A Statistical Approach.” Prentice Hall, London. Ferentinos, K., and Papaioannou, T. (1983).J. Comb. Inform. and Syst. Sci. 8,286-294. Fergunson, T. S. (1967). “Mathematical Statistics.”pp. 284--308, Academic Press, New York. Ferreri, C. (1980). Stutistica XL, 155-168. Gallager, R. G., (1968).“Information Theory and Reliable Communication. John Wiley and Sons, New York. Gallager, R. G. (1978).IEEE Trans. on Inform. Theory lT-29,668-674.
412
INDER JEET TANEJA
Guiasu S. (1977).“Information Theory with Applications.” McCraw Hill, New York. Gyorfi. L., and Nemetz, T. (1975). Collog. on Inform. Theory, Keszthely, Hungary, pp. 309-331. Hardy, G. H., Littlewood, J. E., and Pblya, G. (1934).“Inequalities.” Cambridge University Press, London. Hartley, R. V. L. (1928).Bell System Tech. J . 7, 535-563. Hellman, M. E., and Raviv, J. (1970).I E E E Trans. on Inform. Theory IT-16, 368-372. Horibe, Y. (1973). Information and Control 22,403-404. Horibe, Y. (1985).I E E E Trans. on Systems, Man, and Cybernetics SMC-15,641-642. Jeffreys, H. (1946). Pror. Royal Sor., A186,453-561. Jelinek, F . (1968a). “Probabilistic lnformation Theory.” McGraw Hill, New York. Jelinek, F. (1968b). I E E E Trans. on Inform. The0r.y IT-IS, 765-774. Jelinek. F., and Schneider, K. (1972).l E E E Trans. on Inform. Theory lT-l8,765-774. Kailath, T . (1967). I E E E Trans. on Commun. Tech. COM-15, 52-60. Kanal, L. N. (1974).IEEE Trans. on Inform. Theory lT-20,687-722. Kapur, J . N. (1967). The Math. Seminar 4,78-94. Kapur. J. N. (1983).J . Infbrm. and Optim. Sci. 4,207-232. Kapur, J. N. (1986).Indian J . Pure & Appl. Math. 17,429-449. Kerridge, D. F. (1961).J . Royal Statist. Sac. 823, 184-194. Kieffer,J . C. (1979).Information and Control 41, 136-146. Kovalevski, V. A. (1968). In “Character Readers and Pattern Recognition.” pp. 3-30, V. A. Kovalevsky, Ed.. New York: Spartan. Kullback, S. and Leibler, R. A. (1951). Ann. Math. Statist. 22, 79-86. Longo, G . (1980).“Information Theory” (in Italian. Boringhieri, Torino, Italy. Mangasarian, 0.L. (1969). “Nonlinear Programming,” Tata McGraw Hill, New Delhi/Bombay. Marshall, A. W., and Olkin, 1. (1979).“Inequalities: Theory of Majorization and Its Application,” Academic Press, New York. Mathai, A. M., and Rathie, P. N. (1975).“Basic Concepts in Information Theory and Statistic.” Wiley and Sons, New York. Matusita, K. (1967). Ann. Inst. Statist. Math. 19, 181-192. McEliece, R. J. (1977).“The Theory of Information and Coding.” Encyclopedia of Mathematics and its Applications, Vol. 3. Addison Wesley, Reading, Massachusetts. Nath, P. (1975).Infbrmarion and Control 29, 234-242. Nyquist, H. (1924). Bell Sysr. Tech. J . 3, 324-. Nyquist, H. (1928). A I E E E Trans. 47,617-. Parker, J., D. S. (1979). S l A M J. Comput. 9, 470-489. Picard, C. F. (1979).J . Comb. Inform. und Syst. Sci. 4,343-356. Rao, C. R. (1982). Theor. Popul. Biology 21, 24-43. Rathie, P. N. (1970). J. Appl. Probl. 7, 124-133. Rathie, P. N. and Sheng, L. T. (1981). J . Comb. Inform. andSyst. Sci. 6 , 197-205. Rathie, P. N., and Taneja, I. J. (1989). Information Sciences, to appear. Renyi, A. (1960). M T A I I I Oszthlyhnak Kozf 10,251-282. Renyi, A. (1961).Proc. 4th Berk. Symp. Math. Statist. and Probl., Vol. I , pp. 547-461, University of California Press. Berkeley, California. Sahoo, P. K. (1983). J . Comb. lpform. and Syst. Sci. 8, 263-270. Sant’anna, A. P. Taneja, I. J. (1985). Information Sciences 35, 145-155. Shannon, C. E. (1948). Bell System Tech. J . 27,379-423; 623-656. Sharma, B. D., and Mittal, D. P. (1975).J. Math. Sci. 10,28-40. Sharma, B. D., and Mittal, D. P. (1977). J . Comb. Inform. and Syst. Sci 2, 122-133. Sharma, B. D., and Taneja, I. J. (1975). Metrika 22,205-215. Sharma, B. D., and Taneja, I. J. Elecc. Inform. Kybern. 13,419-433.
INFORMATION MEASURES AND THEIR APPLICATIONS
413
Shiva, S. S. G.. Ahmed, N. U., and Georganas. N. D. (1973).J. Appl. Probl. 10, 666-670. Sibson, R. (1969).Z. Wahrs. und Verw Geh. 14, 149-160. Taneja. 1. J. (1975). PhD. Thesis, University of Delhi. Taneja, 1. J. (1979).J. Comb. Inform. and Syst. Sci. 4,253-274. Taneja, 1. J. (1982).Proc. l E E E Intern. Cot$ on Cybern. and Soc., Washington, D.C., October 2830, pp. 463 -466. Taneja. 1. J. (l983a). l E E E Trans. on Systems. Mun. rind Cybernetics, SMC-13,241-242. Taneja, 1. J. (1983b).J. Comb. Inform. & S w t . Sci. 8, 206-212. Taneja, I . J. (1984a).Matemtiticu Aplicuda e Comprrtucional3, 199-204. Taneja, 1. J. ( I 984b). J. Comb. Iqjbrm. arid Sysr. Sci. 9, 169- 174. Taneja, I . J. (1985a). Purrern Recognition Lettors 3, 361 368. Taneja, I. J. (1985b).Proc. Intern. Con[. on Telet~ommunicatic,nand Control, Rio de Janeiro, Brazil, December 9-12, pp. 48-51. Taneja. I . J. (l986a). Information Sciences 39, 21 1 2 16. Taneja, I. J. (1986b).J. Comb. Injorm. & Syst. Sc,i. 11,99-109. Taneja, I . J. (1987).Statistical Planning arid It+wnce 16, 137-145. Taneja, 1. J. (1988a). In/i)rmation Sciences, to appetrr. Taneja, 1. J . (1988b). Tamkang J . Math. 19. Taneja. I . J. (1988~).Tumkang J . Math. 19. Toussaint. G. T. (1974). Proc. Second Intrrn. Joint Con/’. on Pattern Recog., Copenhagen, Denmark. Toussaint. G. T. (1977). I E E E Trans. Systems, M a n . and Cyhernetics SMC-7, 300-302. Trouborst, P. M.. Backer, E., Boekee, D. E.. and Boxma Y . (1974).Proc. Second Intern. Joint Corf: on Pattern Recog., Copenhagen, Denmark. Vajda, 1. ( I 968). Infijrm. Trans. Problems 4 , 9 - 19. van der Lubbe, J. C. A. (1978). Proc. 8th Prague Con/:. pp. 253-266. Prague, Chechoslovakia. van der Lubbe, J. C. A., Boxma, Y. and Boekee, D. E. (1984). Information Sciences 32, 187-215. van der Lubbe, J. C. A., Boekee, D. E. and Boxma, Y . (1987). Information Sciences 41, 139-169. van der Pyl, T. (1977). Colloq. Intern. d u C.N.R.S., No. 276, Teorie de I’lnformation, Cahan, France, 4-8 July, pp. 161-171. Vdrma, R. S. (1966).J. Math. Sci. I, 34-48. Wiener, N. (1948).“Cybernetics.” M.I.T. Press, Cambridge. Wyner, A. D. ( I 972). Iq/i)rmution and Conrrol20, I76 I8 I . ~
~
~
This Page Intentionally Left Blank
A
B
Aberrations astigmatism, 40, 89 chromatic, 44 quadrupole lens, 143 round lens, 92 transaxial lens, 169 two-dimensional lens, 152 coma, 40, 88 distortion, 40, 91 field curvature, 40, 90 geometrical, 36, 41 coefficients, 39 correction of, 43, 184, 193, 197, 198 quadrupole lens, 139 round lens, 88 third order, 39 transaxial lens, 165 mechanical, 45, 118 spherical, 40 correction of, 43, 185, 194, 198, 199 crossed lens, 179 measurement of, 50 quadrupole lens, 140 round lens, 84 transaxial lens, 166 two-dimensional lens, 152 Absorption, laser, 261 Acceptance, 60 Additivity, 330, 331 Algebraic property, 327, 339 Amorphous Ge, crystallization, 263-273 Analysis of message, 368 Analytic property, 327, 339 Antiphase domains, 309 Arithmetic, 331 Auger process, 214 Auxiliary criterion, 386 function, 407 Average codeword length, 331
Backscattered electrons, 212-214 Bayes decision rule, 387 rule, 387 Bayesian distance, 393 Bayesian probability of error, 332, 368 Beam damage, 285,314 Beam-device matching, 61 Bench, electron optical, 47 Bhattacharya distance, 327,401,405,408 Binary digit, 328 Bivariate, 368 Bounds, in information theory, 327, 331, 342, 343, 350, 353, 386, 387, 388, 389, 390, 391, 393 Branching, 330 property, 390 Brightness, 70, 74
C Calculated images. 289, 293 Capacitor, deflecting, 266 Cardinal elements, 27 measurement of. 48 quadrupole lens, 126 transaxial lens, 163 Cathodoluminescence,215 Cation ordering, 292 Channeling, 291 Characterization (of information), 330, 331, 355 Chernoff measure, 401 Class, 386, 387, 395 Class-conditional density, 386 probability density function, 387 Classification problem (in information theory), 386 Clays, 308
415
416
INDEX
Coding information, 328 problem, 328 theorem, 328, 331 Communication, 328, 368 channel, 328 theory, 328 Composition relation, 337, 340, 341, 358, 359 Compositivity relation, 337, 377 Compressing systems, 62, 69 Concave, 337, 357, 367 function, 339, 340, 341,348,352, 354,404 Concavity, 340, 341, 357, 359, 362, 364,367, 380,388, 396 Conditional, 369, 387 entropy of degree s, 369 entropy of order rand degree s, 377 generalized entropy, 369 probability, 368 Contamination, 296 Continuity, 339 Continuous distribution, 405 function, 339, 348 probability distribution, 393 random variable, 398 Convergent beam electron diffraction, 290 Convex, 337, 357, 367 fUnction. 339, 352, 354,356, 357, 366,404 Convexity, 358 Coordination number, 291 Cosmic dust, 322 Crossover, 65 Crystal growth, 264267,270,299 Crystallization,amorphous Ge, 263-273 Cubic entropy, 390
D Decision making, 328 Decision region, 386 Decision rule, 387 Decision theory, 332 Decreasing function, 340, 352 Defects, 297,299 Deflection, electrons, 225-228 Density current, 69, 73 distribution, 69, 74, 75 phase-space particles, 55 Diagenesis, 308
Directed divergence, 329, 353, 378 of order r, 350,378 Dirichlet problem, 176 Disc of least confusion, 87 Discrete finite, 393, 405 (wary) probability distribution, 329 random variable, 368, 369 Distance measure, 327, 352, 386, 387, 389 Divergence, 359 measure, 359, 360, 387 Doubly stochastic matrix, 338
E Effective length, 122 Electrolytic tank, 13 Electron, interaction with matter, 211-216 Electron beam induced conductivity, 216 testing, 214-254 Electron diffraction, 289 Electron emission photo, pulsed, 228 thermal, pulsed, 224,229-232 Electron energy loss spectrometry,291 Electron energy spectrometer, 243, 245 Electron gun, 228-233 Electron emission, microscopy 218, 220, 221 mirror, 217, 220, 221 reflection (REM), 217,220 scanning (SEM), 221-223 transmission (TEM), 216-220 Emittance, 57 Encode (message), 328 Energy dissipation, electrons, 213 Energy spectrum, 213,243 Entropy, 332, 333, 336, 337, 342, 368, 369, 376, 389, 410 of degree s, 327, 331, 332, 333, 369, 390 graph, 327, 410 of kind t , 327,332, 333, 391 of order 1 and degree s, 333, 392 of order r, 327,329,330,331, 332,333, 343, 344, 391 of order rand degree s, 327, 333, 352, 392 series, 329, 342, 350 Envelope, 61, 64, 86 angular, 62 equation, 62 linear. 62
INDEX Equality. 347, 380, 384 Equation Hamiltonian, 53 Laplace, 13, 15, 18, 161 motion, 9, 124 trajectory. 11 paraxial, 22, 23, 81, 125, 150, 161 third order, 37, 140 Error, 386 bound, 327.387, 393, 398 probability. 386 Expected probability of error, 387 Expected value, 403 Experiment, 330 Explosive crystallization, 263-273 Exponential average codeword length of order r, 331 Exposure, short time, 233, 257-260,266,267 Exsolution, 315
F Fano, bound, 388 Fano, inequality, 388 Fano-type bound, 387, 390 Faulted sequences, 287 Feature, 328, 386, 387 selection, 386 Field (potential)distribution, I2 coaxial lens, 188 crossed lens, 177 quadrupole lens, 116, 119, 121 nonlinearity, 119 rectangular model, 121, 124 round lens, 79 axial. 94, 99, 102 transaxial lens, 161, 171 two-dimensional lens, 149, 157, 159 Focal length. 28, 83, 126, 151, 163 Focusing astigmatic, 26, 115, 160, 174 stigmatic, 26 Frequency mapping, 249 tracing, 247 Frequency-contrast characteristic, 75 Function, 332, 338, 346, 347, 357, 358, 367 Function, of discrimination, 353 Fuzzy sets theory, 328
417 G
Gain, MCP, 235, 236 Ge, crystallization, 263-273 Generality, 345 Generalized certainty measure, 352 coordinates, 52 distance measure, 327, 352 divergence measure, 327, 329, 359 entropy, 327,329, 332, 333, 336, 342,354, 368, 387 exponential length, 331 f-entropy, 332 information measure, 327, 328, 329 Jensen difference divergence measure, 393 length, 331 measure, 353, 362 of Chernoff, 327, 401 of directed divergence, 353 momenta, 52 Shannon’s or Gibb’s inequality, 348 velocities, 52 Geology, 283 Gradient operator, 337 Graphs (in information theory), 348, 410 Gunn diode, 246, 247
H Hamiltonian, 53 Huffman algorithm, 331
I Image, 25 converter, 234 formation, TEM, 218, 219 line, 26 Imaging artifact, 288 Inaccuracy measure, 349 Increasing, 358, 359 function, 331, 339, 352. 353, 354, 356, 358 Independent, 351 random variable, 368, 379 Individual, 368, 369 conditional unified (r, s)-entropy, 369 Inequality, 327, 329, 337, 342, 343, 344, 345, 348, 349, 350, 355, 357, 358, 359, 375, 388, 393, 401, 406, 408, 409 among entropies, 342, 393
418
INDEX
Information theory, application, 327, 328, 329, 331, 353, 355, 386 Information, 328 amount of, 328, 330 distributions, 330, 378 measure, 331, 386, 387 processing, 328 radius, 327, 329, 359 retrieval, 328 source, 328 storage, 328 theory, 328 Interaction, electron with matter, 211-216
J J-divergence,327, 329, 359, 364, 366,405, 408 Jacobian, ?2 Jensen difference divergence measure, 359, 360 inequality, 344 Joint, 386 experiment, 368
K Kinetic theory, phase transition, 270
L Lagrangian, 52 Language, of information theory, 328 Laser annealing, 255,258,259, 263-273 driven electron gun, 228-232 induced processes, 260-263 Lattice imaging, 287 Lemma, 357,358,363,366, 367, 380,397,403 Lens astigmatic, 26, 115, 160, 174 box-like, 199 coaxial, 187-189 crossed, 174-187 systems, 182 einzel, 29. 101, 155, 176, 180 immersion, 29, 94, 110, 152 quadrupole, 115-148 achromatic, 144 doublet, 129 quadruplet, 135
systems, 128 triplet, 132 radial, 189-193 round, 78-115 stigmatic, 26 transaxial, 159-174 tube, 199 two-dimensional, 148-159 zoom, 79, 112 Lens aberration, 219, 221 Limit, 354,362, 365 Limiting case, 329, 333, 340, 349, 369, 377, 410 Literature, of information theory, 328, 329, 330, 333, 338, 353, 356, 357, 358, 359, 368, 370, 386 Logarithmic function, 357, 364, 367, 380, 396 Logarithmic nature, 328 Logic state tracing, 249 Lower bound, 342, 387, 399
M Magnetic wall, imaging, 219 oscillation, 237-240 Magnification angular, 26 crossover, 66 linear, 26, 28, 130 Majorization, 338 Matrix, 31, 129, 136 determinant, 32 drift space, 32 einzel lens, 32 inverse, 33 mirror, 34 multiplication, 33 Matusita’s measure of affinity, 401 Maximality, 342 Maximum, 342, 347 Maximum probability, 342 . Measure, 328,330, 349, 355, 359, 362,367, 370, 386,393, 401,402,406 of divergence, 359 of information, 328, 329, 353 of separability, 386 of uncertainty, 330 Metamorphism, 309 Meteorites, 317 Method charge density, 17,95 conformal transformation, 15, 121
INDEX finite-difference, 18 finite elements, 20 separation of variables, 15, 94, 156, 170, 190 shadow, 48 Mica, 299 Microanalysis, 291 Microchannel plate (MCP), 234-237 Mineral reactions, 305 Model, explosive crystallization, 268-273 Modulated structures, 303 Monotonicity, 340 Multidimensional space, 386 Multiple-class, 387 Multivariate probability distribution, 327, 368, 369 Mutual information, 378, 379 of degree s, 370
N Nonnegativity, 339, 340, 356, 362, 366, 380 Normality, 340 Nuclear waste, 314 Nucleation, 269, 222 Numerical differentiable function, 337 function, 337, 338
0 Octupole, 193 One-to-one code, 331 Oscillation Bloch line, 241 magnetic wall, 237-240 Oxidation state, 291
P Pair, 356, 357, 367 of distribution, 366 Parameter, 340 Parametric generalization, 353 Particular, 352, 368 case, 333, 337, 387, 389,401,403,404,409,410 Pattern class, 386, 387 misclassification, 386 recognition, 328
419
Phase shift, by potential, 219 Phase space, 53 contour, 57 ellipse, 57, 68 parallelogram, 57, 68 properties, 54 six-dimensional, 53 trajectories, 53 two-dimensional, 54 volume, 55 Phase transition, theory, 268-271 Photo emission, 215 Plane focal, 28 image, 25 object, 25 phase-space, 54, 57 principal, 28 Polysomatism, 301 Polytypism, 289, 297, 299 Positivity, 336 Posterior probability, 387 Prior probability,401,406 Probabilistic model, 328 Probability, 342, 355, 368, 386, 387 a posreriori, 387 a priori, 386 distribution, 342, 350, 351, 353, 368, 393 of error, 327, 353, 386, 387, 388, 398, 399, 401,408 Problems (related to information theory), 328, 331, 336,386, 387 Propagation crystallization,265-267, 270 high-field domain, 246, 247 phase transition, 262 Property, 328, 329, 330, 331, 336, 337, 339, 340,341, 342, 343, 344, 347, 348, 352, 353, 355, 358, 362, 364, 369, 370, 371, 376, 380 Proposition, 338, 341, 342, 344,349, 352, 353, 354, 355, 356, 358, 359,362. 363, 366, 367, 370, 371, 3’72, 374, 379. 381, 384, 385, 387,388, 389, 393, 398,400,402, 403,408,409 Pseudoconcave, 337, 341 function, 341 Pseudoconcavity, 341 Pseudoconvexity, 337 Pseudometric space, 373 Pulsed detector, 233-237
420
INDEX
Pulsing electron beam, 224-233
Q Quadratic entropy, 390 Quantity, 328, 331, 352, 357 Quasi-concave, 337, 341 function, 341 Quasi-concavity, 341 Quasi-convexity, 337, 341 Quasi-linearity, 331
R R-divergence, 327, 359, 360, 361 Radioactivity, 314 Random coding exponent function, 379 variable, 373, 378, 379 Real function, 332 parameter, 331, 332 Real-time microscopy, 223 REM, 257,259 TEM, 254-258,263-268 Recognition system, 386 Recursive, 332, 333, 344 Recursivity, 330, 331, 344 of degrees, 331 Redundancy of degree s, 332 of order r, 331 Reflection microscopy, 217, 220, 257 Refractive index, 8 Relative information, 353 Remark, 355, 362 Resistance network, 14 Resolution electron beam testing, 249-252 real-time microscopy, 273-276 spatial, SEM, 221, 249 spatial, TEM, 219 Results, summaries of, 329, 339, 340, 341, 344, 348, 350,355. 358, 363, 364, 367,369, 373, 374,380,384,398,401,403,404 S Scalar parameter, 329, 332. 364,365,366
parametric entropy, 329 Scanning microscopy, 221-223 Scattering electrons, 211-214 function, R Schur concave, 338 function, 341 concavity, 338, 341 convexity, 338 Secondary electrons, 212-214 Semiconductor junction gun, 232 Serpentine, 282, 285, 290, 304, 312 Shannon information measure, 328 theory, 328 Shannon’s case, 369, 410 Shannon’s conditional entropy, 369 Shannon’sentropy, 327, 329, 330, 333, 336, 341, 342, 344, 359, 375, 376, 379, 388,389, 390 Shannon’s inequality, 390 Shannon’s or Gibb’s inequality, 349 Shot noise, 251, 274 Similarity theory, 11 Solidification, 270-273 Specimen preparation, 284 Spinelloids, 319 Statistical applications, 367 Statistical concepts, 330 Statistical pattern recognition, 327, 329, 386 Strictly concave, 347 Stroboscopic microscopy, 223, 237-241, 247 Structure determination, 292, 294 Sum property, 331 representation, 330, 332, 333 Supercooled liquid, 270-273 Superposition principle, 13 Symmetric function, 340 SYNROC, 314
T Theorem Helmholz-Lagrange, 27, 29 Liouville, 55 Liouville, corollary, 55, 60 Thin lens approximation, 30, 32 quadrupole lens, 128 round lens, 83
421
INDEX transaxial lens, 163 two-dimensional lens, 151 Time-resolving REM, 257,259 techniques, 210, 254-268 TEM, 254-258, 263-268 Transformation Fourier, 76 linear, 31, 55, 57 Transient states, laser-induced, 258, 260, 266,267 Transmission, 328 of information, 328 Transmission microscopy, 216-220, 254-258 Transmit, 328
U Uncertainty amount of, 330 Uncertainty measure, 328 Unconditional density, 387 Unified entropy, 329, 339, 369 expression, 329, 337. 355 (r, s)conditional entropy, 327,376,377,378, 388 (r. s)-directed divergence, 355, 378 (r, s)-entropy, 327, 336, 337, 339, 342. 360, 368, 369, 387 way, 329, 355, 369, 394, 395,408
Upper bound, 342,387,388, 390,393, 398
v Variable length, 331 Velocity crystallization, 264-268,270-273 flow, 262, 265 Verification, 339, 340, 355, 371 Voltage contrast, 242-245
W Waveform sampling, 246 Weathering, 306 Wronskian, 24,32, 56
X X-ray emission, 214
Y Yield. electron emission, 213
Z Zero, 347, 355 Zirconolite, 314
This Page Intentionally Left Blank