ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS VOLUME 23
CONTRIBUTORS TO THISVOLUME Eugene R. Chenette P. A. Grivet W. ...
14 downloads
914 Views
24MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS VOLUME 23
CONTRIBUTORS TO THISVOLUME Eugene R. Chenette P. A. Grivet W. C. Livingston E. A. Lynton W. L. McLean L. Malnar H. Motz C. J. H. Watson
Advances in
Electronics and Electron Physics EDITEDBY L. MARTON National Bureau of Standards, Washington, D.C.
Assistant Editor CLAIREMARTON EDITORIAL BOARD
T. E. Alliborie H. B. G. Casimir L. T. DeVore W. G. Dow A. 0. C. Nier
E. R. Piore Jf. Ponte A. Rose 1,. 1’. Smith F. K. Willenbrock
VOLUME 23
1967
ACADEMIC PRESS
New York and London
C O P Y R I G H T @ 1967, BY ACADEMICP R E S S INC. ALL RIGHTS RESERVED. NO PART O F T H I S BOOK MAY B E REPRODUCED I N ANY FORM, B Y PHOTOSTAT, MICROFILM, OR ANY O T H E R MEANS, W I T H O U T W R I T T E N PERMISSION FROM T H E PUBLISHERS.
ACADEMIC PRESS INC. 111 Fifth Avenue, New York, New York 10003
United Kin dom Edition ublished by ACAIIEMfC PRESS IRC. ( L O N D O N )’ LTD. Berkeley Square House, London W.l
LIBRARYOF CONGRESS CATALOG NUMBER:49-7504
P R I N T E D I N T H E U N I T E D S T A T E S O F AMERICA
CONTRIBUTORS Numbers in parentheses refer to the pages on which the authors’ contributions begin.
EUGENER. CHENETTE*(303), Electrical Engineering Department, University of Minnesota, Minneapolis, Minnesota P. A. GRIVET(39), University of Paris, Institut d’Electronique Fondamentale, Orsay, France
W. C. LIVINGSTON (347), Kitt Peak National Observatory, Tucson, Arizona E. A. LYNTON(l), Department of Physics, Rutgers-The New Brunswick, New Jersey
State University,
W. L. MCLEAN(l), Department of Physics, Rutgers-The versity, New Brunswick, New Jersey
State Uni-
L. MALNAR,C.F.S. (39), Dept. de Physique Appliqub, Corbeville par Orsay, France H. MOTZ(153) , Department of Engineering Science, Oxford University, Oxford, England C. ,J. H. WATSON(153), Merton College, Oxford, England
* Present address: Bell Telephone Laboratories, Allentown, Pennsylvania. V
This Page Intentionally Left Blank
FOREWORD For the first time in this serial publication I am in a position to present two reviews of a subject which we failed to treat earlier: superconductivity. The two reviews contained in this volume cover both some theoretical (Lynton and McLean) and some practical aspects (Laverick). Because of the direction of development, I hope to come back to different aspects of this subject in future volumes. I n the foreword to Volume 21, I mentioned that while four reviews on plasmas appeared in Volumes 20 and 21, in view of the importance of the subject, further reviews were planned. The article by Mota and Watson covers an important phase of plasma research. Although three special volumes (12, 16, and 22) were devoted to photoelectronic imaging devices and techniques, it seemed advisable to present a slightly dissident viewpoint by Livingston. It has been a long time since the subject of magnetic field measurements was treated here (Volume 4). The review by Grivet and Malnar couples this classical subject with a very modern technique: magnetic resonance. Last, but not least, the review by Chenette covers a n equally neglected subject in these series: noise in semiconductor devices. The last review on this subject appeared in Volume 4 (van der Ziel). As done in previous volumes, I would like to list the titles and authors of future reviews: Cooperative Phenomena Progress in Microwave Tubes Application of Group Theory t o Waveguides Optimization of Control Thermal Energy Ion-Molecule Reaction Rates Novel High Frequency Solid State Ultrasonic Devices The Analysis of Dense Electron Beams Ion Waves and Moving Striations The Electron Beam Shadow Methods of Investigating Magnetic Properties of Crystals vii
J. L. Jackson and L. Klein 0. Doehler and G. Kantorowicz 1).Kerns A. Blaquiere
E. E. Ferguson
N. G. Einspruch K. Aniboss N. L. Oleson and A. W. Cooper A. E . Curzon and N. D. Lisgarten
viii
FOREWORD
Ion Beam Bombardment and Doping of Semiconductors Nuclear and Electronic Spin Resonance Josephson Effect and Devices Linear Ion Accelerators Electron Spin Resonance: A Tool in Mineralogy and Geology Linear Ferrite Devices for Microwave Applications Reactive Scattering in Molecular Beams Thermionic Cathodes Radio Wave Fading Photoelectric Emission from Solids Dielectric Breakdown The Hall Effect and Its Applications Electrical Conductivity of Gases Progress in Traveling Wave Devices Electromagnetic Radiation in Plasmas Millimeter and Submillimeter Wave Detectors Luminescence of Compound Semiconductors The Statistical Behavior of the Scintillation Counter: Theories and Experiments Radio Backscatter Studies of Thin Polycrystalline Films by Electron Beams Gas Lasers and Conventional Sources in Interferometry Application of Lasers to Microelectronic Fabrication Theory of the Unrippled Space-Charge Flow in General Axially Symmetric Electron Beams Study of Ionization Phenomena by Mass Spectroscopy Recent Advances in Circular Accelerators Image Formation a t Defects in Transmission Electron Microscopy Quadrupoles as Electron Lenses Resolution in the Electron Microscope Nonlinear Electromagnetic Waves in Plasmas Ion Bombardment Doping of Semiconductors Space-Charge Limited Corona Current Molecular Reactions in Glow Discharges
D . B. Medved E. R. Andrew and S. Clough J. E. Mercereau and D. N. Langenberg E. L. Hubbard
w. Low W. H. von Aulock and C. E. Fay S. Data P. Zalm M. Philips F. Allen N. Klein S. Stricker J. M. Dolique W. E. Waters J. R. Wait G. I. Haddad
F. E. Williams E. Gatti and V. Svelto M. Philips C. W. B. Grigson
K. D. Mielenz M. I. Cohen and J. P. Epperson W. E. Waters H. M. Rosenstock
J. P. Blewett S. Amelinckx P. W. Hawkes E. Zeitler
J. Rowe V. S. Vavilov A. Langsdorf, Jr. R. A. Hartunian
I n addition, we expect to publish our third supplement volume soon: “Narrow Angle Elect>ronGuns and Cathode Ray Tubes” by H. MOSS.
FOREWORD
ix
It is my pleasure to announce that Dean F. I(. Willenbrock, of the State University of New York at Buffalo, has joined our Editorial Board, filling the gap created by the death of Professor W. B. Nottingham. Washington, D. C. May, 1967
L. MARTON
This Page Intentionally Left Blank
CONTENTS LIST O F CONTRIBUTORS .
.
.
.
.
.
.
.
.
.
FOREWORD . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
V
vii
Type I1 Superconductors
.
E . A LYNTON A N D W . L . MCLEAN
I . Introduction . . . . . . . . . . . . . I1. Basic Properties of Superconductors . . . . . . I11. London Equation . . . . . . . . . . . IV . Quantization of Flux . . . . . . . . . . V. The Ginzburg-Landau Equations . . . . . . VI The Interphase Surface Energy . . . . . . . VII . The Static Properties of the Mixed State . . . . VIII . Surface Superconductivity . . . . . . . . . I X Dynamic Effects . . . . . . . . . . . . Appendix . . . . . . . . . . . . . References . . . . . . . . . . . . .
.
.
. . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
1 1 4 5 6 10 13 23 25 33 35
Measurement of Weak Magnetic Fields by Magnetic Resonance P . A . GRIVETA N D L . MALNAR I . Introduction . . . . . . . . . . . . . . . . . . I1. Order of Magnitude and blain Characteristics of Natural Fields . . I11. Nuclear Resonance . . . . . . . . . . . . . . . .
.
IV Optical Detection of an Electron Nuclear Resonance . . . V. An Example of Design: The Cesium Vapor Magnetometers . VI . Superconducting Interferometers as Magnetometers . . . References . . . . . . . . . . . . . . .
. . . . . .
. . . .
.
.
40 45 55 76 111 143 146
The Radio-Frequency Confinement and Acceleration of Plasmas H . MOTZA N D C . J . WATSON
. . . . . . . . . . . . . . . . Introduction 1 . Single Particle Motions . . . . . . . . . . . . . 2. The Theory of Radio-Frcquenry confinement of Plasma . . . . 3. Theory of Combined Radio-Frequency and Magnetostatic Confinement Plasma . . . . . . . . . . . . . . . . . . 4 . Stability Theory . . . . . . . . . . . . . . . 5. Application to Fusion Reactors . . . . . . . . . . 6. Experiments Related to Radio-Frequency Confinement . . . . 7. The Theory of Radio-Frequency Acceleration of Plasma . . . . 8 . Experiments on Radio-Frequency Acceleration of Plasma . . . References . . . . . . . . . . . . . . . . . xi
. 154 . 159 . 194 of
. 223 . 227 234
. 241 . 264 . 283
.
298
xii
CONTENTS
Noise in Semiconductor Devices
EUGENE R . CHENETTE
I . Introduction . . . . . . . . . . . . . I1. Theory of Noise in Semiconductor Devices . . . . I11. Experimental Verification of the Theory . . . .
. . . I\'. Practical Low-Noise Amplifiers . . . . . . . . V. Summary . . . . . . . . . . . . . . References . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . .
303 304 319 329 340 345
Properties and Limitations of Image Intensifiers Used in Astronomy W . C. LIVINGSTON
I . Introduction . . . . . . . . . . . . . . . . . . 347 I1. Qualitative Comparison Between the Photographic Plate and the Image Tube . . . . . . . . . . . . . . . . . . . . 349
I11. IV . V. VI .
Quantitative Evaluation of Image Tubes . . . . . . . . . 352 Description of Tubes and Results . . . . . . . . . . . 354 Prospects for Future Developments . . . . . . . . . . . 372 Summary and Conclusions . . . . . . . . . . . . . 380 References . . . . . . . . . . . . . . . . . 381
AUTHOR INDEX . . SUBJECTI N D E X .
. . . . . . . . . . . . . . . . . 475 . . . . . . . . . . . . . . . . . 485
Type I1 Superconductors E. A. LYNTON
AND
W. L. AIcLEAN
Department of Physics Rulgers-The State University New Brunswick, New Jersey
I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic Properties of Superconductors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . London Equation.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Quantization of Flux.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Ginzburg-Landau Equations. . . . . . . . . . . . . . The Interphase Surface Energy.. . . . . . . . . . . . . . . The Static Properties of the Mixed State.. . . . . . . . . . . . . . . . . . . . . . . . . . . . Surface Superconductivity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dynamic Effects .......................................... A. Steady-State Flux Flow.. . . . . . . . . . . . . . . . . . .. B. Vortex Motion., . . . . . . . . . . . . . . . . . . . . . . . . . .. C. Vortex Waves.. . . . . . . . . . . ........................... D. The Surface Ba,rrier. . . . . . ........................... E. Vortex Motion in Thin Films.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. References.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11. 111. IV. V. VI. VII. VIII. IX.
I
1 4 5 13 23 25 25
28 30 31 32 33 35
I. INTRODUCTION During the past few years, a great deal of fundamental experimental and theoretical research in superconductivity has concentrated on socalled Type I1 superconductors. This work is of interest both because it has brought a renewed interest in macroscopic quantum phenomena as well as in certain basic electrodynamic problems and because of its relevance to the application of superconductivity to magnets and related devices. We shall, therefore, review the principal aspects of this work after a brief sumniary of basic superconducting properties. 11. BASIC.PROPERTIES OF SUPERCONDUCTORS' A superconductor is a substance which below some well-defined critical temperature, T,, has zero electrical resistance for low-frequency currents 1 References to most of the original work which is summarized in this chapter can be found in Lynton (1). 1
2
E. A. LYNTON AND W. L. MCLEAN
and in which, therefore, the electric field E vanishes. At this time, 24 metallic elements are known to become superconducting, with values of T , ranging from 11.2"K down to about 0.01"K. I n addition, there are more than 500 superconducting compounds and alloys, with T , as high as 18"I*
The first of these fundamental equations describes the equilibrium spatial variation of the order parameter; the second, the current distribution, i.e., the diamagnetic response of the superconductor to the external field. The Ginzburg-Landau equations give rise to two fundamental lengths characteristic of a given superconductor. One of these we have already encountered: the penetration depth A. In a weak field, arid to first order in B, [$I2 can be replaced by its equilibrium value in the absence of a field, $02, which is independent of position. To this order, the second G-L equation reduces to one formally equivalent to the London equation:
V X J = -(q2/mc)+02B, leading to an exponential decay of an applied field with a characteristic penetration depth X2(T) = m ~ ~ / 4 ? r q ~ +0:~ l/(To ~ ( T )- T). The other characteristic length follows from the appearance of the gradient term in the first G-L equation, which prevents, by making it too costly in energy, any rapid spatial variation of +(r). The scale on which this variation occurs can be seen as follows: We write the first G-L equation in the absence of a field for a semi-infinite slab with its surface lying in the plane z = 0; we assume that depends on z only,
+
+ may be taken to be real if the field and current are zero. We introduce and
( ( T ) having the dimensions of length. With this, the equation reduces to -[(T)2(d2f/dZ2)- f f 3 = 0. Clearly ((7') is the natural unit for the spatial variation of f(z); and from
+
the temperature variation of
(Y
and its value in terms of microscopic
TYPE I1 SUPERCONDUCTORS
9
quantities, one can show that
F(T) = 5oITe/(Tc
- T)11/2,
where 50 = 0.18hv~/k~T~, V F is the Fernii velocity, and k s is the Boltzmann constant. tois the so-called coherence length of a superconductor a t absolute zero [see Goodman (9)]. Note that both X(T) arid ( ( T )diverge as T -+ Tc,but that their ratio K = X ( T ) / [ ( T )= rn~p/qh(27r)'/~ is, in the same limit, a constant. This is the so-called Ginzburg-Landau parameter of a superconductor, to which we will refer in a later section. The existence and even the approximate magnitude of the coherence length follows from quite general phenonienological considerations [Pippard ( l o ) ] .It is, as was just stated, the minimum length over which variations can occur in the order parameter. Specifically this means that the smallest natural size of a superconducting region surrounded by normal material is l o . Pippard pointed out that the finite and in fact considerable value of this size follows from the extreme sharpness of the superconducting transition. For example, in a well annealed sample of tin, the transition is observed to occur within less than a millidegree. The small limit to the statistical fluctuations, inferred from this result, indicates that superconductivity nucleates in a region with a diameter of approximately cm. The same magnitude of toalso follows from a simple argument based on the uncertainty principle. The coherence length can be regarded as characterizing the spatial definition of the superconducting electrons, related to the uncertainty in their momentum by
50 AP
-
h.
But the electrons involved in the superconducting condensation are those within an energy kBTc of the Fermi surface, so that and
-
For most superconducting elements, to 10-5-10-4 cm. For transition metals such as niobium and tantalum, however, the Fernii velocity VF is very low (due to the high density of electron states a t the Fermi surface) and T , is uriusually high. Thus for these substances tois only of the order of cm. As a result K = A/[ for transition niet,als is close to unity, whereas, e.g., K 0.01 for aluminum. Both the range of coherence and the penetration depth vary with the normal electronic mean free path. As this becomes shorter with decreasing
-
10
E. A. LYNTON A N D W. L. MCLEAN
purity of the metal, X increases and [ decreases. Thus, K increases with decreasing mean free path. We will discuss below how the basic characteristics of a superconductor with small K values differ in a fundamental way from those of one with large K values. The change from one type of behavior to the other can be produced by alloying a superconducting element, thereby decreasing its mean free path.
VI. THEINTERPHASE SURFACE ENERGY The gradual spatial variation of the superconducting order parameter together with the finite penetration depth give rise to a contribution to the specimen energy for every unit area of surface separating superconducting and normal material [Pippard (II)]. Consider a unit area of interphase boundary in a plane normal to the plane of the paper. The order I
I
H
G M FIQ.1. Variation of the superconducting order parameter
+(T)
and the flux density
B a t the boundary between normal and superconducting regions.
parameter +(r) decreases from its equilibrium value in the superconducting material down to zero in the normal material over a distance of the order of [. The flux density falls to zero from its value H , in the normal material over a distance of the order of A. Representing the gradual variation of and B by abrupt changes, it is as if the configurational or ordering boundary occurred a t C and the magnetic boundary at M (see Fig. 1). Thus in the diagonally shaded volume one loses the advantage of superconducting order, and there is a consequent increase in energy equal to this volume times the energy difference per unit volume between normal and superconducting material, Hc2/87r.On the other hand, the field is not excluded from the crosshatched volume, leading to a decrease of the energy equal to the product of this volume times the energy per unit volume owing to the exclusion of flux of density H , H2/8n.When the field in the normal region equals H,, we thus have
+
Increase in energy due to loss of order Decrease in energy due to field penetration Net increase in energy per unit area
---
[Hc2/8r XHC2/8?r ( 4 - X)HO2/8?r
TYPE I1 SUPERCONDUCTORS
11
Thus there is a net contribution per unit area of interphase boundary, the sign and magnitude of which depend on the relative sizes of the coherence length [ and the penetration depth A. It is evident that the sign of this surface energy determines whether or not it is energetically favorable to form normal inclusions in the superconducting matrix below the field H , a t which the volume energies of the two phases become equal. Thus there will be very different magnetic behavior for superconductors with positive and for those with negative surface energy, and these are accordingly differentiated by being called, respectively, type I and type I1 superconductors. Let us use the same simple approach to investigate a t which exterior field a normal inclusion becomes energetically favorable. Consider such an inclusion in the form of a thin normal thread, at the center of which we
-€FIG.2. Variation of the order parameter + ( T ) and the flux density B in a cylindrical normal region surrounded by superconducting material.
have a flux density equal to the external field, H , and a vanishing order parameter # = 0. The former decreases over a distance A, the latter increases over a distance [. The inclusion becomes favorable when the energy due to flux exclusion becomes equal to or greater than the energy decrease due to the condensation into the more ordered state, i.e., when i.e., Thus, there are two cases: (a) Type I. [ A : The field a t which isolated inclusions become possible exceeds H , ; thus the material is entirely diamagnetic for H 6 H,, as indicated in Fig. 3 by the dash lines. (b) Type 11. f ,< A : Normal regions appear a t
>
Hci= ( f / A ) H c 6 H c
12
E. A . LYNTON A N D W. L. MCLEAN
and by the same token, superconducting regions persist to fields greater than H,, as the negative surface energy compensates for the greater volume energy. Hence the magnetization curve for type I1 is as shown by the solid line. The limiting field H,2 for the persistence of a mixture of normal and superconducting material can be found from the G-L equation. For this purpose we take the z-axis perpendicular to the field and suppose that I) depends on one coordinate, z, only. It may easily be verified that, if I) varies with x or y, the value of Hc2 obtained in that case is not higher than the value obtained here. Also, we consider either an infinite medium or parts of the superconductor much further than t: from the surface so that there is no need to take the surface into account. I n the normal state, the field is uniform and so in fields with B = H >/ Hc2, a suitable vector
FIG.3. The magnetization curves for type I (dashed lines) and type I1 (solid curve) superconductors.
potential is A = (zH,O,O). The G-L equation can thus be written in terms of the zero field equilibrium order parameter # o and the quantities K and X in the form
Near Ho2, it is reasonable to assume + ( H ) f. It may be noted that the relation between HCzand H , is independent of the assumption that # depends on z only, and that a semi-infinite slab is not a necessary requirement for the result obtained. The more general derivation of this result is mathematically equivalent to the calculation of the Landau levels of an electron in a uniform magnetic field given by Dingle (1.2). h(T)
f(T),
and
VII. THESTATIC PROPERTIES OF THE MIXEDSTATE From the preceding section it is apparent that for external fields H such t ha t Hcl < H < Hc2, a type I1 superconductor is in a state which is neither entirely superconducting nor entirely normal. The detailed nature of this state, called the mixed state, can, in principle, be deduced from a solution of the G-L equations under appropriate conditions. The nonlinearity of these equations makes it necessary to use certain approximations, as was first done by Abrikosov ( I S ) . He deduced the nature of the mixed state near H C zby an iterative procedure, substituting into the nonlinear G-L equation the solution to the linearized one, modified by a small additive function. This yielded the remarkable result that in a plane normal to the applied field the order parameter is a doubly periodic function, varying from a zero value on a lattice of points, as shown in Fig. 4, to a maximum value midway between neighboring zeros. Near the zeros, the symmetry is circular. There is a gradual transition to an almost square pattern at the boundary of the primitive cell centered on a zero. I n three dimensions the pattern of field penetration into the mixed state is th a t of a grid of normal filaments, each surrounded by surfaces of equal order which also correspond to sheets along which the supercurrents flow, perpendicular to the direction of the field. Near to each of the normal filaments, the current density varies in a similar fashion to the velocity near a vortex in a classical fluid. Thus, in the mixed state the interior of the superconductor contains a set of parallel current vortices. From our discussion in Section VI, i t is evident that each such vortex, being topologically equivalent to a superconductor with a cylindrical filamentary normal hole along the axis, must carry an integral number of flux quanta &. A remarkable feature of this vortex structure predicted by Abrikosov for the mixed state is that the superconducting order is finit,e everywhere
14
E. A. LYNTON A N D W. L. MCLEAN 2
312
I
1/2
0
112
I
312
2
FIG.4. Contours of equal [$I2 (which are also lines of current flow, the tangent a t any point giving the direction of J . a t t h at point) according to the Abrikosov model. [From A. A. Abrikosov, Phys. Chem. Solids 2, 199 (1957).]
FIG.5. Lincs of magnetic flux, current flow, and the variation of order parameter and flux density near the center of a vortex.
TYPE I1 SUPERCONDUCTORS
15
except alorig the filaments of negligible volume. The mixed state can, therefore, he considered as everywhere superconduct ing and can be characterized by an average order parameter. An early verification of this came from thermal conductivity measurements, the results of which were successfully analyzed in ternis of such an average [Dubeck el al. (BG)]. Let us first look at the properties of an isolated vortex, following essentially the treatment of deGennes (5), and making the simplifying assumption that K >> 1, i.e., that X >> E. The description of a vortex is as follows : The order parameter and the effective density of superconducting carriers rises from zero a t the center to its equilibrium value over a distance of the order of f . The flux density B is maximum at the center, and it extends over a distance of the order of A, being screened by circular current loops also extending over a distance A. Neglecting the energy of the “normal core” of radius E, we have for the vortex energy per unit length
+
21rr d ~ [ B ( r ) ~ / 8 1 r$ n , m ~ ( r ) ~ ]
where n,
=
constant for r
> f . From Maxwell’s equations
V - B= O
V XB so that, putting
XL2
=
=
47rJ/c
=
4an,ev/c
mc2/4~n,e2, we get 27rr dr[B(r)2
= J7>E
+ XL2(V X Bl21/8*.
The condition that this be a minimum yields
B
+ XL2V X V X B
= 0.
This is again the London equation, showing that as expected the field penetrates a dist,ance hL. However, this equation can be expected to be valid only where there is nonvanishing order, i.e., not a t the center. The equation applicable throughout the whole region is
B
+ XL’ V X V X B
=
+oS(r),
where 6 ( r ) is the two-dimensional delta function, arid +o is a vector in the field direction, of magnitude equal to the quantum of flux. The validity of this form of the equation can be easily verified by integrating this equation over the surface bounded by a circle of radius T and encircling the vortex axis in a plane normal to the axis. The general solution of the equation is B ( r ) = +oK0(r/XL)/2aXL2where KO is a Bessel function of imaginary argument and of order zero. The
16
E. A. LYNTON AND W. L. MCLEAN
asymptotic values are, for t B(r)
arid for r >> XL,
B(r)
=
< T > X and as ln(X/rlz) for E < r12 >1) vortex lines per unit area perpendicular to the lines is G = nF Zij Fij - HB/47r. As every vortex carries a single flux quantum c $ ~ , we have for the average flux density,
+
B
= n+o.
The first question of interest is at what field H c l it is energetically favorable to have flux penetration. At this field the line density will be low, so that we neglect the interaction term and write
G
-
nF - B H / 4 n
-
B ( F / & - H/47r).
The dependence of G on B changes when H = 47rF/&,. For H < 4nF/&, G increases with B, and the lowest energy is obtained for B = 0, i.e., B = 0, the so-called Meissner state. For H > 47rF/&, G decreases with increasing B, so that flux penetration becomes advantageous. Clearly, therefore, Hcl = 47rF/d0 = ( 4 0 / 4 r X ~ ~ln(AL/E). ) For K = X/E > > 1, Hcl > to, and again this holds only near T,. Finally, a local relation between superconducting current density and the vector potential can be used only when the latter varies slowly over distances of the order of to,i.e., if A(!)’ >> to, which happens near T,. I n recent years the theory has been extended to lower temperatures yielding different temperature [Gor’kov (22); deGennes (23); Rlaki (24)], dependences for K values deduced from various features of the magnetization curve. It is useful to follow nlaki in defining
22
E. A . LYNTON AND W. L. MCLEAN
K ~ ( Tas) the value deduced froni d@/dH near H,z, and K ~ ( Tas) that
obtained from H,l(T). In the limit of low electronic mean free path, theory predicts that both K~ and K 2 rise slowly with decreasing temperature by unequal but similar amounts. This has been verified by experiments on alloys. However, in the
0000 0000000
I I 0000000 0000
0000 a9 000000 0000!3~po0000 (b)
00000000000 00000000 00000 FIG. 6. The relation of the vortices to the thin film used by Parks et al. (28) (a) At low fields the vortices are too large to fit into the narrow arm joining the two more extensive parts of the film. (b) At a certain field the vortices can just fit inside the arm and do so to reduce the free energy of the system.
pure limit, there is no such agreement. Both calorimetric and magnetic measurements on pure niobium [McConville and Serin (26), Strnad and Kim (SS)]show that ~ ~ ( 5 "increases ) by about as T -+ 0, whereas the calculation predicts a behavior much closer to that in the impure limit, in which K ~ T( ) increases by only about 20 %. Furthermore, ~ 2 T( ) is experimentally found to increase with lowering temperature also in the pure case, whereas it is predicted to decrease.
+
TYPE I1 SUPERCONDUCTORS
23
We conclude this section by noting that current vortices can also occur in type I superconducting materials under special conditions : namely, when the superconductor is in the form of a thin film normal to a magnetic field of magnitude slightly less than that required to destroy completely the superconductivity. Tinkham (27) has found that fluxoid quantization plays a dominant role in this situation and has carried out a n analysis using the Ginzburg-Landau theory, predicting the variation of the critical field with angle of inclination to the surface and the dependence on temperature of the critical field when the field is perpendicular to the film. Parks et al., (28) have recently carried out experiments in which the size of the vortices in a thin film perpendicular to a strong field was observed to have an effect on the transition temperature of part of the film. The film was of uniform thickness but was of the shape shown in Fig. 6. An increase in the transition temperature of the bridge joining the two more massive parts was observed when the magnetic field reached a value a t which the vortices were of small enough diameter to fit into the bridge, presumably because the free energy of the system was reduced once the vortices were able to spread into the bridge. We shall return later to discuss further experiments on vortices in thin films th a t have considerably clarified our understanding of the mixed state. VIII.
SURFACE SUPERCONDUCTIVITY
For many years experimenters have observed that in certain metals superconductivity, detected by resistance measurements, persisted in magnetic fields higher than the field required to restore the diamagnetic moment of the sample to its normal state value. This phenomenon was generally ascribed to strains or inhomogeneities which cause the metal to behave in a way different from that of a pure system. However, SaintJames and deGennes (29) have shown that when the externally applied field is parallel to the surface of a superconductor, a layer of thickness -l a t the surface remains superconducting up to the field Hc3 = 1.69 4 KH,.This result was obtained by a treatment similar to th a t described in Section VII for obtaining Hcz = 4 9 KH,,the maximum field a t which superconductivity can nucleate from the normal metal. Surface effects were not considered in that derivation. However, the solution of the Ginzburg-Landau equations near the surface is quite different from in the interior, and if the magnetic field is parallel to the surface, superconductivity can nucleate at a maximum field Hc3. Numerous experiments have since been carried out verifying the existence of the surface layer of superconducting material and in many cases confirming the quantitative estimate for H c 3 .I t was found by Rosenbluni and Cardona (SO)that surface superconductivity could also occur in type I superconductors since al-
24
E. A . LYNTON AND W. L. MCLEAN
though in these K < 1/42, it is possible for 1.69 4 3 K to be greater than 1, i.e., 4 3 K H , < H , < 1.69 4 3 KH,. If the magnetic field makes an angle 0 with respect to the surface, the superconducting layer is destroyed at a field lower than Hc3.When 0 = 90"' the critical field is H,z. Saint-James (31) has extended the solution of the Ginzburg-Landau equations in the vicinity of the surface to cover 0 6 0 6 90". Tinkham's (2'7) treatment of the transition in thin films, mentioned in the last section, agrees well with this exact solution only when the film thickness is much less than E . I n experiments carried out on cylindrical samples with their axes parallel to the magnetic field, it might be thought that the annular region of superconducting material surrounding the cylinder would act like a perfectly conducting shield and prevent magnetic flux from entering the interior of the rod until the external field had reached Ho3. A simple calculation shows that the positive magnetic contribution to the free energy because of flux exclusion is much greater than the negative contribution from the condensation energy of the surface layer ( H O z< H < H c 3 )and so exclusion of the flux in the way envisaged would be thermodynamically unstable. We are thus lead to the concept of a maximum or critical net current that can be carried by the surface sheath. If the flux density were the same on both sides of the layer-in the interior of and outside the cylinder-there would be diamagnetic currents shielding the inside of the layer from the magnetic field, but no net current. A gradual increase of the external field would induce a net current, proportional to the difference between the external and internal flux densities, but the current eventually would reach its critical value whereupon the flux inside the cylinder would increase. Except for the very careful measurements of Sandiford and Schweitzer (32), bulk magnetic moment measurements on samples which should have had superconducting surface layers have never revealed any shielding by the surface layers. The shielding would manifest itself by an obvious hysteresis pattern. I n the mixed state with HC1< H < Hc2, the absence of vortices from the region of thickness 5 near the surface implies that there should be small shielding effects there just as in the range H,z < H < Hc3. I n practice these are obscured in bulk magnetic moment measurements by flux trapped in the interior which dominates the hysteresis. Rothwarf (33) has developed the simple model mentioned in Section VII, which treats the orbital motion of electron pairs about their centers of mass differently from the motion of their centers of mass, and he has predicted that the high-frequency surface impedance should vary linearly with field between HCzand Hc3.Such behavior has recently been observed
TYPE I1 SUPERCONDUCTORS
25
in the surface rwctancc at 2 Nc/se(. by Carlson (34) arid in the microwave surface resist ariw by Gittlemuii and Rosenblum (35). A linear variation in the surface impedanc’e has also been predicted by Malti (36), from a solution to the Gor’lcov equations (8)’ for the surface sheath of very impure superconductors. Even the most impure systems studied so far appear to be relatively too pure for the theory to apply.
EFFECTS IX. DYNAMIC The phcnoriiena discussed in previous chapters have pertained to the equilibrium states of a type I1 superconductor-the order parameter has been considered to vary spatially but not with time and the appropriate free energy of the system minimized to determine the state of thermodynamic equilibrium. In this chapter, we consider effects in which there is also a temporal variation of the order parameter. Ideally one would like to obtain from the microscopic theory not only the solution given by Abrikosov, discussed in Section VII, which applies to the system in equilibrium, but also solutions applicable when the equilibrium is disturbed by the application of external fields or which would describe the transient behavior when the magnetic ficld is changed from one static value to another. Although some advances have been made from the standpoint of the microsvopic theory, a clearer physical picture of dynamic effects is emerging from a hydrodynamic treatment of the motion of a vortex through the superfluid. Experimental results are complicated in many cases by structural defects in the metal. We shall restrict our discussion mainly to situations where there is a clear-cut and significant agreement or disagreement with the predictions of the vortex model.
A . Steady-State Flux Flow I t has been found that when a steady current is passed through a type I1 superconductor in the mixed state, electrical resistance may appear, in spite of the large fraction of the material that is superconducting [Kim et al. (37)l. T he interpretation of this result in terms of the motion of magnetic flux has been important in the development of models of vortex motion [Anderson and Kim (38)l. Figure 7 shows the potential drop across the type I1 superconductors NbsoTasoand Pbs31n17as a function of current for various values of the magnetic field. At low currents there is no measurable resistance. As the current is increased a potential drop appears, indicating an electric field inside the superconductor. This field is generated by electromagnetic induction owing to the movement of vortices-either individually or in linked groups. The vortices are driven by what has been misleadingly called the “Lorentz force,’’ which has the form J X +o/c per unit length
26
E. A. LYNTON A N D W. L. MCLEAN
of the vortex, where J is the superimposed or transport current density and the flux in a vortex. This expression has been derived by a thermodynamic argument by Friedel et al. (39). It may be noted that this force is in the opposite direction from the reaction to the Lorentz force per unit
+,,
I (amp)
FIG.7. Flux-flow resistance of two different type I1 superconductors, after Kim et al. (37).
volume J X B/c which acts on the current carriers passing through the flux of the vortex. That some caution is needed in using the simple classical treatment of a charged particle moving in a magnetic field should be apparent from a n attempt to consider the motion of the electrons in a
TYPE I1 SUPERCONDUCTORS
27
single vortex. Some force besides the Lorentz force must be invoked in order to explain the motion of the current in circular loops. We return to the nature of the forces in the next section. At low currents, the “Lorentz force’’ is insufficient to overcome the forces which “pin” the vortices to imperfections in the crystal lattice. The probability of the vortices leaving their potential wells is increased by raising the current and also by raising the temperature. At high currents, the trapping mechanism has little effect and the vortices “flow,” i.e., move through the metal with a speed limited by a viscous force that we discuss later. Motion of the flux occurs at intermediate values of the current by the “flux creep” process in which the vortices spend part of their time “flowing” and the rest of it trapped by imperfections. I n addition to the characteristics which stand out in Fig. 7, it has been found that the resistivity varies with field according to p / p , = H/Hc2. From the Abrikosov theory, H B &,/d2, where d is the average distance between vortices, while Hcz c # Q / ~ .so ~ , that H/H,2 = E2/d2 (the fraction of the volume that is in the normal state). This assumes, as has been demonstrated b y Caroli et al. ( 1 4 , th at the vortex core can be regarded as a cylinder of normal material of radius [. The field variation of the resistivity is thus proportional to the fraction of normal metal, and leads to the surprising conclusion th at the current flows uniformly through the normal and superconducting parts-instead of avoiding the normal cores and passing through the superconducting parts only. Microwave surface resistance measurements [Rosenblum and Cardona (do)]have also supported this conclusion. We return to a discussion of this result in the next section. The fact that the electrical resistance in type I1 superconductors apparently arises from electromagnetic induction has been a considerable source of confusion and discussion. Th e experiments referred to above were carried out under steady-state conditions so that the flux through the measuring circuit did not change with time. On the other hand, there was a movement of flux through the superconductor, from which a n electric field arose within the superconductor. The clearest general explanation of the paradox has been given by Josephson ( & ) , who has shown th a t in the steady state the difference in the electrochemical potential between two points in a superconductor is equal to the rate at which flux crosses a line in the superconductor joining the two points. This treatment is based on the assumption that the driving force for the current in a superconductor is proportional to the electric field and to the gradient of the chemical potential pc, i.e., that
- --
28
E. A. LYNTON AND W. L. MCLEAN
+
where p = po eV is the electrochemical potential and V the scalar electric potential. This assumption appears to lead to a correct explanation of the observed thermoelectric and magnetoelectric behavior of superconductors [see also Luttinger (4S)], one which is in accord with the theory mentioned in the last section of this chapter. It may be noted th a t the motion of flux in quantized amounts through the superconductor does not imply that flux enters at one side and leaves at the other side of the superconductor. When a vortex approaches the boundary, its flow pattern is no longer circular since no current can flow across the boundary. Just as boundaries may be taken into account in electrostatics and magnetostatics by the method of images, the actual flow pattern in the superconductor can be synthesized by the superposition of the patterns of two undistorted vortices, the first with its center a t the center of the real vortex, the second the image of the first in a mirror coincident with the boundary of the superconductor. As the center of the vortex approaches the boundary, there is an overlap between the flow patterns of the two auxiliary vortices. Since their senses of rotation are opposite, they tend to cancel each other. Finally, the first vortex is completely annulled by the second when they both reach the boundary. The flux bundle does not pass out of the superconductor but merely dies away.
B. Vortex Motion Many of the conclusions drawn from the experiments above have been derived from a hydrodynamic treatment of the motion of a single vortex moving through the superfluid of electrons [Bardeen and Stephen (43); NoziBres and deGennes (&)I. The starting point in these theories has been the application of Euler’s equation, p[(av/at) v VV]= the total force density, to the superfluid which has been assumed to follow London’s equation (see Section 111) a t every point, except in the vortex core. The self-consistent internal electromagnetic field is in this case the analog of the pressure gradient which makes the important contribution to the force density in an uncharged, nonviscous classical fluid. The situation studied has been a single vortex with a transport current superimposed upon it. From the Euler equation, the fields in the superfluid have been related to the vortex motion: the field and current in the vortex core have been connected with the fields in the surrounding superfluid by use of continuity conditions a t the core boundary. The first of the two treatments cited above arrives at the unexpected result mentioned in the previous section that the applied or transport current density is the same inside the core as outside it: the current flows equally through both normal arid superconducting parts. This equality is assumed in the second treatment as the only quantitatively siniple assumption that will lead to
+
TYPE II SUPERCONDUCTORS
29
resistance of the same order as that observed in practice. The dependence of the flux flow resistance on magnetic field is also explained. We summarize here the theoretical predictions regarding the actual motion of the vortex under the driving force produced by the transport current and a viscous force arising from power dissipation in and around the core. Bardeen and Stephen have obtained V L ~ ~=T V L ~ tan T CY = V T H / H , ~ where U L ~ ~and T uLIT are the components of the velocity of the vortex line and V T is the drift velocity of the transport current. The relative orientations of these two velocities and the velocity of the electrons in the core, zr, are shown in Fig. 8. Also, E, is the uniform electric field in the core of
FIG. 8. The relative orientations of the “Lorentz force,” PI,,the electric field E, and the drift velocity v, in a vortex core, the velocity of the vortex line n,and the superimposed superfluid flow velocity VT (parallel to the transport current JT),according to the analysis of Bardeen and Stephen (43).
the vortex, J T is the transport current density (parallel to VT), and F L is the “Lorentz force.” CY is the Hall angle for the normal core and is given by tan CY = wcr, where wc = eB/m and 7 is the relaxation time for the normal state. I n the very impure case, wcr > 1, the line moves with the transport current, as expected by analogy with the classical fluid. A solid cylinder placed in an originally uniform flow of fluid of velocity v1, with its axis perpendicular to the fluid flow direction, is subject to a force along a direction mutually perpendicular to the cylinder axis and ul. This force is called the Magnus force and its magnitude is pkvl per unit length of the cylinder, where k = $v. dl is the circulation of fluid around the cylinder-the contour
30
E. A . LYNTON AND W. L. MCLEAN
of integration being just outside the cylinder-and p is the fluid density. If the cylinder is in motion with velocity VL, the Magnus force becomes p k ( ( ~1 V L ) ~ I. n the steady state, the total force is zero so the cylinder moves with the fluid, with VL = vl. Similarly the core of a vortex in a classical fluid may be treated as a solid cylinder and the same behavior deduced for the motion of a vortex line. It should be noted that the Magnus force is of the same nature as the centripetal force in circular motion and has to be supplied by some physical means. I n the classical fluid case, it arises from the gradient in pressure. I n the superconductor it is produced by the action of the electromagnetic field. The analogy with the classical fluid suggests that, in a pure type I1 superconductor, it should be possible to detect the Hall effect in the mixed state, with a large Hall angle [Vinen (46)l. (The Hall angle in a type I superconductor is zero.) The theoretical investigation mentioned above predicts the Hall angle to be that of the normal metal in a magnetic field equal to the field in the core. Recent observations in type I1 superconductors [Niessen and Staas ( 4 6 ) ; Reed et ul. (47)] have shown the Hall angle to be large but not as large as in the normal state, possibly because of pinning effects preventing free motion of the vortices under the action of the driving and viscous forces. No detailed comparison with theory of the complicated results reported in the former reference has yet been made. C . Vortez Wuves
A taut string plucked aside from its equilibrium position undergoes vibrations which can be described in terms of waves traveling along the string. Similarly, a vortex line disturbed from its equilibrium position in a magnetic field, by, for instance, passing a localized current impulse near one part of the vortex, is expected to undergo a precession about the steady field direction which can be described in terms of circularly polarized waves traveling along the vortex. A derivation of the wave equation has been given by deGennes (6). A search for such waves in the mixed state of type I1 superconductors has been carried out in many laboratories without success [see, for instance, Borcherds et al. (48)].The analyses of NoziBres and deGennes (44), and of Bardeen and Stephen (43), have indicated that the criterion for the propagation of these waves is W ~ T>> 1; where wC is the cyclotron resonance frequency, eB/m, at an induction B equal to the flux density in the cores of the vortices in the mixed state; and 7 is the relaxation time in the normal state of the metal. I n all the cases studied so far, W ~ Thas been much less than unity (in many cases the type I1 materials were alloys with low T ) , mainly because the highest field at which the mixed state can be studied, Nc2, is relatively small in com-
TYPE I1 SUPERCONDUCTORS
31
parison with the values with which the magnetic properties of normal metals are observed. Consideration of the motion of a vortex is of course a convenient model for analyzing the time variation of the order parameter in the mixed state. Nozihres arid deGerines (49) have also treated the motion of a “flux-tube”-such as could be formed in the intermediate state of a type I superconductor [cf., e.g. Faber (50)], and have predicted th a t the phase boundary in that situation moves in such a way that again circularly polarized waves can propagate with the dispersion formula q2 = w / h 2 w 0 . Here q and w are the wavevector and the angular frequency of the waves respectively, XI,^ = m c 2 / 4 ~ n e and 2 , w o is the cyclotron resonance frequency a t a flux density H , (the critical field), regardless of the external field H , providing H 6 H,. This formula is similar to the dispersion formula for helicon waves (51) in the normal metal except that w o is replaced by the cyclotron frequency a t a field H . The methods of excitation and detection of such waves are similar to those for helicons. Circularly polarized waves with a dispersion which depends on the external field in the manner predicted by NoziBres and deGennes have been observed in very pure indium by Hays (52) and more recently by Kushnir (53),and by Maxfield and Johnson (54).The damping of these waves was much less than might be expected from a set of unconnected cylindrical superconducting regions, suggesting that a filamentary structure, of the type observed in aluminum by Faber (50), forms also in indium in the intermediate state. An elegant deinoristration of the existence of circularly polarized or precessional modes in the intermediate state has been given by Haenssler and Ririderer (55) who observed that when a field was applied normal to a superconductirig indium disk which had been sprinkled with fine diamagnetic niobium powder, a spiral pattern of flux entry (or flux escape when the field was reduced) was formed. A similar demonstration has been given by DeSorbo (66) from moving photographs taken of a surface just above which was a cerous phosphate plate of high Faraday coefficient. Linearly polarized light has its plane of vibration rotated by an amount which depends on the density of flux emerging from the surface of the superconductor immediately below.
D. The Surface Burrier It had been noted in experiments on type I1 superconductors th a t there was considerable irreversibility-for instance, the magnetization depended not only on the value of the magnetic field and the temperature but also on previous values of the magnetic field; i.e., whether it was being increased or decreased. Bean and Livingstori (57) suggested that apart from the effects of structural imperfections, there was a n intrinsic
32
E. A . LYNTON A N D W. L. MCLEAN
irreversibility caused by a potential energy barrier for vortices a t the metal surface. As a vortex approaches the boundary, it becomes distorted as discussed earlier and is attracted towards the boundary (there is an attraction between the two counterflowing vortices-the undistorted vortex and its image). However, if the superconducting order parameter is nonzero, the magnetic induction falls off below the surface, providing a magnetic pressure gradient which repels the vortex from the boundary. The second of these two forces is dominant except when the vortex is close to the boundary. deGennes arid Matricon (58) have considered the barrier by taking into account the presence of the surface in solving the Ginzburg-Landau equations. Their result is that, although it may be energetically favorable for a vortex to form in the body of the material once H is greater than Hcl, nucleation cannot occur a t the surface until H = H,, the thermodynamic critical field. A convincing verification of this result has been obtained by DeBlois and DeSorbo (59). From this model we might expect that near the surface there would always be a lower vortex density than in the body of the superconductor. I n other words, the order parameter to a depth approximately E near the surface should not be depressed as it is in the vicinity of the vortex core, although the body of the material may be closely packed with vortices. This is borne out by detailed calculations from the Gineburg-Landau equations [Fink (60)l.
E. Vortex Motion in Thin Films As has already been mentioned in Section VII, vortices can form not only in type I1 superconductors but also in thin films of type I superconductors, with the strong magnetic field perpendicular to the plane of the film. An interesting connection between vortex motion and the more fundamental theoretical formulation of dynamic effects in terms of the timedependence of the order parameter +(r) has been deduced from experiments on thin film bridges [Anderson and Dayem ( S l ) ] .According to the microscopic theory of superconductivity [Gor’kov (S)], the energy gap function A, and hence the order parameter for the electrons in a superconductor, contains a time-varying phase factor exp ( - 2 i p t / h ) , where p is the electrochemical potential. I n many circumstances, the phase has no observable effect since it is usually quantities like \+I2 or $* V+, etc., which govern the behavior of the measurable properties of a system. A case where the phase factor does matter is in the Josephson effect in the tunneling of a current from one superconductor to another through a n insulating layer separating the two (62). In addition to other effects, if a potential difference V is established between the two superconductors
TYPE I1 SUPERCONDUCTORS
33
across the insulating layer, an alternating current of frequency f,given by hf = 2eV, passc.s lhrough the insulating barrier. The thin film bridge also evidently act,s in the same way as the insulating layer in a tunneling experiment, allowing weak coupling of the two more massive pieces of superconductor on either side. When a current is passed across the bridge, a potential difference may arise, as it does in the flux-flow experiments, through motion of the vortices in a direction that has a component perpendicular to the superimposed current flow. I n Anderson and Dayem’s experiment, the superposition of an oscillating current of frequency f caused a resonancepresumably with the Josephson-type oscillationwhich was detectable in the dc voltage-current curves a t voltage intervals 8V given by hf = 2e SV. Anderson and Dayem have suggested that such effectsmay be understood in terms of what they call the “other” GinzburgLandau equation
+.
which explicitly gives the time variation of Here AD is the Debye screening length, and is related to the London penetration depth XL and the Fermi velocity V F by AD = X L u p / d $ c. A phenomenological derivation of this equation has recently been given by Anderson et al., (65). A similar result has been obtained from an extension of the Gor’ltov (8)theory by Abrahams and Tsuneto (64). APPENDIX
The Free Energies of a Superconductor in a Magnetic Field Magnetism problems are usually formulated in terms of either magnetic poles or Amperian currents. In the latter approach, the magnetic material is hypothetically replaced by a nonmagnetic medium in which there is a distribution of currents which produces the same flux density a t every point in space as exists in the actual case. If the magnetic moment per unit volume of the magnetic material is M(r), the Amperian current density is given by J(r) = curl M(r). Both the Amperian currents and the conduction currents (of density Jeond.) are coupled to the magnetic flux so that curl B = 4s(J Jcond.). The magnetic properties of superconductors arise from the fields generated by the supercurrents which flow in the superconductor. These must not be counted both as Amperian and conduction currents. Here we prefer to consider them as Amperian currents and to relate a density of magnetization to them by the equation given above. To avoid the complications arising from demagnetization effects, we
+
34
E. A . LYNTON A N D W. L. MCLEAN
shall restrict ourselves to specimens of negligible demagnetizing coefficient, such as long thin cylinders with the external field parallel to their axes. We now derive first, of all the Helmholtz free energy per unit volume, F , using the result that the change in F is equal to the work done on the system during an isothermal change. Prom Poynting’s theorem, the rate of flow of energy across a closed surface just outside the boundary of the superconductor is
dW/dt
= - (c/4n)JE = (1/4s)J(B.
.
X B n dS aB/at E aE/at) d3r
+
+ JE . J d3r.
Assuming that the electric field exists only during the transient stage of applying the magnetic field, we get for the work done on the superconductor
/ (F(B,T) - F ( 0 , T ) ) d3r / d3r { (1/4s) B - dB + 1:- E - J dt} / d3r { B 2 / 8 r + /:- E - J dt}, =
/oB
=
where the upper limit of integration B is the steady flux density that has been reached a t t = 0 in the volume element d3r. Thus we may take5
F ( B , T ) - F ( 0 , T ) = B2/8n
+ 1:- E
J dt.
The second term represents the density of kinetic energy of the supercurrent, being the work done in setting up the current. The thermodynamic potential whose minimum gives the condition for equilibrium in the presence of a fixed external field is the Gibbs free energy
G
=
-
F - H B/4n
where H is the flux density of the external field alone or the magnetic intensity. Thus
G(B,T) - G ( 0 , T ) = ( B 2 - 2 H . B)/87r
+ kinetic energy density.
The part of this energy that the system would have if it were nonmagnetic, that is if B = H , is not of interest and we subtract it off, leaving
G(B,T) - G ( 0 , T ) = (B - H)2/8a
+ kinetic energy density.
The kinetic energy density can be expressed in a number of different forms. For instance, if the supercurrent satisfies the London equation A aJ/at = E, where A = 4 ~ X ~ / then c~,
JE* J dt = JA(aJ/at) * J dt = J(a/dt)(+AP) dt = +AJz. The energy of a system in general cannot be considered to be localized in particular parts of the system [see Heine (6741. 6
35
TYPE I1 SUPERCONDUCTORS
Using J
=
nev arid A
m/ne2, we get
=
+hJ2 = n(+mv2).
Alternatively, from the other London equation,
chJ
=
- A , +AJ2 = ne2A2/2mc2.
This should be compared with the expression on p. 7, recalling that in the Ginzburg-Landau theory, n = J$12. Finally we show the relation between
dW
=
- + JE - J dt] d3r
J[(1/4s)B dB
and a more usual forin for dW.
dW
=
= = = =
- +
J d3r [(1/4s)B dB JE curl M dt] J d3r [(1/4n)B dB + JM curl E dt] J d3r [(1/4s)B * dB - JM aB/dt dtl J d 3 r (1/4s)(B - M) . d B J d3r (1/4s)H dB
+ J dt Jn
*
E X M dS
-
(M = 0 over the surface of integration which is outside the superconductor.) Again, subtracting off the work that would be done if the superconductor were nonmagnetic, we get for the contribution per unit volume H dM/4s. ACKNOWLEDGMENTS We are very grateful to the persons mentioned in thc list of references who have privately communicated results of their work prior t o publication. We wish to thank Mr. D. E. Carlson for help with the drawings. We are grateful to Professor E. Abrahams and Professor P. R. Weiss for pointing out some errors and obscurities in the original manuscript.
REFERENCES 1. E. A. Lynton, “Superconductivity,” 2nd ed. Methuen, London, 1964. 1. A. B. Pippard, The Dynamics of Conduction Electrons, in “Low Temperature Physics” (C. DeWitt, B. Dreyfus, and P. C . deGennes, eds.). Gordon and Breach, New York, 1962. 9. J. Bardeen, L. N. Cooper, and J. R. Schrieffer, Phys. Rev. 108, 1175 (1957). 4 . F. London, “Superfluids,” Vol. I. Wiley, New York, 1950. 6. P. G. decennes, TroisiPme Cycle Notes, “MBtaux et Alliages Supraconducteurs,” Vol. 11. Paris, 1962-1963. Now available in P. G. deGennes “Superconductivity of Metals and Alloys.” Benjamin, New York, 1966. 6 . V. L. Cinzburg and L. D. Landau, Soviet Phys. J E T P 20, 1064 (1950). 7. L. D. Landau and E. M. Lifshitz, “Statistical Physics.’’ Pergamon Press, Oxford, 1958.
36
E. A. LYNTON AND W. L. MCLEAN
L. P. Gor’kov, Soviet Phys. J E T P 9, 1364 (1959); 10, 998 (1960). B. B. Goodman, Rev. Mod. Phys. 36, 12 (1964). A. B. Pippard, Proc. Roy. SOC.A216, 547 (1953). A. B. Pippard, Proc. Cambridge Phil. SOC.47 Pt. 3, 617 (1951). R. B. Dingle, Proc. Roy. SOC.A211, 500 (1952). A. A. Abrikosov, Soviet Phys. J E T P 6, 1174 (1957); Phys. Chem. Solids 2, 199 (1957). 14. C. Caroli, P. G. deGennes, and J. Matricon, Phys. Letters 9, 307 (1964). 16. B. Serin, Phys. Letters 16, 112 (1965). 16. D. Cribier, B. Jacrot, B. Farnoux, and L. Madhav Rao, 11th Conf. on Magnetism and Magnetic Materials, San Francisco, 1965. J . A p p l . Phys. 37, 952 (1966). i7. P. G. deGennes, Troisiiime Cycle Notes, “M6taux et Alliages Supraconducteurs,” Vol. IV. Paris, 1963-1964. Now available in P. G. deGennes “Superconductivity of Metals and Alloys.” Benjamin, New York, 1966. 18. A. Rothwarf, Phys. Letters 16, 217 (1965). 19. M. Cardona, J. Gittleman, and B. Rosenblurn, Phys. Letters 17, 92 (1965). 20. T. Kinsel, E. A. Lynton, and B. Serin, Rev. Mod. Phys. 36, 105 (1964). 2i. G. Bon Mardion, B. B. Goodman, and A. Lacaze, Phys. Chem. Solids 26, 1143 (1965). 22. L. P. Gor’kov, Soviet Phys. J E T P 10, 593 (1960). 23. P. G. deGennes, Phys. Condensed Matter 3, 79 (1964). 24. K. Maki, Physics ( N . Y . )1, 127 (1964). 26. T. McConville and B. Serin, Phys. Rev. 140, A1169 (1965). 26. A. R. Strnad and Y. B. Kim. Private communication, preprint 1965. 27. M. Tinkham, Phys. Rev. 129, 2413 (1963); Rev. Mod. Phys. 36, 268 (1964). 28. R. D. Parks, J. M. Mochel, and L. V. Surgent, Jr., Phys. Rev. Letters 13, 331a (1964). 29. D. Saint-James and P. G. deGennes, Phys. Letters 7, 306 (1963). 30. B. Rosenblum and M. Cardona, Phys. Letters 9, 220 (1964). 31. D. Saint-James, Phys. Letters 16, 218 (1965). 32. 1).J. Sandiford and D. G. Schweitzer, Phys. Letters 13, 98 (1964). 33. A. Rothwarf, Private Communication, 1965. 34. D. E. Carlson, Private Communication, 1965. 36. J. Gittleman and B. Rosenblum, Private Communication, 1965. 36. K. Maki, On surface superconductivity in the sub-critical region, Publ. COO-264. Univ. of Chicago, Chicago, Illinois, 1964. 37. Y. B. Kim, C. F. Hempstead, and A. R. Strand, Phys. Rev. 139, A1163 (1965). 38. P. W. Anderson and Y. B. Kim, Rev. Mod. Phys. 36, 39 (1964). 39. J. Friedel, P. G. deGennes, and J. Matricon, A p p l . Phys. Letters 2, 119 (1963). 40. B. Rosenblum and M. Cardona, Communicated a t Conf. Phys. Type I1 Superconductivity, Cleveland, 1964. 42. B. D. Josephson, Phys. Letters 16, 242 (1965). 42. J. M. Luttinger, Phys. Rev. 136, A1481 (1964). 43. J. Bardeen and M. J. Stephen, Phys. Rev. 140, A1197 (1964). 44. P. Nozihres and P. G. deGennes, “Magnus Force and Flux Flow in Superconductors.” Private communication, preprint, 1964. 46. W. F. Vinen, Rev. Mod. Phys. 36, 48 (1964). 46. A. K. Niessen and F. A. Staas, Phys. Letters 16, 26 (1965). 47. W. A. Reed, E. Fawcett, and Y. B. Kim, Phys. Rev. Letters 14, 790 (1965). 48. P. H. Borchcrds, C. E. Gough, W. F. Vinpn, and A. C. Warren, Phil. Mag. 10,349 (1964). 8. 9. 10. 11. 12. IS.
TYPE I1 SUPERCONDUCTORS
37
49. P. N0eiPrt.s and P. G. deGennes, Phys. Letters 16, 216 (1965). 50. T. E. Faber, PTOC. Roy. SOC.A248, 460 (1958). 61. For review papers on this topic, see “Plasma Effects in Solids,” Proc. 7th Intern. Conf. Phys. Semicond., Paris, 1964. Academic Press, New York, 1964. 58. 11. A. Hays, Private Communication, 1965.
63. A. J . Kushnir, Private Communication, 1965. 54. B. W. Maxfield and E. J . Johnson, Phys. Rev. Letters 16, 677 (1965).
66. F. Haenssler and L. Itinderer, Phys. Letters 16, 29 (1965). 66. W. DeSorbo, Phil. Mag. 11, 853 (1965). 67. C. P. Bean and J. D. Livingston, Phys. Rev. Letters 12, 14 (1965). 68. P. G. deCennes and J. Matricon, Rev. Mod. Phys. 36, 45 (1964). 69. R . W. DeBlois and W. DeSorbo, Phys. Rev. Letters 12, 499 (1964). 60. H. J. Fink, Phys. Rev. Letters 14, 853 (1965). 61. P. W. Anderson and A. H . Dayem, Phys. Rev. Letters 13, 195 (1964). 68. B. 11. Josephson, Phys. Letters 1, 251 (1962); Rev. Mod. Phys. 36, 216 (1964). 63. P. W. Anderson, N. 12. Werthamer, and J . M. Luttinger, Phys. Rev. 138, A1157 (1965). 64. E. Abrahams and T. Tsuneto. Phys. Rev., to be published. 65. V. Heine, Proc. Cambridge Phil. SOC.62, 546 (1956). 66. L. Dubeck, P. Lindenfeld, E. A. Lynton, and H. Rohrer, Phys. Rev. Letters 10, 98 (1963).
This Page Intentionally Left Blank
Measurement of Weak Magnetic Fields By Magnetic Resonance P. A. GRIVET t!Jnaversily of Paris Institut d’Eleetronique Fondamenlale Orsay (gl-Essonne), I’’rance AND
L. MALNAR C.F.S. Depl. & Physique Appliquie Corbeville par Orsoy (91-Essone),France
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 A . Outline of This Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 B . Comparison with Nonresonant Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 C . Scalar and Vector Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 13 . Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 I1. Order of Magnitude and Main Characteristics of Natural Fields . . . . . . . . . . 45 A . Geomagnetic Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 B . Interplanetary Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 I11. Nuclear Resonance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 A . Prepolarization and Free Nuclear Precession . . . . . . . . . . . . . . . . . . . . . . . . . 55 B . Overhauser Polarization and Spin Oscillator . . . . . . . . . . . . . . . . . . . . . . . . . 64 IV. Optical Detection of An Electron Nuclear Resonance ( A m p = f 1) . . . . . . . . 76 A . Optical Pumping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 B . Zeeman Excitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 C . Experimental Orders of Magnitude . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .103 1) . Helium Magnetometer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 V . An Example of Design : The Cesium Vapor Magnetometers ............... 111 A . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .111 B. The Magnetometers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112 C. Optimum Working Conditions and Limitations in Use . . . . . . . . . . . . . . . . 129 D . Examples of Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 VI . Superconducting Interferometers as Magnetometers . . . . . . . . . . . . . . . . . . . . . 143 A . Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 B. The First Practical SQUID . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 39
40
P. A . GRIVET AND L. MALNAR
I. INTRODUCTION
A . Outline of This Study I n 1946 the discovery of nuclear magnetic resonance by Purcell and Bloch opened a new era in the domain of accurate magnetic field measurement. Indeed, the new phenomenon links accurately and linearly the value
rt+
FIG.1. Free precession of a magnetic moment ing to the law d M / d t = r(M X B),for 7 > 0.
M around a magnetic field B accord-
of a field B to a circular frequency w , by introducing a new physical constant, the gyromagnetic ratio: w =
rB
(1)
where w is the angular velocity of precession of the magnetic moment M of the specimen around the vector B (Fig. l), and y characterizes the substance in which the resonance is observed or, eventually and with some minor corrections, the nucleus, the magnetic moment of which is put into resonance.
M E A S U R E M E N T OF W E A K M A G N E T I C F I E L D S
41
At high fields, protons are ordinarily provided by such convenient liquids as water or benzene, and then one has (cgs) (mks)
y = 2.67513 X lo4gauss sec-I y = 2.67513 X lo8 tesla sec-I
for pure deoxygenated water. The passage to y p , the value characterizing the bare protons of the theory, would involve a few local field corrections, of the order of a few parts per million in relative value. They are of no importance for the type of measurements involved here but their small magnitude explains why the y of any diamagnetic liquid is very near the proton’s value, so that for measurements which do not seek an accuracy better then the choice of the specimen is not critical and the resonance method is a very convenient one. The number of significant figures given for y shows that the resonance is a very sharp one, and that Hipple el al. (1) and later Bender and Driscoll ( 2 ) had to resort to the most painstaking care in their classical measurements of B a t the National Bureau of Standards in order to obtain y with such a n accuracy of a few parts in As long as B is steady during some 10 sec, the measurement of w presents no difficulty. The only limiting condition for this technique to maintain all its advantages is that the field be sufficiently high-higher than 500 gauss, to state a definite limit. For lower fields the resonance frequency still obeys Eq. (1) and the line remains sharp, but the signal-to-noise ratio deteriorates, and progressively impairs the accuracy. Considering, for simplicity, the case where the sample used is of constant volume, the signal decreases with B2.Moreover, when the 10-gauss region is reached, the frequency enters the 2O-kc/sec range and flicker noise appears. Geophysical fields are still an order of magnitude lower, near 0.5 gauss, in the middle latitude countries, and the direct detection of a nuclear resonance signal succeeds only in exceptionally good “natural” conditions (see Section 11). This tour de force could be accomplished only very recently in the geophysical observatory a t Jussy, near Geneva, by the specialized team of B&nBel al. (3, 4). In 1957, clever use of the Overhauser effect led Abragam et al. ( 5 ) to introduce a new technique of “double irradiation” in nuclear resonance, and so obtain a signal which decreases as the first, power of B . It will be shown in Section 111 that, by this method, a fair signal-to-noise ratio is produced, of the order of 100 or more, for fields of geophysical interest. In the same year, 1957, the flight of the first Sputnik focused attention on interplanetary conditions. In the nearest region of space, and in first approximation, the earth acts as a magnetic dipole, and its field decreases rapidly as ( r / R ) - 3 with the distance r to the center, R being the earth
42
P. A. GRIVET AND L. MALNAR
radius ( R N 6400 km). So, at some 30 earth radii, the influence of the earth practically disappears, reaching the 20-pG (microgauss) level, and one reaches the “interplanetary level”; in this region the sources of the field are largely unknown. Today it is only known that it is characterized by a very low amplitude of some 50 p G and that the currents due to ionic winds are the probable cause of the interplanetary field. In this range, the linear law in B, characteristic of “double irradiation” in nuclear resonance, becomes unfavorable too, and moreover the resonance frequency lies in a very low and also unfavorable range: 0.2 cps for 50 pG. Then, as explained later, the spin of the space ship interferes badly with the spin of the nuclei: nuclear resonance in liquids is no longer useful. TABLE I
*6Rb ‘33Cs ‘He 3He
Resonant frequency for B = 1 gauss in cps
Frequency a t the residual space level of 50 pG in cps
4 . 6 7 X 106 3 . 5 x 106 2 . 8 X lo6 3.8 X 106
23 17 140 190
But in 1957, “optical detection” of electron’ resonance in an alkali metal vapor, as proposed sometime before by the discoverer of “optical appeared ), for the first time as feasible. This was pumping,” Kastler2” (‘i‘ due to the important improvements suggested by Dehmelt (8) and implemented by Bell and Bloom (9) at Varian Associates. For the simple free electron resonance ye would be 660 times larger than y p . In fact, here, for the coupled electron-nucleus resonance, the so-called Amp = 5 1 transition (see Section IV), y is of the same order of magnitude or even larger; for example, see Table I. Remark. As regards frequency, helium magnetometers are comparable to the alkali metal vapor type and are included in the table. The 4He nucleus is devoid of any nuclear spin and the metastable level2 2%’ is split into three equidistant mJ = 0, _+1levels, so that the magnetic transitions AmJ = k 1 all have the same frequency. For 3He which has a nuclear spin of +, the metastable 23S1 level is split in two sublevels 1 Here the phrase “electron resonance” is a n oversimplification: in a classical vector (6, Chapter 8) model, the vector precessing around B would not be a pure electron spin moment, but the resultant of both an electron and a nuclear moment. 2 See Leighton (6, pp. 204, 205) for a clear explanation of atomic level symbolism. 2 0 Nobel prize in physics, 1966.
MEASUREMENT OF WEAK MAGNETIC FIELDS
+
43
F = and F = Q by hyperfine coupling. Two magnetic transitions are available. The one mentioned in Table I relative to F = Q is the highest in frequency. The other transition is less favorable: it pertains to the F = 8 level, and occurs a t half the frequency of the first. Table I shows clearly that the resonance frequency still amounts to usable values and that 3He offers the most favorable conditions in this respect. Of still greater importance is a general advantage of Kastler’s optical method valid for all its variants: the signal intensity is independent of B. This is a “trigger” method: one does not observe directly the quantum emitted in the magnetic transition, which would be hw = ZLyB, e.g., proportional to B. Instead, one receives in a photocell an optical quantum hvg, V D being the frequency of the D line of the alkali metal, of constant energy. The emission of the big and constant optical quantum is “triggered” by each3 appearance of a small magnetic quantum irrespective of the latter’s size, which is proportional to B. This high sensitivity a t very low fields is the chief advantage of the “optical pumping method.” The efficiency of optical magnetometers would only wane for extremely low fields in the neighborhood of 100 pG. This is an order-of-magnitude estimate of the line width of the optical transition, when translated from the ordinary frequency domain (Av = Aw/2a) to the magnetic field scale by use of law (1). I n this review, we shall concentrate on the two types of low-field magnetometers which we have presented in this historical introduction. But before starting with the main subject, we shall recall briefly, in Section 11, the characteristics of the natural magnetic fields, which bear on the accuracy of the measurements. Indeed, geomagnetic, like interplanetary, fields are not steady and their rates of change influence the technique of measurement. On the other hand, the study of their variation in time is of high interest. The conditions in geophysics are very different from those which prevail for high field, in the laboratory: here as soon as the physicist has accomplished some noticeable progress in accuracy, he uses this knowledge in determining the residual fluctuations of the field. He is then able to compensate them by some kind of feedback process: by so doing, he is paving the way for new progress in the measuring art. I n the natural field domain, fluctuations are both a factor limiting the accuracy of the measuring process and also an interesting object of study. They cannot be suppressed and for this reason a rough knowledge of the natural and often erratic variations of the natural fields seems in order before studying the instruments and methods of measurement. 8
Neglecting relaxation “leaks.”
44
P. A . GRIVET A N D L. MALNAR
B. Comparison with Nonresonant Methods In the past decade, older methods were greatly perfected too, and for some applications look more attractive then the resonance method, even today. I n these paragraphs, we indicate some references in order to offer modern points of comparison with resonance. 1. Induction. Induction coils equipped with ferrite cores have high sensitivity for fast variations of the field: they are used for the study of the geomagnetic spectrum in the range 1-50 cps, as was done, for example, by Stefant (10). The modern integrating fluxmeter, for example, makes the value of B directly available, instead of dB/at as shown by Grivet et al. (11). In space probes, the natural spin of the space ship may be put to good use to rotate the sensing coil and to obtain high sensitivities, a s shown in the equipment of Pioneer I :
Sensor, a coil with 30,000 turns of n”40 wire on a ferronickel core Angular velocity in space, 2 cps. Dynamic range, 6 HG-12 mG. Bandwidth of the useful spectrum, 1.25-2.5 cps. Threshold of sensitivity, 6 pG. Source power, 25 mw. (For other data see 12-15.> 2. Hall Efect. Hall plates associated with a ferrite static field concentrator reach a threshold of 1 pG in sensitivity, for a signal-to-noise ratio of 10 db. Further details on this device can be found in the review by Grivet (16) and in Epsteiri et al. (17‘). 3. Fluxgate. This method is also known as that using the “second harmonic.” This form of signal is induced in a coil wound around a magnetic yoke, by nonlinear addition. The combined effects of the unknown field B and of an auxiliary sinusoidal field of frequency w , produce a signal, proportional to B , of frequency 2w. This technique has been fully developed since World War I1 (18) and its modern applications were described in Vol. 9 of this series by Melton (19). An elaborate example is offered today by the type of apparatus which complemented an optical magnetometer on the Interplanetary Monitoring Platform, IMP-I. This satellite, also known as Explorer 18, was launched in November 1963. The monoaxial fluxgate had the following characteristics (10): dynamic range, k400 HG;sensitivity, k 2 . 5 pG. (For a description of other recent devices of this type see 21-23.) It is worthwhile to remark here that the fluxgate principle was trans-
45
MEASUREMENT OF WEAK MAGNETIC FIELDS
posed to the resonance domain in one recent proposal for a vector magnetometer (24).
C . Scalar and Vector A4easurements The magnetic field is an axial vector, B, and in order to determine this vector completely, measurements of the intensity of three noncoplanar components must be made. This is necessary when one is using resonance magnetometers, which give “scalar” measurements; indeed, formula (1) links the frequency to the inlensity of the field vector, or, in other words, the measured frequency is completely independent of the orientation of the magnetometer with respect to the field. The angular positioning of the magnetometer reacts on the amplitude of the signal only. One must then resort to the classical methods using compensating coils in order to obtain the three components. Such methods are discussed in the literature for fixed stations, for example, Minitrack stations (25), and the recent proposal (26) embodies a high degree of automation. The same scheme is actually used on mobile observatories and on satellites (27) * D. Units A traditional unit in geomagnetism is the “gamma” (y), which in the recent past represented roughly the smallest measurable variation of the field B. In terms of the B vector, we have 1 gamma
=
10 microgauss (UEMcgs)
= =
lop6 gauss tesla (mks)
=
1 picotesla
tjhetesla being the internationally adopted name for the weber per square meter. I n terms of the H vector one has 1 gamma = 10 microoersteds = 0.79 milliampere-turn/meter (mks)
The use of the gamma is inconvenient here, because we shall also have to deal often with y, the gyromagnetic ratio. We shall therefore adopt the microgauss (pG) as the usual unit, which fits the accuracy of present measurements well.
11. ORDER
MAIN CHARACTERISTICS NATURAL FIELDS
O F MAGNITUDE AND
OF
A . Geomagnetic Field
I. Normal Conditions (Quiet Days). a. Spatial variations. We describe here the average characterist,ics of the earth field; they are easily observnhle, since the pioneer work of Gauss, Weber, arid Schuster (see Ruricorri
46
P. A. G R I V E T A N D L. M A L N A R
28, for a short history), during the many days when variations with time are largely absent and which for this reason are specified as “quiet days.” Intensity, direction; dipolar approximation (Gauss) : When geographical variations are averaged out, the field is nearly the same as would be produced by a suitable dipoleaa located a t the center of the earth; its axis is contained in the meridian plane defined by longitude 69’W, and is slightly inclined on the geographical south-north axis, by 11.5’; in other words the magnetic north pole is located in northern Canada, a t a point
radii -
I FIG.2. The earth as a magnetic dipole.
defined by latitude 78.5’N, longitude 69’W (Greenwich). It should be remarked that the south pole of the equivalent dipole points to the north direction (Fig. 2). Uniformity: The field is spatially highly uniform, even very near the surface of the earth; heterogeneity of the magnetic properties of the soil, as well as telluric currents, may spoil this homogeneity locally; raising the magnetometer probe some 5 meters over the ground does away with a good part of the perturbation in magnetically smooth regions. On the other hand, anomalies may be very strong. They can be divided into two categories : interesting geological anomalies, pointing, e.g., to the existence of interesting ores; or, on the contrary, artifacts due, for example, to the *a If M is the magnitude of the magnetic moment and a the earth’s radius, Ma-a is a magnetic field equal t o the intensity at the magnetic equator or one half of t h a t at the magnetic north pole. Between 1962 and 1965, one had Mav3 = 311,110 [miwooerstedt] and the secular variation of the quant.ity was of thc order of 160 [ r o c ] pcr year (28~).
MEASUREMENT OF W E A K MAGNETIC FIELDS
47
residual magnetization by currents orire produred by ancient lightning striking the soil. Near latitude 45"N (F = 0, 45 gauss), the average degree of uniformity of the field may be characterized by the following values of the gradient: 200 pG per kilometer in altitude; 50 pG per kilometer in latitude. Nomenclature: I n order to make reference to the introductory books on the earth's field (29-31) and to the literature easier, we collect the standard notation in Table 11. TABLE I1 Standard letter symbol Total field Horizontal component Vertical component Declination (angle between H and the geographical north direction) Inclination or dip (angle between F and the horizontal plane)
P
Average value near latitude 45"N (gauss)
H
z
0.45 0.15-0.25 0.4
D
Depends on location
Z
60-70" ( F pointing down)
b. Variations in time. Nowadays the average value of the earth's field is thought to stem mainly from currents in the earth's interior; indeed, the core is highly conducting (as much as the sea a t a depth of 700 meters) owing to the high temperature (1500°C at 700 meters) (68). On the contrary, fluctuations in time of the field vector are 75% due to external currents in the ionosphere; only 25% is linked directly to internal currents. A coupling exists between both kind of currents; but the qualitative duality of sources*makes it easy to understand that the fluctuation in time may be either slow, and very slow, or fast. In fact, one observes: (a) Secular variations at a rate of 300 pG per year. (b) Diurnal variations; the intensity F decreases rapidly during the early morning, soon after sunrise, reaches a minimum a t noon, and increases during the afternoon and night. Th e amplitude of this daily oscillation is a t its highest in June and reaches k250 p G ; its minimum value is in January, when it amounts to f 2 5 pG. The direction of the vector fluctuates also, as indicated b y the correlated variation of H and Z which was recorded, for example, in the U. S. Coast and Geodetic Charts 30772 and 3077H. We need not go deeper into this subject here since resonance methods are only sensitive to the variation of H . 4 The duality of sources may he proved by measurements on the earth's surface only, as shown theoretically by Gauss and first measured by Schuster. On this point, see Chapman (29, p. 54) or Massey and Boyd (30, p. 186).
48
P. A . GRIVET A N D L. M A L N A R
(c) Fast variations; actually, there are a few types of well-characterized quick variations. Known sincc a long time are the Eschenhagen oscillations :amplitude:5 to 50 &, period :25 sec, total duration: a few minutes. All the newly discovered ones (4, 10, 32, 33, 33a) seem to be linked with ionospheric activity and are then the indirect result of solar activity. Their common characteristics are a very small amplitude and a relatively high frequency; for example, see the accompanying tabulatJion. Frequency (cps) Amplitude ( N G ) Hydromagnetic “pearls” Oscillation of the cavity between earth and ionosphere
0.3-3
0.2
5-40
0.02
They are most often observed by induction (10) and some of them lie a t the limit of sensitivity of resonance apparatus. They offer a wide choice of natural signals useful to test the resonance magnetometer, as shown, for example, by Arnold et al. (34). 2. Exceptional Variations; Magnetic Storms. Magnetic storms happen several times in a year: their occurrence is linked with solar activity and they occur after strong eruptions a t the surface of the sun in the form of sunspots (35). The associated magnetic variations on earth are of a n order of magnitude of 5 to 15 mG, e.g., 100 or 1000 times larger than the interesting details in a magnetic exploration. Thus the occurrence of magnetic storms precludes any other magnetic study than that of the storm itself. Magnetic storms have been studied since the beginning of the 19th century and are of widely varying intensities. The one th a t occurred in 1859 was so strong as to be visible by looking at a magnetic needle; it involved variation of F of the order of 10%. This by far exceeds the dynamic range of standard magnetometers and the study of storms requires special instruments. Optical (pumping) magnetometers are a n exception to this rule; when equipped with digital registration they show a dynamic range of 0.5 gauss.
3. Magnetic Exploration with Mobile Magnetometers. Magnetic exploration is done in a variety of ways and magnetometers are carried by automobiles, ships, or planes. The movement of the carrier transforms the spatial variations into time variations: the useful variation of the signal is then mixed with parasitic variations: some of them arise from the regular geographical variation of the averaged field; others stem from the irregular time variations. Corrections are necessary and in the second case can only be done by comparison with a stationary station operating at the same time.
MEASUREMENT OF WEAK MAGNETIC FIELDS
49
The procedures, as well as the categories of signal strength which may occur, are numerous. Two very different examples may illustrate the variety of cases. One is a deep sea signal recorded in the course of the magnetic search for the submarine Thresher's hull with a proton precession magnetometer (36) (Fig. 3). The other is given by an airborne magnetometer during the broad-scale magnetic exploration of the Arctic, which resulted in the characterization of two different halves of the sub-
cn
E-
60-
5040 -
30 -
Time
0430
0445
0500
FIG.3. The deep sea signal given by a proton precession magnetometer on the spot of the Thresher's accident.
merged continental shelf (Fig. 4). Further illustrations may be found in Raff (37) and King et al. (38).
4. Geophysical Observatory. Since 1830, when Gauss initiated highaccuracy measurement of the geomagnetic field, this aspect of the science of magnetism developed steadily. Now, in many countries, there are several geomagnetic observatories, as free as possible from any man-made magnetic disturbance. This implies mainly the following conditions: (1) A wooden building, where all metallic; parts including the small ones (nails, wires, etc.) are copper or aluminum, and verified to be nonmagnetic material (for example, brass, which should be nonmagnetic, nevertheless often is magnetic, owing to the practice of recasting old scrap, possibly nickel plated). (2) A safe distance (40-60 km) from the nearest high-voltage line, usually an electric railway line.
50 (3) line.
P. A. GRIVET AND L. MALNAR
A suitably screened bifilar or coaxial feed system for the power
Apart from large and well-known national laboratories such as that near Fredericksburg, Virginia (see also 38a), universities may build smaller but efficient laboratories a t relatively low cost, such as the Jussy laboratory, which has been described by the Geneva physicists (39).
Pi
/
I
*4
/
-Central
magnetic z o n e 1
Wrangel
+=
I
a B
F I ~4.. Aerial magnetic exploration of the Arctic shelf.
I n such a location, the gradient of F is as low as 1 pG per meter; a n experienced geophysicist may claim to measure F with an accuracy of 0.1 pG. He should work during a quiet day and have a t his disposal a n auxiliary continuously operating survey apparatus: he may then be able to choose the most favorable period (thour perhaps) for the measurement. More often, he niay be happy to reach the microgauss limit only, and this is now feasible with any conimerrial resonance magnetometer. 5. Ordinary Laboratories. Compensating Natural and Man-Made Fluctuations. If one turns to ordinary physical laboratories, the conditions are by far worse. An unforseeable value of the spatial gradient is produced by the ferromagnetic girder embedded in the concrete, or by open beams
MEASUREMENT OF W E A K MAGNETIC FIELDS
51
(elevators): values ranging from 1000 to 5000 pG per meter are not exceptional. On the other hand, sinusoidal or transient fields a t the mains frequency and its harmonics also reach a level of some lo3 pG and are very disturbing. It was long thought that such an unfortunate location simply makes any experimentation on high-accuracy magnetometers impossible. But recently successful attempts have been made to quiet the field fluctuations with time, so it may become profitable to further improve the conditions by compensating the static gradient to first order. The proposals are all based on a high-sensitivity pick-up system, which measures the fluctuations : this signal is linearly amplified, and then fed back with the proper phase to a pair or quadrupole of compensating coils of large dimensions. Three types of pick-ups were successfully used:
FIQ.5. Automatic balancing of the fluctuations of the apparent earth field with an helium magnetometer ( 4 2 ) .
induction coils (40), fluxgate (41), and a helium resonance magnetometer (49). The induction system has a large bandwidth (10-10000 cps) and reduces ac hum by a factor which may be better than 120. But the lower cut-off frequency is between 10 and 5 cps, and the device is rather inefficient against low-frequency fluctuations. These may be reduced by careful choice of the room in the building. Fluxgate and resonance magnetometer show complementary qualities; they are efficient from 10 cps down to zero frequency. For resonance studies, it is sufficient to stabilize F in magnitude: in other words, fluctuations in the direction of F are of secondary importance. Figure 5 shows the efficiency of such a device described by Schearer (42). Wolff (41) described a more complete system regularizing the three components of the vector, bracketed within +lo0 pG. More recently, a simpler technique was used with success: a large multiple magnetic screen surrounds the system to be studied (43-45a). One meets the following difficulties:
52
P. A . GRIVET A N D L. MALNAR
(1) Economic considerations limit the volume to roughly 1 m3 or less. (2) One has to demagnetize the screen carefully, and this operation is necessary after each important change in the external magnetic environment (moving of iron masses) ; permanent demagnetizing coils are a necessary part of the screen. (3) Inside the screen, the field is limited in intensity to some 10 gauss for Helmholtz coils (44) and to 100 gauss for solenoids (46, 46b).
This solution looks very promising and will take a still more convenient form with forthcoming improvements in the art of magnetic
I2 mG
v = 2025 cps B- 916 mG
I
FIG.6. A faint component of the nuclear resonance spectrum of POBH2, which would remain hidden in external noise in the absence of a magnetic shield around the low-field spectrograph.
joints; Fig. 6 shows a faint nuclear line, which could only be observed with such a protection.
B, Interplanetary Field (46a) 1. Order of Magnitude. Three main types of measurements have been actively made for the past few years: (1) Survey of the “cIassica1” geomagnetic field in space, by means of low-altitude satellites. One hopes for a more thorough, fast and accurate survey; the magnetic conditions remain similar to those a t the surface of the earth. The main interest lies in the small Gaussian departures from the dipolar field. The altitude is between 300 and 800 km, where the magnetic conditions are better than a t sea level because of the smoothing
53
MEASUREMENT OF WEAK MAGNETIC FIELDS
by distance of a good part of the influence of geographical heterogeneity. On the other hand, one has t o face the general difficulties which occur in spatial experimentation and which are described later. The recent program is ably described by Heppner et al. (46-49). (2) Near zone exploration, which extends from 1000 km to 15 earthradii, and includes the region of the Van Allen belts. There, the field is mainly the dipolar field and its intensity decreases, following a ( T / R ) - ~ law,:reaching some 3 mG at 5R and 100 pG at 20R. In this range one is
4
6
8 -~
--
10
12
Earth radii [RE]
FIG.7. The earth field in the “near zone” as measured by Pioneer V.
interested in the departures due to the intense ionic currents (Fig. 7) produced by the ionic belts, such as the one discovered a t 8R by Pioneer V. It produced a “bump” of some 200 pG on the dipolar field curve. Reviews of the magnetic data for this zone are given by Harrison (60) and Cahill (61). (3) Farther then 20R, the conditions are less accurately known and Pioneer V data (1961) are now considered as inaccurate. The most recent experiment [IMP.I (20)]provides more reliable data and indicates a field of 50 pG located in the ecliptic plane. I n this region many spatial and time variations of an intensity of some 10 pG are related to interesting aspects of the solar ionic wind; a discussion of the difficulties encountered in the exploration of this far region is to be found also in Ness et al. (20).
54
P. A . GRIVET A N D L. MALNAR
2. Spatial Requirements. This brief summary of the results shows that magnetic measurements in space are of a very delicate nature; moreover, the space ship is moving a t high speed (some 2 km per second) and genuine time variations are mixed with artificial ones, produced by the movement through heterogeneous ion clouds and currents. We refer the reader to the last-mentioned references and to Scull and Ludwig (52) for a discussion of these interesting problems. We shall only summarize here the final requirements which actually shape the construction of a spatial resonance magnetometer: (a) Sensitivity, 0.1 pG. (b) Frequency band of the order of 10 cps. (c) Weight, 2.5 pounds (sensor); 10 pounds total with converters and container. 2 watts (thermal control). (d) Power, 6 watts (e) Insensitive to the spinning of the satellite. (f) Insensitive to the “attitude” of the satellite, e.g., to the orientation of an internal reference direction with respect to the astronomical reference frame.
+
The last two requirements need some explanation. (1) Modern space probes are stabilized in direction; for magnetic measurements, the preferred solution to this problem is to have the craft spin a t moderate speeds (22.3 rpm for IMP.1). One ensures by this means that the axis of rotation keeps a constant orientation with respect to the astronomical reference frame. The consequence for the magnetic precession magnetometer is important. Indeed, it ma.y be proved generallya detailed analysis is given by Heims and Jaynes (5S)-that a rotation of the laboratory defined by the rotation vector SL has the same action on the magnetometer as an additional fictitious field
b
= Q/y
(2)
would have on the same apparatus supposed a t rest : if y is small as in the pure nuclear resonance, the error field b is high.6 For example, for proton resonance, an angular speed of 1 turn per second, when and B are parallel, means an error of 1 part in 2000 on the earth’s field a t sea level and middle latitudes, (blreaching the value 600 pG. Such a systematic error is unacceptable for the low space fields, and pure nuclear resonance is ruled out for this reason. Optical resonances correspond to y’s lo6 times higher and are perfectly suitable. 6
Only in the exceptional case 0 IB, would the error be negligible.
MEASUREMENT OF WEAK MAGNETIC FIELDS
55
(2) I n space, the most interesting contribution to the field, due to ionic currents, shows no preferred direction. On the other hand, the intensity of the signal due t o magtietic precession depends on the direction of the measured field with respect to the magnetometer. The dependence law, as will be explained later, leaves one with a dead angle of some & 15’ with respect to B. The only present solution to this difficulty is to usc “double” optical magnetometers and to add complementary magnetometers: in IMP.1, a monoaxial Rb resonance magnetometer is coupled
FIG.8. A nondirectional, triple-head, helium resonance magnetometer for satellite exploration (courtesy of Texas Instr. Co.).
with two fluxgates (20),and Fig. 8 shows a triple system of helium magnetometers for spatial use. 111. NUCLEAR RESONANCE
A . Prepolarization and Free Nuclear Precession 1. Method. I n 1953, Packard arid Varian (64) disclosed a free-precession experiment performed in the earth’s field and started the development of modern resonance magnetometers. Moreover, the first commercial model proved to be so efficient in the field that it still remains of great practical value in nearly its original form. I n order to rcach the low-field domain, the itiveritors separated the free-precession experiment already known for high fields (55) into two successive processes (Fig. 9). For this purpose, they took advantage of the long nuclear relaxation time of such ordinary liquids as water (2’1 ‘v 3 sec) or benzene (7’1 10 sec) when purified from the dissolved oxygen.
56
P. A . GRIVET A N D L. MALNAR
I n a first step, the sample is strongly polarized in a n intense field B, of the order 100 to 800 gauss; the 250-ml water sample then acquires a sizable bulk magnetic moment M which will only slowly decrease in magnitude during the 10 see necessary for an experiment. During this lapse of
FI-J Fostswitch polorizing position
DC. power
( N o t to scale)
FIG.9. Timing of the main steps in the Packard-Varian proton precession niagnetometer.
time the moment decreases following an exponential law e--l’T1.The establishment of polarization itself requires a time of a few times TI, e.g., a few seconds, to be fully accomplished. The second step begins with the abrupt breaking of the current pro-
MEASUREMENT OF WEAK MAGNETIC FIELDS
57
ducing the polarizing field B,; the direction of B, is chosen approximately a t right angles to the unknown B.Then, a t the instant that B, disappears, the magnetization M = xB, begins to precess around B (Fig. 10) inducing a signal in the pick-up, coils a t the frequency w = rB. I n fact the switching out of the polarizing current and the brealtdown of the field B, proceed continuously during a certain time, say 1 psec, during which the total field vector quickly, but continuously, evolves from the state B, B = B p to B. If this change is not made swiftly enough the magnetization M will follow smoothly (“adiabatically”), the
+
13.
‘i-/
FIG.10. Free precession of the magnetic moment M during the measuring period.
field remaining a t every time nearly aligned with it; and no precession will be observed. I n order that M will surely be left back, the change in the B direction must occur in a time short compared to the time constant of the movement of M.As is known from Bloch’s equations and as has already been known for a long time from the experiments of Rabi’s school, this is adequately measured by the period of precession of M in the actual field a t the time t. A safe limit is the period in the polarizing field B,; for B, = 100 gauss the period is near 2 psec and the cut-off must occur in 1 or 2 rsec; these were the approximate conditions in the first experiment. As the magnetizing current reaches the 5- to 15-amp range (feeding 200-500 turns of 135 w.g. wire wound around a 500-ml bottle), this is an exacting condition on the current breaker. Today (66,
58
P. A . GRIVET A N D L. M A L N A R
67) one softens this condition by first decreasing the cwrrent slowly (in a few tenths of a second) to an intermediate value, 0.5 amp for example; a t this stage, B,’ = 6 gauss and is still much larger then the earth’s field; M remains aligned on Bp’. At this instant one only has to cut the smaller current of 0.5 amp rapidly; moreover one now has more time a t his disposal, the new value of the limit being 10 times larger; indeed it is the precession period in the B,’ field, e.g., 50 Fsec. 2. Frequency Measurement. The gain in intensity produced by the prepolarization process is B p / B , ranging from 200 to 400 in the earth’s field for B < 100 < B, < 200 gauss; this is large enough to get a good signal-to-noise ratio during, maybe, 10 sec. Indeed, the signal decreases following an exponential law exp(-t’T2*);Tz* is of the same order of magnitude as TI, if the field is highly homogeneous spatially, but frequently Tz*< T1.This inequality expresses the lack of uniformity of the field; in fact,
l/r Tz* N (6H2)”2
(3)
where (6H2)1’2is the rms value of the spatial variation of the field in the volume of the sample; in an ordinary laboratory, T2* may range from 0.1 to 1 sec for a 500-em3 sample. A simple calculation (cf. IS) shows that the optimum observation time for the minimum error in frequency, taking account of the decrease of the signal-to-noise ratio at the end of the process, is approximately given by Tz*. Measuring the average period over 1 sec with an electronic counter permits one easily to reach with relatively good accuracy; the sample is water with a practical limit for Tt* given by the natural Tz,e.g., 2.1 sec for distilled and 3.1 sec for deoxygenated water; for benzene, Tz is 14 sec. I n the latter case the accuracy is increased by nearly a n order of magnitude if the field remains steady for 10 sec. One can easily show, as is done, for example, by Blaquiere et al. (58), that the error in B, f 6B, is given b y
6B
= l/pyO
(4)
where 0 is the measuring time, and p the signal-to-noise ratio. Counting the number of periods of the signal for a few seconds is considered nowadays as a rather crude manner of using the precious information contained in this continuous signal. Italian scientists (59) have tried to remedy this situation; they were looking for feeble, but predictable, variations of F , such as those produced by the perturbation of the ionosphere that occurs during an eclipse or due to an atomic explosion. They sought to increase substantially the accuracy of their precession magnetometer by processing the signal more carefully. The block diagram of their device is shown in Fig. 11. The method makes
59
M E A S U R E M E N T O F W E A K MAGNETIC F I E L D S
use of synchronous detection. This mode of detection is of current use in radiospectroscopy, for example, where one knows in advance both the frequency and phase of the signal. Here, however, the frequency is only approximately known; but this is not an insuperable difficulty if one has t o ;deal with slow and slight deviations of the frequency. On the other hand, the knowledge of the initial phase of the signal is assured if one prepares the initial state not by fast removal of the polarizing field only, SSB Signal Modulator A *in iw+)
SSB Filter
-
Detector integrator
Error signal
-
t FIG. 11. Block diagram of the method of Faini and Svelto for optimal processing of the information of the free precession signal.
ti
FIG.12. A scheme of the accurate timing necessary in the optimal processing.
but b y a 7r/2 “Hahn pulse” (60, 6 l ) , following a rather slow break of current (Fig. 12). It appears possible, then, to heterodyne the signal a t frequency w by mixing it with an auxiliary oscillation a t frequency 00. The chief experimental difficulty is to generate this heterodyne oscillation with an adjustable frequency and still keep the same high stability of the frequency as in a quartz clock. This makes for most of the rather great complication of the system. In order to measure w one has to adjust w o so that Q = w - w o will be as near zero as possible; this occurs when the output of the integrator is zero; Fig. 13 shows the error signal. The time of integration should be chosen between 1 and 10 sec, so as to be
60
P. A . GRIVET A N D L. MALNAR
much longer then the coherence time of the incoming signal, which is equal to l/Aw, Aw being the bandwidth of the sensor’s circuit at the input. Under these conditions the theory of the “synchronous detector” still applies. (See, for example 66, pp. 96-100, and 63, pp. 389-393.) The detector-integrator set combination cuts the rma value of the noise by a factor of the order of [l/€3Aw]1/2.An exact theory, which may be found in Blaquiere et al. (68) and Faini et al. (69),gives the improvement factor
FIG.13. The error signal for a good adjustment in the Faini-Svelto method.
SB’/SB, 6B’ being the error over B in the new method, 6B that in the old one; one has
6B’
(&) 6B 1/2
=
One gains practically an order of magnitude in accuracy for 8
=
10 sec,
Aw = 2s X 20, reaching the 0.1-pG range. In fact, the indirect effect
of the eclipse on 15 February 1961 could not be detected with this equipment because the residual magnetic activity was not “quiet” enough (64). On the other hand, H-bomb explosions in the atmosphere are now known to be much easier to observe; one does not require such elaborate means, and the diagram on Fig. 14 was registered on 9 July 1962 a t Chambon-la-Forkt with a simple but sensitive induction head (65). The chief interest of this analysis is to show that notable improvements in accuracy are possible but at the expense of a considerable sophistication
M E A S U R E M E N T O F W E A K MAGNETIC F I E L D S
61
of the instrumentation arid of a slowing down of the measuring process (one measurement every 15 sec). 3. Modern Realizations. The gist of the Varim-Packard method rests in an original mixture of simplicity and accuracy, so that most modern versions keep to the original scheme and try to adapt it to various
M
Detector
4
Loudspeaker
FIG.15. A double-head free precession “gradiometer” (68).
specialized functions. We have already mentioned in Section 1,C the methods used to apply it to make a vector magnetometer; we may indicate a few additional uses: (1) Accurate data on the standard realization may be found in Waters and Francis (66), Klose (57), Faini and Svelto (66), and in the numerous references cited by Blaquiere et al. (58). (2) Gradiometer: feeding the same amplifier with two heads in series connection (Fig. 15) and mixing the two signals gives a beat frequency
62
P. A . GRIVET AND L. MALNAR
proportional to the scalar product of the gradient of the field and the vector distance of the two heads; the sign of the gradient is determined in an auxiliary experiment by adding a known gradient produced b y a coil; details are to be found in Rikitake and Tanoka (67) and Aitken and Tite (68). (3) Check of the accuracy by comparison with other magnetometers (69, 70). (4) Rocket and satellite magnetometers. The system has been successfully used on rockets and satellites, €or example, in Vanguard 111. Mansir (71) describes the device, which shows the following performance: (a) B, = 600 gauss. (b) Dynamic range on Vanguard 111: 0.07-0.375 gauss corresponding to altitudes of 510 to 3750 km. (c) Sensitivity: 30 pG. (d) Energy: 200 joules per measuring cycle. (e) Spin of the satellite: less then 0.09 cps.
4. Use of Coupled Spins in Very Low Fields. The Varian-Packard prepolarization scheme provides a law linear in B for the amplitude S of the signal. The decrease of S a t very low field is due to Faraday’s law of induction : a constant magnetic moment precessing with a n angular speed w induces an emf proportional to w and nature shows no exception to this law. But one may add to the very feeble unknown field B a constant and accurately known auxiliary field B,. The resultant B B, is now intense enough to be measured accurately, and one gets B back by difference, measuring B, in an auxiliary experiment. The process would be a safe one if B, were very steady. No man-made field easily meets this last requirement, but atomic fields would. Along this line of thought, Thompson and Brown (72) proposed, and realized in 1961, a free precession experiment using two6 coupled spins P and H, in HPO(OH)2,instead of one. I n the absence of any external field, one may rather loosely think of each of the two nuclei P and H as precessing in a field indirectly produced by the other through “scalar coupling.” When a low external field B is introduced, it splits the resonance and changes the frequency. Quantum-mechanically, the system offers the simplest example of coupled spins and in the general magnetic case, e.g., in the presence of a n external magnetic field, its wave function and energy levels are easily calculated, as shown, for example, by Pople et aE. (73, pp. 119-121). The diagram
+
6 Owing to the rapid exchange between the H of the (OH)z radical and the solvent water, these two protons play no role; their field averages out.
63
MEASUREMENT O F W E A K MAGNETIC F I E L D S
of the levels is shown in Fig. 16, where the zero-field transition is indicated by an arrow. It is of the same famiIy as the AF = 1, Am = 0 transition used in atomic clocks,’ occurring here between one of the triplet states
,
, E/J
J = 695 cps
0 75
FIG. 16. The Breit-Rabi level diagram for the two coupled spins H and P in HPO(OH)2.
(mp = 0) and the singlet state (mp = 0); its frequency is coupling energy between spins is written as
x = JII .
I2
v3
=
J if the (6)
It is a parallel transition, and in a weak external field B it couples best
to a coil whose axis is parallel to B. When the receiving coil axis is at a right angle to B, it is sensitive to the ordinary transitions, which are two in nuniher here, Amp = f l (broken arrows in Fig. 16); their fre-
7 Figure I6 shows t h e same genwal shape as that of H atoms, except for the crossing of levels rnp = -1 and m p = 0, which does not occur for H. The reason is the big difference between the values of the y’s for the P nucleus and for the lone electron in H.
64
P. A. GRIVET AND L. MALNAR
quencies are
+
and v 1 - v 2 = V H V P , where we designate by vIf and V P the ordinary resonant frequencies of isolated atoms H and P in a field B. When the coil axis is a t an intermediate angle to B, all three signals are induced simultaneously; the low frequency v 1 - v 2 appears as a modulation on the slightly damped carrier at frequency v3. A rigorous theory of this free precession experiment (74) explains it in detail. Thus the signal remains strong a t very low fields because the induction law is used a t the constant and convenient carrier frequency v 3 = 695 cps; the frequency measuring the field appears as an amplitude modulation. At present the limit of this technique lies a t some 10 MGwhen the period of the modulation becomes long compared with the damping time T2*of the signal. Nowadays this method appears interesting for the solution of an actual problem: to check how accurately one can realize a field-free space, inside a screen or by dynamic compensation.
B. Overhauser Polarization and S p i n Oscillator 1. Variant of the Overhauser Efect. Since 1955 (75) the Overhauser effect [see the introductory treatments (76, 77) and the reviews (78-80)] has provided a means to obtain continuously, in a liquid sample, a considerable enhancement of nuclear polarization. Practically, in the earth’s field, -... M=-4r&B
-
+
B
FIG.17. The opposite directions of the vectors B and M characteristic of “negative polarization” and “negative susceptibility.”
the gain obtained in this way is of the same order of magnitude as that achieved by the discontinuous Varian technique. The chief advantage of the new process rests in the possibility of continuously monitoring the value of the field. Nevertheless it should be remarked that the nuclear resonance frequencies involved remain of the same low value for the geophysical fields: the accurate determination of the frequency still requires a time 0 (0 of the order of 0.1 to 1 sec). Each point of a continuous plot of B ( t ) measures in fact the average value of B(t), l?(t)el over 0;-in other words, the bandwidth of the device is still limited to 10 cps upward. Moreover, in his 1955 analysis, Abragam showed how to obtain a negative polarization, e.g., a vector M pointing in the direction opposite to that of the measured field B (Fig. 17). This situation still provides, at these acoustical frequencies, the possibility of building a self-oscillator
MEASUREMENT O F WEAK MAGNETIC FIELDS
65
of the maser type: its frequenry is given with a good accuracy by Eq. (1). Indeed, riot only does the analysis of the process made ordinarily for centimeter waves remain valid [see Singer’s review in this series (81) or Gordon et al. (82) and Shimoda et al. (SS)], but also the threshold condition is quantitatively unchanged as it does not include any factor, that would be strongly frequency dependent: nuclear magnetic masers are good, sturdy oscillators. The simple layout of the experiment is shown in Fig. 18. The sample is a water solution and the solute, vehicle of the Overheuser effect, is a paramagnetic ion: the effect is obtained by exciting the paramagnetic
I Generator I
I
/
I
/ 0 E=7Ogauss
-
-
Scanning
FIG. 18. The scheme of a “double-resonance” spectrograph for the study of the Overhauser effect in free radicals.
resonance of the ion so strongly as to saturate the transition. Practically saturation is realized with a moderate power of some 5 to 10watts, because one chooses an ion with narrow resonance line, such as the semiquinone ion of tetrachlorohydroquinone. Such an ion shows a simple resonance spectrum, with a single line only, a t a frequency w,/27r given by we =
lr,lB
lye/ =
( 2 ~ )X 2.80453 X 106gauss sec-’
(8)
the so-called “free electron” resonance frequency.8 By such a choice of a single line ion, one obtains what m a y b e called a “simple” Overhauser effect. When one simultaneously observes the nuclear resonance of the proton in the water molecule of the solvent, this signal appears as strongly enhanced. One obtains a multiplication of y s is negative, which means that vectorially o = - Irs[B,and B and the angular speed vector are antiparallel.
68
P . A . GRIVET AND L. MALNAR
the natural nuclear magnetic susceptibility x0 by an iniportjant but constant magnification factor m :
y e is
the negative (ye = -lye/) gyromagnetic ratio of the free electron, and y p the gyromagnetic ratio of the proton. Changing X O to mxo one gets a bulk nuclear magnetic moment per cubic centimeter of
Mo is still proportional to B, although by a larger proportionality factor. Therefore one preserves the classical and unfavorable law for the signal amplitude which varies as B 2 . I n 1957 this situation was improved and it was shown possible to restore a law linear in B. I n fact, Abragam et al. ( 6 ) could make an efficient use of an idea put forward by Kittel (84). Some ions show not a single line, but a multiplet. This is the case for the peroxylamine disulfonate ion, ON(SO3) z2-, whereg the splitting is due to a “hyperfine” interaction between the ionic magnetic moment and the nuclear moment of 14N.This nucleus has a spin I = 1 and can take three different orientations with respect to the measured field B : parallel, perpendicular, and antiparallel. In each case, the magnetic moment of 14N,by what is known as an “indirect action’’ (see, for example, 73, p. 184 et seq.) slightly changes the field experienced by the magnetic moment of the ion, and this by a quantity which is dependent on the orientation of the nucleus. The result is conspicuous when one is observing the paramagnetic resonance spectrum of the ion: a t high field, one observes three lines, the central one at the frequency defined by Eq. (10) and the two satellites at frequencies us _+ fi with fi/27r = Q x 54.7 Mc/sec; in usual spect,rometers B = 3000 gauss and the splitting is 13 gauss (Fig. 19). The interest of this kind of splitting here is that it survives a t low fields, as shown in Fig. 20; the frequency of the central component decreases indeed to zero, but the two others coalesce in a doubly degenerate single line a t fi/27r = 54.7 Mc/aec. In the earth’s field, near 0.5 gauss, these two components are well separated and may be separately saturated to induce the Overhauser effect; in the original experiment ( 5 , 8 5 , 8 6 ) ,the transition w1 between levels (F = 3, mF = i)and (f = +, mF = 6) was used. What will be the enhancemerit of the nuclear signal under these last conditions? Multiplying ye and y p by B in Eq. (8) one puts this relation 9 Long-term stability of peroxylamine sulfate is not satisfactory; new suitable radicals (84a)are highly stable and allow long storage.
MEASUREMENT O F W E A K MAGNETIC FIELDS
67
3000 gauss (8414[Mc/secl)
ON(S0,);-
FIG. 19. The hyperfine splitting in the high-field electron resonance spectrum of the peroxylaniine sulfate ion. I
05 F=3/2
1 I
1
-[
F=V2
-I
FIG.20. The Zeernan levels of the peroxylarnine sulfate ion in a magnetic field.
68
P. A. GRIVET A N D L. MALNAR
in the form:
Following Kittel’s analysis (84), one then remarks that Eq. (11) is the appropriate form, from a physical point of view. In fact, we and up are the relevant parameters for a description of the Overhauser effect. Indeed, it consists essentially in a peculiar regime of exchange of energy between three thermodynamic systems, the protons of the solution, the ions, and the saturating radiation field. I n all these processes, energy is exchanged by elementary quanta, nu, or hup, and it is then necessary that the law (11) allow these units of energy to stand in the final result of the detailed analysis of the energy balance. Looking at formula (Il), in the low-field region, where W . nears its zero-field limit Q,one might hope to obtain an enhancement of the order of &‘2/wP which would amount to about 11,500 in a field of 0.5 gauss. This much is not achieved in reality, because the multiplet structure offers numerous channels for energy exchange and the analysis of the complicated situation (86, 8‘7) changes formula (11) to m =
4 Q --
27 u p
+ 330
‘v
3880
for B
=
0.5 gauss
(12)
If we adopt an image currently used for masers, dynamic polarization by they?verhauser effect is a “pumping” process and ordinary relaxation acts as ‘a “leak” for the pumped energy. When multilevel ions are used, the chinnels for relaxation are also multiplied in number and it is not possible to reach the limit calculated for m by Eq. (12). Indeed Eq. (12) gives the result for a “relaxation-free” model; in fact, one obtains high values of the order of 1000 to 1500 for m, when B = 0.5 gauss. An essential property appearing in formula (12) is that now the magnification factor varies approximately as l/wP = 1/B, and finally the signal varies linearly as B. This advantage remains for fields much lower then 0.5 gauss. I t may be inferred from hhe data in (87) that when the two electronic lines w1 and w 2 coalesce (their width amounts to 0.7 Mc/sec and their difference in frequency is Aw = -ye B/3), one may safely saturate simultaneously both transitions without impairing the efficiency. 2. Maser Oscillations. An important property of the coupled ion-proton system appears also in formulas (11) and (12) through the minus sign: it implies a negative value form and therefore for Mo’due to relation (10). The significance of the minus sign is that the moment Mo’points in the direction opposite to that of B,the measured field: one obtains a “negative” or “inverted” moment, as in Fig. 17. Such a relative disposition
MEASUREMENT O F WEAK MAGNETIC FIELDS
69
of the vectors Mo and B is not an exclusive feature of the Overhauser method: the same state could be obtained discontinuously, by the Packard-Varian technique, simply by orienting t,he prepolarization field B, just in the direction opposite to B (Fig. 21). It could also be obtained
FIG.21. A possible Varian process for obtaining negative polarization.
FIG.2'2. Scheme of the nuclear maser of Ahragam et
a). (86, 86,83).
continuously by a liquid flow method due to Benoit (88), which will be examined later. Irrespective of the choice of the preparation method, the inverted state offers the opportunity to build simply a maser oscillator whose frequency, determined by Eq. ( I ) , is proportional to B. The schematic of the devire is shown i n Fig. 22. This possibility stems
70
P. A . GRIVET A N D L. MALNAR
from the “conditional radiation instability” of the negative by this term one understands the following properties:
Mo’state;
(1) If the inverted sample is left alone, it is stable and Mo’decreases in the ordinary way, e.g., exponentially following the law exp( - t/T1) just as a positive moment would do. (2) If the inverted sample is contained in a resonant radio circuit tuned to w = vPB and if the electrical quality Q of this circuit is high enough, then the negative Mo’ is “radiation instable” and the strongly coupled system (Mo’ tank circuit) starts oscillating a t a frequency nearly equal to w. If Mo’ is produced by the Overhauser effect continuously and a t a rate high enough to overcome the radiation losses, e.g., to provide the Joule energy lost in the tank, a steady state of oscillation is established and offers new possibilities for measuring accurately o and B. The threshold of instability is determined by Townes’ condition (89)
+
(UEMcgs)
(13)
where all symbols have been previously defined except r ] , a numerical coefficient less then 1, the filling factor, which measures the degree of geometrical coupling between coil and sample; r] = 1 if the liquid fills all the coil volume [note that formula (13) is valid in the mksa system, provided t.he right-hand side is multiplied by 4 , ] . l o There are many ways to prove relation (13). The easiest to understand is the original quantum-mechanical demonstration given b y Townes (89). One remarks that a proton in a field B is a two-level quantum system (Fig. 23) with an upper level, 2, of higher energy and a lower level, 1, of smaller energy; by Boltamann’s law, in thermal equilibrium the number n2 of protons in level 2 is a little less than the number nl in level 1. According to Langevin’s theory of paramagnetism (see 81, p. 96) Mo is then positive. I n the steady state of Overhauser pumping the proton system is not in a state of thermal equilibrium. I n this mode the populations of the levels are “inverted,” e.g., n2 < nl and instability by radiation appears intuitively to be more possible. It is just the peculiar mechanism available to this peculiar system then so prepared as to be in a high-energy nonequilibrium state; it is natural that it follows a n ordinary evolution to a state of minimum energy. Macroscopic reasoning also leads to condition (13) but in a much less lo If Q is smaller than the limit expressed in Eq. (13), then notwithstanding the presence of the circuit, R negative Mo’ is stable and decreases exponentially without oscillations.
71
M E A S U R E M E N T O F WEAK MAGNETIC F I E L D S
transparent manner. Onc cwnibincs the (airwit equation with the Bloch’s equatioris as first done by Blocnibergen nrid I’oiirid (90) and one looks for the evolution of a solutiori dcfiricd by m,, m,, nz,. These quantities represent small departures froin the ordinary (91)expoiieritial evolution from an iriitlial state M, = M , = 0, Mz,: M , = M,, exp(--t/TJ 1- M o . Such a theory was first published by Vladimirsky (92) arid was also developed later by Soloinori (93)and Combrisson (94). Thermol equilibrium
Inverted or “pumped” state c
c FIG.23. The relevant nuclear magnetic lcvels and their populations, in the normal and inverted states.
Finally the complex susceptibility formulation a s applied to the transverse nuclear susceptibility (76, 91) offers a short route (95, 96) to Eq. (13): Bloch’s equations are used to express xL’ = xII jxItr, which appears as simply proportional to xo; one has for xL” a t exact resonance
+
xl’l
=
&xowTz*
(14)
It then appears as natural that a negative xo involves negative values for xLI and xL”. On the other hand, it is well known (see 67 or 76) that a positive xl” entails the appearance of a positive resistance yn in the coil of a resoriarit vircuit coupled to the sample: r n = 4sqx,”Lw
(15)
where L is the self-inductance of the coil. There a negative xL” produces a negative resistance rn and the threshold of oscillation is reached when the negative resistance just compensates the positive ohmic resistance of the coil (r = Lw/Q); if we take into account relations (l), (14), and (15), the condition rn r < 0 leads directly to Townes’ conditions (13).
+
11
The occurrence of instability was overlooked in the first analysis by this method.
72
P. A . GRIVET A N D L. M A L N A R
Experiments beautifully check these predictions in the case of the nuclear maser (see 5, 85, 86, 88). 3. Pulling h’flect. The sole important drawback of the nuclear maser results directly from the threshold condition: the Overhauser effect does not provide a sufficiently high value of Mo to enable one to reduce Ql under a value of 150. The Q of the circuit must reach practically a value twice as high as the minimum limit Ql arid Q must approach values of the order of 300. Under these conditions an error appears because the tank circuit cannot be tuned to the precession frequency w = y p B with sufficient accuracy. Practically it is adjusted a t w,, different from w , and for this reason the maser oscillates a t a frequency wm which differs both from w and wc. It is given by am
-w
= (w, - w ) / q
(16)
where q = wT2*/2Q. Q is the quality of the circuit and Q > Q I ;Qt is the value of Q given by formula (13) when it reduces to an equality:
Exact tuning cannot be achieved for two reasons: (1) lack of a suitable criterion; ( 2 ) necessity to adjust the tuning in order t o follow the variations in intensity of the natural fields. Relation (16) may be established in a simple manner by interpreting the buildup of the oscillations in the following way: lying along -B) there exist small micro(1) I n the initial state (Ma scopic transverse components isotropically distributed so as to have a zero resultant. (2) The noise level of oscillation in the circuit produces a small rotating transverse field B,, and B, gives rise in the usual manner (following Bloch’s steady-state solution) to a small transverse macroscopic moment M , rotating with angular velocity w ; this occurs because B, introduces order in the distribution of microscopic transverse components, e.g., it “phases” their precession: M , induces an emf e in the tank and produces a current i;i generates a rotating field B , which couples to M,. This may be shown by the following feedback diagram: M , -+ e -+ i -+ increase in M,; for this phasing process to be most successful the sum of the phase differences along the feedback loop must be a multiple of 27r. This condition is achieved a t exact tuning12; in the case
+B,
12
The calculation in this case is easy.
M E A S U R E M E N T O F WEAK M A G N E T I C FIELDS
73
of a slight mistuning, the maser still oscillates; therefore the phase difference between e arid i in the circuit, 64c, must be compensated by an opposite phase shift between B , and M,, 6 4 N ; this condition reads 84N
+ 64C
=
0
(18)
and by reference to the Bloch and circuit equations one has
64= ~
(W
- w,) Tz*
6dc
=
2Q (urn - w C ) / W C
(19)
Combining Eqs. (18) and (19) one obtains Eq. (16). A discussion of the practical values of the parameters leads to the conclusion that owing to this “pulling” error, the nuclear maser has too small a dynamic range; it would be useful only if provided with an automatic compensation system for the variation of B.
4. Spin Oscillator. Automatic compensation of the 6Bc’ is a possible solution. A
__
FIG.24. The basic schemc of a nuclear spin oscillator using decoupled Bloch coils.
ti
pair of magnetically
Bonnet (98). They avoided use of the negative sign of M Oand the maser effect. From the Overhauser effect they retained the enhancement of the polarization’s absolute value; in other words, they utilized the high value of m in Eq. (12) but no longer its negative sign. They still kept the continuous oscillator principle but now resorted to a spin-coupled tube oscillator. Indeed, in his first disclosure of the discovery of “nuclear induction” F. Bloch stressed the advantages of electromagnetically decoupled “Bloch coils.” These are two coils with perpendicular axes and a fine electrical adjustment in ordcr to further reduce the residual coupling. Salvi made such a set of highly decoupled coils for frequencies in the neighborhood of 1900 cps (suitable i n the earth’s field) ; the residual coupling may be kept a t a value as low as below 80 dB in a wide-band frequency. If one then connects a pair of such coils to the input and output of a n amplifier (Fig. 24) the system does not oscillate, even if the phase balance is properly adjusted: the coil’s coupling is too low. This state of
74
P. A. GRIVET A N D
L.
MALNAR
affairs changes drastically if a polarized sample is put inside the input coil and if an external field B is adjusted so that the precession frequency w of the moment is located in the middle of the amplifier's passband: the rotating moment efficiently couples the two coils a t frequency w , and a steady oscillation occurs a t this frequency. This spin oscillator also shows threshold; the threshold condition is similar to th a t of the maser but includes as a new and very useful parameter the voltage gain G of the amplifier. It reads
and the product QG of the quality of the circuit and the gain G replaces the single parameter Q in the old formula (13). One can adopt a very low Q and still reach threshold by increasing G to a big value. Under these conditions the pulling error, proportional to Q, may be made negligible. Practical orders of magnitude are Q = 5 and Qr = 100. No other form of pulling occurs if the linear amplifier is of the wide-band type and shows a phase shift independent of frequency near w (cf. 58). The practical limit to the possible high values of G is imposed by the residual electromagnetic coupling: the device should not be able to oscillate in th e absence of B field, by sole virtue of this residual coupling. If one expresses it by the ordinary coupling coefficient k,, one can show that the spin coupling may also be represented by a similar coefficient k,; the value of k , is
xo'
= mXo and ql,q2 are filling factors for sensor and exciting coils. The condition for good performance then reads k,>>k,; it can be easily realized; k , may reach values of the order (for Tz*= 1 sec; w = 27r.2000 cps; m = 1500; XO' = 1500 X 3.3 X lO-'O; UEMcgs; and k , may be reduced to Spin oscillators were used earlier in the high-field domain; they were introduced by Schmelzer at CERN in Geneva in 1952 (99), and the theory was also developed by Kurochkin (100). The industrial resonance magnetometer built in France by Sud Aviation actually works along this principle; the original apparatus built by Salvi had the following characteristics:
(1) Sensitivity, 0.2 pG. (2) Dynamic range, 0.4-0.5 gauss (cut in 10 ranges). (3) Bandwidth, 0 - 1 cps.
75
M E A S U R E M E N T O F W E A K M A G N E T I C FIELDS
Overhauser hf power, 2.5 watts. ( 5 ) Easily transformed to be a gradiometer.
(4)
5. Benoit’s Liquid Plow Method. As mentioned before, the Overhauser effect is not the sole method available to obtain a marked enhancement of the nuclear resonance signal in a liquid. Benoit (88) devised a very efficient procedure, where a liquid, tap water for example, is strongly polarized in the strong field of an electromagnet and then flows to the sensor; this system may be designed as a maser or as a spin oscillator. Indeed, between polariser and sensor one may-at will-invert the moment, by “fast passage” through an auxiliary hf coil, exciting the Pump
r2ateLf‘ow
I
Bt
field r
l ...
Inversion by adiabatic fast passage
I
lTzz3l magnet
Maser’s COI I
FIG.25. Principle of Benoit’s expcrimcnt for continuously obtaining prepolarieation and inversion by the usc of liquid flow.
inversion in the inhomogeneous leakage field of the magnet. The mechanism of the device is explained at length in the review (16),where one may find a complete bibliography on this system. The enhancement factor W L may in practice reach very high values of the order of 10,000-20,000, and with a spin oscillator sensor one obtains a very sensitive, robust, and reliable magnetometer. But the device, by necessity involves the use of a rather bulky and heavy piece of apparatus: the polarizing electromagnet. For this reason it can be used only in stationary observatories and even there one needs to carefully adjust the position of the sensor with respect to that of the polarizer, so as to make the leakage field of the polarizer inoperative; this can be achieved in a systematic manner, as shown by Hennequin (101). The inconvenient necessity of this procedure has until now hindered the development of the device as a commercial magnetometer. But the experiment shown schematically in Fig. 25 is a simple and efficient one, and offers the simplest and most flexible demonstration of the properties of masers and spin oscillators.
76
P. A. GRIVET AND L. MALNAR
IV. OPTICALDETECTION OF AN ELECTRON NUCLEAR RESONANCE (Amp = +1) A . Optical Pumping 1 . Preliminary Survey. a. Optical and magnetic levels. The vivid expression “optical pumping,” coined by Prof. Kastler, calls to mind a rather accurate picture of this kind of experiment. To understand it, one may, for example, look at the usual pumping scheme, from the bottom of a valley, to a high-altitude lake, and back. Describing the process in the more abstract terms of energy transformation, one could then say: Rb vowi cel I Photocell
.
-
.
~
.
Osci Iloscope
FIG.26. The basic setup for studying an optical Zecman spectrum by “optical pumping” of an alkali vapor.
Energy is pumped from a low level to a high level, stored for a while, and then allowed to flow down to another low level. This description may look like a rather dry one, in everyday life; its advantage is that it describes optical pumping equally well; in this process the energy is that of a few cubic centimeters of alkali metal vapor contained in a bulb at low pressure and subjected to a magnetic field of 1 gauss. The pump takes the unusual form of a circularly polarized beam of light, illuminating the alkali metal bulb (Fig. 26). The useful levels of energy are those of the alkali atoms and here they are many in number. The levels appear to fall in one or the other of two classes when the energies involved in possible transitions between them are considered : (1) Optical transitions correspond to energies of the order of 500,000 Gc/sec (gigacycle = lo8 cps) when expressed in frequency units1*; 18
The equivalence between the frequency unit of energy and the mechanical one = E, which results in: 1 cps = 6.625. ergs. Other
is given by Planck’s law hv
M E A S U R E M E N T O F W E A K MAGNETIC F I E L D S
77
(2) Zeeman or magnetic transitions. Ordinarily each optical level is “degenerate”; for a given energy El the level contains a few states which one is justified in describing as different because their detailed properties are different; this is especially clear in theory, because they belong to different #-wave functions. As regards energy in field-free space, this multiplicity is hidden and does not appear on the diagram, as long as one does not consider the effect of a magnetic field on the sample. I n this case the field “removes” the degeneracy and instead of the single level line Eithere now appear di distinct levels, Ei,,with energy differences proportional to the field; this magnetic splitting is described by a Zeeman diagram, Fig. 27a, which shows accurately the energy dependence with respect to the field intensity, in the low-field domain. A more extended graphical description, Fig. 27b, shows in general that the splitting obeys different laws in the low-field region (under 1 gauss), in the intermediate region (from 1 gauss upward to a few hundred), and in the high-field region (from 1000 gauss upward). Optical pumping can occur generally in the low-field region, where the splitting is in equal steps of value Avi given by Kastler (102, IOWa, 102b):
Ti
gFye
= -
2
where yi is the apparent gyromagnetic ratio of the level, ye is the gyromagnetic ratio of the free electron considered in the first part, and appearing with a small correction in the electron paramagnetic resonance of a free radical (such as DPPH), and g F is linked with the multiplicity of the level (it is also called for historicI4 reasons the “Land6 factor”). To the same approximation, one may draw straight lines on the diagram. The magnetic transitions generally, and this more markedly in the low-field domain, are characterized by the order of magnitude of their frequencies, which lie in the radioelectric domain (more specifically, they range from 100 kc/sec to a few megacycles per second; they are typically useful equivalences are detailed in table 7a-2, p. 7.4, of the American Institute of Physics Handhook. The introduction of the frequency unit is convenient for measuring Zeeman energies later on. It may still he worth mentioning that the relation E = kT introduces the correspondence: 1°K 420,835 Gc/sec. Finally, as visible lines of the optical spectruni are known by their wavelength, the wave number I = 1/h [cm-I] is a useful intermediate; I is related to frequency by I = v/3 . 10’0: 1 cm-1 c) 30,000 Gc /sec. I* Historically, the genuine “Land6 factor” described the degeneracy for L-S coupling, ignoring the nuclear spin.
78
P. A. GRIVET A N D L. MALNAR
“radioelectric” frequencies).I6 This low order of magnitude has an important consequence: the do sublevels16of the “optical ground” state (i.e., the lowest of the energy levels in the absence of an external magnetic field) are equally populated, by thermal excitation; indeed, the average thermal energy per particle and per translational degree of freedom (+lcT) at 300°K amounts to a much higher value of energy, to some 4200 Gc/sec, in frequency units; under such a strong thermal excitation all the sublevels of the ground state are equally populated in the ordinary equilibrium. On the other hand, all the other optical levels and sublevels of interest here are located some 500,000 Gc/sec higher, and are practically empty in the thermal equilibrium state. A last important order of magnitude concerns the pumping light. The source is an ordinary high-frequency discharge; it must emit strongly on 16 Some frequencies occur in the microwave spectrum; such are the AF = 1, Amp = 0 transitions used in atomic clocks (105),but they play no role in magne-
tometry, their frequencies being insensitive to B,to the first order. 18 We call d the number of degenerate levels; the standard notation in statistical physics is g, which could be confused with the Land6 factor here.
D lines
Optical pumping frequency 509,000 Gc/sec
I
7
1772 Mc/sec
Zeeman effect for low fields
(a)
FIG.27. (a) The splitting of the ground state in the Na spectrum: the two hyperfine levels F = 1 and F = 2 and t)heir %reman splitting in low fields. The upper
MEASUREMENT O F W E A K MAGNETIC FIELDS
79
the frequenvy of the transition to be pumped. For this aim, one uses the same alkali metal i n thc lam]) as for the pumped sample and, eventually, filters out of the spectrum thc rclcvant single line corresponding to the transition studied in the absorber. But the conditions in the source (full Doppler effect, higher pressures, strong hf excitation) are very different from those in the absorption bulb: the line is broad (1000 to 3000 Mc/sec), much broader than the total width of the magnetic splitting occurring in the corresponding levels of the absorbers; therefore in the presence of a magnetic field the relevant sublevels of the absorber are all subjected to the same optical excitation by the pumping light. These simple conditions occur in the first absorbing layers of the sample: the very process of absorption itself rapidly changes the line shape of the excitation light whenithe beam has penetrated deeper into the absorber. Effects of this kind will be examined later.
E
Zeeman transitions
F=i
LIOCK
transitions
I 1772 Mc/se\ I
B
F=
transitions
Fields
Intermediate Region
Fields
(b) “optical” levels are indicated syrnholically only. (b) The three typical regions in the full Zeeman spectrum: in the low-field region only, the splitting is linear in B.
80
P. A. GRIVET AND L. MALNAR
b. The e1cment:u.y processes in the mechanism of pum1)ing. To summarize, in the simplest conditions one pumps with a broad line, and this illuminates evenly the atoms in the sublevels of the two relevant optical states. Nevertheless, equal illumination corresponds to diff ererit rates of excitation in the sublevels of the ground state for two reasons: (1) The light is circularly polarized; it is, for example, a+ light for which the E vector rotates in the positive direction around the direction of propagation. For such u+ light, the total transition probabilities from these sublevels are different; therefore the diff erent sublevels empties a t unequal rates. (2) By the same token, the magnetic sublevels of an excited optical level (the “upper” levels in the pump image) are populated a t still different, but also unequal, rates. The elementary probability of transition is relative to a pair of sublevels, one in the optical ground state, the other in the optical excited states: these “transition probabilities” are accurately given by the quantum theory (8,104). The total probabilities out of (or to) any sublevel are obtained by summing the elementary probabilities for all the allotted transitions stemming from that level (or ending a t it). The rules defining the allotted transition depend on the polarization of the beam (u+ light, AmF = 4-1) and on the quantum state considered (elementary rules AF = 0, + l ) ; later on, as in Fig. 31, relative total probabilities for ‘T+light are indicated by numbers on each level. Each of the atoms “pumped” to an excited state remains in this upper state for a short while and then returns to a lower state essentially by fluorescence; this radiative process belongs partly to “stimulated emission” under the action of the pumping beam and partly and mostly to “spontaneous emission.” The distinction is important: the first process is the strict inverse of the pumping action and involves the same selection rules and probabilities; in contrast, spontaneous emission obeys different rules and opens new downward channels.” The two processes are pictured in Fig. 28 by a type of level diagram that is very convenient to describe such processes and will be linked to the features of the Zeeman effect in alkali atoms later. For the present it is sufficient to know that energies are measured along a vertical axis, and that magnetic sublevels are characterized by their magnetic quantum numbers mF, which are plotted along a horizontal direction: the big 17 This may be easily understood by considering spontaneous radiation as stimulated by zero-point energy fluctuations of the vacuum; this radiation shows no preferred direction or polarization.
MEASUREMENT O F W E A K MAGNETIC FIELDS
81
energy difference between two “optical” levels appears clearly; the much smaller one between magnetic sublevels is also shown but not to scale (current pravtice in the specialized literature is t o neglect it on such diagrams). The scheme here is purely hypothetical and does not pertain
L T Zeeman Av
\
Relaxation
-2
-I
0
+I
+2
“F
FIG. 28. Hrisenberg’s form of the Zeenian diagram: it is convenient for understanding the pumping action of polarized light; it shows the action of I+ light, of spontaneous emission, and of relaxation; stimulated emission is proved to be quantitatively negligible here.
to any real alkali metals, but its simplicity lets some important features of the pumping scheme appear clearly:
+
(a) Absorption. The selection rule AmF = 1characterizes the pumping effect of U + light: by absorbing it, atoms are transferred from level a to level b‘ a t a rate of Nab! per second, Nab, is given b y the product of the relevant probability of transition U/’ob’ by the actual population n, of the initial sublevel a. This is a big population because a is a sublevel of the ground state and its Zeenian energy is sniall compared with the average thermal energy (z kT). Stirnulat,ed emission. This process is completely negligible here. (b) Iridccd the pumping beam may also induce the reverse transition b’ + a and by the principle of “microreversibility” wbla = w&. But the “optical” level b‘ is so high in energy compared with the thermal excitation that, by
82
P. A . GRIVET AND L. MALNAH.
Boltzman’s formula, it remains practically empty a t equilibrium; pumping does not change these conditions significantly because spontaneous emission empties the upper sublevels quickly. Finally, the rate of transfer by stimulated emission Nbta = Wb’anb’ remains negligible as n b r . (c) Spontaneous emission. I n contrast, spontaneous emission equally well links upper level b’ with any of the low levels a, b, c ; it is an important effect, which determines the order of magnitude of the lifetime in the upper level, 7 between 10 and 100 nsec. (d) Relaxation. Practically, spontaneous emission does not link any pair of Zeeman’s neighbors b’ to a‘ or c’ to b’; the probabilities of transition in such cases still exist but are very low because they are proportional to ( A v ) ~and the Zeeman A d s are so small in comparison with the optical A d s that this theoretical possibility may be safely neglected in practice. Then relaxation transitions become important; they are due to mechanical collisions between atoms or of the atoms on the wall. Very conspicuous is the efficiency of relaxation between Zeeman sublevels of the upper optical levels : relaxation can equalize the population of the sublevels very quickly if the pressure in the bulb is higher than mm of mercury, owing to a higher temperature or to the presence of an inert “buffer” gas; this offers certain advantages as explained later. This description of the main elementary processes in the pumping scheme is sufficient to introduce the origin of the practical classification of the modes of the pumping. In practice one distinguishes between:
(1) Kastler pumping. The bulb contains alkali vapor only a t the mm Hg). right temperature to maintain a very low overall pressure By the choice of a low pressure and the use of a favorable coating on the walls of the bulb, relaxation between the sublevels of the upper optical level is minimized as far as possible. One can then understand the pumping effect by considering two processes only: absorption of the u+ light and spontaneous emission from the upper level. This mechanism is shown for the levels a, b’, b, c on Fig. 28; one sees that the population of level a tends to migrate to the right in levels b and c. This is a general trend for any level of the ground state: for each of them, three arrows move the atoms to the right and only one backward to the original level. Practically, this useful effect may be a feeble one; nevertheless, after a buildup time a permanent regime is established where the levels a t the right, in the ground state, are significantly more populated then those a t the left of the diagram. The buildup time measures the overall efficiency of the scheme and ranges around sec or more. Dehmelt pumping. The bulb contains alkali vapor a t the same (2) pressure as before, because technological reasons impose adoption of the
MEASUREMENT OF WEAK MAGNETIC FIELDS
83
same working temperature, but in principle it could be higher. An inert (nonparamagnetic) buffer gas fills the bulb at a well-chosen pressure (a few millimeters of mercury; see later discussion); the result is th a t relaxation operates efficiently between the Zeeman sublevels of the upper optical level, but not between those of the ground level. The populations are equalized by relaxation in the sublevels of the excited state, but they remain unequal in the sublevel of the ground state. This is the result of the inequalities in the probabilities of emptying the different sublevels of the ground state; theory shows that these probabilities decrease when one goes from the left to the right on the diagram of Fig. 28. The drawing itself enables us to foresee that the probability for emptying sublevel d is zero, because one has no upper level available to draw a u+ arrow, starting from d. The details of the mechanism of pumping are evidently different in both cases, but happily enough both schemes may be made very efficient in practice as regards magnetometer construction, because in this kind of application one is interested in the sublevels of the ground state only. I n any case, the process of pumping appears as continuous and the atoms circle quickly around a “closed cycle” of states. When the permanent regime is established (this requires a buildup time in the range of a fraction of a second), one observes strong inequalities in the populations of the sublevels, a t least in the ground state. c. Detection of the “orientation” in the pumped state. The occurrence of strong inequalities in the populations of the magnetic sublevels represents the essential result of the “pumping cycle.” An important problem is to determine the efficiency of the process, i.e., to measure the permanent differences in populations: magnetometry will appear as a n interesting by-product of the solution. This is possible in a variety of ways and we begin by evoking two methods which are not used here, although they are very popular in other domains, where one deals with condensed matter; here they are ruled out because the sample is a vapor a t low pressure. (1) Enhanced paramagnetism. The alkali atoms in the states considered here have a magnetic moment, and the artificial inequalities in the ni produced by pumping are the cause of strongly enhanced paramagnetism. But the direct measurement of paramagnetic resonance is made difficult by the low density of the vapor. (2) Detection of the “orientation” by nuclear methods. The strongly enhanced paramagnetism explains the expression “oriented state” often used to describe the pumped state. Indeed in the magnetic field B , the magnetic moments and correspondingly, the atomic spins are oriented in
84
P. A . GRIVET A N D L. MALNAR
the direction of the field: the degree of orientation may be high if all the atoms are pumped to a single sublevel of the ground state. As will appear later, this is nearly possible. Such a “completely pumped” sample could be a useful oriented target in nuclear physics (106) if the density of the vapor could be increased; this has been recently achieved, in the closely related case of optically oriented 3He (206)(the reference offers an accurate discussion of this case). (3) Maser action. The differences in population between sublevels may show one sign or the other; a priori, it does not look impossible to choose two levels that show an inverted population, e.g., a negative paramagnetism, and to try to obtain maser oscillations by a suitable coupling with a resonator. Practically, this experiment is very difficult for Zeeman transitions a t radio frequencies, because the alkali vapor is a t too low a pressure: the inequalities in population may be very strong, but the total number of atoms available per cubic centimeter is too low in a vapor a t a pressure of some mm Hg. This kind of experiment was first attempted for microwave clock transitions for which it is essential t o reach the regime of self-oscillation; success was achieved only recently with great difficulty (107, 107a). Nevertheless 3He was shown very recently t o offer an exceptionally favorable case; a nuclear maser was operated successfully with optically pumped 3He (108); the nuclear relaxation time is very long and this permits a n efficient “motion narrowing” of the nuclear resonance (see 91, p. 57) ; the result is a high value of T2*, which in formula (13) overcompensates the influence of a low M o , owing t o the low gas density. I n the general case, Kastler’s discovery of the indirect optical methods offers easy solutions to these difficulties. Only two of these procedures are used in the magnetometer technique. The principle of these “trigger methods,” whose great sensitivity was stressed before, is the following: (a) One submits the sample to a secondla and stronger excitation, a t a radio frequency corresponding to a Zeeman transition between two sublevels, for example, i and j of the ground state; this is called the “double resonance” method; (b) The result is “saturation” of the transition (see 109, p. 168); in other words, there is a trend toward equalizing the populations of the sublevels i and j . Correlatively, it markedly perturbs the pumped regime and changes the permanent populations of all the sublevels; this is often called a “reorientation” of the atoms. 18 For this reason this type of experiment is often called “double resonance,” e.g., optical and radio resonance.
MEASUREMENT O F W E A K MAGNETIC FIELDS
85
The optical consequences are numerous and varied; generally speaking, the radio excitation changes (1) The polarization arid intensity of the fluorescent light, which one may observe to be emitted sideways. The determination of the polarization of the fluorescent light was widely used by Kastler’s school (7) for scientific research. It is not popular as regards magnetometers for two reasons: (a) Experimentally, it is complicated, arid (b) fundamentally, it may be shown that differences in polarization arise from unequal populations of the sublevels of the optically excited state. These are p states for alkali metals; the pelectron
Sodium
/u
FIG.29. Probing the effect of Zeeman rf excitation on a pumped vapor by the use of an auxiliary crossed beam : the secondary beam is amplitude modulated a t the rf Zeeman frequency.
cloud is geomet>ricallyasymmetrical, assuming a n elongated shape and showing an orbit8almagnetic moment. For this reason, the p states are very sensitive to mutual atomic collisions which disturb the orbital magnetic moment, and by “fine structure” and “hyperfine structure’’ coupling induce transitions between the sublevels; one says that the sublevels are “mixed” or “reorient’ed” and naturdly this averaging of the populations strongly deteriorat.es the efficiency of the method a t high vapor pressures or i n the prescnce of “buffer gases” (ot,herwise int,erest.ing for highinterisit y effects) . ( 2 ) The optical absorption of the bulb. One may observe the absorption on the pumping beam itself, receiving it finally, as proposed by Dehnielt ( I I O ) , on a photomultiplier. One may also use an auxiliary beam directed a t right angles to the first (Fig. 29) and much less intense (in
86
P. A . GRIVET AND L. MALNAH
order that it will exert a negligible “pumping” action); this case is very advantageous because, as discovered by Dehmelt (11I), the beam appears as amplitude modulated a t the Zeeman frequency. The absorption depends on the populations of the sublevels of the ground state only. These substates are spherically symmetrical S states with zero orbital magnetic moment: thus an important link between the orientation of the magnetic moment (which in this case originates in the electron spin or in the coupled electron and nuclear spin) and the perturbation during an encounter is mostly suppressed. The mixing of sublevels of the ground state does not occur easily and useful variation of absorption under the action of Zeeman saturation still occurs at the highest pressures in use. The process is simple in principle: “optical pumping” by its very nature empties selectively some sublevels of the ground states; this effect diminishes the over-all absorption of the bulb, which is proportional to
c ij
niwij
ni being the population of sublevel i of the ground state in the permanent regime and wij the transition probability to a sublevel j of the excited state. Saturating a Zeeman transition changes the values of the ni in a complicated manner: experiments and theory (9) show that in general it increases the absorption. This result is nearly evident in the simple and easily realizable case when the Zeeman equalization occurs between a level i included in the summation and a level k, which does not figure in summation (24). Level lc does not participate in optical absorption because of a prohibition expressed by the so-called “selection rules”; such a level automatically fills up because the pump cannot empty it (the relaxation “leaks,” neglected here, will be considered later on; in fact, in a few cases, they can be made of minor importance). In this important case, Zeeman excitation appears clearly as a “depumping action.” Such a case is easily achieved for any alkali metal by a suitable choice of the pumping light, as first proved by Franzen (112). This simple case will be examined in more detail in the next paragra.ph. For the general case, the reader may find a more detailed introduction in recent reviews on “optical pumping” in Kastler (102, 10da), and Bloom et al. (113-118), and a detailed treatment in the references cited there, and more especially in Brossel (119, 120, 12Oa) and Cohen-Tannoudji (121). 2. Alkali Atoms: the Simplified Case of N a . Actually, rubidium and cesium exclusively are used in commercial magnetometers; this choice results from technological advantages of Cs and R b that will appear later
87
MEASUREMENT OF WEAK MAGNETIC FIELDS
on. I n principle, any member of the first column of Mendeleev’s table (222) would work equally well in a pumping cycle. At first sight, one might think that the member in the first row, hydrogen, would prove a simple and typical example. Experimentally, this is not true because the appropriate line, called Lyman a, which corresponds to the transition from the ground state, n = 1 (n radial quantum number), to the first excited state, n = 2, is situated in the far ultraviolet (X = 1215.7 A), where the construction of sources and lenses meets big difficulties. Moreover, t,he natural species is the Hz molecule, which must be first dissociated in H atoms by some kind of Langmuir arc; the experiment was attempted only very recently (123). On the other hand, the theory would appear both singular and oversimplified; it would be of little help in understanding the other cases. Indeed, the hydrogen atom is the only one where a pure Coulomb force acts between nucleus and electron. For this reason, the n levels all show a peculiar degeneracy, the S state (1 = 0) and the P state ( I = 1) having the same energy. As a result, the Lyman a line is a singlet (neglecting the very small unresolved splittings corresponding to the Dirac relativity correction and to the Lamb shift), corresponding to the “radial” transition n = 1 + n = 2. All the other members of the first row are true alkali metals and the structure of the relevant levels are markedly different. I n these cases, one still deals with an unpaired electron; but its lowest level corresponds to n = 2 a t least for Li and n = 3 for Na; we will take Na as an example, because it is the simplest*g of the experimentally studied alkali metals. Between the shell before the lone electron and the nucleus other shells completely filled with electrons are interposed. I n the present problem, the presence of these complete shells is of no importance, except that the central force is no longer Coulombic, and now states with the same n, but different Z’s, have significantly different energies. Consequently the ground state remains an S state (n = 3, 1 = 0 for Na) but the next excited state is a P state still with n = 3 but with Z = 1 (for Na); the relevant optical transitions depend essentially on the azimuthal quantum number (AZ = f l ) , not on n, and as the various alkalis differ essentially in n and the radial part of $, this explains why the optical transitions are similar in all of them. The spin of the lone electron shows its existence by the spin-orbit coupling, which splits this P state (I = 1) into two substates labeled by the quantum number J : (J = Z 8 =
+
+
1 9 The only abundant isotope of Na is *aNa, with the smallest value for the nuclear spin (I = I);87Rb shows the same nuclear spin and for this reason the same set of levels; but natural Rb is a mixture of 87Rb(27%) and 86Rb(73%) with ( I = $). Pure 87Rb is available and is used in magnetometers, giving strong signals; 23Na for technological reasons gives poor signals.
88
P. A. GRIVET A N D L. MALNAR
and J = 1 - & = 6 ) ; this “fine structure” splitting is the origin of the well-known yellow doublet (01, XI = 5986 A; Dz, XZ = 5890 A) of the sodium light; similar doublets occur in the visible spectrum of all the alkali metals as shown in Table 111. TABLE Isotopic abundance
ZaNa JBK 86Rb *7Rb 1JJCS
100 93.1 72.8 27.2 100
Nuclear spin
a3 H
T2 1
1115
DI(A) P1/2 -+ SI/Z 5896 7699 7948 7948 8943.5
-+
S1/2
5890 7665 7800 7800 8521
a The Li doublet at 6708 A (Ah = 0.14 A) is not resolved by current optical spectrographs (these data are excerpts from H. G. Kuhn, “Atomic Spectra.” Longmans, Green, New York, 1962).
High-resolution optical spectroscopy allows a still finer splitting of the levels to appear, the “hyperfine” splitting. As explained by Leighton ( 6 ) and Mockler ( I O S ) , this arises from the dipole coupling between the intrinsic magnetic moments of the electron and of the nucleus. It exists in its simplest form in the H spectrum, where I = 9 for the proton; it is more complicated for z3Na ( I = i) and still more so for 86Rb (for *6Rb, I = 8; for 87Rb, I = 8) and Cs (for 133Cs,I = 8). Each of the previously considered J levels is split into many components corresponding to the values of the “total” quantum number F (vectorially F = I J ; F , the projection of F on the z axis, runs through all the integers between I J and II - J J ;the resulting spectrum may be quite complicated, as shown by Skalinski (164)for Cs. But the mechanism of pumping is not essentially changed because one pumps with a beam of circularly polarized light shining along the B direction and called U+ illumination. For this kind of polarization the selection rules of importance are the same for F (AmF = +1) as for J (AmJ = +1). Therefore, the discussion and the drawings can be simplified by ignoring the hyperfine structure and discussing an approximate model where all the hyperfine levels are lumped together at their center of gravity. Figure 30a shows a model for Na; it may be considered as a typical one for the alkali metals because the principal quantum number n plays no role in the process. The selection rules invoked here are simply expressions of the conservation of angular momentum: for u+ illumination (the E vector
+
+
89
MEASUREMENT OF WEAK MAGNETIC FIELDS
rotating in the same sense as a magnetizing current producing B ) , each photon carries a quantum of angular momentum h and can only disappear by absorption in a collision th at induces a transition in which the atom gains a unit of angular momentum h : these permitted transitions are defined by mF=+1 m J = +1
at
This conservation law is of general validity and applies to the simplified model, as well 3s to the real situation. Fine structure
b, pumping far 23Na (simplified)
Zeeman levels
/
B
\
DI light
D21ight
I
-312
-112
112 mi
(0 )
( b)
3 2 -312
1/2
-1/2
I
312
mi (Cl
FIG.30. The action of the DI and D2 components in u+ pumping, explained on a simplified level diagram : hyperfine structure is ignored. (a) Zeeman diagram; (b) Heisenberg diagram, D1 action; (c) Heisenberg diagram, D2 action (114).
Looking then at the Zeeman diagram for the model in the presence of
B, it is convenient to arrange the levels to correspond to a given value of B, in the manner shown in the right-hand part of Fig. 30. For improved clarity, there, the sublevels all correspond to the given B, and those situated on the same vertical line correspond to the same value of mJ. Therefore, the only allowed transitions for U+ excitation according to the rule (Am., = +1) are pictured by arrows inclined toward the right. It
90
P. .4. GRIVET AND L. MALNAR
then appears that if pumping is done with the D1 line only (filtering out the Dz line), there is no transition to empty the sublevel mJ = & of the ground state; this level then fills up because the atoms pumped to the mJ = sublevel of Pljz fall down to the mJ = 8 sublevel of SliZ by the phenomenon of “spontaneous emission’J20(broken arrow, Fig. 30b). Figure 30c displays the pumping scheme for D z light. This is a more complicated case and since there is a pump transition emptying the mJ = $, S1/2level, the result can no longer be foreseen so simply. Pumping produces a more complicated nonequilibrium population pattern which
+
-3 -2
-I
0
I
2
3
mE
FIG. 31. Complete Heisenberg diagram for Na, including hyperfine structure: the pumped transition using u+, DI light.
can be calculated by rate equations, using the transition probabilities of quantum theory. This is shown in many of the above-mentioned references, especially Franzen and Emslie (112)and Alley (126). It is easy to show that the simplicity and efficiency of (01, u+) pumping is retained when hyperfine splitting is taken into account: the full Na sublevel structure is shown in Fig. 31. One may check that the P l p and S ~ /levels Z each split into the same number of mF sublevels because the relevant values of F are the same in both cases; e.g., F = 2 or F = 1 or more generally F = I +. For this reason the sublevel now labeled by mF = 2, F = 2 , S l / z ,is not pumped, the selection rule AmF = 4-1 here taking the form of a prohibition. 10 As shown by Einstein, the probability of spontaneous emission gains rapidly in importance as the frequency increases (proportional to the cube of the frequency). For this reason ‘k.pontaneousemission” is negligible in the microwave and radioelectric range of frequencies but important for visible light (see Siegman, 109, p. 212).
MEASUREMENT OF WEAK MAGNETIC FIELDS
91
This (bl,u+) process is also very inl,eresting because it demonstrates t,hat as the pump depletes all the available sublevels, absorption of the pumping light strongly decreases : the bulb becomes much more transparent as soon as all atoms are collected in the insensitive WLF = 2 sublevel. The efficiency of the pumping scheme can easily be checked by measuring the decrease in absorption; for example, one may observe an absorption coefficient of 50% when pumping is switched on and it, decreases in a second of time to 40% or less when the pumping cycle is fully established. A correlative and important aspect of this mechanism explains why the entire volume of the bulb is pumped: during the buildup of the cycle, only the first layers are pumped; they soon become transparent and allow the source to pump progressively deeper layers. With this last argument, it appears that pumping with mixed D1 and Dz light could occur by the same simple mechanism: the Dz lightz1has a transition starting from the level m F = 2S1,2,soon filled up by D1. Consequently, it still suffers strong absorption after the sample becomes transparent for D1. D z would be absorbed in a first thin part of the sample; the remaining volume of the bulb would be pumped by D I as described above. This would explain an apparent contradiction: the DID2 pumping appears experimentally efficient even in those cases where theory would predict zero efficiency. It is always possible to filter out D z by interference filters for R b and Cs, the interval between DZ and D1 being sufficiently large (see tabulation).
86Rb 87Rb 133cs
148 148 422
Finally, Franzen’s mechanism also shows clearly the efficiency of radioelectric “depumping.” I n the pumped state, level mF = 2 is full, and saturating the transition AmF = - 1 by a hf magnetic excitation transverse t o B sends atoms to the neighboring sublevel, m F = 1, from which they make transitions to other sublevels and thus restore absorption.
B. Zeeman Excitation 1 . Zeeman Resonance Frequency. The pumping beam therefore suffers a strong absorption when the hf excitation is tuned to the transition *1 Moreover D Ialone suffers twice as much absorption as D1 alone in the same sample.
92
P. A. GRIVET A N D L. M A L N A R
frequency. T he latt’er’svalue is, in general, v = 2.800/(21
+ 1)
[Mc/sec]
(25)
I being the nuclear spin value for the alkali metal chosen. The order of magnitude of v is shown in Table IV. TABLE IV
Isotopic abundance Number of linesa 21 Au(cps)*
+1
A B (rG)’
+
nLevel F = Z J = I For B = 0.5 gauss.
100
4 138 198
93.1 4 531 760
6.9 4 100 144
72.8 6 36 77
27.2 4
36 52
100 8 6.7 19
+a.
Formula (25) would be very accurate for B < 0.1 gauss, but in the earth’s field i t is only approximate, about 0.1%; this accuracy is sufficient for ‘measuring variations in B around a mean value, such as for the earth’s field. If one looks for higher accuracy and absolute measurements in the same range (=0.5 gauss), corrections to formula (25) are necessary and proceed from the Breit-Rabi formula, as explained in detail by Parsons and Wiatr (126). It means that on the Zeemen diagram the vertical B = 0.5 gauss is already located in the “intermediate” region where the curves are no longer rigorously straight lines. The intersection points are no longer equidistant with AV given by Eq. (25). The so-called Back-Goudsmit effect begins to bend the curves slightly, and this phenomenon is accurately described by the theoretical Breit-Rabi formula (127): by its use an accuracy of one part in lo6 for absolute measurements may be reached, if the various component lines are sufficiently narrow to be resolved. The factor determining line width will be discussed in the next paragraph: it will then appear that complete resolution is possible for R b under carefully chosen conditions: among others, a few are relative to external parameters: a very low Zeeman excitation is necessary in order to avoid any saturation broadening; a very uniform field B is required. Commercial magnetometers and ordinary observations do not meet these conditions: in order to get a good signal-to-noise ratio, a strong signal is obtained, a t the expense of broadening, by using an intense excitation. Under these conditions the various lines corresponding to different pairs of sublevels AmF = 1, in the F = I J and F = I - J levels, overlap, and in effect a center of gravity peculiar to the apparatus is used; the
+
MEASUREMENT OF WEAK MAGNETIC FIELDS
93
value of the absolute accuracy must be determined experimentally in each case. These possible differences appear clearly when the signals of "Rb and 87Rb are compared in the same apparatus or when the same field is measured with Cs and R b magnetometers. Practically (49), the frequency of the center of gravity of the aggregate of lines is given by formulas such as the following: v =
466744B f ( K ) 359B2
v = 699585B f ( K ) 216B2
for 86Rb for 87Rb
(26)
where K is an empirical coefficient experimentally determined, which characterizes a given instrument; K < 0.4 generally. The minus sign in this formula is the proper one when B points in the direction of propagation of the pumping light; the plus sign is used when B and the propagation vector arc antiparallel. lcull resolution was achieved early by Bender in his pioneer work on the subject [see, for example, Bender et al. (128)],and a theory of line width was given by Bloom (129). Bender (130) analyzed the practical aspcvts of magnetometers; a detailcd analysis is to be found in Bloom's now classical description of the technology of the Rb magnetometer (131); data on Cs magnetometers will be given in the last chapter of this review. The quantitative aspects of the problem will become clear if one considers the following: (1) The number of magnetic lines. Each of the two hyperfine levels oftheopticalgroundstate,F = I + J =I + + a n d F = I - J = I - + , splits into 2F 1 magnetic levels, e.g., in (21 2 ) and 21 levels, respectively, arid therefore there are (21 1) and (21 - 1) lines (rule AmF = 5 1). ( 2 ) The spacing of the lines on the frequency scale. In the intcrmediate region the curves on the Zeeman diagram can be considered as slightly parabolic (the first-order correction to linearity) ; then the second-order difference of the energy terms is constant: it means that the component lines differ in frequency by constant amounts; these are to be compared to the line width. Table IV is an excerpt of Bloom's calculation (131) for the F = I 4 level. It appears that the components of Cs show the least differences in frequencies and for this reason its use is advantageous as regards absolute measurements.
+
+
+
+
Shifts of a different origin but probably of a smaller order of magnitude (a few cycles per second) (132) are known to exist in theory. They appear in the clock transition in the microwave range and are easier to measure in this case; they correspond to small perturbations of the energy levels due to the pimping light it,self (an effect, similar to the Bloch-Siegert
94
P. A . GRIVET AND L. MALNAR
effect in magnetic resonance) or to the collisions between atoms, or with the “buffer” gas molecule or the bulb walls; we refer the reader to the review by Bender (133) and to Arditi and Carver (134) for information on these phenomena, which gain importance in absolute measurements; see the detailed theory due to Barrat and Cohen-Tannoudji (135).The 4Hetype of apparatus was thoroughly studied and shows a notable defect (a few tens of cycles per second) of this general kind (136). The 3He nuclear resonance magnetometer shows an effect of the same nature but peculiar to the exchange of spin states which characterizes this type of instrunlent (137, 138). 2. Linewidth; ReZaxation. a. Pump leaks; mixing of sublevel populations. For clarity, let us consider the simple case of ( D l , u+) pumping on the simplified Na diagram of Fig. 30; the essential results are as follows:
(1) Pumping alone results in filling up the sublevel mJ = & of the ground state, emptying all the others. (2) Switching on radio excitation first populates the adjacent level mJ = 0 a t the expense of mJ = +, and afterward the combined action of the optical pump and of the radio excitation redistributes the population among all levels of the ground state; this general change in population indirectly produces the optical signal.
+
Any other cause transferring atoms from the privileged mJ = level to others will perturb the scheme and diminish the signal; it will act as a leak for an ordinary pump. This appears clearly when one examines the initial effect of radio excitation: if a parasitic “relaxation” process partially fills the level m = 0, the first action of radio excitation will be impaired. More generally relaxation is due to transitions occurring in collisions between atoms and with the walls; these processes belong to thermal excitation and their trend is to restore the populations occurring a t thermal equilibrium, a state where Zeeman resonance is completely obscured by noise. These difficulties made the initial experiments on optical pumping very difficult; for example, pumping mercury was considered to be impossible for several years. An obvious remedy would be to pump with brighter sources, but the physics of the source is complicated and progress is slow in this domain. Moreover, powerful laboratory lamps are rather bulky and would not fit into portable instruments. Nevertheless, the study of sources is active, and the recent ones, in which a 10-cm cw magnetron powers a hf discharge in a flat, thin tube, are efficient; such lamps are described by Cagnac et al. (139-144). A second way is to diminish the relaxation scrambling of population
MEASUREMENT OF WEAK MAGNETIC FIELDS
95
between levels. Progress of great practical importance was the proof that the mixing of the sensitive sublevels of the excited optical states was of little consequence in the (01, u+) pumping scheme; quantum theory is used to calculate both cases, and Franaen has shown (112) that a t most a factor of 2 in efficiency is lost; consequently, the buildup time of the oriented state is longer, as shown in Fig. 32. The chief relaxation process for the ground state, as explained before, now resides in collisions of atoms with the walls; mutual collisions are inefficient. Bloom (113) states the problem in these vivid terms: each atom suffers some 10,000 collisions per second with the wall a t the low
I
i
Time in unit of Vw, wt : total probability of obsorption per unit time
FIG.32. Fransen’s calculation of two extreme cases in u+, D1 pumping: negligible or complete mixing caused by relaxation in the sublevels of the upper optical level.
pressure in use (10-6 mm Hg); each collision may have a depumping action. On the other hand, the most powerful illumination system provides only 1000 useful photons per second to hit and pump this atom; the pump is largely overcome. A first remedy was found in Kastler’s laboratoryzz(145): Adding to the metal vapor a magnetically inactive “buffer gas” such as argon (with zero atomic magnetic moment), a t relatively high pressure (a few millimeters or centimeters of mercury), replaces the straight path of the alkali atoms by a complicated zig-zag trajectory; encounters with the wall seldom occur and the numerous collisions with Ar atoms are largely innocuous, entailing no magnetic effect. This clearing of magnetic defects 2z
The interesting story of this discovery and of its development is told in Bloom
(113) and Carver (114).
96
P. A. GRIVET AND L. MALNAR
was made so eficient by Dehmelt (146-148) that a residual effect of the glass wall appeared. Dehmelt and others (149-151) successfully suppressed it by coating the wall with long-chain hydrocarbons or silicones. It seems that under the best conditions the only relaxation process still active is the collision with very small metal droplets on the cold parts of the wall. A detailed theory of these processes and an extensive bibliography is to be found in Bernheim and others (162-154). b. Line width. Relaxation processes not only scramble the populations but also significantly disturb the energy levels, usually broadening them. The resonance frequency becomes less sharp, or in other words the line is
-a-
_---.
i-----.
Coated glass wall
/
I --
FIG.33. Dicke’s simplifiedmodel of the buffergas or coating action, in reducing the broadening of lines by the Doppler effect.
broadening. The use of a buffer gas and of coatings is beneficial in this respect but experimentally the advantage would not even appear if these two processes did not have yet another virtue: they dramatically suppress the chief cause of broadening, the Doppler effect (which of course does not count as a relaxation process). This is Dicke’s theoretical discovery (155), and was proved experimentally by his students (156, 157). We analyze here the basic mechanism using a simple linear and classical model introduced by Dicke. Consider an atom flying along the z axis and bouncing back and forth between two glass walls of separation a (Fig. 33). An observer located on the positive part of the axis x receives a wave which is frequency modulated. One assumes complete efficiency of the wall coating: the radiation mechanism is not perturbed at all by the collision. Moreover, the wall in the drawing may symbolize the real wall
97
MEASUREMENT OF WEAK MAGNETIC FIELDS
of the bulb or the average argon atom of the buffer gas in a highly simplified model which is accurate enough for our purpose. The classical theory may be used to obtain the frequency modulation. The time diagram of the apparent frequency radiated by the atom is simple: it is made up of alternately positive and negative rectangular pulses; the height of the positive crests is + S v , and the height of the negative
Y(I
- v/c)
V(l+V/Cl
Emitted frequency
FIG.34. Splitting due to frequency modulation in Dicke’s model.
crests is -6v. Taking the unmodified frequency as the reference, 6 v is the elementary Doppler shift value ( 6 v ( = v / c v , where v is the velocity of the atom, c the velocity of light, and v the frequency. The Fourier spectrum of such a frequency-modulated wave (Fig. 34) is well known, if rather complicated. Around the central line (“carrier”) a t frequency v , we observe satellites (sidebands); the two first are centered on v 6v and v - 6 v ; these two satellites show an amplitude J ~ ( T G v )which , may be compared with J o ( T 6 v ) for the central component; T is the period of
+
98
P. A . GRIVET A N D L. MALNAR
the modulation (T = 2a/v), 6v the elementary Doppler shift, and J o and J1 are standard Bessel functions of order 0 and 1. For small values of the argument, J I / J o ‘v +T 6 v = a 6v/v = a/X, where X is the wavelength of the unmodified line (the last step obtained by taking into account Doppler’s law, 6 v / v = v/c). A small relative amplitude of the satellites with respect to the central line thus depends on the quotient a/X, being equal to it for small values; considering the values of the parameter a/X, very different but typical conditions appear. When the bulb cont.ains alkali vapor only, a is simply the diameter of the bulb, which we take as 5 cm; with a buffer gas, a is the mean free path length L; at high pressures of the buffer gas, L becomes much smaller. A specific example is argon a t pressure of 1 mm Hg, as used in Arditi’s clock experiment (158) with a = 0.1 mm. Typical cases are then: (1)
X
Magnetometer transition in the earth’s field; X
‘v
100 meters;
>>> a for any conditions. It makes no difference whether or not a buffer
gas is used: a/X is always very small and the average continuous spectrum reduces to the central component. This “motional narrowing’’ results in very thin lines and sharp resonances. The best results were obtained by Bender (169,160, 160a, 160b) taking great care to eliminate external disturbances; he obtained a line width of 2.3 cps for Rb. Using the extremely favorable conditions at the Fredericksburg observatory for measuring the lowest-frequency transition in 87Rb, he obtained
F = 2 mF=-2 v = 699585B - 216B2
F = 2 mp=-l ( v in cps, B in gauss)
(27)
(2) Optical clocks; X ‘v 10 cm. The narrowing effect would essentially disappear for a coated bulb (X/a = 2), but the use of a buffer gas restores the narrowing, and with L ‘v 0.1 mm, and X/L = 100, one reaches line widths of the order of 20 cps for Na and 40 cps for Rb and Cs. The narrowness is essential here, because the clock’s longitudinal transition is a weak line in comparison with ordinary Zeeman lines arising from transverse transitions. (3) Optical transitions ( D lines for example). The wavelength, X ~ 0 . p5, is so small that any narrowing would be difficult to obtain, requiring a high pressure for the buffer gas.
3. Transverse Modulation. a. Bloch’s equations and density matrix theory. In the previous paragraphs, the role of the radio field was described in a rather crude manner: it was considered as inducing transitions between pairs of adjacent magnetic levels, changing their populations. For example, the same basic process occurs in a nuclear resonance experi-
MEASUREMENT OF WEAK MAGNETIC FIELDS
99
ment with water protons: here, the level structure is the simplest one with two levels, and -$; the consideration of populations and of selection rules would lead to a good description of the fundamentals of the nuclear resonance. But many experiments, such as adiabatic fast passage, Varian-Packard free precession, and Hahn’s echo method, do not find an easy explanation that way: for all these processes, involving the “phasing” of transverse atomic components of the magnetic moment and the “coherence” properties resulting in the creation of a bulk magnetic moment transverse to the polarizing field B, Bloch’s equations are of great value. How to bridge the two points of view is also known now and was due, in the field of nuclear resonance, largely t o the efforts of Bloch himself (see 161 for the most “introductory” treatment), who succeeded in a precise but difficult theory by means of the “density matrix.” I n the domain of optical pumping, the historical evolution followed the same path: “coherence” effects embodied in the movement of components of the moment M , perpendicular to the steady field B were discovered by Dehmelt (111) and explained by Bell and Bloom on the basis of equations of the Bloch type (9) before the establishment of an accurate density matrix theory by Barrat and Cohen-Tannoudji
++
(121, 162, 163).
Recognizing the simplest features of the Bloch equations in nuclear resonance helps to understand the modulation of a transverse beam: the basic property of a macroscopic moment M is to exist as a definite physical entity. I n other words, it possesses definite properties irrespective of the method employed t o produce it: for example, in the PackardVarian experiment one may produce the initial state M’, in the direction perpendicular to the earth’s field B b y prepolarizing in a static transverse field B, (this scheme is easily described in population language), or use the more complicated process where one changes the orientation from longitudinal to transverse by a suitable hf pulse ( r / 2 pulse), or use other dynamical processes such as adiabatic fast passage: this last procedure is easy to explain starting with Bloch’s equations, but more difficult to understand when one is working with populations (see 16). These many possible choices are all equivalent in the end, and the moment M at a given stage obeys the Bloch equations linking its movement to the actual value of the total magnetic field. Distinguishing transverse components M, and longitudinal ones may be very convenient for the clarity of the description as is the separation of the field into Bo = B11 and B,, but these analytical distinctions do not imply any difference in basic properties. On the other hand, the Bloch equations are linearz3and distinguishing 23
Saturation is not considered here and the amplitudes are assumed very small.
1‘00
P. A . GRIVET A N D L. MALNAR
various components in B in order to separate the various possible excitations show that their effects add linearlyz4;interference (164) terms do not appear in most instances. A well-founded example of this linear addition appears in Rloch’s initial treatment of resonance: he simply adds to the effect of the hf coherent excitation those of the stochastic fields responsible for relaxation: d 1 1 1 -- M = r(M X B) r(M X BI) iM kM (28) jM dt 1
+
+
bf excitation
static term
+
+
relaxation
Applying these results to nuclear resonance, one may predict the effect of radio excitation by a field rotating about the z axis in the plane xy with angular velocity w = y i B ; acting on a sample pumped by u+ along the z axis, it moves the moment Iaway from its z’z direction in a spiral path, inducing it to rotate around the axis z, in synchronism with the hf field. At any instant during the process, M remains endowed with the same properties as in the initial position M,; when M is collinear with the y‘y axis and points in the positive direction, the sample will strongly absorb the light of an auxiliary (u+) beam that is collinear with the y‘y axis and used to optically probe the sample; the absorption of the beam will vary in intensity in proportion to the magnitude of the component of M along the direction of the probing beam My.24aThe effect is that of a radio-frequency modulation of the amplitude of the light a t the precession frequency of M.This is a low frequency compared with that of the light. This “adiabatic transformation’’ does not have any other influence on light emissions but hf modulation. The success of this kind of theory, well verified by experiment, should not lead one to think that density matrix theory represents a superfluous complication. Indeed, the conditions in the optical experiment are original ones: fur example, the magnetic moment in the optical experiment is a .composite one (F = I J = I L S) and this vectorial addition is made flexible through the Back-Goudsmit effect: the conditions are markedly different from those obtained with a rigid nuclear moment I > and the Back-Goudsmit transitions may show slightly different frequencies. For these reasons it is very interesting to use a complete theory; here the density matrix calculations are notably simpler than in statistical mechanics: the reader may judge by himself, consulting Cagnac (I%), Cohen-Tannoudji (166), and Winter (166), who introduce one to the complete treatment found in Cohen-Tannoudji (121). Recently, Professor Carver has given a new formulation to this theory,
+
+ +
+
24 For small excitations, of course; disregarding any saturation phenomenon, for example. 24a When M, < 0 the auxiliary beam is absorbed as a (u-) excitation.
101
MEASUREMENT OF WEAK MAGNETIC FIELDS
which is both clear and efficient, and he has presented it in a pedagogical form, very suitable for a first study (121~). b. Experimental aspects. This modulation is used for measuring w through the construction of a spin-coupled oscillator. In principle it is simple to transform the light into an electrical oscillation with a photomultiplier, t o amplify it, and feed it back to the excitation coils with the proper phase to obtain a self-oscillator of the spin-coupled type. The output of the photomultiplier occurs a t frequency w and not a t 2w because the depth of modulation amounts to a few percent. The optical spin oscillator is very similar to the nuclear Bloch coil oscillator, except for -!-90° Phase shift Field to light
Interference filter (pass
A,= 7948)
Circular polar izer
Rubidium vapor cell
RF coil
b Phase shift
output
signal
FIG.35. Scheme of the one-beam optical spin oscillator. one-point: the photocurrent is directly proportional to M , a t variance with the induced emf which is proportional to dM,/dt. This introduces a x/2 phase shift in the feedback loop that must be compensated by one means or another. If one looks a t the scheme of Fig. 29 where the pumping beam (along 0 2 ) and the probing beam (along Oy) are especially separated one could simply “cross” the axis of the hf coils with the y beam as in the Bloch coil arrangement for resonance. This is not possible in practice: the general setup is always simplified by using the same light beam for pumping and probing (Fig. 35). It must then be inclined a t 45’ to the field t o be measured, and the simplest mechanical arrangement of the coils leads t o alignment of their axes along the beam; for this rigid geometry then/2 phase shift must be compensated by external electrical means. The x/2 phase shift is of importance in geomagnetic exploration: for example, when the observation plane turns back the magnetometer,
102
P. A . GRIVET A N D L. MALNAR
the field B is inverted, and this negative value of B, amounts to a change in sign of w , the angular speed in the precession of M ;for the electrical signal too, it is equivalent to a change in sign of the circular frequency w , the various phase shifts keeping their sign. An equivalent electrical situation is obtained, keeping w positive and changing the sign of the phase shift: this operation is of no importance for the small parasitic shifts, but the ad hoe s / 2 phase shifter must be commuted so as t o produce + s / 2 instead of Ts/2. This inconvenience is suppressed if a single pump lamp feeds two magnetometers symmetrically disposed back to back on the
FIQ. 36. Elimination of the s / 2 phase shifter by a double-head arrangement due t o Bloom (131): the phase shifter is the source of troublesome effects due to rotation
in mobile equipment.
same axis (131). The connection shown on the scheme makes the two s / 2 phase shifts relative to each half of the apparatus cancel one another (Fig. 36). This setup has another marked advantage: the Breit-Rabi formula, developed to second order as in Eq. (26), gives terms linear in R and also quadratic terms; inverting B changes the magnitude of v because the quadratic terms do not change sign, but v and B do; this fact is expressed in ordinary frequency formulas, which link the absolute values of v and B, by double signs such as those of Eq. (26). This double valuedness would be a source of errors in practice; it is removed in the double magnetometers that oscillate on the medium frequencies given by formula (26), where one cuts off the quadratic terms.
MEASUREMENT O F WEAK MAGNETIC FIELDS
103
C . Experimental Orders of Magnitude The choice made in practice for the chief parameters of optically pumped magnetometers often results from the imperatives that appeared in the preceding paragraphs. For example, the usefulness of “coatings” made of straight chain hydrocarbons, with low melting points, favors the selection of R b or Cs as an active medium: these two metals alone offer a high enough vapor pressure at sufficiently low temperatures, as shown by the following tablez6 giving the temperature in degrees Celsius at, which the vapor pressure is mm Hg: 7Li 304
23Na 126
39Rb 63
*‘Rb 39
87Rb 39
la3Cs 22
The intensity of the pumping light may be kept a t a relatively low level in order to reduce the source power and simplify its construction, especially its stabilization mechanism; this is possible because optical detection is highly efficient. For the same reason, when using the hf modulation principle it is also possible to maintain a low level of oscillation (hf field of the order of 100 p G ) , and this precaut>ioneliminates the possibilities of spurious double quantum jump resonances (reviewed in 166) and assures narrow Zeeman lines (Av = 50-100 cps). On the other hand, polarizers are standard componentsz6and quarterwave plates for this favorable part of the spectrum may be made easily from mica or cellophane sheet. Less well determined is the choice between steady absorption and self-oscillator systems for detection: it seems now commonly accepted that the former system is better suited for stationary and aerial observation, and the latter for rocket and satellite experimentation: a n autooscillator a t a high frequency leads more readily to short overall time constants. Further details will be given later in the section on the Cs magnetometer and are also to be found in Bender (130).z7 Still undecided is the choice between use of a buffer gas and of a coating alone, as well as between R b and Cs. Such a comparison will not be attempted because the over-all performances are of the same quality, and result more from careful engineering than from anything else. We will conclude this analysis by describing the recognized characteristics of two types of R b magnetometer, a laboratory one and a commercial one, both using a buffer gas. The last part of this review will give a detailed description of a Cs magnetometer, planned and constructed by 25 26
27
An excerpt of the extensive data given by It. E. Honig, RCA Rev. 18, 195 (1957). Marketed by the Polaroid Company. The reasons are set forth in Section VB.
104
P.
A.
GRIVET AND L. MALNAR
one of us (Malnar). Thorough discussion of the technology is to be found in Colegrove et al. (106) and of the sensitivity in Bender (130) and Bloom (131). It should be remarked that the physics of optical magnetometers is much more complicated than that of nuclear resonance; the relaxation processes are not completely understood or measured. For these reasons, one cannot actually make as clear-cut an analysis as for the nuclear magnetometer. Interference
Photocell
I
Circular polarlzer
Recorder
I
"
Frequency counter
Phose modulator
__
I t -
Lormor frequency oscl I lot or
Low
frequency oscillotor
Frequency control signal
d
FIG.37. Scheme of tho automatic control of the frequency of an rf oscillator for locking it to the center frequency of an Zeeman line: phase modulation is used to scan the line, but a pure carrier is retained for accurate frequency measurement without any filtering.
P. L. Bender has summarized his experiments (130) on a R b magnetometer of the dc type in the following data: (1) (2) (3) (4) (5) (6)
Line width, 15-20 cps or 20-30 pG for B = 0.5 gauss. Absolute accuracy, 12 parts in Line width a t low field, 3 cps or 3 pG for B = 50 mG. Bandwidth for commercial apparatus, 1 cps. Practical sensitivity, 0.1 pG. Frequency, 700 kc/sec per gauss.
Figure 37 shows the electronics used by Bell and Bloom for locking the frequency of an oscillator on the Zeeman frequency (131). I n Ref. (131), A. L. Bloom describes the HF modulation magnetometer of the portable type:
MEASUREMENT OF WEAK MAGNETIC FIELDS
105
(1) Buffer gas, Neon, p = 3 cm Hg. (2) D1 light, A 1 = 7948 A; ( A 2 = 7800 A reject,ed by an interference filter). (3) Average photocurrent, 100 pa. (4) Average signal current, 1 pa. (5) Signal-to-noise ratio, 600. (6) Ultimate sensitivity, 0.03 pG. (7) Line width, 100 cps for 140 pG. (8) Bandwidth, linked with sensitivity, 1 cps for 0.01 pG. (9) Amplitude orientation dependence (double sensor), sin 28.
The low-field limit may be taken approximately as the line width, e.g., 100 pG. The technology of space magnetometers of this type is described in Heppner et al. (167) and Ruddock (168).
D . Helium Magnetometer 1. Helium-4. Helium-4 is a two-electron atom, appearing in column 0 and in the first row of Mendeleev’s table; it has no nuclear moment. Its optical spectrum generally speaking is very different from that of an alkali metal. Its energy terms fall into one or the other of two clear-cut categories: for para helium the spins of the two electrons cancel (S = 0), while for ortho helium they add, giving S = 1. Helium atoms cannot change by a radiative process from one of these categories of states t o the other; in other words, ortho helium and para helium behave markedly as distinct atoms. For example, the ground state ‘So belongs to para helium and the first excited state to ortho helium. Under electric excitation in a hf discharge (a nonradiative process), an atom may reach the 3S1 state; it remains in this “metastable” excited state a long time, of the order of 10 msec, because it cannot return to the ground state by emitting radiation; this would violate both selection rules A1 = +1 and A S = 0. These rules express the conservation of angular momentum and the prohibition is a strong one: the life of the metastable state is long sec). The metastable state 3S1may be considered as a pseudo-ground state for ortho helium: it is obtained by a mild H F discharge in He a t 1 to 5 mm pressure. The density of metastable ortho helium is of the order of 5 X lolo per cubic centimeter, comparable to that of the alkali metal vapor a t 10-6 mm. The spectrum of ortho helium is similar to th a t of alkali meta.ls except that the P levels are three in number, corresponding to S = 1, ms = +1, 0, - 1, and there are in principle three D lines, DO, D , D2. But D I and D 2 are separated by Av = 1800 Mc/sec (e.g., Ah = 0.091 A only) and for simplicity may be considered here as unre-
106
P . A. GRIVET AND L. MALNAR
solved. The aggregate DIDZ is called D3 and the spectrum when reduced to D oand D3 is very similar to the alkali case, except that D ois eight times less intense than D3 in theory and three times in practice; the detailed analysis is given by Franken and Colegrove (169) ; the diagram is shown in Fig. 38. The physical conditions remain also nearly the same as for the alkalis; the pumping light is a line in the near infrared (X ‘v 10,830 A), where standard optical equipment is available; lead sulfide photosensitive conductors or silicon photodiodes are used in the dc type mi
+I
0
-I
-2
/
Ortho helium
16x1OScm-’ I
I
I
4He
/ ,is,,
/ I
( I cm-’
= 30 Gclsec 1
Para helium
FIG.38. Zeeman diagram for ordinary ‘He: the 2% level of orthohelium is SO long sec). lived that it is called “metastable” (life:
of magnetometer; for the hf modulation type an S-1 response phototube works well. There is no need for the addition of a buffer gas; indeed the ground state para helium gas (at a few millimeters Hg pressure, p < 30 mm Hg) plays the role of a buffer gas for the active metastable ortho helium atoms. Coatings are superfluous (170) too, because the line width of the Zeeman transition is broadened to a notable extent b y a secondary effect of the hf auxiliary excitation which produces the metastables; this broadening would make illusive any advantage hoped from a coating. The broadening is the direct consequence of the finite lifetime of the metastables, due to the uncertainty relatiorl, Ah’ T N 1, which links level
107
MEASUREMENT O F W E A K MAGNETIC FIELDS
width AE measured in cycles per second and lifetime 7 . T h e lifetime of “metastability” itself, e.g., of the whole level 3S1,is long, a few tens of milliseconds; but, as shown in (160) the lifetimes of the individual magnetic sec only; sublevels are significantly shorter and amount to some this is natural for several reasons, among others, because the AmJ = 1 transitions are “allotted” ones. The result is a line width of the order of 3000 cps or 1 mG. The main qualities of this type of instrument are simplicity and ruggedness. Extreme simplicity may be achieved by pumping with natural light; one may even pump without lenses or polarizers, by putting the source in contact with the absorbing bulb. This possibility B R F oscillotor
High voltoge RF
l-n- l
Voriable RF signal generator
We0 k electrodeless helium discharge
1
CRO
FIG.39. Pumping metastable ortho helium is efficient, even when using natural unpolarieed light: a simple device for displaying a Zeeman line on a cathode ray oscilloscope.
arises here because the total transition probabilities, away from the sublevels mJ = 1 and mJ = -1, are some 3% lower than for the mJ = 0 level; by this “intensity pumping” the atoms accumulate significantly in the two mJ = k 1 sublevelsZ8(Fig. 39). The possibility is not used in actual commercial instruments because ordinary pumping (171) produces a stronger orientation, e.g., a better signal-to-noise ratio ( X 40), and also a slightly sharper Zeeman resonance. The second main advantage is of higher value, especially in space research: there is no need, as with the alkali vapors, for temperature regulation, which takes considerable power and is a source of troubles
(.w.
The helium magnetometer seems very attractive and was built in a very practical omnidirectional form (49), embodying three sensors around 28 It is possible to show by a symmetry argument that this procedure produces “alignment” and not “polarization”; in other words, the gas is not made paramagnetic, because there are as many aligned atoms pointing in the positive as in the negative direction.
108
P. A . GRIVET AND L. MALNAR
one source; its chief defects are high-frequency shift (some 10 cps) under variation in light intensity, and large line width, which corresponds to a high value for the low-field limit of space. Typical characteristics are the following: (1) (2) (3) (4) a t the (5) (6)
Frequency, 2.8 Mc/sec per gauss. Signal-to-noise ratio, 500. Sensitivity, 0.1 pG. Bandwidth, 1 cps (the bandwidth may be increased to 104 cps expense of sensitivity). Minimum width, 700 pG. Temperature range, -50” to +50” C.
2. Helium-3. The rare but stable isotope 3He is the final product of tritium’s radioactive transformation ; it is now commercially available at a price which makes it suitable for magnetometer construction. The change from 4Heto 3He would be very similar to that from the simplified Na model to the real Na. Indeed, the only peculiarity of 3He is its nuclear spin and its nuclear magnetic moment. Replacement of 4He by 3He would offer no great advantage: The presence of the nuclear spin would be only a complicating factor in the Zeeman diagram (Fig. 40); the multiplication of levels would not improve pumping efficiency; the apparent gyromagnetic ratio could be increased by a factor 4 by choosing the magnetic resonance of the F l i z sublevel; it is reduced by Q for the F3/2 level. But 3He may be used more cleverly as shown by Colegrove et al. (106, 172) to observe purely nuclear resonance, of the nuclear magnetic moment of 3He, which shows a high and accurately known gyromagnetic ratio:
+
- p= ~ 2.03795 ~
X lo4gauss-’ sec-I
This new possibility is opened by the “spin exchange” process. This phenomenon has been much studied since 1956, when it was shown (173) to be responsible for the occurrence of the celestial 21-cm hydrogen emission line. The spin temperature of H atoms in emitting clouds reaches 50”K, some 40” higher than the average radiation temperature in ordinary regions in deep space; through “spin exchange” H atoms are able to gain magnetic internal energy at the expense of the kinetic energy of H atoms in these peculiar clouds (see 122). Spin exchange has been the subject of much research and the source of important discoveries such as that (174) of Dehmelt. Generally the object of these studies is the transfer of orientation from one species of atom to another by collision. The theory of the process is very interesting too; an introductory treatment may be found in Kastler (102) and in Carver (114).
MEASUREMENT OF WEAK MAGNETIC FIELDS
109
Spin exchange occurs here when, for example, an 3He atom in the ground state S = 0 (para helium), mz = collides with another 3He atom in one of the substates of the oriented metastable state S = 1, ms = 1 ; after the encounter the spin states of both atoms are exchanged: the para helium atom is in the state X = 0, mr = +$; and the metastable, in the S = l , m.9 = 0 state; the probability, and correspondingly
-+
Ionization b
-I cm-’
4r
9;
cm-‘
2 ‘So€
F =3/2
- 3/2
’He ( I cm”= 30Gc/sec)
FIG.40. Zeeman diagram of 3He, with the hyperfine structure due to coupling between orbital electron and nucleus ( I = +).
angular momentum. Pumping with polarized light favors the accumulation of metastables in the X = 1 substate, and ultimately results in a population of para helium atoms, mostly in the mr = state. In other words, the lower nuclear state is more populated than the upper one; the corresponding bulk nuclear moment is not only enhanced in magnitude by pumping and exchange but also inverted. These are the proper conditions for obtaining maser oscillation at the nuclear resonance frequency.
-+
110
P. A . GRIVET A N D L. MALNAR
Such a maser oscillator has recently been built and operated with full success (108) a t Harvnrd. The amazing aspect of the experiment is the high degree of nuclear orientation obtained, around 40% (106). This occurs because of the high efficiency of optical pumping and the high mobility of the gas atoms: between two encounters, the pump is able to restore the proper orientation ms = l in the metastable atom; it moves rapidly and may reverse the nuclear spins of many para helium atoms. The conditions are similar to those occurring in the Overhauser effect of a dilute water solution of a paramagnetic radical. As regards magnetometry, a notable progress is achieved toward absolute measurement: frequency shifts are induced by the pumping light only in the levels involved in the optical pumping transitions. The E
Circulatory polarized
u+
-t--
Pumping light
Fro. 41. The extremely long life of nuclear orientation of 3He (some 10 sec) permits the mechanical transport of the orientated nuclei by diffusion over distances of a few inches: one can separate the bulb in two distant parts, one for optical orientation, the other for resonance.
nuclear levels of the ground state para helium are free from this direct disturbance. Unfortunately, a residual perturbation still exists: these para helium atoms need activation, e.g., during the short time they are coupled to the ortho helium metastables the resonance frequency measured is not purely that of the free para helium-3. It can be shown that a small correction of the order of but depending on the rate of activation, e.g., on the intensity of the pumping light, must be introduced in the frequency law (1). The Harvard experiment has not yet been fully exploited for magnetometers, but it affords an effective remedy for the latter defect: nuclear resonance is not induced in the pumped bulb itself, but in an auxiliary one a few centimeters away; the two bulbs (Fig. 41) are connected by a pipe, and the para helium atoms diffuse into the second bulb, without losing their nuclear polarization: the nucleus is so well protected by the S = 0 electronic shell that it does not feel the collisions with the wall during diffusion for some 10 minutes. On the other hand, the enhancement of
MEASUREMENT OF WEAK MAGNETIC FIELDS
111
the polarization magnitude is large enough to reach a signal levcl better than that of the protons in a benzene sample of t,he same volume. This is a very promising experiment. Finally, pumping 3He with an 4Helamp has appeared recently ( 1 7 4 ~ ) to be more efficient than the use of an “e lanip; the strong unresolved component in the 4He light (3X1-3P2 and 3S1-3P1) overlaps with one of the 3S1-3P0transitions ( F : Q -+ 8) of 3He only and pumps it very strongly.
v. AN EXAMPLE OF
DESIGN: THECESIUM
V A P O R RIACNETOUETERS
A . Introduction The object of this section is to give some idea of the practical problems met with in designing optically pumped magnetometers. Questions of Lens
‘:I
Resononce cell
\
Photoelectric cell
4
Circular polarizer
0
FIG.42. Practical optical pumping arrangement.
sensitivity and technological problems are dealt with for the controlled system and for the self-oscillator system in Sections V, B, 2 and V, B, 3, respectively. Certain characteristics of these instruments are concerned with the very principle of optical pumping. Special utilization conditions result therefrom. These matters are considered in Section C. The magnetometers described below utilize cesium vapor. Various commercial versions of these instruments are a t present in use in Europe. With respect to their principles they fall into the general class of instruments described in Section IV, so that the matters dealt with below are qualitatively common to all optically pumped magnetometers. But they differ in regard to one feature mentioned below. As shown in Fig. 42, which illustrates schematically a practical arrangement of optical pumping, cesium vapor magnetometers do not make use of an interference filter for separating the D1 component; indeed,
112
P. A . GRIVET AND L. MALNAR
light is used for pumping, but with the two components D1 and DZ; the effects of relaxation are avoidcd by the use of a paraffin coating only (153, 175, 176) no buffer gas (145, 146, 153, 177) is used, as explained in Section IV, B, 2, b. So Kastler’s mode of pumping (as described in Section IV, A, 1, b) is used. It is worthwhile to mention that, for this type of pumping, experiment appears to offer conclusions that are a t variance with theoretical prediction. All that has been said about orientation is valuable, but, as regards detection, theory states (178) that it is impossible to show a difference of population by a variation of light absorption if the intensities are equal on D1 and Dz. But in practice, experiment shows th a t with bulbs containing no foreign gas (residual pressure being less than mm Hg) and merely coated with paraffin it is preferable to use the two lines simultaneously. An explanation of this discrepancy may be brought forward as already mentioned; the first vapor layers through which light has passed provide an effective selection of D1 which is even more efficient than the use of an interference filter, especially when losses inherent in these devices are taken into consideration. Although this point has not been fully cleared up, the cesium magnetometers described here use the two lines of equal over-all intensity. o+
B. The Magnetometers 1. Principles of Magnetometers. Considered in a very simple way, the magnetic resonance phenomenon detected optically can be described as follows (Fig. 43). The pumping light orientates the magnetic moments of the cesium atoms contained in the resonance cell in the direction of the external field B. This orientation aff ects light transmission through the bulb, because absorption is substantially proportional to the magnetic moment in the direction of the light beam. If a magnetic field 2B1 vibrating at frequency v is applied simultaneously a t right angles by means of coils, the dipoles are acted upon and precess also at frequency v about B ; their movement locks on one of the rotating components29 B1 of the oscillating field. Consequently, the moment M, along B decreases, and light transmission also. T h e effect is a maximum a t resonance, i.e., when w = 2nv = riB, where yi is a proportionality constant defined by Eq. (23). Figure 44 shows the variation of the light intensity I transmitted against the frequency vg of the applied hf field. As shown on Fig. 1, the active component rotates in the negative sense around B is positive and in the positive sense if y is negative.
29
if
113
MEASUREMENT OF WEAK MAGNETIC FIELDS
As explained in Section IV, two techniques are possible for monitoring the magnetic resonance frequency automatically. I n the first case, the component M , parallel to the field B is used. This component passes through a minimum a t resonance; the minimum B
Rotation frequency v ) .
-
'-
@_c
I -'A1
FIG.43. Schematic description of optical pumping.
I
AI- I/IOO
i i b
U
vq
FIG.44. Optical detection of magnetic resonance.
is used for driving a generator, thus leading to the so-called L'controlled'' magnetometer. In the second case, the M , perpendicular component is used to modulate the light in the manner explained in Section IV, B, 3. 2. The Controlled Magnetometer. a. Principle. The pumping light is provided by a source consisting of a cell containing cesium vapor raised
114
P. A . GRIVET AND L. MALNAR
to luminescence by high-frequency excitation. This light is focused by lenses and circularly polarized (u+). The light then passes through a second bulb, known as the resonance cell, which also contains cesium vapor and is subjected to a rf magnetic field provided by a n adjustable frequency generator. The transmitted light is then collected by a photoelectric cell. As already mentioned, the amount of light absorbed, proportional to M,, depends on the difference between the frequency supplied by the generator and the resonance frequency. The principle of control is shown in Fig. 45. B
c
Rotation frequency
vg +Avg sinfit
Circulor
amplifier
AF Oscillator
1
h J
Synchronous detector
FIG.45. Principle of the controlled magnetometer.
Figure 46 explains the elaboration of the error signal; the intensity of the light transmitt,ed by the resonance cell varies with the frequency vg supplied by the generator. The latter is frequency modulated a t a low frequency so as to scan the resonance line about its mean value. Owing to this scanning, the light transmission coefficient of the resonance cell is also modulated and a signal of frequency s2 is then collected a t the photoelectric cell. The phase of the detected signal with respect to the initial modulating signal depends on the position of the mean generator frequency with respect to the middle of the resonance line. When the two low-frequency signals are applied to a synchronous detector the latter restitutes a
MEASUREMENT O F WEAK MAGNETIC FIELDS
115
signal s = dI sin 4, where dI is the amplitude of the fundamental frequency of the detected signal and 4 is its phase. The signal s shows the proper behavior in order to constitute the error signal. This is used in a control loop, to maintain the generator frequency vg coincident with the resonance frequency v, hence the term “controlled magnetometer.”
s = dI s h y ,
f
FIG.46. Principle of detection by frequency modulation.
b. Limit sensitivity. The problem has been thoroughly discussed by Malnar (179).It will be approached here in a slightly different way. The main considerations are the following: (1) It is assumed that the center of the resonance line does not depend on physical parameters other than the magnetic field. Some reservations have to be made as regards this statement; they are examined later on. The problem merely amounts to marking the middle of the resonance line, and sensitivity will be limited by the precision with which this can be done. (2) Sensitivity is defined as the minimum value of the detectable field variation (root-mean-square value) ; it is readily deduced on the
116
P. A. GRIVET AND L. MALNAR
control principle. Near the center of the line one has (Fig. 47)
6B
N
= __
ds/dB
where ds/dB is the differential of the signal s with respect to the field, and N is the sum of all background noise which accompanies the signal s. (It is immaterial whether one considers the frequency or the field, since
1
AI-I 1100
I
--B
FIG.47. Determination of the center of a resonance line.
both are related by the relation 21rv microgauss.)
= yiB
where
y, =
0.35 cps per
Looking a t Fig. 47, one sees that the slope ds/dB is proportional to S I A B , S being proportional to A I / A B , and hence
AB2 6B--B A1 but AB, A I , and N depend on the quantity of light I , on the hf field Bi, and on the values of the frequency and amplitude modulation used. As regards the amplitude of modulation, it can be chosen with the help of theory, it shows that maximum slope corresponds to an appreciable widening of the resonance line by modulation (180, 181); a compromise should be adopted.
MEASUREMENT OF WEAK MAGNETIC FIELDS
117
A calculation of the optical pumping cycle would be necessary to make S and AB optimum, but it is as yet not generally possible, especially for Cs owing to its complicated structure ( I = z ; see Fig. 48). What can reasonably be done is to plot experimentally the evolution of AI and AB against I and B1 to determine the optimum values. E
I,
F=2
F F.3 =4
I
J
Excited state. The Zeemon splitling is not shown.
*B FIG.48. Eriergy levels of the fundamental state and of the excited state of cesium.
As regards background noise, it is possible to identify the different causes and to assess the respective contribution of each (1'79). The following conclusion is finally attained. The intensity of the resonance line AI and the width of the line AB are simple functions of the light intensity I and of the rf field B I .Minimum theoretical detectable 6B can be determined entirely from the two parameters I and B1. Experimental measurements permit plotting the useful quantity d s / d B that appears in formula (29) and which in a way represents the slope of the discriminator constituted by the resonance line (see Fig. 47). This slope ds/dB is represented by the curve of Fig. 49, plotted a t constant light against 2B1.
118
P . A. GRIVET AND L. M A L N A R
-I FIQ.49. Variation of the slope ds/dB against the hf field 28,.
1
FIG.50. Variation of line width against light intensity.
The top of the curve shows the optimum working point which permits choosing the value of field 2B1, for a given amount of light I. At'this point it is possible t o give- a simple expression for the quantity ds/dB against the light intensity only, for line widths plotted against I lie substantially along a straight line (Fig, 50). A straight line is also obtained for the curve that represents A 1 as a function of I 2 (Fig. 51).
MEASUREMENT OF WEAK MAGNETIC FIELDS
119
These results permit writing the following empirical law of evolution for the quantity A Z / A B , which is proportional to ds
I2
dB cv
(ABo
+LUI)~
If i t is assumed that the sole cause of noise is the ultimate one, e.g., luminous shot effect, noise N can be written in the form
N cv
(Idf)1/2
where df is the information passband used, i.e., the passband of the final recorder.
t
FIG.51. Absorption A I near’and at resonance against light intensity.
The minimum detectable field then takes the following simple expression :
As for any measuring instrument, one may use this result to define a figure of merit,, Q , for the magnetometer: Q
6B
(ABo
=-rv
(df)1/2 -
+(YI)~
I3/2
Q may be made a minimum by differentiating it with respect to I ; a very “broad” minimum is obtained for =
3ABo
120
P. A. GRIVET AND L. MALNAR
This shows that the amount of light is not very critical and that there is no need to exceed certain light intensities. When adjusted to satisfy these optimum conditions, the cesium magnetometer would provide a theoretical minimum figure of merit :
Such a n extremely good quality would be difficult to assess in practice. In actual fact this limit is never attained, because actual background noise is substantially higher than the shot noise contribution by the photocell; the input of the amplifier affords an important contribution (low-frequency flicker noise). It should also be noted that in order to avoid certain rotation effects, magnetometers are never adjusted to optimum point, but to higher values of the field B I . This point is explained later (Section C, 1). For these various reasons, sensitivities observed in practice are of the order of 0.1 pG with 1-cps passbands; this order of magnitude appears as quite sufficient for current geophysical problems. c. Associated electronic problems. (1) Requirements. I n practice, optimum pumping conditions, described in the preceding section, are readily obtained. Rut it is difficult to attain theoretically predictable sensitivities. Experience shows that performances obtained are most usually limited by the quality of the associated electronics. It therefore seems useful to give it some consideration and to examine the engineering problems raised in the design. One must weigh the following considerations :
(i) The control error, which is unavoidable, has to be a minimum since it brings about a systematic error in absolute magnetometers. I n the case of mobile instruments the control error may introduce additional rotation effects on account of gain variations related to the orientation (this point will be further gone into in the next paragraph). (ii) The dynamic range has to be sufficient to avoid excessively frequent adjustments; for example, a magnetometer used for measuring the earth’s field has to cover one octave (from 200,000 to 700,000 pG). (iii) The frequency generator has to cover this range by means of a simple control; it is therefore, in principle, not very stable, and control has to be sufficiently effective to correct this instability. (iv) The closed-loop cutoff frequency that finally determines the information passband has to be compatible with long storage in open loop. This storage is necessary because a controlled instrument requires a search system during the starting phase so as to bring about coincidence
MEASUREMENT O F WEAK MAGNETIC FIELDS
121
between the generator frequency and the resonance frequency. This search demands a fairly long time, about, 1 to 2 minutes. I n operation, it may happen th at the control loop fails temporarily for one reason or another; in this case the generator returns to its resting frequency with a time constant th at is th at of open loop control. If this time constant is sufficient to ensure th at the generator frequency remains within the line width for the duration of the interruption, the system will automatically resume operation as soon as normal conditions are re-established. I n this way the loss of time in the search system will be obviated. The type of control system th at would best correspond to these conditions would undoubtedly be an integrator system. But in practice these systems demand rotary machines and their drawback is that they
G(Bi-B)
1
1
Bi-B
FIG.52. Control system.
are magnetic : this excludes the possibility of a compact design. Moreover, for reasons of inertia and friction they possess a threshold and fail to respond to too weak electrical action. This excludes any fine control, unless the machine is associated with a second electronic control acting as vernier. This combination may be a very useful compromise when the electronics can be separate from the actual magnetic probe. I n the design example that will now be described, the solution adopted uses a purely electronic system with a double time constant; this solution, which somewhat approaches ideal integration, is analyzed below. (2) Control analysis. (i) The servo’s equation: Figure 52 shows the control circuit. The frequencies are converted to a magnetic field in accordance with the fundamental law (23) : w =
~ T V=
7iB
Using it, we may define various values of B which will also be useful later on: B, corresponds to the resting generator frequency, Bi is the field read on the instrument, and B is the field to be measured. G is the
122
P . A . GRIVET AND L. MALNAR
gain on open loop; it can be written in the following form:
Go is the total gain of the loop on direct current, taking into account both the actual amplifier gain and the transfer coefficient of the optical pumping system as a whole. The term (1 j w ’ ) relates to the phase lead network rc = 7 ’ ) whose role will be explained later. The term in the denominator takes account of the dependence of gain on frequency; the factor (1 j u ~ relates ) ~ to the two time constants RC actually inserted in the loop; they are of the order of 60 seconds. The product IIi relates to all the time constants inherent in the actual circuit; the main ones are those of magnetic resonance (relaxation time) of the photoelectric cell and of the low-frequency amplifier. These various time constants are short compared with the information passband chosen, so that the circuit can be analyzed while neglecting them. Finally, the expression for the gain becomes
+
+
and the magnetometer response may be written
These expressions describe the behavior of the loop as a function of gain Go as well as of the time constants chosen. The main results are the following. (ii) Static control error: The static control error is the difference between the values read and the real values:
This error is a maximum when Bi is at one end of the range; if a total dynamic range of 2 X 250,000 p G is chosen, the gain Go required in order to maintain an error of the same order as the minimum detectable difference has to be greater than 2.5 X 10”: Go
> 2.5 X
lo6
(31)
(iii) Generator’s frequency noise: This noise is due to fluctuation of the generator’s mean resting frequency ; it is obtained by differentiating
MEASUREMENT OF WEAK MAGNETIC FIELDS
123
formula (30) with respect to B,. I t s rms value is
or where
R appears as a reduction factor giving the amount of attenuation of generator noise by the control.
FIG.53. Reduction factor R of generator noise.
The curves of Fig. 53 show the values of R against frequency for two special values of 7’. Curve (a) corresponds to T’ = 0; in this case there is no phase correction. The reduction factor shows a sharp maximum for the value WT = Ordinarily the maximum corresponds to unstable conditions in which generator noise is actually amplified. This defect can be avoided by a suitable choice of 7’. Curve (b) corresponds to the special value of T’ = 7(2/G0) 1 / 2 which is the minimum value of T’ that will ensure th a t the reduction factor is always less than 1, i.e., that it actually produces a reduction. R then takes the simple form
R =
[
++
1 + [l 2Go
w
““I-’ ~
T
~
]
~
(34)
124
P. A. GRIVET AND L. MALNAR
The attenuation exceeds 20 db when wr is less than (Go/10)1'2.In practice generator noise is negligible at frequencies that satisfy this inequality : WT
< (Go/10)'/2
(3.5)
(iv) Open-loop storage: A precise meaning can be given to storage by defining i t in the following way; it is the time taken by the generator to sweep across the width of the resonance line when left to itself after a sudden break in the control chain. The law of evolution of the generator frequency is then of the form
(This is the law of evolution of a system with two time constants subjected to a sudden variation.) The most unfavorable case occurs when the generator stands a t the end of the range. I n the case Bi(0) - B, = 250,000 pG and since the line width a t resonance is of the order of 500 pG, Bi(tl) - &(O) has to be not more than 500 pG if storage is to be a t least equal to t l . I n view of the law of evolution a third condition results for 7 , which is T
> 15 t i
(36)
(v) Passband: This band is the information passband, i.e., that over which the instrument remains accurate. It is equal to the value of w a t which the closed loop gain drops by 3 db or, practically,
1 y(;{w)I
=
0.707
As a result of the calculations, the frequencies transmitted with a n attenuation less than 3 db have to satisfy the inequality: w
< 2 Go"2
(37)
It will be noticed that condition (35) is more stringent then condition (37). As a n illustration, the following is a practical example of a control system of the type discussed above. To ensure a dynamic range of 2 X 250,000 pG with a control error less than 0.1 pG and a storage of 4 seconds, the following must obtain:
Go = 2.5 X lo6 r = 60 seconds 7' = 60 msec Generator noise is then negligible up to a frequency f = 1.5 cps (condition (35)), and the useful passband is about 6 cps (condition (37)).
MEASUREMENT O F WEAK MAGNETIC FIELDS
125
d. An example of practical design. (1) General scheme. Figure 54 shows the circuit diagram of a complete, controlled magnetometer including, in addition to the control electronics, the auxiliary circuits, such as the starting arrangements, the light source excitation circuits, and the thermal regulation circuits. These various components will be briefly considered. (9) Control. (2) Generator: In order to cover a wide dynamic range, the generator mnsists of two oscillators whose mutual beat frequency is collected.
FIG.54. Complete controlled magnetometer arrangement.
The frequencies t o be obtained extend from 70 kc/sec t o 250 kc/sec and the oscillator frequencies are around 2 Mc/sec. The use of two identical oscillators practically avoids fluctuations of external origin. The frequency of each oscillator is controlled by varicaps which receive either a linearly variable voltage during the search phase, or an error voltage during the normal working phase. (ii) Modulator: One needs to scan the line in the resonance hull, and also to measure accurately the central frequency. For this purpose, the system adopted is the phase modulation type. This permits two separated channels, at the output of the carrier's generator; one feeds the frequency counter directly; the second leads to the resonance cell,
126
P. A . GRIVET AND L. MALNAR
through a low-frequency phase modulator. One then gets for measurement a signal free of the modulation required for control. This last kind of modulation would not be troublesome in the case of analog discrimination, for which it is possible to apply filters. But it may introduce a n error when the frequency is counted digitally, for if we write
1 for the expression of a signal modulated a t frequency F with a n excursion Av, a digital computer gives the number of maxima of a sinusoidal function during the counting time At, The frequency read is then V +
Av sin 27rF At 27rF At
The second term represents an error which depends both on the modulating frequency and on counting time. (iii) Amplifier chain and synchronous detector: To avoid unwanted phase rotation in the control it is useful to provide wide passbands, but such passbands emphasizes noise in the receiver. For example, fastchanging magnetic fields of industrial origin are captured a t the resonance cell and are liable to cause troublesome interference a t the output of the synchronous detector. This error may be important if the modulating frequency is in a simple ratio to the interference frequencies, and compromise is a difficult matter. (3) Starting arrangements. For starting purposes, a single sweep causes the generator to cover the whole of the dynamic range of frequencies; when it crosses the resonance line a signal of frequency twice the modulating frequency appears. This signal stops the sweep and closes the control loop. The start of the sweep is controlled by the temperature of the light source, for it is that temperature that requires the longest time for establishment. (4) Light source. The light source is the most important part in systems utilizing optical pumping. The main problems that arise are the following: (a) The firing of the lamps is aided by strong electric fields, but the discharge is more stably sustained by high-frequency excitation of the magnetic type. (b) The impedance of a gas discharge tube is quite different in the hot and in the cold states; and since it is impossible to insert the lamps directly in the circuits of the excitation oscillators on account of the disturbing magnetic fields, a return loop has to be provided.
MEASUREMENT OF WEAK MAGNETIC FIELDS
127
(c) Basically, better results are obtained by controlling the lamp temperature by heating independently of the high-frequency excitation. (d) The heating arrangements have to ensure heat distribution such that drops of alkaline metal will not mask the light in front of the output pupil. This is specially important in the case of mobile equipment, which may take almost any configuration in the gravity field. ( e ) The light spectrum emitted depends strongly on the optical paths covered and on the distribution of liquid masses inside the lamp. But this distribution may vary with time or with the orientation, and so on. Most of these problems have t o be solved by cut and try methods. Various solutions have been described. Gourber (144) described that which has been adopted for the cesium magnetometer. It enables lamps to be produved with a life of several thousand hours, noise not exceeding the minimum theoretical level, viz., the Schottky noise. Temperature control is one of the conditions of lamp stability. Further, detevted signals pass through a maximum at a given temperature of the resonauw (.ell, a temperature which is the result of a compromise between a sufficiently high vapor pressure and somewhat long relaxation times. For feeding the thermostats it is convenient to use variable power oscillators controlled by thermistor bridges, for they are progressive and cause no magnetic disturbance provided some elementary circuit design precautions are taken. 3. Self-Oscillator System. a. Principle. Figure 55 shows the circuit diagram of the system. Here, in principle, a second light beam is set a t right angles to the first and the variation of a component M , of the magnetization is observed on this beam. But this component oscillates at the frequency of the signal injected in the coil and the amplitude of oscillation passes through a maximum a t resonance. The signal collected in the photoelectric cell is amplified, its phase is shifted, arid it is reinjected in the coils. The system then oscillates at the resonance frequency. This is the principle of the self-oscillator magnetonieter; in theory it appears simply as an optical kind of “spin coupled oscillator”; the use of an optical sensor in the feedback loop does away with the presence of residual cwupling by induction, which is troublesome in other spin oscillators. The optical spin oscillator is a very good one. I n practice it is possible in this system to use a single light beam set at 45” to the direction of the field to be measured. I n this case the components involved are the components of B in the direction of the light beam and the perpendicular direction. The rf field B1 may be applied in a direction perpendicular or parallel to the light beam,
128
P. A. GRIVET A N D L. MALNAR
for the active component is perpendicular to B. But for reasons of mechanical symmetry it is preferable to place the resonance coil parallel to B. b. Sensitivity. This point has been discussed by Bloom (131). I n practice the sensitivity reaches the same value as in the case of controlled systems, e.g., some 0.1 pG for an information passband of 1 cps. c. Electronic problems. (1) Amplifier equivalent scheme. The singlebeam self-oscillator magnetometer can be described schematically as f 0110ws. The subensemble of the system comprised of the excitation coil, the precessing spins, the probing beam, and the photocell may be simulated
r-
Photoelectric cell
,,/
1
\I+AIcos2swt
'-
/ 0-
1
Amplifier
\Circular
2 8,sin 2swt
polarizer
Lamp
FIG.55. Principle of self-oscillator magnetometer.
by a simple series resonant circuit; this is inserted in the return path of the feedback loop, connecting output and input of a wide-band amplifier (Fig. 56). To complete the reaction circuit a black box represents the whole of the auxiliary functions met with in the physics of the self-oscillator system, in particular the + ~ / 2phase shift, which appears depending on whether the angle between the magnetic field and the light beam is greater or less than 90". (2) Requirements. The system represented by Fig. 56 has to oscillate at the proper frequency of the oscillating circuit as nearly as possible.
M E A S U R E M E N T O F WEAK MAGNETIC F I E L D S
129
Two conditions have to be satisfied: (a) The amplifier gain has to be sufficient to ensure oscillation. This inequality condition is readily obtained. (b) The phase of the amplifier has to be strictly equal to + ~ / 2 over the whole of the frequency band used; otherwise the system will oscillate a t t,he frequency at which the closed-loop phase shift is zero (provided the gain is adequate). This is a n equality condition which must be satisfied with a great accuracy. Noise
FIG.56. Equivalent circuit for the self-oscillator magnetometer.
Indeed, the error in the magnetic field due to the phase error is approximately given by the following expression :
where AB is the line width and d 4 the phase error. I n view of the line width used, about 500 pG, phase errors would have to be less than 0.02" to ensure errors of measurement less than 0.1 pG. I n practice amplifiers30can be constructed with a phase error of about 1" over a frequency band of one octave. Beyond this limit important difficulties appear owing to reproducibility problems.
C . Optimum Working Conditions and Limitations in Use 1. Mobile Equipment. This class includes airborne magnetometers or those carried in vehicles or drogues for detecting anomalies in geomagnetic surveys, and space magnetometers for measuring extraterrestrial fields. The importance of dynamic range in this type of instrument is obvious. But in most cases absolute precision is of secondary importance; what is really required is to compare magnitudes a t different points, SO that sensitivity is the chief quality. But since magnetometers may assume varying configurations with respect to the direction of the magnetic field, secondary effects may 30 Constant phase amplifiers are less common than the constant gain or constant delay types, but the same principle applies for obtaining maximum flatness of the phase characteristic ; phase compensating equalizers are used.
130
P. A . GRIVET A N D L. M A L N A R
become important. In particular, the instrument’s sensitivity may depend on its orientation with respect to the field to be measured; in addition, the measurement of the field can be affected by rotating the instrument, which brings about systematic errors related either to the orientation or to the speed of rotation. a. Dynamic range. This is the range of magnetic field over which the instrument is capable of taking measurements without readjustment. As explained before (see Sections I, A and IV, A, 1,c), optically pumped magnetometers are unaffected by a weakening of the field observed under magnetic resonance until one reaches the low level of 100 pG (e.g., one line width). I n practice, however, the electronic system will limit the dynamic range. For the case of the controlled system this limit, which is fixed by control error conditions, has already been discussed in the preceding section. I n the case of the self-oscillator system, the dynamic range is related to the width of the passband in which constant phase can be secured. As soon as the phase differs from the required value, a systematic error is obtained in the absolute value measured. The consequences of this effect are thus identical with those in the controlled system. I n both ca.ses the dynamic range can be defined only by fixing the error which can be tolerated in the measurement of the absolute value of the field. Taking this condition into consideration, it is technically much easier to secure a wide dynamic range with the controlled system. For example, if it is required to cover the band of 200,000 to 700,000 pG, the control error is easily held a t a value of 1 p G with a controlled magnetometer, whereas with a self-oscillator magnetometer it is difficult to secure a phase error corresponding to a setting error of 10 pG. b. Dependence of sensitivity on orientation. The theory of the magnetometer, given in Section IV, is based on the assumption that the pumping light is transmitted in the direction of the magnetic field B and that the rf field BIis applied at right angles to this direction. If such is not the case, light and the rf field are involved only in proportion to their components in the ideal directions. On this account the values of the detected signals, and the line widths, vary with the orientation of the magnetometer, and the same applies to sensitivity. These variations may be predicted from the results given in Section V, B, 2. I n particular in the case of the controlled magnetometer a n emf proportional to the signal is easily measured a t twice the modulation frequency near resonance. In the notations of Section B, 2, this emf is substantially proportional to ds/dB, s being the signal. It can therefore be used for measuring sensitivity. An experimental diagram of sensitivity
MEASUREMENT OF WEAK MAGNETIC FIELDS
131
against orientation obtained in this way is shown in polar coordinates by the curve of Fig. 57, where the modulus of the vector in direction 0 is proportional to the signal, e.g., to ds/dB. This diagram refers to rotation about a fixed axis perpendicular to B. This is the most unfavorable case in practice since the field B1is taken as very strong in the falling part of the curve of Fig. 49. I n particular the diagram shows that the detectable variation in strength does not exceed twice the ideal minimum in a cone of 70" apex angle. This permits using a single magnetometer in the airborne version over the whole B
' 270" FIG. .57. Ihpendence of sensitivity on orientation in the case of a controlled magnetometer.
region of the globe where the field inclination exceeds 50", without appreciably affecting performance. The diagram of the self-oscillator system would be of the same type if it included two light beams at right angles, one for pumping and the other for detection. Since in practice a single beam is used, giving optimum sensitivity a t 45" to the field, the sensitivity diagram becomes zero a t 0" and 90" from the field. The experimental curve of Fig. 58 gives a general idea of this diagram. The space diagram is symmetrical about B. Consideration of the difference between Figs. 57 and 58 helps in choosing between the two systems for any specific application. For example, the use of a single self-oscillator as an airborne magnetometer appears unfavorable, unless an orientable system is used.
132
P. A. GRIVET A N D L. MALNAR
On the other hand, in the case of space applications, in which the “attitude” of the carrier craft is unknown, the probability of correct operation is the same with both types of diagrams. This point has been discussed by Bloom (131). I n actual fact, this probabilit,y is immensely greater in the case of the self-oscillator, which oscillates instantly as soon as it returns to a working zone, whereas the controlled system demands a nonnegligible search time, as has already been seen. Naturally, with both systems it is possible to cover the whole space by using several magnetometers set up in different directions. The installation of such systems raises some rather considerable practical I
90”
270°
FIG.58. Dependence of sensitivity on orientation for a single-beam self-oscillator magnetometer.
problems, for, while the total bulk must be kept reasonably small, coupling between magnetometers, arising through the resonance coils, has t o be avoided. c. Turnover effects. One of the qualities required of magnetometers is that they be isotropic, i.e., they must provide a measurement that is independent of their orientation with respect to the magnetic field. All effects related t o anisotropy are generally designated by the term turnover effects and constitute a severe limitation in their use, especially in the case of mobile instruments. These effects are systematic errors which depend either on the magnetometer’s orientation in the field, or on the speed of rotation as considered below. (1) Purely magnetic effects. (i) Dc fields: If the materials used in the construction of the probes contain either remanent or induced magnetiza-
MEASUREMENT OF WEAK MAGNETIC FIELDS
133
tion, they are the sources of a field b which is added vectorially to the field B to be measured; the sum of the two (b B) depends on the orientation, thus producing a rotation effect. The problem, which is common to all magnetometers, is far from being negligible, though the necessary precautions are clear. I n the case of optical pumping magnetometers special attention has to be given to secondary circuits for temperature control. (ii) Ac fields: A more interesting effect occurs when an alternating field adds vectorially to the dc field that is to be measured. If the frequency of the ac field is higher then the passband of the magnetometer, one observes an error due to the nonlinearity of vector addition. If the ac field shows a component b, cos w t at right angles to the dc field B, one measures effectively the mean value, equal to B[1 ( Z I , / ~ B ) to ~] first order. This correction may be a troublesome error. On the other hand, comparing the measures for two orientations of the magnetometer, one for which the two fields are parallel and the other for which they are perpendicular, one gets an accurate measure of small ac fields by this second-order effect. (2) Line asymmetry. The apparent resonance line of an alkaline atom shows, in weak fields, arid asymmetry that depends on the polarization of the pumping light. This asymmetry is related to the Back-Goudsmit effect which disturbs the linearity of the energy levels with respect to the magnetic field, as explained in Section IV, B, 1. The energy of each level shown on Fig. 48 as a function of the magnetic field is given by the Breit-Rabi formula:
+
+
where AEo is the hyperfine difference; I is nuclear spin (in this case, 4 ) ; p~ is the Bohr magneton; g J and gr are the Land6 factors which represent the magnetic, electronic, and nuclear moments with respect to the Bohr magneton; arid x is the quantity [(gJ - g ~ ) p & ] / A E e . Nonlinearity is introduced by the term under the radical in formula (38). One of the consequences of this fact is that the frequencies of the different Zeeman components are riot coincident. For example, in the case of cesium the difference between lines is 19 pG in a field of 500,000 pG. This difference is less than each line width, so that the spectrum is not resolved. What is actually observed is an over-all line composed of all the elementary lincs. Rut optical pumping with a+ light8has thc advantage of populating levels with the highest value of m. This produces inequality between the different components (Fig. 59), and asymmetry in the over-all line to the
134
P. A. GRIVET A N D L . MALNAR
right. Population is reversed toward negative values of m if the light is and asymmetry also has its sense changed. Since changing the sense of the polarization and the sense of the magnetic field are equivalent operations, an identical effect is obtained if the magnetometer is turned around in the field to be measured. This produces a rotation effect. If this effect were pure, the Breit-Rabi formula would enable its upper limit to be calculated. This upper limit would be reached in the extreme case for which the over-all resonance line is coincident with the strongest of the elementary lines.
U-,
Fro. 59. Combination of lines of unequal amplitudes.
Transformation of the Breit-Rabi formulas shows that in the case of weak fields the relative difference between the extreme frequencies 2 X (vmax - vmin)/(vmax vmin) is independent of the value of nuclear spin and depends only on the Zeeman parameter x;
+
2 x
Vmsx Vmsx
-
+
Vmin Vmin
- 2(gJ
- g1)PBB
=
2x
AEo
The relative difference of fields measured before and after rotation of 180" would then have the following value for the various alkaline metals : Cs 6 X 10-4B 87Rb 8.2 X 10-4B 86Rb 18.6 x 10-4B K 220 x 10-4~ where-B is in gauss. This extreme difference would be encountered in the limit for an infinite light intensity I of the optjical pumping and for a near zero value
13Fj
MEASUREMENT OF WEAK MAGNETIC FIELDS
of the Zeeman field B1; for the very principle of optical detection of magnetic resonance suggests competition between the light intensity, which tends t o accumulate the atoms in a single level, and the Zeeman field, which tends, on the contrary, to equalize the population in all the levels. These antagonistic effects are also apparent in the shape of the resonance lines. I n particular, one would expect, more symmetrical lines and so smaller rotation effects as the light intensity drops or as the field B1 increases. c
30
20
10
--c
28,
FIG. 60. Line asymmetry as a function of the field B , and resulting field displacement.
This tends to be confirmed by curve (D) of Fig. 60, which shows line asymmetry against the field B1. Thus an asymmetry parameter D is defined as the difference between the frequencies a t the top and a t the center of the line, this difference being referred to half the line width. The changes in the value of the indicated magnetic field, dB, corresponding to this asymmetry are shown on the same graph. Curve (b) is quite compatible with rotation effects actually encountered with the cesium magnetometer, which are of the order of 10 to 20 pG in a field of 450,000 pG. When these results are compared to the previsions of the theory resumed in the preceding table it appears th a t asymmetry effects are not the only ones taking part. (3) Residual effects. The frequency shifts related to optical pumping (121)may contribute to the discrepancy although they are not a t present
136
P. A . GRIVET A N D L. MALNAR
known quantitatively. Variations in the light source are linked with the orieiitation on account, of the diff ererit distribution of the unvaporized masses of metal within the lamp. This effect becomes marked in selfoscillator niagnetonieters, which require very accurate definition of the optical axis; it somewhat restricts the efficacy of compensating devices aimed a t suppressing rotation effects by the use of opposite optical beams obtained from the same light source. The above-mentioned effects depend only on the direction of the optical pumping with respect to the magnetic field. There is a further effect related tto 0,the speed of rotation of the magnetometer about B (see Section I , A, a), which can be simply described as follows, with reference to the diagram of Fig. 43. The vapor’s magnetization M is caused to precess by a rotating field whose angular speed is increased (or decreased) by the rotation of the magnetizing coils. In accordance with Eq. ( 2 ) , this produces the following relative error:
dB/B
=
s1/2?rv
is the resonance frequency corresponding to B. This effect is not easily observed with optical resonance magnetometers in which the relevant frequencies are very high as compared with the values of 52 reached in practice. 2. Resettability; Stability. a. Resettability. Often, the interest in absolute precision is not very great. But it is still important to have available an instrument which can be substituted without having to make a fresh calibration. From this point of view resettability can be defined as the ability of several instruments of the same type placed in the same conditions to supply the same readings. It can be numerically stated by the relative difference between these readings. Cesium magnetometers of the controlled type possess a resettability of about 10 pG. The following table offers an example of the results obtainable when checking resettability by comparing the readings of five different cesium magnetometers in the same location. The present experiments (181) were made in the French Geomagnetic Observatory of Chambon-la-For&, in a field of approximately 0.45 gauss; the five apparatuses were later used for establishing a new magnetic map of France during the year 1964, by aerial exploration. Many series of readings were made successively, and for ease of comparison, the readings were reduced to the common value B = 0.45 gauss; here is a randomly chosen reading: (cps]
f
No. 1 157441
No. 2 157442
No. 3 157443
No. 4 157439
No. 5 157439.5
MEASUREMENT OF WEAK MAGNETIC FIELDS
137
The conclusion of the trials for this set of apparatus was expressed as a frequency versus field law: [cps] f = 3.49869 X lo6 X B
gauss
b. Long-term stability. The stability of an instrument can be defined as its ability t o retain its initial calibration. So the problem of stability is the problem of long-term drift, which may be caused by a variation of the instrument in time due to aging or through the action of external factors such as temperature, pressure, and humidity. I n the present state of experience, the chief external cause of drift is temperature. It makes its effect felt through the electronic equipment whose characteristics are liable to vary: (1) Temperature causes variations in the intensity of the high-frequency excitation of the light source, or in the amplitude of the Zeemari field; these variations cause frequency shifts. But these effects are very slight and are negligible compared to the following ones. (2) I n the controlled magnetometer, the most important cause of drift is related to the very principle of the determination of the center of the resonance line by means of a synchronous detector. The latter delivers an output signal by comparing not only the fundamental frequencies of the modulation, but also the various harmonics if they exist in both channels. But they are also present in the return channel, a t least in principle, and may occur in the reference channel if the modulating signal is distorted owing to stray coupling and so on. This is therefore the origin of a false error signal which may vary with temperature. This effect is a t present responsible for drift of about 10 pG for a temperature change of 40°C. (3) I n the self-oscillator systems the dominant effect is phase rotation in the amplifier, caused by variation of the chararteristics of reactive components with temperature. This effect causes drift of about 10 p G for a temperature change of 10°C.
All these causes of drift can be considerably reduced by thermal regulation of the electronic components. 3. Limits of the Passband. Magnetometers are essentially continuous instruments, i.e., designed for measuring slowly varying phenomena. They are gener:dly used with time constants greater than 0.1 sec. More rapidly varying phenomena could be measured with probe coils whose sensitivity increases with the frequency to be measured. Nevertheless, these devices measure only the field components, so it may be useful to know the limit passband of optical pumping magnetometers.
138
P. A. GRIVET AND L. MALNAR
Controlled magnetometers possess a passband limited by the control system itself. As shown by the analysis of Section B, 2, c, it is difficult in practice to obtain a cut-off frequency higher than a few cycles per second for closed-loop control; this will therefore be the cut-off frequency for the magnetometer itself. I n the self-oscillator system the only limitation is the width of the resonance line. One can expect, a t the instrument output, a passband approximately equal to half the line width, measured in cycles per second. This is confirmed by experiment, as shown by the following table, which shows the cut-off frequency of a self-oscillator for different amplitudes of the applied sinusoidal field. Amplitude ( p G ) Cut-off frequency (cps)
350 75
800 70
1700
65
4500 55
10000 50
This problem is discussed theoretically in Ref. (1.31).
D. Examples of Designs The following figures illustrate the preceding sections by a few examples of design. Figures 61, 62, and 63 relate to a magnetometer of the controlled type used for geophysical measurements. The same instrument can be used either fixed or as a mobile instrument in its drogue version. Figure 61 shows the complete instrument. I n the foreground is the actual magnetometer, which is basically an epoxy resin cylinder 1.5 meter long; on one end is the control electronic unit; a t the other the detector probe, i.e., the optical pumping unit. This lengthwise arrangement permits enclosing the system in a bird-shaped envelope, and so forms a magnetometer which can be towed by an aircraft or a helicopter. Further back in Fig. 61 is a unit comprising the power supply and the counter system. The latter can convert the received frequency, which is proportional to the magnetic field, either to a coded signal which can be stored on a magnetic tape, or to an analog signal tto be recorded graphically. The counter is also provided with a high-stability clock for synchronizing two stations separated from one another, so as to obtain simultaneous records. This unit (counter and power supply) is connected to the magnetometer itself by a single coaxial cable which may be 100 meters long. This cable is used as the towing cable when the magnetometer is converted to a drogue version. Figure 62 shows the drogue under a towing aircraft, on the ground, and Fig. 63 shows the same instrument towed by a helicopter, in its normal use.
MEASUREMENT O F WEAK MAGNETIC FIELDS
139
Figure (54shows one example of simultaneous recording in absolute value of the magnetic field measured by two independent magnetometers. These recordings are shown on the same graph. The two magnetometers were 30 meters apart.
FIG.61. The geophysical magnetometer.
FIG.62. The drogue on its adjustable nonmagnetic checkout fixture.
Figure 65 gives a similar recording for two magnetometers 300 km apart. Figure 66 illustrates the measurement method used in the operation of the French magnetic map, with instruments similar to th a t showed by Fig. 62. An aerial magnetometer is flown along a predetermined axis and records both the magnetic profile and the time-dependent fluctuations.
140
P . A . GRIVET AND L. MALNAR
Simultaneously a ground station records these fluctuations only. The difference between the bwo records shows only the magnetic profile. The smoothing effect obtained can be observed in the figure. Figure 67 shows a self-oscillator magnetometer for space measurements. It consists esseritially of the measuring probe (optical pumping
FIG.63. The magnetometer in flight.
unit) and two air-tight boxes containing the electronic unit and the power supply. The frequency-measuring system, which may be placed on the ground, is not shown. Finally Fig. 68 is an example of a record of geomagnetic fluctuations obtained with an instrument of this type. Table V summarizes the typical characteristics of the cesium vapor magnetometers.
141
MEASUREMENT O F WEAK MAGNETIC FIELDS
t
I
2
.
[min]
FIG.64. Example of simultaneous absolute recording at same place.
45283
--
45273
452631
45191
Absolute value Salles Curan May 5.1964
1
I1
45181
-Absolute vclue Condom May 5,1964
45171 -
142
-
P. A . ORIVET A N D L. MALNAR
2
Air record
Ground stotton
FIG.66. Simultaneous aerial and ground record.
FIG.67. The self-oscillator magnetometer.
t
8 [gammas]
-0I t
I
2
3
4
[min]
FIG.08. Example of high sensitivity relative recording.
143
MEASUREMENT OF WEAK MAGNETIC FIELDS
TABLE V Controlled type
Auto-oscillating type --__
Sensitivity (for 1 cps bandwidth) Absolute accuracy in relative value Stability (for A T = 40°C) Possible bandwidth Dynamic range Temperature range Turn-over effect (one head)
0.1 pG 10-6
10 pc: 5 cps 0 . 5 gauss - 20” to 40°C 10 pG max
0 . 1 pG Not measured 20 pG 50 cps 0. .5 gauss - 20” to 40°C +30 pG max
VI. SUPERCONDUCTING INTERFEROMETERS AS MAGNETOMETERS A . Principle A very interesting effect was discovered recently by Jaklevic, Mercereau et al. (18.2, 183, 184) in the domain of superconductivity. They were able to build a device which may be called a n “interferometer in the time domain for de Broglie’s electron waves,” and the structure of the “interference pat,t,ern” is highly sensitive to the intensity and direction of the ambient magnetic field. The magnetic sensor of this device is simply a small loop of superconducting wire of area A . The ultimate sensitivity of the measurement 1 A H I is linked, order-of-magnitude-wise, to the quantum of flux by the relation IAHI
p1 (4ol-4)
[UEfiI]
(39)
where 40 given by %
=
h/2e = 2.1 10-7
[fi4axwell]
(40)
is very small ( h Planck’s constant, e charge of the electron). Moreover, the numerical factor p may reach values of the order of 10 to 100; it measures the accuracy obtained in determining electronically the location of a minimum in a “dark fringe” of the interference pattern; in fact, the interference appears as a beat frequency and evolves in time. Actually the accuracy is limited by noise and stability problems and the useful values for the surface of the loop do not exceed 1 em2; the speed of response is good. In these conditions one may hope to conveniently reach the G level. This gain would be specially valuable for interplanetary fields, which are very low (20 to 50 pG). On the other hand, miniature helium refrigerators have been already developed for use on rockets and satellites
144
P. A . GRIVET A N D L. MALNAR
arid an experiment combining both devices aboard a satellite seems promisingly possible. Already, today, a laboratory prototype has been constructed in the same place as the initial discovery, thoroughly tested and compared with optical magnetometers, and shows at least the same quality (183). The device is named SQUID by its inventors (Superconducting Quantum Interference Device) and its properties will be briefly described in the following paragraph. No attempt will be made to survey the basic theory of the interferometer, as it is masterfully expounded in the third volume of Feynman’s lectures on physics (185, lecture 21) and brilliantly too in the reviews and articles (186, 187, 184).
FIG.69. SQUID construction.
However the following point should be stressed : The superconducting ring sensor of the SQUID, shown in Fig. 69, is not simply a homogeneous wire bent in a circle. On the contrary it includes two “weak links.” Until now, the theory considered only one type of such “weak link,” the so-called Josephson’s diode. On the contrary, in practice the “weak link” is simply made of a point contact, under moderate elastic pressure, and its diode characteristic differs notably from the curve for Josephson’s diode.
B . The First Practical Squid The ring sensor shown On Fig. 69 is made of vanadium wire (or niobium). When the pressure is properly adjusted, one observes the characteristic V - I shown on Fig. 70. Starting with I = 0, one moves on the axis V = 0 in the first superconductive region a, but a current of
MEASUREMENT OF WEAK MAGNETIC FIELDS
145
the order of 1 mA is sufficient to partially destroy the superconductivity and to progressively restore resistance: one crosses the useful region b ; finally one reaches the normal region c. The remarkable point is th a t in the useful region, the position of the curve along the I axis depends on the magnetic flux threading the aperture of the sensor: if the flux equals fn90/2, n being an even integer, one describes branch I of the curve;
FIG.70. Voltage-current characteristics for a typical SQUID.
I
-2e0
-4
I
0
+%
+ 2Q0
FIG.71. Voltage versus magnetic flux characteristics for a proper choice of biasing current.
if the flux is changed to one of the values + n a 0 / 2 , n being an odd integer, it describes branch 11. For intermediate value of the field, the figurative point describes the “load line,” oscillating back and forth between points A l and N as shown by the arrows. The absolute value of the slope of the load line is the resistance of the diode in the normal state (region c ) . Biasing the current in the useful region b, the signal V is a smooth oscill+tory function of the magnetic flux 9 shown in Fig. 71. The magnetometer works on the null principle and a servoloop regulates
146
P. A. GRIVET AND L. MALNAR
Bias current
amplif ier
Synchroncus detector
Squid Field coil Bolancing+locol scanning
10 KC oscillator
FIG.72. Lock-on SQUID magnetometer block diagram.
a compensating field, so as to lock the figurative point on a position very near from a minimum on diagram 71. The block diagram of the system is given in Fig. 72. The obtained performances for the first SQUID (October 1965) are the following: hlodulation frequency for local scanning of the diode SQUID diameter Short term noise Dynamic range Feeding power Possible long term drift
10 kc
6 or +‘B in. lo-’ (G) for a 4 cps bandwidth 0.01 G to 1 mG 1.5 W anything to 0.01 mG
Instabilities are presumably due to the use of point contact diodes. This defect may be cured in the future by the development of thin film diodes. REFERENCES 1 . H. A. Thomas, R . L. Driscoll, and J. A. Hipple, J . Res. Natl. Bur. S f d . 44, 569 (1950). 2. P. L. Bender and R. L. Driscoll, I R E Trans. Instr. 1, 176 (1958). 3. G. Hochstrasser, HeZv. Phys. Acla 34, 189 (1960). 4 . G. Hochstrasser and A. Erbeia, Bull. Ampere 10, 280 (1961). 6. A. Abragam, J. Combrisson, and I. Solomon, Compt. Rend. 246, 157 (1957). 6. R. B. Leighton, “Principles of Modern Physics,” R.ZcGraw-Hil1, New York, 1959. 7. A. Kastler, Physica 17, 191 (1951);Proc. Phys. SOC.(London) 867, 853 (1954); J . Opt. SOC.A m . 47, 460 (1957). 8. H. G. Dehmelt, Phys. Rev. 106, 1487 (1956). 9. W. E. Bell and A . L. Bloom, Phys. Rev. 107, 1559 (1957). 10. R . Stefant, Ann. Geophys. 19, 250 (1963). 11. P. Grivet, M. Sauzade, and R. Stefant, Rev. Gen. Elec. 70, 317 (1961). 18. C. P. Sonnett, Advan. Space Sci. 2, 3 (1960). 13. D. L. Judge, M. G. McLeod, and A . 13. Sims, Report GM-41, 1-588 Space Technol. Lah., Los Angeles (195!1). 1.4. J. P. Heppner, N. F. Ness, C. S. Scearce, and T. L. Skillman, J.Geophys. Res. 68, 1 (1963).
M E A S U R E M E N T OF W E A K MAGNETIC F I E L D S
147
15. 8. S. Uolginov, L. N. Zhuzgov, and N. V. Puskov, Artijicia2 Satellites 2,63 (1960) ; 6, 16 (1961). Dokl. Akad. Nauk S.S.S.R., Geojizika 170, 574 (1966). 16. P. Grivet, Bull. Ampere 9, 567 (1960). 17. M. Epstein, L. S. Greenstein, and H. M. Sachs, Proc. Natl. Electron. Conf., 1959 16, 24 (1960). 18. W. E. Fromm, Advan. Electron. 4, 257 (1952). 1.9. B. S. Melton, Advan. Electron. Electron Phys. 9, 297 (1957). 80. N. F. Ness, C. S. Scearce, and J. B. Seek, Initial results of the IMP-L magnetic field experiment. J. Geophys. Res. 69, 3531 (1964). 81. P. H. Serson and W. L. W. Hannaford, Can. J. Technol. 34, 232 (1956); P. H. Serson, Can. J. Phys. 36, 1387 (1957). 28. K. J. Burrows, J. Brit. I R E 19, 769 (1959). 83. W. A. Geyger, AZEE Trans. Paper 58-1277 (1958). 84. J. Jung and J. Cackenberghe, Bull. Ampere 10, 132 (1961). 85. L. Kurwitz, and J. Nelson, J . Geophys. Res. 66, 1759 (1960). 86. L. R. Alldredge, J. Geophys. Res. 66, 3777 (1960). 87. I. R. Shapiro, J. Stolarik, and J. P. Heppner, J. Geophys. Res. 66, 913 (1960). 28. S. K. Runcorn, Discovery 26, 20 (1964). 88a. B. R. Leaton, S. R. C. Malin, Nature 213, 1110 (1967). 89. S. Chapman, “The Earth Magnetism.” Methuen, London, 1951. 30. H. S. Massey and R. L. F. Boyd, “The Upper Atmosphere.” Hutchison, London, 1958. 31. S. Chapman, “Solar Plasma, Geomagnetism and Aurora,” pp. 373-502. Gordon & Breach, New York, 1963. 38. H. Benioff, J. Geophys. Res. 66, 1413 (1960). 33. W. J. Campbell, J. Geophys. Res. 66, 1819 (1960). 33a. L. J. Cahill, Jr., Science 147, 991 (1965); Sci. American 212, No. 3, 58 (March 1965). 34. J. T. I. Arnold, W. E. Bell, A. L. Bloom, and L. R. Sarles, J. Geophys. Res. 66, 2472 (1960). 35. J. Winckler, Discovery 23, 20 (1963). 36. F. N. Spiess, and A. E. Maxwell, Science 146, 349 (1964). 37. A. D. Raff, Sci. Am. 206, 146 (1961). 38. E. R. King, I. Zietz, and L. R. Alldredge, Science 144, 1551 (1964). 38a. F. K. Harris, A nonmagnetic laboratory for the National Bureau of Standards. I E E E Spectrum pp. 85-87 (1966). 39. G. J. Bene, Une station d’etudes du magnetisme nuclkaire dans la forht de Jussy, pres de Geneve. 86Bme Congrh des Socidt6s Savantes p. 30 (1960). 40. L. A. Marzetta, Rev. Sci. Znstr. 32, 1192 (1961). 41. N. Wolff, ZEEE Intern. Conv. Record Part 8, Instrument., p. 149 (1964). 48. L. D. Schemer, Rev. Sci. Znstr. 32, 1190 (1961). 43. F. Salle, M. Sauzade, Compt. Rend. 268, 73 (1964). 44. A. Lory, P. Grivet, and M. Sauzade, A secondary standard of current based on a nuclear spin oscillator. ZEEE Trans. Znst. Meas. IM-13, 231 (1964). 46. M. F. Pipkin and R. J. Hanson, A magnetically shielded solenoid with field of high homogeneity. Rev. Sci. Znstr. 36, 79, (1965). 45a. J . Patton and J. L. Fitch, J. Geophys. Res. 67, 1117 (1962). 45b. W. R. Hindmarsh, F. J. Lowes, P. H. Roberts, and S. K. Runcorn, “Magnetism and the Cosmos.” Oliver & Boyd, Edinburgh and London, 1963. 46. J. P. Heppner, J. D. Stolarik, I. R. Shapiro, and J. C. Cain, Proc. 1st Intern. Space Sci. Symp., Nice, 1960 p. 982. North-Holland Publ., Amsterdam, 1961.
148
P. A. GRIVET AND L. MALNAR
47. J. P. Heppner, T. L. Skillman and J. C. Cain, Proc. 2nd Intern. Space Sci. Symp., Florence, 1961 p. 681. North-Holland Publ., Amsterdam, 1962. 48. G. H. Ludwig, Space Sci. Rev. 2, 175 (1963). 49. J. P. Heppner, Space Sci. Rev. 2, 315-354 (1963). 60. E. R. Harisson, Geophys. J. 6, 479 (1962). 61. L. J . Cahill, Jr., J. Geophys. Res. 68, 1835 (1963). 62. W. E. Scull, and G. H. Ludwig, Proc. IRE 60, 2287 (1962). 63. S. P. Heims. and E. T. Jaynes, Rev. Mod. Phys. 34, 143 (1962). 64. M. Packard, and R. Varian, Phys. Rev. 93, 941 (1954). 65. E. L. Hahn, Phys. Rev. 77, 297 (1950). 66. G. S. Waters, P. D. Francis, J . Sci. Instr. 36, 88 (1938). 67. G. Klose, 2.Angew. Phys. 10, 495 (1958). 68. A. Blaquiere, G. Bonnet, and P. Grivet, Proc. 3rd Quantum Electron. Congr., 1963 p. 231. Dunod, Paris, 1964. 69. G. Faini, A. Fuortes, and 0. Svelto, Energie Nucl. 7 , 705 (1960). 60. E. L. Hahn, and D. E. Maxwell, Phys. Rev. 88, 1070 (1952). 61. A. Losche, “Kerninduktion,” pp. 102-128. Verlag der Wissenschaften, Berlin, 1957. 68. A. Blanc Lapierre, and B. Picinbono, “PropriBtBs statistiques du bruit de fond.”
Masson, Paris, 1961. 63. P. Grivet, and A. Blaquiere, “Le bruit de fond.” Masson, Paris, 1958. 64. G. Faini, and 0. Svelto, Energie Nucl. 8, 295 (1961). 66. R. Gendrin, and R. Stefant, Compt. Rend. 266, 2273 (1962). 66. G. Faini, and 0. Svelto, Nuovo Cimento Suppl. 23, 55 (1962). 67. T. Rikitake, and I. Tanoka, Bull. Earthquake Res. Inst., Tokyo Univ. 38, 319 (1960). 68. M. J. Aitken and M. S. Tite, J . Sci. Znstr. 39, 625 (1962). 69. S. Narayans and G. Woolard, J. Geophys. Res. 66, 2548 (1961). 70. A. Erbeia and G. Hochstrasser, Bull. Ampere 10, 280 (1961). 71. D. Mansir, Electronics 33, 47 (1960). 72. D. D. Thompson and R. J. S. Brown, J. Chem. Phys. 36, 1894 (1961). 73. J. A. Pople, W. G. Schneider, and H. J. Bernstein, “High Resolution Nuclear Magnetic Resonance.” McGraw-Hill, New York, 1959. 74. D. D. Thompson and R. J. S. Brown, J . Chem. Phys. 40, 3076 (1964). 76. A. Abragam, Phys. Rev. 98, 1729 (1955). 76. G. E. Pake, Solid State Phys. 2, 87 (1956). 77. J. Uebersfeld, Bull. Ampere 10, 456 (1961). 78. W. A. Barker, Rev. Mod. Phys. 34, 173 (1962). 79. C. D. Jeffries, “Dynamic Nuclear Orientation.” Wiley (Interscience), New York, 1963. 80. A. Abragam and M. Borghini, “Dynamic Polarization of Nuclear Targets,” Vol. 4, Chapter VIII. North-Holland Publ., Amsterdam, 1964. 81. J. R. Singer, Advan. Electron. Electron Phys. 16, 73 (1961). 82. J. P. Gordon, H. J. Zeiger, and C. H. Townes, Phys. Rev. 99, 1264 (1955). 83. M. Shimoda, C . H. Townes and T. C. Wang, Phys. Rev. 102, 1308 (1958). 84. C. Kittel, Phys. Rev. 96, 589 (1954). 84a. R. Besson, H. Lemoine, A. Rassat, A. Sator and P. Servoz-Gavin, Proc. Colloq. AMPERE 327 (Atomes Mol. Etudes Radio Elec.) 12, (1964). 86. I. Solomon, J . Phys. Radium 19,837 (1958). 86. J. Combrisson, J . Phys. Radium 19, 840 (1958). 87. A. J. Landesman, J. Phys. Radium 20, 937 (1959).
MEASUREMENT OF WEAK MAGNETIC FIELDS
149
88. H. Benoit, Compt. Rend. 246,3053 (1958); Ann. Phys. (Paris) [13] 4, 1440 (1959). 89. J. Combrisson, A. Honig and C. H. Townes, Compt. Rend. 242, 2451 (1956). 90. N. Bloembergen and R. V. Pound, Phys. Rev. 96, 8 (1954). 91. P. Grivet, Cahiers Phys. 66, 20 (1956). 92. R. V. Vladimirsky, Nucl. Znstr. Methods 1, 329 (1957). 99. I. Solomon, Comm. Energie At. (France),Rappt. CEA (SACLAY) No. 346 (1961). 94. J. Combrisson, Quantum Electron., Symp., High View, N . Y.,1969 Vol. 1, p. 167. Columbia Univ. Press, New York, 1960. 96. A. M. Prokhorov and N . G. Basov, Discussions Paraday SOC. 19, 96 (1955). 96. H. Benoit, J. Phys. Radium 21, 212 (1960). 97. A. Salvi, An absolute magnetometer based on magnetic nuclear resonance. Thesis, Grenoble, (1961); Comm. Energie At. (France) Rappt. No. 2383 (1964). 98. G. Bonnet, Ann. Geophys. 18, 62 and 150 (1962). 99. C. Schmelzer, CERN (Geneva) Rept. No. PS/CS-2 (1952); No. PS/CS-/1 (1953). 100. S. Kurochkin, Radiotelchn. i Elektron. 3, 198 (1958). 101. J. Hennequin, Ann. Phys. (Paris) [13] 6, 949 (1961). 102. A. Kastler, “La spectroscopie en radiofr6quence,” p. 7. Revue d’Optique, Paris, 1957. 102a. C. Cohen-Tannoudji and A. Kastler, Optical pumping, i n “Progress in Optics” (E. Wolf, ed.), Vol. 5, pp. 3-85. North-Holland Pub]., Amsterdam, 1966. 102b. R. A. Bernheim, “Optical Pumping: An Introduction.” Frontiers in Chemistry, monograph 711. Benjamin, New York, 1965. 10s. R. C. Mockler, Advan. Eleclron. Electron Phys. 16, 1-74 (1961). 10.4. W. B. Hawkins, Phys. Rev. 98, 478 (1955). 106. M. E. Rose, ed., “Nuclear Orientation,” Intern. Sci. Rev. Ser. No. 6. Gordon & Breach, New York, 1963. 106. F. D. Colegrove, L. D. Schearer, and G. K. Walters, Polarization of H,* gas by optical pumping. Phys. Rev. 132, 2561 (1963). 107. M. Arditi and T. R. Carver, Phys. Rev. 136A, 643 (1964); J . Appl. Phys. 36,443 (1965). 107a. P. Davidovits, Appl. Phys. Letters 6 , 15 (1964). 108. H. G. Robinson and T. Myint,,Appl. Phys. Letters 6, 116 (1964). 109. A. E. Siegman, “Microwave Solid State Masers.” McGraw-Hill, New York, 1964. 110. H. G. Dehmelt, Phys. Rev. 103, 1125 (1956). 111. H. G. Dehmelt, Phys. Rev. 106, 1924 (1957). 112. W. Franren and A. G. Emslie, Phys. Rev. 108, 1453 (1957). 113. A. L. Bloom, Sci. A m . 203, 72 (1960). 114. T. R. Carver, Science 141, 599 (1963). 116. R. L. de Zafra, A m . J. Phys. 26,646 (1960). 116. T. Skalinski, in “Topics of Radiofrequency Spectroscopy” (A. Goezini, dir.), pp. 212-239. Academic Press, New York, 1962. 117. Ann Arbor Conf. Opt. Pumping 1969. Report edited by Univ. of Michigan, 1960. 118. G. V. Skrotskii and T. G. Izyumova, Soviet Phys.-Usp. (English Transl.) 4, 177 (1961). 119. J. Brosscl, Quantum Electron. Symp. High View, N . Y., 1969pp. 81-92. Columbia Univ. Press, New York, 1960. 120. J. Brossel, in “Advances in Quantum Electronics” (J. R. Singer, ed.), pp. 95-113. Columbia Univ. Press. New York, 1961. 120a. J. Brossel, Optical pumping, i n “Quantum Optics and Electronics,” pp. 189327. Gordon & Breach, New York, (1964). 121. C. Cohen-Tannoudji, Ann. Phys. (Paris) [13] 7 , 423 (1962).
150
P. A . GRIVET A N D L. MALNAR
122. A. R. von Hippel, “Foundations of Future Electronics” (D. B. Langmuir and W. D. Hershberger, eds.) pp. 1-35. Wiley, New York, 1961. 193. I. M. Popesco and L. N. Novikov, Compt. Rend. 269, 1321 (1962). 194. T. Skalinski, J . Phys. Radium 18, 890 (1958). 196. C. 0. Alley, Thesis, Princeton University (1961). 126. L. W. Parsons and 2. M. Wiatr, J. Sci. Znstr. 39, 292 (1962). 127. N. F. Ramsey, “Nuclear Moments.” Wiley, New York, 1953. 128. P. L. Bender, E. C. Beaty, and A. R. Chi, Phys. Rev. Letters 1, 311 (1958). 199. A. L. Bloom, Phys. Rev. 118, 664 (1960). 130. P. L. Bender, Bull. Ampere 9, 621 (1960). 131. A. L. Bloom, Appl. Opt. 1, 61 (1961). 132. C. Cohen-Tannoudji, Compt Rend. 262,394 (1961). 133. P. L. Bender, Proc. 3rd Quantum Electron. Congr., 1963 p. 263. Columbia Univ. Press, New York, 1964. 134. M. Arditi and T. R. Carver, Phys. Rev. 124, 800 (1961). 136. J. P. Barrat and C. Cohen-Tannoudji, Compt. Rend. 262, 93 (1961); J . Phys. Radium 22, 329 and 433 (1961). 136. L. D. Schearer, Phys. Rev. 127, 512 (1962). 137. L. D. Schearer, F. D. Colegrove, and G. K. Walters, Rev. Sci. Znstr. 36,767 (1964). 138. H. G. Dehmclt, Rev. Sci. Znstr. 36, 768 (1964). 139. B. Cagnac, Ann. Phys. (Paris) [13] 6,467 (1961). 140. W. E. Bell, A. L. Bloom, and J. Lynch, Rev. Sci. Znstr. 32, 688 (1961). 141. G. R. Brewer, Rev. Sci. Znstr. 32, 1356 (1961). 142. V. B. Gerard, J. Sci. Znstr. 39, 217 (1962). 143. F. A. Franz, Rev. Sci. Znstr. 34, 589 (1963). 144. J. P. Gourber, Proc. 3rd Quantum Electron. Congr., 1963 p. 325. Columbia Univ. Press, New York, 1964. 146. J . Brossel. A. Kaatler, and J. Marjerie, Compt. Rend. 241, 865 (1955). 146. H. G. Dehmelt, Phys. Rev. 106, 1487 (1957). 147. P. L. Bender, Ph. D. Thesis, Princeton University (1956). 148. C. Cohen-Tannoudji, DiplBme Etudes SupBrieures, ENS, Paris (1956). 149. H. G. Dehmelt, E. S. Ensberg, and H. G. Robinson, Bull. Am. Phys. SOC.[2] 3, 9 (1958). 160. W. Franzen, Phys. Rev. 116, 859 (1959). 161. C. 0. Alley, in “Advances in Quantum Electronics (J. R. Singer, ed.), p. 120. Columbia Univ. Press, New York, 1961. 162. R. A. Bernheim, J. Chem. Phys. 36, 135 (1962). 163. M. A. Bouchiat, J. Phys. Radium 24, 379 and 611 (1963). 164. F. Grossetete, J . Phys. Radium 26, 383 (1964). 166. R. H. Dicke, Phys Rev. 89, 472 (1953); 96,340 (1954). 166. R. H. Dicke and J. P. Wittke, Phys. Rev. 96, 530 (1954). 167. J. P. Wittke, Ph. D. Thesis, Princeton Univ. (1955). 168. M. Arditi, Ann. Phys. (Paris) [13] 6, 973 (1960). 169. P. L. Bender, Ann Arbor Conf. Opt. Pumping, 1969 p. 111. Report edited by Univ. of Michigan, 1960. 160. P. L. Bender and T. L. Skillman, J. Geophys. Res. 63, 513 (1958). 160a. P. L. Bender, Phys. Rev. 128, 2 and 218 (1962). 160b. R. L. Driscoll, Phgs. Rev. 136A,54 (1964). 161. C. P. Slichter, “Principles of Magnetic Resonance.” Harper, New York, 1963. 162. J. P. Barrat, Proc. Roy. SOC.A263, 371 (1961).
MEASUREMENT OF WEAK MAGNETIC FIELDS
151
163. J. P. Barrat and C. Cohen-Tannoudji, J . Phys. Radium 22, 329 and 443 (1961). 164. M . I. Podgoretskij and 0. A. Krustalev, Soviet Phys.-Usp. (English Transl.) 6, 682 (1964). 165. C. Cohen-Tannoudji, in “Topics of Radiofrequency Spectroscopy” (A. Gozaini, dir.), p. 240. Academic Press, New York, 1962. 166. J. Winter, in “Topics of Radiofrequency Spectroscopy” (A. Cozzini, dir.) , p. 259. Acadcmic Press, New York, 1962. 167. J . P. Heppncr, N. F. Ness, C. S. Scearce, and T. L. Skillman, J . Geophys. Res. 68, 1 (1963). 168. K . A. Ruddock, Proc. 2nd Intern. Space Sci. Symp., Florence, 1961 p. 692. North-Holland Publ., Amsterdam, 1962. 169. F. D. Colegrove and P. A. Franken, Phys. Rev. 119, 680 (1960). 170. W. L. Clark, Rev. Sci. Znstr. 33, 560 (1962). 171. L. D. Schearcr, in “Advances in Quantum Electronics” (J. R. Singer, ed.), pp. 239-251. Colunibia Univ. Press, New York, 1961. 172. F. D . Colegrove, L. D. Schearer, and G. K. Walters, Rev. Sci. Znstr. 34, 1363 (1963). 173. G. B. Field and E. M. Purcell, Astrophys. J . 124, 1542 (1956). 174. H. G. Dehmelt, Phys. Rev. 109, 381 (1958). l74a. R. C. Greenhow, Phys. Rev. 136A, 660 (1964). 176. N. F. Ramsey, Rev. Sci. Znstr. 28, 57 (1957). 176. H. G. Robinson, E. S. Ensberg, and H. G. Dehmelt, Bull. Am. Phys. SOC.121 3, 9 (1958). 177. C. Cohen-Tannoudji and J. Brossel, Compt. Rend. 244, 1027 (1957). 178. W. B. Hawkins, Phys. Rev. 123,544 (1961). 1’79. L. Malnar, Proc. 3rd Quantum Electron. Congr., 1963 p. 305. Columbia Univ. Press, New York, 1964. 180. R. Karplus, Phys. Rev. 73, 9 (1948). 181. E. le Borgne and J. le Mouel, Znst. Phys. Globe, Paris, No. 2 (not published). 182. J. Lamhe, A. H. Silver, J . E. Mercereau, and R . C. Jakleavic, Phys. Letters 11, 16 (1964); 12, 159 (1964). 185. J. E. Zimmerman, A. H. Silver, Phys. Letters 10,47 (1964). 184. J. E. Zimmerman, A. H. Silver, Phys. Rev. 141, 367 (1966). 185. R. P. Feynman, R. B. Leighton, and M. Sands, “The Feynman Lectures on Physics,” Vol. 3, p. 21-1. Addison-Wesley, New York, 1965. 186. G. F. Zharkov, Soviet Phys.-Usp. (English Transl.) 9, 198 (1966). 187. R . de Bruyn-Ouboter, M . H. Omar, A. J. P. T. Arnold, T. Guinau, and K. W. Taconis, Physicu 32, 1448 (1966).
This Page Intentionally Left Blank
The Radio-Frequency Confinement and Acceleration of Plasmas H . MOTZ Department of Engineering Science Oxford University. Oxford. England AND
C . J . H . WATSON Merton College Oxford. England Introduction . . . . . . . . . . .. . . . . . ... .. . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 1. Single Particle Motions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 A Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 B. Guiding Center Theory for Arbitrary Radio-Frequency Ficlds . . . . . . . . . . 163 C . Radio-Frequency Plus a Uniform Magnetostatic Field . . . . . . . . . . . . . . . . . 166 D . Radio-Frequency Plus a Nonuniform Magnetostatic Field . . . . . . . . . . . . . . 168 E . The Cyclotron Resonance . (i) Standing Waves . . . . . . . . . . . . . . . . . . . . . . . . 173 F . Particle Motion in a Traveling Electromagnetic Wave . . . . . . . . . . . . . . . . . 187 G . The Cyclotron Resonance . (ii) Traveling Waves . . . . . . . . . . . . . . . . . . . . . . . 190 2 The Theory of Radio-Frequency Confinement of Plasma . . . . . . . . . . . . . . . . . . 194 A. Derivation of the Self-consistent Field Equations . . . . . . . . . . . . . . . . . . . . . 194 B. The Energy-Momentum Tensor Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198 C . One-Dimensional Equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 D . Infinite Cylindrically Symmetric Equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 E. Three-Dimensional Confinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 3 . Theory of Combined Radio-Frequency and Magnetostatic Confinement of Plasma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 A. Derivation of the Self-consistent Field Equations . . . . . . . . . . . . . . . . . . . . . 223 B . One-Dimensional Equilibria with a Uniform Magnetostatic Field . . . . . . . . 225 C . Low Pressure Plasma Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 4 . StabilityTheory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 5. Application t o Fusion Reactors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 6 . Experiments Related to Radio-Frequency Confinement . . . . . . . . . . . . . . . . . . . 241 A . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 B . Single Particle Confinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242 C . Electron BeamFocusing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 D . Direct Evidence of Radio-Frequency Confinement of Plasma . . . . . . . . . . . 250 E . Indirect Support for the Quasi-Potential Concept from Breakdown Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 7. The Theory of Radio-Frequency Acceleration of Plasma . . . . . . . . . . . . . . . . . . 264 A. Purely Radio-Frequency Acceleration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 B Acceleration Using Combined Radio-Frequency and Magnetic Fields . . . . 276 8 . Experiments on Radio-Frequency Acceleration of Plasma . . . . . . . . . . . . . . . . . 283 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 153
.
.
.
154
H. MOTZ AND C. J. H. WATSON
INTRODUCTION Most of the work described in this review was inspired by one or the other of two unattained goals of modern physics and engineering-the control of nuclear fusion and the acceleration of high density matter to relativistic velocities. The common feature of these two apparently unrelated projects is that in both cases the most hopeful working substance is a fully ionized plasma, which has to be stably confined in isolation from the material walls of the apparatus. The conventional approach is to use magnetostatic or quasi-magnetostatic forces for this purpose. As early* as 1956 however, the Soviet physicist Veksler (1956) pointed out that the force exerted on a plasma by a radio-frequency electromagnetic field could be quite substantial, and he suggested that a suitable rf field configuration might be capable of simultaneously confining a plasma and accelerating it. The following year, Knox (1957a,b) independently proposed that the rf fields which can be set up in a spherical resonant cavity should be used to confine a thermonuclear plasma. In the years immediately following this, a number of theoretical physicists in the Soviet Union, Britain, and America explored these possibilities in some detail. In the first instance, very crude models were used to represent the behavior of the plasma. On the one hand, it was taken to be a perfectly conducting fluid with sharp boundaries, at which all transverse electric fields had to vanish; on the other hand, it was supposed that the plasma could be treated as a bounded uniform dielectric medium of fixed dielectric coefficient (the value chosen being 1 - wp2/w2, where w was the frequency of the rf field and up some average plasma frequency). Soon, however, these “quasi-metallic” and “quasi-dielectric” models were replaced by an approximate magnetohydrodynamic theory. I t was shown that, provided the electric field gradient remained everywhere relatively small, it was possible to write down approximate expressions for the charge and current densities in the plasma, in terms of the local value of the electric field strength, and hence to obtain a set of self-consistent equations which determined the plasma and rf field configurations. At this time (19581960), attention was focused upon those solutions of these self-consistent equations which tended to confirm the validity of the quasi-metallic model, for the plausible reason that this model predicted the existence of confined equilibria and was significantly simpler than the quasi-dielectric model. Accordingly, stability analyses of these equilibria were carried out on the assumption that the use of this model had been justified-i.e., that
* There had indeed been earlier proposals, circulated as classified reports-e.g. Good (1953)-but Veksler appears to have priority of publication.
R F CONFINEMENT AND ACCELERATION OF PLASMAS
155
the force which maintained the equilibrium was the radiation pressure of the magnetic component of the rf field alone. The results of this work were very discouraging to those with thermonuclear interests. It was shown that many of the confined plasma configurations, including the spherical configuration of Knox (1957a,b) and the cylindrical configuration of Boot et al. (1958), were unstable against certain deformations of shape. Furthermore, it was pointed out that the rf field strengths required to confine a thermonuclear plasma were so large that the dissipation of energy in the walls of the cavity would exceed the thermonuclear output, except possibly if the frequency were kept very low or the plasma density very high. The calculations of Boot et al. in this respect, which as we shall see were overoptimistic, indicated that a positive net power output could only be obtained if the value of up2/w2 was greater than about lo7. Unfortunately, as Weibel (195813) pointed out, for such very large values of up2/uzthe electric field gradient a t the plasma-radiation boundary, which is proportional to up,becomes sufficiently steep to invalidate the assumption underlying the self-consistent MHD theory upon which the calculations were based, and consequently (he implied) such configurations could not exist. As a result of this theoretical work, interest in the thermonuclear applications of rf confinement declined rapidly, and the few experiments which had been started were discontinued before any extensive results had been obtained. Fortunately the subject was kept open by those whose interests lay in the direction of acceleration. Accelerator physicists have never been deterred by a lavish consumption of rf power and, as a group of accelerator theorists (Levin et al., 1959; Askaryan et al., 1961) showed, it is by no means true that all confined plasma configurations are unstable on the quasi-metallic model. Consequently, in 1961 some experiments were started a t the Physics Institute of the Academy of Sciences in Moscow to test the principle of rf acceleration. One effect of these experiments, which have already produced some quite promising practical results, has been to reawaken the interest of a number of theoreticians in problems connected with the interaction of strong electromagnetic fields with plasmas. Perhaps the most important recent advance has been the demonstration by Silin that it is possible to solve the Vlasov equation for a plasma in a strong but spatially uniform rf electric field. As one of the present authors has shown, his approach can be extended to give an approximate kinetic theory of plasma in slowly varying nonuniform rf fields. This theory, which is described in Section 2, draws on the earlier work of Miller (1958~)on the motion of individual partides in rf fields, and it confirms the validity of the self-consistent MHD equations for plasmas with Maxwellian distributions. Examination of the conditions
156
H. MOTZ AND C. J. H. WATSON
under which these are valid however, shows that this equation can never be used to justify the quasi-metallic model of a plasma confined by rf fields. Essentially this is because the gradient of the electric field is related, through Maxwell’s equations, to the strength of the magnetic field; so whenever the magnetic field is so large compared with the electric field that the latter can be neglected (the condition which is assumed to hold at the plasma boundary on the quasi-metallic model), the electric field gradient is so large that the self-consistent field theory is inapplicable. Conversely, whenever the electric field gradient is sufficiently small for the theory to be applicable, the electric field penetrates into the plasma to a significant extent, in the manner qualitatively described by the quasi-dielectric model. An immediate consequence of this is that all stability analyses based on the quasi-metallic model are suspect, since a stability theory cannot be more reliable than the theory of the equilibrium upon which it is based. In the view of the present authors it is now an open question whether there exist configurations in which a plasma with some given plasma frequency is confined by an rf field of very much lower frequency and whether, if they exist, they are stable. Weibel’s argument (1958b) shows only that one cannot use the approximate self-consistent field equations to determine the equilibrium. As we shall see, however, there are qualitative arguments, based on the behavior of individual particles in the presence of steeply sloping electric fields, which suggest that such equilibria may exist but that the width of the boundary layer between the plasma and the radiation has to be rather greater than that predicted by the approximate self-consistent field theory. This possibility has led us to reconsider the feasibility of a thermonuclear reactor working on this basis. I n Chapter 4 we show that, contrary to the optimistic conclusions of Boot et al. (1958), whatever assumptions one makes about the ratio wp2/w2, the rf losses in the cavity walls would preclude a positive power balance at any plasma density which could be confined by attainable rf field strengths, even if the cavity is made of a highly conducting metal such as silver. However, the position has been transformed by a recent technological breakthrough: it has been established that rf cavities can be constructed from superconducting metals, kept at a temperature below their critical point, and that the cavity losses can then be reduced by a factor of over 106. Research in this field is still in a very preliminary state, and there are theoretical indications that even greater reductions in the losses might be obtained; our calculations indicate that although the achievements so far are not quite sufficient, a 50-fold increase in the product of the surface admittance of the superconductor and its temperature of operation might make possible a reactor with positive power balance. In such a reactor both the brems-
RF CONFINEMENT AND ACCELERATION OF PLASMAS
157
strahlung and the rf dissipation (both in the plasma and in the walls of the cavity) would be negligible compared with the refrigeration effort required to keep the walls superconducting ! The technological problems connected with the design of such a reactor would of course be formidable, but the same is true of any realistic proposal for a reactor using magnetic confinement. I n addition to proposals for purely rf confinement and acceleration, it has been suggested th at the rf fields should be combined with a stationary magnetic field. I n this context a distinction has to be made between two alternatives, which we shall describe as the “resonant” and “nonresonant” approaches respectively, which differ in the relationship chosen between the rf frequency w and the electron cyclotron frequency Q. If Iw - Ql is greater than a quantity which is difficult to estimate precisely but is roughly of order wvo/c, where v g is the thermal or directed velocity of the particles in the plasma, the oscillation of the rf field and the gyrations of the electrons in the magnetic field do not remain in phase for a significant length of time. Under these “nonresonant” conditions, if the magnetic field is uniform, the behavior of the plasma is qualitatively the same as in the rf field only, but the magnitude of the force exerted b y the rf fields on the plasma is amplified by a factor w / ( w - a) and it changes sign when Q becomes greater than (J.The application of this amplification effect in the theory of plasma confinement is discussed in Section 3, and its application to acceleration in Section 7. Its usefulness is unfortunately restricted by the fact that, as the plasma reaches high temperatures or velocities, the amplification factor decreases. A further difficulty, mentioned in Section 4,is that the presence of a magnetic field considerably increases the tendency for microinstabilities to develop in the plasma, the consequences of which for confinement are rather difficult to predict. Nevertheless, the reality of this amplification effect has been demonstrated in the Russian experiments on plasma acceleration. If the magnetic field is nonuniform, it is possible to use the (likewise amplified) rf forces t o improve the confinement properties of the magnetic field alone. Thus, for example, Johnston (1960) has suggested that one might hope t o block up the loss cones of a mirror machine in this way. The chief difficulty with this proposal is that the magnetic field in any confinement device of thermonuclear interest is rather strong (-10 kg) and the associated cyclotron frequency is of order 10“ sec-’. Thus the components of the rf electric field perpendicular to the static magnetic field (which alone are subject to the amplification effect) only enhance confinement if the rf frequency is somewhat greater than this: i.e., if its wavelength is less than about 1 cm. This raises both geometric problems and a very serious difficulty in supplying enough rf power to be useful.
158
H. MOT2 AND C . J. H. WATSON
When the value of Iw - QI becomes small enough for particles to remain in resonance for a significant period of time, the approximate selfconsistent field theory of the behavior of the plasma and the fields breaks down. Under these circumstances it is exceedingly difficult to give an adequate account of the motion of even a single particle, let alone a whole plasma, except in situations of unrealistically simple geometry. In Section 1 we summarize the special cases in which the equations of motion of a single particle have been solved, either exactly or in an approximation whose validity can be assessed, and in Section 7 we indicate the way in which these restricted results have been used to predict the behavior of a plasma accelerator based on this “resonance” principle. I n Section 8 we describe the experiments, on resonant confinement and acceleration, conducted by Consoli and his co-workers at Saclay; some of these have been strikingly successful and offer considerable prospects of further development. The practical applications of the ideas which we have outlined above are many and various. Apart from the reopened question of the possibility of an rf thermonuclear machine, rf confinement may well prove useful in machines for the direct conversion of thermal energy into electricity. Rf plasma accelerators may be used to investigate statistically improbable fundamental particle transformations, which require very high beam densities if they are to be observed; or to inject hot plasma into a more conventional thermonuclear machine ; or for space propulsion ; or even [as suggested by Consoli (1963a)I to create very high vacua, by ionizing the residual gas and then accelerating it out of the cavity. But even if it should turn out that there are practical obstacles to all these applications, the theory of the interaction of rf fields with plasma may prove important in the understanding of certain astronomical phenomena, for example, in radio-stars. It would not be the first time that human inventive activity has suggested mechanisms which explain cosmic phenomena. Finally, the forces exerted by rf fields on solid state plasmas have hardly been discussed in the existing literature. In the present review we do not attempt an exhaustive treatment of the whole literature. In particular, we have restricted ourselves to the interaction of natural rf modes with a plasma. For such modes the electric and magnetic field strengths, averaged over the space within which the mode is set up, are equal. It is of course possible to excite a cavity with forced oscillations, imposed by alternating currents in the cavity walls, such that the magnetic field strength is everywhere much larger than the electric field. Such “quasi-magnetostatic” rf modes exert forces on the plasma which cannot be analyzed by the methods described in this review, and we have accordingly excluded this case from our discussion.
R F CONFINEMENT AND ACCELERATION O F PLASMAS
159
1. SINGLE PARTICLE MOTIONS
-4.Introduction The exact motion of a charged particle in a n arbitrary time-dependent electromagnetic field is one of the notoriously insoluble problems of classical dynamics. Until recently, however, the amplitudes of electromagnetic fields which either were known to arise in nature or could easily be generated in the laboratory were relatively small, and one could plausibly use a linearized approach: i.e., expand the actual particle orbit as a power series in the amplitude of the electric field E about some unperturbed orbit-rectilinear or (in the presence of a uniform magnetic field) spiral. This approach clearly becomes inapplicable if the maximum velocity acquired by the particle under the influence of the field during one cycle, which can be cstimated as V E eE/mw, becomes comparable with its unperturbed velocity. The practically realizable values of VE have risen sharply as a result of recent developments in microwave and laser technology. The present position might be summarized by saying that, for an electron velocities v E c are attainable a t frequencies u p to about 10"; and even at w = 1015, the field strength from a ruby laser gives UE 105 cm/sec. For such field strengths a nonlinear approach is required. Fortunately, in recent years a number of analytical techniques have been developed which yield approximate nonlinear solutions of specifiable precision in all but the most intractable cases. For the most part, these techniques are variants of the method of averaging, first popularized in mathematical physics by Bogolyubov (1955). This method is in fact of very wide applicability; however, 011 the principle that approximate solutions are more credible if they reproduce exact solutions in special cases, we shall begin by considering a few cases where the symmetry properties of the fields make it possible to solve the equations of motion exactly. We shall therefore discuss the motion of a particle of charge e in the field of a plane electromagnetic wave which is derivable from a vector potential A1(z, 1) lying in the z,y plane. Since in this case the Hamiltonian
-
-
-
x
=
(p - eA,(z, t ) ) 2 / 2 w
is independent of the coordinate x and y, we immediately obtain two constants of the motion, p , and p,, related to the dynamic variables by pL = nzv,
+ eA,.
(2)
This determines the motion in the z,y plane. In the z direction we have
p,
=
nzz
= --ax/az.
(3)
160
H. MOTZ AND C. J. H. WATSON
At this point it becomes necessary to decide whether A, represents a traveling or standing wave. This distinction is unimportant in linearized theory, where indeed it is customary to use a complex representation in which no such distinction is made; in nonlinear theory the difference is fundamental, as may be seen from the fact that for a traveling wave the quantity A* averaged over one cycle d2is constant, whereas for a standing wave it is a function of z. A t this stage we shall consider only standing waves, since the traveling wave case can only be discussed adequately in a relativistic framework, which would obscure the argument at this point. We shall return to it later. A second decision which is required concerns the state of polarization of the wave. We shall later develop a method which makes it possible to consider quite arbitrary polarizations; here we shall consider only plane or circularly polarized waves and shall write
Al(x,
(4)
t ) = A(z)[cos at, Y sin wt, 01,
where Y = 0, +1 for plane and circularly polarized waves, respectively. Equation (3) then gives
+ v p , sin w t ) a - - (e2A2/2nr)(cos2 wt + v 2 sin2w t ) . 82
e dA m& = - - ( p z cos wt m az
(5)
This equation simplifies significantly if we choose pL = 0. The physical significance of this choice becomes clear if we average Eq. (2) over one cycle: pI is the average particle momentum in the z,y plane. (It is tempting to suppose that in consequence one can always choose pL = 0 simply by = 0. However, such a transforming to a frame of reference in which transformation also changes A, in such a way that the form of (5) is unaltered.) If in addition we consider circularly polarized waves (Y = l ) , Eq. (5) becomes exactly soluble and gives
Qmi2
+ e2A2(x)/2m= & = const;
(6)
that is, J/ = esA2/2nt = e2E2/2mu2acts as a potential well which tends to confine the particle in the neighborhood of a node of the wave. To consider the resulting motion in more detail, let us expand the vacuum wave
E
=
Eosin k z [cos wt, fsin
wt,
01,
k
= w/c,
(7)
about one of its nodcs. We obtain J/ = & v n ( e E ~ / ~ n c=) ~ +nawJ2z2 z~
Thus the particle oscillates harmonically with a frequency W J (vE/c>w.
(8) =
eEo/mc =
R F CONFINEMENT AND ACCELERATION OF PLASMAS
161
If on the ot,her hand we consider a plane polarized wave (v = 0) and again consider the motion in the neighborhood of a node, Eq. ( 5 ) becomes 8
+ 2 0 J 2 z sin2wt = 8 + wJ2(1 - cos 2wt)z = 0
(9)
(where W J has the same value as before if E , is interpreted as the rms value). Equation (9) is Matthieu’s equation, and we can take over its known solutions. A general property of these solutions, demonstrable by Floquet’s theorem, is that they can be written in the form z = P ( w t ) exp(fipt),
where P is a periodic function of wt and p is a (possibly complex) constant. If p has an imaginary part, the solution is described as unstable; the physical significance of this in the present context is that, whatever the initial energy of the particle, it is able to move arbitrarily far from the node (at least until the expansion of the field strength about the node breaks down). Conversely, if p is real the particle remains trapped. The stability and instability regions for a given value of the parameter w.,/w can be obtained from standard tables; in particular, the solution is stable for 0 p w J / w 5 0.83. The nature of the solutions is discussed in (for example) Erdelyi (19,53) (Section 16-2), who show that for small U J / W , p ‘v W J and P(wt) = 1 ( w j 2 / w 2 ) P ’ ( w t ) . Thus, in this limit the solution is strikingly similar to the exact solution obtained above in the case of circularly polarized waves; the only difference is a small rapidly oscillating correction of order w J 2 / w 2 .As this parameter increases, however, the correction term likewise increases and the motion becomes significantly nonsinusoidal. Finally, when W J / W 0.83, x begins to grow exponentially with time and the particle escapes. We shall now show that for small W J / W all the properties of this exact solution can be obtained by the method of averages. We attempt a solution of (9) of the form z = zoaC zs, where zoscrepresents the oscillatory motion on the time scale 3 a / w and z8 represents the smooth motion on the much longer time scale 2?r/wJ. Treating zoscas E02, (18)
where EO is defined by e2E02/4mw2T= Eo, and a = ?j and Q for the monochromatic and Fermi distributions, respectively.
202
H. MOT2 AND C. J. H. WATSON
It is convenient a t this stage t o introduce dimensionless variables in place of z and E ; measuring z on the scale of c / o and E on the scale of (4mw2T/e2)1'2, we can rewrite (15) as
E'2 where $ ( E ) = X 2
+ $ ( E ) = const = C2,
+ (op2/02)exp(-E2)
(19)
in the Maxwellian case and
in the case of the truncated distributions considered. I n each case, the general form of $ ( E ) is seen to depend in a similar manner upon the value 1 it has a single minimum at E = 0, but for larger of oP2/w2-for wp2/w2 values it has two minima and a maximum a t E = 0, and this property is readily shown from (16) to hold for all distribution functions which are functions of & only. To complete the solution we need to integrate (19). This topic has been discussed by Volkov (1959a), Sagdeev (1959), Weibel (1957), Self (1960), Cushing and Sodha (1959) and R'lotz (1963a,b), but none of these authors gives a wholly satisfactory account of it. Sagdeev considered only truncated distributions and only one special value of the constant C. The others were restricted by their M H D treatments to considering Maxwellian distributions. Of these, Volkov gives the most nearly comprehensive account, though his analysis of the case C2 = wp2/w2 is incomplete, and his scheme of dimensionless parameters obscures the role played by the ratio wp2/wz. Cushing and Sodha's paper (1959) is justly criticized by Self (1960), who, however, bases his treatment of the rather unhelpful parameter Em,, and gives little physical interpretation. Mots (1963), considers only the application to plasma confinement and in this context draws conclusions about plasma leakage which need modification in the light of the kinetic analysis, but he introduces the following mechanical analogy. Equation (19) is analogous to the first (energy) integral of the equation of motion of a classical mechanical particle in the potential well $ ( E ) .The qualitative features of the motion thus follow by inspection from the form of the potential well, once the particle energy (corresponding to Cz) is given. I n Fig. 2 we represent the shape of $Jfor the two cases: (i) wP2/w2 < 1, (ii) wp2/w2 > 1. The dotted lines represent the contribution of the second term and consequently give a measure of the plasma density corresponding to that value of E. In case (i) the results are relatively simple. The minimum value of C2 = C12 gives us E = 0 everywhere. For C2 slightly larger than CI2,
RF CONFINEMENT A N D ACCELERATION OF PLASMAS
203
E 2 1 throughout the motion, and we can use the linearized equation, obtaining E 1 (C2 - C12)1’2 sin(1 - o , , ~ / w ~ ) ~ / ~ z . (20) In this limit the plasma density remains constant everywhere. As C2 increases, w e move out of the region where the linearized equation holds everywhere. The solution remains periodic, but the wavelength decreases and the waveform ceases to be sinusoidal, becoming most severely altered near the nodes [the analysis and a figure are given by Volltov (1959a)l. For C2 >> 1, the presence of the plasma (now confined to the neighborhood of the nodes) becomes unimportant in the wave equation, and we approach the vacuum solution E = C sin ( U / C ) Z almost everywhere.
FIG.2
I n case (ii) ( O ~ , ~ / W>~ l), we have to distinguish four classes of solution, corresponding to the four distinct choices for C2-(a) C2 > CI2, (b) C2 = C12, (c) C I 2 > C2 > C Z 2 , (d) C2 = Cz2,~ h c r e is C the ~ ~minimum possible choice of C2, given by 1 log wp2/w2 for a Maxwellian and by 1 - [ a / ( l C U ) ] ( W ~ / W , , ~for )~/~ a truncated distribution. With choice (a) the solution is periodic arid qualitatively resembles the solution for large C2 in the case w P 2 / w 2 < 1. Here, however, as C2 is decreased toward C12 the wavelength increases to infinity. That this must happen follows a t once from the mechanical analogy: for small C2 - C12,the kinetic energy of the particle near the peak at E = 0 becomes small, and the period of its motion in the well increases rapidly, reaching infinity when C2 = C12, since i t then comes to rest at the point E = 0. Mathematically, this follows from the linearized treatment, valid locally near E’ = 0, which gives
+
+
E
(C2 - C12)1/2 sinh[(wP2/wo2)- 1]1/2z
1
(21)
(we are restricated to this hyperbolic function since by hypothesis there exists a point where E = 0), showing that the value of z for which E becomes of order unity (which givcs a lower bound to the wavelength) approaches infinity as C2 - C12 tends to zero.
204
H. MOTZ AND C . J. H. WATSON
Choice (b), C2 = CL2, has the particular significance that for this value of C2 only is it possible for both E and E' to vanish simultaneously a t some point or points. Since E' is proportional to B , the electromagnetic field vanishes altogether a t these points, leaving pure plasma. Since, as we have seen, the wavelength is in this case infinite, there can a t most be two such points. If there are two, they must be a t x = f 0 0 , and we have the case considered by Volkov-two semi-infinite plasmas separated by (i.e., confining) a slab of radiation. If there is only one such point, we have the case considered by Sagdeev-a semi-infinite plasma separated from a semi-infinite domain of vacuum radiation by a single boundary layer. We may note that in the former case the width of the slab is indeterminate. The field profile in the boundary layer(s) grows exponentially away from the point(s) where the field vanishes until it leaves the linear region, when it continues to grow nonlinearly up to the value Em,, determined by the equation +(Emax) = C12. Although the full width of the boundary layer is infinite, its effective width is of order c/(wpz - w2)1/2. For still lower values of C2, i.e., choice (c), (C12 > C2 > CzZ),we once more obtain periodic solutions, but the oscillations are now between two positive (or negative) values of E, Emin, and Em,,. The sign is not significant, as is clear if we remember that E represents only the amplitude of the electric field; the field itself is E multiplied by a sinusoidal function of time. Finally, for the choice (d), (C2 = CzZ),we have E' = 0 everywhere and hence the plasma is of constant density and the electric field everywhere has the same amplitude and sign, given by & E,, where +(E,) = Cz2. The above discussion classifies the qualitatively different solutions of the nonlinear wave equation (14),using only two parameters C2 and w p 2 / 0 2 which , characterize the field amplitude and plasma density, respectively. The classification turns out to be applicable to any distribution function; the effect of different choices of distribution function only becomes apparent when one considers the distribution in space of the plasma corresponding to these various classes of solution. Since the plasma density is given by Eq. (17), we see that for a Maxwellian, although the plasma is never wholly confined to a finite region of space, its density is exponentially small if (momentarily reverting to dimensional variables) (eoE2/ 4noT)(wp2/w2)>> 1. If wp2 2 w2 there is no restriction on the maximum value of E2, so the plasma can always be localized by making E 2 large enough, the bulk of the plasma sitting at the nodes of the wave, which is periodic but not exactly sinusoidal. For w p 2 > w 2 the position is more complicated and in order to interpret it physically we need to know the relationship between C2 and the maximum value of E 2which occurs in a solution specified by C2. It proves con-
R F C O N F I N E M E N T A N D ACCELERATION O F PLASMAS
205
venient to work with a quantity M = ( w 2 / w p 2 ) E Z ;in dimensional form M = enE2/4nnT and is a measure of the electric pressure. B y (19) we have that both the maximum and minimum values of E2, corresponding to a solution specified b y given C2 and wP2/w, are determined by the transcendental equation
The qualitative features of the solution of this equation are most readily seen graphically. For the periodic solutions with alternating signs, C2 > Cl2 = W , , ~ / W ~ SO , there is no upper bound on M and the plasma density can again be made exponentially small a t the antinodes of the wave simply by taking E:,,, large enough. For the (critical) nonperiodic case C2 = C12, we have M exp( -wP2M/w2) = 1, a transcendental equation whose solution is always fit' l and hencen = noexp(-wp2/w2). Thus for this class of solutions the plasma density is only exponentially small if wpz/w2 >> 1. For the periodic but'*nonalternating solutions Cl2 > C2 > C2, Eq. (22) has two solutions M,,, and M,,, (as may be seen by plotting the left-hand side graphically) corresponding to the maximum and minimum values of E2. Since there now exists a minimum value of E2, wp can no longer be interpreted as the maximum local value of the plasma frequency, which is given instead by the smaller quantity w:,,, = up2exp[- (wp2/w2)Mmln]. This is nevertheless still larger than w 2 , as follows from the fact that Cz > CZ2 = 1 log(wp2/w2),which combined with the definition of upmRx and (22) gives
+
-
+
4max ~W2
log&
w2
=
1
+ C2 - c22> 1
(23)
and hence w: mRx > w2. Incidentally, = wP2 exp[- (wP2/w2)M,,1, which is in fact given by the smaller of the two roots of this equation, is less than w2. Finally for C2 = Czz, upma,and wpmln coincide and equal w . The above analysis applies to Maxwellian plasmas; we should consider briefly what changes one should expect for a truncated equilibrium. It is not difficult to show that apart from the unimportant changes in the values of C1 arid C p , the above analysis goes through unaltered, so t'he spatial distribution of E in the Riaxwelliari and truncated cases is essentially the same. However, there are major differences in the plasma distribution because of the different dependence of the plasma density on E. Since for the truncated distributions n ( E ) vanishes for E 2 En, it follows that any solution for which l3 > Eo anywhere will exclude the plasma from the regions concerned. In particular, if Eo < EmRx is chosen, the periodic solutions will have separate plasma slabs located near the nodes (for the
206
H. MOTZ AND C. J. H. WATSON
solution with alternating sign) or near the minima (for those with fixed sign) and the aperiodic solutions will consist of plasma half-spaces, fully confined by the radiation. The extension of the above analysis to cover waves of arbitrary polarization is mathematically elementary but leads to a rather unexpected conclusion. In general, we can always write
E
=
fi[E,(z) sin(wt + &),
E, sin(ot
+ 4,),
where we have added a normalization constant d2 so that 8 Equation (13) now gives
d2Ei dz2
+ w2
-- ~i c2
= C2
~i
[
exp -
C
01,
2
=
+
EZ2 EU2.
e2Ei2
i
or, in the appropriate dimensionless variables,
E:!
+ E~ =
(
exp -
(W,~/W~)E~
2
~ ~ 2 ) .
i
This system of equations possesses an integral (which is in fact its Haniilt onian)
Writing E , = A cos 8, E, = A sin 8, we have
+ A2eQ + A 2 + (wP2/w2)
X = Af2
exp(-A2).
Since X is independent of e we have at once the analog of conservation of angular momentum, L = A20' = constant, and X = At2
+ ( L 2 / A 2 )+ A 2 + (wp2/w2) exp(-A2)
=
AI2
+ $(A2),
where J. now possesses a centrifugal barrier a t A = 0 except when L = 0, a condition which is easily interpreted as requiring a plane or circularly polarized wave. I n consequence, A can only oscillate between maximum and minimum values of the same sign; i.e., the alternating periodic and aperiodic solutions become impossible, and the field cannot vanish anywhere within the plasma. Transforming back to the variables E, and E , we obtain an oscillatory but nonperiodic motion of these variables precisely analogous to the motion of a classical particle in a non-Keplerian central field of force. The physical significance of these nonperiodic solutions may be seen from the observation that they could be Fourier-analyzed into a continuous spcctrum of periodic waves. Thus they represent what, might be described as n monochromatic turbulent state of the plasma.
R F C O N F I N E M E N T A N D ACCELERATION O F PLASMAS
207
Before leaving one-dimensional equilibria, some discussion should be given of the way in which such equilibria might be established. Under normal experimental conditions the plasma is either injected before switching on the rf power; or it is injected into a preexistent standing wave; or it is created by the rf fields, ionizing neutral gas in a resonant cavity. I n each case, the electromagnetic fields will initially have a time dependence which is more complicated than the dependence sin ut which has been assumed in the above discussion, though it might be expected to settle down into one of the equilibria eventually. So far, no analysis of the transient behavior of the fields appears to have been attempted and the actual behavior is at best a matter for speculation. It is particularly difficult to be certain what would happen if rf power were applied to a plasma such that up2> u2and the power level corresponded to a periodic but nonalternating equilibrium. On linear theory, such radiation would not penetrate into the plasma, and it is not clear what conclusion one should draw from the fact that a nonlinear equilibrium exists in which the radiation has penetrated.
D.Infinite Cylindrically Symmetric Equilibria We shall begin with the energy-momentum tensor approach. The representation of the divergence of a Cartesian tensor in a curvilinear coordinate system (such as cylindrical polar coordinates) requires some caution. The correct expression is given in (for example) McConnell (1957, p. 313). The interesting component is that which expresses conservation of radial momentum. This gives
i a
- -rTrr
r ar
a 1 a f -rl ae Tre - -r Tee +
T,,
+ a Trt = 0.
(24)
If we take all quantities to be functions of r only and average over time we obtain for the TMo mode at cutoff 1 (eoEZ2- p o H e 2 ) 2r
+ dZP = 0
or
and for the TEo mode at cutoff, I d
- - r(coEe2
21. d,.
or
+
poHg2)
+ 2;;1 (eoEe2 -
poHz2)
+ dP = 0
(27)
208
H . MOTZ AND C. J. H. WATSON
where p = 2nm~v2Jd3v.With the help of Maxwell’s equations, one can show that Eqs. (26) and (28) (with f chosen to be Maxmellian for simplicity), respectively, reduce to
and
which are also the equations obtainable from (7). We shall deal with the TMo mode first, since several discussions of it are to be found in the literature; in particular, Boot et al. (1958), Weibel (1958), and Clauser and Weibel (1959) have undertaken numerical solutions of Eq. (29) for certain ranges of values of the parameters op2/wz and Ex (r = 0); Weibel has criticized the earlier work on the grounds that many of the solutions are incompatible with the assumptions upon which (29) is based, a criticism which is in our view well founded. We shall therefore consider the solutions in some detail, introducing an approximate analytic method of solution which puts the detailed numerical solutions into perspective. It will be seen that Eq. (26) differs from the otherwise analogous Eq. (lOa) in that it does not make the quantity $(eoE2 p o H 2 ) p a constant of the motion. However, an analogous constant of the motion can be obtained by integrating (26) from 0 to r ; this gives
+
+(eo~,2
+ p o H e 2 ) + p + p o /01 (He2/r’) dr’
+
= const.
(31)
If we use the Maxwell equation aEz/ar = -aBe/at to eliminate He, transform to dimensionless variables (Ex being measured on the scale (4mw2T/e2)1/2and r on the scale c/o), and drop the subscript z on El Eq. (31) becomes El2
+ E 2 + (wP2/o2) exp(-E2) + 2 J6 ( E t Z / r ’ )dr’
=
C2.
(32)
In selecting the lower limit of integration as r = 0, it is necessary to confirm that the integrand is well behaved at this point. However, if we solve (29) (in dimensionless variables) for small r by a power series expansion in r, we obtain
E = Eo(1 - t [ l -
+
(op2/oz) exp( -Eo2)]r2
+ O(r4))
(33)
and hence E’ = r O(r2),showing that E t 2 / r vanishes a t r = 0 as required. Equation (32) has again a mechanical analog which enables us t o
209
R F CONFINEMENT A N D ACCELERATION OF PLASMAS
classify its various types of solution; if we replace E by x and r by t d Z / m we obtain t +mu2 +(x) = E - m (vZ/t’> dt’; (34)
+
Jo
+
+
i.e., i t is the motion of a particle in the potential well = x 2 (upz/ w 2 ) exp(--s2) subject to a dissipative force proportional t o its velocity, but for which the drag coefficient is inversely proportional to time. Thus, qualitatively, the particle starts from some point on the edge of the well (at t = 0, v = 0, and Jb(v2/t’) dt’ = 0) and accelerates across the well, dissipating energy a t a rate mv2/t as it does so, and after performing a number of oscillations of different periods and peak amplitudes finally comes t o rest a t the bottom of the well.
As in the onedimensional problem of the previous section, the potential well has one minimum at the origin if wpz/wz 5 1 and two minima displaced from the origin if wp2/wz > 1 (see Fig. 3). I n the former case, only one class of (oscillatory) motion exists, which is qualitatively similar (as regards the maximum and minimum values of E ) t o the Bessel function Jo(r) and indeed goes over to J o as wp 3 0. For op2/w2 > 1, a large number of distinct classes of motion exist, examples of which are illustrated in Figs. 3(i) to 3(iii). Class (i) corresponds to the case C2> CI2 in the onedimensional problem above, but differs from it in that after a finite number of oscillations with alternating sign the particle must either come to rest a t the origin or transfer to oscillations without alternation of sign. Class (ii) corresponds to the case C2 = C12 (though clearly in the present context C2will have to be larger than w p z / w z to allow for the dissipation of “energy” en route) ; t,he equilibrium described is a radiation column confined by a plasma pressure which decreases as one moves inwards, initially exponentially but with a nonlinear cutoff as E approaches Em,,. Class (iii) corresponds t o the case C2 < CI2. Here, however, there are two distinct types of solution, depending on whether Eo = E (T = 0) is the larger or smaller solution of the equation +(Eo)= C2. The two possibilities are illustrated in the regions (a) and (b) of Fig. (iii). It will be seen that in
+
210
H. MOTZ AND C. J. H. WATSON
case (a) the plasma is excluded from the origin (since it is the point where E is maximum) and it is therefore concentrated in concentric shells at varying radii, the distinctness of which gets blurred as r increases, merging eventually into a uniform plasma of density no exp(-EW2). I n case (b), on the other hand, the plasma density is maximum on the axis, though there is again an infinite set of concentric shells which eventually merge into the same density. All the computed solutions described in the literature appear to describe equilibria of class (iii). Both authors, however, seek to avoid the awkward fact that these have plasma and a constant field amplitude at infinity instead of the combination of vacuum radiation and no plasma which would enable one to insert the perfectly conducting walls of a waveguide at some suitable radius. They do so by cutting off the plasma at some radius where its density is exponentially small and then matching the radiation into a solution of the vacuum field equations at that radius, These solutions are not true equilibria, since for a Maxwellian distribution the plasma density nowhere vanishes, but the leakage which would occur can be made negligible if wp2/02 >> 1. Naturally, if the distribution function were non-Maxwellian, and such that the plasma density were exactly zero at some radius, there would be no approximation involved in performing this matching operation. Graphs obtained by the numerical integration of Eq. (29) are given in the papers of both authors; Weibel considers only a solution of class (iii) (b) in which the matching operation is performed at the first minimum of the plasma density, whereas Boot et al. plot solutions of class (iii) (a) or (b), with the matching performed at the first or second minimum. (In interpreting their paper, it is necessary to read U = E, no/nc = (wP2/ w2) exp(-Eo2), and n/n, = (wp2/w2) exp( - E 2 ) . ) Some of the graphs which they obtained are illustrated in Fig. 4.They also give a chart showing the plasma radius, maximum electric field, and “fractional tuning” (a measure of the displacement of the wall of the surrounding waveguide required by the presence of the plasma) for a large spectrum of values of no/noin the range no/n, > 1; i.e., in our notation W , ~ / W ~ exp(-Eo2) > 1. Inserting this condition into the series solution for small r [Eq. (33)] we see that this describes a mode in which E increases from r = 0, and hence it identifies the solution as being of class (iii) (b). The qualitative features of their chart can all be derived from the mechanical analogy. Both Weibel and Boot et al. comment on the fact that for the case wP2/w2 > > 1, E, 1. I n consequence, the role played by the “dissipative” term 2J7,Et2/r’dr’ is rather unimportant; it somewhat reduces
-
(Cl
FIG.4
the maximum values of E’ and E for given C2 but does not affect the qualitative features of the solution. If for simplicity we neglect it, we can use (32) to determine the maximum value of E’ and the value of E a t which this occurs:
(E’):,, = C2 - 10g(wp2/w2) - 1 EcZ = log(wp2/w2),
B
C l
and hence B,2 - -EC2
c 2
- 10g(wp2/w2) - 1 log(wP2/w2)
wp2/w2
log(wp2/w~) *
Thus although the maximum values of E and B are both of order wp/w, the maximum in B is reached first. The corresponding density profile can be
212
H. MOTZ AND C. J. H. WATSON
inferred from the fact that for Eo > 1, n, > El and that the electric field is small both at its surface and within it. This confirms analytically what Weibel and Boot et al. showed numerically-that for w P 2 / w 2>> 1, the equilibria are quasi-metallic. The same argument shows, however, that as w p 2 / w 2 - + 1 the equilibrium loses this quasimetallic character. The boundary can still be quite sharp, as a result of the nonlinear variation of E as it approaches E,,,, and the radiation can still be quite effectively excluded from the center of the plasma (if C2 is suitably chosen), but E and B are of the same order of magnitude everywhere. For example, if EO= 0 and w p 2 / w 2 = 10, (Bc/Ec)2= 2.9, but EL,, = 10 and at this point n = noe-lO. Such equilibria are radically different from those considered by Weibel and Boot et al. and might appropriately be described as “quasi-dielectric.” It is important to note that strictly speaking the self-consistent field theory developed in this chapter is only valid for these quasi-dielectric equilibria, since for large w p 2 / w 2the length scale LE for the nonuniformity of E near the plasma boundary can be estimated from the above expression as
cWwE,‘ = [ $ / l o g 3 1
2
-112
C/O
which becomes much less than the length scale c / w for vacuum radiation and hence invalidates the method of averaging upon which the concept of the quasi-potential depends. Since no alternative technique has been developed, what happens as w p 2 / w 2is increased is a matter for speculation. It is unfortunately very important that we should express a view on this question, however, for, as we shall see, the feasibility or otherwise of an rf thermonuclear reactor turns precisely upon it. I n the early papers on the applications of rf confinement to thermonuclear fusion (e.g., Boot et al.), reactor designs were proposed in which w p 2 / w 2was as large as lo7,These designs were criticized by Weibel on the grounds that they depended on a theory of the rf equilibrium which was invalid for such large values of wp2/w2. This objection is, as we have seen, well founded; it does not, however, follow that there cannot exist high density confined equilibria which are qualitatively similar to those obtained from this (strictly inapplicable) theory. Some insight into the question can be obtained by considering the exact theory of particle motions in a plane rf wave, discussed in Section 1. We saw then that particles in the neighborhood of the node of a plane wave Eo sin k z remain confined if e 2 E 0 2 k 2 / m 2 w < 2 0.7; that is, if ( v E / c )
RF CONFINEMENT AND ACCELERATION O F PLASMAS
213
( k c l w ) < 0.83. Thus, although the averaging theory upon which the quasi-potential concept depends is valid only if (z~E/c)(Icclw) 1 there exist equilibria in which the electric field on the axis can be made arbitrarily small. These correspond to B choice of the constant of integration C2 of Eq. (32) very slightly less than ~ , , ~ / wFrom ~ . the analogy of the motion of a particle it is clear that the duration of the first transit of the particle across the well increases sharply as (op2/w2) - C2 tends to zero and hence that the
214
H. MOTZ AND C. J. H . WATSON
radius of plasma over which the radiation fields remain small (and hence the plasma density stays nearly constant) increases correspondingly. Nevertheless, the maximum field amplitude eventually reached is as large as is compatible with Eo fi 0, and it is of order wD2/w2; so, provided that this is somewhat larger than unity (say -10) we can take the plasma density to be exponentially small at the first maximum of E and match on to a vacuum mode at this point. As we have seen, we cannot trust the quasi-potential theory of the structure of the boundary layer for values of wp2/w2 much greater than 10, but there is still hope that such configurations, consisting of a large cylinder of plasma isolated from a conducting wall by a relatively thin rf layer, might exist for much larger values of wP2/w2. These equilibria, in which the ratio of the plasma volume to the volume of space filled with rf can in principle be made indefinitely large, were overlooked by Boot et al., presumably because they only exist for a narrow range of values of the constant C2. Indeed, one practical problem which they raise is the delicate adjustment of the rf power level needed to maintain them, and some feedback mechanism would presumably be required. One feature of Weibel’s numerical calculations should be mentioned. Throughout the above discussion we have assumed that quasineutrality is maintained not only in the region containing the bulk of the plasma (where it undoubtedly is) but also in the low density tail (where it is not). Weibel has integrated the more exact field equation, with
and Poisson’s equation
concurrently by an iterative numerical method and shown that an ion sheath develops in the low density tail, as one would expect. We now turn to the cutoff cylindrical TEomode where the wave equation is given by Eq. ( 3 0 ) . As before, we can obtain a first integral from the energy-momentum tensor theorem; in the (by now) familiar dimensionless unit we have
Et2
+ E2 + (wp2/w2) exp( -E2) + 2 Jd ( E 2 / r ’ )dr’
=
C2
(35)
in analogy with Eq. (32). As before we can show by series solution for small r that E vanishes as r for small r, and hence that the integral is well behaved near r = 0 , and indeed that the contribution of this term to the
RF CONFINEMENT AND ACCELERATION OF PLASMAS
215
left-hand side of (35) goes to 0 with r . Thus it again acts as a kind of pseudofriction (to draw on the mechanical analogy once more), though the rate of dissipation of energy is now proportional to the square of the displacement of the particle instead of the square of its velocity. It is clear that this makes only minor quantitative differences to the actual motion of the representative particle; however, we must now start the motion from points where E = 0, so the classification of solutions becomes different. Examples of the various classes are illustrated in Fig. 5 for the case w p 2 / w 2> 1. The physically interesting class is (iii): b y choosing C2 slightly larger than ( w p 2 / w 2 )we can again construct a solution in which there is a plasma of large radius with a very small trapped rf field, and we can match irito the vacuum a t the first field maximum.
We can rapidly dismiss cutoff cylindrical waveguide modes of lower symmetry; since these necessarily have azimuthal nodes, the corresponding quasi potentials do not possess absolute minima arid consequently cannot be used to confine plasma. Combinations of such modes could be used, but the need to use two generators of different frequency and the extra mathematical complexity of the theoretical analysis render this an unattractive approach.
E. Three-Dimensional Conjineinent Three classes of three-dimensional confinement configuration which rely entirely upon rf confinement appear to have been proposed in the literature, and several more have been proposed in which the rf field supplements a magrietostatic field. The simplest of the former is a configuration in which one of the cutoff TE, or TRIOmodes discussed in the preceding section is hcrit around into a torus. No detailed calculations appear to have been made, hut, it seems probable that for sufficiently small torus aspcct ratio the cy lin d rid theory would be approximately applicable. If this were correct, thc dificwlty raised by the condition (E* V ) E Z= 0 need not arise, sinve, as we have seen, Poiricar6’s theorem shows that this condition can be met on surfaces of toroidal topology. For all the other
216
H. MOTZ A N D C . J. H. WATSON
equilibria discussed below, the use of the equation
V XV
)(
E
=
(w2/c2)eE
(7)
is open to objection for the reasons we gave in Section A. Nevertheless it seems plausible that this might give an approximate representation of the field configuration which would arise if the condition (E V)E2 = 0 were relaxed as a result of one of the effects discussed above. The two further classes of three-dimensionally confined plasmas involve rf modes in cylindrical cavities of finite length and spherical cavities respectively. For such equilibria, the energy-momentum tensor approach used above is of little value, since two or more components of the divergence of the tensor have nontrivial content. One is therefore forced to work with Maxwell’s equations in their fully differential form. This has the unappealing consequence that one has to solve Eq. (7), a nonlinear, partial differential equation in at least two independent variables and (usually) two or more dependent variables, or some equivalent equation or set of equations. The number of independent variables is irreducible in the nature of the problem. (It can be shown from V B = 0 that no single mode exists for which the fields have spherical symmetry.) Attention has therefore been directed towards reducing the number of dependent variables, if possible down to a single scalar function from which all the fields can be derived. For vacuum modes there exists a considerable literature on this subject [see, for example, Nisbet (1955) for a comprehensive discussion, or Stratton (1941) or Panofsky and Phillips (1962) for simpler accounts]. Unfortunately, the methods developed in that context are of little value when e is a (nonlinear) function of E, and the most that they can supply is a classification of the vacuum modes t o which the solutions of (7) might go over in the limit w,, + 0. However, as we shall see, there may be solutions of (1) which cannot exist in the limit up --f 0 (e.g., if they only exist for wp2 > u 2 ) or for which the topology of the fields changes discontinuously when up2> w2, so such classifications could be misleading. We shall first consider equilibria in cylindrical cavities. In cylindrical polar coordinates, the wave Eq. (7) has the form (with a/a@ = 0)
-
It is clear from these equations that Eg is decoupled from the E, and E, components, but that the latter are conditionally coupled, in the sense
R F C O N F I N E M E N T AND ACCELERATION O F PLASMAS
217
that E, can exist without E, only if a2Ez/ar az = 0 and E, can exist without E , only if (l/r)(a/ar) 1’ aE,/az = 0. The former condition is met in the TMomode a t cutoff, a case which we have already discussed; the latter requires E, = A ( z ) / r , which excludes such modes for an empty cylinder but not in a coaxial line, for which it gives the electric field of the T E M (Lecher) mode. This raises the question as to whether a plasma column can act as the central conductor for a TEA4 mode. The answer is negative if a T E M mode is defined as one for which E, = 0, for if E, = A ( x ) / r is substituted in (36) it is seen that no such solution exists, since c = e ( ~ ,2). Nevertheless, it does appear possible that a TEA1 mode, in the sense of a mode possessing no frequency cutoff but having both E , and E, components, might exist; it would be analogous to a vacuum TEA4 mode in a coaxial line with a bumpy inner cylinder. Equation (38) is the equation for the total electric field of the TE, modes. Thus, to summarize, all the fields of a TEomode a t or above cutoff can be derived from the scalar Eg which satisfied (38) ;the fields of a TMo mode at cutoff can be derived from the scalar E , which satisfies (37) with E , = 0; but neither the T M o modes above cutoff nor the T E M modes are determined by a single component of El and for both of these one needs the coupled equations (36) and (37). This is not to say that it is in principle impossible to find a scalar function for these modes, and indeed, as Mot2 has shown, the quantity Hg can under some circumstances be used in this way to derive the T M omodes. To see this we note that the wave equation for the magnetic field
V X (l/t) V X H for the Tillo mode
=
(d/c2)H
(39)
(H= HB) gives
a -i _a (rHo)+---+-H.j a iaHe _ ar tr ar az az
w2
~2
=O.
This equation is unfortunately of very little analytic value for two reasons. First, it has singularities at points where c = 0. This means th a t considerable mathematical difficulties arise for any confined plasma whose maximum density is greater than critical. (It does not follow that such plasmas cannot exist; indeed the converse is demonstrable, for the singularity in (40) is still present if the mode is excited a t cutoff, whereas for this case we have already shown that confined supercritical plasma equilibria exist.) Secondly, e is not an explicit function of He, but is determined by the intractable implicit equation E
=
w z
1 - +exp w
218
H. MOTZ AND C . J. H . WATSON
Nevertheless, (40) is a very convenient form for a nunie~icalintegration, provided that the plasma remains subcritical. The results of such an integration will be outlined shortly. This difference between the TE, and TATomodes-that the former can and the latter in general cannot be derived from a single component electric field-is related to another difference, first commented on by Miller: the quasi potentials of the vacuum fields have quite different topological propert,ies in the two cases. The TE quasi-potential $TE = R(r) sin2 Icz has a factor R(r) which has a minimum a t r = 0. However, since $ T ~vanishes a t the planes where sin kz = 0, it does not possess any absolute minimum and hence cannot confine single particles or low density plasmas. The TM quasi-potential, however, is $TM = Rl(r) sin2 lcz Rz(r) cos2 kz, since the two electric field components E , and E, are out of phase, and hence it does possess an absolute minimum.
+
TABLE I For
W,,~/W~
= 0,920 825
0.079 0.318 0.564 0.753 0.875 0.942 0 .973 0.986 0.991
17.5 351 830 057 505 354 457 441 456
0.561 0.652 0.768 0.860 0.922 0.95.5 0.977 0.987 0.991
113 824 057 625 462 815 910 013 024
0.874 243 0 . 9 0 0 640 0 . 9 3 2 928 0.957 063 0 . 9 7 2 292 0.981 688 0.987 263 0.990 306
0.986 0.988 0.991 0.992 0.993
909 516 031 854 854
Self has investigated the shape of this minimum numerically and shown that it is of optimum depth for single particle confinement u hen the ratio of the radius to the length of the cylindrical cavity is 0.6. However, the presence of a plasma of significant density must alter the shape and depth of the minimum; a numerical calculation of this effect has been made by hlotz, who computed solutions to (40) and (41) for this optimum cavity shape by an iterative procedure, obtaining the modified cavity resonance frequency and plasma arid field distributions for a number of subcritical plasmas. (As previously indicated, (40) is invalid for supercritical plasmas, so no results for such plasmas were obtained.) The solution for one particular set of initial conditions is given in Table I, which gives the value of E at various points of a discrete lattice, with rows and columns corresponding to the z and r directions respectively. The resonance frequency obtained was u 2 = 0.06453c2/a2 as against the vacuum resonance frequency of 0.06437c2/a2,showing that a high Q cavity would be significantly detuned by even this amount of plasma. The consequent need to retune the oscillator during the build-up of the plasma is a matter of some experimental
RF CONFINEMENT AND ACCELERATION OF PLASMAS
219
difiiculty a t the present time. It is clear that in this case, arid indeed for all subcritical plasmas, the ratio of the plasma volume to the volume filled with rf field is very unfavorable. No attempt appears to have been made, however, to compute supercritical equilibria, in which the ratio might be much more acceptable, on the basis of Eqs. (36) and (37). The supercritical radially confined plasma equilibria obtained from Eq. (38) a t cutoff (see above) give grounds for hope that a TEO mode sufficiently above cutoff might confine a supercritical plasma even though it cannot confine single particles; i.e., the plasma would have to make its own field minimum a t the center of the cavity. Ultimately, it would be necessary to test this suggestion by the numerical solution of (38) ; that a positive result is likely can be made plausible by the following considerations. For small T , we can solve (38) approximately by expanding E+ = 2, E2rE, where the E, are arbitrary functions of z. This a t once gives E, = 0, i < 1. Thus Eo vanishes on the axis, and hence for sufficiently small r , E = 1 - wp2/w2. However, under this assumption (38) becomes separable, giving (with E , = R(r)Z(z)and in suitable dimensionless units)
d2R dr2 ~
R + -r1-dR + aR = 0, dr r2 d2Z + bZ = 0, dz2 -
~
+
(42)
(43)
where a b = 1 - wp2/w2. The separation constants a and b are determined by the boundary conditions, which unfortunately have to be applied outside the region of validity of (42) and (43). However, it seems probable that for sufficiently large wp2/w2, b can be negative, which corresponds to an absolute minimum in E 2 at the center of the cavity, since by symmetry d E / d z = 0 there. It is clear that a certain difficulty would arise in the experimental realization of such an equilibrium, since the corresponding vacuum mode does not trap single particles, and hence the plasma could not be built u p gradually. One possible technique would be to create the plasma a t the cavity center with the help of a laser. Another alternative would be to confine the plasma in the z direction during the build-up phase by means of an electrostatic potential, giving the cavity end plates a negative charge. Unlike the final equilibrium, which does not appear to have been investigated numerically, the characteristics of such temporary confinement have been computed by Rlotz. The calculation assumed that the electrons of the plasma would distribute themselves in the z direction in a manner determined by the electrostatic potential due to the charged end plates, and that in a time which is short compared with the ambipolar diffusion time the ions would be trapped by the space charge created by the elec-
220
H. MOTZ AND C. J. H . WATSON
trons, and it was shown that supercritical plasma distributions could be confined in this way. We now turn to equilibria set up in spherical cavities. The wave equation (7) in spherical polar coordinates with 8/84 = 0 becomes a ---i (444 r2 sin e ae ae
ae As in the cylindrical case, we see that E+is decoupled from E , and Eel but now these latter are unconditionally coupled to each other and no solutions exist in which any of the field components are independent of 8. Thus there is no analog of the cylindrical cutoff modes, which have an additional symmetry not possessed by the more general modes. Consequently, no analytic solutions of these equations have been found, and no numerical solutions appear to have been attempted. True, a modified version of Eq. (44c) exists in the literature [Consoli et al. (1964b)l which would be analytically soluble if it were correct, but in its derivation it is unjustifiably assumed that E.+= E,(r) sin 0. One can see from the fact that B is an unfactorizable function of r and 0 that Eq. (44c) is not separable. Thus all that can be said about spherical equilibria at the present time has to be derived from the known solutions of Eqs. (444 to (44c) when e = constant. I n what follows we shall assume that e = 1, though the modification for other, including negative, values of t would not be difficult and might indeed be used to give a qualitative discussion of supercritical solutions, in the manner indicated above for finite cylindrical modes. Just as in cylindrical cavities, the vacuum modes can be classified as TE, TM, or (presumably, although the literature does not seem to mention them) TEM modes if there is an inner sphere. The vacuum fields for the TE and T M modes in a cavity of radius a are given in the tabulation of Eq. (45) [Panofsky and Phillips (1962)l.
E.+ =
a
-ikzi( k r ) Yim(0, ae
+)
RF CONFINEMENT AND ACCELERATION OF PLASMAS
22 1
In Eq. (45), Ic = kn,l is defined so that ka is the n th zero of the Bessel function .I[, z1 is the spherical Bessel function J1+1,2(x)/x1/21and Y p are the spherical harmonics. From these we can a t once derive the quasipotential created by any given mode. I n particular, for the m = 0 modes (a/a4 3 0) we have for the T E modes
and for the T M modes
Expanding the Bessel function for small
T,
we obtain
where Pl(cos 8) are Legendre polynomials. Since the spherical harmonics for 1 2 1 all have radial nodes, we see that the quasi-potentials f i cannot ~ ~ confine single particles. On the other hand, the expression in square brackets in $TM is always positive and nonzero. However, for 1 = 1, the dominant term in $TM is independent of r, and the next term is negative. Thus in this case #TM has a maximum a t the origin. For all higher modes, however, there is a minimum which could confine particles. I n view of these facts it is somewhat strange that all the proposals in the literature, Ilnox (1957s,b), Butler et ul. (1958), and Consoli (1962), are based on the T E modes, using a combination of two or more such modes with different frequencies or azimuthal variation in order to achieve a confining configuration. The explanation would appear to be that these proposals were made before the quasi-potential theory had been developed, a t a time when the plasma was regarded as a metallic conductor, upon which the radiation exerted pressure. K ~ O proposed X the superposition of three TE,lo modes with their orientations chosen to give a spherical over-all configuration; Butler et al. (1958) proposed a combination of TEI10 and TEl,, modes. Consoli et ul. proposed, and have investigated experimentally, a TEllo mode alone, accepting a certain plasma loss along the axis which, they claim, would be reduced by the fact that the oscillations of the particles in the rf field about the position of their guiding centers would be such that the loss cone for particles might be less than one would infer from guiding center theory alone.
222
H. MOTZ A N D C. J. H . WATSON
Before leaving the subject of three-diniensional confinement, we should make some mention of a few papers which discuss this subject from an oversimplified viewpoint and reach conclusions which are, in our view, erroneous. There is first a paper by Johnston (1960) which purports to show that a high density plasma (for which w p 2 / w 2 > 1) cannot be confined a t an absolute minimum of E2. Johnston takes as a model of the plasma a system consisting of two particles, of masses m and M , interacting through a force - wp2z, where z is the separation between them and is taken t o be in the direction of VZ2. He then shows by a trivial extension of the methods of Section 1 that the center of mass of such a system would move according to
where VIIand V, are the components of V parallel to and perpendicular to the direction of E,respectively. It follows from (47) that although a configuration possessing an absolute minimum in E2 would always confine a plasma with wp2 < w 2 and would confine a plasma with up2> w2 provided that E V E 2 = 0 everywhere, the presence of a component of E parallel to V E 2 would lead to a loss of confinement in this direction for plasmas with up2> w2. This state of affairs has a certain plausibility, in the sense that we have only actually succeeded in proving above that three-dimensional confinement is possible for low density plasmas or if E vE2 = 0. However, it will be observed that Johnston’s choice of model is a highly arbitrary one: he assumes without any convincing justification that the system shows plasmalike properties only in the direction of V E 2 (his argument in this context-that one only expects space charge effects to be important in the direction of V n , which is parallel to VE2-would if correct show that a uniform plasma could not sustain plasma oscillations!). We are therefore in no way compelled to accept his conclusion, and the extreme crudity of the model, which makes no allowance for changes in the plasma density resulting from the (near the hypothetical resonance) very large forces acting on it, makes it doubtful whether much significance should be attached to the result. A slightly more plausible model has been proposed by Asaltaryan (1958) and Gildenburg and !Miller (1960), who treat the plasma as a rigid dielectric sphere of radius much less than the wavelength of the confining field, having a dielectric coefficient c = 1 - w p 2 / w 2 . Since we believe that this model may be relevant to the theory of plasma acceleration, we discuss it in some detail in Section 7; here we shall simply state their conclusionthat the center of mass of such a sphere situated in a standing rf field
-
-
RF CONFINEMENT AND ACCELERATION OF PLASMhS
223
should experience a n acceleration
+
where wo2 = wp2/3 (the factor reflecting the spherical geometry of the plasma) arid y is the damping coefficient for plasma oscillations due to collisions or radiation. It will be seen that this theory predicts a resonance, and a reversal in the direction of action of the quasi-potential force as w passes through wo, whether E V E 2 = 0 or not. Thus (48) is absolutely incompatible with the self-consistent field theory of the equilibrium of a high density plasma confined by rf fields as discussed in this chapter. In the present authors’ view, this resonance is a spurious effect resulting from the inadequacy of the plasma model adopted. Finally, there is a paper by Knox (1961) which undertakes to show, on the assumption that the self-consistent field equations remain valid where E * V E 2 # 0, that in this case absolute confinement is impossible if w p 2 / w 2 > 1. Essentially his approach is to consider a particular configuration for which E has a component parallel to VE2-a TR4o mode in a finite cylinder closed a t one end by a plasma. He starts from our Eq. (40) for this mode, which he simplifies by assuming that e is a given function of z instead of being a self-consistently determined function of z and T , and he chooses the functional form of E(Z) so that the equation is soluble analytically in the neighborhood of the point where e = 0. He shows that the general solution is a sum of a singular solution and a well-behaved solution but that the boundary conditions require an admixture of the singular solution and hence the component of E parallel t o V E 2 becomes much larger than the perpendicular component and hence the dominant force is deconfining. I n the present authors’ view, this result is due to his simplifying assumption that E is a given function of z alone; it seems reasonable to expect that, if t can vary in a self-consistent manner, a nonsingular (and hence confined) solution can exist in the present, as in the cutoff, case.
.
3. THEORY OF
COMBINED
RADIO-FREQUENCY AND
nfhGNETOSTATIC
CONFINEMENT OF I’LA4SMA
A . Ijerivation o j the Selj-Consistent Field Equations We proceed by urialogy with the analysis given in the preceding section for pure rf coiifinemcnt. I n the section on single particle motions, we showed that the quantities p+ = Qinhpct2/Boand
H. MOTZ AND C. J. H. WATSON
224
are constants of the motion to first order in a and p. If we generalize this by including a n electrostatic potential r$ and we remember that f = fc en eW and that fc is parallel to 51 and en perpendicular to it, we can write p, and E, in terms of the particle coordinates as
+
+
- @wd2/Bo
P, =
and E, = Qmdv
+ $* + e9,
-
(1)
where to first order in (Y and p we can take $, and 4 to be functions of the actual particle coordinates. We now consider distribution functions f,(&,, p*); such functions can only describe plasmas in which to first order in a and p no stationary currents are flowing, and hence cannot be used t o consider situations in which the presence of the plasma modifies the stationary magnetic field, but they are sufficiently general for our present purposes. We now have, as in the preceding section, j
=
eJv(f+
- j-> d3v
e[bw+no+Jj’+ d3v - ew-no-Jj- d34,
=
(2)
+
wherej& = J+(+nz&v2 $* k eqj, + m ~ v L 2 / Band O ) is normalized to unity a t some reference point. If as before we write
E(r, t )
=
E exp(iwt)
+ E* exp(-ht),
the wave equation becomes
vxvx(E”4
This equation can be put into the same form as Eq. (7) of the previous section:
V XV X E
=
(w~//c~)EE, €
=
(1 - [w,’/w(w
+
YQ)]Jj-d”v>LT,
(4)
if the scalar e appropriate there is replaced by the tensor r which is diagonal (though not a multiple of the unit tensor) only in the ( s y representation. We can as before use the assumption of quasi-neutrality to eliminate r$ from Eq. (4),by using
+ + e9, P+> d3v
Jj’+(4m+v2 $+
=
J,f-(!m-v2
+ $- - e4, P-> d3v,
(5)
but this only leads to an explicit expression for 4 if we make definite assumptions about the functional form of the j*.I n what follows we shall for the most part assume for simplicity t,hat the ,f& are independent of p*
225
RF CONFINEMENT AND ACCELERATION OF PLASMAS
and are the same function. We then obtain e+ = &($- - $+) as before. absence of an rf field, such an assumption is inadequate, since if j , is to describe a confined (and hence nonuniform) plasma, some p dependence is essential. This is reflected in the fact th a t a plasma confined in a mirror machine has a distribution possessing a “loss cone” in the velocity space. If an rf field is present, however, the plasma can be confined by i t even iff+ has no p dependence, and calculations are simplified if this assumption is made. We shall now give certain particular applications of Eqs. (4). 111 the
B. One-Dimensional Equilibria with a Uniform Magnetostatic Field It is clear that the solution of (4) for arbitrary directions of propagation with respect to a uniform magnetostatic field cannot be simpler than the corresponding linear problem, which already has a certain complexity. The most interesting case is the one in which the direction of propagation is parallel to the magnetic field, which we shall take to be the x direction, and we shall therefore concentrate on this particular case. We then obtain the two coupled equations:
These equations were first obtained by Volkov, using a magnetohydrodynamic approach. If we take 3- to be Maxwellian, and work with the usual dimensionless variables, Eq. (6) becomes
d2E+ - ( 1 dz2
2 w(wu pk
[
-=)I]
(A
2 1 IF + n/w l2 n) exp - I
+ 1 - o/w
E,.
(7)
Sirice the complex conjugate quantities E,* satisfy the same equation, if we multiply (7) by dE,*/dz and its conjugate by dE,/dz and add, we obtain
+
+
&lE+’12 JE+I2 IE-’I2
+ IE-12
+g e x p
[ f (m P+I2 + x - n / w) const, ](8) -
1
=
an equation which can at once be recognized as the dimensionless form of the equilibrium condition derivable from the energy-momentum tensor theorem :
226
H. MOT2 A N D C. J. H. WATSON
To complete the solution of the problem we have to obtain expressions for the four independent quantities Eh and E,*. We can in principle still draw on the mechanical analogy used in the preceding section to discuss the qualitative features, but since the motion of a particle in a fourdimensional potential well is not easily visualizable, it is convenient a t this point to restrict attention to either plane or circularly polarized waves. I n the former case, E+ = E- = E ; in the latter, either E'+ or E- = 0, and we can take the nonvanishing quantity to be real. In this case, the equivalent mechanical problem becomes one dimensional, and the arguments of the proceeding section can be taken over almost unaltered. One interesting difference arises, however: the effective potential seen by the particle representing the electric field is
For w > !ilthe resulting motion is largely unaltered, though the nonlinear regime is reached a t much lower values of E* as w approaches 3. For w < 3, however, the topology of the effective potential $ ( E ) is altered; it no longer has a central hump, whatever the value of w p 2 / w 2 .This reflects the fact that a strongly magnetized plasma can transmit even linear waves a t frequencies below the plasma frequency. Under these conditions the only effect of the nonlinearity of the fields is to distort the wave form and alter the wavelength of the waves. The effect on the plasma distribution is striking, however; it is now concentrated a t the antinodes of the wave. One feature which is obvious from Eq. (8) and not from Eq. (3) is that the condition that the reaction of the plasma back on the rf field distribution should be negligible everywhere does not depend upon the presence or absence of the magnetostatic field, and it is (as one might expect) &E2 >> noKT. It is clear that the tensor character of E in Eq. (4) seriously complicates the form of the equation when the rf field geometry is anything other than one dimensional. I n what follows, we shall evade this complexity by considering low pressure plasmas (for which +eoE2>> noKT and hence E = I) and shall discuss the plasma distributions which are obtained with various combinations of rf magnetostatic fields. C . Low Pressure Plasma Distributions The assumption that the plasma pressure is negligible in comparison with the rf pressure enables us to calculate the rf field configuration with the vacuum field equations, satisfying the appropriate boundary conditions a t the walls of the cavity in which it is set up. From this, given the
R F CONFINEMENT A N D ACCELERATION O F PLASMAS
227
geometry of the magnetostatic field, the quasi-potentials
Y
can be calculated and the electrostatic field and the resulting plasma distribution can then be derived from Eq. (5). Now the advantage of combined rf-magnetostatic confinement over pure magnetostatic confinement is that it is in principle possible to confine an isotropic velocity distribution in this way and hence to avoid loss cone instabilities and other anisotropy instabilities. To achieve this, we must takef* = j ( & ) orily and this, as we have seen, implies that e+ = +($+ - $-) and a density distribution n(r) = noJf(+7nv2 &$-) d3v.
+
Thus, contours of constant n then coincide with contours of constant *, and, to ensure confinement, we require a set of nested surfaces of constant # with everywhere positive and increasing outwards. This requirement is most easily met if w > Q everywhere within the region of confinement of the plasma. The required increase in yi can then be achieved either by an increase in one or more of the (E,I toward the periphery or by an increase in Bo toward the periphery in such a way that D approaches w . Naturally, this second alternative economizes in rf power; it should be remembered, however, that we have assumed in this section th a t this power level is high enough for the rf field distribution to be unaffected by the plasma. For any given rf and magnetostatic field distribution, the # contours can be calculated from Eq. (52) or (53) of Section 1.
+
4. STABILITY THEORY
The literature on the stability of rf confined plasmas makes rather depressing reading, both because of the heaviness of the algebra and the uniformitywith which instabilityis reported (Knox, 1957a,b; Weibel, 1957; Whipple, 1959; Sagdeev, 1959; Yanliov, 1959; etc.). The cumulative effect of this theoretical work has been a serious discouragement to experiment in this field. However, all the stability analyses to which we have referred are based on the assumption that the quasi-metallic model of the plasma provides an adequate description both of the equilibrium and of the perturbed configurations. We saw in Section 2, however, that the use of this model of the plasma is never justified. The assumption th a t transverse electric fields vanish a t the plasma boundary only corresponds to the field configuration derived from the approximate self-consistent field equation
V XV X E
=
(w2/c2)eE,
(1)
228
H. MOT2 AND C. J. H. WATSON
where wo2/w2 >> 1. However, whenever wp2/w2 is sufficiently large to justify this neglect of the electric field at the plasma boundary, the electric field gradient obtained from (1) is so large as to invalidate the assumption upon which it was based. True, we argued in Section 2 that there may exist rf confined configurations for which wp2/w2 is much greater than unity, but for such configurations the depth of penetration of the rf into the plasma would have to be much greater than the depth given by (l),and again the use of the quasi-metallic model could not be justified. Furthermore, in such equilibria, if they exist, one could not distinguish the time scale for the motion of the plasma particles in the boundary layer from the time scale for the oscillations of the rf field, so the concept of radiation pressure could not be used to describe the confining force or to analyze the stability of the configuration. (It may be noted that these remarks do not in any way invalidate the use of Eq. (1) or the concept of radiation pressure to describe the interaction of an rf wave with a true metal, since, in that case, lattice forces ensure the confinement of the electrons.) I n this chapter, therefore, we shall ignore all stability analyses based on the quasi-metallic model and shall deal exclusively with the stability of those equilibria which satisfy Eq. (1) and for which that equation is in fact valid. We shall begin by exploiting the interesting, but by no means perfect, analogy between rf and magnetostatic confinement of plasma. The subject of magnetostatic confinement is complicated by the fact that the pressure is usually not scalar; for the purposes of the analogy, however, we shall restrict ourselves to the case where it is. It then follows, as is well known, that an equilibrium is only possible if there exist nested surfaces formed by magnetic field lines (so-called magnetic surfaces) on which the plasma pressure is constant. As we have seen, exactly the same is true (to first order in v/c) of the rf confinement of quasi-neutral plasma: we analogously have electric surfaces on which the plasma pressure is constant. A second common feature of the two confinement systems is the existence of flux tubes formed by lines of force, which are continuous except at singular points where E = 0 and contain a constant flux across any cross section. (This follows from V E = 0, the equation of quasi-neutrality.) I n the electric case, however, this constancy of the flux results trivially from the fact that the tubes are of constant cross section. A third common feature is the fact that in both systems the plasma particles are undergoing highfrequency low amplitude oscillations about the position of their guiding centers, intermediate-frequency oscillations of their guiding centers in the quasi-potential well created by the nonuniformity of the confining field and (though we have not actually proved it for the electric case) a slow precession or drift motion as well. [Higher order terms in the motion of a charged particle in an rf field are discussed in Litvak et al. (1962).]
-
R F CONFINEMENT AND ACCELERATION OF PLASMAS
229
Consequently, we might expect to find an analogous series of instabilities -low-frequency interchange or drift instabilities, intermediate-frequency instabilities due to resonance with the well frequency or improper velocity distributions, and high-frequency resonance instabilities. As regards the low-frequency instabilities, all the detailed analyses carried out to date have (as we see above) been based on a n inadequate model of the equilibrium. However it appears to be difficult to give a more satisfactory treatment. There is a temptation to provide a general stability criterion for interchange stabilities along the lines of the $dl/B criterion of Rosenbluth and Longmire (1957) for magnetostatic confinement. It is easily shown however that one cannot simply take over their argument, replacing magnetic flux tubes by electric flux tubes, since for a n rf confined plasma, the plasma is not “frozen” to the electric field, so there are no grounds for excluding nonconvective interchanges. Furthermore, the field energy within a flux tube cannot be expressed in terms of E2 alone. An alternative approach has been suggested by Fowler (1962), who pointed out that insofar as one can replace the rf field by an equivalent fixed quasi-potential J., his proof of the stability of any distribution function fo($rnv2 J.), which is a monotonically decreasing function of its argument, would be applicable. Since, for such a distribution function, the condition for plasma confinement requires that the electric field configuration should have the minimum E property, Fowler’s argument appears to provide the analog of Taylor’s proof of the stability of certain minimum B configurations. However, as we saw in Section 2, a satisfactory treatment of the self-consistent rf field within a plasma requires that we should take the distribution function to be a function of &m(v - (e/nz)JEdt)2 J. and it is not clear that Fowler’s proof can be generalized to cover such distributions. We now turn to the theory of intermediate- and high-frequency instabilities. Here, as in the case of magnetically confined plasmas, almost all the theory has been given for spatially uniform equilibria, it being plausibly maintained that the results can be taken over in some WKB sense for weakly nonuniform plasmas. The danger of this assumption in t,he magnetic case has already been pointed out (Watson, 1964) but one has to start somewhere. The basic kinetic theory of a uniform plasma containing a uniform rf field Eo sin w0t has been given by Silin (1965) who considered its application to the theory of high-frequency instabilities, and it has been used to discuss intermediate-frequency (streaming) instabilities by Gorbunov and Silin (1965). I t s application to high-frequency instabilities in a uniformly magnetized plasma has been given by Aliev et al. (1966). The following discussion is based on the mathematical techniques of this last paper, from
+
+
230
H. MOTZ AND C. J. H . WATSON
which the results of the other two papers follow as special cases. T h e equilibrium configuration of a plasma in the presence of a uniform rf field and a uniform constant magnetic field B is obtained by solving the Vlasov equation for each species:
which gives .fo = fo(v - v0(t))where f o is an arbitrary function and vo satisfies dvo/dt = (e/m)[Eo vo X Bo], an equation which can be solved exactly since Eo and Bo are spatially uniform. The velocity vo depends upon the sign of the charge and the magnitude of its mass; consequently, for Bo E 0 the equilibrium contains a uniform distribution of ions and electrons streaming past each other in an oscillatory manner; on this may be superimposed a net drift of ions with respect to electrons or of electrons past each other by choosingfo suitably. The effect of Bo is to superimpose a Larmor motion in addition to the above motions. Since Eo is assumed uniform, no drift of guiding centers occurs and no deviations from neutrality are required in the equilibrium. To discuss its stability we linearize the Vlasov equation about this equilibrium; the resulting linear equation can be Fourier-transformed in space (since the equilibrium is uniform) but not in time (since the equilibrium is time dependent). However, since the equilibrium is a periodic function of time with period 27r/w0, an analog of Floquet’s theorem shows that the solution can be expanded,
+
C m
fl
=
n=-m
jln exp[i(nwo
+
w
~
(3)
.
It proves convenient to work with each Vlasov equation in the (oscillating) frame in which the species concerned is a t rest in the equilibrium. If one considers only perturbations in which the perturbed electric field is derivable from a scalar potential, one obtains the following set of coupled equations for the n th components (in the above sense) of the perturbed ion and electron charge densities in the moving frames, pin' and p?), respectively,
-
where J,(a) is a Bessel function of order r and argument a = k L, L being the maximum displacement of a particle from its equilibrium position
R F CONFINEMENT AND -4CCELERATION OF PLASMAS
23 1
during one cycle of the rf field; arid & i ( w , k), &,(w, k) are the ion and electron contributions to the linear. dielectric coefficient appropriate to the electrostatic oscillations of a uniform plasma in the absence of the rf field (the total dielectric coefficient then has the form E = 1 8 ~ i a€,). Thus, in particular, for a cold unmagnetized plasma, 86, = - w i , / w 2 and 6Ei = - w $ / w 2 , where up,, up, are the plasma frequencies for the two species. If we define quantities Rln),RP) b y
+ +
Eq. (4) can be reduced to
and the spectrum of the perturbed osc*illationscan formally he obtained from the determiriaiital consistency condition for ( 5 ) , IM - 11 = 0. I n practice, however, some approximation has to be made; to show that this ispossible let us consider Eq. ( 5 ) . If we restrict attention to the case where wn is of order up, or higher (i.e., to those values of wa for which quasi-potential theory is valid), we see that all the quantities Rim’are much less than unity (by a factor of order m,/mi) except possibly for RI”, whose value depends upon o,the perturbed oscillation frequency. If this is large (w 2 wo ) , R!’” is also 1, the remaining Rk’) being -1. In this case, p:) >> pd’), (1 # n ) , and Eq. (5) approximates to
If, on the other hand, w > up,. If there exists a cur-
+
+
R F CONFINEMENT AND ACCELERATION O F PLASMAS
233
rent in the equilibrium, so th at the electrons are moving with respect to the ions with velocity v, the dispersion relation becomes approximately (the terms with n 2 1 are negligible)
and the instability criterion becomes
Thus, although this streaming instability still exists, the range of k-values which are unstable is reduced by the rf field, and it can be shown that the maximum growth rate is also reduced. An analogous analysis (by Gorbunov and Silin, 1965) of the case of two neutral plasmas streaming past each other, however, shows that in this case the field does not exercise a stabilizing role; nor does it do so if two electron beams stream past each other in a stationary neutralizing background. The implications of Eq. (6) for non-Maxwellian distributions do not appear to have been analyzed in the limit uo < upein any detail, though it can, for example, be shown that an rf field can stabilize an anisotropic distribution in a uniform magnetic field against the Harris instability (resonance between the electron plasmaand ion cyclotron-frequencies). Much remains to be done in this region, but the general impression appears to be that apart from the parametric resonance the rf field does not have a strong destabilizing influence on plasmas which would be stable in its absence. As regards the low-frequency dispersion relation, the picture is rather less clear. Equation (7), which can be solved for a cold magnetized plasma for all angles of propagation except the narrow cone around e = ?r/2 (which causes trouble even in the absence of an rf field), gives approximately
where A(wo,
k, =
T:
n'oo'(n2uo' (%ZWO2
-
- De2)J,'(a) - 6-2)
W+2)(n2W02
and Wrt are the roots of Eq. (8), which determines the characteristic frequencies for electron wave propagation. The expression for an unmagnetized plasma is again obtained by taking the limit D -+ 0. It is clear from (12) that the stability condition is A L 0. This is a criterion which, however, is not too easy to state as a necessary condition upon w 0 , though a sufficient condition is clearly wO2 > fie2, Wrt2; in the limit, a cope (though the actual threshold is substantially lower than this). It will be observed that A has singularities as n2w02-+ O*2. This reflects the fact that both the high-frequency and low-frequency dispersion relations approach the same expression in the neighborhood of the parametric resonances. Except in this resonant region, the growth rate for the low-frequency instabilities never exceeds a quantity of order wpi. To summarize, the current position as regards high and intermediate frequency instabilities of a plasma in an rf field would appear to be that there exists a dangerous, but avoidable, parametric resonance instability when the rf frequency wo approaches one of the frequencies a t which the plasma could propagate waves in the absence of the rf field, but that there are no clear signs that otherwise the situation is any more dangerous than for a plasma confined in any other way. The low-frequency instability regime has scarcely been investigated effectively, but such indications as we have are reasonably encouraging. 5. APPLICATION TO FUSION REACTORS
The motive behind much of the early work on rf confinement was the hope that it might be made the basis of a thermonuclear reactor. This hope was attacked as early as 1957 by Osovets, who pointed out that the rf cavities available a t that time seldom had a quality factor Q better than a few thousand and that, taking this figure, the dissipation of rf energy in the walls of the cavitywould be prohibitively large. Indeed, not onlywould this loss exceed the thermonuclear power output, but for any reasonable size of machine it would exceed the power which was available from any existing rf generating device. This conclusion was confirmed by the calculations of Weibel (1957) and, in consequence, interest in rf confinemen t declined. The position has recently improved in three important respects. First, the upper limit on the rf power available a t any given frequency has increased by several orders, so that, for example, megawatts of cw power are available in the 10-cm waveband. Second, the Q factors of cavities have been raised by over five orders of magnitude by constructing them from superconducting metals kept a t 2°K. Third, as we saw in Section 2, we now know that there exist equilibria in which the ratio of the volume filled by plasma to the volume filled by rf can in principle be made very large. We must therefore re-examine the question of the feasibility of an rf thermonuclear reactor. We may begin by considering what factors determine the viability of any given react.or design. There is necessarily an initial investment of energy, equal to 3nT per unit volume, where n is the
RF CONFINEMENT AND ACCELERATION OF PLASMAS
235
mean ion or electron density and I' is the temperature. If there is either a steady particle loss mechanism or if confinement breaks down after some finite time, there will exist an effective confinement time t for this energy, and one can therefore regard the initial investment as equivalent to a steady dissipation of energy at a rate Pi = 3nT/t. I n addition to this initial investment, there are running costs: the plasma radiates by bremsstrahlung at a rate PB,and the confining fields must be maintained at a level that ensures confinement, which involves dissipating energy at a rate P, in the cavity walls and at a rate energy P, in the plasma itself. Provided that confinement is maintained, however, thermonuclear reactions generate power at a rate P F and there is the possibility of a positive net output Po. These six quantities Pi,PB, Po, P,, P F , and POare functions of the parameters of the reactor n, T, w , etc. Our first step will be to show that practical considerations rapidly narrow down the interesting range of variation of these parameters. A useful reactor must have a reasonably large power output. Conventional reactors have a power density of around 100 W/cm3; this is perhaps uncomfortably high, but certainly a power density below 1 W/cm3 would make a reactor of reasonable total output (say 109 W) unmanageably large. This immediately sets a lower limit to the plasma density and pressure. The fusion power generated by a Maxwellian plasma is given by
P F = n2(av)&/4 where u is the reaction cross section, (av) the appropriate average of uv over the distribution function [evaluated by, for example, Thompson (1957)] and &R the energy liberated per reaction. Taking the D.T reaction for which &R = 17.6 MeV, the graph of (UV)as a function of T given by Post (1956), [confirmed by more recent work: Eder and Motz (1958), Wandel et al., 19581 shows that the maximum value of P F is 7.02 x 1 0 - 2 8 n2 W/cm3 (where n is measured in particles/cm3) and that the corresponding temperature is GO keV. Thus the minimum plasma density (giving P F = 1 W/cm3) is n = 3.78 X 1013 ~ m - ~The . corresponding plasma pressure, which must be balanced by electromagnetic pressure, is p = 2nT
=
7.25 X lo6 dyn/cm2 = 7.2 atm.
This pressure can be reduced slightly by working at a somewhat lower temperature: to determine the minimum plasma pressure we can use the fact that, for temperatures below about 20 keV, it is possible to use the analytical approximation of Gamow to represent the reaction rate (Thompson) :
( a )= 3.7 X 10-12T-2'3exp( -19.9T-1'3)
cm3 sec-';
236
H. MOT2 AND C. J. H. WATSON
(with T in keV). One can then write the plasma pressure at which any given level of fusion power PF is maintained in the form p = (PF/(UU)&)’”%” 0: T4’3 e~p(19.9T-l/~/2), and this has a minimum when T = (19.9/8)3 = 15.3 keV. At this temperature, PF = 1.42 10-2%2 W/cm3 and this reaches 1 W/cm3 when n = 8.42 X 1013~ m - the ~ ; corresponding pressure is p = 4.16 X lo6dyn/cm2. The electric field strength a t which &E2 = p is E = 3.06 X lo6 V/cm; such field strengths, though large, are almost within the reach of current rf technology. However, it would clearly be undesirable to work a t a higher plasma pressure than is absolutely necessary to achieve a reasonable PF. We shall therefore take n = 8.42 X l O I 3 ~ m and - ~ T = 15.3 keV as the basic design parameters for an rf thermonuclear machine, though we may note that there is the possibility of raising the power level of operation if the maximum field strength could be increased. In the above discussion we assumed a N‘axwellian distribution. As is well known, such a distribution is particularly suitable for thermonuclear purposes: the tail wags the thermonuclear dog. It is therefore very important that a t least a reasonable fraction of the tail should be confined. It will be seen from the graphs in Eder and Motz that particles with energies up to about 150 keV should be confined if a 15-keV plasma is to react a t a rate which approaches that derived on the assumption of a Maxwellian distribution. This condition in turn sets a lower limit on the ratio wP2/w2. To see this, we may refer back to Section 2, where we showed that (for an equilibrium computable by the quasi-potential approach a t least) the density distribution given by n
=
no exp( -e2E2/4mw2T) = noexp( -op2tOE2/w24n0T)
-
and that, for oP2/o2 > 1,the maximum value of E2in a confining cylindrical mode is such that eoE2/4noT 1. Thus, for any given wp2/w2, particles are confined which have an energy less than the maximum quasi-potential (wP2/u2)T;so if we are to confine 150-keV particles in a 15-keV plasma, wp2/w2 must be a t least 10. By the same token, if wpz/w2 > 230, we will confine the 3.5-MeV He4 reaction products as well. We shall see shortly that in practice it is essential for wp2/w2 to be larger than this; we should emphasize, however, that whereas quasi-potential theory undoubtedly remains valid for a 15-keV plasma with wp2/w2 = 10, for much larger values of wP2/w2, the plasma-radiation boundary cannot be described by this theory, and the uncertainty discussed in Section 2 hangs over such equilibria. For a plasma with n = 8.42 X 1013,up = 5.16 X 10”sec-l. If we take wP2/o2 = 10, this gives o = 1.63 X 10’’ sec-I, or a wavelength X = 1.16
R F CONFINEMENT A N D ACCELERATION O F PLASMAS
237
cm. For several reasons, this wavelength is unacceptably small. I n the first place, the rf power at present available in the 1-cm waveband is of the order of hundreds of watts (cw) only, whereas (as we shall see) an rf power which is not enormously smaller than the fusion power is certainly required. Secondly, it implies a very thin isolating layer of rf between the plasma and the cavity wall (for reasons connected with the near-degeneracy of modes in a large cavity, this layer cannot be more than a few wavelengths thick), and hence small amplitude oscillations of the plasma about its equilibrium position, even though stable, would tend to cause loss of plasma to the walls. Thirdly, the rf losses in the cavity walls, which increase as w3'2 for conventional conductors and as w 3 for superconductors, 10" sec-I. As we shall see, are in either case prohibitively high at w these problems become much less intractable for X > 20 cm: in this waveband 1 MW of cw per klystron is available, the isolating layer is of reasonable dimensions, and the cavity losses might just be brought under control. 1000, which stretches quasi-potential This, however, implies wp2/w2 theory beyond its limit so we cannot describe the equilibrium with any precision. In what follows, however, we draw only on the qualitative features of the quasi-potential equilibria. We shall first show that at the chosen temperature and density, the bremstrahlung is totally negligible and that the initial investment can be made so, given a confinement time of a few seconds. Spitzer gives the Born approximation :
-
-
PB = (26e6/3hm,c3) ( 2 ~ T / r n , ) ~ ~ ~ Z ~=n5.35 i n , X 10-31n2T1/2 W/cm3 ( T in keV). Various corrections to this have been proposed (see, for example, Wandel et al.) but none make any significant difference at 15 keV. For n = 8.42 X 1013crn4 we obtain PB= 1.74 X
W/cm3
=
0.017 PF.
On the other hand,
Pi
=
3nT/t = 0.62/t
W/cm3.
I n view of the inefficiency of most plasma heating processes, it would clearly be desirable to keep t reasonably long, say 100 sec; but if this were not possible one might hope to improve the efficiency with which this energy was recovered instead. We shall now consider the heating of the plasma which results from the penetration of the rf fields into it. There does not appear to exist any theory of the effects of strong electromagnetic fields on collision processes in a plasma or of the heating which might be expected to result; we shall therefore begin by exploring the consequences of the theory appropriate
238
H . MOTZ AND C. J. H. WATSON
to weak rf fields in a plasma with screened Coulomb collisions. This has been given by (for example) Silin and Ruchadze (1961),who show that the dielectric coefficient of the plasma acquires an imaginary part: Im E = ( w P 2 / w 2 ) ( v , f f / w where ) veff = 2 ( 3 2 ~ / 9 m ) ~ / ~ ( e ~ /log T t 'A,~ )log n A being the usual Coulomb logarithm which we take to equal 20. If we note that n, which appears explicitly in veff and implicitly in up,should be taken as the local value, i.e., n = no exp[ -eo(wP2/w2) (E2/4noT)], we see that the power dissipated in the plasma is given by
s
E 2 d3r = 4vonoT
J
= 4vonoT
1 up2eoE2
- - -exp 4 w 2 noT
)
w toE2 ( - 2 2 - d3r w2 4noT
E 2 exp( -2E2) d3r
where we have reverted to the dimensionless electric field of Section 2 and have written vo = v,ff (n = no).With T in keV,
P,
=
1.17 X 10-26
no2log A
J
T"2
E 2 exp( -2E2) d3r,
in watts, and, with the chosen values of n and T ,
P, = 4240JE2exp(-2E2) d3r. I n order to compare this figure with PB = 1 W/cm3, we need to know the precise plasma configuration-i.e., the plasma volume V , and the value of the integral $E2exp( -2E2) d3r. Even for those configurations for which quasi-potential theory is valid, the integral $E2 exp( -2E2) d3r can only be performed numerically or by elaborate approximation. Since for wp2/02 1000 quasi-potential theory cannot be better than a first approximation and must break down badly in the region of space which contributes most to this integral, we shall simply estimate the integral as 47rr2sa where r is the radius of the plasma (assumed spherical for simplicity), s is the depth of penetration of the rf into the plasma (a depth of order c / w , according to quasi-potential theory but presumably here somewhat greater), and a is the maximum value of the integrand, numerically 0.183. Thus, if we define the volume of the boundary layer V , = 4748,
-
P,
= 4240aVs/V, = 704V,/V, W/cm3.
It may be remarked that this (on any reasonable assumption about V , / V , ) very large quantity, deduced from linear theory, probably enormously exaggerates the heating effect, since (as the theory of electron runaway shows) the viscous drag of a plasma on an electron moving in a strong electric field, such that V B / V B wp2/w2 >> 1 , is very much less than
-
239
RF CONFINEMENT AND ACCELERATION O F PLASMAS
the drag on a thermal electron. Thus, although on the above linear theory the plasma heating would exceed the fusion power and would in consequence require a very efficient heat recovcry procedure at the end of the reaction cycle, it is probable that in practice P, has a negligible effect on the power balance, though it remains a serious burden on the rf generator. Finally, let us consider the rf losses in the cavity walls. Since these depend upon geometric factors and are roughly proportional to the surface area of the cavity, it is necessary to decide upon its shape. It is clear that the most favorable arrangement would be a large spherical plasma, isolated from the cavity walls by a thin rf layer of thickness d = X/2. As the radius r of the plasma tends to infinity (though practical considerations would probably dictate an upper limit of say r 101) the rf configuration in this layer would asymptotically approach that of a plane wave incident on a plane conducting surface. The energy loss resulting from the finite conductivity u of this surface is given by Panofsky and Phillips (1962) as
-
d&/dt =
JNd S
-
-
= w6$+poR2n d S = w 6 + E o ~ ~ , , $ n d S = 2w6noTS
where N is the Poynting vector, J d S is a surface integral over the cavity wall, n is a unit vector in the direction of N (which in the plane wave limit is normal to the wall), and 6 = ( 2 / p ~ o a ) ~is/ ~the depth of penetration of the rf into the wall as a result of its finite conductivity. I n deriving the last two expressions, we have used the fact that for a plane wave + p B 2 is maximum at the wall and equals +eoE2 at one-quarter wavelength inside the wall, where E = E,,,, and have used the result of quasi-potential theory to replace +coE;,, by 2noT. For a spherical plasma of radius r we can express the surface area S = 4s(r d ) 2 in the form S = (3/r)(l d / r ) 2 V , and hence we can obtain the power per unit volume of plasma P, required to maintain the cavity losses:
+
+
=
1.25
(2) [
f(1
+
W/cm3.
The expression in square brackets, which depends simply on the size of the configuration, cannot for practical reasons be indefinitely small. If we take &as its lower limit, we see that for positive power balance ( P P> P o ) , there is a very stringent upper limit on w 6 / d 16. If we take d = X/2, we obtain w26 < 1.5 x 10l2. For copper, 6 = 1 6 . 5 / ~ " ~and , we obtain a maximum frequency w = 2.0 X lo7sec-' and, consequently, wp2/w2 lo*. Frequencies as low as this could not be used to excite the natural modes of a cavity of reasonable size; it would therefore be necessary to use forced
-
-
240
H. MOTZ AND C. J. H. WATSON
oscillations, and the electromagnetic field distribution would be quasimagnetostatic. Such an approach, though possibly feasible, could not be discussed by any of the methods developed in this article, and we shall not w / Q , this argument is essenconsider it further. Since for d = X/2, w 6 / d tially the same as the one which led Osovets to conclude that there was no possibility of using rf confinement as the basis of a thermonuclear reactor. If we allow the possibility of using superconducting cavity walls, the position at once looks more attractive. The theory of superconducting rf cavities, and some very promising experimental results, are described in an article by Schwettman et al. (1964). Since the falloff of the electric field in a superconductor is not exponential, one cannot use the analysis given above to calculate the power loss, but from measurements of the Q of normal and superconducting cavities excited in the same mode, it is possible to determine an experimental effective 6. In the experiment quoted, the Q of a superconducting lead cavity at 1.5"K excited in a TEoll mode a t 2856 Mc/sec was a factor 1.2 X lo6better than that of a copper cavity at room temperature, and even this was a factor of 25 lower than might be expected theoretically. This gives an effective 6 of 10-9 cm, or w 6 / d = 3.4 sec-I. Furthermore, theoretically, the effective skin depth should increase linearly with w, so w6/X a w3, and by working with frequencies even slightly lower than that quoted (for which X = 10 em) a dramatic improvement should be obtained. Electric field strengths of lo6V/cm have already been realized; these correspond to magnetic field strengths of 300 G a t the cavity walls, and the limit appears to have been set by the magnetic field at which the superconducting property was lost. The rather low temperature of operation was likewise dictated by the low electron pairing energy in lead. Thus there is hope that both the critical field strength and the operating temperature could be raised somewhat by a judicious choice of superconducting materials. The question of the operating temperature is very important, since the rf energy which is dissipated in the walls must be pumped away if they are to be kept superconducting, and this involves a refrigeration effort which is by no means trivial. Carnot's theorem shows that there is a limit to the efficiency with which this can be done, equal to T 1 / ( T z- T I )where T1 is the temperature of the superconductor and Tzis room temperature. At 15°K this is already a factor of 1/200, and existing refrigerators do not approach even this low efficiency, though refrigerators acting on the Stirling cycle will soon be available which approach the Carnot efficiency. Thus the energy which must be supplied to maintain the superconductivity is qPc,where q 2 200 at 1.5"K,although it would be reduced substantially if one could use superconductors with a higher critical temperature. Thus if we take X = 10 cm, d = 5 cm, r = 100 cm, and q = 250, we obtain
-
R F CONFINEMENT AND ACCELERATION O F PLASMAS
24 1
P, = 53.8 W/cm3. This figure is still unfavorable, but improvements in the working temperature of the superconductor, or even the realization of the theoretical effective skin depth a t 1.5"K, might bring this loss down to the break-even point, and a reduction in frequency (with a consequent scaling up of the size of the reactor, if we keep d / r fixed) would then give a positive net output Po. Finally, some comment should be made on the permissibility of maintaining a superconductor close to a thermonuclear plasma, since the above calculations have been made on the assumption that the cavity is entirely lined with superconductor and contains no absorbent material other than plasma within it. This assumption is, as it stands, quite unrealistic, since although the charged reaction products remain confined, the neutrons and bremsstrahlung would strike the cavity walls. Althoughit might be feasible to allow the neutrons to pass through a thin superconducting layer, the bremsstrahlung would almost inevitably be absorbed, and the need to pump its energy away would again make the reactor unworkable. Two alternative approaches to this problem suggest themselves. (i) Insert a cooled heat shield of some rf transparent material between the plasma and the cavity wall. For cavity walls of low conductivity such a shield, if made of a low loss dielectric such as quartz or titanium might even enhance the oxide (for which the loss tangent tan A Q value, in the manner discussed by Walker and Hyman (1958); for a superconductor, however, it would cause an unacceptably large loss (this can be seen from the fact that it would give the part of the rf filled volume occupied by it an imaginary dielectric coefficient I m r = Re e tan A which was much larger than that of the plasma). (ii) Break up the cavity surface into a finite number of filamentary superconductors. The theoretical problems connected with such an approach have still to be investigated. We may conclude that the advent of superconductors, though it reopens what appeared to be a closed issue, by no means solves the problem of designing an rf thermonuclear reactor.
-
6. EXPERIMENTS RELATED TO RADIO-FREQUENCY CONFINEMENT
A . Introduction One of the striking features of the"subject of rf confinement is the very limited extent to which any of the (by now) very extensive body of theory has been tested experimentally ; furthermore, the measurements made in the few experiments which have been reported are in almost every case very incomplete. This state of affairs is perhaps hardly surprizing in view of the doubts which have been felt about the practicability of devices based upon rf confinement, but it seriously complicates the task of the
242
H. MOTZ AND C. J. H. WATSON
present authors in giving a straightforward account of the extent to which the theory has been confirmed, since the evidence is for the most part indirect and ambiguous and was in several cases collected for other purposes. We shall begin by reviewing the experiments which establish that, in plane and cylindrical geometrics at least, rf waves do set up a quasi-potential barrier = e @ / 2 m d which is capable of confining or repelling single particles. We shall then describe the experiment which shows that a cylindrical barrier can be used to focus electron beams by balancing the dispersion space charge forces, and the very few experiments which provide direct evidence for at least partial confinement of a quasi-neutral plasma. Finally we shall mention an experiment which provides indirect evidence for the existence of a quasi-potential relief.
+
B . Single Particle Conjinenaent The most convincing quantitative experiments on the confinement of single particles were performed by Bravo-Zhivotovsky et al. (1959) of
FIG.6
Gorky University. The first experiment, which was designed to test the ability of the quasi-potential barrier to cut off an electron beam, used a rectangular resonance cavity 2.85 X 1.25 cm in cross section, irismatched to a waveguide feed and tuned by a plunger in the manner indicated in Fig. 6. The cavity was excited in TElon(n = 1,r2, . . . , 5) modes at a frequency w = 6 X 1 O l o sec-'. This configuration has a quasipotential maximum at the center of the cavity. The height of the barrier is related to the power P supplied to the cavity according to
RF CONFINEMENT A N D ACCELERATION OF PLASMAS
243
where Q is the quality factor of the cavity, V its volume, and a a factor relating the maximum value of 82 to the average value over the cavity, which in the present experiments had the value 0.47. Electrons from a gun mounted in a 6-mm-diameter (cutoff) cylinder in the broad face of the cavity could be shot across the cavity to be collected by an electrode at the end of a similar cutoff metal tube in the opposite face. A weak focusing magnetic field of between 10 and 100 G was used to reduce dispersal of the beam in transit. For each of the modes referred to, the rf power P required to cut off the beam was measured as a function of the voltage V , on the electron gun. The results for n = 1 and n = 5 are shown in Fig. 7, together with the measured values of Q in the two cases. The power level is in watts for n = 1 and kilowatts for n = 5; the solid line gives the theoretical value obtained from the above formula. It will bc
FIG.7
seen that the experimental results agree with the theory to within experimental error (-7%). Another experiment carried out by this group involved a cylindrical quasi-potential well, established by exciting a helical slow wave structure of diameter 0.59 cm and pitch 0.03 cm with a 10-cm traveling wave. The use of such helical structures leads to a great economy in the use of rf power, as we shall see shortly; their disadvantage, from the point of view of conducting an experiment designed to test the quasi-potential concept, is that they create slow traveling waves. Such waves, as we saw in Section 1 have a spatial nonuniformity on a length scale much shorter than the vacuum length scale, and, in consequence, the analysis given there does not prove that the quasipotential concept is applicable in this case. However, as shown by Litvac et al. (1962), it is still applicable provided that the velocity of the particle not only remains much less than c (the condition assumed in Section 1) but also much less than the phase velocity of the slow wave v+. (Essentially this is because, under this stricter condition, the force exerted by the magnetic field is still much smaller than that
244
H. MOTZ AND C. J. H . WATSON
of the electric field and the particle still executes high-frequency oscillations of small amplitude about a guiding center.) I n this experiment v4 c/30 and the electron velocities were always less than, and for the most part much less than, v4 so the experiment does provide a test of quasi-potential theory. Electrons from a gun were fired along the axis of the spiral. Over a short section of their path (1 = 3 cm) they were subjected to a weak perpendicular magnetic field ( B 1 G), as a result of which they acquired a directed perpendicular motion vL = eBol/m that could be varied in the range lO6-lO9 cm/sec and was always significantly greater than the random perpendicular velocity of the electrons resulting from imperfect beam collimation and space charge effects. I n the absence of an rf field on the helix, this perpendicular motion led to a loss of electrons to the walls of the tube, and no electron current was detected. An rf field on the helix, however, creates electric fields
-
-
E , = EoIo(yr)exp[i(wt - kllz)] and
E , = (ikli/r)EoZl(rr)exp[i(wt - kiiz)l. Hence a quasi-potential well given by
+
$ = ( e 2 E ~ 2 / 2 ~ n o 2 ) [ l ~ 2 (11~(yr)l rr>
(2)
is set up, where lo,I l are modified Bessel functions; y is the perpendicular wave number, given by y2 = k112 - k2 = ( w 2 / v m 2 > ( 1- vO2/c2);and Eo is the electric field strength on the axis, a quantity which is related to the power supplied to the helix P by Eo2 = aP,where a is a constant expressible in terms of the dimensions of the helix. [Expressions for the spatial variation of the fields and the impedance of the helix are to be found in, for example, Pierce (1950).]It was found that a sufficiently large quasipotential prevented loss of electrons and led to the detection of the full beam current at the far end of the helix; the power needed to achieve this was shown to be proportional to the magnitude of the perpendicular energy V given to the beam by the uniform magnetic field until the latter was so large that the condition vL > v+ (the electron gun used worked poorly at higher voltages). In discussing these experiments, we shall begin by considering how the interpretation given by Birdsall and Rayfield, based on the analysis given in the theoretical part of their paper, is related to quasi-potential theory. As we have seen, for v+ >> v,, the results of quasi-potential theory should be applicable and hence the cylindrical potential well given by (2) should be set up. In the absence of perpendicular motion resulting from imperfect
245
H. MOTZ AND C. J. H. WATSON
collimation, the condition for beam focusing is that the quasi-potential force at the edge of the beam should balance the electrostatic force on an electron there. This force is given by (see the paper of Weibel and Clark for a detailed discussion) F = eE, = (eZ/roto)J2n(r)rdr where n is the electron density at the point r and ro is the radius of the beam (a quantity which is perfectly well defined, since by hypothesis there is no thermal spread). This can be re-expressed in terms of the total beam current l o o and its velocity voo as F = e l 0 0 / 2 a r ~ t ~Hence v ~ ~ . the quasi-potential force balances the space charge force a t the boundary if
For voo 2 vd, the quasi-potential approach becomes inapplicable, since the Taylor series expansion of E(R e) about E(R) fails. However, as Birdsall and Rayfield point out, an approximate method of solution can still be developed, since for the slow waves set up within a helix, the magnetic field strength is very small. This is most easily seen by observing that, apart from quantities of order vd2/c2, the electric field components quoted above can be derived from a scalar potential
+
4 = --+Ar = V l l o ( y r ) cos(wt - kllx). Hence V X E = 0 apart from terms of order vp2/c2, and B vanishes in the same approximation. Thus, as we showed in Section 1, by making a Galilean transformation to the frame in which the wave is at rest, one can demonstrate that the actual particle motion is approximately the motion of a particle in an electrostatic potential field 4 = VlI0(yv) cos(kl1z’). However, the problem of the motion of a charged particle in a periodic electrostatic field has been discussed by several authors (e.g., Clogson and Heffner, 1954; Tien, 1954); they show that the field has a focusing effect on a particle moving parallel to the symmetry axis. I n Tien’s analysis, which is more general than Clogson and Heffner’s and gives their results as a special case, a beam of electrons carrying a current lois considered to move along the symmetry axis in an electrostatic potential V , sin fix, with a velocity v. which is modulated as a consequence of the electrostatic potential about an unperturbed velocity vo. The modulations are treated as small (Iv, - vo1 > vo >> voo, we recover quasi-potential theory, as we should since with this ordering of velocities both the quasi-potential and electrostatic focusing approaches are valid. For voo >> v+, on the other hand, the electrostatic theory shows that the quasi-potential force is diminished by a factor ( ~ + / v O o ) ~Provided . that this force is sufficient to maintain confinement, the assumptions underlying ( 5 ) remain valid. As voo tends toward v+, however, the assumption that the electrostatic field only modulates predominantly rectilinear orbits becomes invalid, and it is clear that the particles become free to escape in the radial direction at each of the nodal planes of the electrostatic potential. It will be seen that the predictions of both these approaches, which coincide when their conditions of applicability overlap, are in qualitative agreement with the experimental results obtained by Birdsall and Rayfield. In view of the partial beam loss which occurred even under optimum conditions, it does not appear to be possible to claim quantitative confirmation. In their summary, these authors make some remarks about the power requirements of this method of focusing, as compared with focusing by fast traveling waves, which are somewhat misleading. They point out that the power required (-15 W) to confine their beam was smaller than the power used by Weibel and Clark to confine an “almost identical” beam using fast waves by a factor of 16,000, and they speculate about the possibility of turning this power saving to good account in a fusion machine. However, this factor of 16,000 arises from a combination of two circum-
250
H. MOTZ AND C. J. H. WATSON
stances. In the first place, they arranged by beam collimation for the random perpendicular motion of the electrons to be negligible (i.e., they used an essentially “cold” electron beam), whereas in the experiments of Weibel and Clark, the thermal motion was the dominant cause of beam dispersal and hence required a substantially deeper quasi-potential well. Thus, the beams were not “almost identical” in the relevant respect. I n the second place, it is characteristic of a helical slow wave structure that it can be excited a t a much lower frequency than a waveguide of the same dimensions. I n the experiment of Weibel and Clark, for example, the frequency was 9.3 times higher than in that of Birdsall and Rayfield. The power required is, of course, sensitively dependent on the frequency of operation, and this is the main reason why slow waves are a more economical means of focusing electron beams than fast waves. This dependence of t.he power required upon frequency is a combination of two factors: the power loss for a given electric field strength varies as w 3 / 2for normal conductors and as w 3 for superconductors (as we saw in Section 5), whereas the mean square field strength required to maintain a given height of quasi-potential barrier increases as w2. Thus the power required to confine 1 normal ~ single particles of given transverse energy increases as ~ ‘ for metals and as w6 for superconductors. A factor of (9.3)’12 already accounts for most of the difference between the power consumptions in the two experiments. However, in a fusion machine, it is necessary to confine not single particles but a high density plasma, and its pressure has to be balanced by electromagnetic pressure, so the extra factor of w2 associated with the quasi potential does not arise. Nevertheless, the possibility of using a helical structure instead of a waveguide in a fusion machine needs examination.
D. Direct Evidence oJ Radio-Frequency Confinement of Plasma There has been a tendency in the literature to interpret all observations of plasma confinement in which an rf field is present as resulting from the establishment of a quasi-potential well. Thus, for example, Thompson’s dark zones in electrodeless discharges, the confinement of plasmoids between the plates of a condenser fed with rf, and the experiments of Birdsall and Lichtenberg on plasma confinement on the axis of a 3-cm helix fed with rf power in the 3- to 25-Mc/sec frequency range have all been discussed from this point of view. An interesting paper by Butler and Kin0 (1963) points out the wrongness of these interpretations; in each of these cases, the wavelength of the rf is so large compared with the dimensions of the apparatus that the rf fields are not even approximately natural modes for the system. Thus the field nonuniformity is on a length scale which is so short compared with the vacuum length scale
RF CONFINEMENT AND ACCELERATION OF PLASMAS
251
that the quasi-potential concept becomes inapplicable. True, we have just seen that it may still be used t o discuss the action of a helical slow wave structure provided that c >> u+ >> u,; however, in the experiment of Birdsall and Lichtenberg (1959) u+ is so much smaller than the thermal velocity of the electrons that not only is the quasi-potential concept inapplicable, but the electrostatic periodic focusing would be negligible as well. An alternative explanation of the plasma confinement in this and other experiments, in terms of the formation of a positive ion sheath, was suggested by Sturrock (1959); his theory has been extended and experimentally corroborated by Butler and Kino. Measurement of plasma concentration
To
I electromaanetic field I
L To the vacuum system
Nevertheless, in the view of the present authors, a number of experiments have confirmed the possibility of confining a quasi-neutral plasma in a quasi-potential well. The earliest of these are the Russian experiments reported at the Geneva and Salzburg conferences by Glagolev and his co-workers (see Vedenov et al., 1959; Arsenev et al., 1961). Their Geneva paper describes an experiment in which a uniform magnetic field was used to confine a cylindrical plasma in the radial direction, and two rf cavities were used to ensure confinement in the axial direction. A schematic representation of their apparatus is given in Fig. 11. The two rf resonant cavities, which had cutoff openings to receive the quartz tube within which the plasma was created, were excited in the TElol mode by pulses of up to 400 kW of rf power of duration 120 psec supplied by a magnetron operating in the 10-cm waveband. The peak rf magnetic field was 60 G, whereas the uniform magnetic field was in the rangc 0-2000 G. Thus tlhe electron cyclotron frequency could be made greater or less t,hari the rf
252
H. MOT2 AND C . J. H. WATSON
frequency, though the significance of this for confinement is not discussed in the experimental part of the paper. The plasma was created within the quartz tube (of length 40 cm and diameter 1.5 cm) by means of the rf pulse subsequently used to confine it. The ionization process was observed to initiate a t the ends of the tube and to spread down it to the middle. The plasma density was determined by measuring the frequency shift of the measuring resonator shown and by measuring the cutoff frequency for propagation of rf waves with wavelengths in the range 3 to 0.8 cm. The behavior of the apparatus depended on the gas pressure and rf power used. For example, with argon a t 3 X mm Hg a plasma of density 1013ern+ was formed with an rf pulse giving a maximum magnetic field of 30 G ; and under these conditions the confining resonators were highly detuned during part of the pulse, indicating penetration of the plasma into them. When the rf field strength was increased to 60 G, thc plasma density reached a t this gas pressure was unaltered whereas the detuning of the resonators was reduced almost to zero, though a t higher gas pressures, a substantial detuning associated with a higher plasma density was again observed. Similar results were obtained with other gases. To assess the significance of these observations, they measured a t low rf power the detuning of the cavity produced by a measured density of plasma in it, and they established that a plasma of density above 10''' shifted the resonator frequency outside the recaption band for the rf supply system. Thus, their experiments showed that a 60-G rf field could confine a plasma of density 1013cm+ in such a way that the density in the cavities did not exceed 1O1O cm-2. The temperature of the plasma was estimated from the duration of the afterglow as being of order 5 eV. This gives a plasma pressure 80 dyn/cm2. The rf pressure was estimated from the condition which would hold in plane geometry:
+
+e08~+ p o R 2 = + p o 8 &
=
70 dyn/cm2.
Thus, the condition for pressure balance is seen to be satisfied to within experimental error. I n their second (Salzburg conference) paper they described two further experiments. The first was so similar to the experiment described a t the Geneva conference that it is not clear what motivated it. The chief differences were t,he use of a single resonant cavity in place of the two used previously and the employment of a separate plasma injector instead of using the same rf field both to create the plasma and confine it. This latter feature had the advantage that it was possible to vary the plasma density independently of the rf field strength and hence to give a more unambiguous demonstration of confinement. Its disadvantage was that the maximum plasma density attained was an order of magnitude lower. I n this experi-
RF CONFINEMENT AND ACCELERATION OF PLASMAS
-
253
ment, a TElll mode was used, and the maximum field strength was raised to 250 G ( E 100 kV/cm) and the duration of the pulse extended to 1 msec. Furthermore, the plasma density within the resonator was determined by exciting it simultaneously in a low amplitude TMolomode and measuring the frequency shift. I t was shown that at full rf power a plasma of density 10l2 ern+ could be expelled from the cavity, the leakage of plasma back into the cavity not exceeding 0.01 %. The temperature of the plasma is unfortunately not quoted, but it was presumably of the same order as in the previous experiment, so the published measurements on this apparatus do no more than confirm the interpretation given to the measurements made previously. Their second experiment was primarily related to the theory of rf stabilization of plasmas confined by magnetic fields, a subject which has received considerable attention at the Kurchatov Institute (Volkov, 1959; Osovets, 1959; Volkov and Kadomtsev, 1962), but which lies outside the scope of the present review, since the rf frequency or field configuration are chosen in such a way that the dominant force acting on the plasma is due to the magnetic component of the rf field. I n the experiments described above, the rf field was used to maintain confirlenient in one direction only, that of the magnetostatic field. A more ambitious experiment has been undertaken by Consoli’s group a t Saclay, with the object of demonstrating three-dimensional confinement exclusively by rf fields and by rf fields in combination with a magnetic mirror field. In their experiments a spherical cavity was used; in their original design of the experiment it was intended that this should be used to realize the field configuration proposed by Butler et al. (1958)-the rotating mode obtained by exciting the TElll and TEllo modes in quadrature. However, they encountered experimental difficulties in achieving this at high frequencies, and the work reported in the literature describes almost exclusively the effect of a TEllomode in combination with a magnetic mirror field orientated along the symmetry axis of the rf mode. I n 1962, Consoli and Le Gardeur (1962b) reported their first experiment on combined rf magnetic mirror confinement; they used a plasma created by a discharge passing through a spherical resonator supplied with rf from power triodes at 1135 Mc/sec, which is the resonant frequency of a sphere of diameter 38 cm excited in the TEllo mode. I n a regime with pulses of 100 psec. the power level was 20 kW while in the continuous regime the power level was 200 W. The apparatus is shown in Fig. 12. The plasma from the discharge is sketched in, as well as field coils for a static magnetic field hnd a probe which measured the radial flux of particles escaping from the resonator. The arc discharge was run at 100 A in the pulsed regime and at 8 A for
2ti4
H . MOT2 AND C. J. H. WATSON
continuous operation. The density in the center region was measured by microwave interferometry (2-mm wave equipment) and reached 7 X 10ls electrons per cubic centimeter. The ion temperature was measured spectroscopically by observation of the H a line to be -2eV. Gases used were Hz, He, Ar. In the pulsed regime, the following evidence for confinement was obtained. Application of the rf increased the density by a factor of 2 to 3 and increased the decay period of the density when it was switched on n
-
Triode generator 5 0 kW Dulsed
-
2
Particle collector
.. Insulation
Variable feadback
, /
Cathode
H.E voltage
of gas
7
P-
P
-r---t 1 2 0 0
I
FIQ.12
during the afterglow of the discharge. The density increase could not be ascribed to ionization by the rf because at pressures of 4 x mm even full ionization would only give an increase of 10%. I n the continuous regime, the particle flux measured by the probe decreased when the rf was applied. The experiment was repeated for various values of the magnetic field, and Fig. 13 shows the results registered on an oscilloscope with a persistent screen so that the various deflections form a cont,inuous curve. The uppermost trace shows the particle flux in the presence of the magnetic field but without rf. The trace below shows the reduction in flux obtained by the application of rf. The lowest trace shows the ioniza-
RF CONFINEMENT AND ACCELEHATION OF PLASMAS
255
tion obtained by the rf alone when the discharge was switched off. High ionization was achieved when the rf frequency approached the cyclotron frequency, which happens near 400 G. The effect of this ionization tends, of course, to diminish the reduction in flux due to confinement.
I#
= f(B) escaping flux p = 3 x 10-4mm
Gas Argon
Without
Id = 4 a m p
RF field
r h 50
JU
Reversal due to the resononce
.-ICn L
30 a
-
10
a
+ ionization due to RF
-.i B
(gouss)
FIG.13
In order to discuss the significance of these results, we need to calculate the confining pressure due to the rf by means of the formula
where P is the rf power. Q is the quality factor of the cavity, and V is the volume; the constant (Y relates the maximum field strength to the volume integral of the squared field, i.e., a E i a x V = J E 2d3r. P’or the empty cavity Q may be ralwlated to be 6 x lo4 for the TI3110 mode. The paper docs not indicate a measured value. Inserting this valuc of Q we find $€&” = 50/a dyn/cm2 while the particle pressure on thc axis amounts to n ( T , T,) = 224[(T, T,)/eV]. The factor a is approximately and although the ion temperature Ti was 2 eV the electron temperature T , may be higher than this.
+
+
+
256
H. MOT2 AND C. J. H. WATSON
We first consider the pulsed regime. The confining pressure is -100 dyn/cm2 while the plasma pressure is 500 dyn/cm2 or higher. It is therefore not surprising that the plasma escaped. I n the afterglow, however, when the plasma density has decayed to 5th of the original density, equilibrium might have been reached which explains the increase in the decay time. I n the continuous regime, however, the rf pressure is only -1 dyn/cm2 and is negligible compared with the plasma pressure. We therefore have to explain why the rf reduces the particle flux at, say, 250 G. A qualitative explanation may perhaps be given if we consider the quasipotential acting on single particles, which as we have seen is increased by the magnetic field by a factor (1 - Q / W ) - ~ , which is 2.8 for the present experiment. The quasi potential in this case is -1 eV without magnetic field and 2.8 eV a t 250 G. Thus, if we assume that the electron temperature was of order 2 eV, the quasi-potential of the vacuum field in the cavity would be sufficient to confine single particles. We know, however, that a plasma of pressure greater than the rf pressure must eventually modify the vacuum field configuration in such a way that it can escape; but it seems reasonable to expect this process to occur more slowly than the free escape of individual particles, as the experimental results indicate. It is clear, however, that such experiments in the continuous regime do not demonstrate plasma confinement. The same objection applies to similar experiments b y Consoli et al. (1962b) reported in the same year. Further experiments with plasma created in a similar spherical cavity by the rf itself, also in the presence of a dc magnetic field, were reported a t the International Colloquium held at Saclay in 1964 (Consoli et al. 1964b). The power had been increased to 80 kW, the pressure was 2 x mm. Photographs of the appearance of the plasma luminosity were taken by means of an image converter which could be switched on for a very short time; they are shown in Figs. 14 and 15. The oscillograms above each photograph show three traces. The uppermost one shows the pulse that switched on the image converter a t a time which varies progressively in relation to the start of the rf pulse; the middle one shows a quantity corresponding to the escaping particle flux; and the lowest one shows the rf pulse, which is seen to be fairly reproducible except in two cases and which in every case cuts off as a result of cavity detuning due to plasma buildup in it. I n Fig. 14 the power was 80 kW and the pressure 2 X 10-4 mm, while in the case of Fig. 15 the power was 20 kW and the pressure 2 x 10-3 mm. Figure 14 shows, according t o the authors, that the plasma is confined during the rf pulse, but it expands somewhat until, after the pulse, it is no longer confined and subsequently decays. Looking a t Fig. 15 we see
R F CONFINEMENT AND ACCELERATION OF PL.4SM.46
257
that 110 confinement was obtained with the lower rf and higher gas pressure. Unfortunately, no data concerning the temperature and plasma density are given in the paper. It is hard to base conclusions on the observation of the distribution of luminosity alone because this merely indicates the
-
Time
FIG.14
regions where conditions favorable for the generation of visible light exist. In conjunction with the particle loss measurements, however, the photographs provide reasonable evidence for confinement and we can use the rf pulse trace to draw further conclusions. When breakdown is produced in a gas by microwaves, a t a power level such that the rf pressure is not enormously larger than the plasma pressure, the ionization must cease once the plasma frequency exceeds the microwave frequency, because the rf can no longer penetrate into the cavity. The electron density
258
H. MOTZ AND C. J. H. WATSON
could not exceed 1-2 X 1010/cm3under these circumstances, and the plasma (T/eV) dyn/cm2. At the power pressure 2nT could not exceed 3.8 X level of 80 kW, however, the rf pressure is 400 dyn/cm2. We shall see later that the temperature is likely to be of the order of 10 eV, so that rf, should be able to localize the plasma near the axis. Indeed, a t this power level, it would be possible to achieve an equilibrium with a 1000-fold increase of
-
Time
FIG.15
plasma density a t the axis. It is certain however, that the cavity detuned before such densities were reached and more elaborate frequency tracking would be needed to reach this equilibrium, I n fact, one can see from Table I that in cylindrical geometry a t least a detuning of 0.5 % is reached when wp2/w2 is as low as 0.9. Consoli et al. indicated a t the colloquium that their rf supply system could track u p to about 0.5% frequency. If the contraction of luminosity is due to plasma confinement it must be assumed that, a t the higher pressure of 2 X 10+ despite the lower power, the rate
RF CONFINEMENT AND ACCELERATION O F PLASMAS
259
of ionization is so fast that luminosity appears almost simultaneously throughout the cavity. The breakdown mechanism is different at the higher pressures as we shall see below. This would explain why Fig. 15 shows no confinement of luminosity to a central region. The authors state that no light output was observed with the lower pressure for a time of 7 psec after the onset of the rf pulse, and the first photograph of Fig. 14 shows the beginning of the light output. It appears that the collisions leading to light output were most frequent, in the axial region where the particles oscillating in the quasi-potential well have maximum energy. We do not know whether the light output is larger than that observed in Fig. 15 throughout the pulse. The buildup and decay of rf fields in a cavity follows a law exp( & ( u / Q ) t ) = exp { (t/8.5 X 1 0 P ) } in UQCUO in the present case. Looking a t the photographs of Fig. 14, we see that the buildup time is of the order of 6 psec, indicating merely that the loa.ded Q is in fact lower than 6 x lo4 but the decay time is even shorter, i.e., only 2 psec. I n Fig. 15 the buildup time is shorter than in Fig. 14 and the decay time is equal to the buildup time. This indicates that losses in the plasma due to ionization, excitation, and heating are of the order of the copper losses. The sharp onset of the decay in Fig. 14 indicates that the cavity is b y this time filled with plasma exceeding the critical density and the decay time is even shorter than in the case of Fig. 15. If there is confinement, the changed field configuration also lowers Q, because the volume accessible t o the fields is lower. But we may merely see enhanced collision losses. I n yet another experiment by Consoli et al. (1964c), a n axial and an equatorial electron collector allowed the determination of the time dependence of the escaping flux which was carried out a t gas pressures from 1 to 1.7 X 10P mm. The setup was presumably similar to that of the previous experiment. The particle flux arrived a t the collector after a time delay t, with respect to the onset of the rf pulse. When the electrons arrive a t the collector, the rf ceases to flow into the cavity and the rf pulse measured by a loop inserted into the cavity stops. The authors interpret this as the detuning of the cavity by the plasma accumulating in the cavity. Photographs of the image converter which is switched on during the pulse show a cigar-shaped luminosity, elongated along the axis (where, we recall, the TEllo mode has a node and so does not confine in the axial direction); the luminosity develops just before the rf pulse cuts off. The authors have measured the rf pulse length f, as a function of the gas pressure and find a law
260
H. MOTZ AND
C. J. H. WATSON
To explain this result they proposed a somewhat unsatisfactory ad hoc model. They calculate the time dependence of the ionization and find, for the time needed to reach a density of 10" by volume ionization from an initial density of 2 X lo8, l/t, = O.16(pSve - l / ~ ) ,
(8)
where S is the ionization cross section, T the lifetime of electrons, and v8 their velocity. They conclude from the form of the experimcntal law that the buildup does indeed occur by ionization in the cavity volume and, by identifying the theoretical and the experimental expressions, determine T = 5 X sec and Sue = 2 X 109 sec-l/Torr. They state that this cross section corresponds to electrons of 30 eV. On the other hand, the energy of the electrons collected is of the order of 10 eV. The initial density of 2 x 108 is assumed because this is the density required for quasineutrality in the quasi-potential field. Some other mechanism is made responsible for generating this initial density. The present authors think that a more appropriate breakdown theory for these very low pressures is that of Self and Boot (1959) reported in the next section. The quasi-potential for the experiment under discussion is 400 V. As the electrons oscillate in the quasi-potential well they soon acquire energies sufficient for ionizing collisions. The ionization probably builds up exponentially with time but the phenomena are complicated by the plasma accumulation at the center. At higher pressures, the diffusion controlled mechanism of breakdown involving outward particle flux takes over, as seen in the next section.
E. Indirect Support for the Quasi-Potential Concept from Breakdown Measurements Experiments on breakdown of gas in microwave cavities were carried out by Self and Boot (1959) and their theoretical interpretation adduces an interesting confirmation of the quasi-potential concept. They find results similar to those of Brown and Macdonald (1949) within the pressure range used by the latter authors, i.e., within the range of validity of their theory. A t lower pressures, Self and Boot get very different results which must be explained by quasi-potential theory. In the experiments of Brown and Macdonald, the dominating electron producing process is ionization by collision in the cavity volume, while the principal process of electron removal is diffusion to the boundary and subsequent recombination at the cavity walls. Secondary processes at the walls are unimportant unless the cavity dimensions are very small compared to the free space wavelength A, in which case secondary electron resonance (multipactor) occurs. In the cases considered by Brown, the
RF CONFINEMENT AND ACCELERATION OF PLASMAS
26 1
electron energy of oscillation is significantly less than the energy required for ionization. Electrons acquire energy from the rf field sufficient for ionization by a stochastic process involving collision with the neutral molecules. According to Brown and Macdonald, under these circumstances breakdown occurs as soon as the rate of electron production exceeds the rate of removal by diffusion. Stipulating as the breakdown condition that these two rates are equal, they arrive at the equation
V2Dn
+ Fin = 0,
(9)
where Vi is the mean ionization rate, i.e., the average rate of production of new electrons per electron, and D is the diffusion constant for electrons. The solutions of this equation, subject to the boundary condition th a t the electron density n vanishes a t the cavity walls, form a set of eigenfunctions with eigenvalues for Fi/D. The smallest eigenvalue is written
Vi/D
=
1/A2
(10)
and corresponds to the breakdown condition. If the field is uniform, and D are independent of h',so that (9) becomes V2n
+ (Fi/D)n = 0.
Vi
(11)
The smallest cigenvalue for a cylindrical cavity with radius a and length 1 becomes
A is a characteristic diffusion length for the cavity which in the case of a uniform field we shall call the geometrical diffusion length Ag. A high frequency ionization coefficient t = Vi/DE2may be defined, and dimensional considerations show th at breakdown will depend on three variables which may be chosen as EA, E/p, and pX, where p is the gas pressure. We can write [ = l / i P A ) i.e., with A given by (12). From breakdown measurements, [ has been determined as a function of E / p and pX for various gases, using short cylindrical cavities with all > 15 for which the field may be assumed uniform. For nonuniform fields, (9) may be written Vz+
+
+ tE2+
=
0,
(13)
with 4 = D n satisfying = 0 a t the walls. Since [ is experimentally known as a function of E for uniform fields, if we assume th a t for nonuniform fields t: is determined by the local value of E , it is known as a function of position in the cavity, and, in principle, (13) may be solved as an eigenvalue problem for the maximum value of El E,. Boot and Self have shown that the maximum field required for breakdown in nonuniform fields, e.g.,
262
H. MOTZ A N D C. J . H. WATSON
in a cavity with length equal to radius, is always greater than the field that mould be required if the field were uniform in the same cavity. Certain (but, as we shall see, not all) limits of the theory sketched above, the diffusion theory, were discussed by Brown and Macdonald and plotted as limit, lines in a p h , pX plane. The limits which are relevant for our further discussion are the mean free path limit and the uniform field limit. The field is no longer uniform when A/X is no longer small, and Brown and Macdonald consider the theory to be valid when A/A < 1/2a = 0.159. The measurements of Boot arid Self were carried out in a cavity with h/X = 0.135, and since this case they can also show that A = A, the fields are sufficiently uniform for diffusion theory to be valid. Another limit of validity of the diffusion theory is reached when the pressure becomes so low that the electron mean free path becomes equal
-a+-
Electric field c Magnetic field 0 , -
(a 1
-a+
(b)
FIG.16
to A. Brown and Rfacdonald, assuming that the average electron energy is of the ionization energy (for hydrogen the ionization energy = 15.6 eV) obtain for this limit p h = 0.02 mm Hg cm. Boot and Self have carried out experiments with microwaves of free space wavelength X = 3 cm and a cavity for which A = 0.405, so that the mean free path limit is reached and diffusion theory is not applicable for pressures smaller than -0.05 mm. They used two different field modes, TRiIolnand TILloll. The field configurations for these modes are shown in Figs. 1Ga and 16b. It is seen that the TRllollmode has a valley in the center, while the TMolomode has a field maximum near the axis. Detuning of the cavity and consequent decay of the cavity field indicates breakdown in these experiments. The maximum fields in the cavity for breakdown obtained experimentally by Boot and Self are plotted against pressure in Fig. 17. Above 0.3 mm pressure, where the diffusion theory is certainly valid, the measured breakdown fields for the TMoln mode agree with those of Brown and RIacdonald (suitably scaled) obtained a t = 10 em with a TR’Inlo cavity of similar dimensions. The breakdown fields in the case of
+
263
RF CONFINEMENT AND ACCELERATION O F PLASMAS
TRlollare higher, and Boot and Self are able to show that A is less than A, in this mse.
For low pressures the results differ radically for the two cavities. Breakdown could not be reached with the fields available in the experiment of Boot and Self (31,000 Vjcm) in Lhe case of the Thlolo cavity, while in the case of the TAIolLcavity breakdown occurred a t fields lower than 21,000 Vjcm. For low pressures, the minimum cavity field for breakdown asymptotically reached a value, corresponding to a well depth of the quasi p o t e n t d V calculated from the field value, which is equal to
-
v'.15,6 ev
___
E
Y
3
10
_---------
P
-X--X-X\
V.15.6
eV
w 9
+EOM
x
cavity
,,E, cavity Brown and MacDonald's results. Scaled from A = 10.0cm
I
lo-2
I
I
I
10.'
I
10
Pressure p
I lo2
(mrn Hg)
FIG.17
the ionization potential 15.6 eV. (The height V of the hill of quasi-potential in the case of the TMo,o mode becoines equal to 15.6 eV for a somewhat lower maximum field, which has also been marked on the figure). These results are clearly in accordance with quasi-potential theory. In the case of a quasi-potential well, trapped electrons will describe orbits which eventually will lead to collisions with neutral molecules. If the well depth is equal to the ionization energy or greater, the probability of ionizing collisions will become considerable. In the case of a quasi-potential hill initiatory electrons are accelerated by the quasi potential and are lost t o the wall before they have an appreciable chance of an ionizing collision. According to the mean free path criterion, however, diffusion theory should be valid a t pressures below 0.3 mm-in fact down to 5 X mm. This discrepancy may be explained by pointing out that diffusion theory assumes the oscillation energy of the electrons to be small compared to the
264
H. MOTZ A N D C. J . H. WATSON
ionization energy. When the quasi-potential hill or valley can accelerate the electrons to energies approaching the ionization potential this assumption is violated.
7. THE THEORY OF RADIO-FREQUENCY ACCELERATION OF PLASMA We have seen above that a plasma in an rf field undergoes internal microscopic motions, as a result of which it acquires certain quasi-dielectric or quasi-metallic properties. I n consequence, the rf fields can exert macroscopic forces upon it, which can qualitatively be attributed to “radiation pressure,” and can under some circumstances confine it in stable equilibrium. The possibility naturally suggests itself that one might use the same forces to accelerate the plasma as a whole. I n addition to this purely rf acceleration procedure, it has been proposed that one should use the enormous enhancement of the effectiveness of a n rf field induced by the presence of a not-quite-resonant stationary magnetic field to increase the rate of acceleration or that one should adapt the exactly resonant principle of operation of conventional particle accelerators to accelerate the plasma in a resonant manner. I n each case, the suggestion is easily made but is difficult to discuss in a precise quantitative manner, and the rather primitive state of the existing theory of both the nonresonant and resonant acceleration methods will be evident in what follows.
A . Purely Radio-Frequency Acceleration We shall begin by considering the main proposals which rely exclusively on rf fields t o accelerate the plasmn. These differ with respect t o the type of radiation used and the time scale on which it is proposed that the acceleration should take place. Decisions 011 these two questioris cannot be taken independently. A plasma in its natural state is uniform and of infinite extent; to accelerate a finite amount of it as a whole, it is necessary either to do so very quickly, so that the “plasmoid” does not have time to disperse, or to supply fields other than those strictly necessary for acceleration in order to preserve its coherence. i\’lethods of the former type, which might be called “impulsive,” are subject to the difficulty that one must ensure that no component of the necessarily very large accelerative forces has a tendency to break up the plasmoid; the latter type, however, which we shall call “continuous,” raise all the problems of confinement, and of stability of confinement, which we considered in Sections 2 and 4. This decision on the time scale for acceleration affects the model of the plasma which it is appropriate to use in designing the accelerator. I n continuous acceleration devices, t,he plasmoid will have time to adjust its volume and shape to the contours of the force fields being used to confine it; one should therefore use a self-consistent quasi-equilibrium theory of
RF CONFINEMENT -4ND ACCELERATION OF PLASMAS
265
the plasma, not unlihe the theory given in Section 2. I n impulsive devices, such a model would be inappropriate; ideally one should use the kinetic theory of an expanding plasma, but this theory is only beginning to be developed [see, for example, Cheremisin (1965)], arid in all the discussions of impulsive accelerators in the literature, it is assumed that the plasma is a rigid object of fixed shape which happens to possess a dielectric coefficient c = 1 - wp2/w2. Both the mechanical and the electrical aspects of this assumption are rather problematical. On the mechanical side, it is clear that if one attempts to accelerate a fluid object such as a plasmoid by exerting external forces upon it, the initial effect will be to cause internal motions, and only if arid when internal equilibrium has been restored will the plasmoid accelerate as a whole. For the rigid dielectric model to be applicable, it must be supposed that such a readjustment has successfully taken place; for consistency, therefore, one should a t least show that the surface force acting on it (given by Fi = T , k n k where n is a unit normal to the surface and T,r is the electromagnetic stress tensor) is everywhere of a suitable direction arid magnitude to maintain adequate coherence during acceleration. On the electrical side, the assumption of a dielectric coefficient e = 1 - wpZ/wz presupposes that the electron density is high enough, and the duration of the acceleration long enough, for collective electron oscillations to be important, and it neglects all nonlinear effects of the (by hypothesis) very large electric fields on the dielectric coefficient. As we have seen, for a stationary rf confined plasma, large electric fields completely alter the effective dielectric coefficient of a plasma; their effect 011 its noriequilibrium properties are a t best a matter for speculation. We shall now consider the various types of rf wave that have been proposed for acceleration purposes. Although. as we have seen, one should use the appropriate mode of plasma behavior to evaluate them, it is useful t o start by considering their effect on single charged particles. (i) Standzng Waves. In this case, the theory given in Section 1 shows that the rf field exerts a force F = -Ve2E2/2mw2 = -V#. Thus if one were to inject a particle at a point where E = Em,, arid withdraw it a t a point where E = Em,,,it would have increased in energy by an amount A&
=
e 2 ( E i R X- E",,,)/2mw2 = A#.
Qualitatively speaking, if the single particle were replaced by a quasineutral rarified plasmoid, each electron would gain this energy and would share it with the associated ion, and thc velocity of ejection of the whole plasnioid would be
v
=
( 2 A#.-/vI,+)~'*.
(2)
266
H . MOTZ AND C. J. H. WATSON
This method, first introduced for quasi-neutral plasmoids by Asltaryan (1659), is applicable only over one quarter wavelength of a standing wave, or over one e-folding length of an exponentially decaying wave, and is accordingly only suitable as an impulsive accelerator, On single particle theory a t least, it should be possible to insure coherence of the plasmoid in the direction a t right angles to the axis of acceleration by ensuring that $ has a minimum on the axis, but it is not possible to confine the plasmoid along the axis. For a hydrogen plasma, the maximum velocity of ejection is given by v/c = 800E,,,/w where I3 is measured in volts per centimeter and w in cycles per secoiid, and, as we shall see subsequently, it is doubtful whether (2) is valid when wp 2 w. Thus the method is characterized by a regrettable inverse relationship between the maximum density and maximum velocity attainable. Possible techniques for extending the applicability to continuous acceleration have, however, been proposed by Askaryan (1959) and Cildenburg and l\liller (1960). (ii)Neady Standing Waves. If one modulates the amplitude of a standing wave in such a way that the wave form moves with respect to the laboratory frame, but at a velocity very much less than the phase velocity of the wave, the quasi-potential theory given in Section 1is still applicable. One can therefore imagine trapping a particle or a plasmoid in a threedimensional qunsi-potential well and then accelerating this well with respect to the laboratory. This method, first, described by Gaponov and Miller (1958b), is clearly a continuous method and its applicability depends upon the realization of stable confinement. A suitably modulated wave could be obtained by propagating two traveling waves of slightly different frequency in opposite directions along the accelerator axis. There is an upper limit to the rate of acceleration set by the requirement that the plasmoid remains trapped; for a plasma of thermal velocity v8 confined near the node of a wave, so that its particles oscillated with a frequency wJ, the upper limit on the acceleration would be of order wJv8. (iii) Slow Traveling Waves. If one sets up a traveling wave in a delay line of varying characteristics, so that the phase velocity of the wave increases along the axis, one can inject a particle with a velocity close to the minimum phase velocity of the wave, which is then trapped in the manner indicated a t the begirining of Section 1, F, and accelerated. This is the principle of ordinary linear accelerators and will therefore not be considered in this review. (iv) Fast Traveling Waves. For such waves, unlike the slow traveling waves just considered, the rioriuriiformity of the field is on a length scale of order c/o, so quasi-potential theory should be applicable. On this theory,
RF CONFINEMENT AND ACCELERATION O F PLASMAS
267
as we havc seen, there is no force on a partivle in the direction of propagation of the wavc, in apparent contradiction to the well-known result that an electron in a traveling wave causes Thompson scattering and therefore experiences a force. As we mentioned in Section 1,however, this is asecondorder effect; the force exerted by a plane traveling wave is
FT = O ~ E ~ E ' ~ ~ ,
(3)
where n is a unit vector in the direction of propagation and uT is the Thompson scattering cross section; this is smaller than the force due to the quasi potential of a standing wave of the same amplitude and frequency by a factor row/c, where ro is the classical radius of the electron. I n view of the smallness of the effect, it might be supposed that the force exerted by a traveling wave can only be important in, for example, stellar interiors where the radiation density is sufficiently high. Such a conclusion is however, incorrect, as can be seen by considering the implications of the fact that a traveling light wave can be scattered by, and can exert a measurable force on, a thin piece of metal foil. For, if we were to regard this as an assembly of independent electrons of density ~ l O * ~ p a r t i c l e s / c r n ~ , it would be virtually transparent, since the Thompson cross section is of sq cm. I n fact, a metal is opaque to any radiation whose freorder quency is sigriificantly below its plasma frequency. As is well known, this is interpreted to mean that the electrons in it respond collectively to the incident radiation arid in consequence havc a much higher effective cross section than they have individually. Unfortunately, this interpretation is based on linear theory, and on the assumption that the density distribution of the plasma is maintained uniform in a finite region of space by solid state forces, and we are therefore left to speculate as to what an adequate nonlinear theory of the interaction of traveling waves with a plasma not confined in any other way would predict. The position is complicated by the fact that it would be unrealistic to assume that ordy a traveling wave was present, except initially. In any steady state there would be a t least partial reflection of the incident wave, and hence a t least a certain admixture of a standing wave, whose nonlinear effects could be predicted on quasi-potential theory. It is indeed possible that the most significant nonlinear effects of a traveling wave are exerted in this way. However, the feeling remains that the collective electron motions, which as we have seen play such an important role in the linear regime a t a metal surface, might also play an important part. The one paper in the literature which gives a plausible discussion of this question is that of Gildenburg and Miller (1960), to which we referred briefly in Section 2. We shall therefore give a fairly detailed account of this paper, drawing on various others for the justification of certain assertions made in it. [It
268
H. MOTZ AND C. J. H . WATSON
should be remarked that its chief results coincide with those of Askaryan (1958, 1959) who, however, did not give any clear derivation of them.] I t will be assumed that any plasma accelerator based on fast traveling waves must work impulsively. Certainly, no suggestion has been made in the literature that such a wave would tend to confine a plasmoid along the axis of acceleration, although radial focusing could certainly be achieved by quasi-potential forces. We shall therefore adopt the rigid dielectric model, with all the possible defects of that model outlined above. For simplicity we shall assume a spherical plasmoid, though discussions of ellipsoidal plasmoids have been given by Levin et al. (1959) and Veksler et al. (1963). We shall begin by calculating the electromagnetic response of such a dielectric sphere to an incident wave, which, with a view to maintaining maximum generality, we shall not specify to be either traveling or standing. This response turns out to be related to the fact that such a sphere possesses a set of natural modes of electromagnetic oscillation. The theory of these is given in Stratton (1941, p. 554); in outline, the analysis consists of an expansion of the electromagnetic fields at any point outside the sphere in terms of the vector spherical harmonics which represent an outgoing wave satisfying Maxwell’s equations in vacuo, and a similar expansion of the fields inside the sphere in terms of vector spherical harmonics which represent a wave which is well behaved at the origin and satisfies Maxwell’s equations for a medium of dielectric coefficient E . On specifying the boundary conditions which must be met at the dielectricvacuum interface, it becomes clear that these pose an eigenvalue problem, and there consequently exists a set of eigenfrequencies wm,,, which are the natural modes of electromagnetic oscillation of this system. All these modes turn out to be very heavily damped (i.e., all the possess imaginary parts which are of the same order as the real parts); this is to be expected, since in such a mode of oscillation, energy is being radiated away from the central sphere. These modes are qualitatively similar to the radiation fields which would be set up by the oscillations of certain charge distributions at the origin, and they can accordingly be classified by type (transverse electric or magnetic) and by the character of the multipole (dipole, etc.) whose oscillations would generate them. I n consequence, if at some initial time we were to switch on an rf field external to such a dielectric sphere we would expect initially to excite oscillations both at the external frequency w and at the natural frequencies urn,,,,the relative amplitudes of these being determined by the boundary condition at the surface of the sphere. After a short time, however, these natural oscillations would decay, leaving only the forced oscillations. Since the force exerted upon the sphere depends upon the total rf field acting, we need to determine the steady state of this system. T o solve this
R F CONFINEMENT A N D ACCELERATION O F PLASMAS
269
one proceeds as before, but now expanding the field outside thc sphere as a sum of two components, the given external field (assumed to be generated by sources which are not affected by the scattered radiation) and the scattered field. Thus, symbolically,
(for the significance of the vector spherical harmonics M,, 1941) arid Edielectric
=
N, see Stratton,
2 ahd’MLd’+ bf)NAd’, n
where we havc put the superscript (d) on thc vector spherical harmonics inside the dielectric to indicate that they are the eigenfunctions appropriate lo the dielectric medium and are well behaved a t the origin. On imposing the boundary conditions at the surface, we find that the expansion coefficients u(nd),6Ld), ujlsc’, and b r ) are all determined in terms of the expansion coefficients a y t ’ and byt’. Thus, for example, for an incident plane traveling wave, one obtains: = j n ( ~ l ~ z P ) b j d P ) Y- .in(P)[€1’2Pjn(€1/4P)1, =
D7lb)
(4)
where j 8 , ( x )is a spherical Bessel function, p = w a / c , and = 0 is the dispersion relation for the eigenfrequencies of the natural oscillations of the sphere. We may note that as w approaches urn,,,the amplitude of the scattered wave for that mode increases resonantly; since the urn,,, are all complex, however, whereas w is by hypothesis a real quantity, the expansion coefficient never actually becomes infinite. The above procedure is perfectly general and leads to precise expressions for the total electromagnetic field after all the initially excited natural modes have died away. However, for any given external electromagnetic field, even the calculation of its expansion coefficients a r t ) b‘,“xt’ is by no means trivial, and the resulting expression for the total field is very unwieldy. It is therefore desirable to introduce an approximation. The most useful approximation is to assume that the radius of the sphere, a, is much less than the wavelength of the radiation both in uacuo ( c / w ) and in the dielectric ( ~ / w d / ~ )I. n this case, provided that w e are not too close to a resonance we can use the expansion for small p: = -
i(€
-
i)p2r1+3
+ 1)(2n + 3)[(2n+ 1)!!12 i ( e - l)p2”+’ /)y= (en + .n + l ) ( % + 1)[(2n - l ) ! ! ]+ i(€ - l)(n + l)p2’r+l (an
1
(5)
270
H. MOTZ AND C . J . H. WATSON
(Gildenburg and Kondratev, 1963). The largest coefficient is clearly (apart from resonances)
Using the fact that in this approximation the external field is uniform a t the location of the sphere, one can easily show that the scattered radiation field is identical with that of a point dipole of moment
an expression which differs from the well-known result of Raleigh for the scattering of radiation by atmospheric particles [Landau and Lifschitz (1951) Electrodynamics of continuous media, p. 3801 only by the presence of a small but physically important correction in the denominator. If we writle e = 1 - wD2/w2, E = Eoexp(iwt), w o = q,/dZ, y = (2 /9) (wD2/w)p3 = (2/3) (Ne2w2/nzc3), we see that p can be written in the form
p
=
- ( N e 2 / n z ) E / ( ~ 2- mu2 - i y w )
(8)
and satisfies the differential equation p
+ woZp - (y/w2)p
=
(Ne2/~~2)E,
(91
an equation which is the starting point for the discussion of Askaryan (1958) and Gildenburg and Miller (1960). An interesting property of Eq. (9) is obtained by taking the limit in which the number of particles is reduced to one electron-ion pair; in this case we can write p = er and we obtain r wo2r - (y/w2)Y = (e/m)E (10)
+
where y / w 2 = %e2/mc3,which is precisely the equation from which one derives the theory of Thompson scattering [Landau and Lifschitz (1951), Classical theory of fields, p. 2641. Thus by retaining the small imaginary term in the denominators of (7) and (8), it would appear that we are working in an approximation in which Thompson scattering is included ; and we see that for a plasma describable by a real dielectric coefficient e, the Thompson scattered wavelets of different particles add coherently, a t least for the fundamental (dipole) mode. If the dielectric coefficient were made complex, so as to include the effcct, of collisions or thermal motion, the above expression for y would have to be modified accordingly.
RF CONFINEMENT AND ACCELERATION OF PLASMAS
271
If we now assunic that the external electromagnetic fields are nonuniform but slowly varying and that their frequency is sufficiently far from that of any multipole resonance, we can describe the motion of the plasmoid in very much the same manner as we discussed the motion of single particles in Section 1. If r is the location of the center of the plasmoid we have
N(m+
+ m-)r
=
.
+ r X H(r)/c.
(p v ) E ( r )
(11)
If we impose the restriction that the electrons should not execute motions on a scale comparable with the dimensions of the plasmoid (lpl eanalysis (24, 25). Van der Ziel's original discussions mere coricerricd mainly with junction FET's.
316
EU G EN E R. CHENETTE
Jordan and Jordan (30) used a similar technique to calculate the drain noise of an enhancement type insulated gate FET with very similar results. It seems there are only small numerical differences between the magnitudes of the noise sources in junction and in RIOS FET’s. An excellent survey of noise in MOS FET’s was presented by Johnson in his book (31). Here is a brief summary of the derivation of the magnitude of the drain-source noise current generator as first presented by van der Ziel. The basic model of the junction F E T is that presented by Shockley (32). Figure 7 shows a sketch of the longitudinal cross section of such an FET with dc bias voltages applied and with both the gate and drain ac shortcircuited to the source. The conducting channel is assumed to be p-type
FIG.7. Sketch of the longitudinal cross section of a junction FET. The incremental resistance between zo and (zo Az) gives rise to a thermal noise emf
+
-
ADZ = 4 k T ( A r ( s , ) ) df
and of uniform conductivity ao. Field dependence of the mobility is neglected. The length of the channel is L ; its thickness is 2 a and it is of unit width. The depletion layers of the gate junction reduce the effective thickness of the channel to 2 6 ( x ) at a point x units from the source contact. The resistance of an element of the channel between x o and x o Ax is
+
According to Shockley’s model, b ( x ) and v(z), the volt-drop across the gate junction, are related by
b(x)
=
41 - ( v ( ~ ) / V n o ) ” ~ l ,
(22)
where Voo is the bias required to cut off the channel. Figure 8 shows an approximate sket,ch of the v(z) as a function of position from the source
317
NOISE I N SEMICONDUCTOR DEVICES
(x = 0) to the drain (x = L). v ( x ) varies from v(0)
=
+ q) to
(VCS
v(L)= (VCS+ cp - VDS).9 is the diffusion potential of the gate junction. x
The drain current, I D , is related to the gate volt-drop at x L by the expression
=
0 and
=
I~
=
gcvoo[y- x - j
( p 2
- ~3/2)1
(23)
whereg, = 2a . a o / Lis the conductance of the open channel, y = [v(L)/V001 is the normalized gate volt-drop at the drain, and z = [v(0)/VOO] is the normalized gate volt-drop a t the source. Thus, the transconductance is g m = - (aID/dVCS) = gc(y'l2 -
2'12) ;
(24)
and the output conductance is
When VDS = 0; v(0)
gd
=
dTD/dVDS = gc(1
=
v(L)or y gd =
gdO
= x =
- y1l2).
(25)
and
gc[l - 21'2].
(26)
At saturation y approaches unity and lim gm = gm(max) = gc[l -
w+
2'121 = gdO.
(27)
1
The magnitude of 2 is calculated as follows: The basic noise source is the thermal noise of the channel. Thus, by Nyquist's theorem, the noise of a narrow region between xo and xo Ax can be represented by the noise emf of the resistance Ar(xO), Eq. (21) :
+
Av(x0)*= 4kT Ar(xo) df
=
AV 4kT df ID
Here A V is the change in volt-drop across the element of resistance Ar(xo) because of the drain current. The perturbation in the channel voltage a t 50produces a perturbation in the channel current Aid (and also a perturbation Ai, in the gate current). A sketch showing the effect of a small emf in the channel a t xo is shown in Fig, 8. This small jump in v ( x ) at x o must cause a small jump in the drain current. It can be shown that Aid and Av(x) arerelated by the expression A i d = gc[l - V0(~0)/V001'21A D ( ~ G ) , (29) and hence = gca[l - ( V , ( S O ) / V O O ) ' / ~ ] ~ A Z (30)
z2
318
E U G E N E R. CI-IENETTE
The total fluctuation in the drain current is obtained by combining Eqs. ( 2 2 ) , (28), and (30) and integrating over the length of the channel. The result can be written as c
ids2 =
41cTgm(max)df Q(VGS,VDS)
(31)
where Q(VGS,VDS) is a complicated function of the gate and drain biases. It lies between and Q for a junction FET and is equal to Q for an MOS FET at normal bias. This result is amazingly close to the most simple guess for the drainsource noise current. At very low drain and source voltages one would
+
VGS’
X
FIG.8. Approximate sketch of the volt-drop across the gate junction as a function of distance from the source. A noise emf a t zocan give rise to a perturbation as shown. This “jump” Av(zo) must cause a jump in the drain current Aid.
expect the drain-source conductance to show full thermal noise,‘i.e., one would expect i d , 2 = 4kTgdodf for zero bias. The simple theory shows (Eq. (27)) g d O = g,(max). The result of Eq. (31) is very close to this value. A similar calculation can be made to show the effect of the coupling of the thermal noise emf through the gate junction capacitance into the gate circuit. Calculating the from each incremental noise emf of the channel and then integrating over the length of the channel yields
where gll = (gm circuited and
+ q,J
is the input conductance with the output shortFET ( 2 4 forfor aanjunction MOS FET. 1
P(VGS,VDS)
=
319
NOISE I N SEMICONDUCTOR DEVICES
Equation (32) is an excellent agreement with the value for is obtained with Eq. ( 5 ) ,
2 which
-
ig2= 4kTgll df - 2qI0 df.
(33)
Polder and Baelde’s basic assumptions allow for variations with position of the electric field in the “bulk region of a junction” and for the variation of material parameters with position. The presence of drift current in the conducting channel does riot violate these assumptions. At high frequencies Eq. (33) will be dominated by the thermal noise term since 911 increases with ~2 and the gate bias current I G = - 1 ~ 0is usually very small. The gate- arid drain-source noise current generators are partially correlated since they are caused by the same incremental noise emf’s in the conducting channel. Van der Ziel has calculated the correlation coefficient and obtai ried
where 0.393 < (KI < 0.446 for normal bias conditions. It is also convenient to know how to divide i, between i,, and can be shown that
i,d2
arid
-
=
4kTg,d df
-
-
igb2 = ig2- i , d 2
igd.
It
(35) = =
4kI’(yll - grid) df 4kTg,, df.
(36)
Thus the gate-drain and gate-source noise current generators both show about full thermal noise of their corresponding conductances. 111. EXPERIMENTAL VERIFICATION OF
THEORY As the title of this section indicates, there is excellent agreement between theory and experiment for semiconductor devices over a wide range of operating conditions. The problems of comparing theoretical and experimental noise performance of a device can be a formidable one. The reader who wishes to verify the agreement for himself faces a n interesting problem. Methods of noise measurement are discussed in detail in several texts (33, 3). A noise measurement system requires good linear amplifiers. It is helpful if the system noise is negligible with respect to the noise of the device under test or at least constant so it can be easily subtracted. True quadratic or square-law detectors are also useful and precise noise calibration standards are essential. It is important that noise measurements be made over a wide freTHE
320
EUGENE R. CHENETTE
quericy range to guard against experimental errors-such as are caused by excess noise in the device under test or by systematic problems. Detailed experimental noise studies are often tedious and time consuming because of the care required to verify the precision of the measurements. Another problem in attempting to compare theoretical and experimental performance is designing the experiment so that it yields maximum information about the noise of the device. Simple noise factor measurements, attractive as they are to the engineer applying a device, seldom are the most useful experiment. The reason for this is that noise factor depends not only on the noise generators but also on the smallsignal performance of the device under test. If a discrepancy occurs it is difficult, in most cases, to determine whether the reason is an error in the noise theory or simply (or not so simply) a problem with the small-signal characterization. This problem will be mentioned again in the discussions of both bipolar and field effect transistors.
A . Experiments on Semiconductor Diodes Studying the noise of junction diodes in the laboratory usually does not allow much freedom in the design of experiments, since there are only two terminals. The noise can be represented either as an emf in series with the impedance of the device or a noise current generator in parallel with the device conductance. The experiment is fairly difficult however because of the low impedance level of the forward biased junction and because the open circuit noise signal of a well-behaved diode is only expected to be about one-half the thermal noise of a resistor of the same impedance. Champlin circumvents this problem with a measurement technique which compares the noise of the diode under test with the thermal noise of a lumped R-C circuit of the same impedance (34). Guggenbuehl and Strutt used transformer coupling between the diode under test and the preamplifier to increase the signal amplitude (35). The results obtained by these workers show good agreement between theory and experiment. Schneider and Strutt found good agreement between theory and experiment for diodes with substantial recombination current (‘7). At high injection levels Schneider and Strutt expected to find junction noise negligible with respect to the “thermal noise of the material of the junction.” Their results were inconclusive because of the large amount of l/f noise a t the high bias currents (18).
B. Experiments on Bipolar Transistors As has been mentioned previously, the problem of comparing the measured noise performance of a transistor with that calculated with the
NOISE IN SEMICONDUCTOR DEVICES
321
theory requires riot only inforliiation on the magnitude of the noise generators but also information 011 the small-signal performance of the transistor. Calculations required for this purpose can be simplified somewhat by converting the noise equivalent circuit of Fig. 4 to the one shown in Fig. 9. The two almost completely correlated noise current generators in Fig. 4 are replaced by an output noise current generator io = iz - ail
I
X
X
FIG.9. Modified noise equivalent circuit for comparison between theoretical and experimental performance.
and a n input noise emf e base current gain is a =
= i12, =
a0
i ~ / y , .The expression for the common-
+.iJ/Im).
exp (,+Kf/.fJl(l
The mean-squared values of the generators are
where and
(37)
322
EUGENE R. CHENETTE
In this author’s opinion the most satisfactory approach to comparing the theoretical and experimental noise performance of bipolar transistor is that followed by van der Ziel and his students (19, 36-38>. The basic experiment is measuring the equivalent noise current a t the output of a transistor with its input ac open-circuited. This yields information almost directly about 2 as long as xc >> Tb%. After this basic verification van der Ziel turns to measurements at the input with source impedance as a parameter. Measurements with Rs ‘v 0 are most sensitive to small errors in .Pand G; the effect of correlation between e and io can be seen by tuning the input for minimum noise (38) and by varying Rs. - It is convenient to define the equivalent saturated diode current of i o 2 . If Eqs. (39) and (41) are combined one can obtain the expression = IC
-
laI2(IE
- IR).
(43)
The equivalent noise resistance is calculated by referring all the noise sources of Fig. 9 to a single apparent noise emf in series with the input. The result is
R,
-
= ejn2/4kT df = R8 rb‘b
+ + + gsllzs + + + Tal
rb‘b
ze
ZSC~~,
(44)
where rsl is the noise resistance of the spectral density of r 2 ; gsl is the equivalent noise conductance of Q referred t o the input; and xsc is the correlation impedance. These parameters can be evaluated by the relations
-
+ 2ze*gea - z e * g ( l E + ~ I E E ) / ( ~ T ) I ~ 29s arei* - [-I + - Z&(IE + 2 I E E ) / ( k T ) ] . 71-1
1
2ec
=
22egeo
2951
z
(46)
(47)
Neglecting Ics(l - a ~ and ) IEE, which amounts to neglecting holeelectron pair generation in the base region, introducing the dc current amplification factor aB by the definition aB = I C / I E ,
and eliminating
I R
(48)
with the help of the expression
I R = 2(ao
- ~B)IE/~o,
(49)
N O I S E I N SEMICONDUCTOR DEVICES
323
one obtains instead of Eq. (43)
I,,
= IE[aB
-
Ja12(-a"
+ 2au)/aol.
(43a)
rhCO,rbl0, and reo of I,,, zSc, ral, arid ze The low-frequency values IeqO, thus become, respectively, IcaO =
IE[aB(l - a B >
+
(a0 - a B ) 2 1 ,
If trapping-recombination effects in the emitter-base transistor region can be neglected, one has I R = 0. It is then convenient to redefine CYB by requiring LYB = ac. If I E E and ICs(1 - aR) are not neglected, one then obt,ains
Since 9.0
= q(IE
+
I,,
IEE)/kT,
= IC -
one obtains
lCI!I2IE
=
((110
-
jaI2)IE
f
ICO,
(43c)
where Icois the collector-saturated current. The expression for I,, and the corresponding expressions for gsl, z,,, and rS1coincide with expressions given earlier by van der Ziel (3, 17). From the above it is clear that a complete calculation of the value of R, requires a knowledge of I,, a~~~ a, Tb'b, and of the source impedance R, seen by the transistor. Figures 10 and 11 show typical results of the measurements of ILq. Figure 12 shows the results of the measurement of R, as a function of operating point. There is good agreement between the theoretical and experimental curves. These results are typical of those obtained by Hanson (36), Chenette (19), and Brunclte (37) with van der Ziel for a wide range of operating conditions. Guggenbuehl and Strutt and Schneider and Strutt have also reported good agreement bet,ween theory and experiment. There is however a small discrepancy between Schneider and Strutt's noise equivalent circuit and van der Ziel's. The reason for it is that,, a t low frequencies, van der Ziel attributes full shot noise to the recombination current as outlined above, while Schneider arid Strutt assign only 4 shot noise to this current. Schncidcr and Strut t's exprcssion should hc valid for frequciwies high
324
EUGENE R. CHENETTE 1000
; i 100
a
5
-i OI
c1
10
t,
!
I
I
I
3
I
lo-‘
I
I
lo-I loo Frequency (Mc)
10’
lo2
FIG.10. Z,,O as a function of frequency.
loo 100
L /,,’,’;.’ /’
,”
,,’ ,/: /’
/‘ ,/
0 Experimental
values
,/
I 10
I
I(
30
FIG.11. Z,,O as a function of bias current. Theoretical curve I includes recombination current; curve I1 neglects it completely; curve I11 uses the high-frequency approximation of Schneider and Strutt.
NOISE IN SEMICONDUCTOR DEVICES
325
with respect to the lifetime of a trapped carrier while van der Ziel’s should hold a t lower frequencies. The effect of this discrepancy is shown on Fig. 11. The theoretical curve attributed to Schneider and Strutt was calculated by converting their expression for the noise factor to the output noise current with input ac open-circuited. T he most important point to be made here is not the small discrepancy between the two theories. The important point is that the experiment which reveals the discrepancy is the measurement of Ieq. Noise factor
I,
(
p amp. 1
FIG.12. Rnoas a function of emitter current in the white noise region for two values of source resistance. Tb‘b was determined from measurements of h i b O .
measurements, as presented by Schneider and Strutt, do not allow one to discriminate between errors in the noise model and in the small-signal equivalent circuit. At very high frequencies it becomes impossible to follow the procedure recommended above. A satisfactory “open-circuit” for the input does not exist, so that accurate measurements of I,, cannot be made. One must, therefore, abandon the above scheme and turn to measurements of noise resistance or noise factor for a much more limited range of source impedances. Fultui (39) and Policky and Cooke (40) have made extensive measurements of the noise performance of transistors in the microwave region. They find essential agreement between theory and experiment as
326
EUGENE R . CHENETTE
long as an adequate characterization of t8heeffect of the transistor header is included in the noise equivalent circuit. This is a tedious procedure! Fukui’s equivalent circuit is the “low-frequency” hybrid-pi circuit shown in Fig. 5. He modifies this circuit to include the effects of the transistor package and then finds excellent agreement between theory and experiment. The most important point of his work is that the simplest low-frequency model for the noise is useful a t microwave frequencies. This must mean that the fa of the transistors studied by Fukui lie well above 1.3 GHz. Schneider and Strutt (18) have studied the effect of high injection effects on noise in transistors. The main modification of their equivalent circuit was to modify the small-signal impedance of the junctions. Their results-again noise figure measurements at nominal source impedanceshow good agreement with the model. Johnson [Johnson et al. (SO)] has made measurements of I,, at high-current densities. Modern high-frequency transistors such as were studied by Fukui and by Policky and Cooke are operating at very high-current densities. Agouridis (66) has investigated the high frequency-high injection level problem and found good agreement between theory and experiment when small signal effects were properly interpreted.
C . Experiments on Field E$ect Transistors Brunclie and van der Ziel (23) have demonstrated that the noise performance of junction FET’s is in good agreement with the thermal noise theory. They find it necessary to take into account that the gate extends only over part of the conducting channel. The inactive part of the channel must be represented by resistances r. and rd in series with the source and drain respectively. The most straightforward measurement to check the validity of the theory is that of measuring the noise current between drain and source with the gate ac short-circuited to the source. This is a direct measurement of G2. Figure 13 shows a comparison of theoretical and experimental values of the spectral density of 2 as a function of frequency. The lowfrequency value of this curve agrees with Eq. (31) ; the increase at high frequencies is the result of the increase in 812 and hence of G. Figure 14 is another of Bruncke and van der Ziel’s results showing the effect of drain volt-drop on the spectral density of 2.The theoretical curve includes the effects of the bias dependent factor Q(Va8, V D S and ) of the extrinsic resistances rs and rd. Unfortunately, many FET’s show excess noise as is shown in the data 0 ~ 7 ~ excess ) noise is apparently associated with of Fig. 15. The 1/(1
+
327
NOISE IN SEMICONDUCTOR DEVICES
f (Mc/s)
FIG. 13. Spectral density if saturation with V C S= 0.
0.I
0.3
3 as a function of
frequency for a junction FET at
3
I
v,
10
(Volts)
in the “white noise” region as a function of VDSwith FIG.14. Spectral density of VGS= 0. The theoretical curve includes the effect of extrinsic source resistance.
trapping of carriers (26-28, 5 4 ) . Shoji (29) has measured the noise of FET’s as a function of temperature. At low temperatures, he has observed a white excess noise which is several times the theoretical thermal noise level. He identified experimentally that this was generation-recombination noise with relatively short time constants. Additional studies of the problem are under way a t the University of Minnesota.
328 I02
10
E U G E N E R. C H E N E T T E
a,
-En l 0
x
M
0.I
\
I,,,(theo) = 63 pamp I
I,,,(theo)
= 31 pomp
I
lo3
102
I
f (Kc/s)
FIG.15. Spectral density of 3 as a function of frequency for two FET’s showing excess noise with a 1/(1 0 2 ~ 2 ) characteristic. The curves approach the theoretical thermal noise level a t high frequencies.
+
D. Experiments on Flicker Noise At low frequencies the noise performance of all semiconductor devices seems to be dominated by noise with a spectral density which varies as I/f. This is commonly referred to as “flicker noise.” Most early experimental studies of noise in diodes and transistors were necessarily studies of the characteristics of the l/f noise sources (41). Modern technology has made possible the fabrication of devices with greatly reduced l/f noise. It will be seen later that l/f noise is relatively more troublesome in modern FET’s than in modern bipolar transistors. Schottky barrier diodes apparently have much less l/f noise than is found in point contact diodes. The theoretical models discussed above have nothing to say about this l/f or flicker noise. It is not possible at the present time t o predict the magnitude of the l/f noise sources in any semiconductor device. Fonger (42) found in studying the flicker noise of diodes and transistors that some of this noise was associated with leakage around a junction
N O I S E I N SEMICONDUCTOR DEVICES
329
and that some was apparently associated with modulation of the normal processes, possibly the result of fluctuation surface recombination velocities. A phenomenological equivalent circuit for l/f noise in a junction transistor is shown in Fig. 16. Experiments by Plumb and Chenette (43, 44) show these two l/f noise generators to be almost completely correlated for a transistor with negligible ohmic leakage a t the collector and that 2 >> 2 for most bias conditions. On the basis of these results
FIG.16. Noise equivalent circuit in the 1/f noise region. In many transistors if2are almost completely correlated and 2 >> 2.
if1
and
the l/f noise of a transistor can be described with a single l/f noise current generator as shown in Fig. 5 . This generator is connected in parallel with 2 and has the formal mean-squared value
if2
=
K I B T f - " df.
(51)
Here y and are often both about unity and K varies greatly from one transistor to another. The only practical way of determining the magnitude of this 1/J' noise generator for any transistor is by measuring the noise in the l/J region. (Y
IV. PRACTICAL LOW-NOISEAMPLIFIERS Since the theoretical niodels of noise in the devices discussed have been shown t o agree well with the results of basic noise studies it is worthwhile to use the noise equivalent circuits to predict the performance of several practical amplifiers. Both bipolar and F E T circuits will be included in the following paragraphs.
330
EUGENE R. CHENETTE
A . Bipolar Transistor Amplifiers An expression was derived above for the equivalent noise resistance of a bipolar transistor using the circuit of Fig. 9. This expression was recommended for comparison with experiment after verifying Icq. The expression can also be used, of course, to predict the noise performance of a practical amplifier. If one is interested in the noise factor, it is given by the relation F = RJR,. (52) Thus Eq. (44)yields (19)
F = l +
rb‘b
+ + rsl
R.
gsl
12s
+
rb’b
f ze
+
&el2*
(53)
This expression is valid in the shot-noise region for common-base, common-emitter, and common-collector connection. It is useful to frequencies near fa (with the sensible inclusion of package parameters). The various terms have been defined above. It was essentially this expression with zeC= 0 which Nielsen used in discussing the behavior of noise figure in transistors (45). The only objection to using this expression for calculating the noise performance of a trarisist,or amplifier is that it is relatively complicated. All of the terms, including ?“b’b, are more or less complicated functions of the operating point and transistor parameters. If one is willing to determine these parameters accurately this expression will yield accurate prediction of the noise performance for a wide range of transistors over a wide range of operating conditions. Such problems as high ICO and the transit time effects which become important when operating near fa are included in the model. It is obvious that the calculations required t o predict the noise performance also require accurate characterization of the circuit in which the transistor is imbedded. If package parasitics affect this circuit, as they do at high frequencies, they must be included. If one assumes in advance that the transistor to be used satisfies the requirements that operation is well below fa and that leakage currents are negligible, it is possible to use the somewhat simpler noise equivalent circuit of Fig. 5 with the mean-squared values of the noise generators (11, 46) i B 2 E2 q I ~ dfi (18) ic2 2qIc d f , (19) iA2 0, ==O, (20) and e b 2 = 4kTr,df. (12)
33 1
N O I S E I N SEMICONDUCTOR DEVICES
The noise factor of the circuit of Fig. 5 is given by the expression
is the mean-squared value of the flicker noise generator (Eq. 51). If this is negligible Eq. (54) can be rewritten as
=
Fo
+ F~(w*).
(54b)
R, is the real part of the source impedance 2,. The hybrid-pi parameters are given by the expressions grn = (q/kT)Ic,
rn =
Po/grn,
and
WT =
grn/(Cr
+ C,).
(55)
PO = hfeo= qlnrr is the low-frequency common-emitter current gain; r. is the base spreading resistance and must be determined by experiment. It is equivalent to the rb'b used earlier. F o is the low-frequency asymptotic value of the noise figure. 14
12
g
10
LL
=
2 0
10'
lo2
lo3
lo4 f -Frequency
lo5
- cps
FIG 17. Noise figure as a function of frequency for a typical silicon transistor (Type 2N3117). (Courtesy of Fairchild Semiconductor Products.)
Even Eq. (54a) is fairly complex. The noise factor is a complicated function of R, and the operating point. However, some insight into the shot-noise performance can be gained by examining the expression. For example, Eq. (54a) shows that the noise figure increases as u2 a t high frequencies. This behavior is seen in the curves of Fig. 17 which show the noise figure as a function of frequency for a typical transistor. The
332
E U G EN E R. CHENETTE
30 1-, Collector current
- pamp
(a)
1-, Collector current - pamp (b)
FIQ. 18. Contours of constant noise figure in the R, - I C plane for several
location of the break to the 0 2 dependence is a function of the operating point and source impedance as well as the WT of the transistor. It is clear that it makes good sense to choose a transistor with a highffr (or, by Eq. (53), a highf,) for good low-noise performance over a wide frequency band. It is a straightforward procedure to calculate the optimum source resistance for a given bias current on the optimum bias current for a
N O I S E I N SEMICONDUCTOH DEVICES
333
I,- Collector current -pamp (C)
Ic- Collector current - ma (d)
frequencies (Type 2N3117). (Courtesy of Fairchild Semiconductor Products.)
given source resistance if one assumes the transistor parameters to be constant. The results are, for the low-frequency shot-noise region (47)
334
EUGENE R . CHENETTE
and when
kT (59) qRs Here h F E = I c / I n is the dc current gain of the transistor. These expressions make clear that low-noise performance requires high h F E and low rz, as well as the large W T . The minimum possible noise factor is [l l / ( h ~ ~ ) % when ] r. is negligible. One of the most useful ways of displaying data about the noise figure is that of showing contours of constant noise figure (46) in the R.-Ic plane as shown in Fig. 18. The 2N3117 is a good example of the transistors currently available for use in low-noise amplifiers a t low frequencies. The typical minimum noise figure of 3.3 d B with R, = 500 Q and I , = 1 ma indicates a base resistance of about 600 a. The contours are in good agreement with this value. - I n the l/f noise region the noise equivalent circuit is dominated by i f 2 . Then the noise figure is given by the expression
Ic =
- (hFE)$‘.
+
F(l/f)
=
1
+ (?/Z?)(Rs + r d 2 + ( K I B ~ / RP)[R, , + r212
(60) unity. This expression has the minimum value for a fixed IC when R, = r.. This agrees with the results of measurements of noise figure contours reported by Gibbons (48). = 1
if y
N
B. Field E$ect Transistor Amplifiers Figure 19 shows a schematic diagram of a typical n-channel junction FET amplifier. The noise figure can be calculated in much the same way
lR9 lRc I
IRL
FIG.19. Schematic diagram of a typical junction FET amplifier.
as it was for the bipolar transistor. The noise equivalent circuitto be used is that given in Fig. 20. The values of the noise generators i,*, i d 2 , arid i,id* are given by Eqs. (31), (32), and (34). A slight modification of
335
NOISE I N SEMICONDUCTOR DEVICES
these values may be necessary to account for the extrinsic source and drain resistances. The noise figure can be calculated by evaluating the short-circuit output noise current of each of the noise sources. The result is
4-id(Ga
Gc F = l + - + G.
+-Yc 4-
yin>/Ytr12
(61)
i.2
where Gc is the real part of the input circuit admittance and G, is the signal source conductance. yi, = gin jbi, is the input admittance of the FET with its output short-circuited and Ytr = ~ Z L ~ 1 = 2 (gtr jbtr) is the net transfer admittance.
+
FIG. 20. Noise i,z = 4kTGcdf are
+
equivalent circuit for the FET amplifier. 3 = 4kTG,df and the result of thermal noise of the source and input circuit
conductances.
This expression may be simplified by dividing i, into two parts, i,' which is fully correlated with id, and i," which is completely uncorrelated with if.
i,
=
ig'
+ i,"
-
-
+
-
ig2= igt2 i:l2.
and
(62)
Equation (61) can then be written as
where gn is the noise conductance of the uncorrelated part of i,. It is defined by - - - 4kT dfgn = i g " 2 = i , 2 - i,'2 = i , 2 - lipid*/$. (63) rn is equivalent noise resistance of
4k!l'rndf
id
=
referred to the input, defined by
g/m.
(64)
Combining Eqs. (64) and (31) yields
r,
=
(gm(max)/lytr12)&(VCs,VDS).
(65)
336
EUGENE R. CHENETTE
yooris the correlation admittance relating the magnitude of i,’ to
id.
It is clear that the noise figure can be tuned to the value F’= 1
Gc gn TnlGs + Gc + gin + gcor12 + -Ge + - GB + Gs
when
(Bc
+ + her) bin
=
0.
(67) (68)
The source conductance can be chosen to yield minimum noise figure. The result from Eq. (67) is Fmin’
for
=
+ 2rn[Gc + gn + goor] + 2[rn(Gc + gn) + rn2*Gc+ gin + goor)z11/2 (69) Gs Gs (opt) [(Gc + gn)/rn + rn2(Go + gin + gcor)z]1’2- (70) 1
=
=
If, as is often the case, (Gc
+ gin + goor)’