ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS VOLUME XI
This Page Intentionally Left Blank
Advances in
Electronics ...
21 downloads
431 Views
26MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS VOLUME XI
This Page Intentionally Left Blank
Advances in
Electronics and Electron Physics EDITED BY L. MARTON National Bureau of Standards, Washington, D. C.
Assistant Editor CLAIREMARTON EDITORIAL BOARD
W. B. Nottingham E. R. Piore M. Ponte
T. E. Allibone H. B. G. Casimir L. T. DeVore W. G. Dow A. 0. C. Nier
A. Rose L. P. Smith
VOLUME XI
1969
(29
ACADEMIC PRESS
New York and London
COPYRIGHT 0 1959,
BY
ACAIIEMIC PRESSINC.
ALL EIGHTS RESERVED
X O PART OF THIS BOOK MAY R E REPRODUCEI) I U AXY FORM,
BY PHOTOSTAT, MICROFILV, OR ANY OTHER MEANS,
\\'ITIIOUT WRITTEN PERMISSION FROM THE PI'BLISIIER5.
ACADEMIC PRESS INC. 111 FIFTHAVENUE NEW YORK3, N . Y.
I -ri itad Kingdom Edition Published by ACADEMIC PRESS INC. (LOKDON) LTI). 40 PALLMALL, LONDON S.W. 1
Lthl-al-y of C'ongl-ess ('atalog (lard A'iimbcr 49-7504
PRINTED I N T H E UNITED STATES O F AMERICA
CONTRIBUTORS TO VOLUME XI G. E. BARLOW, Australian Joint Service Staff, Washington, D . C . IT. BRAUER,Institute for Solid State Research, Theoretical Department, German Academy of Sciences, Berlin
1'. GORLICH, Friedrich-Schiller- University, Jena, Germany 0. HACHENBEHG, Heinrich-Hertz-Institute, German Academy of Sciences, Berlin I 90"):
Thus, for i = +1, we have N P / N B - = [ ~ $ . For an unpolarized p--beam, the momentum spectrum N,(x) is obtained by integrating N(z,O) [Eq. (IS)] over dQe,giving N,(Z)C~Z =
dzlN(x,e)da, = 2 ~ y 3- 2 2 ) d ~
The p+ - e+ decay is completely analogous to the p+-decay proceeds by emission of i and v according to p+
---f
e+
+v +V
p-
(23)
- e- decay. The (XI)
The spectrum N(z,O) for the positrons is given by Eq. (lS), in which 0 is the angle between the positron momentum (in the p+ rest system) and the momentum of the p+-meson. Thus, the positron spectrum is also peaked backwards with respect to the p+-meson momentum. The angular distribution NB(0) irrespective of energy is again given by Eq. (20). We shall point out here an important prediction of the two-component neutrino theory concerning the @-decay of unpolarized nuclei, which was obtained independently by Landau (16), Jackson, Treiman, and Wyld (25), Wolfenstein and Page (26),and Curtis and Lewis (97).For a P-decay in which an electron is emitted, the electron has a longitudinal polarization P (spin polarization along the direction of the momentum vector p) of -v,/c, where u, is the velocity of the electron. Thus for high-energy electrons (v, = c), the spin is completely polarized opposite to the direction of motion. For a P-decay where a positron is emitted, the positron polarization P is +v,/c along the direction of the momentum vector, i.e., for a high-energy positron, the spin is completely polarized along the direction of motion. Lee and Yang (28)2have discussed an important concept in connection See also Konopinski and Mahmoud, who were the first to consider a law of conservation of leptons ( 2 8 ~ ) .
PARITY NONCONSERVATION I N WEAK INTERACTIONS
49
with the /3 and meson decays, namely t,he concept of lepton conservation. According to this hypothesis, each of the particles v, V , eft and p* is given a lepton number 1 which is either +1 or -1. For I = fl, the particle is called a lepton; for 1 = -1, the particle is an antilepton. The principle of lepton conservation states that in a particle decay (due to a weak interaction) the total lepton number is conserved, i.e., the lepton number 1 of the decaying particle is equal to the sum of the lepton numbers 1, of the decay products: 1 = I,.
2 i
I n the assignment of lepton numbers, the hyperons, nucleons, K and ?r-mesons,and y-rays are given I = 0; the p-, e-, and v are leptons ( I = +l), while the p+, e+, and fi are antileptons ( 1 = -1). It is easily verified that all the reactions which have been proposed in the preceding discussion [(V), (VI), (VII), (X), and (XI)] satisfy the principle of lepton conservation. On the other hand, the possibilities (VIII) and (IX) for the p-decay (which have been excluded on the basis of the resulting e- spectra) would not satisfy lepton conservation; e.g., for (VIII), l ( p - ) = +1 on the lefthand side, whereas Z(e-) 2I(v) = +3 on the right-hand side. Assuming lepton conservation, we can write down the reactions for various K meson decays, on the assumption that Z(K) 0:
+
K,2+ -+ p+ I(,Z 3pK,,+ -+ p+ Ke3+ e+ ---f
+v
-
+F +v + + v + rro
Of course, the important point in these decays is that lepton conservation enables us to decide whether a v or V is emitted in a particular decay. This has special importance for the two-body decay of K,z*, since the direction of the spin polarization of v or G then determines the direction of the polarization of the p*, in the same manner as for the rr-p-decay (assuming that the spin of the K meson is zero). Thus, for the K,2+ decay at rest, the p+ is polarized antiparallel to its momentum p,+, whereas for the K,2- decay, the p--spin is polarized along the direction of motion, according to (XU) and (XIII). In both cases, the muon polarization is 100%. These predictions have been recently verified by Coombes et al. (29), using Kp2+’s produced in the 6.2-Bev proton beam of the Berkeley Bevatron. OF GARWIN,LEDERMAN, AND WEINRICH VI. THE EXPERIMENT ON THE ? r y e DECAY
Following the success of the experiment of Wu et al. (Z),Gamin et al. (3) carried out a very important experiment, also suggested by Lee and Yang, ( I ) to determine the angular distribution of the positrons from the
50
R. M. STERNHEIMER
p+-decay. The experimental arrangement is schematically shown in Fig. 5. A beam of 85-Mev r+ mesons containing a -10% p+-contamination, obtained from the Columbia Nevis cyclotron, is allowed to impinge on an %in. carbon absorber, which is followed by a thin carbon target. The purpose of the carbon absorber is to stop the pions (the range of an 85-Mev r+ is -5 in. of carbon). As noted above, the beam contains -10% p+-mesons, most of which originate from r+-p+decays occurring in the vicinity of the cyclotron target. The thickness of the carbon absorber was chosen to be
-t7 85-Mev
SHIELDING
T+ BEAM t i o o i 0
pt
COUNTER #I
BON ABSORBER
MAGNETIZING COIL CARBON TARGET MAGNETIC SHIELD ABSORBER
FIG.5. Schematic view of the experimental arrangement of Garwin et al. [R. L. Garwin, L. M. Lederman, and M. Weinrich, Phys. Rev. 106, 1415 (1957)l used to measure the angular distribution of the positrons from the decay of polarized p+ mesons.
8 in., so that a maximum number of the p+, whose range is -8 in., come to rest in the carbon target following the absorber. The passage of a fast p+ through the carbon absorber is indicated by a fast coincidence count of counters No. 1 and No. 2 (see Fig. 5). Around the carbon target, there is a magnetizing coil which produces a vertical magnetic field (i.e., perpendicular to the plane of the figure). If the p+ is longitudinally polarized along its direction of motion, as predicted by the two-component theory, this field H will make the spin precess around H;i.e., the spin vector will rotate through an angle wt, where w = (p/sh)H, with p = magnetic moment of muon, s = spin of p+ = t = time
s;
PARITY NONCONSERVATION I N WEAK INTERACTIONS
51
elapsed between the arrival of the p+ at the carbon target and the p+ decay. The decay positrons are counted in the counter telescope No. 3 4 (see Fig. 5 ) . Between counters No. 3 and No. 4, a n absorber can be interposed in order to exclude positrons with energies below a certain value (e.g., 35 Mev in some of the tests in this experiment). As the magnetic field H is increased, for a given fixed time delay t, the p+ spin will precess by an increasing amount proportional to H . With the angular distribution of Eq. (20), there will be a variation of the counting rate in the No. 3-4 telescope, with maxima and minima depending on the orientation of the p+ spin with respect to the counters. Garwin et al. (3) thus obtained the sinelike dependence of the counting rate on the magnetizing current, which is shown in Fig. 6. The theoretical curve shown in Fig. 6 includes the effect
MAGNETIZING CURRENT (AMPERES)
FIQ.6. The variation of the counting rate in the counter telescope No. 3-4 of Garwin et al. [R. L. Garwin, L. M. Lederman, and M. Weinrich, Phys. Rev. 106, 1415 (1957)l as a function of the current in the magnetizing coil around the carbon target.
of the smearing due to the finite gate width of the counter telescope. (Positrons which are emitted within a time interval from T = 0.75 to 2.0 psec after the p+ stops in the carbon target are counted.) The calculated curve, which is based on Eq. (20) with 4 = +1, is in good agreement with the data, thus providing a strong confirmation of the two-component neutrino theory, which in turn implies nonconservation of both P and C invariance in the r-p-e decay. I n obtaining the amount of precession corresponding to a given magnetizing field H , Garwin et al. (3) assumed that the gyromagnetic ratio of
52
R. M. STERNHEIMER
the p+ meson is 2.00, the same as for the electron. Thus, the agreement obtained implies that the actual gyromagnetic ratio of the p + is 2.00 f 0.10, i.e., the magnetic moment is eh/(m,c) within the experimental uncertainties, where nz, is the p-meson mass. As a check on their experiment, Garwin et al. (3) decreased the thickness of carbon absorber to -5 in., so that most of the T+ mesons stopped in the carbon target inside the magnetic field, instead of the carbon absorber. In this case, no variation of counting rate with field H was observed, and the positron counting rate increased by a factor of -10. These effects would be expected, since pions decaying at rest in the target now provide most of the muons, which are emitted in all directions and therefore give no variation of the positron counting rate as H is increased. For negative muons, Garwin el al. (3) have also detected a backwardpeaked asymmetry, and have verified that the magnetic moment is negative and approximately equal to that of the pf. Following the experiments of Wu et al. ( 2 ) ,Garwin et al. (S), and Friedman and Telegdi (4),a great number of experiments on parity nonconservation have been carried out. In the following discussion, we shall restrict ourselves to experiments on the longitudinal polarization of P-particles from unpolarized nucleis Thus, we shall not discuss the numerous experiments on the asymmetries in T-p-e decay, P-decay of oriented nuclei, Kparticle and hyperon decay, which furnish important information on parity nonconservation in the weak decay interactions. VII. CIRCULARPOLARIZATION OF THE BREMSSTRAHLUNG EMITTED BY LONGITUDINALLY POLARIZED ELECTRONS
A method of demonstrating the longitudinal polarization of electrons emitted in 0-decay consists in measuring the circular polarization of the bremsstrahlung photons which are radiated by the electrons. The circular polarization of the bremsstrahlung was first discussed by McVoy (SO), in connection with the experiment of Goldhaber, Grodzins, and Sunyar (31) on the polarization of the electrons from the 0-decay of YsO. McVoy’s (SO) calculation proceeds in the same general manner as the original Bethe-Heitler calculation (32) for unpolarized electrons, except for the fact that the initial and final electron spin states are not summed over, so that one obtains the dependence of the cross section on the polarization state of the y-ray for a given direction of polarization of the incident electron. McVoy (SO) has carried out this calculation only for the case that the photon is emitted in the forward direction. This case is of greatest
* The survey of the literature pertaining to these experiments on the longitudinal polarization of 8-particles was completed in April, 1958.
PARITY NONCONSERVATION I N WEAK INTERACTIONS
53
interest for relativistic electrons, since the photons are then emitted predominantly a t small angles. The definition of right-circular polarization adopted by McVoy (SO) (and also by other workers in this field) is the opposite of the old definition used in optical work. Thus, if the z axis is along the direction of propagation of the photon (propagation vector k), the polarization vector is e = (e, iSe,)/dZ for circularly polarized light, where 8, and e, are unit polarization vectors in the x and y directions, respectively (the xyz coordinate system is taken as right-handed). According to the present definition, 6 = +1 corresponds to right-circularly polarized light, while 6 = -1 corresponds to left-circular polarization.
+
1.0
0.8
i
0.6
0.2
0 PHOTON ENERGY
(Mevl
FIQ.7. The circular polarization P of bremsstrahlung emitted in the forward direction by forward-spin electrons of kinetic energy TO= 2 MeV, as a function of the photon energy. This figure is taken from the work of K. W. McVoy [Phys.Rev. 106, 828 (1957); 110, 1484 (1958), Fig. 11 and is reprinted with the permission of the author and the Editor of the Physical Review.
For a given incoming electron kinetic energy To,the polarization P increases with increasing photon energy hv from P = 0 a t the lower end of the bremsstrahlung spectrum (hv = 0) to a maximum value P,,, a t the upper end (hv = To). Figure 7 shows the curve for P vs hv for To = 2 MeV, as obtained by McVoy (SO). The polarization P is defined as
P = (R - L)/(R
+ L)
(24)
where R and L are the cross sections for producing a right-handed and left-
54
R. M. STERNHEIMER
handed photon, respectively, in the forward direction from an electron with forward spin (spin parallel to direction of motion). The maximum polarization P,,, (for hv = To)is a relatively simple function of the incident electron energy. P,,, is given by
+
where Eo ( = To m) is the total energy of the incident e l e ~ t r o n whose ,~ momentum is denoted by p,. Figure 8 shows a plot of P,,, vs. To,as obtained 1.0 I
I
I
I
a
ELECTRON ENERGY T,(IN
MeV)
FIG.8. The maximum circular polarization P , of bremsstrahlung emitted in the forward direction by forward-spin electrons, as a function of the electron kinetic energy To.Pmaxis obtained from the equation given by McVoy [Phys. Rev. 106, 828 (1957); 110, 1484 (1958)l; see Eq. (25) of text.
from Eq. ( 2 5 ) .It is seen that P,, rapidly approaches 100% as the incident electron energy becomes relativistic. At high electron energies, the photons of maximum energy in the forward direction are essentially 100% rightcircularly polarized for incident forward-spin electrons and 100% leftcircularly polarized for incident backward-spin electrons (spin antiparallel t o direction of motion). I n a recent paper, Fronsdal and Uberall (33) have extended McVoy's results t o obtain the circular polarization of the bremsstrahlung emitted at an arbitrary angle 0 to the direction of the incident electron. Their results 4
It is assumed that the units are such that e
=
1.
PARITY NONCONSERVATION I N WEAK INTERACTIONS
55
for the polarization P as a function of photon energy hv, for finite 8, are in general similar to those obtained by McVoy (SO) for 0 = 0". In an important experiment of Goldhaber et al. ( S I ) , the circular polarization of the bremsstrahlung was used to detect the longitudinal polarization of the f-rays from the decay of YgO.The circular polarization of the bremsstrahlung y-rays was established by measuring the Compton scattering of the y-rays in an iron electromagnet which was magnetized to saturation either parallel or antiparallel to the direction of the y-rays. As was SCALE:
I
u
ANALYZING MAGNET DOUBLE 'MU METAL SHIELD
FIG.9. Schematic view of the experimental arrangement of Goldhaber et al. used t o demonstrate the circular polarization of the bremsstrahlung emitted by the electrons from the 0 decay of YgO. This figure is taken from the paper of Goldhaber et al. [M. Goldhaber, L. Grodzins, and A. W. Sunyar, Phys. Rev. 106, 826 (1957), Fig. 11 and is reprinted with the permission of the authors and the Editor of the Physical Review.
first demonstrated experimentally by Gunst and Page (34, see also 35), the Compton scattering cross section for right- or left-circularly polarized y-rays has a spin-dependent part, which reverses sign when the orientation of the spin of the target electron is reversed. For iron, there are two electrons per atom (3d electrons) whose spin aligns itself with the direction of the external magnetic field. The arrangement of the apparatus used by Goldhaber et al. (31) is shown in Fig. 9. The Sr90 Ygosource was encased in
+
56
R. M. STERNHEIMER
Monel metal (Z,ff = 28; 60% Ni, 33% Cu, 6.5% Fe) in which most of the bremsstrahlung was produced. The p--rays from the decay of Ygohave a maximum energy of 2.24 MeV, for which (v/c),,, = 0.98. According to the two-component neutrino theory, the P--rays should have a polarization -v/c, i.e., the spin vector of the electrons should be approximately opposite to their direction of motion. The bremsstrahlung y-rays are filtered in the iron of the magnet; i.e., a fraction of the y’s suffers Compton scattering and thereby disappears from the beam (which goes downward in Fig. 9). The y-rays are then counted in a 3 X 3-in. NaI(T1) scintillation counter, for which the pulse height gives directly the energy of the y-ray. (All of the energy of the y-ray is deposited in the NaI(T1) crystal and therefore contributes to the pulse height.) The counting rate was found to be different for the field direction up and field direction down; these two counting rates will be denoted by N+ and N-, respectively. At 1.8-Mev photon energy, Goldhaber et al. (31) obtained a relative difference 6 = 0.07 f 0.005, where 6 = (N- - N+)/[>.i(N- N + ) ] . Assuming a reasonable value for the effective path length in the magnet (4% in.), the value of 6 should be 6 = 0.08, if the photons were completely circularly polarized. The actual observed value of 6 therefore indicates a high degree of circular polarization (-90%) for the high-energy photons (-1.8 MeV.). From the fact that 6 is positive (N- > N+), one can conclude that the photons are left-circularly polarized. This result is expected from the theory of McVoy (SO), if the P--particles are polarized antiparallel to their direction of motion (backward spin), as is required by the two-component neutrino theory. Thus, Goldhaber et al. (31) have established the longitudinal polarization of the p--rays from YgO,which is a first-forbidden transition of unique shape AJ = 2, yes (meaning parity change of the nuclear states). It should be noted that the experiment of Goldhaber et al. (31) was one of the first to demonstrate the longitudinal polarization of 0-particles from unpolarized nuclei. The circular polarization of the internal bremsstrahlung has been discussed by Cutkosky (36), Ford (37),and Pytte (38). The internal bremsstrahlung is an effect caused by the changing dipole moment of the atom as the electronic charge of the @-particleis suddenly shifted from the nucleus to the external region of the atomic electrons. A summary of the experimental and theoretical information on the inner bremsstrahlung has been given by Wu (39). Cutkosky (36) has discussed in detail the internal bremsstrahlung accompanying K capture. This author has also pointed out that for a general 0-decay (not involving K capture) the energetic inner bremsstrahlung y-rays will be right- or left-circularly polaiized, if they accompany slow positrons or slow electrons, respectively. On the other hand, low-energy y-rays have a linear polarization correlated with the direction of the 0-particle
+
PARITY NONCONSERVATION I N WEAK INTERACTIONS
57
Ford (37) has shown that the degree of circular polarization P ( k ) of the inner bremsstrahlung increases rapidly with increasing photon energy k ( = hv). For the inner bremsstrahlen from P32(maximum electron energy Tmax = 1.70 Mev), P ( k ) is 0.72 for k/T,,, = 0.5, P ( k ) = 0.96 for k/T,,, = 0.8, and P ( k ) = 1 for k/T,,, = 1. Thus, the dependence of P ( k ) on k is very similar to the corresponding k dependence of P for the external bremsstrahlung, as discussed by McVoy (30) (see Fig. 7). The results of Pytte (38) are similar to those obtained by Ford (37). Pytte has calculated the polarization P ( k ) of the inner bremsstrahlung from P32and S35(T,,, = 167 kev). This author has also considered the effects of the nuclear Coulomb field on the y-ray spectrum and the polarization. In a recent experiment, Schopper and Galster (40) have verified the predictions of the theory of P ( k ) for the inner bremsstrahlung from a SrgO Ygosource. The calculated values of P ( k ) obtained by Ford (37) and Pytte (38)are in good agreement with the observations of Schopper and Galster (40),who measured the circular polarization of the inner bremsstrahlung by means of the Compton scattering in magnetized iron (34, 35) for photon energies from 0 to -1.8 MeV. The circular polarization is nearly complete at the upper end of the spectrum, showing that there is a maximum violation of parity conservation in the P-decay interaction. In the same experiment (40), the polarization of the ordinary (external) bremsstrahlung emitted by the P-decay electrons was also investigated and was found to be in reasonable agreement (especially a t high energies, -2 MeV) with the calculations of McVoy (SO) and the assumption that the polarization P of the electrons is - v / c .
+
VIII. DETERMINATION O F THE LONGITUDINL4LPOLARIZATION OF P-RAYSBY THE METHODOF SCATTERIKG ON POLARIZED ELECTRONS (MOLLERSCATTERING) Aside from the detection of the circular polarization of the bremsstrahlung, an independent way of establishing the longitudinal polarization of the P-decay electrons is the Mgller (41) scattering of the polarized electrons on polarized electrons in the target material (e.g., the ferromagnetic 3rl electrons in an iron sample). The dependence of the relativistic Mgller electron-electron scattering (41) on the directions of polarization of the two electrons has been investigated by Bincer (42)and by Ford and Mullin (43). Bincer (42) has investigated the electron-electron scattering for the special case that both electrons are longitudinally polarized along their relative direction of motion before the scattering, and that the initial spin directions (before the scattering) are either parallel or antiparallel. These two cross sections will be denoted by &, and &, respectively. Bincer (42)
58
R. M. STERNHEIMER
found that 4p/+ais different from 1 for all energies and scattering angles ( Z O O ) , and can be as low as 0. The ratio + p / + a is given by 4, = +a
1
22 + (3s+ + x: + ( 2 + 3s -
z2))pZ 2Z))PZ
+ (1 + 4p4 + (5 - 42 +
s”P4
(26)
where P is the velocity of either electron in the center-of-mass system (to be abbreviated as c.m. system), and s E cos2 8, where 8 is the scattering angle in the c.m. system. Equation (26) shows that = 1 for 3 = O”, independently of p. decreases with increasing S up to S = n/2, where i t reaches its minimum value, which is given by
1 .o
.8
.6
9, -
%
.4
.2
0
0
.I
’
.2
.3
4
.5
W
FIG. 10. The ratio &,/& for electron-electron scattering, as a function of the relative kinetic energy transfer w. This figure is taken from the work of A. M. Bincer [Phys. Rev. 107, 1434 (1957), Fig. I], and is reprinted with the permission of the author and the Editor of the Physical Review.
59
PARITY NONCONSERVATION I N WEAK INTERACTIONS
i ~ 0 as ,8 + 0 and becomes $6 for 6 + 1. Figure Thus, ( ~ # ~ / & ) ~ approaches 10 shows a plot of vs. w for several values of y. Here w is the relative kinetic energy transfer in the laboratory system; i.e., w = W / T , where W is the kinetic energy lost by the incident electron in the collision (W equals the kinetic energy of the secondary electron) and T is the initial kinetic energy of the incident electron. In Fig. 10, the values of y affixed t o the various curves represent y = E/m = ( T / m ) 1, where E is the total energy of the incident electron in the laboratory system. I n view of the symmetry of the problem with respect to the two electrons, &,/+a is the same for 8 and 180' - 8, and also for w and 1 - w.We have
+
(28)
Equation (26) can be rewritten as follows:
& &
+ +
+
- y2(1 6~ 9)- 2y(l - X) 1 - x2 89 - 2 ~ (4 5~ x2) 4 - 6~ 2x2
The expression for
+ +
+
[Eq. (27)] can also be rewritten in terms of
y:
For w = 0.5, the two outgoing electrons are symmetric with respect to the incident direction and make an angle 8, (in the laboratory system) with this direction, which is given by sin2 O8
= 2/(y
+ 3)
(31)
The strong dependence of the electron-electron scattering cross section on the initial directions of polarization of the two electrons indicates that, it should be possible to measure the longitudinal polarization of electrons from 0-decay by scattering from an iron target in which the spins of the two ferromagnetic 3d electrons have been aligned parallel or antiparallel to the direction of the incident (@-decay)electron. This method has been used successfully in a few experiments and confirms that the longitudinal polarization is - v/c within the experimental uncertainties (see below). The same spin-dependent effect is expected to occur when longitudinally polarized positrons (from /3 decay) are scattered on polarized electrons. This effect for positrons has also been considered by Bincer (42). The general expression for will not be reproduced here, since it is quite complicated. This complication arises partly from the fact that &,/4ais no longer invariant for 8 t)180' - 8 (8 = c.m. angle of scattering), since the positron and electron are distinguishable particles. Figure 11 shows the curves of &,/+a vs. w for various values of y of the incident positron. In the nonrela-
60
R. M. STERNHEIMER
tivistic case (y = l), +,/& = 1 a t all angles 8, i.e., for all values of the w I 1). As can be seen from Fig. 11, however, for energy transfer w(0 I relativistic energies ( y > I), deviates appreciably from 1, except near w = 0 (for w = 0, = 1 for all y). At very high positron energies (Y +. ~0 1, +,/& is given by "'= -1 (1 6 cos2 8 cos' 8) = 1 - 4w 6w2 - 4w3 2w4 (32) +a 8
+
+
+
+
This expression (for y -+ 00) is symmetric with respect to w = 0.5. It is 1 for w = 0 and w = 1 and attains its minimum value, $6, for w = 0.5.
W
FIG.11. The ratio +p/+O for positron-electron scattering, as a function of the relative kinetic energy transfer w. This figure is taken from the work of A. M. Bincer [Phys. Rev. 107, 1434 (1957), Fig. 21 and is reprinted with the permission of the author and the Editor of the Physical Review.
from 1 for positron-electron scattering The large deviations of indicate that the Moller scattering can also be used to detect the longitudinal polarization of positrons from 0-decay. Bincer (42) has also investigated the spin dependence of the scattering cross section for two fermions which are neither identical to each other, nor each other's antiparticle. This theory is applicable to the p - e scattering, for example, and indicates large deviations of +,/& from 1, provided that the energy of the p-meson is sufficiently high (T,,2 2 Bev). In a separate publication, Bincer (44) has studied the influence of a possible anomalous magnetic moment (e.g., for the neutron, proton) on the polarization effects in the scattering of fermions by fermions. These results are applicable, for example, to p - p , p - n, and e - ?a scattering.
PARITY NONCONSERVATION I N W E A K INTERACTIONS
61
Ford and Mullin (43)have obtained a general expression for the scattering of electrons by electrons for arbitrary spin orientations of the two electrons. I n general, they find two effects: (1) a dependence of the cross section on the relative spin orientation and (2) a n updown asymmetry in the cross section. For the special case that the incident electron is polarized either parallel or antiparallel to its direction of motion, but for arbitrary direction of the spin of the target electron, Ford and Mullin (43) obtain for the cross section da- --
dl2
+
+
To2 - { [(2y2- 1)2(4- 3 sin2 8) (y2 - 1)2 (sin4 8 4 7 ( r 2 - 1)' sin4 e 4 sin2 a)] - [(2y2- 1)(4y2- 3) sin2 8 - (7* - 1) sin4a)] cos # - 27(r2- 1) cos 3 sin3 8 sin # cos p \ (33)
I I I I
/ /
// Y'
FIG. 12. Diagram showing spins and momenta of incident and target electrons in electron-electron scattering and notation used by G. W. Ford and C. J. Mullin [Phys. Rev. 108, 477 (1957), Fig. 31.
z/m,
where ro = e2/mc2(classical electron radius), T = where is the total energy of either electron in the c.m. system, 8 is the c.m. scattering angle; Ic. and cp are the polar and azimuthal angles, respectively, of the spin l2of electron 2 (target electron) with respect, to the incident direction and the scattering plane. Figure 12 shows the notation used. I n this figure, and ll are the c.m. momentum and spin, respectively, of the incident electron;
62
R . M. STERNHEIMER
p' is the momentum of one of the outgoing electrons in the c.m. system. For the case $ = 0" or 180", Eq. (33) reduces to the results obtained by Bincer (42).The spin of the incident electron is assumed to be parallel to its direction of motion, so that the cross sections &, and 4aof Bincer (42) correspond to # = 0" and 180°, respectively. The spin dependence of du-/dQ is manifested by the terms proportional to cos $ and to sin I)cos cp. Both terms change sign when either of the spin directions is reversed. (A reversal of the spin 1 2 corresponds to changing $ to 180" - $, and cp to 180" cp). In addition, the term proportional to sin $ cos cp represents an up-down asymmetry with respect to the yz plane in Fig. 12. (A reflection with respect to the yz plane changes cp to 180" - Q, but obviously leaves $ unchanged.) As pointed out by Ford and Mullin (43)the coefficient of cos # can be obtained by measuring the cross sections for parallel and antiparallel spins, whereas the coefficient of the sin $ cos cp term can be determined by obtaining the difference in the cross sections a t cp = 0 and cp = a when the two spins are perpendicular to each other (I) = 90"). In order to discuss these effects, Ford and Mullin (43)introduce two quantities A'(Te,T) and B'(3,y) defined as follows:
+
* e-) A
B*(B,T)
3
( C J Y
=
da*(# = 0) - du*(# = a) = 0) du*(# = a)
+
- du*(*
du*(# = ~/2,cp= 0)
&*(I)
= a/2,cp = 0)
+
- du*($
+
(34)
= a/2,cp = a) a/2,cp = a)
du*($ =
(35)
where the superscripts and - refer to incident positrons and electrons, respectively. [For positrons, Ford and Mullin (43) have also obtained the explicit expression for the scattering cross section du+/dQ for arbitrary # and cp, similar to Eq. (33) for incident electrons.] A- and B- measure the strengths of the cos $ and sin # cos cp terms in (33), respectively. At 0 = a/2, both A- and A+ have their maximum values and are given by
For 7 ---f 03, both Eqs. (36) and (37) approach the same value, -Jd. In the nonrelativistic limit (T + l), A-(a/2,7) + - 1, i.e., du-(# = 0) = 0, which is due to the effect of the Pauli exclusion principle (42). The asymmetry coefficient B-(B,y) for electron-electron scattering is given by
B-(B,r)
= -
27(r2 - 1) sin2 8 cos 8 (27' - 1)2(4- 3 sin2 6) (T2 - 1)2(sin48
+
+ 4 sin2 8)
(38)
PARITY NONCONSERVATION IN WEAK INTERACTIONS
63
At 8 = 7r/3 and a laboratory energy of 1 Mev (corresponding to =?GZ 2), we have B- = -0.05. Similarly small values are obtained for B+ at 1 MeV. It may be noted that B- and B+ vanish both at nonrelativistic energies ( y -+ 1) and at very high energies (7 -+ a).It can be concluded that the coefficient of the up-down asymmetry term ( cc sin $j cos cp) is generally much smaller than the coefficient of the cos $j term. Ford and Mullin (43)have also considered depolarization effects in electron-electron and muon-electron scattering. For an electron with initial spin direction along the direction of motion, the probability that the final spin be parallel ( E = + l ) or antiparallel ( E = -1) to the initial spin is given by P(E,W)
1
+
E
= --
€(?
2
+
- 1)2(2y 1)2w (272 - 1)2
+ 0(w2>
(39)
where w is the fractional energy transfer in the laboratory system. Equation (39) shows that the probability of a spin flip is proportional to the energy transfer w, for small values of w. The term O(w2) represents a quantity of order w2 which is therefore unimportant for small w. For high-energy electrons, the great majority of the collisions correspond to a very small energy loss, and therefore the depolarization effects are expected to be negligible. Thus, as pointed out by Ford and Mullin (43),a l-Mev electron has a most probable fractional loss w 5 On the other hand, at low energies, as the electron is brought to rest and therefore suffers large fractional energy losses, the depolarization becomes important. In the nonrelativistic limit, the exact expression for P(E,w)is given by P(E,W)
1 + E
- -2E 1 - 3W2 w+3w2
=-
2
For nonrelativistic longitudinally polarized muons, the probability that a scattering will result in no spin-flip ( E = +I) or in a spin-flip ( E = -1) is given by
&(€,a)=
~
+ '2
E
[
(3
!f /34 sin2 P2
(i)+ (i)]
- sin4
sin6
(41)
where p = muon mass, ,8 = velocity of p in laboratory system, 8 = c.m. scattering angle of muon. The fractional energy loss w is given by
Thus, the second term of Eq. (41) (involving the square bracket) is of order ewP2(m/p), which is extremely small. Hence, for muons originating from pions which decay at rest, the depolarization is expected to be negligible
64
R. M. STERNHEIMER
until the muon is brought to rest. After the muon is brought to rest, there could, however, be some depolarization (by a spin-flip interaction of the 1.1 with the surrounding electrons) before the 1.1 meson decays. The first experiment using the Mgller (electron-electron) scattering to detect the electron polarization was carried out by Frauenfelder et al. (45) for the electrons from the decay of P3*and Pr144. The.experimenta1 arrangement is schematically shown in Fig. 13. The electrons from the source are first collimated and then impinge on a magnetized Deltamax foil of thickness 2.7 mg/cm2 and having an induction R = 15,000 gauss. The plane of the foil and therefore the direction of the polarization of the ferromagnetic ( 3 4 electrons is a t an angle CII = 30” to the direction of the incident electron beam. The scattered electrons are recorded in two anthracene scintilla-
COLLIMATOR
MAGNETIZING Pnll E
W
b W I L J
FIG. 13. Schematic view of the experimental arrangement of Frauenfelder et al., used to demonstrate the longitudinal polarization of electrons from the p-decay of Pa* and Pr144,by means of the M ~ l l e r(electron-electron) scattering [H. Frauenfelder, A. 0. Hanson, N. Levine, A. Rossi, and G. De Pasquali, Phys. Rev. 107, 643 (1957)].
tion counters which are in “fast-slow” coincidence. The advantage of using two counters in coincidence to detect both scattered electrons (having about equal energies) is that this procedure eliminates the undesired large background due t o Rutherford scattering as well as the complications due to plural scattering in the foil. This latter difficulty is present when the Mott (nuclear) scattering is used t o detect the electron polarization (see Sec. IX). The fraction f of the electrons of the foil which are magnetized is .f = 0.055 f 0.004 under the conditions of the experiment ( B = 15,000 gauss). The relative difference of the counting rates 6 is defined as 6
2(Cp - Ca)/(Cp
+ Ca)
(43)
where C, and Ca are the numbers of coincideiices when the incident electron momentum and the polarizing magnetic field in the scattering foil are parallel and antiparallel, respectively. (Actually, these two directions are
PARITY NOR'CONSERVATION I N WEAK INTERACTIONS
65
not exactly parallel and antiparallel to each other; the angle between them is a = 30" and 180" - a = 150" in the two cases.) In terms of the longitudinal polarization P of the incident electrons, 6 is given by 6
=
2fcos aP(1 - e ) / ( l
+ e)
(44)
where e = &,/4,, is the ratio of the scattering cross sections for longitudinally polarized electrons with parallel and antiparallel spins, as given by Bincer (4.2). For P32(I+ --+ O+ transition), two energy groups of electrons were investigated. Group (1) has energies between 0.3 and 1.0 MeV, with a n average (v/c),, = 0.85. For this group, 6 = -0.064 f 0.007, whence from Eq. (44), P = -0.85 f 0.11. Group (2) extends from 0.8 to 1.6 MeV, with (v/c),, = 0.94. The observed 6 = -0.069 f 0.010 gives P = -0.94 f 0.16. A check was obtained by substituting an aluminum foil for the Deltamax foil. In this case, the dependence on the magnetic field direction was zero within the experimental uncertainties (6 = -0.002 f 0.009). For Pr14*(0O+ transition), electron group (1) has energies ranging from 0.4 t o 1.1 Mev, with (v/c),, = 0.86, while the value of P obtained from the measured 6 is -0.66 f 0.18. Group (2) has energies between 1.2 and 3.0 MeV, with (v/c),, = 0.97. The experimental value of P for this group is -1.05 f 0.25. With the Pr144source, the aluminum scatterer again gave a negligible value of 6. It can be concluded from this experiment that both for P32and the electrons are polarized opposite to their direction of motion, and that the magnitude of the polarization is v/c within the limits of the experimental errors. By using the same method of the Mgller electron-electron scattering, Benczer-Koller et al. (&) have recently shown that the polarization P for the electrons from YgOand AuIg8is -v/c. In this experiment, the scattering foil was a piece of Supermendur, 2 x in. thick, which was mounted on a steel frame, making an angle of 30" with respect to the incident electron beam. The distance from the source to the center of the foil could be varied from 11to 29 em. A strong-focusing lens consisting of two quadrupole magnets was also used to focus the high-energy electrons. The detectors were two plastic scintillators in fast-slow coincidence, which were placed a t an angle of 35 f 11" to the incident electron beam in a symmetric arrangement [w = 0.5, see Eq. (31)]. Extensive tests were made on the collimation of the electron beam arriving a t the magnetic foil. It was found that the use of the strong focusing magnets increased the intensity of the high-energy electrons and suppressed the low-energy end of the spectrum, as is desired, since the high-energy electrons are the ones which give rise to the electron-electron scattering, ---f
66
R. M. STERNHEIMER
whereas the low-energy electrons give unwanted coulomb (nuclear) scattering, which constitutes an undesirable background. By the use of the strong focusing magnets, the intensity of the electrons from Adg8in the highenergy region (500-960 kev) was increased by a factor of 4. The observed differences 6 were as follows: For the 0- from Ygo (2-) ---f ZrgO(O+),with velocities v between 0 . 9 5 ~ and 0.98c, 6 with the use of the focusing magnets was (f6.95 f 1.60)%, which gives a polarization P = (-0.93 f 0.21)vlc. A control experiment in which an aluminum foil replaced the Supermendur gave 6 = (-0.89 f l . l l ) % for electrons from Ygoin the same velocity range (0 .9 5 ~to 0.98~).For the 0- from Aulg8 (2-) -+ Hglg8 (2+) with velocities v between 0.89~and 0 . 9 4 ~6~with the focusing magnets was (+6.65 f 1.33)%, giving P = (-1.02 f 0.19)vlc. Again a control experiment with an aluminum foil gave zero difference within the experimental errors [6 = (-0.21 f 1.30)%], as is, of course, t o be expected, since A1 has no polarized electrons, which could be aligned with the external magnetic field. The difference 6 used above is defined as 6 E 2(Np - Na)/(Np
+
Na)
(45)
where N , and N , are the numbers of coincidence counts when the spins of the incident electrons and the (ferromagnetic 3d) target electrons are parallel and antiparallel, respectively. Since the spin aligns itself in a direction opposite t o the magnetic field direction, the present N , corresponds to C, of Eq. (43), and N , corresponds to C,, so that the present 6 is equivalent t o -6 of Frauenfelder et al. (45). The polarization P is obtained from the relation
P =-
;
6 [f cos a
(; ;;$;3]-1
where f is the fraction of polarized electrons per atom, a = angle between the incident beam and the plane of the foil (a = 30"), and 4,/& is the ratio of the scabtering cross sections determined by Bincer (42) which for the conditions of this experiment (y 3, w between % and $5) is of the order of 0.1 (see Fig. 10). The fraction f was obtained by the authors (46) from the relationship:
-
B =
+ 4?r(26fpoNo)
(47)
where B is the induction, B = 13,000 gauss, which was obtained for a magnetic field H = 2.5 oersteds. (The remanence of the Supermendur was 11,000 gauss.) I n Eq. (47) po = Bohr magneton = 0.9270 X gausscm3, No = number of atoms per cm3 = 8.46 X ~ m - ~Thus, . one obtains f = 0.051, which shows that the effective f is somewhat less than
67
PARITY NONCONSERVATION IN WEAK INTERACTIONS
the maximum expected value 2/26 = 0.077, assuming two polarized 3d electrons. The result P = - u / c of the experiment of Benczer-Koller et al. (46) provides a strong confirmation of the prediction of the two-component neutrino theory. For the case of Aulgn,which is a first-forbidden transition with AJ = 0, yes (meaning parity change), the experiments of Boehm and Wapstra (47) on the @-7circular polarization correlation (see See. X) show that there is a large amount of interference between the Gamow-Teller and the Fermi interaction terms. For such a @-transition,according to the twinneutrino theory of Goeppert-Mayer and Telegdi (48),and Preston (49),the polarization should be much less than v/c. Thus, the experimental result of full (u/c) polarization for AulSnprovides a strong argument against the validity of the twin-neutrino theory.
IX. DETERMINATION OF THE POLARIZATION OF ELECTRONS FROM P-DECAYBY MOTT SCATTERING OF
THE
ELECTRONS ON NUCLEI
The Mott scattering (50) of the electrons from decay has also been used to measure their polarization. In this type of experiment, the longitudinal polarization of the electrons is first transformed into a transverse polarization, for instance, by deflecting the electrons through -90' by means of an electrostatic field. Another method consists in scattering the electrons by a large angle, as will be discussed below. After the particles have thus acquired a substantial amount of transverse polarization ( d l p ) , they are scattered by an angle 0 in the plane perpendicular to the (d,p) plane, which we assume to be horizontal, and the up-down asymmetry is observed. That is, the intensity of the electrons scattered through an angle 0 in the upward direction is different from the intensity of the electrons scattered through the same angle e in the downward direction. As was first shown by Mott (51) in 1929, the asymmetry in the scattering of transversely polarized electrons is largest. for heavy elements and for large scattering angles ( 0 90" - 150"). In connection wit,h the transformation of the longitudinal polarization of the beta particles into a transverse polarization, Case (52) has given a simple discussion of the behavior of the spin of a Dirac particle in an external electromagnetic field. The results of Case (52) have been previously derived by Tolhoek and de Groot (53). Let p be the (ordinary) kinetic momentum of the particle (electron, muon, etc.). In the presence of an electromagnetic field with vector potential A, p is given by
-
p =P where
- (e/c)A
(48)
P is the total momentum of the particle [which is represented by
68
R . M. STERNHEIMER
the quantum mechanical operator (A/i)V, where V is the gradient operator]. The Hamiltonian X of the system: particle field is given by X = cpld
*
+ p + p3mc2+ e+
(49)
where p1 and p3 are the usual matrices introduced by Dirac (54),and 4 is the scalar potent,ial for the external field. From the commutator [X,d p], Case (56) obtains
-
where E is the external electric field. Thus, for a pure magnetic field (E = 0), d p is a constant of the motion. From this property, two conclusions can be drawn: 1. A state of longitudinal polarization cannot be changed to a state of transverse polarization by using purely magnetic fields. 2. A longitudinally polarized beam will never be depolarized on passing through a purely magnetic field. For the case of a pure electric field, Case (52) considers the commutator [%,PI and finds
-
From Eqs. (50) and (51), one obtains
An important application of this equation concerns the p-meson experiment of Garwin et al. ( 3 ) .If in slowing down the p-mesons, only those moving in the initial direction (z direction) are considered, Eq. (52) becomes du,/dt = 0
(53)
Hence, if these p's are originally longitudinally polarized, they will remain so, with the same amount of polarization, after the slowing down. For the situation where both electric and magnetic fields are present, Case (52) has also derived appropriate equations for dp/dt and p * ( d d / d t ) . He has thus shown that when a longitudinally polarized beam moves perpendicular to a magnetic field, Eq. (52) is still valid. Tolhoek (35) has discussed in detail the rotation of the spin vector in transverse and longitudinal electric and magnetic fields. We shall first consider the case of a transverse electric field, such as exists between the plates of a cylindrical condenser (electrostat,ic deflector). For this case, Tolhoek (35) obtains
PARITY NONCONSERVATION IN WEAK INTERACTIONS
A a/ A y = T,/E,
69 (54)
where ACYis the angle by which the spin vector d is rotated, AT is the angle of deflection of the beam, and T , and E, are the kinetic and total energies of the electron, respectively. For nonrelativistic electrons, T J E , v2/2c2 is negligible, so that A a Z 0. Hence, in this case, the spin direction remains unchanged, and by deflecting the beam through go", an initial longitudinal polarization can be transformed into a transverse polarization. In order to accomplish the same objective for relativistic electrons, the deflection angle must be larger than go", namely, (r/2)(1 - Te/Ee)+. For a transverse magnetic field, Tolhoek (35) gives the result A a/ A y = 1
(55)
independently of the electron energy. Thus, for a pure transverse magnetic field, the spin vector follows exactly the momentum vector, and therefore the degree of polarization (longitudinal or transverse) is unchanged. This result is, of course, identical with that obtained from Eq. (50) with E = 0. For a beam with transverse polarization P (and d perpendicular to the plane of scattering) the ratio of scattered intensities in both azimuthal directions (upward and downward in the example discussed above) is given by
where a ( @is a function, first calculated by Mott (51),which depends on the atomic number of the scatterer, the incident electron energy, and the angle of scattering 8. The most complete recent calculation of n(e) has been carried out by Sherman (55),who has tabulated a(@at intervals of 15" for vaiious values of the electron velocity p, for three elements: mercury ( Z = 80), cadmium ( Z = 48), and aluminum (Z= 13). In addition to the function a(e), the rcal and imaginary parts of the Coulomb wave functiors F and G are also tabulated, together with the differential cross section & / d a (for unpolarized incident electrons). As mentioned above, la(0)l is largest for heavy elements and large values of 8. Thus, for p = 0.6 ( T , = 128 kev) and 2 = 80, a(0) = -0.062 a t e = 60", -0.271 a t 90°, -0.424 a t 120°, and -0.337 a t 150". The function a(0)is zero a t e = 0" and 180" for all energies, and a t 0 = 1 for all angles 0. In spite of the increase of [ a ( @from 0.271 at 90" t o 0.424 a t 120" and 0.418 a t 135" in the above example (0= 0.6), it has been found desirable to work a t -90' because of the rapid decrease of the cross section with increasing angle. Among the earlier determinations of a ( @ ,we may mention the calculations of Mott (52),Bartlett and Watson (56),and Mohr and Tassie (5'7).
70
R. M. STERNHEIMER
It may be remarked that the function a(e) also enters into the related problem (not directly applicable here) of the double scattering of an initially unpolarized beam of electrons. (51) I n this case, the polarization P after a single scattering through an angle el is given by a(&),and the direction of d after the scattering is perpendicular to the plane of the scattering. After a second scattering, the relative intensity of the beam as a function of the angle cp between the first and the second plane of scattering is given by I(el,ez,cp) = 1 where
02
+ a(el>a(ez)cos
CP
(57)
is the angle of the second scattering.
LEAD SHIELD
FIG.14. Schematic view of the experimental arrangement of de-Shalit et al. [A. de-Shalit, S. Kuperman, H. J. Lipkin, and T. Rothem, Phys. Rev. 107, 1459 (1957)l used to determine the longitudinal polarization of electrons from the p-decay of P32,by means of the Mott (nuclear) scattering.
Equation (57) applies if there is no magnetic field between the two scatterers. The situation where there is a magnetic field between scatterers 1 and 2 has been discussed by Mendlowitz and Case (58). This type of experiment (with magnetic field) can be used to determine the anomalous magnetic moment of the electron (59), i.e., the deviation from 2 of the gyromagnetic ratio g. As an example of the detection of the polarization of /3 particles by means of the Mott scattering, we shall discuss the experiment of de-Shalit et aZ.(60) on the polarization of the electrons from the @decay of P32.The
PARITY NONCONSERVATION I N WEAK INTERACTIONS
71
experimental arrangement is schematically shown in Fig. 14. The electrons are first scattered through 90" by a semicircular aluminum foil u1 (0.05 cm thick), which transforms their longitudinal polarization into a partial transverse polarization. In Fig. 14, the foil ul is in a plane perpendicular t o the plane of the paper. The diameter of the semicircle describing the foil is along the line joining the source S to the foil u2 (2.5 mg/cm2 Au) where the electrons undergo a second scattering. This property of u1 ensures that the first scattering angle is 90". A lead shield placed midway between S and u2 prevents direct (nonscattered) electrons emitted by the source from reaching u2. The second scattering (at uz) takes place in the plane of the paper in Fig. 14, and the electrons which are scattered to the right and to the left by 75" are recorded in the counters C R and C L .
FIG. 15. Diagram showing momentum and magnetic moment of electrons in the double scattering experiment of de-Shalit et al. [A. de-Shalit, S. Kuperman, H. J. Lipkin, and T. Rothem, Phys. Rev. 107, 1459 (1957)l. This figure applies for the nonrelativistic electrons from the 8-decay. For relativistic energies, the situation is more complicated [see references (61) and (62)l.
The situation as concerns the relative direction of the magnetic moment and the electron momentum is shown schematically in Fig. 15 for the case of nonrelativistic electrons. The electron is initially polarized with its spin s in the direction opposite to its direction of motion. Thus, since the magnetic moment is p = (eTa/mc)s, where e, the charge of the electron, is negative, p will be parallel to the electron momentum p, as shown in Fig. 15. After the scattering a t ul, p is rotated by go", but the direction of p is unchanged, so that p is then a t right angles to p (transverse polarization).
72
R. M. STERXHEIMER
Finally, with p pointing upward, the right-left asymmetry is measured. A simple qualitative argument given by de-Shalit et al. (60) shows that with the magnetic moment direction as shown in Fig. 15, there will be more particles scattered to the left (into the plane of the paper) than to the right (out of the plane). This is indeed observed, as will now be discussed. The measured right-left asymmetry was
N L - N R = (5.1 =k 0.6) X lo-' ~ ( N L NR)
+
where NR and NL are the counting rates in the counters C R and CL, respectively. The sign of the asymmetry (NL > N R ) shows that the electrons are initially polarized longitudinally, with the spin pointing backwards. The magnitude of the asymmetry is compatible with full polarization, i.e., P = -v/c. As noted above, the considerations presented in connection with Fig. 15 apply to nonrelativistic electrons. However, in the experiment of deShalit et al. (GO), only relativistic electrons with energies T, between 0.9 Mev and the maximum energy 1.7 Mev were included. Gursey (61) has given a treatment of the Coulomb scattering of polarized relativistic electrons (see also Tassie, 62). This author has calculated that for the experiment of de-Shalit et al. (GO), with a mean kinetic energy (T,) = 1.15 MeV, the asymmetry 6 is expected to be 9% for complete longitudinal polarization of the incident electron beam. The difference between the observed value, (5.1 f 0.6)%, and the theoretical result can probably be attributed to effects of plural scattering in the two scatterers u1 and U Z . In a later experiment, Lipkin et al. (63) measured the 0-ray polarization for AuIg8by the same method as used by de-Shalit (60). They found that both for Sn and Au foils used as scatterers C T ~the , asymmetry was the same for AuIg8electrons as for P3' electrons in the same energy range. Thus, for a gold foil of thickness 1.3 X cm, 6 = (8.6 f 1.0)% for Aulg8,as compared with 6 = (8.7 f 0.7)0j, for P32.This result implies that the AuIg8 electrons are fully polarized, if one accepts the result of full polarization for the P32electrons obtained by Frauenfelder et al. (45). By comparing in this manner the observed asymmetries for two nuclei, it is not necessary t o make complicated corrections for various experimental effects (e.g., plural scattering) which would enter into an absolute determination of the polarization (61). As discussed above, the result of full polarization for A d g 8 was also obtained by Benczer-Koller et al. (46) from a measurement of the MGller (electron-electron) scattering of the 0-particles in magnetized iron. One of the first experiments on the electron polarization by means of the Mott scattering was carried out by Frauenfelder et al. (64),who measured the polarization of the electrons from the Co60decay. In this experi-
PaRITY NONCONSERVATION I N W E A K INTERACTIONS
73
ment, the electrons were deflected by 108" in an electrostatic field, so that the spin was approximately perpendicular to the direction of motion after the deflection. The polarization analyzer consisted of a gold scattering foil (0.15 or 0.05 mg/cm2), and the asymmetry in the scattering was measured for scattering angles e in the region from -95" to 140". The measurements were carried out for three different groups of Co60electrons having energies T , = 50, 68, and 77 kev. The values of the polarization P obtained from the data are -0.04 for T , = 50 kev (v/c = 0.41), -0.16 for T , = 68 kev (v/c = 0.47), and P = -0.40 and -0.35 for the two runs a t T , = 77 kev (V/C = 0.49). The left-right asymmetry N L / N Rof the counting rates N L and N R was very pronounced for the two runs a t 77 kev. The two values of NL/"R were 1.35 =t 0.06 and 1.30 rt 0.09; these asymmetry ratios will be denoted by R, and Rb, respectively. In view of Eq. (56), P is given by
where R = N L / N R .From the tables of Sherman (55),the value of a ( @for Z = 80, 0 = 110", p = 0.49, is a ( 0 ) = -0.37. One obtains for R,: P(R,) = -0.40 rt 0.06, and for Rb: P(&) = -0.35 =t0.09. Thus, the observed polarization is a large fraction (7040%) of the predicted value, P = - v / c = -0.49. The relatively small discrepancy could be due to depolarization effects in the source and in the analyzer. In an experiment similar to that of Frauenfelder et al. (64),De Waard and Poppema (65) have measured the longitudinal polarization of the /3particles from Co60, P32,Tm170, and Aulg8.The electrons were deflected by 90" in an electrostatic field, and were subsequently scattered from a gold foil (thickness -200pg/cm2) a t angles from 50" to 87". The electron velocity in this experiment was v = 0 . 6 6 ~( T , = 168 kev). For Co60,the experimental value of the polarization was P = -0.49 + 0.11, as compared with the theoretical prediction, P = -0.66. It was suggested by Cavanagh et al. (66) that the observed lPl would be increased to a value close to 0.66, if the necessary correction for plural scattering in the gold foil were applied to the measured asymmetry. Cavanagh et al. (66) have pointed out that, in order to obtain more accurate results with the Mott scattering method, it is advantageous to use a system of crossed electric and magnetic fields, instead of a pure electrostatic field, to produce the transverse polarization. I n this case, the focusing condition for the particles becomes identical with the condition for turning the spin through 90". Cavanagh et al. (66) have used this method to determine the longitudinal polarization of the electrons from Co60. The crossed fields satisfy the condition E / H = 0, where E = electric field, H = mag-
74
R. M. STERNHEIMER
netic field, and 0 = v/c of the electrons. As a result, the electrons having the desired velocity travel along the central axis of the crossed-field region without any deflection. The electrons were injected from a thin-lens spectrometer. The value of 0 was taken as 0.6, to make use of the fact that the asymmetry function [a(O)i for 90" and 2 = 80 is highest for 0 = 0.6 [a(90°)= -0.2711 (55). This corresponds to a kinetic energy T, = 128 kev. The electrons are then scattered through 90" by a thin gold foil, which is placed in a transmission position at 60" to the incident beam, in order to reduce effects of plural scattering in the foil (35). Concerning the spin rotator, the two plates providing the electric field were 20 cm long, with a gap distance of 2.8 cm. The voltage could be vaned up to 70 kv applied symmetrically to both plates, and the magnetic field H was of the order of 100 oersteds. The required electric field for H = 100 oersteds and 0 = 0.6 is
E = 300 X 100 X 0.6
=
18,000 v/cm,
(60)
and the required potential difference AV across the 2.8-em gap is, therefore,
AV The angle
=
18 X 2.8 = 50.4 kv
x (in radians) by which the spin is turned is given by x = 300Hoe=stlCm(1 - P2)"/P,,/,
(61) (67) (62)
where Hoerst is the field in oersteds, Zc, = length of the plates in em, and p,,/, is the momentum in ev/c. The detector consists of a scintillation counter which counts electrons scattered by 90" in the gold foil. The reason for choosing 90' and not some larger angle where la(@/ is higher than for 90" is that the cross section du/dQ decreases rapidly with angle, as was mentioned above. Thus, for 0 = 120°, where la(0)l has its maximum value for 0 = 0.6 and 2 = 80 [a(120") = -0.4241, du/dQ is down by a factor 2.1 from its value a t 90" (du/dQ = 2.00 X lo3 barn/ster a t 120" as compared with 4.29 X lo3 barn/ster at 90") (55). As a result, the observed asymmetry would be changed appreciably from the single-scattering value by the admixture of some plural scattering, if the angle e were too large, so that the plural scattering would predominate. The thickness of the gold foil must also be held small enough so that the corrections due to secondary processes (plural scattering) will be negligible. A discussion of these effects is given in the review article of Tolhoek (35). In their double scattering experiment, Ryu, Hashimoto, and Nonaka (68) found that the thickness t of the gold foil for the scattering (analyzer) should not exceed 100 pg/cm2, to ensure that secondary corrections are not excessive. Actually, aside from several runs with t 100 pg/cm2, the experiment of Cavanagh et al. (66) was also
-
75
PARITY NONCONSERVATION I N WEAK INTERACTIONS
carried out for several thicknesses t between 100 pg/cm2 and 1 mg/cm2 to estimate the corrections due to plural scattering. Cavanagh et al. (66) obtained approximately the expected sin cp dependence of the asymmetry of the counting rate on the angle p between the spin direction and the azimuthal angle of the detector. The latter could be rotated in a plane perpendicular to the central axis of the spin rotator, so as to vary cp. Thus, for a gold foil of thickness 180 pg/cm2, the counting rate N was given by
N = No[l
+ A sin (9+ S)]
(63)
where the size of the asymmetry A was 0.11, and the const.ant phase angle 6 = 25" arose from certain instrumental misalignments. [The beam made a small angle (-3") with the axis of the spin rotator and was also displaced from the axis by a small amount.] With increasing thickness t, 6 increases. Thus, 6 = 60" for t = 770 pg/cm2 of Au. However, such large thicknesses were not weighted strongly in obtaining the asymmetry A ( t ) extrapolated to zero thickness t. The latter, A(O),is 0.159. This value must be increased by 10% to take into account the multiple scattering and the wide-angle scattering in the source. This gives A = 0.173 f 0.035. In obtaining A(O),the authors (66) used the approximate relation A ( t ) = A(O)/(l ct), where c is a constant if double scattering is the dominant process producing the secondary effects. From the experimental value of A and from the value (55) of a(90") = -0.267 for 2 = 79, /3 = 0.6, one obtains P = A / a = -0.65 f 0.13, which is consistent with the prediction of the two-component neutrino theory, P = -/3 = -0.6. Cavanagh et al. (66) have also measured P for 129-kev electrons from Aulg8,and have obtained P = (-0.97 f 0.20)v/c,in good agreement with the theoretical prediction and with the experimental results of BenczerKoller et al. (46) and Lipkin et al. (63). An experiment similar to that of Cavanagh et al. (66), using crossed electric and magnet,ic fields, and Mott scattering, has been recently carried out by Alikhanov et al. (67).These workers have measured the longitudinal polarization P of the electrons from a Sr-Y source, corresponding to the transitions: SrgO(O+) -+ YgO(2-) -+ ZrgO(O+),and Srsg((45+) -+ Ysg(>5-). The energy of the electrons involved in the experiment was T,= 300 kev (0 = 0.78). The effective length of path in the crossed fields was I = 27 cm. The gap between the condenser plates was 12 mm. The gold scattterer was placed in the transmission position a t 45" to the beam axis. Electrons scattered through an angle of 90 f 4" were counted by two Geiger counters in coincidence. The counters could be rotated about the beam axis in a plane perpendicular to the direction of the beam before the scattering. The calculated asymmetry for the scattering of 300-kev electrons a t
+
76
R. M. S T E R N H E I M E R
angles cp = 90" and 270" to the spin direction was: Bcalc = 21.8%, assuming that the polarization P = -v/c. The actual measured asymmetry was B,, = (17.4 f 4.3)70. Thus, the experimental value of the polarization IPI is
lP1
=
*
(17.421.84.3) c!! = (0.80 f 0.20)
However, the measured asymmetry should be increased by 13% to correct for multiple scattering effects. This gives jPI = (0.90 f 0.23)v/c, which is in essential agreement with the theoretical prediction. An additional experiment was carried out by Alikhanov et al. (67)using 750-kev electrons from the decay of Ygoand SrS9.I n this case, the measured = (7.8 f 2.5)%, as compared with the calculated asymmetry was, , , ,6 value Gcalc = 6.8%. This gives lP1 = (1.15 f 0.4)v/c for 750-kev electrons. The experiment was also repeated for T,= 300 kev under slightly different conditions ( E = 20 kv/cm, H = 86 oersteds, scattering angle = 105"). The azimuthal asymmetry was 35.5%. The resulting electron polarization IPI is (1.10 f 0.19)v/c. The mean value of IPl for both experiments at 300 kev is1'21 = (1.02 f 0.15)v/c, in good agreement with the prediction of the two-component neutrino theory. In all cases, the sign of the asymmetry was that to be expected for electrons whose spin direction is opposite to the direction of motion.
X. EXPERIMENTS ON THE LONGITUDINAL POLARIZATION OF POSITRONS FROM P-DECAY. /3 - y CIRCULAR POLARIZATION CORRELATION EXPERIMENTS The circular polarization of the bremsstrahlung, as well as the Moller and the Mott scattering, which were discussed above, have been used primarily for the electron P-emitters. Various other methods based on the annihilation properties of positrons have been used for the positron 0-emitters. In this section we shall describe these experiments on the longitudinal polarization of positrons and shall also give a brief discussion of the P - y circular polarization correlation experiments, which have given valuable information on the P-decay interaction for various electron emitters. The experiment of Page and Heinberg (69) on the polarization of positrons from Na22is based on the properties of the triplet (1 3Ss,,0) and singlet (1 states of positroniuni (70).In certain gases such as argon or propane, the positronium is formed with rather large kinetic energies, and is subsequently thermalized (i.e., slowed down) a t such a rate that the singlet state, with lifetime 7 1O-Io sec, retains most of its initial motion a t the time of annihilation, whereas for the triplet state, with 7 3 X sec,
-
-
PARITY NONCONSERVATION I N WEAK INTERACTIONS
77
most of the initial motion is lost by the time it undergoes two-photon annihilation (of course, in the presence of a magnetic field). Therefore, the angle O,, between the two annihilation photons mill be on the average closer to 180" for the 3X1.0states than for the 'So,ostates. If one requires strict angular correlation a t O,, = 180", with a suitable gas (e.g., argon) the relative efficiency for triplet/singlet states can be made -1.5, in the presence of the background of the other two-quantum annihilation events. When a magnetic field H is applied to the gas sample, the positrons are preferentially captured into the triplet state if the positron spin is antiparallel to H and the singlet state if the spin is parallel to H. Making use of this property, Page and Heinberg (69) applied fields H of 10-15 kilogauss to various gas samples and obtained the relative difference 6 in the annihilation yield (for O,, very close to 180") with the field H parallel and antiparallel to the direction of motion of the positrons from the Na22source. From the observed values and the sign of 6, Page and Heinberg (69) concluded that the positrons are polarized along their direction of motion, the value of the polarization P being greater than 0.4 (u/c), where (v/c) = 0.75 is the average value of v/c for the ef from Na22.The fact that less than the full expected value of P (= (v/c)) was obtained may be due to several interfering effects: (1) backscattering from the NaZ2source and its mounting; (2) partial depolarization of the positrons prior t o the formation of positronium; (3) depolarization of the ef in the positronium "atom" before the annihilation process takes place (mixing of the magnetic substates of 1s positronium). Hanna and Preston (71) have demonstrated the longitudinal polarization of the positrons from Cu6*by annihilation of the positrons in magnetized iron. A suitable part of the annihilation spectrum was selected by appropriate collimation, namely that part which corresponds to a large momentum for the electrons of the target material (Fe) against which the positrons annihilate. This was done by obscuring the central part of the angular distribution of the annihilation radiation (angle O,, between the two 7's = lSO"), so that only values of O,, which differ from 180" by more than 8.5 milliradians were included. With this arrangement, it was found that the annihilation (two-y) yield Y is consistently higher by (5 f l)% with the magnetizing field H (around the iron) parallel to the positron direction of motion than with H antiparallel. As a check on the experiment, the Fe sample was replaced by a Cu sample, and the field-dependent effect on Y was found to vanish. Hanna and Preston (71) have interpreted their results as follows. The high-momentum Fe electrons which are involved in this experiment are mostly the polarized 3d electrons, whose spin is aligned antiparallel to the direction of the applied field H.Thus the fact that the annihilation rate is larger for H parallel to the direction of motion of the positrons indicates that the positrons are polarized parallel to their direc-
78
R. M. STERNHEIMER
tion of motion and still retain a substantial part of their original polarization a t the time of the annihilation. In a recent paper, Hanna and Preston (72) have given the results of additional experiments with their arrangement, using samples of Fe, Fe-Co, Ni, Cu, and Gd, in which the annihilation takes place. The asymmetry in this experiment was defined as 6 = (N+ - N - ) / N - , where N+ and N- denote the counting rates for field parallel and antiparallel, respectively, t o the direction of motion of the positrons, and the eclipsing angle was a,,= 8 milliradians, i.e., only annihilation events with angles a = 180" - e,, > a,,were included. The values of 6 were obtained as a function of positron energy by interposing various thicknesses of A1 foil between the CuMsource and the sample. Thus, for a n Fe-Co sample, it was found that 6 increases from (5.4 f 0.8)% at T, = 0.33 Mev to (11 f 2.5)% at T,= 0.50 MeV. This increase is not primarily due to the variation of v/c (= polarization P ) which increases by only 9% (from v/c = 0.79 a t 0.33 Mev to 0.86 a t 0.50 Mev). The increase of 6 is rather due to the improved directionality of the high-energy positrons, which results in an increase of the polarization along the direction of the magnetic field in the sample. With a thin source (0.002 in.), the Fe-Co sample gives a somewhat smaller 6 a t 0.33 MeV: 6 = (4.4 f 1.2)%, as compared with (5.4 f 0.8)% for the thick source (0.005 in.). The reason for this difference is that with increasing thickness, the emerging positrons have on the average a higher energy a t creation, and a correspondingly higher polarization P. A steel sample and an Armco sample gave values of 6 of the same order as for Fe-Co. By contrast, a Ni sample gave 6 = 0 within the experimental errors [6 = (-0.3 f 0.9)% a t 0.33 Mev], even though the Curie temperature for Ni, Tc = 631" K, is considerably above room temperature. Hanna and Preston (72) attribute this unexpected result for Ni to a possible difference of the spatial and momentum distribution of the polarized electrons in the solid, as compared t o Fe and Co. The authors (72) also mention the possibility that the positron waves inside the Ni sample may not penetrate the regions where the polarized electrons are found with appreciable probability. For gadolinium a t -100" C, 6 was also zero [6 = (0.0 f 1.8)% a t 0.43 Mev], even though Gd is strongly ferromagnetic at this temperature (Curie temperature Tc = 289" K). This result is probably due to the fact that the ferromagnetic electrons of Gd, being 4 j electrons, are localized in the internal regions of the atom and are therefore very effectively shielded from the incident positrons. A Cu sample gave no effect, as would be expected from the absence of ferromagnetism. It is apparent from these results that investigations with polarized positrons can give valuable information about the spatial and momentum distribution of the polarized electrons in ferromagnetic materials, and hence ultimately about the wave functions of these electrons.
PARITY NONCONSERVATION I N WEAK INTERACTIONS
79
In an important experiment, Deutsch et al. (73) showed that the positrons from Ga'j6and C134are polarized along their direction of motion. This was accomplished by making use of the fact that high-energy photons from two-quantum annihilation are almost 100% circularly polarized in the direction of the positron spin (74). The circular polarization was detected by means of the Compton scattering in magnetized iron, i.e., from the dependence of the transmission of the y-rays on the direction of the applied magnetic field (34, 35). The annihilation took place in a Lucite converter. As a typical result, for y-rays of energy T, = 3 Mev from Ga@,the observed difference in transmission with a thick iron analyzer (12 cm) was (8.4 f 0.5)%, as compared with the theoretical value (8.8 f 1.0)%, which assumes full polarization of the positrons ( P = +v/c). These values pertain to annihilation quanta with energies above 2 MeV. The results for CP4are somewhat less certain, but they do indicate that the average polarization is again along the direction of the positron spin. The CP4data are also compatible with full polarization within the experimental uncertainties. Since CP4 is a pure Fermi transition, this experiment shows that parity nonconservation is not restricted to Gamow-Teller transitions, but is a property of the general /%decay interaction, as predicted by the two-component neutrino theory. Ga66is also probably a pure Fermi transition, although no definite conclusions can be drawn, until the parity of this nuclide is definitely established as positive. By the method of annihilation-in-flight in a magnetic material (74), Frankel et al. (75) have also found that the positrons from Ga66are highly polarized along the direction of motion. These experimental results for Ga66are important, since an earlier experiment by Frauenfelder et al. (76) indicated little or no polarization. Boehm et al. (77) have measured the positron polarization for the mirror transition N13by observing the circular polarization of the annihilation-inflight quanta. The photons were produced in a carbon sample which contains a small amount of NI3, obtained by deuteron bombardment of the sample in a 3-Mev Van de Graaff generator. In the same manner as in the experiment of Goldhaber et al. ( S I ) , the circular polarization was detected by means of the Compton scattering in an Armco iron magnet, which was magnetized by means of two coils. The difference in counting rate for the two opposite field directions was measured for the following y-energies : 620, 830, 1,040, and 1,140 kev. The calculated values of the circular polarization a t these energies are 36%, 59%, 70%, and 74%, respectively, assuming full polarization for the positrons ( P = +v/c). From these values and from the energy dependence of the Compton scattering (34), one finds that the relative counting rate difference 6 should increase from -0 a t 620 kev to 2.1% at 1,140 kev. The experimental results are in good agreement with the theoretical curve and indicate that the positron polarization is ($0.93 f 0.20)u/c for "3. From the ft value (lifetime) of this
80
R. M. STERNHEIMER
transition, Winther and Kofoed-Hansen (78) have deduced that the ratio of the squares of the matrix elements, (McT/2/IMF12, is 0.40. The result of full polarization therefore shows that the Fermi part of this transition contributes strongly to the observed polarization and, in fact, the measurements are compatible with full polarization for the Fermi part. If only the Gamow-Teller interaction would contribute to the polarization, the calculated difference 6 would be only 0.65% at 1,140kev, in definite disagreement with the observed result (2.0 & 0.6)%. It may be noted that the results of Boehm et al. (77) concerning full polarization for the Fermi interaction are in good agreement with the results of Deutsch et al. (73),which have been discussed above. A large positive polarization for the positrons from N13 has also been observed by Hanna and Preston (79),who used their method of annihilation in magnetized iron, which has been described above (71, 7 2 ) . These authors obtained comparable values for the counting rate ratios N + / N (-1.1) for N13and for Cu64(pure Gamow-Teller transition) with the same experimental arrangement. Important information on the nature of the 0-decay interaction has been obtained from the 0 - y circular polarization correlation experiments, which have been carried out by Boehm and Wapstra (47), Schopper et al. (80), and Lundby et al. (81). The basic idea underlying these experiments is the following. On account of the nonconservation of parity in the P-decay, the residual nucleus after P-decay will be polarized, even if the initial nucleus was unpolarized (as is generally the case). If the residual nucleus is in an excited state and emits a y-ray, the y-ray will be circularly polarized. The angular distribution of circularly polarized y-rays emitted a t an angle e to the preceding 0-rays is given by
w(e,
=
1 A A ( U / Ccos )
e
(65)
+
where A is a constant coefficient, v is the electron velocity, and the sign applies t o right-hand and the - sign to left-hand circular polarization. The theoretical expressions for A for various types of /3-transitions have been derived by Alder, Stech, and Winther (82). The experimental arrangement used by Boehm and Wapstra (47) is shown in Fig. 16. Above the source, there is a polarization-analyzer magnet, which serves t o determine the ciIcular polarization of the y-rays by means of the Compton scattering in magnetized iron (34) in the same manner as in the experiment of Goldhaber et al. (31).The magnet consists of a hollow Armco core which is magnetized by means of a coil. The y-rays from the source are scattered through an angle of =52" on the inside of the Armco cylinder and are then counted in a NaI crystal, which is connected by a light pipe to a photomultiplier. The direct y-rays (from the source to the
PARITY NONCONSERVATION I N WEAK INTERACTIONS
81
NaI crystal) are suppressed by a lead absorber. The @-rays(traveling downward in Fig. 16) are counted in an anthracene crystal, also connected to a light pipe and a photomultiplier. A metal shield around each photomultiplier is used to eliminate the effect of stray magnetic fields. As a result, a reversal of the field direction in the analyzing magnet changed the single pand y-counting rates by only (0.02 f 0.02)y0and (0.07 f 0.02)70, respectively.
.\
LEAD ANTHRACENE LIGHT PIPE (TO PHOTOMULTIPLIER)
FIG.16. Schematic view of the experimental arrangement of Boehm and Wapstra [Phys. Rev. 106, 1364; 107, 1202, 1462 (1957); 109, 456 (1958)], used to measure the B-r circular polarization correlation for several p emitters.
The coincidences between the P-particles and the scattered y-rays were measured with a fast-slow coincidence circuit with a resolving time of 0.02 psec. The efficiency of the analyzer in this arrangement was calculated by Alder (see Boehm and Wapstra, 4'7) and is given by E
=
2.90k(l
+ 0.13k)/(l + 0.36k + 0.09k2)
(66)
where e is defined as the percentage difference of the counting rates for opposite directions of the (saturated) magnetic field, for completely circularly polarized y-rays of energy kmc2. E must be multiplied by the cosine of the average angle between the p- and y-radiations (148'). This efficiency
82
R. M. STERNHEIMER
function E was checked by measuring the counting rate differences 6 for the bremsstrahlung emitted by the ,&particles from P32and Tml'O. The expected value of 6 was obtained from the circular polarization of the bremsstrahlung spectrum, as calculated by McVoy (SO), and from the theoretical efficiency function E [Eq. (66)l. In this manner, it was found that the longitudinal polarization P of the 0-decay electrons is (-0.97 f O.O6)v/c for P32and (-0.93 f O.O7)v/c for TmI7". These results for P are in good agreement with those obtained from other experiments, thus providing a check on the accuracy of the function used for E [Eq. (66)]. Boehm and Wapstra (47) have obtained the coefficients A for the following nuclides: Na24,S C ~S~C, ~V48, ~ , CoSslCo6O, and Adg8.In general, the value of A gives some information about the ratio x, defined as x = u ~ / M G T / M F Iwhere , MGT and MF are the Gamow-Teller and Fermi where ~, matrix elements, respectively, for the 0-transition, and a = C G T ~ / C F CGT and C F are the Gamow-Teller and Fermi coupling constants in the fundamental Odecay interaction. An experimental value of a was deduced by Kofoed-Hansen and Winther (83) from a study of theft values (0-decay lifetimes) in mirror transitions. These authors obtain a = 1.3. The ratio R = x2/(1 x2) represents the fractional amount of GamowTeller interaction for the particular 0-transition. ( R can vary between 0 and 1.) The theoretical expression (82) for A involves an interference term I , which takes on a particularly simple form if the two-component neutrino theory is valid and if, moreover, either only S and T, or only V and A , interactions occur. In this case, the absolute value of I is III = (CGT/CF)-' = a-%. For the case of S C ~the ~ , value of A is particularly large: A = +0.33 f 0.04. From this result, Boehm and Wapstra (47) deduce that 111 must be larger than 0.5 and obtain the following estimate of the ratio of the Gamow-Teller to Fermi matrix element: (MGTI/IMFI= 2.2. It may be noted that the theoretical value of A for a pure Gamow-Teller transition for Sc46would be +0.08, showing that the admixture of Fermi interaction results in a sizable change of A (from 0.08 to 0.33). For Co60, Boehm and Wapstra (47) have obtained A = -0.41 =t 0.08, which is in good agreement with the experimental values A = -0.34 rrt 0.04 of Schopper et al. (80) and A = - 0.32 f 0.07 of Lundby et al. (81).These results are also in good agreement with the theoretical value (82) A = -0.33. For Sc44and V48,the values of A are (algebraically) larger than for a pure Gamow-Teller interaction (e.g., the measured value for Sc44is -0.02 f 0.04, as compared with the theoretical prediction A = -0.17 for a pure Gamow-Teller transition). This deviation indicates an appreciable amount of interference between the Gamow-Teller and Fermi interactions. Assum-
+
PARITY NONCONSERVATION I N WEAK INTERACTIONS
83
ing the maximum value for I , the authors (47) find IMGT/MF[= 5 for both Sc44and V48. The maximum asymmetry is found for Aulg8,with a value of A = f0.52 & 0.09. The work of Boehm and Wapstra (47‘) (particularly the a pure V T and a pure SA interaction. On the experiment on S C ~excludes ~) other hand, the results are in good agreement with the pure V A interaction which has been proposed in references 18-20. Moreover, these 0 - y circular polarization correlation experiments disprove the validity of the twin-neutrino theory (48,49) and can also be used to rule out a large breakdown of time-reversal invariance. ACKNOWLEDGMENTS I wish to thank Dr. G. Feinberg for several very helpful discussions concerning parity nonconservation. I am also indebted to Dr. S. Pasternack and Dr. L. C. L. Yuan for valuable comments. REFERENCES 1. T. D. Lee and C. N. Yang, Phys. Rev. 104,254 (1956). 2. C. S. Wu, E. Ambler, R. W. Hayward, D. D. Hoppes, and R. P. Hudson, Phys. Rev. 106, 1413 (1957). S. R. L. Garwin, L. M. Lederman, and M. Weinrich, Phys. Rev. 106, 1415 (1957). 4. J. I. Friedman and V. L. Telegdi, Phys. Rev. 106, 1681 (1957). 5. R. H. Dalitz, Phil Mag. 44, 1068 (1953); Phys. Rev. 94, 1046 (1954); see also E. Fabri, Nuovo cimento 11, 479 (1954). 6. T. D. Lee and C. N. Yang, “Elementary Particles and Weak Interactions,” p. 10. Brookhaven Natl. Laboratory Report BNL 443 (T-Ql), 1957. 7 . J. M. Blatt and V. F. Weisskopf, “Theoretical Nuclear Physics,” pp. 24, 431, and 798. Wiley, New York, 1952. 8. See, for example, L. Wolfenstein, Ann. Rev. Nuclear Sci. 6,43 (1956). 9. E. Ambler, M. A. Grace, H. Halban, N. Kurti, H. Durand, C. E. Johnson, and H. R. Lemmer, Phil Mag. 44,216 (1953). 10. T. D. Lee, R. Oehme, and C. N. Yang, Phys. Rev. 106,340 (1957). 11. W. Pauli, in “Niels Bohr and the Development of Physics.” Pergamon Press, London, 1955; G. Luders, Kgl. Danske Videnskab. Selskab, Mat.-fys. Medd. 28, No. 5 (1954); J. Schwinger, Phys. Rev. 91, 720, 723 (1953); 94, 1366, 1576 (1954). 12. E. Ambler, R. W. Hayward, D. D. Hoppes, R. P. Hudson, and C. S. Wu, Phys. Rev. 106, 1361 (1957). IS. H. Postma, W. J. Huiskamp, A. R. Miedema, M. J. Steenland, H. A. Tolhoek, and C. J. Gorter, Physica 23, 259 (1957). 14. J. I. Friedman and V. L. Telegdi, Phys. Rev. 106, 1290 (1957). 15. T. D. Lee and C. N. Yang, Phys. Rev. 106, 1671 (1957). 16. L. D. Landau, Nuclear Phys. 3, 127 (1957); A. Salam, Nuovo cimento 6, 299 (1957). 17. B. M. Rustad and S. L. Ruby, Phys. Rev. 97, 991 (1955). 18. R. P. Feynman and M. Gell-Mann, Phys. Rev. 109, 193 (1958). 19. E. C. G. Sudarshan and R. E. Marshak, Proc. Padua-Venice Conj. on Mesons and Recently Discovered Particles, p. V-14 (1957); Phys. Rev. 109, 1860 (1958). 20. J. J. Sakurai, Bull. Am. Phys. SOC[2] 3, 10 (1958); Nuovo cimento 7, 649 (1958); see also R. E. Behrends, Phys. Rev. 109,2217 (1958).
84
R. M. STERNHEIMER
81. M. Goldhaber, L. Grodzins, and A. W. Sunyar, Phys. Rev. 109, 1015 (1958); L. Grodzins, ibid. 109, 1014 (1958). 22. W. B. Herrmannsfeldt, D. R. Maxson, I-’.Stahclin, and J. S. Allen, Phys. Rev. 107, 641 (1957). 23. G. Culligan, S. G. F. Frank, J. R. IIolt, J. C. Kluyver, and T. Massam, Nature 180, 751 (1957). $4. L. Michel, Proc. Phys. SOC.A63, 514 (1950). 26. J. D. Jackson, S. B. Treiman, and H. W. Wyld, Phys. Rev. 106, 517 (1957). 26. L. Wolfenstein and L. A. Page, Bull. Am. Phys. SOC.[2] 2, 190 (1957). 27. R. B. Curtis and R. R. Lewis, Phys. Rev. 107, 543 (1957). 28. T. D. Lee and C. N. Yang, “Elementary Particles and Weak Interactions,” p. 54. Brookhaven Natl. Laboratory Report BNL 443 (T-91), 1957. 28u. E. Konopinski and H. M. Mahmoud, Phys. Rev. 92, 1045 (1953). 29. C. A. Coombes, B. Cork, W. Galbraith, G. R. Lambertson, and W. A. Wenzel, Phys. Rev. 108, 1348 (1957). 30. K. W. McVoy. Phys. Rev. 106,828 (1957); 110, 1484 (1958); see also K. W. McVoy and F. J. Dyson, ibid. 106, 1360 (1957). 31. M. Goldhaber, L. Grodains, and A. W. Sunyar, Phys. Rev. 106, 826 (1957). 32. H. A. Bethe and W. Heitler, Proc. Roy. Soc. 8146,83 (1934). 33. C. Fronsdal and H. tfberall, Phys. Rev. 111, 580 (1958). 34. S. B. Gunst and L. A. Page, Phys. Rev. 92,970 (1953). 36. H. A. Tolhoek. Revs. Modern Phys. 28, 277 (1956). 36. R. E. Cutkosky, Phys. Rev. 107, 330 (1957). 37. G. W. Ford, Phys. Reo. 107, 321 (1957). 38. A. Pytte, Phys. Rev. 107, 1681 (1957). 39. C. S. Wu, in “Beta- and Gamma-Ray Spectroscopy” (K. Siegbahn, ed.), p. 649. Interscience, New York, 1955. 40. H. Schopper and S. Galster, Nuclear Phys. 6, 125 (1958). 41. C. Mnller, Ann. Physik [5]14, 531 (1932). 42. A. M. Bincer, Phys. Rev. 107, 1434 (1957). 43. G. W. Ford and C. J. Mullin, Phys. Rev. 108,477 (1957); 110, 1485 (1958). See also P. Stehle, ibid. 110, 1458 (1958); A. RBczka and R. Rgcaka, ibid. 110, 1469 (1958). 44. A. M. Bincer, Phys. Rev. 107, 1467 (1957). 46. H. Frauenfelder, A. 0. Hanson, N. Levine, A. Rossi, and G. De Pasquali, Phys. Rev. 107, 643 (1957). 46. N. Benczer-Koller, A. Schwarzschild, J. B. Vise, and C. S. Wu, Phys. Rev. 109, 85 (1958). 47. F. Boehm and A. H. Wapstra, Phys. Rev. 106, 1364; 107, 1202 ,1462 (1957); 109, 456 (1958). 48. M. Goeppert-Mayer and V. L. Telegdi, Phys. Rev. 107, 1445 (1957). 49. M. A. Preston, Can. J . Phys. 36, 1017 (1957). 60. N. F. Mott, Proc. Roy. Sac. A126, 259 (1930). 61. N. F. Mott, Proc. Roy. SOC.A124, 425 (1929); A136 ,429 (1932). 62. K. M. Case, Phys. Rev. 106, 173 (1957). 65. H. A. Tolhoek and R. S. de Groot, Physica 17, 17 (1951). 64. P. A. M. Dirac, “The Principles of Quantum Mechanics.” Oxford University Press, London and New York, 1935. 65. N. Sherman, Phys. Rev. 103, 1601 (1956). 66. J. H. Bartlett and R. E. Watson, Proc. Am. Acud. Arts Sci. 74,53 (1940). 57. C. B. 0. Mohr and L. J. Tassie, Proc. Phys. SOC.A67, 711 (1954); C. B. 0. Mohr, Proc. Roy. SOC.8182, 189 (1943).
PARITY NONCONSERVATION IN WEAK INTERACTIONS
85
58. H. Mendlowite and K. M. Case, Phys. Rev. 97, 33 (1955); 100, 1551 (1955). 59. W. H. Louisell, R. W. Pidd, and H. R. Crane, Phys. Rev. 94, 7 (1954). 60. A. de-Shalit, S. Kuperman, H. J. Lipkin, and T. Rothem, Phys. Rev. 107, 1459 (1957). 61. F. Gursey, Phys. Rev. 107, 1734 (1957). 61. L. J . Tassie, Phys. Rev. 107, 1452 (1957). 63. H. J. Lipkin, S. Kuperman, T. Rothem, and A. de-Shalit, Phys. Rev. 109,223’(1958). 64. H. Frauenfelder, R. Bobone, E. von Goeler, N. Levine, H. R. Lewis, R. N. Peacock, A. Rossi, and G. De Pasquali, Phys. Rev. 106,386 (1957). 66. H. De Waard and 0. J. Poppema, Physica 23, 597 (1957). 66. P. E. Cavanagh, J. F. Turner, C. F. Coleman, G. A. Gard, and B. W. Ridley, Phil. Mag. [8] 2, 1105 (1957). 67. A. I. Alikhanov, G. P. Eliseiev, V. A. Lubimov, and B. V. Ershler, Nuclear Phys. 6, 588 (1958). 68. N. Ryu, K. Hashimoto, and I. Nonaka, J . Phys. SOC.J a p a n 8, 575 (1953). 69. L. A. Page and M. Heinberg, Phys. Rev. 106, 1220 (1957). 70. M. Heinberg and L. A. Page, Phys. Rev. 107, 1589 (1957). 71. S. S. Hanna and R. S. Preston, Phys. Rev. 106, 1363 (1957). 7e. S. S. Hanna and R. S. Preston, Phys. Rev. 109,716 (1958); see also R. S. Preston and S. S.Hanna, Phys. Rev. 110, 1406 (1958). 73’. M. Deutsch, B. Gittleman, R. W. Bauer, L. Grodzins, and A. W. Sunyar, Phys. Rev. 107, 1733 (1957). 74. L. A. Page, Phys. Rev. 106,394 (1957). 76. S. Frankel, P. G. Hansen, 0. Nathan, and G. M. Temmer, Phys. Rev. 108, 1099 (1957). 76. H. Frauenfelder, A. 0. Hanson, N. Levine, A. Rossi, and G. De Pasquali, Phys. Rev. 107, 910 (1957). 77. F. Boehm, T. B. Novey, C. A. Barnes, and B. Stech, Phys. Rev. 108, 1497 (1957). 78. A. Winther and 0. Kofoed-Hansen, Kgl. Danske Videnskab. Selskab, Mat.-fys. Medd. 27, No. 14 (1953). 79. S.S. Hanna and R. S. Preston, Phys. Rev. 108, 160 (1957). 80. H. Schopper, Phil. Mag. [8] 2, 710 (1957); H. Appel, H. Schopper, and S. D. Bloom, Phys. Rev. 109, 2211 (1958). 81. A. Lundby, A. P. Patro, and J. P. Stroot, Nuovo cimento 6, 745 (1957); 7, 891 (1958). 81. K. Alder, B. Stech, and A. Winther, Phys. Rev. 107,728 (1957); see also M. Morita and R. S. Morita, Phys. Rev. 107, 1316 (1957). 83. 0. Kofoed-Hansen and A. Winther, Kgl. Danske Videnskab. Selskab, Mat.-fys. Medd. 30, No.20 (1956).
This Page Intentionally Left Blank
Quantum Efficiency of Detectors for Visible and Infrared Radiation R. CLARKJONES Research Laboratory, Polaroid Corporation, Cambridge, Massachusetts Page I. Introduction and Summary.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 A. Introduction. .... B. Summary ....... . . . . . . . . . . . . . . . . . . 89 C. Elementary Detector Concepts .................... 91 92 11. Responsive Quantum Efficiency., . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Detective Quantum ............................ 94 A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . :. 94 B. Elementary Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 1. The Ideal Detector.. .................... 2. Definition of the Detective 3. The Quasi-Ideal Detector. . . . . . . . . . . . . . . . . . . . . . . 99 4. Alternative Expression for C. A More Rigorous Discussion.. . ........................ 101 1. The Ideal Detector.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 2. Definition of the Detective Quantum Efficiency.. . . . . . . . . . . . . . . . . . . 104 3. The Quasi-Ideal Detector.. . . . . . . . . . . . . . . . . . . . . . . . IV. Detectivity and Contrast Detectivity A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Some Definitions. ........................................... 107 C. Properties of the .-Ma Curve.. ................................ 108 D. Increasing &D by a Local Source of Radiation.. . . . . . . . . . . . . . . . . . . . . . . 109 E. Increasing &D by a Neutral Filter. . ............................. 110 F. The Useful Range of a Detector; Underloading and Overloading.. ...... 111 G. Method of Comparing Television Camera Tubes with Photographic Films. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Photoemissive Tubes. ...................... A. Introduction.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 B. Responsive Quantum Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Detective Quantum Efficiency .................... 120 VI. Photoconductive Cells. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Introduction.. . . . . . . . . . . . . . . . . . ............................. 121 B. Responsive Quantum Efficiency. . ............................. 123 C. Detective Quantum Efficiency.. .. ............................. 125 1. Cadmium Sulfide Cells.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 2. Lead Sulfide Cells.. . . . . . . . . . . . . . . . . . . . .................... 126 .................... 128 3. Conclusions. ......................... VII. Television Camera Tubes. .......... ............................... 128 A. Introduction ............................ . . . . . . . . . . . . . . . . . . . . . . . . . 128 B. Responsive Quantum Efficiency.. .................................. 129
87
88
R. CLARK JONES
Page C. Detective Quantum Efficiency... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 1. The Basic Data.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 ........................ 131 2. Derivation of a Working Formula for QD.. 3. Results for Image Orthicons.. . . . . . . . . . . . . . . . . . . . . . . . 4. Detective Quantum EAiciency of the 6326 Vidicon.. . . . . . . . . . . . . . . . 135 5. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 VIII. Photographic Negatives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 A. Introduction.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . R. Responsive Quantum Efficiency.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 C. Detective Quantum Efficiency.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 1. Derivation of a Working Formula for Q D . . . . . . . . . . . . . . . . . . . . . . . . . . 2. Description of the Films.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3. Sensitometric Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4. Granularity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 5. Wavelength Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 6. Numerical Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 7. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 IX. Human Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 B. Responsive Quantum Efficiency., . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 C. Detective Quantum Efficiency... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 ................................. 157 1. The Signal-to-Noise Ratio k 2. The Experimental Data.. .. ................................ 160 3. Derivation of a Working Formula for Q B . . . . . . . . . . . . . . . . . . . . . . . . . . 161 4. Numerical Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 5. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 X. Other Detectors.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Heat Detectors in General.. . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 €3. The Golay Pneumatic Heat Detector. . . . . . . . . . . . . . . . . . . 174 C. Thermocouples and Bolometers. ......... . . . . . . . . . . . . . . . 174 D. Back-Biased p-n Junctions. ................................. 175 E. Photovoltaic Cells. . . . . . . . ......................... F. Photoelectromagnetic Detectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 G. Photosynthesis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References. . . . . . . . . . . . . . . . .................................... 178
I. INTRODUCTION ASD SUMMARY A . Introduction In this review of the quantum efficiency of radiation detectors, chief emphasis is given to photoemissive detectors, photoconductive detectors, television camera tubes, photographic negatives, and t8hehuman eye. Writers have employed many different kinds of quantum efficiency. Nearly all of them relate to responsivity and are accordingly called a responsive quantum effciency. There is one kind of quantum efficiency, however, the kind here called detective quantum efficiency, that is of particular importance in connection with the detecting ability of detectors. The concept of detective quantum efficiency was first formulated by Rose,
QUANTUM EFFICIENCY O F DETECTORS
89
and his review (1) of image detectors in Vol. I of this series makes extensive use of the concept. The definition of the detective quantum efficiency involves the concept of “ambient radiation.” The ambient radiation is now defined as a certain part of the total radiation incident on the detector. There are three principal sources of such radiation: (1) the signal radiation that is to be detected, usually varying with time; (2) the blackbody radiation field produced by the detector and its environment; (3) other steady radiation, such as daylight, moonlight, or steady manmade illumination. The combination of (2) and (3) we call “ambient radiation.” Ambient radiation is the steady radiation that falls on the detector. There are two quite different kinds of situations in which one may be interested in the detectivity of a radiation detector: the photon noise of the ambient radiation may be negligible, or it may be the dominant noise. I n the first situation the noise produced in the output of the detector by the quantum fluctuations of the steady ambient radiation is small compared with the other noises in the output. Thermocouples, bolometers, and photoconductive cells are usually operated under this condition. The extreme case of the first kind of situation occurs when the only radiation incident on the detector is the blackbody radiation appropriate to the temperature of the detector and the signal radiation that is to be detected. This extreme situation was the only one considered in the writer’s review (2) of detector performance in Vol. V of this series. I n contrast with the first situation, the second situation involves an ambient radiation field whose intensity is such that the statistical fluctuations in this “steady” field produce the dominant noise in the output of detector-that is to say, the detectivity is limited by the photon noise of the ambient radiation field. Human vision and multiplier phototubes usually operate under this condition. The intensity of the incident radiation required to make the photon noise the dominant noise depends, of course, on the kind of detector. It is difficult t o make the photon noise dominant in the output of a lead sulfide cell, whereas i t is only under unusual circumstances that the photon noise is not the dominant noise in the output of a multiplier phototube. The concept of detective quantum efficiency is the appropriate means to characterize the detecting ability of a detector in the second kind of situation, and the primary purpose of this review is to describe the detecting performance of a substantial number of detectors from this point of view.
B. Summary Some of the quantitative results of this article are summarized in Figs. 1 and 2. Figure 1 shows the detective quantum efficiency (DQE) of three RCA television camera tubes, four different Kodak films, and human
90
R. CLARK JONES
foveal vision plotted against the intensity of the ambient radiation as measured in ergs per square centimeters. In Fig. 1 all of the other parameters, such as radiation wavelength, area of the signal, etc., have been adjusted to the value that maximizes the DQE. Figure 2 shows the DQE of the same group of detectors plotted against the wavelength of the radiation, with all of the other parameters adjusted to the value that maximizes the ordinate. Data on the way that the DQE depends on the other parameters are given in the appropriate sections. 101
I
I
I
I
I
I
I
IMAGE ORTHICONS
VISION
FILM
I
,/ /
~
16~
-I
10
I
10
2 10
EXPOSURE IN ERGSICM~
FIG. 1. The detective quantum efficiency &O plotted versus the ambient exposure. Only the three image-forming detectors are summarized in this figure: television camera tubes, photographic negatives, and human vision, which are the subjects of Secs. VII, VIII and IX. The dashed lines have slopes of plus or minus 1 and indicate the way that the performance can be improved by the use of added ambient illumination or by the use of neutral filters, as discussed in Sec. IV. The results shown are for optimum choice of all of the other parameters that affect the DQE, such as wavelength, size of signal area, signal duration, etc. For human vision the method of calculating the exposure U from the luminance B is described in Sec. IX,C,4. For the camera tubes, U is calculated from the irradiation H of Sec. VI1,C on the basis of an exposure duration of $50 sec.
The concepts of responsive and detective quantum efficiency are the subjects of Sew. I1 and 111. Section I V shows in a quite general way how the detective quantum efficiency is related to detectivity and (‘contrast detectivity” (3). Two kinds of non-image-forming detectors (photoemissive tubes and photoconductive cells) are discussed in Secs. V and VI, and this is followed by the discussion of three kinds of image-forming detectors (television camera tubes, photographic negatives, and human vision) in Secs. VII, VIII, and IX. A number of other detectors are considered in Sec. X.
91
QUANTUM EFFICIENCY O F DETECTORS
C . Elementary Detector Concepts This section is now concluded with a brief summary of the more elementary concepts relating to radiation detectors. The responsivity R of a detector is the ratio of the detector output (usually in volts or amperes) to detector input (usually in watts or lumens). Thus, the responsivity may be expressed in volts per watt. The responsivity of a detector usually depends on the operating temperature, the wavelength of the radiation, the modulation frequency of the radiation, the sensitive area of the detector, and the speed of response. I
I
I
I
FIG.2. The detective quantum efficiency Q D plotted versus the wavelength of the signal radiation and ambient radiation. The detectors are the same as those involved in Fig. 1. The results are for optimum choice of all of the other parameters that affect the DQE, such as amount of ambient radiation, size of signal area, signal duration, etc.
The noise N of a detector is the rms fluctuation in the output expressed in terms of the detector output. Thus, N may be expressed in rms volts, or rms amperes. The noise equivalent input (NEI) is the rms fluctuation in the output expressed in terms of the detector input. Thus, the NEI may be expressed in watts or in lumens. When the NEI is expressed in watts, it is called the
92
R . CLARK JONES
noise equivalent power and is denoted PN. Sometimes, and with photographic negatives, it is convenient to introduce the noise equivalent energy E N . I n terms of the responsivity R and the noise N , the noise equivalent input is defined by
NEI = N / R
(1.1)
The noise equivalent input is psychologically upside-down. A detector with a greater detecting ability has a smaller noise equivalert input. To avoid this difficulty it has been customary to use the term sensitivity to denote the right-side-up concept. But the term sensitivity has been used to denote both the reciprocal of noise equivalent input and the simple concept of responsivity. To avoid the ambiguity of the term sensitivity, the author in 1952 introduced (4) the term detectivity t o denote the reciprocal of the noise equivalent input. The detectivity D is defined by 9 =
R/N
(1.2)
The detectivity depends on the five quantities listed above on which the responsivity depends and also on the frequency bandwidth of the noise. If the radiation input is measured in terms of its power P, then the responsivity R determines the signal output S :
RP and the detectivity D determines the signal-to-noise ratio : S
S
=
(1.3)
= DP
The operation of many kinds of detectors is described in the recent book by Smith, Jones, and Chasmar ( 5 ) . A review of detectors with emphasis on their detecting ability was given in Vol. V of this series ( 2 ) . Neither of these references considers detectors from the point of view of detective quantum efficiency. 11. RESPONSIVE QUANTUMEFFICIENCY The term eficiency without the term quantum appended usually means the ratio of an output power to an input power, or perhaps the ratio of an output energy or output free energy to an input energy. But the term quantum efficiency means something else. The responsive type of quantum efficiency is always the ratio of the numbers of two kinds of countable events. For example, the quantum efficiency of a simple vacuum phototube is usually defined as the ratio of the number of electrons that reach the anode and flow in the external circuit to the number of photons that are incident on the photocathode.
QUANTUM EFFICIENCY O F DETECTORS
93
Thus, we define the responsive quantum efficiency (RQE) of a detector as the ratio of the number of countable output events to the number of photons that act on the device. It may serve a useful purpose to review briefly the various definitions of RQE that have been used for detectors. For any detector, the number of input events may be either the number of incident photons or the number of absorbed photons. Historically, the choice has been different for different types of detectors. With photoemissive tubes the incident photons are usually counted, whereas with photoconductive cells it is the absorbed photons that are usually counted. With photoconductive cells, the method used in all of the early work of Gudden and Pohl is to consider the output event as the flowing of a n electron in the external circuit. This, of course, made the RQE depend on the applied voltage. All workers today consider the creation of an electron-hole pair to be the output event. The changeover of point of view in this respect is well described by Rose (6). With respect to human vision, various workers have studied the minimum number of absorbed photons that are required to elicit a sensation of vision in the dark-adapted eye. The reciprocal of this number is the ratio of the number of perceptions of light to the number of absorbed photons and thus may be considered to be a RQE. Rose in 1942 (7) defined the performance in terms of the number of photons required to produce a just detectable signal in a single resolution element. This method of evaluation was abandoned by Rose (8) in 1946 in favor of the detective quantum efficiency that is defined below in Sec. 111. Photographic negatives have been left until last because with this detector we find the greatest variety of possible definitions of responsive quantum efficiency. With this detector the input event may be either the incident photon, the absorbed photon, or the photon that is absorbed in a photographically relevant manner. The output event may be either the absorption of a photon in a photographically relevant manner or the event of a grain’s becoming developable. Furthermore, the number of input events may be either the actual number under a given set of conditions or the minimum number required to effect the chosen output event. There are undoubtedly other possible definitions of the RQE of a photographic negative, but the list given is sufficient to show the complexity. These examples are perhaps sufficient to indicate the wide variety of the possible definitions of a RQE. Thus, when anyone says, “The quantum efficiency of detector A is 2070,’1very little information is carried by this statement until he states just what kind of (responsive) quantum efficiency he is talking about. The RQE may be greater than unity. For example, the responsive quantum efficiency of a multiplier phototube may be 100,000, if this num-
94
R. CLARK JONES
ber of electrons reach the anode for each photon incident on the photocathode. In contrast, the detective quantum efficiency defined in the next section cannot be greater than unity.
111. DETECTIVEQUANTUM EFFICIENCY A. Introduction
As we saw in the preceding section, the term quantum efficiency has usually meant responsive quantum efficiency, with the result that a simple phototube may have a quantum efficiency of 0.1, whereas a multiplier phototube with the same kind of sensitive surface may have a quantum efficiency of 100,000. This situation is one that cries for some unifying concept, and I believe that the concept of detective quantum efficiency introduced by Rose (8) is the answer t o this need.’ The reader may have in mind the notion of a “fundamental” kind of quantum efficiency, which is probably expressed in some intuitive terms such as the following. Suppose N i photons of a certain wavelength are incident on a detector. Of these photons, a certain number N , will be absorbed. Usually, not all of the absorbed photons will be effective in stimulating an electrical output of the detector; specifically, suppose that N, of the photons are effective in producing the excitation that contributes to the electrical output. Then most of us would be willing to agree that the ratio N,,”,, the ratio of effective to incident quanta, is a good measure of a ‘(fundamental” kind of quantum efficiency. The definition of a “fundamental” quantum efficiency in the preceding paragraph is not an operational one. With many kinds of detectors, including the human eye and photographic negatives, it is difficult or impossible t o measure the number N,,or even to say exactly what we mean by “effective” photons. The concept of detective quantum efficiency, however, does have a clean-cut operational definition : The detective quantum efficiency of an actual detector is defined as the square of the ratio of the measured detectivity of the detector to the maximum possible detectivity on the given signal in the presence of the given ambient radiation. The method of calculation is suggested by the following example. Suppose we have an “ideal” detector of unit quantum efficiency, by which we mean in this paragraph that one electron flows in the external circuit for each incident photon. Since the photons are statistically independent (see l The concept was formulated by Rose. The name “detective quantum efficiency” was introduced by the author. It is called “equivalent quantum efficiency” by Feligett (9).
QUANTUM EFFICIENCY OF DETECTORS
95
below for qualifications), the electrons that flow in the external circuit are statistically independent, which is another way of saying that the noise in the output is related to the average current by the ordinary shot noise formula. Second, suppose that the detector is then changed so that it responds to only one-half of the incident quanta (accomplished, for example, by placing a filter with a transmission of 0.5 over the detector) and that an amplifier with a gain of 2 is added. Then the responsive quantum efficiency of the second detector is the same as that of the first, but observation of the noise shows that the mean square noise current of the second detector is twice that of the first. Thus, the detectivity of the second detector is less than that of the first by the factor 0.707. If we postulate that the “ideal” detector just described has the maximum possible detectivity in the presence of the given ambient radiation, then it follows from the definition of detective quantum efficiency that the second detector has a detective quantum efficiencyof 0.5. This conclusion obviously accords with our intuitive feeling that the ‘(fundamental” quantum efficiency of the second detector is only half that of the first, since only one-half of the incident photons are actually used in the second detector. In Sec. B we define carefully the concept of an ideal detector, and we then immediately define the detective quantum efficiency in terms of this ideal detector. Then finally we define a quasi-ideal detector as the equivalent of an ideal detector that is degraded by a filter of transmittance F , so that F is the fractional utilization, and then show that the detective quantum efficiency of this quasi-ideal detector is equal to F. The discussion in Sec. B is throughout elementary, in the sense that the emphasis is on the physical concepts. In Sec. C, the concept of detective quantum efficiency is redefined in a more rigorous way.
B. Elementary Discussion 1. The Ideal Detector. An ideal detector is one that makes fully effective use of every photon incident upon the sensitive area of the detector. In this section the detecting ability of an ideal detector is derived. It is supposed that the signal and the ambient radiation both consist of radiation within a narrow band of radiation frequencies and that the band is the same for the signal and for the ambient radiation. It is further supposed that the (modulation) frequency response of the detector may be characterized by an integration time T ;this assumption is made only to simplify the presentation, and the final results will be expressed in a form independent of this assumption. The strength of the ambient radiation will be specified by the average
96
R. CLARK JONES
number Ma of photons that reach the sensitive area of the detector in the period of duration T. Similarly, the strength of the steady signal is defined by the average number M , of signal photons that reach the sensitive area in the integration time T. For the sake of simplicity, it is supposed that M , is small compared with Ma. Because of the statistical independence (or near independence) of the individual photons, the number of ambient photons will not be the same in successive integration periods. The average number will be M a as defined above, but the actual number 311, in any given period of length T will usually differ from Ma. The deviation from the mean value in any given period is 311, - Ma. The average value of the square of this deviation is called the mean-square deviation, and the square root of the mean-square deviation is called the root-mean-square (rms) deviation, and is here denoted by AMa: AMa = [((ma- Ma)’)Ix (3.1) where the angle brackets indicate that the quantity within is averaged over a large number of integration periods. Lewis (10) has presented results showing the magnitude of the fluctuation AMRfor radiation from a thermal source. His results are equivalent to =
OMa
(3.2)
where O is a factor that may usually be taken to be unity in the visible and infrared regions. O is defined and discussed below in Sec. C. I n the remainder of this Sec. B, the factor O will be set equal to unity. Thus, one has
AM,
=
Ma%
(3.3)
This result may be derived simply if we assume that the individual photons occur randomly and independently. The number 311, then has a Poisson distribution, and the result (3.3) follows a t once. It turns out, however, that the concept of a sequence of random events is not so simple after all. Fry (11) has examined this concept in detail and finds that there are two ways in which a sequence of events may be random. The events may be “individually a t random” or they may be “collectively a t random.” The reader is referred to Fry’s lucid account for definitions and examples of these concepts. Fry shows that if and only if a sequence of events is random in both of these senses do the events have a Poisson distribution. By a Poisson distribution, we mean that if M a is the mean number in intervals of length T, the probability that exactly 311, events occur in this interval is given by p (311,) = M,3n0e-IMn/~, !
(3.4)
QUANTUM EFFICIENCY O F DETECTORS
97
It is easily confirmed that this expression gives M a as the mean number, and more detailed calculation shows that the rms deviation from the mean is given by (3.5) in confirmation of (3.3). (The calculation makes use of the fact that meansquare deviation from the mean is the difference between the mean squared number and the square of the mean number.) The rms noise N , measured in photon numbers, is thus AM,
N
=
(3.6)
=
and the signal S measured in photon numbers is equal to the number M , of signal photons:
s = M,
(3.7)
All these photon numbers are, of course, those that pertain to the integration period T. The signal-to-noise ratio is thus given by
M,/M,” and the noise equivalent number of signal photons is given by SIN
SN
=
= M,N =
Ma”
(3.8)
(3.9)
Usually, because of imperfections in the detector, the actual number of photons required to produce a noise equivalent signal output is larger then the value just given. But we here define an ideal detector as a detector that does achieve the detecting ability corresponding to the last equation. 2. Definition of the Detective Q u a n t u m Eficiency. We are now prepared to define the detective quantum efficiency (DQE) of an actual detector. Suppose that we measure the signal-to-noise ratio ( S I N ) , of an actual detector on a given signal and in the presence of a given ambient radiation. Then the detective quantum efficiency Q D is the square of the ratio of the measured signal-to-noise ratio (SIN),, to the signal-to-noise ratio of the ideal detector on the given signal in the presence of the given ambient radiation: (3.10)
It is clear from this definition that the DQE cannot be greater than unity. The DQE is unity only for the (nonexistent) ideal detector described above and is less than unity for any detector that is less than ideal. We recall from Sec. I1 that the RQE is often greater than unity. It is important to note that M a is the mean number of photons incident
98
R. CLARK JONES
on the sensitive area. If the detector and environment are all at the same temperature, this incident flux will be balanced by a n exactly equal flux of reflected or emitted photons, so that the net flux is zero. But Ma is not the net fiux; it is the incident flux. It may help the reader to imagine that the ideal detector is perfectly black and a t absolute zero; then the only photons are the incident photons. In fact, an ideal detector would have to be perfectly black and at absolute zero, but this fact is a consequence of the definition, not a part of the definition (of an ideal detector). The definition of DQE given above is general in that there is no restriction on the amount of the ambient radiation. It is important to realize, however, that the DQE will accord with our intuitive understanding of its significance only when the ambient radiation is sufficient in amount that its fluctuation produces the dominant noise in the output of the detector. The quantum fluctuations in the ambient radiation will always produce noise in the output of the detector, but in practical situations it may be that the noise due to these fluctuations (photon noise) may be buried under other noises of larger magnitude. Only when the dominant noise in the detector output is photon noise may we expect to obtain a value for the DQE that is independent of the amount of the ambient radiation and that accords with the intuitive notion of a “fundamental” quantum efficiency. The definition of the DQE given in the text immediately preceding Eq. (3.10) is the same as the definition in Sec. II1,A only if the ideal detector does have the maximum possible detectivity in the presence of the given ambient. The ideal detector gives equal weight to every incident photon, SO one might ask whether an even higher detectivity might be achieved by giving unequal weights to the various incident photons. For example, the detector might split the integration period T into 100 equal subperiods and give the number of photons that are incident in each subperiod a weight that depends on the number of photons that occur in that subperiod. Such a detector would be nonlinear in the sense that the output voltage would not be proportional to the incident power. Thus, the question here emphasized is whether the maximum conceivable detectivity is given by a linear detector that gives equal weight to every incident photon. I believe the answer is yes: the ideal detector defined above does have the maximum possible detectivity. But I do not have a formal proof, nor does this question appear to be discussed in the literature. My intuitive proof is based on the fact that both the signal photons and the noise photons, considered either separately or in combination, occur collectively and individually at random, with the result that there is simply no way of making an a priori judgment that any given photon is more likely to be a signal photon than any other photon. If, for example, all the signal photons arrived singly and the background photons arrived two a t a
QUANTUM EFFICIENCY OF DETECTORS
99
time, one could easily devise a detector that would reject all of the noise photons. But the essential fact here is that there is no basis for making such a distinction. So, in the absence of a basis for discrimination, the detector must give equal weight to every incident photon. I am indebted to three specialists in noise theory for discussion of this question: Dr. David Middleton, Dr. David Van Meter, and Professor Norbert Wiener. The considerations of the preceding paragraph are perhaps most nearly implicit in a recent paper by Middleton ( 2 2 ) . 3. T h e Quasi-Ideal Detector. The quasi-ideal detector is defined as the combination of an ideal detector, as defined above, covered by a filter that transmits a fraction F of the incident photons. Thus, the quasi-ideal detector is a detector that makes maximal use of the fraction F of the incident photons. We shall show that the detective quantum efficiency QD of a quasi-ideal filter is equal to F. This is the basic justification for calling QD a “quantum efficiency.” Speaking loosely, we may say that the DQE is equal to the fraction of incident photons that are actually utilized by the detector. This demonstration is also the justification for defining the DQE as the square of the ratio of signal-to-noise ratios. If any other power than the second were used in the definition, the DQE would not be equal to the fractional utilization F of a quasi-ideal detector. For the quasi-ideal detector, the effective number S of signal photons is not M , (as i t is for the ideal detector), but is rather FM.:
S = FM,
(3.11)
Similarly, the effective mean number of ambient photons is not Ma, but is rather FM,. Then the rms fluctuation N in the effective number of background photons is given by
N
=
(FMa)’
(3.12)
The square of the signal-to-noise ratio is therefore given by
(X/N)’
=
M,2/FMa
(3.13)
and if this signal-to-noise ratio be considered to be to the “measured” signal-to-noise ratio of the quasi-ideal detector, then it follows a t once from Eq. (3.10) that the detective quantum efficiency QD of the quasi-ideal detector is given by QD
=
F
(3.14)
4. Alternative Expressions for the Detective Quantum Eficiency. We now proceed to express the DQE in terms of quantities that are more directly related to experiment than the photon numbers Ma and M*N.In
100
R. CLARK J O N E S
various applications, we may be concerned with the energy El the exposure U , the power P, or the irradiation H . These quantities are related to the number of photons M by E = P T = H A T = &M
(3.15)
where & is the energy of a single photon: & =
(1.9857 X 10-l2 erg)/X
(3.16)
where X is the wavelengths in microns. In some applications, the signal may be expressed in terms of its contrast C defined by
c = M,/Ma
(3.17)
In terms of the contrast C, the DQE may be written in the following four different forms: (3.18) (3.19) (3.20) (3.21) In terms of noise equivalent parameters, the DQE may be writ,ten in the following four different forms: (3.22) (3.23) (3.24) (3.25) Other forms are also possible but these eight forms are more than sufficient for the purposes of this article. For electrical detectors, it is convenient to transform the expression so tJhatit involves a frequency bandwidth Af rather than an integration time T. Since the relation between these two parameters is 2TAf = 1, Eq. (3.24) becomes QD
=
2&PaA
=
2EPaD2Af
(3.26)
where D = 1/PN is the detectivity. This is the form of the expression for the detective quantum efficiency for detectors with an electrical output. The detective quantum efficiency may be defined in still another way, as the ratio of the mean-square fluctuation in the incident power to the square of the measured noise equivalent power of the detector: QD
=
((Pa
-
paI2)/pN2
Jn this form, it is perhaps most clear that the reciprocal of
QD
is a kind
QUANTUM EFFICIENCY O F DETECTORS
101
of noise figure. 1 / & differs ~ from the usual noise figure, however, in that the reference noise is radiation noise = photon noise, whereas the reference noise of the ordinary noise figure is Johnson noise. Photon noise and Johnson noise differ in concept and in amount. (In the special case of the radio antenna, they become identical in concept.) The detective quantum efficiency has been defined only for nearly monochromatic radiation. Both the signal radiation and the ambient radiation must be nearly monochromatic and of the same wavelength. This is not essential to the definition of the detective quantum efficiency; the ambient radiation could be permitted to have a n arbitrary spectrum. But this degree of freedom would greatly complicate the expressions and the concepts, and all in all it does seem preferable to restrict the ambient radiation to being nearly monochromatic and of the same wavelength as the signal. A slight departure from this principle is made only in Sec. X, in connection with heat detectors. Several persons have suggested that the word “quantum” be removed from detective quantum efficiency so that it becomes “detective efficiency.” The concept behind the suggestion is that the word “quantum” implies that the concept is applicable only to “quantum” detectors, by which is meant detectors like photoemissive and photoconductive detectors, as distinct from detectors like thermocouples and radio antennas. Actually, the concept of a “quantum” detector cannot be made rigorous except by enumeration. I feel that the advantage of retaining the close relation of the concept t o the responsive quantum efficiency makes it desirable to retain the word (‘quantum,” but the reader should be prepared to find other writers using the phrase “detective efficiency.” The remainder of this section may be omitted by those who are satisfied with the definitions given so far. For those, however, who are uncomfortable about the lack of rigor, we now present a more rigorous discussion.
C . A More Rigorous Discussion I n this part we derive the expression for the DQE of a detector with more attention to some kinds of details than in the preceding part. The same order of discussion will be employed: the ideal detector, the definition of the DQE, and the quasi-ideal detector. 1. The Ideal Detector. The results of Lewis ( l o ) ,based on the quantum statistics of an unidirectional stream of photons, state that the mean-square fluctuation in the power incident on the surface of a detector is given by AP2 = 2GP,Odvdf
(3.27)
where AP2 is the mean-square fluctuation of the power in the radiation frequency band of width dv, and in the fluctuation frequency band of width
102
R. CLARK JONES
df. E is the energy hv of a photon. P, is the power per unit radiation frequency. fi is called the coherence factor and is defined by =
Q
(1 - e - - h v / k y
(3.28)
where v is the radiation frequency and T is the thermodynamic temperature of the radiation. (For a detailed discussion of the significance of the radiation temperature T , see Planck's book, I S , or an article, ref. 14, by the writer.) If the radiation reaches the detector from a blackbody of absolute temperature T and if the radiation suffers neither absorption nor scattering in its path from the source to the detector, then the temperature of the radiation is equal t o the temperature of the source. The factor 8 is a measure of the degree to which the photons are clumped in the radiation from a thermal source. Photons are Bose particles, and Bose particles in thermal equilibrium have a very fundamental tendency t o clump. The average occupation number of the quantum states of the radiation field is equal to 8 - 1. A few numerical values follow. At a radiation wavelength of 10 p and a temperature T = 300" K, D has the value 1.007. Q has the same value for A = 1.0 p and T = 3000' K. For both of these cases, hv/kT has the value 5. But for A = 10 p and T = 3000" K, hv/kT is about 55, and 8 is therefore about 2.5. Thus, for high temperatures and long wavelengths, the factor D departs significantly from unity. D reaches very large values in the radio region. At the wavelength of 1 meter and T = 300°K, D has the value 500,000. The large value means that the photons in thermal radiation a t radio wavelengths are very highly clumped. The importance of the factor 8 for the detailed understanding of the ideal heat detector is discussed in Sec. I1 of ref. 2. The coherence factor D may also be expressed in terms of the spectral radiance of the radiation : D
=
1
+ c2N,/2hv3
(3.29)
where N , is the spectral radiance of the radiation (assumed to be unpolarized). For a given geometry of the radiation incident on the detector the spectral radiance is proportional to the power P,, so that one may also write 8 = 1
+ constant - P,
(3.30)
where the "constant" is independent of the radiation temperature T and of the power P,. The total ambient power is given by
Pa =
/omPu(v)dv
(3.31)
QUANTUM EFFICIENCY OF DETECTORS
103
and the total fluctuation of the power in the fluctuation bandwidth d j is given by
AP2 = 2df
/omhvP,(v)Q(v,T(v))dv
(3.32)
where the symbolism Q(v,T(v)) is intended to indicate that the radiation temperature T of the radiation depends only on the radiation frequency v. (T could also depend on the direction of incidence of the radiation and could differ for the two opposite states of polarization (14), but no useful purpose would be served by introducing this degree of generality in the present discussion.) We now define the ideal radiation detector as the combination of a detector proper and an ideal amplifier of gain G . The gain G is supposed to depend on the frequency f of the modulation of the incident power. The detector proper causes one electron to flow in the electrical output circuit for each photon that is incident on the detector. The combination, which we call the ideal detector, thus converts changes in incident power into changes in current in accordance with the transfer ratio I / P = e G ( f ) / a , where e is the charge of the electron. It then follows that the mean-square fluctuation in the current in the output of the ideal detector is given by A12 = 2
/om (e2/hv)P,(v)Q(v,T(v))dv/om G2(f)df
(3.33)
We now suppose that the ambient power described by P, is confined to a narrow band of radiation wavelength centered a t the frequency VO. Then the last expression may be written in the compact form
AP
=
(2ezGm2/hvo)P,SLAf
(3.34)
where Af and SL are defined by (3.35) (3.36) and where G , is the maximum value of G (f). We now consider the response to a radiation signal. Let P, be the power of a radiation signal that is modulated sinusoidally at the frequency So. More precisely Pa is the rms value of the deviation of the instantaneous power from it mean value. Like the ambient power, the signal power P, is supposed t o be confined to a narrow band of radiation frequencies centered a t the frequency yo. The rms current output due to the radiation signal is then given by
104
R. CLARK JONES
The noise equivalent value PN of the radiation power P , is obtained by setting I,2 equal to AP. One thus finds PN= ~ 2&P,PAf[Gm2/G2(fo)]
(3.38)
where & now denotes hvo. Finally, if the frequency fo of the radiation signal is equal t o the frequency that maximizes G(f), the square bracket is unity and the last equation becomes
P N= ~ 2&P,PAf
(3.39)
This is the desired expression for the noise equivalent power of an ideal detector. The last three factors are defined by Eqs. (3.31), (3.36), and (3.35). 2. Definition of the Detective Q u a n t u m Eficiency. We are now ready to define the detective quantum efficiency QD. We suppose that the noise equivalent power is measured with a given amount of ambient radiation incident on the detector. The measured value is denoted PNm and its reciprocal, the measured detectivity, is denoted %. Then the DQE is defined by &D
=
(pN/pNm)2
(3.40)
This relation may be written in two other ways:
QD
=
a ) m 2 P ~=2 2&P,Dm2PAf
(3.41)
There is a hidden subtlety, however, in the result just given, that relates to the factor P. This subtlety is discussed in connection with the quasi-ideal detector, in See. 3 below. Except for the factor P, the expression just found for the detective quantum efficiency is formally the same as that given by (3.26). There are real differences, however. Perhaps the most important difference is that (3.41) has been established for a detector of arbitrary frequency response; the bandwidth Af has a precise definition (3.35) in terms of the frequency response of the detector. The derivation of (3.41) has made clear that it holds only for narrow radiation frequency bandwidths, and thus the detective quantum efficiency defined by (3.41) may be and should be considered to depend on the wavelength of the radiation used to measure it. 3. T h e Quasi-Ideal Detector. Just as in Sec. B, we here consider a quasiideal detector defined as a detector that is ideal in every respect except that it makes use of only a fraction F of the incident photons, instead of making use of all of them. Also as in See. B, the purpose of this See. 3 is primarily to justify the definition of the detective quantum efficiency as the square of the ratio of the two signal-to-noise ratios. Still another purpose is to bring out an ambiguity in the definition of the quasi-ideal detector that has an important
QUANTUM EFFICIENCY OF DETECTORS
105
bearing on the suitability of the definition of detective quantum efficiency given in Sec. 2 above. In order to find the noise equivalent power PN of the quasi-ideal detector we must replace Pv, Pa, and P, in Sec. III,C,2 by FP,, FP,, and FP,. We must also consider how the introduction of the factor F influences the quantity Q. This is a subtle question. We assume that the quasi-ideal detector is the combination of an ideal detector and a filter of transmittance F. Suppose (first supposition) that the filter is a homogeneous layer of absorbing material. Then the filter reduces the spectral radiance of the ambient radiation, and the proper expression for the coherence factor of the radiation that has passed through the filter is QF =
1
+ c2FN,/2hv3
(3.42)
where N , is the spectral radiance of the incident ambient radiation. But now suppose alternatively (second supposition) that the filter is a wire screen that blocks off a fraction 1 - F of the cross section of the incident radiation. Then the spectral radiance is unchanged by the filter, and the coherence factor is as given by Eq. (3.29). On the basis of the first supposition, the noise equivalent power of the quasi-ideal detector is given by
PN2 = 2EPaQFAf/F
(3.43)
whereas on the second supposition it is given by
Plv2 = 2EPaQAf/F
(3.44)
Then on the basis of the first supposition the DQE of the quasi-ideal detector is QD =
QF/QF
(3.45)
whereas on the second supposition it is given by QD
=
F
(3.46)
Thus, the success of our aim to show that the detective quantum efficiency of a quasi-ideal detector is equal to F turns out to depend on how we imagine our quasi-ideal detector to be constructed. This difficulty is closely related to a current controversy. Both Fellgett (16) and this writer (16)have considered the ultimate performance of heat detectors as given by a statistical mechanical argument, and (probably incorrectly) we both indicated that these results were very general and should hold for all radiation detectors, including photoemissive tubes. In a recent article Hanbury-Brown and Twiss (17) disagreed with our conclusions and indicated a different conclusion for photoemissive tubes.
106
R. CLARK J O N E S
Now the difference between our two different conclusions is just the difference bet,ween the two kinds of filters indicated above. Fellgett and the writer have shown that the noise equivalent power of an “ideal” heat detector with emissivity E is given by Eq. (3.44) with F set equal to E. Hanbury-Brown and Twiss find that the noise equivalent energy of a photoemissive surface with responsive quantum efficiency q is given by Eq. (3.43) with F set equal to q. Thus, the Fellgett-Jones results accords with the second supposition, whereas the Hanbury-Brown and Twiss result accords with the first supposition. With hindsight we can see that these results are reasonable. I n the case of a heat detector, the temperature of the detecting element is necessarily in equilibrium with its surroundings; the chief ambient radiation is the blackbody radiation appropriate to the temperature of the environment. If now we imagine placing a Glter over a heat detector of unit emissivity, it is clear that the filter must have the same temperature as the detector, since otherwise it would change drastically the operating temperature of the detector. But if the filter has the same temperature as the detector, then its interposition does not change the ambient radiation that falls on the detector: the filter radiates just as much radiation as it absorbs. Thus, it is clear that the filter does not change the spectral radiance of the ambient radiation, and therefore only the second supposition is valid. But the situation is different with a photoemissive tube. Here the ambient radiation is not blackbody radiation of the temperature of the photoemissive surface; the phototube does not respond significantly to such radiation. Thus, the argument just given for the second supposition does not carry through for the phototube. My conclusion is as follows. For heat detectors, the expression (3.41) with z;2 defined by (3.29) is the correct expression for the DQE. For photoemissive tubes and other detectors not in thermal equilibrium with the ambient radiation, perhaps the factor 0 should be replaced by (3.47)
We choose not to make this replacement; if we did, we would find that Q D would occur on both sides of Eq. (3.41), and we would find that Q D would be given as the positive root of a quadratic equation. We wish to avoid this complication. Furthermore, the decision to stick with Eq. (3.41) leaves us with a single formula for the DQE, albeit a somewhat arbitrary formula. (The current status of the controversy mentioned above is as follows: I prepared a report dated March 1, 1958, and circulated it to everyone I knew to be interested in the subject; I described the area of agreement, and showed just where the two groups disagreed. I emphasized that both arguments seem unassailable in their respective field of application and
QUANTUM EFFICIENCY O F DETECTORS
107
that the real problem that remained was to reconcile the two results. Dr. Hanbury-Brown replied in a letter dated April 10, 1958, agreed that there was a real problem, and indicated that he knew of two graduate students who were working on the problem, without success so far. Dr. Twiss replied in a letker dated May 13, 1958, and suggested what seems to me a convincing resolution of the problem. His solution suggests that the second supposition is correct for a detector in which the ingoing and outgoing radiation fluxes are equal, whereas the first supposition holds when these fluxes are very unequal. His result also includes the intermediate case where the fluxes have an arbitrary ratio.)”
IV. DETECTIVITY AND CONTRAST DETECTIVITY A. Introduction In this section we shall discuss the relation between the detective quantum efficiency and the other two useful approaches to detector performance: detectivity and contrast detectivity. We shall also show that some detectors have a restricted range of useful ambient radiation. For example, if the exposure of a photographic negative is less t,han a certain amount, the DQE can actually be increased by preexposing the film. On the other hand, with some detectors such as the vidicon, photographic negatives, and (probably) human vision, there is a critical value of background radiation above which the DQE can actually be increased by covering the detector with a filter that attenuates both the signal radiation and the ambient radiation. It will be shown that these two limits, the lower and the upper limit on the ambient radiation, correspond to the values of the ambient radiation that maximize, respectively, the detectivity and the contrast detectivity.
B. Some Dejinitions For reference, we repeat here the definition of the detective quantum efficiency:
The detectivity a> for the purposes of this section is now defined as the reciprocal of the noise equivalent number of signal photons:
* Note added in proof: This resolution of the controversy is described in a manuscript “Fluctuations in photon streams,” by P. B. Fellgett, R. Clark Jones, and R. Q. Twiss, which has been submitted to Nature.
108
R. CLARK J O N E S
The contrast detectivity3 is the noise equivalent value of the ratio M,/M,:
Thus, if, for example, the noise equivalent contrast is 0.8%, the contrast detectivity is 125. It is immediately obvious that QD is equal to the product of 3 and D,: QD
c
SLOPE
(4.4)
= 99,
+I
log Ma FIG.3. A schematic plot showing the detective quantum efficiency &D plotted against the amount of the irradiation of the detecting surface. If the irradiation is less than the amount a t the point A (where the curve has a slope of +l),the detector is underloaded, and the DQE can be increased by adding additional ambient radiation. If the ambient irradiation is more than that of the point C (where the curve has a slope of -l), the detector is overloaded, and the DQE can be increased by placing a neutral filter over the detector. All detectors can be overloaded, but only some detectors can be underloaded (see Sec. IV for a full discussion).
C. Properties of the QD-vs-M, Curve Consider the imaginary detector whose detective quantum efficiency Q D is plotted versus the ambient photon number M a with logarithmic coordinates in Fig. 3, and consider further the two straight lines with slopes of plus and minus one that are tangent to the curve (at the points A and C). It is not meant to imply that all detectors have a curve with slopes of both and -1. A multiplier phototube has no point A , and furthermore, it
+
QUANTUM EFFICIENCY O F DETECTORS
109
probably has no point C until the radiation is so intense that it heats up the photocathode. Photographic negatives always have points A and C. The point B represents the maximum value of the DQE with respect to variation of the ambient radiation Ma. It will now be shown that the points A and C correspond to the ambients that maximize D and D, respectively. I n general the slope of the curve in Fig. 3 is given by (4.5)
At the point A , this slope is + l . If one sets the above expression equal t o +1, and performs a little reduction, one finds d9 -dMa =o
Similarly, a t the point C, the slope given by (4.5) is -1. One finds
The last two equations are, of course, the formal conditions that D and D,, respectively, be stationary with respect to variation of Ma.
D. Increasing QD by a Local Source of Radiation We shall show that if the ambient radiation is less than the amount that corresponds to the point A in Fig. 3, the DQE of the detector can be increased by deliberately increasing the steady radiation that falls on the detector. This can be done, for example, by letting a local source of steady radiation act on the detector. Consider a point (on the curve in Fig. 3) that is to the left of point A , such as the point D. We shall show that the effective value of the DQE can be raised vertically from D to the point E if, without changing the number of signal photons, the number of ambient photons is increased (by a local source) so that the total number or ambient photons is the same as at the point A. In this Sec. D and also in Sec. E, we shall simplify the calculations as much as possible by supposing that the number of signal photons in the original signal to be a given fixed number. Subscripts will be added to M a to indicate the point to which the value of Ma refers. I n the argument to be given, we shall throughout use the definition (4.1) of the DQE: the ratio of the signal-to-noise ratio squared ( S / N ) 2in the detector output to the square of the signal-to-noise ratio LW,~/M,in the radiation input to the detector. At the point A , the DQE is the ratio of the output signal-to-noise squared ( S / N ) A *to the input signal-to-noise squared, M?/Ma* :
110
R. CLARK JONES
To compute the value of the DQE at the point D or El we note that the input signal-to-noise squared is M , 2 / M a ~Furthermore, . if the ambient number of photons is increased by the local source so that the number of photons is the same as a t A , both the number of signal photons and the number of ambient photons is the same as a t A , and therefore the output signal-to-noise squared is the same as a t A . Thus, the value of the DQE with the local source is
By eliminating M , between the last two equations, one finds QLS =
(MaD/MaA)&A
(4.10)
In words, the value of the DQE with the local source is the value at A multiplied by the ratio M a D / M a ~whence , by simple geometrical reasoning i t follows that QLS is the value a t the intersection of the line with slope +1 and the vertical line through D; that is to say, QLs is the value of the DQE a t the point E. This is what we set out to prove.
E. Increasing Q D by a Neutral Filter We shall show that if the ambient radiation is greater than the amount corresponding to the point C, the DQE of the detector can be increased by placing a neutral filter over the detector. The filter attentuates the signal radiation and the ambient radiation by the same amount. Consider a point (on the curve in Fig. 3) that is to the right of point C, such as the point F. We shall show that the effective value of the DQE can be raised vertically from the point F to G if a neutral filter is placed over the detector that reduces the number of ambient photons that reach the detector t o the amount a t C. The value of the DQE at the point C is the ratio of the output signal-tonoise squared ( S / N ) c 2to the input signal-to-noise squared Ms2/Mac: (4.11)
The transmittance T of the filter that reduces the ambient M,F to Mac is
T
=
Mac/MaF
(4.12)
To compute the value of the DQE with the ambient MaF and with the filter in place, we note that the effect of the filter is to reduce the number of signal photons by the factor T. Thus the output signal-to-noise squared is T2
QUANTUM EFFICIENCY OF DETECTORS
111
times ( S / N ) c 2 .The input signal-to-noise ratio is Ma2/M,F. The value of the DQE with the filter in place is therefore (4.13) By elimiiiating T and M , among the last three equations, one finds &fil
= (MaC/MaF)&c
(4.14)
I n words, the value of the DQE with the filter in place is the value a t C multiplied by the ratio M,c/M,F. From simple geometrical reasoning it follows that & f l l is the value of the DQE a t the intersection of the line with slope - 1 and the vertical line through F ; that is to say, Q f l l is the value of the DQE a t the point G. This is what we desired to prove.
F. The Useful Range of a Detector; Underloading and Overloading It is a simple consequence of the results established in the preceding two parts that the useful range of the detector itself is limited to the range of ambients lying between A and C . If the ambient radiation is greater than the amount a t C, the detector is ((overloaded”: the performance of the detector can be improved by placing over the detector a filter that reduces the ambient radiation to the value that maximizes the contrast detectivity. If, on the other hand, the ambient radiation is less than that a t A , the detector is “uiiderloaded”: the performance of the detector can be improved by deliberately increasing the ambient radiation by a local source to the value that maximizes the energy detectivity. All detectors without exception can be overloaded : all detectors can be destroyed by sufficiently intense radiation. But not all detectors can be underloaded. Human vision and photoemissive tubes cannot be underloaded. But photographic negatives can always be underloaded. G. Method of Comparing Television Camera Tubes with Photographic Films An uiiderloaded detector may show strikingly poor performance. A good example is presented by the series of pictures shown in Fig. 4, reproduced with permission of Rose, Weimer, and Law (18). These pictures compare the performance of the image orthicon and Super-XX film. (We shall show that there is a sense in which the comparison favored the image orthicon in that no effort was made to correct for the underloading of the film.) I n each of the four pictures shown in Fig. 4, the model is on the right, and an image, picked up by an image orthicon and shown on the screen of a television receiver, is shown on the left. The four pictures cover a hundredfold change in the illumination of the model. The photographs in Fig. 4 were all taken with a 35-mm still camera, with Super-XX film and
112
R. CLARK J O N E S
with an exposure duration of 1/30 sec. Both the television camera and the 35-mm camera used an j / 2 lens. Under these conditions both the image orthicon and the film receive approximately equal numbers of signal photons and equal numbers of ambient photons in the 1/30-sec frame time; and, therefore, the signal-tonoise ratios in the images were proportional to the square roots of the detective quantum efficiencies. The four pictures show that the image orthicon continues to provide a clear signal even when the photographic system fails to provide any image a t all. Now this is precisely what we would expect from the results shown in Fig. 1, unless measures are taken to correct the underloading of the film.
0.02
0.2
0.07
2
FIG.4. Four pictures showing the comparison made by Rose, Weimer, and Law of the performance of image orthicons and Super-XX film. A 35-mm photographic camera and a television camera both viewed the same subject, and the pictures show the view obtained with the photographic camera. The model is on the right in each picture, and a television receiver showing the image picked up by the television camera is on the left. The two cameras used lenses of the same aperture and focal length. The luminance of the model varied over a hundredfold range and is indicated under each picture, in footlamberts. The text explains how this comparison is partial to the image orthicon, since no effort was made to correct the underloading of the film in these tests.
QUANTUM EFFICIENCY OF DETECTORS
113
(Films and image orthicons have been much improved since 1946; thus, in invoking data from Fig. 1 we must rather arbitrarily select the items to be compared. We choose the Royal-X film and the 5820 image orthicon.) Figure 1 shows that although the Royal-X film has its highest DQE a t about erg/cm2, the DQE has a very low value indeed a t one-third of this ambient exposure. Thus, for values of the ambient exposure less than about 3 X lo-* erg/cm2, the film gave no image a t all. But a t the same ambient exposure, the 5820 image orthicon has a DQE greater than 1%, and the DQE is falling with a slope that is less than fl. Consider, however, the effect of suitable additional ambient exposure, provided, for example, by either preexposing or postexposing the film. The Royal-X curve is thereby converted to the curve shown by the dashed line. At the ambient exposure of 3 X 10-4 erg/cm2, the Royal-X film then has a DQE of about 0.3%, compared with about 1.6% for the 5820. Under these conditions, both the film and the image orthicon would provide a substantial signal-to-noise ratio, the SIN for the image orthicon being 2.3 times the SIN of the film. For lower ambient exposures, the two curves both decrease, with the separation between them increasing slightly and with an asymptotic separation by the factor ten, so that the image orthicon provides about three times the signal-to-noise ratio of the film. If we are to obtain a good image a t low ambient exposures, the film must be suitably preexposed, and the resulting negative must be printed with higher than normal contrast. It may a t first seem that these measures are more complex than those used with an image orthicon a t low light levels, but it should be recalled that a reduction of ambient light requires adjustments of the image orthicon also; the beam current must be reduced, the video gain must be increased, and it may be necessary to retrim the shading adjustments. But none of these considerations is intended to obscure the fact that the curve for the 5820 image orthicon shown in Fig. 1 is for all ambients well above the curve for Royal-X film. I n summary, this section has called attention to the serious consequences of underloading photographic films. When the underloading is left uncorrected, dramatic demonstrations of the poor performance of photographic films are possible: Fig. 4, for example.
v. PHOTOEMISSIVE
TUBES
A . Introduction Of all radiation detectors, photoemissive tubes are the most simple to discuss in terms of quantum efficiency. The specification of the responsive quantum efficiency of the photocathode is an almost complete specification of the performance in the presence of ambient radiation.
114
R. CLARK JONES
The discovery of photoemissive surfaces that have substantial quantum efficiency in the visible region of the spectrum is a fascinating story. The history is described by Sommer (19) and by Zworykin and Ramberg (20). The investigation of the photoelectric properties of bulk metals was followed by studies of thin films of alkali metals by Ives @ I ) , Campbell (22),and K. T. Bainbridge. These investigations led to the development of three photosurfaces that have been of outstanding importance during the last twenty to thirty years: the silver-cesium-oxygen surface (used for the S-1 response) developed by Holler (23) in 1928, the cesium-antimony surface (used for the S-4 response) developed by Gorlich (24) in 1935, and the bismuth-silver-cesium-oxygen surface (widely used for television camera tubes) developed by Sommer (25, 26) in 1939. The Ag-Cs-0 surface is outstanding in its unique response in the infrared, the Cs-Sb surface for its high quantum efficiency in the blue, and the Ag-Bi-O-Cs surface for its moderately high quantum efficiency throughout the visible spectrum. Several recent developments will now be described. One of the difficulties in making surfaces of high quantum efficiency is that the mean free path for the photoelectrons is shorter than the mean free path for the exciting photons. Thus, if the layer is made thick enough to absorb most of the light, many of the photoelectrons are stopped inside the layer. An ingenious way of mitigating this limitation is employed in the 7029 multiplier phototube: the cesium-antimony is deposited on top of an opaque aluminum mirror. This arrangement effectively doubles the optical thickness of the layer without increasing the path length for the photoelectrons. A markedly superior photosurface has recently been developed by Sommer (27, 28). At all wavelengths this surface has a higher responsive quantum efficiency than the antimony-cesium surface. This new surface is variously called a multialkali surface or a trialkali surface. It employs the elements antimony, potassium, sodium, and cesium. The responsive quantum efficiency of this surface, based on data given in the RCA HB-3 Tube Handbook, is shown by curve A in Fig. 5. The last decade has seen an impressive development in the use of scintillation counters in nuclear research. These counters involve a scintillating material, and one or more multiplier phototubes. Tubes have been developed for this application that have photocathodes of very large area, reduced dispersion of transit time, and very high gain and may deliver peak currents of many amperes. A group of articles on this subject is collected in the November, 1956, issue of Nuclear Science Transactions (29). A special problem surrounds the use of multiplier phototubes that have Ag-Cs-0 photocathodes (S-1 response). These tubes are very difficult to make, and although quite a number of such tubes have been manufactured and tested (30-32) they have shown a persistent tendency to lose infrared response during operational life.
QUANTUM EFFICIENCY OF DETECTORS
115
Mr. Bennett Sherman (Farrand Optical Company) examined in 1955 a number of such tubes manufactured up to 12 months or more previously by DuMont, Farnsworth, RCA, and Cinetel and found that none of them showed the classical S-1 response; they had no response at 8,000 A (33). Dr. Gerald E. Kron (Lick Observatory) has done precision photometry and colorimetry since 1938 with Ag-Cs-0 surfaces. Until 1955, he used gas-filled phototubes with an electrometer amplifier (34). Beginning in 1955, he has been using a 12-stage Lallemand multiplier phototube with a Ag-Cs-0 photocathode and with solid silver-magnesium-alloy dynodes. In a letter dated October 30, 1957, Dr. Kron wrote: Our techniques in using these photosurfaces (Ag-Cs-0) differ quite a bit from the usual, I think. First of all, we do not even turn on voltage unless the tube is refrigerated with dry ice. The room temperature dark current is so large that I believe operation even in the unilluminated condition with full cathode voltage would probably spoil a tube. Secondly, our illumination level is very low. We almost never operate with a cathode current of more than 10-l’ amp. Thirdly, we keep the cathode voltage no higher than 90 volts; this means running the first stage of a multiplier a t a lower voltage than the others but it does preserve the cathode. I think it possible that the chief cause for the deterioration of the cathodes in commercial multipliers may be simply operating them without refrigeration. It may be that the normal thermal dark current, which we know to be large, is so large that it alone is enough t o spoil the cathode during service periods.
Both Dr. Sommer and Dr. Engstrom (RCA) have written (35)that with Ag-Cs-0 surfaces the loss of infrared response with use is usually found only in multiplier tubes and that the loss may be associated with release of oxygen from the dynodes during electron bombardment. Dr. Engstrom says further, While we have not entirely solved the problem of loss of infrared sensitivity in the new RCA 7102, we have very much reduced it. I n fact, in many applications, particularly those involving low current levels, the tube is quite satisfactory and will give reasonably long life. The improvement in this characteristic has been obtained by a very strenuous processing procedure.
B. Responsive Quantum Eficiency If R is the responsivity (in amperes per watt) of the photocathode for radiation of the wavelength X (in microns), the responsive quantum efficiency QR is given by QR
= 1.2396R/X
(5.1)
where the number 1.2396 is hc/e in appropriate units. This relation permits the immediate calculation of the RQE from data on the radiant responsivity of the photocathode, in amperes per watt.
116
R. CLARK JONES
Through the courtesy of Dr. Ralph W. Engstrom and Mr. R. G. Stoudenheimer (RCA at Lancaster), the writer has received a substantial amount of unpublished information about the characteristics of RCA phototubes. This section is confined to data on RCA phototubes. The response-vs-wavelength curves of RCA phototubes are given as one of a number of S-responses. A given S-response denotes a specific relative response-vs-wavelength curve (plotted in the RCA HB-3 Tube Handbook) and does not denote any particular kind of surface. The writer here follows this convention. TABLE I. PROPERTIES A N D NATUREOF THE PHOTOCATHODES USED IN RCA PHOTOTUBES Designation of response
s-1 s-3 s-4 s-5
s-8 s-9 s-10 s-11 9-13 S-17 s-20 (1
Cathode surface used in 1958 Ag-Cs-0 Ag-Rb-0 Sb-Cs (opaque layer) Sb-Cs (opaque layer in 9741 glass bulb) Bi-Cs (opaque layer) Sb-Cs (semitransparent) Ag-Bi-O-Cs (semitransparcnt) Same as for S-9 but a thinner layer Sb-Cs (semitransparent on fused silica window) Sb-Cs (thin layer on aluminum mirror) Sb-K-Na-Cs
Wavelength of Median responsive maximum response, quantum efficiency, angstroms per cent" 8,000 4,200 4,000 3,400
0.42 0.57 14.0 18.2
4,200 4,800 5,400 4,400
0.68 6.45 3.6 15.8
4,400
13.2
4,900
21.5
4,200
18.9
The value tabulated is the maximum value listed in Table 11.
At any one date, of course, a given kind of surface is used to obtain each of the S-responses. Table I lists for each of the current S-responses (used for photoemissive tubes) the surface that was employed in 1958, the wavelength of maximum response, and the highest value of the median RQE that is obtained with that surface. Inspection of the second column of Table I reveals the importance of Gorlich's cesium-antimony surface: six of the eleven responses were obtained with various forms of that surface in 1958. The properties of most RCA phototubes, including gas and multiplier types, are listed in Table 11. Only cathode responsivities are shown. The values of the RQE listed in the last column are calculated by Eq. (5.1)
117
QUANTUM EFFICIENCY OF DETECTORS
from the radiant responsivities listed in the second column. The data in Table I1 were assembled from a variety of sources listed in the statement a t the end of the table. One should note particularly in Table I1 the very high responsivity in microamperes per lumen (of 2870" K radiation) of the trialkali surface; it is two or three times the responsivity obtainable with prior surfaces. All of the RQE values given in Tables I and I1 are given for the (designcenter) wavelength a t which the responsivity has its maximum value. Since the RQE is proportional to the responsivity divided by the wavelength, the RQE will be slightly higher than the values given for a wavelength slightly shorter than the wavelength given in Table I. The only phototube for which
00
I
I
I
I
4000
5000
6000
7000
8000
WAVELENGTH IN ANGSTROMS
FIG.5. The responsive quantum efficiency &R of four important photosurfaces plotted versus the wavelength in angstroms. Curve A : The RQE of the 7265 multiplier tube with an 5-20 response and a trialkali photocathode; Curve A also represents the RQE of the 7037 image orthicon. Curve B : The RQE of the 6810-A multiplier tube with an S-11 response and a Sb-Cs photocathode. Curve C: The RQE of the 7612 multiplier tube with an S-10 response and a Ag-Bi-0-Cs- photocathode. Curve D: The RQE of the 5280 and 6849 image orthicons with Ag-Bi-0-Cs photocathodes. The figure shows clearly the markedly superior performance of the new trialkali surface in the important red region from 6,000 t o 7,000 A (see also Figs. 2 and 8). The circles indicate the point where the responsivity has its maximum value.
118
R. CLARK JONES
TABLE 11. RESPONSIVITY A N D QUANTUM EFFICIENCY OF RCA PHOTOTUBES Cathode responsivity
amperes per watt
Luminous, microamperes per lumen (28700K)
V M
0.0019 0.0016 0.0018 0.0024 0.00135 0.0016 0,0018 0.0018 0.0016 0.0020 0.0027 0.0027
23* 18* 20 27* 15* 18* 20 20 18* 23* 30 30
G V
0.0019 0.0018
M G V V M V G G G G V M M
0.04 0.04 0.045 0.045 0.03 0.03 0.035 0.03 0.035 0.03 0.045 0.02 0.02
M V
Tube designationu
Ratio, lumens per watt
Responsive quantum efficiency, per cent
-
S-1 phototubes: 1P40 868 917,919 918 920 921 922 925 927 930 6570 7102 S-3 phototubes: 1P29 926 S-4phototubes: 1P21 1P37 1P39 929 931-A 934 5581 5582 5583 5584 5652 6323,6328 6472 S-5 phototubes: 1P28 935 S-8 phototubes lP22 S-9 phototubes: 1P42 S-10 phototubes: 6217 S-11 phototubes: 2020 5819
G G
90 90 90 90 90 90 90 90 90 90 90 90
0.30 0.25 0.28 0.38 0.21 0.25 0.28 0.28 0.25 0.32 0.42 0.42
270 280
0.57 0.53
1000 1000 1000 1000
20*
1000 1000 1000 1000
12.4 12.5 14.0 14.0 9.3 8.7 11.9 9.3 11.9 9.3 14.0 6.2 6.2
0.05 0.043
40 35
1250 1230
18.2 15.6
M
0.0023
3
768
0.68
V
0.025
37
675
6.45
M
0.0156
40
390
3.6
M M
0.04 0.04
50 50
800 800
11.3 11.3
V G G G V V
G G
7* 6.5 40 40* 45 45 30 30 35* 30* 35* 30* 45 20*
1000
1000 1000 1000 1000
119
QUANTUM EFFICIENCY OF DETECTORS
6199 6342 6372 6655 6810-A S-13 phototubes: 6903 S-17 phototubes: 7029 S-20 phototubes: 7265 The following three (S-11) 6810-A (5-17) 7029 (S-20) 7265
M M M M M
0.036 0.048 0.027 0.04 0.056
45 60 33 50 70
800 800 800 800 800
10.2 13.5 7.5 11.3 15.8
M
0.047
60
780
13.2
M
0.085
125
680
21.5
0.064 150 426 give maximum observed responsivities: 0.8 100 800 0.122 180 680 0.10 225 444
18.9
M entries M M M
22.5 30.8 29.5
a The letter following the tube designation indicates whether the tube is a vacuum phototube (V), a gas phototube (G), or a multiplier phototube (M). The values given for the cathode responsivity are median values, except for the last three rows, where maximum observed values are given. The values of the radiant cathode responsivity and the values of the responsive quantum efficiency are for the wavelength of maximum response given in Table I. The wavelength of maximum response is the design-center value. Most of the data are taken from the 1955 RCA Publication No. CRPD-105, “Photosensitive Devices and Cathode-Ray Tubes.” Data for the 6810A, 6903, 7029, and 7235 are from the RCA HB-3 Tube Handbook. The maximum observed values in the last three rows are given in a letter dated Sept. 2, 1958 from Dr. Ralph W. Engstrom. The cathode responsivities of the gas phototubes are not given in the publications cited; these were supplied in a letter dated May 21, 1957, from Mr. R. G. Stoudenheimer, along with the values for a few of the multiplier phototubes; values so obtained are indicated by an asterisk.
this method of calculation gives a slightly misleading result is the 6217 multiplier phototube with the S-10 response; this tube has the RQE of 3.6% a t the wavelength 5,400 A, where the responsivity has its maximum value, but has the RQE of 4.7% at 3,800 A. Figure 5 shows the RQE of four photosurfaces plotted against the wavelength of the radiation. From the top down, the three solid curves are for the trialkali surface (Sb-K-Na-Cs), the Sb-Cs surface, and the Ag-Bi-0-Cs surface, as represented respectively by the 7265, 6810-A, and 6217 multiplier phototubes, and representing, respectively, the 5-20, S-11, and S-10 responses; all these tubes have semitransparent photocathodes. The dashed curve represents the 5820 and 6849 image orthicons. The strikingly superior performance of the trialkali surface in the important red region from 6,000 to 7,000 A is well shown by curve A in Fig. 5. All of the results shown in the figure are based on the median responsivity as given in the RCA-HB-3 Tube Handbook. One notes in Fig. 5 that all of the curves drop sharply between 4,000
120
R . CLARK JONES
and 3,000 A. A recent article by Spicer (36) indicates that this drop for the two upper curves (and probably for the other two) is due entirely to the absorption of the glass envelope. Spicer's paper contains a number of plots of the responsive quantum efficiency versus the energy of the photon for various alkali-metal-antimony photocathodes; all the curves rise toward a horizontal asymptote as the photon energy increases to the maximum measured energy of 4.5 volts (2,750 A).
C . Detective Quantum Eficiency It is often possible in practice to achieve a detective quantum efficiency that is as high as 0.8 or 0.9 times the responsive quantum efficiency. The first necessary condition is that the ambient radiation be sufficient to make the photocurrent large compared with the thermionic dark current. If I, is the photocurrent and Id the dark current and if everything else is ideal, the detective quantum efficiency Q D is related to the responsive quantum efficiency Q R by QD-- I , QR I p -k Id
(5.2)
If one is concerned with a simple vacuum phototube (not a multiplier phototube), another necessary condition is that the shot noise of the photocurrent must be large compared with the Johnson noise of the load resistance R. I n a unit frequency bandwidth, the former mean-square noise voltage is 2eI,R2 and the latter is 4kTR. If everything else is ideal, then the relation between QD and QR is QD = QR
eI,R eI,R -k 2kT
(5.3)
When T is 300" K, the last equation becomes
!&QR
EP
E,
+ 0.0518 volt
(5.4)
where E, is the I R drop across the load resistor. Thus, the photocurrent must produce a drop across the load resistor that is large compared with 0.0518 volt. If the amplifier has additional noise above the noise of the load resistor, the voltage drop must be correspondingly larger. When the photocurrents are very small, as in astronomical work, the tubes must be refrigerated to reduce the thermionic dark current and very large load resistances must be used. Kron (34) in his work with a simple S-1 phototube used load resistances as large as 2.5 X 1013ohms. A very good answer to the problem of uncomfortably large load resistances is the multiplier phototube and, to a lesser extent, the gas phototube.
QUANTUM EFFICIENCY OF DETECTORS
121
The dynode chain in a multiplier tube and the gas in a gas phototube provide amplification and introduce relatively little additional noise. In most cases the additional noise serves to reduce the detective quantum efficiency by a factor of not less than about 0.7. The theory of the noise produced by the amplification process in a mukiplier phototube is developed in a fundamental paper by Shockley and Pierce (37'). They find that if (1) the noise in the cathode current is shot noise, if (2) a t each dynode the number of secondary electrons for each primary electron has a Poisson distribution, and if (3) the gain of each dynode is the same, then the amplification process increases the mean square noise more than the signal squared by the factor: Mm-1 M ( m - 1)
(5.5)
where M is the total gain of the dynode chain and m is the gain of each dynode. In practical multiplier tubes where M is very large compared with m, the factor reduces to m/(m
- 1)
(5 * 6)
If everything else is ideal, the relation between the responsive quantum efficiency of the cathode QR and the detective quantum efficiency QD is QD/QR
= 1
- m-l
(5.7)
For a typical dynode with gain m = 4 , Q d is 0.75 times Qr. The corresponding theory for the noise produced by the amplification process in a gas phototube was developed by Rajchman and Synder (38). They suppose that the number of secondary electrons produced a t each collision has a Poisson distribution. Their result, stated without proof, is that the amplification increases the mean square noise more than the square of the signal by the factor 1 G-l, where G is the total current gain produced by the gas. If G = 5, the factor is 1.2 and Q D is 0.83 times QR. In summary, the use of a multiplier phototube or a gas phototube reduces the DQE that may be attained to a value that is not less than 0.7 or 0.8 times the RQE of the photocathode. The values of RQE shown in Tables I and I1 are numbers that the DQE can approach but not quite achieve.
+
VI. PHOTOCONDUCTIVE CELLS
A . Introduction Since World War I1 a number of different kinds of photoconductive cells have become important in industrial and military technology. These
122
R. CLARK JONES
include cadmium sulfide and selenide cells for the visible spectrum; lead sulfide, telluride, and selenide cells for”the region out to not more than 10 p ; and doped germanium cells with response extending out to 50 I.( or more. Cells that have significant response beyond 3 or 4 I.( must usually be cooled to attain their best performance. In any discussion of the properties of photoconductive cells, the concept of the absorption edge plays a major role: the absorption coefficient for radiation is small for wavelengths longer than that of the edge and rises rapidly for shorter wavelengths. The rising absorption for wavelengths shorter than the edge is due to the production of free electron-hole pairs, and the photoconduction is due to the movement of these charge carriers. One of the two carriers (electron or hole) usually has a very short lifetime in photoconductors and is trapped a t once. The other carrier has a longer lifetime and is responsible for the photoconduction. We now sketch briefly some of the other concepts, including transit time and photoconductive gain. At equilibrium in a given volume for the photoconductor, the number of free carriers (produced by the radiation) is
M
=
Fr
(6.1)
where F is the number of pairs produced per second and r is the lifetime of the carrier. The transit time T , required for the carrier to travel from one electrode t o the other under the influence of the bias voltage V is
T , = L2/pV (6.2) is the mobility of the carrier and L is the interelectrode distance.
where p The photocurrent produced by the radiation is then
I
=
eFr/T,
(6.3)
where e is the magnitude of the charge of the electron. If just one electronic charge were transferred from one electrode to the other for each pair that was produced by the radiation, the photocurrent would be I = eF. But Eq. (6.3) indicates that the actual photocurrent is greater than this by the factor r/Tr. This ratio may thus be considered as the ‘(gain” G of the photoconductor:
G
=
r/T,
=
rpV/L2
(6.4)
The gain G may also be written in a form that is analogous to Eq. (5.1) :
G
=
1.24R/X&~
(6.5)
where R is the responsivity of the call in amperes per watt, X is the wavelength in microns, and QR is the RQE defined by Eq. (6.6) below.
QUANTUM EFFICIENCY O F DETECTORS
123
The photoconductive gain can be quite large. A developmental RCA cadmium selenide cell C7218 has a median responsivity of 13,500 amp per incident watt at 0.72 p with a polarizing voltage of 75 volts. If one supposes that all the incident photons produce pairs (QR = l), one calculates with Eq. (6.5) that the gain G is 23,300. If the more plausible assumption is made that roughly one-half of the incident photons produce pairs (QR = 0.5), the computed photoconductive gain G is 46,500. The theory given above is due to Rose. (8). When one compares the RQE and the DQE of a photoconductive cell, it is important to note that, following established practice, the RQE is defined in terms of the absorbed photons, whereas the DQE is defined in terms of the incident photons. Thus, the DQE cannot be greater than the absorptance.2 Even if a cell with a RQE of unity is ideal in every other way, the DQE cannot be greater than the absorptance. This Sec. VI is confined t o true photoconductive cells, cells that a t a given irradiation obey Ohm’s law; p-n junctions are considered separately in Sec. X.
B. Responsive Quantum Eficiency Since the event produced by a photon is the creation of an electron-hole pair, it is clear that responsive quantum efficiency should be defined as the ratio of the number of pairs produced per second t o the number n of photons absorbed per second: QR
=
F/n
(6.6)
(As stated in Sec. 11, it is customary with photoconductive detectors to consider the absorption of a photon as the input event, instead of the incidence of a photon.) This, however, was not the definition of the RQE that was used prior to 1937 by Gudden and Pohl (see Nix, 39) in their extensive pioneer work on photoconductivity. They defined the RQE as the ratio of the photocurrent (measured in electronic charges per second) to the number n of photons absorbed per second : QR
=
I/en
(6.7)
By comparing the last two equations with Eqs. (6.3) and (6.4), one sees that the RQE used by Gudden and Pohl is equal to the product of Q R as properly defined and the photoconductive gain G. Since G may be large compared with unity, we can see why the early workers were puzzled by “anomalous” photocurrents, currents for which QR as defined by Gudden and Pohl was greater than unity.
* The absorptance is defined as the fraction of the incident light that is absorbed. The absorptance plus the transmittance plus the reflectance is equal to unity.
124
R. CLARK JONES
The first careful measurement of the RQE of a photoconductor was carried out by Goucher (40) on a sample of nearly intrinsic germanium. This was a sample in which the lifetime and mobilities of both of the carriers had been measured. Goucher found that the RQE was unity over the range from 1.0 to 1.7 p , with a probable error of 10 or 15%. Goucher’s careful experiment was carried out to test a hypothesis that has come to be widely accepted by solid-state physicists-the hypothesis that all photoconductors wit,h a sharp absorption edge have a RQE of unity for wavelengths just shorter than the edge and for some distance toward shorter wavelengths. The basis for this hypothesis is easy to understand: as the wavelength moves into the absorption edge from longer wavelengths, the absorption coefficient increases by several orders of magnitude-by six or seven orders of magnitude for germanium (41,42). All this extra absorption is due to the much increased cross section for pair production. Thus, if the absorption coefficient has increased by three orders of magnitude, all but one part in lo3 is due to pair production. This is merely another way of saying that 99.9% of the absorbed photons produce pairs and that the RQE is therefore 99.9%. This hypothesis must be used with care, of course. As the wavelength is decreased through the visible and into the ultraviolet, other electronic absorption mechanisms will set in and will compete with pair production. The net result is a drop in the RQE. And if one is dealing with a complex photoconductor of unknown structure, such as lead sulfide evaporated films, one cannot be sure that other absorption mechanisms may not be setting in just inside the absorption edge. This hypothesis may be in error in the opposite direction, also. Smith and Dutton (43) present evidence that the RQE of lead sulfide films rises above unity and is inversely proportional to the wavelength for wavelengths between 0.6 and 0.2 p. For wavelengths shorter than 0.6 p , this corresponds, on the average, to the production of one electron-hole pair per 2.1 electron volts of energy in the absorbed radiation. In summary, the hypothesis is probably true for most photoconductors under most conditions, but is far from being a law of nature. We now describe a few specific results about the RQE of photoconductors. In 1957 Lummis and Petritz (44)reported a RQE of about 60% for lead sulfide films, and a year later Spencer (45) reported a RQE of nearly 100% for the same kind of filmsQ3 By combining the results of photoconductive and photoelectromagnetic Earlier, in 1956, Wolfe (46) reported a value of only 0.25%, but this report has since been found (46) to be based on an incorrect premise as t o the origin of the noise in these films.
QUANTUM EFFICIENCY OF DETECTORS
125
measurements, Moss (47’) has found that the RQE of single crystals of lead sulfide is roughly unity at 2 p. Moss has also reported (48) a similar finding about the RQE of single crystals of indium antimonide. Mollwo (49) has found rough confirmation of unity RQE in single crystals of zinc oxide. Significant measurements have also been made over a wide range of energy in germanium. To be sure the measurements now to be reported were made on p-n junctions, but such measurements relate t o the RQE of germanium as a material even though the germanium was in the form of a junction rather than in the form of photoconductive cell. While studying the photovoltaic effect in germanium p-n junction excited with X-rays, Backovsky, Malkovska, and Tauc (50) found the short-circuit current to be linearly proportional to the absorbed power of the X-rays and not proportional to the number of X-ray photons. The conclusion is that the RQE is proportional to the energy of the X-ray photons. According to Drahokoupil, Malkovska, and Tauc (51))the absorbed energy required t o form one electron-hole pair is about 2.5 electron volts, which energy corresponds to a 0.5-p photon. Similar results were obtained by McKay (52) using excitation by alpha particles; the result was a n energy per pair of 3.0 0.4 electron volts per pair. Finally, Koc (53) studied the RQE of germanium p-n junctions over the wavelength range 0.3 t o 2.0 p ; he found an RQE of unity for wavelengths longer than 0.6 p and an RQE greater than unity and equal to 0.6/X for wavelengths between 0.6 and 0.3 p. Goucher’s pioneer measurement of the RQE of photoconductivity in germanium over the range from 1.0 to 1.7 p has already been reported. I n summary, there is good reason to suppose that the RQE of all photoconductors that show a marked absorption edge is substantially loo%, particularly for the octave between the edge wavelength and one-half of the edge wavelength.
C. Detective Quantum, Eficiency Rose (6) has pointed out that in a photoconductive cell, there is a statistical fluctuation in the number of the free carriers and also a fluctuation in their lifetime. These two fluctuations contribute equally to the meansquare noise voltage in the output. The result is that the mean-square noise in the output, when referred to the input, is never less than twice the noise in the steady ambient radiation. The consequence is that the maximum possible detective quantum efficiency of a photoconductive cell is one-half. I n this respect photoconductive cells must be distinguished from back-biased p-n junctions (Sec. X,D), in which there is no correspond-
126
R. CLARK JONES
ing fluctuation in the lifetime; the lifetime of a carrier is equal to the transit time between electrodes. Thus, in back-biased p-n junctions, the maximum possible DQE is unity. Measurements of the detective quantum efficiency (DQE) of photoconductive cells are few in number. 1. C a d m i u m SulJide Cells. Shulman (54)has reported values of the DQE close to 100% for a cadmium sulfide crystal cell. His values are valid at not just one light level, but over a range of 5 to 1 in cell illumination. He assumed a unit responsive quantum efficiency, measured the photoconductive gain G, and calculated the power spectrum of the noise in the cell output that would be produced by photon noise alone. This power spectrum is compared with the measured power spectrum in Shulman’s Fig. 1. The measured power spectrum is about twice the calculated spectrum, except in the vicinity of 100 cps, where the ratio is 1.5. Shulman states that if the correction due to surface reflection is made, the comput’ed curve must be multiplied by the factor 1.25. One concludes, therefore, that the DQE is about 60% and would be higher if a nonreflecting coating were used. The fact that the observed noise was in fact photon noise was confirmed by Shulman in an interesting way. If one has an ideal photoconductor of constant DQE and varies the photocurrent a t constant bias voltage by varying the intensity of the light, the mean-square noise should be proportional t o the current. But if one varies the current at constant light by varying the bias voltage, one is effectively varying the photoconductive gain G, and one would expect the mean-square noise to vary as the square of the photocurrent. These behaviors were in fact observed, over a 10-to-1 range of current for both kinds of variation. The absolute values of the light intensity and of the currents involved are not given by Shulman. Van Vliet et al. (55) have reported similar and quite extensive measurements on the noise characteristics of cadmium sulfide cells. With respect to order of magnitude, they estimate a DQE of about 10% for modulation frequencies below 1,000 cps and for light within the absorption edge. The exact value of the DQE is found to depend slightly on the amount of ambient light. For frequencies larger than 1,000 cps, the DQE begins to decrease because of spontaneous trapping fluctuations. 2. Lead Xuljide Cells. Some information is available also about the DQE of lead sulfide cells. Wolfe’s conclusion (46),which he presented as a determination of the RQE, was actually close to a determination of the DQE. He found that the detectivity of the cells he measured was 5y0 of that of an ideal detector operating a t room temperature and limited by fluctuations in both the incident and emitted photons. This corresponds to a detectivity 3.5Yo of that of a n ideal detector limited only by the fluctuations in the incident
QUANTUM EFFICIENCY O F DETECTORS
127
photons. We conclude that the detective quantum efficiency of his cells was 0.12% [0.0012 = (0.035)2]. A lead sulfide cell with a typical responsivity versus wavelength curve (such as that shown in Fig. 13 of ref. 2) responds effectively to only 1/20,000 of the total power in room temperature blackbody radiation. Since the latter is 0.05 watt/cm2, the effective fraction of the blackbody radiation is 2.5 X watt/cm2. From this we conclude that if a lead sulfide cell does have a DQE close to its absorptance, it will do so only for ambient irradiations greater than 800 X 2.5 X =2 X watt/cm2. Free (56) has measured the DQE of a group of lead sulfide cells for blue light; he found values lying in the range from 25 to 100%. The source of the ambient blue light was an overvoltaged ribbon filament lamp, monochromatized by a (quartz prism) Perkin Elmer monochromator with 2-mm wide slits. A separate chopped source was used to measure the detectivity. The irradiation of the ambient light was not independently measured, but from the fact that it reduced the resistance of the lead sulfide cells t o 85% of the dark resistance, the irradiation is computed to be 0.5 X watt/cm2. The tests of Wolfe and Free were carried out on Eastman Kodak chemically deposited lead sulfide cells operated a t room temperature. In summary, we conclude that lead sulfide cells have a DQE close to their absorptance, but only for irradiations greater than about watt/ cm2. For smaller irradiations, the DQE is proportional t o the irradiation. I n marked contrast to the low detective quantum efficiency of roomtemperature lead sulfide cells, Watts (5'7) has described a cooled lead sulfide cell that as interpreted by Moss (58), has a DQE of (1.3)-2 = 59%. This result, of course, breaches the theoretical limit of 50%, but the accuracy of the result is not such that the discrepancy is significant. This cell was a t a temperature of 110" K, but it was in an enclosure whose temperature was a t 200" K. Thus, the ambient radiation was 200" K blackbody radiation. Fellgett (59) has described a lead telluride cell a t 90" K, in a room temperature enclosure, that has a DQE of (1.9)-2 = 28%. Further details about these cells, and a derivation of the noise figures of 1.3 and 1.9, will be found in reference 2, pp. 71 and 75. This writer was rather disturbed by the wide difference between the low DQE of the room temperature cells and the high DQE of the cooled cells as described above. It was therefore gratifying to receive a letter dated February 20, 1959 from Dr. Harry E. Spencer of the Eastman Kodak Company in which he reported the calculation of the DQE for two lead sulfide cells over a wide range of temperature, as shown in the following table :
128 ~~~
R. CLARK JONES ~
Cell
T (deg. K)
QD
(%I
PbS-6-4
302 275 207 100
0.14 0.46 36 78
N179-8-4
302 275 207 100
0.028 0.18 14 100
Dr. Spencer reported that the values of Q D in this table are approximate only; the extreme error is probably less than a factor of two. The data in this table indicate that the DQE of the same lead sulfide cell can vary from 10-4 to unity as the temperature drops from room temperature to 100°K. 3. Conclusions. From the limited number of results presented above, perhaps one is permitted to speculate that many photoconductive cells have values of the DQE close to their absorptance for sufficiently high values of the ambient radiation. As improvements are made in these cells, it is to be expected that the amount of ambient radiation required for a DQE close to the absorptance will be reduced. This reduction will also increase the detectivity in the presence of a negligible amount of ambient radiation.
VII. TELEVISION CAMERATUBES A . Introduction I n this section we consider the two kinds of television camera tubes that are of commercial importance: the image orthicon and the vidicon. The responsive quantum efficiency (RQE) of several image orthicons is described. The detective quantum efficiency (DQE) of two RCA image orthicons (the 5820 and 6849) is computed from unpublished signal, noise, and resolution data generously supplied by the Radio Corporation of America. We present also (Sec. C,5) the DQE of the RCA 6326 vidicon. To be sure, DQE is not a fully appropriate criterion for the vidicon, since the noise of a vidicon and its amplifier is quite independent of the amount of ambient radiation. But in order to be able to compare the performance of vidicons with that of image orthicons, we must discuss the vidicon from the point of view of the DQE. The DQE is discussed as a function of the wavelength of the photocathode irradiation, of the line number of the target, and of the amount of the ambient irradiation of the cathode. The chief results are presented in Table 111and in Figs. 8 through 14. The maximum value of the DQE with
QUANTUM EFFICIENCY OF DETECTORS
129
TABLE111. THESIGNAL-TO-NOISE RATIOR, THE PHOTOCATHODE IRRADIATION H, AND THE DETECTIVE QUANTUM EFFICIENCY ALL FOR THE IRRADIATION THATMAXIMIZES THE DQE OF THE Two IMAGE ORTHICONS. 5820
R H QIMX
16.2 5 . 6 5 X 102.65%
6849 8.1 8 . 6 3 X 10-lo watts/cm2 4.35%
respect to wavelength, line number, and ambient radiation is found to be about 2.5% for the 5820, about 4.5% for the 6949, and only about 0.1% for the 6326 vidicon.
B. Responsive QuantumEficiency
So far as this writer knows, the responsive quantum efficiency of image orthicons has been defined in only one way: as the responsive quantum efficiency of the photocathode-that is, the ratio of the number of photoelectrons to the number of incident photons. Accordingly, the responsive quantum efficiency of several RCA image orthicons has already been covered in Sec. V,B. Curve D of Fig. 5 shows the RQE of the 5820 and the 6849; curve A shows the RQE of the new 7037 image orthicon with the trialkali photocathode: the photocathode of the 7037 is identical with that of the 7265 multiplier tube. C . Detective QuantumEficiency In this section the detective quantum efficiency (DQE) is calculated for two RCA image orthicons and one RCA vidicon as a function of photocathode illumination, size of signal area, and the radiation wavelength. The two image orthicons are the RCA 5820 and the RCA 6849, the latter being the wide-spaced version of the former. The vidicon is the RCA 6326. The vidicon is treated separately in Sec. 4. 1. The Basic Data. The writer is much indebted to Mr. F. David Marschka, Dr. George A. Morton, and Dr. Benjamin H. Vine, of the Radio Corporation of America for supplying unpublished information on the performance of these image orthicons. Most of this information is shown in Figs. 6 and 7. Figure 6 shows the electrical signal-to-noise ratio R of the two camera tubes as a function of the photocathode irradiation in watts per square centimeter of 395-mp monochromatic radiation. The signal is the peak-topeak output when the orthicon sees a pattern that has high contrast, between areas of large angular subtense. (Such a pattern will be called briefly a large-area black-to-white transition.) The noise is the rms electrical noise
130
R. CLARK JONES
voltage in a bandwidth that is slightly greater than 4.5 Mc. The beam current was separately adjusted for each of the experimental points on which the curves in Fig. 6 are based. The original data supplied by RCA involved an abscissa equal to the photocathode illumination in lumens per square foot of radiation from a bank of Sylvania “white” fluorescent lamps. Dr. Keith Butler of Sylvania Electric Products has kindly supplied the relative radiant output of these lamps per unit wavelength interval. By combining this information with the relative dpectral sensitivity of the image orthicon as given by the RCA HB-3 Tube Handbook, I calculate that 480 lumens are equivalent to 1 watt of 395-mp radiation. (This ratio is slightly greater than the ratio found below of 450 lumens of 2870” K tungsten radiation per watt of 395-mp radiation.) ILLU MI NATION I N LUMENS /FT
10-l0
I O - ~
IRRADIATION
I o-8
H IN WATTS/CM*
FIG.6. The signal-to-noise ratio of the two image orthicons plotted versus the cathode illumination in lumens (of “white” fluorescent light) per square foot. An alternative scale indicates the cathode irradiation in watts (of 395-mp radiation) per square centimeter. The signal-to-noise ratio is the ratio of the peak signal voltage (for a black-towhite transition) t o the rms noise voltage in a bandwidth slightly greater than 4.5 Me.
Figure 7 shows the way tha$ the resolution of the two camera tubes depends on the photocathode irradiation. The four curves, two for each tube, show the line numbers a t which the response has decreased to 0.5 and 0.25 of the large-area response. These data also were originally supplied with the abscissa given in lumens per square foot, of fluorescent radiation. Curve C of Fig. 5 in Sec. V, which is based on data in the RCA HB-3 Tube Handbook, indicates that the RQE of the two image orthicons has its maximum value a t 395 mp, where the RQE is 5.21% and the responsivity is 0.0166 amp/watt. RCA has supplied the additional information that the response-versus-
131
QUANTUM EFFICIENCY O F DETECTORS
wavelength curve in the HB-3 Tube Handbook corresponds to a responsivity of 36.6 pa/lumen of 2870" K radiation, from which we conclude that 453 lumens of 2870" K radiation is equivalent to 1watt of 395-mpradiation. RCA has supplied the additional information that the image orthicons now being manufactured have a higher responsivity than is indicated above, and that the particular tubes used to obtain the data in Fig. 6 had a responsivity about 60 pa/lumen and about 0.027 amp/watt. Thus, for these newer tubes the RQE is about 8.1% and the ratio of responsivities is 450 lumens/ watt. Thus, the abscissa in Figs. 6 and 7 may be converted t o photocathode current in amperes per square centimeter by multiplying the abscissa by 0.027 amp/watt. Alternatively, the abscissas may be converted to lumens per square centimeter by multiplying the abscissa by 480 or 450 lumens/ watt for the two kinds of light mentioned above. ILLUMINATION I N
lo-*
1000
>
+
LUMEN SIFT.^
I
6849 5 0 % AMPLITUDE,
35' C.
395 m p
.10-10
10-9 10-8 IRRADIATION H IN WATTSICM'
10-7
FIG.7. The television line number of the two image orthicons for two different amplitude responses plotted versus the cathode illumination and the cathode irradiation. The curves labeled "25 percent amplitude" indicates the line number at which the electrical response is $4 of the response for any very small line number and similarly for the other two curves.
2. Derivation of a Working Formula jor Q D . The photocathode of the image orthicons has the dimensions 2.44 by 3.25 cm and thus has the area
A, The integration time orthicon is
=
7.93 cm2
(7.1)
T of the normal mode of operation of the image
T = 130 sec
(7.2)
At 395 mp, the energy of a single photon as given by Eq. (3.16) is
132
R. CLARK JONES
E
=
5.025 X
joule
(7.3)
With the help of the last two equations and Eq. (3.21), one has the following preliminary expression for the detective quantum efficiency: QD
= 1.508 X 10-1’(S/N),2/HAC2
(7.4)
where A is the photocathode area (in square centimeters) that is illuminated by the signal, C is the contrast, and H is the ambient irradiation of the cathode in watts per square centimeter. If the image orthicon had no resolution limitations, then we could say that the electrical signal-to-noise ratio CR in a 4.5-Mc bandwidth would be equal t o the signal-to-noise ratio ( S I N ) , for a signal spot of size equal to the smallest area that can be resolved by such a bandwidth. If A , is the photocathode area, this smallest area Aminis given by A,/(2 X 4.5 X 106/30) Amin = AC/300,000
Amin =
(7.5) (7.6)
To derive the signal-to-noise ratio ( S I N ) , for larger signal areas, we note that the frequency bandwidth required to transmit a signal spot of area A is inversely proportional to A. The required bandwidth is 4.5 Mc for the area Aminand is only 15 cps for a signal that uniformly covers the photocathode. Since, furthermore, the noise has a flat spectrum, the signalto-noise ratio for a signal spot of area A is given by
( S I N )m
=
(A/Amin) %CR
(7.7)
or by
( S / N ) , = (300,000A/AC)”CR
(7.8)
where A , is the area of the photocathode. If the area of the photocathode A , = 7.93 cm2 is inserted in the last expression and the result substituted in Eq. (7.4), one finds
Q~
=
5.71
x
10-13~2/~
(7.9)
This is the final “working” expression for the DQE in terms of the measured quantities R and H . In this expression, QD is a fraction (not in per cent), and H is in watts per square centimeter. It is particularly to be noted that the area A of the signal does not appear in this expression. Actually, of course, the area A fails to appear because of our assumption that the performance is not limited by the resolution capability; for sufficiently small areas the DQE will decrease, and this decrease is examined in Sec. 3. 3. Results for Image Orthicons. The fact indicated by Eq. (7.9) that the DQE varies as R 2 / H means that the curves of constant DQE in Fig. 6 are straight lines with a positive slope of one-half. The point where each of the two curves has a slope of one-half is indicated by the open circles in Fig. 6.
133
QUANTUM EFFICIENCY O F DETECTORS
The values of R and H at these two points are indicated in Table 111. The last row of Table I11 shows the value of the DQE computed for these two points by Eq. (7.9), in per cent. These values of the DQE are of course the maximum values with respect to radiation wavelength, image size, and photocathode irradiation. The value of the DQE under other conditions is related to its maximum value by QD =
Q,&'xF$H
(7.10)
where the three F's are factors with a maximum value of unity, which depend respectively on the radiation wavelength, line number, and illumination. I
I
I
I
I
I
I
I -
-
5820 AND 6849
-
-
WAVELENGTH IN MILLIMICRONS 300
4 00
500
600
700
FIG.8. The relative detective quantum efficiency plotted versus wavelength. The curve is normalized so that its maximum value is unity. The curve applies to both kinds of image orthicon. The nominal responsive quantum efficiency is shown by curve D in Fig. 5 .
The factor FA that takes into account the variation of QD with the wavelength A is easily shown to vary with wavelength in proportion t o the responsive quantum efficiency of the photocathode, which in turn is proportional t o the responsivity (in amperes per watt) divided by the wavelength, as indicated by Eq. (5.1). Suppose, for example, that we consider a wavelength where the responsive quantum efficiency is just half its value a t 395 mp. This means that both the signal and ambient photon fluxes M , and M a must be doubled in order to produce t,he same photocurrents as before. It then follows directly from Eq. (3.10) that the DQE is reduced to half its previous value. (The experimental data on the responsivity used to construct Fig. 8 were taken from the HB-3 Tube Handbook.) To assure that FArepresents the variation of the DQE with wavelength,
134
R. CLARK J O N E S
the ambient photon flux must vary inversely as the function FA in order that the photocathode current be held a t its optimum value. With reference t o Fig. 8, one sees that this is equivalent to the statement that the photocathode current be held a t 0.00150 pa for the 5820 and a t 0.000237 pa microampere for the 6849.
1.c LLF
>
0
z
w
0 LL LL W
5
I-
z a 0. W
2
I-
0 W
IW 0 W
I
l-
a _I
w
cr 0.0
I
1
40
I
I 100
I
200
T V LINE NUMBER
I 400
\
I
v
FIG. 9. The relative detective quantum efficiency plotted versus the television line number. The ordinate is equal t o the square of the amplitude line-number response. The two image orthicon curves apply only when the cathode illumination is such as to maximize the detective quantum efficiency (see Fig. 10).
The detective quantum efficiency Qmaxapplies to a target so large as to be completely resolved. The DQE is smaller for targets that are not fully resolved. Schade (60) has shown how the response of an image system for a target of any size and shape may be calculated from the line-number response of the system by Fourier methods. For the sake of brevity, we omit this transformation and simply show how the DQE depends on the line number of a simple pattern that may be described adequately by a narrow range of line numbers. The factor F , shown in Fig. 9 is proportional to the square of the line-number response. The shape of this curve depends on the irradiation level, as shown by Fig. 7. The two curves in Fig. 9 are both for the irradiation level shown in Table 111, for which Q D is a maximum for large-area targets.
135
QUANTUM EFFICIENCY O F DETECTORS
The development of the full significance of the function F, would require the introduction of two-dimensional Fourier analysis (61, 62), and would be a sufficiently extensive discussion that its length would not be in proper proportion to its importance for this section. Figure 10 shows the function FH plotted as a function of the irradiation H for a wavelength of 395 mp. This function is equal to the ratio of R2/H as given by Fig. 6 to the maximum value of R 2 / H . The function plott,ed in Fig. 10 is for radiation of the wavelength 395 mp. For other wavelengths, the curve should be shifted to the right by a factor equal to the reciprocal of FA. ILLUMINATION IN
LUMEN SIFT.^
C
0 W
FH
2
k
-I W
.0.1 "O
~
[L
395
mp
IRRADIATION H IN WATTS/CM2 -10 10
-9 10
-8 10
-7 10
FIG.10. The relative detective quantum efficiency of the two image orthicons plotted versus the cathode illumination and the cathode irradiation. The curves are normalized. With the data in this figure and the data in Figs. 7,8, and 9 and Table 111, one may compute the detective quantum efficiency of either image orthicon for any combination of radiation wavelength, line number, and cathode illumination.
4. Detective Q u a n t u m Eltfciency of the 6326 Vidicon. The detective quantum efficiency is best adapted to describing the performance of radiation detectors whose noise is due to the fluctuation in the arrival of the ambient photons a t the sensitive surface. The vidicon is not a member of this class of detectors. The noise at the output of the amplifier associated with the vidicon is quite independent of the level of ambient illumination. Accordingly, if the vidicon were being compared with detectors in general, it would be more suitable to evaluate it by the methods of reference 2. But in this section we are not interested in comparing the vidicon with detectors in general. Rather, we wish to compare its performance with other camera tubes, the image orthicon in particular. Since the image orthicon is suitably evaluated by means of the detective quantum efficiency, we shall use this method also for the vidicon.
136
R. CLARK J O N E S
The signal-to-noise ratio of the vidicon camera tube is substantially degraded by the noise of the best available video amplifier. The noise level depends to some extent on the degree of frequency compensation used in the amplifier, which compensation corrects the horizontal response for the aperturing effect of the scanning beam and for the shunt capacity of the tube. The electrical signal-to-noise ratio R shown in Fig. 11 and the F,
200 -
100 -
-
40 -
20 10 -
-
I
I
I 1
-6
10
I
I
I I -5 10
I
I
I 1 -4
10
FIG.11. The signal-to-noise ratio R of the 6326 vidicon for a black-to-white transition and also the quantity r R to be used for calculating the signal-to-noise ratio for a small signal, plotted versus the cathode illumination in lumens per square foot of 2870' K radiation and the cathode irradiation in watts per square centimeter of 435-mp radiation.
function shown in Fig. 9 are for no compensation and represent the situation in which the spectrum of the iioise a t the output of the amplifier is approximately flat. The curves in Figs. 9 and 11 are based on Figs. 8 and 10 of the RCA Bulletin describing the 6326 vidicon and on the information (kindly supplied by Mr. A. D. Cope and Dr. Benjamin H. Vine of RCA) to the effect that the noise current referred to the output of the tube is about 1.5 X amp in a 4.5-Mc bandwidth. The curves in Fig. 11 are for a large-area black-to-white transition and are for monochromatic light of wavelength 435 mp. From the data in Figs. 8 and 10 of the 6326 Bulletin one finds that the tube's responsivity/wavelength ratio has a maximum a t this wavelength and that 1 watt of 435-mp radiation produces the same response as 520 lumens.
QUANTUM EFFICIENCY O F DETECTORS
137
Unlike the response of the image orthicon, which is linear for illuminations below the “knee” of the characteristic curve, the response of the vidicon is nonlinear, with a “gamma” less than unity. Thus, the signal-tonoise ratio from a large-area target of small contrast C will not be CR, but will rather be yCR, where y is the gradient of the log-current-output-vslog-irradiation curve. A plot of y R vs H is also shown in Fig. 11. At 435 mp, the energy of a photon is & = 4.56 X 10-19 joule
(7.11)
We shall base our calculation on the assumption that the integrat’ion time of the vidicon is 1/30 sec. This assumption, which is very sound for image orthicons, is not exact for vidicons. These camera tubes have appreciable carryover from one frame to the next, which is small a t the recommended illumination of 30 1umens/ft2, but which becomes quite marked a t much lower illuminations; this integration increases significantly the signal-tonoise ratio at lower illuminations, a t the cost of blurring rapid motion. In the absence of detailed information about the amount of temporal integration as a function of illumination, we make the simple assumption that the integration time is
T = 1/30 sec
(7.12)
As will be apparent shortly, the detective quantum efficiency has a maximum value for H = 4 lumens/ft2, and thus this assumption is sound for the illuminations of greatest interest. By exactly the same type of argument as that used in Sec. VI1,2, we find
(X/N),
=
(300,000 A/A,)’yCR
(7.13)
where A , is the sensitive area of the vidicon, which has the dimensions 0.5 by 0.375 in. :
A,
=
1.342 cm2
(7.14)
With the help of Eq. (3.21), the last four equations yield QD
=
3.06 X 10-l2Y2R2/H
(7.15)
This is the “working” equation for the DQE of the vidicon. The DQE varies as y2R2/H.The point on the curve in Fig. 11 where this ratio has its maximum value is indicated by a n open circle. For this watt/cm2, whence one has point, y R = 47.5, and H = 8.28 X
Q,,
=
0.084%
(7.16)
This is the maximum value of the DQE with respect to the radiation wavelength, size of the signal image, and amount of ambient radiation. The
138
R. CLARK J O N E S
factors FA,F,,, FH, defined as in the preceding section, are plotted in Figs. 9, 12, and 13. The function FA is proportional to the responsivity (plotted in Fig. 10 of the Tube Bulletin) divided by the wavelength. The function F, is proportional to the square of the response versus line number. The function FH is equal to the ratio of -y2R2/Hto its maximum value. Many of the comI
I
I
I
I
I
I
I
I
I
FIG.12. The relative detective quantum efficiency of the 6326 vidicon plotted versus the radiation wavelength. The curve is normalized.
> ILLUMINATION IN LUMENSIFT*
I
I I
I
-6
10
I
I
I I
-5
10
IRRADIATION H IN WATTS/CM*
I
I -4
10
FIG.13. The relative detective quantum efficiency of the 6326 vidicon plotted versus the cathode illumination and the cathode irradiation. The curve is normalized.
139
QUANTUM EFFICIENCY O F DETECTOIZS
ments made in the preceding section about the F functions apply also to the F functions of this section. The fact that the limiting noise of the vidicon is amplifier noise has a n important consequence. If an image orthicon had a DQE of only O.O840j, (the value for the vidicon) under the best conditions, the RQE of the photocathode would have to be increased by the factor 52 to equal the 4.35% DQE of the 6849. But since the noise of the vidicon is amplifier noise, the responsivity of the vidicon’s photocathode would have to be increased only by the factor 7.2 = (52)% in order that the DQE rise to 4.35%, provided that the noise of the vidicon proper continues to remain below the amplifier noise. This is one of the consequences of the fact (mentioned a t the beginning of this Sec. 4) that the vidicon is not a member of the class of detectors that are best described in terms of their detective quantum
l,06849-
‘
0.1
162
I
I
I
10
0 I-
a I-
K
-
CT 0
w n
z -01 0 0
,.--E too-
~
--,,------
5822, /
6849///;’ / / /,,/
/
/
/
4-
/
YR,
0
’
6326
+T-\
R
?
/
/
W
10-
//
/
10-8
Io
-~
IO-~
IO-~
1 6 ~
FIG.14. A summary plot showing both the detective quantum efficiency Q D (solid curves) and the signal-to-noise ratio (dashed curves) of all three camera tubes plotted versus the cathode illumination and the cathode irradiation. The values of Q D assume optimum choice of radiation wavelength and television line number.
5. Discussion. Figure 14 shows the electrical signal-to-noise ratio and the DQE of image orthicons and vidicons when the camera sees a pattern with contrast between large areas, which pattern is illuminated by light of the optimum wavelength. This figure summarizes the most important results of this paper. The accuracy of the results presented here is not high. The data supplied by RCA were laboratory dat,a not obtained for the purposes of this review; the noise levels are based on peak-to-peak noise amplitudes as observed on a n oscilloscope. Since the DQE is inversely proportional to the mean-
140
R. CLARK JONES
square noise voltage, it is clear that there is room for appreciably error in the results. I mould guess that the numbers found are probably between % and 35 of the correct values for the image orthicons, and the result for the vidicon may have a somewhat larger range of probable error. I feel confident that the values of the DQE found for the image orthicons are approximat,ely correct : I had expected the detective quantum efficiency to be about of the responsive quantum efficiency, and the result found accords with this expectation. It is also in agreement with expectations based on the internal parameters of the image orthicons. The values of DQE found in Table 111, about 2.5 and 4.5%, are from 45 to $6 of the responsive quantum efficiency of about 8%. I share with Dr. Rose the feeling that these high efficiencies are a technical accomplishment of the first rank. The writer knows of no other image system that can quite come up to this performance. As indicated in Secs. VIII and IX, the human eye and photographic negatives both have a DQE under the best conditions of about 1%. The new image orthicon with the improved cathode (RCA 7037) has a n RQE of 19.2% a t 400 mp, as indicated by curve A of Fig. 5, and there is every reason t o expect that such tubes will have a DQE approaching 10%. To summarize, Fig. 14 shows the signal-to-noise ratio and the DQE of the three image tubes discussed in this report. The maximum signal-tonoise ratio of the vidicon is slightly higher than that of either of the image orthicons, but the DQE of the vidicon is much lower, because the illumination required to achieve the high signal-to-noise ratio is so much larger. VIII. PHOTOGRAPHIC NEGATIVES
A. Introduction Nearly all the fundamental investigations on the behavior of photographic materials and of the individual grains have in one way or another contributed t o our understanding of the responsive quantum efficiency. Prominent in this field are the names of Silberstein, Trivelli, and Webb in this country, and Berg, Burton, Gurney, Mitchell, and Mott in England. Important in these investigations have been the shape of the density-vs-logexposure curve, the intermittency effect, and reciprocity law failure. An important tool has been the counting of developed grains in single-grainlayer films. The broad outline of the Mott-Gurney theory (63, 64) of the photographic process, announced in 1938, is still the accepted theory. This theory views the silver halide grain as a photoconductor. The absorbed photon produces an electron-hole pair. The electron is mobile. Interstitial silver ions are also mobile. At special sites within or on the surface of the grain,
QUANTUM EFFICIENCY O F DETECTORS
141
the electrons and silver ions combine to form a collection of silver atoms. This bit of metallic silver is the latent image speck. Sixteen years before the pioneer publication by Gurney and Mott, Silberstein (65)in 1922 made the first effort,to understand the photographic process in terms of the fact that light arrives a t the film in discrete bundles (the photons). He assumed that the effective absorption of a single photon was sufficient to make the grain developable. I n the same year, Svedberg (66) established that incidence of a single alpha particle is sufficient to make a grain developable; the same fact was established for X-rays by Silberstein and Trivelli (67) in 1930. In 1928 Silberstein (68) generalized this 1922 concept by the assumption that a small but finite number of photons must be effectively absorbed to make a grain developable. Silberstein and Webb (69) reported in 1934 that the intermittency effect could be understood only in terms of the quantum nature of light. I n 1927 Wightman and Quirk (70) suggested that the formation of a developable grain proceeds in two stages. In the first stage, a nondevelopable “subspeck” is formed. The subspeck is later converted into a developable “full speck” by further action of light. A large literature has developed about this concept, and the concept is now fully accepted and has a great deal of evidence t o support it. In 1938 Webb and Evans (71) and Berg and Mendelssohn (72) showed that the reciprocity law failure was due t o the instability of the subspeck in its initial stage of formation. I n 1946 and 1948 Burton and his co-workers reported (73-77) an impressive series of experiments using the “double exposure” technique. From these experiments it was concluded that the formation of the subspeck required the effective absorption of a t least two photons and that the full speck required a t least two more photons. (In the more recent literature, the subspeck is called the latent subimage speck and the full speck is called the latent image speck.) I n 1950, Webb (78) and Katz (79) showed conclusively that the effective absorption of two photons within a critical period of length r is required to form a latent subimage speck on the basis of measurements on reciprocity law failure. Webb indicated that after the latent subimage is formed, the grain must absorb about six more photons to make the grain developable. In 1954 Maerker (80) determined that the length of the critical period was 3 or 4 sec a t room temperature. I n 1957 Mitchell and Mott (81)and Mitchell (82) published a refinement of the Mott-Gurney theory, as part of which the latent subimage speck requires the absorption of two photons, and the latent image speck requires the absorption of one more; the minimum latent image speck consists of four silver atoms with a unit positive charge. The effort to use the shape of the density-vs-log-exposures curve to estimate the number of photons required to make a grain developable was
142
R. CLARK J O N E S
carried further by Webb (83, 84)in 1939 and 1941, and by Burton (85) in 1951, but the conclusion of all of this work was that the shape of this curve is determined primarily by the fact that the required number is widely different for different grains within a given emulsion. Several workers have measured the number of incident photons required to make a grain developable. As summarized by Webb (86) in 1948, the published measurements range from 200 to 1,350 photons/grain for wavelengths near 4,000 A. Taking the number to be 400, Webb indicates that XOof these incident photons are absorbed by the grain and that of the 40 absorbed, only 10 are effective photographically. For further information on the theory of the photographic process, the reader is referred to the excellent reviews by Berg (87) and by Mitchell (88),and to the incomparable book (89) edited by Mees.
B. Responsive Quantum Eficiency One definition of the responsive quantum efficiency is the ratio of the number of developed grains to the number of incident photons. Under optimum conditions of wavelength, exposure, and development, this ratio was found above to be about $&o = 0.2y0. Another definition is the ratio of the number of grains made developable to the number of effectively absorbed photons. The ratio predicted by the Mott-Mitchell theory is 55 = 33%. Actual measurements carried out under optimum conditions indicate the ratio HO= 10%. These results, which vary from 0.2 to 33%, may be compared with the results of 0.3 to 0.9% for the detective quantum efficiency (see Sec. C ) .
C . Detective Quantum Eficiency In this section the detective quantum efficiency (DQE) of four Eastman Kodak films (abbreviated names: Royal-X, Tri-X, Plus-X, and Pan-X) is computed from sensitometric and granularity data generously supplied by the manufacturer. The DQE of a film depends on the wavelength of the radiation and on the amount of the preexposure. For each of the films, a curve of the DQE versus the preexposure (for a radiation wavelength of 430 mp) is presented in Fig. 18. The DQE passes through a maximum as the pre-exposure is increased. The maximum values of the DQE for 430-mp radiation are found t o be 0.90,0.59, 0.62 and 0.30y0 for the four films, and these maxima occur for preexposures of 0.0011, 0.0040, 0.010, and 0.018 erg/cm2. 1. Derivation of a Working Formula for Q D . The DQE of a given photographic negative may be expected to depend on the following: 1. The amount of ambient exposure 2. The spectral distribution of the radiation signal 3. The method of development
QUANTUM EFFICIENCY OF DETECTORS
143
In this Sec. C, the dependence of the DQE on the first two items will be discussed. Standard developing conditions are assumed. I n application to photography, the concepts formulated in Sec. I11 may be interpreted as follows. The “noise” in the photographic negative is the density fluctuation from place to place on the surface: if one measures the density with an aperture of area A a t a large number of different places on the developed negative, the measured densities will not all be the same. The set of measured densities may be characterized by a mean density that will be denoted simply by D and by a rms deviation from D that is denoted by u. Suppose, then, that the entire surface of a negative is uniformly preexposed by the ‘(ambient” radiation and that on one small region of area A, a small additional radiation “signal” is incident. The noise equivalent value of this signal is the value that produces a density increment equal to the rms density fluctuation u measured with apertures of the same area A . The value of u will depend, of course on the area A of the aperture, and indeed where there is now good evidence (90) that u varies as A% for apertures substantially larger than the size of the grains in the emulsion. The numerical values given in this Sec. C are for a n arbitrarily chosen aperture 10 p in diameter. The symbol U is used to indicate the exposure of the film in ergs per square centimeter. The preexposure is denoted by U,. The amount of additional exposure that produces a density increment equal to the rms fluctuation may be called the noise equivalent exposure UN and is given by
U,
=
udU/dD
(8.1)
where d U / d D is the slope of the U-vs-D curve a t the point where the exposure U is equal to U,. In practice, the slope dD/dU is determined by taking the ratio of small finite increments AD and AU. Thus, UN may be written
UN = uAU/AD
(8.2)
The detective quantum efficiency as given by (3.24) now may be written
where U, is the exposure a t the middle of the range AU. The chief results derived in this section are for 430-mp radiation. At this wavelength, the energy of a photon is & = 4.6180 X 10-l2 erg
based on h = 6.6238 X erg-sec and c area of the 10-p circular aperture is
A
=
78.54
x
=
(8.4) 2.9979 X 1Olo cm/sec. The
lo-* cm2
(8.5)
144
R. CLARK JONES
The last three relations then yield QD
= 0.5880 X 10-5Uv,(AD/~loAU)2
(8.6)
This is the “working” expression for the DQE of photographic negatives. U, and AU must be expressed in ergs per square centimeter. 2. Description of the Films. The Eastman Kodak Company has very generously provided sufficient data for the calculation of the DQE of four of its current films. The four films are: 1. Kodak Royal-X Pan Film, Code 6128 2. Eastman Tri-X Panchromatic Negative Film, Type 5233 3. Eastman Plus-X Panchromatic Negative Film, Type 4231 4. Kodak Panatomic-X Film, Code 5240 The data supplied by Eastman Kodak for these films are representative of the films at the time of manufacture, but it should be recognized that their characteristics can be expected to vary with manufacturing tolerances and may change as improvements are made. The first is sheet film, and the last three are 35-mm roll films. The films were manfactured in February, 1957. Table I V shows the abbreviations used in this review for these films, the developing conditions, and the gamma to which the films were developed. TABLE IV. DATACONCERNING THE FOUR PHOTOGRAPHIC FILMS Material
Abbreviation
Time
Developer
Gamma
1 2
Royal-X Tri-X Plus-x Pan-X
5 min 6.5 min 6 . 5 min 6 rnin
DK-50 SD-28 SD-28 D-76
0.65 0.54 0.79 0.67
3 4
5. Sensitometric Data. Eastman Kodak has supplied density-vs-logexposure curves for each of the materials. The exposures were made to radiation that has passed through a narrow-band Wratten filter whose effective wavelength was 430 mp. The exposures through the filter were calibrated against exposures made with a prism monochromator. The duration of the sensitometric exposures was 15 sec. There was a significant reciprocity failure at this duration, and the exposures were corrected at a density of 0.38 above base to the exposure that would be required for a 0.1-sec exposure. The reciprocity correction amounted to 0.45, 0.34, 0.29, and 0.29 log exposure units for materials 1 through 4. The density of the film base was 0.22 for all of the materials, and all of the densities in this Sec. C are densities above base (except in Sec. 5 and Fig. 17.
145
QUANTUM EFFICIENCY OF DETECTORS
The density D was read off the curves a t points separated by 0.1 in log,, U. As indicated by Table V, the values of log U a t which the densities were read contained 5 in the second decimal place, in order that the midpoints of the intervals between successive values of log U would have zero in the second decimal place. In the Tables VIII through XI, the value of log U, in the first column is the logarithm of the value of U, a t the midpoints of the intervals. One hundred times U , is given in the second column, and the third column shows 100 times AU, where AU is defined as the difference between the two adjacent values of U, in Table V. (For example, TABLEV. FILMDENSITY VERSUS EXPOSURE loglo
Density 1)
u,,
ergs/cm2
-3.55 -3.45 -3.35 -3.25 -3.15 -3.05 -2.95 -2.85 -2.75 -2.65 -2.55 -2.45 -2.35 -2.25 -2.15 -2.05 -1.95 -1.85 -1.75 -1.65 -1.55 -1.45 -1.35 -1.25 -1.15 -1.05 -0.95 -0.85 -0.75 -0.65 -0.55 -0.45
Royal-X 0.105 0.105 0.110 0.121 0.141 0.170 0.207 0.251 0.302 0.360 0.422 0.487 0.552 0.617 0.681 0.742 0.801 0.859 0.913 0.964
Tri-X
0.055 0.055 0.060 0.073 0.095 0.124 0.161 0.204 0.251 0.300 0.350 0.401 0.453 0.506 0.560 0.614
Plus-x
0.051 0.051 0.054 0.063 0.079 0.103 0.134 0.173 0.218 0.269 0.326 0.388 0.455 0.526 0.601 0.679 0.758 0.837 0.916
Pan-X
0.009 0.009 0.010 0.014 0.022 0.035 0.054 0.078 0.107 0.141 0.180 0.223 0.270 0.322 0.378 0.438 0.500 0.565 0.631 0.698 0.765
146
R. CLARK JONES
for log,, U , = -2.00, the value of lOOU, is 1.00, and the value of 1OOAU in the third column is obtained from the values of U, corresponding to log U, = 1.95 and log U, = -2.05.) The densities were read off the curves with a n estimated reading accuracy of 0.001 density unit. The densities so read were differenced, the differences AD were smoothed, and resummed. The resulting smoothed densities D are shown in Table V and in the fourth columns of Tables VIII through XI. Then the differences AD were themselves differenced, smoothed, and resummed. The resulting smoothed AD values are shown in the fifth columns of Tables VIII through XI. This two-step smoothing operation was found necessary to yield values of AD that would plot smoothly against log U. The value of AD shown in the tables, when multiplied by 10, is the finite difference approximation to the derivative dD/d log U of the D-vs-log-U curve. The characteristic curves of the four materials are plotted in Fig. 15.
-3
LOG EXPOSURE
-2
ua, u,
-I
IN ERGS/CM'
FIG. 15. Showing the optical density of the developed films as a function of the 430-mp exposure expressed in ergs per square centimeter. The curves are plotted from the data in Table V.
4. Granularity. With the aid of the automatic scanning microdensitometer described by Altman and Stultz (91),the density fluctuation of each type of film was measured-on specimens of four different densities with apertures 10, 20, and 40 p in diameter. The individual density readings were recorded on punched cards, and the calculation of u was made on an IBM 705. The 40 values of u supplied to the writer by Eastman Kodak are shown in Table VI. Each of the 40 values involves 2,000 individual density measurements. Actually, the entries in the table are ulo, 2 ~ 2 0 ,and 4 ~ 4 0 .If there
147
QUANTUM EFFICIENCY OF DETECTORS
were no sampling error in the results and no effects of diffraction, one would expect these three values for the same film to be identical. With two exceptions, this expectation is fairly well confirmed by the numbers in the table. There is a noticeable tendency for the v1o to be perhaps 10% lower than 2uzOand 4u40. This would be expected on the basis of the inevitable blurring of the edges of the circular spot by diffraction and the finite thickness of the developed layer. The three values are averaged with equal weight in the last column of Table VI, except that the entry 0.260 was given zero weight, and the entry 0.172 was given weight relative to unit weight for the other two entries. TABLEVI. MEAN-SQUARE DENSITY FLUCTUATION FOR CIRCULAR APERTURES 10, 20, AND 40 p IN DIAMETER
D
u10
2uzo
4ua1
(aidA,
0.097 0.147 0.260 0.186
0.102 0.168 0.206 0.230
0.100 0.164 0.192 0.232
0.100 0.160 0.199 0.216
0.056 0.085 0.090 0.122
0.074 0.104 0.112 0.122
0.064 0.128 0.116 0.172
0.065 0.106 0.106 0.129
0.030 0.061 0.074 0.069
0.030 0.058 0.074 0.062
0.030 0.060 0.074 0.066
0.029 0.048 0.057 0.076
0.034 0.054 0.062 0.066
0.032 0.051 0.060 0.071
Royal-X 0.10 0.34 0.80 1.18
Tri-X 0.06 0.36 0.74 1.14
Plus-x 0.06 0.30 0.80 1.40
Pan-X 0.04 0.26 0.66 1.10
The 16 values of ul0 in the last column are plotted in Fig. 16 versus the density, and the indicated smooth curve was drawn through the points for each of the four materials. These curves were used to find the values of u10 for intermediate densities. The value of u10 was read from the curve for each of the densities shown in the fourth column of Tables VIII through XI. The values so read were differenced, smoothed, arid re-summed to obtain the values of vl0 shown in the sixth column of the tables. 5. Wavelength Dependence. The Eastman Kodak Company has also
148
R. CLARK JONES
supplied data on the absolute sensitivity-vs-wavelength characteristics of these four films. The data supplied, however, are not for the same emulsion numbers as those for the other data in this report. The data were supplied in the form of curves showing the sensitivity (reciprocal of the energy per unit area in ergs per square centimeter required t.0 produce a density of 1.0 above the gross fog density) versus the wavelength. This density is of course much higher than the densities a t which the detective quantum efficiency is a maximum. Since there is some change in the shape of H and D curve as the wavelength is varied, the curves should be used with caution.
I 0.2- bQ z
0 0.1-
t
2
0.05W
a 0.02 0.02
DENSITY D I
0.05
I
0.1
I
0.2
I
I
0.5
1.0
2.0
FIG.16. Showing the rms density fluctuation as a function of the mean optical density for the four films. The films were uniformly exposed t o 430-mp radiation. The ordinate is the rms density fluctuation measured with a circular aperture 10p in diameter. The points represent the data in the last column of Table VI.
As the wavelength is varied, one would expect the detective quantum efficiency to vary in proportion to the sensitivity divided by the wavelength. If we denote the sensitivity as defined above by S(X),then the function W(X) = S(X)/X (8.7) will vary with wavelength in the same way as the detective quantum efficiency. That is to say, the detective quantum efficiency a t the wavelength X is given by QD
(A) =
QD
(430)W(X)/W(430)
(8.8)
We might have chosen to plot QD(X),W(X), or W(X)/W(430). We chose to plot W(X) in Fig. 17 because this choice not only preserves all of the information in the original data but also provides a separation among the four curves. The values of W(430) for materials 1 through 4 are 78.8, 12.67, 2.196, and 1.078 cm2/erg/p. Thus, if one wishes to find the values of Q D ( X ) that
149
QUANTUM EFFICIENCY OF DETECTORS
correspond to the values of &(430) shown in Table VII, the values of W(X) shown in Fig. 17 must be multiplied by the factors 0.01137, 0.0464, 0.281, or 0.274, respectively. 6. Numerical Results. The second, third, fifth, and sixth columns of Tables VIII through XI contain the information required for the calculation of the DQE by Eq. (8.6). TABLEVII. PARAMETERS CORRESPONDING THE MAXIMUM VALUEOF Film
Royal-X Tri-X Plus-x Pan-X
Qm,.
0.895 0.588 0.618 0.295
9
g/r
0.416 0.370 0.310 0.260
0.640 0.685 0.392 0.388
log,,
u,
-2.94 -2.40 -2.00 -1.75
QD
D 0.211 0.142 0.118 0.078
FIG.17. Showing the sensitivity divided by the wavelength as a function of the wavelength. The sensitivity is the reciprocal of the number of ergs per square centimeter required to produce a density of 1.0 above gross fog. The sensitivity divided by the wavelength is in the units: cm2/(erg-micron).
The result of the calculation is shown in the seventh column of the tables; the value of &D is tabulated in per cent. The DQE is plotted in Fig. 18 versus the exposure U and in Fig. 19 against the gradient g = dD/ d log,, U , approximated by 10 times AD. Both of the figures indicate clearly that the DQE has a maximum value within the range of exposures included in the calculations. Table VII lists
150
R. CLARK J O N E S
TABLEVIII. DATAAND RESULTS FOR ROYAL-X FILMO log,,
-3.4 -3.3 -3.2 -3.1 -3.0 -2.9 -2.8 -2.7 -2.6 -2.5 -2.4 -2.3 -2.2 a
u,
lOOU, 0.03981 0.05012 0.06311 0.07944 0.1000 0.1259 0.1585 0.1995 0.2512 0.3162 0.3981 0.5012 0.6311
100A
u
0.009191 0.01156 0.01456 0.01834 0.02307 0.02910 0.03650 0.04610 0.05791 0.07301 0.09191 0.1156 0.1456
QD,
D
10AD
g10
per cent
0.1075 0.1155 0.1310 0.1555 0.1885 0.2290 0.2765 0.3310 0.3910 0.4545 0.5195 0.5845 0.6490
0.045 0.115 0.197 0.282 0.365 0.444 0.514 0.573 0.618 0.643 0.650 0.644 0.625
0.1040 0.1070 0.1120 0.1200 0.1294 0.1392 0.1487 0.1575 0.1654 0.1726 0.1790 0.1848 0.1900
0.0519 0.255 0.541 0.767 0.879 0.889 0.836 0.730 0.615 0.484 0.365 0.268 0.189
Exposure U is in ergs/cm*.
TABLEIX. DATAAND RESULTSFOR TRI-X FILM"
-2.8 -2.7 -2.6 -2.5 -2.4 -2.3 -2.2 -2.1 -2.0 -1.9 -1.8 -1.7 -1.6 4
0.1585 0.1995 0.2512 0.3162 0.3981 0.5012 0.6311 0.7944 1.000 1.259 1.585 1.995 2.512
0.03650 0.04610 0.05791 0.07301 0,09191 0.1156 0.1456 0.1834 0.2307 0.2910 0.3650 0.4610 0.5791
D
10AD
0.0575 0.0665 0.0840 0.1095 0.1425 0.1825 0.2275 0.2755 0.3250 0.3755 0.4270 0.4795 0.5330
0.050 0.125 0.210 0.295 0.370 0.430 0.470 0.490 0.504 0.515 0.525 0.534 0.540
QD.
per cent
0.0645 0.0667 0.0706 0.0754 0.0804 0.0854 0.0902 0.0946 0.0986 0.1021 0.1051 0.1076 0.1096
0.0420 0.194 0.390 0.534 0.587 0.559 0.475 0.372 0.289 0.222 0.174 0.136 0.107
Exposure U is in ergs/cm2.
a number of properties associated with the maximum value of Q D ; the table includes the maximum value of Q D , the gradient g a t the maximum, the ratio g/r of this gradient to gamma, the exposure U , and the density D. The numerical values of all these quantities, except the DQE itself, are not too well determined by the data, since a slight change in the smoothing operations could conceivably shift the position of the maximum by an appreciable amount.
151
QUASTUM EFFICIENCY O F DETECTORS
TABLEX. DATAA N D RESULTS FOR PLUS-X FILM^ QD,
-2.3 -2.2 -2.1 -2.0 -1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1 -1.0 -0.9 -0.8 0
0.5012 0.6311 0.7944 1,000 1,259 1.585 1,995 2.512 3.162 3.981 5.012 6.311 7.944 10.00 12.59 15.85
0.1156 0.1456 0.1834 0.2307 0.2910 0.3650 0.4610 0.5791 0.7301 0.9191 1,156 1.456 1.834 2.307 2.910 3.650
0.0585 0.0710 0,0910 0.1185 0.1535 0.1955 0.2435 0.2975 0.3570 0.4215 0.4905 0.5635 0.6400 0.7185 0.7975 0.876.5
0.090 0.162 0.237 0.312 0.383 0.450 0.512 0.570 0.624 0.673 0.716 0.752 0.776 0.790 0.790 0.790
u10
per cent
0.0295 0.0327 0.0369 0.0417 0.0467 0.0516 0.0561 0.0600 0.0632 0.0657 0.0675 0.0688 0.0697 0.0703 0.0706 0.0706
0.205 0.430 0.573 0.618 0.588 0.532 0.460 0.398 0.340 0.291 0.248 0.209 0.172 0.140 0.109 0.088
Exposure U is in ergs/cm2.
TABLEXI. DATAAND RESULTS FOR PAN-XFILM" L'a
100U ,
-2.2 -2.1 -2.0 -1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2 -1.1 -1.0 -0.9 -0.8 -0.7 -0.6 -0.5
0.6311 0.7944 1.000 1,259 1.585 1.995 2.512 3.162 3.981 5.012 6.311 7.944 10.00 12.59 15.85 19.95 25.12 31.62
loglo
0
0.1456 0.1834 0.2307 0.2910 0.3650 0.4610 0.5791 0.7301 0.9191 1.156 1.456 1.834 2.307 2.910 3.650 4.610 5.791 7.301
Exposure U is in ergs/cm2.
D
l0AD
0.0120 0,0180 0.0285 0.0445 0.0660 0.0925 0,1240 0.1605 0.2015 0.2465 0.2960 0.3500 0.4080 0.4690 0.5325 0.5980 0.6645 0.7315
0.040 0.080 0.126 0.177 0.230 0.283 0.336 0.388 0.437 0.483 0.524 0.562 0.595 0.623 0.645 0.661 0.670 0.670
QD,
per cent
0.0237 0.0262 0.0291 0.0323 0.0356 0.0388 0.0419 0.0448 0.0475 0.0500 0.0523 0.0544 0.0564 0.0582 0.0599 0.0614 0.0628 0.0640
0.0498 0.129 0.207 0.262 0.292 0.294 0.283 0.262 0.235 0.206 0.176 0.148 0.123 0.100 0.0811 0.0640 0.0501 0.0382
7. Discussion. All the curves of the DQE in Figs. 18 and 19 show a maximum for an intermediate value of the preexposure U,. The qualitative reasons for the existence of this maximum are easy to understand. It is well known that in order for a grain to become developable, the grain must be acted on by a number of phot,ons, of the order of magnitude 10. At exposures much less than that for Qmsr, the large majority of the incident photons are wasted on grains that receive less than the necessary number to become developable. On the other hand, for exposures much greater than that of the maximum, a large majority of the incident photons are wasted on grains that have already received a number sufficient to make them -
I
I
I
I
I
1
I 1 l l l l
I
I
I
I
I
I I I I I
I
1
I
I I I I I
4
I
I
I
I . .
EXPOSURE U, IN ERGS/CM2 -4 10
1
1
1
1
1
1
-3
10
I
I
I 1 1 1 1 1 1
-2
10
I
I
I 1 , l
I
1
-I 10
FIG.18. Showing the detective quantum efficiency of the four films as a function of the exposure in ergs per square centimeter for 430-mp radiation. The results are for the standard developing conditions shown in Table IV. The data are from the last columns of Tables V I I I through XI.
developable. At some intermediate exposure, corresponding to the maximum, there is a situation in which these two tendencies are balanced. It must be understood that this argument is qualitative, and cannot be made the basis of a detailed analysis. The detective quantum efficiency is one of the best single measures of the detecting ability of a photographic negative. More precisely, as one changes the size of the grains in an emulsion, a constant value of the DQE would indicate an optimum swapping of freedom-from-granularity for detectivity. In the light of this comment, it is interesting to observe that the maximum value of the DQE for the Royal-X, Tri-X, and Plus-X films
153
QUANTUM EFFICIENCY OF DETECTORS
is within &17% of the value 0.75%. The maximum value of the DQE for the Pan-X film is definitely lower. Indeed, the inferior performance of the Pan-X film is also noticeable in the energy detectivity and contrast detectivity of this same group of films ( 9 1 ~ ) . 1.0
-
I-
z
-
0
-
a
-
W
K W
I 0.5~
0
z
w
0 LL LL
w
-
2
3 I-
z
a
0.2-
w
2
l0 W -I W
n 0.11
0
I
I
0.2
I
1
0.4
I
I\
0.6
FIQ.19. Showing the detective quantum efficiency of the four films as a function of the gradient of the D-vs-loglo U curve ( H and D curve) for 430-mp radiation. The quantity 10 times the value of AD (that corresponds to an increment of 0.1 in loglo U ) is the finite increment approximation to the gradient d U / d loglo U of the H and D curves. All of the curves are multivalued as a function of the gradient, but only the curve for Royal-X is shown explicitly as a multivalued function. The data plotted are the same as those shown in Fig. 17; however, a different abscissa is used here. To be sure, the scattering of light in the undeveloped film is an important limitation on the ability of these films to resolve detail, and the detective quantum efficiency fails to take this aspect of image structure into account. And indeed, examination of the resolving power values for Plus-X
154
R. CLARK JONES
and Pan-X (95 lines/mm and 130, respectively) indicates that the particular usefulness of Pan-X film lies in its smaller light-scattering power relative t o Plus-X. This smaller light-scattering power is not evident in the granularity data and indicates a basic difference in the structure of the emulsion. The value of the DQE for a given film depends on the ambient exposure U,, the wavelength X of the radiation, and the method of development. The dependence on U , is shown explicitly in the tables and in Figs. 18 and 19. The dependence on the wavelength is shown in Fig. 17. The value of QD is independent of the area A. In this paper only the standard method of development specified in Table IV is considered. The writer believes that the DQE is a sufficiently fundamental property of a film that it will be found to be relatively insensitive to the method of development, but a t present this is only a hypothesis. We conclude this section with a discussion of the data illustrated in Figs. 15 and 16, relating to the sensitometric curve, and to the granularity. The sensitometric curves of the four films are all different. The Royal-X curve has a gradient that falls off rapidly after reaching its maximum gradient, but the other three curves, particularly the Tri-X and Plus-X, have a long straight line portion after the maximum gradient is reached. I n the toe region, the Tri-X curve is different from the others in that the gradient increases rapidly a t first, and then incieases a t a much slower rate; a t log U = -2.1, the gradient is already 0.49, but it does not reach its maximum value of 0.54 until log U = - 1.6. In the granularity curves (u vs D)in Fig. 16, there is a clear tendency for the Tri-X and Pan-X data to have lower slopes than the other two films. On the other hand, the data for the Plus-X film are unique in indicating a maximum in u near the density D = 1.0. All of the curves show a lower average slope than the slope 0.4 indicated by the granularity data published in 1946 by L. A. Jones and G. C. Higgins (92))although the beginning portions of the Royal-X and Plus-X curves do have a slope of about 0.4. (The slope here considered is the derivative d log u / d log D.) IX. HUMANVISION A. Introduction
In this section we present results on the responsive and detective quantum efficiencies of human vision. The information on the responsive quantum efficiency is for the fully dark-adapted eye, whereas the results for the detective quantum efficiency are for the light-adapted eye. New results are given on the DQE; these new results are confined to the phototopic range of adaptation (cone vision), but older results are also given that cover the range of adapting luminance from to foot-lamberts.
QUANTUM EFFICIENCY O F DETECTORS
155
The new results for foveal vision are based on flash perception data due to Blackwell and McCready (93)and kindly supplied to the writer in advance of p~bl icatio nThe .~ new values of the DQE are tabulated in Table XVII for various values of target diameter a,light pulse duration T,and background luminance B. The maximum values of the DQE (with respect to variation of M and T)range from about 0.25% to about 1.0% over the range from 0.1 t.0 100 foot-lamberts, with the maximum value occurring a t about 1.0 foot-lambert. The computed values of the DQE are free of the questionable assumption previously used by Rose (94) and by Jones (95) that the eye has an assignable integration time.
B. Responsive Quantum Bjiciency The response of the eye to light is not usually expressed as a quantum efficiency, and thus, we must inquire just what we mean by responsive quantum efficiency (RQE). With vision, as with photographic films, there are many different ways of defining the responsive quantum efficiency, and as we shall see, the different methods yield greatly dissimilar results. The response event may be either the perception of a flash or light or the excitation of a rod or cone. The input event may be an incident photon, or the effective absorption of a photon. From these two pairs of possibilities, we have three different definitions of the RQE: (1) the ratio of the number of flash perceptions to the number of incident quanta, (2) the ratio of the number of flash perceptions to the number of photons that are effectively absorbed, and (3) the ratio of the number of rod excitations to the number of effectively absorbed photons. We now examine the experimental data that indicate the values of these ratios. We shall, of course, have the highest values for these ratios if the eye is completely dark-adapted and if the photons to be detected are the only photons in the field of view. The results given here are confined to the fully dark-adapted state. Results could be given for various light-adapted conditions also, but these results would be primarily a mere restatement of the data on contrast perception of the eye. According to the careful measurements of Hecht, Shlaer, and Pirenne (96), the mean energy incident on the pupil of the eye corresponding t o a 60% probability of seeing, varied among the subjects from 2.1 to5.7 X erg. The conditions of the tests were: test field 1/6' in diameter; monochromatic light of A = 510 mp; stimulus duration 0.001 sec; the test field was placed at an angle of 20" from the fixation point; the subjects were thoroughly dark-adapted. Some of these data are published in graphical form by Dr. Blackwell (9%).
156
R. CLARK JONES
The energies given above correspond to the energies of 54 to 148 photons. On the basis of Pirenne’s estimate (97) that 10% of the incident photons are absorbed by the retina, the corresponding number of absorbed photons is 5 to 14. The most startling conclusion about the dark-adapted eye, however, is the indubitable fact that individual rods are often triggered by absorption of a single photon (96,97). Probability considerations establish conclusively that a test field is often seen when none of the subject’s retinal rods has absorbed more than one photon. Using rough mean values, we see that the above results lead to a responsive quantum efficiency of 1,10, or loo%, where the first number is the ratio of flash sensations to incident photons, the second is the ratio of flash sensations to absorbed photons, and the third is the ratio of rod excitations to absorbed photons.
C . Detective Quantum Eficiency The new experimental data of Blackwell and McCready (93) are used to derive an improved estimate of the detective quantum efficiency (DQE) of human photopic vision. The new data are of a different kind than those used by Rose (94) and by the writer (95), and are better suited to the problem a t hand. In particular, questions about the temporal integration ability of the eye may be avoided. With all nonbiological detectors, the output of the detector can be considered to be a signal-to-noise ratio, and it was in terms of this output signal-to-noise ratio that the concept of detective quantum efficiency was formulated in See. 111. But with the detection process of human vision, the output is not a signal-to-noise ratio. Rather, the output is the probability of detecting the given radiation signal. Thus, the output is the result of a decision-making process. In order, therefore, to define the detective quantum efficiency of vision, we must compare the decision-making abilities of human vision with the decision-making abilities of an ideal decision-making device. Suppose that to make a given kind of discrimination, an ideal device would require the signal-to-noise ratio k. Suppose further that in order that the human observer be able to make the same kind of discrimination, the human observer requires a signal-to-noise ratio in the incident radiation denoted as in See. I11 by M./Mau. Then the detective quantum efficiency is defined bv (9.1)
I t is a matter of taste whether this be regarded as a new definition of the
QUANTUM EFFICIENCY OF DETECTORS
157
detective quantum efficiency. One could define the detective quantum efficiency in a more general way that would include both the above definition and the definition used in Sec. 111. Up to this Sec. IX , however, the relatively more simple definition used in Sec. I11 was sufficient, and therefore we choose to introduce the more general definition a t this point. The first suggestion that statistical fluctuation in the arrival of photons might limit the contrast perception of the eye was made by Barnes and Czerny (98) in 1932. Incomplete approaches to the quantitative effect of these fluctuations were made by Hecht et al. (96), Rose (7), and de Vries (99),in 1942 and 1943, but the full use of the statistical approach awaited the work of Rose (8,94) published in the period 1946-1948. I n the last ten years there have been many publications ( l O O - l O 4 ) that consider visual performance as a signal-to-noise problem. The references cited are not at all complete, but they are sufficient to lead to this literature. 1. The Signal-to-NoiseRatio k. The purpose of this Sec. 1 is to show how one may calculate the minimum signal-to-noise ratio k that must be presented a t the input of an ideal device if that device is to be able to make certain specific kinds of discrimination, and to present the result of such calculations. We will suppose that the noise involves a Gaussian distribution of amplitudes. (To be sure, the distribution of photons is Poisson rather than Gaussian, but except when the number of photons is small, the two distributions are practically indistinguishable.) Suppose now that we wish to detect a signal in the presence of Gaussian noise. The situation is as illustrated in Fig. 20. The Gaussian curve on the left labeled “noise only” shows the distribution of amplitudes when the signal is not present, and the Gaussian curve on the right labeled “signal plus noise” shows the distribution of amplitudes when the signal is present. The signal itself is supposed t o have a fixed amplitude; the result of this assumption is that the two Gaussian curves in Fig. 20 have the same width. We suppose the left Gaussian curve to be expressed by y = (2,)-%,-2*/2
(9.2)
and the right Gaussian curve by y = ( 2 ~ )M- e - ( z - k ) 2 / 2
(9.3)
So written, both Gaussians have unit area and unit standard deviation, whence it follows that k is the ratio of the signal amplitude to the rms noise amplitude. We now suppose that the ideal device has a threshold; that is to say, if the measured amplitude is above the decision threshold, the device concludes that the signal is present and otherwise concludes that only noise is
158
R. CLARK JONES
present. The position of the threshold is denoted by T . Then the shaded area to the right of the vertical line a t T is the probability f that the device falsely concludes that a signal is present when it is not, and the shaded area to the left of the vertical line is the probability 1 - p that the device concludes a signal is not present when it is. The probability p is then the probability that the signal is detected when it is present. We call f the falsealarm fraction and p the (unnormalized) reliability of detection. NOISE ALONE
SIGNAL PLUS NOISE
-
0
T
k
t-----4
THRESHOLD SIGNAL
FIG.20. The Gaussian bell on the left represents the distribution of amplitudes when the signal is absent and the similar bell on the right is the distribution of signalto-noise amplitudes when the signal is present. If the decision threshold is at the position T , the shaded area to the left of the line represents the probability that the device fails to detect a signal that is present, and the shaded area to the right of the line represents the probability that the device judges a signal present when it is not (a false alarm).
Actually, the unnormalized reliability p just defined is not what one would logically define as the reliability. One notes that to achieve an unnormalized reliability equal to the false-alarm fraction, the signal-tonoise ratio k required is zero! Accordingly, it is customary in situations of this kind to introduce a renormalized reliability q that is zero when the probability of recognizing a signal is equal to the false-alarm fraction. A suitable definition of q is rl = ( P
-f)/(l -f)
(9.4)
In formulating the concept of renormalized reliability q, we are following the practice of Blackwell and McCready and many other authors. The discussion could equally me11 be carried out in terms of p . We see that Fig. 20 establishes a graphical relation between the signalto-noise ratio k , the false-alarm fraction f, and the reliability p. To obtain
159
QUANTUM EFFICIENCY OF DETECTORS
corresponding analytic relation, we define x is inverse to y
=
erf z
=
(2a)
1:
=
erf-ly as the relatioi: that
(9.5)
-''/2du
where erf z is the well-known error-function, tabulated in many mellknown tables. Then it is easy to show that the signal-to-noise ratio k required by the ideal device of Fig. 20 to achieve a, reliability q with a false alarm fraction f is given by k
=
erf-I (1 - f )
+ erf-'
p
(9.6)
or k = erf-' (1 - f )
+ erf-I [f + (1 - f)q]
(9.7)
Thus, by inverse interpolation in a table of the error function, one may determine the signal-to-noise ratio required by an ideal device to detect a signal with a given reliability p and a given false-alarm fraction f. TABLEXII. THEENTRIES ARETHE VALUES OF THE SIGNAL-TO-NOISE RATIOk REQUIRED BY AN IDEAL DETECTOR TO ACHIEVE A RELIABILITY q AND A FALSE-ALARM FRACTION f False-alarm fraction, f,per cent
50 10 1 0.1 0.01
q = 50%
Q = 90%
q = 99%
0.67 1.41 2.34 3.09 3.72
1.64 2.62 3.62 4.37 5.00
2.58 3.65 4.66 5.42 6.05
Table XI1 contains a set of k values determined in this way. Inspection of the table shows that if one simultaneously requires a small false-alarm fraction and reliability close to unity, a signal-to-noise ratio in the range from 4 to 6 is required. At the other end of the scale if one permits a 50% false-alarm fraction and requires only a 50% reliability of detection, a signal-to-noise ratio of only 0.67 is required. This concludes the considerations relating to Fig. 20. Next we consider multiple-choice alternatives. Suppose, for example, that a signal is presented in one and only in one of M similar channels. We suppose that. the rms amplitude of the noise is the same in each of the M channels and that the device has no bias in favor of any of the channels. Accordingly, the false-alarm fraction of each channel is just f = 1/M.
160
R. CLARK JONES
We suppose further that the device examines the amplitude in each channel and selects in each trial the channel in which the amplitude is greatest. In this situation there is no fixed threshold; the device selects the channel whose amplitude is largest. The mathematical analysis is more complicatedIin this multiple-choice situation. The problem has been formulated, and results obtained by numerical integration, by Birdsall and Peterson (106). Their results include M = 2, 3, 4, 8, 16, 32, 256, and 1,000 and are presented in the form of a plot of k vs p , with M as the parameter of the separate curves. Table XI11 was constructed by reading points off their curves. TABLEXIII. THE ENTRIES ARE THE VALUESOF THE SIONAL-TO-NOISE RATIOk REQUIRED BY AN IDEAL DETECTOR TO ACHIEVE A RELIABILITY q IN A N M-CHANNEL FORCED-CHOICE SITUATION Signal-to-noise ratio k
Reliability, q, per cent
M = 2
M = 4
M = 8
50 90 99
0. 95 2.36 3.25
1.22 2.65 3.80
1.52 2.93 4.11
2. The Experimental Data. We here describe the experimental measures of Blackwell and McCready (93). The observer looked at a uniformly illuminated surface of luminance B . This uniformly illuminated surface subtended about 40' at the eye of the observer. The uniformity was interrupted by faint steady fixation sources that directed the observer's eye to a given point a t the center of this uniform surface. At certain times the luminance of a circular area of diameter a! centered at the fixation point was increased in luminance by the amount CB for a period of duration T, where C is called the contrast. The total duration of each observation period was 10 sec. This period was divided into four successive 2.5-sec periods, in one and only one of which the signal of duration T was presented. The observer was required to guess in which one of the four periods the signal was present. Acoustical signals were provided to indicate the beginning and end of each 10-sec period and to indicate the beginning of each 2.5-sec period (one click a t the beginning and end of the 10-sec period; two clicks between the 2.5-sec periods). The center of the signal duration was always at the center of one of the 2.5-sec periods. In summary, the task of the observer was to detect a signal presented with contrast C, for the duration T, covering an area of angular diameter LY on a background of luminance B. The type of detection may be called
161
QUANTUM EFFICIENCY OF DETECTORS
loosely “flash perception,” in order to distinguish it from the earlier “contrast perception” work of Blackwell in which steady signals were presented to the observer. The measurements covered all of the combinations of five background luminances ( B = 100, 10, 1,0.1, and 0.001 foot-lamberts), four target diameters (a = 1, 4, 16, 64 min), and seven durations (T = 1/1,000, 1/300, 1/100, 1/30, 1/10, 1/3, and 1.0 see). There were thus 140 measuring conditions in all. For each of these measuring conditions, the contrast C was varied by small increments in order to provide the data for a psychometric function. All of the contrast thresholds given in this report are for a renormalized reliability q = 50%. TABLEXIV. THEENTRIES ARE THE VALUESOF THRESHOLD CONTRAST C (FOR A RELIABILITY q = 50% IN A FOUR-CHANNEL FORCED-CHOICE SITUATION) WITH A TARGET DIAMETER a I N MINUTESAND A BACKQROUND LUMINANCE B IN FOOT-LAMBERTS. ALL THE ENTRIES ARE FOR A LIGHT PULSEDURATION OF 0.1 SEC.THEVALUESOF C GREATER THAN 0.5 ARE IN PARENTHESES. a
1 2 4 10 20 60
B =
100
10
1
0.3459 0.08650 0.02624 0.01462 0.01164 0.009376
(0.5808) 0.1452 0.04217 0.01977 0.01489 0.01140
(1.380) 0.3451 0.09772 0.03802 0.02535 0.01637
0.1 (5.808) (1.452) 0.3972 0.1291 0.04713 0.03917
0.01
0.001
(42.56) (10.64) (2.911) (0.9462) (0.5432) 0.2871
(413.0) (103.3) (28.31) (9.204) (5.272) (2.793)
The fixation of the eye was such that the target images always fell on t,he center of the fovea. The target was viewed with both eyes; no artificial pupil was employed. The color temperature of the light was about 2850” K. The data of Blackwell and McCready are presented in the form of tables of the threshold contrast C for a reliability q = 50%, as a function of the background luminance B, target diameter a, and flash duration T.A sample of their data is shown in Table XIV. The sample is for the duration T = 0.1 second, with six luminances B and six diameters a. Some of the 36 values of threshold contrast C in Table XIV were obtained by interpolation, and all the values were adjusted when necessary to provide smooth plots of C vs a, T, and B. The values of C larger than 0.5 are placed in parentheses; these values violate the condition that M , be small compared with Ma. 3. Deviation of a Working Formula for Q D . The number of photons that enter the pupil of the eye from a background area of angular diameter LY during the pulse duration T is given by
162
R. CLARK JONES
M a = (r/16)a2D2SBnT (9.8) where B is the background luminance in lambert-type units, D is the diameter of the pupil of the observer's eye, and n is the number of photons in a lumen-second of the light. S is the Stiles-Crawford factor, defined as the ratio of the sensitivit,y of the fovea for light entering the entire pupil to the sensitivity the fovea would have if all of the light entered the center of the pupil. We int,roduce the factor S in order that the value of the DQE that we compute is for light that enters the center of the pupil. The number of signal photons is then given by M , = CM, where C is the contrast. By Eq. (9.1), the DQE is given by QD =
16k2 a2C2D2SBnT
(9.9)
(9.10)
If we choose special units so that 01 is in angular minutes, D is in millimeters, and B is in foot-lamberts, the last expression becomes QD =
5.59 x 10%2 a2C2D2SBnT
(9.11)
We saw in Sec. 1 that in a four-channel forced-choice system, a signalto-noise ratio of k = 1.22 is required by an ideal device for a 50% reliability of detection. Thus, if we are to use values of contrast C that are for a 50% reliability, Table XI11 indicates that the value of k to be set in Eq. (9.11) is k
=
1.22
(9.12)
Although the results reported by Blackwell and McCready were obtained with white light of a color temperature of about 2850" K, we here make the specific assumption that the same performance measured in luminous units would have been obtained if the measurements had been carried out with green light of wavelength 555 mF. At this wavelength, the number of photons per lumen-second is n
=
4.073 X 1015photons/lumen-sec
(9.13)
This result is based on the current value of 680 lumens/watt for 555-mp light. On placing these values of k and n in Eq. (9.11), one finds QD =
0.0020434/a2C2D2SBT
(9.14)
where Q D and C are fractions, a is in minutes of angle, D is in millimeters,
163
QUANTUM EFFICIENCY O F DETECTORS
B is in foot-lamberts, and T is in seconds. This is the ((working” equation for the DQE. This equation for the DQE involves the product D2S. The factor S depends on the pupillary diameter D. The actual diameter of the observers’ pupil was not measured by Blackwell and McCready. Thus, it is necessary to estimate the diameters. The values of D shown in the second column of Table XV were taken from the review by Reeves (lo@,and the values of the factor S in the third column were taken from the review by Moon and Spencer (107).The last column shows the values of D2S used in the calculations reported here. TABLEXV. PUPILDIAMETERS AND STILES-CRAWFORD FACTORS
B,
D,
foot-lamberts
mm
S
2.8 4.0 5.0 6.1 7.0 7.4
0.93 0.84 0.76 0.66 0.58 0.55
100 10 1 0.1 0.01 0.001
.
D2S, mm2
7.3 13.4 19.0 24.5 28.6 30.0
4. Numerical Results. The DQE has been calculated for each of the 252 values of the threshold contrast C that were supplied by Blackwell and McCready. The results for B = 0.1, 1.0, 10, and 100 foot-lamberts are shown in Table XVI. This table constitutes the chief result of this section. We now indicate a simple calculation. We consider the case B = 1.0 foot-lambert, a = 10 min, and T = 0.1 see. From Table XIV we then find that C = 0.0380 and from Table XV, D2S = 19.0 mm2. Substitution of these values in Eq. (9.14) yields &D
=
0.007441
=
0.7441%
(9.15)
In Table XVI, the values of &D that correspond to a value of the contrast C greater than 0.5 have been placed in parentheses. The values in parentheses do not represent a correct calculation of the DQE, since they violate the condition that the number of signal photons must be small compared with the number of background pho ons. The error, however, is always a conservative one; the parenthesized values of the DQE are all smaller than the values that would be computed by a calculation that took into account the fact that the signal is not very small compared with the background. For the background luminance of B = 1.0 foot-lambert, the DQE is plotted in Figs. 21 and 22. In Fig. 21 the abscissa is the target diameter a,
164
R. CLARK JONES
TABLEXVI. THEENTRIES SHOWTHE DETECTIVE QUANTUMEFFICIENCY Q D IN PER CENTFOR THE TARGET DIAMETER a (IN ANGULAR MINUTES)FOR THE FIRSTCOLUMN, FOR TARGET DURATION T (IN SECONDS) SHOWN AT THE HEADOF THE COLUMN, AND FOR THE LUMINANCE B INDICATED ABOVE THAT SECTION OF THE TABLE. THEENTRIESTHATDERIVE FROM A CONTRAST C GREATER THAN 0.5 ARE PLACED IN PARENTHESES P
T=1/1000 T=1/309 T=1/100 T=1/30 T=1/10
T=1/3
1 2 4 10 20 60
B = 100 foot-lamberts (0.000764) (0.00247) (0.00764) (0.0184) 0.0234 (0.00306) (0.00988) 0.0306 0.0735 0.0935 (0.00818) 0.0265 0.0818 0.197 0.254 (0.00480) 0.0155 0.0480 0.119 0.131 0.0550 0.0516 (0.00225) 0.00729 0.0225 0.00523 0.0109 0.00885 0.000523 0.00169
0.0159 0.0637 0.166 0.0762 0.0276 0.00419
0.0110 0.0442 0.106 0.0376 0.0112 0.00140
1 2 4 10 20 60
(0.00138) (0.00551) (0.0169) (0.0158) (0.00894) 0.00240
B = 10 foot-lamberts (0.00446) (0.0126) (0.0293) (0.0452) (0.0178) (0.0503) 0.117 0.181 0.362 0.536 (0.0546) 0.154 0.0512 0.144 0.328 0.390 0.0289 0.0815 0.176 0.172 0.00778 0.0219 0.0447 0.0326
0.0462 0.185 0.481 0.237 0.0896 0.0152
0.0279 0.112 0.292 0.125 0.0412 0.00551
1 2 4 10 20 60
(0.00102) (0.00407) (0.0131) (0.0166) (0.0115) (0.00409)
(0.00329) (0.0132) (0.0424) (0.0536) 0.0371 0.0132
B = 1.0 foot-lambert (0.0102) (0.0276) (0.0564) (0.0407) (0.110) 0.226 (0.131) 0.354 0.704 0.166 0.427 0.744 0.115 0.278 0.418 0.0409 0.0943 0.112
(0.0612) 0.246 0.756 0.690 0.342 0.0718
(0.0350) 0.139 0.424 0.334 0.155 0.0248
1 2 4 10 20 60
(0.000314) (0.00126) (0.00428) (0.00666) (0.00493) (0.00203)
(0.00102) (0.00407) (0.0138) (0.0215) (0.0160) (0.00626)
B = 0.1 foot-lambert (0.00314) (0.0108) (0.0247) (0.0126) (0.0433) (0.0989) (0.0428) (0.137) 0.330 (0.0666) 0.221 0.500 (0.0493) 0.166 0.379 0.0203 0.0661 0.151
(0.0293) (0.117) 0.370 0.417 0.233 0.0602
(0.0148) (0.0590) 0.187 0.190 0.0993 0.0238
'
T=l
and in Fig. 22 the abscissa is the pulse duration T. (The curves in these figures are based entirely on the unparenthesized values in Table XVI.) The top curve in each of these figures was drawn through points that are the maximum values of the curves in the other figure. Thus, the top curve represents the maximum value of the DQE for the value of the abscissa in question. The maximum value of the DQE for the top curve is the maximum value of the DQE for the luminance of B = 1.0 foot-lambert. Similar plots, not shown here, have been constructed for the luminances
165
QUANTUM EFFICIENCY OF DETECTORS
10 .1-
I
I
I .
I
LA.
\'
B=1.0 FT-LAMB.
0.01
I
2
8 10
4
20
40
80
TARGET DIAMETER (C IN MINUTES FIG.21. Showing the detective quantum efficiency QD of human vision as a function of the target diameter LY in angular minutes. The separate curves are for different pulse durations and the label on each curve gives the pulse duration T in seconds. All of the curves are for the background luminance of 1.0 foot-lambert. The top curve is drawn through the highest points of the separate curves of Fig. 22.
B = 0.1, 10, and 100 foot-lamberts. The maximum values of the DQE, and also the values of a and T for which the maximum is attained, are shown in Table XVII. I n order to be able to compare human vision with the other kinds of TABLEXVII. THE MAXIMUM VALUESOF THE DETECTIVE QUANTUMEFFICIENCY Q D FOR EACHOF FOURBACKGROUND LUMINANCES, ALONG WITH THE TARGET DIAMETERS LY AND THE PULSEDURATIONS T FOR WHICHTHE MAXIMUMIs ACHIEVED. B, foot-lamberts
100 10 1 0.1
U , ergs/cm2 1.28 0.236 0.0334 0.00431
QD
per cent
0.255 0.575 0.92 0.525
01,
minutes
T, seconds
4.4 5.5
0.09 0.15
6.0
0.18
8.5
0.15
166
R. CLARK JONES
PULSE DURATION T IN SECONDS 0.0I1 0.0 I
I
1
I
I
I
01
I
I
I
I 1.0
FIG.22. Showing the detective quant,um efficiency &a of human vision as a function of the light pulse duration T in seconds. The separate curves are for different target diameters and the label on each curve gives the target diameter a in angular minutes. All of the curves are for a background luminance of 1.0 foot-lambert. The data plotted in this figure are the same as those plotted in Fig. 21, but with a different abscissa. The top curve is drawn through points that represent the maximum heights of the separate curves in Fig. 21.
detectors considered in this report, it is desirable to express the ambient luminance B that has been expressed in foot-lamberts, in terms of the exposure U of the retina in ergs per square centimeter of 555-mp radiation. I n making this translation, we ignore the absorption within the eye and calculate the exposure for a hypothetical eye that is free of absorption, and has a back focal length of 1.5 cm. We furthermore calculate the exposure (for all of the luminances) for an exposure time of 0.1 sec, which duration is close to the value that produces the maximum DQE. Finally, we calculate the exposure using 7rD2S/4 as the effective area of the pupil. One then finds
U
=
0.00176BD2S
(9.16)
where B is in foot-lamberts, D2S is in square millimeters, and U is in ergs per square centimeter. Values of U are tabulated in Table XVII. The conclusions shown in Tables XVI and XVII and in Figs. 21 and 22 with regard to the DQE of vision are discussed from a number of points of view in the next section.
QUANTUM EFFICIENCY O F DETECTORS
167
5 . Discussion. Physicists may be interested in these results because they show how a biological detector (the human eye) may be compared meaningfully with purely physical detectors, such as image orthicons and photographic negatives. It is perhaps less clear why these results may be of interest to biologists. In a discussion of this question, Dr. H. B. Barlow of Cambridge University wrote me in November, 1957, to this effect: “My main interest in these problems (detective quantum efficiency) is that they throw light on the efficiency with which a nervous structure-the retina-performs certain none-too-easy tasks.” We now turn to a more specific discussion of various aspects of these results. We shall discuss (1)existence of maxima with respect to pulse duration T, target diameter a , and luminance B, (2) comparison with prior results, (3) scotopic quantum efficiency, and (4) nonlinear ideal detectors. a. Existence of Maxima. The dependence of the DQE on the target diameter a and on the pulse duration T is shown in Figs. 21 and 22 for the luminance B = 1.0 foot-lambert. The curves for greater or smaller luminances, which may be plotted from the data in Table XVI are quite similar, although there is markedly less crossing-over of the separate curves for B = 100 foot-lamberts. Both figures show a maximum value of the DQE for an intermediate value of the abscissa. The elementary explanation of this maximum is the same for both figures: the eye has inadequate resolution for the smaller values and inadequate integration for the larger values of the abscissa. If the visual process had perfect integration for a range of abscissa values for which there was good resolution, one would expect that the curve would show a flat maximum. There is no such flat maximum; the integration ability seems t o fail at about the same value of the abscissa as that a t which good resolution is achieved. Casual inspection of the two figures shows that the curves versus a in Fig. 21 peak more sharply than the curves versus T in Fig. 22. But this is an artifact of the method of plotting. If the data of Fig. 21 were plotted against the area of the target (instead of the diameter), the peaks are of breadth comparable with those in Fig. 22. The slope of the curves in Fig. 21 for small values of the diameter approaches the value +2, and this slope is just what one would predict on the basis of a resolution limitation that causes all very small targets t o be spread into a blur circle of the same diameter. Similarly, the asymptotic slope in Fig. 22 approaches f l for small pulse durations, and this also is just what one would predict for a resolution limitation that causes all very short pulses to appear t o have the same length. These asymptotic slopes are even more clear in the data for the luminances 10 and 100 foot-lamberts. Finally, we consider the maximum of the DQE with respect to the lum-
168
R. CLARK J O N E S
inance B, as suggested by the values of Q D in Table XVII and as plotted in Fig. 23 below. Photographic materials and image orthicons also show a maximum detective quantum efficiency a t some intermediate scene luminance. I have little doubt that the general nature of the explanation of the maximum is the same as given above for the photographic negative. A similar formulation could be offered for the performance of the retina, but in the present state of ignorance of the detailed mechanism of the retina such a formulation would necessarily be extremely vague.
-
ROSE, k = 5
I 0-6
I
o4
10-*
I
I00
LUMINANCE IN FOOT-LAMBERTS FIG.23. A summary of published data on the detective quantum efficiency of human vision plotted versus the background luminance. Curve A is the result of Rose (1957), and the curves B through D are the results of Jones (1957). Curve E shows the results of this paper. Curves A through D are based mainly on Blackwell’s contrast perception data, and all involve the questionable assumption of an integration period. Curve E is based on the flash perception data of Blackwell and McCready and is free of this assumption.
6. Comparison with Prior Results. So far as I know, there are only five prior publications that give numerical results for the detective quantum efficiency of vision. These are the three papers by Rose (1, 8, 94) in 19461948, one by the writer (95) in 1957, and one by Rose (108) in 1957. The last paper gives the same results as those in the 1946-1948 papers, except that the results are given for green light and are given more explicitly. Figure 23 shows the results found in refs. 5 and 12,plus those of this paper. The DQE (for green light) is plotted versus the adapting luminance in foot-lamberts. Curve A shows Rose’s 1957 results, which are calculated with k = 5. Curves B, C, and D show the writer’s 1957 results, which are calculated with k = 1.47. Curve E shows the results of the present paper, which are c a h l a t e d with k = 1.22.
QUANTUM EFFICIENCY OF DETECTORS
169
Curve E excepted, all the results in Fig. 23 are based mainly on the contrast perception measurements reported by Blackwell (109)) and to a minor degree on earlier contrast perception data and on some flicker perception measurements by the writer. Curve B is based on Part I11 of ref. 109 dealing with tests in which the observer was given 15 sec to determine the presence or absence of a target a t a single position. Curve C is based on Part I of ref. 109, where the observer was given 6 sec to make a forcedchoice among either possible positions of the target, and curve D uses the same data and supposed that the integration time of the eye is 0.2 sec. In the 1957 papers by Rose and by me, as well as in Rose’s 1946-1948 papers, it was necessary to assume an integration period for the eye: we assumed that the eye was able to sum perfectly the photons received during a short period and was completely unable to integrate the received information over a longer period. The integration period was taken by Rose to be 0.2 sec and was taken by me to be a somewhat shorter period, the exact length depending on the background in the manner suggested by my flicker perception measurements. This assumption of an integration period was a questionable one. Dr. H. B. Barlow in a private communication has questioned the assumption of an integration time of 0.2 sec or shorter on the ground that the observers actually had 6 or 15 sec to view the target and that the observers may have been able to make some integration over this much longer period. The chief superiority of the methods and results described in the present paper is that-by relying on flash perception data rather than contrast perception data-they entail no assumption as to the length of the integration period. Rose, in all of his publications on this subject, has assumed that under the conditions of Blackwell’s measurements (log), the threshold signal-tonoise ratio was k = 5, whereas in this paper we employ the much lower value k = 1.22. This difference corresponds to a factor of 16.8 in the DQE! Rose’s value is based on measurements of the threshold of signals of grainy photographic images and in noisy television images. It is clear from inspection of Table XI1 that a value of k as large as 5 could be valid for an ideal detector only when the reliability q is high and the false alarm fraction f is small (q = 90% and f = O.Ol%, for example). But the Blackwell data (Part I of ref. 109) used in Rose’s calculations involve a condition ( q = 500/0, M = 8 channels) for which Table I indicates k = 1.52. Thus, we conclude that the value of k used by Rose is much larger than the value required by an ideal device. In this section the value of k used in calculating the DQE of vision is based on an ideal device. The calculation of the values of k for an ideal device that are reported in Sec. C,1 were made inresponse toaSept. 18,1957, letter from Dr. Barlow, and I feel much indebted to him for emphasizing
170
R. CLARK JONES
the need for basing the calculation on the value of k appropriate to an ideal detecting device, instead of on the signal-to-noise ratio required by the eye for some other visual task. Since the above section was written, an article by P. B. Fellgett has come to my attention: “Investigations of image detectors,” in The Present and Future of Telescopes of Moderate Size (University of Pennsylvania Press, Philadelphia, 1958), pp 51-86. Dr. Fellgett refers to some unpublished results in which Dr. H. B. Barlow “obtains peak monochromatic values (of the DQE) of 5% a t threshold, and 1% a t 0.01 and 0.1 footlambert. The determination (as also in Sec. IX,C) is based on the detectability of bright targets for definite times against a uniform background. The value quoted is a maximum with respect to target size and duration.” c. Scotopic Quantum E$ciency. The flash perception data were obtained only with foveal fixation, and are therefore ielevant for luminances above about 0.1 foot-lambert. So far as I know, no such data exist for peripheral vision a t lower luminances. It is highly desirable that flash perception data be obtained a t luminances below 0.1 foot-lambert with the observer permitted t o fixate optimally for each luminance. The measurements should preferably be made with monochromatic green light, with monocular vision, and with a small artificial pupil to eliminate uncertainty about the actual pupil diameter. But the observer should be free to fixate the target as he wishes. d. Nonlinear Detectors. In a discussion in 1957, Professor H. It. Blackwell inquired about the possible effect of the nonlinear nature of the human nervous system. In particular, he raised the question of whether the nervous system might be able to achieve a higher performance than the best linear system. This is clearly an important question, since there is no doubt that the nervous system is in fact highly nonlinear. This question was discussed in See. III,B,2, whcre it is concluded that a linear detector is in fact superior to any nonlinear detector. X. OTHERDETECTORS In this final section we discuss more briefly a number of other kinds of radiation detectors. Inclusion of a detector in this section carries no implication that it is not important. Inclusion of a detector in this section means only that the brevity with which it may be discussed would not justify a separate major section. The detectors included in this section are the heat detectors, including thermocouples, bolometers, and the Golay detector, back-biased p-n junctions, photovoltaic cells, and photoelectromagnetic detectors. Photosynthesis is mentioned briefly.
QUANTUM EFFICIEATCY O F DETECTORS
171
A. Heat Detectors in General In order t o fit heat detectors into the pattern established in this report, and yet t o avoid considerable complexity in the formulation, it is desirable in these general considerations to idealize slightly the concept of heat detectors. I n this section we define a heat detector as a detector that satisfies the following two conditions: 1. The noise equivalent power is independent of the wavelength for all wavelengths much longer than the wavelength X1 defined below. 2. The emissivity of the detector is independent of wavelength. Most readers will agree that this definition is in reasonable accord with the ordinary idea of a heat detector. We further assume that the only radiation incident on the detector is monochromatic or, at worst, narrow-band radiation. And it is assumed that the amount of this incident radiation is just sufficient to maintain the detector in thermal equilibrium a t the temperature T. The steady incident power is therefore
Pa = AaT4
(10.1)
where u is the Stefan-Boltzman radiation constant, but this formula should not mislead the reader to suppose that the incident radiation is blackbody radiation. These assumptions are quite remote from the ordinary operating conditions of heat detectors, but they are necessary to conform to the fact that the detective quantum efficiency is not defined for broad bands of radiation, but is defined only for monochromatic radiation. (We shall later see how it is possible to define something like a DQE for blackbody radiation.) By combining Eq. (10.1) with (3.26), one finds the following expressions for the detective quantum efficiency of a heat detector: QD
=
~EAuT~A~/PN~
(10.2)
This formula indicates that P Ncan ~ never be smaller than the numerator on the right in Eq. (10.2). This numerator represents the mean-square fluctuation in the incident power. If we use P ~ tNo denote the (constant) noise equivalent power of the heat detector for wavelengths large compared with All the noise equivalent power is given in general by PN2
= PoN2
+ 2€AaT4Af
(10.3)
since the photon noise and internal noise will add incoherently. The wavelength X1 is now defined as the wavelength that makes equal
172
R. CLARK JONES
the two terms on the right in (10.3) and thus by (10.2) makes QD equal to one-half. Since 8 = hc/X, the wavelength XI is given explicitly by X i = 2hCoT4Af/PoN2
(10.4)
Equation (2.78) of ref. 2 indicates that the mean-square fluctuation in the power of blackbody radiation of temperature T incident on the area A is ((AP)2) = 8AkTuT4Af
(10.5)
We compare this result with the mean-square fluctuation in the monochromatic power incident on the area A given by the numerator on the right in (10.2): ((AP)2) = 2&aT4Aj
(10.6)
We see that the two fluctuations are equal if & = 4kT
(10.7)
I n words, the fluctuation in the power incident on a heat detector is the same whether the radiation be blackbody radiation of temperature T or monochromatic radiation with photon energy given by & = 4kT. For monochromatic radiation that satisfies the last equation, the DQE defined by (10.2) becomes QDT
= ~A~TCTT~A~/PN~
(10.8)
There is a sense in which one may say that QDT is the DQE for blackbody radiation of temperature T. It is now convenient to introduce the abbreviation z for the ratio Q D / ~- QD): =
QD/(~
- QD) = ~ E A U T ~ A ~ / P O N ~
(10.9)
Unlike QD, the ratio z is rigorously proportional (for a heat detector) to the photon energy e. For this reason, it satisfies simpler relations. As indicated by the definition, z and Q D are substantially equal when both are small compared with unity. Given z, the corresponding value of the DQE is found by the relation QD = z/(1 2). At the wavelength X I defined by (10.4) that makes equal the two terms on the right in (10.3), QD is one-half and I: is unity. If zh denotes the value of t.he ratio z for a given wavelength A, the wavelength XI is given by
+
X I = xzx
The value of z that corresponds to QDTis
(10.10)
173
QUANTUM EFFICIENCY OF DETECTORS
ZT =
8AkTuT4Af/PoNz
(10.11)
I n reference 2 a figure of merit M1 was introduced as a n invariant measure of the detectivity of Class I radiation detectors. M1 is defined
MI
=
(16AkTrT4Af)'/Po~
(10.12)
where, however, in this equation the temperature T is required to be T = 300" K. I n the remainder of this Sec. A, we assume that the temperature of the detector is a t 300" K; this is not a severe restriction for this review, since all of the heat, detectors to be discussed do operate a t room temperature. Comparison of the last two equations yields the simple relation
Although the definition of M1holds for any Class I detector, whether it be a heat detector or not, the relation (10.11) holds only for a heat detector and therefore (10.13) holds only for a heat detector. [The factor 16 occurs in Eq. (10.12) instead of the factor 8 in (10.11) because the former is intended to take account of fluctuations in both the incident and the emitted power, whereas the latt,er takes account of only the incident power; this is the source of the factor 55 in Eq. (10.13.)] Most heat detectors are Class I1 detectors. The figure of merit Mz for Class I1 detectors is related to M1 by
Mi2 = 0.846~Mz~
(10.14)
where r is the reference time constant defined by Eq. (3.4) of ref. 2. The last two equations then yield ZT =
O.423rMz2
(10.15)
This expression also holds only for heat detectors. The wavelength of a photon that has the energy 4kt with T = 300" K is Xso0 = 12.0 p. From Eqs. (lO.lO), (10.13), and (10.15), one then has 2: =
6M?/X
(10.16)
and z = 5.07Mz2/X
(10.17)
for heat detectors operating at the temperature 300" K, where X is in microns. If one wishes to avoid the assumption that the heat detector is in thermal equilibrium at the temperature 300" K, one can do so by introducing in the right side of Eqs. (10.13), (10.15), (10.16), and (10.17) the factor (T/300)6.
174
R. CLARK JONES
B. T h e Golay Pneumatic Heat Detector The Golay pneumatic heat detector (iiO-iI2) is the only heat detector that was listed as a Class I detector in ref. 2. This detector consists of a closed chamber. The front wall is a window that is transparent to the radiation to be detected; the rear wall is a thin aluminized distendable membrane. Within the chamber is a thin radiation-absorbing membrane. Absorption of radiation by the membrane increases the temperature and therefore the pressure of the gas within the chamber. The increase of pressure distends the membrane. The movement of the membraiie is transduced into an electrical signal by suitable means. The figure of merit Ml of this detector is found in ref. 1 to be M1 = 0.275. By the equations of Sec. A, one then finds z
0.451/A ZT = 0.0378 A1 = 0.451 p Q D = 0.451/(0.451 =
(10.18)
+ A)
where A is in microns. For wavelengths in the visible range, it follows that DQE of the Golay detector is roughly one-half.
C. Thermocouples and Bolometers Thermocouples and bolometers are the most widely employed types of heat detectors. The detectivity of these detectors was discussed a t length in reference 2, where it was found that the very best representatives of both kinds of detectors had a figure of merit Mz of about unity. The figure of merit M2 of most of the widely used commercial detectors lies in the range from 0.1 to 1.0. I n this section, we shall merely discuss the detective quantum efficiency of these detectors on the basis that their figure of merit Mz is in fact equal to 1.0. From the equations of Sec. A, one then finds 5.077/A 0.4237 = 5.077/(5.077
Z =
ZT = QD
+ A)
(10.19)
where X is in microns and the reference time constant T is in seconds. Many of the commercial thermocouples and bolometers are designed to operate with radiation that is chopped at a frequency near 10 or 15 cps. Such detectors have a reference time constant of about 0.01 sec. One then finds
QUANTUM EFFICIENCY OF DETECTORS
O.O507/X ZT = 0.00423 &D = 0.0507/(0.0507 X i = 0.0507 p
175
2: =
+ X)
(10.20)
One concludes that the very best thermocouples and bolometers (with = 0.01 see) have a DQE of about 0.1 a t the wavelength 0.5 p .
T
D. Back-Biased p-n Junctions In a back-biased p-n junction, substantially all the applied voltage is across the thin layer of intrinsic material. I n the absence of radiation, the only current is that due to the small number of electrons that exist in thermal equilibrium in the p material and the small number of holes that exist in equilibrium in the n material. When one of these carriers diffuses into the intrinsic layer, it is seized by the electric field and is transferred to the other side of the layer. The result is the transferance of one electronic charge between the electrodes. When a photon is absorbed in the intrinsic layer, an electron-hole pair is produced. The field causes the carriers to move in opposite directions, and the net effect is again that one electronic charge is transferred between the two electrodes. If the photon is absorbed near the layer, the minority carrier may diffuse into the layer and thus transfer a unit charge between the electrodes. It was established in Sec. VI that the responsive quantum efficiency (RQE) of all good photoconductive materials is unity for some range of wavelengths shorter than the wavelength of the absorption edge. It follows that the RQE of back-biased p-n junctions is unity for radiation that is absorbed in the intrinsic layer, and decreases as the site of absorption move:, away from the layer. The responsivity data obtained by T . C. Anderson and reported by Shive (113) for a back-biased germanium p-n junction are consistent with a RQE of unity. How about detective quantum efficiency (DQE)? This also will be unity if the noise in the photocurrent is no more than the noise in a shot noise current of the same value, if the photocurrent is large compared with the dark current, and if all the incident radiation is absorbed. This conclusion is of course based on the assumption that only one charge is transferred for each electron-hole pair that is produced, and this assumption is based on the theory of p-n junctions. (To be sure there are a t least two different kinds of breakdowns in semiconductors (see ref. 114), but we shall suppose that the bias voltage is well below the breakdown voltage.) For germanium p-n junctions, Slocum and Shive (115) have, in fact, reportedjthat for a limited range of radiation intensity and bias voltage the noise in the photo-
176
R. CLARK JONES
current does not exceed the amount calculated from the shot-noise formula. Thus, Slocum and Shive’s result confirms the fact that the DQE, for a restricted range of ambient radiation, is close to the fraction of incident radiation that is absorbed. I n summary, both the responsive and the detective quantum efficiencies of p-n junctions may be close to unit,y, subject to the conditions stated above.
E. Photovoltaic Cells A photovoltaic cell is a p-n junction that is used as a transducer of radiation power into electrical power. It differs from a back-biased p-n junction only in that no external bias is applied to the cell. At first sight, it is difficult to see why anyone would employ a p-n junction as a photovoltaic cell, since the cell could be employed as a backbiased junction. But the fact is that certain kinds of junctions, notably those in indium antimonide, are used as photovoltaic cells. The reason is that excessive noise is produced if a substantial bias is applied to the cell. But even so it is found (116) that some indium antimonide cells have a marked increase in their detectivity if a small backbias, of about 0.1 volt, is applied t o the cell. Perhaps the effect of this small bias is to reduce the recombination of carriers in the intrinsic layer. Or perhaps the bias reduces the current through the shunt resistance and thereby reduces the current noise in this resistance.* I n order t o discuss some of the necessary conditions that must be satisfied if a photovoltaic cell is to have a high DQE, we shall use as an example the silicon photovoltaic cell described by Prince (117). This cell, t o be sure, was designed as an energy transducer, not a detector, but we use this cell in the discussion because its properties are so completely described. The cell has an area of 6.8 cm2, a parasitic shunt resistance of 1,000 * Note added in proof. According to a letter dated March 25, 1959 from Dr. Werner J. Beyen, of Texas Instruments, and subsequent discussion, recent indium antimonide photovoltaic cells show a substantial increase in noise when the potential across the cell is as small as 30 millivolts. Potentials substantially greater than 30 millivolts are usually produced when daylight falls on the detector. Thus if one wishes to obtain the highest detectivity from these cells, one must take measures that eliminate the dc potential across the cell. One convenient way to do this is to connect the cell to the primary of a transformer. Similar results were described by W. A. Craven and R. H. Genoud at the April 2 4 , 1959 meeting of the Optical Society of America in New York City. For cells that do show the phenomenon of extra noise with a small dc potential, the argument given below must be modified to provide for shorting out the dc potential.
QUANTUM EFFICIENCY O F DETECTORS
177
ohms, and a series resistance of 1.8 ohms. Its open circuit current output voltage with high illumination is 0.50 volt. Its short circuit current in sunlight is about 100 ma. For irradiation levels that produce a short circuit current of less than 20 ma, the same current is produced with an output voltage of 0.25 volt. I n using this cell as a radiation detector, we would normally connect the cell t o a high impedance circuit, so that the effective load on the cell would be its internal shunt resistance of 1,000 ohms. In order that the DQE be as large as is possible, two conditions must be satisfied: 1. Since the noise in the output current is not less than the shot noise of the current, the output voltage must be large compared with 0.0518 volt (see Sec. V-C). 2. The output voltage must not be so large that the current is reduced below the short-circuit current. To produce a voltage drop of 0.25 volt across the internal shunt resistance of 1,000 ohms requires only 0.25 ma. With this output voltage, the effect of the Johnson noise of the load resistance is to decrease the DQE that would otherwise be obtainable by the factor 0.83. At the wavelength 0.5 p, the absorbed flux required to produce the current of 0.00025 amp is 0.00062 watt, or 0.000091 watt/cm2. This is about 1/500 of the 300" K blackbody flux of 0.0459 watt/cm2. If the absorptance of the detector is unity and if there are no sources of noise other than the photon noise of the absorbed radiation and the Johnson noise of the load resistance, then we would expect the DQE of this photovoltaic cell to be 0.83. The same DQE could be obtained for higher incident fluxes if an external shunt resistor is used to hold the voltage down to 0.25 volt. But the flux may not be increased by a factor as large as 100, because if the shunt resistance is reduced t o 10 ohms while the current rises to 25 ma, two undesirable effects begin to be important: the output current ceases to be equal to the short-circuit current, and the series resistance of 1.8 ohms begins to be comparable with the load resistance. Similarly, if the ambient radiation is so low that the voltage produced is less than 0.05 volt, the DQE will be small and proportional to the amount of ambient radiation.
F. Photoelectromagnetic Detectors The photoelectromagnetic effect is observed when a slab of semiconductor, illuminated on one surface, is placed in a magnetic field perpendicular to the direction of illumination. An electromotive force appears in the third perpendicular direction. Piiicherle (118) has reviewed the effect. Hilsum and Ross (119) have described a useful detect or made of indium
178
R. CLARK JONES
antimonide that is based on this effect. This detector is unusual in that a t room temperature its response extends to about 7 p . (At 6.8 p , the responsivity is one-half of the maximum responsivity, which occurs a t 6 p ) . At 6 p , the noise equivalent input of this detector is lop9 watt, for a bandwidth of 1.0 cps. The sensitive area of the detector is 1.0 by 2.0 mm. The sensitive element is 0.1 mm thick and is placed in a magnetic field of 10,000 gauss. The responsivity versus wavelength curve of this detector, as given by ref. 119 permits one to calculate that the detector responds to 7% of the total 300" K blackbody radiation incident on the detector and that most of the incident power to which it does respond lies between 5 and 7 p. Seven percent of 0.02 X 0.0459 watt/cm2 is 0.000064 watt. One thus has
PN =
watt Af = 1 . 0 ~ ~ s Pa = 6.4 X watt EGp = 3.31 X joule
(10.21)
whence by Eq. (3.26) &D =
4.24 X lop6
(10.22)
This value of Q D would be increased by the factor 100/7 if the blackbody radiation normally incident on the detector were replaced with monochromatic 6-p radiation of the same radiant flux.
G. Photosynthesis Photochemistry makes very frequent use of the concept of responsive quantum efficiency, defined as the ratio of the number of molecules that are transformed t o the number of absorbed photons. The reciprocal of the RQE is the so-called quantum requirement, or the quantum demand. A famous controversy concerns the RQE of photosynthesis in algae under optimum conditions. (Here the RQE is the ratio of the number of CHzO groups formed to the number of absorbed photons.) One school claims the value 25%, and most other investigators find about 12.5% (120).
ACKNOWLEDGMENTS This review could not have been written without the unpublished information so generously supplied by RCA, by Kodak, and also by Professor H. R. Blackwell, and used in Secs. V, VII, VIII, and IX. The writer wishes to express his sincere gratitude t o A. D. Cope, R. W. Engstrom, F. D. Marschka, G. A. Morton, Albert Rose, A. H. Sommer, R. G. Stoudenheimer, and B. H. Vine of RCA; also Joseph Altman, John Tupper, J. H. Webb, Robert Wolfe, and Hans Zweig of Kodak; also H. R. Blackwell, Keith Butler, David Dutton, L. R. Koller, G. E. Kron, David Middleton, M. B. Prince, Bennett Sherman, J. N. Shive, H. F. Spencer, David Van Meter, K. M. van Vliet, and Norbert Wiener.
QUANTUM EFFICIENCY O F DETECTORS
179
REFERENCES General 1. A. Rose, Television pickup tubes and the problem of vision. Advances in EZectronzcs
1, 131-166 (1948). 2. R. Clark Jones, Performance of detectors for visible and infrared radiation. Advances in Electronics 6, 1-96 (1953). 3. R. Clark Jones, On the minimum energy detectable by photographic materials. Phot. Sci. Eng. 2, 191-197, 198-204 (1958).
4. R. Clark Jones, “Detectivity”: the reciprocal of the noise equivalent input of radiation. Nature 170, 937-938 (1952). 5. R. A. Smith, F. E. Jones, and R. P. Chasmar, “The Detection and Measurement of Infrared Radiation.” Oxford University Press, London and New York, 1957.
Definition of Quantum Eficiency 6. A. Rose, Performance of photoconductors. Proc. I . R. E. 43, 185Ck1869 (1955). 7 . A. Rose, Relative sensitivities of telcvision pickup tubes, photographic film, and the human eye. Proc. I . R. E. 30, 293-300 (1942). 8. A. Rose, A unified approach t o the performance of photographic film, tclcvision pickup tubes, and the human eye. J . SOC.Motion Picture Engrs. 47, 273-294 (1946). 9. P. B. Fellgett, Equivalent quantum efficiencies of photographic emulsions. Monthly Notices Roy. Astron. SOC.118, 224-233 (1958). 10. W. B. Lewis, Fluctuations in streams of thermal radiation. Proc. Phys. SOC.69, 3 4 4 0 (1947). 11. T. C. Fry, “Probability and Its Engineering Uses,” pp. 216-227. Van Nostrand, New York, 1929. 12. D. Middleton, On the detection of stochastic signals in additive normal noise. Part I. Trans. I . R. E. IT-3, 86-121 (1957). I S . M. Planck, “The Theory of Heat Radiation.” Blakiston, Philadelphia, 1914. Translated by Morton Masius. 14. R. Clark Jones, On reversibility and irreversibility in optics. J . Opt. SOC.Am. 43, 138-144 (1953). 15. P. B. Fellgett, On the ultimate sensitivity and practical performance of radiation detectors. J . Opt. SOC.Am. 39, 970-976 (1949). 16. R. Clark Jones, A new classification system for radiation detectors. J . Opt. SOC. Am. 39, 327-356 (1949). 17. R. Hanbury Brown, and R. Q. Twiss, Interferometry of the intensity fluctuations in light. I. Proc. Roy. SOC.2428, 300-324 (1957). 18. A. Rose, P. K. Weimer, and H. B. Law, The image orthicon-a sensitive television pickup tube. Proc. I. R. E. 34, 424-432 (1946).
Photoemissive Tubes 19. A. H. Sommer, “Photoelectric Tubes,” Methuen, London, 1951.
20. V. K. Zworykin and E. G. Ramberg, “Photoelectricity and Its Application,” Chapter 3. Wiley, New York, 1949. 21. H. E. Ives, Photoelectric properties of thin films of alkali metals. Astrophys. J . 60, 209-230 (1924). 22. N. R. Campbell, Photoelectric properties of thin films of alkali metals. Phil. Mag. [7]6, 633-648 (1928). 2s. L. R. Koller, Photoelectric emission from thin films of cesium. Phys. Rev. 36, 1639-1647 (1930).
180
R. CLARK JONES
24. P. Gorlich, Composite transparent photoelectric cathodes. 2. Physik. 101, 335-342 (1936). 25. A. H. Sommer, Photoelectrically sensitive electrodes. British Patent 532,259 (19iO). 26. A. H. Sommer, Photoelectrically sensitive electrodes. British Patent 540,739 (1941). 27. A. H . Sommer, New photoemissive cathodes of high sensitivity. Rev. Sci. Znstr. 26, 725-26 (1955). 28. A. H. Sommer, Multi-alkali photocathodes. Trans. I . R. E. NS-3,8-12 (1956). 29. Professional Group on Nuclear Science. Trans. I. R. E. NS-3,3-144 (1956). SO. G. E. Kron, Test of an RCA Type C-7050 red-sensitive multiplier phototube. Harvard College Observatory, Circular No. 451, 37-38. About 1947: no date is given. 31. C. C. Larson, and H. Salinger, Photocell multiplier tubes. Rev. Sci. Znstr. 11, 226-29 (19tO). 52. R. C.Winans, and J. R. Pierce, Operation of electrostatic photomultipliers. Rev. Sci. Znstr. 12, 269-77 (1941). 55. B. Sherman, private communication, 1958. 34. G. E. Kron, Developments in the practical use of photocells for measuring faint light. Astrophys. J . 116, 1-13 (1952). 35. A. H. Sommer and R. W. Engstrom, private communications, 1958. 36. W. E. Spieer, Photoemissive, photoconductive and optical absorption studies of alkali-antimony compounds. Phys. Rev. 112, 114-122 (1958). 37. W. Shockley and J. R. Pierce, Theory of noise for electron multipliers. Proc. I . R. E. 26, 321-32 (1938). 58. J. A. Rajchman and R. L. Snyder, An electrically-focused multiplier phototube. Electronics 13, 20-23, 58, 60 (1940).
Photoconductive Cells 99. F. C. Nix, Photoconductivity. Revs. Modern Phys. 4, 723-766 (1932). 40. F. S. Goucher, The photon yield of electron-hole pairs in germanium. Phys. Rev. 78, 816(L) (1950). 41. H . Y . Fan, M. L. Shepherd, and W. Spitaer, Infrared absorption and energy-band structure of germanium and silicon. I n “Atlantic City Photoconductivity Conference, 1954” (R. G. Breckenridge, B. R. Russel, and E. E. Hahn, eds.), pp. 184-203. Wiley, New York, 1956. 42. E. Burstein, G. Picus, and N. Sclar, Optical and photoconductive properties of silicon and germanium. I n “Atlantic City Photoconductivity Conference, 1954” (R. G. Breekenridge, B. R. Russell, and E. E. Hahn, eds.), pp. 353-413. Wiley, New York, 1956. 43. A. Smith and D. Dutt,on, Behavior of lead sulfide photocells in the ultraviolet. J . Opt. SOC.Am. 48, 1007-1009 (1958). 44. F. L. Lummis and R. L. Petrita, Noise, time constant, and Hall studies on lead sulfide photoconductive films. Phys. Rev. 106, 502-508 (1957). 45. H. E. Spencer, Quantum efficiency of photoconductive lead sulfide films. Phys. Rev. 109, 1074-75 (1958). 46. B. Wolfe, On the specific noise of lead sulfide photodetectors. Rev. Sci. Znstr. 27, 60-61(L) (1956). 47. T . S. Moss, Photoelectromagnetic and photoconductive effects in lead sulphide single crystals. Proc. Phys. SOC.66B, 993-1002 (1953). 48. T. S. Moss, Absorption and photoconductivity in InSb. I n “Atlantic City Photoconductivity Conference, 1954” (R. G. Breckenridge, B. R. Russell, and E. E. Hahn, eds.), pp. 427448. Wiley, New York, 1956.
QUANTUM EFFICIEXCY O F DETECTORS
181
49. E. Mollwo, Electrical and optical properties of ZnO. I n “Atlantic City Photoconductivity Conference, 1954” (R. G. Breckenridge, B. R. Russell, and E. E. Hahn, eds.), pp. 509-528. Wiley, New York, 1956. 50. J. Backovsky, M. Malkovska, and J. Tauc, Photo-voltaic effect caused by X-rays a t p-n junctions in germanium. Czech. J . Phys. 4, 98(L) (1954). 51. J. Drahokoupil, M. Malkovska, and J. Tauc, Quantum efficiency of the photoelectric effect in germanium for X-rays. Czech. J . Phys. 7, 57-65 (1957). 59. K. G. McKay, Electron-hole production in germanium by alpha-particles. Phys. Rev. 84, 829-832 (1951). 53. S. Koc, The quantum efficiency of the photoelectric effect in germanium for the 0.3-2.0 micron wavelength region. Czech. J . Phys. 7, 91-95 (1957). 54. C. I. Shulman, Measurement of shot noise in CdS crystals. Phys. Rev. 98, 384-386 (1955). 56. K. M. van Vliet, J. Blok, C. Ris and J. Steketee, Measurements of noiseand response to modulated light of cadmium sulphide single crystals. Physica 22,723-740 (1956). 56. L. Free, New London Underwater Sound Laboratory, unpublished (1957). 57. B. N. Watts, Increased sensitivity of infrared photoconductive receivers. Proc. Phys. SOC.62A,456-7(L) (1949). 58. T. S. Moss, The ultimate limits of sensitivity of lead sulfide and telluride photoconductive detectors. J. Opt. SOC. Am. 40, 603-607 (1950). 59. P. B. Fellgett, On the ultimate sensitivity and practical performance of radiation detectors. J. Opt. Sac. Am. 39, 970-76 (1949).
Television Camera Tubes 60. 0. H . Schade, Electro-optical characteristics of television systems. R C A Rev. 9, 5-37, 245-286, 490-530, 653-686 (1948). 61. P. Elias, D. S. Grey and D. Z. Robinson, Fourier treatment of optical processes. J. Opt. Sac. Am. 42, 127-134 (1952). 6%.R. Clark Jones, New method of describing and measuring the granularity of photographic materials. J. Opt. Sac. Am. 46, 799-808 (1955).
Photographic Negatives 63. R. W. Gurney and N. F. Mott, Theory of the photolysis of silver bromide and photographic latent images. Proc. Roy. Sac. 164A,151-167 (1938). 64. N. F. Mott, Photographic latent image. Phot. J . B81, 63-69 (1941). 65. L. Silberstein, Quantum theory of photographic exposure. Phil. Mag. [6] 44, 257273 (1922). 66. T. Svedberg, Relation between sensitiveness and size of grain in photographic emulsions. Phot. J . B62, 186-196 (1922). 67. L. Silberstein and A. P. H. Trivelli, The quantum theory of X-ray exposures on photographic emulsions. Phil. Mag. [7] 9, Suppl., 787-800 (1930). 68. L. Silberstein, Contribution to the theory of photographic exposure. Phil. Mag. [7] 6,464489 (1928). 69. L. Silberstein and J. H. Webb, Photographic intermittency effects and the discrete structure of light. Phil. Mag. [7] 18, 1-24 (1934). 70. E. P. Wightman and R. F. Quirk, Intensification of the latent image on photographic plates and films. J . Franklin Inst. 203, 261-287 (1927). 71. J. H. Webb and C. H. Evans, An experimental study of latent-image formation by means of interrupted and Herschel exposures a t low temperature. J. Opt. Sac. Am. 28, 249-263 (1938). 72. W. F. Berg and K. Mendelssohn, Photographic sensitivity and the reciprocity law at low temperatures. Proc. Roy. SOC.168A, 168-175 (1938).
182
R. CLARK JONES
73. P. C. Burton and W. F. Berg, A study of latent image formation by a double exposure technique. Phot. J . B86, 2-24 (1946). 74. P. C. Burton, il study of latent image fading and growth by a double exposure technique. Part I. Phot. J . B86, 62-70 (1946). 75. P. C. Burton, A study of latent image fading and growth by a double exposure technique. Part 11. Phot. J . B88, 13-17 (1948). 76. W. F. Berg and P. C. Burton, Study of latent image formation by a double exposure technique. Part 11. Phol. J . B88, 84-88 (1948). 77. P. C. Burton, A two-stage theory of the density-intensity-time relation for single and double photographic exposures. Phot. J . B88, 123-136 (1948). 78. J. H. Webb, Low intensity reciprocity-law failure in photographic exposure; energy depth of electron traps in latent image formation: number of quanta required to form the stable sublatent image. J . Opt. SOC.Am. 40, 3-13 (1950). 79. E. Kata, On the photographic reciprocity law failure and related effects. 11. J . Chem. Phys. 18,499-506 (1950). 80. R. E. Maerker, Estimation of the critical T period in latent-image formation by intermittent exposures. J . Opt. SOC.Am. 44, 625-629 (1954). 81. J. W. Mitchell and N. F. Mott, The nature and formation of the photographic latent image. Phil. Mag. [8] 2, 1149-1170 (1957). 88. J. W. Mitchell, The nature of photographic sensitivity. J . Phot. Sci. 6, 49-70 (1957). 85. J. H. Webb, Graphical analysis of photographic exposure and a new theoretical formulation of the H and D curve. J . Opt. SOC.Am. 29, 314-326 (1939). 84. J. H. Webb, Number of quanta required to form the photographic latent image as determined from mathematical analysis of the H and D curve, Parts I and 11. J . Opt. SOC.Am. 31, 348-354, 559-569 (1941). 8.5. P. C. Burton, Interpretation of the characteristic curves of photographic materials. I n “Fundamental Mechanisms of Photographic Sensitivity” (J. W. Mitchell, ed.), pp. 188-207. Butterworths, London, 1951. 86. J. H. Webb, Absolute sensitivity measurements on single-grain layer photographic plates for different wave-lengths. J. Opt. SOC.Am. 38, 312-323 (1948). 87. W. F. Berg, Latent image formation in photographic silver halide gelatine emulsions. Repts. Progr. in Phys. 11, 248-297 (1948). 88. J. W. Mitchell, Photographic sensitivity. Repts. Progr. in Phys. 20, 433-515 (1957). 89. C. E. K. Mees, ed., “The Theory of the Photographic Process.” Macmillan, New Pork, 1954. See especially Chapter 5 by P. C. Burton. 90. H. J. Zweig, Autocorrelation and granularity. J. Opt. SOC.Am. 46,805-820 (1956). 91. J. H. Altman and K. F. Stultz, Microdensitometer for photographic research. Rev. Sci. Znstr. 27, 1033-1036 (1956). 9la. R. Clark Jones, On the minimum energy detectable by photographic materials, Phot. Sci. and Eng. 2, 191-204 (1958). 92. L. A. Jones and G. C. Higgins, Photographic granularity and graininess. Part 11. J . Opt. SOC.Am. 36, 203-227 (1946).
Human Vision 93. H. R. Blackwell and D. W. McCready, Jr., to be published. 9Sa. H. R. Blackwell, Brightness discrimination data for the specification of quantity of illumination. Illum. Eng. 47, 602-609 (1952). 94. A. Rose, Sensitivity performance of the human eye on an absolute scale. J . Opt. SOC.Am. 38, 196-208 (1948). 95. R. Clark Jones, On the quantum efficiency of scotopic and photopic vision. J . Wash. Acad. S C ~47, . 100-108 (1957).
QUANTUM EFFICIEKCY OF DETECTORS
183
96. S. Hecht, S. Shlaer and M. H. Pirenne, Energy, quanta, and vision. J . Gen. Physiol.
26, 819-840 (1942). 97. M. H. Pirenne, Physiological mechanisms of vision and the quantum nature of light. Biol. Revs. 31, 194-241 (1956). 98. R. B. Barnes and M. Czerny, Liisst sich ein Schroteneffekt dcr Photoncn mit dem Auge beobachten. 2. Physik. 79,436449 (1932). 99. H. de Vries, The quantum nature of light and its bearing upon the threshold of vision, the differential sensitivity, and the visual acuity of the eye. Physica 10, 553-564 (1943). 100. H. B. Barlow, Retinal noise and absolute threshold. J . Opt. Soc. Am. 46, 634-639 (1956). 101. H. B. Barlow, Increment thresholds a t low intensities considered as signal/noise discriminations. J . Pkysiol. 136,469488 (1957). 10%’.E. Baumgardt, Sehmechanimus und Quantenstruktur des Lichtes. Naturwissenschaflen 39, 388-393 (1952). 103. H. K. Hartline, L. J. Milne and I. H. Wagman, Fluctuation of response of single visual sense cells. Federation Proc. 6, 124 (1947). 104. W. P. Tanner, Jr. and J. A. Swets, The human use of information, Part I. Prof. Group on Information on Theory. Trans. I . R. E. PGIT-4,213-221 (1954). 105. T. G. Birdsall and W. W. Peterson, The probability of a correct decision in a forced choice and among M alternatives. University of Michigan Electronic Defense Group, Quart. Prog. Rept. No. 10, April 1954. 106. P. Reeves, Response of the average pupil to various intensities of light. J . Opt. Soc. Am. 4, 35-43 (1920). 107. P. Moon and D. E. Spencer, On the Stiles-Crawford effect. J . Opt. Soc. Am. 34, 319-329 (1944). 108. A. Rose, Quantum effects in vision. Advances in Biol. and Med. Phys. 6, 211-242 (1957). 109. H. R. Blackwell, Contrast thresholds of the human eye. J . Opt. SOC.Am. 36, 624-643 (1946). Other Detectors 110. H. A. Zahl and M. J. E. Golay, Pneumatic heat detector. Rev. Sci. Instr. 17, 511-515 (1946). 111. M. J. E. Golay, A pneumatic infrared detector. Rev. Sci. Instr. 18, 357-362 (1947). 11%’. M. J. E. Golay, Theoretical and practical sensitivity of the pneumatic infrared detector. Rev. Sci. Instr. 20, 816-820 (1949). 113. J. N. Shive, The properties of germanium phototransistors. J . Opt. Soc. Am. 43, 239-244 (1953). 114. A. G. Chynoweth and G. L. Pearson, Effects of dislocations on breakdown in silicon p-n junctions. J . Appl. Phys. 29, 1103-1110 (1958). 115. A. Slocum and J. N. Shive, Shot dependence of p-n junction phototransistor noise. J . Appl. Phys. 26, 406(L) (1954). 116. A. J. Cussen, private communication (1955). 117. M. B. Prince, Silicon solar energy converters. J . Appl. Phys. 26, 534-540 (1955). 118. L. Pincherle, The photoelectromagnetic effect. I n “Atlantic City Photoconductivity Conference, 1954” (R. G. Breckenridge, B. R. Russell, and E. E. Hahn, eds.) pp. 307-320. Wiley, New York, 1956. 119. C. Hilsum and I. M. Ross, An infrared photocell based on the photoelectromagnetic effect in indium antimonide. Nature 179, 146(L) (1957). 120. E. I. Rabinowitch, “Photosynthesis and related processes,” Vol. 11, Part 2, pp. 1940-1969. Interscience, New York, 1956.
This Page Intentionally Left Blank
Automatic Data Processing in the Physical Sciences G. E. BARLOW Australian Joint Service S t a f , Washington, D. C.
J. A. OVENSTONE Department of Defence, Victoria Barracks, Victoria, Australia
I?. F. THONEMANN Weapons Research Establishment, Department of Supply, Salisbury, South Australia
111. An Example of a Data-Processing System.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Design of the Experiment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. General Description of the ADP System.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Economy and Efficiency.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. The Acquisition, Storage, and Conversion of D a t a . . . . . . . . . . . . . . . . . . . . . . . .
Page 185 186 189 189 191 192 192
........................
208
I. Introduction.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Automatic Data Processing and the Experiment.. . . . . . . . . . . . . . . . . . . . . . . . .
VII. Conclusion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgment. ........
I. INTRODUCTION Data processing, or the purposeful manipulation of signals and symbols, is common t o all the sciences. While mechanical aids such as the recording meter, the slide rule, and the desk calculator have long been familiar in this activity, its large-scale mechanization has been achieved only in the last decade by linking electronic and electromechanical devices into complex, automatic systems. 185
186
G . E. BARLOW, J.
A.
OVENSTONE,
AND F. F. THONEMANN
The first stage of the thorough mechanization of scientific practice was the use of automatic electronic computers for the analysis of experimental data and the simulation of physical systems. More recently, because an experiment neither begins nor ends with calculation and analysis, these computers have been used as the central units of automatic systems which also take measurements and even partially control the course of an experiment. This is possible because the routine part of experiments, including the retrieval of literature, measurement, calculation, and the tabulation of results, can be delegated to machines. In this review we shall discuss the extent and use of this delegation and shall describe some techniques and systems in order to show the scope for automatic data processing in the physical sciences. 11. AUTOMATIC DATAPROCESSING AND
THE
EXPERIMENT
The data process begins with the conception of the experiment and ends with the presentation of intelligible results. Evidently, there is no quicker way to get the results of an experiment than by processing data as they are acquired. Such (Lreal-time,l’or (‘on-line,” operation is exemplified in control systems where the data process must keep pace with the controlled elements in some well-defined sense and in simulators which incorporate physical parts of the system being studied. In fire control, weather forecasting, and other systems where current data are used to predict events, data processing in real time may also be necessary. In many other applications, however, real-time processing is unjustified because data are generated faster than can be processed economically or because a fault in the system may result in the irretrievable loss of data or because the time and effort required for the assessment of results discount the speed of real-time processing. The last reason is important in scientific work because the final decisions about an experiment are made by the experimenter and not by machines. The main advantages of “off-line” processing are that the recorded data may be replayed at different rates, that the system may be time-shared, and that all the data are available for other investigations which may arise in the course of the experiment. Regarded as a data process, any experiment seems to divide naturally into four stages: 1. Design. 2. Data acquisition and conversion. 3. Calculation and display. 4. Evaluation. I n the first stage, the objectives of the experiment are defined and means are set up to reach them. The second stage includes doing the experiment, recording observations, and changing data representations into the forms
AUTOMATIC DATA PROCESSING
IN THE PHYSICAL
SCIENCES
187
required later in the processing. In the third stage, the data are analyzed and the results displayed, while the last stage has to do with the assessment and interpretation of results as a whole. These stages are illustrated in Fig. 1as a circular flow diagram to emphasize that evaluation is often followed by the redesign and repetition of the experiment. It is usual, of course, for the results from any stage to modify operations in a preceding one, but the over-all feedback to the design stage implies a controlled interaction of theory and experiment with the object of bringing about a correspondence between experimental results and t heo-
PREPARATION
OF EQUIPMENT
PRESENT STATE
DATA AC(X1SlT
EVALUATION
\AN0 DISPLAY
)
FIG.1. The stages of an experiment.
retical expectations. Like all purposeful activities, an experiment bears a general resemblance to an error-actuated servomechanism, and this suggests that automatic data processing might be taken much further than it has. Usually, however, the feedback is so complicated as to escape analysis a t present and is therefore seldom amenable to mechanization. The essentials of an automatic data-processing system are almost selfexplanatory. As can be seen from Fig. 2, current data, or data retrieved from the store, pass to the operation unit and are returned to the store or the input/output as determined by the control unit. The sequence and kind of operations performed by the system may be preset in the equipment or, as in automatic sequence-controlled computers, may be determined by a piogram of instructions which have been planted in the store.
188
G. E. BARLOW, J. A. OVENSTONE, AND F. F. THONEMANN
I n the latter case, the program may be modified, or another sequence of operations may be initiated by data as they flow through the operation unit or the store. The functions of the input/output units are obvious, but because of the variety of forms in which data are communicated, they can be very difficult to instrument. The three major applications of automatic data processing in experimental work are, broadly, 1. The solution of mathematical problems and models. 2. The simulation of systems. 3. The conversion and analysis of experimental data.
----
CONTROL SIGNAL PATHS DATA PATHS
FIG.2. The essentials of an ADP system.
Since electronic digital computers and differential analyzers were designed originally for the solution of mathematical problems, we shall not dwell on the first application. Modern machines are fundamentally similar to those of a decade ago, although their capacity, reliability, and speed (particularly for digital machines) have been greatly improved. The simulation of a real or hypothetical system means that it is represented by an analog which may be of more interest than the explicit equa-
AUTOMATIC DATA PROCESSING I N THE PHYSICAL SCIENCES
189
tions describing the system. Simulation is often a quick and inexpensive way of exploring the possibilities of complex designs and systems, and permits “pilot runs” to be made in the laboratory in an attempt to avoid gross errors in the field. Because a simulation can be run on a time scale chosen by the experimenter, it is a most flexible means for the study of all kinds of engineering and physical problems, e.g., nuclear reactors ( I ) , electrical networks (W), optical systems (S), neurological models (4), and particle accelerators (5). In the processing of experimental data, the operations of acquisition recording, conversion, calculation, and display can each be mechanized readily. But, mechanization will be ineffective unless the communication links between the various machines insure a fast, reliable, and continuous flow of data. The design of automatic systems therefore demands very careful attention. As these systems are less flexible than equivalent manual ones, every contingency must be allowed for, and, with the increasingly complex tasks being delegated to machines, the subject of system engineering ( 6 ) has become of major importance.
111. AN EXAMPLE OF
A
DATA-PROCESSING SYSTEM
This example of a data-processing system for guided missile trials is chosen because it is familiar to the authors and because much of the development in automatic data processing (ADP) began in this field. The system will be described in outline only as details are given in later sections and elsewhere ( 7 ) .
A . Design of the Experiment Guided missile flight trials are expensive, and an evident economy is to gather as much pertinent data as possible from each trial. Because the results of one trial can determine the conditions of the next, it is necessary to process data at a rate which does not delay successive trials. Although “real-time” processing is mandatory for some trials, it is not justified for most of them, and full advantage can be taken of the flexibility of an “offline” system. For the observation of a missile in flight, trajectory and displacement measuring instruments, ground-based and airborne instruments for the observation of events, and radio telemeters are used. The main characteristics of some of these instruments are given in Table I, which shows also that a very large volume of data is produced in a few minutes. The rate and amount of data is such as to make automatic recording essential on media with high storage capacity and transfer rate. Coarse-scale examination of the records gives a n over-all, qualitative picture of the experiment and suffices for the human assessment of behavior
TABLE
Instrument
Type
I. TYPICAL CHARACTERISTICS
Description of data
OF SohlE
Rates
RANGEINSTRUMENTS Approximate reading precision
Basic calculation and results required
Kinetheodolite
Trajectory
Image giving azimuth, eleva- Up t o 60 tion and misalignment of mis- frames per sile relative to optical axis second
Radio-doppler
Displacement
C-W signal describing radio path-length between transmitter, missile, and receiver
Up to 5,000 0.01 to 0.1 of the doppler cycles doppler wavelength per sec
Radar
Trajectory
Range and direction of missile relative to transceiver axis
Up to 100 sets of data, per sec
Varies from 0.1 to Usually coordinate transformations and then as for kinetheodolites 1 min of arc and from 1 to 100 yards in range depending on radar system
Long-focallength camera
Ground event measurement
As for kinetheodolites in general
Up to 200 frames per sec
10-20 sec of arc
As for kinetheodolites with modifications and additions
AMPOR
Airborne Image of missile relative to camera cluster target axis
Up to 100 frames per sec
1 min of arc
As for kinetheodolites with additions
F.M.
Telemetry
About 5,000 samples per sec but may be higher
1% of full scale reading
Calibration, correlation, and misccllaneous comparisons, integrations, and frequency analysis
About 3-20 sec of arc depending on type of theodolite mounting
Evaluate probable intersection of skew rays in space; thence velocity, aerodynamic derivatives, guidance system parameters, etc. Evaluate intersection of ellipsoids, then as for kinctheodolites
~
Telemetry
Time-multiplexed, frequency-modulated signal relatcd to transducer inputs
AUTOMATIC DATA PROCESSING I S T H E PHYSICAL S C I E N C E S
191
and for the selection of relevant data for further analysis. In the detailed assessment and analysis of the records, however, the examination is localized and can be done by an automatic computer. It follows that the data should be recorded in forms suitable for both human observation and automatic analysis.
B. General Description of the ADP System Originally, cameras of various kinds were used to record data from all the instruments. The developed films were read manually, and calculations were done with the assistance of punched-card machines. In the ADP system (see Fig. 3), all instrument data, with the exception
FIG.3. Diagram of a working ADP system.
of those from kinetheodolites, high-speed cameras and “quick-look” records, are recorded on magnetic tape together with elapsed time codes and standard reference frequencies. The primary magnetic tape records are converted automatically to secondary magnetic tape records on which the data are expressed in a standard digital representation for input to an automatic digital computer. The primary photographic records are converted by means of semi-automatic film readers into punched paper tape
192
G. E. BARLOW, J. A. OVENSTONE,
AND F. F. THONEMANN
records in another standard digital representation. Relevant data are selected for conversion with the help of the “quick-look” records, and the conversion equipment is flexible enough to cope with new or special instruments which may be used in some experiments. Both kinds of secondary records are fed into the digital computer. If high-speed transfer of large amounts of data from the computer is required, its output is recorded on tertiary magnetic tapes; otherwise, the record is made on punched paper tape. The tertiary records are played into automatic tabulating and/or graphing units which provide displays suitahe for immediate copying and publication. If necessary, the tertiary records can be used for further automatic calculation and analysis, since standard digital representations have been used throughout the system. Qualitative assessing and monitoring is performed with, or immediately after, each recording, and the digital computer then effects any detailed analysis. This transfer of large amounts of routine analysis and assessment from men t o machines greatly increases the processing speed and makes possible the more effective employment of scientific staff.
C . Economy and Eficiency
It took about four years to design and construct the ADP system and t o put it into full operation. Of this period some 18 months and 30 manyears were occupied in changing over from the earlier semiautomatic system t o routine automatic operation of the new one. Even now, as experience accumulates on automatic assessment, monitoring, and editing, the revision of methods is a continuous task to which must be added the processing requirements for new, more complex missiles. Generally, it seems that one always knows what should have been done after the first few routine runs have been made! A typical example of the efficiency of the system is given in Fig. 4. Without any increase in staff , the automatic system has reduced the elapsed time for analyzing trials by a factor of 10 while dealing with missiles at least five times as complex as originally envisaged. The over-all costs of processing data have been reduced by a factor of 60, and the system paid for itself in its first year of operation. STORAGE,AND CONVERSION OF DATA IV. THEACQUISITION, Because the bulk of quantitative data from modern instrumentation is first accessible as electrical waveforms, the electrical analog suggests itself as a general method of automatic data processing. It has been used in some radiotelemetry systems where the number of separate analog transformations may approach 10, for example, acceleration-to-voltage-to-frequency-to-carrier-to-frequency-to-voltage-to-graphical display. Each trans-
AUTOMATIC DATA PROCESSING
IN THE PHYSICAL SCIENCES
193
formation, as well as the analog computation, contributes its quota of random error t o the final result, a degradation which could be avoided in a digital system. But a more compelling reason for breaking an analog dataprocessing chain a t some point is that computations, which are best done digitally, are often required on a t least some of the data. If the only record is on film or paper, its conversion to digital form can be done by automatic curve followers, but the 1 to 2% precision of these devices may not be good enough. Precisions of 0.1% or better can be obtained with semiautomatic readers which are in common use as a link between film or paper records and digital computers. These readers are expensive and, with a maximum reading rate of about one point in 5 see,
wonrow
OA~S
FIG.4. Comparative times for the manual and automatic processing of telemetry data.
fail to match the rate a t which instruments can produce and digital computers can accept data. If, however, electrical analog data are taken from some point in the system, or from a magnetic record, the conversion to digital form can be done easily and precisely. For monitoring and “quicklook” purposes, ancillary analog or digital displays can still be used. Although a n analog system is sometimes simple and expedient for the processing of experimental data, a digital system is superior in scope, flexibility, and precision. For processes in which a n over-all precision better than 1% is called for, the utility of an analog system becomes questionable because of the unavoidably cumulative (‘instrument errors” of analog computation and the susceptibility of analog data t o distortion and drift
194
G.
E.
BARLOW, J. A . OVENSTONE,
AND F. F. THONEMANN
in recording and transmission. The round-off and truncation errors of digital computations can both be made negligible, the first by providing enough significant digit places in the machine and the second by correct programing. As is well known, the recording and transmission of digital data can be made free of error, provided the conditions on pulse-code modulation (PCM) are satisfied (8).At present, there are no precise analog data stores combining the permanence, capacity, and quick access of the existing forms of digital stores, and this lack alone places a severe restriction on the use of analog systems for “off-line” data processing. These are some of the reasons why automatic data processing has a strongly digital trend, particularly in large-scale experiments, and explain why we have mainly confined discussion to systems in which a digital computer is the central element.
I
PUiNTEP/ FW7TER.
FIG.5. A typical system for the automatic processing of instrument data.
I n its simplest form the ADP system might be represented by the block diagram of Fig. 5 . At this general level, little comment is necessary because the operations are familiar and common to all experiments. The system acquires data by means of the sampling switch either directly from the observing instruments or from analog records of their outputs. The analogto-digital converter translates the input samples into the common code of the system, and the buffer stores provide for the automatic selection, reordering, and control of data entering and leaving the computer. Because much of the data derived from experiments are either unexpected or suspect, editing operations are most important and must be based on decisions which cannot be made until the experimenter knows
AUTOMATIC DATA PROCESSING IN THE PHYSICAL SCIENCES
195
how the experiment is behaving. Although manual editing could be done a t the output of the ADP system, much of the processing would often turn out to have been irrelevant and the input data would have to be reprocessed. Thus, the fully automatic process is usually interrupted while the raw data are edited manually. Two common arrangements providing for the manual editing of raw data are also illustrated in Fig. 5 . I n the first, analog samples are recorded and then replayed to give a “quick-look” display, and only the selected parts of the record are processed further. I n the second arrangement, the record is digital, and the “quick-look” display is made by way of a digitalto-analog converter. With the development of fast, reliable analog-todigital converters and high-speed magnetic recorders for digital data, the second arrangement is nearly always practicable. It also has the merit that digital, unlike analog, recording does not introduce errors. It is, however, more expensive to implement and, because the storage density of analog data is an order of magnitude higher than that of digital data, it makes a less efficient use of recording media.
A. Storage of Data I n the last few years, the use of magnetic materials for the storage of both analog and digital data has become commonplace. Magnetic tapes (9),cards ( l o ) , drums ( I I ) , and matrices (12) are together unrivaled in storage capacity, transfer rate, and access time and, because data are entered into and extracted from them directly in electrical form, they are naturally fitted for use in electronic data-processing systems. I . Magnetic Tape Recording. Currently available magnetic tape marchines allow for the direct recording of electrical waveforms in the frequency band 100 cps to 100 kc on each of up to 8 tracks of standard halfinch tape a t a recording speed of about 60 in./sec. A reduction of tape transport speed proportionately reduces the recording bandwidth by about 2 kc/in./sec. It is possible to extend the recording bandwidth downward by the use of flux-sensitive reproducing heads ( I S ) , and upward to several megacycles by the use of microgap ferrite heads (14) and tape transport mechanisms of high quality. Video recording techniques (15) in which a number of heads are moved laterally across the tape a t speeds of up to 1,500 in./sec have recently come into use for the recording of television signals, but have not yet been used in other applications. Direct electrical analog recording on magnetic tape is restricted by the intermittent occurrence of amplitude “drop-outs” and fluctuations of the reproduced signals arising from imperfect tape motion and inhomogeneities of the oxide coating. These factors seem to place a limit of about 5% rms amplitude error on reproduction of signals recorded in this way. Several
196
G . E. BARLOW, J. A. OVENSTONE,
AND F. F. THONEMANN
carrier recording methods are in common use which have precisions of about 1% of full scale. Wide-deviation frequency modulation with signal frequencies in the band zero to 10 kc is used @S), but requires care in preventing drifts in the average tape speed. If a constant frequency is recorded with the signal, i t can be used on replay to compensate for wow and flutter in the tape motion. Similar remarks apply to narrow-band frequency-modulation methods in which more than 10 carriers in the band 0.4 to 80 kc, each with a deviation of f 7 . 5 % , are multiplexed and recorded on one tape track (17). I n compound modulation (CM), suppressed carrier amplitude modulation precedes frequency modulation (18). The recorded information is, therefore, the magnitude of the frequency deviation and is not the instantaneous frequency as in pure FM. Center frequency drift arising from any source (including tape speed) does not introduce errors, but the signal frequency bandwidth is only about 1 kc. Pulse-width modulation readily lends itself to time-sharing and is probably the best recording technique for quasi-static inputs (19). Signals can be reproduced with a precision of better than 1% of full scale, and wide variations in tape recording or replay speed do not introduce significant errors. When precisions of much better than 1% are required, digital recording is necessary. The maximum rate a t which binary coded samples may be recorded on a single track of tape is nv/N, where n bits per inch per track is the maximum digit packing density, v is the recording speed in inches per second, and N is the number of bits in the word, i.e., the number of binary digits used t o encode each analog sample. Evidently, if the digits are recorded in parallel on N tracks, the maximum recording rate is nv samples a second, and the samples are reproduced in each case with a precision of one part in 2N. For example, if n = 500, v = 60 in./sec, and N = 7, the maximum recording rate is about 4,000 samples a second for serial recording and about 28,000 samples a second for parallel recording. I n both cases, the precision is better than 1%.It is worth noting that an increase in recording speed by the small factor of ( N 1)/N doubles the precision, since an additional binary place becomes available. The most efficient method for the recording of binary coded data is to saturate the tape in one direction or the other so that a change from positive t o negative saturation indicates ‘‘1”and change in the reverse direction indicates “0.” Compared with return-to-zero recording, in which the digits are represented by recording positive or negative-going pulses, the nonreturn-to-zero (NRZ) method almost doubles the effective pulse packing density. Unless the tape transport speed is constant both for recording and replay, however, it is necessary to record a timing track by means of
+
AUTOMATIC DATA PROCESSING IN THE PHYSICAL SCIENCES
197
which individual digits of the code may be reconstituted. It is evident that digital recording lends itself easily to time-sharing and that tape speed variations are not a source of error so long as the replay tape speed does not fall to a value below which the recorded digits cannot be recovered reliably. 2. Magnetic Drum Recording. A magnetic drum store consists of a cylinder of magnetic material whose surface is coated with a thin film of magnetic alloy or magnetic oxide. At about 500 pin. from the surface are the gaps of a number of read/write heads. When the drum is spun about its axis, data may be recorded on, or recovered from, the magnetizable tracks under each head in the same way as with digital magnetic tape recorders. Data are erased by overwriting with a field strong enough to saturate a track in one direction. Storage densities of about 150 bits/in./track are readily obtained with NRZ recording, and a typical magnetic drum 10 in. in diameter, 10 in. long with 10 tracks/in., can store well over a quarter of a million bits. The maximum access time to a stored bit is inversely proportional to the speed of rotation of the drum and is usually of the order of 10 msec. I n capacity and access time, magnetic drums and disks (20) are intermediate between magnetic tape and magnetic core high-speed stores. 3. Magnetic Core Matrices. The magnetic core matrix store (21) depends on the fact that the B-H loop of magnesium-manganese ferrite is almost rectangular and that the remanent flux density is nearly equal to the total flux density. A small toroidal core (about 2 mm in outer diameter) can be switched in about 1 psec from one state of remanence to the other by a magnetomotive force of about 1 amp-turn when an emf of about 100 mv/turn is induced. The repeated application of magnetomotive forces insufficient t o cause switching takes the core round a minor hysteresis loop without progressively reducing the remanent flux density. A 2 X 2 plane matrix is illustrated in Fig. 6, with each core threaded with one horizontal and one vertical wire and all the cores threaded by the diagonal output wire. To read a “1” into any core, current pulses producing a little more than half the maximum magnetizing field are applied simultaneously t o the horizontal and vertical wires threading the selected core. Since it is the only one subjected to the maximum magnetizing field, it is switched (if it was in the “0” state) to the state of positive remanence representing “1.” The readout of a core storing “1” is done similarly, except that the applied currents are reversed so that the selected core is switched to the “0” state and the emf induced in the output wire represents the “1” readout. The other cores disturbed by the reading operation induce a much smaller emf in the output wire than the core switched and, by threading the output wire in opposite sense through adjacent cores, unwanted signals tend to cancel. Moreover, since the unwanted signals fall almost to zero
198
G. E. BARLOW, J. A . OVEXSTONE, A N D F. F. THONEMANN
by the time the “1” signal reaches its peak, the discrimination (or the size of the matrix) may be increased by gating the output a t the proper time. While a plane matrix of this kind must evidently be read serially, a number of them driven simultaneously by common addressing equipment provide parallel readout.
FIG.6. A 2 X 2 magnetic core matrix.
4. Other Recording Devices. Fast, nonvolatile, immediate-access matrix stores have been constructed of ferroelectric crystals (22)and superconducting elements ( C 3 ) , but the magnetic core matrix is in current and increasing use throughout ADP systems. For high storage density, however, photographic media are potentially superior. Permanent photographic digital stores of very high storage capacity and quick access have an evident applica,t.ionin information retrieval and language translating machines because there is no requirement for high-speed writing. Recent work on the optical scanning of photographically stored digital data (24) indicates that a storage density of about l-million bits per square inch and a readout rate of about 10 million bits per second might soon be achieved. If comparable writing speeds are ever reached on erasable photographic stores, e.g., by
AUTOMATIC DATA PROCESSING I N THE PHYSICAL SC I EN C ES
199
reversibly changing the color of an emulsion by strobing with light, the photon might take over many of the present functions of the electron in ADP.
R. Analog-to-Digital Conversion Most observing instruments do not measure data but only make measurement convenient. As the processing of experimental data is largely the manipulation of measurements, it is essential to have automatic measuring devices which match the observing instruments in speed and precision. For brevity, analog-to-digital conversion is often called coding, its inverse is called decoding, and the instruments may be called coders and decoders, respectively. Digital-to-digital conversion is also coding or decoding, but the context should make clear what is meant. The digital description of a physical quantity results from the operation of sampling, quantizing, and encoding. Sampling is the capture of a continuous, time-varying physical quantity a t successive instants (or within intervals during which the quantity changes insignificantly). It is often associated with the holding, or temporary storage, of the sample while the conversion is completed. Quantizing is the splitting of the sample size into separate levels whose number is necessarily finite because there are no infinitely small units of measurement. The “quantization error” can be made negligible for a fine enough subdivision of the sample size, since noise limits the definition of physical quantities. Coding is the labeling of the levels occupied by the samples. For automatic digital processors, the codes are commonly binary, or binary coded-decimal, but some converters deliver their outputs in unit-distance codes, or in incremental (unweighted) digits. It is easy to mechanize the translation of any one of these codes into any of the others or into an alpha-numeric code for display to human observers. If the frequency band occupied by the class of analog data is known in advance, the minimum sampling rate necessary for exact interpolation between samples is given by the Whittaker-Shannon sampling theorem (25,26).If the sampling rate is high enough, interpolation becomes unnecessary in practice. For example, a measuring instrument with a total bandwidth B will have a full-scale response time of T = 1/B sec, and if its resolution restricts it to n significant amplitude levels, then a significant change in reading cannot occur in less than T / ( n - 1) see. Thus, if the full-scale deflection of the instrument is divided into parts of equal size (quantized) and readings are sampled a t the rate of ( n - 1)/T a second, the quantized samples fully represent the analog output. The n significant sample amplitudes can be labeled by a pattern of symbols (coding) and, for a binary code, logzn digits are required to represent any quantized sample.
200
G. E. BARLOW, J.
A.
OVENSTONE, AND F. F. THONEMANN
Analog-to-digital conversion rates range from a few samples a second in converters using cams, gears, or relays to a few million samples a second in converters using electronic weighting circuits or coding tubes. The choice of a converter is naturally determined by the system in which it is to be used, but the main factors to be taken into account are 1. The number of data sources the converter will time-share. 2. The rate a t which each source will be sampled and the time distribution of the samples. 3. The kind of output code required, and the form and rate of readout. The matching of the conversion rate with other operations entails the use of data stores, recorders, buffers, or holding devices whose operation may be computer-controlled. The output code of the converter should be that used to represent numbers in the computer and is usually either binary or binary-coded decimal. With other codes, one has the choice of using computer time to translate them or providing a special unit for translation. For the human monitoring of data, an easily legible display will require additional equipment unless the output units of the computer are used for this purpose. The three physical quantities for which direct conversion devices have been made are time (including phase and frequency), voltage, and displacement (linear and angular). Primary (source) data occur in many other forms, but most observing instruments deliver data in one or other of these three forms. Voltage converters appear to be limited in precision to O.Ol%, but the precision of time and displacement converters increase with the time resolution of electronic devices and the extensions in the applications of optical interferometry. 1. Time-Interval Converters. The encoding of time intervals is straightforward; the beginning of the interval is marked by a sharply rising pulse which opens an electronic gate to pulses derived from a stable frequency source or clock. The pulses accumulate in a counter until the gate is shut by a second signal marking the end of the interval. By recording the clock frequency together with the time intervals on magnetic tape, the coder can be operated on a changed time scale by replaying the tape slower or faster than the recording speed. No significant error is introduced by tape speed variations which are common to the recorded time intervals and the recorded clock frequency. On replay, by frequency multiplication of the clock signal to a value higher than could be recorded originally, the accuracy of conversion can be increased up to the limit set by the resolution of the magnetic tape and reproducing head (about 10 pin.). Similar methods may be used for the conversion of the phase difference of two signals and for the period of a sinusoid by generating interval marks a t the instants of zero amplitude.
AUTOMATIC DATA PROCESSING IN THE PHYSICAL SCIENCES
201
The limitations of time-interval converters are imposed by the stability with which reference marks can be selected in the presence of noise and fluctuating signal amplitude, the resolving power of coding counters, and the rise times of gating circuits. Gated-beam tubes, secondary emission tubes, and transistors operated in the “avalanche” mode (27) all have gating resolutions of a few millimicroseconds, so that, as the effective resolving power of coding counters can be increased by a vernier technique (28) only the first mentioned limitation is of practical importance. 2. Voltage Converters. Voltage converters are important in data-processing systems because there are few experiments which do not produce voltage as primary data or which do not convert primary data to voltage waveforms for electrical transmission. Many voltage converters work through an intermediate conversion of voltage t o time or displacement; in the ramp CODING COUNTER
CLOCK PULSES L
I
---
ZERO LNEL COMPARATOR
sIwrwTw
I
c
VOLTA8E.
FIG.7. A ramp converter.
converter, for example, there is an intermediate conversion to time and, in the coding tube, a displacement of an electron beam. Other forms of voltage coders using self-balancing potentiometers in which decimal readout is obtained from rotary switches or coded disks are well known and need not be discussed here. a. R a m p converters. The form of this converter is shown in Fig. 7. A ramp function is generated, and as it crosses a reference voltage level (corresponding to the base voltage level of the input), a pulse is generated. When the ramp crosses the level of the vohage sample, a second pulse is generated. The time interval defined by the edges of these two pulses is converted in the way described above. If the ramp generator and the voltage comparators are stable, the number of clock pulses accumulated in the counter will evidently be proportional to the voltage amplitude of the sam-
202
G . E. BARLOW, J. A .
OVENSTOKE,
AND F. F. THONEMANN
ple. This kind of converter has a relatively long conversion time which depends on the dynamic range of the input signal and the slope of the ramp. The precision of the conversion may be made as high as 0.02% if circuit refinements are added with the purpose of correcting for drifts in the slope of the ramp (99). A commercial version of a ramp converter using a Miller time-base generator and diode amplitude comparators has a quoted fullscale precision of 0.1%. Transistor versions have also been built successfully (SO). b. Feedback converters. The arrangement for this kind of converter is illustrated in Fig. 8. SuppoPe the register (a series of flip-flops) is initially set to zero. A sequencing unit sets the flip-flop corresponding to the most significant digit to “I,” and a voltage equal to one-half of full scale is switched to the comparator. If this exceeds the amplitude of the sample, REGISTER
v
VOLTAGE DATA
-
COMPARATOR
t t t i
t
-
DIGITAL ANALOGUE CONVERTER.
1 FIG.8. A feedback converter.
this flip-flop is reset; if it is less, the flip-flop remains set to “1.” The operation is repeated on each successive flip-flop, the reference voltage being half the proceeding one on each comparison. The final state of the flip-flops represents the binary-digital value of the sample. This can now be read out either serially or in parallel, and the flip-flops are reset in readiness for the next sample. Conversion by this method is fast, requiring only m/f, seconds for each sample, where f, is the repetition rate of the sequencing unit and 2m 3 p , with p the number of quantization levels. The precision of conversion depends on the comparator, which must retain its sensitivity to better than one quantum over the complete range of input amplitudes. It is possible t o use a series of m comparators, each one optimized for a particular static voltage if necessary. An alternative method is to make the series of comparisons against one reference voltage equal to 2m-1 quanta and, if the comparison results in “1,” to subtract the reference voltage from the input sample and multiply the difference by a factor of 2 before the next comparison (31).Such a converter is extremely fast and
AUTOMATIC
DATA PROCESSING
IN THE PHYSICAL
SCIENCES
203
is limited in speed only by the rise time of the circuits. But because of drift in the independent comparators and amplifiers, the precision of conversion is likely to be poor. In another form of feedback converter, the series of flip-flops is arranged as a forward-backward counter. The comparator detects whether or not the feedback voltage is larger than the input sample and selects the “count-up))or the “count-domn” line to the counter accordingly. Commercial units’ using this backward-forward counter will follow a 4-cps sine wave without significant lag to a precision of 0.01% or, trading precision for speed, a 512-cps sine wave to 1%. A commercial version2 of the first kind of feedback converter has a quoted speed of 4 psec per bit plus 4 psec per conversion, i.e., a conversion to O.Olyo occupies 60 psec. c. The coding tube. Converters for mechanical displacement are a t least an order of precision better than the most precise voltage converters, which suggests that speed combined with high precision might be had by the electrostatic displacement of an electron beam. The first converter of this kind was a special cathode-ray tube originally developed by the Bell Telephone Laboratories for a pulse-code modulation communication system (33).Conversion rates of the order of a million a second are easy to obtain, but the method does not appear to have been developed to give precisions of better than 1-2%. The major sources of error are the nonlinear relation between the deflection voltage and the beam displacement, the finite size of the cathode-ray light spot, and the difficulty in aligning the coding mask to the deflection plates. 3. Displacement Converters. In the conversion of angular or linear displacement, one mechanical member is usually marked by lines, slits, contacts, or a code pattern, and the other carries a sensing element by way of which the numbers of marks passing it, or the code pattern in the field of view, may be registered. The most accurate and precise conversion is obtained by the use of diffraction gratings in which the number of interference bands passing the sensing elements are ~ o u n t e d .Because ~ the methods used for the conversion of either kind of displacement are similar, only three types of shaft converters need to be discussed here. These are those in which 1. The shaft position is measured in terms of increments of rotation from a reference position. 2. A coded pattern is rotated with the shaft to give a direct digitally coded output. 3. An analog-to-analog conversion precedes digital conversion. The Multiverter M-1, Packard-Bell Corp., Los Angeles, California. The Multiverter M-2, Packard-Bell Corp., Los Angeles, California. Fcrranti Co., Edinburgh (1957).
204
G . E. BARLOW, J. A. OVENSTONE, AND F. F. THONEMANN
The incremental converters generate signals which are accumulated in counters capable of counting up or down on command from a device which detects reversals in the sense of rotation. The counter, of course, cannot be time-shared. Incremental signals have been generated by 1. The passage of a light beam through radial slots cut in a disk. (An asymmetry in the shape of the slots makes possible the direct detection of reversals of rotation.) 2. The interruption of current by alternatively conducting and nonconducting disk segments. (This requires the use of brushes with their attendant problems of friction and wear.) 3. The detection of prerecorded magnetic marks (many magnetic drum stores select the addresses of tracks by this method), but magnetostatic heads, or other methods (33) must be used unless the shaft rotates continuously. 4. The detection of the phase changes of the modulation envelope of a carrier applied to a multipole stator, with a probe winding on the shaft. Multiple reading stations and simple logical circuits can be devised to sense the direction of rotation and to detect and correct errors which, with one reading station only, would accumulate until the shaft returned t o its reference position. a. Coded disks. Most of the functions of the external circuits of incremental converters are intrinsic to the coded disk, so that, for conversions which do not call for the resolutions of optical interferometry, the coded disk converter is generally used. The segments of the disk may be of insulating and conducting materials sensed by brushes, or opaque and transparent materials sensed optically. The main difficulty in reading coded disks occurs a t the sector boundaries. If a reading is taken near a boundary, gross errors may be caused by the finite brush (or optical slit) width, zoneto-zone misalignment of the pattern boundary with respect, to a disk radius, and brush (or optical slit) misalignment with respect to a disk radius. These errors are avoided by the use of unit-distance codes in which not more than one digit changes in crossing a zone boundary. It is not possible to do arithmetic directly with such codes, but the Gray (also called cyclic, reflected, or progressive) unit distance code is easily converted to the natural binary code and is frequently etched on disks. With binary-coded decimals, the unit distance property within decades does not prevent gross errors, since an ambiguity of one digit in the tens decade is equivalent to an error of 10 quanta. This may be avoided by reflecting alternate decades; i.e., the Gray code for 9 is repeated for 0, 8 for 1, 7 for 2, etc., in successive decades. Thus there is no ambiguity in reading a 9 for a 0 transition, but external logical circuits must be used to provide the nines complement of any digit in a decade which has been reflected in this way. Another
AUTOMATIC DATA PROCESSING IN THE PHYSICAL SCIENCES
205
method of resolving the ambiguities of reading makes use of two brushes (or optical slits) for each zone and makes a logical choice of which one to read. I n its most common form it is called the V-scan method (34). The Gray-coded disk, however, despite the fact that an external digital-todigital conversion is required before the data are arithmetically usable, is simple and effective, and the additional circuitry hardly counts against its use now that transistors are available. The V-scan method, with at least two additional accurately placed brushes or slits for each zone, seems complicated and expensive by comparison. A series of disk coders may, of course, be connected through gear trains. I n this way, for example, two identical coders each with a resolution of 1 part in 2’ might be linked through a 2’: 1 step-up gear train to provide a resolution of 1 part in 214. Unit distance codes will not resolve the intercoder ambiguity, but a V-brush arrangement, or a logical operation involving two redundant digits, can be used. Optically-read coded disks are free of the backlash and wear of a commutator disks and associated gear trains. Moreover, being frictionless, the optical disk places a n inertial load only on the shaft it is connected to. Commutator disks are available with precisions up t o 1 part in 21° and optical disks up to 1 part in 216. Mathematical functions have been coded directly on disks, the most useful being sine and cosine (35).With such disks, the quantum size varies around the disk circumference, and gross reading errors can be avoided only by the use of unit distance codes. b. Phase-shift converters. In these converters, the shaft rotation shifts the phase of a signal with respect to a reference, and the phase shift is measured by accumulating clock pulses in an external counter. Converters of this kind using synchronous motors or resolvers have an rms precision as high as 1 part in 213. Two recently developed versions of phase-shift converters have a precision of 1 part in 2ls7which makes them the most precise automatic analog-to-digital converters known to the authors. One of these (36) uses a pair of capacitively coupled phonic wheels, each with 360 “teeth.” The stator of one is fixed, and the stator of the other is clamped to the shaft whose rotation is t o be converted. Both wheels are spun a t 1,000 rpm, and a roughly sinusoidal signal of 0.1-volt peak-to-peak amplitude and 6-kc frequency appears across each load. This arrangement averages out random layout errors of the 360 teeth in a way similar to that in multipolar resolvers. A coded disk aligned with the “teeth” and clamped to the shaft is used t o register the number of degrees the shaft is displaced from its reference position. Fine resolution is made by using the zero crossings of the two signals to open and shut an electronic gate placed between a 6-Mc clock pulse source and a counter. The spinning toothed wheels are synchronized t o the clock by driving them from signals obtained by frequency divi-
206
G. E. BARLOW, J. A . O V E N S T O S E , A N D F. F. THONEMANS
sion of the clock frequency. Angles converted by this device have been compared with angles read from a high-precision theodolite; the difference, around a circle, had a standard deviation of 1 sec of arc.
C. Digital-to-Analog Conversion In ADP systems, digital-to-analog converters are usually used to display digital data in the form of graphs or to couple analog and digital computers. As one might expect there is a reciprocity betweeii coders and decoders. A system with the one in its feedback loop may be made to operate as the other. Of the many decoders that have been described in the literature, we shall mention only two kinds which are of general use in data-processing systems: the network decoder, which generates a voltage analog of digitally coded numbers, and decoders for use with digitally controlled plotting equipment. NUMBER R E I I S T E R .
REFERENCE
VOLTABE .
FIG.9. A network decoder.
1. Network Decoders. With these units, all the digits of the number must be available simultaneously for decoding. This is not a serious limitation, because a serial-to-parallel conversion of the digits can always be made, and the decoder itself is simple, precise, and fast. Let us represent the value of the kth digit in a number system of radix, r , by akrk, where ak = 0,1,2, . . . ,(r - 1). The decoder selects one of r voltage or current levels for each digit a k and sums these with the appropriate weights rk in a resistance network whose output is the voltage analog of the number. If the number is represented in natural binary digits, only two voltage levels are needed, and the weighting resistance is switched to one level if the digit is a “I,” and to the other level if the digit is “0.” A complete decoder is illustrated in Fig. 9, where the feedback amplifier is a conventional operational amplifier of the kind used in electronic differential analyzers. If the
AUTOMATIC
DATA PROCESSING
I N THE PHYSICAL SCIENCES
207
open-loop gain of the amplifier is much greater than unity, the output voltage is 1L-1
=
-m(a,
+ a1Z2+
+ . . . + an-12n-1)
which is proportional to the number which was to be decoded. The precision of the decoder depends mainly on the stability of the resistance network, amplifier, and switches. Transistors of very low forward resistance are fast and efficient switches, and, once the number is set up in the register, the decoding speed is limited only by the resistance network time constants and the bandwidth of the amplifier. A precision as high as 12 bits can be had from network decoders of this kind. CODED
I
msc.
DRUM
NUMBER
FIG.
10. A digitally controlled plotter.
2'. Position Decoders. In the sort of data process we consider here, position decoders are likely to occur only in digitally controlled plotting equipment. In most cases they will consist of a coder in a feedback loop, and, since the output shaft will nearly always be voltage driven, they will be associated with a voltage decoder. (The exception occurs when incremental data is used to drive a stepping motor directly.) When it is not necessary to position the shaft, but merely to make a mark when the shaft is in a position corresponding to a stored number, the system of Fig. 10 can be used. The shaft rotates continuously, and a pulse is generated when the coded pattern is the same as the binary number stored. This pulse then energizes the raised spiral on the drum, and a mark is made on the paper along its Y coordinate a t a position proportional to the stored number.
208
G. E. BARLOW, J. A. OVENSTONE,
AND F. F. THONEMANN
V. THE CALCULATION AND DISPLAY STAGE This stage is the nucleus of most ADP systems in that it provides most of the facilities for simulating human actions. I n particular, it effects automatic supervision and provides relevant results in an easily intelligible form. Much of the evaluation of the experiment can be performed in this stage although, as has been noted earlier, it is usual in scientific work for the experimenter to make the final decisions. The automatic computing equipment used can range from modified mechanical accounting machines to large-scale electronic digital and analog computers, the choice of a particular equipment being determined by the volume, rate, and kind of data delivered to it. Since similar remarks apply t o the display equipment, the interrelation of data input, calculation, output, and display must be carefully studied for a balanced system (37). This study must include the human operators employed for monitoring, editing, and setting up equipment and should allow for their proneness to error. Further, means for editing, checking, and manipulating data without using the computer are often desirable, so that the output a t any stage of processing should be suitable for automatic display, and data transfer (including editing) should not delay automatic monitoring and calculation.
A . Data Input, Output, and Display Some typical forms of digital input, output and display are given in Tables I1 and I11 together with transfer rates, packing densities, and some comments. Since most digital scientific processing systems do not need to work in real time (whence input/output also act as data stores), punched cards, perforated tape, and magnetic tape are frequently used for input and output. They can also provide a common language between different processing systems and display equipment, a factor which can be of major economic and engineering importance when designing a new system. It is usual to link these digital input/output units to the computer by buffer stores which permit overlapping of input, supervision, calculation, and output. These stores may be magnetic cores, magnetic drums, or arrays of bistable elements and may be housed in the computer, in the individual units or as separate entities or may be the immediate access store of the computer itself. The last offers a most acceptable and flexible solution for general use, particularly when block transfers of data are involved. While the input/output media for analog data are similar to those for digital data, the emphasis on recording and transfer techniques differs considerably. For example, essentially digital media such as perforated tape and punched cards are used via digital-to-analog converters operating in the time of the analog computer. Other media such as magnetic tape, film,
AUTOMATIC DATA PROCESSING IN THE PHYSICAL SCIENCES
209
paper, etc., which are used to record analog data in the form of frequency, displacement, or intensity, are well known. It should be noted, however, that the analog data often have to be converted to another analog form for computer input or output. For this reason, and for ease of editing, recording, communication, and display, digital recording and transmission of analog data and digital input/output for analog computers should become common in data-processing systems. The two common forms of display are graphs, which usually describe qualitative behavior on a compressed scale, and tabulations for detail. I n scientific processing, the display is often produced “off-line” using the input/output media noted above. This allows the displayed data t o be used for further automatic analysis and calculation if necessary. If “on-line” display is used, it is preferable to present the results as deviations from some norm, since this minimizes delays in processing and gives pertinent data in a n acceptable form. Some typical displays have been given in Table 111, and to these must be added devices such as typotrons, galvanometers, and digital plotters. There are also two additions to the existing range of units which would be most useful. The first of these is a fast printer with upper- and lowercase alphabet and easily interchangeable type: the second is a fast (10 t o 20 point/sec), two-dimensional, typing plotter which would automatically annotate graphs as they were produced. Since the kind of display depends on the design requirements of the system and the kind of publication process envisaged for the results, further detailed discussion is not warranted here. It is worth remarking, however, that one frequently sees high-speed printers feeding the results of a n elaborate process almost directly into waste-paper baskets. This is obviously unjustified, particularly in “off-line” processing, and emphasizes that display is just as important as any part of an automatic process. If follows, therefore, that automatic selection of significant results according to stored criteria should be provided as part of the normal display facilities for the ADP system (38, 39). Data-transfer and display devices, which are mainly electromechanical, are the weak links in ADP and require much more development. Otherwise, with possible developments in cryogenic and chemical stores, nearly all the unreliability, power, and bulk for any processing system will be concentrated in the input/output and display units and the full potential of ADP in science may not be realized.
B. Calculation While there are many forms of computers which can be used in automatic processing, the most important for scientific systems are automatic
210
G . E. BARLOW, J. A. OVENSTOKE, AND F . F. THONEMANN
TABLE 11. TYPICAL DIGITALINPUTS Maximum Packing rate, density, characters characters per sec per sq in.
Type
1. Manual keyboard or switches
10
Comments
-
Used mainly for presetting parameters, interrogation, monitoring, and manual insertion of information
2. Printed pages
200
100
Basic reference document with character reading by optical means, or with magnetized or magnetizable ink used in printing for reading by multitrack heads: editing and correction simple manually, but font must be constant for each page.
3. Punched cards
800
3-4
Used as low-capacity store, but supply rate usually lower than indicated here, since data must be preset in size and position on each card: 3.5-, 80-, 90-, 120-column cards are standard with one column per character: easy to edit and correct manually but cards can be lost or disarranged
1,000
10-13
Used as low-capacity store, but errors difficult to correct without breaking tape: 5-, 6-, 7-, or 8-hole tapes are standard with one row per character: tape insures storage of data without fear of partial loss
4. Perforated
paper tape
5. Magnetic tape
6. Photographic film
7. Digital inputs from buffer stores
80,000
500
Used as large-capacity store and can be overwritten and reused as required: tape usually 0.5 to 1.0 in., or 8 to 16 channels, wide on 2,400-ft spools: editing and corrections performed by computer systems: if random access to data required, access time may be unduly long but this can usually be overcome in scientific processing
200,000
2,000
Used as large-capacity store usually on 100-ft spools of 35-mm film: editing and correcting requires special photographic techniques with inherent delays
Limited by computer These inputs are generally provided by cycle time and response any of the media above or by digital transof the mechanism feed- ducers or electrical and/or mechanical linking the buffer store ages: the buffer stores may range from immediate access magnetic cores down to low-speed, high capacity, magnetic drums or disks: editing and corrections must be effected by the computer system
AUTOMATIC DATA PROCESSING IN THE PHYSICAL SCIENCES
Type 1. Graphson paper
2. Tabulation on paper
3. Film
4. Digital stores
5. Analog representation
211
TABLE 111. TYPICAL DIGITALOUTPUTS Maximum rate, characters Comments per sec (a) Incremental: Facsimile plotters with electrosensitive paper can offer automatic graticule generation and digital plotting of up to 20 points per sec to 0.1% (b) General: Two-dimensional graphs a t up to 5 points per sec to 0.1% are possible: speeds are limited by acceleration and inertia problems for plotting head or by steady motion of paper past fixed head (c) Owing t o slow speed, all plotters are normally used “off-line” with the inputs of Table I 1,000 (a) Character printers: Fast speeds obtained by using wire matrices to give alphabet and decimals, low speeds use type face: magnetizable inks can be used for re-input: can often be used “on-line” or as monitoring printer on computer (b) Line printers: Usually use up to 150 printing 1,500 wheels in a line, each line printing 10 times per sec.: each wheel has upper-case alphabet and decimal digits: magnetizable inks can be used for re-input : usually used “off-line” because fixed format required for each printing run. 20,000 (c) Page printers: Usually use C R T character display and xerographic printing: C R T character display may be upper case alphabet and decimals in either line or page layout: normally used “on-line” and as monitor disdav (a) Graphical display: Two-dimensional display on CRT photographed continuously or frame-wise: limited only by computer and film speeds: normally used “on-line” for high-frequency results (b) Page printing: As in 2c above: can be used for 20,000 re-input to computer system (a) Perforated paper tape: Normal punching speeds 300 25 to 300 characters per second: see also Table I. (b) Punched cards: Usually 2 cards/minute: see also Table I 80,000 (c) Magnetic tape: See Table I Limited by Usually voltage, frequency, or current proportional t o digital result for use in “on-line” equipment: may computer also be provided from buffer stores on request by cycle and external equipment such as analog computers digital-toanalog converter resDonse
212
G. E. BARLOW, J. A. OVENSTONE, AND F. F. THONEMANN
digital computers, analog computers (or simulators), digital differential analyzers (DDA), or combinations of these. Indeed, if it were not for automatic electronic digital computers, ADP would be of very restricted interest. The speed, versatility, and practically unlimited accuracy of these machines for both logical and arithmetical work have made them the central unit of most large-scale ADP systems. Many scientific systems reduce t o just this type of computer with suitably selected input/output. Progress in digital computers over the last five years (40, 41) has been very rapid in all branches. In components, the commercial production of ferrite cores and high-frequency transistors (42) have led to improvements in reliability and operating speeds while decreasing space and power requirements. With the expected introduction of superconducting and ferroelectric elements, and in fast access, large-capacity chemical and solid-state stores (@), steady progress should continue and make ADP less expensive than it is a t present. Probably the most noticeable development has been in the direction of larger, more complex and more ambitious digital computer systems such as the LARC (44), SAGE (45), STRETCH (46), and TX-2. However, with the exceptions of “program interrupt” facilities and the use of existing computers to design more complex systems, it seems that no new principles of logical design have been introduced and that late developments have been elaborations or extensions made possible by advances in components. The main trends have been in the formation of computer systems designed as part of some large processing scheme. This has meant that more multistage operations and instructional flexibility have been built into the system instead of leaving them to be programed as earlier. But the extremely complex logical problem involved in the programing of such equipment requires serious mathematical investigation, and one feels that a period of performance evaluation of these new systems is necessary before even more ambitious, and more expensive projects are undertaken. From the programing viewpoint, the techniques of microprograming (47),interpretive, compiler, and autocoding routines have removed much of the tedium of coding and debugging for these computers, while permitting efficient exchange of basic processing sequences between various machines and computer groups. It must be emphasized, however, that coding and debugging are only two phases of programing: they are preceded by mathematical description, numerical analysis, and system analysis of the problem. There have been some interesting attempts a t programing models of learning (48),but any practical applications for programing itself are still remote. Also, autocoding techniques are helpful only in programing an immediate problem or one which has to be treated once only. Any program which is part of the base load of the computer (and this load should
AUTOMATIC DATA PROCESSING IN THE PHYSICAL SCIENCES
213
be about 60% of the normal computer production time in a processing system) should be as efficient as possible. This means that continual refinements have to be made to any production program and there is a constant demand on programing and coding effort even with the most standard processing. Hence, autocoding techniques are limited in general processing, although they do permit a form of working system to be introduced quickly. While it is apparent that digital computers have the speed, flexibility, and operational characteristics for most scientific ADP systems, there are some outstanding questions still open. What can be done to increase reliability and ease maintenance, what speed and amount of immediate access store are really needed, what is the most suitable machine instruction code for general use, and what operations should be supplied as hardware and how much should be left to the program? These questions must be answered before the next generation of digital computers with multiple instruction registers, variable word lengths, ultralarge capacity stores, and timesharing of programs can be produced. It may happen that the precision and flexibility of a digital computer is unnecessary or that direct physical simulation of part of the experiment is required in the calculation. In this case, electronic analog computers (or differential analyzers) can be used, although they are not usually flexible enough for general scientific processing systems because of their poor decision facilities, lack of logical operations, restricted input/output precision, and the necessity for working with time as the independent variable. Against this, particularly for certain processings, they possess the advantages of simplicity in setting up for long runs, ease in changing problem parameters, low capital cost, and their simulation property. But when complex problems involving large-scale computers are considered, or when many different short runs have to be made in a given time, most of these advantages disappear in comparison with the equivalent digital processing system. Nevertheless, analog computers have a secure place in ADP, as is evidenced by their use for model analysis and on-line operation in many specialized fields. The main development in electronic analog computers and simulators has been the advent of the digital control unit (41). All major installations and manufacturers have now introduced methods of setting up, programing, checking, and reading out results by means of perforated paper tape and electric typewriters, and it seems certain that this trend will eventually include all forms of digital input/output. The inherent accuracy of analog machines has not changed appreciably, but stability and operational bandwidths have improved markedly. The union of analog and digital computers into one calculating facility has been achieved in some ADP systems (.49), but it is doubtful whether this development will continue because of the
214
G. E. BARLOW, J. A . OVEXSTOIL’E, AND F. F. THOXEMANK
greatly increased speeds of digital machines and the trend toward digital input/output. The digital differential analyzer (50) is a development which shows promise of replacing the existing analog computer in almost all of its ADP applications because it combines the precision of the digital computer with the ease of programing previously associated with the analog computer. For ADP use, it also possesses the advantages that the independent variable in integration need not be time, that speed is inversely proportional to the precision required for the calculation, and that megacycle clock rates are relatively easy to obtain for real-time simulation. It is, of course, possible for a DDA t o be amalgamated with a digital computer into one computing system. But, as with analog computer, the time-sharing and parallel programing problems indicate that a suitable digital computer system would be more useful in practice. It appears, therefore, that digital computers will have increasing use in scientific ADP systems because the computing facility has to perform not only the required calculation, but also the automatic editing, monitoring, and correction of data. Another reason for this bias to digital machines is their potential application in automatic design and evaluation. While present mathematical techniques have rather limited application for the multiparameter, nonlinear problems of modern science, they can indicate optimum actioiis or designs for an experiment or analysis. With probable advances in combinatorial analysis (51), games theory, and discontinuous mathematics (52), this application will become increasingly important for large-scale research. It must be emphasized, however, that these techniques will not endow equipment with the abilities of the research worker: all they can do is suggest a connected line of approach which might otherwise be overlooked. VI. EXAMPLES OF AUTOMATIC SYSTEMS I n this chapter we shall describe some techniques and systems which have been used successfully in processing data derived from a number of different kinds of experiments. Some of these systems are elaborate and complicated, but it should not be supposed that ADP is confined t.0 largescale experiments. It is common to find old and new scientific instruments fitted with electronic devices which acquire and record, even if t,hey do not also check and evaluate the instrument readings.
A . Guided-Missile Testing The automatic processing of experimental data has probably been carried furthest in the flight testing of guided missiles. Although such tests are of restricted interest in themselves, they involve the observation and
AUTOMATIC DATA PROCESSING I N T H E PHYSICAL SCIEBCEG
215
measurement of a compreheiisive range of physical quantities. These ADP systems and techniques, thereforr, have applications in many kinds of scientific work. 1 . Radio Dopplcr. In a Doppler tracking system (53) one is interested in determining a number of Doppler cycles as a function of time. These data are obtained from a set of a t least three ground receivers and are sufficient, given an initial position, to determine the trajectory of a missile in flight. Because of the accurate and lengthy computation required, automatic digital processing of the primary Doppler data is employed on most missile ranges. I n modern systems the output of the Doppler receivers is in the frequency range 0 to 50 kc and is recorded on magnetic tape. This tape is replayed at a convenient speed into automatic conversion equipment which records the measurements on a second magnetic tape in a form which may be entered directly into a digital computer. The automatic digital measurement of the primary data may be done either by counting the number of Doppler cycles in successive fixed intervals of time or by measuring the successive time intervals occupied by a fixed number of Doppler cycles. The measurement of fractions of a cycle required by the first method has been evaded by frequency multiplication of the primary signal in a very wide-band multiplier. The second method is simple to mechanize, but the digital data are generated a t unequal time intervals, which slightly complicates the calculation process. If a timing frequency is recorded on the tape together with the Doppler signals and is used to measure time, then the measurements will be independent of tape speed variations. A hybrid of both methods has also been described (54)in which the number of cycles in a fixed time interval is counted and is followed by the counting of timing pulses until the last cycle of the group is completed. Since the Doppler signal is continuous, the end of one measurement must be the beginning of the next; otherwise, there would be an accumulation of error. Continuous measurement is effected by the use of two counters, one counting while the other is scanned. Another method is to “flash” the contents of one counter into an auxiliary shift register. 2. Kinetheodolite and Radar Data. Triangulation using kinetheodolites has provided much of the trajectory data of missile flights. The kinetheodolite provides a 35-mm film on each frame of which is the missile image and the azimuth and elevation of the line of sight as indicated by a set of cross hairs. These frames are then laboriously read by manually operated film readers t o provide accurate azimuth and elevation and a boresight correction which allows for the operator’s inability to keep the missile image on the theodolite axis. The automatic measurement of the boresight correction, either in the
216
G . E. BARLOW, J. A . OVENSTONE,
AND F. F. THONEMANN
focal plane of the theodolite or indirectly from the film, has not yet been effected. However, the more recent shaft digitizers do permit recording of the azimuth and elevation shaft angles to the required precision. These can increase processing speed by a factor of two or three, but the boresight correction must still be done manually. The development of instrumentation radars (55) with shaft digitizers having a precision of up to one part in 218 for slant range and direction angles makes possible an accurate automatic trajectory system. The outputs of these digitizers are usually recorded on magnetic tape with the elapsed time of observation and can be directly processed by a digital computer. 3. Telemetry. In modern telemeters (56), the number of independent datdachannels range from about 20 for small missiles, to several hundred for large missiles, and the channel bandwidths range from a few cycles to a few kilocycles. As the transmission of a million or more data points to the ground in the course of 3 min of flight is common, automatic processing of the received data is mandatory if quantitatively assessed results are to be promptly available. For FM/FM, PAM/FM/FM, and PDMI F M telemeters, the data process begins with the recording of the total output of the telemetry receiver. If the process is “on-line,” the record is made concurrently with the rest of the processing. More often a magnetic tape record serves as a link between the receiving station and the ADP system, particularly if these are widely separated. Elapsed time is recorded on the tape together with a constant reference frequency which may be used to correct for difference in recording and replay speeds and for tape speed variations. The time or frequency-shared channels are separated on replay, the former by a synchronizer and the latter by wave filters. At this point, the signals are demodulated and the uncalibrated data are displayed on a multichannel, direct writing recorder. This display permits quick manual selection (usually about 20-30y0 of the total) of the received data for further processing. From this point there is the choice of one of two kinds of processing. If computer time can be spared to correct for drifts and transducer nonlinearities, the selected data is digitized by counting the number of reference cycles in a fixed number of data cycles for PAM/FM and F M telemetry (5‘7), or for the duration of the data pulse in P DM telemetry. This method of conversion is independent of tape speed. Continuous F M channels, which usually carry vibiatioii data, are rarely digitized but are replayed from the tape into a harmonic analyzer. In the alternative processing scheme (58), demodulators are used to correct for system drifts and for errors introduced by variations of tape speed. The drifts are compensated by comparison of the demodulated values of the periodically transmitted ‘%xed” levels with the known reference voltages used initially to establish them. Errors due to tape speed variations
AUTOMATIC DATA PROCESSING IN THE PHYSICAL SCIENCES
217
can be removed by dividing the data by the reference signals on a cycleby-cycle basis, and demodulators which approximate this process have been used. A simpler, partially correcting method is the subtraction of the reference voltage from the data voltage after demodulation. Calibration of the demodulated and drift-corrected signals using the calibration curves of the corresponding transducers is done using analog function generators such as the photoformer (59) or diode curve-fitting networks. The calibrated data from most telemeters are accurate to about 1% of full scale, and a high-quality direct-writing recorder may be used to display the results. Usually about 30% of the calibrated data will be used for digital computations, and these will be in the form of sample voltages from the demodulator. The samples are applied to a voltage digitizer whose numerical output is recorded on the input medium of the computer. A variation used in the telemetry processing system for Vanguard (60) interchanges the position of the calibrator and the digitizer. All data are demodulated and digitized, and the digits are recorded on magnetic tape. The tape is replayed into a multichannel calibrator, and the corrected data are displayed or recorded on a second magnetic tape. The calibrator is a magnetic core matrix store in which each datum “looks up’’ its corresponding corrected value. The store has been “loaded” before the trial with calibration data prepared by a computer. The complicated setting-up procedure seems a high price to pay for the freedom from calibrator drift and errors which the system affords. Digital (PCM) telemeters have not been used in missiles until recently (61). The development of small transistor and magnetic core sampling and encoding units has made it practicable to fit a PCM system into the restricted space of most guided missiles. The main advantage to be had from a PCM telemeter is that system drifts after the transducers do not affect the data. It would be simplest if the receiver output could be recorded in a form ready for direct insertion into a computer or for the digital-to-analog converter used in the making of the “quick-look” display. I n practice, some additional processes may be necessary including serial to parallel conversion, change of bit rate, change of code, insertion of a time code, and change of format. For example, the AKT-14 PCM telemetry system encodes each of 32 channels to 10 bits 750 times a second, giving a rate of 24 X lo4 bits/sec. With the possible exception of video magnetic recorders, no magnetic tape machines can record a t this rate serially on one track. A serialto-parallel conversion, which is easy enough to do, would therefore be required “on-line” before recording.
R. Flight Testing of Aircraft The traditional method of gathering data on aircraft performance during flight tests was t o photograph a panel of 30 or more dials or gauges and to record other data on a battery of oscillographs. As the manual
218
G. E. BARLOW, J. A. OVENSTONE, AND F. F. THONEMANN
processing of these records was slow and tedious, it is not surprising that automatic techniques have been enthusiastically applied in this field. Basically, the processing problem is similar to that of missile telemetry in that in-flight observations on temperature, pressures, flows, speeds, stresses, etc., have to be delivered to the ground where appropriate editing and monitoring of data precedes calculation. Unlike missiles, however, aircraft are expected to return to earth intact and usually have enough space available for measuring and recording equipment. Hence, data can be recorded a t its source in the aircraft simultaneously with telemetry transmission to the ground. The large number of systems which have been designed for this kind of processing record data in FM form (62) or digitize them before recording (63).One syetem (64)uses three methods according to the frequency of the data, namely, 10-bit analog-to-digital conversion for data up to 6 cps, FM for data betmeen 6 and 1,000 cps, and direct recording above 1,000 cps. The ground equipment for these systems is much the same as for missile telemetry. However, only a very small amount of the available data is ever required for calculation and later analysis in these experiments.
C. Static Testing Wind tunnel (65), engine, and rocket motor tests are of comparatively short duration, do not require miniaturized equipment, and yield data (except for vibration) a t low frequency. Hence, many of the processiug systems for these tests sample and time multiplex the outputs of the numerous transducers, digitize them, and record them in an intermediate store. After the tests, the records are replayed into a “quick-look” display unit and selected data are entered into the computer. Data processing in these experiments has been relatively simple to mechanize, and many kinds of sampling and multiplexing techniques have been used to obtain digital data (66). Magnetic drums (67), cores (68), magnetic tape, and perforated tape have been used for temporary storage. When high-frequency or transient phenomena are examined, magnetic tape is employed with harmonic analyzers and direct-writing oscillographs as part of the replay facilities (69).
D. Pulse-Counting and Measuring The analysis of the frequency, amplitude, and duration of electrical pulses is a common requirement in many experiments. In nuclear physics, for example, automatic processing has been applied to pulse-height analysis (70) and neutron flight-time determination (71). Both of the systems digitize the quantity of interest and add unity to a location in a randomaccess magnetic-core store. The address of this location is determined by
AUTOMATIC
DATA PROCESSING
IN THX PHYSICAL SCIENCES
219
the value of the digitized quantity. Thus, in one case, the pulse height is quantized to 1 part in 256 and there are 256 store locations of 16 bits, giving a niaximum capacity of 2lS counts. “Quick-look” is available on a CRT as a histogram plot of counts (or duration of flight) against channel number. The contents of the store can be plotted or punched on paper tape as required for further analysis by a digital computer. Another system uses similar methods for the analysis of extensive cosmic-ray showers (7.2).Here an array of Geiger counters is laid out in rows and columns with a proportional counter and paper-tape punch associated with each row. When a shower occurs, the tape provides a record of which counters in the row were fired and which of 10 levels were indicated by the proportional counter. These perforated tapes are then analyzed by a digital computer. I n the study of the transmission characteristics of a forward-scatter radio link (7.9, 5-psecond pulses are transmitted a t a repetition rate of 100 per sec and are sorted into one of 30 aniplitude levels a t the receiver. At’ the end of the sampling interval, the 30 counters associated with the amplitude levels are automatically read out onto adding machine tape or punched paper tape. Each readout also includes the time of day, setting of the receiver attenuator, and identification of the median (the highest level which contains less than 500 pulses in this case). The remaining steps in the automatic data process are obvious The flexibility of these pulse and magnetic tape recording techniques is illustrated by a miniature magnetic tape unit which records a digital code of the thermal radiation encountered by an artificial satellite (74). I n this unit, the tape is driven forward in 0.004-in. steps and simultaneously winds a spring each time a binary digit is recorded. At appropriate times the satellite is interrogated from the earth by a radio signal and the spring ratchet is released. This pulls the tape back over the reading head allowing the data to be transmitted in condensed form and re-recorded a t the ground station.
E. The Processing of Optical and Photographic Data The reader will have noticed that little has been said about the automatic processing of data which are first accessible in the image plane of optical equipment or on photographic film. Apart from television and the automatic counting and sizing of particles (75), however, there is little progress in this field. The processing of optical data often involves the recognition of complex patterns and manual reading, assisted by semiautomatic film readers, is still resorted to. The amount of information yielded by the photoelectric scansion of an image is usually very large and, as much of it is noise, is usually irrelevant. It is only recently that the automatic reading of typescript has been achieved (767, but here one has a short alphabet of
220
G . E. BARLOW, J . A. OVENSTONE, AND F. F. THONEMANN
ideal characters always available for comparison. With the often dubious pictorial data of experiments, the experimenter does not know what data are relevant until he has inspected the picture, and his selection criteria are often invented ad hoc. The analysis of nuclear particle tracks is a case in point (77): once the observer has selected and measured significant data from the bubble chamber photographs, most of the detailed work can be done by a digital computer. The problem in this kind of experiment, therefore, is how t o scan and measure the relevant parts of an image without using fast, uneconomically large data stores, and much development and investigation are still required to solve it. VII. CONCLUSION An automatic data processer for large-scale experiments is one of the most complex of artifacts. Once its purpose has been defined, which is a considerable task itself, detailed design may occupy several man-years of skilled effort and is followed by the still larger effort of construction and testing. But if an ADP system can simulate other physical systems, it can also simulate members of its own class and can do this in greater detail and at greater speed than is practicable manually. One can therefore conceive of much of the design, assembly, and testing of new systems being delegated t o machines. Indeed, it is only a matter of time before this is accomplished and, even if it only leads to reduced costs, the applications of ADP will be broadened. Although ADP systems are generally more reliable than people in similar tasks, much remains to be done. So far improvements in the reliability of components seem to have kept pace with the increasing complexity of automatic systems, and sophisticated schemes for the control of error have not been necessary (78, 79). Preventive maintenance, marginal checking, and the repetition of readings and experiments are the usual ways of combating equipment failure. Error control can be had by use of redundant parts (which make the equipment bulkier) or by the use of redundant errorcorrecting codes (80) (which make the equipment slower), but by the time adequate statistics on component failures have been obtained, the components are often outmoded. The general methods of experimental science have not altered since their invention centuries ago, and ADP is not likely to alter them radically. The prodigious increase in routine processing speeds which electronic machines have made possible can blind enthusiasts to the proper application of ADP, and their speed can be used to augment the production of scientifically trivial results by orders of magnitude. These systems should be employed to extend and exploit the familiar methods of scientific
AUTOMATIC DATA PROCESSING IN THE PHYSICAL SCIENCES
221
practice in the detailed design of experiments, and in the performance of well-planned routine and repetitive tasks. ADP, therefore, should give the scientist more time for thinking and sharper results to think about.
ACKXOWLEDGMENT The authors wish to thank the Chief Scientist, Australian Department of Supply, for permission t o publish this paper.
REFERENCES 1. J. J. Stone, B. B. Gordon, and R. S. Boyd, Proc. Eastern Joint Computer Conf.,
Washington, D.C. p. 80 (1957). 2. S. Y. Wong and M. Kochen, AZEE Trans. 72, Part 1, 172 (1956). 3. M. Hertzberger, J . Opt. SOC. Amer. 41, 805 (1951).
4. N. Rochester, J. H. Holland, L. H. Haibt, and W. L. Duda, Trans. I R E IT-2, 580 (1956). 5. G. G. Alway, Proc. I E E 103, Part B, Suppl. No. 1, 12 (1956). 6. H. H. Goode and R. E. Machol, “System Engineering.” McGraw-Hill, New York, 1957. 7. W. R. E. Salisbury, ed. Conf. on Computers and Data Processing, South Australia (1957). 8. H. F. Mayer, Advances in Electronics 3, 111, 221 (1951). 9. C. D. Mee, Proc. Z E E 106, part B , 373 (1958). 10. R. M. Hayes and J. Wiener, Convention Record I R E Part 4, 205 (1957). 11. D. G. N. Hunter and D. S. Ridler, Electronic Eng. 29, 490 (1957). 12. E. Foss and R. S. Partridge, ZBM J . Research & Development 1, 102 (1957). 13. E. D. Daniel, Proc. I E E 102, Part B., 442 (1955). 14. 0. Kornei, Electronics 29, (11),172 (1956). 15. E. L. Koller, IRE W E S C O N Part 5, 43 (1957). 16. R. L. Peshel, Tech. Information Brochure No. 2, Ampex Corp., Cal. 17. W. H. Foster, Convention Record I R E Part 1, 133 (1956). 18. G. B. Newhouse, Convention Record I R E Part 10, 86 (1955). 19. M. L. Van Doren, Electronzcs 27, ( 5 ) , 232 (1954). 20. W. A. Farrand, IRE W E S C O N Part 4, 227 (1957). 21. J. W. Forrester, J . Appl. Phys. 22, 44 (1951). 22. W. J. Merz and J. R. Anderson, Bell Lab. Record 33, 335 (1955). 23. J. W. Crowe, I R M J . Research R- Development 1, 294 (1957). 24. D. M. Baumann, J . Assoc. Computing Machinery 6, 76 (1957). 25. E. T. Whittaker, Proc. Roy. Soc. Edinburgh 36, 181 (1915). 26. C. E. Shannon, Proc. I R E 37, 10 (1949). 27. J. R. A. Beale, W. L. Stephenson, and E. Wolfendale, Proc. Ih’h’ 104, Part B, 394 (1957). 28. R. G. Baron, Proc. I R E 46, 21 (1957). 29. Consolidated Electrodynamics Corp., “Millisadic” Bull. No. 3003A. 30. F. H. Blecher, Bell System Tech. J . 36, 295 (19.56). 31. B. D. Smith, Trans. I R E 1-6, 155 (1956). 32. R. W. Sears, Bell System Tech, J. 27, 44 (1948). 33. A. J. Winter, Proc. Western Joint Computer Conf. p. 203 (1953). 34. J. B. Speller, I R E W E S C O N 29 (1954).
222
G . E. BARLOW, J. A. OVENSTONE, AND F. F. THONEMANN
35. C. P. Spaulding, Trans. I R E 1-6, 161 (1956). 36. L. G. dc Bey and R. C. Webb, Convention Record I R E Pt. 5, 211 (1958).
37. H. Freeman, (Brit.) Comm. and Electronics 33, 588 (1957). 38. B. & Gordon, ‘I. Trans. I R E TRC-3, 5.1 (1957). 39. W. ‘MT. Hines, Proc. ZSA paper VM-2-57 (1957). 40. L. D. Whitelock, Proc. Eastern Joint Computer Conf., New York p. 9 (1956). 41. R. P. Castanias and J. E. Sherman, Trans. I R E EC7, 65 (1958). 42. G. J. Prom and R. L. Crosby, Trans. I R E EC6, 192 (1956). 43. W. N. Papian, Electronics 30, (lo), 162 (1957). 44. J . P. Eckert, Proc. Eastern Joint Computer Conf., X e w York p. 16 (1956). 45. M. M. hstrahan, B. Housman, et al., I B M J . Research & Development 1, 76 (1957). 46. S. W. Dunswell, Proc. Eastern Joint Computer Conf., New York p. 20 (1956). 47. R. J. Mercer, J . Assoc. Computing Machinery 4, 157 (1957). 48. B. G. Farley and W. A. Clarke, Trans. I R E IT-4, 76 (1954). 49. W. F. Bauer and G. P. West, J . Assoc. Computing Machinery 4, 12 (1957). 50. W. W. Allen, J . Inst. Engrs. (Australia) 29, 255 (1957). 51. Am. Math. SOC.,I’roc. Symposium Appl. Maths. p. 6, (1956). 52. L. Brillouin, Information and Control 1, 1 (1957). 53. D. W. Icean, U.S. Navord Rept. No. 452 (1951). 54. P. M. Xintcr and E. J. Armata, Trans. I R E 1-6, 142 (1956). Fi5. E. A. Mechler, d. W. Porter, and R. 0. Yavne, Natl. Conf.on Military Electronics, Wasl.ington, D.C. p. 213 (1958). 56. M. H. Nichols and L. L. Rnuch, “Radio Telemetry,” 2nd cd. Wiley, New York, 1956. 57. G. E. Barlow, h:atl. Telemetering Con:f., Bnltimore, Maryland p. 176 (1958). 68. G. F. Anderson, Trans. I R E TRC-2, 17 (1956). 59. C. P. Ballard, Trans. I R E 1-2, 105 (1953). 60. D. H. Gridley and W. €3. Poland, Natl. Telemetering Conf., Baltimore, Afaryland p. 180 (1958). 61. K. A. Johnson, Coricerition, Record IRE’ Part 5, 28 (1957). 62. G. Luecke and G. E. Sandgren, Trans. I R E TRC-3, 2.2 (1957). 63. R. S. Djorup, Natl. Telemetering Conf., Baltimore, Maryland p. 306 (1958). 64. 13. W. Royce, Convention Record I R E Part 1, 129 (1956). 65. R. W. Kaisner, Proc. ISA Paper DHRD-1-57 (1957). 66. M. L. Klein, R. B. Rush, and H. C. Morgan, Electronic Eng. 29, 158 (1957). 67. T,. Jaffc, Trans. I R E 1-2, 31 (1953). 68. E. M. Sharp, Trans. I R E 1-6, 186 (1957). 6.9. B. S. Fister and G. A. Woodcock, Trans. I R E 1-7, 48 (1958). 70. R . W. Schumann and J. P. McMahon, Rev. Sci. Instr. 27, 675 (1956). 71. R. W. Schumann, Rev. Sri. Instr. 27, 686 (1956). 72. C. S. Wallace and h1. €1. I h n n a n , Conf. on Computers and Data Processing (W. R. E. Salisbiiry, ed.) South Australia (1957). 73. D. Eadic, Trans. IRE’ 1-6, 234, (1957). 74. V. E. Suomi and R. J. Parent, Natl. Telemetering Conf., Baltimore, Maryland p. 18G (1958). 7 5 . B. B. Morgan, Research 10, 271 (1957). 76. Engineering 183, 348 (1957). 77. Y. Goldschmidt-Clermont, Discovery 19, 148 (1958). 78. J. von Neumann, “Automata Studies,” C. E. Shannon and J. Mecarthy, eds. Princet,on University Prcss, Princeton, N. J., 1956.
AUTOMATIC DATA PROCESSING I N T H E PHYSICAL SCIENCES
223
79. D. J. P. Lipp, Trans. I R E RQC-10, 21 (1957). 80. D. Slepian, Bell System Tech. J . 36, 203 (1956).
GENERAL BIBLIOGRAPHY
W. C. Chaloner and W. 0. Henderson, “Some aspects of the early history of automation.” Research 10, 335 (1957). G. A. Korn and T. M. Korn, “Electronic Analogue Computers,” 2nd ed. McGraw-Hill, New York, 1956. Various authors, “The Computer Issue.” Proc. I R E 41, (1953). R. K. Livesly, “An Introduction t o Automatic Digital Computers.” Cambridge University Press, London and New York, 1957. R. K. Richards, “Arithmetic Operations in Digital Computers.” Van Nostrand, New York, 1955. C. Hintae, “Mathematic vs physical simulation.” Proc. Flight Simulation Symposium, Houston, Texas p. 3 (1957). J. A. Rajchman, “A survey of magnetic and other solid state devices for the storage of information.” Trans. I R E CP-4, 1210 (1957). D. B. Breedon, “Analogue vs digital techniques for engineering design problems.” Trans. I R E PT-2, 86 (1957). R. K. Richards, “Digital Computer Components and Circuits.” Van Nostrand, New York, 1957. A. K. Susskind, ed., “Notes on Analogue-Digital Conversion Techniques.” Technology Press, M.I.T., Mass., 1957. Electronic Engineering Co. California, “Project Datum,” 1957.
This Page Intentionally Left Blank
Operational Amplifiers R. L. KONIGSBERG Applied Physics Laboratory, The Johns Hopkins University, Silver Spring, Maryland Page
I. Introducti ................ 11. The Basic . . . . . . . . . . . . . . 227 111. The Laplace Transform and Transfer Function IV. The Analysis of the Parallel Feedback-Operatio ........................... 244 V. Illustrations of Operations. B. Measurement Application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII. Discussion of Errors.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII. Developments in Operational Amplifier Design. . . . . . . . . . . . . . . . . . . A. Vacuum-tube Circuitry.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Transistor Circuitry ........................ ........................ ...........................................
256 258 282 283 283
I. INTRODUCTION The operational amplifier is probably one of the most important electronic tools available today for analog computing purposes. It appears to have been introduced to the computer field by three investigators. Philbrick, in early 1938, used the operational amplifier in computing circuits for the solution of servomechanism problems (1). During World War 11, Love11 and Parkinson of the Bell Telephone Laboratories introduced it in their work (1, 2 ) , and, subsequently, it was employed with success in the Western Flectric M-IX antiaircraft gun director computing circuits (3). In 1947, Ragazzini, Randall, and F. A. Russell reported on the use of the operational amplifier in computing circuits, after having their attention called to the circuits of the M-IX computer by J. B. Russell of Columbia University (3).Since then, the operational amplifier has been increasingly used in general and special electronic computing systems, and its development has more or less paralleled that of the electronic analog computer art. Some typical examples of present-day computers employing operational amplifiers are the Reeves REAC, Philbrick GAP-R, Boeing BEAC, Goodyear GEDA, Berkeley EASE, and Electronic Associates PACE systems. Reference 61 gives a more complete list of general-purpose computers. Operational amplifiers are high-gain, carefully constructed d-c electronic 225
226
R. L. KONIGSBERG
amplifiers and may be of the voltage or current (amplifier) types. When connected to certain electrical networks in a feedback arrangement,l the amplifier-network system is capable of performing on signals applied to its input the mathematical operations of differentiation and integration with respect to time, summation of the signals, multiplication by a negative constant, various combinations of these operations, and more complex operations. The feedback may be either of the series or parallel types, or combinations thereof. I n general, feedback may also be negative or positive, according t o whether the quantity fed back t o the input oppo,qes or assists, respectively, the original input signaL2 Voltage amplification is generally accomplished with vacuum-tube circuitry, though recent trends point towards the use of transistor circuitry as well (5-10). There exists a type of magnetic amplifier (with no vacuum tubes or transistors therein) which also can be used as a voltage amplifier and in operational circuitry (11, 12). Because of its comparatively narrow bandwidth, however, i t is not generally adapted for precision computer work. Another type of magnetic amplifier is capable of current amplification and, when connected t o networks in a special manner, is capable of performing mathematical operations ( I S , 12). This type also has a comparatively narrow bandwidth. As we have seen above, there are several amplifier types and feedback configurations capable of being used for operational purposes. Herein we shall be mainly concerned with the system which is being used extensively in precision analog computer work today. It consists of the voltage amplifier-parallel feedback network with the feedback being negative. We shall not consider the comparatively narrow-band magnetic amplifier systems or those configurations employing current amplification. Also, feedback arrangements other than the parallel type will not be discussed. However, the principles to be described in terms of the voltage amplifier-parallel feedback network are similar to those for the other systems; hence, there is no severe loss in generality by considering the proposed operational circuitry. As used herein, the term “operational amplifier” will refer to an amplifier which basically is a high-gain, relatively high-input-impedance, phaseinverting, unidirectional, voltage-amplifying device. This amplifier will be 1 The term “feedback” as used herein means that a portion of the electrical energy from the output of a system is returned t o its input and effectively added t o or subtracted from the input signal. Feedback may be used t o alter the properties of a system in a desirable manner. 2 A discussion of feedback terms and principles may be found in Bode’s book (4). Parallel feedback, which will be discussed herein, implies that the feedback network and the amplifier are effectively connected in parallel as viewed at the system input by a source and a t the system output by the load.
OPERATIONAL AMPLIFIERS
227
employed with a parallel-feedback network to form an operational-type circuit. Unidirectional amplification implies amplification from input terminals to output terminals but not vice versa. The amplifier is generally capable of amplifying signals in the frequency range from 0 cps (d-c) to beyond several kc and is designed to exhibit low noise and low d-c drift properties. Our objectives here shall be: 1. To give the reader a general picture of the operational amplifier computing technique (Sec. 11). 2. To indicate some of the basic mathematical concepts that lie behind the technique. In this regard me shall touch on the use of Laplace transforms and the transfer function coiicept (Sec. 111). 3. To analyze the basic operational amplifier-parallel feedback network arrangement and arrive at the generalized operational transfer functions, employing the concept of 2 above (Sec. IV.) 4. To illustrate, by examples, the types of opcrations which can be performed with the amplifier-network system. Such elementary operations as integration, differentiation, etc., will be considered (See. V). 5. To illustrate applications of operational circuits in computer work and in a measurement problem (See. VI). 6. T o investigate the errors introduced by practical operational amplifiers on the performance of the operational circuit (See. VII). 7. To trace some of the developments in operational amplifier design (Hcc. VIII).
11.
THE
BASICPaRALLEL FEEDBACK-OPERATIONAL AMPLIFIERNETWORK
Figure 1 illustrates the basic operational amplifier-network configuration. An operational amplifier is connected t o a signal voltage source El through an input impedance 2,. A feedback impedance Z F is connected between the amplifier input terminal 1 and output terminal 3. This type of cwnnection is referred to as “parallel feedback.” E, represents the output voltage of the amplifier-network system. ( - A ) is the voltage gain of the amplifier, and the negative sign indicates that the amplifier output is 180” out of phase with its input. With Z F connecting the amplifier output and input terminals, as shown in Fig. 1, we have what is termed as “negative feedback” (because a proportion of the output being fed back opposes the input). Considering only linear circuits herein, both 2 1 and Z F , in general, contain combinations of linear, passive (Le., containing no amplifying devices) network elements such as resistance, inductance, and capacitance. For convenience-and this is generally true in practice-one input terminal (terminal 2) and one output terminal (terminal 4) are grounded. All voltages or potentials, therefore, will be referenced to ground. Again
228
R. L. KONIGSBERG
for convenience, we shall omit all ground terminals of the amplifier (i.e., terminals 2 and 4) in subsequent diagrams with the understanding that these terminals are, in reality, grounded. Figure 2, for example, shows the circuit of Fig. 1 with the amplifier ground terminals not shown. If the magnitude of the amplifier gain ( ] AI) is sufficiently high, the relation that defines Eo in terms of the input El and network impedances 21 and Z p , as will be shown later, is
FEEDBACK IMPEDANCE
INPUT IMPEDANCE
A OUTPUT VOLTAGE
-
OPERATIONAL AMPLIFIER
FIG. 1. Uasic operational amplifier circuit.
3
SIGNAL SOURCE VOLTAGE
OUTPUT VOLTAGE
k IEO
El OPERATIONAL AMPLIFIER
FIG.2. Circuit of Fig. 1 with amplifier grounds not shown.
The factor -ZF,/Z, is commonly termed “transfer function”; it defines, in effect, the ratio of output to input voltage Eo/E1for the particular network configuration. The transfer function is a function only of the linear, passive, network elements which are contained in Zl and Z p . Hence, the output function Eo in Eq. (2.1) may be made to conform as closely as we desire t o some prescribed function by choosing precision network elements -keeping in mind that the amplifier gain should be sufficiently large to permit use of Eq. (2.1). As we shall see later, El may correspond (but not be equal) to any practical time function defined for t > 0, with the restriction that the time function be identically zero for t < 0. Ee will then correspond to some output time function. It is possible to adjust the transfer function, by means of the impedances Z1 and ZF, such that:
OPERATIONAL AMPLIFIERS
229
1. The output time function is a negative constant times the input time function. 2. The output (time function) is the time integral of the input (time function). 3. The output is the time derivative of the input. 4. The output is some complicated function of the input, other than that mentioned in 1, 2, and 3 above. In effect, we have a circuit capable of performing many desired mathematical operations on an input time function. One other property of the circuit is worth mentioning now. Various combinations of the above operations may be performed on one or more input signals and then summed with the same amplifier. Figure 3 illustrates how this is done, and the out'put E, is defined as follows:
Zl -vvvVv
€1
-
OPERATIONAL AMPLIFIER
Fro. 3. Basic summing circuit.
Again, Z1, Z2, . . . , Zn, and Z p are composed of linear, passive, network elements; each may be adjusted independently and as accurately as we desire. El, E2, . . . , En correspond to input time functions, while E, corresponds to some output time function. It can be argued that, to a certain extent, these operations can be performed by ordinary linear, passive electrical networks. For some applications such networks can be used, but for the great majority of computer applications there are many attendant disadvantages of the simple network approach which justify the existence of the operational amplifier
230
R. L. KONIGSBERG
circuits. Some of the advantages and disadvantages of the operational amplifier technique may be listed as follows: Advantages: 1. It can be shown that the effective input and output impedances of the amplifier are very low because of the parallel-type, negative-feedback connection (4); therefore, the input, and output circuits are effectively isolated from each other. Operational amplifier-network circuits may therefore be cascaded without fear of interaction between the circuits. This is generally not true for passive networks. 2. Power, voltage, and current gains greater than unity for one or more input signals may be achieved with operational amplifier circuits. With physically realizable passive networks, only voltage or current gains are possible; power is attenuated. 3. Signals will be inverted; i.e., the negative of the input signal will be obtained a t the output because of the phase reversal property of the operational amplifier. 4. Many signals may be operated upon simultaneously, and any one operation may be individually adjusted without affecting the others, if the amplifier gain is sufficiently large, by adjusting the associated input impedance (Z1,or Zz, . . . , or 2, in Fig. 3). 5. It can be shown that in the integrator application, computing time can be much longer for given network elements in the operational amplifier system than it can be for passive networks, assuming the same allowable computational errors in both cases (14).
Disadvantages: 1. Electronic voltage amplifiers and associated stable power-supply equipment are required. Thus, the equipment is much more complex compared with simple passive networks. 2. Electronic voltage amplifiers generally have drift and noise problems not associated with passive networks. 3. Careful design of the high-gain operational amplifier is required to insure stable performance (i.e., so that oscillations do not exist). 4. The dynamic range of the input signal is limited by the capabilities of the amplifier output stage. Excessive inputs cause limiting of the output and a deterioration in the accuracy of the operation. It is possible to cascade a passive circuit with an amplifier for the purpose of achieving a net gain equal to that for an operational amplifier circuit. It will be found, however, that the magnitude of the output for this connection will be sensitive to gain variations, and care must be taken to closely control the gain against environmental influences if the desired
OPERATIONAL AMPLIFIERS
23 1
accuracy of the operation is to be maintained. On the other hand, in the operational amplifier type of circuit, large amplifier gain variations can be tolerated without affecting the accuracy of the operation if a certain minimum gain is maintained. To achieve the gain stabilization required for the passive circuit-amplifier configuration, compensation circuitry will be required, which, in general, will exceed in complexity that of the operational amplifier.
111. THE LAPLACETRANSFORM AKD
T R A N S F E I t $bNCTION
COKCEPT
It is appropriate here to dwell on some mathematics leading to an understanding of the transfer function concept. By so doing, the reader should then be placed in a better position to comprehend the operational amplifiernetwork technique. By applying the mathematics, we shall arrive a t results which will be recognized as straightforward extensions of ordinary linear steady-state a-c circuit theory. The reader with a basic knowledge of the latter should then be capable of deriving different types of transfer functions and ‘(operations,” depending on his choice of network constants. Later, we shall see some typical applications. In electronic analog computer work, it is often desirable to perform the mathematical operations of integration or differentiation, with respect t o time, on a given input function of time. In this field it is quite common practice t o deal with the operator p , which denotes “d( )/dt” (i.e., differentiation with respect to time, where t = time) and the operator l / p which denotes “J( )dt” (i.e., integration with respect to time). I n the electronic engineering field, however, it is becoming common practice to employ functions of the Laplace transform complex frequency variable s (= u j w , where u and w are real variables and j = in dealing with these same operations. Just how this is accomplished we shall see a little later. s might itself be looked upon as an “operator” similar in characteristics to p , but in reality it is a complex variable. Functions involving the complex variable s are obtained from functions of the real time variable t through a special functional transformation called the ((Laplace transformation.” This transformation is effected in the following way. Consider a function of timef(t) defined for t > 0, with f(t) = 0 for t < 0 and wheref(t) meets certain other restrictive conditions (required for convergence of an integral). We shall not go into these restrictions here; suffice it to say that most practical engineering functions meet these cond i t i o n ~f(t) . ~ then has a Laplace transform F ( s ) which is unique (i.e., only
n)
+
3 The rcadc1 interested in learning more concerning the mathematical details of the Laplace transform may consult refs. 15 and 16. Gardner and Barnes (16) give an excellent treatment from both a mechanical and electrical point of view.
232
R. L. KONIGSBERG
one such transform exists; conversely for every F ( s ) only one such f ( t ) , in most practical cases, corresponds t,hereto) and is defined as follows:
where c[f(t)] is a shorthand notation for the Laplace transform of f(t). That the restriction f(t) = 0 for t < 0 causes no loss in the treatment of most practical problems follows from the fact that in these problems we are generally interested in what happens after a particular instant of time. Thus, we can always choose the zero on our time scale to coincide with that particular instant. We then confine our interest to positive time values (see p. 101 in Gardner and Barnes, 15). In what now follows, it will be assumed that we are dealing with linear network element^.^ It is our desire now to introduce the concept of “transfer function” which we shall frequently employ.
ELECTRICAL NETWORK TRANSFER
COMPLEX FREQUENCY EO(S) -DOMAIN
Eo(S).T(S) E t ( S )
FIG.4. Transformation from the time to complex frequency domain.
The term ‘(transfer function” takes on meaning when we consider some of the implications of the Laplace transformation. In Fig. 4 a t the top we have shown a box containing an electrical network (which may consist of combinations of linear passive elements such as resistance, inductance, and capacitance) to which an input time function el(t) (= the cause) has been applied and from which an output time function e,(t) ( = the effect) is derived. If we now apply Kirchhoff’s fundamental current and voltage laws to the n e t ~ o r kand , ~ solve for eo(t) in terms of el(t) and the network constants, we will become involved, in general, in a set of differential A linear network element is one for which the behavior can be described mathematically by a first-degree equation. See ref. 16,p. 3ff., for a discussion of this point. 5 Kirchhoff’s current law states that the algebraic sum of instantaneous currents entering a network junction point equals the sum of the currents leaving the point. The voltage law states that the sum of the instantaneous voltage drops around any closed path, taken in a specified direction, is zero (see pp. 25, 26 in Gardner and Barnes, 16).
OPERATIONAL AMPLIFIERS
233
equations with many unknowns. e,(t) is one such unknown for which we desire a solution. A solution could be obtained by employing the operator p and then solving a set of linear, simultaneous algebraic equations. The same problem, however, can be solved by adopting a somewhat different approach-by employing the Laplace transformation. Briefly, to solve by the method of Laplace transforms, the Laplace transform is found for each term [including the term involving the input function el(t)]in each equation of the set of differential equations, in accordance with the defining equation, Eq. (3.1). When this is done, we end up with a set of linear, simultaneous, algebraic equations involving functions of the complex variable s, the network constants, and the Laplace transforms of el(t), denoted by El(s),and that for the unknown eo(t), denoted by E,(s). When we solve for E,(s) (say, by determinants) and if all initial conditions within the network itself are set equal to zero6 (if they are not, the actual solution for E,(s) can be found with but minor modifications), a comparatively simple result is obtained :
E,(s)
=
T(s)E,(s)
(3.2)
Equation (3.2) is algebraic and obeys all the algebraic rules. T ( s ) is a function of s involving only the network constants. Solving for T ( s ) in Eq. (3.2), we have
We define T ( s )here as the operational transfer function. For any given passive network configuration, it is known, fixed, and independent of the type of signal applied to the network input.’ Once we have chosen an input time function (no matter how complicated, but subject to the restrictions noted above t o permit the use of the Laplace transform), T ( s ) permits us to calculate the Laplace transform of the output, E,(s). by Eq. (3.2). ?“(a), in effect, “operates” on El(s)to yield E,(s). The “operation” is, however, purely algebraic in the complex frequency domain; i.e., the term T(s)E,(s)is the algebraic product of T ( s )and El(s).In effect, the time domain diagram of Fig. 4 (top) has been transformed to the complex frequency domain diagram of Fig. 4 (bottom). In Fig. 4, el(t) corresponds directly t o El(s),eo(t) to E,(s), and the complicated electrical network in the time domain has in effect been transformed to one in the complex fre6 Setting the initial conditions equal t o zero corresponds to zero energy storage in the network elements; that is, the network is in a “quiescent state.” 7 Page 132 in Gardner and Barnes (15) refers to T ( s ) as the system function, since it incorporates in one function all the essential knowledge regarding the physical system (network, in this case).
234
It. L. KOXIGSBEIZG
quency domain. This t,ransformation then permits the determination of a function, namely, T ( s ) , which, when algcbraically multiplied by El(s), yields Eo(s).In the time domain, the network elements “operate” on the input time function to yield the output. In the complex frequency domain, the entity T ( s )“operates” on the Laplace transform of the input time function to yield the Laplace transform of the output function. Just what operation is being performed by the network elements in the time domain can be ascertained by finding e,(t)-which corresponds to the inverse Laplace transform of Eo(s). To find eo(t), knowing E,(s) by Eq. (3.2), we must find the inverse Laplace transform of E,(s), denoted by 2-l [E,(s)].That is, we must attempt to answer the question: “What time function eo(t) has a Laplace transform given by E,(s)?” There are several ways of finding the inverse transform, the most general of which involves integration in the complex s plane.8 But the easiest method is to consult a table of transform pairs, i.e., a table which lists various common (and uncommon) time functioiis encountered in practice, f(t), and the corresponding Laplace transforms F ( s ) = c[f(t)]. Fortunately, the uniqueness property in Laplace transform theory (15) assures us that for eachf(t) there is only one F ( s ) , and for each F ( s ) there is only one f ( t ) for most practical functions. Given f(t), we consult our table of transform pairs and find F ( s ) . Conversely, given F ( s ) , we consult our table and findf(t). A representative table has been prepared and appears in Table I.9 Tables similar to this can easily be made up by assuniiiig time functions encountered frequently and finding their corresponding Laplace transforms in accordance with the defining Eq. (3.1). Note that in Table I, integration in the time domain corresponds to multiplying the transform of the original function by (1,’s) (see transform pairs No. 1 and 3). Also, differentiation in the time domain corresponds to multiplying the transform of the original function by s, provided the initial value of the time function is zero (see transform pairs No. 1 and 2). Note the correspondence between these statements and those made with respect to the operator p ; i.e., multiplication by (l/p) denotes integration, and multiplication by ( p ) denotes differentiation, both with respect to time. The derivation of transform pair No. 8 is a bit more complicated than the others; those interested therein should consult the references (see, e.g., Gardner and Barnes, 15, p. 228 ff.). The value of the Laplace transform approach lies in the fact that when s =jw (j = w = real angular frequency = 2 ~ f wheref , = real fre-
a,
6 Those readers interested in the details of this should consult Goldman (16).Chapters 3 and 7 give a reasonably good background on the subject. For additional tables of transform pairs see Gardner and Barnes (Is),pp. 334-356.
235
OPERATIONAL AMPLIFIERS
TABLEI. TRANSFORM PAIRS Time function
NO.
(t
Laplace transform
2 0)
F ( s ) (Note 2)
sF(s) - f(O+) (Note 3)
F(s) S
-1
4
S
(Note 4) 6 ( t ) (Note 5 ) uf(t) (Note 6)
5 6 7
e-at
Jo’ fib)f!Z(t
8 NOTES: 1. f ( t ) = 0,t 2. ~ ( s )=
(Note 7) - T)d?.
> 1
(4.13)
and we see immediately in Eqs. (4.10) and (4.11) that 0 = le(ju)l 0 = le(ju)l
< < I&(ju>l < < jEo(ju)I
(4.14) (4.15)
I n operational form, the relations (4.14) and (4.15) become, with ju 0
= I4s)l < < IEds)/
0 = I4s)l
= s,
(4.16)
< < /Eo(s)/
(4.17)
-
FEEDBACK NETWORK
NETWORK
OUTPUT
From relations (4.14) and (4.15) we see a very important property of the parallel feedback-operational amplifier system. Over the useful frequency range of interest where 1 - A (ju)[ is sufficiently large, the error voltage ~(ju) is essentially zero. Thus, amplifier terminal 1 in Fig. 6a is essentially a t ground potential. It follows, then, that, for all intents and purposes, the ends of impedances Z,(jo)and Z ~ ( j u connected ) to amplifier terminal 1 are essentially shorted to ground-i.e., the currents that flow out of these ends are essentially short-circuit currents. From this result a more general interpretation may be placed on Zl(ju) and Z F ( j u ) . The latter quantities previously were each assumed to be two terminal networks. We can now generalize by saying that Z,(ju) and Z F ( j u )can be four-terminal networkseach with two input terminals and two output terminals- provided these impedances are defined in a special manner. If we, for convenience, ground
242
R. L. KONIGSBERG
one input and one output terminal of each of these impedances, the fourterminal networks become, in effect, three-terminal affairs. Thus, we can imagine a situation such as shown in Fig. 7, where Z l ( j w ) and Z ~ ( j weach ) represent four-terminal networks connected to the operational amplifier and signal source as shown. Iaal(jw)and I a a p ( j w )represent the short-circuit currents that effectively flow from the Zl(jw) and Z F ( j w ) networks, respectively, towards the operational amplifier input terminal 1. We now general) mean (3, 17) ize our definition of & ( j w ) and Z ~ ( j w to
Edjw) = steady-state a-c short-circuit Z,(jw) = I*,,(PI
transfer impedance of input network
(4.18)
Z ~ ( j w= ) Eo(jw) = steady-state a-c short-circuit
L*).W ~
transfer impedance of feedback network
(4.19)
In Fig. 7, the following relationships must hold true, assuming that the amplifier itself diaws no input current, by Kirchhoff’s current law: Iss*(jw) =
-I s s A j w )
(4.20)
From Eqs. (4.18), (4.19), and (4.20))it follows that T ( j w )= .3w E ( ’ ) = - ___ zF(’w) - steady-state a-c transfer function Zl(jW) for operational amplifier system
(4.21)
where Zl(jw)and Z F ( j w ) are now defined by Eqs. (4.18) and (4.19), respectively. Withjw = s, Eq. (4.21) becomes
for operational amplifier system
(4.22)
where Zl(s) and Z,(S) are defined by Eqs. (4.18) and (4.19), respectively, withjw = s. Thus, because of the virtual ground at the amplifier input terminal 1, we have arrived a t a more general interpretation of Z,(jw) and Z ~ ( j w ) The . two-terminal network representation for each of the impedances Zl(jw) and Z&) is then seen to be a special case of the four-terminal representation for each of these quantities. Finally, we can generalize further by applying many input driving functions El&), Ez(jw), . . . , En@) to their respective four-terminal networks represented by Zl(jw), Z,(jw), . . . , Z,(jw), as in Fig. 8. The latter
243
OPERATIONAL AMPLIFIERS
quantities are defined as the steady-state a-c short-circuit transfer impedances as follows:
(4.23) where I a a k ( j w )is the short-circuit current flowing out of Z k ( j ~a)t the amplifier input terminal 1. For jw = s, we have the operational short-circuit transfer impedances as follows :
As a consequence of Kirchhoff's current law,
-
FEEDBACK NETWORK ISSF(& INPUT NETWORKS
1 OUTPUT FUNCTION L___
OPERATIONAL AMP.
I I
I
-
I I
INPUT E,(jw)
FIG. 8. Operational amplifier system with four-terminal networks and (n) input signals.
(4.27)
244
R. L. KONIGSBERG
If we let j w = s, Eq. (4.27) becomes
We now define the generalized operational transfer function T,(s) for signal E,(s) : (4.29)
Tk(s)is a function of Z,(s) and Z,(s) and is independent of operational amplifier gain as long as the magnitude of the gain is sufficiently large. Each Tk(s)may be adjusted, independently of the others, as accurately as desired by using precision network elements to make up Z,(s). For accuracy, Z,(s) must also be precision-built, since Z F ( S is ) common to all Tk(s). Thus, the operational amplifier system effectively isolates the input signals from each other. With T,(s) in Eq. (4.29) substituted in Eq. (4.28), we have finally
+ Tz(s)Ez(s)+ . . . + Tn(s)En(s)
E,(s) = Ti(s)Ei(s)
(4.30)
The individual Tk(s) in Eq. (4.30) may have magnitudes greater than unity, thereby yielding signal gains rather than losses (as is usually the case for passive network operations without the benefit of an operational amplifier). The network elements which are combined to make up the various Z k ( s ) and ZF(s>may be resistance, inductance (including mutual inductance), and capacitance. For precision operations, precisely adjusted network elements are required. In practice, fairly precise resistances and essentially lossless capacitances (i.e., containing very little effective resistance in series with the capacitance) can be constructed in large values (1 meg and 1 pf, respectively). It is difficult, however, to produce the precise, largevalued, lossless inductances that would be required for low-frequency computer work (where the operational amplifier finds its greatest use). Hence, the tendency is to avoid the use of inductance as a computing element. V. ILLUSTRATIONS OF OPERATIONS
Let us now consider some typical operations which can be performed by the parallel feedback-operational amplifier system. Figure 9 illustrates two-terminal network configurations for both Z,(jw) and Z ~ ( j w )Figure . 10 illustrates a configuration where Z,(jw) is a four-terminal network and Z F ( j w ) a two-terminal network. Here, Z,(jw) and Z&W) may be defined by Eqs. (4.18) and (4.19), respectively. Figure 11 illustrates a cascade arrangement of operational amplifier systems for producing an output
OPERATIONAL AMPLIFIERS
FIG.I). (a) Multiplication by negative gain constant: Rr. R1
= - -e l ( t ) ,
e&)
t 20
(b) Integration: eo(t)
=
- -I t e , ( T ) d T , 1 2 0 KlCF
(c) Differentiation, for t
0
2 0+: eo(t)
=
- RFCl d k l ( 1 ) I ~
dl
provided: e (0+)= 0 (d) Multiplication by negative gain constants and summation for t 2 0 :
245
246
It. L. KONIOSBERC~
FIG.11. Cascaded operational networks.
signal which has the same sign as that of the input. Figure 12 is an arrangement which finds much use in servoniechanism work. In all cases, the magnitude of the operational amplifier gain is assumed to be sufficiently large so that Eq. (4.9) and the more generalized Eq. (4.30) apply. In Fig. 9a, the output signal is a negative gain constant times the input. By Eq. (4.7)
Inserting this value of T(s) in Eq. (4.9), we have
247
OPERATIONAL AMPLIFIXRS
E,(S)
RF Ri
= T(S)El(S)= - - El(S)
(5.2)
Taking the inverse Laplace traiisform of the terms on the left and right sides of Eq. (5.2), making use of transform pairs X o . I and 6 in Table I,
e.(t)
RF
= - - el(t)
(5.3)
R1
e,(t) = &-l[E,(s)],el(tj
where
=
&-'[E1(s)]
el(tj in Eq. (5.3) niay be any practical time function subject to the restriction that el(t) = 0 for t < 0. A special case occurs when f i F = h!,. Then
edt)
8, ( t
= - RF e I I+ 1Rl
=
t
(T
RlCF
(5.4)
-el(O
dr =
-
[
(t ) + & t 1 (
Y
PROPORTIONAL TERM
'
r )dr v-
INTEGRAL TERM
] I
FIQ.12. Proportional plus integral operation.
Thus, the output signal is the negative of the input; i.e., a sign inversion has been performed on the input signal. In Fig. 9b, the output signal is a negative gain constant times the time integral of the input signal. By Eq. (4.7)
Inserting this value of T ( s ) in Eq. (4.9), we have
248
R. L. KONIGSBERG
Taking the inverse Laplace transform of the terms on the left and right sides of Eq. (5.6), making use of transform pairs Nos. 1, 3, and 6 in Table I, we have
eo(t) = -
~
12lCF
/' 0
el(T)dr
(5.7)
where
edt)
=
c-'[Eo(s)l,el(t) = c-W"(s)l
el(t) in Eq. (5.7) may be any practical time function subject to the restriction that el(t) = 0 for t < 0. The "operation" being performed on the input signal by this circuit is obviously time integration and multiplication by a negative gain constant. If, for example, the input function el(t) is the unit step function u(t), then, since u(t) = 1 for t > 0, it follows from Eq. (5.7) eo(t) = -
~
1
F
J'
U ( T ) ~ T=
O
-
'
-/ t
R ~ C Fo
(1)dT
=
t - - (5.7a) R~CF
That is, the output eo(t) is a linear function of time-a ramp function. The allowable integration time t in Eq. (5.7) depends on the accuracy desired and is a function of the operational amplifier gain. The greater the gain, the longer the integration time may be for a prescribed accuracy (14). In Fig. 9c, the output is a negative gain constant times the time derivative of the input signal, provided the initial condition associated with el(t) is zero. By Eq. (4.7)
Inserting this value of T(s) in Eq. (4.9), we have Eo(s) = T ( s ) E l ( s ) = - (RFCls)El(s) = -RFCl[SEl(S)]
(5.9)
Taking the inverse Laplace transform of the terms on the left and right sides of Eq. (5.9), making use of transform pairs Nos. 1, 2, and 6 in Table I, we have
e,(t)
=
-
where
edt)
=
c-l[Eo(s)l,el(t> = c-'[&(s)l
and where it has been assumed that the initial condition el(O+) = 0. el(t)
249
OPERATIONAL AMPLIFIERS
in Eq. (5.10) may be any practical time function subject to the restrictions that el(t) = 0 for t < 0 and el(O+) = 0. The ‘Loperation”being performed on the input signal by this circuit is obviously time differentiation and multiplication by a negative gain constant. Differentiation is not often employed in practice. An inspection of its steady-state a-c transfer function T ( j w ) , given by Eq. (5.8) with s = ju, reveals that it increases directly with angular frequency w. This means that any high frequency noise components contained in El(ju) will be greatly amplified and more than likely overload the amplifier. Fortunat,ely, in analog computer work it is easy to avoid differentiators in favor of integrators, which do not suffer this disadvantage. In Fig. 9d, the output is the summation of negative gain constants times the respective input signals. By Eq. (4.29) (5.11) (5.12) Inserting these values of Tl(s) and T2(s)in Eq. (4.30), we have (5.13) Taking the inverse Laplace transform of each term on the left and right sides of Eq. (5.13)) making use of transform pairs Nos. 1 and 6 in Table I, we have
where e.(t)
=
2-1[E0(s)l,el(t)
=
c-’[El(s)l, ez(t) = &-‘[Ez(s)l
e l ( t ) and e2(t) in Eq. (5.14) may be any practical time functions subject to the restrictions that el(t) = 0, ez(t) = 0 for t < 0. The “operation” being performed on the input signals is summation after multiplication by the respective negative gain constants. If R1 = RZ = RF,Eq. (5.14) reduces to edt)
=
[-el(t)l
+ [-ez(t)l
=
-[el(t)
+ ez(t)l
(5.15)
and the (‘operatioii’’ is strictly a summation process with signal sign inversion. I n Fig. 10, a more complicated operation is being performed on el(t) than any of the previous operations.1° If one calculates Z l ( j w ) and Z ~ ( j w ) lo
This particular operation is described in Bradley and McCoy’s article (17).
250
R. L. KONIGSBERG
in accordance with the defining Eq. (4.18) and (4.19), one finds the values to be as shown in Fig. 10. By Eq. (4.22) the operational transfer function hecomes
Inserting this value of T ( s )in Eq. (4.30), we have
Equation (5.17) may be rewritten in the following form by a partial fraction expansion :I1
where
Now, each term on the right of Eq. (5.18) represents the product of two functions of s. To find e o ( t ) , we nced only apply transform pairs Nos. 6 and 8 in Table I, where
c[el(t)l
=
El(s), s[e,(t)l
When this is done, eo(t) is eo(t)
=
jOf jOt
Ke - (Z/RICI)I
- Ke-(l/RFCF)t
=
e(2/RiCi)r
EAs)
el ( 7 )dr
e(l/RFCF)rel
(TI d~
(5.19)
where K is the constant given in Eq. (5.18). From Eq. (5.19), it is apparent that the operation being performed on el(t) is indeed quite complicated. Again, el(t) may be any practical time function where el(t) = 0 for t < 0. In Fig. 11, the output signal is a positive gain constant times the integral of the input. The positive sign is produced by cascading the integrating operational amplifier network with another operational amplifier network 11 An explanation of the partial fraction expansion may be found in almost any college algebra text. See, e.g., Palmer and Miser (It?), Chapter 19. Combining the two terms of Eq. (5.18) into one which has a common denominator results in Eq. (5.17).
OPERATIONAL AMPLIFIERS
25 1
having an effective operational transfer function of (- 1). Thus, if the circuit of Fig. 9b is cascaded with that of Fig. 9a, we have the result shown in Fig. 11. Figure 12 illustrates how the operation commonly termed “proportional integral” is accomplished on an input time function. This operation is often used in servomechanism work. Many other operations and transfer functions are possible and those readers interested therein should consult the references. See, for example, Korn and Korn (19), p. 415ff), Bradley and McCoy (17, pp. 147-148). Shumard (20, pp. 534-564) describes an application of a high-gain d-c amplifier with a form of parallel feedback slightly different from that described herein. With this arrangement, he is able to synthesize the transfer functions for very low-frequency low-pass, high-pass, and band-pass filter characteristics using only resistance-capacitance (R-C) networks (no inductances required) and one high-gain d-c amplifier. Bridgman and Breii nan (21) indicate another type of parallel feedback configuration, employing a d-c operational amplifier, with which they are able to synthesize the transfer functions for second-order systems. Only R-C networks are used, and no inductances are required. L. Goldberg (22, pp. 128-131) describes a method of obtaining either positive or negative outputs with one operational amplifier employing the basic parallel-feedback arrangement. The amplifier employs a differential amplifier input. stage making available two isolated input terminals to which signals may be applied. The gain from one input terminal to the output is positive, while that from the other input terminal to the output is negative. A list of mathematical operations, available through use of this amplifier, is given. For a typical commercial amplifier, the Philbrick Applications Manual on the GAP/R K2 Series of computing amplifiers (23) lists a number of typical operations possible with the Philbrick amplifiers. Hellerman (5) and Kerfoot (9) describe a different type of feedback arrangement which is designed to operate with current, rather than voltage, amplifiers. The amplifiers are viewed as “current operational amplifiers’’ (as distinguished from the “voltage operational amplifiers” employed generally in computing work), and the general operational transfer functions on a current basis are derived. Since transistors are basically currentoperated devices, the intent is to explore the possibility of using the current amplifier approach for transistor utilization. ThoPe interested in the operations which can be performed with magnetic amplifiers should consult Sack et al. ( I S ) , Hubbard (11), and Patton (12).The Airpax Bulletin No. 221 on the “Ferrac” (24)also lists some typical operations which can be performed with this particular type of computing amplifier.
+
252
R. L. KONIGSBERG
VI. APPLICATIONS OF OPERATIONAL AMPLIFIERS IN COMPUTING AND MEASURING SYSTEMS The attempt will be made here to illustrate the use of operational amplifiers in a computing application and in a measuring problem. A . Computing Application Consider the equation which describes the motion of a spring-mass system with damping proportional t o the velocity as shown in Fig. 13. As can be shown (25), this equation is, [for an assumed driving function j ( t ) = u ( t ) , where u(t)is the unit step function],
M
a22
-
dt2
+ K1-dxdt + K x ( ~=)f ( t ) = ~ ( t )
FIG.13. Spring-mass system with frictional damping.
M
= mass.
x = distance of center of mass from equilibrium position. = K x , K = spring constant.
f, = spring force
ax
K I -, K1 = damping constant. dt f ( t ) = external driving force applied t o system. fd
= frictional damping force =
Equation of motion: d2x
M
dx + II lim (F,) = 1 --f
IA(jw)I
+
19 Meneley and Morrill (38, p. 1488) give the functions F I , FP,and F , under the condition that 2, = m, 2, = 0, ZL = 0 0 .
264
R. L. KONIGSBERG
where
Under the conditions represented by Eq. (7.3), Eq. (7.2) becomes
The conclusion to be drawn here is that an indefinitely large increase in the magnitude of the amplifier gain does not eliminate the noises caused by E N ( j w ) and I ~ ( j w ) The . reduction of the contributions due to EN&) and I N ( j w ) a t the amplifier output terminals has been of major concern to the designers of electronic operational amplifiers. I n the next section (See. VIII), we shall see what steps have been taken by various investigators to minimize these contributions. As mentioned previously, the chief constituent of E N ( ~ w appears ) to be the d-c drift caused by changes in the terminal characteristics of the first-stage tube for vacuum-tube amplifiers and of the first-stage transistor for transistor amplifiers. ING~w), on the other hand, is mainly due to changes in the d-c grid current drawn by the input tube of vacuum-tube amplifiers and changes in the input current to the first-stage transistor of transistor amplifiers. The other noise constituents of E N ( ~ w and ) I N f j w ) usually can be made small, in comparison with the main effects mentioned above, by careful design of the amplifier (i.e., by proper layout and shielding, by the use of stable, internal passive elements, and by the w e of well-regulated power supplies). An inspection of Fz in Eq. (7.3) reveals the fact that if Z F ( j w ) , Z l ( j w ) , . . . , Zn(jw), and therefore Z,(jw), are all held fixed, then Fz will increase as Zo(jw) decreases. Thus, lowering the input impedance Z,Cjw) of the amplifier increases the noise contribution in the amplifier output caused by E ~ ( j w ) ,in accordance with Eq. (7.4), which represents the infinite gain case. This is also true for finite amplifier gains. If, in Eq. (7.2), the parameters Z ~ ( j w ) ,Z l ( j w ) , . . . , Zn(jw) are held fixed and the magnitude of A&) is assumed to be some large but finite value, then an analysis will show that Fl is complex (i.e., contains real and imaginary terms) in general and its magnitude tends to decrease from unity as the magnitude of Z o ( j w ) increases, Z,(jw) decreases, or Z ~ ( j wdecreases ) [relative to Zo(ju)].Hence, the operations on the desired signals tend to become more inaccurate as the latter parameters vary in the manner described, since for accurate operations F1 should be both real and unity in value. Thus, amplifier design should be such that the
265
OPERATIONAL AMPLIFIERS
magnitude of 2,Cjw) is kept high over the useful frequency range. If Z,(jw) cannot be made large, then the accuracy may be maintained by decreasing Z p ( j w ) and increasing IA(jw)1.20 If the magnitude of Z,(jw) is small compared with each of the magnitudes of ZF@) and Z ~ ( j w ) ,the error in an operation caused by loading a t the output terminals [which is due to ZF(jw) as well as ZL(jw)] will be small compared with other errors. These conditions are not difficult to achieve in practice. For accurate operations, the magnitude of the gain - A ( j w ) should be sufficiently large over the intended frequency of operation of the amplifier so that Fl in Eq. (7.2) is as close to being real and unity in value as desired. As an example, consider the negative gain constant operation with just one applied signal source E l ( j w ) and signal source impedance Zl(jw)-refer to Fig. 16. Let Z i ( j ~= ) R1
ZF(jw) = RF
A(jw)
R1, R F , R,
= w, =
Ado
=
=
Ad. ~
jwr
Z,(jw)
=
R,
1
+ 1r r = -
wc
(7.5)
resistances amplifier bandwidth in radians/sec d-c gain of amplifier
Also, assume that the amplifier output impedance Z,(jw) is sufficiently small and may be neglected in the following aiialysis (this may easily be achieved in practice). Then, it may be shown that the transfer function T l ( j w ) , taking into consideration the amplifier parameters, becomes
where
Solving for E,(jw) in Eq. (7.6), we have
RF E,(jw) = - -EE,(jo) Ri
[jw b]
The desired out.put, function E,,(jw) is, on the other hand, *O
This may be the case in transistor amplifier design (see Hellerman, 39).
(7.7)
266
H. L. KONIGSBEIX
By comparing Eqs. (7.7) and (7.8), we see that the error factor F , of Eq. (7.2) is a 1 (7.9) ' l ( j w ) =bw -j where the quantities a and b are as defined in Fq. (7.6). In a real frequency analysis-i.e., in a steady-state a-c circuit analysis a t a prescribed frequency w,-the error factor F , consists of two parts, namely, the factors (a/b) and [1/1 ( j w O / b ) ] .The first factor is a constant independent of w. while the second is dependent thereon. Having chosen a particular operating frequency wo, we may wish to choose a and b in such a way that the magnitude of the actual output differs from that of the desired output by no more than a prescribed fractional error a. If this is the case, we must insure that the magnitude of Fl does not differ from unity by more than the fractional error a. If
+
( y y oltage source. The feedback loop provides a constant voltage, regardless of the load impedance, within certain limits. The battery can be a low-capacit,y Mallory cell or some similar stable voltage reference. The input impedance to the Inductronic Amplifier is extremely high; therefore, the load on the reference cell is a minimum. Figure 29 shows a block diagram of this calibrator. By providing a suitable voltage-divider output, several standard voltages steps from 0 to 5 volts may be obtained. One of the main advantages of this voltage calibrator lies in the fact that no voltage monitoring need be done during the calibration process. The voltage output may be checked periodically by a Leeds and Northrup potentiometer to determine its stability. The short-time stability is extremely good, and the long-time stability lies in the neighborhood of 0.1% for a period of several months.
-
ll6W FELOWE
FLIGHT TAPE PLAYBACK (Amper 500x I)
T R A C K * Z IN
L
TRACK'I
,LOOP
IN ,
RECORDING DUAL PRESETCOUNTER TAPE z 2 TIME PULSE
E
DETECTOR
(Ampax 500'2)
COUNTING UNIT RECORDING PULSE1
-
PROGRAMMING PULSE
ITRACK-$IN
LOOP PLAYBACK SYSTEM
I I
TAPE TIME PULSE DETECTOR
TRACK 44
I
-
TIME PULSES VARIABLE BANDWIDTH
D
I
I
c
SANBORN RECORDER CATHODE FOLLOWER IA4l"
HP300-A WAVE ANALYfER
I
(PHOTOS 1821
-
SQUARE ROOTER
0
OSCILLATOR LOCAL BALANCED
-4 RELAY DPlVER AMPLIFIER
PROGRAUMING PULSE
SIGNALIOCI
VOLTAGE
1
ATTENUATORS
PROGRAMMER (D-1)' R 5 E R S TO DRAWING NO. WHICH SHOWS CIRCUIT.
I
1
COEFFICIENT AMPLIFIER ~~
4 VOLTAGE REGULATED POWER SUPPLY LAMBDA MODEL 28
-
I
~~
I $1 ~~
3 48
HENRY B. RIBLET
B. Frequency Calibrators In an FM/FM telemetering system, it is necessary to have a source of accurate frequencies for the calibration of the telemetering ground stations to determine the response and accuracy of the subcarrier discriminators. It was the practice in early days of telemetering to use a standard audio oscillator with a frequency-measuring device and to adjust manually the oscillator to the specific frequencies necessary to calibrate the subcarrier discriminators. This process was a long and laborious one which required many adjustments to cover the frequency bands. During the past few
FIG29. Block diagram of a voltage calibrator. (Courtesy JO/klkS Hopkins Urkiversity, Applied Physics Laboratory.)
years, frequency calibrators have been designed which utilize crystal oscillators as standard frequency references. The crystal reference frequencies are chosen with the proper spacing to give calibration points a t the center of each telemetering band and also a t specific points within the band and a t the band limits. Some of these calibrators provide as many as 11 points of calibration in each telemetering band. The frequencies in each telemetering band are obtained by dividing the frequency of the crystal oscillators
RADIO TELEMETERING
349
into many frequency combinations and using balanced modulators to add and subtract the various frequency components to arrive a t the specific frequencies necessary within each band. There are several manufacturers who have designed specific equipment for special applications, but in general, they opeiate similarly and provide essentially the same performance characteristics. The frequency calibrators described here are designed to provide an output which calibrates all telemetering bands simultaneously and automatically steps from one calibration point to another to provide from 3 to 11 points within each band. The accuracy of these frequencies generated by the crystal oscillators are in the neighborhood of 0.02%. This type of equipment provides an easy method for the calibration of the subcarrier discriminators and also gives a rapid check of their frequency linearity. One type of equipment uses a panoramic indicating unit to enable the operator to monitor the calibration process and also to check the spectrum of the combined subcarrier frequencies of the telemetering system.
C . Combinatioic. Test Equipment for F M / F M Systems In the adjustment of FM/FM systems it is necessary to check the center frequencies of the various subcarrier oscillators in the airborne system and to check the deviation sensitivity of each channel. There have been several designs of combination equipment which provide a number of operations of this type. Such test equipment usually includes a full set of band-pass filters, a frequency-measuring device, such as an Eput meter, and an r-f power measuring system. I n addition, an accurate voltmeter is usually employed t o measure the relative amplitudes of the subcarrier frequencies as they are passed through the band-pass filters. A voltage calibrator is employed to stimulate the subcarrier oscillators at known voltage value. I n this way, one can determine the frequency deviation of each subcarrier oscillator with respect to standard voltage steps. Again there are a large variety of such equipments available, and generally this type of test equipment is designed for special applications depending upon the particular requirements. Quite often an r-f receiver is used so that the test equipment will include the full r-f link in its test capability. Figure 30 is a block diagram of this equipment, and its operation is fairly obvious. VIII. CONCLUSIONS
A . Review of Advances I n this review of advances in radio telemetering over the past six years, some of the highlights in the telemetering developments of both systems and components have been discussed briefly. An attempt has been made to
350
HENRY B. RIBLET
cover the entire system from the transducer to ground equipment. It has been impossible to discuss or reference all of the material published during the past few years on the subject of radio telemetry. Therefore, only typical developments have been mentioned, and the reader will discover additional material particularly in the I R E Transactions and I R E Convention Records as well as in the National Telemetering Conference reports. It is believed that the significant advances in telemetering systems and components have been made in the following general area: improved accuracy through the development of more stable components in both the BAND PASS FILTERS
VOLTAGE AMPLIFIER
-
FIG.30. Block diagram of telemeter test rack.
airborne and ground equipment and through the more widespread use of digital techniques. A considerable advancement has been made in the development of high-temperature components, particularly in the area of transducers. One of the critical requirements for telemetering systems used in our complex missile designs today is the need for operation in a tremendously high-temperature environment. A considerable amount of effort has been devoted to the transistorization of telemetering systems and components. There have been a number of unique transistorized circuit designs, giving stable and reliable components. Therefore, more and more use is being made of transistorized telemetering components. Transistors have
RADIO TELEMETERING
351
the ability to withstand rugged environments and allow for reduction in size and weight. The smaller power requirements of transistors make the use of these new components very important. The standardization of telemetering systems and the improvement in flexibility have provided our missile designer and aircraft flight engineer with an electronic tool which is all but indispensable in this modern era of long-range missile development and space flight. Telemetering designers are continuing to improve the packaging and engineering design to give increased subminiaturization which will meet the ever-present requirement for less space and lower power consumption. There have been many improvements in the telemetering ground receiving equipment, giving the engineer a much more flexible station and one which is considerably more efficient in operation. In addition, the modern telemetering station can handle a vast amount of data compared with the telemetering ground station of a few years ago. The new equipment available for data recording and processing gives the telemetering system engineer a wide choice of components, which he can arrange in various combinations to provide a telemetering station capable of handling most any telemetering system and with which he can process data in either analog or digital form. The telemetering engineer must, of course, work closely with the data user to determine the types of data measurements which are desired. He must then carefully select a telemetering system which will best fit the specific application. The dynamic range of the measurement, the accuracy required, the frequency response, as well as the ultimate use of the data, all will have an important bearing on the specific type of system and components which will be required to instrument his specific application. With the use of present telemetering techniques, almost any demand can be met with respect to data requirements. 111 some cases, a little cleverness is required, but in general some combination of available components and systems will give measurements with accuracies in the order of 2 to 3% for the FMIFM systems, and in cases where extreme accuracy is required, the improved digital techniques can give data accuracies to 0.1%.
R. Future Trends With the ever-increasing aircraft and missile speeds, the demands for higher and higher temperature operation will require a considerable development effort to increase the supply of telemetering components which can operate in these environments. It is believed that telemetering systems and components will use transistors almost exclusively in the near future, with the possible exception of the r-f transmitters. With the increased use of transistors, additional improvements will be made in the packaging and
352
HENRY B. RIBLET
miniaturizatioii, providing increased reliability under rugged environmental conditions. It is expected that there will be an increased use of PCM telemetering systems with a digital output. There always will be a requirement for analog measurements, however, and the demand for improved accuracy will require the continued efforts of the designer for improvements in the stability and accuracy of the frequency division systems. A considerable improvement can be made in the engineering design and workmanship of telemetering components. When it is considered that the telemetering system is used as a tool to aid in the development of complex electronic circuits, which in themselves must have extremely high reliability to accomplish their mission, one must realize that the telemetering system must be several orders of magnitude more reliable than the equipment which it is monitoring. Therefore, the design engineer must be ever vigilant to design increased reliability, ruggedness, and ease of operation and of maintenance into his telemetering equipment. The recent work directed to microminiaturization of electronic circuits has a direct application to the telemetering components and systems of tomorrow. The demand for reduction in size and weight and power always will be present. The newer microminiaturization techniques give possibilities for improvement in reliability, uniformity, and flexibility. The ever-increasing demand for greater data-handling capacity, reliability, and flexibility should challenge those working in the art of telemetering to continue the wonderful progress which has been made in the last few years. REFERENCES 1. Committee on Guided Missiles Research & Development Board-Department of
Defense, Telemetering Standards for Guided Missiles, MTRI-204/6 (Nov. 1951). 2. Inter-Range Telemetry Working Group, Telemetry Standards for Guided Missiles,
IRIG NO. 102-55 (July 1955). 3. W. C. Qua, “Adaptability of Strain Gage Pressure Pickups.” Consolidated Elec-
trodynamics Corp., Pasadena, Cal.
4. A. B. Kaufman, Instruments and Automation 28, No. 8, 1320-22 (1955). 5. “How to Use Platinum Resistance Thermometers in Temperature Measurement,
Telemetry, and Control.” Trans-Sonics Incorporated, Burlington, Mass. 6. ,4.B. Kaufman, “Hot Thermocouple Reference Junction vs. Ice Baths and Bridge
Compensators.” Arnoux Corp., Los Angeles, Cal. 7 . W. Bradley, Jr., 1957 Natl. Telemetering Conf. Rept., El Paso, Texas I-B-4-1-4 (May
1957). 8. R. I. Kcarley, Jr., 1958 Natl. Telemetering Conf., Baltimore, Md. pp. 241-243 (June,
1958). 9. A. M. Chwastyk, “A Transistorized Voltage Controlled F M Subcarrier Oscillator.”
Appl. Phys. Lab., The Johns Hopkins University, APL/JHU CF-2733 (July, 1958); also in I R E Proc., Natl. Telemetering Symposium, Scpt. 1958, p. 2, 3. 10. R. P. Bishop, I R E Transactions PGTRC TRC-2, 7-9 (March, 1956). 11. R. E. Marquand and W. T. Eddins, 1957 IRE WESCON Convention Record 1, Part 5, 76-80 (1957).
RADIO TELEMETERING
353
12. “Reliable Operation of Commutation Switches of the Stationary Disc and Rotating Brush Assembly Variety a t Relatively High Sampling Speeds.” Mycalex Electronics Corporation, Clifton, N. J. 13. R. Mawson, “A Low-Level PDM Data-Acquisition System,” Second Yankee Instrument Fair and Symposium, Boston, Mass. (Consolidated Electrodynamics Corp.) (Jan., 1958). 14. E. Dorsett and J. Searcy, 1957 I R E Nut. Convention Record 6, Part 5, 57-60 (1957). 15. C. Rosen, 1958 Natl. Telemetering Conf., Baltimore, Md. pp. 71-77 (June, 1958). 16. C. E. Gilchriest, I R E Transactions PGTRC TRC-4, 20-35 (June, 1958). 17. D. D. McRae and H. Scharla-Nielsen, 1958 Natl. Telemetering Conf., Baltimore, M d . pp. 273-278 (June, 1958). 18. A. J. Garon, J. K. Van Hock, and C. H. New, Aviation Age, pp. 116-119 (Aug., 1957). 19. K. A . Morgan and R. F. Blake, I R E Transactions PGTRC-2 7-10 (Nov., 1954). 20. G. H. Barnes and R. M. Tillman, 1957 I R E WESCON Convention Record 1, Part 5, 98-105 (1957). 21. W. H. Duerig, I R E Natl. Convention Record 4, Part 1, 70-82 (1956). 22. M. H. Nichols and L. L. Rauch, “Radio Telemetry,” 2nd ed. pp. 126-135. Wiley, New York, 1956. 23. H. B. Riblet, Electronzcs, 30 ( 8 ) , 182-187 (Aug., 1957). 24. J. A. Petersen, I R E Transactions PGTRC TRC-2, 13-15 (March, 1956). $5. J. Brllinger, J. MacNeill, and C. F. West, 1957 I R E Natl. Convention Record 6, Part 5, 37-43 (1957). 26. G. F. Anderson, I R E Transactions PGTRC TRC-2, 17-20 (March, 1956). 27. J. W. Prast, 1967 IRE Nut/. Convention Record 6, Part 5, 48-56 (1957). 28. W. Kroll, 1956 Natl. Telemderzng Conf. Rept., Los Angeles, CaZ. 11-4-1-11-4-3 (Aug., 1956). 29. C. Rogers and R. E. Hadady, 1957 Natl. Telemetering Conf. Rept., El Paso, Texas V-B-2-1-V-B-2-11 (May, 1957). 30. IRIG Document No. 101-57, I R E Transaclions PGTRC TRC-3,20-22 (Dec., 1957). 31. W. H. Chester, R. E. Colander, and F. A. Wissel, “Wow and Flutter Compensation in F M Telemetry,” RDBT-6728. Bendix Aviation Corp., Pacific Division Development Laboratories, Burbank, Cal. (March, 1955). 32. K. S. Bonwit and R. B. McDowell (APL/JHU), 1957 Natl. Telemetering Conf. Rept., El Paso, Texas V-B-3-1-V-B-3-9 (May, 1957).
This Page Intentionally Left Blank
Electron Diffraction Structure Analysis and the Investigation of Semiconducting Materials Z. G. PINSKER Institute of Crystallography, Academy of Sciences of the U.S.S.R., Moscow
Translated by Lewis B. Leder Page 355 11. State of the Electron-Diffraction Method of Investigating Solids at the Present Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Atomic Scattering ...................................... 357 B. Scattering in Crys ........................ C. Remarks on the Present Development of the Dynamic Theory. . . . . . . . . . 366 D. Process of Structure Determination from Electron-Diffraction Paterns. . . 368 E. Experimental Techniques. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 111. Investigations of the Structure of Semiconducting Phases. . . . . . . . . . . . A. Films of Pure Substances.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Substances Having the Zinc Blcnde Structure.. . . . . . . . . . . . . . . . . . C. The Systems Bi-Se and Bi-Te., . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Two- and Multicomponent Films Containing T1, Sb, As, and Se.. . E. Phases with the GaS Structure.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399 F. Other Semiconducting Phases.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403 G. The Surface Structure of Thin Films and Monocrystalline Ge. . . . . . . . . . . 407 Acknowledgement ................................ 410 References. . . . . . . ................................ 411
I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
I. INTRODUCTION In the extensive development of the physics and technique in such fields as electronics, semiconductors, piezoelectrics, and certain others, much use has been made of the new methods of investigating and analyzing different materials and processes, among them the electrondiffraction method. Besides being used to investigate surface structure and surface phenomena, this method, through the initiative of Soviet authors, is more broadly used for the determination of the atomic structure of solids and has proven t o be a valuable supplement to X-ray structure analysis. Several special scattering effects of electrons by atoms and crystals make it highly effective for the solution of many special problems in physics and engineering. 355
356
2. G . PINSKER
The strong scattering of electrons in matter makes it necessary t o use very thin specimens or to investigate thin surface layers. This is very important in connection with different thin layers, semiconducting, piezo- and ferroelectrics, luminescent materials, etc. Furthermore, it should be pointed out that many materials, such as multicomponent semiconductors, which in bulk form are sometimes difficult to obtain in a mell-regulated and equilibrium condition, will in thin layers crystallize quite easily with moderate heating. In the same specimen there will often be observed, together with the stable phase, an unstable, intermediate structure having scientific and practical interest. As is known, to determine completely the structure of a crystal not having a simple composition or high symmetry, it is necessary to have, in the case of X-ray structure analysis, a single-crystal specimen on the order of magnitude of 0.1 mm. Meanwhile, a great many materials, natural or synthetic, may be obtained only in powder form. Similar fine crystalline specimens are completely accessible t o detailed structure investigations with modern electron-diffraction methods. Furthermore, many important synthetic materials may be formed as single-phase specimens by the methods which were developed and used for obtaining preparations for electron-diffraction studies. There is also interest in so-called kinematic electron di$raction, i.e., the method of investigation which consists in recording diffraction patterns on a moving photographic film or plate. In this way the disappearance of one and the appearance of another phase can be recorded during the course of some type of treatment of a given material. Flectron diffraction has great importance in the observation of the processes occurring during the sublimation (usually in vacuum) of simple as well as complex samples. 'I his observation appears t o be, if not a unique, then in any case, an important method of determining the structure (partially also the composition) of the sublimated layers. Another point that should be stressed is that more detailed structure investigations using phase analysis can give incomparably more important and abundant results. Electron-diff ractioii analysis of crystal structure reveals new perspectives, as has already been shown in the published work of a number of investigators. The essential advantage of the method appears to he, just as with neutron diffraction, the possibility of determining the positions of light atoms in the presence of heavy ones: hydrogen in the presence of carbon, nitrogen, oxygen, fluorine, and even silicon; nitrogen in the presence of iron, molybdenum, etc. Furthermore, structural investigation helps in making precise determinations of the nature of chemical bonding and of the concentration of components in phases of various structures and aids in the solution of a nuniher of other questions.
ELECTRON DIFFRACTION STRUCTURE ANALYSIS
357
11. S T A T E O F T H E E L E C T R O N - D I F F R A C T I O N METHODO F INVESTIGATING SOLIDS AT THE PRESENT TIME More than twenty years ago the author had the idea of creating electron-diffraction structure analysis as a method for completely and independently determining crystal structure. Since that time, the author ( 1 ) and his collaborators and students and, in particular, R. K. Vainshtein ( 2 ) have developed the theory and experimental methods of structure analysis using electron diffraction and have completed a considerable amount of structure determinations. Of especially great importance has been the use of the method of Fourier synthesis in electron diffraction, initiated by Vainshtein (S), which undoubtedly was the impetus for the use of Fourier synthesis in neutron diffraction. Structural electron diffraction may be considered as a iorm of the most penetrating and effective use of the electron-diffraction method for the solution of different scientific and practical problems in the structure of solids. With the development of structural electron diffraction, there appeared new investigations concerned with the scattering of electrons by atoms, molecules, solids, and liquids. To characterize the present state of the method, it is essential first to illuminate some principal results of recent work with respect to the scattering of electrons of average energy (30-100 kev) in matter. A . Atomic Scattering
It appears definite that in a strict theory of the scattering of electrons in crystals it is necessary to proceed from a representation of a continuous periodic potential distribution. In practice, however, the investigator makes use of the method of superposition of the amplitudes fe(s) of scattering in individual atoms to calculate the intensities of electron diffraction patterns. The values of fe(s), where s = 4a(sin @ / A [or fe(sin O)/A], given in tables are calculated by one method or another for free, spherically symmetric neutral atoms. It is obvious that the use of this data to calculate scattering in crystals is inaccurate. Moreover, an error is introduced when the Born approximation is used instead of an accurate calculation of the complex atomic amplitude. It is possible to imagine that the influence of the phase (7)of the complex amplitude would appear in the case of scattering in a crystal containing elements of highly different 2 number (4). Thus, using the Born approximation for the atomic amplitude, we obtain
358
Z. G . PIXSKER
where p ( r ) represents the value of the potential a t point r inside the atom and the integral extends over the volume of the atom. The atomic amplitude fe(s) is calculated by using the expression
where fz(s) is the well-known value of the X-ray atomic amplitude. In recent years extensive values of fe(s) have been tabulated by Vainshtein (W,5)using fz(s) values and taking for light atoms McWeeny and for heavy atoms Thomas-Fermi statistics. In this table are given also values of fe(0), i.e., values of the atomic amplitude for s = 0 obtained experimentally. As shown by Vainshtein, the value fe(0) in electron diffraction is analogous to the value Z in X-ray diffraction, i.e.,
corresponds to the total value of the atomic potential and may be used to calculate the form of the maximum potential, unit amplitude or to estimate the relative scattering power of different atoms. I n Vainshtein's table the fe(s) values are given in so-called p units introduced by this author in analogy with electron units in X-ray analysis. If we rewrite (2) in the form
then, in the nature of the unit, it is possible to choose the scattering power of the proton for a value (sin e)/x = 0.1 x 108 cm-'; in this case, the relative value offe, for all atoms with corresponding value (sin e)/x, is expressed in dimensionless p units. To change to absolute units (see table of values) it is necessary to multiply by a constant of multiplication
Recently, J. A. Ibers (6) published a new table of fe(s) values (expressed in angstroms), including values of fe(0), calculated with greater accuracy than previous ones. For light atoms he used recent data forf,(s), calculated by the method of Hartree-Fock or Hartree, and for medium and heavy atoms the values of fJs) were calculated by the improved method of
ELECTRON DIFFRACTION STRUCTURE ANALYSIS
359
Thomas-Fermi-Dirac. Finally, the limiting quantities fe(0) were determined using the root-mean-square value of the atomic radius. The new values differ from the old for values of (sin e)/x X 10-8 < 0.4-0.5 for light atoms, and (sin O)/A X < 0.6-0.7 for medium and heavy atoms; for the first case toward increased scattering power and for the second toward decreased power. This decrease is very considerable for (sin e)/x < 0.1-0.2 X lo8 cm-I and, in particular, for fe(0). As is well known, the value of fe(s)can be experimentally verified with carefully measured intensities of reflection from a crystalline material. At the same time, a well-defined interpretation of the experimental data is complicated by the conditions of scattering in the crystal, which may be treated either as kinematic or dynamic. We shall come back to this in a following paragraph. Ibers obtained a partial proof of his results by using the value of the molar diamagnetic susceptibility x which is related to the value of fe(0) through the previously mentioned root mean square F2 for a given atom by
-x
=
G f e ( 0 ) = 4.492fe(0) 8?rm c
(4)
Ibers' method gave the best convergence of his data for the inert gases. Analyzing the conditions of the problem, this author pointed out that the reliability and accuracy of the tabulated values of fe(s) depend on the accuracy of the X-ray amplitude data which, in the same way, depends on the limited experimental data on the scattering of electrons in crystals. It should be noted that the disadvantage of Ibers' table of atomic amplitudes is that the quantity is calculated only in an interval up to (sin O)/X = 0.70 X lo8cm-' and not for all elements. Meanwhile, in recent structural investigations the author tried to use the data from electrondiffraction pictures up to (sin e)/A = 1.1-1.2 X lo8 cm-l, which is very important for the cut-off reducing a t the Fourier synthesis potential calculation. A more complete table of atomic amplitudes for light atoms has been compiled by Ibers and Vainshtein (7 ). Comparison of the theoretical and experimental intensities of electron diffraction patterns requires, just as in X-ray structure analysis, a correction to the theoretical fe(s) for the thermal oscillation of the atoms. This correct ion e-B (sin t J / M 2 .
B
6h2T mk02
(5)
causes a steeper fall-off of atomic scattering with increasing (sin e)/x.
360
Z.
G. PINSKER
The value of B is determined either experimentally or, in certain cases, from theoretical calculations. The above relationship is related to the elastic scattering of electrons. At the present time there is insufficient information from which to make any statements concerning the role of inelastic scattering in the formulation of electron diffraction patterns. The well-known work of Heisenberg and Bewilogua apparently is related to collisions accompanied by a substantial variation in energy of the scattered electrons; it is possible that this effect often is the result of multiple collisions. An electron which loses a substantial part of its initial energy will not participate in the formation of interference maxima, but instead forms a general background. Calculation of this background has certain significance for the treatment of electron diffraction patterns from amorphous solids (or molecular vapors). In addition to this, in recent years it has been established that for scattering in films of 200-400-A thickness the observed absorption of energy (from the electrons) is in the range of 10 ev and has a characteristic spectral form. A series of metals and their compounds have clearly marked and reproducible maxima on the curves representing the intensity of the electron beam as a function of energy loss. It has been shown by L. Marton and his co-workers (8),H. Watanabe, and others, that this beam takes part in the formation of interference maxima; i.e., it appears as an inelastically scattered but coherent beam. Comparison of the curves of elastic and inelastic scattering for typical electron diffraction patterns from various metals shows that the inelastically scattered electrons form a considerable part of the background intensity between maxima and, furthermore, take part in forming the maxima. Apparently, in the present state of electron-diffraction analysis deduction of the indicated characteristic loss intensities from electrondiffraction intensities appears to be a difficult problem.
R. Scattering in Crystals (l,@ In kinematic theory the amplitude of a wave scattered by a crystal in certain arbitrary directions is determined by summing the elementary waves scattered by individual atoms. The two stages of this summation, with respect t o an atJomin an elementary cell, and with respect to all the cells involved in the scattering in the volume of the crystal, are different. The first summation gives the structure amplitude
wheref,j and r j are, respectively, the atomic amplitude (2) and radius vector
ELECTRON DIFFRACTION STRUCTURE SNALYSIS
361
inside a cell for atom number j ; h is the radius vector of the reciprocal lattice h
=
ha*
+ kb* + lc*
(7)
Summing over the crystal gives a sum which can be represented by a factor of the form
Expression (8) represents integration for scattering over the crystal volume; a parallelopiped with sides A , B, C. The quantity obtained D ( h1 h2 h3 )
.-
1 sin rAh1 sin aBh2 sin nCha ah, ah, aha
=-
-ap-
D
(9)
can be considered as a distribution function of the intensity in the interference region of the reciprocal lattice space, and hi is a component, with respect to the axis of the radius vector in this region, drawn from the center to any of its points. For the intensity of the reflected wave we obtain the expression
I
=
I0
p l%kiI2 [D(hi)12
(10)
This is the value of the intensity in point hi of the interference region. On the electron-diffraction pattern we obtain a certain cross section of this region in the plane of the photograph where the distance on the photograph is LA times the distance in the volume of the reciprocal lattice. Passing to the integrated intensity, we take dxl and dx2 as elements of length in the plane of the photograph and dx3 perpendicular to the photograph and obtain (C = H )
Formula (11) is used when the plane of the photograph (to be more exact, the sphere of reflection) intersects the maximum region corresponding to h, = 0. Thus, the magnitude of the integrated intensity (also the magnitude of the intensity from one crystal in kinematic theory) is proportional to the square of the structure amplitude and the thickness of the crystal in the direction of the primary beam. As is well known, many crystals are involved in the formation of an electron-diffraction pattern (we do not consider Kikuchi-line patterns from
362
Z . G . PINSKER
macro-monocrystals). We shall return to this point later, but for the present formula (11) is adequate for further discussion. I n the dynamic theory we examine the wave field inside a crystal and the conditions on the boundary crystal-vacuum to obtain the intensity distribution of the scattered radiation. The potential function 8r2me/h2p(r) entering into the Schroedinger equation is represented in the form of a triple Fourier series
I n accordance with this, the solution of the equation inside the crystal proves t o be an infinite series of waves. In the zero approximation, corresponding to the kinematic theory, only one incident wave has a noticeable amplitude. I n the first approximation, the most developed, two beams possess noticeable amplitude; the incident beam 4 0 (wave vect,or ko) and one of the scattered beams # g (wave vector kg). After substitution of this solution into the Schroedinger equation, we obtain for the two specified waves inside the crystal equations of the form
H,, V12, and V Zprove ~ to be the well-known functions of a vector where Ha, of the incident wave in vacuum K = (8r2meE/h2)*, and components VAkI, among them zero vooo, identify the average inner potential of the crystal. It is further essential that, to calculate the value X = we must solve a system of second order relative to X , which leads, in general, to two waves-incident and reflected: #of, #if, #,’, Goff. By imposing the boundary condition that the wave functions and their derivatives on both sides of the crystal-vacuum boundary must be equal, in the simplest case, that of passing through a plane-parallel lamina, both components add. We obtain for the coefficient of reflection
+
vh2 . sin2 [>ikH d(0- tQ2 sin2 200 vh2/k4] IOS - F (e - e0), sin2 200 vh2/k4
Ihkl -
+
(14)
Here k 5 K = 2r/X = (m2meE/h2)*; eE is the total electron energy, eo is the angle of exact reflection, 0 is the angle in the limit of maximum reflection, and H is the thickness of the lamina. Thus, the coefficient of reflection proves to be a periodic function with two arguments: the thickness of the crystal and the value of the divergence from the angle of normal reflection, 8 - eo. This is the so-called pendulum solution of the dynamic problem. The maximum reflection, according to (14), consists of a region of selective
ELECTRON DIFFRACTION STRUCTURE ANALYSIS
363
(or total) reflection having a measurable angular width satisfying the condition 0 = 00 and two regions of partial reflection on each side. For the region of selective reflection
As shown by Blackman (9),Eq. (15) allows us to estimate the boundary between regions using kinematic or dynamic theory. Actually, for a small value of the argument of the sine the magnitude of the reflection may be written in the form
Or, substituting uh = 4 ? r l @ h k 1 / f i l 2 [see Eq. (39) later on], and k = 2?r/X we obtain (ll),i.e., the kinematic formula. AS far as the degree of approximation t o the condition
kinematic expression (11) becomes less and less applicable. Equation (17) can be rewritten in the form
This expression permits a quantitative estimate of the factor corresponding to the transition from the kinematic to the dynamic region of scattering. This transition, for a given thickness H, takes place for large X, i.e., small acceleration voltage or large value of @/D. It is obvious that in a highly symmetrical, simple structure, in particular one containing average or heavy atoms, the limiting thickness H is small. The limiting thickness increases for more complicated and less symmetrical structures with greater values of 52 and smaller values of a. In Table I are given certain values of the thickness H calculated using Eq. (18) for X = 0.042 (approximately corresponding to an accelerating voltage of 75 kev) for the first strong reflections of different structures. As a result we have the data in such a form that structural investigations, in particular those using the intensity of electron-diffraction patterns, can be carried out, in large part on the basis of patterns from polycrystalline and oblique texture patterns (as shown later on), and, as shown experimentally, the particle size in such specimens is on the order of a few hundred angstroms a t maximum. Further, it is obvious that newly investigated structures having greater complexity than simpler structures of the type of Au or Fe have a correspondingly greater limiting thickness.
364
Z. G . PINSKER
TABLE I Reflection indices
Structure
Au
a h k l A-2 -
H in A
hkl
n
111 200 220 311
0.446 0.396 0.282 0.24
50 60 85 100
111 200 220 311
0.314 0.279 0.193 0.163
75 85 125 150
MoS2
203 205 008
9,89/103 11.89/103 13,403
A1
111 200 220 311
=
0.096 0.115 = 0.126
250 210 190
0.142 0.121 0.0825 0.0666
170 200 290 360
=
From an examination of materials of any definite structure, we note that the limiting thickness is reached first of all in relation to the most intense reflections. This circumstance, connected with the presence of the quantity c&kz in (18), is significant for practical structure investigation. If, as usually occurs, we observe a considerable region where a h k l has a magnitude in the ratio of 1OO:l in the reflection spectra from a given structure, then, it is obvious, that the limiting thickness of the crystal, and, significantly, also the condition of inapplicability of the kinematic theory, is reached first of all for rather strong reflections. Thus, there arises the question of introducing a correction which, according to Blackman, is calculated in the following manner. Equation (14) can possibly be written in another form if we introduce the quantities
W =
k2(e -
00) Oh
sin2 eo
and A
VhH
= -=
2k
A@ -H Q
In this case
It is possible to obtain the integrated intensity, for example, by integrating with respect to the angle of divergence (diffraction) = e - eo, from the direction of ideal reflect,ion eo. In the integration one must include all the (Y
ELECTRON DIFFRACTION STRUCTURE ANALYSIS
365
angles in the region of a given maximum. The expression so obtained can be transformed to a Bessel function integral. Changing from variable a to W ,we have Ihkz
I,s
=
cs/i*sin2A d w z d- 1dW 21c2
w2 + 1
-m
$1
cy*rLA
A
=
c
Jo(2s)d.z = -
~
Jo(2x)d.z
where J o (ax) is a Bessel function of zeroth order. For small A , i.e., the lower limit of quantity (186), the integral of the Hessel function is
LA
Jo(2x)d.z = ;1
(21)
aiid the reflection integral is
i.e., proportional to the square of the structure amplitude, which is characteristic for kinematic scattering. For large A , i.e., beyond the limits of condition (18b), the referred to integral converges to 55 and the reflection
i.e., proportional to the first power of the structure amplitude, which is characteristic for dynamic scattering. Furthermore, note the result that in the case of ( 2 2 ) the reflection is proportional to scattering in the volume of the crystal which does not occur in dynamic scattering (23). If the first part of expression (20) is multiplied and divided by A , then we obtain
0bviously , quantity
("
Jo(2x)dx Jo can be considered as a correction to the extinction, depending on the modulus of the structure amplitude. By introducing this correction, we can calculate the intensities of electron diffraction patterns corresponding to (24), i.e. according to the kinematic scattering law. Note that the coefficients c and c' in the reduced formula prove to be
366
Z. G . PINSKER
Lorentz factors, which illustrates the use of this factor in calculating electron diffraction intensities with dynamic scattering.
C . Remarks on the Present Development of the Dynamic Theory I n the last ten to fifteen years, there has been intensive development of the important problem of the dynamic theory. We cannot dwell here a t length on this work, so we will confine ourselves to certain specific results, with the point of view of their possible use in structure analysis. The dynamic theory of scattering leads to a completely new method of determining the coefficients, and, subsequently, also the phase of the structure amplitude, based, not on measuring the intensities, but on the geometry of electron-diffraction patterns. This result cannot be understood on the basis of the kinematic theory. Determination of the coefficient % k l can be accomplished in two different ways. The first consists of using the fine structure of reflections observed on electron-diff raction patterns from convergent beams and was indicated by MacGillavry (10). As shown by Eq. (14), the diffraction maxima intensities, for a fixed value of crystal film thickness H , prove to be a periodic function of the angle of divergence from exact reflection 6 - O0. In other words, in the presence of the proper collection of angles of incidence, the diffraction maxima have a line (or band) structure. A minimum in the function (14) is observed if
>6kH Z / ( O m i n -
sin2 260
+ vt,/k4 = ma
(26)
or, since sin 28" = 2 sin eo = X/dnkl, then
Thus, from the angle of scattering between the maxima (or minima) of the fine structure of a given reflection, it is possible to determine vhk1 and also the structure amplitude. The necessary conditions for the formation of this pattern, fixed value of H and the collection of angles of incidence, are realized in the convergent beam method of Kossel-Mollenstedt (11). If a magnetic lens is used in the electron-diffraction apparatus to focus the electron beam, not on the photoplate or screen as in the usual method of recording (according to Lebedev), but on the plane of the film, for example, mica, then we will have a thin convergent incident beam. In this case, the thickness of the mica will remain constant, with sufficient accuracy, in a small part of the focal spot. The method was tried and gave sufficiently good results for mica. Another method of determining the structure amplitude modulus is based on the variation of the reflection conditions for electrons passing through materials with nonparallel boundaries. As was shown above, the
ELECTRON DIFFRACTION STRUCTURE ANALYSIS
367
dynamic theory leads to two waves inside a crystal, for the incident beam and for the reflected beam, from a given system of planes. I n the case of a plane-parallel film, these waves, going from the crystal to vacuum, add, since their directions in vacuum coincide. If now the plane of escape into vacuum is nonparallel, then waves y ! ~ ~ ’and $,” diverge, and the effect analogous to double refraction in crystal optics is observed. I n the most simple experimentally realizable case, transmission through a cubic crystal of MgO, all the calculations were successfully carried out (12).If the crystal is oriented, for example, the direction [lll] parallel to the electron beam, then the indicated splitting of the reflected beam takes place at the limit of some boundary; the result of this is to produce a combination of beams arranged in a completely definite form. From the distance between beams forming the fine structure in a single reflection, it is possible to calculate the average inner potential VOOO, and also the value 2)J&i corresponding to a given reflection (hkl). Recently (13) it has been shown that not only the modulus but also the phase of the structure amplitude may be determined from geometrical electron diffraction; namely, from the anomalous effect on a Kossel-Mollenstedt picture in a convergent beam and for the intersection of Kikuchi lines. Development of the theory was verified on electron-diffraction patterns of graphite. Finally, we have the very important work (14) in which it was shown that the well-known Friedel law is not fulfilled for reflection from a macromonocrystal. On the electron-diffraction patterns from a cleaved (110) crystal of sphalerite and the basal face (0001) of a crystal of MoS2, there is observed an asymmetry relative to the plane of incidence. I n the first case, the asymmetry appears to be the result of the absence of a center of symmetry in the structure, in the second case the presence of a polar axis in the centrosymmetrical structure, which is in accordance with analysis made on the basis of dynamical theory. Finally, it should be noted that a detailed study of the intensities of Kikuchi lines and bands also presents an interesting way of analyzing crystal structure. It has been pointed out that, in first approximation, it is possible to describe the intensities of Kikuchi lines with the help of atomic-type functions and structure amplitudes for incoherent interaction (15). I n a new work (16) the connection between Kikuchi electrondiffraction patterns and diffuse thermal scattering of electrons in crystals has been established. The mentioned results of the dynamic theory of electron scattering and the experimental work confirming these results have value, in the main, for the principle involved, since they are related to a very special reflection condition. Still, there is no doubt that with further development of electron diffraction appropriate methods will be used in practice.
368
D. Process of Structure Determination from Electron-Diffraction Patterns As was shown experimentally, in a majority of cases electron-diffraction patterns used in structure analysis correspond either to kinematic scattering, or, for a more detailed and careful comparison with theory, may be calculated by introducing the dynamic correction ( 2 5 ) . We next note that a Fourier-series potential, in general, is only slightly sensitive to the divergence between the experimental and theoretical intensities, and allows us to localize the atoms with relations of 2 numbers of 1: 5 , 1:6, t o approximately 1: 30 without introducing any correction for dynamic scattering. It is possible to briefly describe the stages of structure investigation in the following manner. When studying an unknown structure, the question of which system it belongs to, the period of the lattice and the angle between axes (for monoclinic or triclinic systems) may be determined with the help of spot patterns from a mosaic rnonocrystal and, especially, electron-diffraction patterns from oblique textures, analogous t o X-ray rotation patterns. The reflections from a standard specimen obtained on the same electron diffraction picture are used to calculate the constant of the photoplate LA. NaC1, MgO, NH4Cl, LiC1, or some other materials may be used as standards. The spot patterns of greatest visibility give the symmetry of the structure, but it must be kept in mind that each given photograph corresponds to only one plane of the reciprocal lattice. Therefore, the mechanism of extinction produces a possible source of error in indexing a pattern, and for the total characteristics of the spatial lattice of the crystal several similar electron-diffraction patterns are required. The period is calculated in a simple manner from the formula
and the sine of the acute angle, in the case of an oblique lattice by means of the appropriate distance rnOO on the photograph. It is not difficult to also calculate the value of the period in the case where the plane of the electron pattern does not coincide with any coordinate in the plane of the reciprocal lattice. Electron-diffraction patterns of the oblique texture type, in particular those obtained from oriented films consisting of lamellar crystals with disorderly azimuthal orientation, are, in general, more valuable. I n the majority of cases, the face parallel to the substrate corresponds to (or may be selected) a coordinate plane of the reciprocal lattice. If a third coordinate axis or principal direction in the lattice coincides with the axis of
ELECTRON DIFFRACTION STRUCTURE ANALYSIS
369
the texture, i.e., with a direction perpendicular to the referred-to plane, then a photograph of the reflection will be arranged according to layer lines and ellipses. The first corresponds to a definite value of the index I , and the second to indices h and k. In the case where a principal direction parallel to the axis of the texture is absent, only ellipses will remain. Similar photographs make it possible to determine the crystallographic system, period of the lattice, and the angle between axes, and, accordingly, index all reflections in the case of an unknown structure of any symmetry. The periods in the plane of the film and the angles between are determined by the method of trial from the distances in the zero layer line, and the third period and angle 0according to formula
are the distances measured on the electron diffracwhere ThOj, ThOl, and rhOO tion pattern from the central beam to the corresponding reflection, and c* is the period in the reciprocal lattice. If we have a photograph with layer lines, then the formula for the period c becomes much simpler:
For a large elementary cell with average or low symmetry, a reliable and accurate indexing of the reflections on the photographs of oblique textures becomes more difficult and requires adequate resolving power of the electron diffraction apparatus. Wilman proposed the use of Kikuchi patterns to determine elementary cells, but this method has not as yet been used. In studying a multiphase system, the identification of a well-known phase and the determination of the period of the new structure is often accomplished using an electron-diffraction pattern of an intermediate or mixed type. Different phases exhibit rings from polycrystalline sections of the specimen and arcs or spots from crystals oriented in one way or another. A determination of the period of an unknown structure from a photograph of this kind can be successfully accomplished in the case of cubic, tetragonal, or partially hexagonal symmetry. I n other cases (in particular for large periods) the problem is resolved partially by resorting to other data. The next stage of structural investigations is to determine the number of molecules or formula units N in the elementary cell. Because of the negligible quantity of material usually being analyzed by electron diff raction, direct chemical analysis or a determination of density proves to be impracticable (or very difficult). Therefore, we are obliged to go to indirect
370
2. G . PINSKER
means for the identification and determination of N . For isomorphous structures or for structures containing the same componelits, we can conduct a partial, parallel X-ray investigation or some other, of which we will speak in a special section of the article. Proper structure investigation, i.e., determination of the distribution of atoms in thc elementary cell, testing for one or another structural model, or supplementary investigations such as locating the coordinates of light atoms, is not achieved by X-ray determination of a given structure, but requires the use of experimental electron-diffraction intensities. As is well known, the basic method of measuring intensities a t present is photographic. This method has the advantage of surveying and recording on one photograph all interference patterns with short exposure. The photographic method achieves its greatest accuracy in the measurement of electron-diffraction patterns from polycrystals. I n many cases, however, such patterns fail to record, and, furthermore, for materials with average or low crystal symmetry, a considerable number of the rings coincide for 2X or 3X reflection. It is true that this occurs also for cubic lattices. The majority of oblique texture patterns are free of such coincidence and may be traced on a recording microphotometer such as type MF4. For reliable evaluation and, even more, for quantitative measurements, it is necessary to use a series of electron-diffraction patterns taken with increasing exposure, since the latitude of photographic emulsions is insufficient to accurately record the entire range of intensities occurring in an electron-diffraction pattern. For visual evaluation, it is possible t o use the well-known 10-value scale (17). More accurate results may be obtained by using a nomogram (18) and, finally, by drawing a blackening curve for each maximum and the value of the background under the maximum taken from the entire series of microphotometer traces. I n this may (19) each evaluation of the intensity of a maximum with respect to the blackening lies in a rectilinear part of the photographic curve. The blackening curve is plotted from the relation
D
=
klog(1t)
(31)
By careful execution this method can provide an accuracy of 5-10% in the measurement of intensity. To obtain this, it is necessary to arrange several series of measurements. Undoubtedly, what should be aimed a t is to obtain experimental intensity data for the maximum possible number of reflections, in particular, for the maximum number of measurements of (sin @/A, Usually, weak reflections, among them reflections for large (sin @/A, cannot be measured from a microphotometer trace within the specified limits of accuracy. Note that for many structures the problem of adequate location is considerably less accurate than the values of the experimental intensities. Nevertheless, the final verification of the structure, obtaining the most
ELECTRON DIFFRACTION S T RUC TU R E ANA4LYSIS
371
accurate atomic coordinates, in particular for light elements, and the investigation of much finer structure problems requires the use of the indicated method. It should be emphasized that the statement in Sec. II,A of the theory of scattering in crystals permits the use of the indicated limit of accuracy of photographic measurements. I n addition to the photographic method there are described in the literature different methods of electrical registration (90,91): the use of a Faraday cylinder with electrometric registration, counters, CdS crystals using photoelectric multiplication, and others. At present these methods still cannot compete with the photographic method, although they undoubtedly make it possible to obtain more accurate data. It is necessary to convert the experimental intensity l o b s to the structure factor l@nbs12 and to structure amplitude sobs in order to use the experimental intensities for structure analysis. To do this, it is necessary to take into account the angle factor L and the frequency factor p :
pLI@I2 (32) Equation (32) corresponds to kinematic scattering. As was shown by Vainshtein (29), the angle factor L remains invariable for different types of electron-diffraction patterns for the transition from kinematic to dynamic scattering. For electron-diff raction patterns from mosaic single dhkl'. For oblique texture patterns crystals L d h k l , for polycrystals L it is possible to use values dhkod h k l , where dhrl corresponds to the reflection of that ellipse lying on the zero layer line. The frequency factor p is easily determined from the mechanism of formation of electron-diffraction patterns. The question of the applicability of formula ( 3 2 ) to a given electrondiffraction pattern and of the necessity of introducing the dynamic correction has already been solved to define structures more precisely. For the first stage, it is possible to use the kinematic expression with confidence. As is known, structure analysis investigations, a t the present time, use a method of trial and geometrical analysis of the structure as well as Fourier synthesis of @2 and @ values (in X-ray diffraction F2 and F ) obtained from experiment. For individual steps of the investigation and for the final verification of the determined structure, comparison is made between the experimental and theoretical structure amplitudes. To do this, it is already necessary to examine the applicability of the kinematic formula (32), chiefly for strong reflections (23).The experimental (sobs) can be normalized to a theoretical (QjCalc) by using the sum of the average or the average and weak reflections. After this, the ratio lrcl
-
=
-
can be calculated for all hkl and plotted on a graph as a function of
372
2. G . PINSKEIZ
Evidently, in the case of pure kinematic scattering the experimental points corresponding to the ratio on the left side of (33) are laid down close to a value of one, since for small A
If now partial dynamic scattering occurs, then with increasing cpoalc the points gradually become less than one. By introducing the correct value of A1 cp A =X H (35) determined from the graph of D ( A ) , we can calculate the corresponding value of the thickness H . This thickness, as a rule, agrees neither with the value obtained from the half-width of the maximum profile, nor (apparently) with the value calculated by Fourier analysis of the line profile. For a certain average H we can calculate @obs,corr2 (corrected) from which is determined /@o,scrl. In this way can be obtained, as in the case of pure kinematic scattering, good convergence of the experimental and theoretical amplitudes for partial dynamic scattering after introducing a correction for the presence of the proper structure coefficient. For the value of the reliability factor
just as in contemporary X-ray structure work, the author obtaiued 10-15y0 in some recent investigations (20,23). An important method of investigation is the use of cp2 and cf, series. The a2series, just as in X-ray analysis, corresponds t o the interatomic vector and is particularly useful in the initial stages of structure analysis. The @ series corresponds to the distribution of potential in a given structure. The maximum potential, in the same way as the maximum electron density, corresponds to the center of gravity of the atom. To investigate and analyze a given potential distribution it is essential to normalize it to some units, for example, volts. By transforming the value of the lattice potential in each point (p(x,y,z) into a three-dimensional Fourier series
multiplying both sides of (37) by e2ni(r,h)and integrating with respect t o the volume of the reciprocal lattice, we obtain 1 cp(x,y,z)e2ri(rlh)ci7= p h k l 5 (38)
1
On the left we have the quantity
where k
=
-2nme/h2, so that
ELECTlZON DIFFRACTION STRUCTUIlE ANALYSIS
373
Substituting the value of 2?rrne/h2 and changing from units of potential to volts we obtain ‘p(z,y,z) = @(cm)X 4.78 X 108/Q(A3)volts
(40)
If now @,tkl is taken from the table in p units, then the transition coefficient turns out to be 114.5/Q(A)3. A number of theoretiral questions connected with the Fourier represeutation of potential have been treated by Vainshtein (9). The height of the maximum poteiitial or the potential in the center of an atom can be represented by the formula
where k has the value defined above and fe,T(s) indicates that the atomic amplitude has already been multiplied by a temperature factor. The integration extends to all dimensions of s, although, in fact, fe(s) is used only up to s = 15 X lo8 cm[(sin B)/h = 1.2 X lo8 cm-l] in correspondence with experimental observations of reflections. As was shown by I3. K. Vainshtein, the value of q ( 0 ) for different atoms may be approximated by a formula of the type ~ ( 0= ) 114.5kZag (volts)
(42)
where k and a depend on the assumed value of the coefficient B in the temperature factor, and g, moreover, is determined by the cutoff value of the series using fe(s). The value of this factor for B fluctuating from 1 to 4 is given by this author in a table. It is obvious that if the data in the table for k , a , and g are used to calculate a new value of fe(s), then a correction must be applied. Formula (41) allows us to calculate the heights of maxima corresponding to a neutral, free atom of a given element. For this we use the theoretical (tabulated) values of fe(s).From a comparison of the theoretical and experimental results, we can (in principle in every case) form an opinion about the valence state of the atoms in the structure and, therefore, about the nature of the chemical bonding. Actually, in the presence of excess positive or negative charge, the value of the integral in (41) varies due to the variation of the quantity f e
-
(2 - fi)
(43)
For a cation the experimental height of a maximum exceeds the theoretical value, and for an anion it is less than the theoretical value. A difficulty arises in the practical application of the given method (do)
374
2. G. PINSKER
owing to the bad convergence of the integral in (41) for values of the temperature factor corresponding to experimental conditions. Therefore, for numerical integration we have to introduce an artificial temperature factor. Furthermore, what appears to be more essential is that the amplitude for ions is notably different from the amplitude for free atoms only for (sin e)/x < 0.2 x 10s em-’ in a region containing usually one to two reflections. As a result, the function p(0) is only slightly sensitive to the disturbance of the neutrality of the dispersing atom. Instead of the function cp(O), we can compare with theory the distribution from experimental fe(s) in the region of small (sin O)/A. This method undoubtedly can provide a well-known indication concerning the nature of the binding in a given structure, for which is required increasing experimental accuracy and preciseness of scattering theory. Analysis of the potential peak heights may have another use, in particular, for the study of semiconducting alloys. It is well known that many intermetallic and semiconducting phases appear to be phase-variable compounds in the sense that on the state diagram they correspond t o a certain finite range of concentration. Also, it is still vague whether an investigated structure is a phase of constant or variable composition. I n electron-diffraction investigations of films or layers prepared by sublimation in vacuum, it is difficult (often impossible) to specify the composition of the condensate prepared from an alloy. In such a case, the heights of the potential peaks or their ratio may help in determining the composition or the ratio of components. It is evident that the function cp(0)has sufficient sensitivity for a variation of the content of given elements. The same goal can be achieved by minimizing the reliability factor (in the presence of variation we assume a value to calculate the composition). We cannot take the time here to consider other important questions of Vainshtein’s (2) theory, as for example, its use in analyzing the accuracy of other characteristic integrals Fuch as (41) developed for light atoms in the presecce cf hcaby or other.
E. Experimental Techniques The experimental techniques of electron-diffraction analysis, including the apparatus for obtaining and registering electron-diffraction patterns and the methods of preparing specimens, have been considerably developed. A diversity of apparatus has been developed for the solution of different problems, the details of which would take too much space. If we do not concern ourselves with apparatus for investigating molecular structures in vapor, the description of recent electron-diffraction apparatus for investigating solids can accurately be divided into two basic types, in correspondence with the two important directions of work. One of them is the appara-
ELECTRON DIFFRACTION STRUCTURE ANALYSIS
375
tus for structure investigations and other practical uses of electron diffraction, and the other is the apparatus for investigating fine structure connected with dyiiamic scattering and certain other effects. Obviously, the study of fine structure requires considerably higher resolving power; in the case of structure studies, high resolution can cause difficulties in the treatment of experimental intensity data. Resolving power in electron-diffraction work can be expressed as the minimum value of the difference between two interplane distances A d h k l which still may be separated. This value
is less for distant reflections, i.e., on the edge of the electron-diffraction pattern. As is evident, one identifies sharpness of image and dispersion of apparatus. Sharpness depends both on the nature of the apparatus and on the resolving power of the specimen. In many cases, small crystals or the presence of heterogeneity, analogous to that which in X-ray diffraction is connected with strains of the second type, cause more or less considerable broadening of the reflection. I n investigating materials consisting of lamellar crystals, there is sometimes observed a deterioration of the resolving power of the specimen for a change to reflections from planes forming small angles with the base. For a relative resolving power of d / A d = 1,000 it is possible, in the majority of cases, to separate nearly all reflections, even on photographs from polycrystals. With higher resolution the fine structure of Debye lines or other types of reflections is revealed, and complications arise in the measurement and indication of electron-diffraction patterns. The type of apparatus used a t present for different electron-diffraction investigations in the U.S.S.R. (26) is made by a number of manufacturers as type EG after the model developed a t the Institute of Crystallography of the Academy of Sciences of the U.S.S.R. (Figs. 1 and 2). It occupies a space of 2.1 X 0.73 meters (apart from the forevacuum pump); i t has two specimen chambers, making it possible to photograph for distances L from the specimen to photoplate of 700 mm and 250 mm. To accelerate the electrons, there are four steps of high voltage up t o a maximum of 75 kev. The resolving power of the apparatus is such that lines corresponding to a difference in the interplane distance of Ad = 0.001 A (for small d, i.e., d / A d = 500-700) can be separated; the apparatus registers lines in an interval d from 20 A to 0.45-0.42 A or sin e / k = 1.2 x 108 em-'. Furthermore, for visual examination of the specimen, one can observe a shadow electron microscope picture with one hundred-odd magnification. The filament in thc clectron gun can be adjusted relative to the aperture
376
Z. G. PINSKER
FIG.1. Type EG electron-diffraction apparatus constructed at the Institute of Crystallography, Academy of Sciences (U.S.S.R.); made by the Moscow Mechanical Engincering School. General form.
U FIG.2. Scheme of construction of the type EG electron diffraction apparatus.
in the Wehnelt cylinder by several displacements. Furthermore, it is possible to adjust the gun and anode relative to the body of the apparatus. There is a tantalum diaphragm in the anode and another diaphragm in the end of the collimator tube near the entrance to the central chamber. A magnetic lens is used to focus slightly divergent electron beams.
ELECTRON DIFFRACTION STRUCTURE ANALYSIS
377
The crystal holder is adjustable for transmission or reflection under differentconditions and provides displacement of the specimen of f 2 0 mm in two mutually perpendicular directions in a plane perpendicular to the beam and rotation about two perpendicular axes. This arrangement makes it possible to use specimens of large area and make a dependable survey on the phosphor screen of the diffraction from all parts of the specimen. Furthermore, the crystal holder allows one to obtain transmission pictures for a considerable angle of deflection from the perpendicular position of the beam to the film. For reflection work the azimuthal deflection of the reflecting surface can be used. The arrangement of the central chamber of the apparatus in which the crystal holder is usually mounted has provision for introducing a heater so that the specimen can be annealed or patterns can be obtained a t elevated temperatures. A special arrangement to conduct electrondiffraction investigations at low temperatures can be introduced into the same chamber. Auxiliary ports in one wall of the central chamber make it possible to produce two simultaneous evaporations inside the apparatus and to measure the temperature of different sections of the specimen with a thermocouple or resistance thermometer. There is also an electron gun producing slow electrons to discharge the surfaces of dielectric specimens for reflection work. As already mentioned, besides taking patterns a t L = 700 mm, the type EG electron-diffraction apparatus has provision to take patterns also a t L = 250 mm by changing from the standard central chamber to a n intermediate one. The short distance makes it possible to observe and investigate Kikuchi-electron diffraction patterns and different effects in large angular intervals. The basic method of recording in the type EG is on 13 X 18-em stationary photographic plates. For elementary examinations and studies of diffraction patterns, for indexing of reflections, and for approximate microphotographic measurements of intensity, either one or two pictures, or more if required, can be taken on one plate. For quantitative measurements of the intensities of electron diffraction patterns 8 to 12 patterns in the form of narrow strips can be obtained on one plate by using multiple exposures. T o obtain more than one pattern on a plate in the subsidiary chamber, there is a shutter arrangement allowing reguoation of the width of the slit in the path of the diffracted beam. When photographic plates are changed, the vacuum in the apparatus goes down only to to mm Hg, which is essential when investigating materials which react with component parts of the atmosphere. The possibility of using partial electrical registration of electron diffraction patterns in the type EG apparatus was shown by I. I. Yamzin in our laboratory (19). Furthermore, with a simple adjustment, recordings can
378
2. G. P I N S K E R
be made on continuously moving film, i.e., the use of kinematic electron diffraction (27) (see below). The vacuum in the apparatus is produced by standard methods: one fore vacuum pump and one diffusion pump. Evacuation of the system from atmospheric pressure to the working vacuum of mm Hg is produced in 5-7 min. We shall not discuss here apparatus already described in the literature or manufactured or formerly manufactured industrial instruments such as the apparatus of Finch (28), the diffraction apparatus of Trub-Tauber, or the apparatus constructed by GO1 in Leningrad (29). Essentially iiew is the kinematic method of taking patterns on a moving plate or photofilm (27). Photographs of the diffraction pattern are taken through a narrow slit of -0.5-mm width. If a pattern is generated from a polycrystalline specimen, then on the moving photographic layer a series of lines is recorded symmetrical relative to the line from the central beam (more accurately the central stripe of greatest blackening). The investigation of some phase change, for example, the structure change in a film of an alloy brought about by the variation of temperature or composition, can be observed by the disappearance of one group of lines and the gradual appearance of another group. If with this the variation in the corresponding state parameter is recorded, then it is possible, as claimed by the author (27), to determine the thermodynamic (or kinetic) regions of existence of a distinct phase, and the temperature, concentration, etc., responsible for the phase change. 1. Methods of Preparing Specimens. The methods of preparing specimens for the investigation of different substanccs in reflection and transmission have been considerably developed. Some of these methods are described below in connection with one or another investigation. Here we limit ourselves to some general remarks. To investigate freshly cleaved surfaces of single crystal specimens of Ge, Si, quartz, and others, grinding and polishing is used t o eliminate rough contours and to level the surface. T o expose the crystal structure of a ground or polished surface, the surface is etched with some type of reagent. Proper annealing is necessary to relieve plastic deformation in the surface layer. Metallic surfaces also can be treated by the method of electrolytic etching or by ion bombardment. Effectively parallel electron microscope investigations, in particular, the use of methods lying on the border between electron diffraction and electron microscopy, for example, the method of dark-field images (30) (31) will not be gone into here. Structure investigations using transmission may be conducted with great success. One of the very important methods of preparing thin films is sublimation in vacuum and condensation on some surface. I n the sublimation of condensed layers of pure components, different structures are
ELECTROh- DIFFRACTION S T R U C T U R E ANALYSIS
379
obtained by controlling the temperature and the velocity of evaporation, the distance to the condensing surface, and the type or structure of the condensing surface. This also refers to sublimation of stable compounds which do not decompose during evaporation. Materials such as Gel Si, Se, Sb, and also two- (now and then more)-component phases, for example, GaSe, Sb&, usually are amorphous when condensed and give the characteristic diffuse diffraction ring. To obtain a crystalline structure, the condensate can be annealed in vacuum or in a neutral gas atmosphere. Certain materials crystallize only for sufficiently high temperatures (Si-600" C), others require more or less prolonged heating a t temperatures higher than 100-200O C. Annealing in vacuum is less convenient than, for example, annealing in an atmosphere of inert gas, since raising the annealing temperature above 200-300" C partly tends in itself to evaporate volatile components, and, in the case of materials with high transformation temperatures (above 500-550" C), it is not possible to do the annealing on crystals of sodium chloride. During annealing some stages in the process of crystallization are observed. Thus, a t first the crystalline phase is composed of an unoriented polycrystalline layer with gradual increase in the crystal size; after this it orients with the formation of a lamella texture without azimuthal order and then changes to a mosaic crystal. In certain cases a phase occurs with the formation of another structure. The process of crystallization, ordering of atomic structure, and t,he formation of texture, as a rule goes noticeably fast on the cleavage face of single-crystal NaCl or other similar material. In certain cases, crystallization is generally not observed for sublimation on an amorphous substrate. Sometimes, however, the use of NaCl as a substrate is inconvenient, since it hampers the production of completely irregular polycrystalline films or laminated textures without azimuthal orientation. For the investigation of more complex phases, formed as unstable compounds, the problem of obtaining sublimated specimens having the composition and structure of the original alloy becomes more difficult, and in certain cases impracticable. Similarly, compounds are often volatilized in vacuum from the liquid phase in the form of separate components in the order of increasing temperature of evaporation. To obtain films of the same composition, it is necessary to make sure (if possible) that all the components of the molecular beam are totally condensed, so that in the subsequent annealing homogeneity is obtained; i.e., the original phase is synthesized. To obtain films of two or more component alloys, it is possible jointly to condense a molecular beam of two or even three components onto a common surface. With this method, one can assign a definite relationship to the concentration of components (with low accuracy) if it is known that full
380
Z . G. P I N S K E R
condensation of the atoms takes place without reflection. I n the same way, one can obtain alloys with an excess of one or another of the components, for example, investigate the concentration, in a specimen, of a given phase on the state diagram. The number of works on structure study using this method, and, in particular, phases consisting of two (and partly three)-component systems increases. There is also reported some work using kinematic recording. It should be noted that in these investigations it is not always taken into account that even in small crystals formed in thin films the achievement of an equilibrium state and a stable structure requires a certain time. However, there is also interest in obtaining and investigating different changes and metastable structures. It should be understood that besides sublimation in vacuum, specimens are prepared in different investigations by precipitation from a suspension or colloidal solution and also by crystallization from a solution. Furthermore, many specimens prepared by one or another treatment, for example, in a gaseous medium or solution of the film, are first prepared by one of the described methods.
111. INVESTIGATIONS OF THE STRUCTURE OF SEMICONDUCTING PHASES As already indicated in the introduction, electron-diffraction deterniinations of structure are usually performed with thin sublimated films, which in many respects are of interest in semiconductor technique. A . Films of Pure Substances
If first of all we consider films of pure components having some semiconducting property, then in this case it is important to determine the presence of amorphous or crystalline structure, study the process of crystallization, and the character of crystal orientation. A number of authors have studied the structure of sublimated films of Sb ( 1 , 3 2 4 4 ) . It has been established that for condensation from the vapor onto ail amorphous substrate such as celluloid there is first formed a t room temperature an amorphous film of Sb. Its characteristic electron-diffraction pattern is a somewhat diffuse ring. In the work of (32),such electrondiffraction patterns were used to investigate the short-range order or the atomic structure of amorphous films by the method of integral analysis. Experimental curves of intensity mere plotted (Fig. 3) after careful microphotometer measurements of the intensity of the diffuse rings and the use of a darkening curve. After this, with the use of expression
+
47rr2p(r) = 4ar2po
(45)
ICLECTRON DIFFRACTION STRUCTURE ANALYSIS
381
Sine Y d-
x
FIG.3. Experimental curve of scattering intensity from a film of amorphous Sb; f' curve of the atomic factor for Sb.
FIG.4. Radial distribution of atoms in amorphous Sb.
a curve of the radial distribution (Fig. 4) was obtained, i.e., the value of 4pr2p(r)as a function of interatomic distance r. Since in this calculation the
experimental intensity curve was normalized to absolute intensity by using the curve for the atom factor fe(sin @/A for antimony, then from the maximum area of the radial distribution curve it is possible to obtain the number of nearest neighbor atoms. Comparison of the interatomic distances and the number of nearest neighbors obtained in this work with the analogous data from X-ray determined by Hendus (35) shows a correspondence, and also shows the
382
Z . G . PIKSIiER
much higher resolving power of the electron diffraction method. In the amorphous state Sb possesses a much higher coordination number than in the crystalline state which, in general, agrces with the information given in the literature concerning the increase of coordination number for liquefaction of lattices with homopolar or partly homopolar bonds. I t is possible to make continuous observations on a fluorescent screen of the electron-diffraction pattern from a film of antimony during the process of evaporation in the central chamber of the electron-diffraction appaem) lines of the ratus. With increasing film thickness (above -1 X crystalline phase suddenly appear, and after this the diffuse ring disappears. I t is essential to note that the angle of scattering (or iriterplane distance) responsible for the diffraction from the crystalline phase does not agree with the angle of scattering from the amorphous phase. Thus, it is not possible to connect the change from diffuse to sharp rings with the process of growth of small crystals or the increase in resolving power. Moreover, there is indication that in films of Sb giving a diffuse electroii-diff raction pattern the method of dark-field electron microscopy shows signs of a crystalliiie region or particles (36). Different authors have studied in detail the conditions under which films of Sb change to crystalline, in particular, the effect of temperature and the nature of the substrate (34). Rapid sublimation or heating of a n Sb film easily brings about the orientation of crystallites with their bases (i.e., plane (0001)) parallel to the substrate with random azimuthal distribution (see Fig. 5). Another more rarely (37) observed type of orientation occurs on the face of cubic rock salt with the axis [1121] perpendicular to the substratc for two equiprobable azimuthal positions. [00011Sb I I [1oolNaC1
or
[00011Sb 1 1 [01olNaCI
(46)
Measurements of the electrical properties of very thin films of Sb (33) show a sudden change in the electrical conductivity (approximately 200 times) accompanying the change from the amorphous phase to the crystalline; there is also a change in the nature of the conductivity: amorphous antimony possesses electron conductivity, but when crystalline hole conductivity. Finally, the change is accompanied by a change in the thermoelectromotive force from a! = - 100 to - 120 pv/O C for amorphous antimony t o a! = -30 pv/O C for crystalline. Electron-diffraction investigations of Se (1, 38) in the form of thin films sublimated in vacuum have established that the amorphous phase is formed when the film is heated in vacuum; apparently, it first forms a monoclinic modification which quickly changes (at 80-90' C) to hexagonal, although, in certain cases it is possible that the change to a hexagonal lattice takes place immediately.
ELECTIIOR' DIFFRACTION S T R U C T r R E AKALYSIS
383
FIG.5 . Oblique texture pattern from a film of crystalline Sb.
Tellurium (38) immediately forms a crystalline deposit having the wellknown structure. At the samc time, as a rule, a textured layer is formed. Crystalline particles of Te arrange themselves into granular prisms { 1010) parallel to the backing with random azimuthal orientation. This type of orientation occurs in electron-diffraction patterns obtained from similar films when the plane of the film is a t an angle t o the original direction of the electron beam (Fig. 6). Comparison of this electron-diffraction
pattern with pictures of the type on Fig. 5 from a film of Sb discloses the following differences. On the picture from Sb the zero layer line contains reflections of the type hki0, the first layer line-type hkil, and further reflections have increasing index 1. On pictures from Te the zero layer line lies a t reflection 0001, hh2hl; the first layer line is at hh 1 I , the second a t hh 2 2h 2 I, etc., for any h, including h = 0. On the layer line lying below zero, the reflections are arranged: h 1 h 2h 1 1, h 2 h 2h 2 I , etc.
+
+
+
+
+
+
+
FIG.6. Oblique texture pattern from a film of Te.
ELECTRON DIFFRACTION STRUCTURE ANALYSIS
385
When comparing the character of crystallite orientation in films of Sb and T e with their crystalline structures, it is impossible not to note a distinct connection between these factors. The layered character of the Sb structure with the arrangement of layers parallel to the base leads to the formation of lamellar crystals, which in its own turn contributes to the emergence of the described texture. The structure of Te, as is known, forms a spiral chain of atoms bound with a homopolar force. The chain is arranged parallel t o the c axis and is itself bound by a weaker residual force. In correspondence with this, crystalline Te has a prismatic habit and a texture corresponding t o a combination of dispersed hexahedral pencils. The use of a hot crystalline substrate and a high annealing temperature leads t o additional azimuthal orientation of crystalline Te. Thus, on a face of cubic NaCl we observe orientation of the type ( 1 0 1 0 )/I~{ l~O O } N a C 1 and
[ooOllTc
/I [ I O O I N ~ C or ~ [010lNaC1
and on mica {1oio)T, 11
{OOO1)mica
and [0001]Tc / I
[11~O]rnica
(47)
Cd and Bi orient in the same way as Sb, i.e., the base parallel to the substrate. When investigating sublimated films of variable composition Bi,Sb, (S9), an irregular solid solution possessing the same structure as the components is invariably observed for protracted annealing. The lattice period of this alloy is continuously variable in an interval of values between the lattice periods of Sb and of Bi. Oxidation of a film of Sb in air changes it to senarmontite, Sb40,, having a cubic lattice, the crystals of which are oriented with the [ 1111 axis perpendicular to the backing or the basal plane of the Sb. Oxidation of Bi gives a cubic oxide with the periods I : 4.65 A and 11: 5.49 A; the oxide crystals are usually oriented with the face of the cube parallel to the substrate.
B. Substances Having the Zinc Blende Structure It is generally known, that among semiconductors those having the ZnS-type structure are important. From a study of the appropriate compounds we note that just as ZnS, CdS, CdSe, and AgI are formed in two crystalline modifications: sphalerite and wurzite; but for the most part ZnS-type phases are made up of only one of these structures. These data, obtained from X-ray analysis, do not always correspond to the structure of similar compounds in the form of sublimated layers. Up to now, electrondiffraction investigations have been conducted for only a few materials. For the study of sublimated layers of CdTe (40) the films were prepared by simultaneous sublimation of both components, as well as by sublimation
386
Z . 0. PIFSKER
of the alloy. The considerably higher vapor pressure of the components compared with their compound makes it possible to obtain a film containing only CdTe. The condensed film has a polycrystalline structure giving an electron pattern with diffuse rings. Annealing leads to recrystallization, and sublimation on a hot substrate gives either a so-called mosaic monocrystal or an oriented layer without azimuthal regularity. From this, finally, is obtained an oblique texture pattern allowing reliable determination of the phase composition of the film. From X-ray data CdTe is known only to have the sphalerite or zincblende-type structure. Meanwhile, in sublimated layers of this compound it has been determined that there are present crystals of two phases: cubic, having the sphalerite structure and hexagonal, with the wurzite structure. The sphalerite structure can be described as a cubic close-packing of spherical atoms of nonmetallic elements S, Se, Te, C1, Br, P, As, Sb in which the atoms of the metals (Cu, Ag, Cd, Zn, Hg, Be, Al, Ga) occupy half of the tetrahedral holes. I n a plane perpendicular to the [lll]direction, we have observed an interchange of the nonmetallic and metallic atoms. Hexagonal wurzite has an entirely analogous structure. Here the basic framework of nonmetallic atoms is hexagonal close-packed in which the metallic elements also occupy layers in half of the tetrahedral holes. A change from sphalerite structure to wurzite structure amounts to a reorganization of the cubic packing of nonmetallic atoms into a hexagonal, and, therefore, it is natural to anticipate a corresponding polymorphism of the referred-to compound. At the same time, it is more or less accurate to write the following obvious relations (48)
For Te (48) is valid with sufficient accuracy. For this atom c/a = 1.636, i.e., very close t o the value 1.633 corresponding t o an ideal close-packed sphere. It is well-known that in the wurzite-type structure we have one parameter; for example, if the origin of the Te atom coordinates is chosen in an atom of Cd. Using a @ series, for the experimental structure factor it was found that Z = C.368, i.e., a value close to the theoretical one of 0.375. It should be noted that a full investigation of the hexagonal modification was possible only by using the electron diffraction patterns of oblique texture. In a study of the phases of the films CdS, CdSe, and CdTe, the author (40) has established that for the first two compounds the hexagonal modification becomes relatively greater with increased temperature, and for CdTe, inversely, the amount of cubic phase grows. It is also interesting to investigate the crystal orientation of these compounds in thin films. Thus, ~
for condeiisation on a slip plane of mica a cubic crystal of CdTe orients in the following way : 111}cclTCl1 { 0001} mlCa and [IIO]C~TJ 1[1120]mica
(49)
Note, that an analogous orientation is observed for sublimation of Ag on a surface of silicon carbide (Sl),mica, or molybdenite. Another example of the investigation of compounds of the ZnS type is the work (49)concerned with the study of InSb films which had also the object of investigating semiconducting materials (43). To investigate the structure of alloys in the system In-Sb in the form of thin films, the specimens were prepared as condensates from the component vapors in the same way as earlier alloys were prepared. Celluloid or rock salt were used as substrates. For sublimation of the components on a common substrate, a layer of variable composition is observed depending on the condition of what ratio was evaporated. For a 1:1 ratio of I n and Sb there is formed a film of InSb with the zinc-blende structure; on both sides of this zone we observe mixtures of InSb-Sb and InSb-In. The lattice periods of all these phases is constant and agree with the well-known data. Thus, in this system we almost do not produce a solid solution. During the suhlimation of the alloy having the composition InSb, there was observed, in succession, the formation in the first stage of evaporation of Sb, after this a mixture of InSb-Sb, further on pure InSb, and finally a mixture of InSb-In. This result, obtained by the electron-diffraction observation of the process in all its stages, proves to be very effective, since in many cases analogous methods, which do not provide for direct structure inspection of the condensate, introduce an error in the experiment. In indexing the electron-diff raction patterns obtained from polycrystalline films of InSb, there were found, besides lines corresponding t o a cubic lattice (sphalerite type) with period 6.46 A, some odd lines systematically appearing on the pictures. It was not difficult t o establish that these lines belonged in a system of reflection from a hexagonal phase of the same composition with a period: n = 4.56 and c = 7.46 A. This value is coiinected with the value for the cubic phase, 6.46 A, by relation (48) just as in the case of CdTe. It must be pointed out that also in other experiments in the laboratory of the Institute of Crystallography with annealed films of InSb having azimuthal crystal orientation we invariably observe both phases, cubic and hexagonal, with the indicated periods. I n addition t o the phases of IiiSb its semiconducting properties have also been studied (&A!I series ). of samples with relative concentrations of In and Sb from 0 to 100% were made by simultaneous evaporation of the compoiierits onto glass surfaces. Measurements were made of the thermo-
388
Z. G . PINSKER
emf and the electrical resistance of the films as a function of composition before and after annealing. Parallel with these measurements, an electrondiffraction check of the phase composition was made for films on glass, whose electrical properties were measured, and films on celluloid obtained under analogous conditions. As a result of these investigations it was shown, just as in the previously described work, that in these films the composition changes in such a way that on either side of a 50y0mixture of atoms there is a mixture of crystals of InSb with crystals of one or the other component. Condensation of the vapor on a cold surface resulted in the formation of amorphous Sb which, in the appropriate region of concentration, was mixed with crystals of InSb. The curve of the dependence of the thermo-emf on the specimen coordinate (Le., the relative concentration of components) has, before annealing, one deep minimum (amorphous Sb) and two maxima. After annealing only the maxima remain, corresponding to the phase InSb. An analogous picture is given by the electrical resistance curve: two maxima before annealing and one after. Note, that in this work we once more observe a stable amorphous state of Sb in sufficiently thick layers, which is connected with the disperse structure region representing a mixture with InSb.
C . The Systems Bi-Se and Bi-Te The properties of the compounds and their phases in the specified binary system had been insufficiently investigated in X-ray work, and the available data, for example, for Bizseaand Bi, Tea, needed verification. S. A. Semiletov (44), in our laboratory, made a systematic investigation of these alloys in thin, sublimated layers. To analyze the structures, the method of close-packed spheres, developed by L. Pauling (45) and N. V. Belov (46), was used. Different phases mere obtained either by simultaneous sublimation of the components, or sublimation of the alloys of one or the other composition, or, finally, condensation from sources consisting of the alloy and one of its components. In the majority of cases, the investigated films were obtained by condensation on the surface of cleaved sodium chloride or mica. Usually, with these there is produced a dispersed layer, with insufficient resolving power, giving diffuse rings. Annealing in vacuum a t low temperatures leads at first to recrystallization and after this to the formation of a film giving an oblique texture pattern. For this there was frequently observed a difference in sharpness of t,he reflection from the face hkiO (prismatic) and from the face hkil. The second was less sharp. This is connected with the lamellar form of the crystals, i.e., not having sufficient thickness. It is obvious that such a particle form aids in the formation of
ELECTRON DIFFRACTION STRUCTURE ANALYSIS
389
good orientation of the specified type. In certain cases mosaic monocrystals were formed. I n the investigation of the system Bi-Se, the following phases were discovered. Sublimation of an alloy specimen of Bi2Sesforms a film, giving a clear electron-diffraction pattern of oblique texture and also a counterpart in reflection. It was found that the period of the given phase with hexagonal axes was a = 4.124.17 =t0.01 A and c = 28.6-29.2 f 0.2 A, corresponding to the period of a = 4.14 A, but five times the c = 5.71 A, which, according to X-ray data, correspond to the compound Bizsea.One must include three such formula units (“molecules”) in the elementary cell. Further, as far as extinction, it was established that the structure had rhombohedral symmetry. Analysis of the structure was made from geometrical considerations, and also using ip2 and a series. It was determined that it is possible to describe the structure as a close nine-layered packing of Se atoms, i.e., with the identified period along the c axis enclosing nine atom layers of Se. The atoms of Bi are arranged in an octahedral hole, filling two layers and leaving unfilled each third. If, as usual, we designate the sequence of layers by the letters -4,B, C for the atom packing of Se and by small letters for the layers of Bi atoms, then the expression for the structure has the form
b / A cBAcBaCBaCbA C b l a where the lines show the period of the structure. The structure is described by the space group R3m. Atoms of Se occupy positions 3(a) and 6(c) with x s 0 = 0.196. Bi atoms occupy position 6 ( c ) with parameter z B ~= 0.395. The value of the parameter was determined using the one-dimensional cross-section of the three-dimensional series a2and a. I n addition to the above phases another type of electron diffraction was observed in the given series of experiments. From a detailed analysis, the author concluded that a phase Bi3Se4was present, the structure of which was completely established by an analogous method. The period of this phase is a = 4.22 z!= 0.01 A and c = 40.4 f 0.3 A. There are three formula units of BisSec in the elementary cell. The space group is R3m. Bi atoms occupy posit,ions 3(a) and 6(c) with Z B ~= 0.428, and Se atoms occupy 6(c) with x1 = 0.139 and 6(c) with 22 = 0.286. The structure can be described as a 12-layered rhombohedral packing of Se atoms with Bi filling the octahedral hole in 3 i of all the layers. Finally, from a series of spot electron-diffraction patterns it was determined that a third phase exists in this system having the composition BiSe. The structure is of the NaCl type with a period a = 5.85 - 5.98 f 0.02 A. In Table I1 are given the values of the principal interatomic distances determined in the discussed work (Fig. 7).
390
2.
c.
PISSKEK
TABLE11. INTERATOMIC DISTANCES Compound
Interatomic distance between atoms Se-Se
Bi-Se ~
BizSes BisSea BiSe BizTer BiTe
3 30 3 30
Te-Te _ _ _ _ _ _ _ _
-A Bi-Te ___
2 99-3 07 :3 30-3 10 2 99
3 72
3 04-3 24 3 23
FIG.7. Electron-diffraction pattern from a film containing crystals of two phases; BiSe and BisSec. The cross-shaped reflections are due to the normal pyramidal form of the BiSe crystals.
ELECTRON DIFFRACTIOh- STRUCTURE ANALYSIS
39 1
I t mas shown in a special investigation that the lattice periods of the three described phases fluctuate over a considerable range in different experiments. In order t o determine the reason for this effect and to determine the nature of the resulting solid solutions, an experiment was performed wherein one or another composition and one of the components was sublimated onto a common substrate. By this means it was determined that the variation of the period reflects the tendency of these phases t o form solid solutions with one or the other component. I n this solid solution, additional atoms are introduced into the unoccupied octahedral hole resulting in an increase in the lattice period. Evidently, this solid solution can form only, in the cases of Bi2Se3and Bi3Se4,in an “ideal” structure which has an unoccupied octahedral hole. This does not apply to BiSe whose maximum period ( a = 5.98 A) matches the composition accurately. On the other hand, a solid solution with an excess of Se, and, correspondingly, with a deficiency of Bi, appears to be a solid solution of subtraction, i.e., a loss of some of the Se atoms occurs and, therefore, the period diminishes compared with the normal lattice. Solid solutions of subtraction are observed for all three phases. By controlling the process of gradual variation of the film composition and observing this Variation by electron diffraction, the author observed the resulting reorganization of the structure and the change of one phase into another. Analogous results were obtained for a study of the system Bi-Te (Fig. 8). The structures BizTe3and BiTe, which are entirely analogous t o the corresponding phases in the Bi-Se system, were investigated. For BizTe I
c = 30.2 - 31.0 f 0.2 A 0.400; Z T ~= 0.212
a = 4.35 - 4.42 =t0.01 A; ZB, =
For RiTe the period is a = 6.46 A (see Table I1 for the interatomic distance). Similar to the corresponding phases in the Bi-Se system, the given structures also are able to dissolve in themselves certain excesses of one or the other component; an excess of Bi, according to the manner of introduction, increases the lattice period, and an excess of Te (in Bi2Te3),by subtraction, decreases the period. This decrease is so much more remarkable, since, apparently, the atomic radius of Te in a given compound is larger than the atomic radius of Bi. The phase changes taking place in these films for enrichment of one of the components has the same character which also produced the referred-to solid solution. Thus, for the additional sublimation of Ri in a film of RizSe3or BizTe3filling of the unoccupied octahedral void with atoms of Bi takes place, and, with achievement of the proper composition, there results a small reorganization and the production of a new phase: in the system Bi-Se the phase Bi3Se4,and BiTe in the system Bi-Te. From arialysis of the experimental results indications of the existence
392
Z. G . PINSKER
FIG.8. Oblique texture pattern from a film of Biz%.
of structures BiTe2 and RiaTer were discovered. I n connection with this, since the use of many semiconducting layers is usually connected with heating in air or in an atmosphere of oxygen a t reduced pressure, a special experiment (47) was done by heating sublimated films of Bitsea and Bi2Te3 in air. An electron-diffraction check was made of the processes taking place during the annealing. It was determined that annealing of the films caused, a t first, substitution of oxygen for atoms of Se or Te, and after this reorganization of the structure with the formation, apparently, of bismuth oxide, probably having the composition BiO with the NaC1-type structure. The
ELECTRON DIFFRACTION STRUCTURE ANALYSIS
393
period of this structure gradually decreases from 5.65 A to a final value of 5.50 A (see above and also ref. 39). The electrical properties of the films are correspondingly changed.
D. Two-and Multicomponent Films Containing Tl,Sb,As, and Xe If the study of films containing Bi and Se and Bi and T e can serve as examples of the electron-diffraction investigation of two-component systems, then we now change to work which illustrates the possibilities of the given method in investigating multicomponent films. A study of the phase composition and structure of such alloys was conducted by B. T . Kolomiitz and N. A. Goriunova, who synthesized suitable alloys and systematically studied their physical properties (48, 49, 50). The given group of work was devoted to a study of pseudobinary layers of the tertiary system TI-Sb-Se and also layers of the pseudo-tertiary systems TlzSe-SbzSes-AszSea.From their study of the different alloys in the system TlzSe-SbzSe3,the indicated authors established that there is a specific point on the diagram composition-properties corresponding to SO%, or the composition TlzSbzSe4.Electron-diffraction investigation (51) corroborated this conclusion. It was established that, together with the phases TLSe and SbzSe3,there existed still another phase having the composition T1zSbzSe4.The structure of this phase was completely determined. To investigate the compound, a sample of the alloy was evaporated onto a celluloid film and afterwards annealed a t a temperature a little above 100' C. I n the case of SbzSea,the resulting structure was in complete agreement with the subsequently published X-ray work (52); it proved to be similar to the well-known structure of Sb2S3. The structure of T12Se was the subject of a special electron-diffraction investigation by Stasova and Vairishtein (53) on the basis of polycrystalline and oblique texture electron-diffraction patterns. The period of the tetragonal cell, a = 8.52 & 0.02 A and c = 12.6 =t 0.03 A, exactly corresponds to the interplanar distances found in an X-ray investigation of this alloy. I n the elementary cell there are 10 formula units of TIzSe. Analysis of the extinction on oblique texture photographs leads to a space group P4/ncc - I),",, which was the situation in the basic structure investigation, although two weak reflections (101) and (111) could be indicated by the less symmetrical space group P 4 / n - C,", or P 4 / 2 n - C,",. I n the work 107 reflections were used, and the intensities were evaluated partly microphotometrically and partly visually from photographs with multiple exposures. In the first stage a projection and two-dimensional cross section of the Q j 2 series was plotted, which made it possible to determine the general structural scheme. To make it more precise, a projection and cross section of the potential was constructed. It was possible to describe the structure in the following way. Sixteen atoms of TI
394
Z. G. PINSKER
and 8 Se atoms in each elementary cell make up alternate layers which extend in a plane parallel to (zy0) in the structure. I n each layer, an atom of T1, or, correspondingly, an atom of Se, is located with respect t o the vertex of an elongated rhombus, stretched out along one or the other diagonal. Designating with letteis c1 or b layers with one or another rotation of the rhombus, we have the following inscription for a series of layers along the c axis: T1( a )Se ( b )T1(a)TI ( 6 )Se ( a )TI ( b ) The remaining four TI atoms and two Se atoms in each cell make up a linear group along the axes, four on 0 3 5 ~ and 450~.I n this group, atoms of Se are distributed statistically, with weight 46,which denotes the possibility of variation of the phase composition to the side of increased ratio of Se. Thus, using for the described space group P4/ncc, the author obtained the following atom positions TI: 16(g); J: Se: 8(f); J: TI: 4(c); z Se: 4(c); z
0.140, y = 0.148, z 0.340 = 0.25 and = 0
=
=
0.081
=
The important interatomic distances are
TI-TI from 3.24 to 4.71 A Se-Se from 3.85 to 4.50 A TI-Se from 2.80 to 3.28 A A complex phase, corresponding t o the composition TI2Sb2Se4or T1SbSe2, is produced hy sublimation of different alloys consisting of z . T12Se and y . SbzSes. In its pure form it is obtained by sublimation of alloys with z = y = 1. Following initial heating (-100” C) in vacuum, there is formed a n oriented film giving a good oblique texture pattern (Fig. 9). The structure determination was done with the help of similar photographs. In order to convince themselves that the sublimated films had precisely that composition, a calculation of the “molecular volume’’ was made for a series of structures, and parallel X-ray exposures were made of specimens of the original alloys. The referred-to electron-diff raction patterns permit a determination of the period for the rhombic structure of the given phase: a = 4.18 A, b = 4.50 A, and c = 12.00 A, Q = 225.7 A. The density of the original alloy, measured by a pycnometric method, was 6.988 g/cm3, which fairly accurately corresponds to one molecule of TlzSb2Selin an elementary cell. There emerges from the reduced data for the structures T1,Se and SbzSea a volume corresponding to “molecules” of the compounds TLSe - 92 A,
ELECTKOK 1)IFFHACTION STIZUCTUHE ASALYSIS
395
+
SbzSea- 133.7 A; 92 133.7 = 225.7 A, i.e., in excellent agreement with the volume arrived a t for TI2SbzSe4. The X-ray pattern of a specimen of the original alloy has the character of a Debye diagram, containing on a strong general background approximately 10 lines which can be measured and indicated. The values of d h k l for these lines proves to be in good agreement with d)&l determined from
VIG. 9. Oblique texture pattern from a film of Tl*Sb&.
electron-diffraction patterns of oblique texture for the greatest intensity reflections. Thus, the composition (chemical and phase) of the investigated films was verified. Analysis of extinction observed on the electrondiffraction patterns leads to two possible space groups Prnna-D& and Pna-C;, of which the first was expected from the basic investigation. The positions of the TI
396
Z . G . PINSKER
and Sb atoms were a t first selected from geometrical analysis. T12(c) :$6450 and 05635; Sb 2(a) :OOO and +iO>6.These atoms jointly make up, thus, a face-centered lattice. For Se atoms, a fourfold position (h) with two parameters was selected. Further investigation of the structure was done, chiefly, by the method of a2series. A projection (on plane 001) and a series of cross sections for a three-dimensional series were calculated which confirmed the selected positions for TI and Sb and led to the values y 0.5 and z = 0.26 for Se atoms. To make these parameters more precise, reliability factor minimalization was used together with the calculated @-series (cross section relative to the plane y = $6) (Fig. 10). The final value was
FIG.10. Cross section of three-dimensional potential series relative t o the plane y = t:O(z+z),structure TlZSbnSea.
z = 0.272, which is related to the parameter y, so that if we neglect the somewhat weak reflections of the type hlcl with h 1 # 2n, it is
+
possible to take y = 0.5 and describe the structure by the space group Cmmm - Di!. The structure exhibits alternation, along the c axis, of a plane centered in a network of T1 and Sb atoms with a plane made up of a zigzag chain of Se atoms stretched out along the a axis (Fig. 11). The distance Se-Se of 2.15 A in the chain is somewhat less than the corresponding distance of 2.32 A in the structure of selenium, which indicates the presence of a strong homopolar bond in the Se chain when in the structure T12SbzSe4. The distance TI-Sb of 3.07 A in the plane of the lattice appears t o be inter-
ELECTRON DIFFRACTION STRUCTURE ANALYSIS
397
mediate between tabulated values of the ionic and homopolar radii. Finally, the distance T1-Se, equal to 2.74 A, corresponds to a binding of the Se chain with the lattice plane with a comparatively weak force of interaction in the given structure. In a following work (54) the investigated phase was different only in that As atoms were partially substituted for Sb atoms. I n this work the possibility was shown of making more precise composition determinations from electron-diffraction investigations. A film of the tetra-component alloy was prepared by sublimation of the alloy material which was synthesized in vacuum in a quartz ampule (4850). Two specimens of the resulting alloy gave two series of films with somewhat different structure (or composition). Condensation was on unwarmed films of celluloid in order to avoid evaporation of certain com-
FIG.11. Structure of TltSbtSec.
ponentx. I n this way an amorphous or irregular structure was obtained which required long annealing for crystallization. The process goes through the same steps as mentioned earlier: a t first formation of a polycrystalline film, then an oriented film giving pattern of the oblique texture type. The effect of mixing arsenic with the ternary alloy appears to be a greater tendency t o form an amorphous phase and retardation of the regulation process and crystallization. The achievement of full regulation with the formation of a crystalline structure, which takes place with difficulty in the thin film, proves to be practically unattainable in a macrospecimen. Electrondiffractionpatterns obtained from tetra-component alloys were characterized by a stronger background and a more rapid drop of the reflection intensities with increasing (sin @/A, which indicated that there was diffrac-
z.
398
G. PINSKEIt
tion from another structure. Nevertheless, the pattern had good contrast, and a sufficient number of reflections were usable for structure determination. The general form of the electron-diffraction pattern was very like those which were obtained from T12Sb2Sea. Interpretation of the electron-diffraction patterns from films obtained from two alloy specimens led to the determination of two slightly different elementary rhombic cells: I: a 11: a
4.15, b = 3.99, b
=
= =
4.48, c 4.43, c
=
=
11.85 A 11.55 A
By comparing these data with the data obtained for TlzSb2Ser,it is possible to reach the following conclusions. Since the atomic radius of As is less than Rs,, then, obviously, the given alloy corresponds t o a structure in which part of the Sh atoms are replaced by As, upon which in I the As content is less than in 11. This result diverges from the conclusion made on the basis of investigations of the physical properties of macrospecimens, namely, that arsenic does not enter into a crystalline phase, but produces only an amorphous alloy component having the composition T12As2Se4. Investigation of the first structure was done chiefly by using a W series. The similarity to the T12SbzSe4structure was established, wherepuon the parameter x s e = 0.266 was obtained. Judging from the heights of the peaks on the potential series cross section, it was possible to reach a conclusion about the irregular substitution of As atoms for Sb atoms. Analogous results were also obtained for structure 11. The method of reliability factor minimalization was used t o get a more precise phase composition. The best result was obtained for an As content of 20% in structure I and 75% in structure 11. It is possible that these data are related only to the crystalline component of the alloy and part of the As makes up an amorphous component, indicated by the higher composition, which gives only a diffuse background on the electron-diffraction pattern. The smaller number of reflections recorded on these pictures as compared with the usual electron-diffraction pattern indicates the presence of a certain statistically disordered crystalline structure in this phase. Furthermore, the irregular character is connected with the statistical substitution for the Sb atoms. Therefore, it is surprising that we do riot observe a coritinuous variation of the lattice period and instead record two phases with constant period. A similar investigation, having important value, was conducted by Kolomiitz arid Goriunova on a wide class of amorphous semiconductors (50). It should he noted that the recent method of determining ‘khortrange” amorphous solids consisting of atoms of several kinds appears to be still very imperfect.
ELECTRON DIFFRACTION STRUCTURE ANALYSIS
399
E. Phases with the G a s Structure The structure of Gas (55) is characterized by the following data. The period of the hexagonal cell is a = 3.57 and c = 15.47 A, N = 4. The space group is P63/mmc. The atomic coordinates: Ga and S in 4(f): Z G = ~ 0.17 and zs = 0.60. Along the c axis are staggered planes of hexagonal layers of Ga and S. If we denote them, as usual, by A and B for sulfur and a and b for gallium so that each atom layer A ( a ) is located on either side of a central plane formed by a third atomic. layer B(b),then the succession of layers looks like B/AbbA/BaaB/AbbA/Ba
...
(see Fig. 12a, b). Thus, the structure appears to be laminated, and the connection between layers (designated by the lines) is accomplished by a weak residual force. Actually, the distance between S atoms in the neighboring layers A and B is 3.73 A for a total atomic radius of -2.10 A. Each atom of Ga lies a t the peak of a triangular pyramid with atoms of S in the base a t a distance S-S in the layer of 3.09 A, and a distance Ga-S of 2.34 A for a total atomic radius R G ~ Rs = 2.37 A, and a total ionic radius of 2.36 A. The nearest distance Ga-Ga vertically is 2.46 A for an atomic diameter of 2.52 A so that Ga atoms are arranged in a triangular pyramid having fourfold coordinates similar to the ZiiS structure. The structure of GaSe also was investigated by the X-ray method (56)) but the results obtained require additional verification. In this work the presence of two modifications of GaSe was recorded. Hexagonal with period a = 3.735, c = 15.887 A, and rhombohedral with a = 3.74 and c = 23.862 A. The characteristic features of these phases appears to be the low rate of regulation and crystallization, similar to phases containing arsenic. Therefore, electron-diffraction investigation in thin layers is of interest. In conducting an electron-diffraction study of the structure of GaSe (57) specimens were prepared as sublimated layers and also by precipitation of pulverized particles of the alloy from an aqueous suspension, onto celluloid films. The best oblique texture patterns were obtained from specimens prepared by sublimation, but the second method made it possible to ascertain whether sublimation led to the formation of a n alloy of the same composition as the original. Besides, this demonstrates, also, the agreement with the lattice period determined from electron diffraction patterns in the above work (56). The period of the hexagonal lattice, a = 3.74 and c = 15.89 A, is very close to the above-indicated cell dimensions of Gas. A rhombohedral phase is not observed in the thin layers. To obtain a satisfactory photograph from which a structure analysis can be made requires long annealing of the sub-
+
100
2. G . PINSKER
1.509 1.221
2.325
1.221 3.178 1.221 2.325
I .22I 1.589
a - S e
0 - G O
(b)
FIG.12. Structural type Gas (GaSe, InSe): (a) General structural form. (b) Atomic coordinates Ga(1n).
ELECTRON DIFFRACTION STRUCTURE ANALYSIS
40 1
limated films in vacuum. Furthermore, an essential condition for regulation and crystallization is the production of a thick film. Electron-diffraction patterns used in the work nevertheless contained reflections of two types, sharp and diffuse, wherein these latter corresponded to indices with h - k # 3n with 1 # 0. A characteristic feature of the electron-diff raction patterns was the large number of extinctions, exceeding that which it is possible to expect for any of the hexagonal space groups. Namely, there was observed hkil for any 1; 0001 for 1 = 2n; hh zh 1 only for 1 = 2n; hkil with h-k = 2n only for 1 = 212. This extinction does not contradict the group P63/mmc,which also was chosen. The intensities of the sharp reflections were estimated visually, the diffuse with a nonrecording microphotometer with correction for the large half-width. Further, P and @-serieswere calculated which made i t possible to determine that the given structure was of the same structural type as Gas. Since the scattering power of Ga and Se atoms is practically alike, however, the relationships of the peak heights in the a2and @-seriesdo not give a well-defined determination of the positions of atoms of Ga and Se. This was done in a following work (58) devoted to t.he structure of InSe. Thus, in the structure of GaSe atoms of Ga occupy the same positions as in Gas, and the Se atoms are analogous to S. The values of the parameters are Z G ~= -0.10 and zsc = 0.177; the last somewhat different. from that for S. An exact determination of the z coordinates (parameters) could not be made with sufficient reliability, but probably they do not exceed 0.03-0.05 A. Interesting results pertaining to the GaS structure type were obtained by S. A. Semiletov in an investigation of the system In-Se (58). By sublimation of the alloy, having the composition InSe, onto a surface of NaCl and then annealing, the author obtained a film having a hexagonal structure with a = 4.04 and c = 16.90 A (Figs. 13 and 14). Meanwhile, in an X-ray investigation of the alloy system InSe, the elementary cell for phase InSe was determined to be a = 4.05 and c = 25.00 A (approximately 35 of 16.90 A). Rhombohedra1 symmetry was ascribed to this phase. The general appearance of the electron-diffraction pattern from InSe is very similar t o that for GaSe; in particular, it has the same extinction character and diffuseness of given reflections (Fig. 13). Taking the number of “molecules” of InSe in connection with a hexagonal cell as being equal to 4 and the space group P63/mmc, the author proceeded from the same atomic coordinates as for GaS and GaSe and also used a2series for verification of the selected model. For a cross section along the c axis clearly showing a difference in the heights of the maxima, it was possible to reach a welldefined conclusion locating the In atoms a t the center of the above-described prism with atoms of Se in the base and atoms of In in the peak. The
402
Z. G. PIKSKER
values of the parameters, found from a2series were made more precise by minimalization of the reliability factor giving zrn = 0.157 and zse = 0.102. The distance In-In inside the pyramid comprises 3.16 A for the In atomic diameter of 2.88 A, and the distance In-Se equals 2.50 A for the sum of the tetrahedral covalent radii 2.58 A.
FIG.13. Oblique texture pattern from a film of InSe.
It is possible to carry out an analogy between tetrahedral atomic coordinates characterizing them by a ZnS-type structure and positioning the n-etallic atoms a t t8hecenter of the described pyramid. Semiletov explained the diamagnetic properties of InSe by the hypothesis that the pair of nearest In atoms can be considered as a pair of ions h4+, from which it follows that the formula of the given compound is better written in the form InlSe2. It is possible, that similar to the ZnS structure, the Gas structure is
ELECTRON DIFFRACTION STRUCTURE ANALYSIS
403
characterized by a mixed form of binding, so that if in the first case we change to a heavier atom in parallel with increased metallic binding, then in the second case there will be an increased degree of ionic binding.
FIG. 14. Electron-diffraction picture of a highly oriented film of InSe; the picture contains sccondary dynamic reflections, and diffraction lines due t o deformation of the film.
F. Other Semiconducting Phases In connection with determining processes in electronics and semiconductor techniques, there is interest in investigating the oxidation of certain metal alloys. So in a study of the alloys (59) I: Ag-Mg and 11:Cu-Mg,
404
Z.
G. PIhXKER
it was shown that unoxidized alloy specimens have slightly increased lattice periods U I = 4.14 A and UII = 3.69 A in correspondence with the greater atomic radius of Mg as compared with the atomic radii of Cu and Ag. By stripping the top layers by etching and abrasion, the author was convinced that the indicated values of the periods remained invariable for the depth of the alloy specimens. Annealing a t increased temperatures leads to the formation of MgO on the surface and, parallel with this, to a decrease in the lattice period of the alloy to a normal value corresponding to a pure metal. For alloys I1 there is usually observed an excess, 3.62 A instead of 3.607 A, caused by solution of oxygen. Thus, it was shown that high-temperature annealing of alloys containing Mg leads to diffusion of Mg onto the surface with formation of a layer of MgO on the surface, proceeding to full removal of metallic magnesium from the alloy. In investigations of the binary alloys Mg-Bi, Mg-Sb, and Mg-Sn, an analogous course of the process of oxidation is observed (60). In these systems, regardless of the components, there is formed only one phase having the character of the chemical compounds Mg,Biz, Mg$bz, and MgzSn. The first two have the Laz03-typestructure, and the third phase is like the antiisomorphic CaF?. These compounds, however, are characterized by instability, by reaction with oxygen in the air, and by decomposition. Sublimation of the components on a number of glass plates produced films with various relative concentrations. Certain average regions were characterized by a rapid increase, under the action of oxygen in the air, of the optical transparency and electrical resistance by as much as 7 to 10 orders. As shown by electron-diffraction structure investigation, this process apparently is connected with the diffusion of Mg to the crystal boundaries of a given phase, which then oxidizes to MgO and covers the remaining metallic particles. As a result, a nonconducting layer is produced. Thus, the behavior of Mg in the studied phases is the same as in layers of MgAg and MgCu, which appears to be very strange, since in one case we have a distinct stochiometric compound and in the other case a solid solution. Note, that in films of Mg-Sb decomposition of Mg3Sb2 is accompanied by the formation of amorphous Sb, although, in general, the thickness of the film considerably exceeds the limiting thickness for the existing amorphous antimony. In a given case, stabilization of the amorphous phase is explained by the disperse structure of the Sb particles insulated by the film of MgO. Another group of phases subjected to oxidation included PbS, PbSe and PbTe. Detailed investigation was made for the oxidation of PbS. Oxidized lead is unusually unstable, forming different phases partly containing water or hydroxide, and a partial study of it as the oxide, in particular, a complex oxygen-sulfur compound, was done in detail, but a depend-
ELECTRON DIFFRACTION STRUCTURE ANALYSZS
405
able study of the indicated process is very difficult. Apparently, there is no doubt of the formation of phases with the composition PbO ePbS04; 4Pb0 PbSO,; PbOz; and possibly also PbO * Pb(0H)z PbC03. Wilman ( G I ) , in analyzing the results of his investigation of sensitized layers of PbS and PbSe, assumed that orientation played an insignificant role in the process of forming photosensitive layers. More essential were small deviations from stochiometric composition which were not noticed as variations in the lattice period, and for PbS there was also the distribution in the layer of a newly formed phase: PbO * PbS04. Vertzner (62) deems most essential the conditions of initial heating of the film which tends, in itself, to form one or another oxide-containing phase and its fixed distribution in the thickness of the film. A group of lines was discovered by him which he failed to identify with any well-known phase. An example of the use of the kinematic method of taking electrondiffraction pictures to study kinetic phase changes is the investigation of Boettcher and co-workers (63).These authors studied changes in the systems AgS, AgSe, and CuSe. In the indicated systems were studied phase changes of the compounds Ag2S, Ag,Se, and CunSe, forming typical defect structures. At room temperature there is a rhombic phase, changing with increased temperature to cubic. Near the conversion temperature, a partial change to a tetragonal structure was observed. The specimens were investigated in the form of thin films prepared by condensation of the component vapors in vacuum. The processes taking place for annealing of similar layers in the electron-diffraction apparatus were recorded on a moving photographic film upon which was registered the specimen temperature. A celluloid substrate proved to be suitable in spite of annealing temperatures up to 400" C, since its carbonization formed a sufficiently firm substrate. Other authors, using X-ray structure investigation, determined the presence of the following phases for silver sulfide (AgZS). Above 180" C a cubic phase is formed with the period a = 4.88 A; below this temperature a rhombic phase with period a = 4.77, b = 6.92, and c = 6.99 A (from other data c = 6.88 A). The cubic phase, for N = 2, i.e., 4 Ag and 2 S in a cell, is characterized by fixed positions of the S atoms and a statistical disposition of the four Ag atoms in an extremely great number of positions. In the rhombic phase, an analogous distribution of both atoms can occur. Finally, in a recently published work (64) the results of a structure determination of the monoclinic @-phasewas given. I n a n electron-diffraction study the author succeeded in confirming the temperature change a t 180" C from the rhombic to the cubic phase and also recorded the intermediate tetragonal structure forming from the rhombic for temperatures close to 180" C. The period of the tetragonal lattice was
-
-
406
Z. G . PINSKEH
a = 4.77 A and c = 6.90 = 4.88 42 A, indicating the close connection of this structure with both stable structures. I n a kinematic electrondiffraction study for different specimens, the author also observed different relative intensities for some characteristic reflections. This, apparently is connected with the deviation from stochiometric composition of the investigated films. In the system AgSe an X-ray investigation also established the existence of two phases for the compound Ag2Sewith the point of change a t 133" C. The high-temperature cubic phase with period a = 4.98 A has a structure analogous to AgoS, but the distribution of the Ag atoms already, apparently, does not prove to he fully irregular with respect to all (42) possible positions. X-ray data are absent for the low-temperature form. The results of some electron-diffraction investigations can be summarized as follows. I n the work referred to (BS), wherein the kinematic method was used, there was discovered a cubic phase and some tetragonal phase. The transition point of the high-temperature cubic phase to the tetragonal phase is -133" C, and also the change in the cubic lattice to a = 4.98 A is in accord with X-ray data. For repeated heating and cooling of the films, there is observed a hysteresis, or lagging, in the formation of the phase which forms in the initial process (for example, heating preceded by cooling or vice versa). The following tetragonal cells were recorded: I: a = 7.06 and c = 4.98 A; 11: a = 4.98 and c = 4.78 A ; 111: a = 7.06 and c = 4.67 A; IV: a = 4.98 and c = 4.85 A. Note, that 4.38 = 7 .0 6 /4 2 , which is an indication of the close connection between the cubic structure and the tetragonal phase. The low-temperature rhombic modification was not observed in one experiment. In contrast to the kinematic investigations, films of Ag-Se were studied using the usual electron-diffraction apparatus and methods (65). The author established that above 140" C a cubic phase was actually produced corresponding to the composition Ag,Se with lattice period a = 4.978 A. At room temperature, as a result of mutual diffusion of the layers of Ag and Se and recrystallization (requiring a definite time) , a lorn-temperature rhombic modification was formed. The period was determined from recordings of the oblique texture type to be a = 7.046, b = 4.325, c = 7.82 A. A structural investigation of this phase was not carried out. It is necessary to note in this connection that the kinematic electron-diffraction method apparently is not able t o record equilibrium states and stable phases in those cases when the states are obtained as the result of a more or less long diffusion process, regulation or crystallization. In another work the referred-to author (66) studied the structure of films having the composition Ag-Te prepared in an analogous manner. I n accordance with X-ray data, above 150" C a cubic structure of Ag2Te is
ELECTRON DIFFRACTIOK STRUCTURE ANALYSIS
407
observed with period a = 6.58 f 0.03 A, and below the indicated temperature there is a complex, rather low symmetry phase for which no data are given. The last system investigated was that of Cu-Se (63).Compounds having compositions from CuzSe to approximately Cul.&3e were formed having a phase structure somewhat close to that of Agk3 and AgzSe. We shall not give any details of the investigation which, apparently, is still not completed, but will confine ourselves to some results. For layers having the composition CuZSe, there was observed a t room temperature a complex structure which has still not been studied. This changed to a cubic structure above 110" C. The period of the cubic structure is a = 5.84 A. It is built up of a rigid framework lattice of Se atoms and a statistical distribution of Cu ions with respect to the greatest number of possible positions. A decrease of the ratio of Cu in the alloy leads to the formation, even a t room temperature, of a tetragonal phase (a = 11.43 and c = 11.72 A) having the composition Cul.&4e,and even a cubic phase (a = 5.80 A) having the composition Cu,Se if z < 1.9. The electron-diffraction investigation showed a series of tetragonal and cubic lattices which, apparently, are different for different distributions (but always statistical) of the Cu atoms for one or another possible position. Since the position in the given Se lattice framework corresponds to a hole of different volume, then different distributions of Cu ions leads to some expansion of the lattice as a whole, i.e., variation of its period.
G. The Surface Structure of T h i n Films and Monocrystalline Ge Besides the discussed inveFtigations of the atomic structure of different semiconducting phases there have been performed for the last five to eight years some investigations of single-crystal Ge and of thin films of Ge obtained by sublimation in vacuum. This work has twofold value. On the one hand, it represents an example of structure control of an important semiconducting material and, on the other hand, an example of the use of electron diffraction ir, solving the general problem of the real structure of crystals, for example, the nature and distribution of defects and dislocations. It is well known that reflection of an electron beam from the surface of a single crystal gives a different diffraction pattern depending on the real structure of the face. Thus, the face of a diamond crystal gives a typical picture containing a large number of sharply drawn Kikuchi lines and bands, and for a fixed azimuth the so-called curved Kikuchi lines. Therefore, on the basis of the well-known elementary theory of the formation of Kikuchi electron diffraction patterns a similar picture is considered evidence for a more or less perfect single-crystal face. To be more specific, we have to speak, chiefly, about the absence of irregularities on the surface layer.
408
Z . G . PINSKER
The results, however, of an investigation of a single crystal of Ge and a thin layer of Ge makes it necessary to modify somewhat the stated representation.
FIG. 15. Electron-diffraction pattern with Kikuchi lines and bands from freshly cleaved single crystal Ge; L = 250 mm.
I n our laboratory (67) the structure variation on the surface of monocrystalline Ge has been studied for the processes of cutting and etching. A fresh surface produced by breaking a single-crystal block gives, in reflection, a Kikuchi electron-diffraction pattern similar to the picture from the face of diamond. The surface produced by cutting gives a diffraction pat-
ELECTRON DIFFRACTION S T RUC TU R E ANALYSIS
409
tern in the form of a point grid reflection in the presence of highly weakened Kikuchi lines. This is indicative of plastic deformation in the surface layers. In the case of the cut single crystal, separate blocks slightly disoriented in the range -2-3" are formed on the surface. Polishing of the cut surface
FIG. 16. Electron-diffraction pattern from thc surface of single crystal Ge treated by polishing and strong etching; L = 700 mm.
gives a polycrystalline ring owing to fine-crystalline powder on the surface. The use of different etching agents, gradually increasing the etching time and temperature, leads to a gradual restoration of the original structure. For this treatment there a t first again appears the point diffraction grid, which degenerates into a single bright-point reflection and is finally replaced by a Kikuchi electrondiffraction pattern (Figs. 15 and 16).
410
Z.
0.
PISSKER
Films of Ge sublimated in vacuum were studied (68, 69). Condensation of Ge vapor in vacuum onto different surfaces at temperatures below 370" C gave amorphous films. Above this temperature, polycrystalline layers were formed. For rapid sublimation onto a surface kept near the temperatures of 500-900" C, textured layers were obtained. The essential factor for this appears to be the direction from which the growing crystallites are supplied with the molecular beam from the evaporator. Crystallites of Ge orient on the surface of polished graphite or silicon carbide as { l l O ) or (loo), on the surface of glass as (111) or (1001, and on the surface of polished corundum a t 700-800" C with the face (1001 parallel t o the backing. For condensation of Ge on a cleavage face of calcite or mica, formation of mosaic monocrystals is observed which give spot diffraction patterns. The crystal orientation is { 1 1 1 ) parallel ~~ to ( 1071)Caco3,and ( 1 1 1 ) or ~ { 110)Ge parallel to { 0001 1 mica. For sublimation of germanium onto a polished and etched surface of single-crystal Ge, we observe oriented layers. For a temperature somewhat in excess of 500" C films are formed having a mosaic monocrystal structure, and for a substrate temperature of 700-800" C films having a more perfect structure are formed, if we can judge on the basis of Kikuchi patterns. From measurements of the electrical properties of similar films (70) having thicknesses up to 20p, it was determined that there existed a conEiderable deviation from the properties of initially very pure Ge: specific resistance -2 X ohm-cm; Hall constant -3 cm3/coulomb; carrier mobility (hole) - 150 cm2/v-sec. In all cases similar films had hole conductivity. This was also determined in a work (71) in which such properties, of sublimated Ge films were explained (as surface energy levels) as inherent for small crystallites. Investigations of etch figures on films and on single crystals using electron and optical microscopy, in agreement with the discussed data for the electrical properties, indicate the presence of a symmetrical disposition of defects in films. Apparently, in a given case the change from a spot electron-diffraction pattern to a Kikuchi picture corresponds to a large increase in the mosaic block size (possibly by 1 t o 146 times) together with improvement of the mutual orientation of blocks. It should be emphasized that the electron-diff raction method of inspecting and investigating the true structure and the degree of perfection of monocrystalline Ge and Si, with proper treatment, can prove to be very useful.
ACKNOWLEDGMENT I regard it a pleasant obligation to express my appreciation to Dr. B. K. Vainshtein for reviewing the present article and for making valuable comments concerning it.
ELECTROX DIFFRACTIOIL’ STRUCTURE AKALYSIS
41 1
REFERENCES 1. Z. G. Pinsker, “Electron Diffraction.” Academy of Sciences, U.S.S.R., 1949; English ed. Butterworths Scientific Publications, London, 1953. 2. B. K. Vainshtein, “Structure Analysis by Electron Diffraction.” Academy of Sciences, U.S.S.R., 1955. 3. B. K. Vainshtcin and Z. G. Pinsker, Doklady Akad. N a u k S.S.S.R. 64, 49 (1949). Q. J. A. Ibers and J. A. Hoerni, Acta Cryst. 7, 405 (1954). 5. B. K. Vainshtein, J . Exptl. I’heoret. Phys. 26, 157 (1953). 6. J. A. Ibers, A c t a Cryst. 11, 178 (1957). 7. B. K. Vainshtein and J. A. Ibers, KristallograJya 3, No. 4 (1958). 8. L. Marton, L. B. Leder and H. Mendlowitz, Advances in Electronics and Electron Phys. 7, 183 (1955). 9. M. Blackman, Proc. Roy. SOC.A173, 68 (1939). 10. C. H. MacGillavry, Physica 7, 329 (1940). 11. W. Kossel and G. Mollenstedt, Ann. Physik. [5] 26, 113 (1939). 12. K. Moliere et al. 2. Physik 137, 445; 139, 103 (1954); 140, 581 (1955). I S . K. Kambe, J. Phys. SOC.J a p a n 12, 13, 25 (1957). 14. R. Uyeda and S. Miyake, Acta Cryst. 10, 53 (1957). 15. Y. Kainuma, Acta Cryst. 8, 247 (1955). 16. S.Takagi, J . Phys. SOC.J a p a n 13, 278, 287 (1958). 17. S.A. Vekshinski, “New Method of Metallographic Study of Alloys.” OGIZ, 1944. 18. B. K. Vainshtein and A. N. Lobachev, KristaZlograJiya 3, No. 4 (1958). 19. I. I. Yamzin and Z. G. Pinsker, Doklady Akad. N a u k S.S.S.R. 66, 645 (1949). I. I. Yamzin, T r u d y Inst. Krist. Akad. N a u k S.S.S.R. 6, 69 (1949). 20. Z. G. Pinsker and L. N. Abrosimova, KristallograJiya 3, 281 (1958). 21. S. Lenandcr, Arkiv. Fysik 8, 54, 551 (1954); S. Takagi and T. Suzuki, Acta Cryst. 8, 441 (1955). 22. B. K. Vainshtein, KristaZlograJiya 2, 340 (1957). 25. G. G. Dvoriankina and Z. G. Pinsker, Kristallograjya 3, No. 4 (1958). 24. Z. G. Pinsker and V. I. Khitrova, Kristallografiya 1, 300 (1956). 25. Z. G. Pinsker and B. K. Vainshtein, Trudy Znst. Krist. Akad. N a u k S.S.S.R. 9, 291 (1954). 26. B. K. Vainshtein and Z. G. Pinsker, Kristallografiya 3, 358 (1958). 27. A. Boettcher and R. Thun, Optik 11, 22 (1954). 28. G. P. Thomson and W. Cochrane, “Theory and Practice of Electron Diffraction.” London, 1939. 29. G. 0. Bagdykiants, Zzvest. Akad. N a u k S.S.S.R., Ser. Fiz. 17, 255 (1953). SO. “Elcctron Microscopy,” edited by Academician A. A. Lebedcv, State Publishing House, 1954. S1. M. M. Umanskii and V. A. Krylov, J . Exptl. Theoret. Phys. 6, 691 (1936). 52. L. I. Tatarinova and Z. G. Pinsker, Doklady Akad. N a u k S.S.S.R. 96, 265 (1954); L. I. Tatarinova, Trudy Inst. Krist. Akad. N a u k S.S.S.R. 11, 104 (1955). SS. G. A. Kurov and Z. G. Pinsker, Kristallograjya 1, 407 (1956). 34. L. S. Palatnik and V. M. Kosevich, Doklady Akad. N a u k S.S.S.R. 121, 97 (1958). 35. H. Hendus, 2. Physik 119, 265 (1942). 3’6. I. G. Stoianova and A. I. Frimer, Zavodskaia Lab. 18, 1472 (1952). S7. L. I. Tatarinova, T r u d y Znst. Krist. Akad. N a u k S.S.S.R. 11, 101 (1955). 58. S. A. Semiletov, T r u d y Znst. Krist. Akad. N a u k S.S.S.R. 11, 115 (1955). 39. Z. G. Pinsker, 0. S.Orekhvo, and A. I. Miller, KristallograJiya 1, 239 (1956). 40. S. A. Semiletov, Kristallograjya 1, 306 (1956).
412
Z . G . PINSKER
41. B. I Epo a stronger divergence for the curves of different substances than it was the case with metals. (14). c. Qualitative interpretation. Measurements by the pulse method with a very low current intensity generally show the same high yields as earlier measurements with static methods. The results can be regarded as a further confirmation that high yields must be attributed to the substances themselves and that they are not caused by field-enhanced emission within the layer. Which of the fundamental properties of semiconductors and insulators are responsible for the greater variation range of the yield cannot be made out with certainty for all cases. At least we have certain ideas based on the energy band model of the solids, and they proved to be useful. The basic feature characterizing a semiconductor is the position of optical lattice absorption within the spectrum. From it we find the amount of energy necessary to raise an electron from the highest occupied level into the next higher excited state. Since, with rare exceptions, the first excited level lies in the conduction band, we are used to identify this energy with the distance of the conduction band from the valence band in the band model of semiconductors. This energy plus the energy difference between vacuum level and the bottom of the conduction band is therefore necessary to produce one S in the solid. Now the observed high yields of substances with great energy gap are probably not caused by an increased source density of the S produced, since it may be assumed that because of the increased energy gap the excitation rate of X will decrease. The high yields therefore must be due to an increase of the depth of escape. The following mechanism for energy loss of S may play a role in insulators: (1) electron-phonon interaction, (2) interaction with valence electrons, if the excitation energy of X in the upper band is greater than the
438
0. HACHENBERG AND W. BRAUER
TABLE111. MAXIMUMYIELDSFROM SEMICONDUCTORS A N D INSULATORS Group Semiconductive elements
Semiconductive compounds
Substance
1.2-1.4 1.1 1.3 1.35-1.40 2.8 1 1.2
400 250 400 400 750 250 150
Cu20 PbS MoSi MoOz
1.19-1.25 1.2 1.10 1.09-1.33 0.96-1.04 0.98-1.18 1.8
400 500
AgzO ZnS
Insulators
E,,”
Ge (single crystal) Si (single crystal) Se (amorphous) Se (crystal) C (diamond) C (graphite) B
wsz Intermetallic compounds
6,
SbCs3 SbCs BiCs3 Bi2Cs GeCs Rb3Sb
5-6.4 1.9 6-7 1.9 7 7.1
LiF (evaporated layer) NaF (layer) NaCl (layer) NaCl (single crystal) NaBr (layer) NaBr (single crystal) NaJ (layer) KCI (layer) KCl (single crystal) K J (layer) K J (single crystal) RbCl (layer) KBr (single crystal) Be0 MgO (layer) MgO (single crystal) BaO (layer) BaO-SrO (layer)
5.6 5.7 6-6.8 14 6.2-6.5 24 5.5 7.5 12 5.5 10.5 5.8 12-14.7 3.4 4 23 4.8 5-12
A1203 (layer) SiOt (quartz) Mica
1.5-9 2.4 2.4
References a,b,c,d
a c,e ,c,e
f g,h
a
ii k
i 1
i 1 350 700 550 1,000 1,000 700 450
m
n,%P n n n d rl
i i 600 1,200
i,r,s t,U,V,W,X
i 1,800 1,200
X,Y
i i,z
v,a’
i
1,600
1u
i 1,800 2,000 400 1,200 400 1,400 350-1,300 400 300-384
W,d
b‘ i,C’
d‘,e’
f’
9’
439
SECONDARY E L E CT RON EMISSION FROM SOLIDS
Glasses
Technical glasses Pyrex Quartz-glass
2-3 2.3 2.9
300-420 340400 420
5
s,if s,h‘
L. R. Koller and J. S. Burgess, Phys. Rev. 70, 571 (1946). J. B. Johnson and K. G. McKay, Phys. Rev. 93, 668 (1953). c H. Gobrecht and F. Speer, 2. Physik 136, 602 (1953). d G . Appelt, Thesis, Dresden (1958). c G. Oertel, Ann. Physik [7] 1, 305 (1958). f J. B. Johnson, Phys. Rev. 92, 843 (1953). 0 13. Bruining, Thesis, Leiden (1938). E. J. Sternglass, Phys. Rev. 80, 925 (1950). i H. Bruining and J. H. de Boer, Physica 6, 823 (1939). f N. B. Gornij, J . Exptl. Theoret. Phys. (U.S.S.R.) 26, 79 (1954). k 0. Hachenberg, unpublished (1944). 1 A. Afanasjewa, P. W. Timofejew, and A. Ignaton, Physik. 2. Sawjetunion 10, 831 (193F). ~ L N13.. Gornij; J . Exptl. Theoret. Phys. (U.S.S.R.) 26, 88 (1954). n C.. Appelt and 0. Hachenberg, Ann. Physik, to be published. X. D. Morgulis and B. I. Djatlowitskaja, J . Tech. Phys. (U.S.S.R.) 10, 657 (1940). P. 1’. Timofcjew and J. Lemkowa, J . Tech. Phys. (U.S.S.R.) 10, 20 (1940). W. Iianef, Ann. Physik, to be published. r M . hl. Vudinsky, J . Tech. Phys. (U.S.S.R.) 9, 271 (1939). * H. F’alow, 2. tech. Physik 21, 8 (1940); Physik. 2. 41, 434 (1940). A. K. Shnlman, W. L. Makedonsky, and J. D. Yaroshetsky, J . Tech. Phys. (r,7.S.,0.H.) 23, 1152 (1953). u A . 11. Chulman, J . Tech. Phys. (U.S.S.R.) 26, 2150 (1955). A. R. Phulman and B. P. Dement,yev, J . Tech. Phys. (U.S.S.R.) 26, 2256 (1955). D. N. Dobrezow and A. S. Titkow, Doklady Acad. N a u k U.S.S.R. 100, 33 (1955). * D. N. robrcxow and T. L. Matskevich, J . Tech. Phys. (U.S.S.R.) 27, 736 (1957). LJ 1’. L. Matskevich, J . Tech. Phys. (U.S.S.R.) 26, 2399 (1956). 2 M. Knoll, 0. Hachenberg, and J. Randmer, 2. Physik 122, 137 (1944). 5’ B. I’e’eel, Thesis, Dresden (1958). b’ K. H. Geyer, Ann. Physik [5]42, 241 (1942). c‘ G. Rlankenfeld, Ann. Physik [6] 9, 48 (1950). N. R. Whetten and A. B. Laponsky, J . Appl. Phys. 28, 515 (1957). #’J.B. Johnson and K. G. McKay, Phys. Rev. 91, 582 (1953). j‘ H. Bruining and J. H. de Boer, Physica 6, 17 (1938). u’J. B. Johnson, Phys. Rev. 73, 1058 (1948). H. Kruger, Thesis, Berlin (1957). I ‘ C. W. Mueller, J . Appl. Phys. 16, 453 (1945). a
b
0
energy gap, and finally (3) interact,ion with lattice defects. In the case of insulators with great energy gap, mechanism (2) is of no importance. In the case of metals, mechanisms (1) and (3) are less effective than the interaction with free electrons, which is responsible for the relatively low yields of metals. 5. Temperature Dependence. a. General considerations. The above con-
440 0. HACHENBERG AND W. BRAUER
FIG.14. Normalized yield curves for different materials. (a) SbCsl. (b) Mean yield curve of the metals. (c) KBr according to B. Petzel [Thesis, Dresden (1958)]. (d) Quartz. (e) Glass according to H. Salow [Z. tech. Physik 21, 8 (1940); Physik. 2. 41, 434 (1940)l.
SECONDARY ELECTRON EMISSION FROM SOLIDS
44 1
clusions can be proved by the examination of the temperature-dependence of the yield. Of course, we must confine ourselves to phenomena connected with reversible variations of the yield. Those variations which evidently originate from variations of adsorbed surface layers or are connected with permanent changes of the crystal are not taken into consideration. Since interaction processes between excited S and the various components of the solid have a direct effect on the range of S and hence on the yield, a certain classification of the phenomena in the case of semiconductors and insulators could be made (cf. Hachenberg, 33). Obviously, solids can be classified by the following groups: 1. I n crystalline solids with great energy gap and relatively few lattice defects, interaction of S with lattice vibrations is predominant. The range of S, and hence the yield, becomes temperature-dependent. They must decrease with rising temperature. 2. If in solids the number of free electrons in the conduction band is sufficiently high, the interaction of S with lattice vibrations is overshadowed by interaction with free electrons. 3. I n the case of a small energy gap, interactions of S with electrons of the valence band become most important; semiconductors of this group will behave in a similar manner as metals. 4. Finally, in a solid with considerable disorder, in extreme cases in an amorphous solid, the range of S will no longer depend on interaction with lattice vibrations; solids of this kind should not show any dependence of the yield on temperature. b. Experimental results. For metals, where measurements can be carried out with some ease, the iesult of recent investigations is rather unequivocal. The yield is constant in the whole temperature range measured. Morotsov (34) and Wooldridge (35, 36) found that the temperature coefficient of the yield is smaller than the temperature coefficient of linear extension. Blankenfeld (37) too, found no variations > 1% for Ni in the temperature range from 20" to 400" C. Only Sternglass (38) found a temperature dependence, which, however, may be due to adsorption layers a t the surface in particular, since he did not use a tube that could be subjected to heat treatment. Of greater interest for our discussion are the results obtained from semiconductors and insulators. For Ge, Johnson and McKay (39) found a continuous decrease by 5% in a temperature range of 20" to 600" C. I n spite of different donations with activators, the decrease remained the same for all samples. Ge has diamond structure; the energy gap between valence band and conduction band is 0.78 ev, the electron affinity is nearly 4.4 ev. A rise in temperature mentioned above is combined with a n increase in the number of conduction electrons for pure Ge samples, e.g., from 6 X l O I 4 per cm3
442
0. HACHENBERG A N D W. BRAUER
to 7 X 1016electrons per cm3; since, on the other hand, for samples doped with Ga and Sb the density of conduction electrons rises from 3.4 X lo1* to 3.5 X 1018electrons per cm3, the very different changes in the density of Conduction electrons could not have effected the observed change in the yield that is the same in all cases. It is to be supposed, therefore, that the yield is largely independent of the density of conduction electrons in the conduction band for electron densities < 1018per cm3. The diffusion of S is mainly determined by interaction with valence electrons and partly by interaction with lattice vibrations, as is proved by the small temperature effect. Thus Ge belongs to those solids which are characterized by group 3. Arranging intermetallic compounds of types AIBv and A I B I ~into the above groups presents more difficulties. Cs3Sb layers were examined by Appelt and Hachenberg (14) and CsGe layers by Appelt (18) as to the dependence of the yield on temperature. As both types of layers are very unstable compounds that are subject to changes under the influence of temperature, measurements can be made only under great precautions. Nevertheless, it was possible also in these cases to prove a real dependence of the yield on temperature in a range of -30" to +70" C. The energy gap in the case of Cs3Sb amounts to 1.2 ev. The layers apparently take u p a surplus of Cs atoms into the lattice. Thus, the diffusion of X will largely depend on interaction with valence electrons and lattice imperfections and only to a lesser degree on interaction with lattice vibrations. Glasslike insulators must be regarded as solids with a high degree of disorder. As these absorb only in the far ultraviolet, interaction with valence electrons is rather improbable. Thus, the mobility of S is predominantly determined by the degree of disorder. Measurements on Pyrex glass by Mueller (@), on technical glasses by Blankenfeld (37) and by Shulman, Makedonsky, and Yaroshetsky (41), and on quartz glass by Kruger (49) agree in showing no dependence of the yield on temperature, so that this group of substances has obviously to be arranged under group 4. A marked dependence of the yield on temperature is to be expected for alkali halide single crystals as well as alkaline earth oxide crystals. Resulting from the great energy gap (= 10 ev) and the relatively small amount of lattice imperfections in such crystals, excited electrons can spread in the conduction band over relatively wide ranges; the latter are predominantly limited by interaction with lattice vibrations. With rising temperature, the number of lattice vibrations increases nearly proportional to T, and the range decreases correspondingly. Now if the range of P is smaller than or similar to that of X, variation of the depth of escape has no remarkable influence on the yield. On the other hand, the yield must decrease in proportion with the depth of escape for primary energies E," > Epno. A first estimation of the range of the X was attempted by Hachenberg
443
SECONDARY ELECTRON EMISSION FROM SOLIDS
(33);he found a dependence of the depth of escape as proportional to 1/T. An improved investigation of the diffusion process was carried out by Dekker (@,44) who found a dependence of the depth of escape d, that can be approximately described by d,
-
T-"
Thus, it has become highly important in yield measurements to determine exactly the dependence of the yield on temperature in the range E," > Em". Knoll, Hachenberg, and Randmer (31) determined the temperaturedependence of the yield by a static method for evaporated layers of KC1 and found 6 T-l. Today it is clear that such measurements should be carried out only with good single crystals having a perfect surface and only by means of a pulse method with the lowest possible current intensity. A surface altered by polishing or corrosion is capable t o blot out the whole temperature effect of the yield, as was shown by Shulman (Sf?). Measurements satisfying the above demands were carried out by Johnson and McKay (45)for MgO crystals. For E," = 2000 ev, they found 8 1 / 8 2 = 0.78 for T I = 1013" K and T z = 298" K, which approaches the law 6 T-%. For KCl, KI, and KBr, Shulman and Dementyev (15) found a temperature dependence of the yield between 0" and 300"C, which can also be described by the relation proposed by Dekker. Strikingly high yields and also marked temperature dependence was found by Matskevich (46) for a special NaBr sample. For T2 = 300" K and TI = 600" K, he obtained &/62 = 0.4, while 0.72 would have to be expected according to the relation proposed by Dekker. The observed variation approximately corresponds with a law 8 T-ls2. Finally, Petzel (47)made a series of measurements on KCI and KBr; Fig. 15 gives the yield curves for KBr and Fig. 16 shows In 6 for six different values of E," plotted against In T in the case of KC1. For E," > 4 kev, the measurements satisfy the T'6law. While the stronger dependence on temperature found by some authors may be attributed to a secondary effect-possibly to field-enhanced emission, which decreases when temperature rises-there are, on the other hand, samples with surface imperfections showing too little influence of temperature. Nevertheless, the experimental results obtained up to date may be regarded as a confirmation of the ideas developed. To conclude with, let us point out that the shape of yield curves also shows a dependence on temperature. Particularly the position of the maximum shifts to lower values with rising temperature. As shown in Fig. 15 for KBr, the maximum shifts from 1,600 to 1,300 ev a t a temperature variation from 35" to 300" C, which indicates a decrease of depth of escape of s.
-
-
-
FIG. 15. Yield curves of a KBr single crystal at different values of temperature. OT = 35" C. T. = 100" C. OT = 200" C. +T = 300" C. [B. Petael, Thesis, Dresden (1958)l.
SECOKDARY ELECTROX EMISSION FROM SOLIDS
445
6. Influence of Work Function on Yield. The influence of the potential barrier at the surface on yield is fully ascertainable. From measurements by Jonker we know that the angular distribution of internal S in a solid is isotropic; the electron current density impinging on the surface from the inside, therefore, has a cosine distribut,ion. If, in addition, the energy distribution of the internal electrons is known the influence of the potential barrier can be stated explicitly (cf. See. IV,E).
13 12 11
dl
lo
9
a 7
6
5
4
3
I
I
3Gu
400
-
I
-
500
I 600
I
700
T(OK)
-
FIG.16. Variation'of yield of a KCl single crystal target with temperature at different values of Epo. Curve A T-x. Curve B: { 2[exp.(hv/kT) - 11-1 + 1)-% (V = 6.3 X 1012 cps). [B.Petsel, Thesis, Dresden (1958)].
For a n experimental examination of the influence of work function, a metallic target whose work function is known is covered with a very thinsay, monomolecular-layer of another metal. This layer is supposed to have a negligibly small part in the production of 8, but it will alter the work
446
0. HACHENBERG AND W. BRAUER
function of the target to a measurable extent. An alteration of SE yield will then have to be attributed to the alteration of work function. Sixtus (48) was the first to carry out such yield measurements for a W target covered with T h layers of different thickness. The T h layers reduced work-function from 4.52 to 3.3 ev and 2.6 ev; a t the same time, maximum yield increased from 1.8 to 2.0 and 2.2. Thus, lower work function results in higher yield. A number of other authors studied the dependence of yield on workfunction in a similar manner. We confine ourselves to mentioning the measurements of Treloar (49),who found a decrease in yield for an oxidized W layer as against the pure W target from 61 = 1.31 to 132 = 1.06 while the work function increased from 4.52 ev to 6.3 ev. The influence of the work function on secondary emission is obviously very small if compared with its extraordinary effect on thermal and photoelectric emission. With use of the yield formula (67) (Sec. IV,E2), we can calculate the influence of variation of work function on the yield. As result, one obtains (EF = 5 ev) for &/&the value 0.72, whereas we find from Treloar's experi= 0.81. This result is nearly the same as that obtained mental values by Baroody (24). The influence of work function on the yield measurements is certainly manifold, though it has not always been possible to distinguish it clearly from other factors. It is known that yields from different crystal faces of a solid are marked by small differences. Knoll and Theile (50) succeeded in making these differences visible by depicting crystalline layers by means of S. The differences in work function of different crystal planes are sufficient to account for the observed differences in yield. In the majority of yield measurements, the surface of the targets to be measured is affected with adsorbed layers that are more or less unknown. These exert a certain influence on the measurements by altering the work function. Part of the disagreement between the yield values stated by different authors has certainly been caused in this way. 7 . Miscellaneous Problems. a. Oblique incidence of primaries. So far perpendicular incidence has been considered exclusively. Oblique incidence of the primary beam results in an increase of the yield with increasing angle of incidence. This deFendence on the angle of incidence, however, becomes noticeable for primary energies E," > EPnoonly. Bruining (cf. ref. 2) was able to represent the dependence of the yield on the angle of incidence by the relation
(2)
In - = const (1 - cos 0)
SECONDARY ELECTRON EMISSION FROM SOLIDS
447
In order to understand this effect, it must be taken into account, that according to Jonker (20) the distribution of S is nearly isotropic and independent of the angle of incidence of the primary beam. It therefore seems improbable that an anisotropy of excitation, which might exist, should be responsible for the increase of the yield. It is a change in the spatial distribution of P in the solid resulting in an increased source density of S which is effected by oblique incidence. Since the spatial distribution of P is strongly dependent on the diffusion process of the primary beam, there must be different effects of its straggling. For weak diffusion, that is in the case of the primary beam taking a nearly linear path, the range perpendicular to the surface of the target decreases with cos 8 ;the source density of S in the layer near the surface increases accordingly. For strong diffusion, the influence of the direction of incidence on the distribution of P decreases. A corresponding decrease of the dependence of the yield on the angle of incidence from the light elements to the heavy elements can be observed in the measurements that have as yet been made. b. Depth of origin of secondaries. Because of their interactions with the electrons and phonons in the solid, the S released within the target have only a limited range d,. Only those electrons can contribute to the yield which on arrival a t the surface still have sufficient energy to overcome the surface barrier. Only the electrons excitjedin a certain surface layer of thickness d, are able to escape. Experiments on the dependence of the yield on the thickness of layers resulted in distinct saturation values above a certain thickness. These values of the thickness of the layers are to be regarded as the maximum depths of origin of S. The depth of origin obtained in this way is independent of the primary energy-as was first proved by Djatlowitskaja (51)and depends only on the respective material of the target. Values of about 100 A were found for metals, KCI rendered a value of about 500 A (52), in qualitative accordance with the above ideas on the transport process in SE. c. T i m e constant. So far we have exclusively considered the stationary process of SE. If a target is exposed to a rectangular pulse of P , the yield of S E reaches its full value only after a certain length of time from the beginning of the pulse. This building-up process can be described by the time constant of SE. The time constant is extraordinarily small. Therefore, it has not yet been possible to determine it by way of experimental measurements. The experiments that have so far been made result only in an upper limit for its value. On the one hand, various indirect methods have been used to determine
448
0. HACHENBERG AND W. BRAUER
the time constant. Attempts have been made to conclude its value from measurements with dynatrons, or to determine its value by means of the upper limit of the working range of clystrons the reflector of which had been replaced by an SE electrode. Also high-frequency multiplyers have been used for this purpose. The obtained result is that the time constant must have a value smaller than 1 X lo-" sec. The only direct method to determine its value was used by Greenblatt (55). By deflecting back and forth an electron beam across a narrow slit with a frequemy of 400 mc, he produced primary pulses of a duration of 6 X lo-" sec. The S pulses released by these pulses he analyzed as to their deformation. The measured broadening of the X pulse of 7 X lo-" sec has to be attributed to the time the electrons pass through the deflecting mechanism, so that the time constant of SE itself must certainly be smaller than 7 x 10-l1 sec. For the time being, an evaluation of this quantity must be left to theoretical discussion (cf. IV,G2).
C. The Interaction of Primary Electrons with Solids Though P are more or less only carriers of the energy needed for the
excitation of X, their behavior within the solid is nevertheless of fundamental importance for the whole process, even decisive with regard to some particular problems; therefore, the behavior of P in solids will be discussed in more detail than is usual in most special studies on SE. 1. The Paths of Primaries. We can get a first idea about the paths of P from the investigation of electrons passing through thin films. When a parallel electron beam strikes a thin layer perpendicularly, it is both stopped and scattered. First let us consider the scattering process. Except the few cases when electron waves are refracted a t the lattice of the layer, the directions of motion of the electrons after passing through appear in a continuous distribution around the direction of incidence. This distribution depends on the thickness, the substance of the layer, and the velocity of the electrons. We obtain a good insight into the scattering process by comparison with observations made on the paths of electrons in the Wilson chamber. On entrance into the layer, the electron beam is dispersed in the same or even a higher degree than on entrance into the gas of the Wilson chamber (Fig. 17). The electron beam transmits its energy into an almost semispherical range around the point of incidence. Part of the P runs counter t o the original beam. This is due to scattering events with scattering angles > 90" and coupling of several moderate angle scatterings adding up to deflections of the electron beam paths > 90". The excitation in every layer consists of two components, one being
SECONDrZKY ELECTXOh- EMISSION FROM SOLIDS
449
effected by the direct beam, the other by the rediffused beam. It is necessary to know both components if one wishes to obtain accurate information on the source function of S. 2. Rediffusion. The current of rediffused P cannot be measured in the solid itself. As me stated above, it is possible outside the solid to separate true S with sufficient accuracy from reflected and rediffused P by means of a retarding potential of about 50 ev. For thin layers, the part of rediffused electrons was determined by its dependence on the thickness of the layer. This part increases with the thickness of the layer and arrives a t a maximum. The thickness a t which the niaximum is reached is called “rediffusion range.”
FIG.17. Path of 40-kcv clectrons in the Wilson chamber a t normal pressure.
Rediffusion range and rediffusion coefficient 9 still depend on the angle of incidence of the P, and, of course, on their energy. For the energy range of P with normal incidence that we are only concerned with here, 7 has recently been measured by Palluel (27) and by Holliday and Sternglass (28). Figure 18 indicates 9 plotted against primary energy for a number of metals. 9 approaches an upper limit between 0.05 and 0.5 with increasing primary energy. For all metals, the curves rise starting from low energies up to about 15 kev. For a number of insulators, Matskevich (54) carried out measurements by a pulse method and found similar results. The relation between the upper limit of 9, say a t a primary energy of 20 kev, and the atomic number of metals becomes apparent from Fig. 19.
450
0. HACHENBERG AND TV. BRAUER
0.6
17 a2
0
-€6 FIG.
Lev]
18. Rediffusion coefficient 7 of different metals.
ATOMIC NUMBER
-
D
FIG.19. Upper limit of rediffusion coefficient 7 of different metals plotted against the atomic number.
First the rediffusion coefficient.rises linearly with Z; for Z > 30, however, one finds a minor increase of the coefficient. In the case of the heaviest metals, it approaches the limit 0.5. The deviation of the single dots from the curve is remarkably small, so that rediffusion is mainly dependent on
SECONDARY ELECTROK EMISSION FROM SOLIDS
45 1
the atomic number, while the other characteristic quantities of atoms are of minor importance for rediffusion. The value of q = 0.5 for the heaviest metals indicates that in this case the electrons of the beam are distributed isotropically before they are absorbed to any relevant degree. If a normally incident electron beam is totally diffused without considerable absorption, half the incident electrons should eventually be re-emitted. For light elements, on the other hand, diffusion of the beam seems t o play a minor role in comparison with absorption. The beam is weakened, and the P have lost a great amount of their energy before any diffusion worth mentioning takes place. The energy distribution curve of rediffused electrons runs nearly horizontal. Just below primary energy, the continuum is superposed by single sharp peaks indicating that a P on penetration into the solid undergoes single discrete losses of energy. I n these cases, electrons have either raised crystal electrons from deeper levels into the conduction band or have eventually excited plasma oscillations. Moreover, Harrower (8) found slight maxima in the range 50 ev < E < 250 ev; they could be interpreted as Auger electrons. 3. Energy Loss of Primaries. a. Elementary processes. It is obvious that it must be of great use for a theoretical treatment of the subject to know the elementary processes effecting the loss of energy of P . We found evidence of the elementary processes in the energy distribution of a primary beam after passing through a thin film. The energy distribution of the P on the exit side indicates three different processes : 1. An electron beam after passing through a thin film has mainly a continuous energy distribution, which practically begins shortly below the primary energy. It originates apparently from the excitation of electrons of the outer shells of the atoms, which in the case of metals are raised to the unoccupied levels closely above the Fermi surface. As the excited electrons are probably distributed in an energy range of = 15 ev above the Fermi level, the spectrum is smeared out, so that it is not possible to distinguish single transitions between energy bands with an energy difference < 10 ev. I n particular, we cannot distinguish the transitions occurring within the conduction band from transitions from the next occupied band. However, the energy distribution is continuous in a much wider range, so that i t must be concluded that the excitation of outer electrons must overshadow the excitations from deeper levels. Excitations from deeper atomic shells are slightly indicated in the spectrum by Auger electrons. Obviously we may suppose that the number of these transitions is small. 2. A further mechanism of energy loss is to be attributed to the excitation of plasma oscillations. These transitions are observed as a series of
452
0. I-IACHEIiBEKG AND W. BHAUER
sharp maxima shortly below the primary energy with the same energetic distance (55). The corresponding loss of energy per centimeter, too, seems obviously small in most cases as compared with the excitation of single electrons. b. The stopping-power law. The excitation processes by the P taken together determine the average loss of energy - dEp/dx per unit path length and define the range R of the electrons in the solid. 7.2
1.0
t
0.6
0
5CK 0.6
.-c,
z 0.4
0.2
0
2
FIG.20. Practical range R for electrons in aluminum [J.R. Young, J . A p p l . Phys. 27, 1 (1956)l. (a) R = 0.042E,01.3.(b) R = 0.0093EPo2.
I n earlier studies, the range R was preferably measured by means of fast electrons, and a quadratic dependence on Epo-the Whiddington lawwas found. This was occasionally adopted also for electrons with 100 ev < Epo< 10,000 ev and was introduced into semiempirical theories of SE. I n more recent experimental investigations, the stopping-power law was measured for the range of primary energy mentioned above. Young (56) studied the penetration of P in A1 for 0.5 kev < E,' < 11 kev. His results are represented in Fig. 20. For E," > 8.5 kev, he confirmed the Whiddington law; for E,"
< 8.5 kev, he found R
N
Ep0'*3
SECONDARY ELECTRON EMISSION FROM SOLIDS
453
and thus clearly proved that the Whiddington law is not valid in the range we are primarily concerned with. Analogous measurements on Al2O3 films carried out by Young (57) brought very similar results. The law of the rate of energy loss was here for 0.3 kev < E," < 7.25 kev
R
=
0.0115E,01~35
( R in mg/cm2, Epoin kev)
Lane and Zaffarano (58) also investigating A&O3found an exponent 1.66. Therefore, the Whiddington law must be replaced by another law with a smaller exponent of about 1.5 for the range that we are interested in. c. T h e energy loss of the primary electron beam. As a result of the diffusion of the electron beam, the differential energy loss t o be derived from the
)::(
stopping power law is not identical with the energy - dx dissipated in the layer of thickness dx. This quantity is important for the derivation of the source function of X in the solid. In order to obtain knowledge of this quantity, we need the energy distribution and the number of electrons that have passed through thin films of different thickness. Young (59) carried out such measurements on &03films and determined the respective energy losses of the electron beam dependent on the thickness of the layers. The average energy loss dW/dx remains practically constant over the entire penetration depth. 4. Spatial Source Distribution of Secondary Electrons. The spatial distribution of the sources of S in a solid during continuous bombardment with P has not yet been paid special attention, as far as we know, though it is of utmost importance for the problem of SE. An experimental determination of the source function cannot, however, be achieved directly, for it is impossible to measure directly the electrons excited per second in a volume element of the solid. It is only possible to determine the loss of energy of the primary beam in a layer a t the depth z and then to assume that a constant fraction of this amount of energy is employed for the production of S. Perhaps one can proceed in the following way: We imagine the solid as cut a t a plane denoted by the coordinate xo. First one has to measure the primary beam flowing a t the separation surface towards points x > 20, then the current of rediffused electrons coming from the range z > xo has to be defined. Both currents combined make u p the flow of energy at the point 20. Then the same procedure has to be repeated for a point 21; the difference of both numbers denotes the amount of energy absorbed in the layer x1 - zo. Exact measurements of both fractions (fraction of incident P and
454
0. HACHENBERG AND W. BRAUER
rediffused fraction) a t separation surfaces of the sample have not yet been made. On the other hand, the electron beam as well as the energy loss of electrons has often been measured for beams passing through thin films. More recent measurements with electrons of E,' = 2.5 - 10 kev, carried out by Young (59) for A1203films, determine the fraction of energy of the primary beam absorbed in films of different thickness. Now A1203 has a relatively small fraction of re-diffused electrons, the rediffusion coefficient being q = 0.12, so that the fraction of rediffused electrons can be regarded as a correction. If one differentiates the curve measured by Young and adds the fraction resulting from rediffusion, we obtain the energy absorbed in a layer of definite thickness which may be regarded as a measure for the source density of the S.
100
200
THiCKNESS IN p G / c m 2
FIG.21. Dissipated energy in dependence of the space coordinate.
The source functions for three different primary energies are plotted in Fig. 21. At about one-third of the maximum range of the electron beam, the curves have a flat maximum, which moves toward the surface if rediffusion becomes more important. The curves end a t the limit range of the electron beam; the surface enclosed is proportional to the energy of the primary beam. A similar result had been obtained earlier by Hachenberg (60) from KC1 single crystals. If one supposes that the production of color centers in crystals under electron bombardment takes place proportional to the
455
SECONDARY ELECTRON EMISSION FROM SOLIDS
energy dissipated per volume element by the electron beam, one can obtain the source function directly by photometrization. KC1 single crystals were bombarded with 60 and 90 kev electrons; then the crystals were sliced perpendicularly to the bombarded surface and photometrized photographically. The resulting coloration curves (Fig. 22) were much the same as those shown in Fig. 21. In this way it, was possible to obtain the approximate dependence of the source function on the depth in the solid; yet a question that is important for the theory, namely how much of the energy absorbed is used for the excitation of S, has still to be answered.
t
A
0.01
0.03
0.02
0.06
0.05
-
0.04
0.07
THICKNESS m m
FIG.22. Coloration of KCl single crystals after bombarding with 60-kev and 90-kev electrons (0.Hachenberg, unpublished].
IV. THEORY OF SECONDARY EMISSION
A . Interaction between a Free Electron and a Bloch Electron 1. General Formulas. All calculations on the phenomena resulting from interaction between a relatively fast electron and a n electron of the solid (energy loss of P , excitation of S ) start from the quantum-mechanical formula for the transition probability per second into the interval da:
Pa,,. . .
(a,
. . .)da . . .
-2n I(a,
. . .IHwlao, . . .)j26(Ea,. . .
fi -
E,,, .
.
.)da
. . . (3)
where the quantum numbers 00 . . . , respectively, a . . . denote the sttitionary initial and final states of the unperturbed system, which is characterized by a Hamiltonian Ho . H , denotes the perturbation operator inducing the transitions. Let us orthonormalize all occurring wave functions to a periodicity volume V . Then
456
0. IIACHENBERG AKD W. BRAUER
are the wave functions of the free particle with energy
For the coriesponding functions of the Bloch electron we have +k(r) =
1
- - uk(r)ei(kJ), E = V T
E(k)
with
where G is an arbitrary lattice vector. The vector k is regarded as a wave vector in the extended zone scheme. Thus, we can write for the matrix element in (3):
where the screened Coulomb potential is taken for the int.eraction operator H,. If necessary, we may use the pure Coulomb potential with X = 0. Integration over R can be carried out without difficulty, so that
then the transition probability per second becomes
Because of the boundary conditions used, all occurring wave vectors range over discrete values. From (4) it may be seen (62) that the integral I can be written as
If N is the number of atoms in V , then it follows from the periodic boundary condition that
SECONDARY ELECTRON EMISSION FROM SOLIDS
457
H denotes the vectors in the reciprocal lattice derived from the crystal lattice vectors G by the relation (G,H) = integer. So we see (63)that a finite transition probability for the process kK 3 exists only in the case of conservation of quasi momentum, i.e., if
k’
=
k
+ q + 27rH
Then the integral I can be written:
If we define the ((formfactor” F as
then
An evaluation of F requires the explicit form of the Bloch wave functions
+k(r),which is not known in general. Generally, however, we can state that in any case IF1 6 J!uk12d3r = l/n, that is to say, l / n represents an upper limit for F. 2. H = 0 processes. If we first consider the transitions corresponding H = 0 according to (6), we see that for q - 0 the form factor F tends toward
s
unit cell
1 n
IUkpd3r = -
(9)
i.e., toward its upper limit. Because of the r4 in (5), the processes with small q play a particularly important role. By (7) F possesses the same value (9) in H = 0 processes for all q, if we make the additional assumption that the crystal electron is entirely free, i.e., that uk(r) = 1. The reader will observe that H = 0 follows necessarily from this assumption, but not vice versa. Of course, H = 0 processes do not necessarily correspond to a n interaction with free crystal electrons. I n H = 0 processes we shall later replace F by the approximation (9). 3. H # 0 processes. A corresponding discussion of the form factor for
458
0. HACHENBERG AND W. BRAUER
H
# 0 processes is far more difficult. As we mentioned above, the function uk(r) is by no means a constant. Moreover, for H # 0 processes, F differs essentially a t the limit of very small q from its corresponding behavior for H = 0 processes. In that case, the integral (7) tends to
1
unit cell
~ ~ , * ~ ~ e ~ ( ~ - ” ’=~k) (+E27rH ~r,k’
Because of the orthogonality of eigenfunctions belonging to different eigenvalues, this becomes equal to zero (64). In order to examine the behavior of F in the neighborhood of q = 0, it is convenient t o expand it into a Taylor series about this point:
I n order to obtain expression (65):
F(l),
F can exactly be transformed into the following
where (62)
A(H;k’,k) =
knit
oeU Ukl*
grad uke-2ri(H*r)d3r
(12)
and =
E(k’) - E(k)
Taking
&H)
=
AEk+2di.k
we obtain from (11) without difficulty (62) F(1) =
h2 ze
+ 2rH,k))
(13)
Since E(k,O) = 0, it is evident that an expansion of F by (11) is essentially based on the assumption H # 0. Of course, the “linear approximation” F = F(’) for the form factor F is only valid for sufficiently small q, i.e., only as long as, say, [F2)1 E,,", so that from a theory based on (47) one cannot, e.g., obtain the yield maximum. The solution of the equations (47) is obtained by first providing for the appropriate Green's functions. They are defined by
The required functions +z(E)will then result from
+z(E) =
Gl(E,Eo)Sz(Eo)dEo
(49)
Emdenoting a maximum energy of the S, such that the suppositions made on deriving (47) will hold just for E Em. According to what has been said above Em = 100 ev. 2. Po Approximation. By (48) we can calculate the function
E,,"; hence, we no longer have the possibility to determine the dependence on primary energy of the different distribution functions outside this range. Within it, however, we could conclude from the theory that the relative distribution functions are independent of E,". Since there is no detailed theory for E," < E,,", the phenomenon of SE is considered in a rather simplified manner in the so-called semiempirical theory. Nevertheless, this enables us to obtain a fairly satisfactory representation of the dependence of the yield on E,". Obviously, the semi-
484
0. HACEHNBERG AND W. BRAUER
empirical theory must be regarded as a quasi-heuristic expedient which will eventually have to be replaced by a detailed theory. Let S(x,E,";E,r(L) denote the number of S per centimeter excited into the state (E,Q) a t the point x by a P with initial energy E,". Then
S(I3,")
= =
/om dx ,( dEd~S(x,E,";E,~)p(x,E,~) /om dxp(x,E:) ,( dEdQS(x,E,";E,r(L)
Provided that, the mean probability of escape jj(x,Epo) does not depend considerably on E,", which will just be the case if the excitation function contains E," only as a factor, this can be written
6(E,")
=
/ow dxp(x)S(x,E,")
(68)
S(x,E,") representing the total number of S excited per cm at the point x. Beside the assumption on p(x,E,") which led to (68), the semiempirical theory is characterized by two further assumpbions ( 2 ): p(x) = p(0)ecaz and
dE S(X,Ep0) = -K 2 dx Relation (69) results from the following simplified treatment of the transport process of S: If we first consider, say, all S in the state (E,Q)and then separate from these S especially those which in this state had already been at a certain point 20, we may ask for the spatial distribution of those S in the emitter. We are thereby obviously concerned with those S which, proceeding from 2 0 in the state (E,r(L),have not yet suffered any collision. Their spatial distribution is expressed by the reduced Boltzmann equation :
with the solution
x -xo
N(x,E,P) = N(xo,E,p)elcos@ ~
>
'EPo "I, ~
E- , = -
'("">'"I
I+-
2 EPo
+AE
(90)
If we further suppose that N(E,P) be isotropic, we find
By introducing (90) and (91) into (88) we obtain for MgO with W = 0.25 ev, AE = 5.77 ev, E", = 1,200 ev, ,6 = 24, d, = 2 x 10W cm, the time constant T* 3 3 X 10-1~sec. Because for metals the excitation function (34) diverges for E = Er and thus also N and 8,, we have to calculate the excitation function with
S E C O N D A R Y I',LECTI