ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS VOLUME 66
EDITOR-IN-CHIEF
PETER W. HAWKES Lahoratoire d' Optiyue Electr...
33 downloads
756 Views
18MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
ADVANCES IN ELECTRONICS AND ELECTRON PHYSICS VOLUME 66
EDITOR-IN-CHIEF
PETER W. HAWKES Lahoratoire d' Optiyue Electronique du Centre National de la Recherche Scientifi'que Toulouse, France
ASSOCIATE EDITOR-IMAGE
PICK-UP AND DLSPLAY
BENJAMIN KAZAN Xerox Coupordon Pnlo Alto Reseurch Center Palo Alto, Culgorniri
Advances in ~
Electronics and Electron Physics EDITEDBY PETER W. HAWKES Laboratoire D' Optique Electronique du Centre Nutionul de la Recherche Scientifque Toulouse, France
VOLUME 66 1986
ACADEMIC PRESS, INC. Harcourt Brace Jovanovich, Publishers Orlando San Diego New York Austin London Montreal Sydney Tokyo Toronto
BY ACADEMIC PRESS.INC. ALL RIGHTS RESERVED. KO PART OFTHIS PUBLICATION MAY BE REPRODLCEI) OR TRAKSMllTED I N ANY FORM OR BY ANY M E A M . ELECTRONIC OR MECHANICAL. INCLUDING PHOTOCOPY. RECORDING. OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM. WITHOCT PERMISSION I N WRITING FROM THE PUBLISHER.
COPYRIGHTQ 1986
ACADEMIC PRESS. INC. Orlando. Florida 32887
United Kingdom Edition published by
ACADEMIC PRESS INC.
(LONDON) 24-28 Oval Road. London N W I 7DX
LTD.
LIBRARY OF CONGRESS cATAl.OG C A R D NL'MREK:
ISBN 0-12-014666-5 PRINTED IN THE U N I T E D STATES OF AMERICA
X h 87
xx
8Y
Y
x
7 h 5 .I 3
I I
49-7504
CONTENTS CONTRIBUTORS TO VOLUME66 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . PREFACE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vii ix
Applied Problems of Digital Optics L. P. YAROSLAVSKII 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Adaptive Correction of Distortions in Imaging and Holographic Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Preparation of Pictures , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Automatic Localization of Objects in Pictures . . . . . . . . . . . . . V. Synthesis of Holograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..... ..
1
5 45 68 92 136
Two-Dimensional Digital Filters and Data Compression V . CAPPELL~NI I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Two-Dimensional Digital Filters . . . . . . . . . . . . . . . . . . . . . . . 111. Local Space Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV. Data Compression V. Joint Use of Two-
.. ..
141 142 152 158
.. V1. Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
173 176 199
.. ..
Statistical Aspects of Image Handling in LowDose Electron Microscopy of Biological Material CORNELIS H. SLUMPand HEDZERA. FERWERDA 1.Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Object Wave Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Wave-Function Reconstruction of Weak Scatterers . . . . . . . . . . . IV. Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Statistical Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A: The Statistical Properties of the Fourier Transform of the Low-Dose Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V
202 213 230 254 277 297
vi
CONTENTS
Appendix B: The Statistical Properties of an Auxiliary Variable Appendix C: The CramCr-Rao Bound . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
299 305 306
Digital Processing of Remotely Sensed Data A. D. KULKARNI I. Introduction . . . . . . .
............................. ...................... .....................
310 319 326
Geometric Correction and Registration Tech Classification Techniques . . . . . . . . . . . . . . . System Design Considerations . . . . . . . . . . . Conclusion. . .............................. ..................... References . . . . . . . . . . . . . . .
361 361
INDEX. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
369
11. Preprocessing Techni 111. Enhancement Techniques . . .
IV. V. VI. VII.
CONTRIBUTORS TO VOLUME 66 Numbers in parentheses indicate the pages on which the authors’ contributions begin.
V. CAPPELLINI, Dipartimento di Ingegneria Elettronica, University of Florence, and IROE-C.N.R., Florence, Italy (141) HEDZERA. FERWERDA, Department of Applied Physics, Rijksuniversiteit Groningen, 9747 AG Groningen, The Netherlands (201) A. D. KULKARNI,*National Remote Sensing Agency, Balanagar, Hyderabad, 500037 India (309) CORNELIS H. SLUMP,Department of Applied Physics, Rijksuniversiteit Groningen, 9747 AG Groningen, The Netherlands$ (201)
L. P. YAROSLAVSKII, Institute for Information Transmission Problems, 101447 Moscow, USSR ( 1 )
*Present address: Computer Science Department, University of Southern Mississippi, Hattiesburg, Mississippi 39406. $Present address: Philips Medical Systems, Eindhoven. The Netherlands. vii
This Page Intentionally Left Blank
PREFACE The four chapters that make up this volume are all concerned, though in very different ways, with image handling, image processing, and image interpretation. The first contribution, which comes from Moscow, should help Western scientists to appreciate the amount of activity in digital optics in the Soviet Union. The extent of this is not always realized, for despite translation programs, some of it is not readily accessible and little is presented at conferences in Europe, the United States, and Japan. I hope that L. P. Yaroslavskii’s chapter will help to correct the perspective where necessary. V. Cappellini needs no introduction to the electrical engineering community; here he surveys the difficult but very active and important fields of digital filtering in two dimensions and source coding. The list of applications in the concluding section shows the wide range of application of these ideas. The third chapter is concerned with the extremely delicate problem of radiation damage and image interpretation in electron microscopy. For some years, it has been realized, with dismay, that some specimens of great biological importance are destroyed in the electron microscope by the electron dose needed to generate a usable image. One solution is to accumulate very low dose images by computer image manipulation, but a thorough knowledge of image statistics is imperative for this, as indeed it is for other types of electron image processing. This difficult area remained largely uncharted territory until C. H. Slump and H. A . Ferwerda began to explore it in detail: their chapter here gives a very full account of their findings and sheds much light-more indeed than I suspect they dared to hope when they began-on this forbidding subject. The final chapter, by A . D. Kulkarni, is concerned with yet another branch of this vast subject, in particular with enhancement and image analysis. This should be a very helpful supplement to the basic material to be found in the standard textbooks on the subject. P. W. Hawkes
ix
This Page Intentionally Left Blank
Applied Problems of Digital Optics L. P. YAROSLAVSKII Institute for Information Transmission Problems Moscow. USSR
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I1 . Adaptive Correction of Distortions in Imaging and Holographic Systems . . . . . . A . Problem Formulation . Principles of Adaptation to the Parameters of Signals and Distortions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Methods for Automatic Estimation of Random-Noise Parameters . . . . . . . . C. Noise Suppression by Filters with Automatic Parameter Adjustment . . . . . . . D . Correction of Linear Distortions . . . . . . . . . . . . . . . . . . . . . . . . . . E. Correction of Nonlinear Distortions . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Preparation of Pictures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Problems of Picture Preparation . Distinctive Characteristics of Picture Preparation in Automated Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Preparation by Means of Adaptive Nonlinear Transformations of the Video Signal Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Linear Preparation Methods as a Version of Optimal Linear Filtration . . . . . D . Rank Algorithms of Picture Preparation . . . . . . . . . . . . . . . . . . . . . . . . E. Combined Methods of Preparation . Use of Vision Properties for Picture Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV . Automatic Localization of Objects in Pictures . . . . . . . . . . . . . . . . . . . . . A . Optimal Linear Coordinate Estimator: Problem Formulation . . . . . . . . . . . B. Localization of an Exactly Known Object with Spatially Uniform Optimality Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C . Allowance for Object’s Uncertainty of Definition and Spatial Nonuniformity . Localization on “Blurred Pictures” and Characteristics of Detection . . . . . . . D . Optimal Localization and Picture Contours. Selection of Objects from the Viewpoint of Localization Reliability. . . . . . . . . . . . . . . . . . . . . . . . . . E . Estimation of the Volume of Signal Corresponding to a Stereoscopic Picture . . . V . Synthesis of Holograms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Mathematical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Discrete Representation of Fourier and Fresnel Holograms . . . . . . . . . . . . C. Methods and Means for Recording Synthesized Holograms . . . . . . . . . . . . D . Reconstruction of Synthesized Holograms . . . . . . . . . . . . . . . . . . . . . E . Application of Synthesized Holograms to Information Display . . . . . . . . . . References
1
5 6 9 16 27 34 45 46 47
55 61 63 68 69 71 78 82 88 92 94 98 102 120 128 136
I . INTRODUCTION Improvement of the quality and information throughput of optical devices has always been the main task of optics . For the majority of applications. today’s optics and electronics have. in essence. solved the 1 Copynght @ 1986 by Academic Press. Inc All rights of reproduction In any form reserved
2
L. P. YAROSLAVSKII
problem of generating high-quality pictures with great information capacity. Now, the effective use of the enormous amount of information contained in them, is., processing of pictures, holograms, and interferograms, has become topical. One might develop the information aspects of the theory of optical pictures and systems on the basis of information and signal theory and enlist the existing tools and methods for signal processing (of which the most important today are those of digital computer engineering). Armed with electronics, optics has mastered new wave length ranges and methods of measurement, and by means of computers it can extract the information content of radiation. Computerized optical devices enhance the analytical capabilities of radiation detection thus opening qualitatively new horizons to all areas in which optical devices find application. Historically, digital picture processing began at the turn of the 1960s with the application of general-purpose digital computers to the simulation of techniques for picture coding and transmission through communications channels (David, 1961; Huang et al., 1971; Yaroslavskii, 1965, 1968), although digital picture transmission was mentioned as early as the beginning of the 1920s (McFarlane, 1972). By the 1970s it had become obvious that, owing to the advances of computer engineering, it might be expedient to apply digital computers to other picture-processing problems (Vainshtein et al., 1969; Huang et al., 1971; Yaroslavskii, 1968) which traditionally belonged to the domain of optics and optoelectronics. First, publications appeared dealing with computer synthesis of holograms for information display, synthesis of holographic filters, and simulation of holographic processes (Brown and Lohmann, 1966,1969; Huang and Prasada, 1966; Lesem, 1967; Huang, 1971). Finally, by the middle of the 1970s progress in microelectronics enabled the advent of the first digital picture-processing systems, which found wide applications in Earth resource studies, medical diagnostics, and computeraided research. The digital processing of pictures and other optical and similar signals is now emerging as a new scientific field integrating theory, methods, and hardware. We refer to this area as “digital optics” by analogy to the term “digital holography” (Huang, 1971; Yaroslavskii and Merzlyakov, 1977, 1980, 1982), which combines such segments as digital synthesis, analysis, and simulation of holograms and interferograms. The term digital optics reflects the fact that, along with lenses, mirrors, and other traditional optical elements, digital computers and processors are becoming integral to optical systems. Finally, to complete the characterization of digital optics as a scientific field, one should say that it is a part of the general penetration of computer engineering and digital methods into optical studies, as recently noted by Frieden ( 1980).
APPLIED PROBLEMS OF DIGITAL OPTICS
3
What qualitatively new features are brought to optical systems by digital processors? There are two major ones: first, adaptability and flexibility. Owing to the fact that the digital computer is capable of rearranging the structure of the processing without changing its own physical structure, it is an ideal vehicle for adaptive processing of optical signals and is capable of rapid adaptation to various tasks, first of all to information adaptation. It should be also noted that this capability of the digital computer to adapt and rearrange itself has found application in active and adaptive optics for control of light beams as energy carriers. The second is the simplicity of acquiring and processing the quantitative data contained in optical signals, and of connecting optical systems with other information systems. The digital signal representing the optical one in the computer is essentially the pure information carried by the optical signal deprived of its physical vestment. Thanks to its universal nature, the digital signal is an ideal means for integration of different information systems. Digital optics relies upon information theory, digital signal processing theory, statistical decision theory, and that of systems and transformations in optics. Its methods are based on the results of these disciplines, and, similarly, these disciplines find in digital optics new formulations of their problems. Apart from general- and special-purpose computers, the hardware for digital optics also involves optical-to-digital signal converters for input into the digital processor and converters of digital signals into optical form such as displays, photorecorders, and other devices. In the early stages of digital optics, this hardware was borrowed from other fields, including generalpurpose computer engineering, computer graphics, and computer-aided design. Currently, however, dedicated hardware is being designed for digital optics, such as devices for the input of holograms and inteferograms into computers, precision photorecorders for picture processing and production of synthesized holograms, displays, and display processors. Digital optics considerably influences trends in today’s computer engineering towards the design of dedicated parallel processors of two-dimensional signals. As an area of research, digital optics interfaces with other information and computer sciences such as pattern recognition, artificial intelligence, computer vision, television, introscopy, acoustoscopy, radio holography, tomography. Therefore, the methods of digital optics are similar to those of these sciences, and, vice versa. The aim of this article is to discuss the most important problems of applied digital optics as seen by the author, including those of adaptation and of continuity and discreteness in processing pictures and other optical signals. The first section deals with methods for correction of linear and nonlinear distortions of signals in display and holographic systems and with noise
4
L. P. YAROSLAVSKlI
suppression. The emphasis will be on adaptive correction of distortions with unknown parameters and on a means of automatic estimation of these parameters through the observed distorted signal. The second section is devoted to methods for the improvement of a picture’s visual quality and to making preparations for facilitating visual picture interpretation. The term “preparation” was suggested by the present writer (Belikova and Yaroslavskii, 1974; Yaroslavskii, 1979a, 1985) expressly to stress the need for a special processing oriented to the individual user. The philosophy of the methods described in the first two sections relies upon the adaptive approach formulated in Section I,A, which has three aspects. First, it is constructed around adaptation to unknown noise and distortion parameters by means of direct estimation of them through observed distorted signals. Second, for the determination of optimal processing parameters, a new statistical concept of a picture is used that regards the picture as a combination of random object(s) to be interpreted and a random background, together with a new correction quality criterion. This consists in considering that the correction error is minimized on average over a noise ensemble and random parameters of “interpretation objects (see Subsection II,A,l), while the background is considered as fixed. With this method, adaptation to the background is attained. Third, the approach envisages adaptation of picture processing to the user, that is to the specific problem faced by the user of the data contained in a picture. As noted above, it is the simplicity of adaptive processing that is one of the basic merits of digital picture processing as compared with analog (optical, photographic, electronic, etc.) methods. The third section demonstrates how this adaptive approach may be extended to the detection of objects in pictures. This is one of the fundamental problems in automated recognition. The fourth section discusses the problems of digital holography and, by way of hologram synthesis for information display, illustrates another important and characteristic aspect of digital optics: the need to allow in digital processing for the analog nature of the processed signal, i.e., the need to observe the principle of correspondence between analog signal transformation and its digital counterpart. Such a need exists not only in the digital processing of optical signals, but here and especially in digital holography it is particularly manifest in this system because the digital hologram obtained from a digital (discrete and quantized) signal is at the same time an analog object, an element of an analog optical system, thus a most evident embodiment of the unity of discreteness and continuity. ”
APPLIED PROBLEMS OF DIGITAL OPTICS
5
11. ADAPTIVE CORRECTION OF DISTORTIONS IN IMAGINGAND
HOLOGRAPHIC SYSTEMS There are many papers, reviews, and monographs on the correction of distortions in imaging systems (Vasilenko, 1979; Sondhi, 1972; Frieden, 1975; Huang et al., 1971; Huang, 1975; Andrews and Hunt, 1977; Gonzales and Wintz, 1977; Pratt, 1978). Their attention is focused on elimination of distortions in systems which either may be regarded as linear, spatially invariant systems with additive and independent noise, or may be reduced to them. Distortions and their correction in holographic systems have not been sufficiently studied. Little attention has been paid to correction of nonlinear distortions, including those due to signal quantization in digital processors, and to suppression of random noise, which is of prime importance in real problems of processing pictures, holograms, and interferograms. Moreover, the characteristics of distortions and noise, which are required data for their correction and suppression, are usually assumed to be known, although in practical applications of picture processing this is far from being the case, and one must estimate the parameters of distortions and noise directly through the observed distorted signal. Finally, it should be mentioned that, in the majority of the existing studies of correction, insufficient attention has been paid to the allowance for specific computational methods, peculiarities of digital representation, and processing of signals in digital computers. These problems are discussed in this section. In Subsection II,A are formulated the principles of the adaptive approach to picture distortion correction, correction quality estimation, and determination of distortion parameters through distorted signals. Subsection II,B describes algorithms intended for noise parameter estimation through an observed noisy signal: measurements of the variance and correlation function of additive signal-independent fluctuation noise, and of the intensity and frequency of harmonic components of periodic noise in pictures; and estimates of pulse noise and quantization noise parameters, and noise of the “striped” type. Subsection II,C is devoted to noise filtration: linear filtration with automatic adjustment of parameters for suppression of additive noise of narrow spectral composition as well as “striped” noise, and to nonlinear methods of pulse noise filtration. On the basis of the adaptive approach developed, methods are proposed in Subsection I,D for the digital correction of linear distortions in imaging systems and those for hologram recording and reconstruction. Subsection II,E discusses the digital correction of nonlinear distortions, its relation to the problem of optimal signal quantization, practical methods of
6
L. P. YAROSLAVSKII
amplitude correction, and the possibilities of automatic estimation and correction of nonlinear distortions of interferograms and holograms. A . Problem Formulation. Principles of Adaptation to the Parameters of Signals and Distortions
The solution of the distortion correction problem is built around the assumption that it is possible to define a two-dimensional function a(x,y ) describing the output of an ideal system, and the real system may be described by some transform 9 converting the ideal signal into that actually observed
The task of correction is then to determine, knowing some parameters of the transform F,a correcting transform @ of the observed signal such that the result of its application
be, in the sense of some given criterion, as close to the ideal signal as possible. The choice of approaches to this problem depends on the way of describing signals and their transformations in the corrected systems and also on the correction quality criterion.
I . Description of Pictures and Correction Quality Criterion According to the fundamental concepts of information theory and optimal signal reception theory, signals are elements of a statistical ensemble defined by the ensembles of messages carried by the signals and random distortions and noise. The distortion correction quality is defined by the correction error of individual realizations of the signal
averaged over these ensembles. Here, the overbar represents averaging over the ensemble of random distortions and noise, and the angle brackets represent an average over the ensemble of signals. For a concrete definition of averaging over the signal ensemble in Eq. (3), it is necessary to have a description of pictures as elements of the statistical ensemble. In studies of picture restoration, the statistical description relies most commonly on statistical models of Gaussian and Markov random processes and their generalizations to the two-dimensional case. As applied to picture processing, this approach, however, is very limited. It is essential in picture processing that pictures are, from the viewpoint of information theory,
APPLIED PROBLEMS OF DIGITAL OPTICS
7
signals rather than messages. It is the random picture parameters, whose determination is in essence the final aim of picture interpretation, that are messages. These may be size, form, orientation, relative position of picture details, picture texture, etc. Therefore, two essentially different approaches should be distinguished in the formulation of the statistical description of pictures as signals. In one of them, which may be called a local informational approach, pictures are considered as a set of “interpretation objects” and random background. Interpretation objects involve picture details whose random parameters (e.g., mutual position, form, orientation, number, etc.) are the messages which should be determined as the result of picture interpretation. The rest of the picture, which has no informative (from the viewpoint of the given application) parameters, is the background. Another approach may be called a structure informational one. In this case, the parameters of the picture as a whole, e.g., its texture, are informative, and the picture cannot be decomposed into interpretation objects and background. For a statistical description of pictures as textures, the abovementioned classical methods and models of random process theory may be used. A statistical description of pictures in the local informational approach is more complicated and should be based on a separate statistical description of the interpretation objects and background, and also their interrelations. In particular, this results in the fact that the error [see Eq. ( 3 ) ] of picture distortion correction should be averaged separately over the random parameters of interpretation objects and random background. In doing so, the correcting transform minimizing the correction error (as averaged over the background) will be also optimal on the average. However, it is usually desirable that the correcting transform be the best for a given particular corrected picture rather than on the average. From the standpoint of the local informational approach, this implies that a conventionally optimal transform with fixed background is desired rather than averaging of the correction error [Eq. (3)] over the random background. It is this approach that will be studied below. Accordingly, the zT(a-G) in Eq. (3) will be understood as values of the signal correction error averaged over the set of the corrected picture samples, and angle brackets will be understood as averaging over random interpretation object parameters only. 2. System Description It is customary to employ for description of signal transformations in imaging and holographic systems models built of elementary units performing pointwise nonlinear or linear transformations of signals and responsible for
8
L. P.YAROSLAVSKII
the so-called nonlinear and linear signal distortions, while random corruptions of the signal are described by models of additive and multiplicative fluctuation and pulse noise. In accordance with this description, correction is divided into suppression of noise and correction of linear and nonlinear distortions which are solved in the sequence reverse to that of units in the system model. 3. Principles Underlying Estimation of Noise and Distortion Parameters The distinguishing feature of the correction of pictures, holograms, and interferograms is that the characteristics of noise and distortions which are necessary for the construction of correcting transforms in advance are mostly unknown and must be extracted directly from the observed distorted signal. This refers primarily to the determination of statistical characteristics of noise. At first sight this problem might seem intrinsically contradictory: In order to estimate noise parameters through the observed mixture, one has to separate noise from the signal, which may be done only if noise parameters are known. The way out of this dilemma is not to separate signal and noise for determination of statistical noise characteristics, but to separate their characteristics on the basis of measurements of corresponding characteristics of the observed noisy signal (Jaroslavski, 1980b). The problem of signal and noise separation may be solved either as a determinate one, if appropriate distorted signal characteristics are known exactly a priori, or as a statistical problem of parameter estimation. In the latter case, signal characteristics should be regarded as random variables if they are numbers, or random processes if they are number sequences, and the characteristics determined for the observed signal should be regarded as their realizations. In this approach, construction of optimal parameter estimation procedures should be based in principle on statistical models of the characteristics under consideration which should be constructed and substantiated specifically for each particular characteristic. Fortunately enough, in the majority of practical cases, noise is a very simple statistical object; i.e., it is describable by a few parameters, and the characteristics of the distorted signal are dependent mostly on the picture background. Therefore, the reduced problem of noise parameter estimation may be solved by comparatively simple tools even if the statistical properties of the measurable video signal characteristics are given a priori in a very rough and not too detailed manner. One has only to choose among all the measurable signal characteristics those for which noise-induced distortions manifest themselves as anomalies of behavior detectable in the simplest possible way.
APPLIED PROBLEMS OF DIGITAL OPTICS
9
Without making it our aim to construct an exhaustive theory of anomaly detection and estimation, we shall just describe two digitally easily realizable and, to our mind, sufficiently universal detection methods relying upon a common a priori assumption about the smoothness of nondistorted signal characteristics, those of prediction and voting (Jaroslavski, 1980b).Philosophically, these methods are akin to the recently developed robust parameter estimation methods [e.g., see Ershov (I978)l. In the prediction method, for each given element of the sequence under consideration, the difference is determined between its actual value and that predicted through the preceeding, already analyzed elements. If the difference exceeds some given threshold, it is concluded that there is anomalous overshoot. In doing so, the prediction depth, a technique for determination of the predicted value, and the threshold must be defined a priori for the given class of signals. The voting method is a generalization of the well-known median smoothing method [e.g., see Pratt (1978)], in which each element of the sequence is considered together with 2n of its neighbors ( n from the left and n from the right). This sample of (2n + 1) values is arranged in decreasing or increasing order of magnitude, and the value of the given element is compared with the k extreme (i.e., greatest or smallest) values of the ordered sequence. If it is in this range, it is concluded that this element has an anomalous (great or small) value. The voting method is built around the assumption that the “normal” characteristic as a rule is locally monotonous and that deviations from the local monotonicity are small if any. Values of n and k are given a priori on the assumption about “normal” behavior of the nondistorted signal characteristic. This approach to correction of distortions of picture signals, where correction algorithms are optimized on the average through the random parameters of interpretation objects and realizations of random noise, and the required statistical properties of noise and distortions are determined directly via the nondistorted signal, may be called “adaptive.” €3. Methods for Automatic Estimation of
Random-Noise Parameters
This subsection deals with methods based on the above approach and intended for automatic diagnostics of such types of noise as additive signalindependent fluctuation noise, additive narrow-band noise, pulse noise, noise of the “striped” type, and quantization noise (most commonly met in practical corrections of pictures, holograms, and interferograms).
10
L. P. YAROSLAVSKII
I . Diagnostics of the Parameters of Additive Signal-lndependent Fluctuation Wide-Band Noise in Pictures The most important characteristics of the additive and statistically signalindependent fluctuation noise are its standard deviation and correlation function. If, as is often the case, the noise is not correlated or is weakly correlated, the following simple algorithm may be constructed for determination of its variance and correlation function based on the measurement of anomalies in the covariance function of the observed picture (Yaroslavskii, 1979a, 1985). Owing to the additivity and signal independence of noise, the covariance function Co(r,s), measured over observed, N x M-element pictures, is the sum of the covariance function C(r,s) of a non-noisy picture, the noise covariance function Cs(r,s), and the representation of a random process E(r,s) that characterizes the measurement error of a noise covariance function through its finite-dimensional representation Co(r,s) = C(r,s)
+ C,(Y,s) + E(r, s )
(4)
The variance of a random process E(r,s) is known to be inversely proportional to the number of samples Q N M over which the measurement was done. Since this number is over hundreds of thousands, the random error d r , s) in Eq. (4) is small, and C,(r, s) may be estimated as q r ,4
=
CO@,s)
-
C(r,s )
(5)
Consider first the case of noncorrelated noise, where C&r, s) = 6: S(r,s)
(6)
6: being noise variance and 6(r,s) the Kronecker delta function. Thus, the covariance function of the observed picture differs from that of the non-noisy one only in the origin, the difference being equal to the noise variance
s:
=
C,(O,O) - C(0,O)
(7)
and for the rest of the values of ( r , s ) one may use C o ( r , s )as an estimate of C ( r ,s) C o ( r , s )= C ( r , s )
(8)
As measurements of picture correlation functions have demonstrated (for example, see Mirkin, 1978),in the vicinity of the origin (r = 0, s = 0) they are very slowly varying functions of r and s. The value of C(r,s) necessary for the computation of noncorrelated noise variance through Eq. (7) may
APPLIED PROBLEMS OF DIGITAL OPTICS
11
be, therefore, estimated with high accuracy by interpolation over values C(r,s) = Co(r,s) for points (r,s) in the vicinity of the origin. Thus, in order to determine the variance of additive noncorrelated noise in a picture, it is sufficientto measure the covariance function Co(r,s) of the observed picture in a small vicinity of the point (O,O), determine by interpolation the estimate C(r,s) of C(r,s), and apply C(0,O) - tO(0,O)
(9) as a variance estimate. Experiments show that even interpolation over onedimensional cross-sections of the covariance function provides good estimates (Mirkin and Yaroslavskii, 1978). This approach may be also used for estimating the covariance function and variance of weakly correlated noise, i.e., noise whose covariance function C&r, s) is distinct from zero only in a small vicinity of the origin where a nonnoisy picture covariance function may be satisfactorily interpolated by the values of Co(r,s) at those points where C,(r,s) is known in advance to be zero. In the above method, the approximate dimensions of the domain within which nonzero values of C&r,s) are concentrated, and the smoothness of C(r, s) in the vicinity of this domain, are postulated a priori. In Fig. 1, for the sake of illustration the covariance function of the picture shown in Fig. 2 is presented on a semilogarithmic scale. One can readily see in Fig. 1 the break of the covariance function, interpolated values of this function in the vicinity of zero being shown by the dotted line. Below, the difference between the original and interpolated functions is shown, which serves as the estimate of the covariance function of noise in Fig. 2. 0:
=
FIG.1. Estimate of the covariance function of wide-band noise in the picture of Fig. 2
12
L. P. YAROSLAVSKII
FIG.2. Picture used in experiments on estimation of the noise covariance function
2. Estimation of’ Additive Wide-Band Noise Parameters in “One-Dimensional” Interferoyrams
An interferogram with monotonous variation in some direction of the phase difference between reference and object beams will be referred to as “one dimensional” (Yaroslavskii and Fayans, 1975), as exemplified by the interferogram of Fig. 3a. The ideal, i.e., noiseless, interferogram is a two-dimensional sinusoidal signal. As follows from the properties of a discrete Fourier transform, in the power spectrum of the two-dimensional signal there exists a sharp peak near the mean spatial frequency of the interferogram (see Fig. 3b). If there is additive noise in the interferogram, as in Fig. 3a, the peak is also observed against the noise background (Fig. 3c). The problem of estimating signal and noise parameters via their observed power spectrum, evidently, boils down to that of detecting the signal peak in its spectrum and separating that area of the
APPLIED PROBLEMS OF DIGITAL OPTICS
13
FIG.3. Noise parameter estimation in interferograms:(a) example of a noisy interferogram; (b) power spectrum of a non-noisy interferogram; (c) power spectrum of the interferogram in part (a).
spectral plane where the intensity of the signal spectrum components is essentially distinct from zero. The boundaries of this area may be determined by means of a priori data on the mean spatial frequency of the interferogram, which depends on the interferometer design and on the maximal area of the spatial spectrum defined by the a priori data on the interferometry object. Yaroslavskii and Fayans (1 975) have demonstrated that sufficiently good estimates of the noise power spectral density may be obtained by simple averaging of the noisy interferogram spectrum over the peripheral areas of the spectral density which are known not to be occupied by the signal spectrum. Notably, apodization masks (windows usually used in spectral analysis) must be used for better cleaning of the observed signal spectrum periphery at the tails of the spectrum peak of a non-noisy interferogram signal (Ushakov, 198 1).
14
L. P. YAROSLAVSKII
Even more exhaustive use of a priori data on the ideal interferogram and additive noise is also possible in determination of noise parameters. For example, Ushakov (1979)has used the fact that the distribution function of the noise spectral component intensity is essentially of the Rayleigh type for interferogram noise with rather arbitrary distribution due to the normalizing effect of the Fourier transform. He also has supposed that the signal spectral component intensity has a uniform distribution, which is equivalent to the assumption of a triangular pyramid-shaped form for a signal peak in the frequency domain. This allowed him to construct an algorithm decision for each spectral component of a noisy interferogram, whether it belonged to the signal or noise area of the spectral domain via the value of the likelihood ratio. Ushakov’s experiments (1979, 1981) have demonstrated that this method of diagnostics may provide a very high degree of noise suppression in interferograms with filtration.
3. Estimation of Intensity and Frequency of Harmonic Components of Periodic and Other Narrow-Band Noises Periodic (Moire) noise occurs most commonly in TV and photographic systems where the video signal is transmitted through radio channels. Sometimes it occurs because of discretization of pictures with high-frequency periodic structures, and sometimes it is due to interference effects in pictures obtained in coherent optical imaging systems. The characteristic feature of this noise is that the spectrum in the Fourier basis has only a few components appreciably distinct from zero. Noise having narrow spectral composition in other bases also may be regarded as belonging to this class. At the same time, the spatial spectrum of non-noisy pictures in Fourier and some other bases is as a rule a more or less smooth and monotonic function. Therefore, the narrow-band noise manifests itself in the form of anomalously great and localized deviations or overshoots in the spectra of distorted pictures. In contrast to the above fluctuation noise having overshoots of the correlation function at the origin, localization of these overshoots is unknown. They may be localized by means of the prediction and voting methods described above. the mean value of the squared modulus of noisy signal spectral To this end,___ components (1p,,s12),taken with respect to a chosen basis and computed by appropriate fast algorithms (Ahmed and Rao, 1975;Yaroslavskii, 1979a, 1985) is determined by averaging over all the observed pictures with similar periodic noise. If one-dimensional filtration is performed (e.g., along picture rows), averaging may be done over all the rows subject to filtration. Next, localized
APPLIED PROBLEMS OF DIGITAL OPTICS
15
noise components are detected by voting or prediction; i.e., noise-distorted spectral components (Ipr.s12)of the observed signal are marked. By virtue of noise additivity, ( lfir,s12)are, obviously, equal to the sum of the intensities of spectral components of a non-noisy signal (/ar,s12)and noise (see below in Subsection II,C,l). Consequently,
m2
GIz=
~
(IBr,sI‘>
- (Iar,st2>
(10)
Taking into account that the non-noisy signal a priori has a smooth ) in Eq. (10) may be determined by spectrum, the values of ( l a r , s / 2 required interpolation over the nearest samples of Ifir,s12which are not marked as noise distorted. 4 . Estimation of Parameters of Pulse Noise, Quantization Noise, and “Striped” Noise
The basic statistical characteristic of pulse noise is the probability of distortion of signal samples which defines the noise detection threshold in the filtration algorithm (see Section 11,C). The threshold may be determined by means of the histogram of distribution of the modulus of difference between each picture sample and its value predicted by its vicinity. This histogram has two characteristic parts: one defined by the distribution of the difference signal of a non-noisy picture, and another defined by the distribution of the difference between predictions made through a non-noisy picture and noise, as well as by the distribution of noise prediction error. As video signals of neighboring elements are strongly correlated, the first part decreases rather quickly. The second part of the histogram decreases much more slowly because noise overshoots are independent (see Fig. 9b below, showing the histogram of the difference signal of Fig. 9a). A good estimate of the noise detection threshold is provided by the histogram breakpoint, which may be detected by the prediction method. Pulse noise overshoots may be detected also by the voting method if it is applied to the sequence of values of a noisy video signal in a small vicinity of each picture element (see Section 11,C). Signal quantization noise depends on the number of quantization levels. In order to determine it, it suffices to construct a signal histogram and compute the number of signal values for which the histogram is distinct from zero. Striped noise in pictures is caused by random overshoots of the video signal mean value computed in the direction of the stripes. For example, this was the type of noise in the photographs transmitted by the interplanetary stations “Mars-4” and “Mars-5’’ (Belikova et al., 1975, 1980). This may be detected and measured in the same way as the spectrum overshoots by
16
L. P. YAROSLAVSKII
analyzing the sequence of video signal values averaged along the rows in the direction of the bands (see also Section 11,C). C. Noise Suppression by Filters with Automatic Parameter Adjustment
In this Section, filters for the suppression of additive and pulse noise in pictures are described that have automatic parameter adjustment (APA) to the observed distorted picture and are based on the principles of filtration formulated in Section I1,A. For brevity they will be called APA filters. I . Optimal Linear Automatic Parumeter Adjustment Filtration oJ Additive Signal-lndependent Noise
Linear filtration of a noisy signal is known to be the simplest tool for additive noise suppression. Filter parameters are usually determined on the basis of the optimal (Wiener) filtration theory developed for continuous signals and the rms filtration error criterion. The synthesis of rms optimal discrete linear filters of random signals as represented in an arbitrary basis was discussed by Pratt (1972, 1978; see also Ahmed and Rao, 1975). Relying upon the adaptive approach formulated in Section H,A, let us derive the basic formulas for optimal discrete linear filters. For the sake of simplicity we shall use one-dimensional notation; in order to pass to two variables, it will be sufficient to regard the indices as two-component vectors. Let A = { a s } be an N-dimensional vector of picture signal spectrum samples with respect t o some orthonormal basis. It is desired to restore the signal from its observed mixture (1 1)
B=A+X
with independent noise X = {q),so that the squared modulus [&I2 of the from signal A , averaged over the ensemble signal deviation estimate A^ = of noise realizations and random signal parameters, and estimated for one signal sample
be minimal. Determine the optimal linear filter H into A
=
{ v ~ , which ~} transforms signal B
N- 1
d;, =
1
VS,"P,
n=O
and meets the above criterion (mrms filter).
APPLIED PROBLEMS OF DIGITAL OPTICS
17
Optimal values of qs,, are solutions of the systems of equations
where the asterisk signifies the complex conjugate, i.e., of the following systems
If, as is usually the case, the mean value of noise samples is zero, -
p,
= a,
(m>
+ K, = a,
-
=
+ KnK:
(16)
Since for qS,, and qErnsystems (15) are equivalent, it suffices to solve only one of them. By substituting Eq. (16) into Eq. (15) one obtains for qs,,the following system of equations
The matrix H = {yl,,,} defined by Eq. (17) has dimensionality N x N , and, generally, filtration of an N-element vector requires N 2 operations, which is objectionable for practical applications such as processing of pictures and other two-dimensional signals of large information content. A way out of this situation is provided by two-stage filtration
2 = T-~H~TA
(18)
where T and T-’ are direct and inverse matrices of the transformations, which may be performed by the so-called “fast algorithms” (e.g., see Ahmed and Rao, 1975), and Hd is a diagonal matrix describing the so-called “scalar filter” or “filter mask”(Yaroslavskii, 1979a, 1985). This approach to a digital realization of optimal linear filters was seemingly first suggested by Pratt (1972), who considered the use of the Walsh transform as a T transform. Obviously, to bring about good filtration quality, the joint transform T- ‘H,T should well approximate the optimal filter matrix H T-’H,T
1H
(19)
18
L. P. YAROSLAVSKII
Exact equality in Eq. (19) is known to be attainable only if T is a matrix of eigenvectors of the matrix Hd(see Ahmed and Rao, 1975);of course, there is no guarantee that this optimal transform will have a fast algorithm. In this connection, one has to check the feasibility of transform matrix factorization into a product of sparse matrices, and of the synthesis of transforms approximating the given one and definitely possessing a fast algorithm (Yaroslavskii, 1979a, 1985; Jaroslavski, 1980~). Similarly to the above general case, one may easily demonstrate that the Scalar filter Hd = that is optimal with respect to a chosen criterion is defined by
(m>
where is a power spectrum of the observed distorted signal in a chosen basis averaged over noise realizations and random parameters of interpretation objects, and I K , ~ ~is the noise power spectrum. Another possible correction quality criterion which has proved effective is that of signal spectrum reconstruction (SSR) (e.g., see Pratt, 1978). By modifying it according to our approach, that is by imposing a requirement that the restored signal power spectrum coincide with that of the distorted signal averaged over variations of interpretation objects and corrected by the estimate of a noise spectrum, we obtain that the scalar filter optimal with respect to this criterion is
The form of Eqs. (1 7), (20),and (21 ) for optimal linear filters implies that the desired signal and noise parameters may be determined through the observed noisy signal. Therefore, they define optimal linear APA filters. Depending on the depth of filtration error averaging over s in accordance with the criterion of Eq. (12), they will be adjustable either globally or locally. In the latter case, filtration errors are estimated on the average over picture fragments, and corresponding formulas involve spectra and covariance matrices of fragments rather than the picture as a whole. Notably, filters described by Eqs. (20) and (21) are realizable in adaptive coherent-optics systems for spatial filtration with a nonlinear medium in the Fourier plane (Yaroslavskii, 1981). Described below are some practical applications of APA filtration to additive noise in pictures and interferograms. Filtration of strongly correlated (narrow-band) additive noise whose power spectrum {El2) contains only a few components distinct from zero, or
APPLIED PROBLEMS OF DIGITAL OPTICS
19
of a similar narrow-band signal against the background wide-band noise, is one of the important cases of a practical nature of distortion correction in pictures and other signals for which the linear filtration technique based on Eqs. (20) and (21) performs well. Narrow-band noise may be exemplified by periodic noise characteristic of some picture transmission systems. Filtration of narrow-band signals against the background wide-band noise may be represented by the suppression of additive noise in one-dimensional interferograms. The filter of Eq. (20), designed to suppress narrow-band noise, would pass without attenuation the video signal spectral components with zero noise intensity and significantly attenuate those with high noise intensity. For high as compared with the signal, intensity of individual noise components Elz, the filter of Eq. (20) is well approximated by the so-called “rejection” filter, which completely suppresses spectral signal components distorted by intensive noise components
Computationally, the rejection filter is even simpler than that of Eq. (20). The and l
where the sense and definition of and k 12 are'the same as in Eq. (20). Correspondingly, the frequency response of the correcting filter for the SSR criterion will be
are mean values over variations of interpretation objects of where the ( the nondistorted picture spectrum for those s whose = 0. It should be either known a priori or determined by interpolation of l & - 2 ( ( ~ 1 2 ) -k 12 by neighboring points as was done in diagnostics of narrow-band noise in Section II,B. Numerous experimental facts noted by the author and many other researchers indicate that in picture distortion correction sufficiently good results may be obtained if some typical spectrum of the given class of background pictures is used as an a priori nondistorted picture spectrum ( ) E , ) ~ (e.g., ) see Slepyan, 1967). As we see it, this typical spectrum is a picture spectrum estimate averaged over variations of interpretation objects. Denote this by lEs12 and then obtain the following formula for the SSR criterion
It then follows from that for picture correction it is sufficient to know only the phase characteristics of the distorting imaging system. It also follows that if the imaging system does not distort the phase of picture Fourier
30
L. P. YAROSLAVSKII
spectral components,
that is, the characteristic of the correcting filter is independent of the distorting system. This implies that pictures may be corrected even with unknown distortion characteristics, correction being independent of the distorting system characteristics. An important class of imaging systems is composed of systems without signal spectrum phase distortions. They may be exemplified by systems for observation through a turbulent atmosphere (e.g., see Pratt, 1978) or Gaussian aperture systems, that is, by practically all systems where an image is generated by an electronic beam, etc. The effectivenessof this method of correction was borne out by simulation (Karnaukhov and Yaroslavskii, 1981), as may be seen in Fig. 11. It should also be stressed that a filter of the type in Eq. (30)may be easily implemented in an adaptive optical system with nonlinear medium in the Fourier plane similar to that described by Yaroslavskii (1981). When correcting linear distortions of imaging systems, one must take into consideration that correction usually precedes picture synthesis. The frequency response of a photographic or other recorder reproducing the corrected picture also differs from the ideal. One has to allow for this fact during correction. If one denotes by H l ( f x , f y )the continuous frequency response of the imaging system up to the place where correction may be done, and by H2(fx,,fq.)the continuous frequency response of the processing system, one can readily obtain, for example, the rms optimal continuous frequency response of the correcting Wiener filter as follows
The digital realization of such a correcting filter is possible either by means of processing the discrete Fourier spectrum with an FFT algorithm, or by digital filtration in the spatial domain. It is good practice to employ even signal continuation in order to attenuate the boundary effects of filtration, and combined discrete Fourier transform algorithms in order to reduce processing time (see Yaroslavskii (1979a, 1985)]. The choice between these two approaches is defined by the required amount of computation and memory size. It turns out in practice that if a correcting digital filter cannot be satisfactorily approximated by a separable and recursive one, processing in the spectral domain with FFT algorithms usually presents a smaller computational burden.
FIG.11. Correction of unknown Gaussian defocusing: (a) original defocused picture; (b) result of correction by the filter in Eq. (30).
32
L. P. YAROSLAVSKli
The above technique was employed, for example, in processing the photographs made by the automatic interplanetary stations “Mars-4’’ and “Mars-5’’ (Belikova et al., 1975, 1980). In this case, the overall frequency response of the photographing and picture transmission system was known (Selivanov et al., 1974), and correction was performed by means of a simple separable recursive digital filter transforming the samples of the corrected video signal ak,[through the following formula
The gain of the difference signal g, and the dimensions of the averaging area ( 2 N , + 1)(2N, + l), were chosen by the approximation of the desired correcting filter frequency response by the continuous frequency response of the filter in Eq. (32)
X
+
sinc[n(2N2 1)fy/2Fy] sinc(nfJ2F,)
(33)
where (2Fx,25,) are dimensions of the rectangle confining the spatial picture spectrum and defining signal sampling, and H o ( f x ,f,) is the frequency response of the photographic recorder of the picture processing system (Yaroslavskii, 1979a, 1985). The dashed line in Fig. 12 shows the cross section of the system frequency response to be corrected (Selivanov et al., 1974), and the chain-dotted line shows the correcting frequency response, Eq. (33),for g = 4, N , = N2 = 1. The curve labeled 1 in this picture is the post correction frequency response disregarding the frequency response of the photorecording device, and the curve labeled 2 is the overall response. The digital correction thus has more than doubled the spatial bandwidth at the level 0.7. One can visually judge its effect, for example, by Fig. 13, showing a picture before (a) and after (b) correction. It should be noted that in this case correction by means of the separable recursive filter has been made possible owing to a rather simple form of the distorted system characteristics. Correction by this filter is not completely perfect; for instance, on “middle” frequencies it somewhat overcorrects. However, the time required for picture processing by such a filter is several times less than the time that would be required for processing in the spectral domain by an FFT algorithm. Correction of linear distortions in holographic systems has its own peculiarities. In the analysis and synthesis of holograms, linear distortions are
fiml Fic,. 12. Correction of the overall frequency response of photo-TV system.
Fic. 13. Picture (a) before and (b) after correction of photo-TV system frequency response.
34
L. P. YAROSLAVSKII
defined mostly by the finite dimensions of the apertures of devices for recording and sampling (measurement) of holograms and wave fields. As follows from the analysis of synthesized hologram reconstruction (see Section V,D), the finite size of the hologram recorder aperture and the limited resolution of the recording medium bring about the shadowing of the field by a masking function proportional to the squared modulus Ih(x,y)I2 of the Fourier transform of the recorder pulse response with allowance for the characteristics of the photographic material used. This shadowing may be corrected by a corresponding predistortion of the original field amplitude distribution over the object (Yaroslavskii and Merzlyakov, 1977, 1980; Yaroslavskii, 1972a). For a rectangular A t x AYJrecorder aperture
h(x,y) = sinc(7c A t x / l d )sinc(7c AYJy / l d )
(34)
where 3, is the hologram reconstruction wavelength, and d is the distance from the point source illuminating the hologram to the observation point (see Section V,A). Therefore, if the samples of the original field are enumerated by indices k, 1 ( k = 0, 1,. . .,N - 1; I = 0, 1,. . . ,M - l), the amplitude of the field distribution over the object should be multiplied by the following correcting function [see Eq. (171)]
(disregarding the modulation transfer function of the film used for recording holograms). The effect of the shadowing and its correction are illustrated in Fig. 14a and b. Correction of the finite dimensions of signal sensors in digital reconstruction of holograms and wave fields may be done in a similar manner (Yaroslavskii and Merzlyakov, 1977, 1980). E . Correction of Nonlinear Distortions Nonlinear distortions are described by system amplitude characteristics showing the dependence of output on input
b = &(a)
(36)
The ideal system amplitude characteristic w d ( a ) is regarded as given. Generally it is a linear function. The aim of the correction is to find a pointwise correcting transformation that makes the amplitude characteristic of the system after correction the same as that given.
FIG.14. Results of reconstruction of hologram synthesized (a) wlthout and (b) with shadowing correction.
36
L. P. YAROSLAVSKII
1. Correction of Nonlinear Distortions in Imaging Systems When determining correcting transformations for imaging systems, one should bear in mind that before and after correction in the digital system the signal is subjected to a number of nonlinear transformations such as predistortion at processor input, quantization, and nonlinear correction before signal reconstruction at the processor's output. The sequence of the transformations is illustrated in Fig. 15a. The task of the optimal corrections is to minimize the difference between corrected, 6,and nondistorted signals. It is akin to the well-known problem of a
I Nonlinear distortion
Wd (a)
7
i"
Nonlinear predistortion before quantization
I .
Uniform quantization
Correction of nonIinear distortion
Correction of nonlinear predistortion /i W,",(b)
t FIG.15(A) Model of nonlinear distortions in imaging systems and their digital correction.
APPLIED PROBLEMS OF DIGITAL OPTICS
37
I I
FIG.15(B) Digital correction of nonlinear distortions in imaging systems.
optimal quantization (see Garmash, 1957; Max, 1960; Andrews, 1970; Yaroslavskii, 1979a, 1985c), and may be solved by the following method for correction of nonlinearity described by a given distorting function Wd(a) with a given predistorting function Wpd(b)(see Fig. 15b): (1) The boundaries {a') of signal quantization intervals prior to distortion
a'
=
w;'(w,,'(b'))
(37)
are determined through a given quantization scale {b'}( r = 0,1,. . . ,M - 1; M being the number of quantization levels of the signal b). (2) For each rth quantization interval (ar,a'"), the optimal value a' of a representative of this interval is determined, ensuring the minimal quantization error.
38
L. P. YAROSLAVSKII
(3) For each rth representative, a number q of the quantization interval of the continuous variable reconstructed from its quantized values {b,} is determined by the given function of nonlinear predistortion corrections. The resulting table q(r) is the desired correction table. 2. Correction of Nonlineur Distortions in Holographic Systems
The effect of nonlinear distortions during the recording and reconstruction of holograms radically differs from what happens with pictures. Moreover, the nonlinearity of the amplitude characteristic of recording media and of devices for hologram recording and quantization has a different effect on mirror-reflecting and diffusion-reflecting objects (Yaroslavskii and Merzlyakov, 1977, 1980).
0
512 N
FIG. 16. Contribution of thresholding of the dynamic range of orthogonal components of a diffuse object hologram: (a) original distribution of field amplitude; (b) reconstructed distribution under k 3a limitation; (c) the same under i2u; (d) the same under fa.
39
APPLIED PROBLEMS OF DIGITAL OPTICS
Nonlinear distortions and quantization of holograms of mirror-reflecting objects result in the destruction of object macroforms (in particular, reconstructed images become contourlike ones). By appropriate choice of the quantized corrected signal values, distortion in the reconstructed image may be reduced. Holograms of diffusion-reflection objects are more stable to thresholding and quantization. These distortions do not result in the destruction of the reconstructed image, but manifest themselves in the occurrence of random noise called diffusion, or speckle, noise. Figure 16 shows the results of a simulation of dynamic range thresholding during recording of the orthogonal components of a diffusion-reflecting object hologram [(a) is the initial distribution of field intensity over a onedimensional test object, and (b)-(d) are the distributions after thresholding at the levels & 3 0 & 2a,and f0,respectively, where 0 is the rms value of the field components]. One may easily see from these pictures that diffusion noise appears and grows with thresholding and that the object’s macrostructure is preserved. Quantitatively, noise may be evaluated through the dependence of diffusion noise intensity on the extent of thresholding in the hologram field orthogonal component, which is shown graphically in Fig. 17. In this graph
D
2
qL
FIG.17. Speckle contrast vs. hologram value thresholding depth.
40
L. P. YAROSLAVSKIl
the x axis represents the extent of thresholding in the hologram field orthogonal component with respect to rms values, and the y axis gives values of the ratio of the standard deviation of the diffusion noise to the mean value of reconstructed field intensity (speckle contrast). The diagram was obtained for an object with constant intensity reflection coefficient. A similar regularity is observed in the quantization of the orthogonal components of the field of the diffusion object hologram. Reduction of the number of quantization levels leads to higher diffusion noise, but the object's macrostructure is preserved [see Fig. 18, where (a) is the initial field intensity distribution, and (b)-(d) are distributions after quantization within the range - 3rr into 128,64, and 16 levels, respectively]. The form of the speckle contrast
+
0 I
512
I
\-
0
512N
0
512 N
Influence of quantizdtion of the orthogondl components on d diffuse ObJect hologram (a) original distribution of object held intensity, (b)-(d) reconstructed distribution at uniform quantization into 128,64, and 16 levels, respectively FIG 18
APPLIED PROBLEMS PRORI.EMS OF OF DIGITAL DIGITAI OPTICS nPTlfT APPLIED
A1 41
--.
0
64
,A’
FIG.19. Speckle contrast vs. number of levels of uniform quantization of hologram orthogonal components.
of hologram hologram quantization quantization levels levels(Fig. (Fig. 19) 19)isis very very instructive. instructive. This This as a function of shows that, that, with with aa decrease decrease of of the the number number of of quantization quantization levels, levels, dependence shows intensity at at first first grows grows comparatively comparatively slowly, slowly, but but after after the relative noise intensity levels its its speed speed dramatically dramatically accelerates. accelerates. approximately 32 levels approximately The comparative stability stability of of diffusion diffusion object object holograms holograms to to nonlinear nonlinear distortions and quantization quantization enables enables one one to to combat combat such such distortions distortions by by simulating the diffusion diffusion light light bias bias of of the the objects objects in in hologram hologram synthesis synthesis as as isis also done in optical optical holography. holography. This This is is something something of of an an analogy analogy to to the the wellwellof adding adding pseudorandom pseudorandom noise noise to to combat combat picture picture quantizaquantizaknown method of 1962).However, However, this this isis not not the the only only or or the the best best way way see Roberts, 1962). tion noise (e.g.,see hologram stability stability to to nonlinear nonlinear distortions distortions and and quantization. quantization. of providing hologram publications (e.g., (e.g., Kurst Kurst et et al., al., 1973) 1973)propose propose to to employ employ the the so-called so-called Some publications which would would give give the the same same effect effect of of “spreading” “spreading” inin“regular” diffusors, which the hologram hologram as as aa random random diffusor, diffusor, but but without without aa random random formation over the pattern over over the the reconstructed reconstructed image. image. As As the the digital digital synthesis synthesis of of noise pattern holograms is less less limited limited by by implementation implementation considerations considerations than than by by anything anything holograms else, the idea of a “regular” “regular” diffusor diffusor may may be be realized realized here here at at best. best. else, A convenient and and practicable practicable method method for for introducing introducingregular regular redundancy redundancy A into aa digital hologram, hologram, called called the the multiplication multiplication method, method, was was proposed proposed by by
42
L. P. YAROSLAVSKII
FIG.20.
Multiplication method for recording synthesized holograms
Yaroslavskii (1974). Its essence is as follows: The synthesized hologram is broken down into several fragments of differing signal intensities, as shown in Fig. 20 (1 - signal intensity, f is the coordinate on the hologram); the signal in the central, usually most intensive, fragment is L times attenuated, where L has a value of the order of the ratio of the maximal signal in this interval to the signal maximum in a neighboring, less intensive fragment. The attenuated interval is repeated over the area L times and is summed with the signal in a neighboring interval of the hologram. As shown in Fig. 20, this procedure may be repeated several times, resulting in a multiple digital hologram with a much narrower dynamic range of values to be recorded. This method features such merits as simplicity of realization and flexibility because all the multiplication operations are performed over the already computed hologram and, in principle, may be done in the course of hologram recording. Experimental multiplication of holograms has demonstrated that with an appropriate choice of multiplication parameters (number and size of multiple hologram fragments) this method works well (Yaroslavskii and Merzlyakov, 1977, 1980; Jaroslavski and Merzlyakov, 1979). 3. Correction of Nonlinear Distortions of Holograms and Interferograms under Unknown Distortion Function
The form of the characteristic of nonlinear signal distortion is often unknown, as occurs in hologram and interferogram reconstruction.
APPLIED PROBLEMS OF DIGITAL OPTICS
43
In such a case, a priori knowledge about the signal may sometimes be used for determination of the distortion characteristic and, consequently, correcting transformation. Yaroslavskii and Fayans (1975) proposed a method for the determination and correction of nonlinear distortions of interferograms relying upon the properties of an undistorted interferogram and hologram in statistical measurements. An undistorted interferogram is described by the following equation: In, = Io(c
+ cos cp)
(38)
where I , is the interferogram amplitude, c is a constant defining the positive bias of the interferogram signal, and cp is the phase angle of the interferogram. If the observed interferogram contains quite a few periods, cp may be regarded as uniformly distributed over the interval [ - n,n], and I , , must be distributed according to the known law hO(Ind)
=
{znJ1
-
[(Ind/Io)
- clz}-i
(39)
Let
be the real observed interferogram (hologram), where Wd is a distorting function. The distribution density may be empirically measured by the observed signal histogram. Thus, correction of nonlinear distortions in this case boils down to construction of a transformation of the signal with distribution values hl(lob)into a signal with a given distribution. The recoding table for such a transformation may be determined by means of the following simple algorithm: (1) Construct the table through the observed histogram by means of the formula
where M is the number of quantization levels, and the function int rounds to the nearest integer value. A signal transformation done according to this table is referred to as “equalization” because it transforms the arbitrarily distributed signal into a uniformly distributed one (Belikova and Yaroslavskii, 1974; Andrews, 1972; Hummel, 1975). See also Section II1,B. (2) Construct a similar table W,(I)through the desired histogram ho(I). (3) Permute inputs and outputs of the table W2(I)so as to obtain the table r^( W2),which, to quantization effects, defines the transformation of the uniformly distributed signal into that with distribution ho(I).
44
L. P. YAROSLAVSKII
FIG.2 I , Correction of nonlinear distortions of interferograrns: (a) distorted interferogram; (b) corrected interferogram; (c) cross section of distorted interferogram; (d) cross section of corrected interferogram.
(4) Construct of the tables W,(I)and f(W2) a joint table W,(I) = f(W2 = W,(I))
(42)
The operation of this algorithm is illustrated in Fig. 21. It should be noted that if the distorted interferogram or hologram contains additive noise, it will distort the distribution of its values and, consequently, the correcting transformation defined by the algorithm. Experimental verification of algorithm’s stability to additive noise, however, has demonstrated that, even under significant noise level, the correction quality is quite satisfactory (Ushakov and Yaroslavskii, 1984).
APPLIED PROBLEMS OF DIGITAL OPTICS
4s
111. PREPARATION OF PICTURES
As was already noted in the Introduction, representation of the object to the observer by means of an ideal imaging system often turns out to be insufficient for scientific and practical applications. In complicated problems requiring meticulous analysis of pictures (search, object identification, determination of various quantitative characteristics, generalizing descriptions, etc.), it is desirable to arm the observer’s vision with a means for the interpretation of pictures and extraction of the data necessary for analysis. These are, first, technical means using tools all the way from a magnifying glass, pencil, compass, ruler, tracing paper, etc., through complicated optical and optoelectronic devices and dedicated digital picture processing systems; and second, methods of video signal processing. This auxiliary processing we call “picture preparation.” Methodologically, picture preparation may be treated in two ways. From the viewpoint of object transformation into the picture in imaging systems, preparation may be regarded as correction of the interaction of the video signal sensor with the object. From the viewpoint of interpretation and extraction of information, preparation is a preprocessing of the signal intended to coordinate it with the end user, i.e., the human interpreter responsible for decision making. Preparation as picture processing to facilitate visual perception has two aspects: preparation for the collective user of such media as TV, movies, or print art, and preparation for the individual user. In the former case, it is often referred to as “enhancement” (Huang et a/., 1971; Huang, 1975; Andrews, 1972; Gonzalez and Wintz, 1977; Pratt, 1978; Rosenfeld and Kak, 1982; Rosenfeld, 1969). The latter case corresponds to nonformalizable applied problems of picture interpretation. This article pays attention mostly to the second aspect as being most important in applications and defining largely the structure of the processing system. Awareness of the importance of this aspect is very significant both for further development of methods for picture processing oriented to interpretation, and for determination of approaches for the construction of automated picture processing systems. Section III,A classifies preparation problems and analyzes the requirements of automated picture processing systems from this standpoint. Methods of adaptive amplitude transformations of video signals are described in Section II1,B. Linear methods of picture preparation are described and substantiated in Section III,C, and, in Section III,D the concept of rank algorithms for picture preparation is presented. Section III,E is devoted to the combined preparation methods, to preparation involving determination and
46
L. P. YAROSLAVSKII
visualization of the signal’s quantitative characteristics as well as decision making, and to the ways of using color and stereoscopic vision for picture preparation. A . Problems of’ Picture Preparation: Distinctive Characteristics of’
Picture Preparation in Automated Systems Two classes of problems in preparation may be identified: geometrical transformations and feature processing. Geometrical transformations are performed to obtain the most convenient and obvious planar representation of three-dimensional objects. In this domain, digital processors do not have significant advantages as compared with analog (optical, TV) means. Their main merit, the capability of rapidly rearranging the transformation algorithm, does not make up for transfers of bulky data, which require large memory space, and for the difficulty of providing high accuracy of interpolation. That is why we shall not touch upon this class of problems. The processing of features is composed of extraction, measurement, and visualization of the video signal characteristics or of those features which are most informative for the visual system in the current problem of analysis. The choice of features is dictated by the task being executed in the course of analysis and by the distinguishing features of the objects under consideration. These may be, for instance, values and local mean values of the video signal in certain spectral ranges of the registered radiation, the power of the picture spatial spectrum in certain areas of the spectral plane, the area and form of a cross section of the normalized picture correlation function at a certain level, and so on. In selecting feature measurement and transformation methods for automated digital picture processing systems, it is advisable to proceed from the efficiency requirements to the software. To this end, basic transformation classes should be identified which could underlie construction of ramified processing procedures. In compliance with well-known principles of the theory of signals and systems, the following transformation classes may be defined: non-linear point-wise transformation, linear transformations, and combined transformations. Below, consideration is given to the following feature processing methods that are based on the adaptive approach and belong to the above classes: methods of adaptive amplitude transformation, linear preparation methods, combined preparation methods, preparation methods with decision making, and determination and visualization of picture quantitative characteristics. The main characteristic of preparation by means of feature processing is the lack of a formal criterion of picture informativeness for visual analysis.
APPLIED PROBLEMS OF DIGITAL OPTICS
47
Therefore, preparation should be done interactively with the participation of the user controlling the processing by direct observation of the picture in the course of processing. For support of the interactive mode in automated picture processing systems, special devices for dynamic picture visualization-displays and display processors-should be provided. The basic functions of display processor are as follows: (1) Reproduction of high-quality black-and-white and color pictures from the digital signal arriving from the central processor of the picture processing system; (2) Provision of feedback from the user to the central processor both for control and video signals; and ( 3 ) Fast hard-wired picture processing in real-time coordinated with the user’s inherent response and comfortable observation conditions. To perform these functions, the display processor should include: (1) Digital video signal storage, (2) Bilateral data exchange channel between the memory and central processor; ( 3 ) Arithmetic unit and hard-wired processors for fast picture processing with either subsequent visualization only, or visualization after writing into the memory; (4) Graphic processor with generators of vectors, graphs, and characters; and (5) Organs of control and dialogue (functional keys, buttons, joy-sticks, track balls, light pens, etc.). All the modern automated picture processing systems feature display processors (see, for example, Jaroslavskii, 1978; Kulpa 1976; Machover et al., 1977; Reader and Hubble, 1981; Cady and Hodgson, 1980). B. Preparation by Means of’ Adaptive Nonlinear Transformations of the Video Signal Scale
Pointwise nonlinear transformations of video signals are the simplest kind of transformations which may be classified as picture preparation and which came into practice long ago. It suffices to mention such methods as solarization, pseudocoloring in scientific and artistic photography, and gamma correction in print art and TV. With the advent of digital technology, these transformations, realizable in only one operation per picture element, have gained wide acceptance and development. Among the most popular, one
48
L. P. YAROSLAVSKII
may cite such methods as equidensities, amplitude windows, bit-slicing, equalization, and histogram hyperbolization (Belikova and Yaroslavskii, 1974; Andrews, 1972; Hummel, 1975; Frei, 1977). The latter two methods are notable for the fact that their video signal transformation laws are determined through measurement of a video signal histogram, thus making the transformations adaptive. Equalization is described by the following transformation:
where rn is the quantized value of the transformed signal, m = 0, l , . . . , M - l;h(s) is the histogram of its values, s = 0, 1,. . ., M - 1; ri? is the transformed value; and int(x) is the integer part of x. Histogram equalization brings about higher contrast in those picture areas which have the most frequent values of the video signal. Selectiveness of equalization with respect to the frequency of video signal values is its major advantage over other methods of contrast enhancement. Hyperbolization is related to equalization, but it is the histogram of the video signal value's logarithm distribution that is equalized there. If equalization is performed simultaneously over the entire picture and is based on the histogram of the entire picture, it will be globally adaptive. Often, however, local adaptation is required. In this case, picture fragments should be equalized rather than the entire picture, and the fragments may overlap each other. This mode of processing brings to its logical completion the concept of adaptation in nonlinear amplitude transformations. In fragmentwise equalization with overlapping, the distribution histogram is constructed over the whole fragment, but only its central part is transformed, which corresponds to the nonoverlapping areas. If each succeeding fragment is shifted with respect to the preceding one by one element, the tranformation is called "sliding" (Belikova and Yaroslavskii, 1974). The table of sliding transformation (equalization) varies from one picture element ( k ,1 ) to another depending on variations of the histograms h(k,')(s)of the surrounding fragments
1
h'","(s) - h("-')(O))/(l - h'""'(0))
(44)
The fragmentwise and sliding equalizations were used in processing space photographs (Belikova et al., 1975,1980; Nepoklonov et al., 1979), geological interpretations of aerial photographs, and medical radiograms (Belikova and Yaroslavskii, 1980).
APPLIED PROBLEMS OF DIGITAL OPTICS
49
The effect of fragmentwise equalization may be seen in Fig. 22(b). If the original aerial photograph (Fig. 22a) is equalized as a whole (see Fig. 22c) rather than by fragments, its total contrast will also be enhanced, but the distinguishability of the details will be much worse. Figure 23 shows the fragmentwise equalization of the Venus surface panoramas transmitted by the automatic interplanetary stations “Venera-9.” It may be easily seen that equalization enables one to distinguish in the bright and dark areas of the panorama numerous details having low contrast, to emphasize the volume of the plate. In both cases, fragments were 15 x 15 elements with a 3 x 3 step. Note that with fragmentwise and sliding equalization, the number of operations required for transformation table generation may become prohibitive if one does not use recursive algorithms for estimation of current histograms [e.g., see Yaroslavskii (1979a, 19S5)l. Equalization may be regarded as a special case of amplitude transformation of the observed signal into that with a given distribution. The
FIG.22. Picture equalization: (a) original aerial photograph; (b) effect of fragmentwise equalization; (c) equalization of the picture as a whole.
50
L. P. YAROSLAVSKII
FIG.23. Application of fragment-wise equalization to processing Venus surface panorama: (a)before processing; (b) after equalization.
algorithm for this transformation is presented in Section I1,E. In the case of equalization, it is a uniform law. Such a transformation may be used for standardization of various pictures; for example, in constructing photomosaics [see Milgram (1974)l or in texture analysis (Rosenfeld and Troy, 1970). Another interesting possibility of generalization lies in changing the relation between the steepness of a signal’s nonlinear transformation and its histogram (Belikova and Yaroslavskii, 1974; Yaroslavskii, 1979a, 1985). At equalization, the transformation steepness is proportional to histogram values, but it may be made proportional to some power p of the histogram, thus leading to the formula
At p > 1, the greater p is, the more will weak modes be suppressed in the histogram and the most powerful ones extended over the entire range. P = 0 corresponds to a linear extension of the video signal. At p < 0, the more powerful the mode, the greater its compression. Processing by Eq. (45) may be named “power intensification of the picture,” the choice of p being left to the user. Notably, Eq. (45) resembles formulas describing the optimal signal predistortion law for quantization (see Yaroslavskii, 1979a, 1985). This similarity throws more light on the essence of adaptive amplitude transformations. From this point of view, power intensification corresponds to a
APPLIED PROBLEMS OF DIGITAL OPTICS
51
model regarding the visual system as a quantizing device and processing as a signal predistortion required for matching with this device. At p - co,power intensification becomes adaptive mode quantization, quantization boundaries lying within the minima between the histogram modes. Adaptive mode quantization is a version of cluster analysis which is very popular in pattern recognition and classification. Rosenfeld (1969) discusses the application of adaptive mode quantization to picture segmentation as being the first step of their automatic description. A method for adaptive mode quantization as a picture preparation method was developed by Belikova and Yaroslavskii (1974, 1975). This required a new approach to the substantiation of the number of histogram modes and the criterion of mode separation at quantization. In order to establish quantitative criteria for the selection of optimal boundaries between the modes, it is necessary to have a description of the causes of fuzziness of modes and losses due to misclassification. In picture preparation, the most constructive requirement seems to be that of the minimal number of incorrectly classified picture elements. Other requirements such as smoothness of the boundaries of isolated areas, or lack of small foreign impregnations inside a large area, or some similar conditions, are also possible. The degree of mode fuzziness is defined by the object’s properties with respect to the chosen feature. Usually, they are not easily formalized, and one has to construct more or less plausible models relying on the a priori knowledge of how the properties of objects manifest themselves through the observed picture. For instance, the picture to be subjected to preparation may be treated as the result of a transformation of the original field containing only “pure” modes (i.e., field whose distribution of values with respect to the feature under consideration consists of a set of delta functions) effected by random operators and/or noise. Then decision rules may be determined by means of statistical decision theory, for example, for the criterion of minimal frequency of picture element classification error. The field of decisions resulting in this case may be treated as an estimate of the original picture under the assumption that the prepared picture was obtained by distortion of the original “pure”-mode field by noise and operators. The simplest models with random operators acting upon the ideal picture and with additive or multiplicative noise for which a closed solution may be obtained with respect to the choice of decision algorithms usually are not sufficiently adequate for the actual relations between the object’s properties to be extracted and measured features. For example, in the distribution of features over the picture, modes may be made significantly fuzzy because of the “trend” over the observed picture, which should not be treated as the result --f
52
L. P. YAROSLAVSKII
only of the action of noise or of a linear operator on the signals; picture elements grouping into modes usually make up continuous areas or, at least for visual analysis, only continuous areas should be extracted and small ones disregarded and so on. In order to improve adaptive mode quantization and allow for the abovementioned factors formalized with difficulty, Belikova and Yaroslavskii (1975) proposed to make use of such auxiliary techniques as fragmentwise processing, separation by mode fuzziness types (fuzziness due to a linear operator and that due to additive noise), mode rejection by the value of the population, and rejection of small details. Some of the results of the application of the adaptive mode quantization method are illustrated in Figs. 24 through 26. Figure 24a shows the picture used in the experiments, and Figs. 24b-d show the results of its uniform quantization with different mode rejection thresholds by the value of their
FIG.24. Adaptive mode quantization: (a) original picture; (b)-(d) quantizations with thresholds 4,5, and 7%, respectively.
FIG.25. Separation of individual modes: (a)the picture of Fig. 24a as quantized into 3 levels with mode power threshold 10%; (b) details of one of the modes; (c) contours of this mode; (d) superposition of the contours on the original picture.
FIG.26. Comparison of fragmentwise and global quantizations: (a) original picture; (b) global three-level quantization; (c) the result of fragmentwise quantization without overlapping.
54
L. P. YAROSLAVSKII
FIG.26 (continued)
population (power), respectively, 4, 5, and 7%. The resulting number of quantization levels was 11,8, and 4. Comparison of these pictures reveals how details disappear with an increase of mode rejection threshold and the preparation appears more generalized. One may separate details pertaining to particular modes from other details, determine their boundaries, and impose them on the original photograph (see Fig. 25).
APPLIED PROBLEMS OF DIGITAL OPTICS
55
Fragmentwise and global (over the entire picture) quantization may be compared by their results as shown in Fig. 26, fragment boundaries being shown by the grid. In Fig. 26b only the rough structure of the picture is left; Figure 26c preserves numerous details of the original, the picture is sharper, and the boundaries between impregnations are seen better than in the original. Belikova and Yaroslavskii (1980) proposed a method of controlled adaptive transformations which is a further extension of the methods of adaptive amplitude transformation. Transformation parameters are determined there by analyzing the histogram of a picture preparation or fragment, or of a picture of the same object in another radiation range rather than directly of the processed picture. C . Linear Preparation Methods as a Version of Optimal Linear Filtration
Numerous linear processing methods that may be regarded as picture preparation are well known. For emphasizing small details, the suppression of low, and the amplification of high, spatial frequencies of the Fourier signal spectrum is popular. For suppression of small hindering details, lowfrequency filtration, i.e., suppression of higher spatial frequencies of the picture, is advised (Huang et al., 1971;Huang, 1975; Andrews, 1972; Gonzalez and Wintz, 1977; Pratt, 1978; Rosenfeld and Kak, 1982; Rosenfeld, 1969). In order to provide a reasonable basis for the choice of linear transformations and their parameters, it is advisable to treat them as an optimal, in a sense, linear filtration of the useful signal against the noise background, and regard picture details to be amplified as useful signal and the background as noise. Let us determine the characteristics of a filter to minimize the squared modulus of error between the signal of the extracted object (useful signal) and the result of observed signal filtration averaged over all the possible variations of the useful signal and representations of signal sensor noise. Let us confine our discussion only to the most easily realizable filtermasks described by diagonal matrices and consider the observed signal as an additive mixture of the extracted object and background picture. and {A,) be representation coefficients with respect to Let { a s ) ,{&), {&I, some basis { q , ( k ) } , respectively, of objects to be extracted, the observed picture, background, and filter mask. Then the mean-squared value of the filtration error modulus is
<Mi’>= [(N211@s s=o -
1s11.12)]
(46)
56
L. P. YAROSLAVSKII
where the bar means averaging over signal sensor noise, square brackets mean averaging over all the possible object positions in the picture, and angle brackets mean averaging over other stochastic parameters (form, orientation, scale, etc). It may be readily demonstrated that the values of 2, minimizing error are defined by
By substituting into Eq. (47)
one obtains
Since v-
where $ k ( s ) signal,
iS
I
a basis reciprocal to (cp,(k)},and Y-
{ak}
are samples of the object
1
In the simplest and most natural case where the object coordinates are uniformly distributed over the picture area, [ak] is independent of k Cakl
= [a]
and Eq. (51) becomes
c(cr,>l = C$(s)l where
In this case,
Note that, since [$(s)] = f l 6 ( s ) for the majority of practically used bases, the second term in Eq. ( 5 5 ) affects only that value of A. which is usually
APPLIED PROBLEMS OF DIGITAL OPTICS
57
responsible for the inessential constant component over the picture field. Therefore, it will be disregarded below. One may also assume in preparation problems that the objects to be extracted occupy only a minor part of the picture and that the contribution of their variations to the squared modulus of the observed signal spectrum may be taken into account by some smoothing of the spectrum. Thus, one obtains the final formula for the optimal filter mask
where the tilde means the above-mentioned smoothing. This is similar to the classical formula of the optimal Wiener filter but with the denominator containing only the observed signal power spectrum smoothed and averaged over the signal sensor noise rather than the sum of spectral power densities of signal and noise. Such a filter is optimal for the given observed picture on the average over all the variations of the object to be extracted and signal sensor noise. This filter will be referred to as an MRMS filter. If reconstruction of the signal power spectrum is used as the criterion of optimality (Pratt, 1978) instead of the minimum of the rms filtration error, one obtains the filter
which will be called an RSS filter. Finally, if one desires to obtain through filtration the maximum of the ratio of signal on the desired object in its localization point to the rms value of the background picture, one obtains the filter
which may be called MSNR (see Section IV). Thus, a family of filters [Eqs. (56), (57),and (%)I results that may be used during preparation to make objects more prominent against a hindering background. These filters are adaptive because their characteristics depend on the spectrum of a processed picture. Adaptation may be either global if the filtration error is averaged over all the picture and, as a result, the formula of filter frequency response involves the spectrum of the entire picture, or local if the error is averaged over fragments and the formula involves fragment spectra. Notably, the above-mentioned recommendation about suppression of low, and amplification of high, spatial frequencies when extracting minor details, and of suppression of high spatial frequencies when smoothing pictures, are included as special cases in the above three types of filters. Indeed, the picture spectrum as a rule is a function rapidly decreasing with the growth of the spatial frequency (index s). Thus in all the filters of Eqs. (56-58), the
58
L. P. YAROSLAVSKII
position of the passband maximum varies depending on object size, which affects the numerator. If objects have a small size, the passband maximum lies in the domain of high spatial frequencies; if large details are extracted, it shifts to lower frequencies. Experimental processing of geological and medical pictures has demonstrated the effectiveness of these filters (Belikova and Yaroslavskii, 1980). Figure 27 shows filtration with the aim of enhancing distinguishability of microcalcinates in mammograms (roentgenogram of the mammary gland), where (a) is the original mammogram and (b) is the result of MSNR filtering. Minor impregnations of microcalcinates into the soft tissues of the mammary gland are one of most important symptoms of malignant tissue degeneration. Their differentiation by usual mammograms presents significant difficulties, especially at the early stages of disease. Processing like that shown in Fig. 27b
FK. 27. Optimal filtration for enhancing distinguishability of microcalcinates in mammograms: (a) original mammogram; (b) result of the optimal MRMSN filtration; (c) marks indicating detected points.
APPLIED PROBLEMS OF DIGITAL OPTICS
59
FIG.28. Example of optimal filtration of an angiogram: (a) original brain radiogram; (b) isotropic separation of minor details, such as arbitrarily oriented blood vessels; (c) anisotropic separation of minor details, which extracts vertical vessels.
may be of great help in the early diagnosis of malignant tumors of mammary glands. Figure 28 demonstrates examples of applying similar processing to angiograms with the aim of enhancing the distinguishability of blood vessels. An interesting pseudo relief effect is observed in Fig. 28c, resulting from the application of an anisotropic filter which extracts vertical vessels to the radiogram of Fig. 28a. Such processing might be an alternative to administering a contrast substance to a patient at examination that is a painful and, sometimes, dangerous operation. Figure 29 illustrates the application of linear filtration to the suppression of ribs and enhancement of middle-detail contrast in X rays. Fast computer implementation of preparation by spatial filtration is important since interactive processing requires high speed. Single or multiple (parallel or cascaded) signal filtration through the two-dimensional separable recursive filter of the type in Eq. (32) is one of the fastest approaches to optimal filtration. This filter has rectangular impulse response and is, therefore, suitable for separation of rectangular vertical or horizontal details. Multiple parallel filtration enables generation of arbitrarily oriented impulse response corresponding to the orientation of picture details. Successive (cascaded) or iterative processing enables smoother and, in particular, more isotropic impulse response. Sometimes it is more convenient to perform filtration in the spectral domain. It is good practice to d o so if separate spectral components of the signal or narrow intervals of the signal spectrum (as in Fig. 29) are to be suppressed or enhanced. It is important to mention that the speed of existing or predicted digital processors is insufficient for interactive real-time linear transformations. Local spectral adaptation for processing a 1024 x 1024 picture requires, for example, K x 2” operations, where K is a complexity factor which inherently
FIG.29. Suppression of ribs and enhancement of contrast of middle-size details by linear filtration: (a) original x ray; (b) result of filtration.
APPLIED PROBLEMS OF DIGITAL OPTICS
61
cannot be less than several tens even for the best recursive algorithms. Since interactive processing of one frame requires about 0.1 sec, the required speed of a digital processor is in the hundreds of millions of operations per second. Optical technology is known to be much superior in speed to digital in linear spatial filtration. There is a simple optical representation of correction and preparation filters developed here and in Section 11.To this end, it suffices, as Yaroslavskii suggested in 1981, to place a nonlinear optical medium whose transparency depends o n the energy of incoming radiation into the Fourier plane of the classical coherent optical system of the spatial picture filtration. Introduction of this medium makes the optical system adaptive and enables implementation of filters with frequency response of the type in Eqs. (56)-(58). D. Rank Algorithms of Picture Preparation Apart from linear picture preparation methods, it is desirable to have nonlinear ones as well. Arbitrary transformation of digital signals, of course, can be realized with linear and pointwise nonlinear transformations of individual signal samples. Nevertheless, it is advisable to have units larger than pointwise transforms. The distinguishing feature of pictures as two-dimensional signals is that their individual points are related to their neighbors. Therefore, the majority of transformation algorithms are of local nature; i.e., groups of points in some vicinity of the given point are processed simultaneously. Linear transformations readily comply with this requirement of locality and enable construction of algorithms whose computational complexity is only weakly dependent on the size of the vicinity. Nonlinear picture transformations should feature the same properties. At present, a very useful class of nonlinear transformations has appeared. It features both locality and computational simplicity, and consists of algorithms that might be named “rank filtration algorithms” because they are built around the measurement of local-order (rank) picture statistics. A value having rth rank, i.e., occupying the rth place in a list of sample elements ranked in increasing order (in a variational sequence of R elements) is rth-order statistics of a sample consisting of R values. Obviously, any rthorder statistics m,(k, I ) may be determined from local histograms h‘k”’(s) through the following equation
For computation of local histograms there exist fast recursive algorithms similar to those of recursive digital filtration (Yaroslavskii, 1985). Therefore, the computational complexity of rank filtration algorithms basically is almost
62
L. P. YAROSLAVSKII
independent of fragment size. With the computation of specific rank statistics and their derivatives, further simplifications may be possible due, in particular, to the informational redundancy of the picture. The most popular algorithm of this class is that of median filtration (see Section I1,C) (Pratt, 1978; Huang, 1981; Justusson, 1981; Tyan, 1981), where samples of a processed sequence are replaced by the median of the distribution of values of points in a given vicinity of these samples. The median is known to be a robust-against-distribution-“tails” estimate of the sample mean value (Huber, 1981). It is the robustness that makes the median filter superior to those computing the local mean for picture smoothing. The low sensitivity of the median t o distribution “tails” accounts for the fact often mentioned in the literature [e.g., see Pratt (1978)l that, in contrast to smoothing by sliding averaging, that by the sliding median preserves sharp overfalls and detail contours. Robustness allows one to make far-reaching generalizations of median filters, for example, in the direction of constructing median matched twodimensional filters as robust analogs of linear matched and of optimal filters, and, in particular, filters described in the preceding section. For instance, a median filter with an arbitrary window may be regarded as a robust matched filter for a detail having the form of filter window. An algorithm based on the determination of the difference between the picture and the result of its arbitrary-window median filtration is a robust analog of linear filters described in the preceding section and is oriented to extraction of details in pictures. A version of this filter was described by Frieden (1980). The median represents nth-order statistics of the local histogram constructed through a fragment consisting of (2n + 1) samples. Other generalizations of the median filter are possible using order statistics different from the median such as extremal filtration algorithms, where the maximum (2nth-order statistics) or minimum (zero-order statistics) over a (2n + 1)-point fragment is substituted for the fragment under consideration. Obviously, if the point rank over the fragment is substituted for its value, the above sliding equalization algorithm results. Thus, both sliding equalization and other existing adaptive amplitude transformation algorithms relying upon analysis of local histograms may be regarded as rank algorithms. This relation is also stressed by another property of the rank algorithmstheir local adaptability to the characteristics of processed pictures and potential applicability to robust feature extraction in preparation and automatic recognition of pictures, rather than to robust smoothing only. As an example of feature extraction rank algorithms, one can describe a robust algorithm for estimation of local dispersion based on computation of the difference between some given order statistics to the right and to the left of the median (R-L algorithms). In Fig. 30 this algorithm is compared with an
APPLIED PROBLEMS OF DIGITAL OPTICS
63
FIG.30. Comparison of sliding variance and rank R-L algorithms: (a) original picture; (b) pattern of values of local variances of the picture in (a) over a 9 x 9 - point fragment; (c) result of processing by an R-L algorithm with the same size fragment and R = 51, L = 31.
estimate of local variance by computation of the sliding mean value of the squared difference between values of picture points and their local mean values. This comparison demonstrates that R-L algorithms provide much better localization of picture nonuniformities as compared with the sliding variance algorithm. E. Combined Methods of Preparation. Use of Vision Properties for Picture Preparation
In real applications, the best results may obviously be obtained by using various combinations of nonlinear and linear preparation methods and utilizing all the possibilities of visual perception. The diversity of com-
64
L. P. YAROSLAVSKII
binations is unlimited, but two practically important classes may be distinguished among them: preparation with decision making and preparation with determination and visualization of picture quantitative characteristics. Pictures resulting from preparation with decision making can be considered as fields of decisions with respect to selected features. A simple example of such algorithms is represented by the above (Section II1,B) algorithms for adaptive mode quantization, which should be complemented by various linear and nonlinear algorithms whose aim is to provide higher stability of mode selection. The MSNR filters with subsequent detection and marking of the most intensive signal overshoots constitute another example of combined algorithms. This corresponds to optimal detection and localization of picture details, as shown in Section IV. Figure 31 presents examples of such processing. The diversity of methods for preparation with determination of quantitative characteristics is as great as the diversity of picture quantitative characteristics. But what they have in common is that the results of quantitative measurements are represented as pictures: tables, graphs, dimetric projections of surfaces, lines of equal values, etc. Such preparation with determination and visualization of quantitative characteristics may consist of multiple stages. This may be illustrated by detection of layers with respect to depth in lunar soil samples conveyed by the automatic interplanetary station “Luna-24” (Leikin et al., 1980).One of the methods for detection of the layered structure of soil samples is separation of layers with respect to the characteristic size of stones in the stone fraction. The following method was employed for determination and visualization of the average size of stones: ( 1 ) Optimal filtration of the original picture (Fig. 32a) by an MRMS filter for the separation of the stone fraction from the background; (2) Binary quantization of the resulting preparation by the adaptive mode quantization algorithm for obtaining the field of decisions (Fig. 32b); (3) Measurement of normalized one-dimensional correlation functions of preparation rows (i.e., horizontal cross sections of the soil sample) and representation of the correlation function set as a two-dimensional signal whose values are correlation functions in the coordinates “depth of drilling interval of the correlation”; (4) One-dimensional smoothing of this signal by a rectangular window in the direction of increasing depth; ( 5 ) Determination of equal-value lines of the smoothed signal and plotting them in the coordinates “depth width of the correlation function at a given level”(see the graph for the level 0.5 in Fig. 32c). This graph is regarded as the final preparation which along the depth coordinate corresponds to the original picture, and along another coordinate characterizes the average
FIG.31. Preparation with decision making: (a) original mammogram; (b) results of linear filtration of the mammogram by an MRMSN fiiter oriented to the detection of microcalcinates; (c) isolation of concentration domains of calcinatelike details.
66
L. P. YAROSLAVSKII
a
1. 2 FIG.32. Preparation with determination and visualization of the picture’s quantitative characteristics: (a) original radiophotograph of a soil column; (b) binary preparation, the result of stone isolation; (c) graph of the correlation function section of the picture in (b) at level 0.5.
diameter of the black spots in the preparation of Fig. 32b, i.e., the average size of stones in the sample. One can easily see hills and valleys in this graph that correspond to the specimen areas with large and small stones. Therefore, the graph is a convenient quantitative measure for division of the specimen into layers according to the average size of stones. Obviously, a single feature is insufficient in the general case for picture interpretation. To put it differently, it is desirable to generate and represent for visual analysis multicomponent or vector features. To solve this problem, the properties of vision should be exploited to full advantage. First of all, color vision might be used for representation of vector features. In this case, simultaneous representation and observation of three-component features is possible: Each of three picture preparations representing three
b
4 . 2
1
: 1,
4
3
-1
5
6
8
5
6
9
10 11 1;
FIG 32 (continued)
7
8
9
10
1 3 14 1 5 16 17 18 19
68
L. P. YAROSLAVSKII
features is shown by a distinct color (red, blue, or green), and these pictures are mixed on the display screen into a color picture. This technique of representation of preparation results may be named “colorization.” Two-component vector attributes may also be represented by stereoscopic vision. This is most natural in processing pictures which comprise a stereoscopic pair. In this case, one or both photographs of the pair are substituted by some preparation, and the user observer is thus able to examine the stereoscopic picture with the effects of preparation. Another approach to using stereoscopic vision is to treat the feature resulting from picture preparation as a “relief,” and to synthesize through this relief and the original picture new pictures constituting a stereoscopic pair. The user can thus observe a pseudostereoscopic picture whose brightness is defined by one picture preparation or by the original picture, and the relief is defined by another one. Finally, there is one more possibility for representing preparation results: picture cinematization, i.e., their transformation into movies by generating from the series of preparation results a series of movie frames shown with cinematographic speed in order to provide smoothness of the observed changes. Cinematization is best used for observation of smooth variations in a preparation parameter: e.g., fragment size at sliding equalization, exponent at power intensification, etc. Combinations of all three methods are possible, of course. IV. AUTOMATIC LOCALIZATION OF OBJECTS I N PICTURES One of the major tasks of pictures is to provide information about the relative location of objects in space. In many applications, detection and localization (measurement of coordinates) of objects is of extreme practical importance. Many other problems of automatic picture interpretation, especially those of object recognition, may also be reduced to this problem. A copious literature exists on localization and detection of objects in pictures, but the variety of ideas used for the solution of this problem is not so rich. Essentially, detection and localization of objects is reduced in all methods to some kind of correlation of the given object with the observed picture and to subsequent comparison of the result with a threshold. The approach is justified either by a simple additive model treating the observed picture as a sum of the desired object and correlated independent noise with a known autocorrelation function (Andrews, 1970; Vander Lugt, 1964; Rosenfeld, 1969; Pratt, 1978), or by the Schwartz inequality (Rosenfeld, 1969). Numerous experimental verifications, however, reveal that, for sufficiently complicated practical pictures, the probability of erroneous identification by a
APPLIED PROBLEMS OF DIGITAL OPTICS
69
correlation detector of the desired object with foreign background objects is rather high. In order to improve detection quality, various improvements are suggested such as signal quantization, spatial differentiation, predistortion of the form of the correlated object, etc. Being heuristic in nature, these improvements can neither be listed nor classified, nor ordered with respect to their quality. At the same time, this adherence to the correlator is not occasional. The correlation detector-estimator is essentially a version of the linear detectorestimator, where a decision about the presence of a desired object and its coordinates is made pointwise through the level of signal in each point of the field at the output of a linear filter acting upon the observed picture. The aim of the linear filter in such devices is to transform the signal space so as to enable independent decision making by each signal coordinate of the transformed space rather than by the signal as a whole. Due to the decomposition into independent linear and nonlinear spatial inertialess units, analysis and implementation of such a device in digital and analog processors is much simplified. This accounts for the popularity of the correlation method for object detection and localization in pictures. Simplicity of implementation is an important factor, and it turns out that one can determine the optimal characteristics of the linear detector-estimator to ensure the best localization reliability by relying upon its representation as a combination of linear filter and nonlinear pointwise decision unit, as well as on the adaptive approach developed here. The present section is devoted to presentation of this approach, which has proved fruitful both for digital and purely optical processing. In Section IV,A the problem of an optimal detector-estimator is posed. In Section IV,B the problem of determination of the optimal linear filter for localization of an exactly known object by a spatially uniform localization criterion is solved, and data are presented that bear this result out. In Section IV,C it is extended to the case of an inexactly defined object, spatially nonuniform criteria, and a distorted picture. In Section IV,D the results obtained are treated in order to explain the well-known recommendations on the usefulness of extracting contours prior to picture correlation, and to define the very notion of contour more exactly. Moreover, the problem of selecting the best, from the localization reliability standpoint, objects is solved here. In the existing literature, this important practical problem has hardly been discussed. A . Optimal Linear Coordinate Estimator. Problem Formulation
Let us consider an estimator consisting of a linear filter and decision unit determining the coordinates of the absolute maximum of a signal at the filter
70
L. P. YAROSLAVSKII
output, and let us determine the optimal linear filter ensuring the best quality of estimation. The quality of the object coordinate estimation is defined by two kinds of errors: errors due to false identification of the object with separate details in the observed picture, and those of measurement of the coordinates in the vicinity of their true value. The first kind of errors define large deviations of the result exceeding the size of the desired object. In the case of detection, they are called false-alarm errors. We shall refer to them as anomalous. The second kind or normal errors are of the order of magnitude of the object size and are due mostly to the distortions of the object signal by sensor noise. They are quite satisfactorily described by the additive model. Therefore, the classical estimator with matched filter is optimal in terms of the minimum of normal error variance, as was shown by Yaroslavskii in 1972 (it may be assumed that normal errors are characterized by their variance). However, it will yield many anomalous errors. Their probability and related property of estimator threshold were discussed in detail by Yaroslavskii (1972 b). Here we shall determine the characteristics of the linear filter of an estimator optimal in terms of anomalous errors. Let us define exactly the notion of optimality. In order to allow for possible spatial nonuniformity of the optimality criterion, let us assume that the picture is decomposed into N fragments of area S,, n = 0,1,. . ., N - 1. Let h'"'(b,x,, y o ) be a histogram of video signal magnitudes b(x, y ) at the filter output as measured for the nth fragment in points not occupied by the object, provided that the object lies at the point with coordinates ( x o , y o ) ,and b, be the filter output in the object localization point (it may be assumed that bo > 0 without restricting generality). As the linear estimator under consideration decides upon the coordinates of the desired object via those of the absolute maximum at the linear filter output, the integral Qn(Xo >
PO) =
i:
h,(b, x o y o ) db 7
(59)
bo
then represents that portion of nth fragment points that can be erroneously taken by the decision unit for object coordinates. Generally speaking, b, should be regarded as a random variable because it depends on video signal sensor noise, photographing environment, illumination, object orientation at photographing, neighboring objects, and other stochastic factors. In order to take them into consideration, introduce a function q(b,) which is the a priori probability density of b,. Object coordinates also should be regarded as random. Moreover, the weight of measurement errors in localization problems may differ over different picture fragments. To allow for these factors, we introduce weighting functions
APPLIED PROBLEMS OF DIGITAL OPTICS
71
w ( " ) ( x o , y oand ) W, characterizing the a priori significance of errors in the determination of coordinates within the nth fragment and for each nth fragment, respectively
ss
w(")(xo, y , ) dx, dy, = 1
S,
N- 1
1 wn=l
n=O
Then the quality of estimating coordinates by the estimator under consideration may be described by a weighted mean with respect to q(bo),w(")(xo, yo), and W, of the integral of Eq. (59) m
..
S,
-a,
bO
If we want to know the mean estimation quality over a set of pictures, Q should be averaged over this set. An estimator providing the minimum of Q will be regarded as optimal. B. Localization of an Exactly Known Object with Spatially Uniform Optimality Criterion Assume that the desired object is eactly defined, which means that the response of any filter to this object may be exactly calculated or that q(bo)is a delta function db,)
=
w,
- 60)
The Eq. (61) defining the localization quality becomes
or, if the histogram averaged within each fragment over xo and y , is denoted by
Sn
it becomes
72
L. P. YAROSLAVSKII
Suppose that the optimality criterion is spatially homogeneous, i.e., that weights W, are independent of n and are equal to 1jN. Then
is the histogram of the filter output as measured over the whole picture and averaged with respect to the unknown object coordinates. By substituting Eq. (66) into Eq. (65), we obtain Q=
IG:
h(6)db
First, determine the frequency response H(fx,f;,)of a filter minimizing Q. The choice of H ( f , , f , ) affects both bOand histogram h(b). Since go is the filter response at the object localization, it may be determined through the object spectrum crO(fx,f,.) as 60
=
s’
ao(.f;,f;,)H(f,f,)df,df,
(68)
-u(
As for the relation between h(b)and H(f’,fi,), it is, generally speaking, of an involved nature. The explicit dependence on H ( f x , f;,) may be written only for the second moment of the histogram h(b)by making use of the Parseval relation for the Fourier transform
m,
=
([-:,
b2K(b)db)’’, 112
= i S J ^ . i . o . . o ) d x O d . o ~-~x
b2Nb,x0,yo)db)
S
x
where S, is the area of the picture under consideration minus the area
APPLIED PROBLEMS OF DIGITAL OPTICS
73
occ:upied by the signal of the desired object at the filter output, a~ ~ 9' o((fx,f,) is the Fourier spectrum of the picture, where the signal in the area occupied by the desired object is set to zero (background spectrum), and
JJ Sl
Therefore, we shall rely upon Chebyshev's inequality, which is well known in probability theory and which for histograms is
and require that g = rn:/bi
be minimal. This condition is equivalent to that of the maximum of 31) n n
J J -m
In order to determine the minimum of y1 with respect to H ( f x , f y ) ,let us make use of the Schwartz inequality 7
- x f
7,
-x
-m
from which it follows that the maximum (75) -m
is attained at (76)
74
L. P. YAROSLAVSKII
One may express lc~,,,(f,,f,)1~ through the spectrum of the observed picture a p ( f xf,, ) and that of the desired object cro(fx, f,). Obviously,
Then, substitution of Eq. (77) into Eq. (70) results in
-
Ic(bgl2
=
&I2
+ 10(,12 - cr,*aow or,cr,*w* -
(78)
where
S
is the spectrum of the weight function w ( x o ,yo). Usually, the area occupied by the desired object is much less than the area of the picture itself. Therefore, the following approximate estimate is often practicable Obviously, if an optimal filter is required for a set of pictures, the result of spectrum averaging over the set should be substituted into Eqs. (78) and (80) for 1.,(fX,f,)l2. Such an optimal filter may be rather easily implemented by optical means (Yaroslavskii, 1976a, 1981) in an adaptive optical system with a nonlinear element in the Fourier plane and has shown to give good results (Dudinov et ul., 1977). With digital realization, it is most reasonable to process the signal in the frequency domain because the frequency response [Eq. (76)] of the optimal filter is based on measurement of the observed picture spectrum. Computer simulation of the optimal linear estimator also has confirmed its advantage over the traditional correlator. Figure 33 shows a 512 x 512element picture over which experiments were carried out on determination of the coordinates of 20 test 5 x 5-element dark marks whose disposition is shown in Fig. 34 by numbered squares. As may be seen from this scheme, the test objects are situated in structurally different areas of the aerial photograph; this fact enables us to estimate the correlator and optimal linear estimator under different conditions. The contrast of marks is about 25% of the video signal amplitude range. The ratio of the mark amplitude to the rms video signal value over the background is about 1.5. The results of the simulation are shown in Fig. 35, which presents (in the downward direction) the cross sections of the initial video signal and outputs of a standard correlator and optimal filter passing through the centers of marks (12) and (15) in Fig. 33. One may easily see in the graph of correlator output the autocorrelation peaks of test marks and false correlation peaks, including those exceeding the autocorre-
APPLIED PROBLEMS OF DIGITAL OPTICS
75
FIG.33. Test aerial photograph with square marks.
lation one. These false peaks result in false decisions (Fig. 36). Comparison of this graph with the lower one in Fig. 35 shows how the optimal filter facilitates the task of spot localization to the decision unit. The result of the optimal estimator operation is tabulated below in Table 111, which lists 31 main local maxima of the optimal filter output. As may be seen from the table, coordinates of all twenty test marks are precisely measured, and no false decision is made. It may also be seen which areas of the picture give a smaller output response, i.e., are potentially localizable with greater difficulty (see also Fig. 34 where each spot is numbered as in Table 111).
I
FIG.34. Scheme of marks in Fig. 33
FIG.35. Graphs of a section of the original picture (Fig. 33) video signal (upper), standard correlator output (middle), and optimal filter output (lower).
APPLIED PROBLEMS OF DIGITAL OPTICS
r c
77
--
:u
.. .. /
FIG.36. Scheme of decisions at the standard correlator output.
TABLE 111 RESULTS OF MEASURING TESTMARKSI N FIG. 35 Serial number" 1
-7
3 4 5 6 I
8 9 10
11 12 13 14 15 16 a
Relative local maximum
1 0.88
0.81 0.83 0.83 0.83 0.82 0.8 0.8 0.878 0.78 0.78 0.778 0.774 0.77 0.766
1-20 are true peaks; 21-31 are false peaks.
Serial number" 17 18 19 20
21 22 23 24 25 26 27 28 29 30 31
Relative local maximum 0.762 0.754 0.754 0.737 0.733 0.729 0.725 0.721 0.721 0.713 0.709 0.709 0.709 0.704 0.704
78
L. P. YAROSLAVSKII
C. Allowance for Object’s Uncertainty of Definition and Spatial Nonuniformity: Localization on “Blurred Pictures” and Characteristics of Detection 1. Localization of Inexactly Dejined Picture
This is the case when 4(b,) cannot be regarded as a delta function; i.e., the object is not exactly known. As before, the picture will be regarded as spatially uniform. Now, the optimal estimator must provide the minimum of the integral
J -=
J bo
(81)
where h(b)is defined by Eq. (66).
a. Estimator with selection. Decompose the interval of possible values into subintervals within which 4(b,) may be regarded as constant. Then
where b$)is representative of the ith interval, q iis the area under 4(b0)over the ith interval. As q i 0, Q , is minimal if
Qy)=
bg
h(b)db
(83)
is minimal. The problem, thus, is reduced to the above problem of localization of an exactly known object, the only difference being that now an estimator with the filter
should be generated separately for each “representative” of all the possible object variations. Stated differently, this means that there are more than one given objects. Of course, this results in losses of time on selection.
h. Estimator adjusted to averaged object. If the dispersion of parameters is small enough, one may, at the expense of a higher rate of anomalous errors, solve the problem as though the object is exactly known; the optimal filter in this case is corrected with due regard to the object parameter dispersion. In order to correct the filter characteristic, change in Eq. (81)the variables h , = b - h, and the order of integration
79
APPLIED PROBLEMS OF DIGITAL OPTICS
The internal integral in Eq. (85) is a convolution of distributions or distribution of the difference of two independent variables b and b,. One may denote this distribution by h,(b,). Its mean value is equal to the difference of mean values b, and b,, of the distributions q(b,) and h(b),and the variance is equal to the sum of variances of these distributions, that is [rnt + 6:, where 8,‘ is the variance of the distribution q(b,). Therefore, Ql
=
J-: J-: &w, h,(b,)dbl =
h,(bl
(86)
-
bo
The problem, thus, has boiled down to that of Section IV,B, and similarly to Eq. (76) one may write the following expression for the optimal filter frequency response
where @ < ( f x , f Y ) is a function complex conjugate to the object spectrum averaged over the set of unknown object parameters [the result of averaging over q(b,) in Eq. ( 8 5 ) ] , and
laef(fx>LA2= C ~ O ( S X J J - &l(S4l2
(88)
is the mean-squared difference a,(fx, j,) - ?i,(fx,fy). The optimal filter is somewhat different from that of the determinate case: It relies upon an “averaged” object and corrected power spectrum of the background picture, correction being the rms of the object power spectrum. 2. Localization in the Case of Spatially Nonhomogenuous Criterion
Let us turn to the general formula, Eq. (61). Depending on the constraints on implementation, one of the two ways to attain the minimum of Q may be chosen.
a. Readjustable Estimator with Fragmentwise Optimal Filtration. given W,, the minimum of Q is attained at the minima of all
QY =
:/
Under a
~ ~ b o ~ d b o S S w ’ ” ) ~ x , ~ . Y o ~ d xh,(b,xo,.Yo)db od?o
(89)
S,
This means that the linear filter should be readjustable and process pictures by fragments within which the averaging in Eq. (89) is done. For each fragment, the characteristic of an optimal filter is determined through Eq. (74) or (87) on the basis of measurements of the observed local power spectrum of fragments (with allowance for the above reservations about the influence of the object spectrum on the observed picture spectrum). According to Eq. (61), the
80
L. P. YAROSLAVSKII
fragments do not overlap. It is obvious from the very sense of Eq. (61) that it gives rise to the sliding processing algorithm based on an estimate of the current local power spectrum of the picture because error weights may be defined by a continuous function. Note also that, with fragmentwise and sliding processing, the readjustable filter characteristic is independent of weights W, or a corresponding continuous function. b. Nonreadjustable Estimator. When a readjustable estimator with fragmentwise or sliding processing cannot be implemented, the estimator should be adjusted to the power spectrum of picture fragments averaged over W,. Indeed, it follows from Eq. (61)
where h=(b)is a histogram averaged over {K}and w(")(xo,yo), whence one may conclude by analogy with Eqs. (76) and (87) that
where
Thus, the transfer function of the optimal filter is in this case dependent on the weights {Wn). 3. Localization on Defocused Pictures
Let the picture be distorted by a linear, spatially invariant system with frequency response H , ( f x ,f,). Obviously, the optimal estimator should be adjusted to an object that was subjected to the same transformation as the observed picture; i.e., the filter transfer characteristic should be as follows
Depending on which way is more convenient for filter implementation and for representation of the reference object, different modifications of this formula
APPLIED PROBLEMS OF DIGITAL OPTICS
81
are possible. For example,
correponds to the estimator where the observed defocused picture of spectrum is first “whitened” by a filter making its power spectrum uniform and then ~ 2 )ratio 1 ’ Z (. ~ Z ) l ’ Z / may ~ H sbe~ correlated with the reference a ~ ~ H s ~ / ( ~ q , gThe regarded as a picture spectrum at the output of a filter inverse to the defocusing one, i.e., as a spectrum of a picture corrected by the inverse filter. Here a relation exists between localization in a defocused pictures and correction of pictures distorted by linear systems (see Section 11,D).
4 . Detection Characteristics Sometimes it is desirable to detect an object with certain reliability without a priori knowledge that it is present in the picture. Detection reliability is known to be characterized by the conditional probabilities of a missing object and a false alarm (detection). A peculiar feature of the localization and detection problem under consideration lies in the fact that the possibilities of missing an object and of a false alarm depend on different random factors: The former depends on signal sensor noise, and the latter, on the presence of foreign objects and (to a lesser degree) signal sensor noise. Since foreign objects are assumed not to be defined a priori, it is impossible to determine the probability of a false alarm. One can only be sure that for the observed set of foreign objects it is minimized by appropriate choice of the above linear filter. In order to determine the false alarm probability, one has to assume a statistical description of foreign objects in the form, for instance, of signal overshoot distribution at the output of optimal filter as defined for a given class of pictures. The noise of the video signal sensor is quite satisfactorily described by the additive Gaussian model. Therefore, the object missing probability may be defined as
where ho is the maximal signal of the desired object at the optimal filter output, h,, is the chosen detection threshold, 6 is the standard deviation of sensor noise, and @(x) is the error integral 1
r x
L. P. YAROSLAVSKII
82
D. Optimal Localization and Picture Contours. Selection of Objects from the Viewpoint of Localization Reliability
I . Whitening and Contours In order to gain insight into the sense of operations performed over the observed picture by the derived optimal linear filter, its characteristic, Eq. (76), may be conveniently represented as
In this representation, the filter action is reduced to the picture whitening (filter H , ) mentioned above ( in Section IV,C) followed by correlation of the whitened picture with exactly the same transformed desired object (filter H 2 ) . An interesting feature of the optimal filter of Eq. (97) is that the whitening by the filter H ( f x , f , )= l/(Ic(bg(fx,f;.)(2)112 usually brings about contouring of the observed picture owing to amplification of its high spatial frequencies since, as a rule, the picture power spectrum is a sufficiently rapidly decreasing function of spatial frequencies and, consequently, HI(f x , f,) grows with frequency. This conclusion is illustrated by Fig. 37, demonstrating the result of whitening of the picture shown in Fig. 33, and also by the results of test picture whitening as shown in Fig. 38. The recommendation empirically established by some researchers, that in order to enhance localization reliability it is a good practice to extract contours of the picture prior to correlation by some kind of spatial differentiation or to quantize it roughly for improving boundary sharpness, thus, has a rational substantiation. Moreover, this result casts a new light on what are to be regarded as picture contours and why contours are of such importance for the visual system. The concept of contours often occurs and is differently defined in publications on picture processing and recognition. From the viewpoint of object localization in pictures by the linear estimator, “contours” result from picture whitening. The more intensive this “contour” portion in the signal describing the object (the sharper object picture, in particular), the more reliable is localization. Possibly, from this standpoint one can explain the well known effect in vision psychophysics that visibility of noise and distortions near sharp brightness overfalls (object boundaries) is lower than where brightness varies smoothly, i.e., where the intensity of the “contour” signal is small. Notably, when contour extraction is discussed, usually isotropic differentiating procedures are implied. The optimal whitening for localization, however, is not necessarily isotropic or differentiating because it is defined by
APPLIED PROBLEMS OF DIGITAL OPTICS
FIG.
83
37. The result of “whitening” of the picture of Fig. 33.
the spectrum of the background picture or, in the case of a spatially nonhomogeneous estimator, by those of picture fragments over which the desired object is looked for. Moreover, the same phenomenon accounts for the adaptivity of whitening, that is the filter characteristic is adjusted to the observed picture, and the effect of whitening is different on different pictures. For example, it is angular points that are emphasized in rectangulars and parallelograms against the background of circles; in texts, vertical and horizontal fragments of characters are contoured (practically, only angular points are left of them), but sloping fragments almost d o not change because they occur rarely (see Fig. 38b).
84
L. P. YAROSLAVSKII
FIG.38. “Whitening” of the test picture consisting of geometrical figures and characters: (a) original picture; (b)after whitening.
2. Selection of Reference Objects in terms of Localization Reliability There are numerous applications in which the localization object is not defined and one has to choose it. The question is how to do it to best advantage. This problem occurs in stereogrammetry and artificial intelligence, where it is called “the problem of characteristic points.” The literature on stereogrammetry recommends to take as reference objects those fragments that have pronounced local characteristics such as crossroads, river bends, separate buildings, etc. Zavalishin and Muchnic (1974) suggest taking those picture areas over which some especially introduced informativeness functions have extremal values. Qualitative recommendations of this sort may also be found in other publications on pattern recognition.
APPLIED PROBLEMS OF DIGITAL OPTICS
85
The above analysis gives a solution to this problem. Indeed, it follows from Eq. (75) for the maximal “signal-to-noise” ratio at the output of the optimal linear filter that picture fragments with maximal “whitened” spectrum power a 0 / (1 ~ ~ ~ will ~ 1 be ~ the ) ” best ~ references. They will provide the greatest response of the optimal filter and, consequently, the minimum of false identification errors. Hence, the following recommendation may be made on selection of reference objects (in stereogrammetry, for example). One of the stereo pair pictures should be decomposed into fragments and the ratio of their spectrum a(f,,f,) should be determined to the module of the second picture spectrum Irx,,(fx,f,)12. Next, for each fragment the integral of Eq. (73)(or a corresponding sum at digital processing) is computed, and the required number of greatest results is chosen. Since, as it was already observed, the picture spectrum is most commonly a rapidly decreasing function, the reference objects with slowly decreasing spectra, i.e., picture fragments which are visually estimated as containing the most intensive contours, will be the best ones. These recommendation were checked experimentally by Belinskii and Yaroslavskii (1980).Figures 39 and 40 show some of the results of detection of reference objects by means of the above algorithm with sliding processing by a 32 x 32 window. The degree of object (fragment of the original picture) detection reliability is shown by the degree of blackening. It may be readily seen that where the original picture has some sharply pronounced local peculiarities-brightness overfalls, variations of texture pattern, etc.- the best fragments are distinguished. The algorithm for reference object determination requires rather cumbersome computations, especially with sliding processing. Therefore, computationally simpler algorithms approximating the optimal one are of interest. Experiments (Belinskii and Yaroslavskii, 1980) have shown that algorithms for computation of local variance, or mean local values of video signal gradients for which fast recursive algorithms exist, may be used as simplified algorithms. All the processing methods described in this article may be effectively implemented in a hybrid optodigital system built around an adaptive optical correlator with a nonlinear medium in the Fourier plane (Yaroslavskii, 1976a, 1981). With purely digital implementation one has to make some simplifications in order to enhance the speed. This is exemplified by rough quantization of the whitened signal which (Belinskii et al., 1980)enables drastic reduction of operations for computation of the correlation between the “whitened” picture and the desired object, and by the algorithm for identification of benchmarks in aerial and space photographs (Yaroslavskii, 1976b).
86
L. P. YAROSLAVSKII
FIG.39. Automatic extraction of reference objects in an aerial photograph: (a) original picture; (b) result of testing of 32 x 32 fragments.
FIG.40. Automatic extraction of reference objects in a space photograph: (a) original picture; (b) result of testing of 32 x 32 fragments.
88
L. P. YAROSLAVSKII
E. Estimation of’ the Volume of Signal Corresponding to a Stereoscopic Picture The stereo effect is known to be one of the basic stereo vision mechanisms (Valyus, 1950) widely used in different projections of stereo TV and cinema (Shmakov et al., 1966, in applied TV (Shmakov et al., 1966), in aerial photography and cartography, and in many other fields of human activity making use of visual information. Therefore, it is of great practical interest to estimate the volume of signal corresponding to stereo pictures, i.e., the capacity of the channel required for storage and transmission of stereo pictures. This problem is discussed in a number of publications (see, for example, Shmakov et al., 1966; Gurevich and Odnol’ko, 1970)from which one may conclude that the volume of signal corresponding to stereo pictures (stereo pair) is approximately twice the same as that of one picture in the pair. i.e., the capacity of the channel for transmission and storage of stereo pictures is approximately twice that of the single-picture channel. These estimates are based on data on vision resolution of flat pictures and those with depth, and rely upon an implicit assumption that resolutions of stereovision for brightness and relief (depth) components of the stereo picture are equal. Being unfounded, this assumption leads to an overstated estimate of the signal volume. As presented in this article, analysis of optimal localization of objects in pictures enables much more optimistic estimates. From the informational standpoint, the two pictures of the pair are equivalent to one picture plus the relief (depth) map of the scene. Indeed, by means of two pictures one can construct a relief map, and, vice versa, by a relief map and one of the pictures the second picture of the pair may be constructed. Therefore, the increment of signal volume provided by the second picture of a pair is equal to the signal volume corresponding to the relief map. The number of depth grades resolved by the eye is approximately the same as that of the brightness (about 200 according to Gurevich and Odnol’ko, 1970). Therefore, the relative increment of signal volume will be mostly defined by the number of degrees of freedom of the relief map, i.e., by the number of its independent samples. This number may be estimated by the following simple reasoning. Each sample of the relief map may be determined by indentifying corresponding areas in the photographs that form a stereo pair, measuring their parallax, and recalculating it into the relief (plan) depth with due regard to the survey (observation) geometry. All the engineering systems using stereo pictures operate in this manner, and it would be natural to assume that the stereo vision mechanism operates similarly. The number of degrees of freedom
APPLIED PROBLEMS OF DIGITAL OPTICS
89
(independent samples) of the relief map, obviously, is equal to the ratio of the picture area to the minimal area of its fragments which may be identified with confidence in another picture of the pair. It is also evident that, in order to provide reliable identification, the dimensions of identified fragments should exceed those of the picture resolution element, and its area should be several times that of the resolution element. This implies that the number of independent samples of map relief and, consequently, the signal volume increment, will always be several times less than the number of resolution elements in a stereo-pair picture. For example, for identified areas of 2 x 2 and 3 x 3 elements, the increment of signal volume will be, respectively, 4 and 9 times less the signal volume of one picture, etc. The studies of an optimal linear detector of objects in pictures (Belinskii and Yaroslavskii, 1980) demonstrate that, for reliable identification in complicated pictures, areas should be more than 8 x 8 through 10 x 10 picture elements. This fact enables one to hypothesize that the signal volume increment required for representation of the stereo effect is only several percent or even a fraction of one percent of the signal volume for one picture of a stereo pair. The present writer has carried out a series of experiments on stereo picture processing with the aim of indirect verification of this hypothesis. Samples of one of the stereo pair pictures were thinned out and bilinearly interpolated samples were substituted for the rejected ones. The experiments were aimed at determination of the influence of thinning out on the perception of depth and sharpness of the observed stereoscopic picture. Experiments were carried out with frames of a stereoscopic cartoon film (Fig. 41) and a training aerial photograph (Fig. 42). The former were of interest because of sharp steplike changes of plans over which the loss of resolution in one of the pictures caused by thinning out and interpolation might be more prominent. The stereo aerial photograph was used for a quantitative estimation of the influence of thinning out and interpolation on the precision of parallax measurements and, thus, on the accuracy of a relief map. Observing stereo pictures by means of drawings, one may see that thinning out and interpolation of one picture do not tell markedly on the stereo picture quality even at 5 x 5 thinning out when signal volume is decreased by a factor of 25. This same fact is confirmed by the results of measuring the precision of parallax determination for respective points as performed on the stereo comparator for the aerial photograph of Fig. 42 over 31 randomly selected fragments. These results are plotted in Fig. 43. The graph of Fig. 43a shows that at 1 : 3 thinning out the rms error of
90
L. P. YAROSLAVSKII
FIG.41. Influence of thinning out of a picture from a stero pair on the stereoscopic effect: (a) original stereo pair; (b)-(e) the right-hand frame of (a) thinned out with steps of 2: 1,3: 1,4: 1, and 5.1.
APPLIED PROBLEMS OF DIGITAL OPTICS
91
FIG.41 (continued)
parallax measurement is within the precision of the stereo comparator, that is characterized by the error for nonrastered (i.e., not sampled and reconstructed) pictures. Moreover, rastering and 1:2 thinning out slightly decrease this error. This may be explained by the fact that at sampling and reconstruction of pictures by means of a rectangular aperture, pseudocontours occur at the boundaries of neighboring samples that somewhat improve the accuracy of localization of respective points. As may be seen from the graph in Fig. 43b, the loss of stereo effect becomes noticeable only with 1: 7 thinning out, thus confirming the above hypothesis. At the qualitative level it is confirmed also by the well-known fact that one of the pictures in a pair may be distorted significantly (decrease of sharpness, distorted reproduction of half-tints, distortion or even complete loss of colors) without appreciable loss of the stereo effect. O n the other hand, the reasoning used for estimation of the signal volume increment seems to elucidate these phenomena to some extent.
92
L. P. YAROSLAVSKII
FIG.42. Tutorial aerial photograph used in the experiments on thinning out
It should be noted that arguments about the minimal size of identifiable area are tentative because special pictures and objects may be imagined (e.g., sparse contrast points or linear objects against an absolutely even background) where the increment estimate will not be so optimistic. However, it seems to be true for complicated pictures of natural origin. V. SYNTHESIS OF HOLOGRAMS
Hologram synthesis requires the solution of two major problems: computation of the field to be recorded on the hologram, and recording the computation results on a physical carrier capable of interacting with radiation in a hologram reconstruction scheme or in an optical system of spatial
.
APPLIED PROBLEMS OF DIGITAL OPTICS
40
20
i 1
r I
I b
10.2
2
93
.
1
3
4
5
6
, , , Oi
I 2 3 4 5 6 7 8 FIG.43. (a) rms of parallax estimation error and (b) the rate of points with loss of stereo effect as functions of the degree of thinning out. Point “ 0 on the abscissa corresponds to the nonsampled picture and characterizes the precision of stereo comparator. Point “ I ” corresponds to the sampled original picture without thinning out. Points 2 through 8 correspond to thinning out 1:2 through 1:8.
filtration. Solution of the first problem requires an adequate digital representation of wave field transformations occurring in optical systems. For the second problem, optical media are required which can be used for recording synthesized holograms, and techniques and devices for controlling their optical properties such as a transmission or reflection factor, refraction factor, or optical thickness. This section is devoted to the presentation of approaches to these problems. Section V,A formulates a mathematical model which may be used as a basis of synthesis of holograms for data visualization. Section V,B describes, with allowance for the performance of devices for hologram recording and reconstruction, discrete representation of Fourier and Fresnel holograms. Methods for recording synthesized holograms in amplitude, phase, and binary
94
L. P. YAROSLAVSKII
media are analyzed in Section V,C, where the existing hologram recording methods and their modifications are discussed and a universal interpretation of various methods is given. In Section V,D the reconstruction of synthesized holograms in the optical Fourier scheme is considered, and distortions of the reconstructed image are discussed arising at construction of the continuous hologram through its discrete representation. Finally, Section V,E describes the existing methods of data visualization by means of synthesized holograms.
A . Mathematical Model Consider a mathematical model of hologram synthesis built around the scheme of visual observation of objects shown in Fig. 44. The observer’s position with respect to the observed object is defined by the observation surface where the observer’s eyes are situated, and the set of foreshortenings is defined by the object observation angle. In order that the observer may see the object at the given observation angle, it suffices to reproduce the distribution of intensity and phase of the light wave scattered by the object over the observation surface by means of the hologram. For the sake of simplicity, consideration will be given to monochromatic object illumination, which enables one to describe lightwave transformations in terms of complex wave amplitude. Although interaction between radiation and the body at reflection from the body’s surface is of an involved nature, the object characteristics defining its ability to reflect and dissipate incident radiation may be described for our purposes by a radiation reflection factor with respect to the intensity B ( x , y , z ) or amplitude b(x,y,z), which are functions of the object’s surface coordinates. The intensity of the reflected
FIG.44. Scheme of visual object observation by its hologram.
95
APPLIED PROBLEMS OF DIGITAL OPTICS
wave I,(x,y,z) and its complex amplitude A,(x,y,z) at the point (x,y,z) are related to the intensity I(x, y , z) and amplitude of the incident wave as follows I,(& y , z ) = B(x,y , z)l(x, y, 2) A O k Y,
(98)
4 = Nx, y , z)A(x, Y, 4
(99) The reflection factor with respect to amplitude may be regarded as a complex function represented as
W, Y ,4 = I&, Y ,4 expCidx, Y ,41
(100)
Its modulus Ibl and phase p show how the amplitude modulus A and lightwave phase o change after reflection by the body surface at the point (x, y, z) IA,(x,y,z)l
(101)
= IA(X,Y, z)IIb(x,y, z)l
o,(x, y, 4
= w(x,y>4
+ P(X, y, 4
(102)
where A,(&
y,4
= IA,(x,
Y ,41expCiwo(x,Y ,41
(103)
a x , Y, z) = I&, Y, 41expCMx, Y ,41 ( 104) According to Eqs. (98)-(104), the intensity reflection factor may be determined through the amplitude reflection factor as
B
=
lbI2 = bb*
(105)
The relation between the complex amplitude r(5,q, ()of the lightwave field over an arbitrary observation surface defined at the coordinates q, c) and the complex amplitude A , of the object surface can be described by an integral
(r,
re,0,
=
jJ]
~ , ( x ,y , 2) w , Y, z;
4, q, i) dx dy d z
(106)
S(X,Y,Z)
whose kernel T(x, y, z; 4, q, () depends on the spatial disposition of the object and the observation surface, integration being performed over the object
O( 2 is chosen, the dynamic range of possible hologram values may be extended because the maximal reproducible amplitude is K A , . Most interesting of the K > 2 cases are those of K = 3 and 4 because the two-dimensional spatial degrees of freedom of the medium and hologram recorder may be used more effectively through allocation of the component vectors according to Figs. 50b,c and 48c. The above hologram coding methods relying on additive representation of the complex number have one more important property in common. All of them make use of some form of implicit introduction of the spatial carrier and of a nonlinear transformation of the signal with a spatial carrier similar to the classical method of recording optical holograms (Leith and Upatnieks, 1961). It is easy to check that Eqs. (136), (138), (139), (148),and (151) may be, for example, rewritten in the following equivalent form containing explicit hologram samples multiplied by those of the spatial carrier with respect to one or both coordinates
+
F(m,n)= Re(T(r,s)exp(-i2nm/2))/4 c, m = 2r
+ mo; rn,
= 0, I ;
n
=
s
(1 36')
e(rn,n) = $rctf{Re[r(r,s)exp( -i2nm/2)]}
+ 2mo, + mo2;m,, , m o 2 = 0 , l ; n = s rim , n ) = rctf(Re( T(r, s) exp[ i27c(m + 2n)/4)}) ~n = 2r + ni,; n = 2s + n o ; rn,,iz, = 0, 1 m = 4r
(138')
-
(1 39')
APPLIED PROBLEMS OF DIGITAL OPTICS
1
F(m, n) = -
‘
@p=O
hlim(Re{r(r,s) exp[ - i27c(m
x rctf(Re{T(r,s) exp( - i27c(m
m = 3r
119
+ p)/3)})
+ p + 3)/3)))
+ m,; m, = 0,1,2; n = s
(148’)
F(m, n) = r, exp(i{cp(r,s)- c o s ~ ~ ~ r n / ~ ~ a r c c o s ~ ~ ~ ~ r , s ~ ~ /
m = 2r
+ m,;
m,
= 0,l;
n
=s
(151‘)
where rctf(z) is the “rectifier” function rctf(z) =
z, 0,
230 z with symbol probabilities p(sl), &), . . . , p(s,) as above, each symbol si can be transformed or mapped into a fixed sequence of Ii symbols taken from a finite alphabet X = {xl,x, . . . ,x,}. This corresponds to encoding each symbol siinto a code word X i , belonging to the set { X , , X , , . . . ,X q } ;X is called the code alphabet. Source codes can be classified according to the code-word structure: codes using a variable-length encoding, in which the code words X i have a variable length; codes with a fixed length of the code words. Further, the source codes can be distinguished according to the following properties: (1) nonsingular codes, having all different code words; (2) codes which can be univocally decoded, for which the nth code extension is nonsingular for any finite value of n; ( 3 ) comma codes, having a specific symbol to separate a code word from the near ones; (4) instantaneous codes, for which any code word can be decoded into a source symbol without the necessity of considering or knowing the following symbols.
An important example of a source code is represented by the usual binary code, used to represent an image sample (quantized grey level)in a digital form.
DIGITAL FILTERS AND DATA COMPRESSION
161
This binary code, having in general a constant word length, is a simple example of a nonsingular code, which can be univocally decoded. A very important property of source codes is connected to the economy or compactness of representation of the information symbols. To make this concept precise, the average code word length can be defined as 4
L=
C P(si)li i= 1
(48)
This parameter L is very useful to measure the economy of each source code. For instance, a source code having an average word length Lless than or equal to that of all other codes using the same code alphabet for the same information source is called a compact code. It is clear now that a fundamental problem of the source coding consists in the search and definition of compact codes for the different information sources. A general theoretical solution to the above problem is given by the first Information Theory Theorem or first Shannon theorem for source coding (Shannon and Weather, 1949; Shannon, 1959). Substantially this theorem establishes a general bound for the average word length L in relation to the information source entropy. In simplified form this bound is expressed by the following relation Hr(S)I L
(49)
where Hr(S)is the source entropy, measured with a logarithm in base r. The bound can also be set as
44s) Ln/n
(50)
where L,, represents the average word length of the nth extension of the information source with the following limit limn+ (L,,/n) = Hr(S)
(51)
In general the price which is paid to reduce L o r Ln/n is represented by the complexity of the source coding. The above fundamental theorem permits us now to also define in a rigorous way the efficiency of each source code [according to Eq. (49)]
v
= Hr(S)/L
(52)
and the redundancy of the source code as 1 -q
[L - Hr(S)]/L (53) The above Eqs. (52) and (53) permit one to compare different codes for the same information source, selecting that which has the higher efficiency (higher value of q ) or less redundancy. =
162
V. CAPPELLINI
An example of compact or optimum codes is represented by the Huffman encoding procedure, in which the length li of each code word is inversely related to the value of the probability p(sJ In this way the more probable and therefore the more frequent words are encoded in shorter sequences compared with the less probable ones (Abramson, 1963). B. Data Compression Methods and Techniques
Many methods and techniques of source coding or data compression have been studied, defined, and applied to image processing (for local processing, image transmission or storage). According to the first Shannon theorem (see Section IV,A), the different data compression methods can be divided into two main classes: (1) reversible methods, which permit-at least in principle-recovering through decompression (source decoding or inverse transformation) all the original source information amount; (2) irreversible methods, which do not permit recovering all the original data and which introduce therefore some information loss or distortion.
The reversible methods obey and respect the source coding bound of Eq. (50),while the irreversible methods do not. Among the reversible methods, some important ones for image processing are: (1) adaptive sampling and quantization (2) prediction interpolation (adaptive and nonadaptive) (3) variable word-length coding (as the Huffman code) (4) digital filtering (maintaining a useful spectrum extension) (5) use of transformations (Fourier, Walsh-Hadamard, Haar, Karhunen-Loeve) while some irreversible methods are: (1) (2) (3) (4) (5)
thresholding parameter extraction power spectrum digital filtering (with spectrum extension reduction) probability functions.
To evaluate the performance of data compression methods, distortion functions and distortion measures can be used (Berger, 1971).In practice, three measurements are evaluated: the compression ratio C , ; the peak error e,; the
DIGITAL FILTERS AND DATA COMPRESSION
163
rms error e,. The compression ratio C, is defined as (Benelli et al., 1980)
c, = L s / L
(54)
where L, is the mean source word length [in general equal to the entropy H ( S ) ] and L is the mean word length data compression. If the source messages (image samples) are represented in the standard binary form, the compression ratio C , can be obtained by the ratio
c, = NSINC
(55)
where N, represented the number of bits (0 and 1 values) of the source words (representing the image samples), and Nc represents the corresponding number of bits of the code words after compression. The peak error e prepresents the peak or maximum error resulting between the input message (image samples) and the corresponding reconstructed message after decoding or decompression, while the rms error e, represents the root-mean-square error between the input and reconstructed messages evaluated on a suitable block of data (number of image samples, in particular on the image samples overall). It is clear that in general data compression methods of the irreversible type give higher values of the compression ratio C,, with the penalty however of higher e p and e, errors. Furthermore, reversible methods can also become irreversible, if some parameters (as threshold values, sampling frequency, amplitude tolerances, etc.) are changed in such a way as to obtain higher compression ratios (consequently introducing higher error values). In practice, the efficiency of the different data compression methods depends on the nature of the information source (different types of images), and each method can result in being more efficient for some information sources (particular type of image) than for other ones. In this regard, it is very important to perform a suitable analysis of the information source (image to be processed) before the application of any data compression method: (1) space analysis
(2) space-frequency analysis (amplitude and phase spectrum) (3) statistical analysis (statistical average, amplitude distribution, auto-
correlation, power spectral density). 1. Adaptive Sampling and Quantization
Adaptive sampling methods are based on the use of the minimum sampling frequency for any processed image; the minimum sampling frequency, as is well known (from the sampling theorem), is equal to double the maximum space frequency. Adaptivity can in particular be obtained by
164
V. CAPPELLINI
changing the sampling frequency for some sub-block of the processed image; for instance, sample sub-blocks of 32 x 32, 64 x 64, or 128 x 128 are considered, and the sampling frequency is locally adjusted in each image subblock according to the local maximum space frequencies. The maximum space frequency in the image or image sub-blocks can be practically found through the 2D F F T (fast Fourier transform). By means of the above procedures, a reduction of sample number is obtained, in comparison with the use of a fixed sampling frequency. Also, the quantization level (number of bits representing the image samples) can be changed in an adaptive way from one image to another or in image sub-blocks, taking into account the actual grey-level range (for instance, low grey-level ranges require a lower number of bits). The grey-level range can be easily found through the amplitude distribution (grey-level histogram). In practice, the change of sampling frequency or quantization law from one image to another or in image sub-blocks can be identified through a special word expressing the particular sampling frequency or quantization level locally used.
2. Prediction and Interpolation Data compression methods with prediction or interpolation are interesting, due to their relatively simple structure and reasonably good efficiency (Benelli et al., 1980). Let us consider 1D algorithms, which can be applied to image processing line by line (for instance, processing the sequence of image samples along the rows). In prediction methods a priori knowledge of some previous samples (image samples along a row) is used, while in interpolation methods a priori knowledge both of previous and future samples is utilized. In both types of operations the most widely applied technique consists in comparing the predicted or interpolated sample with the actual sample: If the difference is less than a fixed error or amplitude tolerance, the actual sample is not maintained, otherwise the actual sample is maintained, Figure 4 shows a block diagram of a typical data compression system with prediction or interpolation: The nonredundant samples (i.e., the samples for which the prediction or interpolation fails) are fed into a buffer to be reorganized at constant space intervals with the space position identification (synchronization), necessary for the reconstruction of the original data from the compressed samples. The important role of the buffer is therefore to store the incoming samples remaining after the gate, so that they can be reorganized at uniform sampling rate. In this way, while at the input of the buffer we have only bit compression, at its output we also have bandwidth compression (sampling rate reduction). In the following, the symbols C, and C,, indicate the average compression
c
BUFFER
GATE
-
compressed output
b
-COMPARATOR. Input
P R E D l CTOR INTERPOL.
=
$
f *
GATE
synchronization
ratios, respectively, with and without including the bits added for the space identification or synchronization. Prediction algorithms are utilized, according to the following difference equation f,(n)
=f
( n - 1) + A f ( n - 1) + A 2 f ( n - 1) + ... + ANf(n- 1)
(56)
where f p ( n )= predicted sample at space position nX ( X being the space sampling interval along the image rows and columns); f ( n - 1) = sample value at the previous space position ( n - 1)X; A f ( n - 1) = f ( n - 1) f ( n - 2), . . . , A N f ( n- 1) = A N - l f ( n - 1) - A N - l f ( n - 2). The value of N corresponds to the order of the prediction algorithm: with N = 0 we obtain the zero-order predictor (ZOP) and with N = 1 the first-order predictor (FOP). For the ZOP algorithm, several procedures can be followed: (1) Z O P with fixed aperture, in which the dynamic range of the data is divided into a set of fixed tolerance bands with a width of 26; if f ( n -- 1) is the last remaining sample, f ( n )is not maintained when it lies in the same tolerance band; (2) ZOP with floating aperture, where a tolerance band + 6 is placed about the last remaining sample; if the following sample lies in this band, it is not maintained (in this case the next samples are compared again with the value of the last remaining sample & 6 and so on); (3) Z O P with offset aperture, in which the predicted sample is f p ( n )= f ( n - 1) f 6, where 6 is a prefixed quantity and the sign + is used if the last remaining sample is out of tolerance in the positive direction and vice versa. In the F O P algorithm with floating aperture the first two samples are maintained and a straight line is drawn through them, placing an aperture & 6
166
V. CAPPELLINI
about that line: If the actual sample f ( n ) is within this aperture, it will not be maintained and the line will be extrapolated for a following space interval X and so on. Data compression algorithms using interpolation differ from the corresponding ones with prediction, due to the fact that with interpolation both previous and following samples are used to decide whether or not the actual sample is redundant. The more interesting algorithms, based on low-order interpolation, are zero-order interpolator (ZOI) and first-order interpolator (FOI). Adaptive data compression methods with prediction or interpolation represent an improvement on the preceding ones of the nonadaptive type, especially when there are input images to be processed having high activity (variation of the space and frequency behavior from one image part to the other). These methods can be divided into linear and nonlinear ones depending on the specific procedure used for the adaptivity implementation. Of high interest is the adaptive-linear prediction (ALP) method, in which the predicted sample f,(n) is evaluated by a linear weighting of M previous samples (Benelli et al., 1980) M
where bkare suitable weighting coefficients. If the prediction error falls within a given threshold value y , the actual sample is not maintained. If the considered process is a stationary Gaussian series with zero mean, the coefficients Bk can be determined in such a way as to minimize the meansquare prediction error given by M
f ( n - i) -
1 &(M,N)f(n
-
k=l
i
-
k))’
(58)
M being the number of the preceding samples for the prediction and N the
number of samples which the predictor uses to learn the image sample’s evolution (line by line). The method turns out to be advantageous as long as the statistical characteristics of the image are maintained on suitable extended regions. In practice a counter can be used to measure the number of consecutive predictions affected by error; when this number exceeds a prefixed value T, a new set of M coefficients is again computed and the algorithm goes with the new coefficients. 3. Differential Pulse-Code Modulation and Delta Modulation
The general block diagram of differential pulse-code modulation (DPCM) is shown in Fig. 5. A predicted sample f J n ) is evaluated through a linear
167
DIGITAL FILTERS AND DATA COMPRESSION en
input
-
QUANTIZER
.
PREDICTOR
*
c
output
FIG.5 . Block diagram of a DPCM system.
weighting of the M previous samples (Benelli et al., 1980) M
The predicted samples can be obtained using any of the prediction algorithms as ZOP, FOP, ALP,. . . . The difference en between the actual sample and the predicted one is quantized with quantization intervals of amplitude A and encoded in a code word L , bits long. If the image samples have high correlation and the weighting coefficients are correctly chosen, DPCM generally offers a higher efficiency with respect to the usual binary coding; with an equal number of bits, DPCM assures an higher signal-to-quantizationnoise ratio (SNR), or with an equal SNR it requires a lower number of bits. Many adaptive DPCM methods (ADPCM) have also been studied and defined. In general the A value is varied, becoming smaller when the grey level is quiescent and vice versa or the length of the prediction interval ( M ) is changed, according to the signs and values of some previous differences between the predicted and the actual samples. A special data compression method, which can be considered as a DPCM with l-digit code, is represented by delta modulation (DM). In the DM method, the changes in the grey level between consecutive samples are substituted for the absolute grey-level values. These changes are represented in the form of binary pulses, whose sign ( + or -) depends on the sign of the amplitude change. Fig. 6a shows the block diagram of a DM system, while Fig. 6b outlines the main wave forms at different points of the system. In the classical DM a single binary pulse is obtained at each sampling interval instead of a complete code word; the output pulse is in this case synchronous with the input word stream, yielding a constant compression ratio. Errors in the reconstructed data can, however, appear due to two effects: the approximation of the input wave form (grey-level variation) to a step function (granular or quantization noise); and quick variations of the input wave form, which cannot be followed with accuracy. Regarding this last
168
V. CAPPELLINI (a)
PULSE GENERATOR
A(+) COMPARATOR A (-)
I
QUANTIZER
-
MODULATOR
-!+ I
INTEGRATING NETWORK
PULSE GEN. OUTPUT A : eo
B : e -e 0 1 C ( e0 -e1 ) / l e 0 - e 1I
e**
D :e2
eo . el FIG.6. D M system: (a) block diagram; (b) main wave forms in the different points.
aspect, input variations cannot be followed for which the gradient of the sampled data exceeds the limit g
=
Ar
(60)
where A is the change in amplitude of a DM pulse and r is the rate of the pulses (pulse space frequency for the processed image). The distortion due to this aspect is also known as slope-overload distortion. Many studies have been developed to analyze the efficiency of the classical D M and to increase the efficiency (Benelli et al., 1980). A first method is based on the change of the step amplitude, according to the wave-form variations:
DIGITAL FILTERS AND DATA COMPRESSION
169
The step amplitude is increased when a given number No of consecutive samples have the same binary value and it is decreased in the contrary case (high information delta modulation, HIDM). Another interesting modification of the classical DM is basic asynchronous delta modulation (BADM), in which the sampling rate is increased during intervals of high activity (rapid dynamic range variations) and it is decreased in lower activity intervals. A special technique, called operational asynchronous delta modulation (OADM), avoids the errors corresponding to rapid amplitude variations in the following way: When the difference between the input and the reconstructed samples exceeds a prefixed tolerance value, the algorithm goes back m samples and inverts the A value, adjusting the sampling interval appropriately. 4 . Use of Digital Filtering The use of digital filtering ( I D and 2D) is a very useful approach for data compression, for several reasons (see also Section V). First, if the useful information is concentrated in a limited frequency band, digital filtering can extract this band, in particular through low-pass or bandpass filtering; indeed, a lower number of data are required to represent the extracted limited band (in comparison with the overall spectral extension) and hence a data compression result is obtained. Further low-pass digital filtering is useful in preprocessing before the application of particular data compression methods, because the smoothed data can be more efficiently compressed by specific compression algorithms. The joint use of digital filtering and data compression methods is presented in detail in Section V. 5 . Use of Transformations
Orthogonal transformations, such as Fourier, Hadamard-Walsh, Haar, Karhunen-Loeve, etc., in particular in discrete or digital form, can be used for data compression, due to the fact that in general they represent a more compact representation of image data. This means that the transformed data become defined and exist in a smaller region or domain than the original data; a lower number of significant transformed data then result. The 2D discrete Fourier transform (DFT) is defined as in Pratt (1 978)
while the inverse discrete Fourier transform (IDFT) is expressed as
170
V. CAPPELLINI
where the k , ,k , indices correspond to frequencies ( v l = k , Av, v , = k , Av, Av being a constant-space-frequency interval). With suitable symmetry properties, the discrete cosine transform and sine transform (DCT and DST, respectively) can be used. The DCT transform can be expressed in the following way
The Hadamard transform is based on the properties of the Hadamard matrix (square form with elements equal to 2 1, having orthogonality between the rows and columns). A normalized Hadamard matrix, of N x N size, satisfies the relation
HHT = 1
(64)
The orthonormal Hadamard matrix of lowest order is the 2 x 2 Hadamard matrix 1 1 -z-$l
--[
-:I
The above transform is also known in the litearture as a Walsh transform (WT). A frequency interpretation of the above Hadamard matrix is indeed possible: the number of sign changes along any Hadamard matrix row, divided by 2, is called the sequence of the row. The rows of a Hadamard matrix of order N can also be considered as samples of rectangular functions having a subperiod equal to 1 / N ; these functions are called Walsh functions (Pratt, 1978). The Haar transform is based on the Haar matrix, which contains elements equal to I1 and 0. One of the most efficient transforms is represented by the KarhunenLoeve transform (KLT), which can be defined in the following way
where the A(nl, n,; k , , k,) kernels satisfy the relation
where C(nl, n,; n ; , n i ) denote the covariance function of image data and i ( k l ,k,) are constants (eigenvalues of covariance functions) for fixed values of k , and k,. The above-considered discrete transforms can in general be evaluated in a fast form; the computing operation is divided in a sequence of subsequent computing steps in such a way that the results of the first computing steps
DIGITAL FILTERS AND DATA COMPRESSION
171
(partial results) can be utilized repetitively in subsequent steps. Efficient software packages are available for Fast Fourier Transforms (FFTI and Fast Walsh Transforms (FWT). For instance, the number of operations required to evaluate 1D FFT becomes N log, N instead of N 2 for DFT, and to evaluate 1D FWT, N log, N instead of N 2 for DWT (the difference between DFT-FFT and DWT-FWT is that for the first type of transform the operations are complex multiplications and additions, while for the second the operations are additions and subtractions). As already outlined above, the transformed data constitute a compact representation of the original image data; the number of significant transformed data is appreciably smaller than the number of original image data. For instance, an image constituted by a regular smoothed variation of grey levels will be represented by few Fourier components (few F F T data),while an image containing sudden variations in the grey level (nearly rectangular greylevel variations along the rows and columns) will be represented by few Walsh components (few FWT data). Further, the transformed data can be compressed in a stronger way by applying simple algorithms such as thresholding (for instance, setting to zero the values under a small threshold such as a few percent) or prediction interpolation. The block diagram of a system applying this last approach, in particular to verify the efficiency of thresholding or the ZOP algorithm also through suitable displays, is shown in Fig. 7. Variable-word-length coding can also be used for different transformed data blocks. In practice, the transformed data are divided into several squares and a minimum word length (a bit number sufficient to represent the maximum absolute amplitude value in the square plus 1 bit for the sign) is employed for each of them. In the actual storing or transmission of the processed image data, an additional fixed-length word is inserted before each square data, in order to specify the number of bits to represent the square coefficients. With reference to 2D FWT, if N = 2", with n an integer, is the number of rows and columns of the sampled image and L is the number of grey levels, the maximum value which the transform will assume (corresponding to the addition of all the image samples) will be N2L. If q is the quantization value for the transformed data, the number of bits required to specify the word length used in a square will be (68) 1ogZ[10gZ(N2L/q+ ')I where the algorithms are rounded to the next highest integer. A modification of the above method for image data compression by means of 2D FFT or FWT consists in applying the same procedure of variable-wordlength coding of the transformed data in a limited number of transformed nb
=
I
I N P U T DATA
*
1
APERTURE DEFINITION
I
--
-2
,
2
ZOP FLOATING
FFT
I t I
.
RECDNSTRUC TlON
-
+
-
r
(FFT)-'
I A N N l H I L A T I ON
FK;. 7. Block diagram of a data compression system using FFT with thresholding or a ZOP algorithm.
DIFF.
(rms)
4-
DIGITAL FILTERS A N D DATA COMPRESSION
173
image subareas. In particular no value can be maintained for those subareas, where the addition of the absolute values of the transformed data is below a given threshold. From a computational viewpoint, we can further outline the following comparison considerations:
(1) 2D FFT is in general more efficient for images having continuous regular or smoothed variations (sine-wave type) in the grey level; (2) 2D FWT is more efficient for images having sudden variations (of a rectangular type) in the grey level; (3) Karhunen-Loeve transform is the most efficient, but it is much more complex than the others, and no fast computing routine is available for its use. V. JOINTUSEOF TWO-DIMENSIONAL DIGITAL FILTERS AND DATACOMPRESSION The above-considered 2D digital filtering and data compression operations can be joined together with significant advantages for digital imageprocessing efficiency. The two operations are in general performed in cascade, one after the other. Most parts of images indeed require some kind of filtering to smooth the data or to perform space-frequency corrections and obtain enhancement, in general with the goal of reducing the noise or disturbances and of obtaining higher-quality images. Data compression is, as already outlined, a desirable operation after filtering to reduce the amount of data, which is becoming a tremendous problem for the practical use of images in many application areas. Further, the combination of the two operations can be attractive to increase the compression efficiency (see Section IV,B,4); smoothed data after low-pass filtering can surely be more efficiently compressed by the different data compression algorithms, because they now operate on 2D data having lower space-frequency values. In the following some typical connections of the two digital operations are first presented; then a special new system, based on digital filtering and data reduction, is described for digital comparison and correlation of digital images having different space resolutions. A . Some Typical Connections of the Two Digital Operations
Local space operators, 2D digital filters, and data compression can be connected in different useful ways to obtain some specific results on the processed image.
174
V. CAPPELLINI
(1) Low-pass filtering (by means of a local space smoother or 2D digital filtering) and thresholding: high space-frequency components can be reduced due to the noise and disturbances, and hence a binary image can be obtained, where the more useful data are maintained (by selecting suitable threshold values). (2) High-pass filtering (by means of local differential operators or 2D digital filtering) and thresholding: image enhancement is performed, giving higher contrast to image structures (grey-level variations), and hence a binary image is obtained, where some useful structures and patterns can be extracted (by selecting suitable threshold values). ( 3 ) Low-pass filtering [as in (l)] and compression by means of prediction interpolation, DPCM, DM, or variable-word-length coding: high spacefrequency noise is reduced, and in the meantime smoothed data are more efficiently compressed. (4) Low-pass filtering [as in (l)] followed by edge detection and hence spike elimination (by means of nonlinear operators as in Section 111,A): high space-frequency noise is reduced (for instance, random noise), useful edges and boundaries are extracted (representing a compressed form of the image), and further high-amplitude spikes or scintillation noise are eliminated. ( 5 ) Low-pass filtering [as in (l)] and compression by means of digital transformations (2D FFT or FWT):the same result as in (3) can be obtained. (6) Use of 2D FFT to perform filtering and compression: once the 2D FFT has been evaluated, high space-frequency components can be discarded to obtain a filtering effect (smoothing with noise reduction); hence the remaining 2D FFT components can be reduced with thresholding or variableword-length coding (see Section IV,B,S). B. Processing System for Digital Comparison and Correlation of Images Having DifSerent Space Resolution In many application areas, such as remote sensing, biomedicine, and robotics, an important practical problem is represented by the availability of several images given by different sensors or equipment and regarding the same scene (land region, body organ, mechanical object,. . .). In general these images are taken from different view points and have different space resolution. Increasingly often a processing goal is to obtain integrated images or maps, where the data from the different images pertaining the same observed scene are suitably correlated (for instance, through a simple addition difference or specific weighting of one image’s data by the other images’ data). To solve the above problems, there are two types of digital processing to be performed:
DIGITAL FILTERS AND DATA COMPRESSION
175
(1) geometrical corrections and rotations with a change of viewpoint (to refer the different images to the same viewpoint or to the point at infinite distance and orthogonal position producing orthoimages); (2) space resolution variations in such a way finally to have images with the same space resolution, which can be actually integrated.
While for point (1) there are several geometrical transformations available using trigonometric functions, for point (2) there are few approximation procedures. In the following a rigorous method is presented, based on 2D digital filtering and data reduction (Cappellini et al., 1984a). Let us consider two images fl(nl, n,) and f 2 ( n l n,) in digital form, the first with high space-frequency resolution or definition and a space-,sampling interval X , , the second with lower space-frequency resolution and a sampling interval X , > X , . Practically, if m = (X,/Xl), to one pixel of the image f 2 ( n l ,n , ) corresponds m2 pixels of the image fl(nl n,). Several approaches can be used to obtain from a high-definition image fl(nl, n,) an image gl(n,, iz,), having a space-sampling interval X , equal to that of the lower-definition image [gl(nl,nz) is a compressed form of fl(n17nz)l~
One simple technique corresponds to evaluating g l ( n l ,n,) data as the usual average of fl(nl, n,) data [see also Eq. (24)]; that is (with m odd),
The gl(nl, n,) image obtained in this way represents a rough smoothed version of the original high-resolution image fl(nl, n 2 ) . A second, more refined approach consists in evaluating g l ( n l , n,:l data as a weighted average of fl(nl, n,) data; that is,
The weights wa(k,, k 2 ) in the above relation define the form of smoothing operation which is performed on the f l ( n , , n 2 ) data; it is easy to verify, for instance, that with wa(kl, k , ) = (l/mz) Eq. (70) is equivalent to Eq. (69). It can appear reasonable to give, in general, greater weight to the central pixels of the m x m subimage of the fl(n,, n,) image with respect to the peripheral ones. For this purpose one solution corresponds to using a linear weighting resulting in a conical (or pyramidal) function in the 2D domain; another solution consists in using a Gaussian weighting function (the 2D function can be easily obtained through the circular rotation of a ID Gaussian function). The above-described techniques perform a smoothing operation on the high-resolution image f , ( n , , n , ) in a heuristic way to obtain the image
176
V. CAPPELLINI
y 1( n , ,n,) to be compared and correlated with the lower-resolution image
f 2 ( n l ,n2). A more rigorous and precise system is based on the use of a 2D digital filter of the low-pass type with circular symmetry (see Section 11). The precise steps of this system are the following:
(1) to perform a low-pass, circular symmetry, 2D digital filtering with a cutoff frequency 0,/271 = 1/2X,, obtaining the filtered image gl(nl , n J ; (2) to reduce or “decimate” the obtained data yl(n,, n,) up to a spacesampling interval equal to X,, that is, to obtain the image (in digital form) gz(n1 n 2 ) = g,(n,X,, n2X2). 1
The above digital operations indeed remove from the 2D spectrum of the highresolution image f l( n1 , pi2) the space-frequency components greater than wJ271, giving therefore an image which is directly comparable-for that which regards the space resolution-to the lower-resolution image f 2 ( n l ,n,). The two digital images g t ( n l , n,) and f2(nl, n,) now result to have grey-level variations in the different space directions with the same maximum space frequency. It is interesting to observe that the 2D digital filter used in this last rigorous system includes, as particular cases, the approaches defined by Eqs. (69) and (70). The first is obtained by setting the coefficients of the 2D digital filter a ( k , , k 2 ) = (l/m2)[see Eqs. (2) and (9) for the FIR case]; the second results by setting a(kl, k 2 ) = w,(kl, k,).
VI. APPLICATIONS In the following some examples of applications of 2D digital filters, local space operators, and data compression to such important fields as communications, remote sensing, biomedicine, and robotics are presented. A . Applications to Communications
In a communications system a message is transmitted from one place to another through different physical communication channels (lines, cables, satellite links, optical fibers, etc.). If the transmitted message is a signal s ( t ) (Fig. 8), the receiver produces an estimate s,(t) of the original source message, trying to reduce the noise and degradation introduced by the channel. Often many messages of the same or different type have to be sent in parallel to utilize the communications medium in a more efficient way. Multiplex communication systems are used for this purpose. Two important types of multiplex systems are represented by frequency division multiplex
DIGITAL FILTERS AND DATA COMPRESSION
INFORMATION SOURCE
message
MODULATOR OR ENCODER TRANSMITTER
+
CHANNEL
-
177
RECEIVER
received signal
Se (t)
estimate of the information source messagg
FIG.8. General block diagram of a communication system
(FDM) and time division multiplex (TDM). In the first the single messages are set in adjacent frequency domain, while in the second the messages are organized in subsequent time intervals, in general by sending one sample of each message after the other in a cycle or frame and sending one frame after the other. By representing each sample in the TDM system in digital form (a word of a given number of bits) a pulse-code-modulation (PCM)multiplex system is obtained. Digital communications of the PCM type have expanded recently in an exponential manner, due to the main aspects outlined in the Introduction (Section I). In digital communications it is easy to apply digital operations such as previously described (digital filtering, local space operators, data compression). By considering the transmission of images, these are in general converted into a video signal through a scanning procedure and then this signal is sampled and set in digital form. Digital operations described in the previous Sections can be usefully applied to this digital signal both in transmission and reception, according to the general block diagram in Fig. 9. In transmission I D digital filtering can be performed to reduce highfrequency noise components and exactly define the bandwidth. 2D digital filtering and local space operators can also be applied, if a suitable memory or buffer is available, processing image data in such a way as to reduce high space-frequency components (image smoothing) or to obtain image enhancement. Hence data compression can be performed to maintain the more significant data. By means of filtering and compression, a bandwidth reduction or bandwidth compression is achieved, which is a very important result to increase the efficiency of the digital communication system (the same image data can be transmitted by using a smaller bandwidth, or other data can be transmitted with the image by using the same bandwidth). Achannel coder can be added after compression to protect the remaining important data against channel noise and disturbances (Benelli et al., 1977, 1984). In reception, after the eventual channel decoder for error detection and correction, the image data are reconstructed (decompression) also utilizing synchronization data (given by a sequence detector) and hence digital filtering is performed ( I D and 2D type) to reduce remaining channel noise or to
-
-
information
source
A-D
CONVERTER
f
DATA
DIGITAL FILTER
-F
COMPRESS.
-
-
CODER
-t
I
4
TRANSMITT.
-
'
SEQUENCE G EN E R AT0 R
-RECEIVER
-t.
t
D ATA DECODER
f
DECOMPRESS.
PROCESSOR
i
1L
SEQUENCE
Ir
-
information
ir
sequence
Flci. 9. Block diagram of a digital communication system, using digital filtering, data compression, and error-controlled coding operations.
DIGITAL FILTERS AND DATA COMPRESSION
179
increase image quality. With reference to this last filtering processing, 1D and 2D digital filters can indeed be applied in the following way: (1) ID digital filtering can reduce channel noise and degradation (such as multipath and Doppler effects) through channel equalization (fixed and adaptive type) and matched filtering (Cappellini et al., 1978); (2) 2D digital filtering can reduce space-frequency components due to noise and perform image restoration (inverse filtering) and enhancement.
Let us now consider in more detail the digital transmission of time-varying images (television) and of time-fixed or static images. In the first case, for good reproduction of movement, image sequences are required at a sufficient rate. In European TV standards for instance, 25 images/s are used. By using 625 lines/image and 8 bits/sample, transmission rates of 50 Mbits/s are obtained. To reduce this high value, the analysis of two subsequent images can be performed in such a way as to take into account, for instance, only the actually moved parts in the transition from one image to the subsequent one (interframe techniques). DPCM techniques (see Section IV,B,3) can be used for this purpose; through variable-word-lengt h coding, mean word lengths of 2-2.5 bits/sample are obtained, reducing the transmission rate to 10-20 Mbits/s. A suitable encoding, as with prediction interpolation or-more efficiently-with digital transformations (see Section IV,B), can also be performed in each single image (intraframe techniques). Combining interframe and intraframe techniques, lower bit rates (2- 10 Mbits/s) are obtained, at the expense of higher complexity and cost. An interesting approach to reduce the redundancy in TV images, by means of inter-intraframe coding. is based on movement compensation: The coding consists essentially in determining for each pixel the prediction model (spatial, temporal) and transmitting, when necessary, the quantized difference and the prediction model changes (Brofferio et al., 1975). In the case of videotelephone or teleconference s: stems, due to the lower number of images points in movement from one image to another, and by using DPCM techniques with variable-word-length coding, 0.9- 1 bits/sample are sufficient. If information about image moving objects or parts is suitably used, values of 0.4-0.5 bits/sample are reached. For instance, for an object translation, the shift and direction values can be sent to the receiver, which will also reconstruct grey levels of moved image parts. Transmission rates on the order of a few Mbits/s are thus obtained. In the second case of static images, two practical situations can be considered: transmission of written documents and sheets, and transmission of photos (telephoto). Regarding documents and sheets, let us consider a standard A4 sheet (29.6 x 20.8 cm). To describe the written information and transmit it to the
180
V. CAPPELLINI
receiver (as in facsimile), an efficient method can be represented by the use of an optical reader of the written characters. If the sheet contains 30 lines, each with 70 characters, there are 2100 characters in a sheet. If, for instance, 7 bits are utilized for the transmission of the characters, 14,700 bits are required for the representation of the sheet. If, however, variable-word-length coding is used, taking into account the character probabilities (see Section IV,A), a lower number of bits is required (as an example, for the English language, a mean word length-equal to the source entropy-4.2 bits/character is resulting, with 8800 bits required to represent the sheet). A more economic system can scan the sheet (assumed to be black characters on a white background) with 1200 lines (at least 4 lines/mm are required) and represent each line by 800 equidistant space samples (points). Each sample being of a binary value, there are 960 000 bits required to represent the whole sheet (an amount much greater than the previous one). A little more efficient coding is obtained through the representation of the lengths of black or white point sequences by means of variable-word-length coding: Mean values of 0.3-0.4 bits/point are sufficient. By using the variations of the above sequences line by line (comparing one line with the following one) and suitable encoding of these variations, mean values of 0.10.2 bits/point are obtained (with 96,ooO-180,000 bits required to represent the whole sheet). For what concerns the representation and transmission of black and white photos, the number of data required is much higher. Let us consider a telephoto of 13 x 18 cm. To have sufficient space resolution, 8.6 lines/mm are required, and therefore 1500 lines with 1100 samples/line result. The overall amount of data, by representing the grey level of each sample by 7 bits, is therefore 11.5 Mbits (by using transmission with 4800 bits/s, 40 minutes are required for the transmission of a complete photo). By means of data compression techniques, such as DPCM with variable-word-length coding (see Section IV,B,3) a mean word length of 2-3 bits/sample is obtained, and by means of digital transformation (see Section IV,B,5) a value of 1-2 bits/ sample and less can be reached. With reference to this last approach, Fig. 10 shows an example of the application of the discrete cosine transform (DCT, implemented in a fast way, FCT) to a typical photograph of Florence (the Old Bridge). Thresholding compression of transformed data is used, representing image square sub-blocks containing N, x N , data (transformed data less than 17; are neglected). Fig. 10a shows the original digitized photo, and Fig. 10b the reconstruction in the case N , = 16 (mean word length = 0.6 bits/sample, ep = 16.8674, and e, = 2.13%). Finally, it is important to observe that all the above data-compression and data-rate-reduction results can be improved, if 2D digital filtering or local
FIG.10. Example of the application of the fast cosine transform (FCT) to perform data compression on a photograph of the Old Bridge in Florence: (a) original digitized photo; (b) reconstructed photo (with 0.6 bits/sample).
182
V. CAPPELLINI
space processing (as of the low-pass filtering type) is performed before the application of data-compression techniques. B. Applications to Remote Sensing
Many remote sensing images and maps are currently collected by platforms aboard aircrafts and satellites. Indeed passive remote sensing systems, as optical cameras, multispectral scanners (MSS), and microwave radiometers, or active remote sensing systems, as side-looking radar (SLR) and laser radar (lidars), give an impressive amount of images and data. The above images, maps, and data given by remote sensing systems in general need to be processed to improve their quality (geometric and sensor corrections, noise reduction, enhancement,. . .) and to obtain final useful results (extraction of specific regions and land-sea areas as for agriculture investigation or water resource monitoring). 2D digital filters, local space operators, and data compression represent indeed very useful digital operations to achieve the above outlined goals. 2D digital filters or local space operators can be applied as a preprocessing operation to smooth the image data (by means of low-pass filtering) or to perform a space-frequency correction or to obtain enhancement (by means of high-pass or bandpass filtering), also extracting edges and boundaries. In particular, after enhancement better-quality images can, in general, be obtained and through edge extraction different earth regions can be recognized and classified with easy evaluation of the corresponding areas. Data compression can be applied in general after some type of filtering, to reduce the amount of data, which is becoming a tremendous problem for the practical use of satellite data and aircraft photos of large earth areas (Cappellini, 1980). In the following some typical processing examples are given, regarding the application of the above digital operations. Figure 11 shows an example of the application of filtering and edge extraction to an aircraft photo ( a region south of Florence). In Fig. 1l a the digitized image is shown, while in Fig. 11b the result of processing by means of the nonlinear filtering operator as defined by Eq. (26) (nonlinear smoother) followed by the isotropic-gradient edge detector [Eq. (37)]. As it appears, the main different ground regions are isolated; adding the grey-level information, three classes can easily be obtained: forest (black and high-intensity grey levels), wine grapes and oil plants (medium-intensity grey levels), and other ground regions. Another very simple processing example is shown in Fig. 12. A LANDSAT-C image of the Tirrenic coast (at the bottom the Arno River appears) is processed first through grey-level expansion (stretching, see Section
DIGITAL FILTERS AND DATA COMPRESSION
183
P
rc
r
,
..
\. L
b FIG.11. Example of the application of nonlinear smoothing and edge extraction to an aircraft photo (region south of Florence): (a) digitized photo; (b) processed result.
111) and then through thresholding. Figure 12a shows the original image, and
Fig. 12b gives the final result, which practically corresponds to an estimation of the water resources in the analyzed region. Figure 13 shows an example of the application of a 2D FIR digital filter of the high-pass type [Eq. (2)] to a LANDSAT-C image. At the right is a part of the original image (North Africa), while at the left the filtered image appears. As is clear, a good enhancement effect results, which can be very useful for extracting some significant regions; further, through thresholding as in Fig. 12b, final estimate regarding these regions could be obtained (Cappellini, 1984). Figure 14 shows an example of the application of data compression with a ZOP algorithm and floating tolerance (see Section IV,B,2) to an ERTS-1 image. At the left the original image is shown, and at the right the reconstructed one after compression (an average compression ratio Cra = 1.56 is obtained). Figure 15 gives another example of the application of data compression on the same ERTS-1 image, using the 2D FWT with variable-word-length coding
184
V. CAPPELLINI
FIG.12. Example of application of stretching and thresholding to a LANDSAT-C image (Tirrenic coast with the Arno River at the bottom): (a) original image; (b) processed result.
of transformed data blocks (4 x 4). At the left is the original image, in the middle the reconstructed one with q / N = 4% [see Eq. (68)] corresponding to a compression ratio C,, = 2.14, and at the right the reconstructed one with q / N = 8% and C,, = 3.84. Higher compression ratios can be obtained by increasing q / N (Cappellini et al., 1976). A practical application of the processing system for digital comparison and correlation of images having different space resolution, presented in Section V,B, is shown in Figs. 16-20. Figure 16 shows a SEASAT-SAR image (256 x 256) of a coastal region in South Italy (Sele River in Campania), which represents the high-resolution image f l ( n l , nz).Figure 17 gives a LANDSATC image (256 x 256) of the same region, the image representing the lowerresolution onef,(n,, n z ) .Figure 18 shows the result of 2D FIR digital filtering (low-pass type, circular symmetry) of the SEASAT image. Figure 19 shows the two final images obtained for comparison and correlation. At the left is
DIGITAL FILTERS AND DATA COMPRESSION
185
FIG.13. Example of the application of a 2D FIR digital filter of the high-pass type to a LANDSAT-C image (North Africa):at the right is a part of the original image, while at the left the filtered image is emerging.
the LANDSAT image (a part of the original image suitably rotated to be registered with the SEASAT image); at the right is the SEASAT filtered image already decimated [corresponding to g z ( n , ,n 2 ) ] . Figure 20 finally gives a simple integration test: the addition of the two images in Fig. 19 (Cappellini et al., 1984a).
C . Applications to Biomedicine With recent rapid technological evolution, much equipment has been introduced in biomedicine to produce different types of biomedical images or
186
V. CAPPELLINI
FIG. 14. Example of the application of data compression with a ZOP algorithm and floating tolerance to an ERTS-1 image: at the left the original image is shown, while at the right the reconstructed one is shown (CrA= 1.56).
bioimages. Some examples of biomedical branches giving bioimages are: radiography (x-ray), thermography (ir), scintigraphy (nuclear medicine), ecography (ultrasonics), electrocardiography (ECG maps), electroencephalography (EEG maps), and computer tomography (CT). Other bioimages of increasing interest are nuclear-magnetic-resonance (NMR) images and microwave-radiometry images. The above bioimages can be processed by 2D digital filters, local space operators, and data compression to obtain several useful results. By means of low-pass filtering, a smoothing of the bioimage is obtained, reducing high space-frequency noise components. By using high-pass or bandpass filtering, enhancement effects result, outlining and extracting useful data and patterns in other ways not clearly recognized. By means of inverse filtering (restoration), noisy bioimages can be processed to obtain higher-quality images for clinical diagnosis and interpretation. Data-compression techniques can hence reduce the amount of data, increasing in an impressive way, solving storage problems (archival systems), and increasing the efficiency of bioimage
DIGITAL FILTERS AND DATA COMPRESSION
187
FIG.15. Example of the application of data compression using a 2D FWT and variableword-length coding of transformed data blocks (4 x 4) to an ERTS-1 image: at the left, the original image; in the middle, the reconstructed image (Cra= 2.14); at the right, the reconstructed image (Cra= 3.84).
transmission from one place to another (telemedicine). In the following some typical examples are reported. A first example regards ECG or EEG maps. A special hardware system was recently built in Florence, containing a fast digital processor performing up to 256 1D digital filtering operations on ECG or EEG signals (Cappellini and Emiliani, 1983). In particular 16 signals (representing a 4 x 4 micromap) can be processed in realtime and the filtered data, for instance corresponding to LY components in an EEG, can then be processed by computer systems performing 2D digital filtering or other compression operations. Indeed the 1D digital filtering performed on the 4 x 4 signals by the hardware system represents an interesting example of parallel processing of 2D data, extracting useful frequency components (and in this way performing also a sort of data compression). Figure 21 shows a standard chart recording of parallel filtering
188
V. CAPPELLINI
FIG.16. A SEASAT-SAR image of a region in South Italy.
of four EEG signals (2 x 2 micromap), obtaining three outputs for any input signal in the frequency bands 8-10, 10-12, and 12-14 Hz. Figure 22 shows an example of processing in infrared thermography. In Fig. 22a the original digitized image is given, in 22b the result of grey-level expansion (stretching) is reported in conjunction with edge detection applied on a limited range of grey levels, outlining the venous traces (Prosperi, 1983). Figure 23 shows an example of the application of a 2D FIR digital filter of the bandpass type to a nuclear medicine image; at the top is the original image, and at the bottom the result of processing. Due to the special enhancement effect, a cyst now appears at the left of the image (the small black region). Figure 24 shows another example of processing a computer tomography image, In 24a the original image is given, in 24b the result of linear stretching
DIGITAL FILTERS A N D DATA COMPRESSION
189
FIG.17. A LANDSAT-C image of the same region as in Fig. 16 (extended area).
performed in the limited grey-level range 80-190, in 24c the result of a 2D FIR digital filtering of the parabolic type. As it appears this last filter can indeed be useful for obtaining special enhancement. It can be proved that a 2D parabolic filter (having flexible parameters as the origin and slope) is a good approximation of inverse or restoration filtering (Cappellini et al., 1978).This example outlines how in computer tomography, in addition to the standard image manipulation provided, special effects can be obtained in particular 'by means of 2D digital filtering. As already observed, 2D digital filtering can be very useful as preprocessing before data compression (see Section V). Table I shows some experimental results, obtained by processing nuclear medicine images before with 2D lowpass digital filtering and then with data compression using digital transformations (2D F F T and FWT). As it appears, with the same epand e, errors, the compression ratio C,, is appreciably increased when the 2D digital filter is
190
V. CAPPELLINI
FIG. 18. Result of 2D FIR digital filtering(1ow-pass type, circular symmetry) applied to the SEASAT image.
FIG. 19. The two final images obtained for comparison and correlation: at the left LANDSAT image; at the right, the SEASAT filtered and decimated image.
IS
the
FIG.20. A simple integration test: addition of the two images obtained in Fig. 19.
\
2
I
d
P
!ib-Awj)jflFIG.21.
--.-.
Multiple parallel digital filtering of four EEG signals.
192
V. CAPPELLINI
FIG.22. Example of processing in infrared thermography: (a) original digitized image: (b) result of edge detection applied on a limited range of grey levels.
used in comparison with the situation with no prefiltering (in the first case C, passes from 2.5 to 6, in the second one from 3.5 to 8) (Cappellini, 1979a). D. Applications to Robotics
In computer vision for robotics, in which one or more scene sensors such as T V cameras take information on mechanical objects or other systems in a
DIGITAL FILTERS AND DATA COMPRESSION
193
FIG.23. Example of the application of a 2D FIR digital filter of the bandpass type to a nuclear medicine image: at the top is the original image; at the bottom, the result of processing.
static position or in movement, efficient processing techniques are required to analyze the images given by the sensors. In particular, due to environmental noise conditions (light change, different colors of the objects, movement,. . .), preprocessing is required with fast filtering operations; hence edge detection is useful to extract the object’s shape before final recognition and classification. In the following some examples of the application of digital operations described in the previous sections are given.
194
V. CAPPELLINI
FIG.24. Example of processinga computertomographyimage:(a)original image; (b) result of linear stretching;(c) result of 2D FIR digital filtering of the parabolic type.
The first example regards the analysis of complex objects, where the goal is to process images taken of the objects and to produce an automatic object decomposition and subpart identification classification. The processing procedure is outlined in Fig. 25: by using a TV camera images are acquired in 3 colors (R,G,B-red, green, blue), then prefiltering is performed to reduce the noise; after boundary extraction, decomposition and syntactical analysis are performed. Figure 26 shows an example of the application of this procedure: in 26a an original digitized image (red color presented in black and white) is given representing a circuit board (acquisition with strong noise); in 26b the result of decomposition of the circuit board, obtained through a nonlinear
DIGITAL FILTERS AND DATA COMPRESSION
195
FIG.24b
filter of the type presented in Section III,A, an edge detector, and homogeneity operator applied on the three R,G,B images (Cappellini et al., 1984b). Another example regards the recognition and tracking of moving objects as on a transporting tape. The processing steps are the following: preprocessing with a nonlinear smoother [as in Eq. (26)]; edge detection (i.e., Sobel-type operator); spike elimination with a nonlinear operator [as in Eqs. (27) and (28)]; segmentation; object recognition by performing the FFT on the boundary of the object (distances of the boundary points from the centroid). An example of the application of this procedure is given in Fig. 27 on some mechanical objects. At the left there are the input digitized images of two
196
V. CAPPELLINI
FIG.24c
positions; at the right the recognized objects are shown (each identified with a different color, here appearing as a different grey level) with perfect tracking of their movements (Cappellini and Del Bimbo, 1983). With reference also to the above examples, in these robotics applications the preprocessing step is indeed very important (fast and efficient nonlinear filtering operators and edge detectors are required), in such a way as to reduce the noise and disturbances and extract the significant data in compressed form (that is limited to the really significant ones) for the best performance of the final recognition-classification algorithms and procedures.
TABLE I EXPERIMENTAL RESULTS OBTAINED APPLYING DATACOMPRESSION USING TWO-DIMENSIONAL FFT AND FWT TRANSFORMATIONS TO NUCLEAR MEDICINE IMAGES WITH OR WITHOUT A TWO-DIMENSIONAL LOW-PASS DIGITAL PREFILTERING
Compression ratio C,, = 2.5 (Cca= 6 with pre-filtering)
FFT FWT
Compression ratio C,, = 3.5 (Cra= 8 with pre-filtering)
e,
e,
eP
er
0.1322 0.0483
0.0134 0.0074
0.1581 0.1088
0.0164 0.0160
I
I
Different color bands acauisition I
Pref i I t ering algorithm I
I I I I
I
I ACQUISITION I I
Boundary extraction
I I
I I
I
I I I
I
I
Decomposition algorithm I
I I
SEGMENTATION
Syntactical analysis FIG.25. Processing procedure for automatic object decomposition and subpart identification classification.
198
V. CAPPELLINI
a
FIG.26. Example of the application of the procedure in Fig. 25: (a) original digitized image representing a circuit board; (b) result of decomposition obtained through a nonlinear operator, an edge detector, and a homogeneity operator.
DIGITAL FILTERS AND DATA COMPRESSION
199
FIG.27. Example of processing images related to moving objects: at left, the input digitized images; at right, the recognized objects with movement tracking.
REFERENCES Abramson, N. (1963).“Information Theory and Coding.” McGraw-Hill, New York. Benelli, G., Bianciardi, C., Cappellini, V., and Del Re, E. (1977). Proc. EUROCON--Euro Conf. Electrotech. Venice. Benelli, G., Cappellini, V., and Lotti, F. (1980). Radio Electron Eng. 50, 29. Benelli, G., Cappellini, V., and Del Re, E. (1984). IEEE Select. Areas Comm. SAC-2, 77. Berger, T. (1971). “Rate Distortion Theory-A Mathematical Basis for Data Compression.” Prentice-Hall, New York. Bernabo, M., Cappellini, V., and Emiliani, P. L. (1976). Electron. Lett. 12,288. Brofferio, S., CalTorio, C., Rocca, F., and Ruffino, U. (1975). Proc. Florence Conf. Digital Signal Process. p. 158. Calzini, M., Cappellini, V., and Emiliani, P. L. (1975). Alta Frequenza 44, 747. Cappellini, V. (1979a). Proc. J U R E M A Conf., Zagreb. Cappellini, V. (1979b). Proc. Int. Workshop Image Process. Astron., Trieste p. 258. Cappellini, V. (1980). Int. Remote Sensing, 1, 175. Cappellini, V. (1983). Proc. IEEE Int. Symp. Circuits Systems, Newport Beach p. 402. Cappellini, V. (1984). Proc. EARSeLIESA Symp. Integrated Approaches Remote Sensing, Guildford p. 325.
200
V. CAPPELLINI
Cappellini, V., and Del Bimbo, A. (1983). In “Issues in Acoustic Signal/Image Processing and Recognition” (C. H. Chen, ed.), p. 283. Springer-Verlag, Berlin and New York. Cappellini, V., and Emiliani, P. L. (1983). Proc. MEDINFO-83, Amsterdam p. 682. Cappellini, V., and Odorico, L. (1981). Proc. lEEE Int. Conf. Acoust. Speech Signal Process., Atlanta p. 1129. Cappellini, V., Chini, A,, and Lotti, F. (1976). Proc. Int. Techn. Scie. Meef.Space, Rome p. 33. Cappellini, V., Constantinides, A. G., and Emiliani, P. (1978). “Digital Filters and Their Applications.” Academic Press, New York. Cappellini, V., Carla, R., Conese, C., Maracchi, G. P., and Miglietta, F. (1984a). Proc. EARSeLIESASymp. Integrated Approaches Remote Sensing, Guildford p. 23. Cappellini, V., Del Bimbo, A,, and Mecocci, A. (1984b). Image Vision Comput. 2, 109. Costa, J. M., and Venetsanopoulos, A. N. (1974). IEEE Trans. Acoust. Speech Signal Process. ASSP-22,432. Dudgeon, D. E. (1975). IEEE Trans. Acoust. Speech Signal Signal Process. ASSP-23,242. Ekstrom, M. P. (1980). IEEE Trans. Acoust. Speech Signal Process. ASSP-28, 16. Harris, 0.B., and Mersereau, R. M. (1977). IEEE Trans. Acoust. Speech Signal Process. ASSP-25, 492. Hilberg, W., and Rothe, P. G. (1971). lnf. Control 18, 103. Hu, J. V., and Rabiner, L. R. (1972). IEEE Trans. Audio Electroacoust. AU-20,249. Kaiser, J. F. (1966). In “System Analysis by Digital Computer”(F. F. Kuo and J. F. Kaiser, eds.), p. 218. Wiley, New York. McClellan, J. H. (1973). Proc. Annu. Princeton Conf: Inf: Sci. Systems, 7th, p. 247. Maria, G. A., and Fahmy, M. M. (1974). IEEE Trans. Acoust. Speech Signal Process. A S P - 2 2 , 16. Mecklenbrauker, W. F. G., and Mersereau, R. M. (1976). I E E E Trans. Circuits Systems CAS-23, 414. Mersereau, R. M., and Dudgeon, D. E. (1975). I E E E Proc. 63,610. Mersereau, R. M., Mecklenbrauker, W. F. G., and Quatieri, T. F.,Jr. (1976).IEEE Trans. Circuits Systems CAS-23,405. Oppenheim, A. V., and Shafer, R. W. (1975). “Digital Signal Processing.” Prentice-Hall, New York. Pratt. W. K. (1978). “Digital Image Processing.” Wiley, New York. Prosperi, L. ( I 983). Thesis, Department Electrical Engineering, University of Florence. Shannon, C. E. (1959). IRE Nut/. Conu. Rec. 7, 142. Shannon, C. E., and Weather, W. (1949).“The Mathematical Theory of Communication.” Univ. of Illinois Press. Urbana. Shanks, J . L., Treitel, S., and Justice, J. H . (1972). I € € € Truns. Audio Electroacoust. AU-20, 115.
ADVANCES IN ELECTRONICS A N D ELECTRON PHYSICS. VOL . 66
Statistical Aspects of Image Handling in Low-Dose Electron Microscopy of Biological Material CORNELIS H . SLUMP* AND HEDZER A . FERWERDA Department of Applied Physics Rijksuniuersiteit Groningen Groningen. 7he Netherlands
I . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Need for a Fundamental Statistical Analysis . . . . . . . . . . . . . . . . . . . . . B. Interaction of the Electron Beam with the Specimen . . . . . . . . . . . . . . . . .
I1.
111.
1V.
V.
C. Relation between the Object Structure and the Electron Wave Function . . . . . D . Image Formation in the CTEM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E. Remarks about the Contrast Mechanisms . . . . . . . . . . . . . . . . . . . . . . . F. The Phase Problem in Electron Microscopy . . . . . . . . . . . . . . . . . . . . . G . The Stochastic Process Characterizing the Low-Dose Image . . . . . . . . . . . . Object Wave Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Introduction and Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Derivation of the Basic Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Solution of the Integral Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . D . Statistical Analysis of an Approximate Solution . . . . . . . . . . . . . . . . . . . Wave-Function Reconstruction of Weak Scatterers . . . . . . . . . . . . . . . . . . . A . Introductory Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Axial Illumination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Tilted Illurnination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . Maximum Likelihood Estimation in Electron Microscopy . . . . . . . . . . . . . B. Illustrative Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Two-Dimensional Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D . Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Statistical Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . lntroduction to Statistical Hypothesis Testing in Electron Microscopy . . . . . . B. Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Position Detection of Marker Atoms . . . . . . . . . . . . . . . . . . . . . . . . . . D . Statistical Significance of Image Processing . . . . . . . . . . . . . . . . . . . . . . E . Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix A: The Statistical Properties of the Fourier Transform of the Low-Dose Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix B: The Statistical Properties of an Auxiliary Variable . . . . . . . . . . . . Appendix C: The Cramer-Rao Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
202 202 203 204 205 201 209 209 213 213 215 219 224 230 230 230 242 254 254 259 213 216 211
211 281 290 295 295 291 299 305 306
* Present address: Philips Medical Systems. Eindhoven. The Netherlands. 20 1 Copyright 0 1986 by Academic Press. Inc All rights of reproduction in any form reserved.
202
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
I. INTRODUCTION A . Need for a Fundamental Statistical Analysis
It is well known that in the electron microscopy of biological specimens radiation damage constitutes a severe problem. This radiation damage manifests itself as the breaking of chemical bonds due to the bombardment of the specimen by electrons. Obviously, radiation damage can be reduced by limiting the number of electrons during exposure. The price to be paid for this economy is that the pictures exhibit a grainy appearance (“shot noise”) which introduces a probabilistic feature in the imaging process. The image is to be considered as the realization of a stochastic process. The processing of the noisy micrographs consequently acquires a stochastic nature. It is to be expected that the smaller the number of participating electrons the larger the uncertainty in the results will be. It is the aim of the present article to give a statistical characterization of the results obtained from noisy exposures. By this we mean that not only the average of a certain calculated quantity is needed but also its variance, etc. The present analysis puts intuitive notions such as signal-to-noise ratio on a firm basis. Ideally, the probability density function of the relevant quantity should be determined. As we shall see in the subsequent sections, such a goal may be too ambitious in general. In the past several attempts have been made in electron microscopy to improve the statistical significance of the obtained results. The most obvious approach is to repeat the experiments under identical circumstances. This will reduce the variance of the quantities in which we are interested. Unwin and Henderson (1975) achieve this for substances which can be crystallized. In this case the number of repetitions of the exposure of one single unit equals the number of unit cells in the crystal. Unfortunately, not all biological materials can be forced to crystallize. Periodic structures can be handled by the Fourier filtering method of Unwin and Henderson (1975) or the cross-correlating techniques of Saxton and Frank (1977). Nonperiodic objects have only been analyzed in a systematic way by techniques borrowed from pattern recognition, e.g., cluster analysis. Van Heel and Frank (1981) have applied a statistical version of such an algorithm, called correspondence analysis, to a large number of images that each contain a similar, single isolated biological macromolecule. These images could thereafter be oriented with respect to each other, aligned and averaged, resulting in a higher signal-to-noise ratio and showing many more details of the object. Even in the case of the abovementioned improvements in signal-to-noise ratio, a statistical characterization of the final quantities remains necessary.
IMAGE HANDLING IN ELECTRON MICROSCOPY
203
Before entering upon the statistical analysis proper we have to discuss briefly the interaction of the electron beam with the specimen and the image formation in the CTEM (conventional transmission electron microscope).
B. Interaction of the Electron Beam with the Specimen
A basic understanding of the scattering of electrons by the specimen under consideration is needed if one wants to deduce the object structure from one or more micrographs. Our discussion will be very sketchy. A more detailed account can be found in Heidenreich (1964), Haine (1961), Misell (1973a). We distinguish two categories of interactions: (1) inelastic interactions, and (2) elastic interactions. In inelastic interactions the incident electrons transfer energy and momentum to the object, leading to an excitation of the specimen. Such an excitation could be a breaking of a chemical bond, ionization or excitation to another energy level, plasmon interactions, etc. In elastic interactions there is no internal excitation of the object whatsoever. The transfer of momentum and energy is determined by conservation of energy and momentum. The elastic scattering is the scattering of the incident electrons from the electrons and atomic nuclei (Coulomb scattering) in the specimen. When considering monoenergetic incident electrons, the characteristic difference between elastic and inelastic scattering behavior is that the spread in scattering angles is larger for elastic scattering than for inelastic scattering. Intuitively we can use this fact to make some guesses about the resolution which might be obtained when deducing the object structure from the images. This calculation is an idealization because the scattering characteristics of the objects are recorded in some complicated way in the electron micrograph, as will be clarified in the subsequent sections. Let Ap denote the spread of the momentum transfer between the incident electron and the object. According to the Heisenberg uncertainty principle, the object can be localized with an uncertainty of position of the order Ax = %i/Ap,-h being Planck’s constant divided by 271. As Ap is largest for elastic scattering, high-resolution work (aiming at discovering structural information of the order of a few angstrom) has to utilize elastically scattered electrons. It is convenient to remove the inelastically scattered electrons by a filter lens (see e.g., Henkelman and Ottensmeyer, 1974; Egerton et al., 1975). The inelastically scattered electrons form an unwanted low-resolution background signal which can be interpreted as blur. The wave function describing the elastic scattering of an electron by an object has an important property: There exists a definite phase relationship between the incident wave function and the scattered wave function, a
204
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
relationship which does not hold for inelastically scattered electrons. For that reason we shall assume monochromatic, spatially coherent illumination. This is an idealized situation which is experimentally approximated by the field emission gun, which has a very small apparent spot size. Strictly speaking, the illumination is partially coherent. The degree of partial coherence might be incorporated in the calculations, which become increasingly cumbersome without yielding additional insight (for a discussion see Hawkes, 1980b). Therefore, in the present contribution, we shall assume monochromatic,fully spatially coherent illurnination. The monochromaticity requirement is satisfied by removing the inelastically scattered electrons by an energy filter lens. C . Relation between the Object Structure and the Electron Wave Function The discussion in this section only applies to the case of elastic scattering. In order to determine the object structure from the scattered wave function, whose determination will be discussed in the following sections, we need a model for the object. We shall give two examples. (1) The object is considered as a collection of scattering centers. The position and scattering strength of each center is described by a certain number of parameters. In principle, the intensity distribution of the image can be calculated and is compared with the measured image intensity distribution. The parameters can now be determined by a fitting procedure. Needless to say, this procedure is only viable if the number of parameters is restricted, which is the case when we have considerable prior information about the object from other sources (e.g., chemical). This approach will be discussed in further detail in Section IV. (2) The object is described by an electrostatic potential distribution V(r) which is due to all charges inside the object. According to the WKB approximation and Glauber theory (Lenz, 1971; Glauber, 1959), the phase shift cx(.,.) imparted to a plane wave incident along the z direction is proportional to the projection of V(r) on a plane perpendicular to the z axis 4 x 0 , y o ) 0~
b0, y o , z)dz
(1)
The reconstruction of the scattered wave function yields a projection of the potential distribution. It should be clear that the inverse scattering problem (i.e., the determination of the object’s structure from the scattered wave function) is more
IMAGE HANDLING IN ELECTRON MICROSCOPY
205
complicated than the present examples suggest and needs further investigation.
D. Image Formation in the CTEM The setup of the electron microscopic image formation has been schematically sketched in Fig. 1. The z axis of the coordinate system is chosen along the optical axis of the microscope. The wave function in a plane perpendicular to the z axis and situated immediately behind the object shall henceforth be denoted by the name “object wave function,” and is written as $O(XO>YO)
= exPCi4x0,Yo) - P(x0,yo)l
(2)
assuming a plane wave incident along the z axis, exp(ikz), and choosing the z coordinate of the plane, zo, equal to zero. xo and yo are the coordinates in the object plane and will be measured in units of the wavelength of the incident electrons. We shall give a physical interpretation of the quantities CI and fl occurring in Eq. (2). For a plane incident wave along the z axis, a ( x o , y o ) represents the phase shift which is imparted to the incident wave function by the object and which can be related to the projection of the electrostatic potential distribution on the object plane z = zo = 0 [see Eq. (l)]. The quantity P(x0, y o ) has a phenomenological interpretation: This quantity describes the loss of electrons due to inelastic scattering. These electrons are supposed to have been removed by a filter lens. This filtering is incorporated phenomenologically by p(xo,yo). The wave function in the image plane of the microscope, i.e., the image wave function (see Fig. l), is related to the object wave function by
(3) where K ( . ,.) is given by [see, e.g., Hawkes (1980)l K ( x 0 - x,yo - Y )=
II d5
d?exP{ -iY(5,?) - 2.niKxo - X I 5
+ (YO -Y h I J (4)
Equations (3) and (4) are the equations of the linear transfer theory of image formation, and they apply when the optical system is isoplanar, i.e., when the wave-optical aberrations are independent of the object coordinates .toand y o . The real function y ( - , .) introduced in Eq. (4) is the wave aberration function. It incorporates the effects of spherical aberration through the coefficient C, and
206
CORNELIS H. SLUMP AND HEDZER A. FERWERDA $(X.Y
1
-2 optic a x i s
z .za=o
e x i t pupil
z.zp
image plane
z .zi
FIG.1. Schematic of image formation in the transmission electron microscope
the defocus through the coefficient D
The value of C,, usually in the range of 1 to 3 mm, is fixed for a given microscope. However, the defocus D is variable and is to be chosen by the experimentalist. In Eq. (5) aberrations of higher order such as coma and astigmatism are neglected. In the object plane xo and yo are measured in units of E.; in the exit pupil 5 and q are expressed in units of the (back) focal length of the optical system. In the image plane x and y are measured in MA, with M representing the lateral magnification. A wave function is not a measurable physical quantity in itself. In the image plane only the intensity, i.e., the wave function multiplied by its complex conjugate, can be observed and registered. The information about the In an structure of the specimen is contained in the phase function aberration-free microscope this information would disappear from the product $(.,.)$*(.,.) when the defocus parameter D is also set to zero, a situation which would correspond to the Gaussian reference plane. The aberration function y(. ,-)[cf. Eq. ( 5 ) ] is primarily responsible for the generation of phase contrast. In this context phase contrast is defined (as usual) as the contrast in the image caused by the phase shift imparted to the incident electron wave function by the object. This phase contrast is ~(e;).
IMAGE HANDLING IN ELECTRON MICROSCOPY
207
observable thanks to the phase shift provided by the aberration function y(.,.) which plays a role similar to the phase plate in a phase-contrast microscope.
E . Remarks about the Contrast Mechanisms We will briefly make some comments on the physical mechanisms which give rise to image contrast. We will assume that the image is due to the elastically scattered electrons. For low-resolution microscopy one gets useful information from the scattering contrast. The scattering contrast (also called diffraction contrast) is caused by the removal of electrons which have been scattered over such large angles that they are intercepted by the apertures of the microscope. This scattering contrast should be clearly distinguished from the contrast caused by the removal of the inelastically scattered electrons which have been removed by the energy filter lens. Scattering contrast is fully taken into account in our treatment by the finite size of the aperture in the exit pupil, which is incorporated in our formulas. In our treatment scattering contrast is caused by electrons that have been intercepted by the diaphragm in the exit pupil. This diaphragm should not be taken too literally: All the apertures inside the microscope are represented by one effective aperture taken to be located in the exit pupil. Scattering contrast can be observed when essentially only the unscattered electrons pass through the microscope. When the specimen undergoes crystalline scattering (in the context of crystalline specimens the name diffractionis usually preferred), contrast is observed when only the zero-order diffracted beam is transmitted. In some cases scattering contrast admits a simple physical interpretation. The object is considered to consist of a number of point scatterers, which scatter independently. Let us assume that the electrons are never scattered more than once; multiple scattering is excluded. Let the object be illuminated by a wave propagating in the z direction and let Z(x,y , z ) denote the beam intensity in an arbitrary point of space. a is the total cross section for scattering over angles larger than the angular half-width E of the aperture in the exit pupil. As a depends on the chemical composition of the scatterer, a is taken to depend on the space coordinates: a = a(x,y , z). We now consider the propagation of the beam intensity Z(x,,y,z)when proceeding in the z direction (see Fig. 2). Going from the plane perpendicular to the z axis with z coordinate z to a similar plane with z coordinate z Az,we find that the number of electrons which is prevented from reaching the image plane is given by
+
I ( x ,y , 4
-
Z ( X , Y ,z
+ A 4 = 4x9 Y , z ) l ( x ,y, z)p(x,Y, 4 Ai!
where p(x,y,z) denotes the number of scatterers per unit volume. I ( x , y , z )
208
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
/ I
I
I
1
/
1
zi
v
1Z+AZ I
I
FIG.2.
object
>
2 axis
/
Propagation of the beam intensity through the object.
consequently satisfies the transport equation ar(x, y , z, =
- a(x,
y , z)p(x,y , z)Z(x, y , z)
i3Z
which is solved by
assuming a uniform incident beam intensity lint before the beam hits the object. In particular, for a weakly scattering object (meaning a “small” value of a), we obtain by expanding the exponential function and retaining only the first two terms
L
I(x,Y , z ) = Iinc - Iinc
d x . Y ,Z’)P(X, y , z’) dz’
(8)
From this formula one sees that the image contrast is the projection of a(x, y , z)p(x, y , z) on the object plane. ~ ( xy ,, z)p(x, y , z) might be interpreted as
the “scattering power” of the object per unit length. If the scattering cross section a does not vary appreciably over the object, the image contrast is approximately proportional to the projection of the density of the scattering centers, which in its turn is approximately equal to the mass density. The image contrast due to the inelastically scattered electrons gives information on the loss characteristics of the object and is irrelevant, even a nuisance, for structure determination. The reader should be warned not to take the simple-minded discussion of scattering contrast too seriously. The state of coherence of the incident beam
IMAGE HANDLING IN ELECTRON MICROSCOPY
209
has been completely ignored, a liberty that seems to be common in the study of transport phenomena. For high-resolution electron microscopy the most important contrast mechanism is phase contrast, which arises from the interference between the unscattered and the scattered electron waves. In the next section this contrast mechanism is treated in greater depth.
F . The Phase Problem in Electron Microscopy It is clear from the foregoing that determination of the structure of the object requires the knowledge of the complex object wave function, in particular its phase, given by cx(xo,yo) [cf Eq. (1) and (2)]. We shall see later on that we can determine the complex object wave function if we know the complex wave function in the image plane, +(x, y). Unfortunately, only the intensity in the image plane is observable, which is proportional to the square of the modulus of this wave function, I+(x, y)12.So, apparently the phase of the image wave function is not recorded and must be considered as being lost. This prevents a unique determination of the object wave function and frustrates the determination of the object structure. This complication is known as the “phase problem.” In this article the phase problem will be solved by using two or more exposures of the same specimen under different settings of the microscope, e.g., by changing the defocusing between two exposures. A more detailed account can be found in the review articles by Ferwerda (1978, 1981, 1983), where more references can be found, and Saxton (1980).
G. The Stochastic Process Characterizing the Low-Dose Image Within the limits of the virtually countless number of electrons contributing to the image intensity, the observed intensity is proportional to the squared We will call this situation the modulus of the image wave function deterministic case. In this study, however, the specimens of interest are the radiation sensitive objects of biological material. As has been stated earlier, irreparable radiation damage resulting from the imaging electrons restricts the electron dose tremendously. Lowering the dose in order to prevent uncontrollable structural changes in the specimen will cause quantum noise to become manifest. The interpretation of the image will become difficult due to quantum noise. It is obvious that a compromise must be made between radiation damage of the structure and interpretability of the image contrast [see, e.g., Kellenberger (1980)l. We will not pursue this further. We minimize the structural damage by reducing the electron dose, and we will investigate in $(a,-).
210
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
subsequent sections the consequences of the reduction for the retrieval of information from noisy images in general and in particular with respect to the phase problem. In low-dose circumstances the image intensity distribution recorded on the micrograph is a realization of a stochastic process. This noise process, which is directly related to the way in which images are recorded, is investigated in this section. The stochastic nature of the recorded image has a consequence that the results of image processing also become stochastic quantities, for example the results obtained with an algorithm for phase retrieval. This section is therefore of fundamental importance in this study. From among other noise sources, such as thermal fluctuations of the current in the magnetic lenses and mechanical vibrations, we restrict our analysis to the so-called quantum noise (“shot noise”). The reason for this restriction is that quantum noise is a fundamental and major noise source which cannot be made arbitrarily small by good instrumentation and intelligent operation of the microscope. Throughout this study we will assume that low-dose images are recorded in the following idealized away. The registration of the image is performed by a detector (viz. a photographic plate) which is divided into a large number of N 2 (to be specified later) identical nonoverlapping squares. We assume that each image cell counts the exact number of electrons which arrive in the cell. Hence, a recorded image consists of an N x N array of random counts f i k , l , ( k , I ) = { 1,. . . ,N } . We use the hat symbol to denote random variables. The above assumptions are not limiting because the number of developed grains of silver in a region of a photographic emulsion is proportional to the number of incoming quanta. The step of scanning the image with a microdensitometer is omitted from the analysis presented here. The process of digitizing the photographic plate will inevitably add noise to the signal to be processed. The approach in this section is to assume that the information in the micrograph is available in digital form. This is in line with new developments in instrumentation, where a direct interface exists between microscope image and computer, and with developments in solid-state image-receptor technology. The intensity of the electron source must be low because of the restriction to the low-dose regime. It is to be expected therefore that the electron wave packets of the successively emitted electrons do not overlap. This is experimentally supported by Munch (1975), who found an average space of 15 m between two electrons in a beam current of 1.5 x lo-’’ A at 75 keV. On the average there is only one electron in the microscope at a time. Therefore, the successively emitted electrons do not interact with each other, and the emissions are statistically independent events. A further consequence is that the scattering process of the beam with the specimen and the subsequent image formation can be described by a one-electron wave function. Due to the
IMAGE HANDLING IN ELECTRON MICROSCOPY
21 1
statistical independence of the sucessive emissions the total number fi, of emitted electrons during the exposure time T is a realization of a Poisson process, as is shown for example in Davenport and Root (1958, Chap. 7). This means that fi, is a Poisson-distributed random variable with probability distribution
P{Z,
= k } = exp(-/Z,T)(3b,T)k/k!,
k
= 0,1,2,3,.. .
(9) where A, is the source intensity parameter, equal to the mean number of emissions per second. The random counts f i k , i , ( k ,1) = { 1 , . . . ,N } of which the recorded image consists, are independent, Poisson-distributed random variables, as will be shown in the following. In the low-dose regime the probability that an electron which has been emitted by the source will arrive in the Ith image cell, 1 = { 1,. . .,N ] is expressed in terms of the one-electron image wave function, cf. Eq. (3) by
s = jjI$(X.,.)l'dXdY
(10)
ar
in which a, is the area of the Ith image cell. Let fil denote the number of detected electrons in the Ith image element. The probability that k electrons will arrive in the Zth image element is given by a k-times independently repeated Bernoulli trial, with 4 being the probability of success. We arrive at
k = 0,1,2,3,.. .
(1 1)
In Eq. (11) a combination of the Poisson distribution for the total number of electrons and the binomial distribution can be recognized. Defining a new variable j = rn - k, we can write for Eq. (1 1) m
P ( f i l = k ) = exp(-i,T)(k!)-'Pt
1 (&T)j+k(j!)-l(l- 4)' j=O
= exp( -/Z,Tfi)(k!)-'(&T~)k
(12)
From Eq. (12) we conclude that f i l is Poisson distributed with parameter & T e . Consider two nonoverlapping area elements of the image plane, area i and j (see Fig. 3). We will determine the joint probability that k electrons arrive in area i and I electrons in area j. The line of approach followed here parallels the discussion by Papoulis (1965, pp. 76,77). In the low-dose regime the electrons are independently emitted. Therefore the image formation can be modeled as
212
CORNELIS H. S L U M P A N D HEDZER A. FERWERDA Y
-d
I
FIG.3. The area elements i and j in the image plane.
an m-times independently repeated experiment, where m is a Poissondistributed random variable with parameter &T. We have necessarily
(13) The following three mutually exclusive events can be distinguished as possible outcomes of an experiment: m>k+l
electron arrives in area element i event i event j__ electron arrives in area element j event i + j electron arrives somewhere in the image but not in i or j The individual probabilities of the three events are related by (14)
P,+p,+Pq=l
If m has a fixed value satisfying Eq. (13), the probability of the joint event ( k electrons in i and I electrons i n j } is Pm{fii= k , i i j
=
I}
=
{k!I![m - (k
+ l ) ] ! } - ' m ! P ~ P ~ P ~ " + k (15) '
As m is Poisson distributed with the distribution of Eq. (9), we arrive al the following expression for the probability of the joint event that k electrons will arrive in area element i and I electrons will arrive in area element j m
P { i , = k,Gj = l } =
C
exp(-i,,T)(m!)-'(A,T)"Pm{fii = k , f i j = I} (16)
m=l+k
Using Eqs. (14) and (15) and introducing a new variable m' = m
-
(k
+ l ) , we
213
IMAGE HANDLING IN ELECTRON MICROSCOPY
obtain from Eq. (1 6) P{G, = k,Cj = I } m
= e x p ( - A ~ ~ ) ( ~ ! ) - l ( ~ ~ s T ~ ) ~ ( ~ !C ) - l( m ( A’ !s) -Tl (~&)~l ) m ’ ( --l m’=O
= ex p( -
-
p,)“’
A, Tp)(k !) (A, T e ) kexp( - A,T e ) (/ !) (As ~ 5 ) (1 7)
With Eq. (1 2) we obtain
P(G, = k,Ci
=I
}
= P{Ci = k } P { n j = I}
(18)
which shows that the random variables Gi and iij are statistically independent. Notice that this property does not depend on the size or shape of the areas i andj; the only requirement is that they do not overlap. Consequently, the recorded image {fil, G2, fi3, ...,C N l ) consists of N 2 independent and hence uncorrelated Poisson-distributed random variables and has the probability of occurrence
n N2
F{C1,G2,...,CN2], =
I=1
e x p ( - ~ s ~ F * ) ( ~ ~ ! ) - l ( ~ s T ~ (19) l)~’
As Cl is Poisson distributed according to Eq. (12), the mathematical expectation value of GI is given by E{ti,}
=
&T&
(20)
and the variance is given by var{Gl) = &T&
(21)
The stochastic image process is completely specified statistically by Eqs. (9), (lo), (ll), and (19). The stochastic properties of the data will play a dominant role in subsequent sections which deal with the extraction and evaluation of information from the low-dose images.
11. OBJECT WAVERECONSTRUCTION
A . Introduction and Review
The reconstruction of the object wave function is of great importance for the imaging of the structure of biological materials, especially at high resolution by means of an electron microscope. This is due to the relation between the object wave function and the electrostatic potential of the object,
214
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
as has been briefly touched upon in Section I. In Section I,D it is indicated that the interaction between the incoming electrons and the specimen is described by a shift in the phase of the electron wave function. The amount of phase shift corresponds to the projection in the propagation direction of the object’s electrostatic potential. The information about the specimen structure is not related to a directly measurable physical quantity, a situation which leads to the so-called phase problem. Only the intensity of a wave function, which is proportional to its squared modulus, can be recorded, for example, on a photographic plate. In this section we will first discuss the properties of the various methods and algorithms which are proposed in the literature for solving the phase problem when they are applied to low-dose imaging conditions. We hereby exclude periodic objects so that the noise-reducing techniques of averaging over the periodic repetition of image cells cannot be applied (Unwin and Henderson, 1975). Apart from analytical techniques such as zero flipping, which are mainly of theoretical importance, basically three algorithms exist for the retrieval of the phase of the object wave function. The elegant method proposed by Frank (1973)will in practical situations run into difficulties and is omitted here together with the Gerchberg-Saxton algorithm (Gerchberg and Saxton, 1972), which can give nonunique results even for the noise-free case. The three methods discussed in this paragraph are: (1) Newton-Kantorovich approach (Van Toorn and Ferwerda, 1976; Roger, 1981). In this approach the nonlinear equations, describing the object wave function in terms of recorded intensities, are expanded as a kind of Taylor expansion based on a first (intelligent) guess or starting from prior information about the solution to be obtained. Based on the first-order term of this expansion, the wave function is updated, which is the starting point for a new expansion, and the process is repeated. Owing to the noisy data, it is difficult to formulate a criterion which has to indicate that the calculational procedure has converged. A too stringent adaptation to the noisy data has to be prevented. A further problem with this method concerns the convergence to a correct solution when the initial starting function is not close to it. Owing to the iterative nature of the procedure, the influence of the noise on the results obtained is difficult to quantify analytically. The determination of even the expectation value and the variance of the obtained results is a hard task. However, some insight can be obtained by Monte Carlo studies. Analytically tractable is the case of an initial starting solution which is close to the true solution, when only one correction update of the wave function is necessary. However, the statistical analysis of this situation is not of great value as not much improvement is to be expected from adaptation to noisy data. What is most likely to happen is that the good initial a priori solution will be corrupted
IMAGE HANDLING IN ELECTRON MICROSCOPY
215
by noise and that the procedure ends up with an inferior result if compared with the a priori information. (2) Misell’s algorithm (Misell, 1973b).This is also an iterative procedure, based on two exposures which are defocused with respect to each other. Starting from an initial image wave function with the correct modulus according to the first exposure and with an arbitrary or even random phase, the image wave function belonging to the second exposure is calculated. The modulus of this wave function is corrected to satisfy the second exposure. Based on this corrected wave function, the wave function corresponding to the first exposure is calculated, the correct modulus is enforced, etc. When this computational scheme has converged, the object wave function is obtained by inversion of the integral equation [cf. Eq. (3)], which relates the object and the image wave function to each other. With the Misell algorithm the same problem arises with respect to the convergence and the noise influence on the result as with the Newton-Kantorovich approach. Owing to quantum noise the data of the two exposures are not consistent with each other; so that in a strict mathematical sense there exists no solution, and consequently the algorithm cannot be expected to converge. (3) The direct approach (Van Toorn and Ferwerda, 1976).This method is also based on two exposures with a different defocusing parameter. The wave function in the exit pupil is calculated by solving two coupled nonlinear Volterra integral equations of the first kind. The algorithm is not an iterative procedure and is therefore of potential interest for evaluating the solution obtained statistically. The algorithm is very sensitive to noise, as has been reported by Van Toorn et al. (1978), because of error accumulation. Of all methods the approach of solving the integral equations directly seems to be the most relevant for the analysis of the influence of the stochastic data on the reconstructed wave function. Iterative procedures are not tractable for statistical evaluation; therefore, in the next two sections the direct method is analyzed in greater detail.
B. Derivation of the Basic Equation
In this section we derive the basic integral equation which relates the object wave function to a recorded intensity distribution in the image plane. In order to keep the equations as simple as possible, we treat in this chapter one lateral dimension of the images only. For electron microscopes with square diaphragms (if there are any), the extension to two lateral dimensions is straightforward. With a circular symmetry the mathematical treatment becomes more complicated. However, this elaborate analysis will not yield
216
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
more insight than the case of one lateral dimension, which is treated in this section. The illuminating electron beam is described by a one-electron wave function. For this we assume a quasimonochromatic plane wave propagating in the direction of the optical axis of the imaging system, which we consider to be isoplanar. The object wave function is written as
(22)
$o(xo) = expCia(x0) - B(x0)l
where the phase term a(.) denotes the phase shift due to the object’s electrostatic potential and where the attenuation term p(.) describes the removal of inelastically scattered electrons from the imaging beam by the appropriate energy filter lens. The geometry of the electron microscope being considered here with one lateral dimension is presented in Fig. 4. The optical system is characterized by the relations between the wave functions in the three planes: object plane, exit pupil, and image plane. The wave function in the exit pupil $ p ( - ) is related to the object wave function $o(.) by
I-+**
$,(O = expC O(01 -
$o(xo)exp( - 27cix00 dxo
(23)
where y(.) is the aberration function in the exit pupil. The coordinate 4 in the exit pupil is measured in units off; the focal length of the imaging system. In
X
5) 5 specimen holder
4Jolxo)
vacuum
XO
*d *E
F
coherent
G
i IIu mi n a t i on
Z
L
. optic a x i s
-E -d
I
object plane
z:z,=o
exit pupil
z.z
P
image p l a n e ZZZ,
FIG.4. Schematic diagram of the imaging system for the case of axial illumination
217
IMAGE HANDLING IN ELECTRON MICROSCOPY
the object plane xo is measured in units of 2, the wavelength of the accelerated electrons. In the image plane the coordinate x is measured in units of MA, with M the (lateral) magnification. The aberration function y(.) for an isoplanar system is given by
r(5) = 2 n ~ - 1 ( g , 5 4- + ~ p )
(24) where only the spherical aberration with coefficient C, and the defocusing with coefficient D have been taken into account, neglecting higher-order aberrations such as coma and astigmatism. The image wave function $(-)is related to by the wave function $&a)
As has been discussed in Section I,G the recorded low-dose image consists of a (one-dimensional) array of statistically independent, Poisson-distributed, random counts n^ = (n^-,j2, fi-N,2+ . . . , i?N,2- The Shannon number N equals 2 x 2d x 2~ because the intensity $$* in the image plane has a bandwidth of 48. The t i k (k = - * N , . . . , i N - 1) random counts represent the electrons which have arrived in the kth image cell. In Section 1,G it has been shown that Zk is a Poisson-distributed random variable with intensity parameter ikequal to its expectation value, given by
A, = E{fik) = &T(2d)-'
S.,
(261
$(x)$*(x)dx
+
where a,, denotes the kth image cell: ( 4 ~ ) -k - ( 8 ~ ) - < x < ( 4 ~ ) - l k ( 8 ~ ) Calculating the Fourier transform of the stochastic image according to N/2- 1
?(()
=
1
k= -N/2
fikexp[2ni(4&)-'k5]
(27)
we obtain the stochastic function ?(-). For reasons of convenience ?(-) is defined here for a continuum of values of 5. In practice Eq. (27) will be calculated using a fast Fourier transform (FFT) algorithm, which results in 2(cl)for a discrete set of ( values (to. Where appropriate we will use this discrete representation. The function ?(.) is a complex stochastic function, and its statistical properties are studied in Appendix A. In Appendix A it is shown that the autocorrelation function R ( . , .) of the complex stochastic process Eq. (27) is given by
Wt,,t z ) = E { ~ ( 5 1 ) c ^ * ( 5 z ) f= E(C^(51))E(C^*(t2)) + CAkexp[2ni(4~)-'k(t, - t2)1 I from which it follows that
c^(cl)and c^((,)
are correlated
(5,
#
t2).
(28)
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
218
The basic integral equation is derived from Eq. (27) taking the expectation value of both sides. Using Eq. (26) this results in N/2 - 1
C
E(c^(())= i.,T(2d)-l
exp[2ni(4~)-'k51
k= -N/2
la,
$(x)$*(x)dx
(29)
Substituting Eq. (25) and performing the integration over x yields
sin [2n(Se)-'(] X
n4
exp[2ni(4~)-'k(Y ) = 1
+2
j dx, on
jm,. jo jm dYo a(xo,Y,)
x exp{2niC(x - x o ) 5 -
2
I,
dxo
I,
drl sinIr(L 1?)1
d4
+ (Y - ~ 0 ) r l l )
dyoB(xo,yo)
x exp{2niC(x - xo)4 + (4'
Jo I
-
d5
d~cosCy(5,~)l
~011)
(81)
In the deterministic (noise-free) case the image intensity distribution is proportional to Eq. (81). This situation applies when the total number of electrons involved in the image formation is infinite. In low-dose imaging the recorded image intensity is a realization of a stochastic process. The following briefly outlines the characterization of this stochastic process.
IMAGE HANDLING IN ELECTRON MICROSCOPY
233
I . Low-Dose lmuge Recording The stochastic process that characterizes the low-dose image has been described extensively in Section I,G. We thus only recapitulate the main properties here. In low-dose imaging the emissions of electrons by the source are statistically independent events. Therefore the total number of electrons Cr emitted during the exposure time T is a random variable distributed according to the Poisson distribution
P { C T = k ) =exp(-;I,T)(AsT)k/k!,
k = 0 , 1 , 2 , ...
(82)
where the source intensity is is the mean number of electron emissions per second. The image intensity is assumed to be recorded in the following idealized way. The image plane is divided into a large number N 2 of identical nonoverlapping squares. It is assumed that each image cell exactly counts all the electrons arriving at the cell. Consequently a recorded image consists of an N x N array of independent random counts Ck,/,( k , 1) = { 1,. . . ,N }. In lowdose imaging the probability that an electron which is emitted by the source will arrive in the (k, 4th image cell is given by Pk,/
=
ss
dx d y $(x3
Y)$*(x> Y )
(83)
ak,l
where uk,[denotes the area of the ( k , I)th image cell. The recorded image is a realization of a stochastic Poisson process, characterized by
The random counts Ck,/ are Poisson-distributed random variables with the parameter = kTpk,,
(85)
The image wave function is a band-limited function of bandwidth E. In Eq. (80) only the linear terms have been taken into account. Therefore in this approximation the image intensity has bandwidth E also. Applying Whittaker-Shannon sampling to the image results in N Zimage cells with the Shannon number N equal to goo. In the next subsection the consequences of the noise in the data on the reconstruction of a ( - ; ) and fi(-,-) are treated in detail.
234
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
2. Object Wuae Reconstruction
We will determine the object wave function from two low-dose images with different defocusing parameters D. The Poisson-distributed integer data values will be transformed into new random variables $ I by subtracting the background intensity and scaling, as follows
With Eq. (86) a representation of the recorded image is obtained with the sample values s^(k/2&, 1/28) which are denoted by &I. The data rik.1 are integers. Therefore the number of significant decimals in sk,Iis governed by the value of N 2 ( A ,T)-'. Further decimals are beyond retrieval and are further discarded here. For the (mathematical) expectation value S k , ) of s^k,l,we have E { $ k , l } = ask,[ = N Z O i 2 0k.l
+ 27ci(x5 + y q ) ] + C.C. where uk,[is the area of the (k,l)th image cell, which is given by ( 2 & ) - ' k xI (2&)-'k + (4&)-1,(2&)-'1 - (4&)-1I y I (2&)-'1 + (4&)-'. (4&)-1I The Fourier transform ?((, q ) of the transformed data values is defined by
For reasons of convenience the function E(5, q ) is defined for a continuum of values t and q. In actual practice Eq. (88) will be calculated using a fast Fourier transform (FFT) algorithm, which yields 2(tp,qq) for a discrete set of 5, and ylq values. The function E(-, -) is a complex stochastic function, the properties of which are described in Appendix B. As is discussed in this appendix, the .) has the advantage of consisting of uncorrediscrete representation of 2(-, lated random variables. When explicit calculations need to be performed with we shall return to this discrete representation. Appendix B shows that, to a good approximation (for not too few electrons per sample cell), the probability distribution of E(-,-) is complex Gaussian with its mean equal to the true, deterministic function and a (nearly) constant variance N4(3.,T ) - ' . This variance is a microscope parameter, because its value depends only on the illumination dose and the resolution (i.e., the area of the sample cell). by carrying With Eq. (87) we can write for the expectation value of 2(-,.), out the integrations over x and y and with evaluation of the summations over ?(.,a)
IMAGE HANDLING IN ELECTRON MICROSCOPY
x exp[-ni(2~)--l(t- 5"
X
+q
-
q")]
sin[2n(2e)-'+N(t . sin[n(2~)-'(5 .-
235
(")I > l), we obtain the following set of equations [which are the discrete counterparts of Eq. (94)] by which i(-, .) and
238
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
B(-,-)are estimated N/2-l
N/2-1
1
1
!?(k/2~,1/28) = N-2
exp[ -2niN-’(kk‘
+ U’)]
k‘=-N/21‘=-N/2 ( k ’ . l ’ ) # (0.0)
x (2 sinEn2-l AD(2d)-2(k’2+1’2)]}-1 x (cos[y2(k’/2d, 1’/2d)] t:f,l’ -cos[y 1 (k’/2d),1’/2d)]t;.,,.} N/2- 1
E(k/2~,1/2&)=N-’
1
N/2- 1
exp[-2niN-’(kk’+lZ’)]
k‘= -N/21’= -N/2
x
+ lf2)]}-’
{2sin[7tX1
x {sin[yl (k’/2d, 1’/2d)]t;f,1,- sin[y2(k’/2d, 1’/2d)It:,,l,}
(101) In the equation for &(-,.) in Eq. (101) the point k‘ = 1‘ = 0 must be excluded from the summatiqn because in this case the denominator is equal to zero. In the expression for B(., .) this precaution is not necessary because in this case the numerator is also equal to zero for k‘ = 1’ = 0. With Eq. (101) we have to obtained an estimate for the band-limited approximations E(-, -) and the functions in the object wave function a(-,-)and b(.,-),respectively. We now proceed to investigate the stochastic properties of Eqs. (101). Restricting ourselves to the significant decimals of the gk.1 variables, we notice that the statistics of Eq. (91) are unbiased
p(.,.)
{ - i N , . ..,$N
E{@(k/2&, 1/28)) = E(k/2e, 1/2~),
( k ,I )
E{;(k/2~, 1/28)) = p(k/2~,1/28),
( k , 1) = { -$N,. . .,$N - l}
=
-
I} (102)
The variances of the two statistics of Eq. (101) will be calculated separately. We first consider the object Cmplitude function B(.,.). From Eq. (101) we note that the random variable p(.,.) is a weighted sum of Gaussian random variables,-and therefore it is a Gaussian variable likewise. The expectation as is expressed by Eq. (102) and the variance, value of B(.,-) is the true which defines cri, is given by $(.,a)
N/2- 1
var{~(k/2&,1/2~)}=(41,T)-’
1
N/2- 1
1
(sin[ni-’ AD(2d)-2(k’2+1’2)]}-2
k’=-N/2I‘= -N/2
x {sin2[y’(k’/2d, 1’/2d)] +sin2[y2(k’/2d, I’/2d)]}
=(4i,T)-b;,
( k , l ) = { -3N )...,$ N
-
l}
(103)
A
B(.,
From Eq. (103)we see that the variance of .) does not depend on ( k , I). The value of the constant variance (4AST)-’op2 is fully determined by the microscope parameters. From the definition of y(., .) [cf. Eq. (74); we note that
239
IMAGE HANDLING I N ELECTRON MICROSCOPY
every term in the summation is finite provided that AD (cf. Eq. (95)] is chosen in such a way that sin[nL-’ AD(”)-’(k’’ I”)] is zero only in (k’, 1’) = (0,O). For (k‘,1’) = (0,O) the term in the summation in Eq. (103) takes the finite vAalue of (AD)-’(D” + 0’’ ). To summarize, the object amplitude function B(., estimated by Eq. (101), can be described as the sum of the true .) function value plus a signal-independent Gaussian stochastic process with zero mean and constant variance
+
9)
p(-,
~ ( k / 2 ~ , 1 / 2 ~ ) = ~ ( k / 2 ~ , 1 / 2 c ) + N ( O , ( 4 L , T ) -( ’ka, I~) )=,{ - t N , . . . , + N - l } (104) where N(0, a’) denotes a normal distributed random variable with mean equal to zero and a variance of a’. Next we turn to the object phase function ti(-,-). From Eq. (101)we see that $(-,.) is also a weighted sum of Gaussian random variables. Therefore I?(-, -) is a Gaussian random variable, its expectation value is the true ti(., .) value [cf. Eq. (102)], and its variance is given by NI2 - 1
var{$(k/2s, 1/24) =(41,T)-’
N/2-1
C C k’=-N/21’=-N/Z
(sinCn1-l AD(2d)-2(k’2+1’2)]}-2
WJ’) # (0.0)
+ cos2[y1(k’/2d,If/2d)]}, ...,4 N - l}
x {cos2[y2(k’/2d,1’/2d)]
(k,I)={-*N,
(105)
Unlike the situation with the object amplitude function, not every term in the summation in Eq. (105) is finite. If the term belonging to (k’, 1’) = (0,O)had not been excluded from the summation, an infinite variance would have resulted. The variance, Eq. (105), is nevertheless still dominated by the (k’,l’) combinations close to (0,O)where the denominator of Eq. (105) is very small while the numerator is of order unity. The estimated object phase function @(-,.) obtained from Eq. (101) appears to be very sensitive to noise. The variance of :(-, .) is considerably larger than the variance of the object amplitude function p(-,.), and in fact so large that it is questionable whether this quantity is of any use at all. We will now investigate the mechanism which is responsible for the extremely high variance of g(.,-).For this purpose it is necessary to check the -) contributions of the different Fourier coefficients of the phase function Z(-, to the data iik,l.From Eq. (81) we arrive at the following expression for the contribution of the phase function a(-,.) to the recorded image intensity E{Afik,l}= 1,TN-22
s. s. d5
dqsinCy(5, q)] JuodxoJuodYoa(xo~Yo)
x exp[2ni{(k(2~)-’ - xo)5
+(
4 2 ~-) y~o ) ~q ) ]
(106)
240
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
where A& denotes the contrast resulting from the phase function alone. To a good approximation we can substitute for this expression
c
N/Z- 1
E { A k k , f }= R,TN-22N-2
N/2- 1
1 sin[y(k’/2d,1’/2d)]
k’= -N/Z f‘= - N / 2
x exp[2niN-’(kk’
+ 11’)l
In order to be detectable, the right-hand side of Eq. (107) must have at least a numerical value of one as Ai?k,l has to be an integer. The term W 2 @(r/2~, s / 2 ~exp{ ) - 2niN-’(rk’ + sl’)} in Eq. (107) is of the same .) which is known to be less than one. Taking i(-,.) order of magnitude as i(-, to be approximately 0.1, a rough guess is obtained as to which Fourier coefficients do contribute to the phase contrast. In order to observe phase contrast, the following relation now applies
In order for a specific Fourier coefficent (k’,I’) to contribute to the image intensity 6, the more stringent condition must be imposed Isin[y(k’/2d, 1‘,’2d)]l 2 5 N 2 ( i . , T ) - ’
(109)
From Eq. (74) we observe that y(-, -)isa function of (2d)-2(k’2 + I”). Therefore we can find a circle around (0,O) with a radius q given by
k”
+ 1”
I q2
(1 10)
with k‘ and 1’ such that /sin[y(k‘/2d, /‘/2d)]l I 5N2(&T)-’
(111)
The recorded intensity distribution fi does not contain information about the Fourier coefficients (k’,1’) inside the circle with radius q. Therefore these coefficients cannot be retrieved. In Eq. (105) we already observed that the variance of ;(-,-) is dominated by the (k’,!’)combinations close to (0,O). If the Fourier coefficients (k’, 1’) satisfying Eqs. (1 10)and (1 11) are excluded from the computation of g(-,.) in Eq. (101)the value of the noise variance in Eq. (105) is improved considerably. Nevertheless all the phase information contained in the images is used. This in fact represents the bandpass filtration of the object In addition to the frequency components above c, the phase function tl(-;). low-frequency Fourier coefficients are filtered away also. Therefore we do not
24 1
IMAGE HANDLING IN ELECTRON MICROSCOPY
reconstruct the E(.,.) function. Instead a filtered version iq(-,.) is computed which does not contain frequency components in the circle with radius q. The bandpass-filtered object phase function Eq(-,.) is estimated by [cf. Eq. (lol)]
x {2sin[rrA-’ AD(2d)-2(k‘2
x
{COS[Y’(
+ l”)]}-’
’
k’/2d, 1’/2d)]c^:,,l, - cos [y (k‘/2d, 1’/2d)]c^;,,l,},
( k , l ) = {-+N, ...,+N
l}
-
(1 12)
The bandpass-filtered object phase function of Eq. (1 12) can be described as the sum of the expectation value of gq(.,-),which is equal to the true iiq(-,-) value, plus a signal-independent Gaussian stochastic process with zero mean and a constant variance (4%sT)-’0&.This variance follows from [cf. Eq. (105)l
x {cos2[ ~ ( ~ ) ( k ’ / 21’1241 d, = (42sT)-10-~q,
(k,l)
=
+ cos2[y(’)(k’/2d,1’/2d)]}
{ -4N... .,$N
-
l}
(113)
To summarize, the bandpass-filtered object phase function Gq(-,.)estimated by Eq. (1 12) can be expressed as
tq(k/2&,112~)= Eq(k/2&,1/28) + N(0,(4AsT)-’~:q),
( k , 1 )= {
-4 N, ...,4 N
-
(114)
1)
where the relation between the bandpass-filtered Eq(., .) and the true E(-, -) function is represented by
x
expf2niN-’[(k
-
r)k’
+ (1
-
( k , / ) = {-+N, ...,+N - 1)
s)1’1}, (1 15)
Equations (101), (103), (104), ( 1 12), (1 13), and (1 14) are the principal results of this section. For examples of the reconstruction of object wave functions from simulated low-dose images involving the bandpass filtering, the interested reader is referred to Slump and Ferwerda (1982). The next paragraph is devoted to a discussion of a reconstruction algorithm for the object wave
242
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
function which is also capable of reconstructing the lower Fourier coefficients of the object phase function.
C . Tilted Illumination In the previous paragraph the low-dose reconstruction of a weak phaseamplitude object was discussed. The reconstruction algorithm was based on two defocused images obtained with axial illumination. It became evident that the lower spatial frequencies of the object-phase structure were only weakly transmitted and therefore hardly contribute to the image contrast, especially when the applied electron dose is of a low level. Low-dose imaging leads to large noise variances in the reconstructed phase part of the object wave function. In this section we discuss the properties of a promising though more eleborate method of reconstructing the object wave function in the context of low-dose electron microscopy. This reconstruction algorithm is based on several image intensity distributions, which are obtained by illuminating the object structure consecutively from different directions. In Fig. 7 a diagram of the imaging system with tilted illumination is presented. The illuminating electron beam is again described by a one-electron wave function. We assume coherent illumination by a plane wave exp(ik-r,) with wave number k = Ikl = 2x12, propagating in a direction which makes an angle 8, with the optic axis (see Fig. 7). In the simple model for thin weak scattering objects which is considered in this chapter, the object wave function is represented by
$o(x,,Y,)
=
exP~ia(x,,.Yo) - P(xo,~,)l$b(xo,Yo)
(1 16)
The wave $b(.,.) represents the illuminating electron wave function; it is the object wave function in the absence of an object. The restriction to thin weak objects allows the approximation of the object wave function by $O(XO>YO)
=
I1
+ ia(xo,y,)
-
P(xo>Yo)lIl/b(xo,Yo)
(1 17)
Denoting the polar angles of the wave vector k by Bo and 40,we easily obtain $b(xo, y o ) = exp( - 27ri(x0sin 8, cos @ O
+ y o sin 0, sin 40))
( 1 18)
where xo and y o are measured in units of 1 and the position of the object plane zo = 0. Defining the background wave function Il/bg(*,-) as the image wave function in the absence of an object, we obtain by substituting Eqs. (1 18) and (73) into Eq. (72), and carrying out the integration over xo and y , dv exPC - iY(5, v ) + 2 n w X
+ YV)I
sin[2nd(5 + sin Bo cos 40)] sin[2nd(v + sin 8, sin 40)] (1 19) n(v + sin 8, sin &), n(4 sin 8, cos 40)
+
243
object plane
2.0
exit pupil
z=zp
image plane z=zi
FIG.7. Schematic diagram of the imaging system for the case of tilted illumination.
Because of the numerically large value of d (the aperture in the object plane expressed in units of A) the two sinc functions in Eq. (1 19)can be approximated by 6 functions. This yields $bg(x,y ) N exp{ -- iy( - sin 8, cos q50, -sin 8, sin 4,) - 2ni(x sin 6, cos 4, + y sin 8, sin 4,) ( 120) In the derivation of Eq. (120) it is assumed that both sin8,cos4, and sin 0, sin 4, are contained in the interval; this corresponds to bright-field imaging. In the image plane information about the object structure is contained in the image wave function &(., -), defined by dvl$~b(5,vl)expC-iy(5,vl)
+ 274x5 + y q ) ]
(121)
244
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
where the wave function in the exit pupil $&-,.) is given by
$&t> II) =
lo(, lo,
dY0 exp{ - 2niC(t
dx,
+ (v + ~
~
~
~
O
~
~
~
~
+ sin 0 0 cos 40)xo
O P (~X OY? Y OO) I l
~
C
~
(122) ~ ~
~
,
The squared modulus of the image wave $(-,.) is given by $(x,Y)$*(x,Y)=I$bg(x,~)+ $i(x>Y)12
(123)
Using Eq. (120) we obtain for the modulus of the image wave function
I$(x~Y)I
=
C1
+ $bg(x,~)lCIT(x?~) + $&(x,Y)$i(x,Y)+ $ i / i ( x > ~ ) $ T ( x , ~ ) I ~ ' ~ (124)
In the next subparagraph the recorded noisy image is expanded into a set of orthonormal functions. The properties of this expansion are then investigated.
1. Orthonormul Expansion Of the Low-dose Image The image wave function is a band-limited function of bandwidth E ; thus its squared modulus has bandwidth 2.5. In the next subparagraph we will show that the highest (spatial) frequency which is used in the reconstruction of the object wave function is equal to 3e. In order to improve the signal-to-noise ratio, we consider the squared modulus of the image wave function to have a bandwidth of ;E. Applying Whittaker-Shannon sampling to the image results in N 2 image cells, with N equal to ~ C J ~ The E . recorded image intensity is a realization of a stochastic Poisson process. The random counts Yik,l are Poisson-distributed random variables with intensity parameter
which follows from Eq. (85) and the approximation of the integral in Eq. (83). We now expand the modulus of $(.,-) into a set of orthonormal functions. It is convenient to write the two-dimensional orthonormal functions as a direct product of two one-dimensional functions. From Eq. (124) we have the expansion
The functions 4,,, and bnare chosen to be orthonormal on the interval ( - d, + d ) . This set of functions is complete if the indices rn and n in E q . (126) continue to infinity. This is not required here because the image is sampled in squares with sides of length ( 3 e ) - ' . Within the cells the value of the functions is taken to be a constant.
~
Y
IMAGE HANDLING IN ELECTRON MICROSCOPY
245
From Eq. (125), it follows that
Our purpose is to estimate the expansion coefficients a = (. . . ,am,n,. . .) from the random variables fi = (. . . , t ? k , I ,...) using Eqs. (84) and (127). As the variables fi are integers, the accuracy attainable in the coefficients a is limited to approximately (&T)-’N 2 . The maximum likelihood method claims the best estimate of a to be those values which maximize the likelihood function L(ii, a). This function is the joint probability function of the observations. When the parameters a have their true value, L(ii,a) is the probability of obtaining the recorded count pattern 3 given in
In Appendix C the likelihood function [Eq. (128)] is used to determine the amount of information about the parameters a contained in the recorded image fi. Closely related to this Fisher-information matrix is the minimum achievable error variance of the parameters a as expressed in the Cramer-Rao bound (see, for example, Kendall and Stuart, 1967; Van der Waerden, 1969; Van Trees, 1968).The estimated values for the parameters depend on the data; thus they also are random variables. Knowledge about their probability density function, or at least of the first two moments, is of as much importance as the values themselves. We will return to this subject further on. In order to (k, I ) = simplify the estimation of the parameters a, the auxiliary variable { -*N,. . . , i N - I}, is introduced E{t?k,l}
From Eq. (129)
=
&TN-’(I
+ Sk,1}’
( 1 29)
is estimated by
Appendix B shows that the probability density function of is to a good approximation Gaussian, with mean equal to s k , l and variance equal to N2(4iST)-’.From Eqs. (127), (129), and (130), the relation between the auxiliary random variables gk,k,land the parameters a is obtained as m=O n=O
By using Eq. (131), the parameters a can be estimated either by the method of least squares or by the maximum likelihood method, because the probability density function of each Fk,, is Gaussian. The variances of do not depend
246
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
on a. Thus the estimated values for a are obtained by minimizing Q 2 , which is defined as
Minimizing Q 2 with regard to a results in the following expression for the parameters up,q
x 4 p ( k / 3 & ) 4 ( ~ / 3= 40
Using the orthonormality relations
we obtain
For the expectation value of hP,, we obtain from Eq. (131) %P41
= ZPdl
thus the statistic of Eq. (135) is unbiased. We wish to remark here again that the number of significant figures which can be retrieved from the integer data values ii is limited to an accuracy of about (1,T)-'NZ.As the probability density function of the $k,l variables is (approximately) Gaussian, as is shown in Appendix B, the probability density of GP,, is also Gaussian. For the we find that variance of ip,q Nl2- 1
NlZ-1
=(4&T)-'N2,
The covariance matrix
I/ of
( p , q ) = {O,.. . ,N
-
l}
(1 37)
the estimated parameters is given by
Vr,s:p,q = E ( C 2 r . s - E { c r , s } I C ' p , q - E{'p,q}I] = (4AT)-'N26r,p6s,q (138) because of the identical independence of the P parameters, a result that follows from the independence of the ii recordings (cf. Section 1,G). From Eqs. (1 36) and (1 37) we conclude that the estimated parameters P of the expansion in Eq. (127) are uncorrelated and Gaussian distributed. The mean equals the true value given in Eq. (1 36), and the variance is given by Eq. (137), which shows that the variance is a quantity independent of the object.
IMAGE HANDLING IN ELECTRON MICROSCOPY
247
Moreover, when comparing the covariance matrix I/ in Eq. (138) with the Cramer-Rao bound in Appendix C, we see that they are identical. We therefore conclude that the expansion parameters a are efJicientZy estimated, i.e., estimated with the lowest achievable error variance. Equation (135) is therefore an efficient statistic, and all the information that is contained in the data is converted into estimated values. In the next subparagraph we use the expansions of the recorded images to reconstruct the object wave function. 2. Reconstruction of the Object Wave Function
In this subparagraph relations are derived between the object wave function and the recorded low-dose images. In order to do so we determine the wave function in the exit pupil using the orthonormal expansion of the image data described in the previous subsection. Using Eqs. (124)and (136) we obtain
where we neglected the squared modulus of 1,9~(-,-) and approximated the square root by the first two terms of its Taylor expansion. This is an admissible approximation because we have restricted ourselves to weak objects. Until now we have not specified the orthonormal functions to be used in the expansion in Eq. (127). Because of their convenient properties under Fourier transformation, we choose the prolate spheroidal functions (Slepian and Pollak, 1961). An overview of the properties of these functions is given, for example, by Frieden (1971). The prolate spheroidal functions @:(-) are the eigenfunctions of the finite Fourier transform operator, and they are defined by the integral equations (Slepian, 1964) 1
at@ 1550. 760.0 82.6 333.9
Experimental covariance
Negative bound
-0.205 -
-0.188 -0.079
-0.1534.6 -468.6 - 42.8 -196.2
-544.0 -312.3 66.3 -208.4 -
0.156 ~
0.185 0.077
395.4 -
37.7 179.1
Cramer-Rao
548.6 ~
70.9 243.1
0.187 0.133 0.108 0.094
339.2 239.9 195.8 169.6
491.2 347.4 283.6 245.6
' I 0". Eq. (199) from Poisson-distributed data with the following parameter setting: N = 101, a = 0.3, i: = 10 .', p = d / 3 = 833.33, d = 2500, 4 4 = 625.0. A horizontal bar indicates that the value in question could not be calculated due to severe nonparabolic behavior of the log-likelihood function for the corresponding parameter, caused by the neighboring local extrema. The size of the sampling cell is 4 A'. Hence the dose values correspond with 0.5, 1.0, 1.5. and 2.0 e/AZ.
.s
IMAGE HANDLING IN ELECTRON MICROSCOPY
273
example. The information about the total number of local extrema present in a domain is of great value as it tells us whether an iterative procedure to locate all of the zeros of the likelihood equations has missed a zero. This information is provided by evaluating an integral derived by Picard (1892) from previous work by Kronecker (1878) at the end of the nineteenth century. The integrands contain relatively simple algebraic quantities containing derivatives up to the third order of the log-likelihood function involved. The integration must be performed over the domain of interest. For an extensive discussion of this socalled Kronecker-Picard (KP) integral illustrated with examples, see Hoenders and Slump (1983). The Kronecker-Picard integral yields the exact number of zeros of a set of equations in a domain, provided that the zeros are simple; i.e., the Jacobian must not be equal to zero for these points. C. Two-Dimensional Examples The application of the maximum-likelihood method to the estimation problems of the previous section illustrates the possibilities and properties of this method in estimating LI priori parameters in the evaluation of low-dose electron micrographs. Image data are, however, essentially two dimensional. Therefore, in this section a more realistic example is presented which is based on two-dimensional data. The estimation problem presented in this section is inspired by the second example of the previous paragraph [cf. Eq. 1911. The a priori image intensity is assumed to be a function of 15 parameters 3
A k J = x o (1
+ 1 umexp(-fs;2[k(2&)-~
-
pm12 - +r;2[/(2+1-
qm]2})
m= I
(202) = i.,TN-’ and = E(&). Besides the two-dimensional data, a with i0 difference with the estimation problem in Eq. (191) is that the amplitudes a, are not constrained to be smaller than unity. The problem of this section is the estimation of the parameters of the three Gaussian blobs from simulated low( k , I ) = ( - i N , . .., dose images, with Poisson-distributed picture elements $N - l), of which the corresponding intensity &,, is given by Eq. (202). The simulated images are presented in Fig. 12. The estimated values for the parameters are obtained from maximizing the log-likelihood function corresponding to Eq. (202). The numerical procedure is identical to the one used in the one-dimensional examples of the previous section, where the details are described. Table I11 summarizes the series of simulations estimating the parameters p, q, r, and s, performed with increasing electron dose I.,, with fixed values for the amplitudes a and with the image data of Fig. 12. The amplitudes
274
CORNELIS H. SLUMP A N D HEDZER A. FERWERDA
FIG.12. Simulated images used in the estimation calculations which are summarized in Table 111. (a) contains the noise-free image corresponding to Eq. (202).
SUMMARY OF
THE NUMERICAL SIMULATIONS OF THE
Estimated parameter values
Dose 10
8 16 32 48 64
TABLE 111 ESTIMATION OF THE PARAMETERS p, q. r, AND s OF THE THREE GAUSSIAN BLOBS"
PI
-2516.3 2529.4 -2537.1 -2547.2 -2558.6 -
41
3127.9 3 170.2 3185.1 3194.9 3193.6
-
f-1
3026.3 3 144.9 3173.7 3175.9 3170.7
s^1
2059.8 2084.0 2114.9 2105.8 2116.2
P2
42
F2
F2
5070.6 5 1 15.3 5098.2 5088.9 5094.6
1578.8 1594.6 1593.6 1597.0 1601.1
2607.6 2587.5 2574.1 2557.2 2568.1
2070.9 2093.1 2112.2 2115.0 2124.6
P, -1288.5 - 1296.0 -1287.1 - 1285.5 -1285.9
43
F3
-5181.4
3000.6 3 1 10.0 3156.9 3172.5 3179.0
- 5076.4
-5085.2 -5118.0 -5122.9
"Cf. Eq. (202), from Poisson-distributed data, see Fig. 12, with the following parameter setting: N = 128, a , = 4, p1 = -2560, y I = 3200, rl a2 = 6, p 2 = 5120, q2 = 1600, rz = 2 5 6 0 , ~=~2133.3, d = 6400, a , = 5, p , = 1280, q , = 5120, r 3 = 3200, s, = 3840.0. sI = 2 1 3 3 , ~= 0.5 x
F3
3660.5 3707.6 3752.3 3827.8 3822.1 =
3200,
276
CORNELIS H. SLUMP AND HEDZER A FERWERDA TABLE I V THEVALUESOF THE E X A ~CTK A M ~ R - R BAOLND O OF THF PARAMETERS p, q. r, A W s OF THE THREE GAUSSIAN BLOBS"
35.0 24.6 17.4 14.2 12.3
8 16 32 48 64 ~
~~
26.2 18.5 13.1 10.7 9.3
43.7 30.8 21.8 17.8 15.4
25.0 17.6 12.5 10.2 8.8
33.4 23.6 16.7 13.6 11.8
13.6 9.6 6.8 5.6 4.8
41.5 29.4 20.8 17.0 14.7
11.3 8.0 5.6 4.6 4.0
14.2 10.1 7.1 5.8 5.0
62.0 43.8 31.0 25.3 21.9
14.7 10.4 7.4 6.0 5.2
111.4 78.8 5.5.7 45.5 39.4
~
" Cf. Eq. (202).
a are excluded from the estimation because of reasons of computational convenience. The parameters that must be estimated are now all of the same order of magnitude. The exact Cramer-Rao bounds of the estimated parameters are presented in Table IV. An in-depth analysis of the shape of the attained maxima reveals that only for the highest dose values the width of these maxima approaches the values of the Cramer-Rao bound as presented in Table IV. This is due to the fact that Eq. (202) is a highly nonlinear function of the parameters that must be estimated. The analysis of the shape of the attained maxima was greatly facilitated through the use of the MINu1.r program (James and Roos, 1975), developed at C E R N , Geneva, for function optimization. Even with the capabilities for global optimum search offered by ~ we still have n o guarantee of attaining this maximum. the M I N U I program, D. Discussion and Conclusions
The subject of this section is the optimal use of u priori information about the structure of the imaged specimen in low-dose electron microscopy. In this section we take advantage of the prior information available by modeling the object structure in a functional relationship between a number of parameters. From this description a theoretical image intensity distribution results, i.e., the image contrast in the limit of an infinite number of electrons contributing to the image formation. Using the statistical technique of maximum-likelihood estimation, numerical values are obtained for the unknown parameters from the registered realization of the stochastic low-dose image process. The advantage of the approach of parameter estimation is that all the information available in the data is used to determine the relevant parameters about the imaged specimen one wants to know. A disadvantage of parameter estimation is the theoretical image contrast which is required as a function of
IMAGE HANDLING IN ELECTRON MICROSCOPY
277
the parameters to be estimated. This image contrast must be based on the object wave function, a calculation which is analytically very elaborate and complicated for phase contrast images. Furthermore, the determination of the object wave function as function of a number of parameters is not a simple task. Of course, the required functions can be computed numerically. However, the whole estimation procedure will become rather time consuming. More feasible is the situation at a much lower resolution scale, where scattering contrast dominates the image formation. The required image contrast as a function of parameters now can be based on the much more simple mass-density model of the specimen involved. Because of the lower resolution, the sampling cells in the image are much larger and better statistics in the data are achieved for low electron-dose values. A further complication with parameter estimation is the fact that in general the estimation problem is highly nonlinear in the parameters of interest. This nonlinearity manifests itself in the presence of local maxima in the likelihood function. The search for the global maximum of the likelihood function is a very complicated numerical problem when local extrema are present. Since the estimated parameters are based on stochastic data, the obtained values are also random variables. The statistical properties of the results are as important as the actual numerical values calculated. Unfortunately, the determination of even the first two moments is often a complicated task, due to the nonlinearity of the problem. A statistical characterization of the estimated parameters can only be established in the asymptotic regime of the maximumlikelihood estimator. Again the low-resolution imaging of specimens with scattering contrast is the most promising situation for the application of maximum-likelihood parameter estimation to low-dose electron microscopy in molecular biology.
v.
STATISTICAL HYPOTHESIS TESTING
A . Introduction to Statisticul Hypothesis Testing in Electron Microscopy
The present section is the second one which is devoted to the optimal use of a priori information. The evaluation of low-dose electron micrographs is
considered using the techniques of statistical decision theory. First we provide a short introduction to the kery useful technique of statistical hypothesis testing. This technique will be applied in consecutive subsections to three key problems in the evaluation of low-dose images (1) The detection of the presence of an object with a specified error probability for missing the object and false alarm.
278
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
(2) The detection of the positions of single heavy atoms, to be used as markers in the analysis of images of identical molecules with random orientation. The markers allow the images to be aligned and averaged which leads to a higher signal-to-noise ratio (Frank, 1980; Van Heel and Frank, 1981). (3) How to measure the statistical significance of, e.g., applying image processing to the low-dose image, in order to judge to what extent artefacts are introduced by the computer processing of the image. The visual interpretation of low-dose electron micrographs of moderately stained or unstained biological material is almost impossible due to the low and noisy contrast. Therefore computer processing of these images is indispensable. However, image processing applied to electron micrographs by means of a digital computer has to be performed with great care in order to prevent artefacts and false judgements about the structure to be observed. For overcoming these complications, which can be severe, especially for low-dose images, statistical decision theory offers a tool for an independent and objective check afterwards by quantifying the statistical significance of the obtained results. This can be done by statistical hypothesis testing whenever one has prior information about the structure being observed. In many cases occurring in practice, the information in an electron micrograph is partially redundant. This redundancy of the image, which is equivalent to a priori information, offers the opportunity to reduce the influence of noise. One way in which this a priori information can be used optimally is to apply the method of maximum likelihood to the estimation of unknown parameters. This technique which has been studied in depth in the previous section is especially suited when detailed a priori information about the parametrization of the specimen and the resulting image intensity distribution is available. Another approach of using the available a priori information in an optimal way is the construction of one or more hypotheses about the image distribution. Next, the statistical significance of the hypothesis under consideration is tested against the recorded image intensity which results in case of consistency in acceptance of the hypothesis and otherwise it is rejected. The rest of this section contains an outline of this technique of hypothesis testing (for a more general discussion see e.g., Van der Waerden, 1969; Kendall and Stuart, 1967; Lehmann, 1959). Throughout this chapter a recorded low-dose image is represented by an N x N array of statistically independent Poisson-distributed random counts i?k,lwhich correspond to the number of electrons that have arrived in the ( k , I ) image cell, ( k , I ) = { - $ N , . . . , i N - 1 ) with N Z roughly equal to the number of degrees of freedom of the image. The probability distribution of an individual iik,lhas been discussed in Section I,G [cf. Eq. (12)]. The following example is a simple application of hypothesis testing to a recorded image.
IMAGE HANDLING IN ELECTRON MICROSCOPY
279
Suppose that we have to decide between two possibilities: in the image ii or in a smaller region of interest, either specimen A or specimen B is imaged. The specimens A and B can be, e.g., two different biological molecules. Specimen A is characterized by the intensity parameter /z& of the Poisson process of the image recording and let specimen B correspond to the intensity parameter &. We assume here that the intensity parameters A t l and A& are completely specified; i.e., they do not depend on unknown parameters that have to be determined from the image data. In this case we have two simple hypotheses, the null hypothesis H,: Specimen A is imaged and the alternative hypothesis HI: specimen B is imaged. The null hypothesis is the hypothesis which is tested, here chosen to correspond to specimen A . Composite hypotheses also exist, in the case there is not one simple alternative hypothesis but instead a number of alternatives usually involving a free parameter. In the next subsection an example of such a composite hypothesis will be encountered. From the recorded image ii we now have to test hypothesis H , against its alternative HI and to decide whether specimen A or B was imaged. In order to do so, a so-called test statistic T is needed, which is a function of the experimental data to be specified further. Let W be the sample space of the test statistic, i.e., the space containing all possible sets of values of T.The space W is now divided into a critical region w and a region of acceptance W - w. If T falls within the critical region, hypothesis H , is rejected; otherwise it is accepted. The critical region w is chosen in such a way that a preselected level of significance c1 of the test is achieved. This level of significance sl is defined as the probability that T is in w while H , is true (see Fig. 13a) c(
=
P { T €WJH,)
=
i‘
p(TIH,)dT
(203)
In other words, c1 is the probability that H , is rejected although the hypothesis is true. Having chosen a value of a, the value of c follows from Eq. (203) such c: H , is accepted, that if T 2 c: H , is rejected and thus HI is accepted; if T I HI is rejected. Whether a test is useful or not depends on its ability to discriminate against the alternative hypothesis H I .This is measured by the power of the test, which is defined as the probability 1 - p that T is in w while HI is true. This makes p the probability that H , is accepted although HI is true (see Fig. 13b)
p = P { T €w - WIH,}
=
SI
p(TIH,)dT
(204)
m
The performance of a specific test is measured by the two types of error that may occur. The first is type I: H I is chosen while H , is true (“false alarm”). The probability of a type-I error is c(. The second error is called type 11: H , is chosen while HI is true (“miss”).The probability that a type-I1 error will be made is p.
280
CORNELIS H . SLUMP A N D HEDZER A. FERWERDA
Fic;. 13. The level of significance 2 and the power I against the simple alternative hypothesis H , .
- [j
of testing the null hqpothesis ff(,
In hypothesis testing one has to choose the significance level 2, i.e., the probability of a type-I error one is willing to accept and the test statistic T which is to be chosen such that for a given value of 2, /lis minimal. In this section three test statistics will be compared: the likelihood ratio, the chisquare test, and Student's t test. These test statistics are introduced in the following. The likelihood of observing the recorded realization fi of the stochastic image process is given by [cf. Eq. (19)]
I!d(fi,k) = n n e x p ( ~ j " k . J ) ( ~ k ,; .Jfk.1 l k!' ) ~ ' k
(205)
I
The likelihood ratio q is the test statistic which is defined as the ratio of the probabilities of obtaining the recorded count pattern for the hypotheses H , and H ,
q(ii) = L(ii, Ho)/L(fi,H , )
= exp
CC[.~,, '
-
+ fik,i(Io!gi&
-
~og;.;,,)~)
(206) Having calculated the likelihood ratio q according to Eq. (206), its value is to be compared with threshold value q,. If q 2 qo hypothesis H , is accepted, otherwise H , is chosen. The test procedure is now completely specified; what remains to be solved is how the threshold value qo should be chosen in order to correspond to the desired CI level. A further question is what the resulting
IMAGE HANDLING IN ELECTRON MICROSCOPY
28 1
power of the test will be. In general these matters depend on the hypothesis at hand, i.e., the differences between At.,and j&. The likelihood ratio is a powerful test statistic for the decision between the two simple hypotheses H , and H I . Another test statistic which is especially suited for measuring the discrepancy between observed iik,ldata values is the chi-square statistic Tx2with N 2 - 1 degrees of freedom T,,(N2
1) = x x j . ; l ' ( k k , L- i,J2
~
k
I
(207)
The larger the values of Tx2,the larger is the discrepancy between the observed and expected data values. The expected data values are computed on the basis of a hypothesis H , . This H , is rejected if the obtained q2value exceeds the critical value at the desired significance level, e.g., x:,95 or xi,99,which are the critical values to be obtained from tables at the 5% and 1 "/, significance level, respectively (see, for example, Van der Waerden, 1969, Chap. 14, Table 6). In that case the chi-square test concludes that the observations differ significantly from the expected values at the chosen level of significance. Otherwise H , is accepted or at least not rejected. In the next subsection also a third test statistic will be encountered which has a more limited scope of application, namely Student's t test. This is the appropriate test statistic if one wants to test whether an observed mean value of a set of N 2 independent normal-distributed random variables ( X I ,X , , . . ,X N 2 )is consistent with the expected value p . The test statistic t , which is defined as
x
,
r
= .f '(X-
p)N
(208)
where X = N-' C i X , is the sample mean and s2 = ( N 2 - 1 ) - ' c i ( X i- X)' denotes the sample variance. has a Student's t distribution with N 2 - 1 degrees of freedom. Also for this test statistic the critical values, e.g., to,,, and to,99 can be obtained from tables (for example Van der Waerden, 1969, Chap. 14, Table 7). If the test statistic exceeds the critical value, the hypothesis H , is rejected. In the next subsections statistical hypothesis testing is applied to problems in electron microscopy. B. Object Detection
A critical problem in the evaluation of low-dose electron micrographs is the detection of the presence of an object in the noisy images. Once an object has been detected in a certain region of interest, various techniques can be applied to extract information about this object. However, first the question has to be answered whether there is an object present or that the pertinent image intensity variation is .just a random fluctuation. Otherwise, faulty
282
CORNELIS H. S L U M P A N D HEDZER A. FERWERDA
conclusions will arise from applying image processing to an image which consists of random noise only. The detection of objects in question in noise-limited micrographs has been treated by Saxton and Frank (1977) using a matched filter approach based on cross-correlation and by Van Heel (1982)applying a variance image operator to the low-dose image. The visual perceptibility of objects under low-intensity levels has been treated in the pioneering work of Rose (1948a,b) in the early days of television systems. The results of Rose’s analysis are fundamental to low-dose electron microscopy and will be outlined briefly. In order to detect an image-resolution cell of one picture element (pixel) having a contrast C, where C is defined as the relative difference with respect to the background intensity A,, C = A&,/2, in an image of N x N pixels, a total number of electrons nT is needed in the image information. According to Rose (1948a,b) this number is given by
nT = N2k2Cp2 (209) It is assumed that this total number of electrons is uniformly distributed over the image and further the detection quantum efficiency (DQE) is taken to be unity, so that every impinging electron is recorded. The factor k in Eq. (209) is introduced in order to avoid false alarms and should be between 4 and 5 when the image has about lo5 pixels. The following example adopted from Rose ( 1973) illustrates Eq. (209) and clarifies the role of the factor k. Suppose we want to detect a single picture element with a constrast value C of at an unknown position in an image consisting of 100 x 100 pixels. According to Eq. (209) a total number of (at least) 108k2imaging electrons is needed in order to make this pixel visible. A pixel of the background receives in the mean a number of electrons 2, equal to 104k2 and the pixel to be detected expects to receive 1 = 9900k2electrons. The recorded numbers of electrons, ii, and ii, respectively, are Poisson-distributed random variables as is discussed in Section 1,G. Due to the relatively large numbers involved, the Poisson distribution is well approximated by the Gaussian probability density function P { ii,
= m } = exp( - i.,)(m!)-
’2;
= p(m)= ( 2 n a ~ ) p ” 2 e x p [ - - ~ ~ 2(m
(210)
where 0; equals j-,. Figure 14 presents the two probability density functions, the shaded areas correspond to the type-I error a, and type-I1 error p, which depend on the decision threshold c. The distance between the two peaks is look2.This is equal to k times the standard deviation, which is nearly the same for both density functions. If the decision threshold c would be situated halfway in between 2 and L o , the error probabilities a and fl would be equal.
IMAGE HANDLING IN ELECTRON MICROSCOPY
283
FIG.14. The two probability density functions p ( m , io) and p(m, A),together with the shaded areas a and p, representing, respectively, the probability of a type-I error (“false alarm”) and a type-I1 error (“miss”).
Usually this is a desirable situation, but not in this case because there are lo4 - 1 background pixels. Although Rose cannot use explicitly the threshold c because his detection criterion is the visual perceptibility, it is argued in Rose (1973) that the distance from c to I., should be at least 4 standard deviations in order to bring the total CI risk down to 0.3. With a distance from A to c of one standard deviation, the risk becomes 0.1 58, and the value for k is found to be 5. When the image contains less pixels, the value for k can be lowered somewhat. Note that the dose values in this example (I., = 25 x lo4 electrons/pixel) are far away from low-dose imaging conditions, which underlines the inherent difficulty of the evaluation of low-dose electron micrographs at high resolution. Considering the detection of image detail with a larger spot size than one pixel, Eq. (209) can be rewritten as
nT = A k 2 ( d C ) - ’
(21 1 )
where d is the diameter of the spot to be detected and A is the area of the image. According to Eq. (21 1 ) the diameter of a test spot which is just visible varies inversely with the contrast c‘ for a fixed value of the electron dose. This relation is illustrated in Fig. 15, where a test pattern adopted from Rose (1948b) is presented, which consists of a two-dimensional array of discs in a uniform background. The diameter d of the discs decreases in steps of a factor of 2 while moving to the right along a row and the contrast C of the discs decreases in steps of a factor of 2 while moving downwards along a column. These images show that the boundary between the visible and invisible discs lies roughly along a diagonal where the product dC is constant. With Fig. 15 the discussion of Rose’s detection criterion is determined, and we turn to hypothesis testing for the detection of objects. For objects
284
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
d FIG.15. Test pattern [Rose (3948b)], illustrating Eq. (21 1). (a) Original test pattern. (b)-(d) Test patterns for electron-dose values of 8, 16, and 32 electrons per pixel.
consisting of one pixel, not so much can be improved upon Eq. (209); however, for larger objects a significant improvement is possible, as has also been reported by Saxton and Frank (1977) and by Van Heel (1982). This can be understood from the fact that the Rose criterion is the visibility of the test spot, which is based on the integrated contrast over image elements with the size of the test spot. In the case of extended objects detection methods based on the statistics of the individual pixels use more information from the recorded image and therefore in principle are more appropriate for the treatment of lower dose values.
285
IMAGE HANDLING I N ELECTRON MICROSCOPY
Applying statistical hypothesis testing to the detection of the presence of an object, one has to decide between two possibilities: either there is no object, the null hypothesis ( H o : j . k , l= constant = 1,,7iV2) or there is an object present, the alternative hypothesis ( H , : j u k , larbitrary, but not all equal). If ,?,7?v-’ is known, then H , is a simple hypothesis because it is completely specified; H , , however, is a composite hypothesis (the object is not specified). The likelihood-ratio test statistic as it is discussed in the previous section does not apply to this situation because the probability of the alternative hypothesis cannot be specified. This complication is overcome by using instead the generalized or maximum-likelihood ratio as test statistic, which is defined as the ratio of the maximum-likelihood values of the two hypotheses. Of both hypotheses the likelihood is maximized by variation over the pertinent hypotheses; cf. the maximum-likelihood estimation. When H , is true = ii = N L(ii,KO) is maximal estimating Ak,[by CkEl Ck,[. If H , is true, we have = Gk,[;in the case of one observation the best estimated value is the observation itself. The generalized likelihood ratio q is
q ( 6 ) = L(fi,J?,)/L(h,
fi,) = exp
N2iilog ii
-
k 1
Ck.[log Ck,,
from which it follows that 0 5 q I 1. It can be shown (e.g. Kendall and Stuart, 1967, p. 233) that when H , is true, - 2 log q is distributed for N2 -+ CT, as z 2 ( N 2- 1). For image data we may well expect to be in the asymptotic regime so that the probability distribution of -2 log q equals the chi-square distribution with N Z - 1 degrees of freedom. When one has chosen the level of significance ci, usually of the order of 1 the threshold value c for the decision acceptance or rejection of H , can be obtained from tables of the ;c2 distribution (see Fig. 16).The decision threshold
x,
-r
FIG. 16. The chi-square distribution with N *
1 degrees of freedom of r = threshold value c is chosen such that the shaded area equals 1 - a. -
--
21og q. The
286
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
c with which r
=
-
2 log q is to be compared, is chosen such that P x z { r ,N 2 - l} dr
However, for larger values of N 2 - 1 the chi-square distribution is approximated very well by the Gaussian distribution. Defining the value z as follows ~ ( q=) (2NZ- 2)- '"(1 - N 2 - 210gq)
(214)
we have obtained a variable which is distributed standard normal N(0,l). The a levels with the corresponding threshold values c are presented below.
r level
Threshold value c
0.1
1.282
0.05
1.645
0.01
2.326
If the value z exceeds a threshold value c, the H , hypothesis is rejected at the corresponding significance level a. We will follow here the conventional terminology that results significant at the c( level of 1% are highly significant, results significant at the 5% level are probably significant, and results significant at levels larger than 5% are not significant. Because H , is the set of all possible alternatives to H , , nothing can be said about the probability of a type -11 error /I. Applying the chi-square test for the detection of the presence of an object, the test statistic is the following [cf. Eq. (207)] T , , ( N ~-
where 6 = N 2 Z kZ I G k , l . Because of the asymptotic properties of the x 2 distribution, which are already discussed in connection with Eqs. (213) and (214), the value z is defined as z(T) = (2N2 - 2)-' ' [ T , 2 ( N Z- 1) - N Z +
11
(216)
is distributed standard normal N(0,l). Therefore, the threshold values c of the above table also apply to the chi-square test. With the third test statistic, is., Student's t test of Eq. (208) the consistency of the observed mean value is measured against the expected value p, which corresponds to i.,TN -'. The value of p is to be determined from previous exposures under the same conditions, e.g., by dose measurements using a Faraday cup. The t test requires that the random variables to be tested are
IMAGE HANDLING IN ELECTRON MICROSCOPY
287
(approximately) normally distributed. We therefore first apply the square-root transformation to the Poisson-distributed image data j k , [ = (kk.1
+
$)1’2
(217)
The obtained j k , [ values are in good approximation normally distributed N ( p L y , b )where , p y = ( p + ;)l/’. The square-root transformation in Eq. (217) is discussed in Appendix B. The test statistic t t = s - ’ ( y - py)N
(218)
where J = N - 2 C k C I F k k ,and l s2 = ( N 2 - 1 ) - ’ X k X L ( & [ - #, has the Student’s t distribution with N 2 - 1 degrees of freedom. For large values of N 2 - 1, t is distributed standard normal N(0,l). It is to be expected that when an object is present the number of electrons arriving in the image will be reduced, e.g., by scattering contrast or because of the energy filter lens which removes inelastically scattered electrons. Therefore, only the significance levels less than the measured mean corresponding to the lower tail of N(0,I) have to be tested. The threshold values c of the above table apply here, however with opposite sign. The three test statistics which are compared in this section are first applied to the detection of the presence of objects in simulated images of four model objects. The simulated images are arrays of 64 x 64 pixels. The objects are represented by their object wave function and consist of two amplitude objects and two phase objects: amplitude: $,(x,,y,)
phase:$&,,
yo)
=
exp(--0.05),
inside a circle with a diameter of 8 and 16 pixels, respectively;
=
1,
otherwise.
= exp(i/4),
=
inside a circle with a diameter of 8 and 16 pixels, respectively; otherwise.
1.
are calculated according to Eqs. (3)-(5) for The image wave functions the following setting of the microscope parameters: D = 180 nm, C, = 1.6 mm, A = 4 pm, and E = 5 x From this parameter setting it follows that a resolution cell in the image which corresponds in this simulation to a picture element, is equal to (4 x 4) A’. The contrast calculated from the image wave function $(-,-)is the parameter I,,,, of the image Poisson process. By means of random number generation a realization of the low-dose image is obtained. The results of the three test statistics, generalized likelihood ratio, chi-square, and Student’s t test, respectively, given in Eqs. (214), (216), and (218) are summarized in Table V. $(a,.)
TABLE V SUMMARY ok THE DETECTION OF THE PRESENCE OF THE MODELOBJECTS I N THE SIMULATED IMAGESW I T H THE THKWTESTSTATISTICS FOR INCREASINGELECTRON DOSE' Amplitude object
Phase object -.
i.,TN-'(e-/pixel)
i n ( e ./nm'), (e-/A')
Likelihood ratio
Chi-square
Student's
Likelihood ratio
Ho Ho Hn H0,l H", I
HO H0 H0 H", I
H0,l H, Hl Hl Hl
Ho11 Ho, L H0,l HOIl HI
Ho HO Ho Hwi
HO Ho Ho Ho
If0 Ho Hl Hl Hi
Hl Hl Hl Hl Hi
H,, 1 HOIl HI HI HI
H" HO Ho, 1 Hl
Ho Ho Ho Ho
11,
11"
Chi-square
Student's
Object diameter 8 pixels 12 16 32 48 64
75 (0.75) 100(1)
200 (2) 300 (3) 400 (4)
H0
h0,1
lfn
Object diameter 16 pixels 12 16 32 48 64
15 (0.75) 100 (1)
200 (2) 300 (3) 400 (4)
Ho! I
If,, 1 HI HI Hi
" I I , means that the hypothesis !I,, is not rejected, the test statistic is not significant. Hi indicates that the test statistic is highly signilicant. If,, is rejected, while I I , I indicates that the test statistic is probably significant.
IMAGE HANDLING IN ELECTRON MICROSCOPY
289
From the simulated experiments summarized in Table V we observe that the Student’s t test, which has a very sharp response on amplitude contrast, is not sensitive to phase contrast at all. This is not surprising, as the t test statistic measures the deviation between the total number of expected electrons in the image and the acquired number of detected electrons in the image. In the case of amplitude contrast, the number of electrons arriving at the image detector plane will be reduced if compared with the case that there is no object present. In the case of phase contrast this difference is negligible in comparison with the statistical fluctuation in the total number of electrons that is involved in the image formation. For the detection of phase contrast we observe from Table V that the likelihood-ratio test is more sensitive than the chi-square test. In a small experiment in which the presence of phase contrast in low-dose images is tested, we used the likelihood ratio as test statistic. The three lowdose electron micrographs which are the input data for this experiment are a courtesy of Dr. E. J. Boekema and Dr. W. Keegstra of the Biochemisch Laboratorium, Rijksuniversiteit Groningen. The imaged specimens presented in Fig. 17 are an image of a carbon support foil, a carbon foil with a small amount of uranyl acetate, which is used as staining material, and an image of a NADH: Q oxidoreductase crystal (slightly negatively stained with uranyl acetate) from bovine heart mitochondria (see Boekema et al., 1982). The crystal structure of the last image is visualized in Fig. 17d, which is obtained by Fourier peak filtration (Unwin and Henderson, 1975). The images are small sections of 128 x 128 pixels of low-dose CTEM electron micrographs, obtained with an electron dose in between 5 and 7 e-/A2. The micrographs have been scanned by a microdensitometer with a sampling grid of 25 pm. Since the magnification is 46 600, the size of a pixel corresponds to 5.3 A. The low-dose images are recorded on a photographic plate, which is not the ideal device for electron detection. Moreover, due to the scanning by the microdensitometer, we cannot expect the recorded image to have the Poisson statistics of Section I,G. The test statistics developed in this section are based on the Poisson statistics of a recorded image as derived for ideal electrondetection conditions in Section I,G. The following crude approach has been chosen to correct the statistics of the image to the Poisson regime. From the carbon foil image the mean and variance are calculated. Hence the image is scaled in such a way that an equal mean and variance value are obtained which numerically corresponds to a dose of 6 eC/A’. Exactly the same scaling is applied to the two other images. The scaled images serve as input for the estimation experiment. The likelihood-ratio test statistic detects phase contrast, and thus the presence of an object at the 5% level in the image of the carbon foil with uranyl acetate. In the image of the NADH dehydrogenase crystal, phase contrast is detected at the 1o/;l level. Object detection by means of
290
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
CARBON FOIL
CARBON FOIL- UAC
CRYSTAL
CRYSTAL FILTERED
FIG.17. Low-dose CTEM images used for an object detection experiment. The imaged specimens are consecutively (a) carbon foil, (b) carbon foil with a small amount of uranyl acetate, and (c) NADH: Q oxidoreductase crystal. (d) is obtained from (c) by means of Fourier peak filtration.
hypothesis testing exceeds the capability of the human eye, which is very useful in the noisy images of low-dose electron microscopy.
C. Position Detection of Marker Atoms Another key problem in the evaluation of low-dose images is the detection of the position of single heavy atoms. These atoms are used as markers in the analysis of images of identical macromolecules with a random orientation
IMAGE HANDLING IN ELECTRON MICROSCOPY
29 1
with respect to each other. If the marker positions are known, the images can be aligned and integrated, which leads to a higher signal-to-noise ratio (Frank, 1980; Van Heel and Frank, 1981). The imaging of single atoms in electron microscopy has been studied by many authors (see, e.g., Proceedings of the 47th Nobel Symposium, Direct Imaging of Atoms in Crystals and Molecules, Chernicu Scriptu 14 (1978-1979). Calculations of the image contrast due to a single atom have been reported (e.g., by Scherzer, 1949; Niehrs, 1969, 1970; Reimer and Gilde, 1973; Iijima, 1977). Experimental observations showing evidence of imaged single atoms are among others, in Dorignac and Jouffrey (1980), Kirkland and Siege1 (l98l), and Isaacson et a / . (1 974). For the construction of hypotheses about the location of the marker, the theoretical image contrast of one isolated heavy atom is required. However, the calculation of the theoretical image intensity distribution as a function of the lateral position of the atom in the object plane is a complicated task. Our purpose here is to discuss the detection capability of marker atoms under lowdose illumination. In order to bring out the essentials, we will simplify this calculation considerably by neglecting inelastic scattering phenomena completely. Furthermore, we represent the electrostatic potential of the pertinent atom at the position ( x a ,y,, z,) in the specimen which is situated just before the object plane z = zo by the Wentzel potential (Lenz, 1954) V(r,) =
where r, = [(xO- x,)’ atom R is given by
+ (yo
-
Ze exp( - r,R ’) 471c0r,
~
~
y,)’
R
+ (zo
~
= uHZ-’
z,)~]”’. The “radius” of the (220)
with aH the hydrogenic Bohr radius corresponding to 0.529 8, and Z is the charge of the atomic nucleus. According to Eq. (I), plane-wave illumination exp(ikz) incident parallel to the optical axis results in the object wave function (zo = 0)
The integral in Eq. (221) can be evaluated analytically if we extend the upper bound of the integration interval to infinity. This is allowed because of the rapid exponential decay of the potential in Eq. (219). With a change of variables we obtain (using Gradshteyn and Ryzhik, 1980, f.3.387.6, p. 322)
292
CORNELIS H. SLUMP AND HEDZER A. FERWERDA
where KO(.)is the modified Bessel function of the second kind and of zero order. The constant a is given by
In the weak-object approximation* we obtain the object wave function
1 - 2iaKo(R-'C(xo - x,I2 + ( Y O - Y,) 1 ' ) (224) If the integration is extended over the whole object plane, we obtain from Eqs. ( 3 ) and (4) introducing the new variables x b = xo - x, and yb = y o - y, for the wave function in the exit pupil 2
$o(xo,~o;x,,~,)
$,(C3 '11
=
exp[
- pi((,
I')
~
d.uo dy, [ I
2ni(C.u, -
112
+ tly,)]
2ia K,(R-'(.x:
+Y:)':~)]
- 1
(225) exp[ -2ni((-ub + qyb)] The structure of the integrand suggests the use of polar coordinates. s;,= r b c o s 4 b , y b = r;,sin4b,c = r,cos4,,and 17 = r,sin4, $,,(r,. 41,) = exp[ - r;,(r,,) - 2nir,(.ua cos cj,, + J', sin 4,,)] x
x
((:
r ~ ) d r ~ ~ [ c ~ n d $- 2h i[alK o ( R - ' r b ) ] (226)
x exp[- z n i r ; r p c o s ( ~ ~-; , > 1, even for modest resolution requirements, since &, denotes the average number of electrons available for imaging per sampling cell. When Lo >> 1, the term exp{io&' exp [2xi (2~)-'( k t
+ Iq)]}
of Eq. (262) can be approximated by the first terms of its Taylor expansion
+ lq)]) exp(iw&' exp[2~i(2~)-'(k< 1+io&' e x p [ 2 n i ( 2 ~ ) - ' ( k 5 + 1 q ) ] - ~ w ~exp[4~i(2&)-'(kt E.~~ +1q)]
(266)
IMAGE HANDLING IN ELECTRON MICROSCOPY
303
With Eq. (266) we obtain for @?(o) to a good approximation exp [2ni(2~)-'(k[
+ lr])]
We define the real functions a^((, q ) and 6((, r ] ) to be the real and imaginary part of Z(5, r ] )
Z(5, r ] )
+ i&5, r ] )
= a^([, YI)
(268)
From Eq. (267) the characteristic functions of the real part a^(.,.) and the imaginary part ,.) are easily derived. These functions have the structure of the characteristic function of a Gaussian random function with mean a(.,.) and b(.,.) with variances given by
&.
From this we conclude that, to a good approximation (if we restrict our attention to the two lowest-order moments, mean and variance), ?(-,.) is a Gaussian random function with parameters given by Eqs. (263) and (265).
B. Part 2
In the second part of this appendix we examine the probability density function of the auxiliary variable .?k,k,ldefined in Eq. (130). In order to simplify the notation we will drop the subscripts (k, 1) and abbreviate also here the mean number of arriving electrons per image cell AsTN - 2 by 2,. From Eq. (251) we have that E{G}
= &(l
+ sy
(270)
and s is estimated by s^ = & 1 / 2 ( @ 1 / 2
-
1-y)
Since n^ is distributed according to the Poisson distribution we have P{$}
= exp(-A)A'/$!
(272)
with
A = &(1
+
s)2
(273)
304
CORNELIS H. S L U M P A N D HEDZER A. FERWERDA
Applying the transformation Eq. (271) which has Jacobian &,fi)-”’ using Eq. (272) in the probability density function of s^ p(?) = 2A0(1
+ s^)exp[-/20(l + S ) ~ ] [ A ~ ( +I s)2]1~(1+1)2/[A,(l + :)’I!
results (274)
With Stirling’s approximation to the factorial n!,
n!
N
(2nr1)’/~n“exp(-n)
(275)
we obtain for Eq. (274)
(276) Expanding the logarithm in Eq. (276) into a power series, we obtain to third order in s and s^ p(s^) N (2n)-1’2(4ivo)’’Z exp{ -2i0(?
-
s)’
-
2i,[+(i3
-
s3) + ss^(s^ - s)]} (277)
Defining a’ as follows, o’ = (42,)-1
-
(278)
Eq. (277) can be written p(?)
(2na’)-’’’ exp[ -+o-’(S^ - s)’] exp{ -40-~[+(s^~- s3) + ss^(sI - s)]} (279)
The second exponential in Eq. (279) is close to unity because s fJ = 1 - C~c~s.fx)cos.~;)lu (53) where x > 1. In Eqs. (52)and (53)f, and f,are the spatial frequencies in the x and y directions, respectively. It is obvious that the low-pass filter removes the noise; however, it blurs the image, and the high-pass filter sharpens the edges, but it also enhances the noise. Filtering can also be carried out in the spatial domain. A filtering technique using local statistics was first proposed by Wallis (1976) and then extended by Lee (1980).The filtering technique in the spatial domain can be represented as
where i ( i , j ) is the enhanced image; g(i, j ) is the input or the raw image; C, and C , are the constants such that C, 5 1 and C, 2 1; i, j correspond to the row and the column number of the pixel; and g(i,j) is the local gray-level mean surrounding the pixel (i,j ) . It can be seen that with C, = 1 and C, = 0, the operation is a simple smoothing. With C , 5 1 and C , 2 1 all the edges and the fine details in the image are enhanced. Oppenheim et al. (1968)have modeled image formation as a multiplicative process in which a pattern of illumination is multiplied by a reflectance pattern to produce the brightness image. The image then can be represented as
In Eq. (55) gi(i,j)and gr(i,j ) are the illumination and the reflectance patterns, respectively. The technique for filtering the images which are modeled as in Eq. (55) is a homomorphic filtering technique. The homomorphic image processor can be represented as shown in Fig. 20, where F represents the filtering
g(x,y)
G (f x , f y ) F
Ir
g(x,y)
G' ( f x r f y ) H
F-1
338
A. D. KULKARNI
FIG.20. Homomorphic image processor. F represents the filtering operation.
operation which can also be carried out in the spatial domain, as described in Eq. (54). The output image is given by i(i, j ) = Cgi(i, j)IclCgr(i>j)Ic2
(56)
For simultaneous dynamic range reduction and edge enhancement C1 should be less than 1 and C, should be greater than 1. D. Spatial Smoothing Techniques
If the image contains noise, smoothing techniques are used for cleaning the noise. Some of the smoothing techniques even blur the observed image. Thus edge enhancement may be needed afterwards. The simplest smoothing technique is equal to a weighted averaging over a neighborhood of a pixel. It can be expressed as
c r
gkj)=
n
n
W(P,M
P= -mg=
i -P,j
-
4)
(57)
-n
where the weighting coefficients w(p, q ) are given by
w(p,q)= 1/(2rn
+ 1)(2n + 1)
(58)
Equation (57) replaces the gray level at (i,j) by a gray level averaged over a (2m + 1) by (2n + 1) rectangular neighborhood surrounding (i,j). To reduce the blurring effect, several unequal weighted smoothing techniques have been suggested. Graham (1962) used a 3 x 3 neighborhood and a weighting factor matrix Wgiven by
i
0.25 0.5 0.25
W = 0.5 1.0 0.5 0.25 0.5 0.25
i
Brown (1966) proposed the weighting factor matrix
(59)
339
DIGITAL PROCESSING OF REMOTELY SENSED DATA
Kuwahara et al. (1976) proposed a smoothing scheme which replaces the gray level at (i, j ) , by the average gray level of its most homogeneous neighboring region. Yasuoka and Haralick (1983) have proposed a scheme using a slope facet model with a t test for cleaning pepper and salt noise. In a linear stochastic model, the gray value of any pixel can be expressed as g(i,j)
=
ai
+ bj + y + p ( i , j )
(61)
where i is the row position, j is the column position, E represents the independent identically distributed (IDD) random variable with standard deviation p, and a, b, y , and p are the parameters of the model. Each pixel is checked for noise by considering a 3 x 3 neighborhood. The above model is fitted for a 3 x 3 block. The estimated &, 7 , and p^ are found by a criterion function J which is in this case a total mean-squared error for the block R.
B,
J
=
C
[g(i, j ) - ai - bj - 71’
i,jeR
Minimizing J with reference to a, B, and y, we get /
*T = C y ( i . j ) /i,jz 1
(65)
i,j
In all the above summations i a n d j vary from is found as below
-
1 to
+ 1. From these estimates
where N is the number of elements in the block (in this case N estimated gray value of the pixel can now be expressed as
=
9). The
+ j j + 7 + jj&(i,j)
(67) The t test can be used to test the hypothesis, H o : g ( i , j ) = g(i,j), i.e., the estimated value. Here t is defined as ~ ( ji ), = &i
t = ~ ( i , j) g(i,j)/pfi
(68) We take N = 9 and p = p^. The threshold value of t is taken as t ( N - 1,0.05) using 9574 confidence level, and it can be read from the tables. If t < t ( N - 1,
340
A. D. KULKARNI
FIG.21.
Landsat data with noise
FIG.22. Noise-filtered image of Fig. 2 1.
0.05),accept N o ; i.e., g(i, j ) is not a noise element and is not replaced. If t 2 t ( N - 1,0.05),reject H o ; i.e., g ( i , j ) is noise and is replaced by g ( i , j ) . The scheme is amenable to iterations. The image with noise and the noise-removed image using the above algorithm are shown in Figs. 21 and 22, respectively. E. Enhancement by Band Ratioing
Band iatioing is often used in practice for enhancement of multispectral data. Spectral ratios can be defined as G(i,j) = [atB,(i,j )
+ u 2 B 2 ( i , j +) .--+ a,B,,(i, j ) l / [ b l B l ( C j )
+ h,B,(i, j ) + + bnBn(i,j)]
(69)
where a , , u,,. . . , a n and h,, b,, . . . , h, are constants. B , , B,, . . . ,B,, are the gray values in different spectral bands. n is the number of bands. Often ratioed images show more details which are not visible in the raw image.
DIGITAL PROCESSING OF REMOTELY SENSED DATA
341
F. Enhancement by Principal Components In Landsat multispectral images there exists a correlation between the gray values of the pixel in the different spectral bands. The spectral band values can be decorrelated using the Karhuen-Loeve transform (or principal component technique). The technique is discussed below. Let X = (xl, x2,. . . ,X,jT be an n-dimensional vector representing spectral gray values of a pixel corresponding to n spectral bands. Let the transformed vector Y= ( y ,,y , ,. . .,y,) be given by Y= [A]X
(70)
where [ A ] is an n x n matrix, rows of which are the eigenvectors of the covariance matrix of X , C, such that C,
=
[A][ib][AT]
(71)
where [i. represents ] a diagonal matrix, the elements of which correspond to the eigenvalues of C,
CAI
=[;01 A2
...
(72)
---
where i,, 2 A2 2 2 2,. The image corresponding to y , is the first principal component image. Similarly, the image corresponding to yi is the ith principal component image. The transformed values y, ,y,, . . .,y,, which are decorrelated, contain most of the information in the first few principal components depending upon the ratio iLi/A,, . The first few principal component images may be used as enhanced image for visual interpretation purposes.
G . Pseudocolor Composites Enhancement can also be carried out by assigning various colors to the features in the image corresponding to different gray-level ranges. For example, from a single black and white image, three images can be generated such that each generated image enhances features corresponding to a chosen gray-level range. The three images then can be assigned red, blue, and green colors to generate an enhanced color composite. It is also possible to decompose the input image into three images such that each of the three images corresponds to the different frequency pass band. The three decomposed images then can be assigned red, blue, and green colors to generate the color composite output.
342
A. D. KULKARNI
H . Enhancement by Stereo-Pair Decomposition
In the visual interpretation of Landsat images, in order to enhance interpretation capabilities, it is often desirable to add height information to the reflectance pattern or the gray values in the image. MSS images are obtained by a scanning mechanism such that terrain height variations have almost no effect on the spatial location of the pixels in the image plane. If we know the height information for the pixels, then it is possible to generate stereo images from the MSS image using the model as shown in Fig. 23. In Fig. 23, the object plane corresponds to a Landsat image. Let (x, y, z ) be the coordinates of a point in the object plane. The (x, y )coordinates correspond to column and row numbers of the pixel. Let 0, and 0, be the perspective centers for obtaining projections of the object plane in the image planes IP, and IP,. Let ( x i ,y;) and ( x ; , y i ) be the coordinates of the point in image planes IP, and IP,, respectively. The object plane and the image plan coordinates can be related by (Rao et al., 1982) x; = [l
+
Y ; = c1
+ z/(z1 - 4 l ( Y
Z/(Z, -
(73)
z)](x - x,)
(74)
- Y1)
where (x, y , z ) are the coordinates of a point in the object plane; (x,, y, ,zl) are the coordinates of the point 0, with reference to 0 as the origin; and x; and y ; are the coordinates of the corresponding point in the image plane IP, with 0,
IP 2
0
FIG.23. Model for stereo image generation
343
DIGITAL PROCESSING OF REMOTELY SENSED DATA
as the origin. Similar equations can be obtained for projecting a point from the object plane to the image plane IP,. Equations (73) and (76) can be used to generate stereo-pair images from the Landsat image. It is also possible to use any other information like the earth’s magnetic field as z information and generate the stereo images (Green et al., 1978). I . Shijt Variant Enhancement Techniques The techniques discussed above are shift invariant in nature; i.e., transformation functions used for the enhancement do not change with respect to spatial coordinates of the pixel. However, in practice most of the images have different gray-value distributions and textural properties at the different spatial regions in the image. Hence, shift-invariant operators may not yield good results for the entire image. In order to overcome this difficulty, shiftvariant operators for intensity mapping and filtering can be used. The operators can be adaptive in nature and can be obtained by considering the local properties of the image at the different spatial locations (Kulkarni et al., 1982).
Iv. GEOMETRIC CORRECTION AND REGISTRATION TECHNIQUES As described in Section I there are two types of geometric corrections. The corrections carried out by using imaging system characteristics are called systematic corrections. The systematic corrections for geometric distortions are carried out at the preprocessing stage. The precision corrections are usually carried out at the processing stage. One of the methods to correct the image is to use ground control point (GCP) information. Here, the uncorrected image is compared with the corrected image or a map, and a few GCPs spread throughout are identified. The spatial relationship between the point in the uncorrected image g(u, u) and the corrected imagef(x, y) can be written as
g(u,t>)=f(4l(X>Y),42(X,Y))
(75)
where (u,v) are the spatial coordinates of a point in the uncorrected image; and (x, y) are the spatial coordinates of the corresponding point in the corrected image. Thus the transform relationship between the two coordinates of the two images can be represented by
344
A. D. KULKARNI
The functions
and
+* can be polynomials of the form M-IN-I
u=
C C
amnxmyn
m=O n=O
M- 1N- 1
C
u=
bmnxmyn
m=O n=O
Equations (78) and (79) represent the relationship between coordinates of the pixel in the uncorrected image and the corrected image. The problem is to find out the transformation coefficients a,,, a,,, . . . , a M and boo, h,, ,. . . ,b, , N - These coefficients are obtained by using GCP information as below. Let ( x i ,y i ) for i = 1,2,. . . ,Npbe the coordinates of GCPs in the corrected image. Np represents the number of ground control points. Let (ui,ui)for i = 1,2,. . . ,Np be the coordinates of the corresponding GCPs in the uncorrected image. Let Gi and Ci represent the estimates of ui and u i obtained by Eqs. (78) and (79). The total error in the estimate Giis given by ~
NP
J
=
c (Gi
- Ui)Z
i=l
(80)
Equations (78) and (80) can be solved to get the coefficients amnsuch that J is minimum. The coefficients b,, can be evaluated in the same fashion. Equations (78) and (79) represent polynomials of order M and N . In many cases the polynomials can be approximated to the first-order transformations given below.
+ alox + aoly u = boo + b l o x + boly u = a,,
In order to carry out geometric correction, the pixels from the uncorrected image are transformed to the corrected image as defined by the transform equation, as shown in Fig. 24. The gray values in the output image are obtained from input-image pixel gray values by using some resampling or interpolation technique. These are discussed in the next section. The uncorrected and the corrected images are shown in Figs. 25 and 26, respectively. A . Interpolation Techniques
The interpolation techniques are used in geometric correction and are also used for image magnification and reduction. Interpolation is a process of estimating intermediate values of a continuous event from the discrete
Y
Y
U
UNCORRECTED IMAGE
X
CORRECTED IMAGE
FIG.24. Geometric correction transformation.
FIG.25.
Modular multispectral scanner image, uncorrected
346
A. D. K U L K A R N I
FIG.26. Geometrically corrected image of Fig. 25.
samples. The limitations of classical polynomial interpolation, like Lagrange interpolation, are thoroughly discussed by Hou and Andrews (1978). They developed an algorithm for interpolation using cubic spline functions. Recently, Keys (1981) has developed an algorithm for interpolation by cubic convolution. He has defined a kernel for interpolation. There are also other methods, like nearest neighbor, bilinear interpolation, and the hypersurface approximation.
DIGITAL PROCESSING OF REMOTELY SENSED DATA
347
The cubic convolution interpolation method is more accurate than the nearest-neighbor or bilinear-interpolation methods. However, it is not as accurate as the cubic spline interpolation method. In interpolation by hypersurface approximation, a quadratic or cubic surface defined over twodimensional space in the neighborhood of the point to be interpolated, is used. For equispaced, one-dimensional data, the continuous interpolation function can be written as
where g(x) is an interpolated, continuous function corresponding to a sampled functionf (xk)and xk are the interpolation nodes. c k are the coefficients, which depend on sampled dataf(x,), and h is the sampling interval. The kernels for nearest-neighbor, bilinear, and cubic convolution interpolation are given in Eqs. (84), (85), and (86), respectively (Stucki, 1979).
U(s)= 1 =0
for 0 I (sI I0.5 otherwise
U(s)= 1 - Is1
=o
for 0 IIs1 I 1
u(s)= 1 4-~ls12 + 1 =
-(sI3 + 51sI2 - 81sl
=o
(85)
otherwise
+4
for
(sI < 1
for
1 IIs1 I 2
otherwise
(86)
In nearest-neighbor, bilinear, and cubic convolution interpolation methods, the coefficients c k in Eq. (82) are the sampled data functionf(x,). Interpolation by hypersurface approximation can be carried out by using discrete orthogonal polynomials as the basis functions. The expression for a continuous surface in two-dimensional space using the hyperquadratic surface approximation has been given in Eq. (49). The same can be used for the interpolation. As an illustration, the nearest-neighbor, cubic convolution, and hypersurface approximation' interpolation algorithms have been applied to Landsat data, and the outputs are shown in Figs. 27 through 30 (Kulkarni and Sivaraman, 1984). Recently, Kekre et al. (1982) have used raised cosine functions as basis functions and have developed an algorithm for interpolation. Park and Schowengerdlt (1983) have developed an algorithm for interpolation using the parametric cubic convolution technique. They have used the family of
FIG.27. Modular rnultispectral scanner raw Landsat data
FIG.28.
Interpolation by nearest neighbor of data in Fig. 27
DIGITAL PROCESSING OF REMOTELY SENSED DATA
349
FIG.29. Interpolation by cubic convolution of data in Fig. 27.
piecewise cubic polynomials, and the kernel for the same is given by U ( s )= ( a =~
=o
+ 2)ls13
--
(a
+ 3)lsI2 + 1
1 . ~ 15 1~ ~ + 1 ~8~1.~1 - 4~r
D j ( X )
for all j ,
(95)
j # i
B. Minimum Distance Class$er One of the important types of linear classifiers is the minimum distance classifier. Here the distances between sample points in the prototype training samples are used for the classification. Suppose that the reference vectors R , , R,, . . ., R , are given for m classes. The minimum distance classifier assigns the input sample X to a class w,if (1 X - Ri 11 is the minimum, where 11 1) represents the distance, defined as
-
(1 x - R 1)
=
[(X - Ri)T(X - R,)] l I 2
(96)
C . Supervised Clussijication Techniques In supervised parametric classification techniques, the training samples are used to obtain the statistical properties of each of the categories. In decision making one can use reflectance values of the pixels as the feature vectors or the reflectance values can be mapped from an n-dimensional space to a lowerdimensional space. The maximum likelihood classifier has been widely accepted for the analysis of multispectral data. We assume that each observation (pixel) consists of a set of measurements on n variables. With the assumption of a multivariate normal distribution, the probability that X belongs to a class k is given by
pk(Xi) = (2~)-"'"Z~1-''~exp[-(1/2)(Xi - p k ) 5 ; ' ( X i - pk)]
(97)
where n is the number of the measurement variables to characterize each observation; Xiis a vector of measurements on n variables associated with the i t h observation; p k ( X i ) is the probability density value associated with the observation vector X i ,as evaluated for the class k ; X k is the covariance matrix
357
DIGITAL PROCESSING OF REMOTELY SENSED DATA
associated with the kth class; and ,uk is the mean vector associated with the kth class. In the maximum-likelihood decision rule, Eq. (97) allows the calculation of the probability that an observation is a member of the kth class. The individual pixel is then assigned to a class for which the probability is the greatest. In an operational context, the mean and the covariance matrices calculated from observed samples or training sets of finite sample sizes are used in Eq. (97). Equation (97) can be rewritten as
where D, is the covariance matrix associated with class k, taken as an estimator of x k ; and mk is the mean vector associated with class k, taken as the estimator of p k . Since the log of the probability function is a monotonically increasing function, decisions can be made by comparing the values for each class obtained from Eq. (98). A simple decision rule can be derived from Eq. (98) as below, by eliminating the constants. R , : Choose the k which minimizes Fk(Xi)= lnlD,l
+ (Xi
-
(99)
mk)TD,' ( X i - m,)
If we use the a priori probabilities of the classes, then the decision rule can be modified as (Strahler, 1980) R , : Choose k which minimizes F ~ ( X=~ InlDkI ) + ( X i - mk)TD;'(Xi - mk) - 21np(wk) where p ( w k )is the probability that an observation will be a member of prior probability for class w k .
(100) wk,
it. , a
D. Tree Class8ers Single-stage classifiers are used in practice in remote sensing. However, with the development of sensors like the thematic mapper, the data are being acquired with higher spectral, as well as spatial, resolution. Analysis of such data with a single-stage classifier would take a huge amount of computer time. Tree classifiers are considered to be more effective than a single-stage classifier. The advantage with the use of the tree classifier is that the accuracy and the computational efficiency is improved. Tree classifiers are also known as multilevel classifiers. A typical decision tree scheme is shown in Fig. 36 (Fu, 1982). At the first level, the classes are classified into i groups using only n , features. Here i 5 m and n , 5 n, and the n , features selected are the best features to classify these i groups. The same procedure will then be repeated at the second level for each of the i groups. The method is continued in this fashion for the third level, fourth level, etc., until each of the original m classes can be separately identified. Following each tree path in the decision tree, each
A. D. KULKARNI
358
FIG.36. Decision tree structure classification
of the rn classes can be recognized. The basic concerns of the tree classifier are the separation of the groups of the classes at each nonterminal node, the choice of the subset of the features which are most effective in separating these groups of the classes. Therefore, there are three major tasks in the design of a tree classifier (Lin and Fu, 1983): to set up a structure of an optimum tree, (2) to choose the most effective subset at each nonterminal node, and ( 3 ) to choose the decision rule at each nonterminal node.
(1)
During the design of a tree classifier, one would always desire to obtain a optimum tree classifier in the sense of achieving the highest possible classification accuracy while using the smallest possible computer time. Many research workers have defined different evaluation functions to direct a search through possible decision tree structures. However, these optimization approaches require a large amount of computer time and memory space, and the optimization is not guaranteed. One of the ways to reduce the total number of possible tree structures is to limit the number of features selected at each stage. A binary tree can be considered as a special case of the tree classifiers. An algorithm for the binary tree has been developed and implemented (Kulkarni, 1983).The approach is motivated by the classification accuracy, as well as the computational efficiency. Here, at each nonterminal node two clusters are formed. In order to obtain the clusters of the classes, minimum distance has been used as the criterion. The two most distant class centers have been chosen as the cluster centers and the distance, in the feature space, of the class mean from the cluster center has been used as the criterion to assign the class to one of the two clusters. After having assigned all the classes, the cluster centers are
DIGITAL PROCESSING OF REMOTELY SENSED DATA
359
recalculated and the process is iterated, until the cluster centers are stabilized. This can be described in the steps below:
(1) Select the mean vectors of the two classes which are maximum distances apart in the feature space as the starting cluster centers. (2) Assign each class to one of the clusters using the distance in the feature space as the criterion. (3) Calculate the new cluster centers as the mean of the means of the classes in the cluster. (4) If U , and U, are the old centers and and are the new cluster centers, then the error, Err, can be written as
ol
2
Err
= i=l
c2
(q- oi)(q- oJT
(5) Repeat the procedure from step 2 through 4 until the cluster centers are stabilized. In order to classify a pixel to one of the clusters at each nonterminal node, one can use a subset of the selected features. For the selection of the subset of the features, the following procedure can be adopted. At each nonterminal node the scatter matrix S , can be defined as
s, =
c (q. 2
-
i= 1
where q.is the mean vector corresponding to cluster i and U,, is the total pooled mean vector. The pooled covariance matrix can be defined as 2
S2
=
C Ki i= 1
where kiis the covariance matrix corresponding to cluster i. The separability at each nonterminal node can be defined as (O'Toole and Stark, 1980) J = Tr(S,S; ')
(104)
where Tr(-)denotes the trace of a matrix (-). The separability J can be used as the criterion for selecting the features as described in the steps shown below.
(1) Select a combination of the features. (2) Obtain the scatter matrix S , and the pooled covariance matrix S2 corresponding to two clusters considering the selected features. (3) Obtain the separability J . (4) Repeat the above procedure for the desired combination of feature vectors and select a minimum number of features with maximum separability as a subset of features to be used.
360
A. D. KULKARNI
( 5 ) Repeat this process at each nonterminal node. Another measure of separability is the Bhattacharya distance. In the case of Gaussian-distributed classes, the Bhattacharya distance can be expressed as (Lin and Fu, 1983)
J
=
B,,, + B,
(105)
where B m = i(u2 -
Ul)'C(Cl
B,. = iln[l(cl
+2'(-12/)2'
-
(106)
'1)
+ ~2)/21/1~1~1'21~211 '1
(107)
B, is due to the difference of the means of the two clusters, and B, is because of the covariances corresponding to two clusters. As an illustration, an example has been worked out using the above procedure. In the example, Landsat data have been analyzed. In the area selected 16 categories have been observed. The mean vectors corresponding to these 16 categories have been evaluated and are given in Table IV. The tree structure for these classes is obtained using the procedure described above and is shown in Fig. 37. In order to select the subset of features at each nonterminal node, the separability function J is used. Table V shows the separability J at all nonterminal nodes for various subsets. In the given example, the subsets of the
TABLE 1V MEANVECTORS Mean vector (reflectance values) Class no. 1
2 3 4 5 6 7 8 9 10 11
12 13 14 15
16
Description Cropland Fallowland Mixed forest Water Wet crop Sal forest Cultural waste Scrubs Garjin forest Tank water Wet land Bamboo forest Resivior water Jhum area Tropical fruit plantation Evergreen forest
Band 1 Band 2
Band 3 Band 4
39.0 77.9 30.8 52.2 39.7 40.7 43.4 47.7 34.4 35.1 57.7 35.6 28.8 44.1
38.8 103.2 36.3 57.7 37.8 38.5 48.4 54.8 30.9 31.4 71.5 37.6 22.6 55.5
82.5 110.0 84.7 37.7 75.2 99.0 72.4 87.7 88.2 29.3 70.7 90.8 14.0 61.9
79.7 88.6 78.7 10.7 64.3 94.7 61.2 79.7 88.0 11.7 48.0 90.4 3.4 47.5
38.0
35.6
80.0
79.2
DIGITAL PROCESSING OF REMOTELY SENSED DATA
36 1
rn
6, 16
FIG.37. Binary tree structure
features to be used have been selected heuristically. However, the procedure can be automated by defining the optimization function in terms of accuracy and the computer time required. The maximum-likelihood technique can be employed at each nonterminal node for the classification.
E. Contextual Class$cation Techniques In earlier sections we have discussed classification algorithms wherein the classification is performed such that each pixel is classified individually and independently. This classification can only exploit spectral information. There is no provision for using the spatial information to help to decide what a particular pixel in the image might be. Using spatial information together with the spectral information, the analyst may easily identify roads, delineate boundaries, etc. Recent studies have demonstrated the effectiveness of a contextual classifier that combines spectral and spatial information while employing a general statistical approach. Machine classification algorithms can incorporate spatial information in several ways. These approaches can be categorized as being structural and
A. D. KULKARNI
362
TABLE V SEPARABILITY J FOR DIFFERENT FEATURE VECTORSUBSETS Nodes Bands
1
2
3
4
5
6
7
11.60 4.27 3.87 3.95 3.22 0.76 3.48 3.36 3.36 3.42 8.02 0.02 6.04 2.76 3.31
39.70 33.10 17.08 15.40 34.40
5.42 4.80 4.00 3.76 5.55 3.19 3.12 2.12 3.30 2.30 2.23 1.50 1.68 1.60 0.60
7.28 5.79 4.31 2.88 5.93
8.10 8.70 9.28 10.90 1.24 5.70 5.81 1.12 0.09
25.10 23.80 21.70 18.00 23.50 19.00 9.83 7.75 12.40 9.60 2.05 5.73 7.15 0.70 0.05
2.55 3.39 2.54 3.54 1.19 2.11 1.96 0.26 0.78
3.55 1.66 0.88 0.94 1.42 0.65 0.95 1.41 1.14 1.82 1.92 0.16 0.19 0.75 1.11
1.65 0.80 0.43 0.41 0.49 0.15 0.64 0.74 0.70 0.77 1.31 0.60 0.06 0.64 0.60
8
9
10
11
12
13
14
5.80 2.38 3.32 3.42 3.30 1.19 1.24 3.02 2.07
3.17 2.59 1.78 1.19 2.59
2.90 2.75 1.43 1.34 2.80
8.38 8.20 4.15 3.37 8.29
2.58
2.66_
8.13
1.60 2.10 0.98 1.57 0.59 1.60 0.98 0.01 0.58
0.40 0.52 2.37 2.46 0.22 0.37 2.29 0.07 0.15
2.66 2.65 5.35 5.49 0.07 2.60 5.30 0.02 0.04
1.09 0.38 0.38 0.39 0.44 0.10 0.27 0.68 0.36 0.77
0.48 0.47 0.28 0.61 0.46 0.46 0.22 0.21 0.25 0.25 0.04 0.21 0.25 0.01 0.01
3.43 3.41 0.32 0.36 3.42 3.38 0.14 0.13 3.29 3.28 0.05 0.1 1 3.25 0.03 0.01
3.9q 4.06 0.17 0.88 1.1 1 2.82
29.80
5.24
0.94 0.02 0.09 0.25 0.66
textural or contextual. In the contextual approach, the probable classification on neighboring pixels influences the classification of each pixel. Classification accuracies can be improved through this approach, because of the fact that the certain ground cover classes naturally tend to occur more frequently in the same context than others. A general contextual classification approach probabilistically relates the classification of a pixel to the true classes of a limited number of surrounding
DIGJTAL PROCESSING OF REMOTELY SENSED DATA
363
pixels. Chittineni (1981) uses a general Morkov model to describe the class dependencies between the neighboring pixels. Chitteneni developed his model for a single-dimensional multispectral data. Compound decision theory is invoked to develop a classification method which exploits spatial/spectral information. The approach was formulated by Swain et al. (1981), and was further developed by Tilton et al. (1982). In decision compound theory a decision rule d ( x i j ) assigns a minimum risk classification to a pixel ( i , j ) as shown below. Let x i j be n observations from location ( i , j ) having a fixed but unknown classification u i j . The classification uij can be any of m classes from the set R = (w,,w 2 , . . ., wm).Define the context of a pixel at the location (i, j ) as p - 1 observations spatially near to the observation x i j , as shown in Fig. 38. Group the p observations in the p-context array into a vector of observations in x i j = (xl, x2,. . .,x ~and ) let ~ uij be the vector of true but unknown classifications associated with the observations in x i j . Let C p E Q p be a vector of possible classifications for the elements of any p-context array. The decision rule, which defined the set of discriminant functions for the classification problem, is (Tilton et al., 1982)
d ( x i j )= action (classifications) a, which maximizes
CP E RP
(108)
C” = a Where G ( C P )is a “context function” and is the relative frequency with which c poccurs in the scene being analyzed, f ( x k / c k ) is a weighted sum Of multivariate normal densities. Methods for estimating the functions f ( x k / C k ) are well known from the noncontextual maximum-likelihood decision rule. The method for estimating the context function G ( C P )are discussed by Swain et al. (1 982). Tilton (1983) has implemented the algorithm using decision compound theory on a massive parallel processor (MPP) for classification of multispectral data using contextual information efficiently.
p=2
p.3
FIG.38. Examples of p-context arrays.
p=5
364
A. D. KULKARNI
F . Clustering Techniques In the processing of remotely sensed data, often there are many instances where classification must and can be performed without a priori knowledge. The clustering techniques deal with this task. It is a common phenomenon that features belonging to the same class tend to form groups or clusters in the feature space. If we want to classify N samples and each sample is characterized by an n-dimensional vector, i.e., we are given a set of vectors (XI,X 2,. . .,X,) which are known. Each sample is to be placed into one of the m classes, (wl, w2,. . . ,wm),where m may or may not be known. The class to which the ith sample is assigned is denoted by wkj. A classification Q is a vector made up of the wkisand configuration X * is a vector made up of X i s ; i.e., 0 = [wk 1 w k 2 .* * W k n l T
(109)
x* = [x:x;...x;f]
(1 10)
The clustering criterion J is a function of 2 ! and X * and can be written as (Fukunaga, 1972)
J
=
J(Q,X*)
(1 1 1 )
By definition, the best classification satisfies either
Many iterative algorithms to obtain the optimum J are available. The distance measures or the similarity measures are basically used to define the function J . Some of the common distance measures are as below (Deekshatulu and Kamat, 1983) (1)
Minkowsky metric ( 1 14)
where n is the number of features. (2) Quadratic metric d ( X i , Xj) = (x,- Xj)T
W ( X j- X j )
where W is an n x n positive definite matrix. (3) Normalized correlation d ( X i , Xj)
= (X'Xj,/C(X'Xi)(XTXj)l
1'2
365
DIGITAL PROCESSING OF REMOTELY SENSED DATA
(4) Mahalonbic metric d ( x , , X j ) = (Xi
- Xj)'
c-'(xi- Xj)
(1 17)
where C denotes a within-group covariance matrix. VI. SYSTEIM DESIGNCONSIDERATIONS
As discussed in the previous sections, the analysis of remotely sensed data is carried out in two stages, namely, the preprocessing stage and the processing stage. Preprocessing involves reading data from high-density digital tapes (HDDTs), systematic corrections for geometric distortions, radiometric corrections for sensor gain and bias variations, and generation of scene latitude and longitude information for standard film or computer compatible tape (CCT) products. Processing deals with the application of precision geometric correction, enhancements, classification, etc., for generating the thematic maps. The inputs to the processing system can be data on CCT from various sensors like TM, MSS, etc. Input information can also be in the form of geometric control points, training sets, etc. The output of the processing system can be computer-compatable tapes, film products, histograms, scatter diagrams, etc. The processing system can be considered as a black box, as shown in Fig. 39. It can be seen that, since there is a variety of inputs and outputs, it is very difficult to design an optimum system to meet all of the input/output and efficiency
CLASSIFIED OUTPUT
GEOMETRIC CORECTION HISTOGRAM
TM(CCTI IRS(CCT)
SOFTWARE
I
I
M~S(CCT) INPUTS
'
GCP INFORMATION GROIJND TRUTH
k
I
I
STAT1ST ICA L TABLES
FILMS ,TAPES
OUTPUTS
FIG.39. Typical software system IjO requirements.
366
A. D. KULKARNI c
-
REFORMATTING
-
HIGH-PASS- CONTRAST FILTER STRETCHING
CCT NPUT
1
GEOMETRIC CORRECTION
FILMING
FILM OUTPUT
FIG.40. Typical chain of modules
requirements. Hence, the modular approach can be adopted. The typical chain of modules is shown in Fig. 40. The modular approach gives flexibility for the choice of the inputs/outputs and the processing techniques for optimum processing. There can be a number of modules for a variety of processing techniques discussed earlier. The input/output data formats for each of the modules can be designed such that the output of one module can be used as the input to another module. With respect to hardware requirements, both the preprocessing and processing systems can be configured around a minicomputer or around the main frame of a general purpose computer. However, many special devices,
CONSOLE TERMINAL
VIDEO TERMINALS TAPE DRIVE No.1
HIGHDENSITY
VAX 28 TRACK
I
TIME CODE TRANSLATOR
TAPE DRIVE No.2
11/780
AND DECOM
FILM RECORDER
I
GENERATOR PROCESSOR I N P U T S
DISK No 2 OUT P U T S
PROCESSING
FIG. 41. Typical system configuration.
DIGITAL PROCESSING OF REMOTELY SENSED DATA
367
like high-density tape units, display systems, film recorders, digitizers, array processors, etc., are often used in processing remotely sensed data. Hence, a dedicated system around a minicomputer is more suitable. A typical system cofiguration is shown in Fig. 41.
VII. CONCLUSION This article mainly describes some of the digital techniques which are often used in processing remotely sensed images. Some of the sensors are also discussed. It is hoped that this article will be useful as a first reading for scientists in various disciplines, specifically, those who are using remotely sensed data for a variety of applications.
REFERENCES Andrews, H. C. (1 970). “Computer Techniques in Image Processing.” Academic Press, New York. Andrews, H. C., and Teacher, A. G. (1972). I E E E Spectrum July, 20-23. Anuta, P. E. (1970). Trans. I E E E E GE-8, 353-368. Anuta, P. E. (1977). Geophysics 42,468-481. Barnea, D. I., and Silverman, H. F. (1972). Trans. I E E E Comput. C-21, 179-186. Bernstein, R. (1976). / E M J . Res. Den 20,40-57. Brown, D. W. (1966). J . Nucl. Med. 7 , 165. Chittineni. C. B. (1981). Comput. Graphics Image Process. 16,305-340. Chittineni. C . B. (1982). Proc. Int. Symp. Machine Process. Remotely Sensed Data, LARS, Purdue University pp. 245-254. Chittineni. C. B. (1983). Trans. I E E E GE-21, 163-174. Deekshatulu, B. L., and Bajpai, 0. P. (1982). Curr. Sci. 51, 1133. Deekshatulu, B. L., and Kamat, D. S. (1983). Proc. Indian Acad. Sci. 6 (Part 2), 135-144. Deekshatulu, B. L., and Krishnan, R. (1982). J . I E 7 E 2 8 , 4 4 7 4 5 6 . Duda, R. O., and Hart, P. E. (1970). “Pattern Classification and Scene Analysis.” Wiley, New York. Fu, K. S. (1976). I E E E Trans. Geosci. Electron. GE-14, 10-18. Fu, K. S. (1980). “Digital Pattern Recognition.” Springer-Verlag. Berlin and New York. Fu, K. S., ed. (1982). “Application of Pattern Recognition.” CRC Press, Boca Raton, Florida. Fu, K. S. (1983). Proc. Indian Acad. Sci 6 (Part 2), 153-175. Fu, K. S., and Yu. T. S. (1980).“Statistical Pattern Classification.” Wiley (Research Studies Press), New York. Fukunaga, K. (1972). “Introduction to Statistical Pattern Recognition.” Academic Press, New York. Goldberg, M. (1981). In “Digital Image Processing” (J. C . Simon and R. M. Hardlick, eds.), pp. 383-437. Reidel, Dordrecht. Gonzalez, R. C., and Wintz, P. A. (1977). “Digital Image Processing.” Addison-Wesley, Reading, Massachusetts. Graham, R. E. (1962). IRE Trans. lnf. Theory IT-8, 129.
368
A. D. KULKARNI
Green, A. A., Huntington, J. F., and Roberts, G. P. (1978). Proc. Int. Symp. Remote Sensing Environ., 12th 3, 1755. Haralick, R. M. (1976). In “Topics in Applied Physics” (A. Rosenfeld, ed.), Vol. 11. SpringerVerlag, Berlin and New York. Haralick. R. M. (1981). Proc. Pattern Recog. Image Process. Conf., Dallas p p . 285-291. Hou, H. S., and Andrews, H. C. (1978). Trans. I E E E ASSP-26,508-517. Hueckel. C. B. (1973). J . Assoc. Comput. Mach. 20,631-647. Kekre, H. B., Sahasrabudhe, S.C., and Goyal, N. C. (1982). Comput. Electron. h g . 9, 131-152. Keys, R. G. (1981). Trans. I E E E ASSP-29, 1153-1 160. Kulkarni, A. D. (1983). Proc. Int. Symp Remote Sensing Enuiron., 171h, Ann Arbor, Michigan pp. 609-6 15. Kulkarni, A. D., and Sivaraman, K. (1984). Signal Process. 7,65-73. Kulkarni, A. D., Deekshatulu, B. L., and Rao, K. R. (1981). Proc. Int. Symp. M P R S D , L A R S , Purdue University, 7th pp. 181-187. Kulkarni, A. D., Deekshatulu, B. L., and Rao, K. R. (1982). Proc. Int. Symp. M P R S D , L A R S , Purdue University, 8th pp. 258-262. Kuwahara, M., Hachimura, K., and Kinoshito, M. (1976). In “Digital Processing of Bio-Medical Images” (K. Preston and M. Onoe, eds.). Plenum, New York. Lee, J. S. (1980). Trans. I E E E PAMI-2, 165. Lin. Y. K., and Fu, K. S. (1983). Pattern Recoy. 16, 69-80. Morgenthaler, D. S., and Rosenfeld, A. (1981). Trans. I E E E PAMI-3,482-486. Murphy, J. (1984). Technical Memo No. DMD-TM-84-368, Digital Methods Division, Canada Centre for Remote Sensing, Ottawa. Oppenheim, A. V.. Schafter, R. W., and Stockham, T. G (1968). Proc. I E E E 56, 1264-1291. O’Toole, R. K., and Stark, M. (1980). Appl. Opt. 19,2496-2505. Park, S. K., and Schowengerdt, R. A. (1983). Comput. Vision, Graphics Image Process. 23, (3). Rao, K. R., Kulkami, A. D., and Chennaiah, G . Ch. (1981). J . Photo Interpret. Remote Sensing 9, 44-48. Rao, K. R., Kulkarni, A. D., and Chennaiah, G. Ch. (1982). J . Photo Interpret. Remote Sensing, Indian Soc. Photo Interpret. Remote Sensing 10, 1-5. Reeves, G. (1975). “Manual of Remote Sensing” (G. Reeves, ed.), Vol. I , p. 325, American Society o f Photogrammetry. Rosenfeld, A., ed. (1976). “Topics in Applied Physics,” Vol. 11. Springer-Verlag, Berlin and New York. Rosenfeld, A. (1983). Proc. Indian Acad. Sci. 6 (Part 2), 145-152. Rosenfeld, A., and Kak, A. (1982). “Digital Image Processing.” Academic Press, New York. Strahler, A. H. (1982). Remote Sensing Enuiron. 10, 135-163. Stucki, P., ed. (1979). In “Advances in Digital Image Processing.” Plenum, New York. Swain, P. H., and Davis, S. M. (1978). “Remote Sensing: The Quantitative Approach.” McGrawHill, New York. Swain, P. H., Vardeman, S. B., and Tilton, J. C. (1981). Pattern Recog. 13, 429-441. Tilton, J. C. (1983). Proc. Int. Symp. Remote Sensing Environ., I7th, Ann Arbor, Michigan pp. 1-9. Tilton, J. C . ,Vardeman, S.B., and Swain, P. H. (1982). I E E E Trans. Geosci. Remote Sensing GE-20, 445-452. Wallis. R. H. (1976). Proc. Symp. Curr. Math. Problems Image Sci. Monterey, CuliJbrnia. Wang David, C. C., Vagnucci, A. H., and Li, C. C. (1983).Comput. Vision, Graphics Image Process. 26,363-38 1. Webb, W. (1983). Landsat-4 Ground Station Interference Description, Revision 7, GSFC-435-D400, NASA, GSFC, August. Webber, W. F. (1973). Proc. I E E E Conf. Mach. Process. Remotely Sensed Data, Oct. Yasuoka, Y., and Haralick, R. M. (1983). Pattern Recog. 16, 113-129.
Index A
exactly known object, 71-77 optimal linear coordinate estimator, 69-7 I optimal localization and picture contours, 82-87 Automatic parameter adjustment, 16 Auxiliary variable, statistical properties of, 299-305 Average operator, 153 Axial illumination, see Illumination, axial
Aberration function, 206-207, 216 Adaptation to parameters of signals and distortions, 6-9 estimation of noise and distortion parameters, 8-9 picture description and correction quality criterion, 6-7 system description, 7-8 “Adaptive” correction of distortion, definition of, 9 Adaptive correction of distortions in imaging and holographic systems, 5-44 automatic estimation of random-noise parameters, 9-16 of linear distortions, 27-34 noise suppression by filters with automatic parameter adjustment, 16-27 of nonlinear distortions, 34-44 problem formulation, 6-9 Adaptive differential pulse-code modulation, 167 Adaptive-linear prediction, 166 Adaptive mode quantization, 5 1-54 Adaptive nonlinear transformations of the video signal scale, 47-55 Adaptive sampling and quantization, 162- 164 ADPCM, see Adaptive differential pulse-code modulation ALP, see Adaptive linear prediction Amplitude windows, 48 Angiograni, using filters, 59 Antidiffusion operator, 334 APA. see Automatic parameter adjustment Aperture function, I23 Apodization function, 12 I Apodization masks, 13 Atmospheric windows, 3 13 Automatic localization of objects in pictures, 68-92 allowance for object’s uncertainty of definition and spatial nonuniformity, 78-81 estimation of volume of signal corresponding to a stereoscopic picture, 88-92
B BADM, see Basic asynchronous delta modulation Band ratioing, 340 Basic asynchronous delta modulation, 169 BIBO, see Bounded-input bounded-output Bhattacharya distance, 360 Binary hologram method, 108- 109 Binary-media-oriented methods, 120 Biomedicine, 185-192 Bit-slicing, 48 Bounded-input bounded output, 144 Burckhardt coding method, 114
C
CDF, see Cumulative distribution function Central spot, 123 Cepstrum, complex, 145 definition of, 145 Chavel-Hugonin coding method, 114 Chebyshev’s inequality. for histograms, 73 Chi-square test, 281, 286-289 Cinematographic effect, I3 1 Classification techniques, 355-365 pattern recognition, 355 Clustering techniques, 364-365 Code word length, defined, 161 Color holograms, 136 “Colorization,” 68 Colormation C4300, I36 Compact code, 161 Compositional stereo holograms, 129, I3 1132 369
370
lNDEX
Compound decision theory, 363 Compression ratio, defined, 162- 163 averages, 164- 165 “Context function,” 363 Contextual classification techniques, 361-363 “Contour, ’ ’ 82-83 Conventional transmission electron microscope, 203 Covariance function, of a picture, 10- 1 1 Covariance matrix, 246 identical to Cramer-Rao bound, 247 Cramer-Rao bound, 245, 271, 276, 305 Cramer-Rao confidence intervals, 269 CTEM, see Conventional transmission electron microscope Cumulative distribution function, 328
D Data compression, 158- 173 applications, 176- 199 methods and techniques, 162- 173 irreversible methods, 162 reversible methods, 162 prediction and interpolation, 164-166 prediction algorithms, 165 source coding, 159- 162 DCT, see Discrete cosine transform Decision compound theory, see Compound decision theory Decision rule, 357-358 Defocus parameter, 219, 227, 260, 293 Delta modulation, 167- 168 Design considerations, for remote sensing, 365-361 Detection of an object, 281-290 in electron micrographs of carbon foil and NADH:Q oxidoreductase crystal, 289290 “Detour phrase” method, 108 Differential pulse-code modulation and delta modulation, 166-169, 179-180 Diffraction contrast, 207 Diffusion noise, 39 Digital computer engineering, 2 Digital correction of nonlinear distortions in imaging systems, 36-37 Digital filtering, 169 Digital filters, two-dimensional, 142- 152, see also Two-dimensional digital filters
Digital filters, two-dimensional, 142- 152 applications, 176- 199 definition of, 142- 144 design methods of, 145-152 stability of, 144-145 Digital filters, two-dimensional, and data compression, 141-200 applications, 176- 199 Digital image processing, 141-142 Digital optics, 1- 140 Digital processing, 318-319 Digital processing of remotely sensed data 310-367 classification techniques, 355-365 enhancement techniques, 326-343 geometric correction and registration techniques, 343-355 preprocessing techniques, 3 19-326 system design considerations, 365-367 Direct approach, 2 15 for reconstruction of object wave function, 219-224 model computation, 221-224 statistical analysis of approximate solution, 224-229 Discrete cosine transform, 170, 180 Discrete sine transform, 170 Distortions, correction of, 5-44 distortion correction quality, 6 DM, see Delta modulation DPCM, see Differential pulse-code modulation DST, see Discrete sine transform Duplicated symmetrization, 106- 107 E
Edge detectors, 155-158 Edge-enhancement and detection tcchniques, 328-336 Electron microscopy of biological material contrast mechanisms, 207-209 image formation in the CTEM, 205-207 schematic of, 206 interaction of beam with specimen, 203-204 elastic scattering, 203 inelastic scattering, 203 the phase problem, 209, 214 relation between object structure and electron wave function, 204-205 models for the object
37 1
INDEX
stochastic process for low-dose image, 209213 ‘ ‘Enhancement, 45 Enhancement by band ratioing, 340 Enhancement techniques, 326-343 Entropy function, 160 Entropy, of a source, 159-160 definition of, 159 “Equalization,” 43, 48-50 Equidensities, 48 ERTS-I images, 183, 187 Extremal filtration algorithms, 62 ”
mirror nonlinearity, 325-326 misalignment of TM axis and yaw, roll, pitch of the spacecraft, 326-327 Geometric correction and registration techniques, 343-354 precision corrections. 343 systematic corrections, 343 Global quantization, 55 Gray-scale manipulation techniques, 327-328 or gray-level rescaling techniques, 327-328 Ground control point information, 319, 343344
F
H
“Fast algorithms,” 17 Fast cosine transform, 180- 181 Fast Fourier transform, 21, 219 Fast Fourier transform, 2D, 164, 171-172 Fast Walsh transforms, 171 FCT, see Fast cosine transform FDM, see Frequency division multiplex FFT, see Fast Fourier transform Filtering techniques, 336-338 “Filter mask,” 17 Finite impulse response, 143 FIR, see Finite impulse response FIR digital filters, design of, 145- I50 First-order interpolator, 166 First-order predictor, 165 First-order predictor algorithm, 165 Fisher information matrix, 245, 263-264, 305 Fletcher algorithm, 266 FOI, see First-order interpolator FOP, see First-order predictor Fourier holograms, 97- 102 synthesizing of, 133 Fourier transform of low-dose image, statistical properties of, 297-299 Fragmentwise equalization, 49-50 Frequency division multiplex, 176- 177 Fresnel holograms, 97-102 FWT,see Fast Walsh transforms
Haar transform, 170 Hadamard transform, 170 HIDM, see High information delta modulation High information delta modulation, 169 High-resolution visible imaging instruments, 315 Histogram equalization, 328 Histogram of filter output, 72 Histogram hyperbolization, 48 Hologram coding, 109-1 12 Hologram synthesis, 92- 136 application to information display, 128136 discrete representation of Fourier and Fresnel holograms, 98-102 mathematical model, 94-98 reconstruction of, 120- 128 recording synthesized holograms, 102- 120 Hologram window function, 123 Homomorphic filtering technique, 337 Homomorphic image processor, 337-338 HRV, see High-resolution visible Huffman encoding procedure, 162 Hybrid hologram synthesis, 133-135 rephotographing, 135 sandwich holograms, 135 Hybrid optodigital holograms, 133- 135 Hybrid volume hologram, 135 Hypothesis testing, see Statistical hypothesis testing
G GCP. see Ground control point Geometric corrections, 323-326 earth curvature and panoramic distortion, 324-325
I IDD, see Independent identically distributed IIR, see Infinite impulse response
372
INDEX
IIR digital filters, design of. 150-152 Illumination, axial, 230-242 low-dose image recording, 233 object wave reconstruction, 234-242 Illumination, tilted, 242-254 orthonormal expansion of the low-dose image, 244-247 reconstruction of the object wave function, 247-254 Image contrast, 208 Image enhancement techniques, in remote sensing, 326-343 Image handling in electron microscopy, statistical aspects, 202-308 object wave reconstruction, 2 13-229 parameter estimation, 254-277 statistical hypothesis testing, 277-296 wave-function reconstruction of weak scatterers, 230-254 Image intensity distribution in analytical form, 264 Image mask, 333 Image processing, statistical significance of, 295 Image wave function, 231 relation to object wave function, 231 Independent identically distributed random variable, 339 Indian remote sensing satellite, 3 16 Infinite impulse response, 144 Information display, 128- 136 Information Theory Theorem, first, I61 Information visualization. I29 compositional stereo holograms, 129, 131132 “multiplan” holograms, 129- 13 I programmable diffusors, 129, 132I34 Interferogram, equation for, 43 Interferogram, “one-dimensional.” definition of. 12 Interplanetary stations. 15 Interpolation of hologram samples, I19 Interpolation techniques, 344-350 cubic convolution method. 347 cubic spline method, 347 by hypersurface approximation, 347 by nearest neighbor, 347-348 “Interpretation objects,” 7 IRS, see Indian remote sensing satellite
K Karhunen-Loeve transform, 170 Kinofornis,’ ’ I04 Kirsch mask, 158 KLT, see Karhunen-Locve transform KP, see Kronecker-Picard integral Kronecker-Picard integral, 273 “
L Landsat, 315. 342, 347-350 LANDSAT-C images, 182- 185 Lee method, 1 1 I Likelihood function, 245, 255, 270, 293 Likelihood ratio, 280, 285-289 Linear discriminant functions, 356 Linear distortions, correction of, 27-34 Linear filtration of noisy signal, 16 Local informational approach, 7 Localization on “blurred pictures,“ 78-8 I defocused pictures, 80-8 I detection characteristics, 8 I inexactly defined picture, 78-79 estimator adjusted to averaged object, 7879 estimator with selection, 78 spatially nonhomogeneous criterion, 79-80 nonreadjustable estimator, 80 readjustable estimator with fragmentwisc optimal filtration, 79-80 Localization of objects in pictures, .see Automatic localization of objects in pictures Localization reliability, 82-87 Local space operators, 152-158 applications, 176- 199 edge detectors. 155- 158 for image smoothing and enhancement. 153-155 Low-dose imaging, 209. 230, 242 and stochastic process, 232 Low-resolution imaging of specimens with scattering contrast, 277 “Luna-24,” 64 lunar soil samples detected, 64. 66
M Mammograms, after MSNR filtering. 58. 65 Mdrkov sourcc, 160 “Mars-4” and Mar\-S.” 1.5, 20, 32
373
INDEX
Mars surface, color pictures of, 21 Massive parallel processor, 363 Maximum likelihood estimation. 254-259 applications of, 259-277 consistency of, 256 in electron microscopy, 254-259 Maximum-likelihood ratio, as test statistic, 285 Minimum distance classifier, 356 MINUIT program, 276 Misell’s algorithm, 215 Moir6 noise, 14 Morkov model, 363 MPP, see Massive parallel processor MRMS filter, 57, 64 MSNR filter, 57 MSS, see Multispectrd scanner “Multiplan” holograms, 129-131 Multiple exchange ascent algorithm 148, 149 Multiplication method, 41-42 for holograms, 41 -42 Multispectral scanner, 314, 319, 329, 342
N Narrow-band noise, 19-21 Newton-Kantorovich approach, 2 14-215 Noise covariance function, 10- 12 Noise-signal separation, 8-9 Noise suppression, 16-27 optimal linear APA filtration, 16--21 pulse noise filtration, 21-27 algorithm with detection by voting, 26-27 iterative prediction algorithm, 24-25 recursive prediction algorithm, 25-26 Nonliner distortions, correction of, 34-44 in imaging systems, 36-38 in holograms and interferograms under unknown distortion function, 42-44 in holographic systems, 38-42
0 OADM, see Operational asynchronous delta modulation Object amplitude function, 239 Object phase function, 239, 241 Object wave function, 213 and recorded intensity distribution, 215-218 Object wave reconstruction, 2 13-229
Operational asynchronous delta modulation, 169 Optical data processing, 128 Optimal filter mask, formula for, 57 Optimality criterion, 72 Optimal linear coordinate estimator, 69-71 Optimal linear filtration, 55-61 Optimal localization and picture contours, 8287 reference objects, selection of, 84-87 whitening and contours, 82-83 Orthogonal coding, 123-125, 127
P Parameter estimation, in low-dose electron microscopy, 254-271 examples, 259-276 Parameter extraction, 162 Parseval relation, 72 Payload calibration data, 323, 326 PCD, see Payload calibration data PCM, see Pulse-code-modulation Peak error, 162- 163 Periodic noise, 14-15 Phase contrast, 209 Phase-media-oriented methods, 120 Phase problem, 209, 214 Picture distortion correction, 6, 7 Picture preparation, 45-68 definition of, 45 Picture preparation in automated systems, 4647 Picture preparation, in digital optics, 45-68 by adaptive nonlinear transformations of video signal scale, 47-55 combined methods of, 63-68 linear preparation methods, 55-61 problems of, 46-47 rank algorithms of, 61-63 “Ping-pong propagation,” 131 Pixel, 282 Poisson process, 21 1, 266, 270, 279, 293 Position detection of marker atoms, 290-294 simulation experiment, presence of 3 uranium atoms, 293-294 “Power intensification of the picture,” 50-5 1 Power spectrum, 162 Prediction method, of anomaly detection, 9 Prewitt mask, 158
314
INDEX
Principal components, enhancement by. 341 “Problem of characteristic points,” 84 Programmable diffusors, 129, 132- 134 Prolate spheroidal functions, 247 Pseudocolor composites, 341 Pulse-code-modulation, 1 15, I77 Pulse noise, 15-16 Pulse noise filtration algorithms, comparison of, 28 “Pure” modes, definition of, 51
Q Quadrant recursive causal filter, 143 Quadruplicated symmetrization, 106- 107 Quantization noise, 15-16 Quantization of orthogonal components, 40-4 1
R Radiometric corrections, 3 19-323 input radiance vs. output digital voltage, 322 internal calibration systems, 321-322 prelaunch gain and bias values, 320-321 statistical methods, 322-323 Random-noise parameters, automatic estimation of, 9-16 additive signal-independent fluctuation noise in pictures, 10-12 additive wide-band noise parameters in onedimensional interferograms, 12- 14 periodic and other narrow-band noises, 1415 pulse, quantization, and “striped” noise, 15-16 Rank algorithms of picture preparation, 61-63 RBV, see Return beam vidicon Redundancy reduction, 158 Registration techniques, 350-354 “Regular” diffusors, 41 “Rejection” filter, 19 Remote sensing, 182- 185 laser radar, 182 microwave radiometers, 182 multispectral scanners, 182 optical cameras, 182 side-looking radar, 182 Remote sensing, applications of, 317-318 agriculture and forestry. 317 hydrology and water resources, 3 I7
land-use inventory and mapping, 317 meteorology, 317-318 military, 318 mineral resources. 317 Remote sensors, 313-316 photography, 3 13 satellite remote sensing, 3 13-3 16 Rephotographing, I35 Return beam vidicon, 314 R-L algorithms, 62-63 rms error, 163 Robinson mask. 158 Robotics, 192- 199 Robust-against-distribution-“tails,” 62 RSS filter. 57
S Sandwich holograms, 135 “Scalar filter.” 17 Scattering contrast, 207 Scherzer focus conditions, 260-261 Schwartz inequality, 73 SDFT, see Shifted discrete Fourier transform SEASAT-SAR images, 184-185. 188-191 Shadowing, correction of. 34 Shannon information. 258 Shannon theorem, first, 161, 162 Shifted discrete Fourier transform. 99- 100, I06 Shift variant enhancement techniques. 343 “Shot noise.” 202, 210, 225 Signal redundancy, 106 Signal-to-quantization-noisc ratio, 167 “Sliding,” 48 Slope-overload distortion, I68 Smoothing techniques, spatial, 338-340 SNR, see Signal-to-quantization-noise ratio Source code efficiency, defined, 161 Source coding. 159-163 comma codes, 160 instantaneous codes, 160 non-singular codes, 160 Source code redundancy, defined. 161 Spatial smoothing techniques, 338-340 Speckle contrast, 39-41 Spectral signatures, 310-313 green vegetation, 310-31 1 soil, 311-312 water, 311-313 SPOT satellite, 315
375
INDEX
SSR criterion, 29 Stability of two-dimensional digital filter, 144145 definition of, 144 Statistical decision theory, 3, 277 Statistical hypothesis testing, 277-296 in electron microscopy, 277-296 Stereo effect, 88-89 Stereo image generation, 342 Stereo-pair decomposition, 342-343 Stochastic driver functions, 222 Stochastic function, 217 Stochastic Poisson process, 255, 264 Stochastic process for low-dose image, 209213, 233 “Striped” noise, 5 , 15-16 Structure informational approach, 7 Student’s T test, 281, 286-289 Supervised classification techniques, 356-357 Symmetrization, 121-123 Synthesized holograms, application to information display, 128-136 design of holographic displays. 129 information visualization, 129 optical data processing, 128 Synthesized holograms, reconstruction of, 120- 128 orthogonal coding, 123-125 symmetrization, 121- 123 two-phase recording in phase medium, 125128 T TDM, see Time division multiplex Telemedicine, 187 Test statistics. 280-289 chi-square test, 28 I , 286-289 likelihood ratio, 280, 285-289 Student’s T test, 281, 286-289 Thematic mapper, 315-316, 319, 325-327 Thresholding. 38-40, 162, 174. 180 Tilted illumination, see Illumination, tilted Time division multiplex, 177 TM, see Thematic mapper Transformations, 169- 173 Fourier, 169- 170 Haar, 170 Hadamard-Walsh, 170 Karhunen-Loeve, 170 Tree classifiers, 357-361
Two-dimensional digital filters, see also Digital filters, two-dimensional Two-dimensional digital filters and data compression, joint use of, 173-176 processing system for digital comparison and correlation of images, 174-176 typical connections of the two digital operations. 173-174 Two-dimensional low-pass digital filter, 148 Two-phase coding, 116-1 17 Two-phase recording in phase medium, 125128
U Unsymmetrical half-plane filters, 144
v Variable-word-length coding, I7 1 , 179- I80 Variance, 246 Vision properties for picture preparation, use of, 63-68 Volume holograms, 135 Volume of signal corresponding to a stereoscopic picture, 88-92 Voting method, of anomaly detection, 9 W Wave-aberration function, 23 1 Wave-function reconstruction of weak scatterers, 230-254 axial illumination, 230-242 tilted illumination, 242-254 Walsh spectrum, 20 Walsh transform, 170 Weakly scattering bell-shaped object structure, theoretical electron microscopy of, 269273 Weak phase object, theoretical electron microscopy of, 259-264 Weak scatterers, wave-function reconstruction of, 230-254 ”Whitening,” 82-84 Whittaker-Shannon sampling, 30 I Wiener filter, continuous frequency response of, 30 Wiener filtration theory, 16 Window, 351 Window function, hologram, 123
376
INDEX
Window method, 146 Kaiser window, 147 Lanczos-extension window, 147 Weber-type approximation window, 147
X X-ray, with linear filtration, 60
Z Zero-locating algorithm, 265
Zero-memory source, 159- I60 binary source. 159 nth extension, 160 Zero-order interpolator, 166 Zero order predictor, 165 Zero-order predictor algorithm, 165, 171, 172, 183, 186 ZOI, see Zero-order interpolator ZOP, see Zero-order predictor