This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
2 . In region 1 the mean distance between particles will be d1 and in region 2 the mean distance will be d2 , where d1 < d2 . If we dilate using a kernel of radius a, where d1 < 2a < d2 , this will tend to connect the particles in region 1 but should leave the particles in region 2 separate. To ensure connecting the particles in region 1, we can make 2a larger than 1 2 ðd1 þ d2 Þ, but this may risk connecting the particles in region 2 (the risk will be reduced when the subsequent erosion operation is taken into account). Selecting an optimum value of a clearly depends not only on the mean distances d1 ; d2 but also on their distributions. Space prevents us from entering into a detailed discussion of this: we merely assume that a suitable selection of a is made, and that it is effective. The problem that is tackled here is whether the size of the final regions matches the a priori desired segmentation, i.e., whether any size distortion takes place. We start by
176
E. R. DAVIES
Figure 53. 1D particle distribution. z indicates the presence of a particle, and x shows the densities in the two regions. From Davies (2000b).
taking this to be an essentially 1D problem, which can be modeled as in Figure 53 (the 1D particle densities will now be given an x suffix). Suppose first that 2x ¼ 0. Then in region 2 the initial dilation will be counteracted exactly (in 1D) by the subsequent erosion. Next take 2x > 0: when dilation occurs, a number of particles in region 2 will be enveloped, and the erosion process will not exactly reverse the dilation. If a particle in region 2 is within 2a of an outermost particle in region 1, they will merge, and will remain merged when erosion occurs. The probability P that this will happen is the integral over a distance 2a of the probability density for particles in region 2. In addition, when the particles are well separated we can take the probability density as being equal to the mean particle density
2x . Hence Z 2a
2x dx ¼ 2a 2x ð112Þ P¼ 0
If such an event occurs, then region 1 will be expanded by amounts ranging from a to 3a, or 0 to 2a after erosion, though these figures must be increased by b for particles of width b. Thus the mean increase in size of region 1 after dilation þ erosion is 2a 2x ða þ bÞ, where we have assumed that the particle density in region 2 remains uniform right up to region 1. We next consider what additional erosion operation will be necessary to cancel this increase in size. In fact, we just make the radius ~a1D of the erosion kernel equal to the increase in size: ~ a1D ¼ 2a 2x ða þ bÞ
ð113Þ
Finally, we must recognize that the required process is 2D rather than 1D, and take y to be the lateral axis, normal to the original (1D) x-axis. For simplicity we assume that the dilated particles in region 2 are separated laterally, and are not touching or overlapping (Fig. 54). As a result, the
GEOMETRIC DISTORTIONS PRODUCED BY IMAGE PROCESSING FILTERS
177
Figure 54. Model of the incidence of particles in two regions. Region 2 has sufficiently low density that the dilated particles will not touch or overlap. From Davies (2000b).
change of size of region 1 given by Eq. (113) will be diluted relative to the 1D case by the reduced density along the direction ( y) of the border between the two regions: i.e., we must multiply the right-hand side of Eq. (113) by b 2y . We now obtain the relevant 2D equation: ~ a2D ¼ 2ab 2x 2y ða þ bÞ ¼ 2ab 2 ða þ bÞ
ð114Þ
where we have finally reverted to the appropriate 2D area particle density 2 . Clearly, for low values of 2 an additional erosion will not be required, whereas for high values of 2 substantial erosion will be necessary, particularly if b is comparable to or larger than a. If ~a2D < 1, it will be difficult to provide an accurate correction by applying an erosion operation, and all that can be done is to bear in mind that any measurements made from the image will require correction. (Note that if, as often happens, a2D could well be at least 1.) a > 1; ~ B. Discussion This work was motivated by analysis of cereal grain images containing rodent droppings, which had to be consolidated by dilation operations to eliminate speckle, followed by erosion operations to restore size10. It has been found that if the background contains a low density of small particles that tend, upon dilation, to increase the sizes of the foreground objects, additional erosion operations will in general be required to accurately represent the sizes of the regions. The effect would be similar if impulse noise were present, though theory shows what is observed in practice, that the effect is enhanced if the particles in the background are not negligible in 10
For further background on this application see Davies et al. (1998). Davies (2000a).
178
E. R. DAVIES
size. The increases in size are proportional to the occurrence density of the particles in the background, and the kernel for the final erosion operation is calculable, the overall process being a necessary measure rather than an ad hoc technique.
X. A Median-Based Corner Detector It may be thought that the edge shifts discussed at length in this article always present problems, but there is one case in which they have been turned to advantage: this is a novel strategy for detecting corners, developed by Paler et al. (1984). It adopts an initially surprising approach based on the properties of the median filter. The technique involves applying a median filter to the input image, and then forming another image that is the difference between the input and the filtered images. This difference image contains a set of signals that is interpreted as local measures of corner strength. It may seem risky to apply such a technique since its origins suggest that far from giving a correct indication of corners, it may instead unearth all the noise in the original image and present this as a set of ‘‘corner’’ signals. Fortunately, analysis shows that these worries may not be too serious. First, in the absence of noise, strong signals are not expected in areas of background; nor are they expected near straight edges, since median filters do not shift or modify such edges significantly. However, if a neighborhood is moved gradually from a background region until its central pixel is just over a convex object corner, there is no change in the output of the median filter: hence there is a strong difference signal indicating a corner (see Section III.F). Paler et al. (1984) analyzed the operator in some depth and concluded that the signal strength obtained from it is proportional to (1) the local contrast, and (2) the ‘‘sharpness’’ of the corner. The definition of sharpness they used was that of Wang et al. (1983), meaning the angle through which the boundary turns. Since it is assumed here that the boundary turns through a significant angle within the filter neighborhood, the difference from the second-order intensity variation type of approach (based on modeling the local image intensity function in a Taylor series expansion) (Davies, 1997a) is a major one. Indeed, it is an implicit assumption in the latter approach that first- and second-order coefficients describe the local intensity characteristics reasonably rigorously, the intensity function being inherently continuous and differentiable. Thus the second-order methods may give unpredictable results with pointed corners where
GEOMETRIC DISTORTIONS PRODUCED BY IMAGE PROCESSING FILTERS
179
directions change within the range of a few pixels. Nevertheless, it is worth looking at the similarities between the two approaches to corner detection before considering the differences. We proceed with this in the next subsection.
A. Analyzing the Operation of the Median Detector This subsection considers the performance of the median corner detector under conditions in which the gray-scale intensity varies by only a small amount within the median filter neighborhood. This permits the performance of the corner detector to be related to low-order derivatives of the intensity variation, so that comparisons can be made with second-order corner detectors. To proceed we assume a continuous analogue image and a median filter operating in an idealized circular neighborhood. For simplicity, since we are attempting to relate signal strengths and differential coefficients, noise is ignored. Next, recall that for an intensity function that increases ~ but that does monotonically with distance in some arbitrary direction x y, the median within the circular not vary in the perpendicular direction ~ neighborhood is equal to the value at the center of the neighborhood. This means that the median corner detector gives zero signal if the curvature is locally zero. If there is a small curvature , the situation can be modeled by envisaging a set of constant-intensity contours of roughly circular shape and approximately equal curvature, within the circular neighborhood that will be taken to have radius a (Fig. 55). Consider the contour having the median intensity value. The center of this contour does not pass through the center ~-axis. of the neighborhood but is displaced to one side along the negative x Furthermore, the signal obtained from the corner detector depends on this displacement. If the displacement is D, it is easy to see that the corner signal is Dgx~ since gx~ allows the intensity change over the distance D to be estimated (Fig. 55). The remaining problem is to relate D to the curvature . A formula giving this relation has already been obtained. The required result is 1 D ¼ a2 6
ð115Þ
1 K ¼ Dgx~ ¼ gx~ a2 6
ð116Þ
so the corner signal is
180
E. R. DAVIES
Figure 55. Geometry for estimation of corner signals from median-based detectors. (a) Contours of constant intensity within a small neighborhood: ideally, these are parallel, circular, and of approximately equal curvature; (b) cross section of intensity variation, indicating how the displacement D of the median contour leads to an estimate of corner strength. From Davies (1988b).
Note that K has the dimensions of intensity (contrast), and that the equation may be re-expressed in the form 1 ðgx~ aÞ ð2aÞ ð117Þ 12 so that, as in the formulation of Paler et al. (1984), corner strength is closely related to corner contrast and corner sharpness. To summarize, the signal from the median-based corner detector is proportional to curvature and to intensity gradient. Thus this corner detector gives an identical response to second-order intensity variation detectors such as the Kitchen and Rosenfeld (1982) (KR) detector. K¼
GEOMETRIC DISTORTIONS PRODUCED BY IMAGE PROCESSING FILTERS
181
However, this comparison is valid only when second-order variations in intensity give a complete description of the situation. Clearly the situation might be significantly different where corners are so pointed that they turn through a large proportion of their total angle within the median neighborhood. In addition, the effects of noise might be expected to be rather different in the two cases, as the median filter is particularly good at suppressing impulse noise. Meanwhile, for small curvatures, there ought to be no difference in the positions at which median and second-order derivative methods locate corners, and accuracy of localization should be identical in the two cases.
B. Practical Results Experimental tests with the median approach to corner detection have shown that it is a highly effective procedure (Paler et al., 1984; Davies, 1988b). Corners are detected reliably and signal strength is indeed roughly proportional both to local image contrast and to corner sharpness (see Fig. 56). Noise is more apparent for 3 3 implementations and this makes it better to use 5 5 or large neighborhoods to give good corner discrimination. However, the fact that median operations are slow in large neighborhoods, and that background noise is still evident even in 5 5 neighborhoods, means that the basic median-based approach gives poor performance by comparison with the second-order methods. However, both of these disadvantages are virtually eliminated by using a ‘‘skimming’’ procedure, in which edge points are first located by thresholding the edge gradient, and the edge points are then examined with the median detector to
Figure 56. Result of applying median-based corner detector. (a) Original off-camera 128 128 6-bit gray-scale image; (b) result of applying the median-based corner detector in a 5 5 neighborhood. Note that corner signal strength is roughly proportional both to corner contrast and to corner sharpness. From Davies (1997a).
182
E. R. DAVIES
Figure 57. Comparison of the median and KR corner detectors. (a) Original 128 128 gray-scale image; (b) result of applying a median detector; (c) result of including a suitable gradient threshold; (d) result of applying a KR detector. The considerable amount of background noise is saturated out in (a) but is evident from (b). To give a fair comparison between the median and KR detectors, 5 5 neighborhoods are employed in each case, and nonmaximum suppression operations are not applied: the same gradient threshold is used in (c) and (d). From Davies (1998b).
locate the corner points (Davies, 1988b). With this improved method, performance is found to be generally superior to that for (say) the KR method in that corner signals are better localized and accuracy is enhanced. Indeed, the second-order methods appear to give rather fuzzy and blurred signals that contrast with the sharp signals obtained with the improved median approach (Fig. 57). Next, we note that the sharpness of signals obtained by the KR method may be improved by nonmaximum suppression (Kitchen and Rosenfeld, 1982; Nagel, 1983). However, this technique can also be applied to the output of median-based corner detectors. Thus overall, the latter seem to be at least as effective as detectors based on finding second-order intensity variations in the input intensity function. Finally, see Davies (1992a) for a paper covering a fast median filtering algorithm with application to corner detection.
GEOMETRIC DISTORTIONS PRODUCED BY IMAGE PROCESSING FILTERS
183
XI. Boundary Length Measurement Problem At first sight, this section may seem off the main track of this article. However, it is actually quite strongly linked to the central theme, as it is fundamentally involved with the relation between a continuum and measurements made in a discrete lattice of pixels. There are many recognition schemes that involve tracking around the boundaries of objects. They include the ‘‘centroidal profile’’ or polar plot (r; ) method, the (r, s) method, the boundary orientation (s; ) method, and the boundary curvature (s; ) method, s being the boundary distance measured from some convenient point on the object boundary. These methods are described in some detail in Davies (1997a) and will not be considered further here. Simpler methods of recognizing objects also exist. One that has long been used is the ‘‘circularity’’ or ‘‘compactness’’ measure C ¼ area=ðperimeterÞ2 , which also involves measurement along the object boundary. The existence of a family of recognition schemes involving boundary distance s makes it worthwhile to develop accurate means for estimating s. Probably the simplest measure of boundary distance takes all eight neighbors of a given pixel as being one unit of distance away. However, it is more accurate to take the diagonally adjacent neighbors as being pffifficlearly ffi 2 times further away than the other four neighbors (Freeman, 1970)—a procedure that had become quite universal by 1977. At that stage Kulpa (1977) noted that this approach systematically overestimates the analogue boundary distance11 by a small factor, and he calculated a correction. Thus the Freeman measure pffiffiffi LF ¼ ne þ 2no ð118Þ was replaced by the measure
LK ¼ 0:948ne þ 1:341no
ð119Þ
ne and no being, respectively, the number of relevant even (nondiagonal) and odd (diagonal) Freeman chain code elements (Freeman, 1970). These measures are of the general form LG ¼ ne þ no
ð120Þ pffiffiffi where Kulpa assumed that = remains equal to 2. Later Proffitt and Rosen (1979) showed that this is valid, though the proof is purely 11
That is, distance measured in the original analogue space, before digitization.
184
E. R. DAVIES
Figure 58. Possible variations of LF with . These sketches show possible a priori variations of w ¼ LF =L with , L being an ideal boundary distance measure. From Davies (1991b).
Figure 59. Geometry for calculating the variation of LG with . OP and PQ are line segments with orientations 0 and 45 that represent the horizontal and diagonal sections of a line OQ with orientation . From Davies (1991b).
mathematical and the validity of the result is not obvious. Here we study the problem with a view to clarifying the situation (Davies, 1991b). A. Detailed Analysis First we note that the point of the measure LF is that it is exactly correct in the two limiting cases in which we have straight edge boundaries aligned at angles ¼ 0 and 45 to the pixel axes frame. However, between these limits LF varies with in an initially unknown way (Fig. 58). Next we follow Kulpa’s method for calculating the length of a segment of boundary consisting of horizontal and diagonal sections, where the overall horizontal displacement is a and the overall vertical displacement is b (Fig. 59). Then the true (Euclidean) displacement is L¼
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi a 2 þ b2
ð121Þ
GEOMETRIC DISTORTIONS PRODUCED BY IMAGE PROCESSING FILTERS
185
and the length measure LF ¼ ða bÞ þ We now wish to generalize LF to the form
pffiffiffi 2b
LG ¼ ða bÞ þ b ¼ a þ ð Þb
ð122Þ ð123Þ
where and are to be determined. Proceeding to polar coordinates (Fig. 59), we find L¼r
ð124Þ
LG ¼ r½ cos þ ð Þ sin
ð125Þ
so that the ratio (ideally equal to unity) is w ¼ LG =L ¼ cos þ ð Þ sin
ð126Þ
We now note that w can be rewritten in the form of a single cosine function: w ¼ cos ð Þ
ð127Þ
where ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2 þ ð Þ2
ð128Þ
and tan ¼ ð Þ=
ð129Þ
However, we do not need to proceed with this detailed calculation, since it is our purpose here to point out some characteristics of the solution. In particular we note that w is a symmetrical function and that it must be centered symmetrically at ¼ 22:5 for the original case pffiffiffi
¼ 1; ¼ 2, since we know that w ¼ 1 for ¼ 0 and 45 [a formal proof can easily be obtained by substituting for and in Eq. (129)]. This itself is a remarkable result, since it shows an interesting symmetry between the cases of lines near to 0 and 45 (see below). In fact our a priori arguments led only to Figure 58a and b and certainly did not predict such a symmetry.
186
E. R. DAVIES
We next calculate the mean value of w: Z Z =4 ¼
cos þ ð Þ sin d = w 0
4 =4 ½ sin ð Þ cos 0 i pffiffiffi 4 h pffiffiffi
= 2 ð Þ= 2 þ ð Þ ¼ ¼
=4
d 0
ð130Þ
pffiffiffipffiffiffi i 2 2 2 1 hpffiffiffi ¼ 2 þ pffiffiffi equal to unity, but we also Clearly we have to adjust 2 þ to make w have to adjust the relative values of pand ffiffiffi to minimize errors. (The reason it is necessary to do this when only 2 þ appears to matter is that we have to attempt to minimize the deviation in w that can occur in any specific practical instance, i.e., for a specific value of .) Proffitt and Rosen do this by adjusting the relative values so that the standard deviation of the wð Þ 1 distribution is minimized (Proffitt and Rosen, 1979). However, we proceed differently. We note that our pffiffiffi starting values of and make w ¼ 1 cannot alter the symmetric. Furthermore, adjusting 2 þ to make w lateral placing () of the function if ; are maintained in the same ratio [see Eq. (129)]. Hence it cannot alter the symmetry. On the other hand adjusting the relative values of and will destroy the symmetry. Now it is easy to see that the symmetrical placing of w minimizes the maximum error, the mean square error, and a number of other possible error measures. Hence it is clear that the relative values of and must remain unchanged. We assert that this was not obvious a priori, but it confirms and puts a new gloss on previous pffiffiffiwork. Since we have now deduced that ¼ 2 , Eq. (130) and the condition ¼ 1 combine to give w
¼ pffiffiffi ¼ 0:948 ð131Þ 8ð 2 1Þ pffiffiffi 2 ¼ pffiffiffi ¼ 1:341 8ð 2 1Þ
ð132Þ
(Note that various other approximate versions of these values appear in the literature, several of them presumably having been produced by rounding or typographical errors.)
GEOMETRIC DISTORTIONS PRODUCED BY IMAGE PROCESSING FILTERS
187
B. Discussion The above proof is based on the symmetry of the function wð Þ. However, no reason has been given explaining physically why this symmetry occurs. Take the case of p a straight almost horizontal line (Fig. 60a). In this case the ffiffiffi step contributes 2 to LF , whereas ideally it would only contribute pffiffiffi about 1—a clear overestimate. This interchange of the values 1 and 2 suggests step will contribute 1 to LF when that for a line near to 45 a horizontal pffiffiffi ideally it would contribute 2—thereby leading to an underestimate. However careful consideration (Fig. 60b) shows that this argument is fallacious, since the amount ideally contributed by the horizontal step is pffiffiffi 1= 2—so in fact we get an overestimate by the same factor as before. Thus the symmetry between the two limiting cases is quite subtle. The true situation is that in both cases, the amount contributed by the step should be pffiffiffi 1= 2 ð¼ cos 45 Þ of the amount actually contributed: it is only the resolved component of the step distance along the general direction of the line that should actually count. This section has studied the Kulpa boundary distance measure with a view to obtaining a better understanding of the mechanisms underlying choice of boundary distance calibration parameters. It is found that an interesting symmetry exists between the two limiting orientations, and that this why the parameters and should be exactly in the ratio pffiffiexplains ffi 1 : 2. Further insight may be obtained by referring to the papers by Dorst and Smeulders (1987), Beckers and Smeulders (1989), and Koplowitz and Bruckstein (1989).
Figure 60. Special cases of straight lines with orientations close to 0 and 45 . (a) The special case of a nearly diagonal line. In (b) note that the projection of the step along the general pffiffiffi direction of the line is 1= 2 of the length of the step, so taking the step as contributing a length of 1 pixel gives an over estimate by this factor. From Davies (1991b).
188
E. R. DAVIES
XII. Concluding Remarks This article has attempted to provide an understanding of the edge shifts that arise when certain types of filter are applied to digital images. Initial calculations and experiments related to median filters, but it was soon shown that the shifts are not avoided by applying alternative types of filter such as mean and mode filters. Indeed, the amount of shift appeared very similar in all three cases. In retrospect this is not too surprising, as each of these filters represents an averaging process that seems bound to produce a shift of the same approximate magnitude. However, it is possible to design filters that largely eliminate this problem, and among these is the hybrid median type of filter (though careful tests show that this type of filter reduces the shift only by a factor of around four, and does not eliminate it completely). Another filter that exhibited considerably reduced levels of shift distortion was a specially trained artificial neural network filter employing multilayer perceptrons (Greenhill and Davies, 1994): this showed especially good performance in inhibiting the chopping or filling in of corners (dark corners are better described as ‘‘chopped,’’ whereas light corners are better described as ‘‘filled in’’), though it was also good at preventing noise from causing bumps in boundaries. In fact, this type of filter was found to be susceptible to distortions in the training images, a factor that might affect the generality of this otherwise powerful approach to filter design: on the other hand, its capability for solving the problem at some level provides a useful existence theorem that satisfactory filters must exist, and indicates that more conventional filters could be designed with the right properties. One indication of this is provided by Davies (1992b), which showed the design and properties of a filter that is able to avoid edge bias in the vicinity of noise impulses. Although edge shifts are generally disadvantageous, they are turned to good advantage in general rank-order filters and morphological filters, where they are used for processing shapes to create other shapes and in particular to filter objects by size, shape, and other detailed characteristics. Such filters can be made sufficiently general to be able to cope with a great variety of intensity profiles, so the filtering action has to be regarded as not merely binary but also gray scale and even color processing. In this article, space has not permitted color to be discussed; for similar reasons morphological filters have been restricted to what can be achieved with rank-order filters. The latter form a scale on which each individual filter is characterized by the rank-order parameter r, and the shifts for the whole range of rank-order filters for any neighborhood of n pixels form an orderly progression from a to þa where a is the radius of the neighborhood. In
GEOMETRIC DISTORTIONS PRODUCED BY IMAGE PROCESSING FILTERS
189
general, the shifts depend more on r than on the intrinsic neighborhood shift, but for the median filter, the rank-order shift is identically zero, so the relatively small intrinsic neighborhood shift is readily observed. Several attempts have been made to calculate and measure the intrinsic median shifts. The theory was first developed on a continuum approximation, i.e., suppressing any knowledge that the image lattice is discrete in nature. However, this led to difficulties in obtaining exact agreement with experiment, so ultimately a discrete theory of median shifts had to be devised. This demonstrated both highly accurate agreement with experimental measurements of shift, but also showed that the shifts produced by median filters are very far from isotropic. However, it is possible that this is an overly harsh judgment, as rank-order filters give much larger shifts, and the anisotropy of the median shifts is small compared with the large shifts of these other types of filter. Although excellent agreement has been obtained for median filters, mean filters lead to blurring, which largely masks the shifts, and no attempt has been made to derive a discrete model in this case. [Note, however, an interesting discrete calculation and experimental results for noise-induced edge shifts and edge orientation estimation for Sobel-like edge detectors that employ integral mean filtering (Davies, 1987).] The same situation applies for mode filters, though for general rank-order filters some attempts have been made to envisage the discrete shifts that exist. However, the fact that rank-order shifts are generally large means that there is little need to refine the continuum approach and create a detailed discrete model in that case. One further important factor has been found to be of great importance when calculating edge shifts: this is the intensity profile of the edges being investigated. Binary edges constitute a nice concept, but in real gray-scale images, the edge is bound to be gradual and to occur over a distance of about a pixel. It proved possible to measure this effect for both median and mode filters, and in all cases examined nominal step-edges appeared to have widths 1:45 pixels. At the other extreme from step edges lie linear slant edges. However, in the case of mode filters the curvature of the edge profile became important, and the most important parts of the characteristic were the edge plateaus. In fact, it appears that different types of filter seek out different parts of the intensity profile and act on it in different ways. This explains the detailed differences in edge shift that arise for mean, median, and mode filters. In particular, note that 1. Mean filters blur images and optimally suppress Gaussian noise. 2. Median filters no not blur images but are excellent at suppressing impulse noise.
190
E. R. DAVIES
3. Mode filters sharpen up images and are quite good at suppressing impulse noise. In both the latter cases, note that the words ‘‘small irrelevant signals’’ could be used to replace ‘‘impulse noise,’’ thereby emphasizing the underlying (signal-oriented) characteristics of these filters. So many different edge (intensity) profiles and so many shapes of edge boundary are possible that it is difficult to provide a full account of all the edge shifts that may arise in practice. Suffice it to say that the step edge and linear slant edge profiles provide useful extreme cases, whereas the circular edge boundary assumed consistently throughout the article represents a ‘‘worst case’’ scenario, i.e., one leading to the largest shifts. If the edge shifts are a nuisance rather than an advantage, there are three possible courses of action: (1) employ an alternative filter that minimizes or eliminates the effect; (2) do not apply any filter at all; (3) estimate the extent of the shift and allow for it in any subsequent measurements. In this article, we feel that the last approach is generally preferable, and to this end we have provided the clearest guidance that is currently available on the magnitude of the shifts that can arise in a number of important cases. Table 1 lists these cases and indicates where in the article each is discussed. It is hoped that the analysis of the situation provided in this article will prove of some value to those who are working with filters in the area of image measurement. Finally, some problems arose in trying to relate the shifts that arise for continua and discrete lattices of pixels. Such problems are omnipresent in image analysis and make themselves evident in a variety of ways. Another major example of this is in the estimation of boundary length for what was originally an analogue picture and then became a digital image. The work of Kulpa and others in this area has been outlined in Section XI, and leads to the idea that a digital image with square tessellation will systematically overestimate length by a factor 1.055, so multiplication of boundary distance by the factor 0.948 is necessary to compensate for this. Related topics include the design of fiducial marks to permit maximum accuracy of location measurement (Bruckstein et al., 1998), and the partitioning of digital curves into maximal straight line segments (Lindenbaum and Bruckstein, 1993). For a recent tutorial review of the problems of achieving accuracy and robustness in low-level vision, see Davies (2000c).
Acknowledgments The author is grateful to Derek Charles for help in measuring edge shifts in large neighborhoods and for small circles (Sections IV.I and K; VI.D). In
GEOMETRIC DISTORTIONS PRODUCED BY IMAGE PROCESSING FILTERS
191
addition, he would like to credit the following sources for permission to reproduce tables, figures, and extracts of text from his earlier publications: Academic Press for permission to reprint portions of Chapters 3 and 13 of the following book as text in Sections III and X; and as Figure 56: Davies (1997a). Elsevier Science for permission to reprint portions of the following paper as text in Section III; as Table 2; and as Figures 2, 3, and 5–13: Davies (1989). EURASIP for permission to reprint portions of the following paper as text in Section IV; and as Figure 23: Davies (1998). The IEE for permission to reprint portions of the following papers as text in Sections IV, V, VI, IX, and XI; as Table 5; and as Figures 14, 16–21, 35, 52–54, and 58–60: Davies (1991a,b, 1997b, 1999b, 2000b). Professional Engineering Publishing Ltd. and the Royal Photographic Society for permission to reprint portions of the following paper as text in Section VII; as Table 6; and as Figures 41–49: Davies (2000d). Springer-Verlag (Heidelberg) for permission to reprint portions of the following paper as text in Section X; and as Figures 55 and 57: Davies (1988b).
References Bangham, J. A., and Marshall, S. (1998). Image and signal processing with mathematical morphology. IEE Electron. Commun. Eng. J. 10(3), 117–128. Beckers, A. L. D., and Smeulders, A. W. M. (1989). A comment on ‘‘A note on ‘Distance transformations in digital images’ ’’. Comput. Vision Graph. Image Process. 47, 89–91. Bovik, A. C., Huang, T. S., and Munson, D. C. (1983). A generalization of median filtering using linear combinations of order statistics. IEEE Trans. Acoustics, Speech Signal Process. 31(6), 1342–1349. Bovik, A. C., Huang, T. S., and Munson, D. C. (1987). The effect of median filtering on edge estimation and detection. IEEE Trans. Pattern Anal. Mach. Intell. 9(2), 181–194. Bruckstein, A. M., O’Gorman, L., and Orlitsky, A. (1998). Design of shapes for precise image registration. IEEE Trans. Inform. Theory. 44(7), 3156–3162. Coleman, G. B., and Andrews, H. C. (1979). Image segmentation by clustering. Proc. IEEE 67, 773–785. Davies, E. R. (1984). Circularity—a new principle underlying the design of accurate edge orientation operators. Image Vision Comput. 2, 134–142. Davies, E. R. (1987). The effect of noise on edge orientation computations. Pattern Recogn. Lett. 6(5), 315–322. Davies, E. R. (1988a). On the noise suppression and image enhancement characteristics of the median, truncated median and mode filters. Pattern Recogn. Lett. 7(2), 87–97. Davies, E. R. (1988b). Median-based methods of corner detection. In Proceedings of the 4th BPRA International Conference on Pattern Recognition, Cambridge (28–30 March), edited by J. Kittler, Lecture Notes in Computer Science. Berlin: Springer-Verlag, Vol. 301, pp. 360–369. Davies, E. R. (1989). Edge location shifts produced by median filters: Theoretical bounds and experimental results. Signal Process 16(2), 83–96.
192
E. R. DAVIES
Davies, E. R. (1991a). Median and mean filters produce similar shifts on curved boundaries. Electron. Lett. 27(10), 826–828. Davies, E. R. (1991b). Insight into operation of Kulpa boundary distance measure. Electron. Lett. 27(13), 1178–1180. Davies, E. R. (1992a). Simple fast median filtering algorithm, with application to corner detection. Electron. Lett. 28(2), 199–201. Davies, E. R. (1992b). Accurate filter for removing impulse noise from one- or two-dimensional signals. IEE Proc. E 139(2), 111–116. Davies, E. R. (1992c). Simple two-stage method for the accurate location of Hough transform peaks. IEE Proc. E 139(3), 242–248. Davies, E. R. (1993). Electronics, Noise and Signal Recovery. London: Academic Press. Davies, E. R. (1997a). Machine Vision: Theory, Algorithms, Practicalities. 2nd ed. London: Academic Press. Davies, E. R. (1997b). Shifts produced by mode filters on curved intensity contours. Electron. Lett. 33(5), 381–382. Davies, E. R. (1998). From continuum model to a detailed discrete theory of median shifts. Proc. EUSIPCO’98, Rhodes, Greece, 8–11 Sept., pp. 805–808. Davies, E. R. (1999a). High precision discrete model of median shifts. Proc. 7th IEE Int. Conf. Image Process. Appl., Manchester (13–15 July), IEE Conf. Publication No. 465, pp. 197–201. Davies, E. R. (1999b). Image distortions produced by mean, median and mode filters. IEE Proc. Vision Image Signal Process 146(5), 279–285. Davies, E. R. (2000a). Image Processing for the Food Industry. Singapore: World Scientific. Davies, E. R. (2000b). Resolution of problem with use of closing for texture segmentation. Electron. Lett. 36(20), 1694–1696. Davies, E. R. (2000c). Low-level vision requirements. Electron. Commun. Eng. J. 12(5), 197–210. Davies, E. R. (2000d). A generalized model of the geometric distortions produced by rank-order filters. Imaging Sci. J. 48(3), 121–130. Davies, E. R., Bateman, M., Chambers, J., and Ridgway, C. (1998). Hybrid non-linear filters for locating speckled contaminants in grain. IEE Digest No. 1998/284, Colloquium on. NonLinear Signal and Image Processing. IEE (22 May), pp. 12/1–5. Dorst, L., and Smeulders, A. W. M. (1987). Length estimators for digitized contours. Comput. Vision Graph. Image Process 40, 311–333. Evans, A. N., and Nixon, M. S. (1995). Mode filtering to reduce ultrasound speckle for feature extraction. IEE Proc. Vision Image Signal Process 142(2), 87–94. Fitch, J. P., Coyle, E. J., and Gallagher, N. C. (1985). Root properties and convergence rates of median filters. IEEE Trans. Acoust. Speech Signal Process 33, 230–239. Freeman, H. (1970). Boundary encoding and processing. In Picture Processing and Psychopictorics, edited by B. S. Lipkin and A. Rosenfeld, New York: Academic Press, pp. 241–266. Gallagher, N. C., and Wise, G. L. (1981). A theoretical analysis of the properties of median filters. IEEE Trans. Acoust. Speech Signal Process. 29, 1136–1141. Goetcherian, V. (1980). From binary to grey tone image processing using fuzzy logic concepts. Pattern Recogn. 12, 7–15. Greenhill, D., and Davies, E. R. (1994). Relative effectiveness of neural networks for image noise suppression. In Pattern Recognition in Practice IV, edited by E. S. Gelsema and L. N. Kanal, Amsterdam: Elsevier Science B. V., pp. 367–378. Griffin, L. D. (1997). Scale-imprecision space. Image Vision Comput. 15(5), 369–398. Griffin, L. D. (2000). Mean, median and mode filtering of images. Proc. R. Soc. 456(2004), 2995–3004.
GEOMETRIC DISTORTIONS PRODUCED BY IMAGE PROCESSING FILTERS
193
Haralick, R. M., and Shapiro, L. G. (1992). Computer and Robot Vision, Vol. 1. Reading, MA: Addison Wesley. Haralick, R. M., Sternberg, S. R., and Zhuang, X. (1987). Image analysis using mathematical morphology. IEEE Trans. Pattern Anal. Mach. Intell. 9(4), 532–550. Harvey, N. R., and Marshall, S. (1995). Rank-order morphological filters: A new class of filters. Proc. IEEE Workshop on Nonlinear Signal and Image Processing, Halkidiki, Greece, June, pp. 975–978. Heinonen, P., and Neuvo, Y. (1987). FIR-median hybrid filters. IEEE Trans. Acoust. Speech Signal Process 35, 832–838. Hodgson, R. M., Bailey, D. G., Naylor, M. J., Ng, A. L. M., and McNeill, S. J. (1985). Properties, implementations and applications of rank filters. Image Vision Comput. 3(1), 4–14. Kitchen, L., and Rosenfeld, A. (1982). Gray-level corner detection. Pattern Recogn. Lett. 1, 95–102. Koplowitz, J., and Bruckstein, A. M. (1989). Design of perimeter estimators for digitized planar shapes. IEEE Trans. Pattern Anal. Mach. Intell. 11(6), 611–622. Kulpa, Z. (1977). Area and perimeter measurement of blobs in discrete binary pictures. Comput. Graph. Image Process 6, 434–451. Laws, K. I. (1979). Texture energy measures. Proc. Image Understanding Workshop, Nov., pp. 47–51. Lindenbaum, M., and Bruckstein, A. M. (1993). On recursive, O(N) partitioning of a digitized curve into digital straight segments. IEEE Trans. Pattern Anal. Mach. Intell. 15(9), 949–953. Nagel, H.-H. (1983). Displacement vectors derived from second-order intensity variations in image sequences. Comput. Vision Graph. Image Process. 21, 85–117. Nieminen, A., Heinonen, P., and Neuvo, Y. (1987). A new class of detail-preserving filters for image processing. IEEE Trans. Pattern Anal. Mach. Intell. 9(1), 74–90. Paler, K., Fo¨glein, J., Illingworth, J., and Kittler, J. (1984). Local ordered grey levels as an aid to corner detection. Pattern Recogn. 17, 535–543. Proffitt, D., and Rosen, D. (1979). Metrication errors and coding efficiency of chain-encoding schemes for the representation of lines and edges. Comput. Graph. Image Process 10, 318–332. Wang, C., Sun, H., Yada, S., and Rosenfeld, A. (1983). Some experiments in relaxation image matching using corner features. Pattern Recogn. 16, 167–182. Yang, G. J., and Huang, T. S. (1981). The effect of median filtering on edge location estimation. Comput. Graph. Image Process 15, 224–245.
This Page Intentionally Left Blank
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 126
Two-Photon Excitation Microscopy ALBERTO DIASPRO1 AND GIUSEPPE CHIRICO2 1
LAMBS-INFM and Department of Physics, University of Genoa, 16146 Genova, Italy LAMBS-INFM and Department of Physics, University of Milano Bicocca, 20126 Milano, Italy
2
I. II. III. IV. V. VI.
Introduction. . . . . . . . . . . . . . . . . . . . . . Historical Notes . . . . . . . . . . . . . . . . . . . . Basic Principles of Two-Photon Excitation of Fluorescent Molecules Behavior of Fluorescent Molecules under TPE Regime . . . . . Optical Consequences and Resolution Aspects . . . . . . . . . Architecture of Two-Photon Microscopy . . . . . . . . . . . A. General Considerations . . . . . . . . . . . . . . . . B. Laser Sources . . . . . . . . . . . . . . . . . . . . C. Lens Objectives . . . . . . . . . . . . . . . . . . . D. Example of the Practical Realization of a TPE Microscope . . VII. Application Gallery . . . . . . . . . . . . . . . . . . . VIII. Conclusions . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
195 198 202 212 219 225 225 235 242 244 257 273 276
‘‘If I have seen further it is by standing on the shoulders of giants.’’ (Isaac Newton in a letter to Robert Hooke, 5 February, 1676)
I. Introduction Microscopes offer a key to pursuing the goal of opening nature, providing clues as in a secret garden. As recently noted by Colin Sheppard (2002), a microscope is an instrument that magnifies objects by means of a specific interaction—more commonly by means of lenses—so as to capture details invisible to the naked eye. Microscopes transmit information based on image formation, which renders visible previously hidden objects. A ‘‘primary’’ observer is then required to interpret the image (Rochow and Tucker, 1994). Since Hooke’s ornate microscopes (Hooke, 1961) (Fig. 1) and van Leeuwenhoek’s single lens magnifiers (Ford, 1991) (Fig. 2), the development of the optical microscope has undergone a secure and continuous evolution marked by relevant and revolutionary passages in the past 350 years. Inventions in microscopy, stimulated by the needs of scientists, and technology contributed to the evolution of the microscope in its very different modern forms (Beltrame et al., 1985; Fay et al., 1989; 195 Copyright 2003 Elsevier Science (USA). All rights reserved. ISSN 1076-5670/03
196
DIASPRO AND CHIRICO
Figure 1. Drawing of Hooke’s microscope by Cock, 1665 from Micrographia (Hooke, 1961). Hooke did not make his own microscopes; they were made by London instrument maker Christopher Cock, whom Hooke gave much advice on design. In return, the success of Hooke’s book made Cock a very famous microscope maker, and popularized the side-pillar design (see also http://www.utmem.edu/~thjones/hist).
Figure 2. A1 Shinn’s homemade replica of van Leeuwenhoek’s microscope. Antony van Leeuwenhoek (1632–1723) was a microscopist and a microscope maker: he made more than 400 microscopes. Other information can be found at http://www.sirius.com/~alshinn. (Courtesy of A1 Shinn.)
TWO-PHOTON EXCITATION MICROSCOPY
197
Benedetti, 1998; Amos, 2000). Despite the fact that all far-field light microscopes, including conventional, confocal, and two-photon microscopes, are limited in achievable diffraction-limited resolution (Abbe, 1910), light microscopy still occupies a unique niche. Its favorable position, especially for applications in medicine and biology, comes from the peculiar ability of the optical microscope to image living systems at relatively good spatial resolution. Well-established three-dimensional (3D) optical methods such as computational optical sectioning microscopy (Agard et al., 1989; Bianco and Diaspro, 1989; Diaspro et al., 1990; Carrington et al., 1995; Carrington, 2002) and confocal laser scanning microscopy (Brakenhoff et al., 1979; Sheppard and Wilson, 1980; Wilson and Sheppard, 1984; Carlsson et al., 1985; White et al., 1987; Brakenhoff et al., 1989; Wilson, 2002) have been widespread since the 1970s (Weinstein and Castleman, 1971; Shotton, 1993). To penetrate the delicate and complex relationship between structure and function, three-dimensional imaging is relevant to obscure the major shortcoming of diffraction-limited resolution, which is in the range of 200 and 500 nm in the focal plane and along the optical axis, respectively. For the past 10 years, confocal microscopes have proved to be extremely useful research tools, notably in the life sciences. This mature and powerful technique has now evolved to 3D (x–y–z) and 4D (x–y–z–t) analysis allowing researchers to probe even deeper into the intricate mechanisms of living systems (Cheng, 1994; Pawley, 1995; Masters, 1996; Sheppard and Shotton, 1997; Periasamy, 2001; Diaspro, 2002). Within this scenario, two-photon excitation (TPE) microscopy (Denk et al., 1990) is probably the most relevant advancement in fluorescence optical microscopy since the introduction of confocal imaging in the 1980s (Wilson and Sheppard, 1984; Pawley, 1995; Webb, 1996; Sheppard and Shotton, 1997; Diaspro, 2002). It is worth noting that its fast and increasing spread has been strongly influenced by the availability of ultrafast pulsed lasers (Gratton and van de Vende, 1995; Svelto, 1998) as well as the advances in fluorescence microscopy that can be also ascribed to the availability of efficient and specific fluorophores (Haughland, 2002). Now, to place TPE microscopy in the framework of modern microscopy, consider that harm to a large portion of the specimen by fluorescence excitation is a very unfavorable condition affecting ‘‘classic’’ 3D optical schemes. Because of this experimental condition some potentially interesting biological experiments are defeated by photobleaching of the fluorescent label and phototoxicity. This fact applies in particular when there is the need for 3D imaging together with the use of ultraviolet excitable fluorochromes. The advent of two-photon excitation laser scanning microscopy mitigates these concerns, opening new perspectives to the application of microscopic techniques to the study of biological systems and related phenomena, and
198
DIASPRO AND CHIRICO
providing further attractive advantages over classic fluorescence microscopy. In addition to its three-dimensional intrinsic ability, two-photon excitation microscopy is endowed with five other interesting capabilities. (1) TPE greatly reduces photointeractions and allows imaging of living specimens over long time periods. (2) TPE operates in a high-sensitivity background free acquisition scheme. (3) TPE microscopy can image turbid and thick specimens down to a depth of a few hundreds micrometers. (4) TPE allows simultaneous excitation of different fluorescent molecules reducing colocalization errors. (5) TPE can prime photochemical reactions within a subfemtoliter volume inside solutions, cells, and tissues. Moreover, the use of infrared (IR) radiation to excite ultraviolet (UV) or visible transitions allows better discrimination between Rayleigh and Raman scattering, which again falls in the IR, and the fluorescence signal. So far, TPE fluorescence microscopy is not only revolutionary in its ability to provide optical sections, together with other practical advantages, but also in its elegance and effectiveness as applied to quantum physics (Loudon, 1983; Feynman, 1985; Shih et al., 1998). Furthermore, this form of nonlinear microscopy also favors the development and application of other investigative techniques, such as three-photon excited fluorescence (Hell et al., 1996; Maiti et al., 1997), second-harmonic generation (Gannaway and Sheppard, 1978; Campagnola et al., 1999; Diaspro et al., 2002d; Zoumi et al., 2002), third-harmonic generation (Mueller et al., 1998; Squier et al., 1998), fluorescence correlation spectroscopy (Berland et al., 1995; Schwille et al., 1999, 2000), image correlation spectroscopy (Wiseman et al., 2000, 2002), lifetime imaging (Konig et al., 1996c; French et al., 1997; Sytsma et al., 1998; Straub and Hell, 1998), single-molecule detection schemes (Mertz et al., 1995; Xie and Lu, 1999; Sonnleitner et al., 1999; Chirico et al., 2001), photodynamic therapies (Bhawalkar et al., 1997), and others (Diaspro, 1998; White and Errigton, 2000; Masters, 2002; Periasamy, 2001). For these and other reasons, TPE has become an important and relevant technique among biophysicists and biologists.
II. Historical Notes In 1990 Denk and colleagues opened a new chapter in optical microscopy demonstrating the practical application of TPE to optical microscopy of biological systems (Denk et al., 1990). Notwithstanding this, the TPE story dates back to 1931 and its roots are in the theory originally developed by Maria Go¨ppert-Mayer (1931) (Fig. 3) and later reprised by Axe (1964). The
TWO-PHOTON EXCITATION MICROSCOPY
199
Figure 3. Cover of the prestigious scientific journal Annalen der Physik and first page of the famous article published by Maria Go¨ppert-Mayer (1931). (Image obtained by scanning from the Antonio Borsellino library collection, Department of Physics, University of Genoa.)
first page of her historical article from Go¨ppert-Mayer’s doctoral thesis, predicting the phenomenon of two-photon absorption, is shown in Figure 4. The keystone of the principle of TPE theory lies in the prediction that one atom or molecule can simultaneously absorb two photons in the very same quantum event, as originally sketched for the first time in Figure 5. Now, to understand the rarity of the event, consider that the adverb ‘‘simultaneously’’ here implies ‘‘within a temporal window of 1016 1015 s.’’ As recalled by Denk and Svoboda (1997), in bright daylight a good one- or two-photon excitable fluorescent molecule absorbs a photon through one-photon interaction about once a second and a photon pair by two-photon simultaneous interaction every 10 million of years. To increase the probability of the event a very high density of photons is needed, i.e., a very bright and efficient light source.
200
DIASPRO AND CHIRICO
Figure 4. Photograph of Maria Go¨ppert-Mayer biking with colleagues. (Reproduced with permission from AIP Emilio Segre` Visual Archives, http://www.aip.org/history/esva.)
n´ m
n´
n´ m
n´ m
n
n´ m n´
n
n
n´ n
n
n´ m
n
n
n´ n
n
Figure 5. Quantum physics two-photon absorption rules as originally reported by Maria Go¨ppert-Mayer (1931). (Image obtained by scanning from the Antonio Borsellino library collection, Department of Physics, University of Genoa.)
In fact, it was only in the 1960s, after the development of the first laser sources (Wise, 1999; Svelto, 1998), that was possible to find experimental evidences of the Maria Go¨ppert-Mayer’s prediction. Kaiser and Garret (1961) reported two-photon excitation of fluorescence in CaF2:Eu2+ and Singh and Bradley (1964) were able to estimate the three-photon absorption cross section for naphthalene crystals. These two results consolidated other related experimental achievements obtained by Franken et al. (1961) of second-harmonic generation in a quartz crystal using a ruby laser. Later, Rentzepis and colleagues (1970) observed three-photon excited fluorescence from organic dyes, and Hellwarth and Christensen (1974) collected secondharmonic generation signals from ZnSe polycrystals at a microscopic level. In 1976, Berns reported a probable two-photon effect as a result of focusing
TWO-PHOTON EXCITATION MICROSCOPY
201
Figure 6. First page of the revolutionary paper by Denk and colleagues on TPE microscopy of biological samples (Denk et al., 1990). (Image obtained by scanning from the Antonio Borsellino library collection, Department of Physics, University of Genoa.)
202
DIASPRO AND CHIRICO
an intense pulsed laser beam onto chromosomes of living cells (Berns, 1976), and such interactions form the basis of modern nanosurgery (Konig, 2000). However, the original idea of generating 3D microscopic images by means of such nonlinear optical effects was first suggested and attempted in the 1970s by Sheppard, Kompfner, Gannaway, and Choudhury of the Oxford group (Sheppard et al., 1977; Gannaway and Sheppard, 1978; Sheppard and Kompfner, 1978). It was the Oxford group that realized the ability to do optical sectioning based on the event being confined at the focal plane of the objective, because the image intensity had a quadratic dependence on the illumination power (Wilson and Sheppard, 1984). It should be emphasized that for many years the application of two-photon absorption was mainly related to spectroscopic studies (Friedrich and McClain, 1980; Friedrich, 1982; Birge, 1986; Callis, 1997). The real ‘‘TPE boom’’ took place at the beginning of the 1990s at the W. W. Webb Laboratories (Cornell University, Ithaca, NY). In fact, as previously mentioned, it was the excellent and effective work done by Winfried Denk and colleagues (1990) that produced the major impact for spreading of the technique and that revolutionized fluorescence microscopy imaging. Figure 6 reproduces the first page of the cited paper from Science that revolutionized the microscopic approach to study biological systems at the cellular and molecular level. The potential of two-photon excited fluorescence imaging in a scanning microscope was rapidly coupled with the availability of ultrafast pulsed lasers. It was the development of mode-locked lasers, providing high peak power femtosecond pulses with a repetition rate around 100 MHz (Spence et al., 1991; Gosnell and Taylor, 1991; Gratton and van de Vende, 1995; Fisher et al., 1997; Wise, 1999), that made possible in practice the fast dissemination of TPE laser scanning microscopy and the flourishing of related techniques in a sort of avalanche effect (Hell, 1996; Diaspro, 1998, 1999a,b, 2002; Konig, 2000; Periasamy, 2001; Gratton et al., 2001). The technological advances that made two-photon excitation microscopy successful can be found in almost four continuously evolving areas, namely, the development of laser scanning microscopy, of ultrafast laser sources, of highly sensitive and fast acquisition devices, and of digital electronic tools (Shotton, 1993; Piston, 1999; Tan et al., 1999; Robinson, 2001). III. Basic Principles of Two-Photon Excitation of Fluorescent Molecules Fluorescence microscopy is a very popular contrast mechanism for imaging in biology since fluorescence is highly specific either as exogenous labeling or endogenous autofluorescence (Periasamy, 2001). Fluorescent molecules
TWO-PHOTON EXCITATION MICROSCOPY
203
allows us to obtain both spatial and functional information through specific absorption, emission, lifetime, anisotropy, photodecay, diffusion, and other contrast mechanisms (Diaspro, 2002; Zoumi et al., 2002). This means that one can efficiently study, for example, the distribution of proteins, organelles, and DNA as well as ion concentration, voltage, and temperature within living cells (Chance, 1989; Tsien, 1998; Robinson, 2001). Two-photon excitation of fluorescent molecules is a nonlinear process related to the simultaneous absorption of two photons whose total energy equals the energy required for conventional, one-photon, excitation (Birks, 1970; Denk et al., 1995; Callis, 1997). In any case the energy required to prime fluorescence is the energy sufficient to produce a molecular transition to an excited electronic state. Conventional techniques for fluorescence excitation use UV or visible radiation and excitation occurs when the absorbed photons are able to match the energy gap of the ground from the excited state. Then the excited fluorescent molecules decay to an intermediate state giving off a photon of light having an energy lower than the one needed to prime excitation. This means that the energy (E ) provided by photons should equal the molecule energy gap (Eg), and considering the relationship between photon energy (E ) and radiation wavelength () it follows that Eg ¼ E ¼ hc=
ð1Þ
where h ¼ 6:6 1034 J s1 is Planck’s constant and c ¼ 3 108 m s1 is the value of the speed of light (considered in a vacuum and at reasonable approximation). Due to energetic aspects, the fluorescence emission is shifted toward a wavelength longer than the one used for excitation. This shift typically ranges from 50 to 200 nm (Birks, 1970; Cantor and Schimmel, 1980). For example, a fluorescent molecule that absorbs one photon at 340 nm, in the ultraviolet region, exhibits fluorescence at 420 nm in the blue region, as sketched in Figure 7. In an almost classic three-dimensional fluorescence optical microscope such as the confocal one the fluorescence process is such that the excitation photons are focused into a diffraction-limited spot scanned on the specimen (Wilson and Sheppard, 1984; Webb, 1996). The three-dimensional ability, i.e., the confocal effect, is obtained by confining both the illuminated focal region and the detected area of the emitted light (Sheppard, 2002; Wilson, 2002). So far, the light emitted from the specimen is imaged by the objective lens of the microscope into the image plane. Here a circular aperture (pinhole) is placed in front of a light detector, as depicted in Figure 8. This pinhole is responsible for rejection of the axial out-of-focus light and of the lateral overlapping diffraction patterns. This produces an improvement of spatial resolution of a factor 1.4 along each direction, resulting in a volume
420 nm
Visible fluorescence emission
DIASPRO AND CHIRICO
UV or visible excitation
204
One-photon 340 nm Figure 7. Jablonski’s fluorescence selection rules for one-photon excitation. The fluorescent molecule is brought to an excited state and relaxes back by emitting fluorescence. (Courtesy of Ammasi Periasamy, W. M. Keck Center for Cellular Imaging, University of Virginia.)
Figure 8. Confocal basic setup. Fluorescence coming from the geometric focal plane (green) can reach the detector module unlike out-of-focus fluorescence above (red) and below (blue) the actual focal plane that is blocked by a pinhole. (Courtesy of Perkin Elmer.)
TWO-PHOTON EXCITATION MICROSCOPY
205
selectivity 2.7 times better than in the wide-field case (Brakenhoff et al., 1979; Wilson and Sheppard, 1984; Diaspro et al., 1999a; Jonkman and Stelzer, 2002; Torok and Sheppard, 2002). It is the physical suppression of the contributions from out-of-focus layers to image formation that produces the so-called optical sectioning effect. Unfortunately, a drawback is that during the excitation process of the fluorescent molecules the whole thickness of the specimen is harmed by every scan, within a hourglassshaped region (Bianco and Diaspro, 1989). This means that even though out-of-focus fluorescence is not detected, it is generated with the negative effect of potential induction of those photobleaching and phototoxicity phenomena previously mentioned. The situation becomes particularly serious when there is the need for three-dimensional and temporal imaging coupled with the use of fluorochromes that require excitation in the ultraviolet regime (Stelzer et al., 1994; Denk, 1996). As earlier reported by Konig and colleagues (1996a), even using UVA (320–400 nm) photons may modify the activity of the biological system. DNA breaks, giant cell production, and cell death can be induced at radiant exposures of the order of magnitude of J/cm2, accumulable during 10 scans with a 5-mW laser scanning beam at approximately 340 nm and a 50-ms pixel dwell time. In this context, two-photon excitation of fluorescent molecules provides an immediate practical advantage over confocal microscopy (Denk et al., 1990; Potter, 1996; Centonze and White, 1998; Gu and Sheppard, 1995; Piston, 1999; Squirrel et al., 1999; Diaspro and Robello, 2000; So et al., 2000; Wilson, 2002). In fact, reduced overall photobleaching and photodamage are generally acknowledged as major advantages of two-photon excitation in laser scanning microscopy of biological specimens (Brakenhoff et al., 1996; Denk and Svoboda, 1997; Patterson and Piston, 2000). However, excitation intensity has to be kept low considering the normal operation mode as a regime under 10 mW of average power. When laser power is increased above 10 mW some nonlinear effects might arise evidenced through abrupt rising of the signals (Hopt and Neher, 2001). Moreover photothermal effects should be induced especially when focusing on single-molecule detection schemes (Chirico et al., 2002). In TPE, two low-energy photons are involved in the interaction with absorbing molecules. The excitation process of a fluorescent molecule can take place only if two low-energy photons are able to interact simultaneously with the very same fluorophore. As mentioned in the introduction, the time scale for simultaneity is the time scale of molecular energy fluctuations at photon energy scales, as determined by the Heisenberg uncertainty principle, i.e., 1016 1015 s (Louisell, 1973). These two photons do not necessarily have to be identical but their wavelengths, 1 and 2, have to be such that
206
DIASPRO AND CHIRICO
1P ffi
1 1 þ 1 2
1
ð2Þ
where 1P is the wavelength needed to prime fluorescence emission in a conventional one-photon absorption process according to the energy request outlined in Eq. (1). This situation, compared to the conventional one-photon excitation process shown in Figure 7, is illustrated in Figure 9 using a Jablonski-like diagram. It is worth noting that for practical reasons the experimental choice is usually such that (Denk et al., 1990; Diaspro, 2001; Girkin and Wokosin, 2002) 1 ¼ 2 21P ð3Þ and
Eg ¼ 2hc=1P
ð4Þ
Considering this as a nonresonant process and the existence of a virtual intermediate state, one should calculate the resident time, virt, in this intermediate state using the time-energy uncertainty consideration for TPE:
Two-photon 680 nm
420 nm VS
Visible fluorescence emission
ð5Þ
VS IR excitation
VS 420 nm
IR excitation
where, h ¼ h/2.
Visible fluorescence emission
Eg virt ffi h=2
Three-photon 1020 nm
Figure 9. Jablonski’s fluorescence selection rules for two- and three-photon excitation. When the fluorescent molecule is brought to the excited state it relaxes emitting the same fluorescence as in the one-photon excitation case. (Courtesy of Ammasi Periasamy, W. M. Keck Center for Cellular Imaging, University of Virginia.)
TWO-PHOTON EXCITATION MICROSCOPY
207
It follows that virt ffi 1015 1016 s
ð6Þ
This is the temporal window available to two photons to coincide in the virtual state. As will be more evident in the following sections, the TPE process requires high photon flux densities that can typically be obtained by tightly focusing a laser beam. So far, in a TPE process it is crucial to combine sharp spatial focusing with temporal confinement of the excitation beam. The process can be extended to n photons requiring higher photon densities temporally and spatially confined (Fig. 9). Thus, near infrared (about 680–1100 nm) photons can be used to excite UV and visible electronic transitions producing fluorescence. The typical photon flux densities are of the order of more than 1024 photons cm2 s1 , which implies intensities around MW– TW cm2 (Go¨ppert-Mayer, 1931; Konig et al., 1996a). Such a high photon flux density can be achieved by focalizing with high numerical aperture objectives continuous near infrared laser beams (Hanninen and Hell, 1994; Konig et al., 1995). Liu and colleagues (1995) showed that cellular heating due to mW intensities at near-infrared wavelengths is of the order of 20 mK / mW (see also Konig and Tirlapur, 2002). However, the design and realization of ultrafast laser sources further improve the situation (Konig, 2000). Figure 10 shows the main factors in the application of such phenomenon in microscopy, namely, high numerical aperture lenses and ultrafast infrared laser sources. A treatment in terms of quantum theory for two-photon transition has been elegantly proposed by Nakamura (1999) using perturbation theory. He
Figure 10. Technical ingredients for two- and multiphoton excitation microscopy.
208
DIASPRO AND CHIRICO
clearly describes the process by a time-dependent Schroedinger equation, where the Hamiltonian contains electric dipole interaction terms. Using perturbative expansion one finds that the first-order solution is related to one-photon excitation and higher order solutions are related to n-photon ones (Faisal, 1987; Callis, 1997). It is worth noting that the dipole operator has odd parity, and the one-photon transition moment reflects that the initial and final states have opposite parity, whereas in the two-photon case the two states have the same parity (So et al., 2000). Now, let us try to discuss TPE on the basis of the following simple assumption: the probability of a molecule undergoing n-photon absorption is proportional to the probability of finding n photons within the volume it occupies at any moment in time (Louisell, 1973; Andrews, 1985). Alternatively, what is the probability of finding two photons within the interval of time the molecule spends in a virtual state (Moscatelli, 1986)? Here we will refer to the first case discussed earlier by Andrews (1985): what is the probability pn that n photons are in the same molecular volume? We consider that all the molecules are endowed with a suitable set of energy levels such that all possible n-photon transitions are possible. So far, we consider the relationship between the mean number of photons, M, at any time within a molecular volume and the intensity, I, of the laser beam that is energy per unit area per unit time. Considering a cube of side S through which the photons are delivered within a beam width much larger than S. The mean energy in this cubic box, for a certain wavelength , is EM ¼ Mhc=
ð7Þ
I ¼ EM =ðS2 S=cÞ ¼ Mhc2 =ð S3 Þ
ð8Þ
2
Since the cross-sectional area is S , and the time needed for each photon to cross the box is S/c, then
S3
¼ Vm =Na , where for a molecule the mean volume Recalling that V ¼ occupied is the molar volume Vm divided by Avogadro’s constant, Na ¼ 6:022 1023 mol1 , we have M ¼ IVm =Na hc2
ð9Þ
As an example, considering a wavelength of 780 nm delivered at peak intensities of the order of GW cm2 into a reasonable molecular volume of the order of 104 m3 mol1, the resulting value for M is of the order of 105. Using a Poisson distribution to determine pn (Louisell, 1973) we get pn ðM n =n!ÞeM
ð10Þ
TWO-PHOTON EXCITATION MICROSCOPY
209
The resulting probability for TPE, n ¼ 2, expanding the exponential term in Taylor series for M small and truncating at the first term setting the exponential value to unity, is given by p2 ¼ M 2 =2 / I 2
¼ proportionality factor
ð11Þ
Here, the dependence of TPE from I2 should be evident. Because we have shown that TPE is a process that has a quadratic dependence on the instantaneous intensity of the excitation beam, we can introduce the molecular cross section, as its propensity to absorb in a TPE event photons having a certain energy or wavelength, and refer the fluorescence emission as a function of the temporal characteristics of the light, I(t), to it. So far, the fluorescence intensity per molecule, If (t), can be considered proportional to the molecular cross section 2() and to I(t) as " #2 ðNAÞ2 2 2 If ðtÞ / 2 IðtÞ / 2 PðtÞ ð12Þ hc where P(t) is the laser power and NA is the numerical aperture of the focusing objective lens. The last term of Eq. (12) simply takes care of the distribution in time and space of the photons by using paraxial approximation in an ideal optical system (Born and Wolf, 1980). It follows that the time-averaged two-photon fluorescence intensity per molecule within an arbitrary time interval T, , can be written as " #2 Z Z 1 T ðNAÞ2 1 T < If ðtÞ > ¼ If ðtÞdt / 2 PðtÞ2 dt ð13Þ hc T 0 T 0 in the case of continuous wave (CW) laser excitation. Now, because the present experimental situation for TPE is related to the use of ultrafast lasers, we consider that for a pulsed laser T ¼ f1P , where fP is the pulse repetition rate (Svelto, 1998). This implies that a CW laser beam, where P(t) = Pave, allows transformation of Eq. (13) into " # 2 2 ðNAÞ < If;cw ðtÞ > / 2 P2ave ð14Þ hc Now, for a pulsed laser beam with pulse width, p, repetition rate, fp, and average power Pave ¼ D Ppeak ðtÞ where D ¼ p fp , the approximated P(t) profile, can be described as
ð15Þ
210
DIASPRO AND CHIRICO
PðtÞ ¼
Pave p fp
PðtÞ ¼ 0
for 0 < t < p for p < t
/ 2 2 2 dt ¼ 2 ¼ T 0 p fp hc hc p fp
ð16Þ
ð17Þ
The conclusion here is that CW and pulsed lasers operate at the very same excitation efficiency, i.e., fluorescence intensity per molecule, if the average 1 power of the CW laser is kept higher by a factor of pffiffiffiffiffiffi fP . This means that 10 W delivered by a CW laser, allowing the same efficiency of conventional excitation performed at approximately 101 mW, is nearly equivalent to 30 mW for a pulsed laser. This leads to the most popular relationship reported below, which is related to the practical situation of a train of beam pulses focused through a high numerical aperture objective, with a duration p and fp repetition rate. In this case, the probability, na, that a certain fluorophore simultaneously absorbs two photons during a single pulse, in the paraxial approximation, is by (Denk et al., 1990) 2 2 P2ave NA2 ð18Þ na / 2h c p fp2 where Pave is the time-averaged power of the beam and is the excitation wavelength. Introducing 1 GM (Go¨ppert-Mayer) ¼ 1058 ½m4 s], for a 2 of approximately 10 GM per photon (Denk et al., 1990; Xu, 2002), focusing through an objective of NA ¼ 1.2–1.4, an average incident laser power of 1–50 mW, operating at a wavelength ranging from 680 to 1100 nm with 100-fs pulsewidth and 100-MHz repetition rate, would saturate the fluorescence output as for one-photon excitation. This suggests that for optimal fluorescence generation, the desirable repetition time of pulses should be on the order of a typical excited-state lifetime, which is a few nanoseconds for commonly used fluorescent molecules. For this reason the typical repetition rate is around 100 MHz. A further condition that makes Eq. (18) valid is that the probability of each fluorophore being excited during a single pulse has to be smaller than one. The reason lies in the observation that during the pulse time (1013 s of duration and a typical excited-state lifetime in the 109 s range) the molecule has insufficient time to relax to the ground state. This can be considered a prerequisite for absorption of another photon pair. Therefore, whenever na approaches
TWO-PHOTON EXCITATION MICROSCOPY
211
Figure 11. Pictorial (not in scale) representation of typical time scales related to two- and multiphoton excitation processes, namely, laser beam repetition rate (100 MHz), laser beam pulse width (100 fs), and fluorescence decay (ns).
unity saturation effects start to occur. The use of Eq. (18) allows one to choose optical and laser parameters that maximize excitation efficiency without saturation. Figure 11 depicts (not in scale) the practical time scale condition typically used. It is also evident that the optical parameter for enhancing the process in the focal plane is the lens numerical aperture, NA, even if the total fluorescence emitted is independent of this parameter as shown by Xu (2002). This value was confined to around 1.3–1.4 as a maximum value until the recent introduction of two new objectives by Olympus and Zeiss with numerical apertures of 1.65 and 1.45, respectively. Unfortunately there is no information about transmission properties in the UV–IR regime at this moment. One can now estimate na for a common fluorescent molecule such as fluorescein that possesses a two-photon cross section of 38 GM at 780 nm (So et al., 2001). For this purpose, we can use NA ¼ 1.4, a repetition rate at 100 MHz, and a pulsewidth of 100 fs within a range of Pave assumed 1, 10, 20, and 50 mW. Substituting the proper values in Eq. (14) we obtain P2ave na ¼ 38 1058 100 1015 ð100 106 Þ2 " #2 ð1:4Þ2 ffi 5930 P 2ave 2 1:054 1034 3 108 780 109 The final results as a function of 1, 10, 20, and 50 mW are 5.93103 , 5.93101 , 1.86, and 2.965, respectively. It is evident that saturation begins to occur at 10 mW (Koester et al., 1999; So et al., 2001). The very same calculation can be made for rhodamine B by changing the cross-sectional value from 38 to 210 and considering 840 nm instead of
212
DIASPRO AND CHIRICO TABLE 1 Vales of g(2) in Relation to Pulse Shape Pulse Shape
g(2)
Rectangular Gaussian Hyperbolic-secant-squared
1.0 0.66 0.59
780 nm as the excitation wavelength. This leads to an na approximatively 4.76 times greater than in the case of fluorescein. This sets the saturation average power for rhodamine B around 5 mW instead of 10 mW. The related rate of photon emission per molecule, at a nonsaturation excitation level, in the absence of photobleaching (Patterson and Piston, 2000; So et al., 2001), is given by na multiplied by the repetition rate of the pulses. This means approximately 5107 photons s1 in both cases. It is worth noting that when considering the effective fluorescence emission one should consider a further factor given by the so-called quantum efficiency of the fluorescent molecules. In the next section we will report data related to the fluorochrome action cross section that are related to absorption cross section and quantum efficiency. It has been demonstrated that the fluorophore emission spectrum is independent of the excitation mode (Xu et al., 1995; Xu, 2002). So far, the quantum efficiency value is known from conventional one-photon excitation data (Pawley, 1995). Always referring to Eq. (18), one should also consider a further proportionality factor, g(2), that is related to the pulse shape of the laser beam (Svelto, 1998). Values for this form factor are reported in Table 1. All calculations have been made considering a rectangular pulse shape. From Eq. (18) it should be clear that there are some key parameters implicated in TPE that should be considered and controlled, namely, the cross section of the fluorescent molecule, the numerical aperture of the objective, and excitation beam characteristics. The next section will focus on the behavior of fluorescent molecules and on the optical consequences of working under a TPE regime, before moving to considerations related to excitation beam characteristics, practical architectures, performances, and applications.
IV. Behavior of Fluorescent Molecules under TPE Regime In TPE microscopy, several common fluorescent molecules can be used despite the fact that the quantum-mechanical selection rules are different
213
TWO-PHOTON EXCITATION MICROSCOPY
from those for the one-photon excitation condition (Loudon, 1983; Birge, 1979; Wang and Herman, 1996; Haughland, 2002; Xu, 2002). As a starting point, fluorescent molecules can be excited under a TPE regime at twice their one-photon excitation wavelength (Lakowicz, 1999). Figure 12 shows a simplified Jablonski diagram illustrating this type of guideline. Because this is extended to any fluorescent molecule, there are a variety of autofluorescent molecules that can be effectively exploited without the need for ultraviolet excitation (Herman and Tanke, 1998; Lakowicz, 1999). Figure 13 represents the spectral distribution of the autofluorescence exhibited by some interesting biological molecules and macromolecules. To characterize fluorescent molecules in terms of their response to excitation there are two specific parameters that have to be calculated or measured (Harper, 2001; Berland, 2001): the molecule absorption cross section and the quantum efficiency. The former is related to the propensity of a molecule to absorb photons at a certain wavelength (Cantor and Schimmel, 1980). The latter is more directly connected to the fluorescence process and to the molecule itself: it is a measure of the yield in the conversion of absorbed energy into light emission. This last parameter is also known as quantum yield and can be considered as an indicator of the probability that a given excited molecule
2λ1 λ1
λ2
(a)
λ2
2λ1
(b)
Figure 12. Jablonski diagram illustrating one-photon (a) and two-photon (b) excitation and deexcitation pathways for a fluorescent molecule.
Figure 13. Autofluorescence spectral distribution of some interesting biological molecules. (See Color Insert.)
214
DIASPRO AND CHIRICO
will produce a fluorescence photon. It is clear that both these parameters influence the detectable intensity of fluorescence from one or more specific fluorescent molecules. Moreover, their behavior is also influenced by environmental conditions, namely, pH, temperature, etc. In general, quantum yield of a fluorescent molecule conventionally excited, i.e., onephoton excitation, is preserved in a two- or multiphoton excitation regime. Unfortunately the knowledge of one-photon absorption cross sections does not permit any quantitative precise prediction of the two-photon ones. Table 2 reports measured data for both intrinsic and extrinsic fluorescent molecules, including green fluorescent protein (Tsien, 1998; Xu, 2002). This means that cross sections for TPE or higher orders of excitation have to be measured. However, the practical ‘‘rule of thumb’’ mentioned at the beginning of this section can be used even if it does not guarantee optimal TPE conditions. This simple but effective rule works especially with symmetrical molecules (Lakowicz, 1999). Figure 14 compares one- and two-photon performances for two common fluorescent molecules. It is clear that there is a peculiar and interesting variety of excitation in TPE that allows more flexibility in excitation. This fact is depicted in Figure 15. The cross section parameter has been measured for a wide range of dyes (Xu et al., 1995; Albota et al., 1998b; Xu, 2002). It is worth noting that also due to the increasing dissemination of TPE microscopy, new ‘‘ad hoc’’ organic molecules, endowed with large engineered two-photon absorption cross sections, have been recently developed (Albota et al., 1998a). The TPE cross section not only brings information about how well a specific fluorescent molecule is excited by light in the infrared spectral region but also indicates the position of the two-photon absorption peak that is normally unpredictable. As can be seen from cross-sectional data and graphs there is a very interesting and useful variety of ‘‘relative peaks.’’ The practical consequence is that unlike onephoton excitation when using TPE one can find a ‘‘good wavelength’’ for exciting fluorescence of several different molecules using the very same wavelength. The relevance of this fact is obvious, for example, with respect to colocalization problems. One can try to find an optimal excitation wavelength for simultaneously priming fluorescence of different fluorochromes. This means that a real and effective multiple fluorescence colocalization of biological molecules, macromolecules, organelles, etc. can be obtained. Figure 16 shows an example of multiple fluorescence realized by means of one- and two-photon excitation. Special mention is due to excitation of green fluorescent protein (GFP), an important molecular marker for gene expression (Chalfie et al., 1994; Chalfie and Kain, 1998; Potter, 1996; Tsien, 1998). GFP cross sections are around 6 GM (800 nm) and 7 GM (960) in the case of wild type and
215
TWO-PHOTON EXCITATION MICROSCOPY TABLE 2 Intrinsic and Extrinsic Fluorescent Molecules Fluorophores Intrinsic fluorophores GFP wt GFP S65T BFP CFP YFP EGFP Flavine NADH Phycoerythrin Extrinsic fluorophores Bis-MSB Bodipy Calcium green 1 Calcofluor Cascade blue Coumarin 307 CY2 CY3 CY5 DAPI (free) Dansyl Dansyl hydrazine Dil Filipin FITC Fluorescein (pH 11) Fura-2 (free) Fura-2 (high Ca) Hoechst Indo-1 (free) Indo-1 (high Ca) Lucifer yellow Nile red Oregon green bapta 1 Rhodamine B Rhodamine 123 Syto 13 Texas red Triple probe (DAPI, FITC, and rhodamine) TRITC (rhodamine)
(nm)
6 7
800–850 960 780/820 780/840 860/900 900–950 700 700 1064 691/700 920 800 (780, 820) 780/820 750 776 780/800 780 780/820 700/720 700 700 700 720 740/780/820 780 700 700 780/820 700 590/700 860 810 800 840 780–560 810 780 720/740 800–840
0.8 0.02 322 110 6.0 1.8 17 4.9
6.3 1.8
2.1 0.6 19 5.5
0.16 0.05 1 0.72 0.2 95 28 — 11 12
38 9.7
4.5 1.3 1.2 0.4 0.95 0.3
12 4 2.1 0.6
—
210 55
216
DIASPRO AND CHIRICO
Figure 14. Comparison of one-photon (lines) and two-photon (solid circles) fluorescence excitation spectra. The abscissa axis reports excitation wavelengths in nanometers that have to be scaled by a factor two for one-photon excitation. The ordinate axis values represent twophoton action cross section for Cascade blue in water (right) and two-photon absorption cross section for fluorescein in water, pH 13 (left). Values are given in Go¨ppert-Mayer units, GM (1 GM ¼ 1050 cm4 s) and are reported in logarithmic scale. It is worth noting that the ‘‘twice wavelength’’ rule of thumb works almost perfectly for Cascade blue whereas fluorescein exhibits a more complicated and interesting behavior as a function of wavelength. Nevertheless fluorescein also respects the rule, in fact there is a relative maximum near the one-photon excitation one (Xu et al., 1995; Xu, 2002).
S65T type, respectively. As comparison, one should consider that the cross section for NADH, at the excitation maximum of 700 nm, is approximately 0.02 GM (So et al., 2000). A combination of GFP and TPE is one of the most promising scientific fields, unfortunately it is too vast to be treated here. In discussing fluorescent molecules, another very important issue is related to TPE-induced photobleaching. Although the TPE scheme reduces overall photobleaching of the sample by limiting excitation to a small volume element, photobleaching within the excited area is not reduced (Brakenhoff et al., 1996). In fact, accelerated rates of photobleaching have been observed using TPE compared with conventional one-photon excitation (Patterson and Piston, 2000). Although two-photon excitation has several advantages for spectroscopic and imaging applications, very little is known about photobleaching and about similar effects on the stability of the molecules, especially when moving to the single-molecule detection field of application. The studies in the literature mostly refer to bulk measurements. Several definitions of bleaching can be given (Lakowicz, 1999), and we can envision two main sources. The molecule may convert from the excited state, usually with a radiative decay constant in the tens of nanoseconds range, to a second excited metastable state with a vanishing radiative constant. Another possibility is that the molecule changes its structure in such a way that the molecular ground state assumes
TWO-PHOTON EXCITATION MICROSCOPY
217
Figure 15. Two-photon action cross sections for several common fluorescent molecules, namely, Cascade blue (CB), Lucifer yellow (LY), Bodipy (BD), DAPI free (DP), dansyl (DN), pyrene (PY), cumarin (CU) (above), Indo-1 calcium bound (IC), Indo-1 free (IF), Fura-2 calcium bound (FC), Fura-2 free (FF), calcium green calcium bound (CG), calcium orange calcium bound (CO), calcium crimson calcium bound (CC), and Fluo-3 calcium bound (F3). Solid circle marks the wavelength that is twice the optimal one-photon excitation one. Axes as in Figure 14. More details about fluorochormes can be found at the Molecular Probes web site, www.probes.com (Haughland, 2002). (See Color Insert.)
a vanishing cross section for the excitation light. This modification may be induced by isomerization or thermal absorption. In both cases the molecular fluorescence emission drops to zero. Irreversible photobleaching and blinking are usually ascribed to the first type of transition. Mertz et al. (1995) have compared single to two-photon excitations with particular regard to the saturation or higher level transitions. More recently Patterson and Piston (2000) provided data on bulk solutions of dyes that indicate an enhanced photobleaching in two-photon spectroscopy due probably to a three-photon process.
218
DIASPRO AND CHIRICO
Figure 16. Bovine pulmonary artery endothelial cells (F-147780, Molecular Probes) marked with three different fluorophores for mitochondria, F-actin, and DAPI. Image on the left shows mitochondria (red) and F-actin (green)-labeled molecules. Here a sort of black hole is shown in the center at the position of the nucleus due to the fact that DAPI, UV excitable at one-photon excitation, was not excited. On the right a false color picture obtained by means of TPE at 720 nm displays the nuclear portion (pink). Using TPE, simultaneous excitation of the three fluorophores occurred at 720 nm. (Image acquired at LAMBS.) (See Color Insert.)
We have very recently studied the effect of two-photon excitation on the total amount of fluorescence that can be collected from a single immobilized molecule at the high excitation intensities required for single-molecule studies with two-photon excitation (F. Federici, A. Gerbi, and A. Diaspro, unpublished data; Chirico et al., 2002). Four dyes were considered: indo-1, rhodamine 6G, fluorescein, and pyrene. The choice of these dyes is motivated also by the different complexity of the chemical structure that increases from pyrene to indo-1. For these molecules we have evaluated the total amount of fluorescence light that can be recovered from each dye versus the excitation intensity, the temperature, and the duration and the nature of the excitation. The main result of this research is the characterization of the thermally induced bleaching of the dyes and a clear correlation of the bleaching time and its dependence on the excitation intensity, with the photophysics parameters of the molecules (Chirico et al., 2002). These conclusions were also based on numerical simulations of the local temperature increase during laser excitation. Further studies on the features of fluorescent molecules, in particular on single-molecule detection of both isolated and ‘‘in situ functioning’’ fluorescent molecules, are needed.
TWO-PHOTON EXCITATION MICROSCOPY
219
V. Optical Consequences and Resolution Aspects A misconception about TPE microscopy is that optical resolution is enhanced. This is not true in terms of strict optical resolution because as a first step toward TPE wavelengths longer than in the conventional case must be used. However, it is a common feeling that optical resolution in microscopy, or for people involved in microscopy measurements, is a mix of different parameters including signal-to-noise ratio (see also Section VI and Fig. 24). So far, because fluorescence is dramatically reduced in TPE, objects comparable or smaller with the optical resolution attainable in conventional microscopy may appear brighter or more defined when using a double excitation wavelength. It is important that one always remember that far-field TPE microscopy, as discussed in this article, is not the way to surpass the limit described by Abbe (1910). However, in microscopy, one is also interested in obtaining complete spatial information about the sample or in performing three-dimensional imaging. Here, a very important optical consequence coming from the utilization of TPE is given by the confinement of the spatial region where fluorescence takes place within a subfemtoliter volume. The practical consequence of this feature is that optical sectioning ability is an intrinsic ability of TPE microscopy. Within the one-photon excitation optical sectioning scheme depicted in Figure 17 (Agard, 1984; Bianco and Diaspro, 1989; Diaspro et al., 1990; Castleman, 1996), the situation is that the observed image O at a plane j, where the focus of the lens is mechanically positioned and our main interest is concentrated, can be described by the following relationship, for the sake of simplicity expressed in the Fourier transform domain under the condition of a spatially invariant linear system (Castleman, 2002): X O j Ij S j þ Ik Sk þ N ð19Þ k
The first term takes into proper account the optical distortion S, given by the so-called point spread function of the microscope, on the true distribution of fluorescence intensity, I, at the plane j. The second term contains contributions, defocused, from the above and below k adjacent planes. In fact, fluorescent molecules in the adjacent planes are properly excited by the proper wavelength (see Section IV), even if they are more sparse with respect to those at the focal plane or volume (Jonkman and Stelzer, 2002). The third term, N, is the noise, considered additive with reasonable approximation (Castleman, 1996; Agard, 1984). Now, noise can be easily modeled or measured as well as S, the distortion introduced by the optical system that is called the point spread function. In wide-field
220
DIASPRO AND CHIRICO
Figure 17. Sketch of the optical sectioning scheme (a) obtained by exploiting the spatial confinement of TPE depicted in double-cone excitation geometry (b). Only fluorescent molecules positioned at the double-cone apex have a nontrascurable probability of being excited under the TPE regime.
microscopy, after digital acquisition of data, the system can be solved and the best estimate of I can be found. In confocal microscopy the situation is better because the second term is dramatically reduced by the insertion of a pinhole, and S is less disturbing (I’d like to say that trend is assuming the unitary value, but this is not true). Under TPE the second term does not exist at all, due to the confinement properties of the excitation process. The 3D confinement of the two-photon excitation volume can be understood based on optical diffraction theory (Born and Wolf, 1980). Using excitation light with wavelength , the intensity distribution at the focal region of an objective with numerical aperture NA ¼ sin( ) is described, in the paraxial regime, by (Born and Wolf, 1980; Sheppard and Gu, 1990)
TWO-PHOTON EXCITATION MICROSCOPY
Iðu; vÞ ¼ j2
Z
1 0
i
J0 ð Þe2u d j2 2
221 ð20Þ
Where J0 is the zeroth order Bessel function, is a radial coordinate in the pupil plane, and u,v are defined as u¼
8 sin2 ð =2Þz
ð21Þ
2 sinð Þr dimensionless axial and radial coordinates, respectively, normalized to the wavelength (Wilson and Sheppard, 1984). This implies that the intensity of fluorescence distribution within the focal region has an I(u,v) behavior for the one-photon case and I 2 (u/2,v/2) for the TPE case as shown earlier. The arguments of I 2 (u/2,v/2) take into proper account the fact that in the latter case one utilizes wavelengths that are approximatively twice the ones used for one-photon excitation. These distributions are called the point spread functions (PSF) of the microscope (Born and Wolf, 1980; Jonkman and Stelzer, 2002; Castleman, 2002; Bertero and Boccacci, 2002). As compared with the one-photon PSF, the TPE PSF is axially confined (Nakamura, 1993; Gu and Sheppard, 1995; Jonkman and Stelzer, 2002). In fact, considering the integral over v, keeping u constant, its behavior is constant along z for one-photon excitation and has a half-bell shape for TPE. This behavior, better discussed in Wilson (2002), Torok and Sheppard (2002), and Jonkman and Stelzer (2002), explains the three-dimensional discrimination property in TPE. In general, two-photon microscopy has a radial resolution comparable with onephoton conventional microscopes due to a better signal-to-noise ratio and an effective and narrow depth of focus that make it well suited for threedimensional optical sectioning. Figure 18 shows the PSF shape for widefield, confocal, and TPE conditions (Periasamy et al., 1999). Table 3 gives the calculated values of the 3D intensity PSF in the transverse and axial directions (Gu and Sheppard, 1995). The comparison of the 3D intensity PSF for confocal one-photon and two-photon imaging reveals that resolution in both cases is almost the same (Jonkman and Stelzer, 2002; Torok and Sheppard, 2002). Now, the most interesting aspect, also predicted by Eq. (18) or Eq. (20), is that the excitation power falls off with the square of the distance from the lens focal point, within the approximation of a conical illumination geometry. In practice this means that the quadratic relationship between the excitation power and the fluorescence intensity results in TPE falling off ¼
222
DIASPRO AND CHIRICO
Figure 18. Point spread function shapes for conventional digital deconvolution microscopy, confocal microscopy, and TPE microscopy, from the left to the right. (Courtesy of Ammasi Periasamy, modified from Periasamy et al., 1999.)
TABLE 3 Values of the Half-Width of the 3D Intensity Point Spread Function in the Transverse and Axial Directions, v1/2 and u1/2
v1/2 u1/2
Conventional 1P
Confocal 1P
Conventional 2P
1.62 5.56
1.17 4.01
2.34 8.02
Confocal 2P 1.34 4.62
as with the fourth power of distance from the focal point of the objective. This implies that the point spread function or the geometric resolution parameters allow a sort of volume of event for TPE to be defined, as sketched in Figure 19a. Therefore, those regions away from the focal volume of the objective lens, directly related to the numerical aperture of the objective itself, do not suffer photobleaching or phototoxicity effects and do not contribute to the signal detected when a TPE scheme is used. This situation is presented in Figure 19b. Because they are simply not involved in the excitation process, a confocal-like effect is obtained without the necessity of a confocal pinhole. Figure 20 shows the spatial extension of the fluorescence emission from a solution containing fluorescent molecules and subjected to one- and two-photon excitation regimes. Consequently, photodamaging and photobleaching effects are extremely localized as demonstrated in Figure 21. Figure 22 shows a further demonstration of the three-dimensional localization effect attainable by means of TPE. Photobleaching was induced within a large fluorescent sphere—22 mm in diameter—using the confocal and TPE mode. The latter not only exhibited the expected features but also pointed out the potential of such a technique as an active photodevice, as it will be better seen in the following sections. In
TWO-PHOTON EXCITATION MICROSCOPY
223
Figure 19. (a) PSF or more general optical resolution parameters can be used to detrmine the extent of the volume of TPE event (modified from Pawley, 1995). (b) In conventional excitation (left) all photons carry the ‘‘right energy’’ for priming fluorescence in any fluorescent molecule encountered within the double-cone of excitation, whereas in TPE (right) only photons confined in the volume of event prime fluorescence due to the high temporal and spatial concentration. Under TPE, low-density distributed photons within the double cone of excitation do not possess enough energy for priming fluorescence. (See Color Insert.)
224
DIASPRO AND CHIRICO
Figure 20. Fluorescence emission from a solution containing fluorescent molecules under one- (double cone, above) and two-photon (bright dot, below) excitation. (Picture courtesy of John Girkin from Bio-Rad web page.) (See Color Insert.)
Figure 21. Effect of TPE localization. Comparison between one- (above) and two-photon (below) induced photobleaching visualized along the z-axis in an x–z view. Scanning was performed in the volume defined by the rectangle within the double-cone excitation volume. (Courtesy of David Piston; adapted from Pawley, 1995.)
TPE over 80% of the total intensity of fluorescence comes from a 700- to 1000-nm-thick region about the focal point for objectives with numerical apertures in the range from 1.2 to 1.4 (Brakenhoff et al., 1979; Wilson and Sheppard, 1984; Wilson, 2002; Jonkman and Stelzer, 2001; Torok and
TWO-PHOTON EXCITATION MICROSCOPY
225
Figure 22. Three-dimensional side views (y–z plane cut) of a large fluorescent sphere—22 mm in diameter—where photobleaching has been induced in a central x–y section in singlephoton confocal (left) and TPE (right) mode using 488 and 720 nm excitation wavelengths, respectively. (Adapted from Diaspro, 2001; image realized at LAMBS.)
Sheppard, 2002). This also implies a reduction in background that allows compensation of the reduction in spatial resolution due to the wavelength. The utilization of infrared wavelength instead of UV-visible ones also allows deeper penetration than in the conventional case (So et al., 2000; Periasamy et al., 2002; Konig and Tirlapur, 2002). In fact, Rayleigh scattering produced by small particles is proportional to the inverse fourth power of the wavelength. Thus the longer wavelengths used in TPE, or in general in multiphoton excitation, will be scattered less than the ultraviolet-visible wavelengths used for conventional excitation. It is worth noting that when considering deep imaging in thick samples optical aberrations should be properly considered (de Grauw and Gerritsen, 2002; Centonze and White, 1998). So far deeper targets within a thick sample can be reached. Of course, for the fluorescence light, on the way back, scattering can be overcome by acquiring the emitted fluorescence using a large area detector and collecting not only ballistic photons (Soeller and Cannel, 1999; Bueheler et al., 1999; Girkin and Wokosin, 2002). Because several factors all influence whether a particular sample should be imaged with a confocal, multiphoton, or even wide-field camera imaging system, the highest priced option, in buying or building a two-photon microscope, should not automatically be assumed to be the best for every biological imaging challenge.
VI. Architecture of Two-Photon Microscopy A. General Considerations Two-photon microscopes and architectures are now commercially available, but are very expensive. Table 4 presents an overview of market availability.
TABLE 4 Overview of Market Availability
Model
Company
Dimension
Pulse Width Wavelength Regime Range (nm)
Average Power (mW)
226
LSM 510 NLO (META) Zeiss
Compact/normal
fs
700–900
50
MRC 1024 MP
Bio-Rad
Normal/large
fs
690–1000
Radiance 2000 MP
Bio-Rad
Compact/normal
fs
690–1000
RTS 2000 MP
Bio-Rad
Large
fs
690–1000
TCS SP2
Leica
Normal/large
ps
720–900
Not reported Not reported Not reported Not reported (120 max at the sample)
Laser Coupling
Acquisition
Direct-box/ Descanned/ fiber nondescanned Direct-box Descanned/ nondescanned Direct-box Descanned/ nondescanned Direct-box Descanned/ nondescanned Fiber Descanned/ nondescanned
Other Features Simultaneous confocal None relevant Faster scanning (>750 Hz) 130 frames/s video rate Spectral capability
TWO-PHOTON EXCITATION MICROSCOPY
227
However, a TPE microscope can also be constructed from components or, utilizing a very efficient compromise, by modifying an existing confocal laser scanning microscope. This last situation is still the best in the authors’ opinion allowing an effective mix of operational flexibility and of good quality-to-cost ratio. The basic designs for the above-mentioned three solutions are very similar. The main ingredients to perform two-photon excitation microscopy and related techniques are a high peak-power laser delivering moderate average power (femtosecond or picosecond pulsed at a relatively high repetition rate) emitting infrared or near infrared wavelengths (650–1100 nm), a laser beam scanning system, a high numerical aperture objective (>1), a high-throughput microscope pathway, and a highsensitivity detection system (Denk et al., 1995; Konig et al., 1996b; So et al., 1996; Soeller and Cannell, 1996; Wokosin and White, 1997; Centonze and White, 1998; Potter et al., 1996; Wolleschensky et al., 1998; Diaspro et al., 1999b; Wier et al., 2000; Soeller and Cannell, 1999; Tan et al., 1999; Mainen et al., 1999; Majewska et al., 2000; Diaspro, 2002; Girkin and Wokosin, 2002; Iyer et al., 2002). Figure 23 shows a general scheme for a two-photon excitation microscope that also includes conventional excitation ability. In typical TPE or confocal microscopes, images are built by raster scanning the x–y mirrors of a galvanometric-driven mechanical scanner (Webb, 1996). This implies that the image formation speed is mainly determined by the mechanical properties of the scanner. In this case, the time needed for single line scanning is of the order of milliseconds. Faster beam-scanning schemes can be realized, even if the ‘‘eternal triangle of compromise’’ should be taken into proper account. As shown in Figure 24, in agreement with Shotton (1995) and Pawley (1995), triangulation refers to sensitivity, spatial resolution, and temporal resolution. Ideally, one wishes to maximize all three of these criteria. Unfortunately, the limitations of practical instrument design do not permit this and the best choice is the one satisfying the majority of the needs considering specific applications. However, speculating within the galvanometric mirrors framework, in TPE setups particular attention should be given to the surfaces of the mirrors and to the way they are mounted on the scanners in order to obtain the best reflection efficiency and scanning stability. Enhanced silver coating of the mirrors is frequently used to optimize reflectivity of the infrared excitation wavelengths (Wokosin and White, 1997). Then, the excitation light should reach the microscope objective passing through the minimum number of optical components and possibly along the shortest path. Typically, highnumerical-aperture objectives, with high infrared transmission, are used to maximize TPE efficiency (Benham and Schwartz, 2002). As also reported by Girkin and Wokosin (2002), signal detection efficiency can be further enhanced by using an additional reflector in the condenser assembly.
228
DIASPRO AND CHIRICO
ULTRAFAST LASER SOURCE
BEAM CONTROL
OD FILTER
LASER SOURCE SAMPLE
Z-AXIS CONTROL
LASER SCANNING HEAD
HARDWARE CONTROL Figure 23. Schematic of a typical two-photon scanning microscope in which the ability to use the microscope as a confocal laser scanning microscope is retained.
Figure 24. The ‘‘eternal triangle of compromise.’’
Although the x–y scanners provide lateral focal-point scanning, axial scanning can be achieved by means of different positioning devices, the most popular being a belt-driven system using a DC motor and a single objective piezo nanopositioner. Usually, it is possible to switch between the
TWO-PHOTON EXCITATION MICROSCOPY
229
Figure 25. Photograph of a new generation compact confocal laser scanning microscope architecture, Nikon C1. On the side port of the Nikon inverted microscope is plugged the portable confocal scanning head. The advantage of such a compact confocal scanning head is the reduced optical pathways resulting in an increased sensitivity. (Courtesy of Cristiana Ricci, Nikon SpA, Florence, Italy.)
one-photon and two-photon modes retaining x–y–z positioning on the sample being imaged. Figure 25 shows a new generation compact confocal laser scanning microscope easily convertible into a TPE one. Acquisition and visualization are generally completely computer controlled by dedicated software that allows different key parameters to be controlled as can be seen from the captured screen shown in Figure 26. Let us now consider two popular approaches that can be used to perform TPE microscopy, namely, the descanned and nondescanned mode. The former uses the very same optical pathway and mechanism employed in confocal laser scanning microscopy. The latter mainly optimizes the optical pathway by minimizing the number of optical elements encountered on the way from the sample to detectors, and increases the detector area. Figure 27 illustrates these two approaches, also including the conventional confocal scheme with a pinhole along with the descanned pathway. The nondescanned detection scheme is in tune with Pawley’s axiom, also reported by Girkin and Wokosin (2002), which states that the single most important
230
DIASPRO AND CHIRICO
Figure 26. Example of a software acquisition window. The main controllable parameters are photomultiplier tube gain (linear or logarithmic), dwell time or speed, field of view or zooming factor; channel port selection, and optical sectioning data. (EZ2000 software, courtesy of Kees van Oord and Nikon. Europe; www.coord.nl.) (See Color Insert.)
aspect of fluorescence microscopy is to collect every excited photon possible (Pawley 1995), as well as with John White’s statement that ‘‘The best optics are no optics!’’ (Girkin and Wokosin, 2002). Now, when working with point-scanning laser excitation systems short pixel dwell times (microseconds) are often used, which necessitate very high source intensities for sufficient signal-to-noise imaging. These high intensities have a correspondingly high risk of fluorophore bleaching and saturation. This requires that every emission photon possible should be included in the final image in order to maximize the signal-to-noise ratio and the signal-to-toxicity balance. This action is in contrast with the achievement of good spatial resolution, especially along the z-axis. However, considering an overall balance in terms of image contrast, the situation is not so bad. There is no competition with confocal microscope for imaging at large depth into thick samples: TPE is better. The TPE nondescanned mode allows very good performances providing superior signal-to-noise ratio inside strongly
TWO-PHOTON EXCITATION MICROSCOPY
231
Figure 27. Simplified optical schemes for scanned and nondescanned detection. A confocal pinhole can be used or fully opened. (Courtesy of Mark Cannel, adapted from Soeller and Cannell, 1999.)
scattering samples (Masters et al., 1997; Daria et al., 1998; Centonze and White, 1998; So et al., 2000). In the descanned approach pinholes are removed or set to their maximum aperture and the emission signal is captured using an excitation scanning device on the back pathway. For this reason it is called the descanned mode. In the latter, the confocal architecture is modified in order to increase collection efficiency: pinholes are removed and the emitted radiation is collected using dichroic mirrors on the emission path or external detectors without passing through the galvanometric scanning mirrors. A high-sensitivity detection system is another critical issue (Wokosin et al., 1998; So et al., 2000; Girkin and Wokosin, 2002). The fluorescence emitted is collected by the objective and transferred to the detection system through
232
DIASPRO AND CHIRICO
a dichroic mirror along the emission path. Due to the high excitation intensity, an additional barrier filter is needed to avoid mixing of the excitation and emission light at the detection system that is differently placed depending on the acquisition scheme being used. Photodetectors that can be used include photomultiplier tubes, avalanche photodiodes, and charge-coupled device (CCD) cameras (Denk et al., 1995; Murphy, 2001). Photomultiplier tubes are the most commonly used. This is due to their low cost, good sensitivity in the blue-green spectral region, high dynamic range, large size of the sensitive area, and single-photon counting mode availability (Hamamatsu Photonics, 1999). They have a quantum efficiency around 20–40% in the blue-green spectral region that drops down to < 1% when moving to the red region. This is a good condition in TPE because one wants to reject as much as possible wavelengths above 680 nm that are mainly used for excitation. Another advantage is that the large size of the sensitive area of photomultiplier tubes allows efficient collection of signal in the nondescanned mode within a dynamic range of the order of 108. Avalanche photodiodes are excellent in terms of sensitivity, exhibiting quantum efficiency close to 70–80% in the visible spectral range. Unfortunately their cost is high and the small active photosensitive area, < 1 mm in size, could introduce drawbacks in the detection scheme and require special descanning optics (Farrer et al., 1999). CCD cameras are used in video rate multifocal imaging (Fuijta and Takamatsu, 2002; Girkin and Wokosin, 2002). Now, as a further general consideration, to obtain a better spatial resolution it is also possible to retain the confocal pinhole as shown in Figure 27 and as discussed in the previous section (Soeller and Cannell, 1999; Periasamy et al., 1999; Gauderon et al., 1999; Torok and Sheppard, 2002). Unfortunately, in some practical experimental situations, the low efficiency of the TPE fluorescence process may rule out such a solution. However, when pinhole insertion is possible, the major advantage is that the axial resolution can be improved by approximately 40%. Torok and Sheppard (2002) analyzed the theoretical dependence of the point spread function on pinhole size. The effect of the confocal pinhole is experimentally demonstrated in Figure 28 (Gauderon et al., 1999). It can be seen that the resolution, particularly in the axial direction, is improved by using a confocal pinhole. Because the chromosomes used as test sample are dispersed in 3D, they are well suited to prove the better spatial selectivity attainable resulting in relevant 3D image enhancement. Figure 29 shows two three-dimensional views of a ‘‘spiky’’ pollen grain mounted from fluorescence optical sections realized by means of confocal and TPE microscopy (Potter, 1996). The TPE consequence is related to a better signal-to-noise ratio. This is particularly evident for good fluorescent samples. As usual for weak fluorescence, more complex considerations have
TWO-PHOTON EXCITATION MICROSCOPY
233
Figure 28. Optical sectioning x–y views of two groups of DAPI-stained onion root chromosomes in a three-dimensional volume imaged by two-photon excited fluorescence. Left: Confocal pinhole almost fully open. Right: Optimized confocal pinhole size. Using a confocal pinhole the chromosomes in the focal plane are better selected than in the pinhole open condition. (Courtesy of Colin Sheppard, adapted from Gauderon et al., 1999.) (See Color Insert.)
Figure 29. Spiky pollen grain images acquired by means of confocal and TPE threedimensional imaging. The background-free acquisition property of TPE imaging results in a better signal-to-noise ratio. (After Potter, 1996.)
234
DIASPRO AND CHIRICO TABLE 5 Comparison of TPE and Confocal Imaging Systems TPE
Excitation source
Excitation/emission separation Detectors Volume selectivity Image formation Deep imaging
Spatial resolution
Real time imaging Signal-to-noise ratio Fluorophores
Photobleaching
Contrast mechanisms
Commercially available
Confocal
Laser, IR, fs–ps pulsed, 80–100 MHz repetition rate, tunable 680–1050 nm Wide
Laser VIS/UV CW (365, 488, 514, 543, 568, 633, 647 nm) Close
PMT (typical), CCD, APD Intrinsic (fraction of femtoliter) Beam scanning (or rotating disks)
PMT (typical), CCD Pinhole required Beam scanning (or rotating disks) Approx. 200 mm (problems related to shorter wavelength scattering) Diffraction limited depending on pinhole size
> 500 mm (problems related to pulse shape modifications and scattering) Less than confocal because of the focusing of IR radiation, compensated by the higher signal-to-noise ratio; pinhole increases resolution, good for high fluorescence Possible High (especially in nondescanned mode) All available for conventional excitation plus specifically new designed for TPE Only in the focus volume defined through resolution parameters Fluorescence, high-order harmonic generation, higher order n-photon excitation, autofluorescence Yes (but still not mature and too expensive)
Possible Good Selected fluorophores depending on laser lines in use Within all the double cone of excitation defined by the lens characteristics Fluorescence, reflection, transmission
Yes (very affordable)
to be discussed (Brakenhoff et al., 1996). Table 5 compares one-photon confocal imaging features with TPE descanned ones. However, once the best quality image possible has been obtained then sophisticated mathematical algorithms can be applied to enhance the features of interest to the biological researcher and to improve the quality of data to be used for three-dimensional modeling (Brakenhoff et al., 1989; Shotton, 1995; Diaspro et al., 1990, 2000; Boccacci and Bertero, 2002;
TWO-PHOTON EXCITATION MICROSCOPY
235
Carrington, 2002). Recently, an image restoration web service has been established to get the best quality 3D data set from a wide-field, confocal, or TPE optically sectioned sample. This tool, called ‘‘Power-up your microscope’’ (Diaspro et al., 2002c), is available for free at www.powermicroscope.com. Now, let us focus on three further aspects, namely, laser sources, lens objectives, and an example of a practical realization.
B. Laser Sources Laser sources, as often happened in optical microscopy, represent an important resource, especially in fluorescence microscopy (Gratton and van de Ven, 1995; Svelto, 1998). Within the nonresonant TPE framework, owing to the comparatively low cross sections of fluorophores, high photon flux densities are required, > 1024 photons cm2 s1 (Konig, 2000). As already discussed, using radiation in the spectral range of 600–1100 nm, excitation intensities in the MW–GW cm2 are required. This spatial concentration can be obtained by the combined use of focusing lens objectives (see the next section) and CW (Hanninen and Hell, 1994; Konig et al., 1995) or pulsed (Denk et al., 1990) laser radiation of 50 mW mean power or less (Girkin and Wokosin, 2002; Diaspro and Sheppard, 2002). In fact, two-photon excitation microscopes have been realized using CW, femtosecond, and picosecond laser sources (Periasamy, 2001; Diaspro, 2002; Masters, 2002). We could say that since the original successful experiments in TPE microscopy, advances have been made in the technological field of ultrashort pulsed lasers. Even if prices, in general, are still very high, efforts have been to lower the operative technical complexity and to produce systems simpler to maintain and more compact. The originally used argonpumped dye lasers were rapidly replaced with argon-pumped Ti-sapphire lasers (Fisher et al., 1997), and more recently they have been surpassed by all-solid-state sources requiring standard mains electrical power supply and minimal cooling systems (Wokosin et al., 1996). Laser sources suitable for TPE can now be described as ‘‘turnkey’’ systems. Figure 30 shows the emission range for different laser sources combined with the cross-sectional behavior of some popular fluorophores. It is evident that the range 700– 1050 nm is well addressed by Ti-sapphire lasers. This range of wavelengths is very common because a variety of fluorophores have an excitation range in the conventional one-photon excitation regime within 350–600 nm. So, under the ‘‘twice wavelength’’ rule of thumb, the Ti-sapphire laser appears the best choice. Other laser sources used for TPE are Cr-LiSAF, pulse-compressed Nd-YLF in the femtosecond regime, and mode-locked Nd-YAG and
236
DIASPRO AND CHIRICO
Figure 30. Cross sections of common fluorophores compared with the emission wavelength range available by different commercial laser systems. (After Xu et al., 1995.)
picosecond Ti-sapphire lasers in the picosecond regime (Gratton and van de Ven, 1995; Wokosin et al., 1996). Moreover the absorption coefficients of most biological samples, cells, and tissues are minimized within this spectral window (So et al., 2000). Figures 31 and 32 show a practical setup for a all-solid-state Ti-sapphire laser. Table 6 presents some data about the most commonly used Ti-sapphire laser sources for applications in microscopy and spectroscopy. These lasers operate in the mode-locking mode. Mode locking provides the ability to generate a train of very short pulses by modulating the gain or excitation of a laser at a frequency with a period equal to the roundtrip time of a photon within the laser cavity. This frequency is related to the linear inverse of the cavity length (Fisher et al., 1997; Svelto, 1998). The resulting pulsewidth is in the 50 to 150 fs regime. Figure 33 shows a photograph of an open cavity of a Tsunami (Spectra Physics, CA) Tisapphire cavity. The chromatic beauty is provided by the green light of the solid-state pump and by the red fluorescence of the Ti-sapphire crystal. Measured values for pulse width and average power as a function of the operating wavelength are shown in the graph of Figure 34. This graph is restricted to the 680–830 nm range because even if Ti-sapphire emits in the 680– 1050 regime, to obtain a stable behavior cavity mirrors have to be wavelength selected. In terms of wavelengths, two-photon and multiphoton excitation microscopy take place with a ‘‘comb’’ of wavelengths. This fact has positive and negative effects. For a 1050-nm source three-photon events at the 350-nm
TWO-PHOTON EXCITATION MICROSCOPY
237
Figure 31. Solid state laser pump for a Ti-sapphire crystal laser cavity. In this picture is visible the open cavity of a Millennia V (Spectra Physics, Mountain View, CA) emitting in the green at 543 nm and delivering 5 W. Recently more compact solid-state pumps have been introduced like Millennia X by Spectra Physics and Verdi by Coherent. (Courtesy of Alessandro Esposito; picture taken at LAMBS.)
equivalent wavelength (potential for phototoxic UV transitions) can occur together with 525 nm (two-photon equivalent excitation) ones. Moreover, for a 720-nm laser beam the comb effect may be worse. In fact, for 720 nm, one should consider possible effects at 360 nm and 180 nm with the potential to induce DNA damage. The final choice is as usual a compromise dictated by the specificity of TPE microscope applications. However, the parameters that are more relevant in the selection of the laser source are average power, pulsewidth and repetition rate, and wavelength, also according to Eq. (18). The most popular features for an infrared pulsed laser are 700 mW–1 W average power, 80–100 MHz repetition rate, and 100–150 fs pulse width. So far, the use of short pulses and small duty cycles is mandatory to allow image acquisition in a
238
DIASPRO AND CHIRICO
Figure 32. Coupling of the solid-state laser pump (Millennia V, Spectra Physics) with the Ti-sapphire unit (Tsunami, Spectra Physics). Another popular commercial combination is made by Verdi and Mira by Coherent. On the picture background is visible the only cooling system needed for both commercial systems, i.e., a chiller. (Courtesy of Alessando Esposito; picture taken at LAMBS.)
reasonable time while using power levels that are biologically tolerable (Denk et al., 1994; Denk, 1996; Koester et al., 1999; Konig et al., 1996a,c; Konig et al., 1998; Konig, 2000; Konig and Tirlapur, 2002). The 100-fs pulses used for TPE microscopy have bandwidths of the order of 10–15 nm, and when these pulses are passed through optical elements, mainly objective lenses, dispersion takes place. This means that the original pulse is stretched (Fig. 35), in time reducing its peak power and consequently potential fluorescence signal (Soeller and Cannell, 1996; Wokosin and White, 1997; Wolleshensky et al., 1998, 2002). Compensation of such dispersion is not easy. It is not easy to be actuated and to be maintained, especially in a multiuser TPE microscopy facility. Such compression is also required if an optical fiber is used to deliver the excitation beam to the microscope scanning head. Dealing with optical fibers the problem is also complicated by power limitations. In fact, for example, if operating at high power nonlinear effects within the fiber can occur and the nonpropagating portion of the beam can produce damages at the fiber coupling zone. To minimize dispersion problems Konig (2000) suggests working with pulses around 150–200 nm. This seems a very good compromise both for pulse stretching and sample viability. It is necessary to keep in mind that a shorter pulse broadens more than a longer one. Until new optical fibers, such as the ones outlined by Warren Zipfel at the 2002 SPIE meeting on Multiphoton Microscopy, are designed and produced, it is preferable to
TABLE 6 Ti-Sapphire Laser Sources
Tuning Range
Company/Model
Wavelength (nm)
239
Spectra Physics—Tsunami
Wide
680–1050
Coherent—Mira
Wide
680–1000
Spectra Physics—Mai Tai
100 nm selectable
Coherent—Chameleon
210 nm selectable
750–850 780–920 720–930
Pulse Width 25 fs, 71 >71 >71 >71 >71
T% at 900 nm 51–70 >71 >71 51–70 >71 >71 >71 51–70 >71
TABLE 8 Values of D at 800 nm D at 800 nm (fs2 cm1 ) +251 +300 +389 +445 +1030 +1600
Glass Type CaF2 Quartz FK-3 BK7 SF2 SF10
high NA objective is 5000 fs2 (Konig, 2000). Wolleschensky et al. (2002) summarized dispersion parameters for Zeiss microscope objective lenses measured at 800 nm. D values in fs2 are 1714, 1494, 2398, and 1531 (within an error of about 10%) for 40/0.8 water IR Achroplan, 63/0.9 water IR Achroplan, 40/1.3 oil Plan Neofluar, and 20/0.75 Plan Apochromat, respectively. A pulse broadening factor of a 100-fs pulse was estimated to be between 1.14 and 1.23. D. Example of the Practical Realization of a TPE Microscope This section is related to the practical realization of a TPE microscope achieved through minor modifications of a commercial confocal laser scanning microscope (CLSM), in which the ability to operate as a standard CLSM has been preserved (Diaspro, 2001; Diaspro et al., 2001). This microscope has been established at LAMBS (Laboratory for Advanced
TWO-PHOTON EXCITATION MICROSCOPY
245
Figure 36. A schematic drawing of the TPE microscope developed at LAMBS (Diaspro, 2001). (See Color Insert.)
Bioimaging, Microscopy, and Spectroscopy) under the auspices of and grants from the National Institute for the Physics of Matter (INFM, Istituto Nazionale per la Fisica della Materia) as the first Italian TPE architecture (Diaspro et al., 1999b; Diaspro, 2001). A scheme of the architecture is graphically sketched in Figure 36. Figure 37 shows an overall picture of the laboratory including the TPE microscope. The core of the architecture is a mode-locked Ti-sapphire infrared pulsed laser (Tsunami 3960, Spectra Physics Inc., Mountain View, CA), pumped by a high-power (5 W at 532 nm) solid-state laser (Millennia V, Spectra Physics Inc., Mountain View, CA). This Ti-sapphire laser output can be tuned across two ranges, namely, from 680 to 830 nm and from 730 to 900 nm, depending on the set of mirrors actually mounted into the laser cavity. These two sets allow the twophoton excitation of a variety of fluorescent molecules normally excited by visible and ultraviolet radiation, including the green fluorescent protein family (Xu, 2002). The restriction of the tunable range is given by the set of mirror installed. Power and wavelength measurements are performed using an RE201 model ultrafast laser spectrum analyzer (Ist-Rees, UK) and an AN2/10A-P model thermopile detector power meter (Ophir, Israel) that
246
DIASPRO AND CHIRICO
Figure 37. Photograph of the TPE microscope realized at LAMBS within the strategic framework of a national project of the National Institute of the Physics of Matter (INFM) (Diaspro et al., 1999a; Diaspro, 2001). On the left the open Tsunami cavity is visible. The microscope and the PCM2000 scanning head mounted on its lateral port are visible on the right is the video unit. Visible is part of the beam diagnostics (left) performed using an ultrafast laser spectrum analyzer RE201 (Ist-Rees, UK) and a thermopile detector power meter AN2/10A-P (Ophir, Israel). In this picture are also visible the author (right) and Mirko Corosu (left), the first student working at LAMBS on the TPE microscopy project.
constitute the beam diagnostics module of the system. A model 409-08 scanning autocorrelator (Spectra Physics, Mountain View, CA) has been occasionally used for precise pulse width evaluation, but it is not within routine beam diagnostics. We are currently utilizing a compact optical autocorrelator, which allows measurement of femtosecond laser pulses on the microscope objective plane based on a Michelson interferometer and fluorescence signal (Cannone et al., 2002). A special dichroic mirror set (Stanley, 2001), optimized for high-power ultrashort infrared pulses (CVI, USA), is used to bring the Tsunami beam directly into the scanning head. Before entering the scanning head, beam average power is brought to desired values using a neutral density rotating wheel (Melles Griot, USA). For an average power of 20 mW at the entrance of the scanning head, the average power before the microscope objective is about 8–12 mW and at the sample is estimated between 2 and 6 mW. We found that at the focal volume a 1.5- to 1.8-fold broadening occurs using a high numerical aperture objective and a reduced amount of optics within the optical path (Soeller and Cannell, 1996; Hanninen and Hell, 1994). For example, this means that
TWO-PHOTON EXCITATION MICROSCOPY
247
for a measured laser pulse width of about 100 fs at the Tsunami output window, the estimate at the sample is about 150 fs. During measurement sessions, we continuously display the pulse condition by means of an oscilloscope connected to the output of the spectrum analyzer. The pulse condition can also be tested using a simple reflective grating. In this case the reflected image on a screen will be sharp for quasicontinuous emission and blurred for pulsed emission. This is due to the fact that the output of the pulsed laser beam is more spectrally broadened in the case of pulsed emission. This spectrum is visible on the screen of the above-mentioned oscilloscope in the architecture pictures. For a transform limited Sech2 pulse the relationship between pulse width (dT ) and frequency width (df ) is as follows: dT df = 0.315. Unfortunately, the pulse is not transform limited, so this product can exceed 0.315. The laser beam is aligned using a conventional laser source of the scanning head by marking some reference positions inside the scanning head itself. The scanning and acquisition system for microscopic imaging is based on a commercial single-pinhole scanning head Nikon PCM2000 (Nikon Instr., Florence, Italy) mounted on the lateral port of a common inverted microscope, Nikon Eclipse TE300 (Fig. 38). The Nikon PCM2000 has a
Figure 38. Photograph of the confocal laser scanning head currently operating at LAMBS and modified for TPE imaging, i.e., Nikon PCM2000. (Courtesy of Alessandro Esposito.)
248
DIASPRO AND CHIRICO
simple and compact light path that makes it very appropriate for conversion to a two-photon scope (Diaspro et al., 1999b). The optical resolution performances of this microscope when operating in conventional confocal mode, and using a 100/1.3 NA oil immersion objective, have been reported in detail elsewhere and are 178 21 nm laterally and 509 49 nm axially (Diaspro et al., 1999b). Under TPE the scanning head operates in the ‘‘open pinhole’’ condition, i.e., a wide-field descanned detection scheme is used (Diaspro et al., 1999a). Figure 39 illustrates the optical pathways available on the microscope. Figure 40 shows in detail the optical pathway within the laser scanning head and the beam delivery input port. Figure 41 shows the simple but effective optical path of the PCM2000 scanning head. A dichroic mirror (1st) has been substituted in the original scanning head to allow
Figure 39. Rear view of the LAMBS TPE architecture: (1) Tsunami laser; (2–3) optical mounts for beam splitting dichroics; (4) reflected beam stops; (5–6) spectrum analyzer section including neutral density filter (5) and spectrum analyzer head (6); (7) neutral density filter for average power control along the TPE microscopy pathway (righ-side beam line); (8) mobile power meter, including measuring unit and display; (9) microscopy beam line input port at the confocal scanning head; (10) scanning lens coupling the confocal scanning head with the side port of the inverted optical microscope; (11) epifluorescence originating port sacrified for TPE beam delivery for spectroscopic applications of the left-side beam line (Diaspro et al., 2001). (Photo courtesy of Alessandro Esposito.) (See Color Insert.)
TWO-PHOTON EXCITATION MICROSCOPY
249
Figure 40. Scanning head input (left) and scanning head components (right) including modified dichroics for TPE microscopy (D1, D2). (See Color Insert.)
excitation from 680 to 1050 nm (Chroma Inc., Brattleboro, VT). The substituted dichroic mirror reflects very efficiently (>95%) from 680 to 1050 nm. The 50% cut-off is around 640 nm. At its best performance (>90%) the mirror transmits from 410 to 620 nm. The neutral density filter at the openpinhole location has been removed. The galvanometer mirrors are metal coated (silver) on fused silica and exhibit a high damage threshold. The minimum pixel residence time is 3 ms and it is related to the mechanical response of the scanners. A series of emission custom-made filters that block infrared radiation (>650 nm) to an optical density of 6–7 within 50 mW of beam power incident on the filters themselves have been utilized, namely, E650P, HQ 460/50, HQ 535/50, HQ 485/30, and HQ 405/30 (Chroma Inc., Brattleboro, VT). The E650P filter has been initially tested to check its blocking performance with respect to the IR / NIR reflections coming from stray rays within the scanning head or from the sample and constitutes the base for the other HQ filters. The one-photon and two-photon mode can be simply accomplished by switching from the single-mode optical fiber (one photon), coupled to a module containing conventional laser sources (Ar-Ion, He-Ne green), to the optical path in air, delivering the Tsunami laser beam (two photon). To minimize architectural changes of the PCM2000 scanning head, a lens having a numerical aperture close to 0.11 is used, the numerical aperture of the optical fiber used for conventional excitation laser delivery. Figure 42 shows the attachment developed at LAMBS for laser coupling. It is a device that can be directly plugged into the scanning head. Switching from
250
DIASPRO AND CHIRICO
Figure 41. Optical scheme of the confocal scanning head Nikon PCM2000 shown in Figure 40. The excitation beam enters the PCM2000 scanning head through an optical coupler (1) in order to reach the sample on the x–y–z stage (5). The beam passes through the pinhole holder (2) kept in an open position, the galvanometric mirrors (3), and the scanning lens (4). Fluorescence generated from the sample (5) is delivered to the PMT through acquisition channels directed by two selectable mirrors (6, 8) via optical fiber (7, 9) coupling. The onephoton and two-photon mode can be simply accomplished by switching from the single-mode optical fiber (one photon), coupled to a module containing conventional laser sources (Ar-Ion, He-Ne green) to the two-photon optical coupler (TPOC), allowing delivery of a Tsunami laser beam (two photon). Axial scanning for confocal and TPE three-dimensional imaging is actuated by means of two different positioning devices depending on the experimental circumstances and axial accuracy needed, namely, a belt-driven system using a DC motor (RFZ-A, Nikon, Japan) and a single objective piezo nanopositioner (PIFOC P-721-17, Physik Instrumente, Germany). The piezoelectric axial positioner allows an axial resolution of 10 nm within a motion range of 1000 nm at 100 nm steps and utilizes a linear variable differential transformer (LVDT) integrated feedback sensor (Diaspro, 2001).
conventional to two-photon excitation is simple. Moreover, the switching operation allows the focus and postion on the sample to be maintained as demonstrated in Figures 43 and 44. A high-throughput optical fiber delivers the emitted fluorescence from the scanning head to the PCM2000 control unit where photomultiplier tubes (R928, Hamamatsu, Japan) are physically plugged. This solution is particularly useful for three main reasons: (1) electrical noise is reduced, (2) background light noise is reduced, and (3) it is possible to directly verify optical conditions keeping the scanning head without an enclosure. Axial scanning for confocal and TPE three-dimensional imaging is actuated by means of two different positioning devices depending on the
TWO-PHOTON EXCITATION MICROSCOPY
251
Figure 42. TPOC details. (A) Aluminum tube containing a low magnification coupling objective (Edmind Scientific, USA); (B) TPOC plugged at the scanning head input port after removing optical fiber for conventional excitation beam delivery; (C) optical fiber delivering conventional excitation connected through TPOC to the scanning head. By means of a solution adopted in (C) it is possible to simply and precisely switch from the one- to two-photon excitation mode (Diaspro, 2001). (Photograph courtesy of Federico Federici.)
experimental circumstances and axial accuracy needed, namely, a beltdriven system using a DC motor (RFZ-A, Nikon, Japan) and a singleobjective piezo nanopositioner (PIFOC P-721-17, Physik Instrumente, Germany). The piezoelectric axial positioner allows an axial resolution of 10 nm within a motion range of 1000 nm at 100 nm steps and utilizes a linear variable differential transformer (LVDT) integrated feedback sensor. Acquisition and visualization are completely computer controlled by a dedicated software, EZ2000 (Coord, Apeldorn, The Netherlands; http:// www.coord.nl). The main available controls are related to PMTs voltage, pixel dwell time, frame dimensions (1024 1024 maximum), and field of scan (from 1 to 140 mm using a 100 objective). Remember that decreasing the size of the field of scan increases the radiation exposition time when the resulting pixel dimension is smaller than one-half the dimension of the diffraction limited spot, i.e., < 200 nm, as shown in Figure 45. Zooming over a specific area in the sample it is possible to destroy selected samples by simultaneously increasing the dwell time. This micropatterning effect can be software controlled as reported in Figure 46. Specific patterns can be obtained utilizing the microscope as an active device as shown in Figure 47.
252
DIASPRO AND CHIRICO
Figure 43. Optical sectioning demonstrated in the confocal and TPE mode after switching from one mode to the other using TPOC developed at LAMBS. (See Color Insert.)
To evaluate the performance of the microscope some basic measurements have to be performed, namely, a fluorescence quadratic behavior check and point spread function (PSF) mesurement. PSF measurements are referred to a planachromatic Nikon 100, 1.4 NA immersion oil objective with enhanced transmission in the infrared region. Blue fluorescent carboxylate modified microspheres 100 nm in diameter (F-8797, Molecular Probes, OR) were used. A drop of dilute samples of bead suspensions was spread between two cover slips of nominal thickness 0.17 nm. These microspheres constitute a very good compromise toward the utilization of subresolution point scatterers and acceptable fluorescence emission. The geometry used is sketched in Figure 48. An object plane field of 18 18 mm was imaged in a 512 512 frame, at a pixel dwell time of 17 ms. Axial scanning has been performed and 21 optical consecutive and parallel slices have been collected at steps of 100 nm. The x–y scan step was 35 nm. The scanning head pinhole
TWO-PHOTON EXCITATION MICROSCOPY
253
Figure 44. Same cells as Figure 16 to demonstrate switching from the one- to two-photon excitation mode. In the TPE mode the internal structure of the nucleus is clearly visibile, i.e., chromatin DNA marked by DAPI. Along with Figure 44 this figure clearly shows the control of positioning after mode switching. (See Color Insert.)
Figure 45. Selective photodestruction of cells after selective zooming and average power increase. When zooming the residence time increases due to the fact that the spot is always diffraction limited in the same way, no zoom scanning, while the motion of the point scanning is finer and slower during zoom. (See Color Insert.)
was set to the open position. The 3D data sets of several specimens were analyzed in the form presented in Figure 49. The measured full width at half maximum (FWHM) lateral and axial resolutions were 210 40 nm and 700 50 nm, respectively (Diaspro, 2001). Intensity profiles along with the x–y–z directions of experimental data and theoretical expectations are reported in Figure 50. To be sure of operating in the TPE regime the quadratic behavior of the fluorescence intensity versus excitation power has
254
DIASPRO AND CHIRICO
Figure 46. It is possible to perform software control of the scanning beam in the x–y–z frame. This picture shows the programming of the scanners through the graphic realization (upper left window) of the desired pathway. After this, training commands are sent to EZ2000 and an example of selective photobleaching in a fluorescent sphere is shown (upper right window). (Courtesy of Alessandro Esposito, who developed this software tool, named ‘‘Stealth,’’ which directly interfaced with EZ2000 acquisition software.)
been demonstrated. Figure 51 shows the TPE trend obtained from a solution of fluorescein. Moreover, during any fluorescence acquisition a simple and effective test for the TPE condition can be performed by delivering continuous instead of pulsed radiation. This can be accomplished by interrupting the pumping at the Ti-sapphire laser for a while and switching off the pulse control. When the pump is activated, if there are not too many vibrations, it is possible to get a quasicontinuous beam that is not appropriate for TPE even if it is endowed with the very same average power of the pulsed one. Restoring the pulse at any moment during scanning, fluorescence becomes visible confirming the TPE imaging condition. Figure 52 shows the 3D ability of the system. Two different views of a mature sperm head of the octopus Eledone cirrhosa (Diaspro et al., 1997) are shown, realized from optical sections.
TWO-PHOTON EXCITATION MICROSCOPY
255
Figure 47. Examples of controlled selective photobleaching (above) resulting in writing the characters INFM within the central plane of a fluorescent sphere, and of selective ablation of a cell layer from a three-dimensional multilayer sample (Diaspro, 1999c, 2001). (See Color Insert.)
Figure 48. Geometry of the acquisition conditions for measuring the point spread function. Subresolution fluorescent spherical beads are dried and mounted between two 0.17mm cover-slips optimized for refractive index homogeneity (Diaspro et al., 2002a).
256 Figure 49. Fluorescence signals from subresolution optical fluorescent beads collected plane by plane by means of optical sectioning (Diaspro et al., 1999a,b).
257
TWO-PHOTON EXCITATION MICROSCOPY
Figure 50. Radial and axial intensity profiles of the point spread function. (Adapted from Diaspro et al., 1999b.)
3
Fluorescence intensity (a.u.)
2.5
2
1.5
1
0.5
0
0
5
10
15
20 25 30 35 Paverage (mW)
40
45
50
55
Figure 51. Quadratic behavior check for imaging under the TPE regime. (Courtesy of Mirko Corosu; measurements made at LAMBS.)
VII. Application Gallery Two-photon excitation microscopy has found applications in many areas of biology, medicine, physics, and engineering. Areas such as neurobiology and embryology, tissue engineering, and proteomics are only the tip of the
258
DIASPRO AND CHIRICO
Figure 52. Three-dimensional views (a, b) of the mature sperm head of the octopus Eledone cirrhosa, loaded with DAPI, from 12 optical sections. (Courtesy of Silvia Scaglione; image processing and visualization by Fabio Mazzone, Francesco Di Fato, and Silvia Scaglione at LAMBS and BioLab, University of Genoa, Italy.)
iceberg. Never, since the Dutch van Leewenhoek constructed his simple microscope in 1683, has there been so vast, rapid, and widespread flourishing of applications based on a microscopic technique. As will be seen, the predominant presence of applications is in the neurosciences, which is the field of Denk and co-workers (1990) from whom the modern TPE story started. Here we will try to show different applications, mixing the various areas impacted by the TPE revolution. Unfortunately it is impossible to mention the vast extent of TPE applications. For this reason we refer the interested reader to web-based search engines. Starting from neuroscience, Yuste et al. (2000) provide a wide and complete collection of examples of outstanding and excellent applications of two-photon excitation imaging. Figure 53 shows the complex organizational motifs of a special neuronal cell, i.e., Purkinje cell, evidenced by means of Oregon Green labeling. This fluorescent molecule binds calcium ions in the cytoplasm. Through specific experimental procedures it is possible to obtain quantitative information within a three-dimensional and temporal framework. In this context TPE is also relevant because of the possibility of long-term imaging session and for the further ability of performing these studies in intact tissues. In Figure 54 an optical section of rat granule cerebrellar cells is shown. In this case Indo-1 AM fluorescence is the mechanism of contrast for calcium ion distribution. This marker is conventionally excited in the UV regime and can give quantitative information about calcium concentration. TPE microscopy allows quantitatively dynamic events to be followed without perturbing the delicate and
TWO-PHOTON EXCITATION MICROSCOPY
259
Figure 53. Purkinje cell labeled with Oregon green. Calcium ion concentration is mapped by means of a color scale from blue (low concentration level) to red (maximum concentration level). (Courtesy of Prof. Cesare Usai, Institute of Biophysics, National Research Council, Genoa, Italy. Image acquired at LAMBS.) (See Color Insert.)
Figure 54. Rat granule cerebrellar cell loaded with Indo-1 AM calcium binding dye. This UV excitable fluorescent molecule has been excited at 720 nm at a moderate average power at a focal plane of 2 mW. (Courtesy of Alessandro Esposito, DIFI, University of Genoa, Italy. Image acquired at LAMBS.) (See Color Insert.)
complex relationship within neuronal cell networks that in the UV excitation case would significantly limit the duration of the experiment. The possibility of following dynamic events allowed us to demonstrate that living cells, after encapsulation into fuzzy nanostructured polyelectrolyte matrices, preserved their morphology, metabolic activity, and duplication function (Diaspro et al., 2002c). This is shown in Figure 55. Here the polyelectrolyte capsule was bound to fluorescein and DAPI was used to reveal mitochondrial and nuclear DNA distribution. For these dyes TPE
260
DIASPRO AND CHIRICO
Figure 55. Demonstration of cell duplication ability after polyelectrolyte encapsulation by coupling transmission imaging (A) with TPE imaging (B) of fluorescein and DAPI mapping caspule wall (green) and DNA distribution of a duplicating mother cell (blue). (Reprinted with permission from Langmuir, June 25, 2002, 18, 5047-5050. Copyright 2002 American Chemical Society.) Image acquired at LAMBS (Diaspro et al., 2002c). (See Color Insert.)
Figure 56. This image illustrates the peculiarity of TPE (right) with respect to conventional fluorescence excitation (left). This ability is a keystone for TPE applications. TPE takes place only within a diffraction-limited volume of event whereas conventional excitation takes place everywhere photons at a proper energy meet excitable fluorescent molecules. The volume of event, marked through the bright ellipsoid in the center of the excitation volume, can be roughly quantified using the resolution parameters of the system, as discussed in Section V. (See Color Insert.)
allowed simultaneous excitation at 720 nm at moderate average power (around 5 mW) without perturbing the hybrid cell–polyelectrolyte system. This could happen under a conventional confocal excitation regime for which 360 and 488 nm excitation wavelengths are required to excite DAPI and fluorescein, respectively. This very same ability to perform dynamic imaging is the core of a recent note published by Ott (2002) related to the ability of TPE microscopy to reveal tumor development. Figure 56 shows
TWO-PHOTON EXCITATION MICROSCOPY
261
Figure 57. Optical sections from a sea urchin egg marked by DAPI, TPE excitation at 720 nm. In this case heterochromatin distribution within the female pronucleus is visible. The whole egg has a diameter of 80 mm whereas the nucleus is 10 mm (as reference this is the maximum visible diameter). In conventional wide-field microscopy we could see only a confused bright spot from the nuclues. (Preparation of the sample made by Carla Falugi, DIBISAA, University of Genoa; images acquired at LAMBS.) (See Color Insert.)
that the key feature in TPE is a strong spatial selectivity in exciting extrinsic and intrinsic fluorophores. This property is fundamental in threedimensional imaging in thick samples. Excitation scattering is greatly reduced and at the same time emission scattering should be completely acquired since it comes from a unique well-defined subvolume within the sample that is located at the actual scanning position. The situation is dramatically enhanced with respect to UV regime conventional excitations. Figure 57 shows three-dimensional heterochromatin distribution within the nucleus of a sea urchin egg that constitutes a comparatively thick biological sample. Also in this case DAPI was used for evidencing DNA with the consequence that under conventional excitation DNA distribution details
262
DIASPRO AND CHIRICO
Figure 58. Optical sections of Figure 57 have been mounted in a topographic image. The image shows EZ2000 (Coord, NL) rendering using the ‘‘volume height function’’ that allows us to map the position of the maximum fluorescence along the optical axis. (See Color Insert.)
Figure 59. Spongy mesophyl of rice plant. TPE allowed simultaneous visualization of rice plant autofluorescence (red) and nonspecific DAPI binding to plant cell walls (blue). TPE at 790 nm. (Courtesy of Kirk J. Czymmek, Department of Biological Sciences, University of Delaware. Details on the project can be found at http://www.udel.edu/bio/people/faculty/ kczymmek.html.) (See Color Insert.)
are generally lost. This is due to the thickness and turbidity of the sample coupled with the need for UV excitation and three-dimensional imaging demand. Such a high-resolution imaging modality allows accurate topographical information to be obtained (Fig. 58), which can be used to monitor the environmental effect on sea urchin egg development (C. Falugi, 2002, private communication). Another very interesting field of application of TPE microscopy is plant biology. Figure 59 shows the spongy mesophyl of a rice plant combining chloroplast autofluorescence and DAPI binding fluorescence. It was recently observed that excitation with ultrashort
TWO-PHOTON EXCITATION MICROSCOPY
263
Figure 60. Top-down projection of senile plaques in the brain of a living transgenic mouse (Tg2576). This image is from an x–y–z volume of 500615200 mm3 (Christie et al., 2001). (Image by B. J. Bacskai, [email protected]; downloaded from Bio-Rad site http://microscopy.bio-rad.com/gallery7.htm;) (See Color Insert.)
90- and 170-fs NIR laser pulses at ¼ 740, 760, 780, 800, 820, 840, 860, 880, and 900 nm (at mean power 1 mW) invariably induces red chlorophyll autofluorescence of the chloroplasts present in the mesophyll cells (Tirlapur and Konig, 2002). As recently reported by Tirlapur and Konig (2002), the progress made in realizing TPE in plant biology indicates relevant contributions to the following topics: (1) signal transduction and ion dynamics, (2) protein–protein interactions, (3) symplastic communication, (4) basic aspects of organelle and cell division, (5) tip growth, and (6) plant morphogenesis as a whole. Hence TPE in planta is likely to exert an enormous impact on revolutionizing our basic thinking about structure– function relationships in three as well as in four dimensions. Figure 60 recalls the penetration properties of TPE microscopy, shown in red amyloid angiopathy and senile plaques from a living transgenic mouse brain. A fluorescent angiogram is shown in green. The image, captured from the Bio-Rad web site, is from an outstanding work by Christie and coworkers (2001), and is realized as a top-down projection of a large volume size 0.2 mm deep. This ability to image at a depth of 0.2 mm and deeper is a unique ability of the two-photon approach. A comparison study made by Centonze and White (1998) has convincingly demonstrated that TPE microscopy is a superior method in thick specimen analysis. Moreover the excellent work published by Squirrel’s group about long-term imaging of mammalian embryos without compromising viability (Squirrel et al., 1999) definitely demonstrated the usefulness and relevance of TPE imaging in the noninvasive and high-resolution study of living specimens. This feature of
264
DIASPRO AND CHIRICO
Figure 61. Mouse ear tissue structures visualized by means of two-photon excitation microscopy. Three-dimensional images of epidermal keratinocytes (a), basal cells (b), elastin/ collagen fibers (c), and cartilage structure (d) (above). (Adapted from So et al., 1998, 2000.) In vivo imaging of human skin: basal layers and strata corneum can be distinguished (below). (Adapted from Masters and [Au]So, 1999; So et al., 2000; courtesy of Peter So and Barry Maters.) (See Color Insert.)
TWO-PHOTON EXCITATION MICROSCOPY
265
TPE is of critical importance for applying this technique in optical biopsy. Figure 61a shows three-dimensional reconstructed TPE images of dermal and subcutaneous structures in a mouse ear tissue specimen (So et al., 1998). From the forearm of a human volunteer two-photon skin images were obtained allowing the distinct visualization of the strata corneum and of the basal layers, as reported in Figure 61b. This implies that pathological states such as atypical changes in cellular morphology as well as penetration of intradermal delivered drugs can be monitored. Notwithstanding these results it should be mentioned that some technological limitations still occur. As accurately analyzed by Gu and co-workers (2000) the penetration depth under TPE can be limited by the strength of primed fluorescence and it is not necessarily larger than that under single-photon excitation. In fact for a turbid tissue medium, where Mie scattering is dominant, multiple scattering not only reduces the illumination power in the forward direction but also produces an anisotropic distribution of scattered photons. It is worth noting that in cells and tissue it is possible to perform highresolution DNA analysis of specific sequences using two-photon excitation fluorescence in situ hybridization (FISH), more specifically three-dimensional two-photon multicolor FISH (Konig et al., 2000). Moving again to brain imaging and neuroscience applications within the framework of a study of anatomical features in whole animals (Denk et al., 1994), Yoder and Kleinfeld (2002), in an effort to image the brain with subcellular spatial resolution, designed and applied a method to image directly through thinned mouse skull using TPE microscopy and a stainless steel headframe (Kleinfeld and Denk, 2000). Figures 62 shows a cerebral vascular angiogram visualized through a thinned skull and Figure 63 shows the related red blood motion. Although the images shown here were used to penetrate the cerebral vasculature of NIH Swiss mice, these methods are applicable to any preparation that involves fluorescence imaging in mouse brain such as intracellularly injected fluorescence or genetically encoded fluorescence (e.g., green fluorescent protein). If the mean power in 100-MHz femtosecond laser TPE microscopes with a high numerical aperture is increased to light intensities of the order of magnitude of TW cm2 the instrument can switch from imaging modality to active processes useful for material processing or localized photochemistry, as previously shown in Section VI (Diaspro, 1999c; Diaspro et al., 2001). The Tetsuro Takamatsu and Satoshi Kawata groups recently communicated the achievement of TPE-induced waves of calcium ion concentration in live biological cells (Smith et al., 2001). Calcium waves were precisely induced by femtosecond pulsed-laser illumination by exposing living HeLa cells to focused 140-fs pulses of 780 nm wavelength at 30 mW average power. The waves were imaged by fluorescence and were observed to propagate from
266
DIASPRO AND CHIRICO
Figure 62. Cerebral vascular angiogram visualized through a thinned skull using 800 nm excitation and 90 mW average power. The focal plane is located 150 mm beneath the base of the skull. (Courtesy of Elizabeth Yoder. Reprinted with permission from Microscopy Research and Technique, 56, 305, 2002.)
Figure 63. Red blood motion within the capillary segment indicated in Figure 62. In this x-temporal view the unlabeled cells appear as dark bands that are evidenced against the fluorescent blood serum. A 40 water immersion objective was used. (Courtesy of Elizabeth Yoder. Reprinted with permission from Microscopy Research and Technique, 56, 305, 2002.)
TWO-PHOTON EXCITATION MICROSCOPY
267
Figure 64. Microbull adapted view—about the size of a red blood cell—of the smallest bull in the world realized by Satoshi Kawata’s group demonstrates the power of two-photon photopolymerization exploiting TPE microscopy three-dimensional capability and high spatial resolution (Kawata et al., 2001). (Image adapted from the web.)
Figure 65. Three-dimensional montage of drilled holes and cut structures in human chromosomes with a precision below the diffraction limit. Nanoprocessing has been performed using an 80-MHz ultrafast NIR laser source at 30–50 mW average power at the focal spot. (Courtesy of Karsten Konig; adapted from Konig and Tirlapur, 2002.)
the laser focal point inside the cell. From Kawata’s group, in 1997, a two-photon polymerization technique was developed that recently brought to realization the smallest bull in the world (Kawata et al., 2001). Here twophoton absorption of light was used to cause a polymer to solidify allowing the creation of a microbull in a block of commercially available resin. By using two-photon photopolymerization, Kawata’s team was able to overcome the diffraction limit and create structures with a spatial resolution of about 120 nm, even though the laser used had a wavelength that was more than six times longer, exploiting the nonlinear relationship between the
268
DIASPRO AND CHIRICO
Figure 66. Chromosome dissection within living PTK cells with a precision of 110 nm using the femtosecond NIR laser of a TPE microscope without loss of viability. The cells finished cell division after laser surgery (Ko¨nig et al., 1999b, 2000). (Courtesy of Karsten Ko¨nig; adapted from Konig, 2000.)
polymerization reaction and the light intensity. Figure 64 shows a microbull about the size of a red blood cell. The exposure source employed was a 780-nm mode-locked Ti-sapphire laser, capable of producing laser pulses of 150 fs at a repetition rate of 76 MHz, which was focused into a sample of SCR 500 resin by a high NA 1.4, oil immersion objective lens (Tanaka et al., 2002). The laser spot was scanned on the focal plane by a two-galvanomirror set, and along the optical axis by a piezo stage, both controlled by a computer. The ‘‘microbull’’, as well as the smallest ever functional micromechanical system—a spring with a diameter of only 300 nm, illustrates the potential of a new microfabrication technique that could be used to make optoelectronic devices, micromachines, and drug-delivery systems. As an extension of this nanofabrication ability, by fine tuning the laser power within a TPE microscopy architecture it was possible to realize a noncontact nanoscalpel for surgery inside the living cell, cell nucleus, or organelle without affecting other cellular compartments. Karsten Konig and his group were able to cut chromosomes within a living cell (Konig et al., 2000). Figure 65 shows three-dimensional views of human chromosome
TWO-PHOTON EXCITATION MICROSCOPY
269
Figure 67. Control measurements were performed to verify that the bright spot signals consisted of the second-harmonic generation. As the first check, used also for two-photon excitation autofluorescence, the laser was taken out of mode locking and the signals vanished. This fact indicates that the signals’ origin was due to nonlinear processes, which were also verified by a quadratic dependence on the laser power. Moreover, the laser was scanned between 750 and 830 nm keeping the 405-nm emission filter fixed. No signals was detected at 405 nm within a range of approximatively 5 nm around 810 nm. Finally, the potential SHG image appeared bleach resistant. Diaspro’s group is indebted to Colin Sheppard, Tony Wilson, and Guy Cox for critical and useful discussions about the still unclear origin of such a signal on the backscattering pathway. (Sample prepared by Paola Ramoino, DIPTERIS, University of Genoa; image acquired at LAMBS; Diaspro et al., 2002d.) (See Color Insert.)
nanoscalpeling. Figure 66 demonstrates that the cells remained alive and completed cell division after TPE-based nanosurgery. From the above reported examples, it should be clear that a promising direction for TPE applications is not only given by clinical diagnosis, for which optical biopsy can be considered a new paradigm, but also by clinical treatment based on photodynamic therapy and nanosurgery. To conclude this necessarily not exhaustive section let us switch to two more technical arguments that can greatly increase the large potential of TPE applications, namely, second-harmonic generation imaging and singlemolecule detection imaging. Second-harmonic generation (SHG), as primed by TPE nonlinear light– matter interaction (Sheppard and Kompfner, 1978), has only recently been used for biological imaging applications (Campagnola et al., 1999; Moreaux et al., 2000; Zoumi et al., 2002; Diaspro et al., 2002d). A powerful advance is
270
DIASPRO AND CHIRICO
Figure 68. Single and multiple fluorescent molecule and molecular aggregate trends. Comparison between single molecule and aggregate of molecules intensity decay. (Courtesy of Fabio Cannone, LAMBS and INFM Milano Bicocca.)
Figure 69. Distribution of the intensity of the spots of an image of a glass prepared by spin coating a C = 310 nM rhodamine 6G solution. The image was acquired with residence time 3 ms, 3535 mm2 field of view and average excitation power 7 mW. Top inset: Fluorescence of the peaks in the distribution in order of increasing intensity. (Adapted from Chirico et al., 2001.)
obtained in coupling TPE and SHG imaging on the same detection optical path, which involves different contrast mechanisms usable to obtain complementary information regarding biological system structure and functioning. TPE fluorescence is generally measured in epiilluminatin geometry but the forward propagating nature of SHG seemed to restrict
TWO-PHOTON EXCITATION MICROSCOPY
271
Figure 70. Average fluorescence of the dimmest spot measured on slides spin coated with rhodamine 6G (scatter red), fluorescein (scatter green), and pyrene (scatter navy). The solid lines are square law best fit curves showing clear evidence for the prevailing second-order process. (Courtesy of Fabio Cannone, LAMBS and INFM Milano Bicocca; data acquired at LAMBS.)
SHG microscopy to a transmission mode of detection. This hampered several potential experiments, especially in thick or in optical configurations where it is not possible to place forward detectors. Recently reflected SHG signals were collected from Bruce Tromberg’s and Alberto Diaspro’s groups (Zoumi et al., 2002; Diaspro et al., 2002d) opening new application perspectives. Figure 67 shows an autofluorescence and SHG signal from Paramecium primaurelia, a unicellular organism. The bright spots are forming vesicles and vacuoles according to cellular morphology and related positions. In this case, the autofluorescence signal was used as a cellular landmark. Background autofluorescence and bright spots allow us to image details from the samples without the need of staining (Diaspro et al., 2002d).
272
DIASPRO AND CHIRICO
The study of single molecules by spectroscopic techniques has recently become of major interest, and fluorescence has been used among others techniques to identify and characterize the properties of single-molecular entities. Xie and Lu, and Petra Schwille are the authors of two very useful and excellent reviews on the subject including outstanding developments of fluorescence correlation spectroscopy, first introduced by Magde et al. (1972), that for evident reasons could be inserted in this review (Xie and Lu, 1999; Schwille, 2001). However, we focus on single-molecule imaging (farfield) using simple two-photon optical configurations (Diaspro et al., 2001; Sonneleitner et al., 1999). Following the pioneering work by Sanchez et al. (1997) on two-photon imaging of single rhodamine B glass-immobilized molecules, spatially resolved applications of ultrasensitive TPE fluorescence have shown promising results (Sonneleitner et al., 2000; Chirico et al., 2001). Two basic issues in these studies are to diminish the background signal, either residual scattering or fluorescence, and to discriminate between the signals arising from single molecules and those that correspond to small molecular aggregates. Apart from an elaborate and elegant method based on the observation of the anticorrelation effect due to the saturation of the ground level of a single molecule, the chaotic time behavior of the fluorescence signal on a millisecond range is taken most of the time as a fingerprint of the ‘‘singlemolecule’’ spot. However, these observations can be performed only by following the time evolution of the fluorescence emission of molecular aggregates, which may be degraded by the prolonged exposure to the exciting radiation. Moreover they are performed mainly with sensitive and costly avalanche photodiodes in a single-photon counting regime. Recently we imaged the fluorescence signal of different fluorophores spread on glass substrates by means of the scanning head adapted to twophoton excitation (see Section VI) in the range of about 650–900 kW/cm2 of excitation intensity (Chirico et al., 2001). So far, it was possible to show that in this range of excitation intensity single molecules can be imaged even with analog detection and, more interestingly, that the distributions of the pixel content on the images show discrete peaks at specific levels that are found to be multiples of a reference basic fluorescence level, the latter corresponding to the dimmest spot revealed in the substrates. The main difference with respect to other single-molecule detection schemes was related to the employment of a simple analog detection scheme and the use of a commerical scanning head to quantitatively discriminate between single entities and aggregates on single snapshots of the spin-coated glasses. Figure 68 sketches single and multiple fluorescent molecule behavior under a TPE regime. In Figure 69 the number density of the spots per mm2 versus concentration of the rhodamine 6G fluorescent molecule is shown. Images were taken at microsecond residence time per pixel. The spin coating on the glass slide
TWO-PHOTON EXCITATION MICROSCOPY
273
of the fluorescent molecules has been made from a solution of rhodamine 6G at C = 312 nM, and the excitation power is at 7 mW at the entrance of the scanning head (Chirico et al. 2001). As a further control and singlemolecule level characterisation step, Figure 70 demonstrates the expected power-intensity quadratic dependence under the TPE regime of singlemolecule image spots. This was a first step in the study of the behavior of single molecules; a further step is related to photothermal effects and blinking (Chirico et al., 2002). VIII. Conclusions The rapid spreading of two-photon excitation microscopy, since Denk’s report at the beginning of the 1990s, has brought dramatic changes in designing experiments that utilize fluorescent molecules and more specifically in fluorescence optical microscopy. We are both spectators and actors of an unprecedented revolution that is leading us to new exciting discoveries as well as to have a look behind us on the decennial use of fluorescence microscopy. Not only are new incredible experiments being designed and performed but also a critical reading of the past results is done by comparing one- and two-photon experiments. It offers real progress in science with its intrinsic three-dimensional resolution, the absence of background fluorescence, and the attractive possibility of exciting UV excitable fluorescent molecules thus increasing sample penetration. In fact, in a TPE scheme two 720-nm photons combine to produce the same fluorescence conventionally primed at say 360 nm. The excitation of the fluorescent molecules bound to the biological system being studied mainly takes place (80%) in an excitation volume of the order of magnitude of 1 fl or smaller. This implies an intrinsic 3D optical sectioning effect. What is invaluable for cell imaging and in particular for live-cell imaging is the fact that weak endogenous one-photon absorption and highly localized spatial confinement of the TPE process dramatically reduce phototoxicity stress. To the best of our knowledge the situation is advantageous if compared with the damage induced by means of conventional fluorescence excitation. Notwithstanding this, some sagacity has to be used and some experimental parameters need to be carefully controlled such as average power, acquisition dwell time, zooming factor, and beam pulse width. The following points summarize the unique characteristics and distinct advantages of TPE: 1. Spatially confined fluorescence excitation in the focal plane of the specimen is the hallmark of TPE microscopy. It is one of the advantages over confocal microscopy, where fluorescence emission occurs across
274
DIASPRO AND CHIRICO
the entire thickness of the sample being excited by the scanning laser beam. A strong implication is that there is no photon signal from sources out of the geometric position of the optical focus within the sample. Therefore, the signal-to-noise ratio increases, photo-degradation effects decrease, and optical sectioning is immediately available without the need for pinhole or deconvolution algorithms. Besides, efficient acquisition schemes can be implemented such as the nondescanned one realized by placing the detector near the specimen and outside the conventional confocal fluorescence pathway. 2. The use of near-IR/IR wavelengths permits examination of thick specimens in depth. This is due to the fact that apart from special cases such as pigmented samples and the absorption spectral window of water, cells and tissues absorb poorly in the near-IR/IR region. So, cellular damage is minimized thus allowing cell viability during image acquisition to be prolonged. Moreover, scattering is reduced and deeper targets can be reached without incurring the drawbacks of onephoton excitation, i.e., more excitation intensity needed at the expense of photodamage and signal-to-noise ratio. The depth of penetration can be up to 0.5 mm. Whereas in one-photon excitation, the emission wavelength is comparatively close to the excitation one (about 50–200 nm longer), in TPE the fluorescence emission occurs at a wavelength substantially shorter and at a larger spectral distance than in the onephoton excitation case. Now, despite the advantages, there are still some practical limitations and open questions that remain to be examined closely. A severe limitation is the high cost of laser sources and of maintenance, primarily because of the limited and unpredictable duration of laser pump diodes. As other researchers have pointed out, once the technology becomes less expensive and simpler, all confocal microscopes will also be a two- or multiphoton microscope. Other matters under study involve local heating from absorption of IR light by water at high laser power (Schonle and Hell, 1998) and photothermal effects on fluorescent molecules (Chirico et al., 2002); phototoxicity from long wavelength IR excitation and short wavelength fluorescence emission (Tyrrel and Keyse 1990; Konig, 2000; Hopt and Neher, 2001; Konig and Tirlapur, 2002); and development of new fluorochromes better suited for TPE and multiphoton excitation (Albota et al., 1998a) In agreement with Gratton et al. (2001), it is our opinion that one of the major benefits in setting up a TPE microscope is the flexibility in choosing the measurement modality favored by the simplification of the optical
TWO-PHOTON EXCITATION MICROSCOPY
275
design. In fact, a TPE microscope offers a number and variety of measurement options without changing any optics or hardware. This means that during the same experiments one can get real multimodal information from the specimen being studied. The recent work done by Bruce Tromberg’s group is a clear demonstration of this and a brilliant and outstanding application of TPE (Zoumi et al., 2002). We think that this is a unique feature of the TPE microscope. In fact, the usefulness of the TPE scheme for spectroscopic and life time studies is already well documented (So et al., 1996; Sytsma et al., 1998; Schwille et al., 2000; Diaspro et al., 2001; Wiseman et al., 2002), for optical data storage and microfabrication (Cumpston et al., 1999; Kawata et al., 2001), and for single molecule detection (Mertz et al., 1995; Farrer et al., 1999; So et al., 2000; Chirico et al., 2001). Moreover, very interesting applications involve the study of impurities affecting the growth of protein crystals (Caylor et al., 1999), TPE imaging in the field of plant biology (Tirlapur and Konig, 2002) and measurements in living systems (Squirrel et al., 1999; Yoder and Kleinfeld, 2002; Diaspro et al., 2002d). This growing area of microscopy is also related to the applications of TPE as an active biomedical device for nanosurgery (Konig, 2000) and photodynamic therapy (Bhalwalkar et al., 1997; So et al., 2000). Recently TPE microscopy, even in an evanescent-field-induced configuration, has been extended to large area structures of the order of square centimeters (Duveneck et al., 2001). This can open the way for further improving the sensitivity of biosensing platforms such as genomic and proteomic microarrays based upon large planar waveguides. Besides, we deem that important and dramatic future developments will be in areas such as neurobiology, physiology, embryology, and tissue engineering. It is an easy prediction to state that the range of applicability of TPE and multiphoton laser scanning microscopes is intensively branching in biomedical, biotechnological, and biophysical sciences as well as toward clinical applications. It is appropriate to end with this citation: ‘‘There are more things in Heaven and Earth, Horatio, Than are dreamt of in our philosophy’’ (‘‘Hamlet,’’ by William Shakespeare, approx. 1601–1608).
Acknowledgments The authors are indebted to their co-workers at LAMBS, (Laboratory for Advanced Microscopy, Bioimaging, and Spectroscopy), namely (random order) Andrea Gerbi, Fabio Mazzone, Francesco Difato, Silvia Scaglione, Federico Federici, Fabio Cannone, Sabrina Beretta, Giancarlo Baldini,
276
DIASPRO AND CHIRICO
Marco Scotto, Cesare Usai, Paola Ramoino, and Alessandro Esposito. Moreover, we are grateful to Salvatore Cannistraro, Alessandra Gliozzi and Enrico Gratton for believing in the TPE project. A.D. is indebted to Peter Hawkes, for infinite patience, and to his wife Teresa for lost sunny weekends and help during hard days; without her this chapter could not have been written. A.D. dedicates this chapter to the memory of Mario Arace, who purchased his first oscilloscope still in used for TPE (see figures), and Ivan Krekule, more than a father. This research was performed under the auspices of and grants from INFM, the National Institute for the Physics of Matter, Italy.
References Abbe, E. (1910). edited by O. Lummer and F. Reiche. Braunschweig. Agard, D. A. (1984). Optical sectioning microscopy: Cellular architecture in three dimensions. Annu. Rev. Biophys. 13, 191–219. Agard, D. A., Hiraoka, Y., Shaw, P. J., and Sedat, J. W. (1989). Fluorescence microscopy in three-dimensions. Methods Cell Biol. 30, 353–378. Albota, M. et al. (1998a). Design of organic molecules with large two-photon absorption cross sections. Science 281, 1653–1656. Albota, M. A., Xu, C., and Webb, W. W. (1998b). Two-photon fluorescence excitation cross sections of biomolecular probes from 690 to 960 nm. Appl. Opt. 37, 7352–7356. Amos, B. (2000). Lessons from the history of light microscopy. Nat. Cell Biol. 2, E151–E152. Andrews, D. L. (1985). A simple statistical treatment of multiphoton absorption. Am. J. Phys. 53, 1001–1002. Axe, J. D. (1964). Two-photon processes in complex atoms. Phys. Rev. 136, 42–45. Beltrame, F., Bianco, B., Castellaro, G., and Diaspro, A. (1985). Fluorescence, absorption, phase-contrast, holographic and acoustical cytometries of living cells, in Interactions between Electromagnetic Fields and Cells, edited by A. Chiabrera and H. P. Schwan. NATO ASI Series. Vol. 97. New York: Plenum Press, pp. 483–498. Benedetti, P. (1998). From the histophotometer to the confocal microscope: The evolution of analytical microscopy. Eur. J. Histochem. 42, 11–17. Benham, G. S., and Schwartz, S. (2002). Suitable microscope objectives for multiphoton digital imaging, in Multiphoton Microscopy in the Biomedical Sciences II, edited by A. Periasamy and P. T. C. So. Proc. SPIE. 4620, pp. 36–47. Berland, K. (2001). Basics of fluorescence, in Methods in Cellular Imaging, edited by A. Periasamy. New York: Oxford University Press, pp. 5–19. Berland, K. M., So, P. T. C., and Gratton, E. (1995). Two-photon fluorescence correlation spectroscopy: Method and application to the intracellular environment. Biophys. J. 68, 694–701. Berns, M. W. (1976). A possible two-photon effect in vitro using a focused laser beam. Biophys. J. 16, 973–977. Bertero, M., and Boccacci, P. (1998). Introduction to Inverse Problems in Imaging. Bristol and Philadelphia: IOP Publishing. Bhawalkar, J. D., Kumar, N. D., Zhao, C. F., and Prasad, P. N. (1997). Two-photon photodynamic therapy. J. Clin. Laser Med. Surg. 15, 201–204.
TWO-PHOTON EXCITATION MICROSCOPY
277
Bianco, B., and Diaspro, A. (1989). Analysis of the three dimensional cell imaging obtained with optical microscopy techniques based on defocusing. Cell Biophys. 15(3), 189–200. Birge, R. R. (1979). A theoretical analysis of the two-photon properties of linear polyenes and the visual chromophores. J. Chem. Phys. 70, 165–169. Birge, R. R. (1986). Two-photon spectroscopy of protein-bound fluorophores. Accounts Chem. Res. 19, 138–146. Birks, J. B. (1970). Photophysics of Aromatic Molecules. London: Wiley Interscience. Boccacci, P., and Bertero, M. (2002). Image restoration methods: Basics and agorithms, in Confocal and Two-photon Microscopy: Foundations, Applications and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc, pp. 253–270. Born, M., and Wolf, E. (1980). Principles of Optics. 6th ed., Cambridge, UK: Cambridge University Press. Brakenhoff, G. J., Blom, P., and Barends, P. (1979). Confocal scanning light microscopy with high aperture immersion lenses. J. Microsc. 117, 219–232. Brakenhoff, G. J., van Spronsen, E. A., van der Voort, H. T., and Nanninga, N. (1989). Threedimensional confocal fluorescence microscopy. Methods Cell. Biol. 30, 379–398. Brakenhoff, G. J., Muller, M., and Ghauharali, R. I. (1996). Analysis of efficiency of twophoton versus single-photon absorption for fluorescence generation in biological objects. J. Microsc. 183, 140–144. Buehler, C., Kim, K. H., Dong, C. Y., Masters, B. R., and So, P. T. C. (1999). Innovations in two-photon deep tissue microscopy. IEEE Eng. Med. Biol. 18, 23–30. Callis, P. R. (1997). Two-photon-induced fluorescence. Annu. Rev. Phys. Chem. 48, 271–297. Campagnola, P., Mei-de, Wei, Lewis, A., and Loew, L. (1999). High-resolution nonlinear optical imaging of live cells by second harmonic generation. Biophys. J. 77, 3341–3351. Cannell, M. B., and Soeller, C. (1997). High resolution imaging using confocal and two-photon molecular excitation microscopy. Proc. R. Microsc. Soc. 32, 3–8. Cannone, F., Chirico, G., Scotto, M., and Diaspro, A. (2003) In preparation. Cantor, C. R., and Schimmel, P. R. (1980). Biophysical Chemistry. Part II: Techniques for the Study of Biological Structure and Function. New York: Freeman and Co. Carlsson, K., Danielsson, P. E., Lenz, R., Liljeborg, A., Majlof, L., and Aslund, N. (1985). Three-dimensional microscopy using a confocal laser scanning microscope. Opt. Lett. 10, 53–55. Carrington, W. (2002). Imaging live cells in 3-d using wide field microscopy with image restoration, in Confocal and Two-Photon Microscopy: Foundations, Applications and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc, pp. 333–346. Carrington, W. A., Lynch, R. M., Moore, E. D. W., Isenberg, G., Fogarty, K. E., and Fay, F. S. (1995). Super resolution in three-dimensional images of fluorescence in cells with minimal light exposure. Science. 268, 1483–1487. Castleman, K. R. (1996). Digital Image Processing. Englewood Cliffs, NJ: Prentice Hall. Castleman, K. (2002). Sampling, resolution and digital image processing in spatial and Fourier domain: Basic principles, in Confocal and Two-Photon Microscopy: Foundations, Applications and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc., pp. 237–252. Caylor, C. L., Dobrianov, I., Kimmer, C., Thorne, R. E., Zipfel, W., and Webb, W. W. (1999). Two-photon fluorescence imaging of impurity distributions in protein crystals. Phys. Rev. E 59, 3831–3834. Centonze, V. E., and White, J. G. (1998). Multiphoton excitation provides optical sections from deeper within scattering specimens than confocal imaging. Biophys. J. 75, 2015–2024. Chalfie, M. and Kain, S. Eds. (1998). Green Fluorescent Protein: Properties, Applications and Protocols. New York: Wiley-Liss, Inc.
278
DIASPRO AND CHIRICO
Chalfie, M., Tu, Y., Euskirchen, G., Ward, W. W., and Prasher, D. C. (1994). Green fluorescent protein as a marker for gene expression. Science 263, 802–805. Chance, B. (1989). Cell Structure and Function by Microspectrofluorometry. New York: Academic Press. Cheng, P. C. Ed. (1994). Computer Assisted Multidimensional Microscopies. New York: Springer-Verlag. Chirico, G., Cannone, F., Beretta, S., Baldini, G., and Diaspro, A. (2001). Single molecule studies by means of the two-photon fluorescence distribution. Microsc. Res. Tech. 55, 359–364. Chirico, G., Cannone, F., Baldini, G., and Diaspro, A. (2002). Two-photon thermal bleaching of single fluorescent molecules. Biophys. J (in press). Christie, R. H. Backsai, B. J. Zipfel, W. R. et al. (2001). Growth arrest of individual senile plaques in a model of Alzheimer’s disease observed by in vivo multiphoton microscopy. J. Neurosci. 21(3), 858–864. Cox, I. J. (1984). Scanning optical fluorescence microscopy. J. Microsc. 133, 149–153. Cox, I. J., and Sheppard, C. J. R. (1983). Digital image processing of confocal images. Image Vision Comput. 1, 52–56. Cumpston, B. H. et al. (1999). Two-photon polymerization initiators for three-dimensional optical storage and microfabrication. Nature 348, 51–54. Daria, V., Blanca, C. M., Nakamura, O., Kawata, S., and Saloma, C. (1998). Image contrast enhancement for two-photon fluorescence microscopy in a turbid medium. Appl. Opt. 37, 7960–7967. de Grauw, K., and Gerritsen, H. (2002). Aberrations and penetration depth in confocal and two-photon microscopy, in Confocal and Two-photon Microscopy: Foundations, Applications and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc, pp. 153–170. Denk, W. (1996). Two-photon excitation in functional biological imaging. J. Biomed. Opt. 1, 296–304. Denk, W., and Svoboda, K. (1997). Photon upmanship: Why multiphoton imaging is more than a gimmick. Neuron 18, 351–357. Denk, W., Strickler, J. H., and Webb, W. W. (1990). Two-photon laser scanning fluorescence microscopy. Science 248, 73–76. Denk, W., Delaney, K. R., Gelperin, A., Kleinfeld, D., Strowbridge, B. W., Tank, D. W., and Yuste, R. (1994). Anatomical and functional imaging of neurons using 2-photon laser scanning microscopy. J. Neurosci. Methods 54, 151–162. Denk, W., Piston, D., and Webb, W. W. (1995). Two-photon molecular excitation in laser scanning microscopy, in Handbook of Confocal Microscopy, edited by J. B. Pawley. New York: Plenum Press, pp. 445–457. Diaspro, A. (1998). Two-photon fluorescence excitation. A new potential perspective in flow cytometry. Minerva Biotechnol 11(2), 87–92. Diaspro, A. (1999a). (guest editor) Two-photon microscopy. Microsc. Res. Tech. 47, 163–212. Diaspro, A. (1999b). (guest editor) Two-photon excitation microscopy. IEEE Eng. Med. Biol. 18(5), 16–99. Diaspro, A (1999). Two-photon excitation of fluorescence in three-dimensional microscopy. Eur. J. Histochem. 43, 169–178. Diaspro, A. (2001). Building a two-photon microscope using a laser scanning confocal architecture, in Methods in Cellular Imaging, edited by A. Periasamy. New York: Oxford University Press, pp. 162–179. Diaspro, A. Ed. (2002). Confocal and Two-Photon Microscopy: Foundations, Applications, and Advances. New York: Wiley-Liss, Inc.
TWO-PHOTON EXCITATION MICROSCOPY
279
Diaspro, A., and Robello, M. (2000). Two-photon excitation of fluorescence for threedimensional optical imaging of biological structures. J. Photochem. Photobiol. B 55, 1–8. Diaspro, A., and Sheppard, C. J. R. (2002). Two-photon excitation microscopy: Basic principles and architectures, Confocal and Two-Photon Microscopy: Foundations, Applications, and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc., pp. 34–74. Diaspro, A., Sartore, M., and Nicolini, C. (1990). Three-dimensional representation of biostructures imaged with an optical microscope: I. Digital optical sectioning. Image Vision Comput. 8, 130–141. Diaspro, A., Beltrame, F., Fato, M., Palmeri, A., and Ramoino, P. (1997). Studies on the structure of sperm heads of Eledone cirrhosa by means of CLSM linked to bioimage-oriented devices. Microsc. Res. Tech. 36, 159–164. Diaspro, A., Annunziata, S., Raimondo, M., and Robello, M. (1999a). Three-dimensional optical behaviour of a confocal microscope with single illumination and detection pinhole through imaging of subresolution beads. Microsc. Res. Tech. 45(2), 130–131. Diaspro, A., Corosu, M., Ramoino, P., and Robello, M. (1999b). Adapting a compact confocal microscope system to a two-photon excitation fluorescence imaging architecture. Microsc. Res. Tech. 47, 196–205. Diaspro, A., Annunziata, S., and Robello, M. (2000). Single-pinhole confocal imaging of subresolution sparse objects using experimental point spread function and image restoration. Micros. Res. Tech. 51, 464–468. Diaspro, A., and Chirico, G. Cannone, F. et al. (2001). Two-photon microscopy and spectroscopy based on a compact confocal scanning head. J. Biomed. Opt. 6, 300–310. Diaspro, A., Federici, F., and Robello, M. (2002a). Influence of refractive-index mismatch in high-resolution three-dimensional confocal microscopy. Appl. Opt. 41, 685–690. Diaspro, A., Silvano, D., Krol, S., Cavalleri, O., and Gliozzi, A. (2002b). Single living cell encapsulation in nano-organized polyelectrolyte shells. Langmuir. 18, 5047–5050. Diaspro, A., Boccacci, P., Bonetto, P, Scarito, M., Davolio, M., and Epifani, M. (2002c). ‘‘Power-up your Microscope,’’ www.powermicroscope.com. Diaspro, A., Fronte, P., Raimondo, M., Fato, M., De Leo, G., Beltrame, F., Cannone, F., Chirico, G., and Ramoino, P. (2002d). Functional imaging of living paramecium by means of confocal and two-photon excitation fluorescence microscopy, in Functional Imaging, edited by D. Farkas. Proc. SPIE. 4622, pp. 47–53. Dong, C. Y., Yu, B., Hsu, L. L., and So, P. T. C. (2002). Characterization of two-photon point spread function in skin imaging applications, in Multiphoton Microscopy in the Biomedical Sciences II, edited by A. Periasamy and P. T. C. So. Proc. SPIE. 4620, 1–8. Duveneck, G. L., Bopp, M. A., Ehrat, M., Haiml, M., Keller, U., Bader, M. A., Marowsky, G., and Soria, S. (2001). Evanescent-field-induced two-photon fluorescence: Excitation of macroscopic areas of planar waveguides. Appl. Phys. B. 73, 869–871. Faisal, F. H. M. (1987). Theory of Multiphoton Processes. New York: Plenum Press. Farrer, R. A., Previte, M. J. R., Olson, C. E., Peyser, L. A., Fourkas, J. T., and So, P. T. C. (1999). Single molecule detection with a two-photon fluorescence microscope with fast scanning capabilities and polarization sensitivity. Opt. Lett. 24, 1832–1834. Fay, F. S., Carrington, W., and Fogarty, K. E. (1989). Three-dimensional molecular distribution in single cells analyzed using the digital imaging microscope. J. Microsc. 153, 133–149. Feynman, R. P. (1985). QED: The Strange Theory of Light and Matter. Princeton, NJ: Princeton University Press. Fisher, W. G., Watcher, E. A., Armas, M., and Seaton, C. (1997). Titanium: sapphire laser as an excitation source in two-photon spectroscopy. Appl. Spectrosc. 51, 218–226. Ford, B. J. (1991). The Leeuwenhoek Legacy. Bristol and London: Biopress and Parrand.
280
DIASPRO AND CHIRICO
Franken, P. A., Hill, A. E., Peters, C. W., and Weinreich, G. (1961). Generation of optical harmonics. Phys. Rev. Lett. 7, 118–119. French, T, So, P. T. C., Weaver, D. J., Coelho-Sampaio, T., and Gratton, E. (1997). Twophoton fluorescence lifetime imaging microscopy of macrophage-mediated antigen processing. J. Microsc. 185, 339–353. Friedrich, D. M. (1982). Two-photon molecular spectroscopy. J. Chem. Educ. 59, 472–483. Friedrich, D. M., and McClain, W. M. (1980). Two-photon molecular electronic spectroscopy. Annu. Rev. Phys. Chem. 31, 559–577. Fujita, K., and Takamatsu, T. (2001). Real-time in situ calcium imaging with single and twophoton confocal microscopy, in Confocal and Two-Photon Microscopy: Foundations, Applications and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc, pp. 483–498. Gannaway, J. N., and Sheppard, C. J. R. (1978). Second harmonic imaging in the scanning optical microscope. Opt. Quant. Electron. 10, 435–439. Gauderon, R., Lukins, R. B., and Sheppard, C. J. R. (1999). Effects of a confocal pinhole in two-photon microscopy. Microsc. Res. Tech. 47, 210–214. Girkin, J., and Wokosin, D. (2002). Practical multiphoton microscopy, in Confocal and TwoPhoton Microscopy: Foundations, Applications and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc., pp. 207–236. Go¨ppert-Mayer, M. (1931). u¨ber Elementarakte mit zwei Quantenspru¨ngen. Ann. Phys. 9, 273–295. Gosnell, T. R. Taylor, A. J. Eds. (1991). Selected Papers on Ultrafast Laser Technology, SPIE Milestone series. Bellingham, WA: SPIE Press. Gratton, E., and van de Ven, M. J. (1995). Laser sources for confocal microscopy, in Handbook of Confocal Microscopy, edited by J. B. Pawley. New York: Plenum Press, pp. 69–97. Gratton, E., Barry, N. P., Beretta, S., and Celli, A. (2001). Multiphoton fluorescence microscopy. Methods. 25, 103–110. Gu, M., and Sheppard, C. J. R. (1995). Comparison of three-dimensional imaging properties between two-photon and single-photon fluorescence microscopy. J. Microsc. 177, 128–137. Gu, M., Gan, X., Kisteman, A., and Xu, M. G. (2000). Comparison of penetration depth between two-photon excitation and single-photon excitation in imaging thorugh turbid tissue media. Appl. Phys. Lett. 77(10), 1551–1553. Guild, J. B., Xu, C., and Webb, W. W. (1997). Measurement of group delay dispersion of high numerical aperture objective lenses using two-photon excited fluorescence. Appl. Opt. 36, 397–401. Hamamatsu Photonics, K. K. (1999). Photomultiplier Tubes: Basics and Applications, 2nd ed. Japan: Hamamatsu Photonics K. K. Hanninen, P. E., and Hell, S. W. (1994). Femtosecond pulse broadening in the focal region of a two-photon fluorescence microscope. Bioimaging. 2, 117–121. Harper, I. S. (2001). Fluorophores and their labeling procedures for monitoring various biological signals, in Methods in Cellular Imaging, edited by A. Periasamy. New York: Oxford University Press, pp. 20–39. Haughland, P. R. Ed. (2002). Handbook of Fluorescent Probes and Research ChemicalsEugene, OR: Edn. Molecular Probes. Hell, S. W. (guest editor) (1996). Nonlinear optical microscopy. Bioimaging. 4, 121–172. Hell, S. W., Bahlmann, K., Schrader, M., Soini, A., Malak, H., Gryczynski, I., and Lakowicz, J. R. (1996). Three-photon excitation in fluorescence microscopy. J. Biomed. Opt. 1, 71–74. Hellwarth, R., and Chistensen, P. (1974). Nonlinear optical microscopic examination of structures in polycrystalline ZnSe. Opt. Commun. 12, 318–322. Herman, B., and Tanke, H. J. (1998). Fluorescence Microscopy. New York: Springer-Verlag. Hooke, R. (1961). Micrographia (facsimile). New York: Dover.
TWO-PHOTON EXCITATION MICROSCOPY
281
Hopt, A., and Neher, E. (2001). Highly nonlinear photodamage in two-photon fluorescence microscopy. Biophys. J. 80, 2029–2036. Iyer, V., Hoogland, T. M., Losavio, B. E., McQuiston, A. R., and Saggau, P. (2002). Compact two-photon laser scanning microscope made from minimally modified commercial components, in Multiphoton Microscopy in the Biomedical Sciences II, edited by A. Periasamy and P. T. C. So. Proc. SPIE. pp. 274–280. Jonkman, J., and Stelzer, E. (2002). Resolution and contrast in confocal and two-photon microscopy, in Confocal and Two-photon Microscopy: Foundations, Applications and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc., pp. 101–126. Kaiser, W., and Garrett, C. G. B. (1961). Two-photon excitation in CaF2:Eu2+. Phys. Rev. Lett. 7, 229–231. Kawata, S., Sun, H.-B., Tanaka, T., and Takada, K. (2001). Finer features for functional microdevices. Nature. 412, 697–698. Kleinfeld, D., and Denk, W. (2000). Two-photon imaging of neocortical microcirculation, in Imaging Neurons, edited by R. Yuste, F. Lanni and A. Konnerth. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press, pp. 23.1–23.15. Koester, H. J., Baur, D., Uhl, R., and Hell, S. W. (1999). Ca2+ fluorescence imaging with picoand femtosecond two-photon excitation: Signal and photodamage. Biophys. J. 77, 2226–2236. Konig, K. (2000). Multiphoton microscopy in life sciences. J. Microsc. 200, 83–104. Konig, K., and Tirlapur, U. K. (2002). Cellular and subcellular perturbations during multiphoton microscopy, in Confocal and Two-photon Microscopy: Foundations, Applications and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc., pp. 191–206. Konig, K., Liang, H., Berns, M. W., and Tromberg, B. J. (1995). Cell damage by near-IR microbeams. Nature. 377, 20–21. Konig, K., Krasieva, T., Bauer, E., Fiedler, U., Berns, M. W., Tromberg, B. J., and Greulich, K. O. (1996a). Cell damage by UVA radiation of a mercury microscopy lamp probed by autofluorescence modifications, cloning assay and comet assay. J. Biomed. Opt. 1, 217–222. Konig, K., Simon, U., and Halbhuber, K. J. (1996b). 3D resolved two-photon fluorescence microscopy of living cells using a modified confocal laser scanning microscope. Cell. Mol. Biol. 42, 1181–1194. Konig, K., So, P. T. C., Mantulin, W. W., Tromberg, B. J., and Gratton, E. (1996c). Twophoton excited lifetime imaging of autofluorescence in cells during UVA and NIR photostress. J. Microsc. 183, 197–204. Konig, K., Liang, H., Berns, M. W., and Tromberg, B. J. (1996). Cell damage in near infrared multimode optical traps as a result of multiphoton absorption. Opt. Lett. 21, 1090–1092. Konig, K., So, P. T. C., Mantulin, W. W., and Gratton, E. (1997). Cellular response to near-red femtosecond laser pulses in two-photon microscopes. Opt. Lett. 22, 135–136. Konig, K., Boehme, S., Leclerc, N., and Ahuja, R. (1998). Time-gated autofluorescence microscopy of motile green microalga in an optical trap. Cell. Mol. Biol. 44, 763–770. Konig, K., Becker, T. W., Fischer, P., Riemann, I., and Halbhuber, K. J. (1999a). Pulse-length dependence of cellular response to intense near-infrared laser pulses in multiphoton microscopes. Opt. Lett. 24, 113–115. Ko¨nig, K., Riemann, I., Fischer, P., and Halbhuber, K. J. (1999b). Intracellular nanosurgery with near infrared femtosecond laser pulses. Cell. Mol. Biol. 45, 195–201. Konig, K, Gohlert, A., Liehr, T., Loncarevic, I. F., and Riemann, I. (2000). Two-photon multicolor FISH: A versatile technique to detect specific sequences within single DNA molecules in cells and tissues. Single Mol. 1, 41–51. Kriete, A. Visualization in Biomedical MicroscopiesWeinheim: VCH. Lakowicz, J. R. (1999). Principles of Fluorescence Microscopy. New York: Plenum Press.
282
DIASPRO AND CHIRICO
Lakowicz, J. R., and Gryczynski, I. (1992). Tryptophan fluorescence intensity and anisotropy decays of human serum albumin resulting from one-photon and two-photon excitation. Biophys. Chem. 45, 1–6. Lemons, R. A., and Quate, C. F. (1975). Acoustic microscopy: Biomedical applications. Science. 188, 905–911. Liu, Y., Cheng, D., Sonek, G. J., Berns, M. W., Chapman, C. F., and Tromberg, B. J. (1995). Evidence of focalized cell heating induced by infrared optical tweezers. Biophys. J. 68, 2137–2144. Loudon, R. (1983). The Quantum Theory of Light. London: Oxford University Press. Louisell, W. H. (1973). Quantum Statistical Properties of Radiation. New York: Wiley. Magde, D., Elson, E., and Webb, W. W. (1972). Thermodynamic fluoctuations in a reacting system: Measurement by fluorescence correlation spectroscopy. Phys. Rev. Lett. 29, 705–708. Mainen, Z. F., Malectic-Savic, M., Shi, S. H., Hayashi, Y., Malinow, R., and Svoboda, K. (1999). Two-photon imaging in living brain slices. Methods. 18, 231–239. Maiti, S., Shear, J. B., Williams, R. M., Zipfel, W. R., and Webb, W. W. (1997). Measuring serotonin distribution in live cells with three-photon excitation. Science. 275, 530–532. Majewska, A., Yiu, G., and Yuste, R. (2000). A custom-made two-photon microscope and deconvolution system. Pflugers Arch. 441(2/3), 398–408. Manders, E. M. M., Stap, J., Brakenhoff, G. J., van Diel, R., and Aten, J. A. (1992). Dynamics of three-dimensional replication patterns during the s-phase analyzed by double labelling of DNA and confocal microscopy. J. Cell. Sci. 103, 857–862. Masters, B. R. (1996). Selected Papers on Confocal Microscopy. SPIE Milestone Series. Bellingham, WA: SPIE Press. Masters, B. R. (2002). Selected Papers on Multiphoton Excitation Microscopy. SPIE Milestone Series. Bellingham, WA: SPIE Press. Masters, B. R., and So, P. T. C. (1999). Multiphoton excitation microscopy and confocal microscopy imaging of in vivo human skin: A comparison. Microsc. Microanal. 5, 282–289. Masters, B. R., So, P. T. C., and Gratton, E. (1997). Multiphoton excitation fluorescence microscopy and spectroscopy of in vivo human skin. Biophys. J. 72, 2405–2412. Mertz, J., Xu, C., and Webb, W. W. (1995). Single molecule detection by two-photon excited fluorescence. Opt. Lett. 20, 2532–2534. Minsky, M. (1961). Memoir of inventing the confocal scanning microscope. Scanning. 10, 128–138. Moreaux, L., Sandre, O., and Mertz, J. (2000). J. Opt. Soc. Am. B 17, 1685–1694. Moscatelli, F. A. (1986). A simple conceptual model for two-photon absorption. Am. J. Phys. 54, 52–54. Mueller, M., Squier, J., Wilson, K. R., and Brakenhoff, G. J. (1998). 3D microscopy of transparent objects using third-harmonic generation. J. Microsc. 191, 266–274. Murphy, D. B. (2001). Fundamentals of Light Microscopy and Electronic Imaging New York: Wiley-Liss, Inc., pp. 1–367. Nakamura, O. (1993). Three-dimensional imaging characteristics of laser scan fluorescence microscopy: Two-photon excitation vs. single-photon excitation. Optik 93, 39–42. Nakamura, O. (1999). Fundamentals of two-photon microscopy. Microsc. Res. Tech. 47, 165–171. Ott, D. (2002). Two-photon microscopy reveals tumor development. Biophotonics Int. January/ February, 46–48. Patterson, G. H., and Piston, D. W. (2000). Photobleaching in two-photon excitation microscopy. Biophys. J. 78, 2159–2162. Pawley, J. B. Ed. (1995). Handbook of Biological Confocal MicroscopyNew York: Plenum Press. Periasamy, A. Methods in Cellular Imaging. New York: Oxford University Press.
TWO-PHOTON EXCITATION MICROSCOPY
283
Periasamy, A., Skoglund, P., Noakes, C., and Keller, R. (1999). An evaluation of two-photon excitation versus confocal and digital deconvolution fluorescence microscopy imaging in Xenopus morphogenesis. Microsc. Res. Tech. 47, 172–181. Periasamy, A., Noakes, C., Skoglund, P., Keller, R., and Sutherland, A. E. (2002). Two-photon excitation fluorescence microscopy imaging in Xenopus and transgenic mouse embryos, in Confocal and Two-photon Microscopy: Foundations, Applications and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc., pp. 271–284. Pike, R. (2002). Superresolution in fluorescence confocal microscopy and in DVD optical storage, in Confocal and Two-photon Microscopy: Foundations, Applications and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc., pp. 499–524. Piston, D. W. (1999). Imaging living cells and tissues by two-photon excitation microscopy. Trends Cell Biol. 9, 66–69. Piston, D. W., Masters, B. R., and Webb, W. W. (1995). Three-dimensionally resolved NAD(P)H cellular metabolic redox imaging of the in situ cornea with two-photon excitation laser scanning microscopy. J. Microsc. 178, 20–27. Potter, S. M. (1996). Vital imaging: Two-photons are better than one. Curr. Biol. 6, 1596–1598. Potter, S. M., Wang, C. M., Garrity, P. A., and Fraser, S. E. (1996). Intravital imaging of green fluorescent protein using two-photon laser-scanning microscopy. Gene. 173, 25–31. Rentzepis, P. M., Mitschele, C. J., and Saxman, A. C. (1970). Measurement of ultrashort laser pulses by three-photon fluorescence. Appl. Phys. Lett. 17, 122–124. Robinson, J. P. (2001). Current Protocols in Cytometry. New York: John Wiley & Sons. Rochow, G. T., and Tucker, P. A. (1994). Introduction to Microscopy by Means of Light, Electrons, X-Rays, or Acoustics. New York: Plenum Press. Saloma, C., Saloma-Palmes, C., and Kondoh, H. (1998). Site-specific confocal fluorescence imaging of biological microstructures in a turbid medium. Phys. Med. Biol. 43, 1741. Sanchez, E. J., Novotny, L., Holtom, G. R., and Xie, X. S. (1997). Room-temperature fluorescence imaging and spectroscopy of single molecules by two-photon excitation. J. Phys. Chem. 101, 7019–7023. Schonle, A., and Hell, S. W. (1998). Heating by absorption in the focus of an objective lens. Opt. Lett. 23, 325–327. Schrader, M., Hell, S. W., and van der Voort, H. T. M. (1996). Potential of confocal microscope to resolve in the 50–100 nm range. Appl. Phys. Lett. 69, 3644–3646. Schwille, P. (2001). Fluorescence correlation spectroscopy and its potential for intracellular applications. Cell Biochem. Biophys. 34, 383–405. Schwille, P., Haupts, U., Maiti, S., and Webb, W. W. (1999). Molecular dynamics in living cells observed by fluorescence correlation spectroscopy with one- and two-photon excitation. Biophys. J. 77, 2251–2265. Schwille, P., Kummer, S., Heikal, A. A., Moerner, W. E., and Webb, W. W. (2000). Fluorescence correlation spectroscopy reveals fast optical excitation-driven intramolecular dynamics of yellow fluorescent proteins. Proc. Natl. Acad. Sci. USA. 97, 151–156. Sheppard, C. J. R. (1977). The use of lenses with annular aperture scanning optical microscopy. Optik 48, 329–334. Sheppard, C. J. R. (1989). Axial resolution of confocal fluorescence miroscopy. J. Microsc. 154, 237–241. Sheppard, C. J. R. (2002). The generalized microscope, in Confocal and Two-photon Microscopy: Foundations, Applications and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc., pp. 1–18. Sheppard, C. J. R., and Choudhury, A. (1977). Image formation in the scanning microscope. Opt. Acta. 24, 1051–1073.
284
DIASPRO AND CHIRICO
Sheppard, C. J. R., and Gu, M. (1990). Image formation in two-photon fluorescence microscopy. Optik. 86, 104–106. Sheppard, C. J. R., and Kompfner, R. (1978). Resonant scanning optical microscope. Appl. Opt. 17, 2879–2882. Sheppard, C. J. R., and Shotton, D. M. (1997). Confocal Laser Scanning Microscopy. Oxford, UK: BIOS. Sheppard, C. J. R., and Wilson, T. (1980). Image formation in confocal scanning microscopes. Optik. 55, 331–342. Sheppard, C. J. R., Kompfner, R., Gannaway, J., and Walsh, D. (1977). The scanning harmonic optical microscope. IEEE/OSA Conf. Laser Eng. Appl. Washington, DC. Shih, Y. H., Strekalov, D. V., Pittman, T. D., and Rubin, M. H. (1998). Why two-photon but not two photons? Fortschr. Phys. 46, 627–641. Shotton, D. M. Ed. (1993). Electronic Light Microscopy. Techniques in Modern Biomedical Microscopy. New York: Wiley-Liss, Inc. Shotton, D. M. (1995). Electronic light microscopy—present capabilities and future prospects. Histochem. Cell Biol. 104, 97–137. Singh, S., and Bradley, L. T. (1964). Three-photon absorption in naphthalene crystals by laser excitation. Phys. Rev. Lett. 12, 162–164. Smith, N. I., Fujita, K., Kaneko, T., Katoh, K., Nakamura, O., Kawata, S., and Takamastu, T. (2001). Generation of calcium waves in living cells by pulsed-laser-induced photodisruption. Appl. Phys. Lett. 79, 1208–1210. So, P. T. C., Berland, K. M., French, T., Dong, C. Y., and Gratton, E. (1996). Two photon fluorescence microscopy: Time resolved and intensity imaging, in Fluorescence Imaging Spectroscopy and Microscopy, edited by X. F. Wang and B. Herman. Chemical Analysis Series. New York: John Wiley & Sons, pp. 351–373. So, P. T. C., Kim, H., and Kochevar, I. E. (1998). Two-photon deep tissue ex vivo imaging of mouse dermal and subcutaneous structures. Opt. Express. 3, 339–350. So, P. T. C., Dong, C. Y., Masters, B. R., and Berland, K. M. (2000). Two-photon excitation fluorescence microscopy. Annu. Rev. Biomed. Eng. 2, 399–429. So, P. T. C., Kim, K. H., Buehler, C., Masters, B. R., Hsu, L., and Dong, C. Y. (2001). Basic principles of multi-photon excitation microscopy, in Methods in Cellular Imaging, edited by A. Periasamy. New York: Oxford University Press, pp. 152–161. Soeller, C., and Cannell, M. B. (1996). Construction of a two-photon microscope and optimisation of illumination pulse duration. Pfluegers Arch. 432, 555–561. Soeller, C., and Cannell, M. B. (1999). Two-photon microscopy: Imaging in scattering samples and three-dimensionally resolved flash photolysis. Microsc. Res. Tech. 47, 182–195. Sonnleitner, M., Schutz, G. J., and Schmidt, T. (1999). Imaging individual molecules by twophoton excitation. Chem. Phys. Lett. 300, 221–226. Sonnleitner, M., Schutz, G., Kada, G., and Schindler, H. (2000). Imaging single lipid molecules in living cells using two-photon excitation. Single Mol. 1, 182–183. Spence, D. E., Kean, P. N., and Sibbett, W. (1991). 60-fsec pulse generation from a self-modelocked Ti:sapphire laser. Opt. Lett. 16, 42–45. Squier, J. A., Muller, M., Brakenhoff, G. J., and Wilson, K. R. (1998). Third harmonic generation microscopy. Opt. Express. 3, 315–324. Squirrel, J. M., Wokosin, D. L., White, J. G., and Barister, B. D. (1999). Long-term two-photon fluorescence imaging of mammalian embryos without compromising viability. Nat. Biotechnol. 17, 763–767. Stanley, M. (2001). Improvements in Optical Filter Design, edited by A. Periasamy and P. T. C. So. Proc. SPIE. 4262, 52–61.
TWO-PHOTON EXCITATION MICROSCOPY
285
Stelzer, E. H. K., Hell, S., Lindek, S., Pick, R., Storz, C., Stricker, R., Ritter, G., and Salmon, N. (1994). Non-linear absorption extends confocal fluorescence microscopy into the ultraviolet regime and confines the illumination volume. Opt. Commun. 104, 223–228. Straub, M., and Hell, S. W. (1998). Fluorescence lifetime three-dimensional microscopy with picosecond precision using a multifocal multiphoton microscope. Appl. Phys. Lett. 73, 1769–1771. Straub, M., Lodemann, P., Holroyd, P., Jahn, R., and Hell, S. W. (2000). Live cell imaging by multifocal multiphoton microscopy. Eur. J. Cell Biol. 79, 726–734. Svelto, O. (1998). Principles of Lasers. 4th ed. New York: Plenum Press. Sytsma, J., Vroom, J. M., De Grauw, C. J., and Gerritsen, H. C. (1998). Time-gated fluorescence lifetime imaging and microvolume spectroscopy using two-photon excitation. J. Microsc. 191, 39–51. Tan, Y. P., Llano, I., Hopt, A., Wurriehausen, F., and Neher, E. (1999). Fast scanning and efficient photodetection in a simple two-photon microscope. J. Neurosci. Methods. 92, 123–135. Tanaka, T., Sun, H. B., and Kawata, S. (2002). Rapid sub-diffraction-limit laser micro’nanoprocessing in a threshold material system. Appl. Phys. Lett. 80, 312–314. Tirlapur, U. K., and Konig, K. (2002). Two-photon near infrared femtosecond laser scanning microscopy in plant biology, in Confocal and Two-photon Microscopy: Foundations, Applications and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc., pp. 449–468. Torok, P., and Sheppard, C. J. R. (2002). The role of pinhole size in high aperture two and three-photon microscopy, in Confocal and Two-photon Microscopy: Foundations, Applications and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc., pp. 127–152. Tsien, R. Y. (1998). The green fluorescent protein. Annu. Rev. Biochem. 67, 509–544. Tyrrell, R. M., and Keyse, S. M. (1990). The interaction of UVA radiation with cultured cells. J. Photochem. Photobiol. B. 4, 349–361. Wang, X. F., and Herman, B. (1996). Fluorescence Imaging Spectroscopy and Microscopy. New York: Wiley-Liss, Inc. Webb, R. H. (1996). Confocal optical microscopy. Rep. Prog. Phys. 59, 427–471. Weinstein, M., and Castleman, K. R. (1971). Reconstructing 3-D specimens from 2-D section images. Proc. SPIE. 26, 131–138. White, J. G., Amos, W. B., and Fordham, M. (1987). An evaluation of confocal versus conventional imaging of biological structures by fluorescence light microscopy. J. Cell Biol. 105, 41–48. White, N. S., and Errington, R. J. (2000). Improved laser scanning fluorescence microscopy by multiphoton excitation. Adv. Imag. Elect. Phys. 113, 249–277. Wier, W. G., Balke, C. W., Michael, J. A., and Mauban, J. R. (2000). A custom confocal and two-photon digital laser scanning microscope. Am. J. Physiol. 278, H2150–H2156. Wilson, T. (1989). Optical sectioning in confocal fluorescent microscope. J. Microsc. 154, 143–156. Wilson, T. Confocal Microscopy London: Academic Press. Wilson, T. (2002). Confocal microscopy: Basic principles and architectures, in Confocal and Two-Photon Microscopy: Foundations, Applications and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc., pp. 19–38. Wilson, T., and Sheppard, C. J. R. (1984). Theory and Practice of Scanning Optical Microscopy. London: Academic Press. Wise, F. (1999). Lasers for two-photon microscopy, in Imaging: A Laboratory Manual, edited by R. Yuste, F. Lanni and A. Konnerth. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press, pp. 18.1–18.9.
286
DIASPRO AND CHIRICO
Wiseman, P. W., Squier, J. A., Ellisman, M. H., and Wilson, K. R. (2000). Two photon image correlation spectroscopy and image cross-correlation spectroscopy. J. Microsc. 200, 14–25. Wiseman, P. W., Capani, F., Squier, J. A., and Martone, M. E (2002). Counting dendritic spines in brain tissue slices by image correlation spectroscopy analysis. J. Microsc. 205, 177–186. Wokosin, D. L., and White, J. G. (1997). Optimization of the design of a multiple-photon excitation laser scanning fluorescence imaging system, in Three-Dimensional Microscopy: Image, Acquisition and Processing IV. Proc. SPIE. 2984, 25–29. Wokosin, D. L., Centonze, V. E., White, J., Armstrong, D., Robertson, G., and Ferguson, A. I. (1996). All-solid-state ultrafast lasers facilitate multiphoton excitation fluorescence imaging. IEEE J. Sel. Top. Quant. Elect. 2, 1051–1065. Wokosin, D. L., Amos, W. B., and White, J. G. (1998). Detection sensitivity enhancements for fluorescence imaging with multiphoton excitation microscopy. Proc. IEEE Eng. Med. Biol. Soc. 20, 1707–1714. Wolleschensky, R., Feurer, T., Sauerbrey, R., and Simon, U. (1998). Characterization and optimization of a laser scanning microscope in the femtosecond regime. Appl. Phys. B 67, 87–94. Wolleschensky, R., Dickinson, M., and Fraser, S. E. (2002). Group velocity dispersion and fiber delivery in multiphoton laser scanning microscopy, in Confocal and Two-Photon Microscopy: Foundations, Applications and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc., pp. 171–190. Xie, X. S., and Lu, H. P. (1999). Single molecule enzymology. J. Biol. Chem. 274, 15967–15970. Xu, C. (2002). Cross-sections of fluorescence molecules used in multiphoton microscopy, in Confocal and Two-Photon Microscopy: Foundations, Applications and Advances, edited by A. Diaspro. New York: Wiley-Liss, Inc., pp. 75–100. Xu, C., Guild, J., Webb, W. W., and Denk, W. (1995). Determination of absolute two-photon excitation cross sections by in situ second-order autocorrelation. Opt. Lett. 20, 2372–2374. Yoder, E. J., and Kleinfeld, D. (2002). Cortical imaging through the intact mouse skull using two-photon excitation laser scanning microscopy. Microsc. Res. Tech. 56(4), 304–305. Yuste, R., Lanni, F., and Konnerth, A. (2000). Imaging Neurons: A Laboratory Manual. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press. Zoumi, A., Yeh, A., and Tromberg, B. J. (2002). Imaging cells and extracellular matrix in vivo by using second-harmonic generation and two-photon excited fluorescence. Proc. Natl. Acad. Sci. USA 99(17), 11014–11019. in press.
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 126
Phase Closure Imaging ANDRE´ LANNES Sciences de l’Univers au Centre Europe´en de Recherche et de Formation Avance´e en Calcul Scientifique (SUC-CERFACS), F-31057 Toulouse cedex, France
I. Introduction. . . . . . . . . . . . . . . . . . . . A. Interferometric Graphs . . . . . . . . . . . . . . B. Phase Closure . . . . . . . . . . . . . . . . . . C. Phase Calibration . . . . . . . . . . . . . . . . D. Image Reconstruction . . . . . . . . . . . . . . . E. Contents . . . . . . . . . . . . . . . . . . . . II. Phase Spaces and Integer Lattices . . . . . . . . . . . A. Pupil Phase Space . . . . . . . . . . . . . . . . B. Baseline Phase Space . . . . . . . . . . . . . . . C. Unknown-Spectral Phase Space . . . . . . . . . . . D. Bias Phase Space . . . . . . . . . . . . . . . . E. Loop-Entry Phase Space. . . . . . . . . . . . . . III. Phase Closure Operator, Phase Closure Projection, and Related IV. Variance–Covariance Matrix of the Closure Phases . . . . . V. Spectral Phase Closure Projection . . . . . . . . . . . A. Smith Normal Form of the Spectral Phase Closure Matrix. B. Examples . . . . . . . . . . . . . . . . . . . 1. Weakly Redundant Case . . . . . . . . . . . . 2. Strongly Redundant Case . . . . . . . . . . . . VI. Reference Algebraic Framework . . . . . . . . . . . . VII. Statement of the Phase Calibration Problem . . . . . . . VIII. Phase Calibration Discrepancy and Related Results. . . . . IX. Optimal Model Phase Shift and Related Results . . . . . . A. Optimal Bias Phase. . . . . . . . . . . . . . . . B. Optimal Pupil Phase . . . . . . . . . . . . . . . X. Special Cases . . . . . . . . . . . . . . . . . . . A. Special Case Where m1 ¼ p . . . . . . . . . . . . B. Special Case Where m1 ¼ m with m < p . . . . . . . . C. Special Case Where m1 ¼ m with m ¼ p . . . . . . . . XI. Simulated Example . . . . . . . . . . . . . . . . . XII. Concluding Comments . . . . . . . . . . . . . . . . Appendix 1. Useful Property . . . . . . . . . . . . . Appendix 2. Smith Normal Form of Integral Matrices . . . Appendix 3. Reference Projections . . . . . . . . . . . Appendix 4. Closest Point Search in Lattices . . . . . . . References . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
288 289 290 291 292 292 293 293 293 294 295 296 296 299 299 300 300 300 302 305 307 309 313 314 314 315 315 316 317 317 319 320 321 321 323 327
287 Copyright 2003 Elsevier Science (USA). All rights reserved. ISSN 1076-5670/03
288
A. LANNES
I. Introduction Phase calibration is the key operation of phase closure imaging. In the general case of redundant arrays, the corresponding analysis is based on the Smith normal form of the spectral phase closure matrix. This mathematical representation, well known in integral matrix theory, has not been exploited so far in phase closure imaging. New results are thus exhibited. In this theoretical framework, the optimal model phase shift is obtained by successively solving two integer ambiguity problems. This study is illustrated with the aid of a simulation built on a particular redundant interferometric graph. The potential instabilities of a phase calibration operation can thus be well understood. In this article, A is an interferometric array observing an incoherent source of small angular size (see Fig. 1); A includes n pupil elements: n telescopes in optical interferometry (Reasonberg, 1998) or n antennas in radio imaging (Hunt and Payne, 1997). Relative to the tracking center, the source is characterized by some two-dimensional angular brightness distribution so ðjÞ. Let rð jÞ denote the position vector of the j th pupil element projected onto a plane normal to the tracking axis (see Fig. 1). According to the Van Cittert–Zernike theorem (Born and Wolf, 1970), the data set, consisting of the experimental ‘‘complex visibilities’’ Ve ð j; kÞ, corresponds to a certain sampling of the Fourier transform of so, Z ^so ðuÞ :¼ so ðjÞexpð2iu jÞdj ð1Þ R2
Each baseline ( j, k) defines a Fourier sampling point; the corresponding angular spatial frequency is defined by uð j; kÞ ¼
rð jÞ rðkÞ
ð2Þ
where is the wavelength of the electromagnetic field under consideration. In the absence of errors, one thus has Ve ð j; kÞ ¼ ^so ½uð j; kÞ. Within reasonable, well-defined limits, inversion of this basic relationship yields an approximation to so. The corresponding operation is associated with the notion of aperture synthesis. In most cases encountered in practice, the relationship between so and Ve is not a simple Fourier sampling operation. In particular, residual optical path differences often blur the basic observational principle. More precisely, we then have Ve ð j; kÞ ¼ ^so ½uð j; kÞ exp½ie ð j; kÞ þ error terms
ð3Þ
PHASE CLOSURE IMAGING
289
Figure 1. Interferometric observational principle. Each couple ( j, k) of pupil elements defines a Fourier sampling point u( j, k) of the Fourier transform of the angular brightness distribution of the object source: uð j; kÞ ¼ Mk Mj = [see Eq. (2)].
in which the e ( j, k) are bias phases of the form e ð j; kÞ ¼ e ð jÞ e ðkÞ
ð4Þ
The e( j ) are unknown pupil phases. All the complex-valued functions involved in the observational Eq. (3) are Hermitean. For example, Ve ðk; jÞ ¼ Ve ð j; kÞ. In this article, we consider the situations in which the bias phases e ð j; kÞ cannot be calibrated in an experimental manner. The phase of ^so ½uð j; kÞ, an antisymmetric function denoted by o ð j; kÞ, is therefore not directly accessible. A. Interferometric Graphs Let Bc be the set of the nðn 1Þ=2 baselines ( j, k) generated by A. The graph (A, Bc) (see Fig. 2 and Biggs, 1996), whose vertices are the pupil elements of A, and whose edges are the baselines of Bc, is said to be complete. In practice, one may be led to consider the values of the phase of Ve only on a subset B of Bc: B Bc . For example, this may result from the fact that |Ve| is negligible on Bc n B. The number of baselines of graph (A, B) is denoted by q. Clearly, q
nðn 1Þ 2
ð5Þ
290
A. LANNES
Figure 2. Top: redundant array A; bottom: corresponding complete graph (A, Bc). By definition, Bc is the set of all the baselines generated by A.
According to the very principle of interferometry, A and B are defined so that (A, B) is connected (Biggs, 1996); one then speaks of the interferometric graph. The condition ‘‘ ð jÞ ðkÞ ¼ 0 for all ð j; kÞ 2 B’’ is therefore equivalent to ‘‘ is constant on A.’’ According to Eq. (2), distinct baselines may generate the same angular frequency; ^so ½uð j; kÞ takes the same value on these baselines. Whenever this situation occurs, the interferometric graph is said to be redundant or partly redundant (see Fig. 2). To stress the fact that o is constant on the subsets of B defined by the list of distinct angular frequencies, one then says that #o o is a spectral (baseline) phase. B. Phase Closure A subgraph of (A, B) with n vertices, n 1 edges, and no loop (i.e., no cycle) in it is said to be a spanning tree of (A, B) (see Fig. 3 and Biggs, 1996). Let ( ji, ki) now be a baseline of B that does not lie in the set of baselines of the selected spanning tree. As illustrated in Figure 3 and specified in Lannes (1999), a baseline of this type defines a certain directed loop. The number of loops defined via a given spanning tree (fixed in arbitrary manner) is therefore given by the formula p ¼ q ðn 1Þ
ð6Þ
For example, in Figure 3, the selected spanning tree includes five elements: baselines (1, 2), (1, 3), (1, 4), (1, 5), and (1, 6). Baselines (2, 3), (2, 4), (2, 5),
PHASE CLOSURE IMAGING
291
Figure 3. Example of interferometric graph (n ¼ 6). Baselines (2, 3), (2, 4), (2, 5), and (4, 5) are lacking so that q ¼ 11. The thick lines correspond to the selected spanning tree. Here, such a tree includes five baselines; the remaining baselines define as many loops: p ¼ 6 (see text).
and (4, 5) are lacking so that q ¼ 11. The remaining baselines ( ji, ki), the six baselines (2, 6), (3, 4), (3, 5), (3, 6), (4, 6), and (5, 6), define as many loops ( p ¼ 6). Note that here, all these loops are of order 3. By definition, the closure phases of are the sums of the values of along the directed loops defined through a given spanning tree. For example, in Figure 3, for the directed loop induced by the first loop-entry baseline ( j1,k1) (2,6), the closure phase of is ð1Þ :¼ ð2; 6Þ þ ð6; 1Þ þ ð1; 2Þ
Note that the closures phases of any bias phase are equal to zero.
C. Phase Calibration Let sm (where m stands for model) be an approximation to so. On each baseline ð j; kÞ 2 B, the phase of Ve, the baseline phase e ð j; kÞ, is related to that of sˆm, the spectral phase #m ( j, k), by a relationship of the form e ¼ ð#m þ #Þ þ þ 2 þ e
ð7Þ
in which e is an error term. Here, # is a spectral phase, whereas is a bias phase: the # ( j, k) satisfy the redundancy constraint, whereas the ( j, k) are of the form ( j ) – (k). Clearly, ( j, k) is an integer-valued function. In the phase calibration operation, the quantities #, , and have to be chosen so as to minimize the size of the error term. The model is then constrained through a formula of the form #m ¼ #m þ # . In what follows, # is referred to as the ‘‘optimal model phase shift.’’
292
A. LANNES
D. Image Reconstruction At any step of the image reconstruction procedure, the object model sm may be refined by performing a phase calibration operation followed by a Fourier synthesis process. The latter is performed by using as input the Fourier data of the refined model ^sm ½uð j; kÞ ¼ ^sm ½uð j; kÞ exp½i# ð j; kÞ
ð j; kÞ 2 B
ð8Þ
Examples of Fourier synthesis methods can be found in Lannes et al. (1994, 1996, 1997). As will be clarified in this article, the notion of phase closure imaging is associated with the fact that the optimal model phase shift # can be expressed in terms of the closure phases of e #m . Since the original work by Cornwell and Wilkinson (1981) on how to make maps with interferometers, radio astronomers have been well aware of the critical part played by the phase calibration operation (Hunt and Payne, 1997). Instabilities were observed, but never well understood until the analysis presented much later by Lannes (1999, 2001a) became available. By stating the problem at the level of the phase (instead of the phasor), it was then established that in the case of nonredundant arrays, a phase calibration operation amounted to solving a certain ‘‘nearest lattice point’’ problem. The related instabilities could then be well understood. The present study can be regarded as an extension of the paper by Lannes (2001a) to the case of redundant arrays. New aspects, which were hidden when concentrating on the nonredundant case, are thus revealed, hence providing better knowledge of the matter.
E. Contents We first present the algebraic framework in which the analysis of the phase calibration problem can be developed. Phase spaces and their integer lattices are then introduced (Section II). Some properties related to the notion of phase closure are stated in Section III. The new results essentially concern the variance–covariance matrix of the closure phases (Section IV), and especially, the Smith normal form (SNF) of the spectral phase closure matrix (Section V). As the reader may not be familiar with the notion of SNF, Section V is illustrated with the aid of two examples. The first one concerns a weakly redundant interferometric graph, and the second a strongly redundant graph. Section VI is devoted to the reference algebraic framework resulting from this analysis. The phase calibration problem is thoroughly stated and solved in Sections VII to IX. In the general case of redundant arrays, two integer ambiguity problems must then be successively
PHASE CLOSURE IMAGING
293
solved: P1 and P2. Important special cases are examined in Section X. Section XI is devoted to a simulation built on a particular redundant array. As indicated in the concluding comments (Section XII), the present study can be extended to any interferometric device. II. Phase Spaces and Integer Lattices In what follows, we identify the n-element array A with A :¼ f1; 2; . . . ; ng, and denote by B := fð j; kÞ; ðk; jÞ : ð j; kÞ 2 Bg the set of directed baselines. A. Pupil Phase Space By definition, the pupil phase space is the space H H (R) of real-valued functions that take their values on A or A. Endowed with the inner product X ð 1 j 2 ÞH :¼
1 ð jÞ 2 ð jÞ j2A
H is a Euclidean space of dimension n. The subset of H whose elements are functions with values in Z is denoted by H(Z). This subset is a lattice of H (Cohen, 1996). The are the nodes of this lattice. The set fak : k 2 Ag in which ak ð jÞ ¼ jk (the Kronecker symbol) is the standard basis of H, as well as of H(Z), which is therefore of rank n. Given r in A, Hr is the subspace of H with standard basis fa‘ : ‘ 2 Anrg; Hr ðZÞ is the corresponding lattice. B. Baseline Phase Space The baseline phase space is the space G G(R) of antisymmetric real-valued functions that take their values on B : 8ð j; kÞ 2 B; ðk; jÞ ¼ ð j; kÞ. Clearly (see Section I.A), dim G ¼ q The subset of G whose elements are functions with values in Z is denoted by G(Z). This subset is a lattice of G. The are the nodes of this lattice. The set of baseline phase functions 8 if j ¼ j 0 and k ¼ k0 < 1 bj0 k0 ð j; kÞ :¼ 1 if j ¼ k0 and k ¼ j 0 ð j 0; k0 Þ 2 B : 0 otherwise
is the standard basis of G, as well as of GðZÞ, which is therefore of rank q.
294
A. LANNES
Let $ be a given symmetric weight function: $ð j; kÞ ¼ $ðk; jÞ > 0. Endowed with the inner product 1 X $ð j; kÞ1 ð j; kÞ2 ð j; kÞ ð1 j2 ÞG :¼ 2 ð j; kÞ 2 B ð9Þ X $ð j; kÞ1 ð j; kÞ2 ð j; kÞ :¼ ð j; kÞ 2 B
G is a real Hilbert space. In the absence of any ambiguity, the subscript G will be omitted. In other terms, ðjÞ and k k stand for (jÞG and k kG , respectively. C. Unknown-Spectral Phase Space Whenever redundant interferometric graphs are considered, one is led to introduce an important subspace of G: the spectral phase space Gs Gs ðRÞ. By definition, Gs is the set of baseline phases # 2 G that satisfy the redundancy constraint: # takes the same value on all the baselines that generate the same spatial frequency. As already mentioned, such a phase function is said to be a spectral phase. The weight function $ involved in the definition of the inner product [Eq. (9)] satisfies the redundancy constraint. The object spectral phase #o is often approximately known on a given subset Br of B. (The subscript r stands for reference.) In practice, Br corresponds to a given set of low frequencies. By definition, the unknownspectral phase space K KðRÞ is the space of spectral phases that vanish on the reference set in question. Let m be the number of spectral phase components to be determined, and fuk gm k¼1 be the corresponding set of distinct angular spatial frequencies. Clearly, dim K ¼ m
ð10Þ
For example, for the array shown in Figure 2, when # (1, 2) is assumed to be known a priori, the dimension of K is equal to 4. The subset of K whose elements are functions with values in Z is denoted by KðZÞ. This subset is a lattice of K. The are the nodes of this lattice. The standard basis of K is the set of the spectral phases 8 < 1 if u ði; jÞ ¼ uk k ði; jÞ :¼ 1 if u ði; jÞ ¼ uk ; ðk ¼ 1; . . . ; mÞ : 0 otherwise This basis, k gm k¼1 , is also the standard basis of lattice KðZÞ. By construction, the latter is of rank m.
PHASE CLOSURE IMAGING
295
D. Bias Phase Space In the process of stating the phase calibration problem, one is led to introduce the bias phase operator B : H ! G;
ðB Þð j; kÞ :¼ ð jÞ ðkÞ
By definition, the bias phase space L is the range of B : L :¼ BH. As the graph (A, B) is connected, the space of functions that are constant on A is the kernel (also called the null space) of B. This subspace of H is of dimension unity. As a result, dim L ¼ n 1
ð11Þ
Given r in A, the set f‘ :¼ Ba‘ : ‘ 2 Anrg is a basis of L. This basis generates a lattice of L denoted by L(Z). By construction this lattice is of rank n 1. Note that L(Z) is the subset of L whose elements are functions with values in Z: LðZÞ ¼ GðZÞ \ L The are the nodes of this lattice. The orthogonal complement of L in G, M, is referred to as the bias-free phase space. In what follows, the orthogonal projections onto L and M are denoted by R and S, respectively (see Fig. 4). In practice, their action does not raise any particular difficulty [see the context of Eq. (37) in Lannes, 1999].
Figure 4. Main decomposition of the baseline phase space G. The range of B is the bias phase space L. Its orthogonal complement M is the bias-free phase space; R and S are the corresponding orthogonal projections. The spanning-tree phase space E is the space induced by the selected spanning tree. Its orthogonal complement F is the loop-entry phase space; P and Q are the corresponding orthogonal projections. Operator C, which is the oblique projection of G onto F along L, is referred to as the phase closure projection.
296
A. LANNES
E. Loop-Entry Phase Space Let E be the subspace of G formed by the functions with support in the selected spanning tree; E is referred to as the spanning-tree phase space; its dimension is equal to n 1. Its orthogonal complement, F, is the loop-entry phase space. Clearly, dim F ¼ p The loop-entry phase functions 8 if j ¼ ji and k ¼ ki < 1 i ð j; kÞ :¼ 1 if j ¼ ki and k ¼ ji : 0 otherwise
ði ¼ 1; . . . ; pÞ
ð12Þ
form the standard basis of F. This basis generates a lattice of F denoted by F ðZÞ. Note that F ðZÞ ¼ GðZÞ \ F
By construction, this lattice is of rank p. As shown in Figure 4, the projections onto E and F are denoted by P and Q, respectively.
III. Phase Closure Operator, Phase Closure Projection, and Related Properties The closure phases ð1Þ ; . . . ; ðpÞ of a function lying in G are the sums of the values of along the directed loops defined through a given spanning tree of (A, B) (see Section I.B). These closure phases are the components of a vector b lying in Rp . In this context, the operator C : G ! Rp ;
C :¼ b ¼
p X
ðiÞ ji
i¼1
is said to be the ‘‘phase closure operator.’’ Note that fji ¼ Ci gpi¼1 is the standard basis of Rp . This explicitly shows that C is surjective. We therefore have dimðker CÞ ¼ dim G dim F ¼ q p hence, from Eq. (6): dimðker CÞ ¼ n 1. Clearly, the range of B is contained in ker C. As this range is of dimension n 1 [see Eq. (11)], it follows that ker C ¼ L
PHASE CLOSURE IMAGING
297
Consider the operator C : Rp ! F ;
C :¼
p X
½i i
i¼1
[i]
where denotes the ith component of . Clearly, the operator C :¼ C C is such that p X
C ¼
ðiÞ i
i¼1
As Ci ¼ i , we have C 2 ¼ C. Furthermore, L is the kernel of C. This operator, which is therefore the oblique projection of G onto F along L (see Fig. 4), is said to be the ‘‘phase closure projection.’’ Note that :¼ C lies in ker C and therefore in L. Any in G can therefore be uniquely decomposed in the form ¼ þ C
2 L and C 2 F
with
This explicitly shows that G can be regarded as the direct sum of L and F: G¼LþF
L \ F ¼ f0g
with
Let us now concentrate on the oblique projection of a node of GðZ): C ¼
p X
ðiÞ i
i¼1
As the (i) are rational integers, C is a node of F ðZÞ (see Fig. 5). It is then clear that any in GðZÞ can be decomposed in the form (see Fig. 5) ¼ þ C
with
2 LðZÞ and C 2 F ðZÞ
This explicitly shows that GðZÞ can be regarded as the direct sum of LðZÞ and F ðZÞ: GðZÞ ¼ LðZÞ þ F ðZÞ
with
LðZÞ \ F ðZÞ ¼ f0g:
As Sð CÞ ¼ 0 (since C lies in L, and S is the projection of G onto M ), we have p X S ¼ ðiÞ i i¼1
where
i :¼ Si
298
A. LANNES
Figure 5. Canonical decomposition of lattice G(Z). The intersection of G(Z) with the bias phase space L, L(Z), is a lattice of rank n 1. The intersection of G(Z) with the loop-entry phase space F, F(Z), is a lattice of rank p. For a given choice of spanning tree, any 2 G(Z) can be decomposed in the nonorthogonal form ¼ þ C with 2 LðZÞ and C 2 F ðZÞ; GðZÞ can therefore be regarded as the direct sum of L(Z) and F(Z)
Note that Ci ¼ i (see Fig. 4). As G is the direct sum of L and F, and the is form a basis of F, it follows from the relation above that the is form a basis of M (see Appendix 1). Let us now introduce the operator p X ½i i C : ¼ C : Rp ! M; i¼1
By construction, S ¼ C C. Note that Ci ¼ Ci , hence p X ðiÞ Ci CC ¼ i¼1
¼
p X i¼1
ðiÞ Ci ¼
p X i¼1
ðiÞ ji ¼ b
As ker C ¼ L; C is therefore the Moore–Penrose pseudoinverse of C: C ¼ Cþ
Now, since C is surjective, we have
Cþ ¼ C ðCC Þ1
299
PHASE CLOSURE IMAGING
hence ðCþ Cþ ÞðCC Þ ¼ Ip
ðthe identity on Rp Þ
As a result, C C ¼ ðCC Þ1
ð14Þ
IV. Variance–Covariance Matrix of the Closure Phases Let [C] and [ ] be the matrices of C and CC in the standard bases of G and Rp , and [ ] be the diagonal matrix whose elements are the inverses of the baseline weights $ð j; kÞ [see Eq. (9)]. Denote by [C]t the transpose matrix of [C]. As ðC j ÞRp
¼
½Ct ½ ¼ ½t ½Ct ½
¼
ð j C ÞG ¼ ½t ½V 1 ½C ¼ ½t ½V 1 ½C ½
we have ½C ¼ ½V ½Ct , hence
½V ¼ ½C½V ½Ct
Consequently, when [V] is regarded as the variance–covariance matrix of the baseline phases ð j; kÞ; ½V is the variance–covariance matrix of the ðiÞ closure phases e . According to Eq. (14), we have ½C C ¼ ½V 1 Note that the matrix elements of [C C ] are the inner products (i j i0 Þ:
V. Spectral Phase Closure Projection The operator from K into F induced by C is denoted by CK and referred to as the spectral phase closure projection. Its kernel K0 is the intersection of K with L: K0 :¼ ker CK ¼ K \ L We set m0 ¼ dim K0 ;
m1 :¼ m m0
300
A. LANNES
A. Smith Normal Form of the Spectral Phase Closure Matrix Let [CK] now be the matrix of CK in the standard bases of K and F: p f k gm k¼1 ; fi gi¼1 . By construction, the matrix elements of [CK ] lie in Z. Note that [CK ] has p lines and m columns. According to theo theorem introduced n m0 m1 of KðZÞ, a basis ; in Appendix 2, there exist a basis 0;k k¼1 1; j j¼1 n m1 pm1 o 1; j j¼1 ; 2;i i¼1 of F ðZÞ, and positive integers c1 ; c2 ; . . . ; cm1 , with cj dividing cjþ1 for 1 j < m1 , such that C1; j ¼ cj 1; j (for 1 j m1 ) and C0;k ¼ 0 (for 1 k m0 ), in other words such that the matrix of CK in these bases is of Smith normal form. More precisely, there then exist two matrices [ ] and [ ] (of order m and p, respectively) with coefficients in Z and determinant 1 such that ½ 1 ½CK ½ ¼ ½CK S
with c1 B0 B . B . B . B ½CK S ¼ B B B B @ 0
0
0 c2
0 0 .. .
... ... cm 1 0
0
...
1 0 0C .. C C .C C C C C C .. A . 0
Clearly, CK is of rank m1. The components of 1, j and 0, k in the standard basis of K(Z) form the j th and (m1 + k)th column vectors of [ ], whereas the components of 1, j and 2, i in the standard basis of F(Z) form the j th and (m1 + i)th column vectors of [ ]. As illustrated in the following section, in most cases encountered in practice, the elementary divisors of ½CK ; c1 ; c2 ; . . . ; cm1 , prove to be equal to unity. B. Examples In this section we present two examples. The first one concerns a weakly redundant interferometric graph (Section V.B.1), and the second a strongly redundant graph (Section V.B.2). 1. Weakly Redundant Case Let us consider the four-element array shown in Figure 6 and the corresponding interferometric graph. This graph is complete, with n ¼ 4; q ¼ 6, and p ¼ 3, and weakly redundant: only two baselines are
PHASE CLOSURE IMAGING
301
Figure 6. Top: an example of a weakly redundant array; bottom: corresponding complete interferometric graph.
redundant: baselines (1, 2) and (2, 3). The following vectors form the standard basis of Gs : 1 ¼ b12 þ b23 2 ¼ b13 3 ¼ b14 4 ¼ b24 5 ¼ b34 We now identify K with Gs so that m ¼ 5. Note that here, m is strictly greater than p. The vectors 1 ¼ b23 ; 2 ¼ b24 ; and 3 ¼ b34 , which are the loop-entry vectors of the directed loops (2, 3, 1), (2, 4, 1), and (3, 4, 1), form the standard basis of F. We then have C1 ¼ 21 þ 2 C2 ¼ 1 þ 3 C3 ¼ 2 3 C4 ¼ 2 C5 ¼ 3 Matrix [CK] is therefore of the form 0 2 1 0 ½CK ¼ @ 1 0 1
0 0 1 1 1 0
Its Smith normal form is then as follows: 0 1 0 0 ½CK s ¼ @ 0 1 0 0 0 1
0 0 0
1 0 0A 1 1 0 0A 0
Clearly, CK is then of rank 3: m0 ¼ 2 and m1 ¼ 3. Here,
302
A. LANNES
0
1 B0 B ½ ¼ B B0 @0 0
0 1 0 0 0
1 2 1 0 0
1 2 2 1 0
1 1 2 C C 1 C C 0A 1
The first three columns of [ ] yield the components of 1;1 ; 1;2 , and 1;3 in the standard basis of KðZÞ; the last two columns yield those of 0;1 and 0;2 . If need be, the reader may explicitly verify that C0; k ¼ 0 for k ¼ 1; 2. The routines that give the Smith normal form also yield ½ 1 . Here, 1 0 1 0 1 1 0 B 0 1 2 2 0C C B 1 B ½ ¼ B 0 0 1 2 1 C C @0 0 0 1 0A 0 0 0 0 1
Likewise, we then get
0
2 ½ ¼ @1 0
1 1 0 0 0A 1 1
The columns of [ ] yield the components of 1;1 ; 1;2 , and 1;3 in the standard basis of F ðZÞ. The reader may verify that C1;j ¼ 1; j for j ¼ 1; 2; 3. Here, 0 1 0 1 0 1 ½ ¼ @ 1 2 0A 1 2 1 The weakest weakly redundant situation corresponds to the nonredundant case. Then, K ¼ G; m ¼ q; 0;k ¼ k for 1 k m0 with m0 ¼ n 1, and 1; j ¼ j for 1 j m1 with m1 ¼ p. We then have C1; j ¼ j since Cj ¼ j .
2. Strongly Redundant Case We now consider the six-element array shown in Figure 7 and the corresponding interferometric graph (the same as the one shown in Fig. 3). This graph is incomplete: baselines (2, 3), (2, 4), (2, 5), and (4, 5) are lacking. In this case, n ¼ 6; q ¼ 11, and p ¼ 6. This graph is strongly redundant in the sense that many baselines are redundant. The following vectors form the standard basis of Gs:
PHASE CLOSURE IMAGING
303
Figure 7. Top: an example of a strongly redundant array; bottom: the interferometric graph to be taken into consideration, the same as the one shown in Figure 3.
1 ¼ b12 þ b34 þ b56 2 ¼ b13 þ b35 þ b46 3 ¼ b14 þ b36 4 ¼ b15 þ b26 5 ¼ b16 We now define the unknown-spectral phase space K as the subspace of Gs generated by the vectors 2 ; 3 ; 4 , and 5 ðm ¼ 4Þ. Note that m is then strictly less that p. The vectors 1 ¼ b26 ; 2 ¼ b34 ; 3 ¼ b35 ; 4 ¼ b36 ; 5 ¼ b46 , and 6 ¼ b56 , which are the loop-entry vectors of the directed loops (2, 6, 1), (3, 4, 1), (3, 5, 1), (3, 6, 1), (4, 6, 1), and (5, 6, 1), form the standard basis of F. We then have C2 ¼ 2 þ 23 þ 4 þ 5 C3 ¼ 2 þ 4 þ 5 C4 ¼ 1 3 þ 6 C5 ¼ 1 4 5 6 Matrix [CK] is therefore of the form 0 0 0 B 1 1 B B2 0 ½CK ¼ B B1 1 B @1 1 0 0
1 0 1 0 0 1
1 1 0C C 0C C 1 C C 1 A 1
304
A. LANNES
Its Smith normal form is then as follows: 0 1 0 B0 1 B B0 0 ½CK S ¼ B B0 0 B @0 0 0 0
0 0 1 0 0 0
Clearly, CK is then of rank 3: m0 ¼ 1 and m1 0 1 0 0 B0 0 0 ½ ¼ B @ 0 1 0 0 0 1
1 0 0C C 0C C 0C C 0A 0
¼ 3. Here, 1 1 1C C 2A 2
The first three columns of [ ] yield the components of 1,1, 1,2, and 1,3 in the standard basis of K(Z); the last column yields those of 0,1. If need be, the reader may explicitly verify that C0,1 ¼ 0. Here, 1 0 1 1 0 0 B0 2 1 0C C ½ 1 ¼ B @0 2 0 1 A 0 1 0 0
Likewise, we then get
0
0 B1 B B2 ½ ¼B B1 B @1 0
1 0 1 0 0 1
1 0 0 1 1 1
0 0 0 1 0 0
0 0 0 0 1 0
1 0 0C C 0C C 0C C 0A 1
The first three columns of [ ] yield the components of 1,1, 1,2, and 1,3 in the standard basis of F(Z). The reader may verify that C1; j ¼ 1; j , for j ¼ 1; 2; 3. The last three columns yield the components of 2;1 ; 2;2 , and 2;3 . Thus, in this case, 2;1 ¼ 4 ; 2;2 ¼ 5 , and 2;3 ¼ 6 . Here, 1 0 0 1 0 0 0 0 B 0 2 1 0 0 0C C B B 1 2 1 0 0 0C C ½ 1 ¼ B B 1 1 1 1 0 0 C C B @ 1 1 1 0 1 0 A 1 0 0 0 0 1
PHASE CLOSURE IMAGING
305
Figure 8. Geometric representation of the reference algebraic framework. Here, L is the bias phase space, and M its orthogonal complement in the baseline phase space G. The unknown spectral phase space K is the direct sum of K0 and K1, where K0 is the intersection of K with L. (K0 is not represented in this figure.) Likewise, the loop-entry phase space F is the direct sum of F1 and F2, where F1 is the image of K1 by the spectral phase closure projection CK. The Smith normal decomposition of CK provides bases for K0, K1, F1, and F2. The phase closure projection C maps K1 onto F1. The projection S of G onto M maps K1 and F1 onto M1. The orthogonal complement of K + L in G, M+, is the orthogonal complement of M1 in M.
VI. Reference Algebraic Framework Let K F2(Z) the lattices generated by the bases be pm K1(Z), m1 F1(Z), mand m0(Z), 1 0 ; and 2;i i¼1 1 , respectively. The linear spaces ; 1; j j¼1 ; 1; j j¼1 0;k k¼1 generated by the same bases are denoted by K0, K1, F1, and F2, respectively. Clearly, K(Z) can be regarded as the direct sum of K0(Z) and K1(Z): KðZÞ ¼ K0 ðZÞ þ K1 ðZÞ
with
K0 ðZÞ \ K1 ðZÞ ¼ f0g
ð15Þ
Likewise, F(Z) can be regarded as the direct sum of F1(Z) and F2 (Z): F ðZÞ ¼ F1 ðZÞ þ F2 ðZÞ
with
F1 ðZÞ \ F2 ðZÞ ¼ f0g
ð16Þ
As a corollary, K ¼ K0 þ K1 with K0 \ K1 ¼ f0g and F ¼ F1 þ F2 with F1 \ F2 ¼ f0g. Furthermore, as C1; j ¼ cj 1; j for j ¼ 1; . . . ; m1 , C maps K1 onto F1 (see Fig. 8). The image of K1(Z) by C is a lattice of rank m1. This lattice coincides with F1(Z) iff (if and only if ) all the elementary divisors of CK are equal to unity. According to Eq. (15), any in K(Z) can be decomposed in the form in which
¼ 0 þ 1
306
A. LANNES
0 ¼
m0 X
1 ¼
0; k 0; k ;
k¼1
m1 X
1; j 1; j
j¼1
The integers 0;knfor 1 k m 0 and 1; j for 1 j m1 are the components o of in the basis 0;k mk¼1 ; 1; j mj¼1 . Likewise [see Eq. (16)], any in F (Z) can be decomposed in the form: 0
1
¼ 1 þ 2
in which 1 ¼
m1 X
2 ¼
1; j 1; j ;
j¼1
pm X1
2; i 2; i
i¼1
The integers 1; j for 1 j n m1 and 2;i o for 1 i p m1 are the 1 components of in the basis 1; j mj¼11 ; 2;i pm . In this notation i¼1 C ¼
p X i¼1
ðiÞ i ¼
m1 X
cj 1; j j1; j
j¼1
Let us now introduce the vectors of M (see Fig. 8): 1; j :¼ S1; j ; n
2;i :¼ S2;i n
m1 pm1 o ; 2;i i¼1 1; j j¼1
o
pm1 is a basis of M (see is a basis of F, 1; j mj¼11 ; 2;i i¼1 As Appendix 1). Furthermore, since cj S1; j ¼ SC1; j ¼ S1; j , we have
1; j ¼
1 S1; j cj
ð1 j m1 Þ
S therefore maps K1 onto M1: = SK1. Clearly, the 1, j form a basis of M1 (see Fig. 8). Let K+ be the orthogonal complement of K0 in K. Denoting by T+ the projection of K onto K+, we set þ;1; j :¼ Tþ 1; j
ð17Þ
The +;1, j (for 1 j m1) form a basis of K+ (see Appendix 1). As S1; j ¼ STþ 1; j ¼ Sþ;1; j , we also have 1; j ¼
1 Sþ;1; j cj
From its definition, M1 proves to be the orthogonal complement of L in K + L. As a result, Mþ :¼ ðK þ LÞ? is the orthogonal complement of M1 in M (see Fig. 8). Denoting by U1 and U+ the projections of G onto M1 and M+, respectively, we thus have S ¼ U1 þ Uþ
ð18Þ
307
PHASE CLOSURE IMAGING
In this context, we set (see Fig. 8) 1;1; j :¼ U1 1; j ¼ 1; j ;
þ;1; j :¼ Uþ 1; j ¼ 0
ð1 j m1 Þ
ð19Þ
and 1;2;i :¼ U1 2;i ;
þ;2;i :¼ Uþ 2;i
ð1 i p m1 Þ
ð20Þ
The þ;2;i (for 1 i p m1 Þ form a basis of M+ (see Appendix 1). As 1; j ¼ 1=cj S1; j ; 1; j ¼ U1 1; j and U1 S ¼ U1 , we also have 1; j ¼
1 U1 1; j cj
ð1 j m1 Þ
ð21Þ
As specified in the following sections, in the general case of redundant interferometric graphs, the statement of the phase calibration problem leads us to consider two integer ambiguity problems, successively: P1 and P2. The vectors þ;2;i ðfor 1 i p m1 Þ are the canonical basis vectors of the Z-lattice involved in the nearest lattice node problem P1, whereas the vectors þ;1; j ðfor 1 j m1 Þ are the canonical basis vectors of the Z-lattice involved in the nearest lattice node problem P2. Furthermore, the vectors 1;2;i ðfor 1 i p m1 Þ also play an important part in the statement of P1. All these vectors can be explicitly obtained as indicated in Appendix 3.
VII. Statement of the Phase Calibration Problem Let bxe be the nearest rational integer to x, and {x} be the discrepancy between x and this integer: {x} :¼ x bxe. The value of wrapped into the interval (; ) is denoted by arc( ). Thus, arcð Þ ¼ 2{ _}
where
_ :¼ 2
ð22Þ
Let be any function in the baseline phase space G; the discrepancy between _ :¼ =ð2Þ and the nearest lattice node of G(Z) (for the distance defined in G ) is the function
{_ } ¼ _ b_ e
ð23Þ
arcðÞ ¼ 2{_ }
ð24Þ
where b_ eð j; kÞ :¼ b_ ð j; kÞe. Note that
In the process of stating the phase calibration problem, the guiding idea is to minimize the functional [see Eq. (7)]
308
A. LANNES
f1 : K L ! R;
f1 ð#; Þ :¼ k arcfðe Þ ð#m þ #Þgk2
ð25Þ
Setting :¼ e #m
we have
f1 ð#; Þ :¼ k arcf ð# þ Þgk2
ð26Þ ð27Þ
Let (#1, 1) now be a point of K L at which the minimum of f1 is attained. The quantity :¼ arcfðe 1 Þ ð#m þ #1 Þg ¼ arcf ð#1 þ 1 Þg
ð28Þ
is then referred to as the ‘‘phase calibration discrepancy.’’ In the general case where K0 (the intersection of K with L) is not reduced to {0}, let us consider the set S :¼ fð#1 ’ 2 ; 1 þ ’ 2 Þ : ’ 2 K0 ; 2 KðZÞ; 2 LðZÞg With regard to the minimum of f1, all the points of S are equivalent. In this context, the points of physical interest are those for which the size of #1 ’ 2 is minimum. To define the final solution(s), the idea is therefore to minimize the functional f2 : K0 ! R;
f2 ð’Þ :¼ k arcð#1 ’Þk2
ð29Þ
Denoting by ’ a bias phase for which the minimum of f2 is obtained, the quantity #m :¼ #m þ #
ð30Þ
# :¼ arcð#1 ’ Þ
ð31Þ
in which
is the ‘‘constrained model phase’’; # is the ‘‘optimal model phase shift’’ (see Fig. 9). The ‘‘calibrated phase’’ is then defined by the formula e :¼ e
ð32Þ
¼ 1 þ ’
ð33Þ
where
is said to be an ‘‘optimal bias phase.’’ This phase distribution is defined modulo 2 in L. This means that for any 2 LðZÞ; 2 is a solution in .
PHASE CLOSURE IMAGING
309
Figure 9. Phase calibration terminology in the general case of redundant arrays: # is the optimal model phase shift, is an optimal bias phase, #m is the constrained model phase, e is the calibrated phase, and is the phase calibration discrepancy. These functions take their values on the set of baselines of the interferometric graph. A trigonometric representation of this type can thus be associated with each baseline.
Note that the phase calibration discrepancy , as defined in Eq. (28), is also equal to arc(e #m ) (see Fig. 9). Given r in A, a pupil phase 2 Hr such that B ¼ is said to be an ‘‘optimal pupil phase.’’ Clearly, an optimal pupil phase is defined modulo 2 in Hr: for any 2 Hr (Z), 2 is a solution in . The problem of minimizing f1 is first considered (Section VIII), and then that of minimizing f2 (Section IX).
VIII. Phase Calibration Discrepancy and Related Results According to Eqs. (25), (26), and (27), the search for the phase calibration discrepancy (28) leads us to consider the functional f11 : K þ L ! R;
f11 ðÞ :¼ k arcð Þk2
is of the form # + with # 2 K and 2 L. As arc ð Þ ¼ 2{ _ _ } where _ :¼ =ð2Þ and _ :¼ =ð2Þ [see Eqs. (24), (23), and (22)], the problem of minimizing f11 is equivalent to the one of minimizing 2 f12 : K þ L ! R; f12 ð_ Þ :¼ { _ _ }
Note that { _ _ } is the discrepancy between _ _ and the nearest lattice node of G (Z). The problem is therefore to identify the nodes of G(Z) closest
310
A. LANNES
to the affine space parallel to K + L and passing through _ . We therefore have to minimize in G(Z) the norm of the projection of _ onto ðK þ LÞ? . The nodes at which the minimum is attained are defined up to a node of K(Z) + L(Z). The bulk of the problem is therefore to find a minimum of the functional f13 ðÞ :¼ kUþ ð _ Þk2 ¼ kUþ ðS _ SÞk2
f13 : GðZÞ ! R;
in which U+ is the orthogonal projection onto Mþ :¼ ðK þ LÞ? . Let _ ðiÞ for 1 i p be the closure terms of _ in the standard basis of F. In the notation adopted here, we denote by _ 1; j (for 1 j n m1 ) and by _ 2;io pm1 (for 1 i p m1 Þ the closure terms of _ in the basis 1;j mj¼11 ; 2;i i¼1 (see Section V.A). In matrix form, these closure vectors are explicitly related as follows: 1 0 ½ _ 1 C B 1 ð34Þ A ¼ ½ ½ _ @ ½ _ 2
As
Uþ S _ ¼ Uþ
m1 X j¼1
_ 1; j 1; j þ
pm X1 i¼1
_ 2;i 2;i
!
we have, since U+ 1, j ¼ 0 and þ;2;i :¼ Uþ 2; i [see Eqs. (19) and (20)], Uþ S _ ¼
pm X1
_ 2;i þ;2;i
Uþ S ¼
pm X1
2;i þ;2;i
i¼1
Likewise,
i¼1
Minimizing f13 therefore leads to minimizing the functional 2 pm 1 X _ pm1 ! R; f14 ðm2 Þ :¼ ð 2;i 2;i Þþ;2;i f14 : Z i¼1
In this integer ambiguity problem, referred to as P1, m2 is the vector of Zpm1 whose components are the 2;i . Denoting by c_ 2 the vector of Rpm1 with components _ 2;i , we have f14 ðm2 Þ ¼ q1 ðm2 c_ 2 Þ
ð35Þ
PHASE CLOSURE IMAGING
311
where q1 is the quadratic form pm1
q1 : R
2 pm 1 X ðiÞ q1 ðzÞ :¼ þ;2;i i¼1
! R;
ð36Þ
In the standard basis of Rpm1 , the matrix elements of q1 are the inner products ðþ;2;i j þ;2;i0 Þ. Let m2 be the solution of P1, i.e., the point of ðZpm1 ; q1 Þ closestPto c_ 2 (see Appendix 4). Clearly, according to the pm1 definition of f14 ; i¼1 2;i þ;2;i is the node of lattice U+G(Z) closest to Uþ _ . Let us now set _ 2;i :¼ _ 2;i 2;i _ :¼
ð1 i p m1 Þ
pm X1 i¼1
_ 2;i þ;2;i
ð37Þ ð38Þ
and 2 :¼
pm X1
2;i 2;i
i¼1
ð39Þ
Vector _ , which lies in M+, is none other than the phase calibration discrepancy up to a factor 2 : ¼ 2_ [see Eqs. (7), (28), and the successive definitions of f11, f12, f13, f14]. The nodes of G(Z) at which the minimum of f13 is attained are equal to 2 up to a node of K(Z) + L(Z). Denoting by _ the value of _ corresponding to 2 , we have _ :¼ { _ _ } ¼ ð _ _ Þ 2 hence _ ¼ _ _ 2
ð40Þ
As K + L can be regarded as the direct sum of K1 and L (see Section VI), _ can be uniquely decomposed in the form _ ¼ #_ 1 þ _ 1
ð41Þ
with #_ 1 in K1 and _ 1 in L. The point ð2#_ 1 ; 2_ 1 Þ is therefore a point ð#_ 1 ; 1 Þ of K L at which the minimum of f1 is attained: #1 ¼ 2#_ 1 and 1 ¼ 2_ 1 . It therefore remains to perform decomposition [Eq. (41)]. First note that U1, the orthogonal projection onto M1 (see Section VI), is equal to U1S. It then follows from Eq. (41) that
312
A. LANNES
U1 #_ 1 ¼ U1 _ hence, from Eq. (40) (since _ is orthogonal to M1), U1 #_ 1 ¼ U1 _ U1 2 But (see Section VI), U1 _ ¼ U1 S _ ¼ U1 ¼
m1 X j¼1
m1 X j¼1
_ 1; j 1; j þ
_ 1; j 1; j þ
pm X1
pm X1
_ 2;i 2;i
i¼1
!
_ 2;i 1;2;i
i¼1
Furthermore, from Eq. (39), U1 2 ¼ U1 S2 ¼ U1 ¼
pm X1
2;i 2;i
i¼1
pm X1
2;i 1;2;i
i¼1
As a result [see Eq. (37)], U1 #_ 1 ¼
m1 X j¼1
_ 1; j 1; j þ
pm X1
_ 2;i 1;2;i
i¼1
As 1;2,i can be expressed as a linear combination of the 1; j (see Appendix 3), it follows that m1 X _ 12; j 1; j U1 #_ 1 ¼ j¼1
in which the _ 12; j can be explicitly determined. But, 1; j ¼ U1 1; j =cj , where the cj s are the elementary divisors of [CK] [Eq. (21)]. Consequently, m1 X _ 12; j with #1; j ¼ #_ 1 ¼ ð42Þ #_ 1; j 1; j cj j¼1 The component _ 1 immediately follows from Eqs. (41) and (40): _ 1 ¼ _ _ 2 #_ 1
ð43Þ
PHASE CLOSURE IMAGING
313
IX. Optimal Model Phase Shift and Related Results According to Eqs. (29), (24), (23), and (22), the search for the optimal model phase shift [Eq. (31)] leads us to minimize the objective functional 2 f22 : K0 ! R; f22 ð’_ Þ :¼ {#_ 1 ’_ }
As a result, we have to identify the nodes of K(Z) closest to the affine space parallel to K0 and passing through #_ 1 . We therefore have to minimize in K(Z) the norm of the projection of #_ 1 onto K+ (the orthogonal complement of K0 in K ). The nodes at which the minimum is attained are defined up to a node of K0(Z). The bulk of the problem is therefore to find a minimum of the functional 2 f23 : KðZÞ ! R; f23 ð Þ :¼ Tþ ð#_ 1 Þ
in which T+ is the projection of K onto K+ (see Section VI). Let 0 and 1 be the components of on K0 (Z) and K1(Z). Expanding #_ 1 and 1 in the forms m1 m1 X X
1; j 1; j
1 :¼ #_ 1; j 1; j #_ 1 :¼ j¼1
j¼1
we have, since þ;1; j :¼ Tþ 1; j [Eq. (17)], m1 X Tþ ð#_ 1 Þ ¼ ð#_ 1; j 1; j Þþ;1; j j¼1
The problem is therefore to minimize the functional 2 m1 X m1 _ f24 : Z ! R; f24 ðr1 Þ :¼ ð#1; j 1; j Þþ;1; j j¼1
In this integer ambiguity problem, referred to as P2, r1 is the vector of Zm1 whose components are the 1, j. Denoting by q_ 1 the vector of Rm1 with components #_ 1; j , we have f24 ðr Þ ¼ q2 ðr q_ 1 Þ ð44Þ 1
1
where q2 is the quadratic form q2 : Rm1 ! R;
2 m1 X q2 ðzÞ :¼ ð jÞ þ;1; j j¼1
ð45Þ
In the standard basis of Rm1 , the matrix elements of q2 are the inner products ðþ;1; j j þ;1; j 0 Þ. Let r1 be the solution of P2, i.e., the point of (Zm1 ; q2 Þ closestPto q_ 1 (see Appendix 4). Clearly, according to the definition 1 _ of f24 ; m j¼1 1; j þ;1; j is the node of lattice Tþ KðZÞ closest to Tþ #1 .
314
A. LANNES
Let us now set #_ 1; j :¼ #_ 1; j 1; j #_ 1 :¼
m1 X j¼1
and
1 :¼
ð1 j m1 Þ
#_ 1; j þ;1; j
m1 X
1; j 1; j
j¼1
ð46Þ ð47Þ
ð48Þ
The nodes of K(Z) at which the minimum of f23 is attained are equal to 1* up to a node of K0(Z). Denoting by ’_ 1 the value of ’_ corresponding to 1*, we have #_ 1 :¼ {#_ 1 ’_ 1 } ¼ ð#_ 1 ’_ 1 Þ 1
hence
’_ 1 ¼ #_ 1 #_ 1 1 ð49Þ The optimal model phase shift # and the bias phase ’ are then, respectively, given by the formulas: # ¼ 2#_ 1 ’ ¼ 2’_ 1 A. Optimal Bias Phase The optimal bias phase is defined as ¼ 2ð_ 1 þ ’_ 1 Þ
in which _ 1 ¼ _ _ 2 #_ 1 and ’_ 1 ¼ #_ 1 #_ 1 1 [see Eqs. (33), (43), and (49)]. As _ 1 þ ’_ 1 ¼ _ #_ 1 _ ð 1 þ 2 Þ, it follows that ¼
# 2ð 1 þ 2 Þ
ð50Þ
B. Optimal Pupil Phase Given r in A, let Br be the operator from Hr into E induced by B (see Sections II.A, II.D, and II.E); Br is invertible. Indeed, Br is injective, and dim Hr ¼ dim E ¼ n 1 The optimal pupil phase, which is defined modulo 2 in Hr, is then given by the formula
PHASE CLOSURE IMAGING
315
¼ B1 r P Note that P*, the projection of * onto E, is none other than the restriction of * to the directed baselines of the corresponding spanning tree. From Eq. (50), we therefore have P ¼ Pð # 2 1 Þ As clarified below, the inverse of Br may be obtained by performing the Smith normal decomposition of Br. Let [Br] be the matrix of Br in the standard bases of Hr and E (see Section II). Note that its entries are equal to 1 or 0. As Br maps Hr onto E, the column vectors of [Br] form a basis of E. The elementary divisors of [Br] are therefore equal to unity (see Appendix 4). As a result, the Smith normal form of [Br] is the identity matrix on Rn1 : In1 . The related decomposition 0 0 is therefore of the form ½In1 ¼ ½Dr ½Br ½Dr with ½Dr ¼ ½In1 , hence 1 ½Br ¼ ½Dr . X. Special Cases In this section, we successively consider the special cases where (1) problem 1 disappears (Section X.A), (2) problem 2 is trivial (Section X.B), and (3) problem 1 disappears and problem 2 is trivial (Section X.C). A. Special Case Where m1 ¼ p As m1 is the rank of CK, m1 is less than or equal to p. In this section, we consider the special case where m1 ¼ p. Note that this is typically the case for nonredundant arrays with K ¼ G. Then, K0 ¼ L and K1 ¼ F , hence m0 ¼ n 1 and therefore m1 ¼ m m0 ¼ q ðn 1Þ ¼ p
The condition m1 ¼ p may also be satisfied in the more general case where m is simply greater than p, i.e., in the case of weakly redundant arrays (see the example given in Section V.B.1). When m1 ¼ p; K þ L coincides with G. Indeed, K þ L ¼ K1 þ L with K1 ¼ F , and G is the direct sum of L and F (see Section III). The phase calibration discrepancy is therefore reduced to zero: ¼ 0 As a result, the integer ambiguity problem P1 disappears, and Eq. (34) collapses to
316
A. LANNES
½
1
¼ ½ 1 ½
ð51Þ
The #-solution in K1 is therefore of the form [compare with Eq. (42)] #1 ¼
p X
1; j
cj
j¼1
1; j
ð52Þ
Solving the integer ambiguity problem P2 then yields #* and 1*. Modulo 2 in L, the optimal bias phase is then given by the formula [see Eq. (50)] ¼
# 2 1
Let us finally note that in the special case of nonredundant arrays with K ¼ G, we have 1; j ¼ j for 1 j p. As the elementary divisors of [CK] are then equal to unity, Eq. (52) then collapses to #1 ¼
p X
ð jÞ
j
j¼1
Furthermore, Kþ ¼ M with þ;1; j ¼ Sj ¼ j . It then follows from the analysis presented in Section IV that, in the standard basis of Rp , the matrix of the quadratic form q2 is the inverse of the variance–covariance matrix [V p] of the closure phases. The search for a reduced basis of lattice (Zp ; q2 ) then corresponds to a decorrelation process (see Appendix 4).
B. Special Case Where m1 ¼ m with m < p When, for a given choice of K, the spectral phase closure operator CK is injective, one says that the interferometric device is of ‘‘full phase,’’ and one speaks of redundant spacing calibration (RSC: Lannes and Anterrieu, 1999). This situation arises when operating on strongly redundant arrays (m < p). Then, K0 ¼ f0g; K1 ¼ K; m0 ¼ 0; m1 ¼ m. In this special case, once the integer ambiguity problem P1 has been solved, the particular solution #1 proves to be of the form #1 ¼
m X j¼1
12; j
cj
1; j
As K0 is reduced to {0}, we then have # ¼ arcð#1 Þ. The integer ambiguity problem P2 is therefore trivial: 2 1 ¼ #1 # . Modulo 2 in L, the optimal bias phase is then given by the formula [see Eq. (50)] ¼
#1 22
317
PHASE CLOSURE IMAGING
C. Special Case Where m1 ¼ m with m ¼ p This situation corresponds to what is called ‘‘critical redundancy’’: a fullphase situation with m ¼ p. In this case, the integer ambiguity problem P1 disappears, and P2 is trivial. We then have # ¼ arcð#1 Þ where (see Sections X.A and X.B) m X 1; j #1 ¼ 1; j c j j¼1 Modulo 2 in L, the optimal bias phase is then given by the formula ¼ #1
XI. Simulated Example The simulation presented in this section concerns the six-element array and the corresponding interferometric graph introduced in Section V.B.2 (see Fig. 7). The object spectral phase was assumed to be known on baselines (1, 2), (3, 4), and (5, 6): #r ¼ 0. The numbers of degrees of freedom of the integer ambiguity problems P1 and P2 are then equal to 3: p m1 ¼ 3; m1 ¼ 3. As specified in Sections II.B and II.C, the weight function $ involved in the definition of the inner product must satisfy the redundancy constraint. In the simulation presented in this section, $ was defined by the following components: $ð1; 2Þ ’ 0:21 $ð1; 5Þ ’ 0:03
$ð1; 3Þ ’ 0:06 $ð1; 6Þ ’ 0:01
$ð1; 4Þ ’ 0:05
P Here, these components are normalized so that ð j;kÞ2B $ð j; kÞ ¼ 1 for the graph shown in Figure 7. All the elements involved in the integer ambiguity problems P1 and P2 can then be easily computed. The basic components of the object spectral phase #o were set equal to the following values:
#o ð1; 2Þ ¼ 0 #o ð1; 5Þ ¼ 10
#o ð1; 3Þ ¼ 172 #o ð1; 6Þ ¼ 15
#o ð1; 4Þ ¼ 40
The experimental baseline phases e ( j, k) were simulated by referring to Eq. (7) with #m ¼ 0, and # ¼ #o . The pupil phases ( j ) were randomly distributed on the trigonometric circle, and the error term e was taken into account by adding Gaussian phase noise; its standard deviation was set equal to 3.9 . The values thus obtained were
318
A. LANNES
e ð1; 2Þ ’ 3:1
e ð1; 3Þ ’ 67:5
e ð1; 5Þ ’ 102:1
e ð1; 6Þ ’ 138:6
e ð3; 5Þ ’ 153:9
e ð5; 6Þ ’ 101:3
hence, for #m 0, the closure phases of ð1Þ
ð4Þ
e ð3; 4Þ ’ 147:8
e ð4; 6Þ ’ 40:5
ð2Þ
’ 19:3
ð5Þ
’ 199:5
2;1
’ 241:9
1;2
ð6Þ
’ 7:4
2;2
’ 468:0
1;3
’ 1:2
e ð3; 6Þ ’ 128:3
The change of variable (34) then gave 1;1
e ð2; 6Þ ’ 161:1
ð3Þ
’ 241:9 ’ 205:7
e ð1; 4Þ ’ 26:6
2;3
’ 15:7
’ 342:1
’ 448:7
’ 361:3
The solution of the integer ambiguity problem P1 proved then to be m2 ¼ bc_ 2 e [see Eqs. (36) and (35)] 2;1 ¼ 0;
2;2 ¼ 0;
2;2 ¼ 1
The norm of the phase calibration discrepancy * was then of the order ^1 was of 0.99 . Solving the integer ambiguity problem P2 was not so easy: r different from bq_ 1 e [see Eqs. (45) and (44)]. However, as P2 was of small dimension (m1 ¼ 3), it was not necessary to search for a reduced basis of (Zm1 ; q2 ) (see Appendix 4): the discrete search algorithm was simply applied to the matrix of q2 in the standard basis of R3 . The ambiguities 1, j thus resolved were the following:
1;1 ¼ 1;
1;2 ¼ 2;
1;2 ¼ 2
The optimal model phase shift #* was then found to be characterized by the following spectral components:
# ð1; 2Þ ’ 0:0
# ð1; 5Þ ’ 50:0
# ð1; 3Þ ’ 17:4
# ð1; 4Þ ’ 105:2
# ð1; 6Þ ’ 68:9
Its norm was of the order of 37.8 . In this case, as #m is equal to 0, the constrained model phase #m* coincides with #*. This simulation was completed by computing an optimal bias phase and an optimal calibration phase. Modulo 2 in L,
ð1; 2Þ ’ 3:18 ð1; 5Þ ’ 771:7 ð3; 4Þ ’ 147:3 ð4; 6Þ ’ 382:9
ð1; 3Þ ’ 276:9 ð1; 6Þ ’ 512:5 ð3; 5Þ ’ 494:8 ð5; 6Þ ’ 259:2
ð1; 4Þ ’ 129:6 ð2; 6Þ ’ 509:3 ð3; 6Þ ’ 235:6
319
PHASE CLOSURE IMAGING
and modulo 2 in H1,
ð1Þ ’ 0:0
ð4Þ ’ 129:6
ð2Þ ’ 3:2
ð5Þ ’ 51:7
ð3Þ ’ 83:1
ð6Þ ’ 152:5
In this simulation, as CK was not injective, and the discrepancy between #m and #o was large, the discrepancy between #m* and #o was also large. This situation was selected precisely to illustrate the fact that the phase calibration operation could be performed in any situation (here, obviously, a situation without any physical interest). As a general rule, the situations of physical interest are those for which m2 ¼ bc_ 2 e and r_ 1 ¼ bq_ 1 e [see Eqs. (36), (35) and (45), (44)].
XII. Concluding Comments The problems of integer ambiguity resolution arising in phase closure imaging had been analyzed previously in two extreme situations, nonredundant arrays (Lannes, 2001a) and full-phase arrays (Lannes and Anterrieu, 1999). The present study completes the results already obtained in this field. The corresponding theoretical framework is based on the Smith normal form of the spectral phase closure matrix (see Sections V and VI). In the general case of redundant interferometric graphs, two nearest lattice point problems P must be successively solved: the integer ambiguity problems P1 and P2 [see the context of Eqs. (36), (35) and (45), (44)]. As specified in Appendix 4, a problem such as P is to find the point k of Z closest to a point x of R , the distance being the one induced by a given quadratic form q. One then says that is the number of degrees of freedom of P. In the situations where there exist several k such that qðk x) is of the order of qðk x), phase calibration instabilities may occur. As illustrated in Lannes (1999) and in Lannes and Anterrieu (1999), the problem is then unstable. The number of degrees of freedom of P1 is equal to p m1 , where p denotes the number of loops defined through a given spanning tree of the interferometric graph; m1 is the difference between m, the dimension of the unknown spectral phase space K, and m0, the one of the intersection of K with the bias phase space L. Note that m is the number of spectral phase components to be determined. The number of degrees of freedom of P2 is equal to m1. In the case of full-phase arrays, m1 is equal to m; P2 then proves to be trivial. In the case of nonredundant arrays, m1 is equal to p, so that P1
320
A. LANNES
disappears. With regard to P2, there then exists a particular initialization procedure for the search for the nearest lattice point. As specified in Lannes (2001b), this procedure benefits from the fact that the notion of graph and the related algebra are basically involved in the statement of the problem. This technique can of course be applied directly to weakly redundant situations. The less redundant the array is, the more efficient this initialization procedure. The main result of a phase calibration operation is the optimal model phase shift #* (see Fig. 9). It is important to note that the #* ( j, k) for ( j, k) 2 B depend only on the differences between the closure phases of the data and those of the model. The object model sm involved in Eq. (8) may result from a global image reconstruction process based, for example, on the maximum entropy principle. The data are then the moduli %e( j, k) of the experimental complex visibilities Ve ð j; kÞ %e ð j; kÞ exp½ie ð j; kÞ, and the closure phases ðiÞ e . The phase calibration operation followed by a Fourier synthesis process is then simply used as a refinement technique. When the phase data are the experimental baseline phases e( j, k), as is typically the case in radio imaging, the calibrated phase is then defined by the formula e :¼ e , in which * is an optimal bias phase. The optimal bias phases *( j, k) can be computed (modulo 2), as well as related pupil phases, the optimal calibration phases ( j). An interferometric device is a set of arrays independently observing the same source. The present analysis can easily be extended to such devices.
Appendix 1. Useful Property Let H be a real Hilbert space, and fei gni¼1 be a basis of H. Given r < n, we denote by V the subspace of H with basis fei gri¼1 , by V+ the orthogonal complement of V in H, and by P+ the orthogonal projection of H onto V+. Property. Then, fPþ ei gni¼rþ1 is a basis of Vþ . & Proof. As V+ is of dimension n r, we simply Pn have to show that ei gni¼rþ1 is a free set of VþP . The condition fPþP i¼rþ1 ai Pþ ei ¼ 0 implies Pþ ni¼rþ1 ai ei ¼ 0. The vector ni¼rþ1 ai ei then lies in V. This means that n X
i¼rþ1
ai e i ¼
r X
bj e j
j¼1
for aj :¼ bj for 1 j r, we therefore have Pn some b1 ; . . . ; br . Setting n i¼1 ai ei ¼ 0. As fei gi¼1 is a basis of H, all the ai are equal to 0, and in particular, arþ1 ; . . . ; an . As a result, fPþ ei gni¼rþ1 is a free set of V+. &
321
PHASE CLOSURE IMAGING
Appendix 2. Smith Normal Form of Integral Matrices 0
Let A be a Z-linear operator from Zn into Zn , and [A] be its matrix in the corresponding standard bases; [A] is an n n0 matrix with coefficients in Z. The proof of the following theorem can be found in many textbooks (see, e.g., Newman, 1972; van der Waerden, 1967). 0 Theorem. There then exist a basis E 0 :¼ e01 e02 ; . . . ; e0n0 of Zn and a basis E :¼ fe1 ; e2 ; . . . ; en g of Zn, some integer r 0 and positive integers a1 ; a2 ; . . . ; ar in Z, with aj dividing aj+1 for 1 j < r, such that Ae0j ¼ aj ej for 1 j r and Ae0j ¼ 0 for j > r, in other words such that the matrix of A in the bases E 0 and E is of Smith normal form. More precisely, there then exist two matrices [W 0 ] and [W] (of order n0 and n, respectively) with coefficients in Z and determinant 1 such that 0
B B B B B ½W ½A½W 0 ¼ B B B B @
a1 0 .. .
0 a2
0 0 .. .
... ... ar 0
0
0
...
..
.
1 0 0C .. C C .C C C C C C A 0
The aj are called the ‘‘elementary divisors’’ of [A]. Clearly, r is the rank of A, i.e., the dimension of its range. The coefficients of the jth column of 0 [W 0 ] are the components of e0j in the standard basis of Zn , whereas the coefficients of the j th column of ½W 1 are the components of ej in the standard basis of Zn . The constructive processes that perform the Smith normal decomposition of [A] are based on the Euclid extended algorithm (see Cohen, 1996). They provide the elementary divisors of [A], the matrix elements of ½W ; ½W 1 ; ½W 0 , and ½W 0 1 . (All these matrix elements lie in Z.) Appendix 3. Reference Projections In this appendix, we show how to compute 1;2;i and þ;2;i for 1 i p m1 , and þ;1; j for 1 j m1 . For each i, the components of 1;2,i and +;2,i are obtained by minimizing the functional 2 m1 X m1 g1;i : R ! R; g1;i ðxÞ :¼ 2;i xj 1; j j¼1
322
A. LANNES
Clearly, the xj are the components of x in the standard basis of Rm1 . Denoting by x1* the vector x for which the minimum of g1,i is obtained, we then have 1;2;i ¼
m1 X
þ;2;i ¼ 2;i 1;2;i
x1; j 1; j
j¼1
Likewise, for each j, +;1, j can be determined by minimizing the functional 2 m0 X m0 g0; j : R ! R; xk 0;k g0; j ðxÞ :¼ 1; j k¼1
Denoting by x0* the vector x for which the minimum of g0, j is obtained, we then have 0;1; j ¼
m0 X
þ;1; j ¼ 1; j 0;1; j
x0;k 0;k
k¼1
Remark. Let be a vector in a real Hilbert space H with inner product onto the ( j ), and fei gni¼1 be a free subset of H. The projection of subspace generated by this subset, :¼ P , is obtained by minimizing the functional 2 n X n xi e i gðxÞ :¼ g : R ! R; i¼1
Indeed, denoting by x* the vector for which the minimum of g is obtained, we have
¼
n X i¼1
x;i ei
Let A be the operator
A : Rn ! H;
Ax ¼
n X
xi ei
i¼1
Minimizing g amounts to solving the normal equation A Ax ¼ A . The ith component of A* is then given by the formula
323
PHASE CLOSURE IMAGING
ðA Þi ¼ ðei j Þ Note that the matrix elements of A*A are the inner products (ei j ei 0 ): ai;i0 ¼ ðei j ei0 Þ
Appendix 4. Closest Point Search in Lattices The notion of integer ambiguity resolution is associated with the problem of ^ of R , the distance being the finding the point k* of Z closest to a point x one induced by a given quadratic form q. In what follows, [k] and [^ x] denote ^ in the standard basis of R , and [Q] the the column matrices of and x matrix of q in this basis. The problem is therefore to minimize in the quantity ^Þ ¼ ½ x ^t ½Q½ x ^ qð x ^, and q of course depend on the particular The definitions of ; ; x problem to be solved. For example, in the integer ambiguity problems stated in Sections VIII and IX, we have, for P1 [see Eqs. (36) and (35)], p m1 ;
m2 ;
^ x
c_ 2 ;
q
q1
and for P2 [see Eqs. (45) and (44)],
m1 ;
r1 ;
^ x
q_ 1 ;
q
q2
Let b^ xe denote the vector whose components are the closest rational integers ^. In many cases, the standard basis of lattice Z is far to the components of x from being orthogonal for the quadratic form q; [Q] is therefore far from being diagonal. As a result, in general, the integer ambiguity vector ^ b^ xe is not the solution of the problem: 6¼ ^. A. Search for a Reduced Basis To circumvent this difficulty, one may be led to search for a basis of Z ; e 0i i¼1 , in which the matrix of q, [Q0 ], is as diagonal as possible. This amounts to exhibiting what is referred to as a reduced basis of lattice (Z ; q). This operation can be performed by using a well-known algorithm in algebra of numbers: the Lentsra–Lenstra–Lova´sz (LLL) algorithm (see
324
A. LANNES
Section 2.6 in Cohen, 1996). One is then led to minimize in 0 the quantity ^0 t ½Q0 ½0 x ^0 ^Þ ¼ ½0 x gð x The relationship between [Q0 ] and [Q] is of the form ½Q0 ¼ ½W t ½Q½W in which [W ] is a matrix (of order n) with coefficients in Z and determinant 1. Then, ½^ x ¼ ½W ½^ x0 and ½ ¼ ½W ½0 . (The current implementations of the LLL algorithm provide [W ] and its inverse.) Despite this reduction ^0 t ½Q0 ½0 x ^0 may not be attained at operation, the minimum of ½0 x 0 0 ^ b^ x e. One then proceeds as specified below. B. Discrete Search Process For clarity, we now omit the primes, and note that ^ ¼!! ^ x
ðA4:1Þ
with ! :¼ ^;
^ ^ ^ :¼ x !
ðA4:2Þ
Set ^t ½Q½x ! ^ ðxÞ :¼ ½x !
ðA4:3Þ
and consider the ellipsoid E 0 :¼ fx 2 R :¼ ðxÞ 0 g;
ðA4:4Þ
0 :¼ ð0Þ
^Þ ¼ ð!Þ, the problem is to search for the integer ambiguity As qð x vector(s) !* at which the minimum of in Z is attained. From Eq. (A4.2), * is then given by the formula ¼ ^ þ !
ðA4:5Þ
Clearly, the search for !* can be confined to the points of Z contained in E 0. Let us now consider the Cholesky factorization of Q: ½Q ¼ ½Ut ½U [U ] is an upper triangular matrix with matrix elements uij. It then follows from Eq. (A4.3) that ^Þ ^Þt ð½U½! ! ð!Þ ¼ ð½U½! !
325
PHASE CLOSURE IMAGING
We therefore have ð!Þ ¼
X i¼1
r2i ð!Þ
ðA4:6Þ
in which ri2 is the contribution of the ith row of [U ]: ri ð!Þ :¼
X
uij ð jÞ ;
^ð jÞ ð jÞ :¼ !ð jÞ !
j¼i
ðA4:7Þ
Here, !( j ) is the jth component of ! (in the selected basis), i.e., the jth integer ^ðjÞ is the jth component of ! ^ (in the same ambiguity (in this basis), whereas ! basis). Ellipsoid E 0 is now searched for candidates for the optimal ambiguity vector !*. Following the ideas of the method presented in de Jonge (1998; see also Lannes 2001b), we first show how to exhibit bounds for ambiguity !(i), with the ambiguities !() through !(i + 1) being already conditioned; the ambiguities !ði1Þ through !ð1Þ are not yet conditioned. In other words, they are implicitly set equal to 0. Ambiguity Bounds. For clarity, let us set ri :¼ ri ð!Þ. According to Eqs. (A4.4) and (A4.6), the following condition must be satisfied: 8 X > > < r2i þ r2‘ 0 if i < ‘¼iþ1
> > :
r2i
0
if i ¼
Denoting by yi and zi the quantities 8 X < r2‘ 0 yi :¼ r2i zi :¼ ‘¼iþ1 : 0
if i < if i ¼
ðA4:8Þ
we therefore have
yi z i
ðA4:9Þ
For i < , zi can be written in the form [see Eq. (A4.8)] ! X 2 r‘ r2iþ1 zi ¼ 0 ‘¼iþ2
hence the recurrence formula: zi ¼ ziþ1 yiþ1
ðA4:10Þ
326
A. LANNES
Note that y ¼ ½u ðÞ 2 ;
z ¼ 0 pffiffiffiffi As jri j zi [see Eqs. (A4.8) and (A4.9)], we have pffiffiffiffi pffiffiffiffi zi ri zi
ðA4:11Þ ðA4:12Þ
Let us now expand ri in the form [see Eq. (A4.7)] ri ¼ uii ðiÞ þ si
ðA4:13Þ
where 8 X > < uij ð jÞ si :¼ j¼iþ1 > : 0
if i < if i ¼
It then follows from Eq. (A4.12) that
ðA4:14Þ
1 pffiffiffiffi 1 pffiffiffiffi ð zi þ si Þ ðiÞ ð zi si Þ uii uii
hence from Eq. (A4.7): ^ðiÞ !
1 pffiffiffiffi 1 pffiffiffiffi ^ðiÞ þ ð zi si Þ ð zi þ si Þ !ðiÞ ! uii uii
ðA4:15Þ
Discrete Search Algorithm. Set i . We then have from Eqs. (A4.15), (A4.14), and (A4.8) pffiffiffiffi pffiffiffiffi 0 0 ^ðÞ ^ðÞ þ !ðÞ ! ! u u
For each integer ambiguity !() in this interval, one successively computes i 1, computes si, r , y , and z1 . One then uses a program that sets i and defines the bounds for ambiguity !(i). For each possible value of this ambiguity, one then computes ri ; yi , and zi1 . If i is greater than 1, and 0 zi1 is smaller than the smallest value of computed so far on the ambiguity tree [see Eq. (A4.8) with i i 1], one then uses the same program for defining the bounds for ambiguity !ði1Þ . All the ambiguity vectors ! of interest can thus be identified through the recursive call of a same program. ð1Þ
ð1Þ
ð1Þ
ð1Þ
At level i ¼ 1, we have !m !ð1Þ !M with !m and !M in Z. One then computes ð1Þ ^ð1Þ Þ þ s1 r1 ¼ u11 ð!m !
327
PHASE CLOSURE IMAGING
and y1 ¼ r21 . According to Eqs. (A4.6) and (A4.8), the value of at the integer ambiguity vector ! thus conditioned is given by the formula ð1Þ When !M
ð!Þ ¼ 0 ðz1 y1 Þ
ð1Þ is strictly greater than !m , i.e., when !ð1Þ
ð1Þ is of the form !m
ðA4:16Þ
þ neð1Þ , the corresponding values of q are obtained through the variational formula ^ ¼ ð!Þ þ 2ðnu11 Þr1 þ ðnu11 Þ2 q ð! þ neð1Þ Þ ! ðA4:17Þ
Indeed,
^Þ þ neð1Þ Þ ¼ qð! ! ^Þ þ 2n½! ! ^t ½Q½eð1Þ þ n2 qðeð1Þ Þ q½ð! ! ^Þ ¼ ð!Þ; ½! ! ^t ½Q½eð1Þ ¼ u11 r1 and qðeð1Þ Þ ¼ u211 . in which qð! ! References Biggs, N. (1996). Algebraic Graph Theory. 2nd ed. Cambridge: Cambridge University Press. Born, M., and Wolf, E. (1970). Principles of Optics. Oxford: Pergamon Press. Cohen, H. (1996). A Course in Computational Algebraic Number Theory. Berlin: Springer-Verlag. Cornwell, T. J., and Wilkinson, P. N. (1981). A new method for making maps with unstable interferometers. Mon. Not. R. Astron. Soc. 196, 1067–1086. de Jonge, P. J. (1998). A processing strategy for the application of the GPS in networks. Publication on Geodesy, Vol. 46. Delft: NCG Nederlandse Commissie voor Geodesie Netherlands Geodetic Commission. Hunt, G., and Payne, H. E. (1997). Astronomical Data Analysis Software and Systems VI. San Francisco: Astronomical Society of the Pacific. Lannes, A. (1999). Phase calibration on interferometric graphs. J. Opt. Soc. Am. A. 16, 443–454. Lannes, A. (2001a). Integer ambiguity resolution in phase closure imaging. J. Opt. Soc. Am. A 18, 1046–1055. Lannes, A. (2001b). Re´solution d’ambiguı¨te´s entie`res sur graphes interfe´rome´triques et GPS. C. R. Acad. Sci. Paris 333, Se´r. I, 707–712. Lannes, A., and Anterrieu, E. (1999). Redundant spacing calibration: Phase restoration methods. J. Opt. Soc. Am. A. 16, 2866–2879. Lannes, A., Anterrieu, E., and Bouyoucef, K. (1994). Fourier interpolation and reconstruction via Shannon-type techniques; Part I: Regularization principle. J. Mod. Opt. 41, 1537–1574. Lannes, A., Anterrieu, E., and Bouyoucef, K. (1996). Fourier interpolation and reconstruction via Shannon-type techniques; Part II: Technical developments and applications. J. Mod. Opt. 43, 105–138. Lannes, A., Anterrieu, E., and Mare´chal, P. (1997). Clean and wipe. Astron. Astrophys. Suppl. Ser. 123, 183–198. Newman, M. (1972). Integral Matrices. New York: Academic Press. Reasonberg R. D. (1998) Proceedings of the SPIE meeting on Astronomical Interferometry, Kona (Hawaii), SPIE 3350. van der Waerden, B. L. (1967). Algebra. Berlin: Springer-Verlag.
This Page Intentionally Left Blank
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 126
Three-Dimensional Image Processing and Optical Scanning Holography TING-CHUNG POON Optical Image Processing Laboratory, Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24061
I. Introduction. . . . . . . . . . . . . . . . . . . . . . II. Two-Pupil Optical Heterodyne Scanning . . . . . . . . . . . A. Heterodyning Theory . . . . . . . . . . . . . . . . . B. Coherency Considerations . . . . . . . . . . . . . . . C. Special Cases: Fluorescent Specimens or Incoherently Reflecting Rough Surfaces . . . . . . . . . . . . . . . . . . . D. Detection Schemes . . . . . . . . . . . . . . . . . . III. Three-Dimensional Imaging Properties . . . . . . . . . . . IV. Optical Scanning Holography . . . . . . . . . . . . . . . A. Cosine, Sine, and Complex Hologram . . . . . . . . . . B. 3D Image Reconstruction . . . . . . . . . . . . . . . V. Concluding Remarks . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
329 330 330 333
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
335 336 337 340 340 341 347 348
I. Introduction Optical scanning holography (OSH) (Poon, 1985) was invented as a clever application of the pupil interaction processing technique (Poon and Korpel, 1979), which is unique in extending incoherent image processing to include the implementations of bipolar or even complex point-spread functions (Lohmann and Rhodes, 1978; Mait, 1987; Poon, 1985; Poon and Korpel, 1979; Stoner, 1978). One of the two-pupil processing techniques, namely the use of a pupil interaction scheme in a scanning illumination mode, has been developed extensively (Indebetouw and Poon, 1992). The pupil interaction scheme has been implemented by optical heterodyne scanning (Poon and Korpel, 1979) and has been used for many interesting applications such as textural edge extraction and tunable and three-dimensional (3D) filtering (Poon et al., 1988, 1990). When we drastically modify one of the pupils relative to the other (specifically one of the pupils is an open mask and the 329 Copyright 2003 Elsevier Science (USA). All rights reserved. ISSN 1076-5670/03
330
T.-C. POON
other is a pinhole mask) and defocus the optical system, we end up with an optical scanning system capable of holographic recording of the object being scanned and thus the invention of OSH (Poon, 1985). Indeed, OSH has been invented to acquire holographic information through active two-dimensional optical scanning. Scanning holographic microscopy (Indebetouw et al., 2000; Poon, 1985; Poon et al., 1996; Schilling et al., 1997; Swoger et al., 2002), optical recognition of 3D objects (Kim and Poon, 2000; Poon and Kim, 1999), 3D holographic display (Poon, 2002), and 3D optical remote sensing (Kim et al., 2002; Klysubun et al., 2000; Schilling and Templetion, 2001) are some of its most recent developments. Among its many applications, holographic microscopy has been developed quite extensively due to its important applications in 3D imaging of biological specimen (Kim, 1999; Poon et al., 1995; Zhang and Yamaguchi, 1998). Indeed some properties of a scanning holographic microscope have been outlined recently (Indebetouw et al., 2000) and numerical simulations have shown that point spread functions leading to different imaging functionalities (e.g., enhanced spatial resolution, extended depth of focus, or optical sectioning) can be expected with proper choices of pupil functions (Indebetouw, 2002). The purpose of this article is to introduce OSH through the development of a two-pupil optical heterodyne scanning image processor, which will be discussed in Section II. In Section III, 3D imaging properties in terms of the two pupils are developed and subsequently 3D point-spread functions (PSF) are derived. We then compare the developed PSFs with those obtained with conventional 3D image processing. In Section IV, we discuss OSH as a simple example of the optical heterodyne scanning image processor. We will then introduce the so-called sine-Fresnel zone plate (FZP) hologram, cosineFZP hologram, and complex hologram and subsequently, in that section, we will discuss 3D reconstruction. Finally, in Section V, we make some concluding remarks.
II. Two-Pupil Optical Heterodyne Scanning A. Heterodyning Theory A typical two-pupil heterodyne optical scanning image processor is shown in Figure 1. We model the 3D object as a stack of transverse slices spanning a longitudinal range z 2 elements and var(wA) > 0 and varðwB Þ > 0. The correlation coefficient C between these vectors can be calculated as covðwA ; wB Þ CðwA ; wB Þ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi varðwA Þ varðwB Þ
ð17Þ
The correlation coefficient C(wA, wB) is a number in the range [1; 1]. For C(wA, wB) = 1, there is a strong correlation; for C(wA, wB) = 0 there is no correlation. Therefore, the squared correlation C(wA, wB)2 can be minimized to minimize the likeness of the two weight sets. Although this seems a natural thing to do, a problem is that squared correlation can be minimized either by minimizing the squared covariance or by maximizing the variance of either weight vector. The latter is undesirable, as for interpretation the variance of one of the weight vectors 9 Up to a point, naturally, due to the nonlinearity of the transfer functions in the hidden and output layer. For this discussion it is assumed the network operates in that part of the transfer function that is still reasonably linear.
394
DE RIDDER ET AL.
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
Figure 22. Feature detector pairs found in ANN3, for four different random weight initializations (a–d).
395
396
DE RIDDER ET AL.
should not be unnecessarily increased just to lower the squared correlation. Ideally, both weight vectors should have comparable variance. Therefore, a better measure to minimize is just the squared covariance. To do this, the derivative of the covariance with respect to a single weight wiA has to be computed: " #2 K @covðwA ; wB Þ2 @ 1 X A A B B Þ ðw w Þ ðw w ¼ K ¼1 @wA @wA i i ð18Þ ¼
2 BÞ covðwA ; wB Þ ðwBi w K
This derivative can then be used in combination with the derivative of the MSE with respect to the weights to obtain a training algorithm minimizing both MSE and squared covariance (and therefore squared correlation, because the variance of the weight vectors will remain bounded since the ANN still has to minimize the MSE). Correlation has been used before in neural network training. In the cascade correlation algorithm (Fahlman and Lebiere, 1990), it is used as a tool to find an optimal number of hidden units by taking the correlation between a hidden unit’s output and the error criterion into account. However, it has not yet been applied on weights themselves, to force hidden units to learn different functions during training. b. A Decorrelating Training Algorithm. Squared covariance minimization was incorporated into the CGD method used before. Basically, CGD iteratively applies three stages: calculation of the derivative of the error with respect to the weights, dE ¼ @=@wEðwÞ; deriving a direction h from dE that is conjugate to previously taken directions; a line minimization of E from w along h to find a new weight vector w0 . The squared covariance term was integrated into the derivative of the error function as an additive criterion, as in weight regularization (Bishop, 1995). A problem is how the added term should be weighted (cf. choosing the regularization parameter). The MSE can start very high but usually drops rapidly. The squared covariance part also falls in the range [0, 1], but it may well be the case that it cannot be completely brought down to zero, or only at a significant cost to the error. The latter effect should be avoided: the main training goal is to reach an optimal solution in the MSE sense. Therefore, the covariance information is used in the derivative function only, not in the line minimization. The squared covariance gradient, dcov, is
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
397
normalized to the length of the ordinary gradient dE ( just its direction is used) and weighed with a factor ; i.e., d ¼ dE þ
jjdE jj cov d jjdcov jj
where dcov ¼
K 1 X K X 2 2 @ cov w ð0Þ; wl ð0Þ KðK 1Þ ¼1 l¼þ1 @w
Note that the derivative of the squared covariance is calculated only once for each pair of weight sets and attributed to only one of the weight sets. This allows one weight set to learn a globally optimal function, while the second set is trained to both lower the error and avoid covariance with the first set. It also allows initialization with fixed values, since the asymmetrical contribution of the squared covariance term provides a symmetry breaking mechanism (which can even improve performance in some classification problems, see de Ridder et al., 1999). However, the outcome of the DCGD training process is still dependent on the choice of a number of parameters. DCGD even introduces a new one (the weight factor ). If the parameters are chosen poorly, one will still not obtain understandable feature detectors. This is a problem of ANNs in general, which cannot be solved easily: a certain amount of operator skill in training ANNs is a prerequisite for obtaining good results. Furthermore, experiments with DCGD are reproducable due to the possibility of weight initialization with fixed values. The DCGD algorithm is computationally expensive, as it takes covariances between all pairs of receptive fields into account. Due to this Oðn2 Þ complexity in the number of receptive fields, application of this technique to large ANNs is not feasible. A possible way to solve this problem would be to take only a subset of covariances into account. 3. Training ANN3 Using DCGD ANN3 was trained using DCGD. Weights and biases were initialized to a fixed value of 0.01 (i.e., ¼ 0:01; ¼ 0:0) and N ¼ 10 directions were kept conjugate at a time. The only parameter varied was the weighting factor of the squared covariance gradient, , which was set to 0.5, 1, 2, and 5. Training converged but was slow. The MSE eventually reached the values obtained using CGD (1:0 106 , cf. Section IV.B.1); however, DCGD training was stopped when the MSE reached about 1:0 105 , after about 500–1000 cycles, to prevent overtraining. In all cases, classification was perfect.
398
DE RIDDER ET AL.
Figure 23 shows the feature detectors found in ANN3 trained using DCGD. Squared correlations C2 between them are very small, showing that the minimization was succesful (the squared covariance was, in all cases, nearly 0). For ¼ 1 and ¼ 2, the feature detectors are more clear than those found using standard CGD, in Section IV.B.1. Their frequency responses resemble those of the feature detectors shown in Figure 22b and, due to the fixed weight initialization, are guaranteed to be found when training is repeated. However, should be chosen with some care; if it is too small ( ¼ 0:5), the squared covariance term will have too little effect; if it is too large ( ¼ 5), minimization of the squared covariance term becomes too important and the original functionality of the network is no longer clearly visible. The features detected seem to be diagonal bars, as seen before, and horizontal edges. This is confirmed by inspecting the output of the two feature maps in ANN3 trained with DCGD, ¼ 1, for a number of input samples (see Fig. 24). For samples of class ‘‘1,’’ these outputs are lower than for class ‘‘7,’’ i.e., features specific for digits of class ‘‘7’’ have been found. Furthermore, the first feature detector clearly enhances the stem of ‘‘7’’ digits, whereas the second detector amplifies the top stroke. Finally, versions of ANN3 with three and four feature maps were also trained using DCGD. Besides the two feature detectors found before no clear new feature detectors were found.
C. Discussion The experiments in this section were performed to determine whether training ANNs with receptive field mechanisms leads to the ANN finding useful, shift-invariant features and if a human observer could interpret these features. In general, it was shown that the mere presence of receptive fields in an ANN and a good performance do not mean that shift-invariant features are detected. Interpretation was possible only after severely restricting the ANN architecture, data set complexity, and training method. One thing all experiments had in common was the use of ANNs as classifiers. Classification is a ‘‘derived’’ goal, i.e., the task is assigning (in principle arbitrary) outputs, representing class labels, to input samples. The ANN is free to choose which features to use (or not) to reach this goal. Therefore, to study the way in which ANNs solve problems moving to regression problems might yield results more fit for interpretation, especially when a regression problem can be decomposed into a number of independent subproblems. The next sections will study the use of ANNs as nonlinear filters for image enhancement.
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
399
V. Regression Networks for Image Restoration This section will study whether standard regression feedforward ANNs can be applied successfully to a nonlinear image filtering problem. If so, what are the prerequisites for obtaining a well-functioning ANN? A second question (as in the previous section) is whether these ANNs correspond to classic image processing approaches to solve such a task. Note that again the goal here is not to simply apply ANNs to an image processing problem, nor to construct an ANN that will perform better at it than existing techniques. Instead, the question is to what extent ANNs can learn the nonlinearities needed in some image processing applications. To investigate the possibilities of using feedforward ANNs and the problems one might encounter, the research concentrates on a single example of a nonlinear filter: the Kuwahara filter for edge-preserving smoothing (Kuwahara et al., 1976). Since this filter is well-understood and the training goal is exactly known, it is possible to investigate to what extent ANNs are capable of performing this task. The Kuwahara filter also is an excellent object for this study because of its inherent modular structure, which allows the problem to be split into smaller parts. This is known to be an advantage in learning (Anand et al., 1995) and provides the opportunity to study subproblems in isolation. Pugmire et al., (1998) looked at the application of ANNs to edge detection and found that structuring learning in this way can improve performance; however, they did not investigate the precise role this structuring plays. ANNs have previously been used as image filters, as discussed in Section II.C.1. However, the conclusion was that in many applications the ANNs were nonadaptive. Furthermore, where ANNs were adaptive, a lot of prior knowledge of the problem to be solved was incorporated in the ANN’s architectures. Therefore, in this section a number of modular ANNs will be constructed and trained to emulate the Kuwahara filter, incorporating prior knowledge in various degrees. Their performance will be compared to standard feedforward ANNs. Based on results obtained in these experiments, in Section VI it is shown that several key factors influence ANN behavior in this kind of task. A. Kuwahara Filtering The Kuwahara filter is used to smooth an image while preserving the edges (Kuwahara et al., 1976). Figure 25a illustrates its operation. The input of the filter is a ð2 1Þ ð2 1Þ pixel neighborhood around the central pixel. This neighborhood is divided into four overlapping subwindows
400
DE RIDDER ET AL.
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
Figure 23. Feature detector pairs found in ANN3 using DCGD with various values of weight factor (a–d). C2 is the squared correlation between the feature detectors after training.
401
402 DE RIDDER ET AL.
Figure 24. The output of (a) the first and (b) the second feature map of ANN3 trained with DCGD ( ¼ 1), for two samples of class ‘‘1’’ (left) and two samples of class ‘‘7’’ (right). The samples used were, for both digits, the leftmost two in Figure 17.
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
403
Wi ; i ¼ 1; 2; 3; 4, each of size pixels. For each of these subwindows, the average i and the variance i2 of the 2 gray values are calculated. The output of the filter is then found as the average m of the subwindow Wm having the smallest gray value variance ðm ¼ arg mini 2i Þ. This operation can be applied in a scan-wise manner to filter an entire image. For an example of the effect of the filter, see Figure 26. The filter is nonlinear. As the selection of the subwindow based on the variances is data-driven, edges are not blurred as in normal uniform
Figure 25. (a) The Kuwahara filter: subwindows in a ð2 1Þ ð2 1) window; here ¼ 3. (b) Kuwahara filter operation as a sequence of operations.
Figure 26. Images used for (a) training and (b–c) testing purposes. The top images are the originals; the bottom images are the Kuwahara filtered versions (for image A, the training target). For presentation purposes, the contrast of the images has been stretched (Young et al., 1998).
404
DE RIDDER ET AL.
filtering. Because a straight edge will always lie in at most three subwindows, there will always be at least one subwindow that does not contain an edge and therefore has low variance. For neighboring pixels in edge regions, different subwindows will be selected (due to the minimum operation), resulting in sudden large differences in gray value. Typically, application of the Kuwahara filter to natural images will result in images that have an artificial look but that may be more easily segmented or interpreted. This filter was selected for this research because of the following: It is nonlinear. If ANNs can be put to use in image processing, the most rewarding application will be one to nonlinear rather than linear image processing. ANNs are most often used for learning (seemingly) highly complex, nonlinear tasks with many parameters using only a relatively small number of samples. It is modular (Fig. 25b illustrates this). This means the operation can be split into subtasks that can perhaps be learned more easily than the whole task at once. It will be interesting to see whether an ANN will need this modularity and complexity in order to approximate the filter’s operation. Also, it offers the opportunity to study an ANN’s operation in terms of the individual modules.
B. Architectures and Experiments In the previous section, it was shown that when studying ANN properties, such as internal operation (which functions are performed by which hidden units) or generalization capabilities, one often encounters a phenomenon that could be described as an ANN interpretability trade-off (Section IV.A.3). This trade-off, controlled by restricting the architecture of an ANN, is between the possibility of understanding how a trained ANN operates and the degree to which the experiment is still true to life. To cover the spectrum of possibilities, a number of modular ANNs with varying degrees of freedom was constructed. The layout of such a modular ANN is shown in Figure 27. Of the modular ANNs, four types were created, M ANNM 1 . . . ANN4 . These are discussed below in descending order of artificiality; i.e., the first is completely hand-designed, with every weight set to an optimal value, whereas the last consists of only standard feedforward modules. 1. Modular Networks Each modular ANN consists of four modules. In the four types of modular ANN, different modules are used.
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
405
For ANN1M, the modules were hand-designed for the tasks they are to perform. In some cases, this meant using other than standard (i.e., sigmoid, linear) transfer functions and very unusual weight settings. Figure 28 shows the four module designs and the weights assigned to their connections. — The average module (MODAvg, Fig. 28a) uses only linear transfer functions in units averaging the inputs. Four of these modules can be used to calculate 1,. . ., 4. — The variance module (MODVar, Fig. 28b) uses a submodule (on the left) to calculate the average of the subwindow it is presented. The other submodule (on the right) just transports the original data to lower layers.10 The calculated averages are then subtracted from the original inputs, followed by a layer of units using an f (a) = tanh2 (a) transfer function to approximate the square of the input11 (see Fig. 29a). Four of these modules can be used to find 12, . . ., 42. — The position-of-minimum module for selecting the position of the minimum of four inputs (MODPos, Fig. 28c) is the most complicated one. Using the logarithm of the sigmoid as a transfer function, f ðaÞ ¼ ln
1 1 þ expðaÞ
ð19Þ
(see Fig. 29b), units in the first three hidden layers act as switches comparing their two inputs. Alongside these switches, linear transfer function units are used to transport the original values to deeper layers. Weights wA and wB are very high to enable the units to act as switches. If the input connected using weight wA (input IA) is greater than the input connected using weight wB (input IB), the sum will be large and negative, the output of the sigmoid will approach 0.0, and the output of the unit will be 1. If IB > IA, on the other hand, the sum will be large and positive, the output of the sigmoid part will approach 1.0, and the final output of the unit will be 0.0. This output can be used as an inhibiting signal, by passing it to units of the same type in lower layers. In this way, units in the third hidden layer have as output—if inputs are denoted as 1, 2, 3, and 4: 10 This part is not strictly necessary, but was incorporated since links between nonadjacent layers are difficult to implement in the software package used (Hoekstra et al., 1996). 11 This function is chosen since it approximates a2 well on the interval it will be applied to, but is bounded: it asymptotically reaches 1 as the input grows to 1. The latter property is important for training the ANN, as unbounded transfer functions will hamper convergence.
406
DE RIDDER ET AL.
Figure 27. A modular ANN. MODAvg, MODVar, MODPos, and MODSel denote the ANN modules, corresponding to the operations shown in Figure 25b. The top layer is the input layer. In this figure, shaded boxes correspond to values transported between modules, not units.
si ¼
0:0 0:5
i < minm¼1;...;4^m6¼i m otherwise
ð20Þ
Weights wA and wB are slightly different to handle cases in which two inputs are exactly the same but one (in this case arbitrary) minimum position has to be found. The fourth and fifth hidden layers ensure that exactly one output unit will indicate that the corresponding input was minimal, by setting the output of a unit to 0.0 if another unit to the right has an output 6¼ 0.0. The units perform an xor-like function, giving high output only when exactly one of the inputs is high. Finally, biases (indicated by bA, bB, and bC next to the units) are used to let the outputs have the right value (0.0 or 0.5). — The selection module (MODSel, Fig. 28d) uses large weights coupled to the position-of-minimum module outputs (inputs s1, s2, s3, and s4) to suppress the unwanted average values i before adding these. The small weights with which the average values are multiplied and the large incoming weight of the output unit are used to avoid the nonlinearity of the transfer function.
Figure 28. The modules for (a) calculating the average, (b) calculating the variance, (c) finding the position of the minimum variance, and (d) selecting the right average. In all modules, the top layer is the input layer. Differently shaded boxes correspond to units with different transfer functions.
Since all weights were fixed, this ANN was not trained. ANN2M modules have the same architectures as those of ANN1M. However, in this case the weights were not fixed, hence the modules could be trained. These modules were expected to perform poorly, as some of the optimal weights (as set in ANN1M) were very high and some of the transfer functions are unbounded (see Fig. 29b). In ANN3M modules, nonstandard transfer functions were no longer used. As a result, the modules MODVar and MODPos had to be replaced by standard ANNs. These ANNs contained two layers of 25 hidden units, each of which had a double sigmoid transfer function. This number of hidden units was thought to give the modules a sufficiently large number of parameters, but keeps training times feasible. In the final type, ANN4M, all modules consisted of standard ANNs with two hidden layers of 25 units each. 407
408
DE RIDDER ET AL.
Figure 28. (Contuned)
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
409
Figure 29. The nonstandard transfer functions used in (a) MODVar and (b) MODPos.
With these four types, a transition is made from a fixed, hard-wired type of ANN (ANN1M), which is a hard-wired implementation of the Kuwahara filter, to a free type (ANN4M) in which only the prior knowledge that the filter consists of four subtasks is used. The goal of the exercise is to see a gradual change in behavior and performance. Note that the ANN1M architecture is probably not the only error-free implementation possible using ANN units. It should be clear from the discussion, though, that any architecture should resort to using nonstandard transfer functions and unconventional weight settings to perform the nonlinear operations error-free over a large range of input values. In this respect, the exact choices made here are less important.
2. Standard Networks As shown in Section III, the use of prior knowledge in ANN design will not always guarantee that such ANNs will perform better than standard architectures. To validate results obtained with the ANNs described in the previous section, experiments were also performed with standard, fully connected feedforward ANNs. Although one hidden layer should theoretically be sufficient (Funahashi, 1989; Hornik et al., 1989), the addition of a layer may ease training or lower the number of required parameters (although there is some disagreement on this). Therefore, ANNs having one or two hidden layers of 1, 2, 3, 4, 5, 10, 25, 50, 100, or 250 units each were used. All units used the double sigmoid transfer function. These ANNs will be referred to as ANNSLU , where L indicates the number of hidden layers
410
DE RIDDER ET AL.
(1 or 2) and U the number of units per hidden layer. ANNSL will be used to denote the entire set of ANNs with L hidden layers. 3. Data Sets and Training To train the ANNs, a training set was constructed by drawing samples randomly, using a uniform distribution, from image A (input) and its Kuwahara filtered version (output), both shown in Figure 26a. The original 8-bit 256-gray value image was converted to a floating point image and rescaled to the range [0:5; 0:5]. Three data sets were constructed, containing 1000 samples each: a training set, a validation set, and a testing set. The validation set was used to prevent overtraining: if the error on the validation set did not drop below the minimum error found so far on that set for 1000 cycles, training was stopped. Because in all experiments only ¼ 3 Kuwahara filters were studied, the input to each ANN was a 5 5 region of gray values and the training target was 1 value. For the modular ANNs, additional data sets were constructed from these original data sets to obtain the mappings required by the individual ANNs (average, variance, position-of-minimum, and selection). For training, the standard stochastic backpropagation algorithm (Rumelhart et al., 1986) was used. Weights were initialized to random values drawn from a uniform distribution in the range [0:1; 0:1]. The learning rate was set to 0.1; no momentum was used. Training was stopped after 25,000 cycles or if the validation set indicated overtraining, whichever came first. All experiments were repeated five times with different random initializations; all results reported are averages over five experiments. Wherever appropriate, error bars indicate standard deviations. 4. Results Results are given in Figures 30 and 31. These will be discussed here for the different architectures. a. Modules. The different modules show rather different behavior (Fig. 30). Note that in these figures the MSE was calculated on a testing set of 1000 samples. As was to be expected, the MSE is lowest for the handconstructed ANN1M modules: for all ANNs except MODPos, it was 0. The error remaining for the MODPos module may look quite high, but is caused mainly by the ANN choosing a wrong minimum when two or more input values i are very similar. Although the effect on the behavior of the final module (MODSel) will be negligible, the MSE is quite high since one output that should have been 0.5 is incorrectly set to 0.0 and vice versa, leading to an MSE of 0.25 for that input pattern. For the other ANNs, it seems that if the
M
.
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
Figure 30. Performance of the individual modules on the testing set in each of the modular ANNs, ANN1M . . . ANN4
411
412
DE RIDDER ET AL.
manually set weights are dropped (ANN2M), the modules are not able to learn their function as well as possible (i.e., as well as ANN1M). Nonetheless, the MSE is quite good and comparable to ANN3M and ANN4M. When the individual tasks are considered, the average is obviously the easiest function to approximate. Only for ANN4M, in which standard modules with two hidden layers were used, is the MSE larger than 0.0; apparently these modules generalize less well than the hand-constructed, linear MODAvgs. The variance too is not difficult: MSEs are Oð105 Þ. Clearly, the position-of-minimum task is the hardest. Here, almost all ANNs perform poorly. Performances on the selection problem, finally, are quite good. What is interesting is that the more constrained modules (ANN2M, ANN3M) perform less well than the standard ones. Here again the effect that the construction is closely connected to the optimal set of weights plays a role. Although there is an optimal weight set, the training algorithm did not find it. b. Modular Networks. When the modules are concatenated, the initial MSEs of the resulting ANNs are poor: for ANN2M, ANN3M, and ANN4M Oð1Þ; Oð101 Þ; and Oð102 Þ, respectively. The MODPos module is mainly responsible for this; it is the hardest module to learn due to the nonlinearity involved (see the discussion in Section V.B.4.a). If the trained MODPos in ANN2M. . . ANN4M is replaced by the constructed ANN1M module, the overall MSE always decreases significantly (see Table 2). This is an indication that although its MSE seems low ½Oð102 Þ, this module does not perform well. Furthermore, it seems that the overall MSE is highly sensitive to the error this module makes. However, when the complete ANNs are trained a little further with a low learning rate (0.1), the MSE improves rapidly: after only 100–500 learning cycles training can be stopped. In Pugmire et al. (1998), the same effect occurs. The MSEs of the final ANNs on the entire image are shown in Figure 31a,e, and i for images A, B, and C, respectively. Images B and C were preprocessed in the same way as image A: the original 8-bit (B) and 5-bit (C) 256-gray value images were converted to floating point images, with gray values in the range [0:5; 0:5]. To get an idea of the significance of these results, reinitialized versions of the same ANNs were also trained. That is, all weights of the concatenated ANNs were initialized randomly without using the prior knowledge of modularity. The results of these training runs are shown in Figure 31b, f, and j. Note that only ANN2M cannot be trained well from scratch, due to the nonstandard transfer functions used. For ANN3M and ANN4M the MSE is comparable to the other ANNs. This would indicate that modular training is not beneficial, at least according to the MSE criterion.
413
Figure 31. (Contuned)
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
414 DE RIDDER ET AL.
Figure 31. Performance of all ANNMs and ANNss on the three images used: (a–d) on image A (Fig. 26a), (e–h) on image B (Fig. 26b), and (i–l) on image C (Fig. 26c). For the ANNss, the x-axis indicates the number of hidden units per layer.
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
415
TABLE 2 Dependence of Performance, in MSE on the Image A Testing Set, on the MODPos Modulea Type
MSE
ANNM 2 ANNM 3 ANNM 4
9.2 101 5.2 101 1.2 101 1.2 101 3.6 102 1.7 102
a
MSE with MODPos of ANNM 1 8.7 104 1.7 104 1.0 103 2.0 104 1.2 103 2.4 104
Values given are average MSEs and standard deviations.
The ANNs seem to generalize well, in that nearly identical MSEs are reached for each network on all three images. However, the variance in MSE is larger on image B and image C than it is for image A. This indicates that the modular networks may have become slightly too adapted to the content of image A. c. Standard Networks. Results for the standard ANNs, ANNSs, are shown in Figure 31c–d, g–h, and k–l for images A, B, and C. In each case, the first figure gives the results for ANNs with one hidden layer and the second figure for ANNs with two hidden layers. What is most striking is that for almost all sizes of the ANNs the MSEs are more or less the same. Furthermore, this MSE is nearly identical to the one obtained by the modular ANNs ANN2M. . .ANN4M. It also seems that the smaller ANNs, which give a slightly larger MSE on image A and image B, perform a bit worse on image C. This is due to the larger amount of edge pixels in image C; the next section will discuss this further. C. Investigating the Error The experiments in the previous section indicate that no matter which ANN is trained (except for ANN1M), the MSE it will be able to reach on the images is equal. However, visual inspection shows small differences between images filtered by various ANNs; see e.g., the left and center columns of Figure 32. To gain more insight in the actual errors the ANNs make, a technique can be borrowed from the field of Bayesian learning, which allows the calculation of error bars for each output of the ANN (Bishop, 1995). The computation is based on the Hessian of the ANN output with respect to its weights w, H ¼ r2W Rðx; wÞ, which needs to be found first. Using H, for each input x a corresponding variance 2tot can be found. This makes it possible to create an image in which each pixel corresponds to 2 tot, i.e., the gray value
416
DE RIDDER ET AL.
equals half the width of the error bar on the ANN output at that location. Conversely, the inverse of tot is sometimes used as a measure of confidence in an ANN output for a certain input. For a number of ANNs, the Hessian was calculated using a finite differencing approximation (Bishop, 1995). To calculate the error bars, this matrix has to be inverted first. Unfortunately, for the ANNMs, inversion was impossible as their Hessian matrices were too ill-conditioned because of the complicated architectures, containing fixed and shared weights. Figure 32b and c shows the results for two standard ANNs, ANNS125 and ANNS225 . In the left column the ANN output for image A (Fig. 26a) is shown. The center column shows the absolute difference between this output and the target image. In the third column the error bars calculated using the Hessian are shown. The figures show that the error the ANN makes is not spread out evenly over the image. The highest errors occur near the edges in image A, as can be seen by comparing the center column of Figure 32 with the gradient magnitude of |rIA| of image A, shown in Figure 33a. This gradient magnitude is calculated as (Young et al., 1998) sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 IA IA ð21Þ þ jrIA j ¼ x y where IA =x is approximated by convolving image A with a [1 0 1] mask, and IA =y by convolving image A with its transpose. The error bar images, in the right column of Figure 32, show that the standard deviation of ANN output is also highest on and around the edges. Furthermore, although the output of the ANNs looks identical, the error bars show that the ANNs actually behave differently. These results lead to the conclusion that the ANNs have learned fairly well to approximate the Kuwahara filter in flat regions, where it operates like a local average filter. However, on and around edges they fail to give the correct output; most edges are sharpened slightly, but not nearly as much as they would be by the Kuwahara filter. In other words, the linear operation of the Kuwahara filter is emulated correctly, but the nonlinear part is not. Furthermore, the error bar images suggest there are differences between ANNs that are not expressed in their MSEs.
D. Discussion The most noticeable result of the experiments above is that whatever ANN is trained, be it a simple one hidden unit ANN or a specially constructed
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
417
Figure 32. (a) The original image A. (b) and (c), from left to right: outputs of two ANNss on image A; absolute differences between target image and ANN output and ANN output error bar widths plotted as gray values.
Figure 33. (a) The gradient magnitude of image A, | rIA |. (b) Performance of ANNs150 for various training set sample sizes.
modular ANN, approximately the same performance (measured in MSE) can be reached. Modular training does not seem to boost performance at all. However, inspection of error images and standard deviation of ANN outputs suggests that there are differences between ANNs. Furthermore, the errors made by ANNs are concentrated around edges, i.e., in the part where the Kuwahara filter’s nonlinearity comes into play. There are a number of hypotheses as to what causes all ANNs to seemingly perform equally well, some of which will be investigated in the next section:
418
DE RIDDER ET AL.
the problem may simply be too hard to be learned by a finite-size ANN. This does not seem plausible, since even for a two-hidden layer ANN with 250 hidden units per layer, resulting in a total of 69,000 free parameters, the MSE is no better than for very simple ANNs. One would at least expect to see some enhancement of results; it is possible that the sample size of 1000 is too small, as it was rather arbitrarily chosen. An experiment was performed in which ANNS150 was trained using training sets with 50, 100, 250, 500, 1000, and 2000 samples. The results, given in Figure 33b, show, however, that the chosen sample size of 1000 seems sufficient. The decrease in MSE when using 2000 samples in the training set is rather small; the training set may not be representative for the problem, i.e., the nature of the problem may not be well reflected in the way the set is sampled from the image; the error criterion may not be fit for training the ANNs or assessing their performance. It is very well possible that the MSE criterion used is of limited use in this problem, since it weighs both the interesting parts of the image, around the edges, and the less interesting parts equally; the problem may be of such a nature that local minima are prominently present in the error surface, while the global minima are very hard to reach, causing suboptimal ANN operation.
VI. Inspection and Improvement of Regression Networks This section tries to answer the questions raised by the experiments in the previous section, by investigating the influence of the data set, the appropriateness of the MSE as a performance measure and the trained ANNs themselves. A. Edge-Favoring Sampling Inspection of the ANN outputs and the error bars on those outputs led to the conclusion that the ANNs had learned to emulate the Kuwahara filter well in most places, except in regions near edges (Section V.C). A problem in sampling a training set from an image12 for this particular 12 From here on, the term sampling will be used to denote the process of constructing a data set by extracting windows from an image with coordinates sampled from a certain distribution on the image grid.
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
419
application is that such interesting regions, i.e., the regions where the filter is nonlinear, are very poorly represented. Edge pixels constitute only a very small percentage of the total number of pixels in an image [as a rule of pffiffiffi thumb, Oð nÞ edge pixels on OðnÞ image pixels] and will therefore not be represented well in the training set when sampling randomly using a uniform distribution. To learn more about the influence of the training set on performance, a second group of data sets was created by sampling from image A (Fig. 26a) with a probability density function based on its gradient magnitude image R R |rIA| [Eq. (21)]. If |rI | is scaled by a factor c such that x y c jrIðx; yÞ jdydx ¼ 1, and used as a probability density function when sampling, edge regions have a much higher probability of being included in the data set than pixels from flat regions. This will be called edge-favoring sampling, as opposed to normal sampling. 1. Experiments Performances (in MSE) of ANNs trained on this edge-favoring set are given in Figures 34 and 35. Note that the results obtained on the normal training set (first shown in Fig. 31) are included again to facilitate comparison. The sampling of the data set clearly has an influence on the results. Because the edge-favoring set contains more samples taken from regions around edges, the task of finding the mean is harder to learn due to the larger variation. At the same time, it eases training the position-of-minimum and selection modules. For all tasks except the average, the final MSE on the edgefavoring testing set (Fig. 34b,d,f, and h) is better than that of ANNs trained using a normal training set. The MSE is, in some cases, even lower on the normal testing set (Fig. 34e and g). Overall results for the modular and standard ANNs (Fig. 35) suggest that performance decreases when ANNs are trained on a specially selected data set (i.e., the MSE increases). However, when the quality of the filtering operation is judged by looking at the filtered images (see, e.g., Fig. 36), one finds that these ANNs give superior results in approximating the Kuwahara filter. Clearly, there is a discrepancy between performance as indicated by the MSE and visual perception of filter quality. Therefore, below we will investigate the possibility of finding another way of measuring performance. B. Performance Measures for Edge-Preserving Smoothing The results given in Section VI.A.1 show that it is very hard to interpret the MSE as a measure of filter performance. Although the MSEs differ only
420 DE RIDDER ET AL.
M Figure 34. Performance of the individual modules in each of the modular ANNs, ANNM 1 . . . ANN4 , on the normal testing set (a, c, e, g) and edge-favoring testing set (b, d, f, h).
421
Figure 35. (Continued).
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
422 DE RIDDER ET AL.
Figure 35. Performance of all ANNMs and ANNss on the three images used: (a–d) on image A (Fig. 26a), (e–h) on image B (Fig. 26b), and (i–l) on image C (Fig. 26c).
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
423
slightly, visually the differences are quite large. If images filtered by various ANNs trained on the normal and edge-favoring data sets are compared, it seems clear which ANN performs better. As an example, Figure 36 shows two filtered images. The left image was filtered by ANN4M trained on an edge-favoring training set. The image on the right is the output of ANNS1100 trained on a normal data set. Although the MSEs are nearly equal (1:48 103 for the left image versus 1:44 103 for the right one), in the left image the edges seem much crisper and the regions much smoother than in the image on the right; that is, one would judge the filter used to produce the left image to perform better. One would like to find a measure for filter performance that bears more relation to this qualitative judgment than the MSE. The reason why the MSE is so uninformative is that by far the largest number of pixels do not lie on edges. Figure 37a illustrates this: it shows that the histogram of the gradient magnitude image is concentrated near zero, i.e., most pixels lie in flat regions. Because the MSE averages over all pixels, it may be quite low for filters that preserve edges poorly. Vice versa, the visual quality of the images produced by the ANNs trained using the edge-favoring data set may be better while their MSE is worse, due to a large number of small errors made in flat regions. The finding that the MSE does not correlate well with perceptual quality judgment is not a new one. A number of alternatives have been proposed, among which the mean absolute error (MAE) seems to be the most prominent one. There is also a body of work on performance measures for edge detection, e.g., Pratt’s Figure of Merit (FOM) (Pratt, 1991) or Average Risk (Spreeuwers, 1992). However, none of these captures the dual goals of edge sharpening and region smoothing present in this problem.
Figure 36. Two ANN output images with details. For the left image, output of ANN4M trained on the edge-favoring set, the MSE is 1:48 103 ; for the right image, output of ANNs1 100 trained on a normally sampled set, it is 1:44 103 . The details in the middle show the target output of the Kuwahara filter; the entire target image is shown in Figure 26a.
424
DE RIDDER ET AL.
Figure 37. (a) Histograms of gradient magnitude values | rI | of image A (Fig. 26a) and a Kuwahara filtered version ( ¼ 3). (b) Scattergram of the gradient magnitude image pixel values with estimated lines.
1. Smoothing versus Sharpening In edge-preserving smoothing, two goals are pursued: on the one hand the algorithm should preserve edge sharpness, and on the other hand it should smooth the image in regions that do not contain edges. In other words, the gradient of an image should remain the same in places where it is high13 and decrease where it is low. If the gradient magnitude |rI | of an image I is plotted versus |rf (I )| of a Kuwahara-filtered version f (I ), for each pixel I(i, j ), the result will look like Figure 37b. In this figure, the two separate effects can be seen: for a number of points, the gradient is increased by filtering while for another set of points the gradient is decreased. The steeper the upper cloud, the better the sharpening; the flatter the lower cloud, the better the smoothing. Note that the figure gives no indication of the density of both clouds: in general, by far the most points lie in the lower cloud, since more pixels lie in smooth regions than on edges. The graph is reminiscent of the scattergram approach discussed (and denounced) in Katsulai and Arimizu (1981), but here the scattergram of the gradient magnitude images is shown.
13 Or even increase. If the regions divided by the edge become smoother, the gradient of the edge itself may increase, as long as there was no overshoot in the original image. Overshoot is defined as the effect of artificially sharp edges, which may be obtained by adding a small value to the top part of an edge and subtracting a small value from the lower part (Young et al., 1998).
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
425
To estimate the slope of the trend of both clouds, the point data is first separated into two sets: A¼ jrIjði; jÞ ; jrf ðIÞjði; jÞ jrIjði; jÞ jrf ðIÞjði; jÞ ð22Þ
j
B¼
jrIjði; jÞ ; jrf ðIÞjði; jÞ rIjði; jÞ < jrf ðIÞjði; jÞ
j
ð23Þ
Lines y ¼ ax þ b can be fitted through both sets using a robust estimation technique, minimizing the absolute deviation (Press et al., 1992), to get a density-independent estimate of the factors with which edges are sharpened and flat regions are smoothed: X ðaA ; bA Þ ¼ arg minða;bÞ jy ðax þ bÞj ð24Þ ðx;yÞ2A
ðaB ; bB Þ ¼ arg minða;bÞ
X
ðx;yÞ2B
jy ðax þ bÞj
ð25Þ
The slope of the lower line found, aA, gives an indication of the smoothing induced by the filter f. Likewise, aB gives an indication of the sharpening effect of the filter. The offsets bA and bB are discarded, although it is necessary to estimate them to avoid a bias in the estimates of aA and aB. Note that a demand is that aA 1 and aB 1, so the values are clipped at 1 if necessary—note that due to the fact that the estimated trends are not forced to go through the origin, this might be the case. To account for the number of pixels actually used to estimate these values, the slopes found are weighed with the relative number of points in the corresponding cloud. Therefore, the numbers Smoothingð f ; IÞ ¼
jAj ða0 1Þ jAj þ jBj A
ð26Þ
Sharpeningð f ; IÞ ¼
jBj ðaB 1Þ jAj þ jBj
ð27Þ
and
are used, where a0A ¼ 1/aA was substituted to obtain numbers in the same range [0, 1]. These two values can be considered to be an amplification factor of edges and an attenuation factor of flat regions, respectively. Note that these measures cannot be used as absolute quantitative indications of filter performance, since a higher value does not necessarily mean a better performance, i.e., there is no absolute optimal value. Furthermore, the measures are highly dependent on image content and
426
DE RIDDER ET AL.
scaling of f (I ) with respect to I. The scaling problem can be neglected, however, since the ANNs were trained to give output values in the correct range. Thus, for various filters f (I ) on a certain image, these measures can now be compared, giving an indication of relative filter performance on that image. To get an idea of the range of possible values, smoothing and sharpening values for some standard filters can be calculated, like the Kuwahara filter, a Gaussian filter 2 1 x þ y2 fG ðI; Þ ¼ I exp ð28Þ 2 2 2 2 for14 ¼ 0.0, 0.1,. . ., 2.0; and an unsharp masking filter 2 0 13 1 2 1 fU ðI; Þ ¼ I 4I @ 2 4 2 A5 1 2 1
ð29Þ
which subtracts times the Laplacian15 from an image, ¼ 0:0; 0:1; . . . ; 2:0. 2. Experiments Smoothing and sharpening performance values were calculated for all ANNs discussed in Section VI.A.1. The results are shown in Figure 38. First, lines of performance values for the Gaussian and unsharp masking filters give an indication of the range of possible values. As expected, the Gaussian filter on images A and B (Fig. 26a and b) gives high smoothing values and low sharpening values, while the unsharp masking filter gives low smoothing values and high sharpening values. The Kuwahara filter scores high on smoothing and low on sharpening. This is exactly as it should be: the Kuwahara filter should smooth while preserving the edges, it should not necessarily sharpen them. If ANNs have a higher sharpening value, they are usually producing overshoot around the edges in the output images. The measures calculated for image C (Fig. 26c) show the limitations of the method. In this image there is a large number of very sharp edges in an otherwise already rather smooth image. For this image the Gaussian filter gives only very low smoothing values and the unsharp masking filter gives no sharpening value at all. This is due to the fact that for this image, subtracting the Laplacian from an image produces a very small sharpening 14
For 0.5 the Gaussian is ill-sampled; in this case, a discrete approximation is used that is not stricly speaking a Gaussian. 15 This is an implementation of the continuous Laplacian edge detector mentioned in Section IV.A.1, different from the discrete detector shown in Figure 11.
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
427
value, together with a negative smoothing value, caused by the Laplacian greatly enhancing the amount of noise in the image. Because the values were clipped at 0, the results are not shown in the figure. Regarding the ANNs, some things become clear. First, the handconstructed ANN (ANN1M) almost perfectly mimics the Kuwahara filter, according to the new measures. However, as soon as the hand-set weights are dropped (ANN2M), performance drops drastically. Apparently the nonstandard transfer functions and special architecture inhibit the ANN too much. ANN3M and ANN4M perform better and generalize well to other images. However, besides ANN1M, no other ANN in this study seems to be able to approximate the Kuwahara filter well. The best trained ANN still performs much worse. Second, edge-favoring sampling has a strong influence. Most of the architectures discussed perform reasonably only when trained on a set with a significantly larger number of edge samples than acquired by random sampling, especially the ANNSs. This indicates that although the MSE actually indicates ANNs trained on an edge-favoring set perform worse, sampling in critical areas of the image is a prerequisite for obtaining a wellperforming, nonlinear approximation to the Kuwahara filter. Most standard ANNs perform poorly. Only for ANNS210 , ANNS225 , and ANNS250 performance is reasonable. In retrospect, this concurs with the drop in the MSE that can be seen in Figure 35d, although the differences there are very small. ANNS250 clearly performs best. A hypothesis is that this depends on the training of the ANNs, since training parameters were not optimized for each ANN. To verify this, the same set of standard ANNs was trained in experiments in which the weights were initialized using random values drawn from a uniform distribution over the range [1:0, 1.0], using a learning rate of 0.5. Now, the optimal standard ANN was found to be ANNS225 , with all other ANNs performing very poorly. Generalization is, for all ANNs, reasonable. Even on image C (Fig. 26c), which differs substantially from the training image (image A, Fig. 26a), performance is quite good. The best standard ANN, ANNS250 , seems to generalize a little better than the modular ANNs.
3. Discussion In Dijk et al. (1999), it is shown that the smoothing and sharpening performance measures proposed here correlate well with human perception. It should be noted that in this study, subjects had fewer problems in discerning various levels of smoothing than they had with levels of sharpening. This indicates that the two measures proposed are not equivalently spaced.
428
DE RIDDER ET AL.
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
Figure 38. Performance of standard filters, all ANNMs, and ANNSs on the three images used: (a–d) on image A (Fig. 26a), (e–h) on image B (Fig. 26b), and (i–l) on image C (Fig. 26c). In the legends, ef stands for ANNs trained on edge-favoring data sets, as opposed to normally sampled data sets (nrm); further indicates ANNs initialized by training the individual modules as opposed to ANNs trained from scratch (over); and 10, 25, and so on denote the number of units per hidden layer.
429
430
DE RIDDER ET AL.
The fact that the measures show that edge-favoring sampling in building a training set increases performance considerably suggests possibilities for extensions. Pugmire et al. (1998) claim that learning should be structured, i.e., start with the general problem and then proceed to special cases. This can be easily accomplished in training set construction, by adding a constant to each pixel in the gradient magnitude image before scaling and using it as a probability density function from which window coordinates are sampled. If this constant is gradually lowered, edge-pixels become better represented in the training set. Another, more general possibility would be to train ANNs on normally sampled data first and calculate an error image (such as those shown in the center column of Fig. 32). Next, the ANN could be trained further—or retrained—on a data set sampled using the distribution of the errors the ANN made, a new error image can be calculated, and so on. This is similar to boosting and arcing approaches in classification (Shapire, 1990). An advantage is that this does not use the prior knowledge that edges are important, which makes it more generally applicable. 4. Training Using Different Criteria Ideally, the sharpening and smoothing performance measures discussed in the previous section should be used to train ANNs. However, this is infeasible as they are not differentiable. This means they could be used only in learning procedures that do not need the criterion function to be differentiable, such as reinforcement learning (Gullapalli, 1990). This falls outside the scope of the experiments in this section. However, the previous section showed that ANNs did learn to emulate the Kuwahara filter better when trained using the edge- favoring data set. Note that constructing a data set in this way is equivalent to using a much larger data set and weighing the MSE with the gradient magnitude. Therefore, this approach is comparable to using an adapted error criterion in training the ANN. However, this weighting is quite specific to this problem. In the literature, several more general alternatives to the MSE [Eq. (8)] have been proposed (Hertz et al., 1991; Burrascano, 1991). Among these, a very flexible family of error criteria based on the Lp norm is Ep ðW; BÞ ¼
m 1 X X p jRðxi ; W; BÞ yi j 2jLj ðx ;y Þ2L ¼1 i
i
ð30Þ
where p 2 Z . Note that for p = 2, this criterion is equal to the MSE. For p ¼ 0, each error is considered equally bad, no matter how small or large it is. For p ¼ 1, the resulting error criterion is known as the mean absolute error or MAE. The MAE is more robust to outliers than the MSE, as larger
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
431
errors are given relatively smaller weights than in the MSE. For p > 2, larger errors are given more weight, i.e., the data are considered not to contain outliers. In fact, which p to use should be decided by assuming a noise model for the target data (Burrascano, 1991). The L1 norm (robust to outliers) corresponds to a noise distribution with large tails, a Laplacian distribution, under which outliers are probable. At the other extreme, L1 corresponds to a uniform noise distribution. As discussed before, the Kuwahara filter is most interesting around the edges in an image, were the filter behaves nonlinearly. It was also shown that exactly around these edges most ANNs make the largest errors (Fig. 32). Therefore, it makes sense to use an error criterion that puts more emphasis on larger errors, i.e., the Lp norm for p > 2. To this end, a number of experiments were run in which different norms were used. Although implementing these criteria in the backpropagation algorithm is trivial (only the gradient calculation at the output units changes), the modified algorithm does not converge well using standard settings. The learning rate and initialization have to be adapted for each choice of norm, to avoid divergence. Therefore, the norms were used in the CGD training algorithm, which is less sensitive to initialization and choice of criterion due to the line minimization involved. The best performing ANN found in Section VI.B, ANNS250 , was trained using CGD with the Lp norm. The parameter p was set to 1,2,3,5, and 7, and both the normal and the edge-favoring training sets were used. The ANN was trained using the same settings as before; in the CGD algorithm, directions were kept conjugate for 10 iterations. Figure 39 shows the results. Clearly, using the Lp norm helps the ANN trained on the normal set to achieve better performance (Fig. 39a). For increasing p, the sharpening performance becomes higher. However, the smoothing performance still lags behind that of the ANN trained using the MSE on the edge-favoring training set (Fig. 38d). When ANNS250 is trained using the Lp norm on the edge-favoring data set, smoothing performance actually decreases (Fig. 39b). This is caused by the fact that the training set and error criterion in concert stress errors around edges so much, that the smoothing operation in flat regions suffers. Figure 40 illustrates this by showing the output of ANNS225 as well as the absolute difference between this output and the target image, for various values of p. For increasing p, the errors become less localized around the edges; for p 3 the error in flat regions becomes comparable to that around edges. In conclusion, using different Lp norms instead of the MSE can help in improving performance. However, it does not help as much as edge-favoring sampling from the training set, since only the latter influences the error criterion exactly where it matters, around edges. Furthermore, it requires choosing a
432
DE RIDDER ET AL.
value for the parameter p, for which an optimal setting is not clear beforehand. Finally, visual inspection still shows p ¼ 2 to be the best choice. C. Inspection of Trained Networks 1. Standard Networks To gain insight into the relatively poor performance of most of the standard ANNs according to the performance measure introduced in Section VI.B, a very simple architecture was created, containing only a small number of weights (see Fig. 41a). Because the Kuwahara filter should be isotropic, a
Figure 39. Performance of ANNS250 on image A (Fig. 26a), trained using different Lp norm error criteria and (a) the normal training set and (b) the edge-favoring training set.
Figure 40. Top row: output of ANNS250 trained using the Lp norm on the edge-favoring data set, for various p (a–e). Bottom row: absolute difference between output and target image.
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
433
symmetric weight mask was imposed on the weights (cf. Section IV.A.2.d). Furthermore, linear transfer functions were used to avoid the complications introduced in the analysis by the use of sigmoids. No bias was used. This ANN was trained on the normal data set, using a validation set. The learned weight set is shown in Figure 42a. In filtering terms, the main component looks like a negative Laplacian-of-Gaussian (i.e., the negative values around the center and the slightly positive values in the four corners). Further analysis showed that this filter closely resembles a linear combination of a normal Gaussian and a Laplacian-of-Gaussian. To confirm the hypothesis that standard ANNs learned such linear approximations to the Kuwahara filter, a simple standard ANN was trained in the same way ANNK was, using the DCGD training algorithm (Section IV.B.2). This ANN, ANNS12 , is shown in Figure 41b. All weights were initialized to a fixed value of 0.01, was set to 1, and the number of directions to be kept conjugate was set to 10. After training, the MSE on the testing set was 1:43 103 , i.e., comparable to other standard ANNs (Fig. 31), and C2 was 5:1 103 . The resulting weight sets show that the filter can indeed be decomposed into a Gaussian-like and a negative Laplacian-like filter. Adding more hidden units and training using DCGD, for which results are not shown here, did not cause any new filters to be found. This decomposition can well be explained by looking at the training objective. The Kuwahara filter smoothes images while preserving the edges. The Gaussian is a smoothing filter, while its second derivative, the Laplacian, emphasizes edges when subtracted from the original. Therefore, the following model for the filter found by the ANN was set up:
Figure 41. (a) ANNK, the simplest linear ANN to perform a Kuwahara filtering: a 5 5 unit input layer and one output unit without bias. The ANN contains six independent weights indicated in the mask by the letters A through F. (b) ANNs12 : two hidden units, no mask (i.e., no restrictions).
434 DE RIDDER ET AL. Figure 42. (a) Weights found in ANNK (Fig. 41a). (b) Weights generated by the fitted model [Eq. (31): c1 ¼ 10.21, 1 ¼ 2.87, c2 ¼ 3.41, 2 = 0.99]. (c) A cross section of this model at x ¼ 0. (d, e) Weight matrices found in ANNS12 (Fig. 41b) trained using DCGD.
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
f ðc1 ; 1 ; c2 ; 2 Þ ¼ c1 fG ð 1 Þ c2 fL ð 2 Þ ¼ 2 2 ðx2 þ y2 Þ 22 1 x þ y2 x þ y2 exp exp c1 c2 2 21 2 22 2 21 2 62
435
ð31Þ
in which c1 and 1 are parameters to be estimated for the Gaussian and c2 and 2 are parameters for the Laplacian. Figure 42c shows these two functions. A Gauss-Newton fitting procedure (Mathworks Inc., 2000) was used to find the parameters of f (c1, 1, c2, 2) given the weights shown in Figure 42a. The resulting model weights are shown in Figure 42b and a crosssection is shown in Figure 42c. Although the fit ðc1 ¼ 10:21; 1 ¼ 2:87; c2 ¼ 3:41; 2 ¼ 0:99Þ is not perfect with a model fit MSE f ¼ 2:5 103 , the correlation between the model and the actual weights is quite high (C ¼ 0.96). The hypothesis was that this solution, i.e., applying a Gaussian and a Laplacian, was a local minimum to which the ANNSs had converged. To test this, the model fitting procedure was applied to each of the units in the first hidden layer of each of the ANNSs. This resulted in a model fit error f and correlation C between the actual weights and the model weights for each unit. The results, given in Figure 43, show that, at least for the smaller ANNs, the hypothesis is supported by the data. For the ANNs trained on the normal data set, over a large range of sizes (i.e., 1–5, 10, and 25 hidden units) the model closely fits each hidden unit. Only for larger numbers of hidden units the fit becomes worse. The reason for this is that in these ANNs many units have an input weight distribution that is very hard to interpret. However, these units do not play a large role in the final ANN output, since they are weighted by small weights in the next layer. For the ANNs trained on the edge-favoring set the fit is less good, but still gives a reasonable correlation. Note however that ANNs that have high performance with respect to the smoothing and sharpening measures (Section VI.B.2) do not necessarily show the lowest correlation: ANNSs with more hidden units give even lower correlation. An opposite effect is playing a role here: as ANNs become too large, they are harder to train. The conclusion is that many of the standard ANNs have learned a linear approximation to the Kuwahara filter. Although this approximation performs well in uniform regions, its output does not correspond to that of the Kuwahara filter near edges. 2. Modular Networks It is interesting to see whether the modular ANNs still use their initialization. Remember that to obtain good performance, the ANNMs had to either be trained further after the modules were concatenated, or reinitialized and trained over (Section V.B.4.b). The question is whether the
436 DE RIDDER ET AL. Figure 43. A comparison between the actual weights in ANNSs and the fitted models, for both ANNs1 s and ANNs2 s. The median f is shown in (a) and (b) as the average f is rather uninformative due to outliers.
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
437
modules are still performing the functions they were initially trained on, or has the ANN—after being trained further for a while—found a better solution? To inspect the ANNs, the modules were first evaluated on the sets with which they were trained. Next, the concatenated ANNMs were taken apart and the modules were evaluated on the same sets. Figures 44 and 45 show some examples of such plots. Unfortunately, detailed inspection is hard. Ideally, if each module was performing the function it was trained to perform exactly, each plot would show a straight line y ¼ x. The plots show that this is, in most cases, not true. However, it is possible to make some general remarks about the differences between the various ways of training the ANNs. These differences are most clear for the mean and selection modules: for well-performing ANNs, the mapping in each module is no longer evident. Instead, it seems these modules make rather good use of their nonlinearity (Fig. 44c). The poorly performing ANNs still show a reasonably linear behavior (Fig. 45a); M there is a progressive increase in nonlinearity for ANNM 2 , ANN3 , and (Figs. 44a–c and 45a–c and d–f). The added complexity allows ANNM 4 the modules more flexibility when they are trained further. Note however that the basic mapping is still preserved, i.e., the trend is still visible for all units; there is an increase in nonlinearity when ANNs are trained on the edge-favoring set instead of the normal set (Fig. 45a–c vs. d–f ); as was to be expected, ANNMs trained from scratch generally do not find the modular structure (Fig. 44d–e). This leads to the conclusion that although the initialization by training models individually was useful, the modules of the better performing ANNs are no longer performing their original function. This is likely to be caused by the modules being trained individually on ideal, noiseless data. Therefore, modules have not learned to deal with errors made by other modules. This is corrected when they are trained further together in the concatenated ANNs. The larger the correction, the better the final performance of the concatenated ANN. For the MODVars and MODPoss, the differences are less clear. Most of these modules seem to have no function left in the final ANNs: the outputs are clamped at a certain value or vary a little in a small region around a value. For MODVar, only ANNM 4 modules have enough flexibility. Here too, training on the edge-favoring set increases the nonlinearity of the output (Fig. 46a–c). MODPos, finally, is clamped in almost all architectures. Only ANN4M modules give some variation in output (Fig. 46d–e). Networks trained from scratch are always clamped too.
438
DE RIDDER ET AL.
Figure 44. Plots of outputs of the four MODAvgs before concatenation against outputs of the same modules after concatenation and training further or over. Different markers indicate different output units. The plots show progressively more freedom as the modules become less restricted (a–c) and an increase in nonlinearity when modules are trained on the edge-favoring data set (a–c vs. d–e).
In conclusion, it seems that in most ANNs, the modules on the right-hand side (MODVar and MODPos, see Fig. 27) are disabled. However, the ANNs that do show some activity in these modules are the ANNs that perform best, indicating that the modular initialization to a certain extent is useful. All results indicate that although the nature of the algorithm can be used to construct and train individual modules, the errors these modules make are such that the concatenated ANNs perform poorly (see Section V.B.4.b). That is, modules trained separately on perfect data (e.g., precalculated positions of the minimal input) are ill-equipped to handle errors in their input, i.e., the output of preceding modules. For the concatenated ANNs, the training algorithm leaves its modular initialization to lower the overall MSE; trained as a whole, different weight configurations are optimal. The fact that a trained MODPos has a very specific weight configuration (with
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
439
Figure 45. Plots of MODSel outputs before concatenation against MODSel outputs after concatenation and training further or over. The plots show progressively more freedom as the modules become less restricted (a–c, d–f) and an increase in nonlinearity when modules are trained on the edge-favoring data set (a–c vs. d–f )
large weights) to be able to perform its function means it is more susceptible to weight changes than other modules and will easily lose its original functionality. In other words, the final concatenated ANN has ‘‘worked around’’ the errors made by MODPos by disabling it.
D. Discussion The previous section discussed a number of experiments, in which modular and standard feedforward ANNs were trained to mimic the Kuwahara filter. The main result was that all ANNs, from very simple to complex, reached the same MSE. A number of hypotheses was proposed for this phenomenon: that the data set and error measure may not accurately represent the finer points of this particular problem or that all ANNs have
440
DE RIDDER ET AL.
Figure 46. Plots of MODVar (a–c) and MODPos (d, e) outputs before concatenation against the same outputs after concatenation and training further or over. Different markers indicate different output units. The plots show many module outputs in the concatenated ANNs are clamped at certain values. Note that in the latter two figures, the original output is either 0.0 or 0.5; a small offset has been added for the different units for presentation purposes.
reached local minima, simply since the problem is too hard. Testing these hypotheses in this section, it was shown that using a different way of constructing training sets, i.e., by mainly sampling from regions around the edges, is of great benefit; using performance measures that do not average over all pixels, but take the two goals of edge-preserving smoothing into account, gives better insight into relative filter performance; by the proposed smoothing and sharpening performance measures, which correspond better to visual perception, modular ANNs performed better than standard ANNs; using the Lp norm to train ANNs, with p 2, improves performance, albeit not dramatically;
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
441
the smaller ANNSs have learned a linear approximation of the Kuwahara filter; i.e., they have reached a local minimum; in the poorly performing modular ANNs, the modules still perform the functions they were trained on. The better performing modular ANNs retain some of their initialization, but have adapted further to a point that the function of individual modules is no longer clear. The better the performance of the final ANN (according to the new measure) the less clear the initialization is retained. In the attempts to try to understand the operation of an ANN instead of treating it like a black box, the interpretability trade-off (discussed in Section IV) again played a role. For the modular ANNs, as soon as some of the constraints were dropped, ANN performance became much worse: there was no graceful degradation. It was shown too that it is hard to interpret the operation of the modular ANN after training it further; the operation of the ANN is distributed differently than in the original modular initialization. The one advantage of using the prior knowledge of the modular nature of the problem (for example, as in ANNM 4 ) is that it helps to avoid painstaking optimization of the number of hidden layers and units, which was shown to be quite critical in standard ANNs. Of course, for different problems this prior knowledge may not be available. The main conclusion is that, in principle, ANNs can be put to use as nonlinear image filters. However, careful use of prior knowledge, selection of ANN architecture, and sampling of the training set are prerequisites for good operation. In addition, the standard error measure used, the MSE, will not indicate an ANN performing poorly. Unimportant deviations in the output image may lead to the same MSE as significant ones, if there is a large number of unimportant deviations and a smaller number of important ones. Consequently, standard feedforward ANNs trained by minimizing the traditional MSE are unfit for designing adaptive nonlinear image filtering operations; other criteria should be developed to facilitate easy application of ANNs in this field. Unfortunately, such criteria will have to be specified for each application (see also Spreeuwers, 1992). In this light it is not surprising to find a large number of nonadaptive, application-specific ANNs in the literature. Finally, although all performance measures used in this section suggest that ANNs perform poorly in edge-preserving smoothing, the perceptual quality of the resulting filtered images is quite good. Perhaps it is the very fact that these ANNs have only partially succeeded in capturing the nonlinearity of the Kuwahara filter that causes this. In some cases this could be considered an advantage: constrained nonlinear parametric approximations to highly nonlinear filtering algorithms may give better perceptual results than the real thing, which is, after all, only a means to an end.
442
DE RIDDER ET AL.
VII. Conclusions This article discussed the application of neural networks in image processing. Three main questions were formulated in the introduction: Applicability: can (nonlinear) image processing operations be learned by adaptive methods? Prior knowledge: how can prior knowledge be used in the construction and training of adaptive methods? Interpretability: what can be learned from adaptive methods trained to solve image processing problems? Below, answers will be formulated to each of the questions.
A. Applicability The overview in Section II discussed how many researchers have attempted to apply artificial neural networks (ANNs) to image processing problems. To a large extent, it is an overview of what can now perhaps be called the ‘‘neural network hype’’ in image processing: the approximately 15-year period following the publications of Kohonen, Hopfield, and Rumelhart et al. Their work inspired many researchers to apply ANNs to their own problem in any of the stages of the image processing chain. In some cases, the reason was biological plausibility; however, in most cases the goal was simply to obtain well-performing classification, regression, or clustering methods. In some of these applications the most interesting aspect of ANNs, the fact that they can be trained, was not (or only partly) used. This held especially for applications to the first few tasks in the image processing chain: preprocessing and feature extraction. Another advantage of ANNs often used to justify their use is the ease of hardware implementation; however, in most publications this did not seem to be the reason for application. These observations, and the fact that often researchers did not compare their results to established techniques, casted some doubt on the actual advantage of using ANNs. In the remainder of the article, ANNs were therefore trained on two tasks in the image processing chain: object recognition (supervised classification), and preprocessing (supervised regression) and, where possible, compared to traditional approaches. The experiment on supervised classification, in handwritten digit recognition, showed that ANNs are quite capable of solving difficult object recognition problems. They performed (nearly) as well as some traditional
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
443
pattern recognition methods, such as the nearest neighbor rule and support vector classifiers, but at a fraction of the computational cost. As supervised regressors, a number of ANN architectures were trained to mimic the Kuwahara filter, a nonlinear edge-preserving smoothing filter used in preprocessing. The experiments showed that careful construction of the training set is very important. If filter behavior is critical only in parts of the image represented by a small subset of the training set, this behavior will not be learned. Constructing training sets using the knowledge that the Kuwahara filter is at its most nonlinear around edges improved performance considerably. This problem is also due to the use of the mean squared error (MSE) as a training criterion, which will allow poor performance if it occurs only for a small number of samples. Another problem connected with the use of the MSE is that it is insufficiently discriminative for model choice; in first attempts, almost all ANN architectures showed identical MSEs on test images. Criteria that were proposed to measure smoothing and sharpening performance showed larger differences. Unfortunately, these results indicate that the training set and performance measure will have to be tailored for each specific application, with which ANNs lose much of their attractiveness as all-round methods. The findings also explain why, in the literature, many ANNs applied to preprocessing were nonadaptive. In conclusion, ANNs seem to be most applicable for problems requiring a nonlinear solution, for which there is a clear, unequivocal performance criterion. This means ANNs are more suitable for high-level tasks in the image processing chain, such as object recognition, rather than low-level tasks. For both classification and regression, the choice of architecture, the performance criterion, and data set construction play a large role and will have to be optimized for each application.
B. Prior Knowledge In many publications on the use of ANNs in image processing, prior knowledge was used to constrain ANNs. This is to be expected; unconstrained ANNs contain large numbers of parameters and run a high risk of being overtrained. Prior knowledge can be used to lower the number of parameters in a way that does not restrict the ANN to such an extent that it can no longer perform the desired function. One way to do this is to construct modular architectures, in which use is made of the knowledge that an operation is best performed as a number of individual suboperations. Another way is to use the knowledge that neighboring pixels are related and should be treated in the same way, e.g., by using receptive fields in shared weights ANN.
444
DE RIDDER ET AL.
The latter idea was tested in supervised classification, i.e., object recognition. The shared weight ANNs used contain several layers of feature maps (detecting features in a shift-invariant way) and subsampling maps (combining information gathered in previous layers). The question is to what extent this prior knowledge was truly useful. Visual inspection of trained ANNs revealed little. Standard feedforward ANNs comparable in the number of connections (and therefore the amount of computation involved), but with a much larger number of weights, performed as well as the shared weight ANNs. This proves that the prior knowledge was indeed useful in lowering the number of parameters without affecting performance. However, it also indicates that training a standard ANN with more weights than required does not necessarily lead to overtraining. For supervised regression, a number of modular ANNs was constructed. Each module was trained on a specific subtask in the nonlinear filtering problem to which the ANN was applied. Furthermore, of each module different versions were created, ranging from architectures specifically designed to solve the problem (using hand-set weights and tailored transfer functions) to standard feedforward ANNs. According to the proposed smoothing and sharpening performance measures, the fully handconstructed ANN performed best. However, when the hand-constructed ANNs were (gradually) replaced by more standard ANNs, performance quickly decreased and became level with that of some of the standard feedforward ANNs. Furthermore, in the modular ANNs that performed well the modular initialization was no longer visible (see also the next section). The only remaining advantage of a modular approach is that careful optimization of the number of hidden layers and units, as for the standard ANNs, is not necessary. These observations lead to the conclusion that prior knowledge can be used to restrict adaptive methods in a useful way. However, various experiments showed that feedforward ANNs are not natural vehicles for doing so, as this prior knowledge will have to be translated into a choice for ANN size, connectivity, transfer functions, etc., parameters that do not have any physical meaning related to the problem. Therefore, such a translation does not necessarily result in an optimal ANN. It is easier to construct a (rough) model of the data and allow model variation by allowing freedom in a number of well-defined parameters. Prior knowledge should be used in constructing models rather than in molding general approaches. C. Interpretability Throughout this article, strong emphasis was placed on the question whether ANN operation could be inspected after training. Rather than just
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
445
applying ANNs, the goal was to learn from the way in which they solved a problem. In few publications this plays a large role, although it would seem to be an important issue when ANNs are applied in mission-critical systems, e.g., in medicine, process control, or defensive systems. Supervised classification ANNs were inspected with respect to their feature extraction capabilities. As feature extractors, shared weight ANNs were shown to perform well, since standard pattern recognition algorithms trained on extracted features performed better than on the original images. Unfortunately, visual inspection of trained shared weight ANNs revealed nothing. The danger here is of overinterpretation, i.e., reading image processing operations into the ANN that are not really there. To be able to find out what features are extracted, two smaller problems were investigated: edge recognition and two-class handwritten digit recognition. A range of ANNs was built, which showed that ANNs need not comply with our ideas of how such applications should be solved. The ANNs took many ‘‘short cuts,’’ using biases and hidden layer-output layer weights. Only after severely restricting the ANN did it make sense in terms of image processing primitives. Furthermore, in experiments on an ANN with two feature maps the ANN was shown to distribute its functionality over these maps in an unclear way. An interpretation tool, the decorrelating conjugate gradient algorithm (DCGD), can help in distributing functionality more clearly over different ANN parts. The findings lead to the formulation of the interpretability trade-off, between realistic yet hard-to-interpret experiments on the one hand and easily interpreted yet nonrepresentative experiments on the other. This interpretability trade-off returned in the supervised regression problem. Modular ANNs constructed using prior knowledge of the filtering algorithm performed well, but could not be interpreted anymore in terms of the individual suboperations. In fact, retention of the modular initialization was negatively correlated to performance. ANN error evaluation was shown to be a useful tool in gaining understanding of where the ANN fails; it showed that filter operation was poorest around edges. The DCGD algorithm was then used to find out why: most of the standard feedforward ANNs found a suboptimal linear approximation to the Kuwahara filter. The conclusion of the experiments on supervised classification and regression is that as long as a distributed system such as an ANN is trained on single goal, i.e., minimization of prediction error, the operation of subsystems cannot be expected to make sense in terms of traditional image processing operations. This held for both the receptive fields in the shared weight ANNs and the modular setup of the regression ANNs: although they are there, they are not necessarily used as such. This also supports the conclusion of the previous section, that the use of prior knowledge in ANNs is not straightforward.
446
DE RIDDER ET AL.
This article showed that interpretation of supervised ANNs is hazardous. As large distributed systems, they can solve problems in a number of ways, not all of which necessarily correspond to human approaches to these problems. Simply opening the black box at some location one expects the ANN to exhibit certain behavior does not give insight into the overall operation. Furthermore, knowledge obtained from the ANNs cannot be used in any other systems, as it makes sense only in the precise setting of the ANN itself.
D. Conclusions We believe that in the past few years there has been an attitude change toward ANNs, in which ANNs are not automatically seen as the best solution to any problem. The field of ANNs has to a large extent been reincorporated in the various disciplines that inspired it: machine learning, psychology, and neurophysiology. In machine learning, researchers are now turning toward other, nonneural adaptive methods, such as the support vector classifier. For them the ANN has become a tool, rather than the tool it was originally thought to be. So when are ANNs useful in image processing? First, they are interesting tools when there is a real need for a fast parallel solution. Second, biological plausibility may be a factor for some researchers. But most importantly, ANNs trained based on examples can be valuable when a problem is too complex to construct an overall model based on knowledge only. Often, real applications consist of several individual modules performing tasks in various steps of the image processing chain. A neural approach can combine these modules, control each of them, and provide feedback from the highest level to change operations at the lowest level. The price one pays for this power is the black-box character, which makes interpretation difficult, and the problematic use of prior knowledge. If prior knowledge is available, it is better to use this to construct a model-based method and learn its parameters; performance can be as good, and interpretation comes naturally.
References Anand, R., Mehrotra, K., Mohan, C., and Ranka, S. (1995). Efficient classification for multiclass problems using modular neural networks. IEEE Trans. Neural Netw. 6(1), 117–124. Bengio, Y. (1996). Neural Networks for Speech and Sequence Recognition. Boston, MA: International Thompson Computer Press.
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
447
Bengio, Y., Le Cun, Y., and Henderson, D. (1994). Globally trained handwritten word recognizer using spatial representation, space displacement neural networks and hidden Markov models, in Advances in Neural Information Processing Systems 6, edited by J. Cowan, G. Tesauro, and J. Alspector. Cambridge, MA: Morgan Kaufmann. Bishop, C. (1995). Neural Networks for Pattern Recognition. Oxford: Oxford University Press. Burrascano, P. (1991). A norm selection criterion for the generalized delta rule. IEEE Trans. Neural Netw. 2(1), 125–130. Ciesielski, V., Zhu, J., Spicer, J., and Franklin, C. (1992). A comparison of image processing techniques and neural networks for an automated visual inspection problem, in Proceedings of the 5th Joint Australian Conference on Artificial Intelligence, edited by A. Adams, and L. Sterling. Singapore: World Scientific, pp. 147–152. De Boer, C., and Smeulders, A. (1996). Bessi: An experimentation system for vision module evaluation, in Proceedings of the 13th IAPR International Conference on Pattern Recognition (ICPR’96), Vol. C. Los Alamitos, CA: IAPR, IEEE Computer Society Press, pp. 109–113. de Ridder, D. (1996). Shared weights neural networks in image analysis. Master’s thesis, Pattern Recognition Group, Faculty of Applied Physics, Delft University of Technology, Delft. de Ridder, D. (2001). Adaptive methods of image processing. Ph.D. thesis, Delft University of Technology, Delft. de Ridder, D., Duin, R., Verbeek, P., and van Vliet, L. (1999). A weight set decorrelating training algorithm for neural network interpretation and symmetry breaking, in Proceedings of the 11th Scandinavian Conference on Image Analysis (SCIA’99), Vol. 2, edited by B. Ersbøll, and P. Johansen. Copenhagen, Denmark: DSAGM (The Pattern Recognition Society of Denmark), pp. 739–746. Devijver, P., and Kittler, J. (1982). Pattern Recognition, a Statistical Approach. London: Prentice-Hall. Dijk, J., de Ridder, D., Verbeek, P., Walraven, J., Young, I., and van Vliet, L. (1999). , A new measure for the effect of sharpening and smoothing filters on images, in Proceedings of the 11th Scandinavian Conference on Image Analysis (SCIA’99), Vol. 1. edited by B. Ersbøll, and P. Johansen. pp. 213–220. Copenhagen, Denmark: DSAGM (The Pattern Recognition Society of Denmark). Egmont-Petersen, M., Dassen, W., Kirchhof, C., Heijmeriks, J., and Ambergen, A. (1998a). An explanation facility for a neural network trained to predict arterial fibrillation directly after cardiac surgery, in Computers in Cardiology 1998. Cleveland: IEEE, pp. 489–492. Egmont-Petersen, M., Talmon, J., Hasman, A., and Ambergen, A. (1998b). Assessing the importance of features for multi-layer perceptrons. Neural Netw. 11(4), 623–635. Egmont-Petersen, M., de Ridder, D., and Handels, H. (2002). Image processing using neural networks—a review. Pattern Recog. 35(10), 119–141. Fahlman, S., and Lebiere, C. (1990). The cascade-correlation learning architecture, in Advances in Neural Information Processing Systems 2, edited by D. Touretzky. Los Altos, CA: Morgan Kaufmann, pp. 524–532. Fogel, D. (1991). An information criterion for optimal neural network selection. IEEE Trans. Neural Netw. 2(5), 490–497. Fogelman Soulie, F., Viennet, E., and Lamy, B. (1993). Multi-modular neural network architectures: Applications in optical character and human face recognition. Int. J. Pattern Recog. Artificial Intel. 7(4), 721–755. Fukunaga, K. (1990). Introduction to Statistical Pattern Recognition. 2nd ed. New York: Academic Press. Fukushima, K., and Miyake, S. (1982). Neocognitron: A new algorithm for pattern recognition tolerant of deformations and shifts in position. Pattern Recog. 15(6), 455–469.
448
DE RIDDER ET AL.
Funahashi, K.-I. (1989). On the approximate realization of continuous mappings by neural networks. Neural Netw. 2(3), 183–192. Gader, P., Miramonti, J., Won, Y., and Coffield, P. (1995). Segmentation free shared weight networks for automatic vehicle detection. Neural Netw. 8(9), 1457–1473. Geman, S., Bienenstock, E., and Doursat, R. (1992). Neural networks and the bias-variance dilemma. Neural Comp. 4(1), 1–58. Gonzalez, R., and Woods, R. (1992). Digital Image Processing. Reading, MA: Addison-Wesley. Gorman, R., and Sejnowski, T. (1988). Analysis of the hidden units in a layered network trained to classify sonar targets. Neural Netw. 1(1), 75–89. Green, C. (1998). Are connectionist models theories of cognition? Psycoloquy. 9(4), (http:// psycprints.ecs.soton.ac.uk). Greenhil, D., and Davies, E. (1994). Relative effectiveness of neural networks for image noise suppression, in Pattern Recognition in Practice IV, edited by E. Gelsema, and L. Kanal. North-Holland: Vlieland, pp. 367–378. Gullapalli, V. (1990). A stochastic reinforcement learning algorithm for learning real-valued functions. Neural Netw. 3(6), 671–692. Haralick, R. (1994). Performance characterization in computer vision. Comput. Vision Graph. Image Process. 60(2), 245–249. Haykin, S. (1994). Neural Networks: A Comprehensive Foundation. New York: Macmillan College Publishing Co. Hertz, J., Krogh, A., and Palmer, R. G. (1991). Introduction to the Theory of Neural Computation. Reading, MA: Addison-Wesley. Hinton, G., Sejnowski, T., and Ackley, D. (1984). Boltzmann machines: Constraint satisfaction networks that learn. Technical Report CMU-CS-84-119, Carnegie-Mellon University, Pittsburgh, PA. Hinton, G., Dayan, P., and Revow, M. (1997). Modelling the manifolds of images of handwritten digits. IEEE Trans. Neural Netw. 8(1), 65–74. Hoekstra, A., Kraaijveld, M., de Ridder, D., and Schmidt, W. (1996). The Complete SPRLIB & ANNLIB. Pattern Recognition Group, Faculty of Applied Physics, Delft University of Technology, Delft. Hopfield, J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proc. Nat. Acad. Sci. USA 81, 3088–3092. Hopfield, J., and Tank, D. (1985). Neural computation of decisions in optimization problems. Biol. Cybern. 52(3), 141–152. Hornik, K., Stinchcombe, M., and White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366. Hubel, D., and Wiesel, T. (1962). Receptive fields, binocular interaction, and functional architecture in the cat’s visual cortex. J. Physiol. 160, 106–154. Jacobs, R., Jordan, M., and Barto, A. (1991). Task decomposition through competition in a modular connectionist architecture: The what and where vision tasks. Cognit. Sci. 15, 219–250. Katsulai, H., and Arimizu, N. (1981). Evaluation of image fidelity by means of the fidelogram and level meansquare error. IEEE Trans. Pattern Anal. Mach. Intel. 3(3), 337–347. Kohonen, T. (1995). Self-Organizing Maps, Springer Series in Information Sciences. Berlin: Springer-Verlag. Kuwahara, M., Hachimura, K., Eiho, S., and Kinoshita, M. (1976). Digital processing of biomedical images. New York: Plenum Press, pp. 187–203. Lawrence, S., Giles, C., Tsoi, A., and Back, A. (1997). Face recognition—a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113.
IMAGE PROCESSING USING ARTIFICIAL NEURAL NETWORKS
449
Le Cun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. (1989a). Backpropagation applied to handwritten zip code recognition. Neural Comp. 1, 541–551. Le Cun, Y., Jackel, L. J., Boser, B., Denker, J. S., Graf, H. P., Guyon, I., Henderson, D., Howard, R. E., and Hubbard, W. (1989b). Handwritten digit recognition: Applications of neural network chips and automatic learning. IEEE Commun. 27(11), 41–46. Le Cun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. (1990). Handwritten digit recognition with a back-propagation network, in Advances in Neural Information Processing Systems 2, edited by D. Touretzky. San Mateo, CA: Morgan Kaufmann. Mathworks Inc. (2000). Matlab release 11.1. McCulloch, W., and Pitts, W. (1943). A logical calculus of the ideas imminent in nervous activity. Bull. Math. Biophys. 5, 115–133. Melnik, O., and Pollack, J. (1998). Exact representations from feed-forward networks. Technical Report CS-99-205, Dept. of Computer Science, Volen National Center for Complex Systems, Brandeis University, Waltham, MA. Minsky, M. L., and Papert, S. (1969). Perceptrons. Cambridge, MA: MIT Press. Murata, N., Yoshizawa, S., and Amari, S. (1994). Network information criterion—determining the number of hidden units for an artificial neural network model. IEEE Trans. Neural Netw. 5(6), 865–872. Nilsson, N. (1965). Learning Machines. New York: McGraw-Hill. Nowlan, S., and Platt, J. (1995). A convolutional neural network hand tracker, in Advances in Neural Information Processing Systems 7, edited by G. Tesauro, D. Tourctzky, and T. Leen. Cambridge, MA: MIT Press, pp. 901–908. Pal, N., and Pal, S. (1993). A review on image segmentation techniques. Pattern Recog. 26(9), 1277–1294. Perlovsky, L. (1998). Conundrum of combinatorial complexity. IEEE Trans. Pattern Anal. Machine Intel. 20(6), 666–670. Poggio, T., and Koch, C. (1985). III-posed problems in early vision: From computational theory to analogue networks. Proc. R. Soc. London B(226), 303–323. Pratt, W. K. (1991). Digital Image Processing. 2nd ed. New York: John Wiley & Sons. Press, W. H., Flannery, B. P., Teukolsky, S. A., and Vetterling, W. T. (1992). Numerical Recipes in C. 2nd ed., Cambridge, MA: Cambridge University Press. Pugmire, R., Hodgson, R., and Chaplin, R. (1998). The properties and training of a neural network based universal window filter developed for image processing tasks, Brain-like Computing and Intelligent Information Systems, edited by S. Amari, and N. Kasabov. Singapore: Springer-Verlag, pp. 49–77. Raudys, S. (1998a). Evolution and generalization of a single neurone: I. Single-layer perceptron as seven statistical classifiers. Neural Netw. 11(2), 283–296. Raudys, S. (1998b). Evolution and generalization of a single neurone: II. Complexity of statistical classifiers and sample size considerations. Neural Netw. 11(2), 297–313. Richard, M. D., and Lippmann, R. P. (1991). Neural network classifiers estimate Bayesian posterior probabilities. Neural Comp. 3(4), 461–483. Rosenblatt, F. (1962). Principles of Neurodynamics. New York: Spartan Books. Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). Learning internal representations by error propagation, in Parallel Distributed Processing: Explorations in the Microstructure of Cognition, edited by D. Rumelhart, and J. McClelland. Cambridge, MA: MIT Press, Vol. I, pp. 319–362.
450
DE RIDDER ET AL.
Schenkel, M., Guyon, I., and Henderson, D. (1995). On-line cursive script recognition using time delay neural networks and hidden Markov models, in Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP’95). Vol. 2, 637. Sejnowski, T., and Rosenberg, C. (1987). Parallel networks that learn to pronounce english text. Complex Syst. I, 145–168. Setiono, R. (1997). Extracting rules from neural networks by pruning and hidden-unit splitting. Neural Comp. 9(1), 205–225. Setiono, R., and Liu, H. (1997). Neural network feature selector. IEEE Trans. Neural Netw. 8(3), 645–662. Shapire, R. (1990). The strength of weak learnability. Mach. Learn. 5(2), 197–227. Shewchuk, J. R. (1994). An introduction to the conjugate gradient method without the agonizing pain, Technical Report CMU-CS-94-125, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA. Simard, P., LeCun, Y., and Denker, J. (1993). Efficient pattern recognition using a new transformation distance, in Advances in Neural Information Processing Systems 5, edited by S. Hanson, J. Cowan, and L. Giles. San Mateo, CA: Morgan Kaufmann. Solla, S. A., and Le Cun, Y. (1991). Constrained neural networks for pattern recognition, in Neural Networks: Concepts, Applications and Implementations, edited by P. Antognetti, and V. Milutinovic. Englewood Cliffs, NJ: Prentice Hall. Sontag, E. D. (1992). Feedback stabilization using two-hidden-layer nets. IEEE Trans. Neural. Netw. 3(6), 981–990. Spreeuwers, L. J. (1992). Image filtering with neural networks, applications and performance evaluation. Ph.D. thesis, Universiteit Twente, Enschede. Tickle, A., Andrews, R., Golea, M., and Diederich, J. (1998). The truth will come to light: Directions and challenges in extracting the knowledge embedded within trained artificial neural networks. IEEE Trans. Neural Netw. 9(6), 1057–1068. Vapnik, V. (1995). The Nature of Statistical Learning Theory. Berlin: Springer-Verlag. Verschure, P. (1996). Connectionist explanation: Taking positions in the mind-brain dilemma, in Neural Networks and a New AI. London: Thompson, pp. 133–188. Viennet, E. (1993). Architectures Connexionistes Multi-Modulaires, Application a` l’Analyse de Sce`ne. Ph.D. thesis, Universite´ de Paris-Sud, Centre d’Orsay. Wilson, C. L., Garris, M. D. (1992) Handprinted character database 3. National Institute of Standards and Technology, Advanced Systems division. Young, I., Gerbrands, J., and Van Vliet, L. (1998). Image processing fundamentals, in The Digital Signal Processing Handbook. Boca Raton, FL: CRC Press/IEEE Press, pp. 51/1–51/81.
Index
classification, 357–358 feedforward, 356–357 image processing, 355 problems, 365–366 regression, 358–359 types, 360 Atmospheric dynamics, 42 Autoassociative artificial neural networks, 362 Avalanche photodiodes, 230 Average module, 405
2D wavelet transform modulus maxima (WTMM) method, 17–23 definition, 17–18 methodology, 18–19 numerical implementation, 21–23 remark, 19–21 to perform image processing tasks, 38–41 2D continuous wavelet transform, 7–22 computation of, 21 3D image processing, 329–348 coherency considerations, 333–336 concluding remarks, 347–348 detection schemes, 336–337 special cases, 335–336 3D imaging properties, 337–340 3D image reconstruction, 341–346
B Baseline phase space, 293–294 Belt-driven system, 249 Bias phase space, 295 Bias-variance dilemma, 359 Binary images, 105 continuous, 110 discrete, 113–115 Black box problem, 359 Boundary length measurement problem, 183–187 detailed analysis, 184–186 discussion, 186–187 Brownian surfaces, 23
A Acquisition scheme, 198 Additive processes, 36 Algorithm, discrete search, 326–327 Analysis of the geometric distortions, 93–192 concluding remarks, 187–190 introduction, 94–96 problem with closing, 174–178 Anisotropic dilations, 10–17 Anisotropic scale invariance, 3 Annealed averaging, 23 Ambiguity bounds, 347 Argument (A), 34 Artificial neural networks (ANN), 353–355 applications, 356 architecture, 366
C Canonical description, 5 Charge-coupled device (CCD) cameras, 230 Classification, 355 Cloud structure, 41–59 liquid water content (LWP), 42 liquid water path (LWP), 43 Colocalization errors, 198
451
452 Competitive training, 340–346 Complex hologram, 298–299 Computer-aided diagnosis (CAD) methods, 73 Continuous images, 105 binary, 105 laser excitation, 209 shifts produced by media filters in, 105–122 Continuous wave (CW) laser excitation, 209 Continuous wavelet transform, 4 2D, 7–22 computation of, 21 image processing, 7–22 Convolution, 96–97 Corner detector, 178 Cosine, 340–341 Cosine-FZP hologram, 342–343 Cusp-like singularities, 5
D D (h) singularity spectrum, 3 Data-oriented approaches, problems, 364–365 Decorrelating conjugate gradient descent, 392–397 Decorrelating training algorithm, 396 Decorrelation, 393 Dense tissues in mammography, 74–77 Diffusion-limited aggregates (DLA), 5 Digital Database for Screening Mammography (DDSM) project, 74 Digital images, 122 Digitized mammorgrams, 73–80 Distance (l), 3 Dome-slicing, 130
E Edge detection, 378 Edge-favoring sampling, 418–419 Edge-preserving smoothing, 423–432
INDEX
discussion, 430 experiments, 426 Edge recognition, 378–379 discussion, 386–387 training, 381 Edge shifts, 105, 121 arising with hybrid media filters, 121–122 general calculations, 128–129 theory of, 105–110 Energy dissipation field, 56 Enstrophy field, 59–61 Exponent Ho¨lder, 10 Extensions continuous gray-scale images, 110–112 discrete neighborhoods, 113
F Fast Fourier Transform (FFT), 21 Fatty tissues in mammography, 74–77 Feature detectors, 388 Feature extraction, 377–398 Feature maps, 368 Feedforward artificial neural networks, 356–357 applications, 360–361 classification, 357–358 preprocessing, 361 regression, 358–359 types, 360 Field dissipation energy, 56 turbulent, 61–68 enstrophy, 59 turbulent 3D, 68 radiance, 51–52 receptive, 367 temperature, 51–52 velocity, 51–56 Filters Gaussian, 95 image, 96–105
453
INDEX
mean, 95 media, 104–105 mode, 100–101 modular, 404 morphological, 102–104 noise suppression, 98–100 nonlinear, 404 rank-order, 157 truncated median, 101 First ISCCP Research Experiment (FIRE), 44 Floating point operations (FLOP), 373 Fluorescence in situ hybridization (FISH), 266 Fluorescent molecules, 202 behavior under TPE regime, 212–218 two-photon excitation, 202–212 under TPE regime, 212–218 Fluorescent specimens, 293 Fokker-Planck approach, 55 Fourier domain, 21 Fractals homogeneous functions, 19 self-affine, 3 Fractal dimension (DF), 3 Fractional Brownian motion (fBm), 23 Fractional Brownian surfaces, 23–31 Full width at half maximum (FWHM) resolutions, 251 Function Gaussian, 7–8 linear, 19 point spread, 251 probability density, 53 space-space correlation, 50–51, 67–68, 71 WTMM probability density, 48–50, 66–67, 70–71 xor-like, 406
G Gaussian filters, 95 Generalized fractal dimension (Dq), 3
Geometric distortions, 93Global Circulation Model (GCM), 44 Grand canonical description, 5 Graphs, interferometric 311 Gray-scale images, 110, 116 continuous, 110–112 discrete gray-scale images, 116–121
H Handwritten digit recognition, 371 data set, 371 experiments, 372 feature extraction, 374 two-class, 388 Heterodyne scanning, 330–337 two-pupil optical, 330–337 Heterodyning theory, 330–333 High curvature, 163–165 High-resolution satellite images, 41–59 Ho¨lder exponent, 10 Hologram, complex, 340–346 Holography, 329–337, 340–346 Homogeneous (monofractal) fractal functions, 19 Hybrid median filters, 121
I Images continuous, 105–122 digital, 122 gray-scale, 110, 116 restoration, 399–418 Image enhancement, 362 Image filters, 96–105 in-depth study of media filters, 104–105 mode filters, 100–101 morphological filter, 102–104 noise suppression filters, 98–100 Image processing, 7–22, 329–337, 352–353 filters, 93 Image understanding, 363–364
454 Input layer, 368 Input-output mapping, 355 Integer lattices, 293 baseline phase space, 293–294 pupil phase space, 293 unknown-spectral, 294 Integral scale (L), 53 Interfaces isotropic, 2 self-similar, 2 Interferometric graphs, 289–290 International Satellite Cloud Climatology Project (ISCCP), 44 Interpretability trade-off, 387 Isotropic dilations, 10 Isotropic interfaces, 2 Isotropic Mexican hat, 7
K Kolmogorov dissipative scale (), 53 Kuwahara filtering, 399–404
L Laboratory for Advanced Bioimaging, Microscopy, and Spectroscopy (LAMBS), 244 Landsat images, 44–51 marine stratocumulus cloud scenes, 43–44 radiance fields and velocity and temperature fields, 51–52 Laser excitation, 209 sources, 233–241 Lattices, closest point search, 345 Layer hidden, 370 input, 368 output, 370 Lens objectives, 242–244 TPE miscroscope, 244–253
INDEX
Linear function, 19 Linear slant edge, 147–149 Linear variable differential transformer (LVDT), 249 Liquid water content (LWC), 42 Liquid water path (LWP), 43 Loop-entry phase space, 296
M Magniture, 36 Mammograms, digitized, 73 application of 2D WTMM method, 74–77 detecting microcalcifications, 77–79 WT skeleton segmentation, 77–79 Mammographic tissues, 79 dense, 74 fatty, 74 Mammography application of the 2D WTMM method to tissue classification, 74–77 multifractal analysis, 73–80 Maps feature, 368 subsampling, 369 Maxima chains, 11 Maxima lines, 11 Mean energy dissipation (), 56 Mean filters, 146 Media filters continuous images, 105–122 theory of edge shifts, 105–110 Media shifts, 122–124 Median-based corner detector, 178–182 Median filters, 104–105 with small circles, 143–146 hybrid, 121 shifts produced by, 105–110, 122 Mexican hat, isotropic, 7 Microcalcifications in mammography, 77–80 Miscroscope, 195
455
INDEX
Millimeter radars, 42 Mode filters, 101–101, 153 Modular filter, 404 Modular networks, 404, 411, 437 Modules, 410–411 average, 405 position-of-minimum, 405 selection, 406 variance, 405 Monofractal rough surfaces, 23 Morphological filters, 102–104 Multifractal analysis 2D WTMM method, 68 discussion, 71–72 mammographic tissue classification, 74–77 multifractal spectra, 68–70 numerical computation of the multifractal spectra, 63–66 remak, 62–63 space-space correlation function analysis, 50–51, 67–68, 71 WTMMM probability density functions, 43–44, 66–67, 70–71 3D turbulence simulation data, 53–72 description of intermittency, 53–56 high-resolution satellite images of cloud structure, 41–59 Multifractal properties, 19 Multifractal rough surfaces, 23, 31–35 Multifractal scaling, 3 Multifractal spectrum computation of, 22 numerical computation of, 63–66, 68–70 Multiplicative processes, 36 Multiscale edge detection, 7
N National Institute for the Physics of Matter (INFM), 244 Navier-Stokes dynamics, 55
Neighborhoods 3 x 3, 124, 129 5 x 5, 131, 136 7 x 7, 136 circular, 161 discrete, 113 p x p, 122 rectangular, 157 trends for large, 137–141 Network architecture, 379 Neural networks, artificial, 351–450 Nonlinear curve, 19 Nonlinear filters, 97, 404 Nonlinear image processing, 351–450 Nonmaximum suppression, 97 Nonparametric, 359 Noise power, 98 Noise suppression filters, 98–100 Normal sampling, 419 Numerical aperture (NA), 211
O Object recognition, 363, 366 Optical consequences and resolution aspects, 219–224 Optical scanning holography (OSH), 329–337 Optical heterodyne scanning, 330–337 Optimal model phase shift, 313 bias phase, 314 pupil phase, 314 Optimization, 364 Oscillating singularities, 5 Overtraining, 359
P Phase calibration, 288, 291 discrepancy and related results, 309–312 problem, 307–309 Phase closure, 290–291 operator, 296 projection, 296
456 Phase closure imaging, 287–327 appendices, 320–327 concluding comments, 319–320 contents, 287–288 simulated example, 317–319 special cases, 315–317 Phase closure operator, 296–299 Phase closure projection, 296–299 spectral, 299–304 Phase space baseline, 293–294 bias, 295 integer lattices, 293 loop-entry, 296 pupil, 293 unknown-spectral, 294 Phase transitions, 20 Photochemical reactions, 198 Photodetectors, 230 Photointeractions, 198 Photomultiplier tubes, 230 Pixel, 141–143 Point spread function (PSF) measurement, 251 Position-of-minimum module, 405 Power, 98 Probability density function (pdf), 53 Problem with closing, 174–178 detailed analysis, 175–177 discussion, 175–178 Pupil phase space, 293
Q Quenched averaging, 22
R Random cascades, 31–35 Rank-order filters, 157, 170–174 analysis of the situation, 170–172 discussion, 173–174 Receptive fields, 367 Redundant case strongly, 302–304 weakly, 300–302
INDEX
Reference algebraic framework, 305–307 Reference projections, 321 Regression method, 355 Regression networks, 399 architecture and experiments, 404–415 experiments, 419 inspection and improvement, 418–442 Relative filter performance, 426 Reynold’s number, 6 Rough surfaces, 9 scale invariance properties, 36–38 incoherently reflecting, 335–336 local regularity properties of, 9–17 test applications, 23–41 Roughness exponent (H ), 3
S Satellite images, high-resolution, 41 Scale invariance properties, 36–38 Scaling exponents, 4 Second-harmonic generation (SHG), 271 Selection module, 406 Self-affine fractals, 3 Self-organizing map (SOM), 360 Self-similar interfaces, 2 Shared weight networks, 367–377 architecture, 368 discussion, 375–377 feature extraction, 377–398 handwritten digit recognition, 371 Sharpening, 424 Shifts mean filters, 146–150 discussion, 149–150 median filters in continuous images, 105–122 median filters in digital images, 122–146 discussion, 137 mode filters, 150–156
457
INDEX
discussion, 151–153 rank-order filters, 156–170 discussion, 169–170 Signal-to-noise ration (SNR), 98 Sine, 340–341 Sine-Fresnel zone plate (FZP) hologram, 342–343 Single-objective piezo nanopositioner, 249 Singularities cusp-like, 5 oscillating, 5 Smith normal form, 300, 321 Smoothing, 423 Soft-threshold, 379 Space-space correlation function, 50–51, 67–68, 71 Spectral phase closure projection, 299 closure matrix, 300 examples, 300–304 weakly redundant case, 300–302 Smith normal form, 300 strongly redundant case, 302–304 Standard networks, 433 Step edges, 146–147 Stratocumulus cloud (Sc) scenes, 43–44 landsat data, 43 application of 2D WTMM method, 44 Strength (h), 4 Subsampling maps, 369 Switches, 405
T Temperature fields, 51 Template, 379 Threshold, 97, 379 Time delay neural networks (TDNN), 366 Trained networks, 433 Transport, 405 Truncated median filter, 101 Turbulence, fully developed, 51
Turbulent 3D enstrophy field, 68 Turbulent dissipation field, 61–68 Two-class handwritten digit classification, 388–398 training, 388–392 Two-photon excitation (TPE) microscope, 244 Two-photon excitation microscopy, 195–273 application gallery, 253–273 architecture, 225 basic principles, 202–212 conclusions, 273 general considerations, 225–233 historical notes, 198–202 optical consequences, 219–225 resolution aspects, 219–225 Two-photo excitation (TPE) regime, 212–219 behavior of fluorescent molecules, 212 Two-pupil optical heterodyne scanning, 330–337 coherency considerations, 333–335 detection schemes, 336–337 special cases, 335–336
U Unknown-spectral phase space, 294 Useful property, 320
V Variance-covariance matrix, 299 Variance module, 405 Velocity field, 53–56
W Wavelets, 4 analyzing for multiscale edge detection, 7–9
458 Wavelet-based method for multifractal image analysis, 1–92 conclusion, 80–81 introduction, 2–7 Wavelet orthogonal basis, 31–35 Wavelet transform (WT), 4, 5 2D, 7 continuous, 4 image processing, 5 Wavelet transform skeleton computation, 21–22 segmentation, 77–80 Wavelet transform modulus maxima (WTMM), 4
INDEX
definition, 17 methodology, 18 numerical implementation, 21 remark, 19 Wavelet transform modulus maxima maxima (WTMMM), 11 probability density functions, 48–50, 70–71
X Xor-like function, 406