EDITORIAL ADVISORY BOARD G. S. AGAUWAL,
Ahmedabad, India
T. ASAKURA,
Sapporo, Japan
M.V: BERRY,
Bristol, England
C...
22 downloads
1043 Views
25MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
EDITORIAL ADVISORY BOARD G. S. AGAUWAL,
Ahmedabad, India
T. ASAKURA,
Sapporo, Japan
M.V: BERRY,
Bristol, England
C. COHEN-TANNOUDJI, Paris, France
V. L. GINZBURG,
Moscow, Russia
F. GORI,
Rome, Italy
A. KUJAWSKI,
Warsaw, Poland
J. Rbl.4,
Olomouc, Czech Republic
R. M. SILLITTO,
Edinburgh, Scotland
H. WALTHER,
Garching, Germany
PROGRESS IN OPTICS VOLUME XXXVIII
EDITED BY
E. WOLF
University of Rochester, N. r, l%S.A.
Contributors S. DUTTA GUPTA, P. HELLO, J.L. HORNER, J. JAHNS, B. JAVIDI, A.W. L O H M A " , D. MENDLOVIC, W NAKWASKI, M. OSfiSKI, 2. ZALEVSKY
1998
ELSEVIER AMSTERDAM. LAUSANNE .NEW YORK OXFORD. SHANNON. SINGAPORE. TOKYO
ELSEVIER SCIENCE B.V
SARA BURGERHARTSTRAAT 25 PO. BOX 21 1 1000 AE AMSTERDAM THE NETHERLANDS
Library of Congress Catalog Card Number: 61-19297 ISBN Volume XXXVIII: 0 444 82907 5
0 1998 Elsevier Science B.V. All rights reserved No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the publisher, Elsevier Science B.V, Rights & Permissions Department, P.O. Box 521, 1000 AM Amsterdam, The Netherlands. Special regulations for readers in the USA: This publication has been registered with the Copyright Clearance Center Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923. Information can be obtained from the CCC about conditions under which photocopies of parts of this publication may be made in the USA. All other copyright questions, including photocopying outside of the USA, should be referred to the Publisher, unless otherwise specified. No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. @ The paper used in this publication meets the requirements of ANSI/NISO 239.48-1992 (Permanence of Paper).
PRINTED IN THE NETHERLANDS
PREFACE It is a pleasure to report two happy events. Professor Claude Cohen-Tannoudji, a long-time member of the Editorial Advisory Board of Progress in Optics has been named co-recipient of the 1997 Nobel Prize in Physics, and Professor Michael Berry, whose important contributions to optics and other fields of physics are also well known, has agreed, not long ago, to become a member of the Board. The present volume contains six review articles on a wide range of topics of current research in optics. The first article, by S. Dutta Gupta, deals with various nonlinear optical phenomena in stratified media. It shows that resonances which arise from stratification are of considerable importance for achieving low-threshold nonlinear optical devices. The article also includes a thorough study of optical bistability and harmonic generation in Kerr nonlinear layered media, and various phase matching techniques are discussed. Recent trends involving novel geometries and new materials are outlined. More recent developments concerning gap solitons in periodic structures, weak photon localization in quasi-periodic structures and enhancement of nonlinear susceptibilities in layered composites are also discussed. The second article, by I? Hello, reviews the optical aspects of interferometric gravitational-wave detectors. Different optical configurations are reviewed and their sensitivities are estimated for typical values of the optical parameters. The next article, by W. Nakwaski and M. Osinski, presents a review of temperature-related effects and thermal modeling of vertical-cavity surfaceemitting lasers (VCSELs). The effects of temperature on the characteristics of such devices are discussed, including the temperature dependence of the longitudinal mode spectra, the transverse-mode structure and the output power. The principles of thermal VCSEL modeling are then outlined. Both analytic and numerical approaches are treated. Finally, the most important results obtained by the use of such models are presented. The fourth article, by A.W. Lohmann, D. Mendlovic and Z. Zalevsky, entitled Fractional Transformations in Optics, describes some recent theoretical developments in mathematical techniques that are used in physical optics and in optical information processing. Many of the usual transforms used in these fields V
vi
PREFACE
contain various parameters which are integers. The generalizations reviewed in this article have parameters which take on fractional or even complex values. The article surveys these developments and discusses the use of such generalized transforms in some areas of optics. The article that follows, by B. Javidi and J.L. Homer, discusses a number of Fourier-plane nonlinear filtering techniques for use in image recognition. Such nonlinear filters can be optically implemented by a processor known as a joint transform correlator. Their performance is discussed and the use of nonlinear techniques in the design of distortion-invariant composite filters for image recognition is considered. The use of joint transform correlators for security verification of credit cards, passports and other documents is also discussed. The concluding article, by J. Jahns, presents an overview of the field of optical digital computing and interconnection. Following an outline of the historical development of the subject, the motivation for the use of free-space optics in computing applications is discussed. Computational aspects of nonlinear optical devices and optical interconnections and their implementations are then reviewed. The article concludes with an overview of architectures and systems for free-space optical computing and switching. Emil Wolf Department of Physics and Astronomy University of Rochester Rochester, New York 14627, USA January 1998
E. WOLF, PROGRESS IN OPTICS XXXVIII 0 1998 ELSEVIER SCIENCE B.V ALL RIGHTS RESERVED
I NONLINEAR OPTICS OF STRATIFIED MEDIA BY
S. DUTTA GUPTA School of Physics, University of Hyderabad, Hyderabad 500046, India
I
CONTENTS
PAGE
Q 1. NTRODUCTION
. . . . . . . . . . . . . . . . . . .
9 2. NONLINEAR TRANSMISSION AND OPTICAL BISTABILITY
3
IN LAYERED MEDIA . . . . . . . . . . . . . . . . .
6
Q 3 . HARMONIC GENERATION AND OTHER NONLINEAR PHENOMENA IN LAYERED GEOMETRY . . . . . . . .
50
$ 4. NONLINEAR OPTICAL PROPERTIES OF LAYERED
COMPOSITES . . . . . . . . . . . . . . . . . . . .
66
$ 5 . CONCLUSIONS . . . . . . . . . . . . . . . . . . .
75
ACKNOWLEDGEMENT . . . . . . . . . . . .
76
REFERENCES . . . . . . . . . . . . . . . . . . .
76
2
8
1. Introduction
The past two decades have witnessed an intense development of nonlinear optics of stratified media. Surveying the past, present and fbture of nonlinear optics, Bloembergen [ 19921 commented that nonlinear optics has entered the technology phase. This has been possible due to a tremendous development both in the understanding of the underlying phenomena and in the remarkable growth of technology. Now diode lasers with sub-watt power levels can be used to observe most nonlinear optical effects. As pointed out by Stegeman [1992], two factors, namely the development of new nonlinear optical materials with better characteristics and a manipulation of the sample geometry, have played the most significant role. In the context of the latter, optical fibers (Agrawal [1989]) and stratified layered media (Stegeman [ 19921) have been of great importance. Our review explores the various possibilities of nonlinear optics in layered configuration, even to the extent of fabricating a nonlinear medium with better characteristics. Linear properties of layered media are well documented (see, e.g., Yeh [1988]). It is now well understood that the optical properties of a layered medium can be distinct from those of its bulk constituents. The simplest possible example is the “structural” dispersion, in stratified geometry. Even with a single dielectric slab (Fabry-Perot (FP) cavity) lacking in material dispersion the transmission is frequency selective due to multiple reflections from the interfaces. The effect is more drastic for a multilayered geometry, for example, for a periodic layered medium, where the interference of the forward and backward propagating waves in each slab can lead to frequency stopgaps. Guided waves (see, e.g., Hunsperger [ 19841) with geometry-dependent dispersion is another technologically significant example. The most important feature of layered structures is their ability to support resonances, which are always associated with local field enhancements. It is in this context that low threshold nonlinear phenomena can be realized in such structures. Guided wave structures play an important role from the viewpoint of device applications. The confinement of the electromagnetic field in waveguides can lead to large power densities over a longer length compared with what can be realized in bulk samples. In addition, they offer the possibility of integration of various functions 3
4
NONLINEAR OPTICS OF STRATIFIED MEDIA
[I, 0 1
on the same optical “chip”. In fact, integrated optics and its nonlinear extension have developed to a stage that the realization of all-optical chips is only a matter of time. In the context of nonlinear stratified media the theoretical and experimental achievements have been fascinating. Plausible theories now explain most nonlinear effects, including self-action optical bistability, frequency up-and-down conversion, four-wave mixing, and phase conjugation, self-focusing, and spatial solitons. These effects have been observed experimentally. Predictions have been made of new physical phenomena that may be observed in the future. One example is the prediction of new nonlinearity-induced modes (without any linear counterpart) in Kerr nonlinear waveguides, which need large power levels for excitation. With the availability of high-power lasers the problem lies not in coupling sufficient power to the guided mode, but rather, with the low damage threshold of the nonlinear materials suitable for waveguide fabrication. Thus the search for new waveguide materials with large nonlinearity and high damage threshold continues. Since the early days of nonlinear optics in stratified geometry, Kerr nonlinearity leading to intensity-dependent refractive index has drawn considerable attention. It offers, perhaps, the simplest possible model amenable to satisfactory theoretical analysis. The second factor is the easy availability of Kerr nonlinear materials such as CS2, nitrobenzene, and liquid crystals. As a result, the literature on Kerr nonlinear effects in stratified media, both for normal and oblique incidence (mostly in waveguides), is extensive. We review both the theoretical and experimental achievements, concentrating on the exact theoretical models (Chen and Mills [1987a-c], Leung [1985, 19891) and surveying some recent experiments involving liquid crystal, organic, and semiconductor films. Results pertaining to periodic and quasiperiodic media are discussed separately because of the specific properties of such structures. In fact, prediction of gap solitons (Chen and Mills [1987b]) in nonlinear periodic structures and of weak photon localization in linear quasiperiodic structures (Kohmoto, Sutherland and Iguchi [1987]) was one of the major achievements. We also discuss other approximate and numerical methods. We hghlight the switching and bistability experiments, which hold considerable potential for optical switches and other signal processing and communication applications (for a survey of device potentials of Kerr nonlinear layered media, see, e.g., Assanto [1992]). Harmonic generation, and in particular, second harmonic generation, has remained one of the most pursued branches of nonlinear optics since its inception. In layered configuration it is attractive, since one can have enhancement of
1, § 11
INTRODUCTION
5
the generated harmonic using the resonances of the stratified medium. The development of a general theory for a multilayered medium, albeit without pump depletion, was rather recent (Bethune [1989, 19911, Hashizume, Ohashi, Kondo and Ito [1995]) and we review this theory in great detail. In the context of harmonic generation in waveguides, coupled mode theory has been applied extensively, and the results are well documented (Stegeman and Seaton [1985], Stegeman [1992]). The advantage of guided wave structures in the context of harmonic generation is obvious. In addition to a large power density over a large interaction length, the major advantage stems from the flexibility in options for phase matching. A real technological breakthrough was the realization of quasiphase matching by means of ferroelectric domain reversal in LiNbO3, LiTaO3, and KTiOPO4 (KTP) waveguides (for a detailed treatment of poling techniques in these materials, see, e.g., Fejer [1992], Bierlein [1992]). Along with other mechanisms of phase matching, we describe the major achievements in quasiphase-matched waveguides. We also discuss the case of surface-emitted second harmonic with counterpropagating fundamental waves. Note that some of these second harmonic devices using the infrared (IR) diode lasers can lead to efficient blue light sources that are much needed for the xerography and laser printing industry. In addition to harmonic generation, we review some recent trends using cascaded second-order nonlinearity leading to efficient “third” order processes. We also outline some four-wave mixing experiments along with new theoretical proposals. In the context of better nonlinear optical materials, a new proposal to enhance the effective nonlinear susceptibility exploiting the local field corrections in layered composites was advanced (Boyd and Sipe [1994]) and tested (Fischer, Boyd, Gehr, Jenekhe, Osaheni, Sipe and Weller-Brophy [ 19951). After briefly surveying the properties of nonlinear Maxwell Garnett composites, we discuss the details of the proposal and summarize the experimental observations. Since the field is rather new, the device applications of such composites have not yet been probed. In view of the vast literature available, the scope of the current review is limited. For example, we restrict discussion only to macroscopic nonlinear optical effects in the framework of a classical theory based on Maxwell’s equations. It is clear that for sufficiently thin layers, quantum confinement effects (for reviews, see, e.g., Flytzanis, Hache, Klein, Ricard and Roussignol [1991], Flytzanis [1992]) can become important, leading to the breakdown of the classical description. We thus consider layered media, where each layer can be characterized by its macroscopic parameter like the dielectric function. The other limitation is that we describe only the parametric processes, and do not
6
NONLINEAR OPTICS OF STRATIFIED MEDIA
[I, 0 2
cover important topics such as Raman and multiphoton processes, although we briefly discuss the multiphoton-induced Kerr-like effect. Section 2 is devoted to Kerr nonlinear effects and is divided into four parts, the first three of which examine theoretical developments and the fourth summarizes the experiments. The cases of normal and oblique incidence are covered in the first two sections, and the third discusses the properties of periodic and quasiperiodic media. Section 3 describes harmonic generation and other effects like cascaded second-order processes and four-wave mixing. Section 4 discusses the properties of nonlinear composite materials, especially layered composites.
Q 2. Nonlinear Transmission and Optical Bistability in Layered Media The nonlinear optical effects in layered media (for earlier reviews see, e.g., Stegeman and Seaton [ 19851, Stegeman, Seaton, Hetherington, Boardman and Egan [ 19861, Mihalache, Bertolotti and Sibilia [ 19891, Langbein, Lederer, Peschel, Trutschel and Mihalache [ 19901; see also Ostrowsky and Reinisch [ 19921) can be diverse, depending on the nature of the nonlinearity. An important nonlinear optical effect, optical bistability, originates in the intensity dependence of the real and imaginary parts of the refractive index. Since the real (imaginary) part of the refractive index defines the dispersive (absorptive) properties of the medium, the resulting multivalued response was labeled as dispersive (absorptive) bistability. Bistable response resulted when the nonlinear medium was contained in a Fabry-Perot cavity. Thus, feedback was identified as an important factor needed to have bistable response. The proposal for optical bistability with nonlinear cavities was advanced almost three decades ago by Seidel [1969] and Szoke, Daneu, Goldhar and Kurnit [1969]. Experimental observation of optical bistability (McCall, Gibbs and Venkatesan [ 19751, Gibbs, McCall and Venkatesan [1976]) was delayed because of the problems of fabricating a high finesse cavity. The importance of optical bistable devices in the context of optical signal processing as well as optical computing (see, e.g., Gibbs, Mandel, Peyghambarian and Smith [ 19861) was well understood. Although the initial studies on optical bistability were in Fabry-Perot geometry, later proposals were advanced for having mirrorless bistability (Bowden and Sung [1980, 19811). Nonlinear phenomena other than self-action of the wave were suggested for having bistable output. Winful and Marburger [ 19801, Flytzanis and Tang [ 19801, Agrawal and Flytzanis [1981] and several others demonstrated the possibility of optical bistability in degenerate four-wave mixing and phase conjugation. An exhaustive treatment of bistable and multistable behavior in
I,
9 21
NONLINEAR TRANSMISSION AND OPTICAL BISTABILITY IN LAYERED MEDIA
I
optical systems was presented by Gibbs [1985]. Our review attempts to examine the later developments in the context of layered geometry, beginning with the theoretical developments and followed by the experimental results. We restrict discussion to dispersive bistability because of its tremendous potential for device applications. In the context of layered media one can classify the problems under two broad categories, namely, when the structure is illuminated by radiation incident normally and at an angle. It is clear that in the first case, one concludes with a system of coupled nonlinear and linear Fabry-Perot cavities with no surface excitations, whereas in the second case one has to take into account the possible surface and guided modes of the structure. Note that irrespective of whether the incidence is normal or oblique, the resonances of the structure are responsible for the local field enhancements leading to low threshold nonlinear optical effects. This review addresses the cases of normal and oblique incidence separately. 2.1. NORMAL INCIDENCE
Marburger and Felber [ 19781 first studied dispersive bistability using Fabry-Perot cavity. The cavity was represented by the mirror reflection coefficient R and the intracavity medium was assumed to have a Kerr-type nonlinearity. As pointed out later by Leung [1989], the problem of transmission through a nonlinear slab is more complicated when compared with that of a Fabry-Perot cavity with localized feedback (mirrors at, say, z=O and z = d with given reflectivities). In fact, in the case of a nonlinear layer the reflectivities at the two surfaces are not given, but rather, they must be determined together with the reflectivity of the film in a self-consistent fashion. The difficulties increase substantially when one deals with a layered medium comprising a combination of nonlinear layers. It is thus necessary to have a theory, that can adequately describe the transmission characteristics of a general nonlinear layered media consisting of both linear and nonlinear layers. Approximate and exact methods in the context of Kerr nonlinearity were proposed to solve this complicated problem. We describe the exact method (Chen and Mills [1987a-c]) followed by other numerical and approximate schemes.
2.I . I . Chen-Mills exact solution The exact solutions for a Kerr nonlinear slab for normal incidence of plane polarized wave was first presented by Chen and Mills [1987a]. The general solution was shown to involve four parameters, three of which could be expressed
8
NONLINEAR OPTICS OF STRATIFIED MEDIA
[I,
52
in terms of the fourth, which with proper scaling was bounded between zero and one. Use of appropriate boundary conditions (i.e., continuity of the electric field and its normal derivative) led to the numerical evaluation of these constants, and finally to the power-dependent transmission and reflection coefficients of the structure. A generalization to the case of a nonlinear layered medium consisting of a finite number of nonlinear slabs was presented later (Chen and Mills [ 1987b,c]).We briefly describe details of their analytical and numerical method. Consider first the solution of the nonlinear wave equation for the propagation of a plane y-polarized wave along the z-direction. Measuring the field in terms of the incident field amplitude Eo, the expression for the wave in the nonlinear medium can be written as E(z) = Eo € ( z ) exp[icp(z)].
(2.1)
E(z) given by eq. (2.1) must satisfy the nonlinear wave equation d2E --+k2 dz2
where k
[1+aIE12]E=0,
= (w/c)no
n2 ( 1 ~ 1 ’ )
=
and we have assumed a nonlinearity of the form
+a 1 ~ 1 ~ ) .
(2.3)
In eq. (2.3), no is the low-power limit of the refractive index and a is the nonlinearity constant. Substitution of eq. (2.1) into eq. (2.2) and subsequent integration lead to the following two equations for the phase ~ ( z and ) the amplitude E(z): dcp- w dz E2’
(
$)2
+
+k2E2 + ik2ZUE4 = A .
In eqs. (2.4) and (2.5), W and A are integration constants, and ZU = a (Eo12. The incident field may then be taken to have unit amplitude, and the intensity dependence of the response can be probed by varying 2. Equations (2.4) and (2.5) can be integrated to yield
with Z(z)=E2(z). In writing eqs. (2.6) and (2.7) it was assumed that the nonlinear medium terminates at z = d . It is clear from eqs. (2.6) and (2.7) that the general
I, 5 21
NONLINEAR TRANSMISSION AND OPTICAL BISTABILITY IN LAYERED MEDIA
9
solutions involve four constants, W ,A, Z(d) and q(d), which are to be determined from the boundary conditions. The integral in eq. (2.6) can be expressed in terms of the Jacobian elliptical functions, and results are different for cases when 6 = 0, 6 > 0, and 6 < 0. For 6 = 0 (linear case), one can express Z(z) in terms of elementary functions as follows: 2
A + (A2 - 4k W
2 112
)
. sin f2k(z - d ) + sin-'
(
For a self-focusing nonlinearity (5 > 0), Z(z) is given by
'
(2.8)
where
In eqs. (2.8)-(2.10), Z('), and Z(3) are the roots of the cubic polynomial in the denominator of the integrand of eq. (2.6). These roots are given by
I(" =
-2- (,),I2 36
COS
(8
+ Fl),
1 = 1, 2, 3,
(2.1 1)
and they are arranged such that Z(')>I(') > I ( 3 ) .In eq. (2.1 l), 8 is determined from the relation
(2.12) with 4 2A p = - 3G2 - - + - k26'
16
'=
27&3
4A 2W2 3k2a2 k2a
+7+7.
(2.13) (2.14)
Note that the choice of the sign in eqs. (2.8) and (2.10) is critical to arrive at the correct results. This issue was discussed in detail by Chen and Mills, clearly demonstrating that a proper implementation of the boundary conditions makes the choice unique. The case of defocusing nonlinearity (6 < 0) can be developed
10
[I,
NONLINEAR OPTICS OF STRATIFIED MEDIA
D
2
X
ni f
nt
"j
I
IEt
I
+ Z
~~
YA Zo=O
zz
z,
4-1
Zj
'N-I
N'
Fig. I . Schematic view of a layered medium consisting of N layers hounded on the left (right) by a linear medium with refractive index n, (nt).The j-th layer is characterized by linear refractive index ni and nonlinear coefficient ai.
similarly as for & > 0, but we do not describe it here because of the rather lengthy expressions involved. We now apply the solution for the nonlinear medium to a multilayered system with N nonlinear layers (fig. 1) embedded in vacuum (n,= nt = 1). Let the layers be labeled by integers j ( j = 1,. . . ,N ) . Each layer is characterized by its linear refractive index n, and nonlinearity 6,, electric field amplitude E,, and phase q,.Let the beginning of the structure be at z = 0, and let the boundary between the j t h and j + 1-th layers be at z,. For incidence from the left, the boundary conditions at z = 0, z = z , and z = Z N can be manipulated to yield the following relations: ft1:6lZ:(O)
+ (n:
-
W
A
1)Zi (0) + 4 - 2 - - - = 0, ko k;
ko
=
w
-, C
(2.15)
(2.17)
(2.21) In the units chosen, W is a real number bounded between zero and ko. The numerical procedure treats W as a parameter. A guessed value of W determines
I, 9 21
NONLINEAR TRANSMISSION A N D OPTICAL BlSTABlLITY IN LAYERED MEDIA
11
I N ( z N ) through the relation W = k o Z ( z ~(see ) eqs. 2.4 and 2.21) and the value of the constant A N (eq. 2.20). Since all the constants W , A N , and Z N ( Z N ) for the N-th slab are known (note that the intensity given by eq. (2.6) does not contain the fourth constant C ~ N ( Z N ) )the , solution at the left edge of the N-th slab can be evaluated using eq. (2.9) with eqs. (2.10)-(2.14). One thus knows ZN-I(ZN-~). A N - I is then evaluated using eq. (2.19), with W N= W N - I= W . With the knowledge of all the relevant constants in the ( N - 1)-th layer, the solution can be propagated to the left edge of the ( N - 1)-th layer. The continuation of the procedure leads to Z,(z) for all m. Zl(O), thus evaluated, must satisfy eq. (2.15). This condition chooses the allowed value of the constant W . Once the solution for W is known, the intensity transmission coefficient T can be evaluated using the boundary condition at z = zN. T is given by T
(2.22)
= ZN(ZN).
2.1.2. Other numerical and approximate methods Perhaps the simplest possible approach to the calculation of the reflection and transmission coefficients of a layered medium comprising Kerr nonlinear slabs was the extension of the linear transfer matrix method (see, e.g., Born and Wolf [1989]) to a nonlinear regime. The nonlinear transfer matrix method was first developed by Dutta Gupta and Aganval [1987], who applied the theory to calculate the transmission of a system of single and coupled nonlinear FabryPerot cavities. Optical bistability for both cases was demonstrated, and the role of coupling between the cavities (in the case of coupled Fabry-Perot cavities) was assessed. Later the theory was generalized to multiple layers, and an efficient numerical scheme to handle such systems was presented (Dutta Gupta and Ray [ 19881). We recall the essential steps pertaining to a multilayered medium consisting of N nonlinear slabs (fig. 1) bounded on the left (right) by a linear medium with dielectric constant E, ( E , ) . Let a TE- (or s-) polarized plane wave be incident on the structure from the left. Let the nonlinearity of thej-th layer be given by the nonlinear displacement vector 6NL as follows (Maker, Terhune and Savage [ 19641): (2.23)
x
where E, is the linear dielectric constant, is the constant of nonlinear interaction, A,,BJ are the Kerr and electrostriction nonlinearity constants, and 2,is the electric field vector, all pertaining to the j-th layer. In the j-th slab
12
NONLINEAR OPTICS OF STRATIFIED MEDIA
[I, 0 2
with nonlinearity given by eq. (2.23) the solutions of Marburger and Felber [1978] for the y-component of the electric field in the slowly varying envelope approximation can be written as EJ. = AJ+. e’S+z+ AJ,- e-ikj-2,
(2.24)
with
h&= ko&
(1
u,, = aj IA~*
I
2
+ L!,* , aj
+2U,,)1’2 =
= konj*,
x/ (Aj +Bj).
ko
=
w
-, C
(2.25) (2.26)
In eqs. (2.24)-(2.26), A,, (Aj-), kj+ (kj-), and Uj+ (UJ-) are the constant amplitude, wave vector, and dimensionless intensity, respectively, of the forward (backward) propagating wave. Using eq. (2.24), one can obtain the expression for the tangential component of the magnetic field. Furthermore, one can eliminate A,j+ and A,- from the expressions of the tangential field components at the left and right interface of thej-th slab. This yields the characteristic matrix M j that relates the tangential magnetic and electric field components at the left and the right faces of the j-th slab. The elements of the characteristic matrix mi, are given by m:,
= (a. .I-
e-ikJ+‘1
+ nj .+ eikJ-‘I ) / ( n j + + Cj-1,
(2.27a) (2.27b)
’J+ nJ- elk/-‘/)/(n,++ n,-).
mi2 = (nJ+e-lk/+
(2.27~) (2.27d)
Since the boundary conditions demand the continuity of the tangential components, the application of characteristic matrices to evaluate the tangential field components at any interface becomes straightforward. Henceforth, we assume that all media have the same nonlinearity constant, namely, aJ = a for all j (otherwise one needs to know all a,’s). One starts at the right edge, that is, with the N-th layer, treating the transmitted intensity U, = a IAtI2 ( A , is the transmitted amplitude) as the parameter. Forward and backward wave intensities in the N-th layer can then be expressed as (2.28) where n t = & and 1 . 1 2 implies mod square of the elements of the column matrix. The set of nonlinear algebraic equations (2.28) is solved numerically
I, 8 21
NONLINEAR TRANSMlSSION AND OPTICAL BISTABILITY IN LAYERED MEDIA
13
by any standard technique (e.g., fixed-point iteration) to obtain the values of U N ~A. knowledge of U N yields ~ n N f through eq. (2.25). The evaluation of the characteristic matrix M N using eq. (2.27) for the N-th slab is then straightforward. An analogous procedure is applied to the successive layers. For anyj-th layer ( j 3 1) one thus needs to solve the coupled nonlinear equations
which yields the characteristic matrix Mi for the j-th slab. For a total of N layers the total characteristic matrix is given by (2.30) As in the case of a linear stratified medium, the reflection and transmission coefficients are then given by (2.31) (2.32) Note that if any layer is linear, eq. (2.27) leads to the linear characteristic matrix and solving the coupled nonlinear equations like eq. (2.29) is unnecessary. The generalization of the above theory to include oblique incidence is also straightforward (see 9 2.2.1). We now stress some important aspects of the nonlinear characteristic matrix formalism of Dutta Gupta and Agarwal [ 19871. It may be noted from the solution given by eq. (2.24) that the effective refractive indices experienced by the forward and backward waves, nj+and ni-, respectively, are not the same, leading to the socalled nonreciprocity. In the context of counterpropagating waves with arbitrary polarization, the light-induced nonreciprocity can lead to interesting effects (Kaplan and Meystre [ 19811, Kaplan [ 19831, Kaplan and Law [ 19851). The other important aspect is the nonlinearity of the boundary condition (Agarwal and Dutta Gupta [1987]). The term “nonlinearity of the boundary condtions” is used in the sense that the boundary conditions involve magnetic fields whch are nonlinear functionals of the electric fields in the medium. In other words, the magnetic field amplitudes in thej-th nonlinear medium, as can be seen from
14
NONLfNEAR OPTICS OF STRATIFIED MEDIA
[I, § 2
the derivative of eq. (2.24), are functions of the n,h, which in turn depend on the forward and backward wave amplitudes Aj& of the electric field. Neglect of the nonlinearity of the boundary conditions amounts to ignoring the nonlinearity in the amplitudes of the magnetic field while retaining the same in the phases. Note that the complications associated with the nonlinearity of the boundary conditions does not arise when the Fabry-Perot cavity is characterized by mirrors with given reflection coefficients. This was the approach earlier (McCall [1974], Marburger and Felber [1978], Carmichael and Agrawal [1981], Abraham and Smith [ 19821, Cooperman, Dagenais and Winful [ 19841, Lang and Yariv [ 19861, Nishiyama and Kurita [1986]). Agarwal and Dutta Gupta [1987] discussed the consequences of the neglect of the nonlinearity of the boundary conditions in detail. They considered a nonlinear slab of width d coated on both sides by alternating m low-index and m + 1 high-index linear slabs (fig. 2a), and compared the results with and without the nonlinearity of the boundary conditions. The results for the transmission coefficient for the nonlinear structure are shown in fig. 2b. An increase in the number of periods m of the coating leads to higher finesse of the cavity, resulting in a lower bistability threshold (fig. 2b). It is clear from fig. 2b that a higher bistability threshold leads to higher deviations of the approximate results (with neglect of nonlinearity of boundary conditions) from the exact nonlinear characteristic matrix theory. The corrections are almost insignificant for m = 3 when the threshold (as well as the nonlinear correction to the effective refractive indices) is rather low. Note also that for a given m the deviations are more prominent near the upper bistability threshold, which is due to larger intracavity field intensity at nonlinear resonance. Thus, the full nonlinearity of the boundary conditions is important whenever one deals with relatively large intensities in the nonlinear medium. Finally, we consider the implications of simultaneous neglect of the effects of nonreciprocity and nonlinearity of the boundary conditions. The assumption of the same effective refractive indices for the forward and backward waves together with the neglect of nonlinearity of the boundary conditions amounts to using the standard Fresnel formulas with the linear refractive index nj replaced by the intensity-dependent refractive index nj[l+ (Uj++ Uj-)]"2. In fact, some of the earlier attempts to analyze optical bistability in the context of oblique incidence made use of an analogous simplified approach (Wysin, Simon and Deck [ 19811, Martinot, Lava1 and Koster [1984]). A (computationally) different (although essentially the same) characteristic matrix approach was proposed later by Danckaert, Thienpont, Veretennicoff, Haelterman and Mandel [1989]. A detailed study exploring both the domain of applicability of the matrix method and the validity of the various approximations
I,
5 21
15
NONLINEAR TRANSMISSION AND OPTlCAL BlSTABlLlTY IN LAYERED MEDIA
A, A,
m=3
...
I..
A,
m=2
m= 1
Fig. 2. (a) Schematic view of a nonlinear Fabry-Perot cavity with reflection coatings, composed of m low-index (nb) and rn + 1 high-index (na) k/4 plates. Parameters are chosen as follows: n, = 2.3, q,= 1.3077, n = 1.7149. (b) Transmission coefficient T as a function of the incident intensity Ui : solid (dashed) curves give the results with (without) nonlinearity of the boundary conditions. Different curves are labeled by the values of ni. The parameters have been chosen as follows: kond = (2 - 2 A ) z , A=0.113 for m = l , A=0.04 for m = 2 , and A=O.O18 for m = 3 . A gives the half width at half maximum of the linear transmission resonances (Aganval and Dutta Gupta [1987]).
was carried out by Danckaert, Fobeles, Veretennicoff, Vitrant and Reinisch [1991]. The basic difference between the method of Dutta Gupta and Aganval [1987] and that of Danckaert, Thienpont, Veretennicoff, Haelterman and Mandel [I9891 is that the latter group uses matrices that relate the constant amplitudes of the forward and backward waves in adjacent layers, whereas the former group uses tangential components of the electromagnetic fields. The advantage of the earlier approach is that it can directly yield the electric and magnetic fields at any point of the layered medium, whereas the other gives the reflection and transmission coefficients in a straightforward way. Moreover, in the latter
16
NONLINEAR OPTICS OF STRATIFIED MEDIA
[I, § 2
approach solving coupled nonlinear equations is unnecessary, which, although simple, can be slightly more time consuming. With respect to the main results of Danckaert, Fobeles, Veretennicoff, Vitrant and Reinisch [ 19911 regarding the validity of the nonlinear characteristic matrix formalism, conclusions were reached by comparing the results of the characteristic matrix formalism with the exact numerical solution of the nonlinear wave equation pertaining to the layered medium. The major approximations essential for developing the formalism can be listed as follows: (i) neglect of the spatial third harmonics (terms like efi3!7; (ii) slowly varying envelope approximation (SVEA). The motivation for the neglect of the spatial third harmonic is obvious. For a nonlinear layer with thickness d > L , an averaging over the high frequency components leads to a smearing of the contributions from such terms. However, for d A, as was pointed out by Biran [ 19901, this can be a poor approximation. The validity of the slowly varying envelope approximation was tested on the basis of exact and approximate calculations for a superlattice with alternate linear (with comparatively larger linear refractive index) and nonlinear layers. Results were obtained for several structures, in particular, hundred-period (highindedlow-index) and a five-period (8 high-inded8 low-index) structures. In both cases approximately the same peak strength of the fields were noted. SVEA was shown to be a good approximation for the five period superlattice, but it failed miserably for the hundred period structure. This was explained in terms of accumulation of errors (due to SVEA) through successive applications of the boundary conditions. T h s explanation seems incomplete, since the authors do not take into account the specifics of the field distribution in the distributed feedback structure. Note that in case of the hundred period superlattice one has the formation of the stationary soliton-like distribution along the length of the superlattice (see 6 2.3.1), whereas they do not emerge in the structure with lower periods. To summarize, the major conclusion was that SVEA holds in the case of nonlinear layers with widths larger than the wavelengths. Nonlinear characteristic matrix formalism and its simplified versions (with the neglect of nonreciprocity and/or nonlinearity of boundary conditions), because of its elegance and simplicity, have found many applications. Structures involving a few layers as well as periodic and quasiperiodic layered media have been studied. Because of the special properties of periodic and quasiperiodic structures, the results pertaining to such systems will be addressed separately. We present here the results for systems laclung periodicity or quasiperiodicity along the direction of stratification. An interesting effect that appears in the optical response of asymmetrical nonlinear layered media is the nonreciprocity of the overall structure. To be more
2 + V(E) = A.
(2.36) 2 In eqs. (2.35) and (2.36) the prime denotes the derivative with respect to E; and 2 are integration constants. Equation (2.36) can be interpreted as the energy
w
integral, where the potential function V ( E ) is given by (2.37)
In eq. (2.37), I? = ni - n?, n,=k,lko. Note that the sign of I? determines whether the waves are propagating or evanescent in the nonlinear layer for vanishing intensities and the solutions take different forms for these two cases. Equation (2.36) after integration leads to the quadrature determining E ( 5 ) in an implicit form: (2.38)
50 is another integration constant. For Kerr-type nonlinearity, 1812)= a one can defme a dimensionless intensity as
where
(
1l?I2
z(E>= n&x E2/n,2, R = f n : ,
where
(2.39)
and carry out integration of eq. (2.38) to obtain the solution in terms of Jacobian elliptical functions. The behavior of the solutions is then determined (as in the
I,
0 21
NONLINEAR TRANSMISSION AND OPTICAL BISTABILITY IN LAYERED MEDIA
21
case of normal incidence) by the cubic dependence of the potential V(Z) given bY
V(Z)= z3 2z2 - 471,z + 271.;
(2.40)
with -
ul = n,a-
2 ‘ 4
n4
-
and
712 =
W nr3
nia-.
(2.41)
The major difference between the two approaches lies in the implementation of the boundary conditions. Whereas Chen and Mills [ 1987~1treated the global constant as the free parameter, Leung [1989] used Z(2) (the intermediate root of the equation V(Z)=O) as the parameter. In the previous approach the equivalent constant W had to be scanned through a fixed range to find particular values that yielded a solution consistent with the boundary conditions. This amounts to solving for the roots of a single but highly complicated algebraic equation. The problem becomes further complicated since there can be several roots in the multivalued domain. In contrast, Leung’s approach avoids this problem, since no equation needs to be solved numerically. However, the method of Leung has not been generalized to a multilayered system. Using the exact solutions, Leung calculated the reflection coefficient from a nonlinear layer with a linear dielectric constant larger than that of the substrate and cladding, and demonstrated multivalued output in reflection. Moreover, Leung [ 19891 also reported nonlinearity-induced modes in the structures. In nonlinear structures, because of the dependence of the dielectric function on the intensity, the nonlinearity-induced increase in the optical width of the nonlinear layer may lead to additional resonances. Leung [1989] explored the origin of such resonances, and calculated the value of the incident intensity for which such resonances occurred in a nonlinear film of given thickness. Using the exact solutions for the nonlinear layers, Langbein, Lederer, Peschel, Trutschel and Mihalache [1990] developed a “matrix” method to handle the transmission and reflection coefficient for oblique incidence. Obviously, the method is not as simple as the nonlinear characteristic matrix approach, where the matrix for a N-layered nonlinear system can be obtained by a direct multiplication of the matrices in proper order. The difficulty arises because these authors deal with the intensities and their derivatives (the solutions for which are known in terms of Jacobian elliptical functions) rather than the complex amplitudes of the fields. The other constraint is the assumption of lossless media, which enables the use of flux conservation.
w
22
NONLINEAR OPTICS OF STRATIFIED MEDIA
[I, § 2
Applying the matrix method, these authors explored the transmission resonances of various structures. Note that a transmission resonance is defined by the zero of the intensity reflection coefficient R . Thus, under the condition of a transmission resonance and in the absence of losses, all the incident energy is transmitted by the structure. They studied the power-dependent evolution of the transmission resonances. The relation between these transmission resonances and all-optical switching was also demonstrated. Transmission resonances without a linear analog (i.e., nonlinearity-induced modes) were another offshoot of their studies. Since other reviews have detailed the earlier studies on s-polarized nonlinear guided and surface waves (see, e.g., Stegeman and Seaton [1985], Mihalache, Bertolotti and Sibilia [1989], Stegeman [1992]), we do not discuss them here. Some major findings included the prediction of new nonlinearity-induced guided and surface modes, which do not have any linear counterpart. Another interesting observation was the possibility of s-polarized surface plasmon polaritons. The existence of s-polarized surface plasmon polaritons in thin metal films bounded on both sides by self-focusing media was predicted by Stegeman, Valera, Seaton, Sipe and Maradudin [1984]. Obviously, these kinds of surface plasmon polaritons do not have any linear analog, since surface plasmons in linear structures are essentially p-polarized. The dispersion relation for the s-polarized nonlinear surface plasmon polaritons guided by a metal film sandwiched between a linear substrate and nonlinear cladding was studied by Lederer and Mihalache [ 19861 and Mihalache, Mazilu and Lederer [1986]. The dispersion curves revealed a local power minimum and a restricted region for the permitted propagation constants. Along with the study of the power-dependent nonlinear surface and guided modes, optical bistability mediated by such modes has drawn considerable attention. A convenient structure to study optical bistability with such modes was the attenuated total reflection (ATR) (sometimes referred to as frustrated total reflection) geometry, where a high-index prism with or without a lowindex spacer layer loaded on top of the guiding interfacellayer is used to couple the incident radiation to the surfacefguided mode. Reflectivity of the structure is monitored as a hnction of the angle of incidence. A change in this angle changes the surface component of the wave vector, thereby matching the propagation constant of the surface/guided mode. In such a case one observes a dip in the reflection coefficient, which otherwise is close to unity because of total internal reflection at the prism spacer layer interface. In the context of nonlinear ATR configuration, to derive a bistable response, one first obtains the angle of incidence when the surfacelguided modes are
I,
3 21
NONLINEAR TRANSMISSION AND OPTICAL BISTABILITY IN LAYERED MEDIA
23
excited at low power levels. Keeping the angle of incidence slightly detuned in the proper direction (determined by the sign of nonlinearity) from the linear (low-power) resonance, one can sweep through the resonance by increasing the incident power. The sweeping is possible because of the dependence of the optical path in the nonlinear layers on intensity. The overall effect of the nonlinearity is a “bending” of the resonances, which eventually leads to the hysteretic response. Several authors exploited ATR geometry to demonstrate optical bistability with guided modes (Stegeman [ 19821, Reinisch, Arlot, Vitrant and Pic [1985]). Modifications of ATR geometry to lower the threshold were also proposed (Haelterman [ 19881). In his proposal Haelterman exploited the intensity-dependent phase jump near the guided wave resonance to achieve low threshold bistability. As mentioned in tj 2. I .2, in the context of normal incidence on a multilayered medium with an arbitrary number of nonlinear layers, the characteristic matrix approach can be generalized to include the case of oblique incidence. The general expression for the nonlinear characteristic matrix given by eq. (2.27) holds, except that now the effective refractive indices for the forward and backward waves (i.e., n j t ) are to be replaced by nz,if,where n,jh are given by (2.42) with
nli
= ~j
-n, 2 > 0, n,
= k.Jko,
(2.43)
where k, is the x-component of the wave vector determined by the angle of incidence, and nzif give the nonlinearity-modified z-component of the propagation constants scaled by the vacuum wave vector k,; n, gives the propagation constant (in units of ko) or the effective index of the guided mode along x. Note that n, is continuous across the interface. The matrix approach for oblique incidence, however, has a restricted domain of application; that is, the nil in each nonlinear layer has to be real. This implies that the method is applicable only when the waves are propagating in the nonlinear layers. Thus, it fails in the case of waveguides with nonlinear substrate or cladding, where the fields are generally evanescent. A much more accurate matrix method using Jacobian elliptical functions as the solution in the nonlinear slab, was proposed by Trutschel, Lederer and Golz [1989]. They considered a system of an arbitrary number of unit cells sandwiched between linear substrate and
24
NONLINEAR OPTICS OF STRATIFIED MEDIA
11, § 2
cladding. Each unit cell consisted of a linear layer between two Kerr nonlinear slabs. The method was applied to a GaAdAlGaAs multilayer. They reported both symmetrical and asymmetrical TE guided modes and presented a detailed study of the corresponding dispersion characteristics. Recently, the optical properties of a Kerr nonlinear layer near a phase conjugate mirror (PCM) was studied using the characteristic matrix approach. Some studies in linear systems with PCM (Agarwal and Dutta Gupta [1995], Dutta Gupta and Jose [1996]) demonstrating the major role of evanescent waves served as a stimulus for such studies. Recent experimental observation of evanescent waves and their phase conjugation (Bozhevolnyi, Keller and Smolyaninov [ 19941, Bozhevolnyi, Vohnsen, Smolyaninov and Zayats [ 19951, Bozhevolnyi, Keller and Smolyaninov [ 19951) was another motivation for the theoretical studies. Dutta Gupta and Jose [ 19961 probed the guided and surface wave structures near a PCM (see inset to fig. 3) for the signature of the interaction of guided and surface modes with a PCM. Phase conjugation of guided and surface modes leading to enhanced back scattering was reported (fig. 3). The studies on the nonlinear counterpart (Jose and Dutta Gupta [1998]) focused on two issues: (a) to probe the well-known distortion correction properties of PCM (see, e.g., Aganval and Wolf [1982], Aganval, Friberg and Wolf [1982a,b, 19831, Friberg and Drummond [1983]) in the context of a nonlinear layer; and (b) to look for the signature of nonlinearity in ordinary and phase conjugated reflectivity. In the domain where the waves were propagating in all the layers, it was shown that a PCM with reflectivity ,u (I,ul= 1) can completely correct for the distortions introduced by a Kerr nonlinear slab. This was an explicit verification of an earlier general theorem of Agarwal [1983] that was applicable to a broad class of nonlinear media. In the domain where guided modes are excited, it was shown that the presence of the PCM can lead to optical multistability in both specular and back scattering directions (fig. 4). 2.2.2. TM- or p-polarized waves An exact solution for the scattering of p-polarized waves is known (Leung
[1985]) only for a single linear-nonlinear interface (i.e., for a semi-infinite nonlinear medium). To this date, to our knowledge, the general solution for a nonlinear slab and in general a layered medium, has not been worked out. Here we briefly recall Leung’s exact solution pertaining to a single interface. Consider a semi-infinite isotropic nonlinear Kerr medium with dielectric function Et = 1 + a1E12)occupying the space z < 0. Assuming the x-dependence of
I, fj 21
NONLINEAR TRANSMISSION AND OPTICAL BISTABILITY IN LAYERED MEDIA
U
1.0
-
0.9
-
0.8
-
0.7
-
0.6
-
25
L. E)
0.5 -
0.4
E2
I ,
-
--lo* lo* lo* -
e [degrees]
10'
100 10.'
_z
102
104
104 10.'
10-Q 10-10
-lo." -10."
10'2
10'3
10.'~ 10-'8
lo-" 1v'B
-
I
10-n
3
I
I
I
20
40
60
I
so
\
.
I
100
e [degrees]
Fig. 3. Linear results for the reflection coefficient in (a) specular direction R and (b) back scattering direction R, as functions of the angle of incidence 0 for p = 1.0. The various dips are labeled by their corresponding mode numbers. The inset shows the layered medium on top of the PCM withthe followingparametervalues: dl = I p m , dZ=5.5pm, d3=0.12pm, 1=0.82pm, E I =3.1329, €2 = 12.95996+0.0453i, ~3 = 1, €4 =6.145 (Jose and Dutta Gupta [1998]).
26
NONLINEAR OPTICS OF STRATIFIED MEDIA
[I, 9 2
1 .o
0.9
0.8
U 0.7
0.6
0.5
I
I
0.1
1 u4
0.01 '
I
' ' ' "
1
0.1
u4
Fig. 4. Reflection coefficient in (a) specula direction R and (b) back scattering direction R, as functions of input intensity U4 for #=47.69" (corresponding to point P in fig. 3a). Other parameters are as in fig. 3 (Jose and Dutta Gupta [1998]).
I,
4
21
27
NONLINEAR TRANSMISSION AND OPTICAL BISTABILITY IN LAYERED MEDIA
the fields to be -eikxxand with no variation along y , the Maxwell equations for p-polarized waves can be written in the form (2.44) (2.45) (2.46) where 5 = koz, and n, = k.Jk0. The set of equations (2.44X2.46) can be reduced to a second-order differential equation of the form
(z)‘(5 =
-
I ) By,
(2.47)
which can be solved for B, and B.L. The solutions for B, and Bh are given by (2.48)
where (2.50) Making use of these exact solutions, Leung [ 19851 found new waves with no linear counterpart. The dispersion relation for the surface modes was obtained without having to solve for the field profiles. Optical bistability mediated by nonlinear p-polarized waves has been an intense field of investigation. Surface plasmons (see, e.g., Raether [ 19771, Kovacs [ 19821) at a metal-dielectric interface played an important role in such studies. These modes are localized near the surface in the sense that their fields decay exponentially away from the surface. Surface plasmons are excited by p-polarized waves in ATR configuration or by surface inhomogeneities (like grating or surface roughness). There can be variations of the ATR geometry (fig. 5). The Otto geometry (Otto [1968]) has a low index spacer layer between the high index prism and the metal film, whereas, in the Kretschmann configuration (Kretschmann [1971]) the metal film is deposited on the base of the high index prism. A different geometry, which can support coupled surface plasmons in very thin metal films, was suggested by Sarid [1981]. In
28
NONLINEAR OPTICS OF STRATIFIED MEDIA
Otto
Kretschmann
Sarid ,A
1 dielectric I
dielectric metal ~
dielectric
~
Fig. 5 . (a) Otto, (b) Kretschmann and (c) Sand geometries for the excitation of surface plasmons. Arrows indicate the interface near whlch the surface excitation is localized. Note the possibility of the excitation of coupled surface plasmons in Sarid geometry
the Sarid geometry one can excite both the symmetrical short-range (SR) and the antisymmetrical long-range (LR) surface plasmons. The long-range surface plasmons (LRSP) have the added advantage of large local field enhancements associated with them (Sarid, Deck, Craig, Hickernell, Jameson and Fasano [ 19821, Agarwal [ 19851). Various nonlinear optical phenomena exploiting this extra enhancement were demonstrated by Sarid, Deck and Fasano [1982], Deck and Sarid [1982], and Quail, Rako, Simon and Deck [1983]. Optical bistability with surface plasmons at a metal-nonlinear dielectric interface was demonstrated by Wysin, Simon and Deck [1981]. Martinot, Lava1 and Koster [I9841 used the ATR configuration with a nonlinear prism loaded on top of the metal film, The electric field intensity in the prism (i.e., &, and incident and reflected plane wavefields, respectively) was approximated by the sum of the intensities of the incident and reflected waves. Moreover, they assumed plane wave solutions in the nonlinear medium with the linear refractive index replaced by its nonlinear counterpart. Dutta Gupta and Agarwal [ 19861 investigated optical bistability in the prism-metal film-nonlinear substrate configuration without the assumption of a plane wave solution for the nonlinear dielectric. They used Kaplan's solutions (Kaplan [1981]) for an approximate wave equation suitable for p-polarized waves. Hickernell and Sarid [I9861 demonstrated the advantages of LRSPs in the context of lowering the bistability threshold. They showed that the power threshold for switching between the two bistable states for LRSP can be two orders of magnitude less than that required for a single interface surface plasmon.
Izi+z~I*,
z~
I,
21
NONLINEAR TRANSMISSION AND OPTICAL BISTABILITY IN LAYERED MEDIA
29
Exact results for optical bistability with surface plasmons in a layered structure on a nonlinear substrate were presented by Aganval and Dutta Gupta [1986] using the solutions of Leung [1985]. Aganval and Dutta Gupta [ 19861 considered a linear stratified medium consisting of N layers occupying a region -ZN < z < 0 on a Kerr nonlinear substrate filling the space z 2 0 (note the location of the nonlinear interface at z=O for obvious convenience). For the incidence of a p-polarized light from the left at an angle 8 the boundary conditions were written as
In eq. (2.51), Bi (B,) is the magnetic field amplitude of the incident (reflected) wave, n,i = cos 8/& ( ~ i dielectric , constant of the medium of incidence), A4 is the characteristic matrix of the stratified medium (Born and Wolf [1989]) occupying -ZN < z < 0, and BY,*l(O)and Ex,n,(0)are the tangential components of the magnetic induction and the electric field, respectively, at z=O+. B,,,l(O) and Ex,nl(0)are evaluated using eqs. (2.48)-(2.50) and taking the limit 5 + O+. They are given by (2.52)
(2.53) with (2.54) Applying eq. (2.51) and treating I ( 0 ) as the free parameter enabled the straightforward calculation of the reflection coefficient of the structure. The exact results thus obtained were compared with the approximate results of Wysin, Simon and Deck [1981] and Dutta Gupta and Aganval [1986]. The set of parameters applied by Wysin, Simon and Deck [1981] was used for calculation. It was shown that the results of Wysin, Simon and Deck underestimated, whereas those of Dutta Gupta and Agarwal overestimated the switching thresholds. Calculations for the Sarid geometry supporting LRSP as well as SRSP were also performed. The former were shown to
30
NONLiNEAR OPTICS OF STRATIFIED MEDIA
[I, § 2
lead to a switching threshold at least one order of magnitude less when compared to a structure supporting ordinary (single interface) surface plasmons. Exact numerical solutions for a nonlinear dielectric slab bounded symmetrically by thin metallic layers was studied by Pande and Dutta Gupta [1990]. The linear equivalent of the structure was proposed by Welford and Sambles [1988]. Pande and Dutta Gupta [1992] made a detailed study of the linear reflection and transmission when the dielectric layer possessed material absorption and dispersion. The metal bound dielectric structure has an advantage that both the asymmetrical and symmetrical coupled modes (i.e., LRSP and SRSP) have comparable damping, unlike the case of Sarid geometry where the SRSP has a large decay compared with the LRSP. Since both the long-range and short-range surface plasmons have comparable decay, both (for proper operating conditions) can be affected by the nonlinearity to the same degree. Note that it is difficult to have optical bistability with the short-range modes in the other conventional scheme (Sarid configuration) where the coupling is by means of the metallic layer. In contrast, metal bound dielectric film can exhibit optical bistability with both LRSP and SRSP (Pande and Dutta Gupta [1990]). Pande and Dutta Gupta [1991] also considered the case of the saturation type of dispersive nonlinearity (Peschel, Dannberg, Langbein and Lederer [ 19881) with the exact numerical method, and studied the effect of saturation on the bistable response. Nonlinearity-induced modes for both the cases of Kerr- and saturation-type nonlinearities were also reported. A more general form of nonlinearity allowing for the nonlinearity-induced anisotropy was considered by Boardman, Maradudin, Stegeman, Twardowski and Wright [1987] and Boardman and Twardowski [1989]. It was shown from first principles that for a macroscopically isotropic nonlinear material the induced polarization at frequency o (= w + o - o)can be calculated using the dielectric tensor which is given by
x,
Ern
i=(
+ CC(E.:+ PE,Z + YE:) 0 . 0
0
&to
+ a@E; + E,' + BE;) 0
0 0 &to + cr(yE,2 + PE;! + E-?)
(2.55) In writing eq. (2.55), a planar guiding structure with the interface at z=O supporting waves with wave vector along x was assumed. The values of the constants fi and y are determined by the nature of the nonlinearity. For example, y = 113, -1/2 and 1, and fi=2/3, 1/4 and 1 for electronic distortion,
I, 5 21
NONLINEAR TRANSMISSION AND OPTICAL BISTABILJTY IN LAYERED MEDJA
31
molecular orientation, and thermal nonlinear mechanisms, respectively. The case of pure p-polarized waves on a semi-infinite nonlinear medium (with E , = 0) was considered for both metal and dielectric bounding media (Boardman, Maradudin, Stegeman, Twardowski and Wright [ 19871). The exact equations of motion were integrated numerically (using a finite element code) to study the power flow as a function of the effective index of the guided mode. For metallic bounding media an extremal behavior (maximum) was predicted. However, access to this maximum requires a large change in the refractive index, which cannot be achieved at moderate power levels, Boardman and Twardowski [1989] studied single interface as well as linear film on a nonlinear substrate and focused on the mixed p- and s-polarized modes. Note that mixing is possible because of the presence of all field components in the expression for 2. Numerical studies were simplified because some of the first integrals could be found. It is clear that in contrast to the case of s-polarization, the study of p-polarized waves in nonlinear layered media poses a much more challenging problem. For s-polarized waves propagating along x-direction with z-axis giving the direction of stratification, one needs to consider the only nonzero component of the electric field E , and the intensity-dependent nonlinearity can be expressed as a function of local intensity IEYl2. In contrast, for p-polarized waves the local intensity becomes a hnction of 1E,j2 + J E z1 assuming an isotropic nonlinearity. Because of the complexity of solving the Maxwell’s equations with both the components of the fields present in the nonlinearity, the earlier approaches resorted to various approximations for IE,I and IE,J. One of the first attempts to model the p-polarized waves (Lederer, Langbein and Ponath [19831) was based on the so-called longitudinal uniaxial approximation for which ( E ,I >> IE, I. Obviously, the approximation proved to be poor for guided waves (Seaton, Valera, Svenson and Stegeman [1985], Stegeman, Seaton and Ariyasu [1985], Langbein, Lederer, Mihalache and Mazilu [ 19871, Boardman, Maradudin, Stegeman, Twardowski and Wright [1987]). This led to the other approach known as the transverse uniaxial approximation which resorted to the other extreme, namely, IE,(. 3.2. TEMPERATURE DEPENDENCE OF THE THRESHOLD CURRENT
The lasing threshold in all diode lasers is determined by the balance between optical gain and losses. With increasing temperature, two main effects take place: the energy gap in 111-V semiconductors shrinks (see table 5), and the carrier density distribution within each band broadens, with its peak shifting further into the band. The bandgap shrinkage is dominant, and the net result of these two opposing effects is that the gain-peak shifts towards the longer wavelengths. In addition, the gain peak is lowered at a given carrier concentration. Thus, even if the optical losses were to remain unchanged, the threshold current would have increased with temperature, since higher current density is required to maintain the same gain level. In addition, optical losses do increase with temperature, since the higher density of carriers necessary to maintain the required gain level results in increased free-carrier absorption in the active region. Somewhat less important, at least at room temperature and above, is an increase in free-carrier absorption that can occur in passive layers, caused by temperature-dependent impurity ionization. These considerations assume implicitly that all current flowing through the diode laser results in radiative transitions. However, only a fraction of electron-hole pairs recombines radiatively. Nonradiative processes can also be temperature dependent, either directly, or indirectly via increased carrier density necessary to balance the optical loss. An example of such a process is Auger recombination that increases rapidly with carrier density. Another mechanism of carrier loss is leakage along a shunt path away from the active region or straight over the quantum-well active region. The current density associated with the latter process can be described using a simple expression, analogous to the standard current-voltage equation for a p-n junction (Scott, Corzine, Young and Coldren [ 19931):
190
THERMAL PROPERTIES OF VCSELs
Table 5 Temperature dependence of bandgap E , and peak gain g,,,
3.95+1.15xf 3.34.0 h,i InO.73Ga0 27AS0.6P0.4
3.25-3.82 iJ
1.48'
InO.6G%.4As0.85PO15
3.75'
1.22
In1-xGaxAs,P
4.w.3yi
1
The values of aE,/dT are not independent of temperature. Using the results of Thurmond [ 19751 T ( T + 408)/(T+204)2 eVPC. for GaAs, we obtain dEg/aT=-5.405x Bandgap wavelength I , = 1.3pm. Bandgap wavelength I , = 1.55 pm. Lattice-matched to InP, with x=0.4527/(1-0.031 ly). Swaminathan and Macrander [1991], p. 16. Adachi [1985]. g Yan and Coldren [ 19901. Lautenschlager, Garriga and Cardona [1987]. ' Adachi [1992]. J Dutta and Nelson 119821. Extracted from theoretical curves reported by Stem [1973]. Extracted from theoretical curves reported by Dutta and Nelson [1982]. a
'
where E g is~the local bandgap energy in the cladding (barrier) region (dependent on temperature), AF,, is the local separation of quasi-Femi levels in the active region (dependent on both carrier concentration and temperature), ks is the Boltzmann constant, and T, is the local active-region temperature. For indexguided bottom-emitting mesa lasers, deeply etched just through the active region, the parameterjo was fitted to 8x lo3kA/cm2 (Scott, Corzine, Young and Coldren [ 19931). Another implicit assumption is that the laser operates at the gain-peak wavelength, and that the lasing wavelength follows the gain-peak shifts with temperature. This is true only when the spacing between the longitudinal modes in the lasing cavity is small, as in conventional Fabry-PCrot lasers. If the spacing is large or if some additional frequency-selective elements are used, the gainpeak wavelength will not coincide with the lasing wavelength. This situation occurs in distributed-feedback (DFB) and distributed-Bragg-reflector (DBR) EELS, and is characteristic of all VCSELs. Depending then on the sign of the
111,
5
31
EFFECTS OF TEMPERATURE ON VCSEL OPERATION
191
initial detuning from the gain peak at room temperature, the lasing wavelength can approach the gain peak or depart from it. In the former case, which takes place when the room-temperature lasing wavelength is offset from the gain peak towards the longer wavelengths, the increase in threshold current described in the first paragraph is partially compensated by the simultaneous shift of the lasing wavelength towards the gain peak. Conversely, if the room-temperature lasing wavelength is on the short-wavelength side of the gain peak, temperature variation of the threshold current will be accelerated. According to Chow, Corzine, Young and Coldren [ 19951, many-body Coulomb interactions between carriers are not canceled completely by plasma screening. This leads to a decrease in the wavelength dependence of the threshold carrier concentration and consequently, a greater tolerance to the changes in the resonance/gain overlap with temperature on the low-temperature side of the threshold minimum. It is clear that the complex interplay between all these mechanisms can result in a variety of different patterns of threshold current evolution with temperature. Yet, it is a common practice to describe the temperature dependence of the threshold current using the Arrhenius-type relation, Ith(T) = Ith(300 K) exp
(
-
O;
")
f
with T in Kelvin and the characteristic temperature To used as a measure of temperature sensitivity of the threshold current. With TO constant, eq. (16) usually approximates the actual threshold variation within a certain temperature interval. More generally, TO is itself a function of temperature, and for an arbitrary Ith(T) dependence it can be defined simply as
In spite of its limited applicability to VCSELs, eq. (16) is the only analytical form in which the temperature dependence of the VCSEL threshold current was reported. Not surprisingly, measured values of To for VCSELs are scattered widely and sensitive to device structure (see tables 6a,b). They range from very high (practically infinite) values in devices where the threshold current remains practically constant within a certain temperature range (cf. Geels, Thibeault, Corzine, Scott and Coldren [1993]), through moderate values of 130-150K, (see, e.g., Uchiyama, Ohmae, Shmizu and Iga [1986] and Iga, Koyama and
192
THERMAL PROPERTIES OF VCSELs
[IK 0 3
Table 6a Characteristic temperature To for pulsed operation
To (K) Range (K)
Size (Fm)
Structure
Reference
200
100-220
20-30 0
DMEWL-TIT-A
Uchiyama, Ohmae, Shimizu and Iga [ 19861
70
220-263
20-30 0
DMEWL-TIT-A
Uchiyama, Ohmae, Shimizu and Iga [ 19861
210
283-363
15 0
PITSEL-ATT-A
Hasnain, Tai, Yang, Wang, Fischer, Wynn, Weir, Dutta and Cho [1991]
47.5
223-253
8x8
UMEWL-UCSB
Wada, Babic, Crawford, Reynolds, Dudley, Bowers, Hu, Merz, Miller, Koren and Young [1991]
26.8
253-339
8x8
UMEWL-UCSB
Wada, Bahic, Crawford, Reynolds, Dudley, Bowers, Hu, Merz, Miller, Koren and Young [1991]
24
203-253
20 x 20
wafer-fused HMUML-UCSB
Wada, Bahic, Ishikawa and Bowers [ 19921
47
203-298
8x8
wafer-fused HMUML-UCSB
Wada, Babic, Ishikawa and Bowers [1992]
67
200-300
11 0
wafer-fused HMUML-UCSB
Dudley, Babic, Mirin, Yang, Miller, Ram, Reynolds, Hu and Bowers [I9941
Kinoshita [ 1988]), to negative values in devices that are detuned towards the longer wavelengths (cf. fig. 8). Thus, in stark contrast to Fabry-Perot-type EELS, the To parameter in VCSELs becomes more of a design parameter (Tell, Brown-Goebeler, Leibenguth, Baez and Lee [ 1992]), than a material- or structure-related characteristic. Since the arbitrary temperature sensitivity of a VCSEL threshold current can be obtained in principle, this opens up a possibility of designing temperatureinsensitive VCSELs, with infinitely large To. Some interesting examples of such constructions were demonstrated not only for ambient room temperatures (Young, Scott, F.H. Peters, Thibeault, Corzine, M.G. Peters, Lee and Coldren [ 19931, Kajita, Kawakami, Nido, Kimura, Yoshikawa, Kurihara, Sugimoto and Kasahara [ 1995]), but also for cryogenic conditions (Lu, Luo, Hains, Cheng, Schneider, Choquette, Lear, Kilcoyne and Zolper [ 19951, Ortiz, Hains, Lu, Sun, Cheng and Zolper [1996], Goncher, Lu, Luo, Cheng, Hersee, Sun, Schneider and Zolper [ 1996]), and elevated temperatures (Dudley, Ishikawa, Babic, Miller, Mirin, Jiang, Bowers and Hu [ 19931, Catchmark, Morgan, Kojima, Leibenguth,
111,
D
31
EFFECTS OF TEMPERATURE ON VCSEL OPERATION
193
Table 6b Characteristic temperature 7'0 for CW operation
To (K)
Range (K)
Size (pm)
Structure
Reference
115
288-323
15 0
SMEWL-ATT
Tai, Fischer, Seabury, Olsson, Huo, Ota and Cho [1989]
120
25&300
100
DMEWL-TIT-B
Extracted from data reported by Koyama, Kinoshita and Iga [1989]
210
293-363
150
PITSEL-ATT-A
Hasnain, Tai, Dutta, Wang, Wynn, Weir and Cho [1991]
130
283-323
150
PITSEL-ATT-A
Hasnain, Tai, Yang, Wang, Fischer, Wynn, Weir, Dutta and Cho [1991]
330
213-298
100
PITSEL-ATT-B
Tu, Wang, Schubert, Weir, Zydzik and Cho [1991]
80
328-348
100
PITSEL-ATT-B
Tu, Wang, Schubert, Weir, Zydzik and Cho [I9911
40
283-353
20 x 20
PIBEL-ATT
Von Lehmen, Banwell, Carrion, Stoffel, Florez and Harbison [1992]
60 a
328-393
10x10
shallow-etched BEML-UCSB
Geels, Thibeault, Corzine, Scott and Coldren [1993]
156
293-3 18
10 0
PITSEL-SNL
Schneider, Choquette, Lott, Lear, Figiel and Malloy [I9941
a
Determined using the active-region temperature.
Asom, Guth, Focht, Luther, Przybylek, Mullay and Chnstodoulides [ 19931, Shoji, Otsubo, Matsuda and Ishikawa [1994], Lu, Zhou, Cheng, Malloy and Zolper [ 19941, Morgan, Hibbs-Brenner, Marta, Walterson, Bounnak, Kalweit and Lehman [ 19951, Ohiso, Tateno, Kohama, Wakatsuki, Tsunetsugu and Kurokawa [1996]). It should be emphasized that the temperature T usually used in eq. (17) in experimental determination of TO,is the ambient (stage or heat sink) temperature (cf. 0 4.2). Under low-duty-cycle pulsed conditions, it coincides with the activeregion temperature. The pulsed and CW values of To should be very similar if the active-region temperature is used instead of the ambient temperature, except for weakly guiding or weakly antiguiding VCSEL structures (cf. 0 3.3) in which lateral nonuniformity of CW temperature profiles plays an important role. It should be noted that the temperature sensitivity of the threshold current depends on the size of the active region. Larger devices usually exhibit lower values of the characteristic temperature TO,which results from poorer overlap
194
THERMAL PROPERTIES OF VCSELs
0.875
0.870
0.865
0.850 0.845
0.mo
Fig. 6 . Temperature effects on the gain-peak wavelength Amax and the vertical-cavity mode = ADBR point corresponds to the minimum threshold wavelength ADBR in a PITSEL. The A,, current. TO is negative in the region when ADBR is offset towards longer wavelengths relative to I,,,. After Tell, Brown-Goebeler, Leibenguth, Baez and Lee [ 19921.
between the gain and photon density profiles (Wada, Babic, Ishikawa and Bowers [ 19921) as well as from worsening thermal properties, with increasingly onedimensional heat flow. The shft of the gain spectrum in VCSEL structures can be determined experimentally by fabricating Fabry-Perot EELs from VCSEL wafers and measuring the lasing wavelength shift with temperature. Typical measured values of dA,,,/dT for GaAdAlGaAs VCSELs are (3.2-3.4) &‘C (e.g., Geels, Thibeault, Corzine, Scott and Coldren [ 19931, Tell, Brown-Goebeler, Leibenguth, Baez and Lee [1992], Scott, Corzine, Young and Coldren [1993]), which is greater than the analogous value of - 2 . 5 b C in conventional EELs. The accelerated shift of gain-peak wavelength is probably caused by heating associated with higher series resistance of multilayer EELS incorporating horizontal Bragg reflectors and by reabsorption of amplified spontaneous emission in the vertical direction, enhanced by high-reflectivity horizontal Bragg
111, § 31
195
EFFECTS OF TEMPERATURE ON VCSEL OPERATION
100
150
200
250
300
350
400
450
SUBSTRATE TEMPERATURE (K) Fig. 7. Temperature dependence of the CW lasing threshold currents for three 16 pm PITSELs with different lasing mode positions relative to a common gain peak (847nm at 300K). The minimum threshold current occurs close to the temperature where the gain peak and lasing mode wavelengths coincide (after Lu, Zhou, Cheng, Malloy and Zolper [1994]).
mirrors. On the other hand, as illustrated in fig. 6 , the observed mode wavelength shift (cf. 9 3.1) is 4-5 times slower than the gain-peak wavelength shift. For each VCSEL design there exists an optimal temperature, for which an ideal alignment between the gain spectrum and the vertical-cavity resonant mode takes place. This usually coincides with the condition for minimum I,h(T), provided optical and electrical losses are not changing drastically around this temperature. Figure 7 presents typical I t h ( T ) curves for three PITSELs with different lasing mode positions at 300K (Lu, Zhou, Cheng, Malloy and Zolper [1994]). The larger the room-temperature detuning of ADBR towards the longer wavelengths, the higher the temperature at which the threshold current reaches minimum. Figure 8 shows the temperature dependence of TO extracted from the data of fig. 7 using eq. (17). It is clear that TO can be considered constant only over a very limited range of temperatures, away from the vertical asymptote. The asymptotes in fig. 8 correspond to the minima of Ith(T) curves in fig. 7. The temperature dependence of the threshold current in VCSELs, with a minimum occurring near the temperature at which ADBR and A,, are aligned, resembles that of frequency-selectiveEELS, such as DFB or DBR lasers. In edgeemitting DFB lasers, however, Ith(T) characteristics may be more complicated,
196
THERMAL PROPERTIES OF VCSELs
Heat-sink
temperature, THS [K]
Fig. 8. Temperature dependence of the characteristic temperature TOextracted from the data of fig. 7.
with multiple minima corresponding to various transverse modes (cf. Aiki, Nakamura and Umeda [ 19761). 3.3. TEMPERATURE DEPENDENCE OF TRANSVERSE-MODE PROPERTIES
Compared to EELS, transverse-mode properties of VCSELs are considerably more complex. Transverse modes in VCSELs are determined by an intricate interplay of gain profile, absorption, diffraction, reflection, spatial filtering, builtin index waveguiding, and temperature distnbution (Scott, Young, Thibeault, Peters and Coldren [1995], Michalzik and Ebeling [1995]). Depending on the particular VCSEL structure, some of these effects can be more important than others, but rarely can a single mechanism be identified as the dominant one. In addition, VCSEL cavities can usually support many transverse modes (Valle, Sarma and Shore [ 1995a]), especially in large-diameter ( 2 20 pm) devices or in strongly index-guided structures. Hence, mode competition and singletransverse-mode control are important problems, even though the device operates in a single longitudinal mode. The difficulties with transverse mode control are best illustrated by the small value (4.4mW) of the fundamental-mode CW output power achieved so far (Lear, Schneider, Choquette, Kilcoyne, Figiel and Zolper [1994]). Coupling to an
111,
P
31
EFFECTS OF TEMPERATURE ON VCSEL OPERATION
197
external cavity, which is a proven technique for increasing the single-mode output power in EELS, has until now resulted in single-mode CW powers of only 22.4mW (Hadley, Wilson, Lau and Smith [1993], Wilson, Hadley, Smith and Lau [ 19931). Spatial filtering with the aid of a graded-index lens has recently extended this value to only 4.5 mW (Koch, Leger, Gopinath, Wang and Morgan [ 19971). This contrasts with significantly larger single-mode powers of 36-1 00 mW in the same external-cavity VCSELs injected with 100 ns pulses at 0.1% duty cycle (Hadley, Wilson, Lau and Smith [1993], Wilson, Hadley, Smith and Lau [ 19931). The large difference between the pulsed and CW results clearly illustrates the strong effect of heating on the transverse mode structure. One of distinct features of VCSELs is that their threshold currents for pulsed and CW operations are often very similar to each other and that the CW threshold can sometimes be even lower than the pulsed one (e.g., Hasnain, Tai, Yang, Wang, Fischer, Wynn, Weir, Dutta and Cho [1991]). This is caused by the socalled thermal lensing effect, which focuses the radiation in regions of higher temperature because of temperature-dependent refractive index. Thermal lensing can strongly influence the transverse mode structure in socalled gain-guided (or carrier-guided) diode lasers without a built-in waveguide in the p n junction plane, where lateral confinement of the optical field occurs via a combination of gain guiding and index antiguiding (e.g., Nash [1973], Cook and Nash [1975], Thompson [1980] (chapter 6.4.1.), Hadley, Hohimer and Owyoung [1987], Cherng and Osinski [1991]). A similar situation takes place in VCSELs with no built-in lateral waveguide, for example in PITSELs (e.g., Hasnain, Tai, Yang, Wang, Fischer, Wynn, Weir, Dutta and Cho [ 19911, Chang-Hasnain, Harbison, Hasnain, Von Lehmen, Florez and Stoffel [ 19911, Zeeb, Moller, Reiner, Ries, Hackbarth and Ebeling [ 1995]), and, partially, in TBEMLs (Michalzik and Ebeling [19931). In low-duty-cycle short-pulse operation, when thermal effects are negligible, the confinement of the optical field in the radial direction occurs via a combination of gain guiding, carrier antiguiding, absorption, diffraction (Babic, Chung, Dagli and Bowers [ 19931, Jansen van Doorn, van Exter and Woerdman [ 1995]), and spatial filtering at the top contact. Carrier antiguiding tends to defocus the optical field, which leads to large diffraction losses (Hasnain, Tai, Yang, Wang, Fischer, Wynn, Weir, Dutta and Cho [ 19911, Dutta, Tu, Hasnain, Zydzik, Wang and Cho [ 19911). Under CW conditions, the active-region heating results in a non-uniform, bell-shaped temperature distribution which peaks in the active region and falls off in the radial direction (cf. fig. 16 in $5.1.1). Since (dnR/dT)/n is positive (see §3.1), the thermal contribution to the refractive index also peaks in the active region, causing the thermal lensing effect. Nonuniformity of the temperature distribution
198
THERMAL PROPERTIES OF VCSELs
[IK § 3
becomes more pronounced with increasing pumping current, to the point where real-index guiding associated with temperature profile may become dominant, resulting in tighter focusing of the optical field. Experimental observations of narrowing near-field patterns of the bell-shaped fundamental transverse mode with increasing current in PITSELs (Chang-Hasnain, Harbison, Florez and Stoffel [ 19911, Chang-Hasnain, Harbison, Hasnain, Von Lehmen, Florez and Stoffel [1991]) have been confirmed by the calculations of Michalzik and Ebeling [ 19931. Thermally-induced waveguiding improves the overlap between the optical field and the gain region and reduces the diffraction loss. In the intermediate regime of relatively long pulses (over 100ns long), the build-up of thermal waveguide leads sometimes to anomalously long time delays in lasing. When the pulse amplitude is only slightly larger than the CW threshold current, the time delay before the onset of lasing can be as long as several ps (Hasnain, Tai, Yang, Wang, Fischer, Wynn, Weir, Dutta and Cho [ 19911). The time delay rapidly decreases with increasing current and reaches the “normal” level of 25 ns when the pumping current amplitude exceeds the pulsed threshold value. A quantitative analysis of the thermal lensing and its effects on the time delay in PITSELs is given by Dutta, Tu, Hasnain, Zydzik, Wang and Cho [1991]. At the beginning of a low-amplitude pulse, threshold losses are higher than the modal gain, and therefore lasing action cannot start. As the device starts to heat up, thermal lensing begins to play a more and more important role, steadily reducing diffraction losses. The observed time delay is simply equal to the time necessary to create a sufficiently strong thermally-induced waveguide. A similar phenomenon has been observed by Prince, Patel, Kasemset and Hong [ 19831 in carrier-guided stripe-geometry EELS and was explained in terms of thermallycontrolled dynamic evolution of waveguide properties. While a thermally-induced waveguide is beneficial from the point of view of lowering the CW threshold current, it can at the same time facilitate excitation of higher-order transverse modes. At higher currents, a stronger real-index thermal waveguide supports a larger number of high-order modes which can then compete with the fundamental mode. Therefore, the dynamic switch-on response of VCSELs initially contains sometimes a single-lobe profile (the fundamental transverse mode), and after the time (dependent on pumping conditions) necessary for the thermal waveguide to build up transforms into a double-lobe profile (the first-order transverse mode) (Yu and Lo [1996], Buccafusca, Chlla, Rocca, Feld, Wilmsen, Morozov and Leibenguth [ 19961). Once the thermal waveguide is established, the main mode competition mechanism switches to spatial hole burning (e.g., Vakhshoori, Wynn, Zydzik, Leibenguth, Asom,
111, D 31
EFFECTS OF TEMPERATURE ON VCSEL OPERATION
199
Kojima and Morgan [1993], Scott, Geels, Corzine and Coldren [1993], Scott, Young, Thbeault, Peters and Coldren [1995], Valle, Sarma and Shore [1995b], Law and Agrawal [1997]). The fimdamental transverse mode is localized in the central part of the active region, therefore the stimulated recombination associated with this mode takes place primarily in this area. This depresses the local carrier density and the gain in the central part of the laser cavity, reducing the modal gain for the fimdamental mode, while allowing the carriers to build-up near the edges of the active region and increasing the modal gain of higher-order doughnut-shaped transverse modes. Eventually, the laser ends up operating in multiple transverse modes. In VCSELs with no built-in lateral waveguide or with weak index-guiding, spatial hole burning can cause a positive-feedback phenomenon known as self focusing, which further reinforces the real-index thermal waveguide (see Wilson, Kuchta, Walker and Smith [ 19941). A depression in the carrier concentration produces a local increase in the refractive index which can further intensify the stimulated emission, locally reducing the carrier concentration, and so on. A similar effect can also arise from thermal lensing via absorption of the emitted light within the core of thermal waveguide. A depression in carrier density, similar to that caused by the spatial hole burning, can also be caused by nonuniformity of current injection in devices with annular contacts (see Osinski, Nakwaski and Varangis [1994]). The two effects can be distinguished by observing the spontaneous emission profile, which is proportional to carrier density distribution. Nonuniformity due to current spreading should also manifest itself below the lasing threshold, while spatial hole burning can occur only above threshold. The only experiments reported so far by Vakhshoori, Wynn, Zydzik, Leibenguth, Asom, Kojima and Morgan [ 19931 and by Wilson, Kuchta, Walker and Smith [ 19941, involving measurements of the spontaneous emission profile above threshold, indicate that at moderate currents the spontaneous emission profile has a doughnut shape. Further above threshold, when higher-order transverse modes become excited, the carrier density profile is sensitive to details of the laser structure. For example, smooth profiles were observed by Wilson, Kuchta, Walker and Smith [ 19941 in their bottom-emitting VCSELs with circular top contacts, indicating that spatial hole burning was the dominant effect. In contrast, Vakhshoori, Wynn, Zydzik, Leibenguth, Asom, Kojima and Morgan [ 19931 observed dark spot near the center of the spontaneous emission profile even high above threshold, which suggests that nonuniform injection was the main effect in their top-emitting devices with annular contacts. Nonuniform current injection, with current crowding near the edges of the
200
Fig. 9. Current density profiles in the p n junction plane for a 16 vm etched-well GaAsiAlGaAs VCSEL (structure DMEWL-TIT-B) with parameters given by Nakwaski and Osinski [ 19931.
active region in VCSELs with annular contacts (e.g., Nakwaski and Osinski [ 199lb], Nakwaski, Osinski and Cheng [ 19921, Wada, Babic, Ishikawa and Bowers [ 19921, Scott, Geels, Corzine and Coldren [ 1993]), also favors excitation of higher-order transverse modes. To some extent, the nonuniform injection is counterbalanced by the ambipolar radial diffusion of carriers prior to their recombination (Sarzala and Nakwaski [ 1997]), which makes the local gain distribution more uniform than the current-density distribution (e.g., Wada, Babic, Ishikawa and Bowers [1992], Chong and Sarma [1993], Sarzala, Nakwaski and Osinski [1995]). Nevertheless, the gain profile still has an onaxis minimum and is better matched to the lugher-order transverse modes than to the fundamental one. This effect is usually not strong enough to suppress the fundamental mode near threshold, but gains in importance with increasing pumping level, as the current crowding becomes more and more intense (see fig. 9). The better overlap of the gain profile with the optical field of the higherorder modes may then become sufficient to overcome the higher difiaction loss suffered by these modes. Nonuniformity of the current density in devices with annular contacts can be largely leveled out if the heterointerfaces between the alternating layers of
111, 5 31
EFFECTS OF TEMPERATURE ON VCSEL OPERATION
20 1
Bragg mirrors are not graded (Michalzik and Ebeling [ 19931). This, however, increases the series resistance (the specific heteroresistance between p-GaAs and p-AlAs layers can be as high as 2.5 x Qcm’) and results in more intense Joule heating. Built-in index antiguiding can be used as a mechanism for extending the single-transverse mode operation range, since the higher-order modes suffer a higher diffraction loss penalty than the fundamental mode (e.g., Chang-Hasnain, Wu, Li, Hasnain, Choquette, Caneau and Florez [1993], Wu, Chang-Hasnain and Nabiev [ 19941, Wu, Li, Nabiev, Choquette, Caneau and Chang-Hasnain [ 19951, Yoo, Chu, Park, Park and Lee [1996]). The negative index step between an equivalent index of the DBR reflector and the surrounding high-index medium can be made as large as 0.18 (Wu, Chang-Hasnain and Nabiev [1993]), hence the antiguide cannot be affected significantly by the much smaller (one-two orders of magnitude) positive index step due to radial temperature profile. So far, however, this approach has had only limited success. While the near-field intensity profiles in bottom-emitting passive-antiguide-region InGaAdAlGaAs VCSELs show no symptoms of thermal lensing, spatial hole burning or self-focusing, the maximum single-transverse-mode power is still limited to only 1.2 mW (Wu, Chang-Hasnain and Nabiev [ 19931, Wu, Li, Nabiev, Choquette, Caneau and Chang-Hasnain [ 19951). Introducing higher doping at the active region perimeter to increase free-carrier losses and using low-reflectivity ring contacts on the top VCSEL reflector were other mode selection methods postulated by Morgan, Guth, Focht, Asom, Kojima, Rogers and Callis [1993]. Another class of temperature-insensitive waveguide involves strong index guiding (Jewell, Scherer, McCall, Lee, Walker, Harbison and Florez [1989], Geels, Corzine, Scott, Young and Coldren [1990], Geels and Coldren [1990,1991], Shimizu, Babic, Dudley, Jiang and Bowers [1993], Dudley, Babic, Mirin, Yang, Miller, Ram, Reynolds, Hu and Bowers [1994], Young, Kapila, Scott, Malhotra and Coldren [1994], Yoffe, van der Vleuten, Leys, Karouta and Wolter [1994], Yoo, Park and Lee [1994]). Compared to strongly antiguiding VCSELs, index-guiding structures have a serious disadvantage of lowering the threshold of higher-order-mode excitation (Chang-Hasnain, Orenstein, Von Lehmen, Florez, Harbison and Stoffel [ 19901, Schroder, Grothe and Harth [ 19961). Consequently, fundamental-transverse-mode operation can be maintained only over a very limited current range near threshold. 3.4. TEMPERATURE DEPENDENCE OF THE OUTPUT POWER
Because of the thermal lensing effect (see §3.3), the threshold current for
202
THERMAL PROPERTIES OF VCSELs
[III, § 3
the CW operation in PITSELs is often distinctly lower than for the pulsed one. The external differential quantum efficiency, which is the laser parameter proportional to the slope of the light-current characteristic above the threshold current, is, however, much higher for the pulsed operation (Hasnain, Tai, Yang, Wang, Fischer, Wynn, Weir, Dutta and Cho [ 19911). Similarly, maximum available output power and the operating current range are enhanced under pulsed conditions. From fig. 7, we may conclude that in order to obtain efficient CW highpower operation of VCSELs at room temperature, their cavity-mode positions at this temperature should be on the long-wavelength side of the gain spectrum. Although such lasers may have higher threshold currents for pulsed operation than those with aligned cavity-mode and gain-peak positions, nevertheless their CW threshold currents will be lower because of the active-region heating (provided the cavity-mode and gain-peak wavelengths are matched at the activeregion temperature). However, since the active-region temperature depends on the driving current, the conditions for minimum threshold current would in general be different from the conditions for maximum output power. This is illustrated in fig. 10, showing the temperature dependence of lightsurrent (LI ) characteristics of a PITSEL device with a room-temperature detuning of the cavity mode by 18nm towards the longer wavelengths. The CW lasing threshold for this device, shown also in fig. 7, has a minimum at 350 K. All LI characteristics display a typical thermal roll-off behavior, indicating that over the wide temperature range of 9 0 K 4 0 0 K , the output power Po,, is thermally limited. The maximum output power is determined primarily by the temperature variation of the peak gain (see table 5) and by changes in the external dierential quantum eflciency Qd. The latter can be extracted from fig. 10 using the following formula (Agrawal and Dutta [ 19931):
where e is the electron charge, h is Planck’s constant, and c is the speed of light. Equation (18) implies that all output power from a top-emitting VCSEL is collected through the top mirror. Figure 11 shows the temperature dependence of v d , calculated by applying eq. (18) for the device of fig. 10. The raising part of L-I curves, not too far above threshold, is used to determine T]d. A(T) is obtained from the data of fig. 5, taking the CW lasing wavelength for the “865 nm mode” and extrapolating down to 90 K. In any case, the wavelength variation represents only a very small correction to v d determined from the slope efficiency dPoUtldlwith a constant
203
EFFECTS OF TEMPERATURE ON VCSEL OPERATION
15
10
5
0
0
10
20
30
40
50
I(-) Fig. 10. Temperature evolution of the light-current characteristics for a 16pm PITSEL shown in fig. 7 as having minimum CW threshold at 350 K (mode wavelength 865 nm at 300 K) (after Lu, Zhou, Cheng and Malloy [1994]).
Temperature, T
[K]
Fig. 11. Temperature dependence of the differential quantum efficiency q d for the device in fig. 10, using either the stage temperature THS(dotted line) or the active-region temperature TA (solid line) as the argument in vd(T).
204
THERMAL PROPERTIES OF VCSELs
[III,
p3
wavelength. The logarithmic scale in fig. 11 is chosen to verify whether an exponential formula analogous to eq. (16),
with a constant characteristic temperature T,, would hold for VCSELs, as it does for EELs (e.g., Papannareddy, Ferguson and Butler [1987]). Note that sometimes a simpler approximation is used (Wipiejewski, Peters, Thibeault, Young and Coldren [1996]): T]d(T) = vmax(l- AT/Tmax),where Tma, is the characteristic roll-over temperature of the laser. The two curves in fig. 11 correspond to results obtained using either the stage temperature THS (dotted line) or the active-region temperature TA (solid line) as the argument in qd(T). TA is estimated using the wavelength shft between the CW and pulsed operation shown in fig. 5, again extrapolating down to 90 K. Under pulsed conditions, the “865 nm mode” device has two clear regimes of linear wavelength shift, characterized by d&/dT = 0.41 h C for 90 < THS< 300 K, and d&/dT = 0.59 &T for 300 < THSj t h ,
(34)
where qi is the internal quantum efficiency of the stimulated emission. Whenever intense heating takes place, it is important to remember that the threshold current density j t h is not a constant device parameter, but is itself temperature dependent. As the pumping current density j increases, so does the active-region temperature, and therefore j t h is also current dependent. To emphasize this, Scott, Geels, Corzine and Coldren [1993] have introduced the concept of a current-dependent effective threshold current density jth,e 3j t h ( j ) . Alternatively, we could write j t h = j t h ( TA),where TA is the average active-region temperature. For high reliability, the quality of semiconductor laser materials must be very good. Consequently, in most cases the internal quantum efficiency for stimulated emission q, is very close to unity (Petermann [1991]). Thus, eq. (34) reduces to:
111, P 41
FUNDAMENTALS OF THERMAL MODELING OF VCSELs
213
In the case of proton-bombarded VCSELs, e.g., in PITSELs, this part of the spontaneous radiation, which is leaving the active region, is mainly absorbed in the closest vicinity of the active region; i.e., in surrounding it highly absorbing (of high absorption coefficient a ) areas exposed during their fabrication to a stream of protons. Thicknesses (= a-') of these new heat sources are very low. Therefore, it is quite a good approximation to assume that these absorption events take place also inside the active region. Then, the radiative transfer coefficientf T should be put equal to zero in all the above expressions. Saturation of the voltage drop Upn(r) at the p-n junction above the lasing threshold (e.g., Sommers [1971], Paoli [1973]) should also be taken into account. It does not simply mean that Upn(r)is taken as a constant distribution for all currents above the threshold, because an increase in the pumping is followed by an increase in the active-region temperature, which results in an increase in the threshold current. Therefore, for a given value of the pumping current, the saturated profile of the voltage drop at the p-n junction should correspond to an actual active-region temperature increase. In laser structures, where diffusion of minority carriers within the active region before their recombination (radiative or nonradiative) plays an important role, i.e., in lasers without radial carrier confinement mechanisms, it is more justified to associate the above heat generation with carrier concentration distribution rather than with a current density profile. Each act of nonradiative recombination is followed by heat generation of energy equal to about hv, where h is the Planck constant, and v is the laser radiation frequency. Generally, especially in lasers with quantum-well active regions, this energy may be different than the energy eUpn,where e is the unit charge. Then this heat generation consists of two processes - carrier thermalization and carrier recombination, whose sum must give the supply energy eUpn.Even if they are separated in space, they both occur inside or very close to the active region. Therefore we may neglect their separation. Equation (35) will be then modified to the following form:
where PA stands for the total effective threshold power generated (mainly nonradiatively) inside the active region, defined as
Jo
214
THERMAL PROPERTIES OF VCSELs
[IK 0 4
and NA,, is the total carrier number composing the effective threshold within the active region:
with rs the structure radius and &h,e the threshold effective carrier concentration (associated withj,h,,). In the above, we assume that all the heat generation inside the active region is distributed uniformly over NA,, recombining carriers. 4.3.2. Absorption of laser radiation Absorption of laser radiation is associated with generation of heat of a volume density gabs:
where a is the absorption coefficient (different in various layers) for the laser radiation and pintis its internal density inside the resonator:
Note that according to the suggestions of Petermann [1991], the internal quantum efficiency for stimulated emission is taken equal to unity in the above equations. 4.3.3. Absorption of spontaneous radiation In contrast to a stimulated radiation, spontaneous radiation is always emitted isotropically in all directions. Some part of its vertical component is reflected at boundaries between the active region and the cladding layers as well as from the resonator mirrors and is effectively absorbed within the active region, which was already taken into account in 4 4.3.1. The in-plane emission, on the other hand, can be amplified significantly by stimulated processes within the active region (Onischenko and Sarma [ 19971). Spontaneous radiation reaches sometimes distant regions of the laser. Its absorption may, therefore, occur in many different places. For that reason, a distribution of heat generation associated with this absorption is usually difficult to determine, unless the active region is surrounded by highly absorptive areas, as in PITSELs (8 4.3.1).
111, P 41
FUNDAMENTALS OF THERMAL MODELING OF VCSELs
215
4.3.4. Joule heating
In all layers, a current flow is followed by generation of the volume Joule heating gJ: gJ =j 2 p
[w/cm31,
(41)
where p stands for the electrical resistivity (in Qcm). A current flow through a potential barrier as contacts and heterobarriers is in turn followed by a generation of the Joule heat of a surface density qB:
where R B is the specific contact resistance (in Qcm2) of the barrier. 4.4. SELF-CONSISTENT APPROACHES
The thermal conductivity, k, of a semiconductor material is a temperaturedependent parameter. This dependence is especially important for relatively high temperature increases because of its strong nonlinear behavior. It may be easily taken into account with the aid of the Kirchhoff transformation (Carslaw and Jaeger [1988], p. 11):
Then all of the calculations are carried out for the transformed temperature 0; i.e., as if the thermal conductivity were constant. These temperature profiles should afterwards be recalculated for the temperature-dependent thermal conductivity case, using the inverse transformation. In eq. (43), TR stands for the reference temperature. Usually we assume it to be equal to the lowest temperature inside a semiconductor medium; i.e.,
The detailed form of the reverse transformation depends on a functional dependence k( T ) in a temperature range of interest. At temperatures around and
216
[IIL ii 4
THERMAL PROPERTIES OF VCSELs
over room temperature, for example, the thermal conductivity of GaAs may be expressed as (Amith, Kudman and Steigmeier [ 19651): kGaAs(T)
=
0.44 . (300/T)1.25
[W/cm K],
(45)
and that of InP as klnP(T) =
[
1.47 + ( T - 30O)l-I 117
[W/cm K].
Equation (46) was obtained on the basis of fig. 1, published by Kudman and Steigmeier [1964]. Introducing successively eqs. (45) and (46) to eq. (43), we get the inverse transformation formula for GaAs in the following form:
[
T = T , 1-
kGaAs(TR)@
528
and that of InP as T
=
(&s]
=
[
T, 1 -
]
0(5) 11/4
1200 300
,
(47)
[
128 + (TR- 128)exp k I y R / O ]
In all the above equations, temperature should be put in Kelvin. Thermal difhsivities K of semiconductor materials are also dependent on temperature. It is not, however, possible to take into consideration at the same time the temperature dependencies of both these thermal material parameters; i.e., k(T) and K ( T ) ,using a transformation analogous to that presented above. Therefore, in detailed analytical transient thermal analyses, i.e., when both the above parameters should be included, another method of calculation, namely the so-called staircase approach, is recommended. For each time step, At, temperature profiles are determined using values of k and K found in a previous calculation step, starting from an initial temperature of the entire structure equal to that of the ambient (Tamb). There is still another temperature-dependent term in the thermal conduction equation (20) - including the volume power density of a heat generation; i.e., g . This is because many material parameters (such as electrical resistivities, refractive indices, absorption coefficients) and device parameters (such as threshold current and quantum efficiencies),which influence the heat generation, are strongly dependent on temperature. The above may be included in the model using the self-consistent approach, when in successive iteration loops of the
111, P 41
FUNDAMENTALS OF THERMAL MODELING OF VCSELs
217
Q START
temperature
Determine all temperaturedependent New average temperatures spreading
No Carrier diffusion
profiles
sources
Kirchhoff transformation
Linear
Determine reference temperature
conduction
*
STOP
Fig. 14. Flow chart of the thermal-electrical self-consistent calculations in VCSELs.
calculation values of the above parameters determined in the previous loop are used. Self-consistency is assumed to be reached when differences between results of calculations obtained in two consecutive loops are below given limits. Strictly speaking, not only material and device parameters but also distributions of current densities and carrier concentrations within the whole laser structure are dependent on current temperature profiles. This is because the current spreading and the carrier diffusion phenomena are temperaturedependent processes. Therefore in more exact thermal analyses of VCSELs, the thermal-electrical self-consistent procedure is recommended (fig. 14), in which mutual interactions between thermal and electrical processes in the laser are included. Even more exact is the thermal-electrical-optical self-consistent approach in which optical processes, with their mutual interactions with both the thermal and electrical processes, are also taken into consideration. The full picture of mutual
218
THERMAL PROPERTIES OF VCSELs
Fig. 15. Mutual interactions between thermal, electrical, and optical processes in semiconductor lasers.
interactions between all these processes is shown in fig. 15. In VCSELs with strained active regions, additionally mechanical processes should be included.
5
5. Comprehensive Thermal Models of VCSELs
The key parameter used in all simplified treatments of steady-state thermal problems in diode lasers is the thermal resistance RTH (in WW), defined as to the total the ratio of the average active-region temperature increase dissipated thermal power QT (Joyce and Dixon [1975], Manning [1981]):
It should be noted that although the thermal resistance is a very useful parameter to compare the thermal properties of various devices, it may sometimes give
111,
o 51
COMPREHENSIVE THERMAL MODELS OF VCSELs
219
misleading information. Consider, for example, a device with a very poor electrical contact between the device chip and the heat sink. The resultant heat, generated at the laserheat sink interface, would be very efficiently extracted by the heat sink (assuming it is made of a high thermal conductivity material), so its influence on the active-region heating would be relatively small. However, the heat generated near the heat sink would still contribute to the total heat power QT. Therefore, when eq. (49) is used to determine R T H , such a device would have lower thermal resistance than a well mounted laser with low-electrical resistance contact. Thermal-electrical behavior of VCSELs is described by a coupled system of partial differential equations with complicated boundary conditions. The approaches towards solving these equations can be classified into two major types: analytical and numerical models. In analytical models, the solution is written in the form of an analytical expression, usually at the expense of some approximations imposed by the postulated form of the solution. In numerical models, no functional form of the solution is sought, hence details of the device structure can be rendered more faithfully. However, in contrast to analytical models where the accuracy of the solution can be easily controlled, it is more difficult to verify that the purely numerical solution does not contain significant errors. Details of hitherto known analytical and numerical comprehensive thermal VCSEL models are compared in table 8a and table 8b, respectively. 5.1. COMPREHENSIVE ANALYTICAL MODELS
5.1.I . Multiluyer rudiulEy uniform structures
The first comprehensive approach to the thermal properties of VCSELs was developed by Kinoshita, Koyama and Iga [1987], Iga, Koyama and Kinoshita [1988], and Iga and Koyama [1990], who assumed only a single flat-disk heat source located in the center of the active region, but considered the influence of multilayer device structure on the 2D heat-flux spreading. The heat exchange with the exterior is assumed to take place only through the heat sink, with adiabatic boundary conditions for all remaining surfaces defining the device. For each layer, assumed to be radially uniform, 2D azimuthally symmetric temperature profiles are expressed in terms of infinite series containing the Bessel and hyperbolic functions. The expansion coefficients are found by imposing the boundary conditions of continuity of the temperature and heat flux profiles across the interfaces between the layers. The method is analogous to that proposed originally by Joyce and Dixon [1975] for edge-emitting lasers. The main
Table 8a Analytical comprehensive thermal VCSEL models Ref.
Year
Structure
Method
1
1987
DMEWL
Fourier
2
1991
DMEWL
Green
3
1992
PITSEL
Fourier
4
1995
DMEWL
Green
5
1995
PITSEL
Green
Current spreading
Carrier diffusion
-
-
crude
-
good
-
exact
-
good
-
fair
+ + + +
Structure modeling
Heat sourcesa
NR
SP
ST
+
-
-
+ + + +
+ + + +
+ + -
-
+ + + +
Self-consistency
BJ
VJ -
CJ -
- + + - + - -
k(T) -
+ + + -
Th-El Th-Op El-Op -
-
+ + +
+c
-
+c
-
+C
-
-
-
-
?
-?
i;;
Abbreviations: NR,nonradiative recombination; SP, absorption of spontaneous radiation; ST, absorption of stimulated radiation; VJ, volume Joule heating; BJ, barrier Joule heating at heterojunctions; CJ, barrier Joule heating at the p-side contact. Abbreviations: k(T), temperature-dependent thermal conductivity; Th-El, thermal-elecbical; Th-Op, thermal-optical; El-Op, electrical+ptical. Partly.
a
References (1) Kinoshita, Koyama and Iga [1987] (2) Nakwaski and Osinski [1991a,b, 19931 (3) Nakwaski and Osihslu [1992c, 19941
*I:
(4) Osihski and Nakwaski [1995b] (5) Zhao and McInemey [1995]
2
2
E
Table 8b
e e
Numerical comprehensive thermal VCSEL models
M VI
Y
Ref.
Year
Structure
Methoda
Current spreading
Camer diffusion
-
-
1
1993
UMEWL
FDM
2
1993
TBEML
FEM
3
1994
HMML
FEM
4
1994
TEML
CVM
5
1994
PITSEL
FEM
6
1995
DMEWL
FEM
+ + + + +
7
1995
DMEWL
FEM
8 9
1995
PITSEL
FEM
1996
PITSEL
FEM
a
Structure modeling
Heat sources
Self-consistency
NR
SP
ST
VJ
BJ
fair
+
?
-
-
-
+
-
exact
+
+
+
-
fair
+
+
-
-
fair
-
fair
-
exact
+ + +
+ + +
+ + -
-
-
fair
+
+
-
-
+ +
+ +
exact
+ +
+ +
+ +
+ +
fair
-
CJ -
+ -
+ + +
-
+ + -
+ +
k(T)
Th-El Th-Op El-@
-
-
-
-
+
-
-
-
-
-
-
-
-
-
-
+ +
+
-
+
-
-
-
-
-
+ + - -
+
+
-
-
+ +
Abbreviations: FDM, hte-difference method; FEM, finite-element method; CVM, control-volume method. See table 8a.
-
3 5 %
I
rn
i5
rn ~
-
+
1
270
I
I
xr 0
(fl
%
References
(1) S h m h , Babic, Dudley, Jiang and Bowers [1993] (2) Michalzik and Ebeling [1993] (3) Piprek and Yo0 [1994] (4) Noms, Chen and Tien [1994], Chen, Hadley and Smith [1994], and Chen [1995] ( 5 ) Piprek, Wenzel and Szteflca [1994], Piprek, Wenzel, Wiinsche, Braun and Henneberger [ 19951
(6) Rahman, Lepkowski and Grattan [1995] ( 7 ) Baba, Kondoh, Koyama and Iga [1995a] (8) Sarzaka, Nakwaslu and Osibski [1995] (9) Hadley, Lear, Warren, Choquette, Scott and Corzine [1996]
N
N
222
THERMAL PROPERTIES OF VCSELs
[III,
5
5
limitation of this approach is that it neglects any structural nonuniformity in the radial direction. Consequently, in the case of buried-heterostructure DMEWLs (e.g., Koyama, Kinoshita and Iga [1989], see also fig. 3 and table I), to which it was applied, neither the lateral confining layers nor the dielectric mirrors on the heat-sink side could be accounted for. The model of Kinoshita, Koyama and Iga [1987], Iga, Koyama and Kinosbta [1988], and Iga and Koyama [1990] is not self-consistent; i.e., the effect of calculated temperature profiles on material parameters and heat source efficiencies was not considered. In VCSELs, where heating is much more intense than in EELS, non-self-consistent models can underestimate the severity of the thermal problems. The first self-consistent treatment of thermal problems in VCSELs (Nakwaski and Osinski [ 1991a,b]) was applied to buried-heterostructure DMEWL devices and is discussed in Q 5.1.2. As a matter of fact, this was the very first self-consistent thermal-electrical model applied to any semiconductor laser, including the edge emitters and high-power laser arrays. While the radially nonuniform DMEWL structure is too complex for the model of Kinoshita, Koyama and Iga [1987], Iga, Koyama and Kinoshita [1988], and Iga and Koyama [1990] to give accurate results, the same model can be applied to radially uniform structures, such as PITSELs. The thermal conductivity of highly electrically-resistive regions which funnel the injected current into the active region is practically unaffected by the implantation process (Vook [1964]), which combined with the planarity of the PITSEL structure, makes it particularly suitable to be modeled analytically. We have incorporated an analytical approach similar to that of Kinoshita, Koyama and Iga [1987] in the analysis of PITSELs, as a portion of our comprehensive, thermal-electrical self-consistent model (Nakwaski and Osinski [1992a,c, 1994]), featuring the temperature-dependent distribution of multiple heat sources, and the temperature dependence of material and device parameters. In the analysis, all important heat-generation mechanisms are taken into account, including nonradiative recombination, reabsorption of spontaneous radiation in the active region, free-carrier absorption of laser radiation, volume Joule heating and absorption of stimulated radiation in all the layers, and barrier Joule heating at heterojunctions. These dstributed heat-generation processes are lumped into three uniform flat-disk heat sources, each of the active-region diameter DA= 2 r ~ , located in the centers of the active region and two Bragg mirrors. An analytical solution is obtained for the entire structure separately for each heat source. Using the superposition principle, a cumulative temperature distribution in the whole volume of the device is determined by adding together contributions from all heat sources. Subsequently, a self-consistent solution is found with the
111, Q 51
COMPREHENSIVE THERMAL MODELS OF VCSELs
223
aid of an iteration procedure, taking into account the temperature dependencies of material and device parameters, including thermal conductivities, threshold current, electrical resistivities, voltage drop at the p n junction, free-carrier absorption as well as internal and external differential quantum efficiencies. The flow chart of numerical calculations of this type is shown in fig. 14, where, however, carrier diffusion is included additionally. Note that large temperature variations in VCSELs affects substantially their lasing characteristics,because of strongly nonlinear thermal-electrical interactions, eventually leading to thermal runaway. For each flat heat source situated between the j th and the (J + 1)th layers and for each ith layer, we are looking for the transformed temperature distribution in the following form:
where rs is the structure radius, z; is the coordinate of the top boundary of the ith layer, zi-1 1.
(63)
Again using eqs. (53) and ( 5 5 ) for i f j , but now for i = 1, and taking advantage of eq. (56), we get:
and
Now we can determine for 2 6 i 6 N all rj,i,nworking inward from r j , ~and ,~ rj,N,,,. To determine all ~ ~ i , i ,we ~ , once more return to eqs. (53) and ( 5 9 , but this time for i = j . After some mathematical manipulation, we find:
with
The remaining aj,i,ncoefficients can be determined from eq. (53), which after using the rj,i,ncoefficients can be rewritten in the following form:
q i + ~=,aj,i,n n [cosh(Kndi)+ rj,i,nsinh(~ndi)] for n
> 1.
(69)
All the P,,i,,coefficients can then be found from eq. (61). Figure 16 shows the pumping current dependence of radial temperature profiles in the midplane of the active region calculated for a 35-pm PITSEL (Zhou, Cheng, Schaus, Sun, Zheng, Armour, Hains, Hsin, Myers and Vawter [1991]). The total number of layers in the simulation, from the highly doped cap layer on the p-type DBR side to the solder contact below the substrate, including the linearly graded interfaces, is 284. Note that the CW threshold current for this device is 10.2mA, hence the lowest profile in fig. 16 shows the temperature profile just above threshold. Superlinear increase in the temperature at the center of the active region (r=O) in response to increased pumping
226
THERMAL PROPERTIES OF VCSELs
. r = DA/2 60
DA = 35 p m
c
40
1: 10.5 mA
30 20 10
0 0
50
100
150
200
250
Radial distance, r [pm] Fig. 16. Radial temperature profiles in the plane containing the active region of a 35pm GaAsiAlGaAsiAlAs PITSEL for various CW pumping currents, calculated using self-consistent thermal-electrical model. Chip diameter DS is 500 pm.
current can be seen clearly. Consequently, the temperature profile becomes increasingly inhomogeneous, with a large temperature step between the center and the edge (r = r A ) of the active region. This results in the creation of a strong thermal waveguide (cf. 5 3.3), with the refractive index step as large as 1 . 4 ~ at I = 50 mA, which corresponds to an index step that would have been obtained if the active region were surrounded by Alo.o2Gao98As rather than GaAs. On the other hand, since the slope dTldr is a measure of the lateral heat flow, it is evident that the importance of 2D heat flow increases with the pumping current. Pumping-current evolution of the relative share of three major heat sources in the same PITSEL device is illustrated in fig. 17. The active-region heating is a dominant heat source near threshold, but gradually the p-type mirror heating takes over, due to its roughly quadratic dependence on the pumping current. The situation reverses again near the thermal runaway limit, where we observe an accelerated increase in relative importance of the active-region heating, caused primarily by nonlinear processes intensifying the nonradiative recombination. Figure 18 illustrates the current dependence of the thermal resistance, RTH, as defined in eq. (49), for PITSELs of the same vertical structure as in figs. 16 and
227
COMPREHENSIVE THERMAL MODELS OF VCSELs
500
400n U
L
0)
3
0
a
I
-
I
I
I
I
I
1
1
1
1
I
-
1: active region 2: P-type mirrors 3: N-type mirrors
-
300-
-
200-
-
100-
-
-
-
-
00
20
40
60
80
100
120
Current [mA] Fig. 17. Yields of three major heat sources in the 35 pm-diameter PITSEL of fig. 16 shown as a function of the pumping current.
L Q,
r !-
-
. . . . . . uniform cylinder model
-
Fig. 18. Pumping-current dependence of thermal resistance RTH for 35 pm-diameter PiTSELs with various electrical series resistances. Curve 2 corresponds to the device simulated in figs. 16 and 17 and reported by Zhou, Cheng, Schaus, Sun,Zheng, Armour, Hains, Hsin, Myers and Vawter [1991].
22 8
THERMAL PROPERTIES OF VCSELs
[IIL 0 5
17. In addition to the experimentally realized device with the series resistance Rs,O= 33 Q (curve 2), we also consider hypothetical devices with lower (curve 1) or larger (curves 3 and 4) series resistances. Corresponding threshold voltages are readjusted using the following equation:
where U p , is the contribution of the p n junction to the threshold voltage, assumed to be independent of the series resistance R,, and determined from the IV characteristic of the 33 Q device. No variation of pulsed threshold current with R, is assumed to take place. The CW threshold, however, does depend on R, due to changing temperature of the active region, although for the devices considered here these changes are very small, primarily due to the high value of TO= 2 10 K (Hasnain, Tai, Dutta, Wang, Wynn, Weir and Cho [1991]) (cf. table 6b, p. 193) assumed in the calculations. Although RTH is usually treated as a constant parameter with a value characteristic of any particular device, (cf. table 4, p. 187), it is clear that due to nonlinear processes it varies substantially with the pumping current (see Nakwaski and Osinski [1992b]). The relatively high values of RTH displayed in fig. 18 are caused primarily by the “junction-up” configuration of PITSELs. The horizontal dotted line represents the thermal resistance calculated using the simplified uniform cylinder model (Nakwaski and Osinski [1992d]). It is clear that this model represents a reasonable approximation only in the linear regime, near the lasing threshold. The effect of the series electrical resistance on the average temperature increase of the active region AT*+”, used in calculation of the thermal resistance RTH,is illustrated in fig. 19. Due to nonlinear processes, the penalty for a too high series resistance of the device increases rapidly with the pumping current. The operating current range of the 100 Q device is nearly half that of the low-series-resistance (20 Q) device. Zhao and McInerney [ 19951 have recently reported an analytical solution of the thermal conduction equation for a GaAdAlGaAs PITSEL volume using the Green’s function approach proposed for surface-emitting LEDs by Nakwaski and Kontkiewicz [1985]. In the model, the complex multilayer VCSEL structure seems to be replaced with an equivalent uniform structure, although the authors did not mention anything about it: in the solution, average (?) values of thermal conductivity and difisivity are used for the entire VCSEL volume. The model would be exact if Green’s function solutions (with unknown expansion coefficients) were assumed separately for each uniform structure layer. Then
229
COMPREHENSIVE THERMAL MODELS OF VCSELs n
Y
Y
.-
E3
e
Y
W
E" W
Y
e p
P
// /
1004
-3 " i0- '
uniform cylinder I
20
I
I
40
I
60
I
' 80
'
model
100
I
'
120
I
i
140
Current, I [mA]
Fig. 19. Pumping-current dependence of the average active-region temperature increase AT,+,, in a 35 pm-diameter PITSEL. Curve 2 corresponds to the device rcported by Zhou, Cheng, Schaus. Sun, Zheng, Armour, Hains, Hun, Myers and Vawter [1991] and simulated in figs. 16 and 17.
the coefficients would be found from continuity conditions at all layer edges for profiles of both temperature and heat flux, similarly as in the approaches proposed for VCSELs by Nakwaski and Osinski [ 1992~1and earlier for EELS by Joyce and Dixon [1975]. As heat sources, Zhao and McInerney considered nonradiative recombination and absorption of spontaneous emission in the active region and volume Joule heating in layers of current spreading. Unfortunately, they did not solve exactly the current-flow process, using a simplified approach with two adjusting parameters of values difficult to estimate. The temperature dependence of all model parameters was neglected. Because of all the above facts, the exactness of the model seems to be very limited. Nevertheless, the model was used later in an interesting analysis of transverse modes in VCSELs (Zhao and McInerney [ 19961). 5.1.2. Multiltryrr rudiutiv nonunifi,rm structures
Most VCSEL structures are either nonplanar, or contain laterally nonuniform layers confining the carriers, defining the waveguide/antiguide, or acting as reflectors. The analytical approach of 4 5.1.1
5.1.2.1. GaAs/AICuAs lusrrs.
230
THERMAL PROPERTIES OF VCSELs
SII, § 5
may only be used for such structures for which the radially nonuniform layers can be replaced with thermally equivalent uniform layers. An alternative analytical approach that takes into consideration lateral nonuniformity without requiring thermal equivalencies in the radial direction has been developed by Nakwaski and Osinski [1991a,b, 19931 and applied to buried-heterostructure . DMEWLs (Koyama, Kinoshita and Iga [ 19891) (see fig. 3 and table 1). First, current spreading between the etched-well substrate and the heat sink is found using approximate analytical formulae (Bugajski and Kontluewicz [ 19821, Nakwaski and Osinslu [19931). Realistic, radially nonuniform, multiple heat sources associated with different layers of the device are considered, each with axially uniform distribution across the layer thickness. The following heat sources are included: the active region, the N-type and the P-type cladding layers and the p-side contact resistance. The device is then divided into two concentric cylinders (internal with 0 < r < D ~ l 2and external with DA/2< r 6 DsI2) such that within each cylinder all layers are radially uniform. While the dividing wall at r=DA/2 is considered to be thermally insulating, prior to finding the solution of the heat spreading problem the heat generated by each source is redistributed between the two cylinders using an electrical analog model (Nakwaski and Osinslu [ 1991al). Due to the smaller size of the inner cylinder, the redistribution of heat within that cylinder, containing the active region, is considered to be more accurate. For each cylinder, the multilayer structure is replaced with a thermally equivalent medium and an analytical solution for the temperature profiles is found for each ith heat source using the Green’s function method in the following form: Region I :
Region I1 :
In the above equations, T R stands for the reference temperature equal to the temperature at the bottom edge of the laser crystal, r A and rs are the radii of the active region and of the laser structure, respectively, j ~ (n,= 1,2,3,. ~ . . ) is the nth zero of the first-order Bessel function of the first kind, cm= n ( m ( m = I, 2,3,. . . ) is the (m + 1)-st zero of the cosine function, z , ~denotes , ~ the z coordinate after the space transformation, and deq,ais its value for the bottom of
i)
111, § 51
COMPREHENSIVE THERMAL MODELS OF VCSELs
23 1
the etched well (the n-GaAsN-AlGaAs interface), both for Region a ( a =I, 11). The coefficients Anm,i,rand Anm,i,~l are calculated using the following formulae:
where keq,a stands for the equivalent thermal conductivity of Region a, is the transformed coordinate of the top of the ith layer in Region a, and gi,eq,js is the equivalent distribution of the ith heat source after its redistribution. In each loop of the self-consistent calculations, the cumulative profiles of transformed temperature are found using the superposition principle by adding together contributions from all heat sources. Then, the actual temperature profiles are determined by performing the inverse Kirchhoff transformation (cf. § 4.4). At the boundary between the two cylinders, the temperature profiles are matched by adjusting the profile of the outer cylinder to the level of the inner cylinder. The adjustment only affects the near vicinity of the boundary, on the side of r >DA/~. A very important feature of the model is its self-consistency (cf. 4.4), which accounts for mutual interactions between thermal and electrical phenomena. In an iterative loop, temperature dependencies of many material and device parameters are considered, including thermal conductivity, electrical resistivity, threshold current, quantum efficiencies, and voltage drop at the p n junction. Also, the temperature dependence of all important heat generation mechanisms is taken into account, including nonradiative recombination, absorption of spontaneous emission, as well as the Joule heating in all layers. The model described here represents the first application of a self-consistent approach to thermal problems in any semiconductor laser, including the edge emitters. We
232
THERMAL PROPERTIES OF VCSELs
n
- self-consistent . . . . . non-self-consistent
-
-
0
10
20
30
40
Radial distance, r [ p m ] Fig. 20. Radial temperature profiles in the plane containing the active region of a 16km buriedheterostructure GaAs/AIGaAs DMEWL for the pumping current I = 5 1 t h ~(1th.p = 38.4 mA), where Ith,P stands for the room-temperature pulsed threshold current, calculated using self-consistent thermal-electrical model (solid line) or taking the output of the first loop of the iterative process as a non-self-consistent solution (dotted line). Chip diameter Ds is 500 Fm.The kink near r = 23 km corresponds to the edge of the outer oxide layer.
have subsequently used the self-consistent approach in all of our comprehensive thermal modeling, including the model described in 6 5.1.1. The importance of self-consistency is illustrated in fig. 20, comparing radial temperature profiles in the active-region of a 16 pm GaAs/AlGaAs DMEWL, obtained using the self-consistent solution (solid line) and taking the output of the first loop of the iterative process as a non-self-consistent solution (dotted line). The device structure is similar to that of Koyama, Kinoshita and Iga [1989], except for an enhanced P-AlGaAs-cladding doping level of 2 x 10l8cm-3 which significantly improves device thermal properties (see Nakwaski and Osinski [ 1991a,c]).A pumping current of 192 mA was assumed, corresponding to 5 times the room-temperature pulsed threshold current. Clearly, at currents significantly above the threshold, the non-self-consistent solution grossly underestimates the active-region temperature increase. Another interesting feature displayed in fig. 20 is the on-axis dip in the temperature profile, which is a direct consequence of nonuniform current injection. Associated with the dip is a thermally induced antiguide. Selfconsistent analysis reported by Nakwaski and Osinski [ 1991a,c,d] and Osinski
233
COMPREHENSIVE THERMAL MODELS OF VCSELs
1-1
3
'=.
U
500
400
DA:
300
1: 5 pm 2: 10 prn 3: 16 prn 4: 20 prn 5: 30 prn 6: 40 prn
100
0
2
4
6
8
10
12
14
16
18
Relative pumping current, Fig. 21. Current-dependence of thermal resistance for buried-heterostructe GaAdAIGaAs DMEWLs of various active-region diameters D A .
and Nakwaski [1992] reveals that the sign of thermal waveguiding in the active region can be controlled by the N-AlGaAs doping level. Increasing the N-AlGaAs doping level beyond the value of N = 7 x 1017cmP3 used in fig. 20, results in improved uniformity of the injected current density. This is manifested initially by flattening of the active-region temperature profile, and eventually by occurrence of a maximum at r=O for N=7xlOI8 ~ m - Freedom ~. to engineer thermal waveguide in the active region is a characteristic feature of all etchedwell VCSELs. Depending on the application, it might be more beneficial to focus the output light into a narrower spot or to spread it over a wider area without changing the active-region diameter. A thermal antiguide can also enhance single-transverse-mode operation (cf. 9 3.3). Figure 21 displays the current dependence of thermal resistance RTH for DMEWL devices with various active-region diameters. Except for their lateral dimensions, the devices have the same structure as the device of fig. 20. Comparison with fig. 18 reveals a device-type-dependent variation of RTH with current. While the RTH(Z)curves increase monotonically in PITSELs, they have distinct minima in DMEWL, particularly for small-size emitters that can operate in CW mode far above threshold (curves 1 and 2 in fig. 21). These seemingly contradictory results can be understood by considering various factors that can influence the evolution of RTH with current.
234
THERMAL PROPERTIES OF VCSELs
WI, § 5
Figures 18 and 2 1 indicate that the thermal resistance in VCSELs is governed by a number of mechanisms that may affect the RTH(I)dependence in opposite ways. Variation of VCSEL thermal resistances with a pumping current is caused by a temperature dependence of thermal conductivities of constituent materials and by a change of intensities of various heat generation processes located in different places of a laser. The former mechanism always increases the value of R T H whereas , the latter one may increase or decrease RTHdepending on the laser structure. This is a reason for a different R T H ( I shown ) in figs. 18 and 21. An increase in the pumping current invariably heats up the device, which in turn reduces the thermal conductivity and increases the thermal resistance RFA associated with every heat source a. This effect is nearly negligible at low currents, but steadily becomes more and more significant at higher currents, as evidenced in fig. 21 by a sudden increase in RTH near the thermal runaway conditions. A more subtle effect is that of the heat source distribution. The thermal resistance of the device is obtained by summing together fractional resistances RFd with weights QJQT determined by the relative shares of corresponding heat sources. If the relative share of heat sources with high fractional resistances RFA increases, the total thermal resistance will have a tendency to increase. As shown in fig. 18, this is obviously the case of PITSELs. However, if the relative share of heat sources with high fractional resistances RFA decreases, the variation of total RTHwiIl depend on which of the two opposite mechanisms prevails: the increase in thermal conductivity or the lower average fractional resistance. It follows from fig. 21 that this more complex behavior is the case for DMEWLs. In a typical VCSEL configuration, where all heat flux is directed towards the heat sink located on the side opposite to the output mirror, the fractional resistances of all heat sources of the same diameter are determined primarily by their distance from the heat sink. Hence, in a junction-up configuration of PITSELs, the P-type Bragg mirror has the largest fractional resistance, while in DMEWLs mounted junction-down the P-AlGaAs cladding layer has the lowest fractional resistance. The Joule heating associated with DBRs or cladding layers is roughly proportional to I* (this dependence would have been exact if current spreading mechanisms and electrical resistivities were independent of temperature) and, at least well below the thermal runaway, tends to grow faster than the active-region heating. Due to relatively high electrical resistivity of p-type semiconductors, the p-type Joule heat sources are the most important ones to consider since they end up having higher weights &/&. From the above considerations it follows that the p-type Joule heat sources in PITSELs, being on the high end of fractional thermal resistances, cause a further increase in RTH
235
COMPREHENSIVE THERMAL MODELS OF VCSELs
..... I
n
t-
w m
3 0
392
NONLINEAR PATTERN RECOGNITION TECHNIQUES IN THE FOURIER DOMAIN
[y 9 7
have to meet the separation condition to avoid overlap of these terms, and therefore the requirement on the space-bandwidth product of the system is relaxed. Equivalently, it can handle a larger input image than does a conventional JTC. This is especially useful for multiple-target detection. Suppose that the input signal s(x,y) is put on the plane PI and the reference signal r(x,y) is put on Pz. On the back focal planes of the Fourier-transform lenses 1 and 2, S(a,@)and R(a,P), the Fourier transforms of f ( x , y ) and r ( x , y ) ,respectively, are obtained. On PI,-,S(a,B) is multiplied by the phase-only function of Ml(a,/3).So the light emanating from PI^, denoted byA,I,-(a,P), is the product of S(a,p) and A4 I (a, B), i.e.
For the same reason, on Plf,
By using beam splitter BS and image lenses IL1, IL2, both PI/ and P2/ planes are imaged onto P3. Consequently, on Pj, the interference pattern of Aplf and Ap2f is obtained. The modulated joint transform power spectrum can be obtained by a square-law device such as a video camera. The modulated joint power spectrum (MJPS) can be represented as
where E ( a ,p) is the MJPS that is recorded by a camera. The output of the camera is connected to a nonlinear network to transform the MJPS. Then, the output of the nonlinear network is sent to a SLM to display the nonlinear version of the MJPS. The nonlinearity here can be any type of nonlinear transform, or a linear transform. Since the phase-only mask Md(a, P), which is used as demodulation function, is placed on the SLM, the light emanating from P4 can be represented as
where g { .} denotes the nonlinear transformation and G(a,B) represents the light distribution emanating from P4.Applying another Fourier transform optically, we get the output on the plane Po.
v, P
71
RANDOM PHASE ENCODED JOINT TRANSFORM CORRELATOR
393
7.2. ANALYSIS OF RANDOM PHASE ENCODED LINEAR JOINT TRANSFORM
CORRELATORS
Assume that the demodulation function Md(a,/3) is
It can be seen that the phase-only function Md is used as a demodulation function which recovers the desired correlation signal completely while the other noise terms are still modulated by phase functions. Hence, on the output plane we have g(x,y)=R.s~(x,y)*md(x,y)+R,(x,y) * md@,y)
(7.7)
+ R.v(x,y) + Rr.s(x,y) * md(x,y) * md(x,y),
where * represents the convolution operation. R,,(a,B) and R,.,.(a,p) are the autocorrelation of s(x,y) and v ( x , y ) , respectively. R,y,.(a,p) and R,,(a, B) are cross-correlations between s ( x , y ) and r ( x , y ) [R,s,.(a,/3) = Rz7(a,p)], and
From eq. (7.7) we can see that there are four terms in the output plane Po. The term R,,,.(a,p) is the desired cross-correlation signal. The other terms are considered as noise signals. This is different from conventional JTC. The three noise terms are all convolved with some functions of Md(@,lj).By properly choosing M l ( a , B )and M2(a,b),and therefore Md(a,/3), it is possible to reduce the noise peak intensity so the signal peak-to-sidelobe ratio can be improved. If Ml(a,/3) and Mz(a,/3) are phase-only functions, Md(a,P) is also a phaseonly function, Therefore we have
Is,
1w(a,B)~d(a7/3)l2 dadB =
// Iw(a,P>I2
dadh
(7.9)
90
where W ( a , P ) is an arbitrary function. It can be seen that the total energy of each noise term will not be changed by the demodulation function Md(a,/?) if it is a phase-only function. Only the distribution of the noise in the output plane is changed.
3 94
NONLINEAR PATTERN RECOGNITION TECHNIQUES M THE FOURIER DOMAIN
[v 5 7
From eqs. (7.6) or (7.7) we can see that the noise terms including autocorrelations and one cross-correlation are multiplied by Md(a, P) and Mi(a,P), respectively. Equation (7.5) is the demodulation condition that Md(a,P) needs to satisfy. If Ml(a,P) and M ~ ( c w ,are / ~ )chosen to be random phase fimctions e x p [ j h ( a , B)1 and eXpfiQlm2(a, 8)1, respectively, and Ql,l(a, P) and rP,2(a, P> are two independent white Gaussian processes, then # m d ( Q , P ) = qjm2(a,P) - Qnll (a,B) is also a white Gaussian process. Therefore, Md((x,P) = exp[j&d(a,P)] has the same defising properties. It reduces the noise peaks of the two autocorrelation terms by spreading the energy of the noise terms across the entire output plane. For the same reason M:(a,B) reduces the noise peak of the fourth term in eq. (7.7). We have shown that all the noise terms in a phase-encoded joint transform correlator are diffused by the random phase encoding but the desired correlation term remains intact. 7.3. COMPUTER SIMULATIONS
To study the performance of the random phase encoded JTC, computer simulations were conducted. The car image shown in fig. 23a (21 x 16pixels) was used as the reference and was located at the center of the reference plane in fig. 22. The input image was the same car image located at the input plane and was shifted 36pixels from the origin. Figure 24a shows the correlation output for a linear JTC. We see the strong autocorrelation terms at the center of the output plane and two cross-correlation terms on both sides of the autocorrelation terms. Here we assume a real-valued gray scale SLM that can display positive and negative values. If only positive values can be displayed, a bias has to be used to shift a negative value to a non-negative one. This will result in an extra DC peak at the origin of the output (Flavin and Homer [1989]). In fig. 24a, only one of the cross-correlation peaks is desired. The other crosscorrelation is the redundant term that does not contain any more information. The autocorrelation terms are also considered as noise terms. Figure 24b shows correlation output for the same reference and input images using the random phase encoded JTC shown in fig. 22. It shows that only one cross-correlation peak appears at the output plane. The autocorrelation peaks and the redundant cross-correlation peak are diffised by the random phase encoding. Since the phase encoding at the Fourier plane uniformly distributes the undesired terms at the output plane, the undesired terms contribute to the background noise floor, and the noise peaks are significantly reduced. Note that the random phase encoding and decoding at the Fourier plane recovers the desired correlation signal completely, and that this correlation signal is unchanged. However, since
v, Q
71
RANDOM PHASE ENCODED JOINT TRANSFORM CORRELATOR
395
20 40
60 80 100 120
20
40
60
80
100
120
80
100
120
(a)
20 40
60 80 100 120
20
40
60
(b) Fig. 23. The input and the reference images used for single-target detection using a JTC and a random phase encoded JTC. (a) The reference image (a car image) located at the center. (b) The input image (same car image as the reference) located at (0,36).
the noise terms overlap with the desired correlation signal and are not negligible, the output noise increases slightly. This is due to the limited space-ban dwidth product of the optical system that binds the noise energy to the limited area of the output plane, unlike the ideal case in which we assume that the noise energy
396
NONLINEAR PATTERN RECOGNITION TECHNIQUES IN THE FOURIER DOMAIN
x
Id5
[v $ 7
r( au t o-correlati on
1
0.5 0
0
x , /
cross-correlation
I 2 1 -
60
Flg 24 The correlation outputs of linear JTCs for the reference and input ‘I> ,II< figs 23a and 23b, respectively (a) The correlation output for the lineai TTC. ( b ) 11ie oi output for the random phase encoded linear JTC
I
can be distributed over a sufficiently large area to result in a negligible i n s ~ L Y W to the background noise. We now test the random phase encoded JTC for multiple-target detection. FOLW identical car images used as the targets are placed at the input as shown in fig. 25. The reference image is the same car image placed at the center of the reference plane as in fig. 23. The size of the car images used here is reduced by half to avoid overlapping in the linear JTC. For the linear case, the correlation output for conventional JTC and random phase encoded JTC are shown in figs. 26a and 26b, respectively. Comparing these two plots, we can see that random phase encoding at the Fourier plane eliminates the autocorrelation peaks and the redundant crosscorrelation peaks. These noise terms are distributed over the background noise floor, as expected.
v, P
71
391
RANDOM PHASE ENCODED JOINT TRANSFORM CORRELATOR
50 100 150
200 250
50
100
150
200
250
(a)
50 100 150
200 250 50
100
150
200
250
(b) Fig. 25. The input and reference for multiple-target detection: (a) the input image that contains four identical car images as the targets. (b) The reference car image located at the center of the reference plane.
398
NONLINEAR PATTERN RECOGNITION TECHNIQUES IN THE FOURIER DOMAIN
[\! 0 8
x 4
k
1
0
10” 4,
h
Fig. 26. The correlation outputs of linear JTCs for the reference images shown in fig. 25: (a) the output of the linear JTC;@) the output of the random phase encoded linear JTC.
0
8. Security Validation and Security Verification
We conclude with a practical application for the technology we have been describing so far. Credit card fraud is a serious problem facing many banks, businesses, and consumers. In addition, counterfeit parts such as computer chips, machine tools, etc. are being produced in great numbers. With the rapid advances in computers, CCD technology, image processing hardware and software, printers, scanners, and copiers, it is becoming increasingly simple to reproduce pictures, logos, symbols, money bills, or patterns. Presently, credit cards and passports use holograms for security. The holograms are inspected by human eye. In theory, the hologram cannot be reproduced by an unauthorized person using commercially available optical components. In practice, the holographic pattern can be easily acquired from a credit card (photographed or captured by a CCD camera) and then a new hologram synthesized. Therefore any pattern
V,
8 81
SECURITY VALIDATION AND SECURITY VERIFICATION
399
that can be read by a conventional light source and a CCD camera can be easily reproduced. We propose an idea for security verification of credit cards, passports, and other IDS so that they cannot be easily reproduced (Javidi and Horner [1994a], Javidi [ 1997a,b]). We propose a new scheme of complex phase/amplitude patterns that cannot be seen and cannot be copied by an intensity-sensitive detector such as a CCD camera. The basic idea is to permanently and irretrievably bond a phase mask to a primary identification amplitude pattern such as a fingerprint, a picture of a face, or a signature. Computer simulation results and laboratory tests of the proposed system will be provided to verify that both the phase mask and the primary pattern are identifiable in an optical processor or correlator (Vander Lugt and Rotz [1970], Weaver and Goodman [1966], Javidi [ 1989a,b]). Complex phase/amplitude patterns which cannot be seen and cannot be copied by an intensity-sensitive detector such as the CCD camera are utilized for verification of the authenticity of items bearing the pattern. The phase portion of the pattern consists of a two-dimensional phase mask that is invisible under ordinary light. The complexities of interferometry and the large dimensions of the mask make it extremely difficult to determine the contents of the mask. The code in the mask is known only to the authorized producer of the card. One cannot analyze the mask by looking at the card under a microscope or photographing it, or reading it with a computer scanner in an attempt to reproduce it. Only an elaborate set-up like a Michelson interferometer is capable of deciphering it. The verification system that reads the card could be one of several coherent optical processor architectures. An biometric image such as a fingerprint or primary pattern g(x,y) whose authenticity is to be verified, consisting of an amplitude gray-scale pattern to which a phase mask has been bonded, is placed in the input plane of the processor. Thus the composite input signal is
If there is no primary pattern and the phase mask exp[jM(x,y)] alone is used for verification (as, for example, in product verification or authentication),g(x, y ) will be a constant. If the primary pattern must be verified, the processor will have an a priori knowledge of the primary pattern g(x,y). In that case, the process is repeated for verifying the primary pattern. In fig. 27, the nonlinear joint transform correlator architecture is used to verify the authenticity of the card. The complex mask on the card and an image of a
400
NONLINEAR PATTERN RECOGNITION TECHNIQUES IN THE FOURIER DOMAIN
[V, § 8
CCD
Laser Di
2
r-----------------I If there is a correlation peak, it means that the card is authentic. I
I I If there is no correlation peak, it-means I that the card is not authentic. I- - - - - - - - - - - - - _ _
r]
-1 I I I
---___________
Fig. 27a. Nonlinear single-SLM joint transform correlator for verifying a fingerprint match in addition to verifying the authenticity of the phase mask superposed over the fingerprint on an ID or credit card.
corresponding reference phase mask are placed at the input plane of a JTC. If a SLM is used, the single-SLM version of the JTC (fig. 27a) is attractive because it is a more cost-effective type for this application. Figure 28 illustrates an example of the phase-encoded ID card inserted into the input plane of an optical processor such as the correlator of fig. 27. The processor (fig. 27) used to verify the input mask can also be used to verify the primary pattern such as a fingerprint or a picture. The processor will have an a priori knowledge of the primary pattern g(x,y) and the process is repeated for verifying the primary pattern. For even more security, the primary pattern could itself be phase-encoded. That is, a fingerprint or picture of a face could be written as a phase mask itself and combined with the random phase mask discussed previously. This would have the effect that the combined pattern would be completely invisible to the
v, § 81
40 1
SECURITY VALIDATION AND SECURITY VERIFICATION
CCD
Laser
, 1
I I I I
If there is a correlation peak, it means that the card is authentic. --I
If there is no correlation peak, if means that the card is not authentic.
--__-------------------------
Fig. 27b. Implementation of a verification system with a joint transform correlator without SLM.
eye or to any other detector using conventional light sources. The composite mask could be produced by the same means used to make the random phase mask; e.g., refractive, embossing, or bleaching techniques. This double phaseencoding scheme would have an additional security value, in that anyone wanting to counterfeit the card would not even be able to determine easily what type of a primary pattern they would have to produce on the card. Even if they obtained the primary pattern, they would have to unscramble the random phase code from it. We have verified the idea proposed above with computer simulations. The card-holder’s fingerprint can be verified before the phase mask on the card is verified. The order is not important. A random noise is used for M ( x , y ) to generate the reference phase mask exp[jM(x, y ) ] .This phase mask is multiplied by an image g(x,y) to form the input signal. The input image g(x,y), a segment of a twenty dollar bill, is shown in fig. 29. A phase mask is placed over the image of Andrew Jackson. In the computer simulation tests, the size of the input image is 80x 100 pixels, and the size of the input array (with zero padding) is
402
NONLINEAR PATTERN RECOGNITION TECHNIQUES IN THE FOURIER DOMAIN
Phase Mask \
Front
[y 4
8
\E V E R Y B A N K
VALID FROM 1193
EXPIRATION DATE
L
1/00
An image (Fiigff Rint & Bich%gnatum, ete.) c
24-HOUR CUSTOMER SERVICE. CALL 1-800012-3456
I
'
An image bonded with the phase mask g(x,y)CiM(x.y)
Back AUTHORIZED SIGNATURE
Bic-Signature bonded with thc phase mask
Magnetic Ship Includes Algorithms to determine the sequenceof Codes to be used, PIN No.,etc.
Fig. 28. An example of the phase-encoded ID card inserted into the correlator of fig. 27.
Fig. 29. Part of the input image (a $20 bill) used in the simulations. A phase mask is placed over the image of Andrew Jackson.
v, 8 81
SECURITY VALIDATION AND SECURITY VERIFICATION
403
Fig. 30. Computer correlation simulation with fig. 29 as input: (a) the reference function is the random phase mask; (b) the reference function is a different random phase mask, for example a counterfeit mask.
256~512pixels.The amplitude of g(x,y) in the input plane is normalized to have a maximum value of unity. We have simulated a binary JTC to verify the phase mask. Figure 30 shows the computer simulations of the correlation with the binary JTC. In all cases the input is that shown in fig. 29 - a picture of Andrew Jackson with a random phase code, exp[jM(x,y)], placed over it. In fig. 30a the reference mask is the same random phase pattern. In fig. 30b, the input image has a different phase mask multiplied with the image in fig. 29, that is, exp[jN(x,y)], where N(x,y) is not the authorized code. In this case, the reference image does not match the random phase mask placed on the twenty dollar bill image. It is
404
NONLINEAR PATTERN RECOGNITION TECHNIQUES IN THE FOURIER DOMAIN
[y 5
9
clear from these figures that the processor has verified the authentic phase mask and rejected the unauthorized mask. In conclusion, we have described a phase-encoding scheme that is virtually impossible to copy, and when combined with a biometric image such as an individual fingerprint or photograph, makes a foolproof security entry or identification system. The phase mask can be verified using a variety of system architectures. The phase mask can be made of a light-transmissive embossed plastic film, a bleached photographic film, or etched onto a glass or metallic reflector. The random phase modulating mask could be used alone, as for example on a tag of a manufactured object, for product authentication. We have presented a number of alternative methods to realize the basic idea. The combination used would ultimately be determined by the level of security required.
Q 9. Summary We have discussed a number of Fourier-plane nonlinear techniques that can be used for pattern recognition. We have shown that nonlinear transformation of the joint power spectrum can result in very good correlation performance in terms of correlation peak size, peak to output noise ratio, and sensitivity against similar objects. We investigated the correlation performance of binary joint transform correlators with unknown input-image light illumination for three different thresholding methods of the joint power spectrum. Two types of frequency-dependent threshold hnction have been analyzed: the sliding-window local-median thresholding method and the spatial-frequency dependent threshold method. We have shown that the binary JTC employing the latter method works better than the sliding-window local-median threshold method and can provide input-image illumination invariance. Computer simulations and experimental results show that for varying illumination, the variable threshold function and the sliding-window local-median thresholding perform well by providing large correlation peak intensity, and large peak to noise ratio. We investigated the use of nonlinear techniques in the design of composite filters which enable these filter designs to be used in a nonlinear joint transform correlator. We find that the performance of these composite filters is improved substantially by applying the Fourier-plane nonlinear techniques. The main performance improvements gained from the incorporation of the Fourier-plane nonlinearity in our filter designs are a large increase in the peak intensity, a significant improvement in the peak-to-sidelobe ratio, and a substantial improvement in
VI
LIST OF SYMBOLS AND ABBREVIATIONS
405
the filter's discriminability. Additionally, the nonlinearities improve the SNR of the peak intensity value. We have presented several techniques to reduce the redundant and self-correlation terms of the joint transform correlator. These include the chirp-encoded technique which places the reference and input images in separate planes. The other technique uses random phase encoding. The analysis and computer simulations show that these random techniques can reduce the undesired output signals, including the DC terms, the spurious autocorrelations from multiple targets, and the higher-order harmonic terms. All of these take valuable space-bandwidth product and increase the system size. Since the undesirable output signals are reduced, the input image separation condition for the target and reference is relaxed. As an application we have described a phase-encoding scheme that is very difficult to copy, and when combined with an individual fingerprint or photograph, makes a foolproof security entry or identification system. The phase mask can be verified using a variety of system architectures. The phase mask can be made of a lighttransmissive embossed plastic film, a bleached photographic film, or etched onto a glass or metallic reflector. Also, the random phase modulating phase mask can be used alone on a manufactured item. We have presented a number of alternative methods to realize the basic idea. The combination used would ultimately be determined by the level of security required.
Acknowledgement We acknowledge Dr. Wenlu Wang for his assistance in preparing this chapter. We also acknowledge the USAF Research Laboratory at Hanscom Air Force Base and the National Science Foundation for their support.
List of Symbols and Abbreviations
1.1 ( I-'
Magnitude Matrix inverse
@
Represents correlation An arbitrary matrix
A ~xP(~@ 1A , , ,
a
The element of the ith row andjth column of matrix A Input image illumination coefficient
Cuk
A constant related with u and k
Aij
406 C
NONLINEAR PATTERN RECOGNITION TECHNIQUES IN THE FOURIER DOMAIN
rv
A column vector containing the desired correlation peak for each training image Complex conjugate of c Correlation output at the location of ( x , y ) Side lobe intensity outside the window w(R2) Charge-coupled device Distance between any two of the targets s, and sm Distance between the reference image and any one of the targets si in the input scene
d
Pulse width of the binarized function g(a, /3)
dr
Distance between the reference signal and FTLl
ds DC terms
Distance between the input signal and FTLl The sum of the autocorrelation of the input scene and the autocorrelation of the reference signal that is formed on the optical axis Discrimination ratio Expected value Joint power spectrum A threshold value = median(hist[R2(a)(a2
+ 2a cos(2xoa) + l)]}
=(l/a*)~T,~
A threshold hnction
Equal-correlation-peak Focal length of the transform lens Airplane used as the non-target object to be rejected Fourier transform Fourier transform lens 1 Fourier transform lens 2 Fourier transform of g(E) The nonlinear characteristic of a nonlinear device The output of the nonlinear device with nonlinearity k The first-order harmonic term of gk(a, B)
VI
LIST OF SYMBOLS AND ABBREVIATIONS
407
The amplitude modulation of each harmonic term of the expansion of g(a,0) Composite filter A column vector obtained by lexicographic scanning of the composite filter
I;
Fourier transform of h
hist[.]
Histogram Expected value of the correlation peak intensity
I0
Ju
Identification Image lenses 1 and 2 Bessel function of the first kind, order u
JTC
Joint transform correlator
KO k LCLV
Coefficient of the Fourier series of g(a,B) The severity of the nonlinearity Liquid crystal light valve
M
Number of training images
MkY)
Random code used to generate the reference phase mask Random phase-only function
ID IL1, IL2
Random phase-only function = M;(a, PI M2(a,PI Fourier transform of Md(a,P) Mean of background color noise Median value
Airplane used as the training target to be recognized Modulated joint power spectrum Modulation transfer function Total number of pixels in w(R2)
PNR POE PSR
Random code used to generate the unauthorized phase mask Peak-to-noise ratio Peak-to-output-energy ratio Peak-to-sidelobe ratio
408
NONLWEAR PATTERN RECOGNITION TECHNIQUES IN THE FOURIER DOMAIN
Optical Fourier transforms of the input signal s(x,y ) Autocorrelation of the reference image Cross-correlation of the reference and input targets Cross-correlation between different targets Autocorrelations of the input targets Fourier transform of the reference signal r(x,y), with
@ ~ (B) a ,the corresponding Fourier phase Autocorrelation of the input signal
Autocorrelation of the reference signal Cross-correlation of the input signal and the reference signal Correlation peak intensity located at (x0,yo) Reference signal located at ( X O , 0) in the input plane Reference image located at (x0,yo) in the input plane Training data matrix with si as its ith column Complex-conjugate transpose of S Fourier transform of S Optical Fourier transform of the reference signal r(x,y)
si
Fourier transform of the input signal s(x,y), with @s(a,B) the corresponding Fourier phase Fourier Transform of the input signal si(x,y), with @si( a,@) the corresponding Fourier phase A column vector obtained by lexicographic scanning of the ith training image Input signal put at (-xo, 0) in the input plane The ith target located at (xi,yi)at the input plane of the N targets Spatial light modulator Signal-to-noise ratio Input transmittance function
C: APP. A1
PERFORMANCE METRICS
409
A threshold value for binarizing the joint power spectrum Variance An arbitrary function Neighborhood of the correlation peak in the output plane Distance of term s(x’,y’) @ r*(x’,y’)from the optical axis
VT
Distance of term s*(x’,y’) @ r(x’,y’) from the optical axis
z
Distance between the output plane and FTL2
20
Distance between the DC terms plane to FTL2
21,
z2
Distance between the cross-correlation term plane to FTL2 Point on a-axis where the envelopes (a - 1)2R2(a) reach E
E amax
Point on a-axis where the envelopes ( a + 1)2R2(a) reach E Spatial frequency coordinates A constant related with order LJ Two independent white Gaussian processes = $rn2(a,B)-$mi(a,B)
Wavelength of the illuminating coherent light Standard deviation of additive white noise Standard deviation of background color noise
Appendix A. Performance metrics
To evaluate correlation performance we use a number of criteria. These criteria include: correlation peak intensity, peak-to-output-energy ratio, peak-to-sidelobe ratio, peak-to-noise ratio (Horner [ 1992]), discrimination ratio, and signal-tonoise ratio of the correlation peak intensity (Javidi and Horner [ 1994b], Horner [1992]). In what follows, the notations I . I and E{.} are used to denote the magnitude and the expected value, respectively.
410
NONLINEAR PATTERN RECOGNITION TECHNIQUES
IN THE FOURIER DOMAIN
[v, App. A
The correlation output peak intensity is defined as the maximum peak intensity, as it appears at the output when the input signal contains no noise. That is,
where R2(xo,yo) is the output correlation peak intensity located at (xo,Yo), and c (x , y) is the correlation output at the location of ( x , y ) .Ideally the peak should appear at (xo,yo)=(O,O). The value of c(0,O) is used as a constraint and it is forced to be a constant in the composite filter formulation of eq. (4.1). When the input signal is corrupted by noise, the correlation peak intensity 10 is defined as the expected value of correlation peak intensity at (xo,Yo):
One measure of correlation peak sharpness and correlation efficiency may be evaluated by peak-to-output-energy (POE). POE is defined as the ratio of the expected value of the correlation peak intensity to the expected value of the spatial average of the output energy:
where c ( x , y ) is the correlation output. The overbar symbol in the denominator denotes the normalized integration (spatial averaging) over (x, y ) . A correlation POE of larger than unity does not guarantee a successful detection of the target. The target is detected successfully when the correlation peak is larger than every other pixel in the output plane. PSR, the peak-tosidelobe ratio, also known as peak-to-secondary ratio, is a suitable criterion that shows how well the target can be detected in the presence of noise and/or other objects. The PSR is defined to be the expected value of the ratio of the correlation-peak intensity to the maximum sidelobe intensity in the output plane:
where Ic(x,y)lz, is the sidelobe intensity outside the window w(R2)that represents the neighborhood of the correlation peak in the output plane.
Y APP. Bl
FREQUENCY-DEPENDENT THRESHOLD FUNCTION METHODS
41 1
To evaluate the robustness of the correlation against noises, we need to measure the variability of the correlation peak in response to the noise in the input scene. The signal-to-noise ratio (SNR) of the output correlation peak in the presence of noise in the input scene can be defined as (Javidi and Horner [ 1994b1)
where
is the variance of the output correlation peak intensity due to the input noise. The peak-to-noise ratio (PNR) is defined as the ratio of the correlation peak intensity to the average noise intensity (Horner [ 19921):
where N’ is the total number of pixels in w(R2). Discrimination capability of a filter refers to the filter’s ability to correctly distinguish the true target images against non-target images. For this purpose, we desire to have a strong output correlation peak in response to the target, and small correlation output in response to other signals. This criterion is evaluated by the discrimination ratio (DR). DR is defined as the ratio of the correlation peak intensity of the target to the correlation peak intensity of any non-target: DR
=
10,target
10,non-target
E
-
E
{fi:arget}
{REon.target}
‘
A larger DR means that the composite filter has better discrimination against other non-target objects.
Appendix B. Frequency-dependent Threshold Function Methods In this appendix, we describe three thresholding methods: the global spatialfrequency dependent threshold function method (Javidi, Wang and Tang [199l]),
412
NONLINEAR PATTERN RECOGNITION TECHNIQUES IN THE FOURIER DOMAIN
[\! App. B
the constant median thresholding method, and a sliding-window local-median thresholding method (Javidi and Wang [ 19911). Given unity illumination over the reference image, and uniform illumination ‘‘a” over the input image, the input transmittance fimction is
The joint power spectrum is
E(a,P) = R2@,B) + a2S2(a,P>
+ 2aR(a,
S(a, cos[2xO~-t@da, B) - @R(a,P)],
(B2)
where a and /3 are the angular spatial frequency coordinates, R(a,B) and S(a,B) are the amplitude spectra, and @R(a,P) and @s(a,P)are the phases of the Fourier transforms of r ( x , y ) and s(x,y), respectively. In the binary JTC, the joint power spectrum is thresholded before the inverse Fourier transform operation is applied (Javidi [ 1989133). For a binary nonlinearity with threshold value E T ,the binarized joint power spectrum can be written as 1 for E(a,B) 2 ET 0 for E(a,P) < E T ’ where ET can be a function of (a,B). The thresholded joint power spectrum can be considered as an infinite sum of harmonic terms (Javidi [1989b], Vander Lugt and Rotz [1970]). The first-order harmonic term will generate the firstorder cross-correlation between the reference and the input image. The first-order cross-correlation signal is diffracted to 2x0, and the uth-order cross-correlation signals are diffracted to 20x0, where u is an integer ( u > 1). The Fourier component of transmittance that generates the first-order cross-correlation signal can be written as (Javidi [1989b])
for
where the subscript “1” denotes the first-order component. The Fourier components g,(a, P) (u > 1) generate higher-order correlation signals at the output plane. When the reference and the input scene are placed
V, APP. Bl
FREQUENCY-DEPENDENT THRESHOLD FUNCTION METHODS
413
sufficiently far from each other in the input plane, the high-order harmonic correlation terms and the zero-order DC term are diffracted far from the firstorder correlation terms. The analysis, based on the low-pass signal and noise models, shows that the binary JTC exhibits the best correlation performance among the class of nonlinear JTCs in terms of the output peak to noise ratio when the noise bandwidth is smaller than the target signal (Javidi, Wang and Fazlollahi [ 19941). B. 1. SPATIAL-FREQUENCY DEPENDENT THRESHOLD FUNCTIONS
We will investigate the spatial-frequency dependent threshold function. It is obvious from eq. (84) that the largest correlation peak is obtained when ET is (Javidi [ 1989b1, Javidi, Wang and Tang [ 19911)
When the threshold function Er(a,B) = R2(a,P)+ a2S2(a,P) is used for binarization of the joint power spectrum, the Fourier magnitudes of the reference image and the input image are removed from gl,(a,/?). Thus, the Fourier component of the transmittance function that generates the first order correlation signal is gl.a(a,P)
=
1
j t exp [$S(a9P)1 exp [-j$R(a, PI1exp(jbOa).
(B6)
It can be seen from eq. (B6) that the correlation signal is independent of the illumination “a”. An additional advantage of using the threshold function is that the DC terms and the intermodulation terms in the joint power spectrum are eliminated. This relaxes the requirement on input image separation in the input plane of the system (Javidi, Wang and Tang [1991]). B.2. SLIDING-WINDOW LOCAL-MEDIAN THRESHOLDING
We consider the sliding-window local-median thresholding to investigate the illumination sensitivity of the binary JTC. We will show that this technique can improve the performance of the binary JTC and can reduce its sensitivity to the illumination “a”. As we discussed before, the binary JTC using the frequency-dependent threshold function for binarization is invariant to input-image illumination, but requires computation of ET = R2(a,P) + a2S2(a,P). We show here that
414
NONLINEAR PATTERN RECOGNITION TECHNIQUES IN THE FOURIER DOMAIN
[y App. B
the median thresholding method is sensitive to illumination variations. The sliding-window local-median thresholding technique uses the median of the local histogram of pixel values of the joint power spectrum inside a window around each pixel, to binarize that pixel value. This is a spatial-frequency dependent threshold that adapts to the changes of the amplitude spectrum and provides some illumination invariance; owing to the fact that when t h s method is used, the threshold value obtained mimics the threshold function ET(a,P) = R2(a,/3)+ a2S2(a,P).Implementation of this threshold is simple owing to the small window size. The size of the window in this technique is selected according to the maximum and minimum possible distances between the reference image and the target in the input image. The sliding-window local-median thresholding technique is easier to implement than either the frequency-dependent threshold hnction method or the global median thresholding, and does not require an assessment of “a”, the illumination constant. The computer simulations and the experimental results show that the binary JTC using the sliding-window localmedian thresholding technique performs well in detecting a target with unknown input image illumination. B.3. GLOBAL MEDIAN THRESHOLDMG
We discuss the variation of the correlation peak intensity with changes in inputimage illumination using constant median thresholding for binarization. We show mathematically that the correlation peak in a binary JTC using median thresholding decreases with changing the illumination “a” from unity. We use one-dimensional notation for simplicity. The input transmittance function is t(x) = Y ( X - xo) + as(x + xo). If we assume s(x) = r(n), i.e. autocorrelation, the binarized joint power spectrum that generates the first-order correlation is (see eq. B4)
/
n 1-
gl,u(a>= -
[
I’
(1 + a2)R2(a)- ET,,
2aR2(a)
c0s(2x0a),
where the constant threshold value ET,, is defined as ET,, = median{ hist [R2(a)(a2+ 2a cos(2xoa) + l)] } ,
where “hist” is the histogram of the bracketed quantity. When a becomes
(B8) =
1, eq. (B7)
Y APP. Bl
415
FREQUENCY-DEPENDENT THRESHOLD FUNCTION METHODS
where
ET,, = median{hist[2R2(a)(l + cos(2xoa))]}. Assuming that R2(a)is slowly varying compared to cos(2x0a), the histogram of the joint power spectrum E ( a ) = R2(a)[a2+ 2a cos(2xocr) + 11 can be written as (Javidi, Li, Fazlollahi and Horner [ 19951)
h(E) = -
n
7a
da
d-E2
+ 2(a2+ 1)R2(a)E- (a2- 1)2R4(a)
for any a.
a&"
(B11) In eq. (B1 l), a:," and a;,, are the points on the a-axis where the envelopes (a - 1)2R2(a) and ( a + 1)2R2(a)reach E , respectively. Equation (B11) is the mathematical expression for the histogram of the joint power spectrum when the input image s(x) = ~ ( x has ) illumination a The median E T , is ~ a value that satisfies
/
ET.*
/
Em,,
h(E) dE
=
h(E)d E ,
(B12)
ET,"
0
with Em,, the maximum value of the joint power spectrum E. We have shown elsewhere (Javidi, Li, Fazlollahi and Homer [1995]) how E T , is~ related to R2(a). We have also shown that with some assumptions on R2(a)we can find an upper bound and a lower bound for E T , in ~ terms of E T , as: ~
E T , will ~ lie somewhere between the bounds when "a" changes. Notice that the two bounds are tangent at a = 1. From eq. (B7) and the upper bound in eq. (B13), we have
(g) 2
Since is concave and reaches its minimum at a = 1, we can conclude that the first-order correlation peak reaches its maximum at a = 1 and
416
NONLINEAR PATTERN RECOGNITION TECHNIQUES IN THE FOURIER DOMAIN
[V
decreases by increasing or decreasing “a”. The median values of the joint power spectrum E ( a ) = R2(a)[u2+ 2acos(2xoa) + 11 for a and l/a are related as Er,1Iu = (l/u2)ET,u.Using this in eq. (B7) we can also conclude that the correlation peaks for a and l/u are the same.
References Davenport, W.B.,and W.L. Root, 1958, An Introduction to the Theory of Random Signal and Noise (McGraw-Hill, New York). Fielding, K.H., and J.L. Homer, 1990, I-f binary joint transform correlator, Opt. Eng. 29, 1081. Flannery, D.L., and J.L. Homer, 1989, Fourier optical signal processors, Proc. IEEE 77, I5 11. Flavin, M., and J.L. Homer, 1989, Amplitude encoded phase-only filters, Appl. Opt. 28, 1692. Goodman, J.W., 1996, Introduction to Fourier Optics (McGraw-Hill, New York). Hahn, W.B.,and D.L. Flannery, 1992, Design elements of binary joint transform correlation and selected optimization techniques, Opt. Eng. 31, 896. Hester, C.F., and D. Casasent, 1980, Multivariant technique for multiclass pattern recognition, Appl. Opt. 19, 1758. Homer, J.L., 1992, Metrics for assessing pattern-recognition performance, Appl. Opt. 31, 165. Homer, J.L., and P.D. Gianino, 1984, Phase-only matched filtering, Appl. Opt. 23, 812. Homer, J.L., B. Javidi and G. Zhang, 1994, Analysis of method to eliminate undesired responses in a binary phase-only filter, Opt. Eng. 33, 1774. Hsu, Y.N., and H.H. Arsenault, 1982, Optical character recognition using circular harmonic expansion, Appl. Opt. 21, 4016. Javidi, B., 1989a, Synthetic discriminant function based nonlinear correlation, Appl. Opt. 28, 2490. Javidi, B., 1989b, Nonlinear joint power spectrum based optical correlation, Appl. Opt. 28, 2358. Javidi, B.,1990, Comparison of nonlinear joint transform correlator and nonlinearly transformed matched fiber based correlators for noisy input scenes, Opt. Eng. 29, 1013. Javidi, B., 1997a, Securing information with optical technologies, Phys. Today 50,27. Javidi, B., 1997b, Optical information processing for encryption and security systems, Opt. & Photonics News 8, 28. Javidi, B.,and J.L. Homer, 1989a, Single SLM joint transform optical correlator, Appl. Opt. 28, 1027. Javidi, B.,and J.L. Homer, 1989b, Multifunction nonlinear signal processor: deconvolution and correlation, Opt. Eng. 28, 837. Javidi, B.,and J.L. Homer, 1994a, Optical pattern recognition for validation and security verification, Opt. Eng. 33, 1752. Javidi, B.,and J.L. Homer, 1994b, Real-Time Optical Information Processing (Academic Press, San Diego, CA) p. 29. Javidi, B.,and C. Kuo, 1988, Joint transform image correlation using a binary spatial light modulation at the Fourier plane, Appl. Opt. 27, 663. Javidi, B., J. Li, A.H. Fazlollahi and J.L. Homer, 1995, Binary nonlinear joint transform correlator performance with different thresholding methods under unknown illumination conditions, Appl. Opt. 34, 886. Javidi, B., and D. Painchaud, 1996, Distortion-invariant pattern recognition with Fourier-plane nonlinear filters, Appl. Opt. 35, 3 18. Javidi, B., and Q. Tang, 1994, Single input plane chirped encoded joint transform correlator, Appl. Opt. 33, 227.
VI
REFERENCES
417
Javidi, B., Q. Tang, D. Gregory and T.D. Huson, 1991, Experiments on nonlinear joint transform correlators using an optically addressed SLM in the Fourier plane, Appl. Opt. 30, 1772. Javidi, B., Q. Tang, G. Zhang and F. Parchekani, 1994, Image classification using a chirp encoded optical processor, Appl. Opt. 33, 6219. Javidi, B., and J. Wang, 1991, Binary nonlinear joint transform correlation with median and subset median thresholding, Appl. Opt. 30, 967. Javidi, B., and J. Wang, 1992, Limitation of the classical defitution of the correlation signal-to-noise ratio in optical pattern recognition with disjoint signal and scene noise, Appl. Opt. 31, 6826. Javidi, B., and J. Wang, 1995, Distortion-invariant filter for detecting a noisy distorted target in nonoverlapping background noise, J. Opt. SOC.Am. 12, 2604. Javidi, B., J. Wang and A.H. Fazlollahi, 1994, Performance of the nonlinear joint transform correlator for images with low-pass characteristics, Appl. Opt. 33, 834. Javidi, B., J. Wang and Q. Tang, 1991, Multiple objects binary joint transform correlation using multiple level thresholding crossing, Appl. Opt. 30, 4234. Javidi, B., W. Wang and G. Zhang, 1997, Composite Fourier-plane nonlinear filter for distortion invariant pattern recognition, Opt. Eng. 36, 2690. Kozma, A,, 1966, Photographic recording of spatially modulated coherent light, J. Opt. SOC.Am. 56, 428. Mahalanobis, A,, B.V.K.V. Kumar and D. Casasent, 1987, Minimum average correlation energy filters, Appl. Opt. 26, 2633. Marom, E., 1993, Error d i f h i o n binarization for joint transform correlators, Appl. Opt. 32, 707. Oppenheim, A.V., and J.S. Lim, 1981, The importance of phase in signals, Proc. IEEE 69(5), 529. Papoulis, A., 1984, Probability, Random Variables and Stochastic Process (McGraw-Hill, New York). Refregier, Ph., 1990, Filter design for optical pattern recognition: multicriteria optimization approach, Opt. Lett. 15, 854. Refregier, Ph., 1991a, Optimal trade-off filters for noise robustness, sharpness of the correlation peak, and Horner efficiency, Opt. Lett. 16, 829. Refregier, Ph., 1991b, Optical pattern recognition: Optimal trade-off circular harmonic filters, Opt. Commun. 86, 113. Refregier, Ph., V. Laude and B. Javidi, 1995, Basic properties of nonlinear global filtering techniques and optimal discriminant solutions, Appl. Opt. 34, 3915. Rogers, S.K., J.D. Nine, M. Kabrisky and J.P. Mills, 1990, New binarization techniques for joint transform correlator, Opt. Eng. 29, 1088. Schils, G.F., and D.W. Sweeney, 1988, Optical processor for recognition of three-dimensional targets viewed from any direction, J. Opt. SOC.Am. 5, 1309. Tang, Q., and B. Javidi, 1991, Binary encoding of grayscale nonlinear joint transform correlators, Appl. Opt. 30, 1321. Tang, Q., and B. Javidi, 1992, Sensitivity of the nonlinear joint transform correlators: experimental investigation, Appl. Opt. 31, 4016. Tang, Q., and B. Javidi, 1993a, A technique for reducing the redundant and self correlation terms in joint transform correlators, Appl. Opt. 32, 191 1. Tang, Q., and B. Javidi, 1993b, Multiple object detection with a chirp encoded joint transform correlator, Appl. Opt. 32, 5079. Vander Lugt, A,, 1964, Signal detection by complex filtering, IEEE Trans. Inform. Theory IT-10, 139. Vander Lugt, A., and F.B. Rotz, 1970, The use of film nonlinearities in optical spatial filtering, Appl. Opt. 9, 215. Weaver, C.S., and J.W. Goodman, 1966, Technique for optically convolving two functions, Appl. Opt. 5 , 1248.
418
NONLINEAR PATTERN RECOGNITION TECHNIQUES IN THE FOURIER DOMAIN
[V
Zhang, G., 1995, Signal detection by optical correlators with phase encoding at Fourier domain, Ph.D. Thesis (The University of Connecticut, Storrs, CT). Zhang, G., and B. Javidi, 1993, Random phase modulation techniques for optical pattern recognition, in: Proc. IEEE Lasers and Electro-Optics SOC.Annu. Meeting, San Jose, 1993 (EEE, Piscataway, NJ) p. 57.
E. WOLF, PROGRESS IN OPTICS XXXVIII 0 1998 ELSEVIER SCIENCE B.V. ALL RIGHTS RESERVED
VI FREE-SPACE OPTICAL DIGITAL COMPUTING AND INTERCONNECTION BY
J ~ G EJAHNS N Fern Uniuersitat Hugen, Optische Nachvichtentechnik, Feithstrasse 140, 58084 Hugen. Germany
419
CONTENTS
PAGE
$ 1. FACETS . . . . . . . . . . . . . . . . . . . . . . .
421
5 2.
SYSTEM MODEL AND COMPUTATIONAL ASPECTS . . .
433
5 3. 9 4. 5 5.
NONLINEAR OPTICAL DEVICES . . . . . . . . . . . .
447
OPTICAL INTERCONNECTIONS . . . . . . . . . . . .
466
ARCHITECTURES AND SYSTEMS . . . . . . . . . . .
486
9 6.
CONCLUSION AND OUTLOOK . . . . . . . . . . . . .
503
ACKNOWLEDGEMENT . . . . . . . . . . . . . . . . . . .
504
REFERENCES . . . . . . . . . . . . . . . . . . . . . . .
504
420
9 1.
Facets
1.1. A LOOK BACK
Since the beginning of the technical age, marked by the invention of the telephone in the middle of the nineteenth century, the three areas of information technology (i.e., communications, processing, and storage) were initially fully covered by electric (or electronic) techniques. In 1876, Alexander Graham Bell received a patent for the telephone. There had been several forerunners, most notably Charles Bourseuil in France (around 1830) and Philipp Reis in Germany (around 1860). Bell also invented an “optical telephone”, called the “photophone”, a couple of years later and patented it in 1880 (Bell [1880]). The photophone consists of a membrane that is excited by the acoustic waves of the speech signal. The membrane modulates a light beam which is reflected off the membrane and then propagates through free space to a detector (fig. 1). Bell considered the invention of the photophone to be even more important than the telephone. However, he could not get it to work at a satisfactory level. One problem was caused by the unguided propagation of the light signal through free-space, which limited the transmission distance. This issue is connected to another problem; namely, the non-availabilityof a suitable light source or in other words, the lack of technology. In his demonstration experiments, Bell used light light source
/*
light signal
I
Se detector
membrane Fig. 1.1. Principle of the photophone invented by A.G. Bell: a light beam is modulated by a membrane. The membrane itself is driven by the sound waves of the speech signal. 42 1
422
FREE-SPACE OPTICAL DIGITAL COMPUTING AND INTERCONNECTION
[VI, § 1
rays from the sun. It took around 80 years until, with the laser, a light source was invented that would have helped Bell with the photophone. Nonetheless, one may consider the photophone as the beginning of modern optical communications. The demonstration of the laser (Maiman [1960]) was the key invention for optical information technology, and triggered many activities in areas like fiber optic communications, optical data storage, holography, and analog optical information processing. As is well known, fiber optical communications and optical data storage based on the compact disk technology have since then developed into standard technologies, both making use of digital data formats. The preference for digital over analog techniques has a variety of reasons. Digital signals have a better immunity to noise and therefore provide better quality, in general. Furthermore, digital data formats offer a great deal of flexibility in modifying, exchanging, combining, and storing information. It is less well known that by the early 1960s, scientists started to investigate several nonIinear effects in lasers to demonstrate the feasibility of digital optical logic, as described, e.g., by Basov, Culver and Shah [1972]. A few early considerations on the use of optics for digital computing even date back to the 1950s (Ganzhorn, Schweitzer and Kulcke [1959]). Since those early days, digital optical computing has been going in waves. Comparisons of optical and electronic logic devices put a damper on the field, since it was predicted that thermal problems would be more severe for optics (Keyes and Armstrong [1969]). The digital optical computing field became quiescent until the mid1970s, when the residue number system was discovered for optics (Huang, Tsunoda, Goodman and Ishihara [1979]). The analogy between the cyclical nature of modulo arithmetic and some optical phenomena (e.g., the phase of a light wave) spurred many activities. Most of these came to a gradual end within a few years, since there were still no suitable “optical” devices to work with. However, working hybrid opto-electronic residue processors were demonstrated some ten years later, for example, by Falk, Capps and Houk [ 19881 and Goutzoulis, Malarkey, Davies, Bradley and Beaudet [ 19881. In the late 1970s optical bistability was demonstrated in semiconductors (Miller, Smith and Johnston [ 19791, Gibbs, McCall, Venkatesan, Gossard, Passner and Wiegmann [ 19791). Advances in semiconductor technology allowed the fabrication of practical nonlinear optoelectronic devices; e.g., the SEED device (Miller [1987]). What followed was a phase of intensive research, with major programs around the world including both industrial and academic laboratories. The most comprehensive effort was probably the program at AT&T Bell Laboratories, which covered all areas, including devices, interconnections, systems, and archtectures. Out of those efforts resulted several
VL
5
11
FACETS
423
Table 1.1 Evolution of digital optical computing 1965
Optical logic operations
1975
Optical residue arithmetic
1985
Optical bistability, array optics
1995
Optical interconnections
demonstrations of digital optical processing systems. As a beneficial side effect, a number of significant advances were made in such areas as device technology, micro-optics, and system architectures. Since then, many of these advances have “cross-fertilized” other areas, such as the use of optical interconnections for VLSI systems, which is now the area that is most actively pursued in digital free-space optics. The processing of information, or simply computing, still remains a domain of electronics with optics covering only a few niche applications until now. Nonetheless, there have been significant efforts in both analog and digital optical computing over the past 30 years (table 1.1). What is the potential of optics, in particular in the area of digital computing and what has been achieved? This is the question we shall consider in this chapter. 1.2. WHAT’S IN A NAME?
“Optical computing” or “digital optics” or “optics in computing” - various names have been used; however, the meaning is always the same. Since the early 1960s, there has been interest in the question of whether optical signals could replace or complement electronics in computing. Digital computing is a complex task. It comprises all three areas of information technology: processing, communications, and storage (although other terms are more common in the computer world: logic, interconnections, and memory). The fundamental parts of a digital computer are the processor, the memory, and the bus used for processor-memory communication and datainput/output. The processor typically contains combinational logic to perform arithmetic operations as well as registers to store intermediate results. Data are being exchanged constantly between the processor and the main memory over the bus that connects the various parts of a computer (fig. 1.2). The memory is often divided into two functional units, the main memory and the cache. The main memory holds all the relevant data required for a specific task. Memory
424
FREE-SPACE OPTICAL DIGITAL COMPUTING AND INTERCONNECTION
memory
[VI, 0 1
4
bus
input
f
processor
b
output
u Fig. 1.2. Functional components of a computer: processor, memory, and bus
chips currently hold 16 or 64 megabytes of information and have access times of more than 100ns. The cache is a small, fast memory, located physically on the same chip as the processor to facilitate communications between them. The cache is used to store data which are frequently accessed by the processor. Its capacity is typically less than 100 kilobytes with an access time of less than 10 ns. Since data are often routed repeatedly between storage units and the logic unit, feedback loops are usually an inherent part of computing systems. The performance of a computer depends very strongly on the balance between its three parts. Recently, constant improvements of processor and memory technology have put much pressure on the interconnection capabilities of electronic computers, Therefore, optical solutions are being considered for the various levels of computer communication (i.e., between chips, boards, and racks). More recently, with the development of fiber optic communications, interest arose in the optical routing and switching of optical signals (“photonic switching”).There is a number of differences between computing and switching, but also several similarities, so that both are related areas. A communications network consists of a transmission facility and one or more switching systems (Cloonan [ 19941) (fig. 1.3). The switching system receives signals from multiple transmission links and routes them to the desired output transmission facilities. The switch consists of the following functional parts: the controller, the switching fabric, and inpuuoutput interfaces. The switching fabric again consists of switching nodes which are interconnected by links. Routing information is extracted from the incoming signals and sent to the controller which then sets up the nodes of the switching fabric. Different multiplexing schemes are used based on a spatial, temporal or wavelength-oriented representation of the optical signals. Unlike the data in a computer, the signals in a switching system always flow in a feed-forward fashion.
425
FACETS
VI, § 11
I
=I
control 1
I
L
:I-
ai
input
f.*
f.-
: I
4.2
I
I
switching fabric
output
F
Fig. 1.3. Functional components of a switching system.
The interest in optics is motivated by the large bandwidth of optical signals and - in the case of free-space optics - the possibility of not being bound by wires or waveguides. It turns out that the communications are more and more becoming the weakest link in computing and switching systems. 1.3. LIMITATIONS OF ELECTRONIC SYSTEMS
The processing capabilities of today’s computers are impressive. Supercomputers have been built which can perform several GFLOPs (FLOP: floating point operation). At the chip level, one can still observe enormous increases in the processing power of a chip, with dimensions of the individual devices shrinking to a few tenths of a micrometer. However, the speed of today’s high performance electronic computers is increasingly limited by communication problems like the number and bandwidth of the interconnections and by data storage and retrieval rates rather than by processing power. It is interesting to take a look at the different time scales that exist at the various levels of an electronic system represented by the delay times for a signal (table 1.2). What is strilung about this comparison is the fact that there are several orders of magnitude between the delay times for an individual device and the communications on a systems level. The switching time for a transistor (e.g., a MOSFET transistor) have remained relatively constant over the years: on the order of a picosecond, limited essentially by the channel length, i.e., the lateral dimensions between its electrodes (Meindl [ 19951). Current technology allows one to fabricate transistors with channel lengths of 0.25 pm, while the microelectronics industry is gearing up for the next step to get to 0.1 pm
426
FREE-SPACE OPTICAL DIGITAL COMPUTING AND INTERCONNECTION
[VI, § 1
Table 1.2 Delay times at different levels of an electronic computer Transistor
O( 1 ps) ‘1
Gate
O( 10- 100 ps)
Chip
O( 1- 10 ns)
Bus
O( 100 ns)
O(.) denotes the order of magnitude.
structures (Taur, Buchanan, Chen, Frank, Ismail, Lo, Sai-Halasz, Viswanathan, Wann, Wind and Wong [ 19971). The improvements in technology will bring the switching delays of a logic gate down to the range of less than 100 ps and clock frequencies of the processors close to the GHz range (Asai and Wada [ 19971). However, while technology still allows one to improve processing and memory devices, the communications capabilities appear to have reached a final limit. The system bus in a computer has emerged as the bottleneck (Boxer [ 19951). Several communications-related limitations exist for the performance of all-electronic computers: (a) physical limitations such as a limited time-bandwidth product of electric wires due to capacities and inductivities as well as effects like electromigration, (b) topological limitations due to the two-dimensional layout of the wiring that is inherent to electronics, and (c) architectural limitations such as the often-cited “von Neumann bottleneck” that limits the amount of data exchanged between processing unit and memory. It is instructive to take a more detailed look at these issues. 1.3.1. Physical limitations Where does the two orders of magnitude discrepancy in the switching delays for a gate and a chip come from? The explanation lies in the need to communicate information between the various parts of the processor. In a processor, one has short (or “local”) and long (or “global”) interconnects. The distribution of interconnection lengths in a processor shows that there are two peaks, one around 0.1 and one around 0.5&, where A is the chip area (Bakoglu [ 19901) (fig. 1.4). The short interconnections serve the majority of the wiring. There is also the need to carry information between distant parts of the processor via
427
FACETS
Occurrence rate local
====F=
1
Fig. 1.4. Histogram showing the distribution of interconnection lengths L , for intrachipcommunications in an electronic processor. The histrogram shows two distinct peaks for local interconnections and global interconnections. A denotes the chip area.
the global interconnects. With chip areas of typically lOmm x lOmm, local interconnection lengths are 1-2 mm and global interconnects are 5 mm or more. The time delay of an electrical signal traveling over a wire of length L, is given as
r = RL,
(CL
+ iCL,),
(1.1)
where R and C are the resistance and capacitance per unit wire length, respectively, and C, is the capacitance of the load at the end of the wire (Taur, Buchanan, Chen, Frank, Ismail, Lo, Sai-Halasz, Viswanathan, Wann, Wind and Wong [1997]). This delay is referred to as RC delay and must be added to other delay components. The second term in eq. (1.1) describes the part in the delay due solely to the wire itself. Wire capacitance per unit length tends to stay essentially constant at 0.2pF/mm and does not scale with technology. On the other hand, resistance per unit length grows linearly with the scaling factor a if the dimensions of the wire are reduced by that factor. Consequently, long and thin wires cause long delay times. They can be used for local interconnects, but not for global interconnects. The way one deals with that problem is by using a hierarchy of interconnections. While local interconnections are implemented by thin and densely packed wires (e.g., with a width of 0.3 pm), one uses “fat” wires (width 3 1 pm) for global interconnections. The number of wiring planes and the pitch of the wires need to be optimized from case to case. High performance processors based on 0.1 pm technology would require six to eight interconnection layers. Bandwidth limitations are even more apparent for off-chip communications. Long delay times are caused by capacitances and inductivities. Consequently, the
428
FREE-SPACE OPTICAL DIGITAL COMPUTING AND INTERCONNECTION
[VI, § 1
speed of chip-to-chip or board-to-board communications is much slower than for intra-chip communications and channel density is much lower. The bandwidth of the system bus is typically lOMb/s. The number of input/output pins of a processor chip used for signal communications is a few hundred. This leads to another problem of electrical interconnects related to their topology. 1.3.2. Topological limitations of 2 - 0 interconnections
With the scaling of device structures to a few tenths of a micrometer, processors now consist of lo6 to lo7 transistors. The processing capabilities of a processor must be supported by an adequate number of interconnections for the processor to work efficiently. This empirical relationship has been quantified by Rent's rule, which says that the number of logic gates Ng and the number of interconnections N, should follow the relationship
N,= aN,', where a and b are empirical constants; b typically lies in the range between 0.5 and 0.7. Often used values are a = 2.5 and b = 0.6. A Pentium processor, for example, has lo5 logic gates. With the values just given, Ni = 2500 inputloutput connections would be required, while in practice only about 500 can be realized. This means that the communications appetite of a processor is not fully satisfied, a trend that will become worse with further improvements of device integration. The fundamental problem is that the number of devices and interconnections scale differently with the chip area (fig. 1S).While the number of devices Ng is proportional to A , the number of interconnects in a two-dimensional geometry scales only with &.
Ng gates, N g = A Ni connections, Ni= A
I12
L12
Fig. 1.5. Two-dimensional interconnection topology: the number of interconnections scales with the square root of the chip area A , while the number of devices grows directly with A .
FACETS
429
1.3.3. Von Neurnann bottleneck Computers are traditionally organized in the von Neumann architecture with processor and memory as separate functional units connected via the systems bus (fig. 1.2). Non-von Neumann architectures are used only in special purpose systems, e.g., systolic arrays. In a von Neumann computer, arithmetic operations like addition, subtraction, etc., are performed by first moving the data from the memory to the processor using a sequential binary addressing mechanism. After the arithmetic operation has been finished, the result is again transferred to the memory. The need to permanently exchange data between processor and memory and the sequential addressing mechanism slows down the operation of a system (Huang [ 19841). The problem is further increased by the bandwidth mismatch between processor and memory. Since 1980, processor speed has increased by a factor of 100, so that currently processors can operate at cycle times of about 5 ns. During the same time period, the clock speed of a DRAM memory chip has fallen only by a factor of 2, from 120ns to 60 ns. This mismatch results in a slowing down of the processor at times when it has to wait for data to be retrieved from main memory. This is true, in particular, for RISC (reduced instruction set computer) architectures that can perform several instructions within one clock cycle. The primary tool for handling the problem is to use a small cache memory which is implemented as a fast SRAM and placed on the same c h p as the processor. The data exchange between process-or and cache does not go over the system bus, so that it can be much faster than the exchange between processor and main memory. By constantly updating, the cache keeps the data which are most likely to be used by the processor, Yet, there are limitations to the improvement that caching offers. These are, again, introduced by the limitations of the system bus and the resulting rate at which the cache can be updated with data from main memory. It was pointed out by Dickinson [ 19901 that increasing the bus-width for processor-memory links from the currently typical 64 channels to 1024 channels would allow one to improve the system performance by a factor of 3 to 4. An even bigger improvement might be expected in multichip architectures where currently electronic crossbar switches are being considered as possible alternatives to the system bus. 1.4. OPTICS AS AN INTERCONNECTION TECHNOLOGY
Basically, one can view optics simply as an interconnection technology. With its
430
rack
FREE-SPACE OPTICAL DIGITAL COMPUTING AND INTERCONNECTION
bus
board
[VI,
01
chip
free-space optics,
waveguide /fiber bundle /free-space optics
- 1 m d
t O . 1 m-
Fig. 1.6. Interconnection hierarchy in a computing system and optics technologies. Typical transmission distances are on the order of meters for rack-to-rack communication, O(O.l m) for board-to-board, and O( 1cm) for chip-to-chip.
large bandwidth and its interconnection capabilities, optics can offer interesting solutions to help alleviate the limitations of all-electronic computer systems. Optics is being considered for the various levels of interconnections in an electronic computing system (fig. 1.6). Depending on the transmission distance and interconnection density, different optics technologies are of interest. Between racks, fiber-optical links have already been used for several years; e.g., as the backplanes of switching systems. For board-to-board communications, polymer waveguides, fiber bundles, and free-space optics can be used. For chip-to-chip interconnections, integrated packaging technologies using either waveguide or free-space optics are being investigated. Optical interconnections behave very much like perfectly matched transmission lines; e.g., the propagation speed in a transmission line with a small permeability and permittivity (i.e., ,ur= 1, E , = 2 4 ) is the same as in an optical fiber. However, for electrical interconnections to achieve transmission line speeds, they must be driven by low impedance drivers and be terminated at the end of the wires in order to avoid reflections. Optical drivers always behave as if they were driven by a low impedance source and optical reflections at detectors are usually not a problem. Optical interconnections also offer the advantage of a bandwidth-independent low absorption and they can support larger fan-outs. Another major advantage is that they lack mutual coupling effects, which is why optical signals can cross through each other without the generation of noise or loss of information. In addition, optical signal transmission can be more energy efficient due to the fact that it is not necessary to charge a wire or a cable in order to transmit a signal. This is the basis of fiber optic
43 1
FACETS
N, connections, N , = A
A
Fig. 1.7 Three-dimensional interconnection topology. the number of interconnections N , scales linearly with the chip area A
communications for long and medium distances. However, even for very short distances optical communications can be of advantage energy-wise as compared with conventional electronic interconnections (Feldman, Esener, Guest and Lee [1988], Miller [1989]). Finally, optics offers various degrees of freedom, such as the wavelength, polarization, and the spatial position of a light beam, which can be used for signal multiplexing to enhance the throughput of a communication channel. In particular, free-space optical propagation is of interest to solve the topological problem of 2-D interconnections. For a 3-D interconnection technology, the number of interconnections scales with A like the number of devices on the chip (fig. 1.7). Free-space optical interconnections with several thousand channels are feasible. From that point of view, free-space would be ideally suited to satisfy the hunger for bandwidth of electronic processors. Yet, the advantages of optical interconnections are not quite clear yet. High performance and reliability at low cost are necessary requirements for any kind of technology to be accepted into the competitive world of electronics. This is dlfficult to achieve, in particular for chip-to-chip interconnections. Here, free space optics offers the largest potential, but also faces a variety of difficulties. Mainly, packaging issues remain to be solved (Jahns and Huang [ 19891). Also, for free-space optical interconnections, the availability and performance of 2-D arrays of light sources and detectors is crucial. Both modulator arrays and arrays of vertical-cavity surface-emitting lasers (VCSELs) are being considered. For large 2-D arrays, addressing mechanisms and thermal issues (Ozaktas and Goodman [ 19921) need to be taken care of. Researchers are actively working to solve the interconnection and packaging problems at many different levels and it is likely that future computing and switching systems will incorporate the results of many of these efforts. Research projects include work on guided wave technology using time-division multiplexing, wavelength division multiplexing and space division multiplexing (Hinton [1993]). Another direction of research is centered around what will
432
FREE-SPACE OPTICAL DIGITAL COMPUTING AND INTERCONNECTION
[VI, 0 1
be the focus of this review, viz., the use of free-space optics for digital computing and switching systems. Free-space optics has the potential to enhance the performance of digital electronic system by providing a large number of interconnections. Future computing systems will continue to make use of metallic wires, but the availability of free-space optics will give system designers alternatives for engineering a system. In addition, beyond being merely a high bandwidth replacement of electronic wires, free-space optics offers design possibilities like the spatial position for implementing specific routing schemes or reconfigurability of the interconnections. 1.5. OUTLINE
In this chapter, we will review the progress made in the area of digital free-space optics over the past 10-20 years. Guided-wave technology and analog optical computing are not considered here. An area like digital free-space optics that has recently gone through a phase of rapid development does not lend itself to a clearcut presentation, in particular since it is a field to which many disciplines contribute. Nonetheless, this is an attempt to provide the reader with some oversight. The chapter is organized as follows: in 0 2, we will examine hardwarerelated issues from a computational point of view. As a basis, a model for a digital optical computer is used as it was developed at AT&T Bell Laboratories in the mid- 1980s and was adapted by many other research groups. In 9 3 and 9 4, the physics of nonlinear devices and optical interconnections will be described. We shall concentrate on semiconductor devices which have the potential for high speed. Liquid crystal components which have also been investigated for computing and switching applications shall not be considered here. In Q 5 we describe some of the architectural and systems aspects relevant to free-space optical digital computing and switching. Some concluding remarks will follow in Q 6. In writing this review, is was unavoidable to assume a certain point of view and consequently give more room to some of the work related to digital free-space optics at the expense of other contributions. This is more a question of economics rather than a lack of respect for the work done by many groups working in the field. The author hopes that the reader will find this review a useful source for references. In addition, one may consult other overview articles or books that deal with the subject of digital free-space optics. Review articles include, e.g., Jahns [1980], Sawchuk and Strand [1984], Huang [1984], Goodman, Leonberger, Kung and Athale [1984], Berra, Ghafoor, Guizani, Marcinkowski and Mitkas [ 19891, Streibl, Brenner, Huang, Jahns, Jewell, Lohmann, Miller, Murdocca, Prise and Sizer 119891, Drabik [ 19941, Hinton, Cloonan, McCormick, Lentine
VI,
0 21
SYSTEM MODEL AND COMPUTATIONAL ASPECTS
433
and Tooley [1994]. Books on the subject include Murdocca [1990], McAulay [1991], Hinton [1993], Lalanne and Chavel [1993], Erhard and Fey [1994], and Jahns and Lee [ 19941.
5 2.
System Model and Computational Aspects
Much of the work on optical computing was motivated by the parallelism that optical interconnections offer and the potentially high speed of optical logic devices. In the mid-I980s, the question of how to use that potential of optics was one of the main issues in the optical computing community. Some considerations on that subject were summarized by Prise, Streibl and Downs [ 19881. According to these authors, there are two approaches which one can follow: one is to implement an architecture similar to the architecture of conventional electronic computers where the optical devices are connected by optical waveguides (West [ 19851). This is worthwhile only if the optical switching devices can be operated at much higher speeds than the gates in electronic computers. Alternatively, one can try to take advantage of the interconnectivity of freespace optics. Optical imaging can provide up to lOOOx 1000 channels or more. In order to make use of such a wide interconnect, however, a significant fraction of those data channels must be busy at any instant. This requires computer architectures which are different from the sequential architectures of conventional von Neumann machines. Since interconnections based on imaging imply a certain amount of regularity, it is also necessary to take advantage of the parallelism of the optical interconnections without suffering from the constraint of regularity. Two main types of architectures for free-space digital optics were developed: symbolic substitution (Brenner, Huang and Streibl [ 19861) and the concept of optical programmable logic arrays (Murdocca, Huang, Jahns and Streibl [1988]). Both will be described in 0 5. Figure 2.1 shows the model of a hgital optical computer as used in the work at Bell Labs in the late 1980s. It consists of 2-D arrays of optical switching devices interconnected through free-space. The interconnections may be implemented by lenses, lenslet arrays, beamsplitters, gratings, holograms, etc. The architecture is characterized by a very wide pipeline with thousands of channels. Some problems with an inherent parallelism, such as switching in optical communications, may benefit directly from this architecture. For more general problems, the pipelined architecture shown in the figure is efficient only if the computation can be broken up into sufficiently large chunks of data where registering and storing occur seldom, if at all. In fast electronics, if data are not
434
FREE-SPACE OPTICAL DIGITAL COMPUTING AND INTERCONNECTION
input
4 -
optical interconnect
column processing
__c
logic array
g
2
output
I
-
[VL
row processing
-
Fig. 2.1. Model of a digital optical computer.
registered periodically, signals get out of synchronization due to clock skew and differences in propagation delays of the gates. With optics, since the path lengths can be controlled precisely, constant latency architectures and pulsed logic are feasible. Free-space optical imaging systems can be designed such that the path lengths are constant over the whole field. Tolerances caused by aberrations are on the order of less than a wavelength. This corresponds to transit time differences on the order of femtoseconds. The synchronicity of the architecture is one aspect. The other is the use of logic gates with a constant fan-in and fan-out. The fan-in of the device is the number of inputs, the fan-out is the number of outputs. All devices across the array should have the same computational and physical properties. For any device to be used for implementing logic operations, a minimum fan-in and fan-out of 2 is required. As Prise, Streibl and Downs [I9881 pointed out, this value is optimum because a larger value for the number of inputs and outputs makes the devices less tolerant of any device or system-related variations. Large fan-ins also require high contrast of the devices. Finally, as we shall see below, it turns out that if the number of gates is large, regular interconnects are much easier to implement than random interconnects. It was shown that it is possible to build digital processors using simple regular interconnects with constant fan-in and fan-out. Such a design can also be efficient despite its regularity (Murdocca [ 19871). Regular interconnections based on optical imaging still allow for some degree of space-variance. A variety of proposals for implementing space-variant optical networks, such as the Perfect Shuffle, have been demonstrated and will be reviewed in 9 4. In the remainder of this section, we will discuss the computational aspects of the devices (9 2.1) and interconnections (9 2.2).
VI, § 21
435
SYSTEM MODEL AND COMPUTATIONAL ASPECTS
2.1. COMPUTATIONAL PROPERTIES OF NONLINEAR OPTICAL DEVICES
In this section, we will discuss the properties of optical switching devices merely from a computational point of view. The physics of the devices and some examples will be treated in 4 3. Throughout this section, we will follow the arguments given by Prise, Streibl and Downs [1988]. Figure 2.2 shows the characteristics of optical switching devices, categorized as inverting vs. non-inverting and thresholding vs. bistable. In order to use any of these devices as a logic gate, the signals from the previous stage (fig. 2.1) are fed to the device by the optical interconnect. For an inverting device, the output intensity will be high (we use the abbreviation HI), if the input intensity (or sum of input intensities from several input beams) is less than the switching power Psw. Conversely, the device will be in the low state (LO), if the input intensity is larger than Psw. In an analogous fashion, one can describe the operation of a non-inverting device. An external bias beam Ps can be added to the device to bring it near its bistable
thresholding out
out
non-inverting
4
out
'+
4%' out
inverting
Fig. 2.2. Device characteristics.
436
q-2-p tVI,
FREE-SPACE OPTICAL DIGITAL COMPUTING AND INTERCONNECTION
HI
LO
LO
LO
HI
1 1
LO
HI
LO
HI
02
Fig. 2.3. Critically biased inverting bistable device operated as a logical NOR gate.
switching point. This is essential in the case of devices without inherent gain, in particular for some bistable devices. Bistable devices can be used as logic gates or as latches. If the device is used in a pulsed mode, i.e., if the optical power is reduced below the bistable region before each switching event, the device can be considered as a thresholding device. Otherwise, the device is a latch, i.e., it can keep a state for a certain amount of time. The main difference between a bistable and a thresholding device is that for a bistable device the output power must be supplied by a bias beam. The bias brings the device near the switching point, so that a small additional change in the input power causes the device to switch. Therefore, operation of the device critically depends on the biasing which imposes tight tolerances on the accuracies of the optical power supply and the optical interconnect. Figure 2.3 shows the operation of an inverting bistable device as a logic NOR gate (Hinton [1993]). The bias power PB is chosen to be close to the switching power Psw. The power levels P I and Pz of the two input beams are assumed to be the same. If the bias beam brings the device sufficiently close to its switching point, then any input will exceed the nonlinear portion, thus moving the device from the HI to the LO state. The truth table for the NOR operation is shown in the right of the figure. The output of the device will only be HI if both input beams are LO. The computational parameters of a device are the fan-in (i.e., the number of inputs), the fan-out (number of outputs), and the threshold which is the number of HI inputs required to switch the device. For example, the NOR gate shown in fig. 2.3 has a threshold of 1, whereas an AND gate with a fan-in of 4 would have a threshold of 4. Obviously, the value of the threshold must be larger than zero and less than or equal to the fan-in. For a device to be used as a logic gate in a system, cascadability is important.
VI, § 21
SYSTEM MODEL AND COMPUTATlONAL ASPECTS
437
Fig. 2.4. Characteristics of a thresholding device.
This means, that the output from one device can feed the inputs of another device. Another requirement is that a complete set of logic operations can be implemented. As is well-known from Boolean logic, AND and OR form a complete set, but NOR is also a complete set, for example. For nonlinear devices, tolerances in the input power levels will cause variations in the output powers. Therefore, the slope and position of the characteristic curve of a device play an important role. The device properties are related to the system properties. Both are discussed in the following using the example of a noninverting thresholding device. Figure 2.4 shows the characteristic in detail. A thresholding device is characterized by the following parameters: - P,,, the switching power, - AP,,,the switching window, - Po,, the output power just after switch-on, - Po@,the output power just before switch-on, - THI, the differential transmission of the device at switching point when the device is on, - TLO,the differential transmission of the device at switching point when the device is off. From this, one can define figures of merit characterizing the device: - C,,, the switching contrast:
438
FREE-SPACE OPTICAL DIGITAL COMPUTING AND INTERCONNECTION
-
T,,
-
osw,the relative switching window
[W 0 2
the switching transmission:
as,
AP5w
=psw
Whether a device is operated in transmission or reflection is important. It has been pointed out that reflective operation reduces the switching power, enhances the contrast, and allows one efficient cooling from the back of the device (Wherrett [ 19841). It is important to note that a device can have a gain greater than one. One refers to these devices as having inherent or absolute gain, as opposed to devices with differential gain. The performance of a device in a system is defined not only by its individual parameters as d e h e d above but also by variations in the power of the bias beam supplied to the device. This may vary over an array of devices and therefore variations in the switching behavior may occur. Furthermore, there may be variations in the switching power in time and across the array as well as variations in device transmission (or reflection, respectively). These variations intrinsic to a system can be taken into account by defining an effective switching window oetf:
oeff= a,
+ a,,,
=
a,,
6Pe 8 P T + 6P,, -$---+---. psw
PB
PT
For critically biased devices, nonuniformities of the optical power supply beams pose a serious problem. The illumination of the devices is achieved by array generators (or array illuminators) whch will be discussed below. One of the figures of merit of an array generator is the uniformity over the array of spots. Nonuniform illumination may result in erroneous operation of the devices; i.e., a device might switch due to the h g h intensity of the power supply beam rather than the state of the input beams. That problem can be largely reduced by using the concept of dual-rail logic (von Neumann [ 19631).That term describes a binary coding technique that comprises both intensity and spatial encoding. The LO and HI states of a device are represented by two pixels which are always complementary to each other. A binary 0 would, e.g., be represented by two pixels where the upper one is bright and the lower one is dark. The use of
VI,
B 21
SYSTEM MODEL AND COMPUTATIONAL ASPECTS
439
dual-rail logic means that the logic state of a device depends on the relative intensity levels of the two pixels (or the two light beams illuminating the two pixels, respectively) rather than on the absolute power level of a single beam. This reduces to a large extent the accuracy requirements related to the optical interconnect and the array generator. 2.2. COMPUTATIONAL ASPECTS OF THE OPTICAL INTERCONNECTS
In interconnecting arrays of devices, several basic operations are encountered (fig. 2.5). These are - “copying” of an input array to an output array, - “split-and-shift”, where two or more laterally shifted versions of the same input array are copied to the output requiring the optical implementation of fanning out from a single device, and - “permutation” of the pixel positions. All three types of operations can be implemented by using optical imaging techniques. However, whereas “copy” and “split-and-shift” are space-invariant operations, permutations are, in general, space-variant operations. The former can be implemented by conventional imaging techniques whereas the latter may require the use of microchannel imaging using multifaceted elements. We will discuss the physical implementations of various interconnection schemes in a later section. Here, we will concentrate on the terminology and the computational aspects of optical interconnects. First, we want to describe the terms space-variant and space-invariant, regular and irregular, as well as fixed and dynamic. Space-invariant and space-variant interconnections: Interconnections for discrete input and output arrays can be represented as so-called bipartite graphs (fig. 2.6). Both the input and the output are represented by a regular array of positions which are denoted by characters (e.g., a, b, c, . . . ) or numbers (0, 1, 2, 3, . . . ). Often the number of positions is a power of 2.
Fig. 2.5. Basic interconnection tasks for array logic.
440 input
FREE-SPACE OPTICAL DIGITAL COMPUTING AND INTERCONNECTION
output
input
[VI, § 2 output
El Fig. 2.6. Bipartite graph representing an interconnection network (in this case space-variant).
....
Fig. 2.7. Space-invariant interconnect.
Linear optical systems can be characterized as either space-invariant (si) or space-variant (sv). A system is called space-variant if the array of lines emerging from an input position varies to the next (fig. 2.6). A system is called spaceinvariant if every input position generates the same output pattern (fig. 2.7). In a bipartite graph this means that the arrays of lines emerging from all input positions are the same. For arrays of finite size, space-invariance is not strictly possible since some lines would not connect to a position in the output array. Occasionally, the lines are wrapped around in a cyclical fashion. This case is represented in fig. 2.7 by the dashed line. Regular and irregular networks: These terms are not defined precisely; they are used according to the general understanding of what regular and irregular mean. It is important to note that irregular is not the same as space-variant. Certain space-variant interconnections exhibit a high degree of regularity. Actually, the interconnection pattern for sv multistage networks (see below) can be expressed explicitly in terms of a mathematical mapping of input to output positions. Fixed and dynamic interconnections: In electronics, connections are usually fixed, with a few exceptions such as switch boards. For metallic wires, a change of a connection requires a mechanical displacement. Mechanical switches are
VI, 0 21
44 1
SYSTEM MODEL AND COMPUTATIONAL ASPECTS
object
lens
image
--
_____II-
field (area A )
Fig. 2.8. Imaging setup.
also used in optics. However, since they are slow (milliseconds) their use is limited. Optical connections can also be changed using electro-optic effects (e.g., directional coupler) or the optical elements (dynamic holograms or kinoforms). Some of these effects can be very fast (nanoseconds or less) and they offer the possibility for dynamic routing of light signals. The usefulness of dynamic interconnections, however, is not quite clear. In computing, there appear to be only few interesting applications. Maybe this is because current electronic computers are built on fixed interconnections. In the following, we will consider only fixed interconnections and explain several more terms like fan-out, blocking and non-blocking networks. Conventional imaging systems may be based on single lens (fig. 2.8) or 4f-setups, for example. The interconnectivity or channel capacity is given by the space-bandwidth product (SBP) of the imaging system. The SBP is given as the ratio of the field of the imaging system with an area A and the area of one pixel, 6x2. The resolution 6x is given by the numerical aperture NA of the imaging system with NA = sina. For a circular lens aperture, the pixel size is 6x = 2.44 UNA. The field is the area over which aberrations are not significant. In general, a large numerical aperture yields a smaller optical field. The SBP of optical imaging systems ranges between 1O3 for micro-optical imaging systems to lo* for expensive systems used in lithography, for example. The SBP of an imaging system scales approximately linearly with the diameter of the lens if the NA is kept constant (Lohmann [1989]). A specific aspect of digital optics is that the objects are discrete, i.e., the input object consists of a discrete array of light sources, for example. If the size 6x of the devices is comparable to the pitch Ax, one speaks of a densely packed array (fig. 2.9). (For simplicity, we assume that the device size is the same as the resolution 6x of the imaging system.) If k / 6 x is large, one speaks of dilute
442
FREE-SPACE OPTICAL DIGITAL COMPUTING AND INTERCONNECTION
[VI, § 2
pitch, A x
device size, 6x
densely packed array
dilute array
Fig. 2.9. Visualization of densely packed and dilute arrays.
arrays. The first generations of self-electro-optic effect devices used in earlier systems experiments were densely packed, with typical values 6x = lOpm and Ax = 20pm. Examples of dilute arrays would be arrays of optical monomode fibers (6x < 10 pm and Ax = 125 pm) or smart pixel arrays. It has been pointed out by Lohmann [ 19911 and McCormick, Tooley, Cloonan, Sasian and Hinton [1991] that the use of conventional imaging is not optimal for dilute arrays since the larger part of the SBP would be used for imaging the dead space between the devices represented by the shaded area in fig. 2.9. Furthermore, the imaging of dilute arrays may require optical fields which are significantly larger than those which can be obtained with conventional imaging systems. For example, one might conceive of a situation in which a silicon chip with optical inputs and outputs is interconnected via free space. In that case, the size of the chip might be on the order of 10 mmx 10 mm. To better match the optics to such a situation, a microchannel imaging scheme seems more appropriate. Here, each optical channel is implemented by a pair of microlenses (fig. 2.10a). The advantage of microchannel imaging over conventional imaging is that the array size can be large without affecting the image quality since field size and resolution are independent. A limitation occurs, however, in the longitudinal separation Az of the device arrays. Due to diffraction, light may couple between neighboring channels and thus crosstalk may occur. In order to prevent this, it is necessary that Az D2/21 (Smith, Murdocca and Stone [1994]). Here, D is the effective aperture of each channel. A modified imaging scheme which combines in a hybrid form conventional imaging with the microchannel approach will be described in a later section. Due to the use of faceted array components in the microchannel approach, one can also implement space-variant random interconnections if suitable elements