Introduction to Information Optics
This page intentionally left blank
Introduction to Information Optics Edited by ...

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Introduction to Information Optics

This page intentionally left blank

Introduction to Information Optics Edited by FRANCIS T. S. YU The Pennsylvania State University

SUGANDA JUTAMULIA Blue Sky Research

SHIZHUO YIN The Pennsylvania State University

® ACADEMIC PRESS A Horcourt Science and Technology Company

San Diego San Francisco New York London Sydney Tokyo

Boston

This book is printed on acid-free paper. (S) Copyright © 2001 by Academic Press All rights reserved No part of this publication maybe reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or stored in any information storage and retrieval system without permission in writing from the publisher. Requests for permission to make copies of any part of the work should be mailed to: Permissions Department, Harcourt Brace & Company, 6277 Sea Harbor Drive, Orlando, Florida, 32887-6777. ACADEMIC PRESS A Harcourt Science and Technology Company 525 B Street Suite 1900, San Diego, California 92101-4495, USA http://www.academicpress.com ACADEMIC PRESS 24-28 Oval Road, London NW1 7DX, UK Library of Congress Cataloging Number: 2001089409 ISBN: 0-12-774811-3 PRINTED IN THE UNITED STATES OF AMERICA 01 0 2 0 3 0 4 0 5 S B 9 8 7 6 5 4 3 2 1

Contents

Preface

Chapter 1 Entropy Information and Optics 1.1. Information Transmission 1.2. Entropy Information 1.3.. Communication Channel 1.3.1. Memoryless Discrete Channel 1.3.2. Continuous Channel 1.4. Band-limited Analysis 1.4.1. Degrees of Freedom 1.4.2. Gabor's Information Cell 1.5. Signal Analysis 1.5.1. Signal Detection 1.5.2. Statistical Signal Detection 1.5.3. Signal Recovering 1.5.4. Signal Ambiguity 1.5.5. Wigner Distribution 1.6. Trading Information with Entropy 1.6.1. Demon Exorcist 1.6.2. Minimum Cost of Entropy 1.7. Accuracy and Reliability Observation 1.7.1. Uncertainty Observation 1.8. Quantum Mechanical Channel 1.8.1. Capacity of a Photon Channel References Exercises

I 2 4 9 10 11 19 23 25 26 28 29 32 34 39 41 42 45 47 51 54 56 60 60

vi

Contents

Chapter 2. Signal Processing with Optics

67

2.1. Coherence Theory of Light 2.2. Processing under Coherent and Incoherent Illumination 2.3. Fresnel-Kirchhoff and Fourier Transformation 2.3.1. Free Space Impulse Response 2.3.2. Fourier Transformation by Lenses 2.4. Fourier Transform Processing 2.4.1. Fourier Domain Filter 2.4.2. Spatial Domain Filter 2.4.3. Processing with Fourier Domain Filters 2.4.4. Processing with Joint Transformation 2.4.5. Hybrid Optical Processing 2.5. image Processing with Optics 2.5.1. Correlation Detection 2.5.2. Image Restoration 2.5.3. Image Subtraction 2.5.4. Broadband Signal Processing 2.6. Algorithms for Processing 2.6.1. Mellin-Transform Processing 2.6.2. Circular Harmonic Processing 2.6.3. Homomorphic Processing 2.6.4. Synthetic Discriminant Algorithm 2.6.5. Simulated Annealing Algorithm 2.7. Processing with Photorefractive Optics 2.7.1. Photorefractive Eifect and Materials 2.7.2. Wave Mixing and Multiplexing 2.7.3. Bragg Diffraction Limitation 2.7.4. Angular and Wavelength Selectivities 2.7.5. Shift-Invariant Limited Correlators 2.8. Processing with Incoherent Light 2.8.1. Exploitation of Coherence 2.8.2. Signal Processing with White Light 2.8.3. Color Image Preservation and Pseudocoloring 2.9. Processing with Neural Networks 2.9.1. Optical Neural Networks 2.9.2. Holpfield Model 2.9.3. Inpattern Association Model References Exercises

67 2 76 76 77 79 79 82 83 85 88 89 89 93 98 98 103 104 105 107 108 112 i 15 115 118 121 122 125 131 131 135 138 141 142 143 144 147 148

Chapter 3, Communication with Optics

163

3.1. Motivation of Fiber-Optic Communication 3.2. Light Propagation in Optical Fibers 3.2.1. Geometric Optics Approach 3.2.2. Wave-Optics Approach 3.2.3. Other Issues Related to Light Propagating in Optical Fiber 3.3. Critical Components 3.3.1. Optical Transmitters for Fiber-Optic Communications — Semiconductor Lasers

163 164 164 164 168 184

7

184

Contents

3.4.

3.3.2. Optical Receivers for Fiber-Optic Communications 3.3.3. Other Components Used in Fiber-Optic Communications Fiber-Optic Networks 3.4.1. Types of Fiber-Optic Networks Classified by Physical Size 3.4.2. Physical Topologies and Routing Topologies Relevant to Fiber-Optic Networks 3.4.3. Wavelength Division Multiplexed Optics Networks 3.4.4. Testing Fiber-Optic Networks References Exercises

VI i

188 192 192 193 193 193 195 198 198

Chapter 4. Switching with Optics

20!

4.1. Figures of Merits for an Optical Switch 4.2. All-Optical Switches 4.2.1. Optical Nonlinearity 4.2.2. Etalon Switching Devices 4.2.3. Nonlinear Directional Coupler 4.2.4. Nonlinear Interferometric Switches 4.3. Fast Electro-optic Switches: Modulators 4.3.1. Direct Modulation of Semiconductor Lasers 4.3.2. External Electro-optic Modulators 4.3.3. MEMS Switches Without Moving Parts 4.4. Optical Switching Based on MEMS 4.4.1. MEMS Fabrications 4.4.2. Electrostatic Actuators 4.4.3. MEMS Optical Switches 4.5. Summary References Exercises

202 203 205 205 208 211 219 220 225 236 236 237 238 242 247 248 250

Chapter 5. Transformation with Optics

255

5.1. Huygens- Fresnel Diffraction 5.2. Fresnel Transform 5.2.1. Definition 5.2.2. Optical Fresnel Transform 5.3. Fourier Transform 5.4. Wavelet Transform 5.4.1 Wavelets 5.4.2. Time-frequency Joint Representation 5.4.3. Properties of Wavelets 5.5. Physical Wavelet Transform 5.5.1. Electromagnetic Wavelet 5.5.2. Electromagnetic Wavelet Transform 5.5.3. Electromagnetic Wavelet Transform and Huygens Diffraction 5.6. Wigner Distribution Function 5.6.1. Definition 5.6.2. Inverse Transform

256 257 257 257 259 260 260 261 262 264 264 266 260 270 270 271

Vlll

5.7.

5.8.

5.9.

5.10.

5.11.

Contents 5.6.3. Geometrical Optics Interpretation 5.6.4. Wigner Distribution Optics Fractional Fourier Transform 5.7.1. Definition 5.7.2. Fractional Fourier Transform and Fresnel Diffraction Hankel Transform 5.8.1. Fourier Transform in Polar Coordinate System 5.8.2. Hankel Transform Radon Transform 5.9.1. Definition 5.9.2. Image Reconstruction Geometric Transform 5.10.1. Basic Geometric Transformations 5.10.2. Generalized Geometric Transformation 5.10.3. Optical Implementation Hough Transform 5.11.1. Definition 5.11.2. Optical Hough Transform References Exercises

Chapter 6 Interconnection with Optics 6.1. 6.2.

Introduction Polymer Waveguides 6.2.1. Polymeric Materials for Waveguide Fabrication 6.2.2. Fabrication of Low-Loss Polymeric Waveguides 6.3.2. Waveguide Loss Measurement 6.3. Thin-Film Waveguide Couplers 6.3.1. Surface-Normal Grating Coupler Design and Fabrication 6.3.2. 45° Surface-Normal Micromirror Couplers 6.4. Integration of Thin-Film Photodetectors 6.5. Integration of Vertical Cavity Surface-Emitting Lasers (VCSELs) 6.6. Optical Clock Signal Distribution 6.7. Polymer Waveguide-Based Optical Bus Structure 6.7.1. Optical Equivalent for Electronic Bus Logic Design 6.8. Summary References Exercises

Chapter 7 Pattern Recognition with Optics 7.1.

7.2.

Basic Architectures 7.1.1. Correlators 7.1.2. Neural Networks 7.1.3. Hybrid Optical Architectures 7.1.4. Robustness of JTC Recognition by Correlation Detections 7,2.1. Nonconventional Joint-Transform Detection

271 272 275 275 277 279 279 281 282 282 283 284 288 288 289 292 292 293 294 295

299 299 303 303 305 310 312 312 326 331 334 339 343 345 348 349 351

355 356 356 357 357 362 364 364

Contents

7.3.

7.4.

7.5.

7.6.

7.7.

7.8.

7.2.2. Nonzero-order Joint-Transform Detection 7.2.3. Position-Encoding Joint-Transform Detection 7.2.4. Phase-Representation Joint-Transform Detection 7.2.5. Iterative Joint-Transform Detection Polychromatic Pattern Recognition 7.3.1. Detection with Temporal Fourier-Domain Filters 7.3.2. Detection with Spatial-Domain Filters Target Tracking 7.4.1. Autonomous Tracking 7.4.2. Data Association Tracking Pattern Recognition Using Composite Filtering 7.5.1. Performance Capacity 7.5.2. Quantization Performance Pattern Classification 7.6.1. Nearest Neighbor Classifiers 7.6.2. Optical Implementation Pattern Recognition with Photorefractive Optics 7.7.1. Detection by Phase Conjugation 7.7.2. Wavelength-Multiplexed Matched Filtering 7.7.3. Wavelet Matched Filtering Neural Pattern Recognition 7.8.1. Recognition by Supervised Learning 7.8.2. Recognition by Unsupervised Learning 7.8.3. Polychromatic Neural Networks References Exercises

Chapter 8 Information Storage with Optics 8.1. Digital Information Storage 8.2. Upper Limit of Optical Storage Density 8.3 Optical Storage Media 8.3.J. Photographic Film 8.3.2. Dichromated Gelatin 8.3.3. Photopolymers 8.3.4. Photoresists 8.3.5. Thermoplastic Film 8.3.6. Photorefractive Materials 8.3.7. Photochromic Materials 8.3.8. Electron-Trapping Materials 8.3.9. Two Photon-Absorption Materials 8.3.10. Bacteriorhodospin 8.3.11. Photochemical Hole Burning 8.3.12. Magneto-optic Materials 8.3.13. Phase-Change Materials 8.4. Bit-Pattern Optical Storage 8.4.1. Optical Tape 8.4.2. Optical Disk 8.4.3. Multilayer Optical Disk 8.4.4. Photon-Gating 3-D Optical Storage

IX 368 370 371 372 375 376 377 380 380 382 387 388 390 394 395 398 401 401 404 407 411 412 414 418 422 423

435 435 436 438 438 439 439 440 440 441 442 442 443 444 444 445 446 446 447 447 448 449

x

Contents

8.4.5. Stacked-Layer 3-D Optical Storage 8.4.6. Photochemical Hole-Burning 3-D Storage 8.5. Holographic Optical Storage 8.5.1. Principle of Holography 8.5.2. Plane Holographic Storage 8.5.3. Stacked Holograms for 3-D Optical Storage 8.5.4. Volume Holographic 3-D Optical Storage 8.6. Near Field Optical Storage 8.7. Concluding Remarks References Exercises

Chapter 9 9.1. 9.2.

9.3.

9.4.

9.5.

9.6.

9.7.

Computing with Optics

Introduction Parallel Optical Logic and Architectures 9.2.1. Optical Logic 9.2.2. Space-Variant Optical Logic 9.2.3. Programmable Logic Array 9.2.4. Parallel Array Logic 9.2.5. Symbolic Substitution 9.2.6. Content-Addressable Memory Number Systems and Basic Operations 9.3.1. Operations with Binary Number Systems 9.3.2. Operations with Nonbinary Number Systems Parallel Signed-Digit Arithmetic 9.4.1. Generalized Signed-Digit Number Systems 9.4.2. MSD Arithmetic 9.4.3. TSD Arithmetic 9.4.4. QSD Arithmetic Conversion between Different Number Systems 9.5.1. Conversion between Signed-Digit and Complement Number Systems 9.5.2. Conversion between NSD and Negabinary Number Systems Optical Implementation 9.6.1. Symbolic Substitution Implemented by Matrix-Vector Operation 9.6.2. SCAM-Based Incoherent Correlator for QSD Addition 9.6.3. Optical Logic Array Processor for Parallel NSD Arithmetic Summary References Exercises

Chapter 10

Sensing with Optics

10.1. Introduction 10.2. A Brief Review of Types of Fiber-Optic Sensors. 10.2.1. Intensity-Based Fiber-Optic Sensors 10.2.2. Polarization-Based Fiber-Optic Sensors 10.2.3. Phase-Based Fiber Optic Sensors 10.2.4. Frequency (or Wavelength)-Based Fiber-Optic Sensors

451 454 454 455 456 458 460 461 461 465 468

475 476 477 477 481 481 485 486 488 489 489 499 501 501 503 530 534 543 544 546 549 549 551 558 560 562 569

571 57! 572 572 575 583 587

Contents

XI

10.3. Distributed Fiber-Optic Sensors 10.3.1. Intrinsic Distributed Fiber-optic Sensors 10.3.2. Quasi-distributed Fiber-optic Sensors 10.4. Summary References Exercises

589 589 600 612 613 615

Chapter 11

617

Information Display with Optics

11.1. I ntrod action 11.2. Information Display Using Acousto-optic Spatial Light Modulators 11.2.1. The Acousto-optic Effect 11.2.2. Intensity Modulation of Laser 11.2.3. Deflection of Laser 11.2.4. Laser TV Display Using Acousto-optic Devices 11.3. 3-D Holographic Display 11.3.1. Principles of Holography 11.3.2. Optical Scanning Holography 11.3.3. Synthetic Aperture Holography 11.4. Information Display Using Electro-optic Spatial Light Modulators 11.4.1. The Electro-optic Effect 11.4.2. Electrically Addressed Spatial Light Modulators 11.4.3. Optically Addressed Spatial Light Modulators 11.5. Concluding Remarks References Exercises

617 1618 618 625 628 629 632 632 638 640 643 643 647 650 661 661 664

Chapter 12 Networking with Optics

667

12.1. 12.2.

667 671 671 673 678 687 689 691 694 696 696 697 701 704 704 709 711 714 715

12.3.

12.4.

Index

Background Optical Network Elements 12.2.1. Optical Fibers 12.2.2. Optical Amplifiers 12.2.3. Wavelength Division Multiplexer/Demultiplexer 12.2.4. Transponder 12.2.5. Optical Add/Drop Multiplexer 12.2.6. Optical Cross-Connect 12.2.7. Optical Monitoring Design of Optical Transport Network 12.3.1. Optical Fiber Dispersion Limit 12.3.2. Optical Fiber Nonlinearity Limit 12.3.3. System Design Examples Applications and Future Development of Optical Networks 12.4.1. Long-haul Backbone Networks 12.4.2. Metropolitan and Access Networks 12.4.3. Future Development References Exercises

719

This page intentionally left blank

Preface

We live in a world bathed in light. Light is not only the source of energy necessary to live — plants grow up by drawing energy from sunlight; light is also the source of energy for information - our vision is based on light detected by our eyes (but we do not grow up by drawing energy from light to our body through our eyes). Furthermore, applications of optics to information technology are not limited to vision and can be found almost everywhere. The deployment rate of optical technology is extraordinary. For example, optical fiber for telecommunication is being installed about one kilometer every second worldwide. Thousands of optical disk players, computer monitors, and television displays are produced daily. The book summarizes and reviews the state of the art in information optics, which is optical science in information technology. The book consists of 12 chapters written by active researchers in the field. Chapter 1 provides the theoretical relation between optics and information theory. Chapter 2 reviews the basis of optical signal processing. Chapter 3 describes the principle of fiber optic communication. Chapter 4 discusses optical switches used for communication and parallel processing. Chapter 5 discusses integral transforms, which can be performed optically in parallel. Chapter 6 presents the physics of optical interconnects used for computing and switching. Chapter 7 reviews pattern recognition including neural networks based on optical Fourier transform and other optical techniques. Chapter 8 discusses optical storage including holographic memory, 3D memory, and near-field optics. Chapter 9 reviews digital

XIV

Preface

optical computing, which takes advantage of parallelism of optics. Chapter 10 describes the principles of optical fiber sensors. Chapter 11 introduces advanced displays including 3D holographic display. Chapter 12 presents an overview of fiber optical networks. The book is not intended to be encyclopedic and exhaustive. Rather, it merely reflects current selected interests in optical applications to information technology. In view of the great number of contributions in this area, we have not been able to include all of them in this book.

Chapter 1

Entropy Information and Optics Francis T.S. Yu THE PENNSYLVANIA STATE UNIVERSITY

Light is not only the mainstream of energy that supports life; it also provides us with an important source of information. One can easily imagine that without light, our present civilization would never have emerged. Furthermore, humans are equipped exceptionally good (although not perfect) eyes, along with an intelligent brain. Humans were therefore able to advance themselves above the rest of the animals on this planet Earth. It is undoubtedly true that if humans did not have eyes, they would not have evolved into their present form. In the presence of light, humans are able to search for the food they need and the art they enjoy, and to explore the unknown. Thus light, or rather optics, provide us with a very valuable source of information, the application of which can range from very abstract artistic interpretations to very efficient scientific usages. This chapter discusses the relationship between entropy information and optics. Our intention is not to provide a detailed discussion, however, but to cover the basic fundamentals that are easily applied to optics. We note that entropy information was not originated by optical scientists, but rather by a group of mathematically oriented electrical engineers whose original interest was centered on electrical communication. Nevertheless, from the very beginning of the discovery of entropy information, interest in its application has never totally been absent from the optical standpoint. As a result of the recent development of optical communication, signal processing, and computing, among other discoveries, the relationship between optics and entropy information has grown more profound than ever.

2

1. Entropy Information and Optics

1.1. INFORMATION TRANSMISSION Although we seem to know the meaning of the word information, fundamentally that may not be the case. In reality, information may be defined as related to usage. From the viewpoint of mathematic formalism, entropy information is basically a probabilistic concept. In other words, without probability theory there would be no entropy information. An information transmission system can be represented by a block diagram, as shown in Fig. 1.1. For example, a message represents an information source which is to be sent by means of a set of written characters that represent a code. If the set of written characters is recorded on a piece of paper, the information still cannot be transmitted until the paper is illuminated by a visible light (the transmitter), which obviously acts as an information carrier. When light reflected from the written characters arrives at your eyes (the receiver), a proper decoding (translating) process takes place; that is, character recognition (decoding) by the user (your mind). This simple example illustrates that we can see that a suitable encoding process may not be adequate unless a suitable decoding process also takes place. For instance, if I show you a foreign newspaper you might not be able to decode the language, even though the optical channel is assumed to be perfect (i.e., noiseless). This is because a suitable decoding process requires a priori knowledge of the encoding scheme; for example, the knowledge of the foreign characters. Thus the decoding process is also known as recognition process. Information transmission can be in fact represented by spatial and temporal information. The preceding example of transmitting written characters obviously represents a spatial information transmission. On the other hand, if the written language is transmitted by coded light pulses, then this

COMMUNICATION CHANNEL

Fig. 1.1. Block diagram of a communication system.

1.1. Information Transmission

3

language should be properly (temporally) encoded; for instance, as transmitted through an optical fiber, which represents a temporal communication channel. Needless to say, at this receiving end a temporally decoding process is required before the temporal coded language is sent to the user. Viewing a television show, for example, represents a one-way spatial-temporal transmission. It is interesting to note that temporal and spatial information can be traded for information transmission. For instance, television signal transmission is a typical example of exploiting the temporal information transmission for spatial information transmission. On the other hand, a movie sound track is an example of exploiting spatial information transmission for temporal information transmission. Information transmission has two basic disciplines: one developed by Wiener [1.1, 1.2], and the other by Shannon [1.3, 1.4]. Although both Wiener and Shannon share a common interest, there is a basic distinction between their ideas. The significance of Wiener's work is that, if a signal (information) is corrupted by some physical means (e.g., noise, nonlinear distortion), it may be possible to recover the signal from the corrupted one. It is for this purpose that Wiener advocates correlation detection, optimum prediction, and other ideas. However, Shannon carries his work a step further. He shows that the signal can be optimally transferred provided it is properly encoded; that is, the signal to be transferred can be processed before and after transmission. In other words, it is possible to combat the disturbances in a communication channel by properly encoding the signal. This is precisely the reason that Shannon advocates the information measure of the signal, communication channel capacity, and signal coding processes. In fact, the main objective of Shannon's theory is the efficient utilization of the information channel. A fundamental theorem proposed by Shannon may be the most surprising result of his work. The theorem can be stated approximately as follows: Given a stationary finite-memory information channel having a channel capacity C, if the binary information transmission rate R (which can be either spatial or temporal) of the signal is smaller than C, there exists a channel encoding and decoding process for which the probability of error per digit of information transfer can be made arbitrarily small. Conversely, if the formation transmission rate R is larger than C, there exists no encoding and decoding processes with this property; that is, the probability of error in information transfer cannot be made arbitrarily small. In other words, the presence of random disturbances in an information channel does not, by itself, limit transmission accuracy. Rather, it limits the transmission rate for which arbitrarily high transmission accuracy can be accomplished. To conclude this section, we note that the distinction between the two communication disciplines are that Wiener assumes, in effect, that the signal in question can be processed after it has been corrupted by noise. Shannon suggests that the signal can be processed both before and after its transmission

4

1. Entropy Information and Optics

through the communication channel. However, both Wiener and Shannon share the same basic objective; namely, faithful reproduction of the original signal.

1.2. ENTROPY INFORMATION Let us now define the information measure, which is one of the vitally important aspects in the development of Shannon's information theory. For simplicity, we consider discrete input and output message ensembles A = {aj and B = {bj}, respectively, as applied to a communication channel, as shown in Fig. 1.2, If a£ is an input event as applied to the information channel and bj is the corresponding transmitted output event, then the information measure about the received event bj specifies a,-, can be written as

(1.1)

Where P(ai/bi) is the conditional probability of input event a{ depends on the output event b^ P(at) is the a priori probability of input event a,-, / = 1 , 2 , . . . , M andj= 1, 2,..., JV. By the symmetric property of the joint probability, we show that I(at; bj) = l(bj; at).

(1.2)

In other words, the amount of information transferred by the output event bj from a, is the same amount as provided by the input event a, that specified bj, It is clear that, if the input and output events are statistically independent; that is, if P(ait bj) = P(ai)P(bj), then /(«,.; bj) = 0 Furthermore, if /(a,•;£>,•) > 0, then P(ai,bj) > P(ai)P(bj), there is a higher joint probability of ai and bj. However, if I(ai;bi) E0 = -kTln [3 - (i)1/*], where E0 is the threshold energy level for the photodetector.

(1.175)

1.7. Accuracy and Reliability Observation

51

For y. » 1 the preceding equation can be written as hv >kT(\noi + 0.367).

(1.176)

Since the absorption of one quantum of light is adequate for a positive response, the corresponding entropy increase is AS - — >fc(lna + 0.367).

(1.177)

The amount of information obtained would be 7-log2abits.

(1.178)

AS-/fcln2>0.367fc>0.

(1.179)

Thus we see that

Except for the equality, this AS is identical to that of the low-frequency observation of Eq. (1.162). However, the entropy increase is much higher, since v is very high. Although fine observation can be obtained by using higher frequency, there is a price to be paid; namely, higher cost of entropy. We now come to the reliable observation. One must distinguish the basic difference between accuracy and reliability in observations. A reliable observation is dependent on the chosen decision threshold level E0; that is, the higher the threshold level, the higher the reliability. However, accuracy in observation is inversely related to the spread of the detected signal; the narrower the spread, the higher the accuracy. These two issues are illustrated in Fig. 1.17. It is evident that the higher threshold energy level E0 chosen would have higher the reliability. However, higher reliability also produces higher probability of misses. On the other hand, if E0 is set at a lower level, a less reliable observation is expected. In other words, high probability of error (false alarms) may produce, for example, due thermal noise fluctuation.

1.7.1. UNCERTAINTY OBSERVATION All physical systems are ultimately restricted by some limitations. When quantum conditions are in use, all limitations are essentially imposed by the basic Heisenberg uncertainty principle: AEAf^/T,

(1.180)

1. Entropy Information and Optics

HIGH ACCURACY HIGH RELIABILITY

HIGH ACCURACY LOW RELIABILITY

LOW ACCURACY HIGH RELIABILITY

LOW ACCURACY LOW RELIABILITY

Fig. 1.17. Examples of accuracy and reliability of observation.

where A£ denotes the energy perturbation and At is the time-interval observation. By reference to the preceding observation made by radiation, one can compare the required energy AE with the mean-square thermal fluctuation of the photodetector ykT, where y is the number of degrees of freedom, which is essentially the number of low-frequency vibrations (hv«kT). Thus, if AE < ykT, we have h

Af » — . ykT

(1.181)

From this inequality we see that a larger time resolution At is required for low-frequency observation. Since A£ is small, the perturbation within At is very small and can by comparison be ignored.

1.7. Accuracy and Reliability Observation

53

However, if the radiation frequency AE = hv > ykT, then we have

v becomes higher, such as

A ^ .

(1.182)

We see that, as the radiant energy required for the observation increases, the more accurate time resolution can be made. But the perturbation of the observation is also higher. Thus, the time resolution A? obtained by the observation may not even be correct, since AE is large. In the classical theory of light, observation has been assumed nonperturbable. This assumption is generally true for many particles problems, for which a large number of quanta is used in observation. In other words, accuracy of observations is not expected to be too high in classical theory of light, since its imposed condition is limited far away from the uncertainty principle; that is AEAt »h, or equivalently, Ap Ax » h.

However, as quantum conditions occur, a nonperturbing system simply does not exist. When a higher-quantum hv is used, a certain perturbation within the system is bound to occur; hence, high-accuracy observation is limited by the uncertainty principle. Let us look at the problem of observing extremely small distance Ax between two particles. One must use a light source having a wavelength / that satisfies the condition / I(A; B)kln2.

(1.1.92)

1.8.1. CAPACITY OF A PHOTON CHANNEL Let us denote the mean quantum number of the photon signal by m(v), and the mean quantum number of a noise by n(v). We have assumed an additive channel; thus, the signal plus noise is

Since the phonton density (i.e., the mean number of photons per unit time per frequency) is the mean quantum number, the signal energy density per unit time can be written as Es(v) — m(v)/n>, where h is Planck's constant. Similarly, the noise energy density per unit time is EN(v) = h(v)hv. Due to the fact that the mean quantum noise (blackbody radiation at temperature T) follows Planck's distribution (also known as the Bose-Einstein distribution),

TT exp(/iv//cT) —

1.8. Quantum Mechanical Channel

57

the noise energy per unit time (noise power) can be calculated by

N =

E,(vXlv =

- r - r *=

,

(1.194)

where £ is an arbitrarily small positive constant. Thus, the minimum required entropy for the signal radiation is

r

dr T

(1.195)

where E(T) is signal energy density per unit time as a function of temperature T, and T is the temperature of the blackbody radiation. Thus, in the presence of a signal the output radiation energy per unit time (the power) can be written F - S + N,

where S and N are the signal and the noise power, respectively. Since the signal is assumed to be deterministic (i.e., the microstate signal), the signal entropy can be considered zero. Remember that the validity of this assumption is mainly based on the independent statistical nature between the signal and the noise, for which the photon statistics follow Bose-Einstein distribution. However, the Bose-Einstein distribution cannot be used for the case of fermions, because, owing to the Pauli exclusion principle, the microstates of the noise are restricted by the occupational states of the signal, or vice versa. In other words, in the case of fermions, the signal and the noise can never be assumed to be statistically independent. For the case of Bose-Einstein statistics, we see that the amount of entropy transfer by radiation remains unchanged:

Since the mutual information is I(A; B) = H(B) — H(B/A), we see that I (A; B) reaches its maximum when H(B) is maximum. Thus, for maximum information transfer, the photon signal should be chosen randomly. But the maximum value of entropy H(B) occurs when the ensemble of the microstates of the total radiation (the ensemble B) corresponds to Gibbs's distribution, which reaches to the thermal equilibrium. Thus, the corresponding mean occupational quantum number of the total radiation also follows Bose-Einstein

58

1. Entropy Information and Optics

distribution at a given temperature Te ^ T: /-in

7(v)) =

(1.197)

where Te is defined as the effective temperature. Since the output entropy can be determined by

(U98)

Icln2

the quantum mechanical channel capacity can be evaluated as given by (...99,

(nkT')2 v

In view of the total output power 0.200, the effective temperature Te can be written as 6hS (nk)

'

(1.201

Thus, the capacity of a photon channel can be shown to be (1.201)

We would further note that high signal-to-noise ratio corresponds to highfrequency transmission (hv » kT), for which we have 6hS

1.8. Quantum Mechanical Channel

59

Fig. 1.18. Photon channel capacity as a function of signal powers, for various values of thermal noise temperature T. Dashed lines represent the classical asymptotes of Eq. (1.203).

Then the photon channel capacity is limited by the quantum statistic; that is, 1/2

c ^quant ~!L(»Y ' ' In 2 \3hJ

'

(1.203)

However, if the signal-to-noise ratio is low (hv « kT), the photon channel capacity reduces to the classical limit: V--rla,

(1.204)

Figure 1.18 shows a plot of the photon channel capacity as a function of signal energy. We see that at the low signal-to-noise ratio (hv « kT) regime, the channel capacity approaches this classical limit. However, for high signal-tonoise ratio (hv » kT), the channel capacity approaches the quantum limit of Eq (1.203), which offers a much higher transmission rate.

1. Entropy Information and Optics

1.1 1.2 1.3 1.4 1.5 1.6 3.7 1.8 1.9 1.10 1.11 1.12

N. Wiener, Cybernetics, MIT Press, Cambridge, Mass., 1948. N. Wiener, Extrapolation, Interpolation, and Smoothing of Stationary Time Series. C. E. Shannon, 1948, "A Mathematical Theory of Communication," Bell Syst. Tech. J., vol. 27, 379-423, 623-656. C. E. Shannon and W. Weaver, The Mathematical Theory of Communication, University of Illinois Press, Urbana, 1949. W. Davenport and W. Root, Random Signals and Noise, McGraw-Hill, New York, 1958. M. Loeve, Probability Theory, 3rd ed., Van Nostrand, Princeton, N.J., 1963. D. Gabor, 1946, "Theory of Communication," J. Inst. Electr. Eng., vol. 93, 429. D. Gabor. 1950, "Communication Theory and Physics", Phi., Mag., vol. 41, no. 7, 1161. J. Neyman and E.S. Pearson, 1928, "On the Use and Interpretation of Certain Test Criteria for Purposes of Statistical Inference", Biometrika, vol. 20A, 175, 263. L. Brillouin, Science and Information Theory, 2nd ed.. Academic, New York, 1962. F. W. Sear, l^hermodynamics, the Kinetic Theory of Gases, and Statistical Mechanics, Addison-Wesley, Reading, Mass., 1953. F. T. S. Yu and X. Yang, Introduction to Optical Engineering, Cambridge University Press, Cambridge, UK, 1997.

EXERCISES

1.1

A picture is indeed worth more than a thousand words. For simplicity, we assume that an old-fashioned monochrome Internet screen has a 500 x 600 pixel-array and each pixel element is capable of providing eight distinguishable brightness levels. If we assume the selection of each information pixel-element is equiprobable, then calculate the amount of information that can be provided by the Internet screen. On the other hand, a broadcaster has a pool of 10,000 words; if he randomly picked 1000 words from this pool, calculate the amount of information provided by these words. 1.2 Given an n-array symmetric channel, its transition probability matrix is written by

l-p P [P]

n- 1

P n —1 1 1

Pn

P n —1

>1- 1

P

n-1

n~^T

P P P n- 1 n - 1 H - 1

l-p

Exercises

(a) Evaluate the channel capacity. (b) Repeat part (a) for a binary channel. 1.3 Show that an information source will provide the maximum amount of information, if and only if the probability distribution of the ensemble information is equiprobable. 1.4 Let an input ensemble to a discrete memoryless channel be A = {a1,a2,a3} with the probability of occurrence p(al) = |, p(a2) = {% p(a3) = i, and let B = {bl,b2,b3} be a set of the output ensemble. If the transition matrix of the channel is given by

a. Calculate the output entropy H(B). b. Compute the conditional entropy H(A/B). 1.5 Let us consider a memoryless channel with input ensemble A — alf a2,...,ar and output ensemble B = b1, b2,...,bs, and channel matrix [p(6j/a,.)]. A random decision rule may be formalized by assuming that if the channel output is bj for every i = 1, 2, . . . , s, the decoder will select a- with probability q(ai/bj), for every i = 1, 2, . . . , r. Show that for a given input distribution there is no random decision rule that will provide a lower probability of error than the ideal observer. 1.6 The product of two discrete memoryless channels C t and C2 is a channel the inputs of which are ordered pairs (a,, a}) and the outputs of which are ordered pairs (bk,h'i) where the first coordinates belong to the alphabet of Cj and the second coordinates to the alphabet of C2. If the transition probability of the product channel is P(b

lait a-) =

determine the capacity of the product channel. 1.7 Develop a maximum likelihood decision and determine the probability of errors for a discrete memoryless channel, as given by,

P=

where this input ensembles are p(fl 2 )=i,

62

1. Entropy Information and Optics

1 .8 A memory less continuous channel is perturbed by an additive Gaussian noise with zero mean and variance equal to N. The output entropy and the noise entropy can be written as

H(B) H(B/A) where b = a + c (i.e., signal plus noises), and of = a^ + a? = S + N. Calculate the capacity of the channel. 1.9 The input to an ideal low-pass filter is a narrow rectangular pulse signal as given by f i1> M i i < ^Av T* H(v) = < 2 _ JO, otherwise

a. Determine the output response g(t). b. Sketch g(t) when At Av < 1, A t A v = 1 and ArAv > 1. c. If

show how this linear phase low-pass filter affects the answers in part (b). 1.10 Consider an mxn array spatial channel in which we spatially encode a set of coded images. What would be the spatial channel capacity? 1.11 A complex Gaussian pulse is given by u(t) = Kexp(-/lr 2 ), where M is a constant and

A = a + ib. Determine and sketch the ambiguity function, and discuss the result as it applies to radar detection. 1.12 Given a chirp signal, as given by

u(t) = calculate and sketch the corresponding Wigner distribution.

63

Exercises

1.13 A sound spectrograph can display speech signals in the form of a frequency versus time plot which bears the name of logon (spectrogram). The analyzer is equipped with a narrow band Av = 45 Hz and a wide band Av = 300 Hz. a. Show that it is impossible to resolve the frequency and time information simultaneously by using only one of these filters. b. For a high-pitched voice that varies from 300-400 Hz, show that it is not possible to resolve the time resolution, although it is possible to resolve the fine-frequency content. c. On the other hand, for a low-pitched voice that varies from 20-45 Hz, show that it is possible to resolve the time resolution but not the fine-frequency resolution. 1.14 We equip the Szillard's demon with a beam of light to illuminate the chamber, in which it has only one molecule wandering in the chamber, as shown in Fig. 1.19. Calculate a. the net entropy change per cycle of operation, b. the amount of information required by the demon, c. the amount of entropy change of the demon, and d. show that with the intervention of the demon, the system is still operating within the limit of the second law of thermodynamics. 1.15 A rectangular photosensitive paper may be divided into an M x N array of information cells. Each cell is capable of resolving K distinct gray levels. By a certain recording (i.e., encoding) process, we reduce the M x N to a. P x Q array of cells with P < M and Q < N a. What is the amount of entropy decrease in reducing the number of cells? b. Let an image be recorded on this photosensitive paper. The amount of information provided by the image is l{ bits, which is assumed to be smaller than the information capacity of the paper. If /,- (i.e., /. < //)

1

P

.m

V ;:

;,

C

0

sx>

A

Fig. 1.19. Szilard's machine by the intervention of the demon. D, photodetector; L, light beams; C, transparent cylinder; P, piston; m. molecule.

64

1. Entropy Information and Optics

bits of the recorded information are considered to be in error, determine the entropy required for restoring the image. 1.16 For low-frequency observation, if the quantum stage g = m is selected as the decision threshold, we will have an error probability of 50% per observation. If we choose g = 5m calculate the probability of error per observation. 1.17. In high-frequency observation, show that the cost of entropy per observation is greater. What is the minimum amount of entropy required per observation? Compare this result with the low-frequency case and show that the high-frequency case is more reliable. 1.18. For the problem of many simultaneous observations, if we assume that an observation gives a positive (correct) result (i.e., any one of the a photodetectors give rise to an energy level above £0 = ghv) and a 25% chance of observation error is imposed, then calculate a. the threshold energy level E0 as a function of a and b. the corresponding amount of entropy increase in the photodetectors. 1.19 In low-frequency simultaneous observations, if we obtain y simultaneous correct observations out of a photodetectors (y < a), determine the minimum cost of entropy required. Show that for the high-frequency case, the minimum entropy required is even greater. 1.20 Given an imaging device; for example, a telescope, if the field of view at a distant corner is denoted by A and the resolution of this imaging system is limited to a small area A^4 over A, calculate the accuracy of this imaging system and the amount of information provided by the observation. 1.21 With reference to the preceding problem, a. Show that a high-accuracy observation requires a light source with a shorter wavelength, and the reliability of this observation is inversely proportional to the wavelength employed. b. Determine the observation efficiency for higher-frequency observation. 1.22 The goal of a certain experiment is to measure the distance between two reflecting walls. Assume that a plane monochromatic wave is to be reflected back and forth between the two walls. The number of interference fringes is determined to be 10 and the wavelength of the light employed is 600nm. Calculate a. the separation between the walls, b. the amount of entropy increase in the photodetector (T= 300°K), c. the amount of information required, d. the corresponding energy threshold level of the photodetector if a reliability 3% — 4 is required, where ?Jt = 1/probability of error, and e. the efficiency of this observation. 1.23 Repeat the preceding problem for the higher-frequency observation.

Exercises

63

1.24 With reference to the problem of observation under a microscope, if a square (sides = r0) instead of a circular wavelength is used, calculate the minimum amount of entropy increase in order to overcome the thermal background fluctuations. Show that the cost of entropy under a microscope observation is even greater than kl In 2. 1.25 Show that the spatial information capacity of an optical channel under coherent illumination is generally higher than under incoherent illumination, 1.26 Let us consider a band-limited periodic signal of bandwidth Av m , where vm is the maximum frequency content of the signal. If the signal is passed through an ideal low-pass filter of bandwidth Avf where the cutoff frequency vf is lower than that of v m , estimate the minimum cost of entropy required to restore the signal. 1.27 Refer to the photon channel previously evaluated. What would be the minimum amount of energy required to transmit a bit of information through the channel? 1.28 High and low signal-to-noise ratio is directly related to the frequency transmission through the channel. By referring to the photon channels that we have obtained, under what condition would the classical limit and quantum statistic meet? For a low transmission rate, what would be the minimum energy required to transmit a bit of information? 1.29 It is well known that under certain conditions a band-limited photon channel capacity can be written as C % Av log

hv

The capacity of the channel increases as the mean occupation number in — S/hv increases. Remember, however, that this is valid only under the condition Av/v « 1, for a narrow-band channel. It is incorrect to assume that the capacity becomes infinitely large as v -» 0. Strictly speaking, the capacity will never exceed the quantum limit as given in Eq. (1.202). Show that the narrow-band photon channel is in fact the Shannon's continuous channel under high signal-to-noise ratio transmission.

This page intentionally left blank

Chapter 2

Signal Processing with Optics Francis T. S. Yu PENNSYLVANIA STATE UNIVERSITY

Optical processing can perform a myriad of processing operations. This is primarily due to its complex amplitude processing capability. Optical signal processors can perform one- or two-dimensional spatial functions using single linear operators, such as conventional linear systems. However, all those inherent processing merits of optical processing cannot happen without the support of good coherence property of light. For this reason, we shall begin our discussion with the fundamental coherence theory of light.

2.1. COHERENCE THEORY OF LIGHT When radiation from two sources maintains a fixed-phase relation between them, they are said to be mutually coherent. Therefore, an extended source is coherent if all points of the source have fixed-phase differences among them. We first must understand the basic theory of coherent light. In the classic theory of electromagnetic radiation, it is usually assumed that the electric and magnetic fields are always measurable quantities at any position. In this situation there is no need for the coherence theory to interpret the property of light. There are scenarios, however, in which this assumption cannot be made; for these it is essential to apply coherence theory. For example, if we want to determine the diffraction pattern caused by radiation 67

68

2. Signal Processing with Optics

ui(t)

Fig. 2.1. A wavefront propagates in space.

from several sources, we cannot obtain an exact result unless the degrees of coherence among the separate sources are taken into account. In such a situation, it is desirable to obtain a statistical ensemble average for the most likely result; for example, from any combination of sources. It is therefore more useful to provide a statistical description than to follow the dynamic behavior of a wave field in detail. Let us assume an electromagnetic wave field propagating in space, as depicted in Fig. 2.1, where u^t) and u2(t) denote the instantaneous wave disturbances at positions 1 and 2, respectively. The mutual coherence function (i.e., cross-correlation function) between these two disturbances can be written as r)u*2(t) dt,

(2.1)

where the superasterisk denotes the complex conjugate, the < > represents the time ensemble average, and T is a time delay. The complex degree of coherence between u^t) and u2(x) can be defined as (2.2)

where r u (t) and r 22 (r) are the self-coherence functions of u^t) and M 2 (f), respectively.

2.1. Coherence Theory of Light

69

Needless to say, the degree of coherent is bounded by 0 ^ |r12(t)| < 1, in which |r12| = 1 represents strictly coherent, and |r12| = 0 represents strictly noncoherent. We note that, in high-frequency regime, it is not easy or impossible to directly evaluate the degree of coherence. However, there exists a practical fringe visibility relationship for which the degree of coherence \rl2\ can be directly measured, by referring to the Young's experiment in Fig. 2.2, in which £ represents a monochromatic extended source. A diffraction screen is located at a distance f 10 from the source, with two small pinholes in this screen, Q, and Q2, separated at a distance d. On the observing screen located r 20 away from the diffracting screen, we observe an interference pattern in which the maximum and minimum intensities /max and / min of the fringes are measured. The Michelson visibility can then be defined as I/ A

max max

min '

(2.3)

min

We shall now show that under the equal intensity condition (i.e., / j = J 2 )» the visibility measure is equal to the degree of coherence. The electromagnetic wave disturbances u^t) and u2(t] at Q1 and Q2 can be determined by the wave equation, such as

where c is the velocity of light. The disturbance at point P, on the observing screen, can be written as =cut

Fig. 2.2. Young's experiment.

70

2. Signal Processing with Optics

where c\ and c2 are the appropriate complex constants. The corresponding irradiance at P is written by — i _ i _ r _i_ o RA / /> i/ I* — l\ + ]2 + 2 K C \ C 1 M 1 1 *

* I /**..* I t

7 IC2"2 ( f —

where / j and / 2 are proportional to the squares of the magnitudes of M 5 (r) and u2(t). By letting r r 2 andj ? ! = —i- , r, = — c c

t = f-, — t ? ,

the preceding equation can be written as IP = li +J2 + 2c,c! Re< Ml (f + rM(r)>. In view of the mutual coherence and self-coherence function, we show that / p = /!-f 72 + 2(/ 1 / 2 ) 1 / 2 Re[y l2 (T)]. Thus, we see that for / = / j = / 2 , the preceding equation reduces to

in which we see that K = |y12(r)|,

(2.4)

the visibility measure is equal to the degree of coherence. Let us now proceed with the Young's experiment further, by letting d increase. We see that the visibility drops rapidly to zero, then reappears, and so on, as shown in Fig. 2.3. There is also a variation in fringe frequency as d varies. In other words, as d increases the spatial frequency of the fringes also increases. Since the visibility is equal to the degree of coherence, it is, in fact, the degree of spatial coherence between points Qv and Q2- If we let the source size £ deduce to very small, as illustrated in Fig. 2.3, we see that the degree of (spatial) coherence becomes unity (100%), over the diffraction screen. In this point of view, we see that the degree of spatial coherence is, in fact, governed by the source size. As we investigate further, when the observation point P moves away from the center of the observing screen, visibility decreases as the path difference Ar = r 2 — r, increases, until it eventually becomes zero. The effect also depends

2.1. Coherence Theory of Light

1

2

3

4

Separation d Fig. 2.3. Vibility as a function of separation.

on how nearly monochromatic the source is. The visibility as affected by the path difference can be written as

Ar

c Au'

(2.5)

where c is the velocity of light and Ay is the spectral bandwidth of the source. The preceding equation is also used to define the coherence length (or temporal coherence) of the source, which is the distance at which the light beam is longitudinally coherent. In view of the preceding discussion, one sees that spatial coherence is primarily governed by the source size and temporal coherence is governed by the spectral bandwidth of the source. In other words, a monochromatic point source is a strictly coherent source, while a monochromatic source is a temporal coherent source and a point source is a spatial coherence source. Nevertheless, it is not necessary to have a completely coherent light to produce an interference pattern. Under certain conditions, an interference pattern may be produced from an incoherent source. This effect is called partial coherence. It is worthwhile to point out that the degree of temporal coherence from a source can be obtained by using the Michelson interferometer, as shown in Fig. 2.4. In short, by varying one of the minors, an interference fringe pattern can be viewed at the observation plane. The path difference, after the light beam is

2. Signal Processing with Optics

Fig, 2.4. The Michelson interferometer. BS; beam splitter; M, mirrors; P, observation screen.

split, is given by Ar = Ate

Aii'

(2.6)

where the temporal coherent length of the source is 1

t

~ Ay'

(2.7)

In fact, the coherent length of the source can also be shown as Ar

A2 AA'

(2.8)

where X is the center wavelength, and AA is the spectral bandwidth of the light source.

2.2. PROCESSING UNDER COHERENT AND INCOHERENT ILLUMINATION Let a hypothetical optical processing system be shown in Fig. 2.5. Assume that the light emitted by the source Z is monochromatic, and let u(x, y) be the complex light distribution at the input signal plane due to an incremental source dZ. If the complex amplitude transmittance of the input plane is /(x, >J),

2,2. Processing under Coherent and Incoherent Illumination

" Light source

lx"

i"

i s

-*ol~ -4-—^

input plane

Output plane

Fig. 2.5. A hypothetical optical processing system.

the complex light field immediately behind the signal plane would be u(x, y)f(x, y). We assume the optical system (i.e., block box) is linearly spatially invariant with a spatial impulse response of h(x, y); the output complex light field, due to dZ, can be calculated by

which can be written as

where the asterisk represents the convolution operation and the superasterisk denotes the complex conjugation. The overall output intensity distribution is therefore

which can be written in the following convolution integral: T(x, y- x', y')h(a -x,0- y)h*(a -x1,?- y')

/(«, ft = */•/«/«/

• f(x, y)f*(x', y'} dx dy dx' dy' where x, y: x', y')= | u(x, y)u*(x'y') li

(2.9)

74

2. Signal Processing with Optics

is the spatial coherence function, also known as mutual intensity function, at the input plane (x, >'). By choosing two arbitrary points ')|2 dxdy,

(2,14)

in which we see that the output intensity distribution is the convolution of the input signal intensity with respect to the intensity impulse response. In other words, for the completely incoherent illumination, the optical signal processing system is linear in intensity, that is, (2.15)

where the asterisk denotes the convolution operation. On the other hand, for the completely coherent illumination; i.e., F(x. y;x'y') = K2, the output intensity distribution can be shown as /(a, 0) = 0(a, /%*(a, P) = \\ h(a - x, fi - y)/(x, y) dxdy JJ

(2.1.6)

h*(a - x', fi - y')f*(x', y') dxdy' when 0(a, P)=\\ h(a - x, p - >-)/(x, y) dxdy,

(2.17)

for which we can see that the optical signal processing system is linear in complex amplitude. In other words, a coherent optical processor is capable of processing the information in complex amplitudes.

76

2. Signal Processing with Optics

2.3, FRESNEL-KIRCHHOFF AND FOURIER TRANSFORMATION 2.3.1, FREE SPACE IMPULSE RESPONSE To understand the basic concept of optical Fourier transformation, we begin our discussion with the development of the Fresnel-Kirchhoff integral. Let us start from the Huygens principle, in which the complex amplitude observed at the point p' of a coordinate system

This page intentionally left blank

Introduction to Information Optics Edited by FRANCIS T. S. YU The Pennsylvania State University

SUGANDA JUTAMULIA Blue Sky Research

SHIZHUO YIN The Pennsylvania State University

® ACADEMIC PRESS A Horcourt Science and Technology Company

San Diego San Francisco New York London Sydney Tokyo

Boston

This book is printed on acid-free paper. (S) Copyright © 2001 by Academic Press All rights reserved No part of this publication maybe reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or stored in any information storage and retrieval system without permission in writing from the publisher. Requests for permission to make copies of any part of the work should be mailed to: Permissions Department, Harcourt Brace & Company, 6277 Sea Harbor Drive, Orlando, Florida, 32887-6777. ACADEMIC PRESS A Harcourt Science and Technology Company 525 B Street Suite 1900, San Diego, California 92101-4495, USA http://www.academicpress.com ACADEMIC PRESS 24-28 Oval Road, London NW1 7DX, UK Library of Congress Cataloging Number: 2001089409 ISBN: 0-12-774811-3 PRINTED IN THE UNITED STATES OF AMERICA 01 0 2 0 3 0 4 0 5 S B 9 8 7 6 5 4 3 2 1

Contents

Preface

Chapter 1 Entropy Information and Optics 1.1. Information Transmission 1.2. Entropy Information 1.3.. Communication Channel 1.3.1. Memoryless Discrete Channel 1.3.2. Continuous Channel 1.4. Band-limited Analysis 1.4.1. Degrees of Freedom 1.4.2. Gabor's Information Cell 1.5. Signal Analysis 1.5.1. Signal Detection 1.5.2. Statistical Signal Detection 1.5.3. Signal Recovering 1.5.4. Signal Ambiguity 1.5.5. Wigner Distribution 1.6. Trading Information with Entropy 1.6.1. Demon Exorcist 1.6.2. Minimum Cost of Entropy 1.7. Accuracy and Reliability Observation 1.7.1. Uncertainty Observation 1.8. Quantum Mechanical Channel 1.8.1. Capacity of a Photon Channel References Exercises

I 2 4 9 10 11 19 23 25 26 28 29 32 34 39 41 42 45 47 51 54 56 60 60

vi

Contents

Chapter 2. Signal Processing with Optics

67

2.1. Coherence Theory of Light 2.2. Processing under Coherent and Incoherent Illumination 2.3. Fresnel-Kirchhoff and Fourier Transformation 2.3.1. Free Space Impulse Response 2.3.2. Fourier Transformation by Lenses 2.4. Fourier Transform Processing 2.4.1. Fourier Domain Filter 2.4.2. Spatial Domain Filter 2.4.3. Processing with Fourier Domain Filters 2.4.4. Processing with Joint Transformation 2.4.5. Hybrid Optical Processing 2.5. image Processing with Optics 2.5.1. Correlation Detection 2.5.2. Image Restoration 2.5.3. Image Subtraction 2.5.4. Broadband Signal Processing 2.6. Algorithms for Processing 2.6.1. Mellin-Transform Processing 2.6.2. Circular Harmonic Processing 2.6.3. Homomorphic Processing 2.6.4. Synthetic Discriminant Algorithm 2.6.5. Simulated Annealing Algorithm 2.7. Processing with Photorefractive Optics 2.7.1. Photorefractive Eifect and Materials 2.7.2. Wave Mixing and Multiplexing 2.7.3. Bragg Diffraction Limitation 2.7.4. Angular and Wavelength Selectivities 2.7.5. Shift-Invariant Limited Correlators 2.8. Processing with Incoherent Light 2.8.1. Exploitation of Coherence 2.8.2. Signal Processing with White Light 2.8.3. Color Image Preservation and Pseudocoloring 2.9. Processing with Neural Networks 2.9.1. Optical Neural Networks 2.9.2. Holpfield Model 2.9.3. Inpattern Association Model References Exercises

67 2 76 76 77 79 79 82 83 85 88 89 89 93 98 98 103 104 105 107 108 112 i 15 115 118 121 122 125 131 131 135 138 141 142 143 144 147 148

Chapter 3, Communication with Optics

163

3.1. Motivation of Fiber-Optic Communication 3.2. Light Propagation in Optical Fibers 3.2.1. Geometric Optics Approach 3.2.2. Wave-Optics Approach 3.2.3. Other Issues Related to Light Propagating in Optical Fiber 3.3. Critical Components 3.3.1. Optical Transmitters for Fiber-Optic Communications — Semiconductor Lasers

163 164 164 164 168 184

7

184

Contents

3.4.

3.3.2. Optical Receivers for Fiber-Optic Communications 3.3.3. Other Components Used in Fiber-Optic Communications Fiber-Optic Networks 3.4.1. Types of Fiber-Optic Networks Classified by Physical Size 3.4.2. Physical Topologies and Routing Topologies Relevant to Fiber-Optic Networks 3.4.3. Wavelength Division Multiplexed Optics Networks 3.4.4. Testing Fiber-Optic Networks References Exercises

VI i

188 192 192 193 193 193 195 198 198

Chapter 4. Switching with Optics

20!

4.1. Figures of Merits for an Optical Switch 4.2. All-Optical Switches 4.2.1. Optical Nonlinearity 4.2.2. Etalon Switching Devices 4.2.3. Nonlinear Directional Coupler 4.2.4. Nonlinear Interferometric Switches 4.3. Fast Electro-optic Switches: Modulators 4.3.1. Direct Modulation of Semiconductor Lasers 4.3.2. External Electro-optic Modulators 4.3.3. MEMS Switches Without Moving Parts 4.4. Optical Switching Based on MEMS 4.4.1. MEMS Fabrications 4.4.2. Electrostatic Actuators 4.4.3. MEMS Optical Switches 4.5. Summary References Exercises

202 203 205 205 208 211 219 220 225 236 236 237 238 242 247 248 250

Chapter 5. Transformation with Optics

255

5.1. Huygens- Fresnel Diffraction 5.2. Fresnel Transform 5.2.1. Definition 5.2.2. Optical Fresnel Transform 5.3. Fourier Transform 5.4. Wavelet Transform 5.4.1 Wavelets 5.4.2. Time-frequency Joint Representation 5.4.3. Properties of Wavelets 5.5. Physical Wavelet Transform 5.5.1. Electromagnetic Wavelet 5.5.2. Electromagnetic Wavelet Transform 5.5.3. Electromagnetic Wavelet Transform and Huygens Diffraction 5.6. Wigner Distribution Function 5.6.1. Definition 5.6.2. Inverse Transform

256 257 257 257 259 260 260 261 262 264 264 266 260 270 270 271

Vlll

5.7.

5.8.

5.9.

5.10.

5.11.

Contents 5.6.3. Geometrical Optics Interpretation 5.6.4. Wigner Distribution Optics Fractional Fourier Transform 5.7.1. Definition 5.7.2. Fractional Fourier Transform and Fresnel Diffraction Hankel Transform 5.8.1. Fourier Transform in Polar Coordinate System 5.8.2. Hankel Transform Radon Transform 5.9.1. Definition 5.9.2. Image Reconstruction Geometric Transform 5.10.1. Basic Geometric Transformations 5.10.2. Generalized Geometric Transformation 5.10.3. Optical Implementation Hough Transform 5.11.1. Definition 5.11.2. Optical Hough Transform References Exercises

Chapter 6 Interconnection with Optics 6.1. 6.2.

Introduction Polymer Waveguides 6.2.1. Polymeric Materials for Waveguide Fabrication 6.2.2. Fabrication of Low-Loss Polymeric Waveguides 6.3.2. Waveguide Loss Measurement 6.3. Thin-Film Waveguide Couplers 6.3.1. Surface-Normal Grating Coupler Design and Fabrication 6.3.2. 45° Surface-Normal Micromirror Couplers 6.4. Integration of Thin-Film Photodetectors 6.5. Integration of Vertical Cavity Surface-Emitting Lasers (VCSELs) 6.6. Optical Clock Signal Distribution 6.7. Polymer Waveguide-Based Optical Bus Structure 6.7.1. Optical Equivalent for Electronic Bus Logic Design 6.8. Summary References Exercises

Chapter 7 Pattern Recognition with Optics 7.1.

7.2.

Basic Architectures 7.1.1. Correlators 7.1.2. Neural Networks 7.1.3. Hybrid Optical Architectures 7.1.4. Robustness of JTC Recognition by Correlation Detections 7,2.1. Nonconventional Joint-Transform Detection

271 272 275 275 277 279 279 281 282 282 283 284 288 288 289 292 292 293 294 295

299 299 303 303 305 310 312 312 326 331 334 339 343 345 348 349 351

355 356 356 357 357 362 364 364

Contents

7.3.

7.4.

7.5.

7.6.

7.7.

7.8.

7.2.2. Nonzero-order Joint-Transform Detection 7.2.3. Position-Encoding Joint-Transform Detection 7.2.4. Phase-Representation Joint-Transform Detection 7.2.5. Iterative Joint-Transform Detection Polychromatic Pattern Recognition 7.3.1. Detection with Temporal Fourier-Domain Filters 7.3.2. Detection with Spatial-Domain Filters Target Tracking 7.4.1. Autonomous Tracking 7.4.2. Data Association Tracking Pattern Recognition Using Composite Filtering 7.5.1. Performance Capacity 7.5.2. Quantization Performance Pattern Classification 7.6.1. Nearest Neighbor Classifiers 7.6.2. Optical Implementation Pattern Recognition with Photorefractive Optics 7.7.1. Detection by Phase Conjugation 7.7.2. Wavelength-Multiplexed Matched Filtering 7.7.3. Wavelet Matched Filtering Neural Pattern Recognition 7.8.1. Recognition by Supervised Learning 7.8.2. Recognition by Unsupervised Learning 7.8.3. Polychromatic Neural Networks References Exercises

Chapter 8 Information Storage with Optics 8.1. Digital Information Storage 8.2. Upper Limit of Optical Storage Density 8.3 Optical Storage Media 8.3.J. Photographic Film 8.3.2. Dichromated Gelatin 8.3.3. Photopolymers 8.3.4. Photoresists 8.3.5. Thermoplastic Film 8.3.6. Photorefractive Materials 8.3.7. Photochromic Materials 8.3.8. Electron-Trapping Materials 8.3.9. Two Photon-Absorption Materials 8.3.10. Bacteriorhodospin 8.3.11. Photochemical Hole Burning 8.3.12. Magneto-optic Materials 8.3.13. Phase-Change Materials 8.4. Bit-Pattern Optical Storage 8.4.1. Optical Tape 8.4.2. Optical Disk 8.4.3. Multilayer Optical Disk 8.4.4. Photon-Gating 3-D Optical Storage

IX 368 370 371 372 375 376 377 380 380 382 387 388 390 394 395 398 401 401 404 407 411 412 414 418 422 423

435 435 436 438 438 439 439 440 440 441 442 442 443 444 444 445 446 446 447 447 448 449

x

Contents

8.4.5. Stacked-Layer 3-D Optical Storage 8.4.6. Photochemical Hole-Burning 3-D Storage 8.5. Holographic Optical Storage 8.5.1. Principle of Holography 8.5.2. Plane Holographic Storage 8.5.3. Stacked Holograms for 3-D Optical Storage 8.5.4. Volume Holographic 3-D Optical Storage 8.6. Near Field Optical Storage 8.7. Concluding Remarks References Exercises

Chapter 9 9.1. 9.2.

9.3.

9.4.

9.5.

9.6.

9.7.

Computing with Optics

Introduction Parallel Optical Logic and Architectures 9.2.1. Optical Logic 9.2.2. Space-Variant Optical Logic 9.2.3. Programmable Logic Array 9.2.4. Parallel Array Logic 9.2.5. Symbolic Substitution 9.2.6. Content-Addressable Memory Number Systems and Basic Operations 9.3.1. Operations with Binary Number Systems 9.3.2. Operations with Nonbinary Number Systems Parallel Signed-Digit Arithmetic 9.4.1. Generalized Signed-Digit Number Systems 9.4.2. MSD Arithmetic 9.4.3. TSD Arithmetic 9.4.4. QSD Arithmetic Conversion between Different Number Systems 9.5.1. Conversion between Signed-Digit and Complement Number Systems 9.5.2. Conversion between NSD and Negabinary Number Systems Optical Implementation 9.6.1. Symbolic Substitution Implemented by Matrix-Vector Operation 9.6.2. SCAM-Based Incoherent Correlator for QSD Addition 9.6.3. Optical Logic Array Processor for Parallel NSD Arithmetic Summary References Exercises

Chapter 10

Sensing with Optics

10.1. Introduction 10.2. A Brief Review of Types of Fiber-Optic Sensors. 10.2.1. Intensity-Based Fiber-Optic Sensors 10.2.2. Polarization-Based Fiber-Optic Sensors 10.2.3. Phase-Based Fiber Optic Sensors 10.2.4. Frequency (or Wavelength)-Based Fiber-Optic Sensors

451 454 454 455 456 458 460 461 461 465 468

475 476 477 477 481 481 485 486 488 489 489 499 501 501 503 530 534 543 544 546 549 549 551 558 560 562 569

571 57! 572 572 575 583 587

Contents

XI

10.3. Distributed Fiber-Optic Sensors 10.3.1. Intrinsic Distributed Fiber-optic Sensors 10.3.2. Quasi-distributed Fiber-optic Sensors 10.4. Summary References Exercises

589 589 600 612 613 615

Chapter 11

617

Information Display with Optics

11.1. I ntrod action 11.2. Information Display Using Acousto-optic Spatial Light Modulators 11.2.1. The Acousto-optic Effect 11.2.2. Intensity Modulation of Laser 11.2.3. Deflection of Laser 11.2.4. Laser TV Display Using Acousto-optic Devices 11.3. 3-D Holographic Display 11.3.1. Principles of Holography 11.3.2. Optical Scanning Holography 11.3.3. Synthetic Aperture Holography 11.4. Information Display Using Electro-optic Spatial Light Modulators 11.4.1. The Electro-optic Effect 11.4.2. Electrically Addressed Spatial Light Modulators 11.4.3. Optically Addressed Spatial Light Modulators 11.5. Concluding Remarks References Exercises

617 1618 618 625 628 629 632 632 638 640 643 643 647 650 661 661 664

Chapter 12 Networking with Optics

667

12.1. 12.2.

667 671 671 673 678 687 689 691 694 696 696 697 701 704 704 709 711 714 715

12.3.

12.4.

Index

Background Optical Network Elements 12.2.1. Optical Fibers 12.2.2. Optical Amplifiers 12.2.3. Wavelength Division Multiplexer/Demultiplexer 12.2.4. Transponder 12.2.5. Optical Add/Drop Multiplexer 12.2.6. Optical Cross-Connect 12.2.7. Optical Monitoring Design of Optical Transport Network 12.3.1. Optical Fiber Dispersion Limit 12.3.2. Optical Fiber Nonlinearity Limit 12.3.3. System Design Examples Applications and Future Development of Optical Networks 12.4.1. Long-haul Backbone Networks 12.4.2. Metropolitan and Access Networks 12.4.3. Future Development References Exercises

719

This page intentionally left blank

Preface

We live in a world bathed in light. Light is not only the source of energy necessary to live — plants grow up by drawing energy from sunlight; light is also the source of energy for information - our vision is based on light detected by our eyes (but we do not grow up by drawing energy from light to our body through our eyes). Furthermore, applications of optics to information technology are not limited to vision and can be found almost everywhere. The deployment rate of optical technology is extraordinary. For example, optical fiber for telecommunication is being installed about one kilometer every second worldwide. Thousands of optical disk players, computer monitors, and television displays are produced daily. The book summarizes and reviews the state of the art in information optics, which is optical science in information technology. The book consists of 12 chapters written by active researchers in the field. Chapter 1 provides the theoretical relation between optics and information theory. Chapter 2 reviews the basis of optical signal processing. Chapter 3 describes the principle of fiber optic communication. Chapter 4 discusses optical switches used for communication and parallel processing. Chapter 5 discusses integral transforms, which can be performed optically in parallel. Chapter 6 presents the physics of optical interconnects used for computing and switching. Chapter 7 reviews pattern recognition including neural networks based on optical Fourier transform and other optical techniques. Chapter 8 discusses optical storage including holographic memory, 3D memory, and near-field optics. Chapter 9 reviews digital

XIV

Preface

optical computing, which takes advantage of parallelism of optics. Chapter 10 describes the principles of optical fiber sensors. Chapter 11 introduces advanced displays including 3D holographic display. Chapter 12 presents an overview of fiber optical networks. The book is not intended to be encyclopedic and exhaustive. Rather, it merely reflects current selected interests in optical applications to information technology. In view of the great number of contributions in this area, we have not been able to include all of them in this book.

Chapter 1

Entropy Information and Optics Francis T.S. Yu THE PENNSYLVANIA STATE UNIVERSITY

Light is not only the mainstream of energy that supports life; it also provides us with an important source of information. One can easily imagine that without light, our present civilization would never have emerged. Furthermore, humans are equipped exceptionally good (although not perfect) eyes, along with an intelligent brain. Humans were therefore able to advance themselves above the rest of the animals on this planet Earth. It is undoubtedly true that if humans did not have eyes, they would not have evolved into their present form. In the presence of light, humans are able to search for the food they need and the art they enjoy, and to explore the unknown. Thus light, or rather optics, provide us with a very valuable source of information, the application of which can range from very abstract artistic interpretations to very efficient scientific usages. This chapter discusses the relationship between entropy information and optics. Our intention is not to provide a detailed discussion, however, but to cover the basic fundamentals that are easily applied to optics. We note that entropy information was not originated by optical scientists, but rather by a group of mathematically oriented electrical engineers whose original interest was centered on electrical communication. Nevertheless, from the very beginning of the discovery of entropy information, interest in its application has never totally been absent from the optical standpoint. As a result of the recent development of optical communication, signal processing, and computing, among other discoveries, the relationship between optics and entropy information has grown more profound than ever.

2

1. Entropy Information and Optics

1.1. INFORMATION TRANSMISSION Although we seem to know the meaning of the word information, fundamentally that may not be the case. In reality, information may be defined as related to usage. From the viewpoint of mathematic formalism, entropy information is basically a probabilistic concept. In other words, without probability theory there would be no entropy information. An information transmission system can be represented by a block diagram, as shown in Fig. 1.1. For example, a message represents an information source which is to be sent by means of a set of written characters that represent a code. If the set of written characters is recorded on a piece of paper, the information still cannot be transmitted until the paper is illuminated by a visible light (the transmitter), which obviously acts as an information carrier. When light reflected from the written characters arrives at your eyes (the receiver), a proper decoding (translating) process takes place; that is, character recognition (decoding) by the user (your mind). This simple example illustrates that we can see that a suitable encoding process may not be adequate unless a suitable decoding process also takes place. For instance, if I show you a foreign newspaper you might not be able to decode the language, even though the optical channel is assumed to be perfect (i.e., noiseless). This is because a suitable decoding process requires a priori knowledge of the encoding scheme; for example, the knowledge of the foreign characters. Thus the decoding process is also known as recognition process. Information transmission can be in fact represented by spatial and temporal information. The preceding example of transmitting written characters obviously represents a spatial information transmission. On the other hand, if the written language is transmitted by coded light pulses, then this

COMMUNICATION CHANNEL

Fig. 1.1. Block diagram of a communication system.

1.1. Information Transmission

3

language should be properly (temporally) encoded; for instance, as transmitted through an optical fiber, which represents a temporal communication channel. Needless to say, at this receiving end a temporally decoding process is required before the temporal coded language is sent to the user. Viewing a television show, for example, represents a one-way spatial-temporal transmission. It is interesting to note that temporal and spatial information can be traded for information transmission. For instance, television signal transmission is a typical example of exploiting the temporal information transmission for spatial information transmission. On the other hand, a movie sound track is an example of exploiting spatial information transmission for temporal information transmission. Information transmission has two basic disciplines: one developed by Wiener [1.1, 1.2], and the other by Shannon [1.3, 1.4]. Although both Wiener and Shannon share a common interest, there is a basic distinction between their ideas. The significance of Wiener's work is that, if a signal (information) is corrupted by some physical means (e.g., noise, nonlinear distortion), it may be possible to recover the signal from the corrupted one. It is for this purpose that Wiener advocates correlation detection, optimum prediction, and other ideas. However, Shannon carries his work a step further. He shows that the signal can be optimally transferred provided it is properly encoded; that is, the signal to be transferred can be processed before and after transmission. In other words, it is possible to combat the disturbances in a communication channel by properly encoding the signal. This is precisely the reason that Shannon advocates the information measure of the signal, communication channel capacity, and signal coding processes. In fact, the main objective of Shannon's theory is the efficient utilization of the information channel. A fundamental theorem proposed by Shannon may be the most surprising result of his work. The theorem can be stated approximately as follows: Given a stationary finite-memory information channel having a channel capacity C, if the binary information transmission rate R (which can be either spatial or temporal) of the signal is smaller than C, there exists a channel encoding and decoding process for which the probability of error per digit of information transfer can be made arbitrarily small. Conversely, if the formation transmission rate R is larger than C, there exists no encoding and decoding processes with this property; that is, the probability of error in information transfer cannot be made arbitrarily small. In other words, the presence of random disturbances in an information channel does not, by itself, limit transmission accuracy. Rather, it limits the transmission rate for which arbitrarily high transmission accuracy can be accomplished. To conclude this section, we note that the distinction between the two communication disciplines are that Wiener assumes, in effect, that the signal in question can be processed after it has been corrupted by noise. Shannon suggests that the signal can be processed both before and after its transmission

4

1. Entropy Information and Optics

through the communication channel. However, both Wiener and Shannon share the same basic objective; namely, faithful reproduction of the original signal.

1.2. ENTROPY INFORMATION Let us now define the information measure, which is one of the vitally important aspects in the development of Shannon's information theory. For simplicity, we consider discrete input and output message ensembles A = {aj and B = {bj}, respectively, as applied to a communication channel, as shown in Fig. 1.2, If a£ is an input event as applied to the information channel and bj is the corresponding transmitted output event, then the information measure about the received event bj specifies a,-, can be written as

(1.1)

Where P(ai/bi) is the conditional probability of input event a{ depends on the output event b^ P(at) is the a priori probability of input event a,-, / = 1 , 2 , . . . , M andj= 1, 2,..., JV. By the symmetric property of the joint probability, we show that I(at; bj) = l(bj; at).

(1.2)

In other words, the amount of information transferred by the output event bj from a, is the same amount as provided by the input event a, that specified bj, It is clear that, if the input and output events are statistically independent; that is, if P(ait bj) = P(ai)P(bj), then /(«,.; bj) = 0 Furthermore, if /(a,•;£>,•) > 0, then P(ai,bj) > P(ai)P(bj), there is a higher joint probability of ai and bj. However, if I(ai;bi) E0 = -kTln [3 - (i)1/*], where E0 is the threshold energy level for the photodetector.

(1.175)

1.7. Accuracy and Reliability Observation

51

For y. » 1 the preceding equation can be written as hv >kT(\noi + 0.367).

(1.176)

Since the absorption of one quantum of light is adequate for a positive response, the corresponding entropy increase is AS - — >fc(lna + 0.367).

(1.177)

The amount of information obtained would be 7-log2abits.

(1.178)

AS-/fcln2>0.367fc>0.

(1.179)

Thus we see that

Except for the equality, this AS is identical to that of the low-frequency observation of Eq. (1.162). However, the entropy increase is much higher, since v is very high. Although fine observation can be obtained by using higher frequency, there is a price to be paid; namely, higher cost of entropy. We now come to the reliable observation. One must distinguish the basic difference between accuracy and reliability in observations. A reliable observation is dependent on the chosen decision threshold level E0; that is, the higher the threshold level, the higher the reliability. However, accuracy in observation is inversely related to the spread of the detected signal; the narrower the spread, the higher the accuracy. These two issues are illustrated in Fig. 1.17. It is evident that the higher threshold energy level E0 chosen would have higher the reliability. However, higher reliability also produces higher probability of misses. On the other hand, if E0 is set at a lower level, a less reliable observation is expected. In other words, high probability of error (false alarms) may produce, for example, due thermal noise fluctuation.

1.7.1. UNCERTAINTY OBSERVATION All physical systems are ultimately restricted by some limitations. When quantum conditions are in use, all limitations are essentially imposed by the basic Heisenberg uncertainty principle: AEAf^/T,

(1.180)

1. Entropy Information and Optics

HIGH ACCURACY HIGH RELIABILITY

HIGH ACCURACY LOW RELIABILITY

LOW ACCURACY HIGH RELIABILITY

LOW ACCURACY LOW RELIABILITY

Fig. 1.17. Examples of accuracy and reliability of observation.

where A£ denotes the energy perturbation and At is the time-interval observation. By reference to the preceding observation made by radiation, one can compare the required energy AE with the mean-square thermal fluctuation of the photodetector ykT, where y is the number of degrees of freedom, which is essentially the number of low-frequency vibrations (hv«kT). Thus, if AE < ykT, we have h

Af » — . ykT

(1.181)

From this inequality we see that a larger time resolution At is required for low-frequency observation. Since A£ is small, the perturbation within At is very small and can by comparison be ignored.

1.7. Accuracy and Reliability Observation

53

However, if the radiation frequency AE = hv > ykT, then we have

v becomes higher, such as

A ^ .

(1.182)

We see that, as the radiant energy required for the observation increases, the more accurate time resolution can be made. But the perturbation of the observation is also higher. Thus, the time resolution A? obtained by the observation may not even be correct, since AE is large. In the classical theory of light, observation has been assumed nonperturbable. This assumption is generally true for many particles problems, for which a large number of quanta is used in observation. In other words, accuracy of observations is not expected to be too high in classical theory of light, since its imposed condition is limited far away from the uncertainty principle; that is AEAt »h, or equivalently, Ap Ax » h.

However, as quantum conditions occur, a nonperturbing system simply does not exist. When a higher-quantum hv is used, a certain perturbation within the system is bound to occur; hence, high-accuracy observation is limited by the uncertainty principle. Let us look at the problem of observing extremely small distance Ax between two particles. One must use a light source having a wavelength / that satisfies the condition / I(A; B)kln2.

(1.1.92)

1.8.1. CAPACITY OF A PHOTON CHANNEL Let us denote the mean quantum number of the photon signal by m(v), and the mean quantum number of a noise by n(v). We have assumed an additive channel; thus, the signal plus noise is

Since the phonton density (i.e., the mean number of photons per unit time per frequency) is the mean quantum number, the signal energy density per unit time can be written as Es(v) — m(v)/n>, where h is Planck's constant. Similarly, the noise energy density per unit time is EN(v) = h(v)hv. Due to the fact that the mean quantum noise (blackbody radiation at temperature T) follows Planck's distribution (also known as the Bose-Einstein distribution),

TT exp(/iv//cT) —

1.8. Quantum Mechanical Channel

57

the noise energy per unit time (noise power) can be calculated by

N =

E,(vXlv =

- r - r *=

,

(1.194)

where £ is an arbitrarily small positive constant. Thus, the minimum required entropy for the signal radiation is

r

dr T

(1.195)

where E(T) is signal energy density per unit time as a function of temperature T, and T is the temperature of the blackbody radiation. Thus, in the presence of a signal the output radiation energy per unit time (the power) can be written F - S + N,

where S and N are the signal and the noise power, respectively. Since the signal is assumed to be deterministic (i.e., the microstate signal), the signal entropy can be considered zero. Remember that the validity of this assumption is mainly based on the independent statistical nature between the signal and the noise, for which the photon statistics follow Bose-Einstein distribution. However, the Bose-Einstein distribution cannot be used for the case of fermions, because, owing to the Pauli exclusion principle, the microstates of the noise are restricted by the occupational states of the signal, or vice versa. In other words, in the case of fermions, the signal and the noise can never be assumed to be statistically independent. For the case of Bose-Einstein statistics, we see that the amount of entropy transfer by radiation remains unchanged:

Since the mutual information is I(A; B) = H(B) — H(B/A), we see that I (A; B) reaches its maximum when H(B) is maximum. Thus, for maximum information transfer, the photon signal should be chosen randomly. But the maximum value of entropy H(B) occurs when the ensemble of the microstates of the total radiation (the ensemble B) corresponds to Gibbs's distribution, which reaches to the thermal equilibrium. Thus, the corresponding mean occupational quantum number of the total radiation also follows Bose-Einstein

58

1. Entropy Information and Optics

distribution at a given temperature Te ^ T: /-in

7(v)) =

(1.197)

where Te is defined as the effective temperature. Since the output entropy can be determined by

(U98)

Icln2

the quantum mechanical channel capacity can be evaluated as given by (...99,

(nkT')2 v

In view of the total output power 0.200, the effective temperature Te can be written as 6hS (nk)

'

(1.201

Thus, the capacity of a photon channel can be shown to be (1.201)

We would further note that high signal-to-noise ratio corresponds to highfrequency transmission (hv » kT), for which we have 6hS

1.8. Quantum Mechanical Channel

59

Fig. 1.18. Photon channel capacity as a function of signal powers, for various values of thermal noise temperature T. Dashed lines represent the classical asymptotes of Eq. (1.203).

Then the photon channel capacity is limited by the quantum statistic; that is, 1/2

c ^quant ~!L(»Y ' ' In 2 \3hJ

'

(1.203)

However, if the signal-to-noise ratio is low (hv « kT), the photon channel capacity reduces to the classical limit: V--rla,

(1.204)

Figure 1.18 shows a plot of the photon channel capacity as a function of signal energy. We see that at the low signal-to-noise ratio (hv « kT) regime, the channel capacity approaches this classical limit. However, for high signal-tonoise ratio (hv » kT), the channel capacity approaches the quantum limit of Eq (1.203), which offers a much higher transmission rate.

1. Entropy Information and Optics

1.1 1.2 1.3 1.4 1.5 1.6 3.7 1.8 1.9 1.10 1.11 1.12

N. Wiener, Cybernetics, MIT Press, Cambridge, Mass., 1948. N. Wiener, Extrapolation, Interpolation, and Smoothing of Stationary Time Series. C. E. Shannon, 1948, "A Mathematical Theory of Communication," Bell Syst. Tech. J., vol. 27, 379-423, 623-656. C. E. Shannon and W. Weaver, The Mathematical Theory of Communication, University of Illinois Press, Urbana, 1949. W. Davenport and W. Root, Random Signals and Noise, McGraw-Hill, New York, 1958. M. Loeve, Probability Theory, 3rd ed., Van Nostrand, Princeton, N.J., 1963. D. Gabor, 1946, "Theory of Communication," J. Inst. Electr. Eng., vol. 93, 429. D. Gabor. 1950, "Communication Theory and Physics", Phi., Mag., vol. 41, no. 7, 1161. J. Neyman and E.S. Pearson, 1928, "On the Use and Interpretation of Certain Test Criteria for Purposes of Statistical Inference", Biometrika, vol. 20A, 175, 263. L. Brillouin, Science and Information Theory, 2nd ed.. Academic, New York, 1962. F. W. Sear, l^hermodynamics, the Kinetic Theory of Gases, and Statistical Mechanics, Addison-Wesley, Reading, Mass., 1953. F. T. S. Yu and X. Yang, Introduction to Optical Engineering, Cambridge University Press, Cambridge, UK, 1997.

EXERCISES

1.1

A picture is indeed worth more than a thousand words. For simplicity, we assume that an old-fashioned monochrome Internet screen has a 500 x 600 pixel-array and each pixel element is capable of providing eight distinguishable brightness levels. If we assume the selection of each information pixel-element is equiprobable, then calculate the amount of information that can be provided by the Internet screen. On the other hand, a broadcaster has a pool of 10,000 words; if he randomly picked 1000 words from this pool, calculate the amount of information provided by these words. 1.2 Given an n-array symmetric channel, its transition probability matrix is written by

l-p P [P]

n- 1

P n —1 1 1

Pn

P n —1

>1- 1

P

n-1

n~^T

P P P n- 1 n - 1 H - 1

l-p

Exercises

(a) Evaluate the channel capacity. (b) Repeat part (a) for a binary channel. 1.3 Show that an information source will provide the maximum amount of information, if and only if the probability distribution of the ensemble information is equiprobable. 1.4 Let an input ensemble to a discrete memoryless channel be A = {a1,a2,a3} with the probability of occurrence p(al) = |, p(a2) = {% p(a3) = i, and let B = {bl,b2,b3} be a set of the output ensemble. If the transition matrix of the channel is given by

a. Calculate the output entropy H(B). b. Compute the conditional entropy H(A/B). 1.5 Let us consider a memoryless channel with input ensemble A — alf a2,...,ar and output ensemble B = b1, b2,...,bs, and channel matrix [p(6j/a,.)]. A random decision rule may be formalized by assuming that if the channel output is bj for every i = 1, 2, . . . , s, the decoder will select a- with probability q(ai/bj), for every i = 1, 2, . . . , r. Show that for a given input distribution there is no random decision rule that will provide a lower probability of error than the ideal observer. 1.6 The product of two discrete memoryless channels C t and C2 is a channel the inputs of which are ordered pairs (a,, a}) and the outputs of which are ordered pairs (bk,h'i) where the first coordinates belong to the alphabet of Cj and the second coordinates to the alphabet of C2. If the transition probability of the product channel is P(b

lait a-) =

determine the capacity of the product channel. 1.7 Develop a maximum likelihood decision and determine the probability of errors for a discrete memoryless channel, as given by,

P=

where this input ensembles are p(fl 2 )=i,

62

1. Entropy Information and Optics

1 .8 A memory less continuous channel is perturbed by an additive Gaussian noise with zero mean and variance equal to N. The output entropy and the noise entropy can be written as

H(B) H(B/A) where b = a + c (i.e., signal plus noises), and of = a^ + a? = S + N. Calculate the capacity of the channel. 1.9 The input to an ideal low-pass filter is a narrow rectangular pulse signal as given by f i1> M i i < ^Av T* H(v) = < 2 _ JO, otherwise

a. Determine the output response g(t). b. Sketch g(t) when At Av < 1, A t A v = 1 and ArAv > 1. c. If

show how this linear phase low-pass filter affects the answers in part (b). 1.10 Consider an mxn array spatial channel in which we spatially encode a set of coded images. What would be the spatial channel capacity? 1.11 A complex Gaussian pulse is given by u(t) = Kexp(-/lr 2 ), where M is a constant and

A = a + ib. Determine and sketch the ambiguity function, and discuss the result as it applies to radar detection. 1.12 Given a chirp signal, as given by

u(t) = calculate and sketch the corresponding Wigner distribution.

63

Exercises

1.13 A sound spectrograph can display speech signals in the form of a frequency versus time plot which bears the name of logon (spectrogram). The analyzer is equipped with a narrow band Av = 45 Hz and a wide band Av = 300 Hz. a. Show that it is impossible to resolve the frequency and time information simultaneously by using only one of these filters. b. For a high-pitched voice that varies from 300-400 Hz, show that it is not possible to resolve the time resolution, although it is possible to resolve the fine-frequency content. c. On the other hand, for a low-pitched voice that varies from 20-45 Hz, show that it is possible to resolve the time resolution but not the fine-frequency resolution. 1.14 We equip the Szillard's demon with a beam of light to illuminate the chamber, in which it has only one molecule wandering in the chamber, as shown in Fig. 1.19. Calculate a. the net entropy change per cycle of operation, b. the amount of information required by the demon, c. the amount of entropy change of the demon, and d. show that with the intervention of the demon, the system is still operating within the limit of the second law of thermodynamics. 1.15 A rectangular photosensitive paper may be divided into an M x N array of information cells. Each cell is capable of resolving K distinct gray levels. By a certain recording (i.e., encoding) process, we reduce the M x N to a. P x Q array of cells with P < M and Q < N a. What is the amount of entropy decrease in reducing the number of cells? b. Let an image be recorded on this photosensitive paper. The amount of information provided by the image is l{ bits, which is assumed to be smaller than the information capacity of the paper. If /,- (i.e., /. < //)

1

P

.m

V ;:

;,

C

0

sx>

A

Fig. 1.19. Szilard's machine by the intervention of the demon. D, photodetector; L, light beams; C, transparent cylinder; P, piston; m. molecule.

64

1. Entropy Information and Optics

bits of the recorded information are considered to be in error, determine the entropy required for restoring the image. 1.16 For low-frequency observation, if the quantum stage g = m is selected as the decision threshold, we will have an error probability of 50% per observation. If we choose g = 5m calculate the probability of error per observation. 1.17. In high-frequency observation, show that the cost of entropy per observation is greater. What is the minimum amount of entropy required per observation? Compare this result with the low-frequency case and show that the high-frequency case is more reliable. 1.18. For the problem of many simultaneous observations, if we assume that an observation gives a positive (correct) result (i.e., any one of the a photodetectors give rise to an energy level above £0 = ghv) and a 25% chance of observation error is imposed, then calculate a. the threshold energy level E0 as a function of a and b. the corresponding amount of entropy increase in the photodetectors. 1.19 In low-frequency simultaneous observations, if we obtain y simultaneous correct observations out of a photodetectors (y < a), determine the minimum cost of entropy required. Show that for the high-frequency case, the minimum entropy required is even greater. 1.20 Given an imaging device; for example, a telescope, if the field of view at a distant corner is denoted by A and the resolution of this imaging system is limited to a small area A^4 over A, calculate the accuracy of this imaging system and the amount of information provided by the observation. 1.21 With reference to the preceding problem, a. Show that a high-accuracy observation requires a light source with a shorter wavelength, and the reliability of this observation is inversely proportional to the wavelength employed. b. Determine the observation efficiency for higher-frequency observation. 1.22 The goal of a certain experiment is to measure the distance between two reflecting walls. Assume that a plane monochromatic wave is to be reflected back and forth between the two walls. The number of interference fringes is determined to be 10 and the wavelength of the light employed is 600nm. Calculate a. the separation between the walls, b. the amount of entropy increase in the photodetector (T= 300°K), c. the amount of information required, d. the corresponding energy threshold level of the photodetector if a reliability 3% — 4 is required, where ?Jt = 1/probability of error, and e. the efficiency of this observation. 1.23 Repeat the preceding problem for the higher-frequency observation.

Exercises

63

1.24 With reference to the problem of observation under a microscope, if a square (sides = r0) instead of a circular wavelength is used, calculate the minimum amount of entropy increase in order to overcome the thermal background fluctuations. Show that the cost of entropy under a microscope observation is even greater than kl In 2. 1.25 Show that the spatial information capacity of an optical channel under coherent illumination is generally higher than under incoherent illumination, 1.26 Let us consider a band-limited periodic signal of bandwidth Av m , where vm is the maximum frequency content of the signal. If the signal is passed through an ideal low-pass filter of bandwidth Avf where the cutoff frequency vf is lower than that of v m , estimate the minimum cost of entropy required to restore the signal. 1.27 Refer to the photon channel previously evaluated. What would be the minimum amount of energy required to transmit a bit of information through the channel? 1.28 High and low signal-to-noise ratio is directly related to the frequency transmission through the channel. By referring to the photon channels that we have obtained, under what condition would the classical limit and quantum statistic meet? For a low transmission rate, what would be the minimum energy required to transmit a bit of information? 1.29 It is well known that under certain conditions a band-limited photon channel capacity can be written as C % Av log

hv

The capacity of the channel increases as the mean occupation number in — S/hv increases. Remember, however, that this is valid only under the condition Av/v « 1, for a narrow-band channel. It is incorrect to assume that the capacity becomes infinitely large as v -» 0. Strictly speaking, the capacity will never exceed the quantum limit as given in Eq. (1.202). Show that the narrow-band photon channel is in fact the Shannon's continuous channel under high signal-to-noise ratio transmission.

This page intentionally left blank

Chapter 2

Signal Processing with Optics Francis T. S. Yu PENNSYLVANIA STATE UNIVERSITY

Optical processing can perform a myriad of processing operations. This is primarily due to its complex amplitude processing capability. Optical signal processors can perform one- or two-dimensional spatial functions using single linear operators, such as conventional linear systems. However, all those inherent processing merits of optical processing cannot happen without the support of good coherence property of light. For this reason, we shall begin our discussion with the fundamental coherence theory of light.

2.1. COHERENCE THEORY OF LIGHT When radiation from two sources maintains a fixed-phase relation between them, they are said to be mutually coherent. Therefore, an extended source is coherent if all points of the source have fixed-phase differences among them. We first must understand the basic theory of coherent light. In the classic theory of electromagnetic radiation, it is usually assumed that the electric and magnetic fields are always measurable quantities at any position. In this situation there is no need for the coherence theory to interpret the property of light. There are scenarios, however, in which this assumption cannot be made; for these it is essential to apply coherence theory. For example, if we want to determine the diffraction pattern caused by radiation 67

68

2. Signal Processing with Optics

ui(t)

Fig. 2.1. A wavefront propagates in space.

from several sources, we cannot obtain an exact result unless the degrees of coherence among the separate sources are taken into account. In such a situation, it is desirable to obtain a statistical ensemble average for the most likely result; for example, from any combination of sources. It is therefore more useful to provide a statistical description than to follow the dynamic behavior of a wave field in detail. Let us assume an electromagnetic wave field propagating in space, as depicted in Fig. 2.1, where u^t) and u2(t) denote the instantaneous wave disturbances at positions 1 and 2, respectively. The mutual coherence function (i.e., cross-correlation function) between these two disturbances can be written as r)u*2(t) dt,

(2.1)

where the superasterisk denotes the complex conjugate, the < > represents the time ensemble average, and T is a time delay. The complex degree of coherence between u^t) and u2(x) can be defined as (2.2)

where r u (t) and r 22 (r) are the self-coherence functions of u^t) and M 2 (f), respectively.

2.1. Coherence Theory of Light

69

Needless to say, the degree of coherent is bounded by 0 ^ |r12(t)| < 1, in which |r12| = 1 represents strictly coherent, and |r12| = 0 represents strictly noncoherent. We note that, in high-frequency regime, it is not easy or impossible to directly evaluate the degree of coherence. However, there exists a practical fringe visibility relationship for which the degree of coherence \rl2\ can be directly measured, by referring to the Young's experiment in Fig. 2.2, in which £ represents a monochromatic extended source. A diffraction screen is located at a distance f 10 from the source, with two small pinholes in this screen, Q, and Q2, separated at a distance d. On the observing screen located r 20 away from the diffracting screen, we observe an interference pattern in which the maximum and minimum intensities /max and / min of the fringes are measured. The Michelson visibility can then be defined as I/ A

max max

min '

(2.3)

min

We shall now show that under the equal intensity condition (i.e., / j = J 2 )» the visibility measure is equal to the degree of coherence. The electromagnetic wave disturbances u^t) and u2(t] at Q1 and Q2 can be determined by the wave equation, such as

where c is the velocity of light. The disturbance at point P, on the observing screen, can be written as =cut

Fig. 2.2. Young's experiment.

70

2. Signal Processing with Optics

where c\ and c2 are the appropriate complex constants. The corresponding irradiance at P is written by — i _ i _ r _i_ o RA / /> i/ I* — l\ + ]2 + 2 K C \ C 1 M 1 1 *

* I /**..* I t

7 IC2"2 ( f —

where / j and / 2 are proportional to the squares of the magnitudes of M 5 (r) and u2(t). By letting r r 2 andj ? ! = —i- , r, = — c c

t = f-, — t ? ,

the preceding equation can be written as IP = li +J2 + 2c,c! Re< Ml (f + rM(r)>. In view of the mutual coherence and self-coherence function, we show that / p = /!-f 72 + 2(/ 1 / 2 ) 1 / 2 Re[y l2 (T)]. Thus, we see that for / = / j = / 2 , the preceding equation reduces to

in which we see that K = |y12(r)|,

(2.4)

the visibility measure is equal to the degree of coherence. Let us now proceed with the Young's experiment further, by letting d increase. We see that the visibility drops rapidly to zero, then reappears, and so on, as shown in Fig. 2.3. There is also a variation in fringe frequency as d varies. In other words, as d increases the spatial frequency of the fringes also increases. Since the visibility is equal to the degree of coherence, it is, in fact, the degree of spatial coherence between points Qv and Q2- If we let the source size £ deduce to very small, as illustrated in Fig. 2.3, we see that the degree of (spatial) coherence becomes unity (100%), over the diffraction screen. In this point of view, we see that the degree of spatial coherence is, in fact, governed by the source size. As we investigate further, when the observation point P moves away from the center of the observing screen, visibility decreases as the path difference Ar = r 2 — r, increases, until it eventually becomes zero. The effect also depends

2.1. Coherence Theory of Light

1

2

3

4

Separation d Fig. 2.3. Vibility as a function of separation.

on how nearly monochromatic the source is. The visibility as affected by the path difference can be written as

Ar

c Au'

(2.5)

where c is the velocity of light and Ay is the spectral bandwidth of the source. The preceding equation is also used to define the coherence length (or temporal coherence) of the source, which is the distance at which the light beam is longitudinally coherent. In view of the preceding discussion, one sees that spatial coherence is primarily governed by the source size and temporal coherence is governed by the spectral bandwidth of the source. In other words, a monochromatic point source is a strictly coherent source, while a monochromatic source is a temporal coherent source and a point source is a spatial coherence source. Nevertheless, it is not necessary to have a completely coherent light to produce an interference pattern. Under certain conditions, an interference pattern may be produced from an incoherent source. This effect is called partial coherence. It is worthwhile to point out that the degree of temporal coherence from a source can be obtained by using the Michelson interferometer, as shown in Fig. 2.4. In short, by varying one of the minors, an interference fringe pattern can be viewed at the observation plane. The path difference, after the light beam is

2. Signal Processing with Optics

Fig, 2.4. The Michelson interferometer. BS; beam splitter; M, mirrors; P, observation screen.

split, is given by Ar = Ate

Aii'

(2.6)

where the temporal coherent length of the source is 1

t

~ Ay'

(2.7)

In fact, the coherent length of the source can also be shown as Ar

A2 AA'

(2.8)

where X is the center wavelength, and AA is the spectral bandwidth of the light source.

2.2. PROCESSING UNDER COHERENT AND INCOHERENT ILLUMINATION Let a hypothetical optical processing system be shown in Fig. 2.5. Assume that the light emitted by the source Z is monochromatic, and let u(x, y) be the complex light distribution at the input signal plane due to an incremental source dZ. If the complex amplitude transmittance of the input plane is /(x, >J),

2,2. Processing under Coherent and Incoherent Illumination

" Light source

lx"

i"

i s

-*ol~ -4-—^

input plane

Output plane

Fig. 2.5. A hypothetical optical processing system.

the complex light field immediately behind the signal plane would be u(x, y)f(x, y). We assume the optical system (i.e., block box) is linearly spatially invariant with a spatial impulse response of h(x, y); the output complex light field, due to dZ, can be calculated by

which can be written as

where the asterisk represents the convolution operation and the superasterisk denotes the complex conjugation. The overall output intensity distribution is therefore

which can be written in the following convolution integral: T(x, y- x', y')h(a -x,0- y)h*(a -x1,?- y')

/(«, ft = */•/«/«/

• f(x, y)f*(x', y'} dx dy dx' dy' where x, y: x', y')= | u(x, y)u*(x'y') li

(2.9)

74

2. Signal Processing with Optics

is the spatial coherence function, also known as mutual intensity function, at the input plane (x, >'). By choosing two arbitrary points ')|2 dxdy,

(2,14)

in which we see that the output intensity distribution is the convolution of the input signal intensity with respect to the intensity impulse response. In other words, for the completely incoherent illumination, the optical signal processing system is linear in intensity, that is, (2.15)

where the asterisk denotes the convolution operation. On the other hand, for the completely coherent illumination; i.e., F(x. y;x'y') = K2, the output intensity distribution can be shown as /(a, 0) = 0(a, /%*(a, P) = \\ h(a - x, fi - y)/(x, y) dxdy JJ

(2.1.6)

h*(a - x', fi - y')f*(x', y') dxdy' when 0(a, P)=\\ h(a - x, p - >-)/(x, y) dxdy,

(2.17)

for which we can see that the optical signal processing system is linear in complex amplitude. In other words, a coherent optical processor is capable of processing the information in complex amplitudes.

76

2. Signal Processing with Optics

2.3, FRESNEL-KIRCHHOFF AND FOURIER TRANSFORMATION 2.3.1, FREE SPACE IMPULSE RESPONSE To understand the basic concept of optical Fourier transformation, we begin our discussion with the development of the Fresnel-Kirchhoff integral. Let us start from the Huygens principle, in which the complex amplitude observed at the point p' of a coordinate system

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & close