Ten Lectures on Wavelets (CBMS-NSF Regional Conference Series in Applied Mathematics)

INGRID DAUBECHIES Rutgers University and AT&T Bell Laboratories Ten Lectures on Wavelets SOCIETY FOR INDUSTRIAL AND ...

Author: Ingrid Daubechies

140 downloads 778 Views 8MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form

DOWNLOAD PDF

INGRID DAUBECHIES

Rutgers University and AT&T Bell Laboratories

Ten Lectures on Wavelets

SOCIETY FOR INDUSTRIAL AND APPLIED MATHEMATICS PHILADELPHIA, PENNSYLVANIA

1992

Copyright 1992 by the Society for Industrial and Applied Mathematics All rights reserved. No part of this book may be reproduced, stored, or transmitted in any manner without the written permission of the Publisher. For information, write the Society for Industrial and Applied Mathematics, 3600 University City Science Center, Philadelphia, Pennsylvania 19104-2688.

Second Printing 1992 Third Printing 1994 Fourth Printing 1995 Fifth Printing 1997 Sixth Printing 1999

Library of Congress Cataloging-in-Publication Data Daubechies, Ingrid. Ten lectures on wavelets / Ingrid Daubechies. p. cm. - (CBMS-NSF regional conference series in applied mathematics; 61) Includes bibliographical references and index. ISBN 0-89871-274-2 II. Series. 1. Wavelets (Mathematics)-Congresses. I. Title. QA403.3.D38 1992 515' .2433-dc20

sianL. is a registered trademark.

92-13201

To my mother, who gave me the will to be independent. To my father, who stimulated my interest in science.

iii

Contents Vll

xi

INTRODUCTION

PRELIMINARIES AND NOTATION

CHAPTER 1:

The What, Why, and How of Wavelets

17

CHAPTER 2:

The Continuous Wavelet Transfonn

53

CHAPTER 3:

107

CHAPTER 4:

Time-Frequency Density and Orthononnal Bases

129

CHAPTER 5:

Orthononnal Bases of Wavelets and Multiresolution Analysis

167

CHAPTER 6:

Orthononnal Bases of Compactly Supported Wavelets

215

CHAPTER 7:

More About the Regularity of Compactly Supported Wavelets

251

CHAPTER 8:

Symmetry for Compactly Supported Wavelet Bases

289

CHAPTER 9:

Characterization of Functional Spaces by Means of Wavelets

313

CHAPTER 10: Generalizations and Tricks for Orthononnal Wavelet Bases

341

REFERENCES

353

SUBJECT INDEX

355

AUTHOR INDEX

Discrete Wavelet Transfonns: Frames

v

I ntrod uction

Wavelets are a relatively recent development in applied mathematics. Their name itself was coined approximately a decade ago (Morlet, Arens, Fourgeau, and Giard (1982) , Morlet (1983) , Grossmann and Morlet (1984)) ; in the last ten years interest in them has grown at an explosive rate. There are several rea sons for their present success. On the one hand, the concept of wavelets can be viewed as a synthesis of ideas which originated during the last twenty or thirty years in engineering (subband coding), physics (coherent states, renormalization group) , and pure mathematics (study of Calder6n-Zygmund operators) . As a consequence of these interdisciplinary origins, wavelets appeal to scientists and engineers of many different backgrounds. On the other hand, wavelets are a fairly simple mathematical tool with a great variety of possible applications. Already they have led to exciting applications in signal analysis (sound, images) (some early references are Kronland-Martinet, Morlet and Grossmann (1987) , Mallat (1989b) , (1989c); more recent references are given later) and numerical analy sis (fast algorithms for integral transforms in Beylkin, Coifman, and Rokhlin (1991)); many other applications are being studied. This wide applicability also contributes to the interest they generate. This book contains ten lectures I delivered as the principal speaker at the CBMS conference on wavelets organized in June 1990 by the Mathematics De partment at the University of Lowell, Massachusetts. According to the usual format of the CBMS conferences, other speakers (G. Battle, G. Beylkin, C. Chui, A. Cohen, R. Coifman, K. Grochenig, J. Liandrat, S. Mallat, B. Torresani, and A. Willsky) provided lectures on their work related to wavelets. Moreover, three workshops were organized, on applications to physics and inverse problems (chaired by B. DeFacio), group theory and harmonic analysis (H. Feichtinger) , and signal analysis (M. Vetterli) . The audience consisted of researchers active in the field of wavelets as well as of mathematicians and other scientists and engineers who knew little about wavelets and hoped to learn more. This second group constituted the largest part of the audience. I saw it as my task to provide a tutorial on wavelets to this part of the audience, which would then be a solid grounding for more recent work exposed by the other lecturers and myself. Con sequently, about two thirds of my lectures consisted of "basic wavelet theory," vii

viii

INTRODUCTION

the other third being devoted to more recent and unpublished work. This divi sion is reflected in the present write-up as well. As a result, I believe that this book will be useful as an introduction to the subject, to be used either for indi vidual reading, or for a seminar or graduate course. None of the other lectures or workshop papers presented at the CBMS conference have been incorporated here. As a result, this presentation is biased more toward my own work than the CBMS conference was. In many instances I have included pointers to references for further reading or a detailed exposition of particular applications, comple menting the present text. Other books on wavelets published include Wavelets and Time Frequency Methods (Combes, Grossmann, and Tchamitchian (1987) ) , which contains the proceedings o f the International Wavelet Conference held in Marseille, France, in December 1987, Ondelettes, by Y. Meyer (1990) (in French; English translation expected soon) , which contains a mathematically more ex panded treatment than the present lectures, with fewer forays into other fields however, Les Ondelettes en 1989, edited by P. G. Lemarie (1990) , a collection of talks given at the Universite Paris XI in the spring of 1989, and An Introduction to Wavelets, by C. K. Chui (1992b) , an introduction from the approximation theory viewpoint. The proceedings of the International Wavelet Conference in May 1989, held again in Marseille, are due to come out soon (Meyer (1992)) . Moreover, many of the other contributors to the CBMS conference, as well as some wavelet researchers who could not attend, were invited to write an essay on their wavelet work; the result is the essay collection Wavelets and their Ap plications (Ruskai et al. (1992)) , which can be considered a companion book to this one. Another wavelet essay book is Wavelets: A Tutorial in Theory and Applications, edited by C. K. Chui (1992c) ; in addition, I know of several other wavelet essay books in preparation (edited by J. Benedetto and M. Frazier, an other by M. Barlaud) , as well as a monograph by M. Holschneider; there was a special wavelet issue of IEEE Trans. Inform. Theory in March of 1992; there will be another one, later in 1992, of Constructive Approximation Theory, and one in 1993, of IEEE Trans. Sign. Proc. In addition, several recent books in clude chapters on wavelets. Examples are Multirate Systems and Filter Banks by P. P. Vaidyanathan (1992) and Quantum Physics, Relativity and Complex Spacetime: Towards a New Synthesis by G. Kaiser (1990) . Readers interested in the present lectures will find these books and special issues useful for many details and other aspects not fully presented here. It is moreover clear that the subject is still developing rapidly. This book more or less follows the path of my lectures: each of the ten chap ters stands for one of the ten lectures, presented in the order in which they were delivered. The first chapter presents a quick overview of different aspects of the wavelet transform. It sketches the outlines of a big fresco; subsequent chapters then fill in more detail. From there on, we proceed to the continu ous wavelet transform (Chapter 2; with a short review of bandlimited functions and Shannon ' s theorem) , to discrete but redundant wavelet transforms (frames; Chapter 3) and to a general discussion of time-frequency density and the possible existence of orthonormal bases (Chapter 4). Many of the results in Chapters 24 can be formulated for the windowed Fourier transform as well as the wavelet

INTRODUCTION

ix

transform, and the two cases are presented in parallel, with analogies and differ ences pointed out as we go along. The remaining chapters all focus on orthonor mal bases of wavelets: multiresolution analysis and a first general strategy for the construction of orthonormal wavelet bases (Chapter 5), orthonormal bases of compactly supported wavelets and their link to subband coding (Chapter 6) , sharp regularity estimates for these wavelet bases (Chapter 7) , symmetry for compactly supported wavelet bases (Chapter 8) . Chapter 9 shows that orthonor mal bases are "good" bases for many functional spaces where Fourier methods are not well adapted. This chapter is the most mathematical of the whole book; most of its material is not connected to the applications discussed in other chap ters, so that it can be skipped by readers uninterested in this aspect of wavelet theory. I included it for several reasons: the kind of estimates used in the proof are very important for harmonic analysis, and similar (but more complicated) estimates in the proof of the "T(l)" -theorem of David and Journe have turned out to be the groundwork for the applications to numerical analysis in the work of Beylkin, Coifman, and Rokhlin ( 1991). Moreover, the Calder6n-Zygmund theorem, explained in this chapter, illustrates how techniques using different scales, one of the forerunners of wavelets, were used in harmonic analysis long before the advent of wavelets. Finally, Chapter 10 sketches several extensions of the constructions of orthonormal wavelet bases: to more than one dimension, to dilation factors different from two (even noninteger) , with the possibility of better frequency localization, and to wavelet bases on a finite interval instead of the whole line. Every chapter concludes with a section of numbered "Notes," referred to in the text of the chapter by superscript numbers. These contain additional references, extra proofs excised to keep the text flowing, remarks, etc. This book is a mathematics book: it states and proves many theorems. It also presupposes some mathematical background. In particular, I assume that the reader is familiar with the basic properties of the Fourier transform and Fourier series. I also use some basic theorems of measure and integration theory (Fatou's lemma, dominated convergence theorem, Fubini's theorem; these can be found in any good book on real analysis) . In some chapters, familiarity with basic Hilbert space techniques is useful. A list of the basic notions and theorems used in the book is given in the Preliminaries. The reader who finds that he or she does not know all of these prerequisites should not be dismayed, however; most of the book can be followed with just the basic notions of Fourier analysis. Moreover, I have tried to keep a very pedes trian pace in almost all the proofs, at the risk of boring some mathematically sophisticated readers. I hope therefore that these lecture notes will interest peo ple other than mathematicians. For this reason I have often shied away from the "Definition-Lemma-Proposition-Theorem-Corollary" sequence, and I have tried to be intuitive in many places, even if this meant that the exposition be came less succinct. I hope to succeed in sharing with my readers some of the excitement that this interdisciplinary subject has brought into my scientific life. I want to take this opportunity to express my gratitude to the many people who made the Lowell conference happen: the CBMS board, and the Mathematics Department of the University of Lowell, in particular Professors G. Kaiser and

x

INTRODUCTION

M. B. Ruskai. The success of the conference, which unexpectedly turned out to have many more participants than customary for CBMS conferences, was due in large part to its very efficient organization. As experienced conference organizer I. M. James (1991) says, "every conference is mainly due to the efforts of a single individual who does almost all the work" ; for the 1990 Wavelet CBMS conference, this individual was Mary Beth Ruskai. I am especially grateful to her for proposing the conference in the first place, for organizing it in such a way that I had a minimal paperwork load, while keeping me posted about all the developments, and for generally being the organizational backbone, no small task. Prior to the conference I had the opportunity to teach much of this material as a graduate course in the Mathematics Department of the University of Michigan, in Ann Arbor. My one-term visit there was supported jointly by a Visiting Professorship for Women from the National Science Foundation, and by the University of Michigan. I would like to thank both institutions for their support. I would also like to thank all the faculty and students who sat in on the course, and who provided feedback and useful suggestions. The manuscript was typeset by Martina Sharp, who I thank for her patience and diligence, and for doing a wonderful job. I wouldn ' t even have attempted to write this book without her. I am grateful to Jeff Lagarias for editorial comments. Several people helped me spot typos in the galley proofs, and I am grateful to all of them; I would like to thank especially Pascal Auscher, Gerry Kaiser, Ming-Jun Lai, and Martin Vetterli. All remaining mistakes are of course my responsiblity. I also would like to thank Jim Driscoll and Sharon Murrel for helping me prepare the author index. Finally, I want to thank my husband Robert Calder bank for being extremely supportive and committed to our two-career-track with family, even though it occasionally means that he as well as I prove a few theorems less. Ingrid Daubechies Rutgers University and ATBT Bell Labomtories In subsequent printings minor mistakes and many typographical errors have been corrected. I am grateful to everybody who helped me to spot them. I have also updated a few things: some of the previously unpublished references have appeared and some of the problems that were listed as open have been solved. I have made no attempt to include the many other interesting papers on wavelets that have appeared since the first printing; in any case, the list of references was not and is still not meant as a complete bibliography of the subject.

Preliminaries and Notation

This preliminary chapter fixes notation conventions and normalizations. It also states some basic theorems that will be used later in the book. For those less familiar with Hilbert and Banach spaces, it contains a very brief primer. ( This primer should be used mainly as a reference, to come back to in those instances when the reader comes across some Hilbert or Banach space language that she or he is unfamiliar with. For most chapters, these concepts are not used. ) Let us start by some notation conventions. For x E JR, we write lxJ for the largest integer not exceeding x, lxJ = max

{n E Z; n � x} .

For example, l3/2J = 1, l-3/2J = -2, l-2J = -2. Similarly, rxl is the smallest integer which is larger than or equal to x. If a�O (or 00 ) , then we denote by O(a) any quantity that is bounded by a constant times a, by o (a) any quantity that tends to 0 (or 00 ) when a does. The end of a proof is always marked with a _; for clarity, many remarks or examples are ended with aD. In many proofs, G denotes a "generic" constant, which need not have the same value throughout the proof. In chains of inequalities, I often use G, G ' , G", . . . or G1 . G2, G3 , to avoid confusion. We use the following convention for the Fourier transform ( in one dimension ) : .

•

.

(0.0. 1) With this normalization, one has

where IIfllLP =

[/

dx If(XW xi

] lip

(0.0.2)

xu


Inversion of the Fourier transform is then given by f(x) g(x)

=

(0.0.3)

g( -x) .

Strictly speaking, (0.0.1), (0.0.3) are well defined only if f, respectively Ff, are absolutely integrable; for general L2-functions f, e.g. , we should define Ff via a limiting process ( see also below ) . We will implicitly assume that the adequate limiting process is used in all cases, and write, with a convenient abuse of no tation, formulas similar to (0.0.1) and (0.0.3) even when a limiting process is understood. A standard property of the Fourier transform is:

J

hence

dx If(l)(xW

oo n --->oo

=

k::::: n}J ;

every sequence, even if it does not have a limit (such as an (-l) n ) , has a lim sup (which may be 00); for sequences that converge to a limit, the lim sup coincides with the limit.) =

Dominated convergence theorem. Suppose fn (x) where. If I fn (x) 1 and

:s;

g(x) for all n, and J dx g(x)

1

dx f(x)

=

nlim --->oo

1

0, there exists no, depending on f, so that I l u n - u rn II � f i f n , m � no - , then there exists u E 1i so that the Un tend to u for n-HX), i. e., limn -t oo Il u - un ll = 0.) A standard example of such a Hilbert space is L 2 (JR) , with 0:,

(f, g )

=

J dx f(x) g(x) .

Here the integration runs from - 00 to 00; we will often drop the integration bounds when the integral runs over the whole real line. Another example is £2 (Z), the set of all square summable sequences of com plex numbers indexed by integers, with 00

n = - oo Again, we will often drop the limits on the summation index when we sum over all integers. Both L 2 (JR) and £2 (Z) are infinite-dimensional Hilbert spaces. Even simpler are finite-dimensional Hilbert spaces, of which Ck is the standard example, with the scalar product k ( u, v ) = L Uj Vj , j= for u = (Ul,···,Uk), v = (VI,···,Vk) E Ck. Hilbert spaces always have orthonormal bases, i.e., there exist families of vectors e n in 1i 1

and n

xv


for all u E 1i. (We only consider separable Hilbert spaces, Le., spaces in which orthonormal bases are countable.) Examples of orthonormal bases are the Her mite functions in L2 (JR) , the sequences en defined by ( en)j = O ,j , with n, j E Z in f2 (Z) (Le., all entries but the nth vanish) , or the k vectors e1,..., ek in C k defined by (et)m = Ot,m, with 1 :::; f, m :::; k. (We use Kronecker's symbol 0 with the usual meaning: O ,j = 1 if i = j, 0 if i =I j.) A standard inequality in a Hilbert space is the Cauchy-Schwarz inequality, ( 0.0.6) I ( v, w) 1 :::; Il v ll ll w ll , easily proved by writing (0.0.5) for appropriate linear combinations of v and w. In particular, for f, g E L2 (JR) , we have 1 /2 1 /2 , dx I g(xW dx I f( XW dx f(x) g(X) n

i

IJ

) (J

1 :::; (J

A consequence of (0.0.6) is Il u ll = sup

v, Ilv ll:$!

l ( u, v ) 1 = sup

v, Ilv ll =1

I ( u, v ) 1 .

)

(0.0.7)

"Operators" on 1i are linear maps from 1i to another Hilbert space, often 1i itself. Explicitly, if A is an operator on 1i, then An operator is continuous if Au - Av can be made arbitrarily small by making u - v small. Explicitly, for all f > 0 there should exist 0 (depending on f ) so that Il u - vii :::; 0 implies II Au - Av ll :::; f. If we take v = 0, f = 1 , then we find that, for some b > 0, II Au l 1 :::; 1 if Il u ll :::; b. For any w E 1i we can define = II ! II w; clearly Il w' ll :::; b and therefore II Aw l1 = � II Aw ' ll :::; b- 1 1I wll . If II Aw ll / ll w ll (w =I 0) is bounded, then the operator A is called bounded. We have just seen that any continuous operator is bounded; the reverse is also true. The norm II A I I of A is defined by (0 . 0 . 8 ) II A II = sup II Au ll / ll u l 1 = sup II Au ll · Wi

uE1t, Ilu ll?"O

Ilu ll =1

It immediately follows that, for all u E 1i, II Au ll :::; II A ll ll u ll · Operators from 1i to C are called "linear functionals." For bounded linear functionals one has Riesz' representation theorem: for any f: 1i----+C , linear and

xvi


bounded, i.e., 1 £ ( u ) 1 :S C ll u ll for all u E H, there exists a unique Ve E H so that £(u) = ( u, ve) . An operator U from HI to H2 is an isometry if ( Uv, Uw ) = ( v, w ) for all v, w E HI ; U is unitary if moreover UHI = H2 , i.e., every element V2 E H2 can be written as V2 = UV I for some V I E HI . If the en constitute an orthonormal basis in HI , and U is unitary, then the Uen constitute an orthonormal basis in H2 . The reverse is also true: any operator that maps an orthonormal basis to another orthonormal basis is unitary. A set D is called dense in H if every u E H can be written as the limit of some sequence of Un in D. (One then says that the closure of D is all of H. The closure of a set S is obtained by adding to it all the v that can be obtained as limits of sequences in S.) If Av is only defined for v E D, but we know that (0. 0 .9) II Av l1 :S C ll v ll for all v E D , then we can extend A to all of H "by continuity." Explicitly: if u E H, find Un E D so that limn---+oo Un = u. Then the Un are necessarily a Cauchy sequence, and because of (0.0.9), so are the Aun ; the AUn have therefore a limit, which we call Au (it does not depend on the particular sequence Un that was chosen). One can also deal with unbounded operators, i.e., A for which there exists no finite C such that II Au l 1 :S C ll u ll holds for all u E 1i. It is a fact of life that these can usually only be defined on a dense set D in H, and cannot be extended by the above trick (since they are not continuous). An example is d� in L2(JR), where we can take D = CO' ( JR) , the set of all infinitely differentiable functions with compact support, for D. The dense set on which the operator is defined is called its domain. The adjoint A* of a bounded operator A from a Hilbert space HI to a Hilbert space H2 (which may be HI itself) is the operator from H2 to HI defined by ( u I , A* U2 ) = ( A UI , U2 ) , which should hold for all UI E HI , U E H . (The existence of A* is guaranteed by Riesz' representation theorem: for2 fixed2 U2 , we can define a linear functional £ on HI by £ ( u I ) = ( AUI , U2 ) . It is clearly bounded, and corresponds therefore to a vector v so that (UI ' v ) £ ( u I ). It is easy to check that the correspondence U2 ---.v is linear; this defines the operator A * .) One has II A* A ll II A I 1 2 . II A* II = II A II , If A * = A (only possible if A maps H to itself) , then A is called self-adjoint. If a self-adjoint operator A satisfies ( Au, u ) � 0 for all u E H, then it is called a positive operator; this is often denoted A � o. We will write A � B if A B is a positive operator. Trace-class operators are special operators such that � n I ( Aen, en ) I is finite for all orthonormal bases in H. For such a trace-class operator, � n ( Aen, en ) is independent of the chosen orthonormal basis; we call this sum the trace of A, =

=

-

n

xvii


If A is positive, then it is sufficient to check whether L n (Ae n , e n) is finite for only one orthonormal basis; if it is, then A is trace-class. (This is not true for non-positive operators!) The spectrum a (A) of an operator A from H to itself consists of all the A E C such that A AId (Id stands for the identity operator, Id = u) does not have a bounded inverse. In a finite-dimensional Hilbert space, a( A) consists of the eigenvalues of A; in the infinite-dimensional case, a(A) contains all the eigenvalues (constituting the point spectrum) but often contains other A as well, constituting the continuous spectrum. (For instance, in L2(JR), multiplication of f (x) with sin has no point spectrum, but its continuous spectrum is [- 1, 1].) The spectrum of a self-adjoint operator consists of only real numbers; the spec trum of a positive operator contains only non-negative numbers. The spectral radius p(A) is defined by u

-

7rX

p(A) = sup { I AI ; A E a(A) } . It has the properties lim I I A n lll/ n . p(A) ::; II AII and p(A) n---> oo Self-adjoint operators can be diagonalized. This is easiest to understand if their spectrum consists only of eigenvalues (as is the case in finite dimensions) . One then has a(A) = { An ; E N} , with a corresponding orthonormal family of eigenvectors, =

n

It then follows that, for all

u

E H,

n n n which is the "diagonalization" of A. (The spectral theorem permits us to gen eralize this if part (or all) of the spectrum is continuous, but we will not need it in this book.) If two operators commute, i.e. , A B u = B Au for all u E H, then they can be diagonalized simultaneously: there exists an orthonormal basis such that Many of these properties for bounded operators can also be formulated for un bounded operators: adjoints, spectrum, diagonalization all exist for unbounded operators as well. One has to be very careful with domains, however. For in stance, generalizing the simultaneous diagonalization of commuting operators requires a careful definition of commuting operators: there exist pathological examples where A, B are both defined on a domain D, where AB and B A both make sense on D and are equal on D, but where A and B nevertheless are not

xviii


simultaneously diagonalizable (because D was chosen "too small" ; see, e.g., Reed and Simon (1971) for an example) . The proper definition of commuting for un bounded self-adjoint operators uses associated bounded operators: H1 and H2 commute if their associated unitary evolution operators commute. For a self adjoint operator H, the associated unitary evolution operators Ut are defined as follows: for any v E D, the domain of H (beware: the domain of a self-adjoint operator is not just any dense set on which H is well defined) , UTV is the solution v( t) at time t = T of the differential equation

i! v(t) = Hv(t) , with initial condition v( O) = v . Banach spaces share many properties with but are more general than Hilbert spaces. They are linear spaces equipped with a norm (which need not be and generally is not derived from a scalar product) , complete with respect to that norm (i.e., all Cauchy sequences converge; see above) . Some of the concepts we reviewed above for Hilbert spaces also exist in Banach spaces; e.g., bounded operators, linear functionals, spectrum and spectral radius. An example of a Banach space that is not a Hilbert space is V(JR) , the set of all functions 1 on JR such that II/I1 LP (see ( 0 . 0 .2)) is finite, with 1 � p < 00, P -=12. Another example is LOO (JR) , the set of all bounded functions on JR, with 11 /11 £,"> = sUPx E JR I/ (x) l . The dual E* of a Banach space E is the set of all bounded linear functionals on E; it is also a linear space, which comes with a natural norm (defined as in (0.0.8)) , with respect to which it is complete: E* is a Banach space itself. In the case of the V-spaces, 1 � P < 00, it turns out that elements of L q, where p and q are related by p- 1 +q- 1 = 1, define bounded linear functionals on V. Indeed, one has Holder's inequality,

If

1

dx I(x) g(x) � II/IILP Ilgl !£. .

It turns out that all bounded linear functionals on V are of this type, i.e., (V)* = L q. In particular, L 2 is its own dual; by Riesz' representation theorem (see above) , every Hilbert space is its own dual. The adjoint A* of an operator A from E1 to E2 is now an operator from E2 to Ei, defined by There exist different types of bases in Banach spaces. (We will again only consider separable spaces, in which bases are countable.) The en consti tute a Schauder basis if, for all v E E, there exist unique J.Ln E C so that v = lim N --->oo 2:;; =1 J.Lnen (i.e. , Ilv - 2:;; =1 J.Lnenll----+O as N----+oo). The uniqueness requirement of the J.Ln forces the en to be linearly independent, in the sense that no en can be in the closure of the linear span of all the others, i.e., there exist no "1m so that en lim N ---> oo 2:� =1 > mopn "Imem· In a Schauder basis, the ordering of the en may be important. A basis is called unconditional if in addition it satisfies one of the following two equivalent properties:

=

xix


whenever L ILn e n E E, it follows that L I ILn l e n E E ; n ±1, randomly chosen for every then if L ILn e n E E, and En n L ILnEn e n E E. n For an unconditional basis, the order in which the basis vectors are taken does not matter. Not all Banach spaces have unconditional bases: L 1 (JR) and L00 (JR) do not. In a Hilbert space 1-l, an unconditional basis is also called a Riesz basis. A Riesz basis can also be characterized by the following equivalent requirement: there exist a > 0, J3 < so that •

n

n,

•

00

(0.0. 10) n for all E 1-l. If A is a bounded operator with a bounded inverse, then A maps any orthonormal basis to a Riesz basis. Moreover, all Riesz bases can be obtained as such images of an orthonormal basis. In a way, Riesz bases are the next best thing to an orthonormal basis. Note that the inequalities in (0.0. 10) are not sufficient to guarantee that the e n constitute a Riesz basis: the e n also need to be linearly independent! u

CHAPTER

1

The What, Why, and How of Wavelets

The wavelet transform is a tool that cuts up data or functions or operators into different frequency components, and then studies each component with a resolu tion matched to its scale. Forerunners of this technique were invented indepen dently in pure mathematics (Calderon's resolution of the identity in harmonic analysis-see e.g. , Calderon ( 1964)), physics (coherent states for the ( ax + b) group in quantum mechanics, first constructed by Aslaksen and Klauder ( 1968), and linked to the hydrogen atom Hamiltonian by Paul (1985)) and engineering (QMF filters by Esteban and Galland (1977), and later QMF filters with exact reconstruction property by Smith and Barnwell (1986) , Vetterli (1986) in elec trical engineering; wavelets were proposed for the analysis of seismic data by J. Morlet (1983)). The last five years have seen a synthesis between all these different approaches, which has been very fertile for all the fields concerned. Let us stay for a moment within the signal analysis framework. (The dis cussion can easily be translated to other fields.) The wavelet transform of a signal evolving in time (e.g., the amplitude of the pressure on an eardrum, for acoustical applications) depends on two variables: scale (or frequency) and time; wavelets provide a tool for time-frequency localization. The first section tells us what time-frequency localization means and why it is of interest. The remaining sections describe different types of wavelets. 1.1.

Time-frequency localization.

In many applications, given a signal f (t) (for the moment, we assume that t is a continuous variable), one is interested in its frequency content locally in time. This is similar to music notation, for example, which tells the player which notes (= frequency information) to play at any given moment. The standard Fourier transform, 1 dt e-·wt f ( t ) , (Ff)(w) = J27r

J

.

also gives a representation of the frequency content of f, but information con cerning time-localization of, e.g. , high frequency bursts cannot be read off easily from Ff. Time-localization can be achieved by first windowing the signal f, so

2

CHAPTER 1

as to cut off only a well-localized slice of f, and then taking its Fourier transform: (Twinf)(w , t) =

J d8 f(8) g(8

-

(1.1.1)

t) e-iws .

This is the windowed Fourier transform, which is a standard technique for time frequency localization. 1 It is even more familiar to signal analysts in its discrete version, where t and w are assigned regularly spaced values: t = nto, w = TnWo, where n range over Z, and wo, to > 0 are fixed. Then (1.1.1) becomes m,

T::':�(J) =

J do5 f(8) g(

05

-

(1 1 . 2 )

nto ) e-imwos .

.

This procedure is schematically represented in Figure 1.1: for fixed n, the T::.i�(J) correspond to the Fourier coefficients of f(·)g(· nto ) . If, for instance, 9 is compactly supported, then it is clear that, with appropriately chosen wo, the Fourier coefficients T.w�n(J) are sufficient to characterize and, if need be, to reconstruct f(·)g(· nto). Changing n amounts to shifting the "slices" by steps of to and its multiples, allowing the recovery of all of f from the T::,:�(J). ( We will discuss this in more mathematical detail in Chapter 3.) Many possible choices have been proposed for the window function 9 in signal analysis, most of which have compact support and reasonable smoothness. In physics, (1 .1.1) is related to coherent state representations; the gW,t ( ) = eiws g( t ) are the coherent states associated to the Weyl-Heisenberg group ( see, e.g., Klauder and Skagerstam (1985)). In this context, a very popular choice is a Gaussian g. In all applications, 9 is supposed to be well concentrated in both time and frequency; if 9 and 9 are both concentrated around zero, then (Twinf)(w , t ) can be interpreted loosely as the "content" of f near time t and near frequency w. The windowed Fourier transform provides thus a description of f in the time-frequency plane. -

-

05

"'--g(i=t�)--'\

05

-

f(t)

\ \

o

FIG . 1 . 1 . The windowed Fourier tmnsform: the function f(t) is multiplied with the window function g(t) , and the Fourier coefficients of the product f (t)g(t) are computed; the procedure is then repeated for tmnslated versions of the window, g(t - to) , g(t - 2to) , . . . .

THE WHAT, WHY, AND HOW OF WAVELETS

1 .2.

3

The wavelet transform: Analogies and differences with the windowed Fourier transform.

The wavelet transform provides a similar time-frequency description, with a few important differences. The wavelet transform formulas analogous to (1. 1 . 1) and (1.1.2) are (Twav f)(a, b) = lal-1/2

and

T::'��(f) = a-;;m/2

J

J

dt f(t) 'Ij;

( : b) t

dt f(t) 'Ij;(aomt - nbo)

.

( 1.2.1)

(1.2.2 )

In both cases we assume that 'Ij; satisfies

J

dt 'Ij;(t) = 0

( for reasons explained in Chapters 2 and 3).

( 1.2.3)

Formula (1.2.2) is again obtained from (1.2.1) by restricting a, b to only dis crete values: a = ao , b = nboao in this case, with m, n ranging over Z , and ao > 1, bo > 0 fixed. One similarity between the wavelet and windowed Fourier transforms is clear: both (1.1.1) and ( 1.2.1) take the inner products of f with a family of functions indexed by two labels, gW,t(s) = eiwsg(s - t) in (1.1. 1 ) , and 'lj;a,b(s) = lal-1/2 'Ij; (s;;:b) in ( 1.2.1) . The functions 'lj;a,b are called "wavelets" ; the function 'Ij; is sometimes called "mother wavelet." ( Note that 'Ij; and 9 are implicitly assumed to be real, even though this is by no means essential; if they are not, then complex conjugates have to be introduced in ( 1.1.1) , (1.2.1) . ) A typical choice for 'Ij; is 'Ij;(t) = ( 1 - t2) exp ( -t2/ 2) , the second derivative of the Gaussian, sometimes called the mexican hat function because it resembles a cross section of a Mexican hat. The mexican hat function is well localized in both time and frequency, and satisfies ( 1.2.3 ) . As a changes, the 'lj;a,O(s) = lal-1/2'1j;(s/a) cover different frequency ranges ( large values of the scaling parameter lal cor respond to small frequencies, or large scale 'lj;a,O; small values of lal correspond to high frequencies or very fine scale 'lj;a,O). Changing the parameter b as well allows us to move the time localization center: each 'lj;a,b(s) is localized around s = b. It follows that ( 1 .2.1) , like (1. 1 .1) , provides a time-frequency description of f. The difference between the wavelet and windowed Fourier transforms lies in the shapes of the analyzing functions gW,t and 'lj;a,b, as shown in Figure 1.2. The functions gW,t all consist of the same envelope function g, translated to the proper time location, and "filled in" with higher frequency oscillations. All the gW,t, regardless of the value of w, have the same width. In contrast, the 'lj;a,b have time-widths adapted to their frequency: high frequency 'lj;a,b are very narrow, while low frequency 'lj;a,b are much broader. As a result, the wavelet transform is better able than the windowed Fourier transform to "zoom in" on very short lived high frequency phenomena, such as transients in signals ( or singularities

4

CHAPTER 1

(b)

(a)

/

0

I

/

1

g(x )

\

x x

b 'l'a,

Re g",,1

withaO

\ \

X

X

0

0

b

'l'a,

witha> 1 b (the associated scaling function) and 1/J for different constructions which we will encounter in later chapters. (a) The Meyer wavelets; (b) and ( c) Battle-Lemarie wavelets; (d) the Haar wavelet; (e ) the next member of the family of compactly supported wavelets, 2 1/J; (f) another compactly supported wavelet, with less asymmetry.

16

CHAPTER 1

(b) , and (c) of Figure 1.8). The Haar wavelet, in Figure 1.8d, has been known since 1910. It can be viewed as the smallest degree Battle-Lemarie wavelet ('!f!Haar = '!f!B L , O ) or also as the first of a family of compactly supported wavelets constructed in Chapter 6, '!f!Haar = 1 '!f!. Figure 1.8e plots the next member of the family of compactly supported wavelets N'!f!; 2 ¢ and 2 '!f! both have support width 3, and are continuous. In this family of N'!f! (constructed in §6.4) , the regularity increases linearly with the support width (Chapter 7). Finally, Figure 1 .8f shows another compactly supported wavelet, with support width 1 1 , and less asymmetry (see Chapter 8) .

Notes. 1. There exist other techniques for time-frequency localization than the win dowed Fourier transform. A well-known example is the Wigner distribu tion. (See, e.g., Boashash (1990) for a good review on the use of the Wigner distribution for signal analysis.) The advantage of the Wigner distribution is that, unlike the windowed Fourier transform or the wavelet transform, it does not introduce a reference function (such as the window function, or the wavelet) against which the signal has to be integrated. The disad vantage is that the signal enters in the Wigner distribution in a quadratic rather than linear way, which is the cause of many interference phenom ena. These may be useful in some applications, especially for, e.g. , signals which have a very short time duration (an example is Janse and Kaiser ( 1983) ; Boashash (1990) contains references to many more examples) ; for signals which last for a longer time, they make the Wigner distribution less attractive. Flandrin (1989) shows how the absolute values of both the win dowed Fourier transform and the wavelet transform of a function can also be obtained by "smoothing" its Wigner distribution in an appropriate way; the phase information is lost in this process however, and reconstruction is not possible any more.

2. The restriction bo = 1 , corresponding to (1 .3.4), is not very serious: if (1.3.4) provides an orthonormal basis, then so do the ;j;m,n ( X ) = 2 - m/2 ;j;(2 - m x - nbo), with ;j;(x) = I bo l - 1 / 2'!f!(bj) l x), where bo =f- 0 is arbi trary. The choice ao = 2 cannot be modified by scaling, and in fact ao cannot be chosen arbitrarily. The general construction of orthonormal bases we will expose here can be made to work for all rational choices for ao > 1 , as shown in Auscher (1989) , but the choice ao = 2 is the simplest. Different choices for ao correspond of course to different '!f!. Although the constructive method for orthonormal wavelet bases, called multiresolution analysis, can work only if ao is rational, it is an open question whether there exist orthonormal wavelet bases (necessarily not associated with a multiresolution analysis) , with good time-frequency localization, and with irrational ao.

CHAPTER

2

T h e Contin uo us Wavelet Transfor m

The images of L 2 -functions under the continuous wavelet transform constitute a reproducing kernel Hilbert space (r.k.H.s.). Such r.k.H.s. 's occur and are useful in many different contexts. One of the simplest examples is the space of all band limited functions, discussed in §§2. 1 and 2.2. In §2.3 we introduce the concept of band and time limiting; of course no nonzero function can be strictly time-limited (i.e. , f (t) == 0 for t outside [-T, T] ) and band-limited (j( � ) == 0 for � � [-0, 0] ) , but one can still introduce time-and-band-limiting operators. We present a short review of the beautiful work of Landau, Pollak, and Slepian on this subject. We then switch to the continuous wavelet transform: the resolution of the identity in §2.4 (with a proof of (1.3. 1 ) ) , the corresponding r.k.H.s. in §2.5. In §2.6 we briefly show how the one-dimensional results of the earlier sections can be extended to higher dimensions. In §2.7 we draw a parallel with the continuous windowed Fourier transform. In §2.8 we show how a different kind of time-and-band-limiting operator can be built from the continuous windowed Fourier transform or from the wavelet transform. Finally, we comment in §2.9 on the "zoom-in" property of the wavelet transform. 2. 1 .

Bandlimited functions and Shannon's theorem. A function f in L 2 (JR) is called bandlimited if its Fourier transform Ff has compact support, i.e., j( O 0 for I � I > O. Let us suppose, for simplicity, that 0= Then j can be represented by its Fourier series (see Preliminaries) , 1r.

==

j(�) = L en e- i ne , nEZ

where

17

18

CHAPTER

2

It follows that f(x) =

=

(2. 1 . 1 ) where we have interchanged integral and summation in the third step, which is only a priori justifiable if L I cn l < 00 (e.g., if only finitely many Cn are nonzero) . By a standard continuity argument, the final result holds for all band limited f (for every x, the series is absolutely summable because L n I f(n) 1 2 2 71' L n I cn l 2 < 00 ) . Formula (2. 1.1) tells us that f is completely determined by its "sampled" values f(n ) . If we lift the restriction 0 71' and assume support 1 C [-0, OJ , with 0 arbitrary, then (2. 1.1) becomes =

=

f(x)

=

L n

( �)

f n

sin (Ox - n7l') Ox - n7l'

(2. 1.2)

the function is now determined by its samples f( n fi J , corresponding to a "sampling density" of 0/71' I sup��rt il . (We use the notation I A I for the "size" of a set A c JR, as measured by the Lebesgue measure; in this case Isupport 11 1 [-O, Ol l 2 0 . ) This sampling density is usually called the Nyquist density. The expansion (2. 1.2) goes by the name of Shannon's theorem. =

=

=

1

I I I

----��- - - l- - - ��----11( 1 + ,x )

-11

o

FIG . 2 . 1 .

Graph of 9;.. .

The "elementary building blocks" si��x in (2.1.2) decay very slowly (they are not even absolutely integrable) . "Oversampling" makes it possible to write f as a superposition of functions with faster decay. Suppose that f is still bandlimited in [-0, OJ (Le., support 1 c [-0, 0) ) , but that f is sampled at a rate (1 + faster than the Nyquist rate, with > O. Then f can be recovered from the f(n7l'/ [0( 1 + in the following way. Define g).. by

A)))

A

A)

{ 1I,

19

THE CONTINUOUS WAVELET TRANSFORM

I�I

::;

0

1 A) O, 0, I � I 2: ( 1 A ) O ( see Figure 2.1 ) . Because 9). 1 on support j, we have j(�) = j(�)9).(�). We can now repeat the same construction as before. 9).(�) =

- ,elfif! ,

o ::;

I�I

(

::;

+

+

==

n

with

�

20 ( 1 A) f

en

+

hence

(

0(

nrr

1 A) +

);

f( )

x

where

2 sin [xO ( l A/2 )] sin(xOA/2) . g). ( X) ). x - 20� A0 ( 1 A)X ( 1 A) These G). have faster decay than sin�x ; note that if A--+O, then G). --+ sin�x , as expected. One can obtain even faster decay by choosing 9). smoother, but it does not pay to put too much effort into making 9). very smooth: true, G). will have very fast decay for asymptotically large x , but the size of A imposes some restrictions on the numerical decay of G).. In other words, a Coo choice of 9). leads to G). decaying faster than any inverse polynomial, I G ).(x)1 ::; CN ( A)( l I x l ) - ( N + 1) , but the constant CN (A) can be very large: it is related to the range of values of theNNth derivative of 9). on [0, 0 ( 1 A)] , so that it is roughly proportional to A- . What happens if f is "undersampled," i.e., if support j = [ 0, 0] , but only the f(nrr/ [O(l - Am are known, where 1 > A > O? We have G ( )

_

+

_

+

2

2

+

+

+

-

(

f n

rr O(l - A )

)

=

_1_

�

1

--

�

j /

f!

-f!

cte j(�) eimre/ [f! (l -)' ) ]

f! (1 - ). )

-f! ( 1 -). )

cte eim re / [f!( l -). )]

[j(�) + j(� + 20(1

- A) )

+

j(� - 20 ( 1

- A))] ,

20

CHAPTER

2

where we have used that the eimrf!./a have period 2a, and where we have assumed A � � (otherwise more terms would intervene in the sum in the last integrand). This means that the undersampled ! (n n(t- A) ) behave exactly as if they were the Nyquist-spaced samples of a function of narrower bandwidth, the Fourier trans form of which is obtained by "folding over" i (see Figure 2.2). In the "folded" version of i, some of the high frequency content of ! is found back in lower frequency regions; only the lei � n(1 2A) are unaffected. This phenomenon is called aliasing; for undersampled acoustic signals, for instance, it is very clearly audible as a metallic clipping of the sound. -

2.2.

Bandlimited functions as a special case of a reproducing kernel Hilbert space. a, (3 , - 00 � a < (3 � 00 ,

For any

the set of functions { f L 2 (1R); support ! [a, (3J ) constitutes a closed subspace of L2 (1R), i.e., it is a subspace, and all Cauchy sequences composed of elements of the subspace converge to an element of the subspace. By the unitarity of the Fourier transform on L2 (1R), it follows that the set of all bandlimited functions Bn = {f L2 (1R); support i [-n, nJ} is a closed subspace of L2(1R). By the Paley-Wiener theorem (see Preliminaries), function ! in Bn has an analytic extension to an entire function on C , which we also denote by !, and which is of exponential type. More precisely, c

E

E

c

any

1 ! (z ) 1 �

� lI i ll £1 e l1m zl n .

In fact, Bn consists of exactly those L2-functions for which there exists an an alytic extension to an entire function satisfying a bound of this type. We can

-11 FIG . 2 . 2 . The three terms and their s u m ( thick line) .

/(1;), /(1; + 2n(1 - A)), and /(1; - 2n(1 - A)) for 11;1

� n ( l - A),

21 therefore consider Bo. to be a Hilbert space of entire functions. For f in Bo. we have v'21r 0. 1 1 f (x) 0.- d1. eixe 1(1:,) THE CONTINUOUS WAVELET TRANSFORM

0. 1 21T - 0.

__

�

d1. eixe

J dY f(y) e-iey

- y) J dY f(y) sinO(x 1T( - y) .

(2.2.1) (The interchange of integrals in the last step is permissible if f Ll, Le., if 1 is sufficiently smooth. Since, for all x, [1T( - .)]-l sinO(x - .) is in L2 (JR), the conclusion then extends to all f in Bo. by the Sistandard trick explained in Pre liminaries.) Introducing the notation eAy) ��(�;t) , we can rewrite (2.2.1) as ( 2.2.2 ) Note that ex Bo. , since ex(l:,) (21T)- 1/2 e-ixe for 1 1:, 1 0, eAI:,) 0 for 1 1:, 1 > 0. Formula (2.2.2) is typical for a reproducing kernel Hilbert space (r.k.H.s). In an r.k.H.s. 1l of functions, the map associating to a function f its value f(x) at a point x is a continuous map (this does not hold in most Hilbert spaces of functions, in particular not in L2 (JR) itself), so that there necessarily exists ex 1l such that f ( x) = (J, ex) for all f 1l (by Riesz' representation lemma; see Preliminaries). One also writes f(x) J dy K(x ,y) f(y), where K(x, y) ex(Y) is the reproducing kernel. In the particular case of Bo. , there even exist special � so that the eXn constitute an orthogonal basis for Bo. , leading to Shannon's formula (2.1.2) . Such special need not exist in a general r.k.H.s. We will meet several examples of other r.k.H.s.'s in what follows. X

E

X

=

=

E

E

T, and (Pof) /\ ( � ) = j ( � ) for I�I 0 , (Pof) /\ ( � ) = 0 for I�I > 0 . Then a signal which is timelimited to [ -T, T] satisfies I = QT I, and transmit ting it over a channel with bandwidth 0 gives as end product Po l = PO QT I (provided there is no other distortion). The operator PO QT represents the total time + band limiting process. How well the transmitted PO QT I approaches the original I is measured by II PoQTI I1 2 / 11 / 11 2 = ( QT Po QT 1, /) / 11 / 11 2 • The maximum value of this ratio is the largest eigenvalue of the symmetric operator QT Po QT, given explicitly by jT y sinO(x - y) I ( if I x l T, d 7r(x - y) y) (2.3. 1) (QT Po QT f)(x) = -T o if I x l > T . The eigenvalues and eigenfunctions of this operator are now known explicitly because of a fortunate accident: QT Po QT commutes with the second order differential operator A, d (T2 - x2 ) dxdl - 02 X2 I (x) . (AI) (x) = dx 7r2 The eigenfunctions of this operator, which had been studied for different reasons long before their connection with band- and timelimiting was discovered, are called the prolate spheroidal wave functions, and many of their properties are known. Because A commutes with QT Po QT (and because the eigenvalues of A are all simple), the prolate spheroidal wave functions are also the eigenfunctions of QT Po QT (with different eigenvalues, of course). More specifically, if we denote the prolate spheroidal wave functions by 'l/Jn, n E N, ordered so that the corresponding eigenvalues an of A increase as n increases, then
0, so that, by straightforward application of (2.4.9),

where ( T.+av f) (a, b) = (1+ , 'lj;�, b ) = (I, 'lj;�, b ) , (T�:av J) is defined analogously, and C'" is as in (2.4.1). Another important variation consists of introducing a different function for the reconstruction than for the decomposition. More explicitly, if 'lj; 1 , 'lj;2 satisfy that

(2.4. 11)

then the same argument as in the proof of Proposition 2.4.1 shows that

(2.4.12) with

C"'1 , "' 2

(2.4.12) as

=

27r J d� I�I -l ¢ d �) ¢2 ( � ) . If C"' 1 , "'2

1= 0, then we can rewrite (2.4. 13)

Note that 'lj; 1 and 'lj;2 may have very different properties! One may be irregu lar, the other smooth; both need not even be admissible: if ¢ 1 (� ) = O( � ) for � ---- O , then ¢2 (0) 0 is allowed. We will not use this extra freedom here. In Holschneider and Tchamitchian (1990), the freedom in the choices of 'lj; 1 , 'lj;2 is exploited to prove some very interesting results (see also §2.9). One can, for instance, choose 'lj; 2 to be compactly supported, support 'lj;2 C [- R, R] , so that, for any x, only the (I, 'lj; � , b ) with I b - x l � laiR will contribute to f( x ) in the reconstruction formula (2.4. 13); the set {(a, b); I b - x l � l aiR} is then called the "cone of influence" of 'lj;2 on x. Holschneider and Tchamitchian (1990) also prove that, with mild conditions on f, (2.4.13) is true pointwise as well as in the L 2 -sense. PROPOSITION 2.4.2. Suppose that 'lj; 1 , 'lj;2 E Ll (IR), that 'lj;2 is differentiable with 'lj;� E L 2 (IR), that x'lj;2 E Ll (JR) , and that ¢ 1 (0) = 0 = ¢2 (0) . If f E L 2 (JR) is bounded, then (2.4.13) holds pointwise in every point x where f is continuous, i. e.,

1=

. 1

f( ) - C'"-11, '"2 Ahm 1 �O X

-

A2

...... oo

Al < _ IaI < _ A2

da a

2"

j= db J, 'lj;1 _ 00

/

a ,b

) 'lj;a2 , b ( X )

.

(2.4. 14)

29

THE CONTINUOUS WAVELET TRANSFORM

Proof.

1.

We can rewrite the right-hand member of (2.4.14) (before taking the limit ) as •

100 db 00 -

f(Y) l a l -1 1fJ1

(Y---b) 1fJ2 ( X --b) a

a

(2.4.15 )

where all the changes of order of integration are permitted by Fubini's theorem (the integral converges absolutely) . Here MA 1 ,A2 is defined by 2.

One easily computes that the Fourier transform of MA1 ,A2 is (211") 1 / 2 C;j;11"P2

rJA1 '5,lal'5,A2 dlaal "fJ2(afJ "fJ1 (af.) (2.4.16)

-

(2.4.17) where M(, e) (211")1 /2 C;j;11>'l/J2 J lal� lel iaT 1fJ2 (a) 1fJ1 (a), as follows from a change of variables a--+af. in (2.4.16). Since a"fJ2(a) L2 (JR) and "fJ1 (a) is bounded, we have M(A 1 e) - M(A2 e) f

d

,

'

,

E

(1 -I )

+

a.

'Y

+

Xo

00

a

o.

50

CHAPTER

2

lal � lhl � I: db ( I l + I I��bll ) I tP2 (- h� ) I b �) I , lhl � lal 9 �� I: db ( I l + I I��bll ) I tP2 ( : ) - tP2 (-(2.9.4)

+

aa

+

aa

where we have assumed a > 'Y . (If a ::; 'Y, things become simpler.) Let us denote the four terms in the right-hand side of (2.9.4) by TI , T2 , T3 , and T4 ·

4. In the second term, we use support

tP2

C

[- R, R] to derive

Ihl , ' a ' �� r da l a r 1 + a I tP2 1 L ' + r -1 l I tP da l a 1 1 £ 2 i 2 og I a , al �h J1al :-:; l hl ClJl h l a .

5. Similarly, for sufficiently small T3

<
oo. A similar theorem can be proved in the wavelet case (Daubechies, Klauder, and Paul (1987)) . 7 . Exactly the same arguments hold for all operators W of type (2.8.2) for which w(w, t) is rotationally symmetric, even if it is not an indicator func tion. An example is w(w, t) = exp[-a(w 2 + t2)], for which it was first shown in Gori and Guattari (1985) that the Hermite functions are the eigenfunctions (irrespective of a; the eigenvalues depend on a, of course!). 8. It is no coincidence that Fefferman and de la Llave would use a representa tion of type (2.8.3) for the operator (2.8.4) : after all, Calderon's formula (to which (2.4.4) is essentially equivalent) is part of a toolbox developed pre cisely for the study of singular integral operators (long before wavelets!) so that it is well adapted for treating the singular kernel in (2.8.4) . In this particular instance, (2.8.4) makes sense even for nonadmissible 'IjJ (C.p cancels out); in Fefferman and de la Llave (1986), 'IjJ was taken to be the indicator function of the unit ball (which is nonadmissible, since its integral does not vanish) . 9. I f we make an extra transformation, mapping the upper half plane {b + ia; a � O} to the unit disk (by means of a conformal mapping), then everything becomes more transparent: the flow z-'>z(t) then corre sponds to a simple rotation around the center of the disk, and H as well as its eigenfunctions are given by simple expressions. See Paul (1985) or Seip (1991). 10. There exist many other choices of 'IjJ for which this analysis works. For each choice the set Se in time-frequency space corresponding to Be in (a, b)-space takes on a different shape. Explicit computations and a figure illustrating these different shapes can be found in Daubechies and Paul (1988) .

C HAPTER

3

D i sc rete Wave l et Transform s : Fra m e s

In this, the longest chapter in this book, we discuss various aspects of non orthonormal, discrete wavelet expansions, together with some parallels with the windowed Fourier transform. The "frames" of the chapter title are sets of non independent vectors; they can nevertheless be used to write a straightforward and completely explicit expansion for every vector in the space. We will discuss wavelet frames as well as frames for the windowed Fourier transform; in the latter case, the approach can be viewed as "oversampled" with respect to the Nyquist density in time-frequency space. A lot of the material in this chapter has been taken from Daubechies (1990) , updated here and there. A very nicely written review of frames (and of the continuous transforms as well) , with some additional original theorems, is Heil and Walnut (1989) .

Discretizing the wavelet transform. In the continuous wavelet transform, we consider the family 3.1.

where b E JR, a E JR+ with a -=1= 0, and 'l/J is admissible. For convenience, in the discretization we restrict a to positive values only, so that the admissibility condition becomes

(See §2.4.) We would like to restrict a, b to discrete values only. The discretiza tion of the dilation parameter seems natural: we choose a = aD', where m E Il, and the dilation step ao -=1= 1 is fixed. For convenience we will assume ao > 1 (although it does not matter, since we take negative as well as positive pow ers m) . For m = 0, it seems natural as well to discretize b by taking only the integer (positive and negative) multiples of one fixed bo (we arbitrarily fix bo > 0) , where bo is appropriately chosen so that the 'l/J(x - nbo) "cover" the whole line (in a sense to be made precise below) . For different values of m , the 53

54

CHAPTER 3

width of a� m /2 '¢(ao m x) is a(f times the width of ,¢(x) ( as measured, e.g. , by width ( I ) = [f dx x2 I f(x) 1 2j 1 /2 , where we assume that J dx x l f(x) 1 2 = 0) , so that the choice b = nbo a(f will ensure that the discretized wavelets at level m "cover" the line in the same way that the '¢(x - nbo) do. Thus we choose a = a(f, b = nboa(f, where m, n range over Z, and ao > 1, bo > 0 are fixed; the appropriate choices for ao, bo depend, of course, on the wavelet '¢ ( see below ) . This corresponds to

ao-m /2 .'P,, ( ao- m x - nbo ) ·

(3. 1 . 1 )

We can now ask two questions: ( 1 ) D o the discrete wavelet coefficients (I, '¢m ,n) completely characterize f? Or, stronger, can we reconstruct f in a numerically stable way from the

(I, '¢m ,n) ?

(2) Can any function f be written as a superposition of "elementary building blocks" '¢m ,n ? 1 Can we write an easy algorithm to find the coefficients in such a superposition? In fact, these questions are dual aspects of only one problem. We will see below that, for reasonable '¢ and appropriate ao, bo, there exist '¢m ,n so that the answer to the reconstruction question is simply

m ,n It then follows that, for any 9 E L2(JR) ( g, f) = ( I, g)

(�

(I, ¢m ,n) ( ¢m ,n , g )

)

•

L (g, '¢m ,n) ('¢m ,n , f ) , m ,n or 9 = L m ,n (g, '¢m ,n) '¢m ,n , at least in the weak sense; this is effectively a prescription for the computation of the coefficients in a superposition of '¢m ,n leading to g. We will mostly focus on the first set of questions here; for a more detailed discussion of the duality between (1) and (2) , see Grochenig (1991). In the case of the continuous wavelet transform, both questions were answered immediately by the resolution of the identity, at least if '¢ was admissible. In the present discrete case there is no analog of the resolution of the identity, 2 so we have to attack the problem some other way. We can also wonder whether there exists a "discrete admissibility condition," and what it is. Let us first give some mathematical content to the questions in (1) . We will restrict ourselves mostly to functions f E L2 (JR), although discrete families of wavelets, like their continuously labelled cousins, can be used in many other function spaces as well.

DISCRETE WAVELET TRANSFORMS: FRAMES

55

Functions can then be "characterized" by means of their "wavelet coefficients" ( I, 'l/Jm ,n) if it is true that (II , 'l/Jm ,n) = (12 , 'l/Jm ,n) for all m, n E Z

or, equivalently, if

implies II

==

12 ,

(I, 'l/Jm ,n) = 0 for all m, n E Z => 1 = 0 .

But we want more than characterizability: we want to be able to reconstruct 1 in a numerically stable way from the (I, 'l/Jm ,n) . In order for such an algorithm to exist, we must be sure that if the sequence ( (II , 'l/Jm ,n) ) m ,nEZ is "close" to ( (12 , 'l/Jm ,n) ) m ,nEZ , then necessarily II and 12 were "close" as well. In order to make this precise, we need topologies on the function space and on the sequence space. On the function space L2(R) we already have its Hilbert space topology; on the sequence space we will choose a similar f2 -topology, in which the distance between sequences e l = ( e� ,n ) m ,nEZ and e2 = ( e;;" ,n ) m ,nEZ is measured by

I 2 ,n 1 2 . Ile l _ e2 11 2 = '"' L.J I Cm ,n - Cm m ,nEZ This implicitly assumes that the sequences ( (I, 'l/Jm ,n) ) m ,nEZ are in f2 (Z2 ) them selves, Le. , that L m ,n 1 (1, 'l/Jm ,n) 1 2 < for all 1 E L2 (R). In practice, this is no problem. As we will see below, any reasonable wavelet (which means that 'l/J has some decay in both time and frequency, and that J dx 'l/J (x) = 0), and any choice for ao > 1, bo > 0 leads to 00

(3. 1.2)

m ,n

We will assume (without specifying any restrictions yet on the 'l/Jm ,n ; we will come back to these later) that (3. 1.2) holds. With the £2 (Z2 ) interpretation of "closeness," the stability requirement means that if L m ,n I (I, 'l/Jm ,n) 1 2 is small, then 11/11 2 should be small. In particular, there should exist a < 00 so that Lm ,n 1 (1, 'l/Jm ,n) 1 2 ::; 1 implies 11/11 2 ::; a. Now take arbitrary 1 E L2(R), and define i = [ L m ,n 1 (1, 'l/Jm ,n) 1 2]- 1 /2 I· Clearly, L m ,n I ( i, 'l/Jm ,n) 1 2 ::; 1; hence II i l1 2 ::; a. But this means

or

[�

1 (1, >Pm ,n) I ' m ,n

j

-'

11111 ' S; a (3. 1.3)

for some A = a -I > O. On the other hand, if (3. 1 .3) holds for all I , then the distance II II - 12 11 cannot be arbitrarily large if L m ,n I ( II , 'l/Jm ,n) - (12 , 'l/Jm ,n) 1 2

CHAPTER 3

56

is small. It follows that (3. 1 .3) is equivalent to our stability requirement. Com bining (3. 1 .3) with (3. 1.2), we obtain that there should exist A > 0, B < 00 so that (3. 1.4) m ,n for all f E L2 (lR) . In other words, the { 1Pm ,n ; m, n E Z} constitute a frame, a concept that we review in the next section. The connection between frames and numerically stable reconstructions from discretized wavelets was first pointed out by A. Grossmann (1985, personal communication ) . 3.2.

Generalities about frames. Frames were introduced by Duffin and Schaeffer (1952), in the context of non harmonic Fourier series ( Le., expansions of functions in L2 ( [0, 1] ) in complex exponentials exp ( iAnX ) , where An =f. 27rn) ; they are also reviewed in Young (1980) . We review here their definition and some of their properties. DEFINITION. A family of functions ( 0, B < so that, for all f in H, A II f l l 2 ::; L l (f, oo . Then In = Sgn , and Il gn - gm l1 2 � a -l ( S(gn - gm ) , gn - gm ) � a -I IIS (gn - gm ) 11 Il gn - gm ll , where we have used a (h, h) � (Sh, h) in the first inequality. But this implies Il gn - gm ll � a -I Illn - Im ll , so that the

gn necessarily constitute a Cauchy sequence in H. This Cauchy sequence necessarily has a limit g in H. Because S is continuous, we now trivially have Sg = limn_co Sgn = limn_co In , so that limn_co In E Ran (S).

2. The orthonormal complement of Ran ( S ) is { o }. Indeed, if (I, Sg ) = 0 for all g E H, then, in particular, (I, S I ) = 0, which by a 1 1/ 1 1 2 � (SI, J) implies II I II = 0; hence 1 = O. Together with point 1 this implies Ran ( S ) = H. It follows that S is invertible: any I E H can be written as I = Sg; we define S - 1 1 = g. Moreover,

a IIS -l 111 2 � (SS - 1 I, S - 1 J) = (I, S - 1 J) � 11/11 IIS -l III ; hence IIS -l I II � a - I 1 11 1 1 , as announced. •

Therefore, we have I I (F * F)- 1 11 � A - I . The reader can easily check that we have, in fact, (3.2.5) B - 1 Id � (F * F) - 1 � A -l Id .


59

Applying the operator (F* F)- 1 to the vectors 'Pj leads to an interesting new family of vectors, which we denote by rpj ,

rpj = (F* F) - 1 'Pj . The family (rpj )j EJ turns out to be a frame as well. P ROPOSITION 3 . 2 . 3 . The (rpj)j EJ constitute a frame with frame constants

B- 1

and

A- I ,

(3.2.6) B - 1 11/11 2 ::; � 1 ( /, rpjW ::; A - I 11/11 2 . j EJ The associated frame operator F : H £2(J), (Ff)j = ( /, rpj ) satisfies F = F(F*F)- I , F * F = (F*F) - I , F*F = Id = F*F and FF * = FF * is the orthogonal projection operator, in £2 (J), onto Ran (F) = Ran (F) . -----

Proof

1. As an exercise, the reader can check that if a bounded operator S has a bounded inverse S - I , and if S * = S , then ( S - I )* = S - I . It follows that hence

j EJ = ( (F*F) - I /, F*F(F*F) - I f ) = ( (F*F) - I /, f ) . (3.2.7) By (3.2.5), this implies (3.2.6) ; the rpj constitute a frame. Moreover, (3.2.7) implies also that the frame operator F satisfies F * F = (F* F)- I . 2. (F(F*F) - I f)j = ( (F*F)- I /, 'Pj) = (/, rpj ) = ( F f)j, F* F = [F(F* F)- I ]* F = (F* F)- 1 F* F = Id, F* F = F* F(F* F) - 1 = Id. 3. Since F = F(F* F)- I , it follows that Ran ( F ) C Ran (F). We have also F = F(F* F); hence Ran (F) C Ran (F). Consequently, Ran (F) = Ran (F). Let P be the orthogonal projection operator onto Ran (F). We want to prove that FF* = P, which is equivalent to FF* (Ff) = FI (i.e. , FF* leaves elements of Ran (F) unchanged) and FF*c = 0 for all c orthogonal to Ran (F). Both assertions are easily checked: FF*FI = F(F*F) - 1 F*F I = F I j EJ

and

c .l Ran (F) => F* c = 0

=> =>

( c, Ff) = 0 for all 1 E H FF* c = o . •

CHAPTER 3

60

We will call (�j )j E J the dual frame of ('Pj )jE J . It is easy to check that the dual frame of (�j ) j E J is the original frame ('Pj ) j E J back again. We can rewrite some of the conclusions of Proposition 3.2.3 in a slightly less abstract form; P* F = Id = F* P means that jEJ

(3.2.8)

jEJ

This means that we have a reconstruction formula for f from the (I, 'Pj ) ! At the same time we have also obtained a recipe for writing f as a superposition of 'Pj , which demonstrates that the two sets of questions in §3. 1 are indeed "dual." When given a frame ('Pj )j E J , the only thing we therefore need to do, in order to apply (3.2.8) , is to compute the �j = (F* F) -l 'Pj . We will come back to this soon. First we will address a question that often arises at this point: I have stressed before that frames, even tight frames, are generally not (orthonormal) bases because the 'Pj are typically not linearly independent. This means that for a given f, there exist many different superpositions of the 'Pj which all add up to f. What then singles out the formula in the second half of (3.2.8) as especially interesting? We can get an inkling of the answer with a simple example. EXAMPLE. We revisit the simple example of Figure 3. 1 . We had there, for any v E C2 , 3 V = "32 L ( v, ej ) ej . (3.2.9) )=1

Since L�= l ej = 0 in this example, it follows that the following formulas are also true: 3 V = "32 L [ ( v, ej ) + 0:] ej , (3.2. 10) j=l

where 0: is arbitrary in C. (In this particular case, one can prove that (3.2. 10) gives all the possible superposition formulas valid for arbitrary v . ) Somehow, (3.2.9) seems more "economical" than (3.2. 10) if 0: This intuitive statement can be made more precise in the following way:

1= O.

3

'"' J ( v, �

j=l

3

ej ) J 2 = "2

JJ v JJ 2 ,

whereas 3

L J ( v, j =l

ej ) +

3 3 o:J 2 = "2 JJ V JJ 2 + 3 Jo:J 2 > "2 JJ v JJ 2 if

0:

1= O .

0

Likewise, the ( I, �j ) are the most "economical" coefficients for a decomposition of f into 'Pj . P RO POSITION 3 . 2 . 4 . If f = L j E J Cj 'Pj for some C = (Cj )j E J E f2(J), and if not all Cj equal ( I, �j ) , then Lj E J J Cj J 2 > L j E J J ( I, �j ) J 2 .

61

DISCRETE WAVELET TRANSFORMS : FRAMES

Proof.

1 . Saying that f

=

Lj E J Cj 'Pj is equivalent to staying that f F*c. =

a + b, where a E Ran ( F) Ran (F), and b ..l Ran ( F). In particular, a ..l b; hence I I c I12 I I a l12 + Il b 112 .

2. Write C

=

=

=

3. Since a E Ran (F), there exists g E H so that a Fg , or C Fg + b. Hence f F*c F* Fg + F*b. But b ..l Ran ( F) , so that F*b 0, and F* F Id. It follows that f g; hence c Ff + b, and =

=

=

=

=

=

=

=

jE J

jEJ

which is strictly larger than L j E J 1 (1, cpj ) 12 , unless b

=

°

and c

=

Ff.

•

This proposition can also be used to see how the CPj play a special role in the first half of ( 3.2.8). We typically have nonuniqueness there as well: there may exist many other families ( Uj )j E J so that f Lj E J ( 1, 'Pj ) Uj . In our earlier two-dimensional example, such other families are given by Uj �ej + a, where a is an arbitrary vector in C 2 . Since L�= ej 0, we obviously have =

=

l

=

Again, however, the Uj are "less economical" than the ej , in the sense that for all v with (v , a ) -I- 0,

L 3

j= l

L I (v , ej ) 12 + 3 1 (v, aW 2 2 3 1IvI12 + 3 1 (v , a ) 12 > 3 11vl12 L I (v , ej ) 1 2 j=

I (v , Uj ) 1 2 =

3

=

j= l

3

=

l

A similar inequality holds for every frame: if f Lj E J ( 1, 'Pj ) Uj , then Lj E J I ( uj , g ) 12 2': Lj E J 1 (cpj , g ) 12 for all g E H, by Proposition 3.2. 4 . Back to the reconstruction issue. If we know CPj ( F* F)-l 'Pj , then ( 3.2.8) tells us how to reconstruct f from the (1, 'Pj ) . So we only need to compute the CPj , which involves the inversion of F* F. If B and A are close to each other, i.e. , r = B /A 1 « 1 , then ( 3.2. 4 ) tells us that F* F is "close" to A t B Id, so that ( F* F)-l is "close" to _ 2 _B YIf)J· · More precisely, 2 _B Id ' and IiiY J· "close" to _ A+ A+ =

=

-

f

=

L

2 A + B jEJ ( 1, 'Pj ) 'Pj + Rf ,

( 3.2. 1 1 )

62

CHAPTER 3

= Id -

! B F* F; hence - �+1 Id :'S: R :'S: �+1 Id. This implies5 IIRII :'S: �+1 = � If r is small, we can drop the rest term Rf in (3.2. 1 1 ) , and we obtain a reconstruction formula for f which is accurate up t o an L2_ error of � I l f ll · Even if r is not so small, we can write an algorithm for the reconstruction of f with exponential convergence. With the same definition of R, we have F* F = A B ( Id - R) ; hence ( F* F) - l = A ! B ( Id - R) - l . Since II RII :'S: �+1 < 1, the series 2:%:0 Rk converges in norm, and its limit is ( Id - R) -1 . It follows that where R

2

2

A

r '

r

;

tj;j

=

f

2 Rk j . ( F* F) -l 'Pj = A + B k=O 'P

Using only the zeroth order term in the reconstruction formula leads exactly to (3.2. 1 1 ) with the rest term dropped. We obtain better approximations by truncating after N terms,

00

N 2 -N - 2 � � k k [ N 'Pj � R 'Pj - 'Pj � R 'Pj - Id - R +1j 'Pj , (3.2. 12) A + B k=O A + B k=N +1 _

_

_

_

with

f-

L ( I, 'Pj ) tj;f EJ J sup

11911 = 1

( f - JLEJ ( I, 'Pj ) tj;f , g )

L

I, 'Pj ) (RN +l tj;j , g) ( = 11911 1 j EJ sup 1 (1, R N +lg ) I :'S: I IRII N +1 I l f ll 11911 = 1 sup

r N+ 1 Ilfll , 2+r increases, since � < 1. In particular, :'S:

which becomes exponentially small as N the tj;f can be computed by an iterative algorithm, -N = 2 'Pj + R 'Pj -N - 1 'Pj

or

A+B

tj;f =

L a� 'Pi ,

iEJ

(

2

r

)

DISCRETE WAVELET TRANSFORMS : FRAMES

63

with This may look daunting, but it is not so terrible in examples of practical interest, where many ( 'Pm , 'Pi) are negligibly small. The same iterative technique can be applied directly to f : (F* F) - I (F* F)f

f

=

lim fN ,

N�oo

with 2

N

" Rk (F* F)f A+B � k= O

� B (F* F)f + R fN- I 2 fN- I + [(I, 'Pj) A + B jL EJ A

-

(IN- I , 'Pj ) ]

'Pj

Now that we have thoroughly explored abstract frame questions, we return to discrete wavelets. 3.3.

Frames of wavelets. We saw in §3. 1 that in order to have a numerically stable reconstruction algo rithm for f from the (I, 'ljJm, n ) , we require that the 'ljJm, n constitute a frame. In §3.2 we found an algorithm to reconstruct f from the (I, 'ljJm, n ) if the 'ljJm, n do constitute a frame; for this algorithm the ratio of the frame bounds is important, and we will come back to ways of computing at least a bound on this ratio, later in this section. First, however, we show that the requirement that the 'ljJm, n constitute a frame already imposes that 'ljJ is admissible. 3.3. 1 .

A necessary condition: Admissibility of the mother wavelet. m 3 . 3 . 1 . If the 'ljJm, n (x) a� / 2 'ljJ ( ao m x nbo ) , n E Z, consti

THEOREM

tute a frame for

L2 (JR)

=

with frame bounds A, B, then

-

m,

(3.3. 1) and

(3.3.2)

CHAPTER 3

64 Proof.

1. We have, for all 1 E L 2 (JR.) ,

A 11 1 11 2 �

L I U, 1Pm ,n) 12 � B 11 1 11 2 .

(3.3.3)

m ,nEZ

If we write (3.3.3) for 1 = Ue, and add all the resulting inequalities, weighted with coefficients Ce > 0 such that L: e ce llue l 12 < 00 , then we obtain

e

e

e

m, n

In particular, if C is any positive trace-class operator (see Preliminaries) , then C = L Ce ( . , ue ) Ue ,

eE N where the Ue are orthonormal, Ce :::: 0, and L: e E N Ce Tr C > =

such operator, we have therefore, by (3.3.4) ,

O. For any (3.3.5)

m, n

2. We now apply ( 3.3.5 ) to a very special operator C, constructed via the continuous wavelet transform, with a different mother wavelet. Take h to be any L2 -function such that support h e [0, 00 ) , d� �- l I h(�) 12 < 00 , , 2 a and define, as in Chapter 2, h b = a - 1 / h ( x�b ) for a, b E JR., a > o. If c ( a, b) is a bounded, positive function, then

oo C = r d�

10

a

1-0000

1000

db ( . , ha , b ) ha , b c (a, b)

( 3.3.6 )

is a bounded, positive operator (see §2.8) . If, moreover, c (a, b) is integrable with respect to a-2 da db, then C is trace-class, and Tr C = � I�oo db c(a, b) l l hI 12 •6 We will in particular choose c (a, b) w ( l b l /a) if 1 � a � ao, o otherwise, with w positive and integrable. We then have =

and

1000

65


3. The middle term in (3.3.5) becomes, for this C,

But

After the change of variables a' = ao m a, b' = ao m b we therefore obtain

m,n

Take now w ( s) = A e -A2,,-2s2 . This function has only one local maximum, and is monotone decreasing as l si increases. An elementary approximation argument for integrals ( the full details of which can be found in Daubechies (1990) , Lemma 2.2) shows that for such functions w and for any a, (3 E JR, (3 > 0 ,

� 1: dt w (t ) - (3wmax � (3 nEIll

or, for our particular w,

(

( + n(3 ) �

w a

" w I b + nbo l �

n

a

1: dt w (t ) + (3wmax ,

) = b0 + p( a, b) , !:.

100 - 1-0000 1-0000

with I p(a, b) 1 � w ( O ) = A. Consequently, 1 da " db I ('l/J , h a ' b ) 1 2 + R , � ( C'l/Jm,n , 'l/Jm,n) = o b a 0 m,n where

I RI

r OO d� io a

db I ('l/J, ha , b W p(a, b)

.

( 3.3 7 )

CHAPTER 3

66

which Ch as defined by (2.4. 1 ) . We can rewrite the first term in (3.3.7) as

f I !: 1 1

1

� { = da = b (= ib 2 / bo Jo a _ d Jo d� -J;(�) a 1 2 h(a�) e � = = da = d� 1-J;(� ) 1 2 I h(a� W = II hl 12 =

!:

1= d� c 1 1 -J;(�) 12 .

4. For the particular weight function w that we have chosen, we have Jo= dt w ( t ) = � , hence Tr C = II h l12 In ao. Substituting all our results in (3.3.5) we find

A II h l12 ln ao

:::; !: Il h 11 2 1 = � C 1 1 -J;(� ) 1 2 + R :::; Bll hl 12 ln ao ,

where I R I :::; A Ch 11 � 112 . If we divide by �: II h l12 and let A tend t o zero, then this proves (3.3. 1 ) . The negative frequency formula (3.3.2) is proved analogously. • REMARK S .

1. Formulas (3.3. 1), (3.3.2) impose an a priori restriction on �, namely that Jo= d� �- l 1-J;(� ) 12 < 00 and J� d� I�I -l 1-J;( O I2 < 00. This is the same restriction as in the continuous = case ( see (2.4.6)) . 2 . In defining the discretely labelled �m ,n , we only took positive dilations aO" into consideration ( the sign of m affects whether aO" is � 1 or :::; 1, but aO" > 0 for all m) . This is the reason why formulas (3.3. 1 ) , (3.3.2) dissociate the positive and negative frequency domains. If we had allowed negative discrete dilations as well, then the condition would have involved only J::' d� I�I -l 1-J;(� ) 12 ( as is easy to check by mimicking the above proof ) . = 3. If the �m, n constitute a tight frame (A = B ) , then (3.3. 1), (3.3.2) imply

fO

(= d� I�I -l 1-J;(� ) 12 . A=� bo ln ao bo ln ao Jo � C 1 1-J;(� ) 1 2 � = In particular, if the �m, n constitute an orthonormal basis of L 2 ( JR) ( such as the Haar basis, or other bases we will encounter ) , then { = d� C 1 1-J;( O I2 = d� I�I -l 1-J;(�) 1 2 = bo ln ao . (3.3.8) 211" _= Jo It is an easy exercise that the Haar basis does indeed satisfy ( 3 . 3.8 ) . Most =

fO

of the orthonormal bases we will consider are real, so that the first equality in (3.3.8) is trivially satisfied.

4. A different proof of Proposition 3.3. 1 is given in Chui and Shi (1993) . In all that follows, we will always assume that � is admissible.

0

67


3.3.2. A sufficient condition and estimates for the frame bounds. Not all choices for 'ljJ, ao, bo lead to frames of wavelets, even if 'ljJ is admissi ble. In this subsection we derive some fairly general conditions on t/;, ao, bo under which we do indeed obtain a frame, and we estimate the corresponding frame bounds. To do this, we need to estimate L m ,n 1 (1 , 'ljJ"" ,n W :

1

I f:

2 d� j (�) ao m / 2 �(ao� ) eibo a� n� L 1 ( 1, 'ljJm ,nW = L m ,n m ,nEZ 2 b - 1 a - Tn = L aO' 27r ° ° d� e i boa� n� L i (� + 2 7r£ao m bo- 1 ) �(ao� + 27r£bo- 1 ) 0 m ,n iE Z 2 bZ; l aZ; Tn 2 7r ( m 7r = o L � L i (� + 2 7r£aO bo -1 ) � (aO � + 2 7r£bo -1 ) io iE Z

1

! (by Plancherel's theorem for periodic functions) = !7ro L / 00 d� i (�) i (� + 27r k ao m bo -1 ) �(ao� ) �( aO'� + 2 7rk bo -1 ) m,k E Z 00 7r (3.3.9) = !o / 00 d� li (�) 1 2 L 1� (aO'� ) 1 2 + Rest (f) . E Z m 00 Here Rest (f) is bounded by m

-

-

IRest ( f) 1

i (� + 2 7r kao m b( 1 ) �(ao O � (aO'� + 2 7r kb(1 )

P (aO'{ + 2.kbo ' l l

� 1 "& (aO'( l l l,,b ( ai)( - 2.kbo ' l l

(use Cauchy-Schwarz on the sum over

m

)

]

]

'I'

'I'

68

CHAPTER 3

< 2 7r 1 1/11 2 L bo k ioO where (3( 8)

=

[(3 ( 2bo7r k) (3 (

_

2 7r

bo k

) ] 1/2 ,

(3.3. 10)

SUp� L m E Z 1¢ (aon�) 1 I¢ (aon � + 8) 1 . Putting (3.3.9) and (3.3. 1 0)

together, we see that7 inf 11 / 11 - 2 ,", , '¢ � 1 (I m ,n) 12

f E 'H 1 #0

:>

'i,f-j

"

{

J;i 1 >P (ag' O I ' - � [� ( : k) � ( - : k

t' }

(3.3. 1 1 )

: hp J;i 1 >P (a;"O I ' + � �J (: k) � ( - !: k) r } .

(3.3. 1 2 )

!:

sup 11/11 - 2 f f

m ,n

"' irf

L 1 ( 1, '¢m ,n) 1 2

m, n

If the right-hand sides of (3.3. 1 1 ) , (3.3. 1 2 ) are strictly positive and bounded, then the '¢m ,n constitute a frame, and (3.3. 1 1 ) gives a lower bound for A, (3.3. 1 2 ) an upper bound for B . To make this work, we need that, for all 1 ::; I � I ::; ao ( other values of � can be reduced to this range by multiplication with a suitable a[J' , except for � = 0, but this constitutes a set of measure zero, and therefore does not matter ) , 0 < a ::; L 1¢ (aon� ) 1 2 ::; (3 < 00 ;

mE Z moreover, L m E Z 1¢ (aon�) I I¢ (aon�+ 8) 1 should have sufficient decay at 00 . "Suffi cient" in this second condition means that Lk io o [(3 ( �: k)(3( - �: k W/ 2 converges, and that the sum tends to ° as bo tends to 0, ensuring that for small enough bo the first terms in (3.3. 1 1 ) , (3.3. 1 2 ) dominate, so that the '¢m,n do indeed

constitute a frame. In order to ensure all this, it is sufficient to require that •

the zeros of ¢ do not "conspire," so that

L 1¢ (ao � W ;:: a >

mE Z

°

(3.3. 13)

for all � -=I- 0, •

I¢ ( O I ::; CI �la ( 1 + 1�12 ) -, / 2 , with a > 0, 'Y > a + l .8

(3.3. 14)

These decay conditions on ¢ are very weak, and in practice we will require much more! If ¢ is continuous, and decays at 00 , then (3.3.13) is a necessary condition: if, for some �o -=I- 0, L m E Z 1¢ (ao �oW ::; 1', then one can construct I E L 2 ( JR ) , with 11 I11 = 1 , so that (2 7r )- l bo L m ,n 1 (1, '¢m ,n) 12 ::; 2 1', implying A ::; 47rE/bo .9

69


If f can be chosen arbitrarily small, then there is no finite lower frame bound. (See also Chui and Shi ( 1993) , where the stronger result A ::; �: L m 1 .,j, (a(f � ) 12 ::; B is proved.) The following proposition summarizes our findings. P ROPOSITION 3 . 3 . 2 . If ,¢, ao are such that 00

inf L 1 .,j, (a(f � W > 0 , l $ I � I $ao m = - CX)

L 00

sup 1 .,j, (a(f � W < 00 , l $I�I$ao m = - (X)

(3.3.15)

and if (3(s) = sup� L m 1 .,j, (a(f�) 1 I .,j, (a (f � + s) 1 decays at least as fast as (1 + I s l ) - ( 1 +{) , with f > 0, then there exists (bO) thr > 0 such that the '¢m , n constitute a frame for all choices bo < ( bo ) thr ' For bo < ( bo ) thr, the following expressions are frame bounds for the '¢m , n :

A=

27r bo

2�

{ {

00

;nf L 1 .,j, (a(f � W l $ I� I $ao m = oo 00

00

L

k = - CXl k, a + 1 . Proof. We have already carried out all the necessary estimates. The decay of {3 ensures the existence of a ( bo h hr. so that L k #O - �: k)P / 2 < inf l $ I � I $ao L m 1 .,j, (a(f � ) 12 if bo < ( bo h hr . • B

=bo

[{3n: k){3(

The moral of these technical estimates is simple: if '¢ is at all "decent" (reasonable decay in time and frequency, J dx ,¢(x) = 0), then there exists a whole range of ao , bo so that the corresponding '¢m , n constitute a frame. Since our conditions on '¢ imply that '¢ is admissible in the sense of Chapter 2, this is not so surprising for values of ao , bo close to 1 , 0 respectively: we already know that the resolution of the identity (2.4.4) holds for such ,¢, and it is reasonable to expect that a sufficiently fine discretization of the integration variables should not upset the reconstruction too much. Surprisingly enough, for many '¢ of practical interest, the range of "good" (ao , bo ) includes values which are quite far from ( 1 , 0). We will see several examples below. But first we will look at the dual frame for a frame of wavelets, and discuss some variations on the basic scheme. 3.3.3.

The dual frame. As we saw in §3.2, the dual frame is defined by '¢m , n = ( F * F) - l '¢m , n , (3.3. 16)

70

CHAPTER 3

where F* F f = L m ,n (I, 'ljJm ,n) 'ljJm ,n' We have an explicit formula for the inverse of F* F which converges exponentially fast, i.e. , like L:=o an , with a convergence ratio a proportional to (� - 1) . It is therefore useful to have frame bounds A, B which are close to each other. Nevertheless, (3.2.8) necessitates, in principle, an infinite number of 'ljJm ,n to be computed. The situation is not quite as bad as one might expect: if we introduce the notation

(Tn f) (x) = f (x - nbo) , then it is easy to check that, for all f E L2 (JR), F*F D m f = Dm F*Ff . It follows that ( F* F) - l and D m commute as well. In particular, since 'ljJm ,n = D m Tn 'ljJ , or

. 1,

m O,n ( ao- m ) . 'f/ x 'f/m ,n ( ) = ao- / 2 .h n Unfortunately, F* F and T �not commute, so that we still have to compute, in principle, infinitely many 'ljJo,n - In practice, one is interested only in functions "living" on a finite range of scales, on which F* F can be reasonably approx imated by L:� m L n EZ ( , 'ljJm ,n ) 'ljJm ,n (see the time-frequency localization o . If ao m , - mo is an integer, N = ao m , - mo , then one easily section below, §3.5) checks that this truncated version of F* F commutes with TN , so that one only has to compute the N different -JjO ,n , 0 ::; n ::; N - 1 in this case. This number is still very large in many cases of practical interest, however. It is therefore espe cially advantageous to work with frames which are almost tight ( "snug frames" ) , i.e., which have � - 1 « 1 : we can then stop after the zeroth order term of the reconstruction formula (3.2. 1 1 ) , avoid all the complications with the dual frame, and still have a high quality reconstruction of arbitrary f. On the other hand, there exist very special choices of 'ljJ, ao , bo fo�hich the 'ljJm ,n are not close to a tight frame, but it so happens that all the 'ljJm ,n are generated by a single function, m2 'ljJ(3.3. 17) m ,n ( x ) = 'ljJm ,n ( x ) = ao- / 'ljJ(ao- m x - n) . An example is provided by some of the biorthogonal bases that we will encounter in Chapter 8; another example is given by the ¢>-transform of Frazier and Jawerth (1988) (see also Frazier, Jawerth, and Weiss (1991)) . � It is important to realize that the 'ljJm ,n and the 'ljJm ,n may have very different regularity properties. For instance, there exist frames where 'ljJ itself is Coo and decays faster than any inverse polynomial, but where some of the 'ljJo,n are not in £P for small p (implying that they have very slow decay) . An example, due to P. G. Lemarie, is given in detail in Daubechies (1990) , pp. 988-989. 1 0 Something similar may happen even if all the 'ljJm ,n are generated by a single function -Jj: there exist examples where 'ljJ E C k (with k arbitrarily large) , but where -Jj is X

'

_


71

not continuous. (The biorthogonal bases in Chapter 8 give examples where this happens; the first example was constructed by Tchamitchian (1989).) One can exclude such dissimilarities by imposing extra conditions on 1/J, ao, and bo (see Daubechies (1990), § II.D.2, pp. 991-992).

Some variations on the basic scheme. So far, we have not re stricted the value of ao, beyond ao > 1. In practice, however, it is very conve nient to have ao = 2 . Going from one scale to the next then means doubling or halving the translation step, which is much more practical than if another ao is used. On the other hand, we have just seen that it is advantageous to use frames with BjA - 1 « 1. Since our estimates ( 3.3.11), ( 3.3.1 2) for A, B give 3.3.4.

A ::;

!7ro mLEZ 1�(ao� W ::; B

(3.3.18)

for all � =1= 0, these two requirements together imply that L m E Z 1�(2m �W should be almost constant for � = 0, which is a very strong restriction on 1/J, not generally satisfied. The Mexican hat function 1/J ( x ) = (1 - x 2 ) e - x2 /2 , for instance, leads to a frame with BjA close to 1, for ao ::; 2 1 /4 , but certainly not for ao = 2 because the amplitude of the oscillations of L m E Z 1�( 2m �W is too large. In order to remedy this situation, without having to give up too much of our freedom in choosing 1/J and its width in the frequency domain, we can adopt a method used by A. Grossmann, R. Kronland-Martinet, and J. MorIet, and use different "voices" per octave. This amounts to using several different wavelets, 1/J 1 , . . . , 1/J N , and to look at the frame {1/J:;' n ; m , n E Z, v = 1, · · · , N }. One can repeat the analysis of §3.3. 2 (see, e.g., ' Daubechies ( 1990)), leading to the following estimates for the frame bounds of this multivoice frame: A=

!: [

1 �Ner9

t, mI;= l �v(2m�w R ( !: ) 1 ' -

(3.3. 19)

(3.3.20) with

N

R ( x ) = L L [j3V (kx ) j3V ( _kX)P /2 ,

v= 1 = j3V(8) = sup L l �v(2m �) 1 l �v(2m � + 8 ) 1 . 1 ::; l e 1 9 m = - = By choosing the � 1 , �n to have slightly staggered frequency localization cen k¥ O

and

,

.

.

•

ters, coupled with good decay at 00, one can achieve B jA - I « 1. (See the examples in §3.3.5 below.) The time-frequency lattice corresponding to such a multivoice scheme looks a little different from Figure 1.4a; an example, with

72

CHAPTER 3

· · ·

FIG. 3.2. The time-frequency lattice for a scheme with four voices. In this case the different voice wavelets 'I/J 1 , . . . , 'l/J4 are assumed to be dilations of a single function 'I/J, 'l/Ji (x) 2 - (i - 1 ) /4'I/J(2-(i - l) /4x)i if 1t,b(€)1 ( which we assume to be even) peaks around ±wo , then the I t,bi l will be concentrated around ± 2- (i - l ) /4wO . =

four voices per octave, is given in Figure 3.2. For every dilation step, we find four different frequency levels (corresponding to the four different frequency lo calizations of 'ljJl, . . . , 'ljJ4), all translated by the same translation step. Such a lattice can be viewed as the superposition of four different lattices of the type in Figure 1.4a, stretched by different amounts in the frequency direction. Each of these four sublattices has a different "density," which is reflected by the fact that typically the 'ljJv have different L 2 -norms. One choice favored by Gross mann, Kronland-Martinet, and Morlet is to take "fractionally" dilated versions of a single wavelet 'IjJ: (Note that these do indeed have different L2 -norms!) In this case ' m m N 2 L:�= l L::';;'= - oo l -0v (2 �) 1 becomes simply L::';;', = - oo 1-0(2 I �)j 2 , and this can easily be made to be almost constant, by choosing N large enough.

73


Fixing ao = 2 allows also for a modification of the estimation techniques in §3.2, which may be useful in some instances. Let us go back to the estimate for Rest (I) . We can rewrite k E Z, k =I- 0, as k = 21(2 k' + 1), where f 2: 0, k' E Z; the correspondence k-'> (f, k') is one-to-one. If ao = 2, then we can

regroup different terms, and write

2 Rest (I) = 11" bo

m ' , k ' EZ

J d� j (�) j (� + 211" (2k' + l ) bo - 1 2- m, )

. L ;j;(2m' HO ;j;[21(2m' � + 211" (2£ + l )bo - 1 ) ] . 00

l=O

This leads to

(3.3.21 )

B = 211" bo

(3.3.22)

where

(3.3.23) These estimates are due to Ph. Tchamitchian. (Full details of the derivation can be found in Daubechies (1990).) Note that /31 , unlike /3, still takes the phases of ;j; into account; as a result, the estimates (3.3.21), (3.3.22) are often better than (3.3.11), (3.3.12) when ;j; is not a positive function. If ;j; is positive, then (3.3.1 1), (3.3.12) may be better. The estimates (3.3.21), (3.3.22) hold if we have one single voice per octave; they can of course also be extended to the multivoice case. 3 . 3.5.

Examples.

A. Tight frames. The following construction (first proposed in Daubechies, Grossmann, and Meyer (1986)) leads to a family of tight wavelet frames. Let be a Ck (or COO) function from lR to lR that satisfies:

I/ ( X ) =

{1

o

1/

if x � o , if x 2: 1

(3.3.24)

74

{

CHAPTER 3

(see Figure 3.3). An example of such a (C 1 ) function v is ( )

v x

For arbitrary ao

>

=

0, sin2 � x

1,

x ,

::; O ,

(3.3.25)

O ::; x ::; l , x

�1 .

1, bo > 0 we then define "j.± (�) by � ::; ( or � � ao2 ( , ( ::; � ::; ao ( ,

where ( = 27r[bo(ao2 - 1)] - 1 , and "j.- ( � ) = "j. + (- � ) . Figure 3.4 shows "j. + for ao = 2, bo = 1 , and as in (3.3.25). It is easy to check that I support "j.+ 1 = (ao 2 - 1) ( = 27r/bo v

and

2: 1 "j.+ ( ao � W = ( In ao) - 1 X(O,oo ) (� ) ,

mEZ

where X( O, oo ) is the indicator function of the open half line (0, 00 ) , i.e. , X(O,oo ) ( �) = 1 if 0 < � < 00 , 0 otherwise. For any f E L2(JR), one then has m , n EZ

= 2: v(x)

O f-----..-'

L-

�x

____�______�______L_____

o

FIG. 3.3. The function v ( x ) defined by (3.3.25) .

75


0.5

o

f------./ ��

L-�______�________�____

o

S n /3

4 n /3

FIG. 3.4. The function ..p+ (e) with the choices ao

=

2, bo

=

1.

Similarly, Lm,n 1 ( 1, 1/I�,n) 12 = bo �:ao J� oo d{ Ij (�) 1 2 . It follows that the collection {1/1:" 'n ; m, n E Z, € = + or } is a tight frame for £2 (JR.), with frame bound bo 21n11"ao One can use a variant to obtain a frame consisting of real wavelets: 1/1 1 = Re 1/1 + = ! [1/1 + + 1/1 - ] and 1/12 = 1m 1/1 + = � [1/1+ - 1/1 - ] generate the tight frame {1/1;' n ; m, n E Z, = 1 or 2}. These frames are not generated by translations and dilations of a single function; this is a natural consequence of the decoupling of positive and negative frequencies in the construction. A more serious objection to their practical use is the fact that their Fourier transforms are compactly supported, and that the size of this support is relatively small (for reasonable ao, bo). As a result, the decay of the wavelets is numerically rather slow: even though we may choose 1/ to be Coo , so that the 1/1± decay faster than any inverse polynomial, -

•

A

the value of CN turns out to be too large to be practical. Note that we did not introduce any restriction on ao, bo in this construction.

B. The Mexican hat function. The Mexican hat function is the second derivative of the Gaussian e - x 2 / 2 ; if we normalize it so that its £ 2 -norm is 1 , and 1/1(0) > 0 we obtain 1/1 ( x ) =

� 11" - 1 /4 (1

_

x2 ) e - x2 / 2 .

This function (and dilated and translated versions of it) was plotted in Figure 1.2b; if you take one such plot, and imagine it rotated around its symmetry axis,

76

CHAPTER 3

then you obtain a shape similar to a Mexican hat. This function is popular in vision analysis (at least in theoretical expositions) , where it was also christened. Table 3.1 gives the frame bounds for this function, as computed from (3.3.19), (3.3.20), with ao = 2, for different values of bo and for a number of voices varying from 1 to 4. As soon as we take 2 or more voices, the frame may be considered tight for all bo � .75. Note that bo = .75 and ( aO ) effective = 2 1/ 2 ':::' 1.41 (intuitively corresponding to two voices per octave) are not small values for the Mexican hat function: the distance between the maximum of 'l/J and its zeros is only 1, and the width of the positive frequency bump of ¢ (as measured by [fooo d� (� - �av ) 2 I ¢ (�) W] 1/2 , with �av = Iooo d� � I ¢ (�) 1 2 ) , is y'3fi ':::' 1.23. For fixed N, and bo small enough, so that the frame is almost tight, the table also shows that A ':::' B is inversely proportional to bo, which fits the intuition that for tight frames of normalized vectors, A = B measures the "redundancy" of the frame (see §3.2), which should indeed double if bo is halved. On the other hand, the numbers in the table also show that B /A increases dramatically if bo is chosen "too large" . For every N , the last value of bo shown is the last value (with increments of .25) for which our estimate (3.3.19) for A is positive; from the next bo on, the 'l/Jm ,n are probably not a frame any more. This very abrupt transition, from a reasonable frame, to a very loose frame and then no more frame, as bo increases, was first observed by J. Morlet (1985, personal communication) , and was one of the motivations for a more detailed mathematical analysis. c. A modulated Gaussian. This is the function most often used by R. Kronland-Martinet and J. Morlet. Its Fourier transform is a shifted Gaussian, adjusted slightly so that ¢ (O) = 0,

(3.3.26) Often �o is chosen so that the ratio of the highest and the second highest max imum of 'l/J is approximately � , i.e. �o = 11" [2 / In 2] 1/ 2 ':::' 5.3364 . . . ; in practice one often takes �o = 5. For this value of �o , the second term in (3.3.26) is so small that it can be neglected in practice. This Morlet-wavelet is complex, even though most applications in which it is used involve only real signals I . Of ten (see, e.g. , Kronland-Martinet, Morlet, and Grossmann (1989)) the wavelet transform of a real signal with this complex wavelet is plotted in modulus-phase form, i.e., rather than Re (J, 'l/Jm ,n) , 1m (J, 'l/Jm ,n ) , one plots I (J, 'l/Jm ,n) 1 and tan - 1 [1m ( J, 'l/Jm ,n) /Re (J, 'l/Jm ,n ) ] ; the phase plot is particularly suited to the detection of singularities (Grossmann et al. (1987)). For real I , one can exploit j ( -�) = j (�) to derive the following frame bounds (this is analogous to what was done in §2.4 for real J): A 11/11 2 �

L

m ,n EZ

I (J, 'l/Jm ,n W � B 11/11 2 for 1 real

77


TABLE 3. 1 Frame bounds for wavelet frames based on the Mexican hat function 'l/J(x) 2/../3 7r - 1 /4 ( 1 2 x2)e-X / 2 . The dilation parameter ao 2 in all cases; is the number of voices.

N

=

N = 1 .25

1 3 . 09 1

14. 183

. 75

4. 364

4. 728

2.001

3.454

. 325

1 . 50

N = 3

A

bo

B

BfA

.25

27.273

27.278

1 . 0002

1 . 083

. 75

9.091

9.093

1 .0002

.50

1 . 1 16

1 . 00

4.221

1 2 . 986

1 . 50

B

BfA

3 . 596

3 . 223

1 . 00

1 . 25

A

bo

1 . 083

1 . 083

7.092

6 . 546

. 50

N = 2

BfA

B

A

bo

=

1 3 . 673

6 . 870

2 . 609

6.483

.517

1 . 75 N = 4

bo

6 . 768

4.834

1 . 25

1 . 726

1 3 . 639

A

6 . 077 7.276

B

1 .0002

1 .0 1 5

1 . 257

2 . 485

14.061

BfA

. 25

40. 9 1 4

40 . 9 1 4

1 .0000

.25

54.552

54.552

1 . 0000

. 75

1 3 .638

1 3 .638

1 . 0000

. 75

1 8 . 184

1 8 . 184

1 . 0000

8.835

1 . 1 73

1 1 .616

1 . 138

1 0 . 279

10. 1 78

1 . 00

7. 530

1 . 25

4. 629

1 . 50

1 . 747

1 . 75

with A

B

-27rbo 27r

bo

where

R and

J1, (s) =

.50

1 . 0000

20.457

20.457

. 50

9.009

27.276

1 .010

1 . 00

1 3 . 586

1 3 . 690

1 . 947

1 . 50

6. 594

1 1 . 590

1 . 25

1 . 75

5.691

9.942

27.276

{� [; [ ; {�

1 0 . 205

2 . 928

1 2 .659

e

1 "b(a;;' � ) I ' + 1 ,p(ao m �) I ' - R

sup

1 ,p (a;;' �) I ' + 1¢ (ao m � ) I ' + R

f=

'- + , -

1 . 007 1 . 758

4. 324

l } l }

inf

e

1 . 0000

[ (�: k) J1, (- �: k) f /2

L J1, ki-0

� supe mLEZ I �(a��) + e¢( -a��) 1 I �(a�� + s) + f�( -a�� - s) 1 .

These can, of course, again be generalized to the multivoice case. Table 3.2 gives the frame bounds for ao = 2, several choices of bo, and number of voices ranging from 2 to 4. In practice, the number of voices is often even higher.

78

CHAPTER 3 TABLE 3 . 2

Frame bounds for wavelet frames based on the modulated Gaussian, 1fJ(x) 11"- 1/4 (e-i�Ox e-e� /2 ) e-x2 / 2 , with �o 2 in all cases; is the 1I"(2/ 1n 2)1/2 . The dilation constant ao number of voices.

N

=

=

=

N = 2

bo

1.

.5

1 .5

2.

2.5

A

6.019

3.910

1 . 1 73

2 . 287

1 .944 . 486

bo

1 . 299

7.820

3 . 009

N = 3

BfA

B

1 .5

3.366

3 . 555

1 . 056

2 . 977

2 . 534

2.

4. 693

N = 4

bo

1 .0 1 7

1 . 950

2 . 282

2.5

3.

A

B

.5

1 3 .837

1 3 . 846

1 .5

4 . 540

4. 688

1. 2.

2.5

3.

6.918

6 . 923

3.910

3.013

1 . 708

3 . 829

.597

BfA

1 0 . 467

1.

1 . 373

B

1 0 . 295

1 . 230

2 . 669

A

.5

4 .0 1 7

5 . 147 2 . 188

1 . 1 75

.320

5 . 234 3.002

3. 1 4 1

1 .017

1 . 372

9 . 824

BfA

1 . 0006 1 . 0008 1 . 032

1 . 297

2 . 242

6 . 732

D. An example that is easy to implement. So far we have not addressed the question of how the wavelet coefficients (/, '¢m , n ) are computed in practice. In real life, f is not given as a function, but in a sampled version. Computing the integrals J dx f(x) '¢m , n (x) then requires some quadrature formulas. For the smallest scales (most negative m ) of interest, this will not involve many samples of f, and one can do the computation quickly. For larger scales, however, one faces huge integrals, which might considerably slow down the computation of the wavelet transform of any given function. Especially for on-line implementa tions, one should avoid having to compute these long integrals. A construction achieving this is the so-called "algorithme it trous" (Holschneider et al. (1989)), which uses an interpolation technique to avoid lengthy computations (for details, I refer to their paper) . Here I propose an analogous example (although it is not "it trous" ) , by borrowing a leaf from multiresolution analysis and orthonormal bases (to which we will come back) , i.e., by introducing an auxiliary function ¢. The basic idea is the following: suppose there exists a function ¢ so that

,¢(x) = L dk ¢ ( x - k ) ,

(3.3.27)

k

¢(x) = L Ck ¢(2x - k) k

,

(3.3.28)

79


11

where in each case only finitely many coefficients are different from zero. (Such pairs of 'l/J abound; an example is given below. The "algorithme a trous" corresponds to special for which Co = 1 , all other even-indexed C2n = 0.) Here does not have integral zero (but 'l/J does!), and we will normalize so that J = 1. Define, even though is not a wavelet, m, n X = 2 - m / 2 2 - m - n ) ; we take ao = 2, = 1 . Then it is clear that

¢,

¢ dx ¢(x) ¢( x

¢ () ¢

¢

(I, 'l/Jm, n ) = L k

bo

¢

dk (I, ¢m,n+k ) ;

the problem of finding the wavelet coefficients is reduced to computing the ( I, ¢m, n ) (finite combinations of which will give the (I, 'l/Jm, n ) ) . On the other hand, 1 (I, ¢m, n ) = J2 L Ck (I, ¢m - l, 2n +k ) , k so that the ( I, ¢m, n ) can be computed recursively, working from the smallest scale (where they are easy to compute) to the largest scale. Everything is done by simple, finite convolutions. An example of a pair of functions satisfying (3.3.27) , (3.3.28) is

[ � ¢(X + l) + ¢(X) - � ¢(X - l)] , 4 4 ) ¢( � ) = _1_ e -2i� ( ei�i - 1 ) = 1 ( sin �/2 ' � �/2 'l/J(X) = N -

which corresponds to

__

yI27;:

yI27;:

� x � -1 , � - x 2 (1 + x /2) , -1 � x � 0 , � - x2(1 - x/2) , O � x � l , -i(x - 2) , l � x � 2 , -2

¢(x) =

3

otherwise .

0

N is a normalization constant chosen so that 11 '1/1 11 = 1; one finds N = 6 J 1 ;�3 ' Figure 3.5a shows graphs of 'l/J; they are not unlike a Gaussian and its second derivative, plotted in Figure 3.5b for comparison. The function '1/1 clearly satisfies (3.3.27) with = N, 1 = N 2, all other k = 0, whereas 4 _1_ S in � 4 = _1_ 2 sin �/2 cos �/2 � � yI27;: yI27;: (cos �/ 2) 4 ¢ (�) ,

do

¢, d± - /

( )

d

(

)

CHAPTER 3

80 which implies

x = � ¢( 2x + 2) + � ¢( 2x + 1) + � ¢ (2x) + �¢(2x - 1) + � ¢ (2x - 2) , or Co = � , C± l = �, C± 2 = � , all other Ck = O. For this 1j;, ao = 2 and bo = 1, the frame bounds are A = .73178, B = 1.77107, corresponding to BfA = 2.420 22; for ao = 2, bo = .5 we have A = 2.33854, B = 2.66717, and BfA = 1.14053 ( using bo = .5 means that the recursion formulas, linking the 1j;m, n with the ¢m, n and the ¢m, n with the ¢m l ,n , have to be adapted, but this is easy ) . Here we have used only one voice. It is of course possible to choose several different ¢( )

-

1j;1I, corresponding to different dk, that give rise to a multivoice scheme, closer

to tight frames. This concludes our example section; other examples are given in Daubechies (1990) ( including one for which the estimates (3.3.21), (3.3.22) outperform (3.3.11), (3.3.12» . Many other examples can of course be constructed. The wavelets used in Mallat and Zhong (1990) are another example of the same type as our last one; in their case 1j; is chosen to be the first derivative of a function with non-zero integral ( so that J dx 1j;(x) 0 but J dx x1j;(x) =f:. 0).

=

(a)

'"

0

0 -2 ( b)

2

0

-1

l

-2

2 Mexican Hat

Gaussian 0

0

-2

0

2

0

-1

-2

0

2

FIG. 3.5. An example that is easy to implement: graphs of ¢>, 'IjJ (in a) , and a comparison with a Gaussian and its second derivative (in b) . 3.4.

Frames for the windowed Fourier transform. The windowed Fourier transform of Chapter 2 can also be discretized. The natural discretization for t in gW ,t (x) e iwx g(x t) is t nto , where to > 0 are fixed, and m, n range over Z; the discretely labelled family is thus

wo,

w,

=

gm, n (x) e imwox g(x

=

-

- nto ) .

w = mwo, =

81


We can again seek answers to the same questions as in the wavelet case: for which choices of 9, wo, to can a function be characterized by the inner products (I, 9m, n) ; when is it possible to reconstruct f in a numerically stable way from these inner products; can an efficient algorithm be given to write f as a linear combination of the 9m,n ? The answers are again provided by the same abstract framework: stable numerical reconstruction of f from its windowed Fourier coefficients

(I, 9m, n)

= J dx f(x) e -imwQ x 9(X - nto)

is only possible if the 9m,n constitute a frame, i.e., if there exist A > 0, B so that

A

J dx I f(xW � m,LnEZ 1 (1, 9m,n) 1 2 � B J dx I f(x) 1 2 .

n ,

I

l/

) (L 'I'

+

, >r

dx l ! (x 1 1 '

)

' I'

+ e1 l !11

]

(3.5.4)

REMARK S .

1. If satisfies ( 3.5. 1 ) and (3.5.2) , then the first two terms in the right-hand side of (3.5.3) are bounded by 2 15 11/11 ; choosing = 15 then leads to III - L m,nE B6 (I, ¢m ,n) ¢m ,n ll = 0( 15 ) . 2. As # B, (no, n1 ; T)--+oo (see proof, below) : infinite precision is only possible if infinitely many (I , ¢m ,n) are used.

Jj

f--+O,

f;

f

0

Figure 3.7 gives a sketch of the set B, ( 0..o , n1 ; T) for one particular value of the proof will show how we obtain this shape.

Proof. 1. We define the set B,

as

E

B, ( no, n1 ; T) = { (m, n) Z2 ; mo :S m :S m1 , I nbo l :S ao m T + t} , where mo, m1 and t, to be defined below, depend on no, n1, T, and The points (aonbo, ±ao m eo) corresponding to (m, n) in such a set, do indeed

fill a shape like Figure 3.7.

f.


89

w

.

\. \ . . . \

\ '----"'--'''-'---'---'-��-'--l'''--'-'-'

. . . . . . )L ................. _

_

_

_

_

FIG. 3.7. The set B. (fl o , fl l ; T) of "wavelet lattice points " needed for an approximate reconstruction of f if f is localized mostly in [-T, T] in time and in [-fll , - fl o ] U [fl o , fl I ] in frequency.

2.

f - L (j, 'l/Jm,n) 'l/Jm,n (m,n) EB. sup

(j, h)

sup

L

Ilh ll =l

m l nEIL [ I (POO'O, !, 'l/Jm,n) I or

+

sup

L

+ 1 ( ( 1 - pOo,oJf, 'l/Jm,n) 11 I ('l/Jm,n , h) 1

L

IIhl l= l mo :S;m:S;ml I nbo l >a;;- m T+t

where we have introduced ( Q T f ) ( X )

=

(3.5.5) f(x) for Ix l :s: T, ( Q T f ) ( X ) = 0

CHAPTER 3

90

otherwise, and =0 = if no :::; I � I :::; nl , 1 otherwise. Since the '¢m , n constitute a frame with frame bounds B - , A- I , we have

(Pno ,n J),,\(�) j(�)

L

"' < "' 0

or

m > "' 1

L 1 ((1 -

n EZ

=

Poo ,o , ) I, '¢m,n ) 1 1 ('¢m,n , h) 1

V¥ [ L or

Similarly,

(Poo,o J ) II (�)

( because II h ll

= 1) .

1 { I > fl ,

It remains to check that the other two terms in (3.5.5) can be bounded by

lif 11 1 11 ·

3. By the same Cauchy-Schwarz trick, we reduce the remaining two terms in (3.5.5) to

(3.5.6) It is therefore sufficient to show that for appropriate mo, ml , each of the expressions between square brackets is smaller than Bf2 11 1 112/4 .

t

4 . We tackle the first term in (3.5.6) by the same technique

as

in the proof of

91


Proposition

3. 3 . 2 :

m<m o nEIL

or m > m l

I ' 1 + �

C1 (1 + e2 ) - ')' / 2 .

Substituting this into

(

0(;', -

(3. 5 .7) we find

(3. 5.7) � 271'bo C2 1 l Poo , oJ II 2 L (1 + e2 ) - ,),-\ /2

sup

L

�r 1

r

/'

(3. 5 . 8)

o o ::;I � I::; o, m<m o > 1, i.e., if > 1 / ,,(; we can choose, e.g. , The sum over e converges if � (1 + "(- 1 ). On the other hand, for no � I�I � n 1 , L 1 � (ao � ) 1 2 ( 1 - -\ ) (1 + a6m n6) - ')' ( I - -\ ) C3 m>m, £") ( . 9) C4 HO-2')' ( 1 - -\ ) ao-2mn ( 1 - -\ )

A

=

or m > m l

"(A

<
O } , where I A I stands for the Lebesgue measure of A The difference between ess infx I ( ) and infx I ( ) lies in the positive measure require ment: if 1(0) = 0, I ( ) = 1 for all =I- 0, then infx I ( ) = 0, but ess infx I ( ) = 1, because I � 1 except on a set of measure zero, which "does not count." In fact we could be pedantic, and replace inf or sup by ess xinf I ( x ) = x

x

x

x

c

x

R

x

ess inf or ess sup in most of our conditions without invalidating them, but it is usually not worth it: in practice the expressions we are dealing with are continuous functions, for which inf and ess inf coincide. In the situation is different: even for very smooth the sum is discontinuous at = 0, because = For the Haar function, for 1 1 / = (27r) -1 and instance, = 27r if =I- 0, 0 if = We therefore need to take the essential infimum; the infimum is zero. 8. This condition implies both the boundedness of and the decay of (3(8) :

(3.3.11) ¢, LmEZ 1 ¢ (a(j'e) 1 2 e ¢(O) O. ¢ (e) 4( )- 2 I e l - sin2 e/4, LmEZ 1 ¢ (e) 1 2 e 1 e 1 O. LmEZ 1 ¢ (a(j'e) 1 2 L 1 ¢ (a(j'eW � I Ssup L 1 ¢ (a(j'(W I C I S ao mEZ mEZ t a5mo + f; a5mO (1 + a5m )-"f < 00 � C2 a5° m oo

[

]

103


and

L

/.,p(ab" O / /.,p(ab" � + s) / (3 (s) = sup 1 :Sl e l :Sao mEZ -1 :::; 2 sup aD L ab"o (1 + /ab"2� + s/ 2 ) -b-o ) /2 1 :Sl e l :Sao m = _ oo + [ (1 + /ab" � / 2 ) ( 1 + /ab" � + s/ 2 )] -b-o ) /2

C

{

�

}.

In the first term we can use that, for l s I � /ao� + s/ � l s I - 1 � J.;l , hence (1 + /ab" � + S/2) - 1 :::; 4(1 + /s/2)-\ for l s I :::; (1 + /ao � + S/2)- 1 :::; 1 :::; 5(1 + /s/2)- 1 . It follows that the first term can be bounded by + /s/2)-b -o ) /2 as soon as a > 0, 'Y > a. For the second term we use that supX , Y E lNt.. 1!]) (1 + y2) [1 + ( x - y )2] - 1 [1 + ( x + y )2] - 1 00 to + /s/2)-b-o ) ( 1 -f ) /2 E := o (l + /ao�/2) -fb-o ) /2 , bound the sum by where 0 f 1 is arbitrary. Since 1 :::; / � / :::; ao, this can be bounded by + /s/2)-b-o ) ( 1 -f ) / 2 if 'Y > a . We have therefore, for 0 P 'Y - a ,

2,

and

C'( l

2,

then this infinite sum is uniformly bounded in �, and we can choose so that the whole right hand-side of the inequality is ::;

2f. 10. Beware of a mistake in the example on pp. 988-989 of Daubechies (1990). The formula for (hoo)'" should read (hoo)'" = 2::;:'0 fj 'l/Jj O , and leads to the conclusion that hoo�£p(lR) for small p. I would like to thank Chui and Shi (1993) for pointing this out to me. 11. This is slightly different from multiresolution analysis, where (3. 3 . 2 7) would also contain a scaling factor 2: 'l/J(X) = L dk ¢ ( 2x - k) . k

12. support. One can also construct tight frames where neither 9 nor 9 have compact It is, for instance, possible to construct a tight frame in which both 9 and 9 have exponential decay. One way of doing this is to start from any windowed Fourier frame, with window function g, and to define the function G = (F* F) - 1 / 2 g, where F* F = 2:: m ,n ( -, gm ,n) gm ,n' The as in the gm ,n ) then (same functions Gm ,n ( x ) = eimwo x G(x constitute a tight frame. One has indeed

nto)

m ,n

Wo, to

m ,n

I ( (F * F) - 1 /2 I, gm ,n) 1 2 = ( (F * F) (F * F) - 1 / 2 I, (F * F) - 1 /2 f ) L m ,n 11 / 11 2 .

The explicit computation of G can be carried out by a series expansion for If 9 and 9 have (F* F) - 1 /2 analogous to the series for (F* F) - l in exponential decay (in particular if 9 is Gaussian) , then the resulting G and its Fourier transform have exponential decay as well. For more details, plots of examples, and an interesting application, see Daubechies, Jaffard, and Journe

§3. 2 .

(1991).


105

13. The proof in Bacry, Grossmann, and Zak (1975) uses the Zak transform, which we introduce and use in Chapter 4. Full details of their argument are also given in Daubechies (1990). It is interesting that their proof can be extended to show that the n still span all of L2 (JR) if one ( any one) gm ,

of the gm , n is deleted, but not if two functions are deleted.

14. These exact formulas use again the Zak transform. Their derivation is given in Daubechies and Grossmann (1988) ; it is also reviewed in Daubechies (1990). 15. that In some applications, Bastiaans' result is interpreted ( correctly) to mean one should "oversample" ( Le., choose woto < 27r) in order to restore stability. Nevertheless, even in such an "oversampled" regime, sometimes Bastiaans' pathological dual function is still used ( see, for instance, Porat and Zeevi (1988)). If woto = 7r, then the gm , n can be split into two families, gm , 2n and gm , 2n+ 1 , each of which can be considered to be a family of Gaussian windowed Fourier functions with woto = 27r, one generated by 9 itself, the other by (x - to ) . For both families, the badly convergent ( non g

convergent in L2 ) expansion (3.4.6) can be written, and a function can be viewed as the average of the two expansions. This is of course true in the sense of distributions, and in practice reasonable convergence ( probably due to cancellations ) seems to be achieved ( using a truncated version of Bastiaans' g�private communication by Zeevi but far better time frequency localization, and I suspect better convergence, in practice, would be achievable by using the optimal dual function 9 ( corresponding to = in Figure 3.6 in this case ) .

(1989)),

A .5

16. This symmetry is certainly not necessary. 17. onIt isboth in fact a true hyperbolic lattice with respect to the hyperbolic geometry the positive and negative frequency half plane. 18. are Note, however, that Y. Meyer has proved recently that the (Xm , n (f) which local extrema in the construction above do not suffice to characterize f

completely.

CHAPTER

4

T i m e- Freq u ency Dens ity and O rthonor m al B a ses

This chapter splits naturally into two parts. The first section discusses the role of time-frequency density in wavelet transforms versus windowed Fourier transforms. In particular, for the windowed Fourier transform, orthonormal bases are possible only at the Nyquist density but no such restriction exists for the wavelet case. This leads naturally to the second section, which discusses different possibilities for orthonormal bases in the two cases. 4. 1 .

The role of time-frequency density in wavelet and windowed Fourier frames. We start with the windowed Fourier case. We mentioned in §3.4. 1 that a family of functions (gm ,n j m, n E Z) ,

(4. 1 . 1 ) gm , n (x) = ei m wo x g(x - nto ) , cannot b e a frame, whatever the choice of g, i f Wo . t o > 27r . I n fact, for any choice of 9 E L2(JR), one can find f E L2(JR) so that f i- 0 but ( f, gm ,n) = 0 for all m, n E Z. If, for instance, Wo = 27r, to = 2, then such a function f is easy to construct: ( f, gm ,n) = 0 for all m , n E Z leads to o = dx e21ri mx f(x) g(x - 2n)

J

1 dx e21rimx iLEZ f(x + £) g(x + £ 1

o

2n) ,

so that it is sufficient to find f i- 0 for which L i EZ f(x + £) g(x + £ - 2n) = o. Define now, for 0 :S x < 1, £ E Z, f(x + £) = (- l) i g(x - £ - 1). Clearly, However, f�oc dx I f(x) 1 2 = f�oc dx Ig(x ) 1 2, so that f E L2(JR) and f ii , which £ £ 2n) + = -l) g(x 2n) 1) £ g(x g(x £) ( + f(x L Li EZ - + i EZ turns into its negative upon the substitution £ = 2n - £' - 1 , and therefore equals zero. The same construction can be used for any other pair Wo , to with product 47rj a generalization of this construction exists if Wo . to > 27r and (27r) - 1 woto is rational ( see Daubechies (1990) , p. 978) . If wotO (27r) - 1 is larger than 1 but irrational, then I know of no explicit construction for f i- 0,

O.

107

I

I

108

CHAPTER 4

was proved in Rieffel ( 1981) , using ar -L gm ,n ' The existence of such an guments involving von Neumann algebras. 1 If only "reasonably nice" g are con sidered (i.e. , g that have some decay in time as well as in frequency) , and if we are only interested in proving that the gm ,n cannot constitute a frame (which is weaker than proving that there exists -L gm ,n ) , then the following very elegant argument by H. Landau does the trick. If Ig(x) 1 :s; C(l + x2 ) - a / 2 , C(l + e ) - a/2 , and the gm ,n constitute a frame, then Theorem 3.5.2 Ig(�) 1 tells us that functions which are "essentially localized" in [- T, TJ x [-0 , OJ in the time-frequency plane can be reconstructed, up to a small error, by using only the gm ,n ) with :::, 0, Into l :::, T. More precisely, if is bandlimited to [-0 , OJ and if [ �x l :::: T dx I / (x) 1 2P / 2 :s; f 1 1 1 11 , then

:s;

I

(I,

II

I mwo l

-

�

+W < I �W O I n to l � T + t
0 is arbitrary. Consider the b-dependent families ( = { 'I/; n ; m, E Z}. As changes, the "density" of the associated lattice changes as w�ll. (Note that and 'I/; are the same for all the ( If any representation like Figure 4. 1 held also for wavelets, then we would expect, since F(l) is an orthonormal basis for L2(1R), that ( would not span all of L2(1R) if > 1 ( "not enough" vectors) , and that ( would not be linearly

F b) b

�

n

F b) F b)

F b) ! )

TIME-FREQUENCY DENSITY AND ORTHONORMAL BASES

115

independent ( "too many" vectors) if b < 1. Yet one can prove (see Theorem 2.10 in Daubechies (1990) ; we also sketch this proof later in this chapter) that for some t > 0, F(b) is a Riesz basis for L2(1R), for any b E ] 1 - t, 1 + t [ . This example shows conclusively that it is not always safe to apply "time-frequency space density intuition" to families of wavelets. 4.2.

Orthonormal bases. 4.2. 1 . Orthonormal wavelet bases. The conclusion of the last paragraph seems a rather negative point for wavelets: no clean time-frequency density concept. In this section we emphasize a much more positive aspect: the existence of orthonormal wavelet bases with good time-frequency localization. Historically, the first orthonormal wavelet basis is the Haar basis, constructed long before the term "wavelet" was coined. The basic wavelet is then, as we already saw in Chapter 1, 0 ::; x < � , (4.2.1) ,¢(x) = - 1 , � ::; x < 1 , o otherwise.

{ I,

We showed in § 1 .6 that the '¢m ,n (x) = 2- m /2 ,¢(2- m x - n ) constitute an or thonormal basis for L2(1R). �The Haar function is not continuous, and its Fourier transform decays only like I I - l , corresponding to bad frequency localization. It may therefore seem that this basis is no better than the windowed Fourier basis (4.2.2) gm ,n (x) = e2 1Timx g(x n ) with ::; x 1 , 9 x = 0 , 0otherwise, which is also an orthonormal basis for L2(1R). However, the Haar basis already has advantages that this windowed Fourier basis does not have. It turns out, for instance, that the Haar basis is an unconditional basis for LP(IR), 1 < p < 00 , whereas the windowed Fourier basis (4.2.2) is not if p =I- 2. 4 We will come back to this in Chapter 9. For the analysis of smoother functions, the discontinuous Haar basis is ill suited. An orthonormal wavelet basis with time-frequency properties complementary to the Haar basis is given by the Littlewood-Paley basis, -

( ) { I,

� (�)

or

=

::;

{ (27r)- 1 /2 , 7f ::; I � I . ::; 27f, otherWIse 0, (7fX

,¢(x) = )- 1 (sin 27fx - sin 7fx) . It is easy to check that the '¢m ,n (x) = 2- m /2 ,¢(2 - m x - n ) constitute indeed an orthonormal basis for L2(1R). We have I I '¢m ,n l l = 1 for all m , n E Z, and

116 =

=

CHAPTER

4

1 2 -m I 127r d(ein( [j( T m ()X [7r ,27r] (() + j(Tm (( 27r)) x[O,7r] (()] 1 2 �) 27r) 0 ffi ,n Lm 2 -m 127r d( I j (T m () X [7r , 27r] (() + j (T m (( - 27r)) X [O, 7r] (() 1 2 _

{"p("p(x) m ,n ; I x l - 1 x---> oo ) (2.1.1); L2 (1R)

By Proposition 3.2. 1 , this implies that m, n E Z} is an orthonormal basis for The decay of is as bad for as that of the orthonormal windowed Fourier basis used in the Shannon expansion both have excellent frequency localization, since their Fourier transforms are compactly supported. In the last ten years, several orthonormal wavelet bases for have been constructed which share the best features of both the Haar basis and the Littlewood-Paley basis: these new constructions have excellent localization prop erties in both time and frequency. The first construction is due to Stromberg his wavelets have exponential decay and are in Ck arbitrary but fi nite) . Unfortunately, his construction was little noticed at the time. The next example is the Meyer basis mentioned above (Meyer in which is com pactly supported (hence E COO ) and Ck arbitrary, may be 00 ) . Unaware at that time of Stromberg' S construction, Y. Meyer actually found this basis while trying to prove a wavelet equivalent of Theorem which would have shown the non-existence of these nice wavelet bases! Soon after, Tchamitchian constructed the first example of what we shall call biorthogonal wavelet bases (see §8.3) . The next year, Battle and Lemarie used very differ ent methods to construct identical families of orthonormal wavelet bases with exponentially decaying E Ck arbitrary but finite) . (Battle was inspired by techniques in quantum field theory; Lemarie reused some of Tchamitchian's computations.) Despite having similar properties, the Battle-Lemarie wavelets are different from the Stromberg wavelets. In the fall of S. Mallat and Y. Meyer developed the "multiresolution analysis" framework, which gave a sat isfactory explanation for all these constructions, and provided a tool for the construction of yet other bases. But this is for later chapters. Before we get into multiresolution analysis, let us review the construction of Meyer's wavelet basis. The construction of is similar to the tight frame in §3.3.5.A. That frame had redundancy (twice "too many" vectors) . To get rid of this redundancy, Meyer's construction combines positive and negative frequencies (reducing a pair of functions to a single function) . In order to achieve orthonormality, some clever

L2 (1R).

"p(x)

rv

(1982);

(k

"p

"p

(k

(1987) (k

(1985)), 4.1.1, (1988)

1986,

2

I -¢ I

-¢

(1987)

1


'I/J by (27r) - 1 /2 eif./2 sin [� v ( 2� 1 � 1 - 1 ) ] , 211" 1 or if

1 m - m' l = 1 , I n - nI l > 1 .

(4.2. 13)

Sullivan et al. (1987) present arguments explaining both the existence of Wil son's basis and its exponential decay. In both papers there are infinitely many functions ¢;, ; as m tends to 00, the ¢;, tend to a limit function ¢� . The moral of Wilson's construction is that orthonormal bases with good phase space localization seem possible after all if bi-modal functions as in (4.2. 13) are used. Note that many of our wavelet constructions, frames as well as the orthonor mal bases we saw earlier, have these two peaks in frequency (one for � > 0, one for � < 0). In the case of frames, or for the continuous wavelet transform, the two frequency regions can be separated (corresponding to one-frequency-peak functions; see §3.3.5.A or (2.4.9)) , but this does not seem to be the case for orthonormal bases. We will see later that the two frequency peaks of '1fJ need not be symmetric: there even exist examples with 1 1 '1fJ 1 1 - 2 i(, -::; o � 1 ¢ ( � ) 1 2 arbitrarily small (but strictly positive!) . However, there is no example, so far, of reason ably well-localized functions '1fJ ± with support (;;;±- ) C IR± and such that the { '1fJ:n n ; m, n E Z, f = + or - } constitute an orthonormal basis for L 2 (IR) , cor resp �nding to wavelet bases with only one "peak" in frequen:r (Equivalently, there is no example of a reasonably smooth function 'TJ = '1fJ + such that the functions 2 m / 2 exp (271"i 2 m n�) 'TJ(2m �) , m, n E Z, are an orthonormal basis of L2 (IR+ ) . ) It is believed, without proof so far, that no such basis exists. 7 But let us return to Wilson bases. If one gives up the restriction (4.2. 13) (if 1m , ¢;, have exponential decay, then these quantities decay exponentially fast in 1 m - m' l , In - n' l anyway) , then Wilson's ansatz (4.2. 1 1 ) , (4.2.12) can be dramatically simplified. In Daubechies, Jaffard, and Journe (1991), a construction is proposed that uses only one function ¢. Explicitly, this construction defines gm n ( X ) ,

=

1m ( X - n) ,

m E N \ {O} , n E

with ¢(� ) , 1 fz (�) = v'2 [ ¢( � - 2 71" ) - ¢( � + 271")] ,

!t (�)

=

A

j3 (�)

=

� [¢(� - 271" ) + ¢(� + 27r)] ei(, /2 ,

Z,

(4.2. 14)

4 f4 (�) = J21 [ (� - 47f) + (� + 47f)] , j5 (�) = � [(� - 47f) - (� + 47f)] eie/2 , etc · · . or j2H tr (�) = � [ (� - 27ff) + (_l ) H tr (� + 27ff)] e itre/2 , (4. 2 .15) with f E N, = 0 or 1, and f = 0, = 0 excluded. The result of all these phase factors and alternating signs is that h (x ) = ¢ ( x ) , 1 - ) hHtr ( X ) = J2 (x + "2 If we relabel the gm ,n in ( 4. 2 .14) by defining Gm ,n , m E N, n E Z as 122

CHAPTER

A

a

a

a

GO,n = g l,n , Gt, 2n + tr = g2t + tr,n ,

then

¢(x - n) , (4. 2 .16) and for f > 0, cos 2 fx if f + n is even , Gt,n (x) = v2 ¢ (x ?!:.) 2 { sin 27f7f fx if f + n is odd . (4. 2 .17) This construction ( as well as others mentioned below ) shows therefore that the key to obtaining good time-frequency localization ( can be chosen so that , ¢ have exponential decay ) and orthonormality in the windowed Fourier framework GO,n (X) =

_

is to use sines and cosines ( alternated in an appropriate way ) rather than complex exponentials. But let us get back to and show how this construction can lead to an orthonormal basis. As usual, we only need to check Il gm ,n II = and 2:: := 1 2:: n EZ I (h , gm ,n) 1 2 = Ilh11 2. We immediately have Il g 1 ,n ll = Ilh ll = 1111 , and for >

(4. 2 .14), (4.2.15)

m 1,

1

(m = 2f + f > 0) 2( _l)Htr J d� (�) (� + 47ff)] �[ ( we assume is real, for simplicity ) . Hence Il gm ,n ll = 1 for all m, n if II fm l 1 2 = IIhHtr l 1 2 2 1111 2 +

On the other hand,

a,

(4. 2 .18)

f: L I (h, gm,nW 27f m=l f: kELZ J � h (�) h (� + 27fk) jm (�) jm (� + 27fk) .

m=l nEZ

=


This equals

I h l 1 2 if

L j (�) j (� + 211"k) (211") -1 8kO . m= l m m 00

123 (4.2.19)

=

A few simple manipulations lead to

m= l

(�) (� + 27rk) + 21 L (� + 27r£)(� + 27r£ + 211"k) [1 + (_l) k ] l#O (4.2.20) 1 l (� - 211"£)(� + 211"£ + 27rk)[1 - (-1) k ] . (_l) +2 L l#O If k is odd, k 2k' + 1, then this reduces to l (� - 211"£) (� + 211"(£ + 2k' + 1)) , (_l) (4.2.21) L Z lE which is zero, since the substitution £' -(£ + 2k' + 1) transforms (4.2.21) in its negative. If k is even, k 2k', then (4. 2 .19) reduces to ( � + 211" £ ) ( � + 211" £ + 411"k ' ) (211") - 1 8k ,0 . (4. 2 . 2 2) L lEZ The {gm , n ; m E N \ { O } , E Z} therefore constitute an orthonormal basis if is a real function satisfying (4. 2 .18) and (4.2.22). Note that integrating (4. 2 . 2 2) over �, between 0 and 211", automatically leads to (4.2.18), so that we really have only the single condition (4. 2 . 2 2) to satisfy. This turns out to be easy: we can take for instance support [ - 211", 211"] , so that (4. 2 . 2 2) is automatically satisfied for k' =f. 0, and we only need to check L lEZ (� + 211"£ ) 2 (211") -1 . This is true if, e.g. , (211") - 1/2 sin [ � 1I (f; + 1) ] , -211" � � � 0 , (�) (211") - 1 /2 cos [� 1I (f;)] , o � � � 211" , =

=

=

=

=

n

c

=

=

1I

otherwise ,

0

(4.2.4). 1I 1m . , 1m (4.2.22), ( Zh)(s, t) (411") 1 /2 L e27ritl h(411"(s - f)) . lEZ

with as in If is Coo , then the have decay faster than any inverse polynomial, but, as for the Meyer basis, the numerical decay may be slow. Faster decay for the can be obtained with noncompactly supported To construct such a satisfying we can again use the Zak transform, now normalized so that =

124

CHAPTER

4

With this normalization, Z is again unitary from L2(lR) to L2([0, 1]2). It is not hard to check that (4.2.22) is equivalent to (4.2.23) I (Z 1» (s, t W + I (Z 1» (s + � , t) 1 2 2 . (Full details are given in Daubechies, Jaffard, and Journe (1991).) This suggests the following technique for constructing 1>: Take any h such that 0 < a :::; I Z h (s, t W + IZh(s + � , t) 1 2 :::; f3 < ; (4.2.24) =

•

00

•

Define 1> by

Z1>(s, t) = v2

[

Z h(s, t) 1/2 I Z h (s, t) 1 2 + I Z h ( s + � , t ) 1 2 ]

( 4.2.25)

If h and h both have exponential decay, then 1> turns out to have exponential decay as well. Figure 4.4 shows the graph of 1> and ¢ when h is a Gaussian. (Gaussians do indeed satisfy (4.2.24) .) An interesting observation is that (4.2.23 ) is exactly equivalent to the requirement that the ¢m, n ( x ) = e27ri mx ¢ (x - � ) , or equivalently, the 'ljJm ,n ( O = e 7r in� 1>(� - m) , with m, constitute a tight frame (with necessarily redundancy 2 ) for L2(lR). The construction (4.2.25) can then be interpreted as the transition from a general frame, generated by h , to a tight frame, by application of (F* F) -1/ 2 (see note 11 after Chapter 3, or Daubechies, Jaffard, and Journe (1991)). This Wilson basis can therefore be viewed as the result of a clever "weeding" process on a (tight) frame with "twice too many" elements. Many variations on this Wilson scheme are possible. Laeng (1990) has con structed an extension of the above scheme in which the frequency spacing need not be as regular as here. Auscher (1990) has reformulated the whole construc tion: starting directly from ( 4.2.16 ) , (4.2.17) as an ansatz, he derives all the results without use of the Fourier transform, and constructs different examples. In particular, he obtains examples where, in the notations of ( 4.2. 17 ) , the "win dow" ¢ is compactly supported, which is very useful in applications. (The decay in frequency is less crucial, as long as it is "reasonable." ) These examples can also be viewed as the result of a "weeding" procedure on the tight frames with redundancy 2 obtained by taking woto = 7r in §3.4.4.A. Other windowed Fourier bases using cosines and sines rather than complex exponentials, and leading to good time-frequency localization, have been found by Malvar (1990) and Coifman and Meyer (1990). Malvar's paper again uses alternating sines and cosines; he presents applications of his construction to speech coding. Coifman and Meyer's "localized sine basis" starts from a partition of lR in intervals,

n E Z,

lR

=

U

JEZ

r aj , aj H ] ,


125

2 .------,---,---.--�

-

1

L-______L-______L-______L-____� X

2 ,------,---,---.--,

o f-------/

-1

FIG. 4 . 4 .

�s 4

L-____�____�______�____

-4

·2

o

2

The functions ¢ and ;P corresponding to (4.2.25) if hex)

aj aJ + l

=

1l" - 1 /4 exp( _x2 /2) .

They then build window functions Wj Ij raja, j aj +1±oo. ], overlapping slightly with the neighboring

with < and limj _d oo localized around these = intervals:

=

Ek aj Ej aJ+l -EJ+la : W (x) w (2a W W l j j j _ l j -x) j j J WJ- l (x) I x - aj Ej . aj l :"::: Ej , Wj ; Wj Wj I x - aj + 1 1 :"::: Ej+ 1 , {Uj, k ; k

where we assume that the satisfy + 2: for all j. Moreover, we require that and = complement each other near and w ( x ) + = 1 if (All this can be achieved with smooth I :"::: one can take, for instance, ( x ) = sin[ � l/( X - �t€j and ) ] for I x J with l/ satisfying (4.2.4) ( x ) = cos[ � l/( x - aj:��€J+l ) ] for € and (4.2.5) .) Coifman and Meyer (1990) prove that the family j, E Z}, with

CHAPTER 4

126

constitutes an orthonormal basis for L2 (JR) , consisting of compactly supported functions with fast decay in frequency. This basis has moreover a very interesting property: if for any j E Z we define P to be the orthogonal projection onto the space spanned by the { U , k ; k E Z} ( P is "morally" the projection into aj + !]), then P + P ! is exactly the projection operator associated to that we would have obtained if we had deleted the point aj + ! from our "slicing" of JR (i.e. if we had started with the sequence ak, ak = ak if k :::: j, ak = k + ! if k � j + 1 ) . This property makes it possible to split and regroup intervals at will, adapted to the application one has in mind. A very nice discussion of this whole construction, with full details, is Auscher, Weiss, and Wickerhauser (1992). So there is, after all, more to orthonormal windowed Fourier bases than was expected even only a few years ago. None of these bases, however, are unconditional bases for V(JR) if p #- 2. This is one point where wavelet bases have the advantage: they turn out to be unconditional bases for a much larger family of function spaces than even these "good" windowed Fourier bases. We will come back to this in Chapter 9.

[aj ,[a , a ], j j +2

j j

j j j+

Pj ,

a

Notes. 1. Rieffel 's proof does not produce an explicit f orthogonal to all the gm , n . This is a challenge to the reader: find a (simple) construction of f 1. gm , n for all m, n, for arbitrary wo, to with woto > 27r. 2. For orthonormal bases the proof is much simpler. In this case we need not bother with the Zak transform, which was only introduced to prove that if Qg, Pg E L2, then Qg , Pg E L2 as well. For orthonormal bases we can start directly with point 5, establishing ( Qg, Pg ) = ( Pg, Qg ) , which is impossible by point 6. This is the original elegant proof in Battle (1988) . 3. If the 1/Jm , n (x) = ao- m / 2 1/J(ao- m x - nbo) constitute a (tight) frame, then so do the 1/Jm , n # (x) = ao - m /2 1/J# (ao - m x - nbo'), with 1/J# (x) =

(bo/bo') 1 /2 1/J(box/bo').

4. To illustrate this, the following example shows that the complex exponen tials exp (27rinx) do not constitute an unconditional basis for if p #- 2. One can show (see Zygmund (1959» that

I�

I�

n - 1 /4 e2� i nx

n - 1 / 4 ei vn e2 �i nx

l l

LP ([O, 1])

x::: o Ci log x l X -2 . x -+ o C >

r-.J

0

1/2. � +

I TJ (�) I I TJ'(�)I G(l I W - a

C HAPTER

5

O rtho n ormal Bases of Wavelets a nd M ulti resol uti o n A n aly s i s

The first constructions of smooth orthonormal wavelet bases seemed a bit mirac ulous, as illustrated by the proof in §4.2.A that the Meyer wavelets constitute an orthonormal basis. This situation changed with the advent of multiresolution analysis, formulated in the fall of 1986 by Mallat and Meyer. Multiresolution analysis provides a natural framework for the understanding of wavelet bases, and for the construction of new examples. The history of the formulation of multiresolution analysis is a beautiful example of applications stimulating theo retical development. When he first learned about the Meyer basis, Mallat was working on image analysis, where the idea of studying images simultaneously at different scales and comparing the results had been popular for many years (see, e.g. , Witkin (1983) or Burt and Adelson ( 1983) ) . This stimulated him to view orthonormal wavelet bases as a tool to describe mathematically the "increment in information" needed to go from a coarse approximation to a higher resolution approximation. This insight crystallized into multiresolution analysis (Mallat ( 1989) , Meyer (1986) ) .

The basic idea. A multiresolution analysis consists of a sequence of successive approximation spaces Vi . More precisely, the closed subs paces Vi satisfy l (5. 1 . 1 ) . . . V2 C VI C Vo C V- I C V- 2 C . . . with (5. 1 .2) 5.1.

(5. 1 .3) {O} . n Vi jEZ If we denote by Pj the orthogonal projection operator onto Vi , then (5. 1 .2) ensures that limj ---> _ oo Pj ! = ! for all ! E £ 2 (lR) . There exist many ladders of spaces satisfying (5. 1 . 1)-(5.1 .3) that have nothing to do with "multiresolution" ; the multiresolution aspect is a consequence of the additional requirement (5. 1.4) 129

130

CHAPTER

5

That is, all the spaces are scaled versions of the central space of spaces Vj satisfying (5. 1 . 1 )�(5. 1 .4) is

Vk E Z :

Vo. An example

f l [2jk, 2j ( k + l ) [ = constant} .

We will call this example the Haar multiresolution analysis. (It is associated with the Haar basis; see Chapter 1 or below.) Figure 5.1 shows what the projection of some f on the Haar spaces V� 1 might look like. This example also exhibits another feature that we require from a multiresolution analysis: invariance of Vo under integer translations,

Vo,

fE

Vo

:::::}

f ( . - n) E

Vo

(5. 1.5)

for all n E Z .

Because of (5. 1.4) this implies that if f E Vj, then f(· - 2j n) E Vj for all n E Z. Finally, we require also that there exists ¢ E so that

Vo

{¢O ,n ; n E Z} is an orthonormal basis in

Vo ,

(5. 1.6)

where, for all j , n E Z, ¢j, n (x) = 2 � j / 2 ¢(2 � j x - n) . Together, (5. 1.6) and (5. 1.4) imply that {¢j, n ; n E Z} is an orthonormal basis for Vj for all j E Z. This last requirement (5. 1.6) seems a bit more "contrived" than the other ones; we will see below that it can be relaxed considerably. In the example given above, a possible choice for ¢ is the indicator function for [0, 1] ' ¢(x) = 1 if 0 ::::: x ::::: 1, ¢(x) = 0 otherwise. We will often call ¢ the "scaling function" of the multiresolution analysis. 2 The basic tenet of multiresolution analysis is that whenever a collection of closed subspaces satisfies (5. 1. 1)�(5.1.6) , then there exists an orthonormal wavelet basis h !j, k ; j, E Z} of L2 (JR), 'lj;j , k ( X ) = 2 � j / 2 'lj; (2 - j x such that, for all f in L2 (JR),

k

k),

Pj - I ! = Pj f +

L ( f, 'lj;j, k ) 'lj;j, k .

kEZ

(5. 1.

7)

( Pj is the orthogonal projection onto Vj . ) The wavelet 'lj; can, moreover, be constructed explicitly. Let us see how. For every j E Z, define Wj to be the orthogonal complement of Vj in Vj - 1 . We have (5. 1.8) and (If j > j ' , e.g. , then Wj

Wj C

1-

Wj , if j 1= j ' .

(5.1.9)

Vj, 1- Wj , .) It follows that, for j < J , Vj

=

VJ

J- j - l EB

E9 WJ � k , k =O

(5. 1 . 10)

131

MULTIRESOLUTION ANALYSIS

x

x

x

o FIG. 5 . 1.

A

function f and its projections onto

V- l

and Va .

where all these subspaces are orthogonal. By virtue of (5. 1.2) and (5. 1 .3), this implies (5. 1 . 1 1 ) L2 (lR) = EB Wj , j EZ

a decomposition of L2(lR) into mutually orthogonal subspaces. Furthermore, the Wj spaces inherit the scaling property (5. 1.4) from the

Vj:

(5. 1 . 12) Formula (5. 1.7) is equivalent to saying that, for fixed j, Nj, k ; k E Z} constitutes an orthonormal basis for Wj ' Because of (5. 1 . 1 1 ) and (5. 1.2), (5. 1.3) , this then automatically implies that the whole collection {'l/Jj, k ; j, k E Z} is an orthonormal basis for L2(lR). On the other hand, (5. 1. 12) ensures that if {'l/Jo, k ; k E Z} is an orthonormal basis for Wo , then {'l/Jj, k ; k E Z} will likewise be an orthonormal basis for Wj , for any j E Z. Our task thus reduces to finding 'l/J E Wo such that the 'l/J(. - k) constitute an orthonormal basis for Wo o To construct this 'l/J, let us write out some interesting properties of
(� + 211') + 4>(� - 211') and of 4>(�/2) for the Meyer multiresolution analysis; their product is 1 t,b (O I . (See also Figure 4.2.)

1 40

CHAPTER

and

� I Ck l 2

=

(21l") -1

5

127< d1" I � Ck e -ike l 2 ,

(5. 3 .1) is equivalent to 0 < (21l") - 1 A :::; Li I¢(� + 21l"£) 1 2 :::; (21l") -1 B < 00 a.e. We can therefore define ¢# E L 2 (1R) by so that

4># ( { )

�

(2. ) -' 1 '

12

(

[�

1 4> ( U 2 '£) l

f

' I'

(5. 3 . 2) (5. 3 . 3)

4> ( { ) .

¢ (. . ¢(

Clearly, L i I¢# � + 21l"£) = ( 21l")-1 a.e., which means that the # - k) are orthonormal. On the other hand, the space vt spanned by the # - k) is given by

vt

{I; I

}

� I!! ¢# ( - n) , ( f!! ) nEZ E £2 (Z) {f; i = v ¢# with v 21l"- periodic, v E L2 ( [0 , 21l"] )} { f; i = ¢ with 21l"- periodic, E L2 ([0, 21l"])} ( use (5. 3 . 2 ) and (5. 3 . 3 )) = � In ¢( . - n ) with ( fn) nEZ E £2 (Z) Vo ( since the ¢( . - n ) are a Riesz basis for Vo ) .

{I; I

=

'

VI

VI

VI

}

5.3.2. Using the scaling function as a starting point. As described in and a multiresolution analysis consists of a ladder of spaces a special function E Vo such that are satisfied ( with possibly relaxed as in One can also try to start the construction from an appropriate choice for the scaling function after all, Vo can be constructed from the - k) , and from there, all the other can be generated. This strategy is followed in many examples. More precisely, we choose such that

§5.1,

¢( .

¢ §5. 3 .1).

(5.1.1)- (5.1.6) l'J¢: ¢ ¢(x) Ln Cn ¢(2x - n) ,

(l'J )j E(5.1.6) Z (5.3.4)

(5. 3 . 5 ) I¢(� + 21l"£ W :::; f3 < fELZ We then define l'J to be the closed subspace spanned by the ¢j , k , k E Z, with ¢j, k (X) = 2-j /2 ¢(2 -j x - k) . The conditions (5. 3 . 4) and (5. 3 . 5 ) are neces sary and sufficient to ensure that {¢j , k ; k E Z} is a Riesz basis in each l'J, and o

< a :::;

00 .

141


that the '0 satisfy the "ladder property" ( 5. 1 . 1 ) . It follows that the '0 satisfy ( 5.1.1 ) , ( 5.1.4 ) , (5.1.5), and (5. 1.6) ; in order to make sure that we have a mul tiresolution analysis we need to check whether 5.1. and (5. 1.3) hold. This is the purpose of the following two propositions. PROPOSITION 5 . 3 . 1 . Suppose E satisfies ( 5.3.5 ) , and define '0 = Span E ll} . Then = '0

{¢j, k ; k

nj E z

¢ L2 (JR) {o}.

( 2)

Proof.

Vo. ¢O, k Vo, 1 Vo, 1 1 1 2 ::; kELZ l (f, ¢o, k )1 2 1 1 1 2 (see Preliminaries) . Since '0 and the ¢j , k are the images of Vo and the ¢O , k -j /2 1 (2 -j x), it follows that, for all 2 under the unitary map (Dj f )(x) 1 E '0 , A 1 1 1 2 ::; L l ( f, ¢j , k )1 2 ::; B 1 1 1 2 , ( 5.3.7 ) kEZ

constitute a Riesz basis for In particular, they 1. By (5.3.5) , the constitute a frame for i.e., there exist A > 0, B < 00 so that, for all E A ( 5.3.6 ) ::; B =

with the same A, B as in ( 5.3.6 ) .

2. Now take 1 E nj Ez '0 . Pick E > 0 arbitrarily small. There exists a compactly supported and continuous i so that I l f -i l £ 2 ::; E. If we denote by Pj the orthogonal projection on Vj , then hence

I l f l ::; E + I l Pj il
M ¢> mo, nI l -+ oo

mo,¢> ¢>. 7

0,

¢>. 6

1(0) 0 L2 (JR), 12 L (JR),

3. If is bounded, and continuous in then the condition =I- is nec essary in Proposition 5.3.2. This can be seen as follows. Take E then If Uj E z Vj = =Iwith support C [-R, RJ , R < = lim J --+ oo But

11 0,

00.

p- J I · j

k

Since 1 is continuous, the first term tends to AThe- I second 27r 1 1(0)term J-+ 00 , byexactly the dominated convergence theorem. 1 2 1 1can1 2 forbe bounded in (5.3. 15), so that this term tends to zero for J -+00. It follows that as in (5.3. 13) .

as

I l f l =I- 0, this implies 1(0) =I- O. 4. The argument in points 3 and 4 of the proof can also be used to prove 1 1(0) 1 2 ::; B/27r. We have indeed B I II 1 2 � B I P_ J II 1 2 � kELZ 1 (1, ¢>- J, k )1 2 = 27r f �1 1 ( T J �) 1 2 I j (�) 1 2 + R, where I R I can be bounded by C 2 - J for nice I. The other term tends to 27r 1 1(0)2 l 2 1 1 1 2 (see 4) . Together with remark 3 above, this implies A/27r ::; 1 1(0)1 ::; B/27r. In particular, if the ¢>O, k are orthonormal, then A = B and 1 1 (0) 1 = (27r) - 1 / 2 . 5. The conditions 1 E L OO , 1(0) =I- 0 (with 1 continuous in 0) imply certain restrictions on the as well. Equation (5.3.4) can be rewritten as (5.3. 18 ) 1(�) = mo(�/2) 1(�/2) , Since

en

145


n

n

with mo (�) = � L Cn ei ( In particular, ¢(O) implies mo (O) = 1 ( since ¢(O) =f. 0) or

mo(O) ¢(O), which

n

(5.3. 19)

Moreover, (5.3. 18) implies that mo is continuous, except possibly near the zeros of ¢. In particular, mo is continuous in � O. If, further more, I ¢(�) I :S C ( + I W - 1 / 2 - ., then the continuity of ¢ implies that L I I ¢(� + 27r£) 12 is continuous as well, so that ¢# ( as defined in §5.3.1) is also continuous; consequently, mt (�) = ¢# (2�)/¢# (�) satisfies mt ( O) = 1. Since Imt (0 12 + I mt (� + 7r) 12 = 1 , it follows that mt (7r) O. This implies mO (7r) = 0 (mt (�) = mO (�) [LI I ¢(� + 27r£) 12] 1 /2 . [LI I¢(2� + 27r£) 12t1/2) , or (5.3.20) Cn ( t = o.

l

=

=

Ln

n

-

l

n

n

Together with L Cn = 2, this implies L n C2 = 1 L C2n+ l . This is consistent with the admissibility condition for 1jJ. 8 Note also that Ln C2n = 1 = L C2nH is equivalent with Micchelli (1991)'s condition L l ¢(x-£) const. =f. 0 if 1¢(x) 1 :S C (1 + I x l ) - l - . and if ¢ is continuous.9 0

n

=

=

All this suggests the following strategy for the construction of new orthonor mal wavelet bases: •

Choose ¢ so that (1) ¢ and ¢ have reasonable decay, (2) (5.3.4) and (5.3.5) are satisfied, (3) J dx ¢ (x ) =f. 0

( by Propositions 5.3.1, 5.3.2 the V; then constitute a multiresolution anal ysis ) ;

•

•

If necessary, perform the "orthonormalization trick"

Finally, ;j; (�) = eie /2mt(� / 2 + 7r)¢# (� / 2), with mt (�) [LI I ¢(� + 27r£) I 2 P / 2 [LI I ¢(2� + 27r£) I 2 t 1 /2 , or equivalently 1jJ ( x )

=

Ln ( _ l) n h"!.nH ¢# (2x

-

n) ,

=

mo(�)

CHAPTER 5

146 5.4.

More examples: The Battl�Lemarie family.

The Battle-Lemarie wavelets are associated with multiresolution analysis ladders consisting of spline function spaces; in each case we take a B-spline with knots at the integers as the original scaling function. If we choose ¢ to be the piecewise constant spline, A. ( ) 0 ::; x ::; 1 , 0 otherwise, 'I' then we end up with the Haar basis. The next example is the piecewise linear spline,

X =

¢ (x)

=

{ I,

{ 01 - l x i ,

0 ::; Ixl ::; 1,

otherwise,

plotted in Figure 5.4a. This ¢ satisfies

¢ (x)

=

� ¢(2x

+ 1) + ¢(2x) + � ¢(2x - 1) ;

see Figure 5.4b. Its Fourier transform is

� 3

+

1 3

( sine/2e/2 )2 '

cos 0, this implies that there exists a, possibly smaller than ",(, so that ReG(�) :::: a/2 for 1 1m � I < a. Consequently, G -1/2 can be defined as an analytic function on 1 1m �I < a, which means that ¢# = G - 1 / 2 ¢ has an extension to a uniformly bounded analytic function on the strip 1 1m � I < a. 4. On the other hand, (5.4. 1 ) implies that

(6

(6

6

for 16 1 ::; a. It follows that on 1 1m � I bounded by Consequently,

1 + 1, and suppose that f E c , with f (l) i 1 for C :S Then (5. 5 .1) J dx xi j(x) 0 fod 0, 1 ,

such that

=

=

0:

m.

m

=

=

"

"

m .

Proof. The idea of the proof is very simple. Choose j, k, j', k' so that is rather spread out, and very much concentrated. (For this expository point only, we assume that has compact support.) On the tiny support of the slice of "seen" by can be replaced by its Taylor series, with as many terms as are well defined. Since, however, J dx = 0, this implies that the integral of the product of and a polynomial of order m is zero. We can then vary the locations of as given by k'. For each location the argument can be repeated, leading to a whole family of different polynomials of order m which all give zero integral when multiplied with 1. This leads to the desired moment condition. But let us be more precise as follows.

1.

fj, k

ij ' , k' i

h,k

ij ' , k'

h, k (X) i ', k'(x) i i ' , ' ,j jk

ij , , k'

prove (5.5.1) by induction on C. The following argument works for both 2 . We the initial step and the inductive step. Assume J dx x n i( x ) 0 for n E N, nis continuous C. (If C 0, then this amounts to no assumption at all.) Since f (i ) ( C :S ) and since the dyadic rationals 2 - j k , (j, k E Z) are dense in JR, there exist J, K so that f ( i ) (2 - J K) -=I- O. (Otherwise f ( i ) 0 would follow, implying f constant if C 0 or 1, which we know not to be the case, or, if C � 2, f polynomial of order C - 1 � 1, which would imply that f is not bounded and is therefore also excluded.) Moreover, for any t > 0 there exists 8 > 0 so that I f(X) - t,(n! ) -l f(n) (2-JK) (x - r JKt l t lx - 2-JKli if Ix - 2 - J KI :S 8. Take now j > J, j > O. Then o J dx f (x) j(2j x - 2j - J K) =

- I , n )

=

=

1.

Ln gn ¢> - I ,n ,

(_l) n h_ n+ l ' Consequently, 2 - j /2 ,¢(2 - j x - k) Tj /2 L gn 2 1 /2 ¢>(Tj +l x n n L gn- 2k ¢>j - l , n ( X ) . n

2k

-

n)

(5. 6 .1)

It follows that

n

i.e. , the (I, '¢ 1 , k ) are obtained by convolving the sequence ( (I, ¢>O , n ) )h E Z with g n ) n E Z , and then retaining only the even samples. Similarly, we have

(

-

(5. 6. 2)

n

which can be used to compute the (I, ,¢j , k ) by means of the same operation ( convolution with g, decimation by factor from the (I, ¢>j - l , k ) , if these are

2)

157


known. But, by

(5.1.15) 0 (a necessary condition to have some regularity for 1/J). Not every such rno is asso ciated to an orthonormal wavelet basis, however, an issue addressed in §§6.2 and 6.3. The main results of these two sections are summarized in Theorem 6.3.6, at the end of §6.3. Section 6.4 contains examples of compactly supported wavelets generating orthonormal bases. The orthonormal wavelet bases thus obtained cannot, in general, be written in a closed analytic form. Their graph can be computed with arbitrarily high precision, via an algorithm that I call the "cas cade algorithm," which is in fact a "refinement scheme" as used in computer aided design. All this is discussed in §6.5. A lot of this material goes back to Daubechies (1988b); for many of the re sults, better, simpler, or more general proofs have been found since, and I have given preference to these new ways of looking at things. These different ap proaches are borrowed mainly from Mallat Cohen ( 1990), Lawton (1 9 , 1991), Meyer and Cohen, Daubechies, and Feauveau for the link with refinement equations the references are Cavaretta, Dahmen, and Micchelli and Dyn and Levin as well as earlier papers by these authors (see §6.5).

20

(1991)

(1990),

(1989),

(1992) ;

(1990),

90

6.1. Construction of mo .

In this chapter we are mainly interested in constructing compactly supported wavelets 1/J. The easiest way to ensure compact support for the wavelet 1/J is to choose the scaling function ¢ with compact support (in its orthogonalized version) . It then follows from the definition of the hn ' hn = v'2

J dx ¢(x) ¢(2x 167

n

)

,

168

CHAPTER

hn

6

that only finitely many are nonzero, so that 1/J reduces to a finite linear combination of compactly supported functions (see (5. 1.34)) , and therefore au tomatically has compact support itself. Choosing both and 1/J with compact support also has the advantage that the corresponding subband filtering scheme (see uses only FIR filters. For compactly supported the 211"-periodic function mo ,

¢

§5. 6)

¢

becomes a trigonometric polynomial. As shown in Chapter orthonormality of the implies

¢O , n

I mo ( e W

+ I mo (e + 1I"W = 1 ,

5 (see (5.1.20)), (6.1.1 )

where we have dropped the "almost everywhere" because mo is necessarily con tinuous, so that has to hold for all e if it holds a.e. We are also interested in making 1/J and ¢ reasonably regular. By Corol lary 5.5.4, this means that mo should be of the form

(6.1.1)

(6.1. 2 ) 1,

with N :::: and £. a trigonometric polynomial. Note that even without reg ularity constraint, we need with N at least P utting together, it follows that we are looking for

1. 1

(6.1. 2)

(6.1.1), (6.1. 2) (6.1. 3)

Mo ( e ) = Imo ( e W ,

a polynomial in cos e , satisfying Mo ( e )

and

(6.1.4)

+ Mo (e + 11") = 1

(6.1. 5)

where L( e ) = 1 .c(e W is also a polynomial in cos e. For our purpose it is conve nient to rewrite L( e ) as a polynomial in sin2 e /2 = ( 1 - cos e / ,

)2

(6.1. 6) (6.1. 4 )

In terms of P, the constraint becomes ( 1 - y) N P(y) yN P(l - y) = which should hold for all y E we use Bezout's theorem. 2

+

1,

(6.1. 7)

[0, 1]' hence for all y E JR. To solve (6.1.7) for P

169

COMPACTLY SUPPORTED WAVELETS

P I , P2

n l , n2 ,

THEOREM 6 . 1 . 1 . If are two polynomials, of degree respec tively, with no common zeros, then there exist unique polynomials q l , q2 , of degree - 1 , - 1, respectively, so that

n2

nl

(6.1.8)

Proof. 1 . We first prove existence; uniqueness follows later. We can assume that ( by renumbering, if necessary ) . Since degree � degree (pd,

nwel ::::cann2find polynomials a (x), b (x), with degree (a (P) 2 ) degree (P ) I 2 2 2 degree (P2 ), degree (b2 ) degree (P2 ), so that PI (X) a2 (x) P2 (X) + b2 (x) . 2. Similarly, we can find a3 (x), b3 (x), with degree (a3 ) -degree (b2 ) , degree (b3 ) degree (b2 ) , so that P2 (X) = a3 (x) b2 (x) + b3 (x) . We keep going with this procedure, with bn - l taking the role of P2 in this last equation, and bn the role of b2 , =

0 . This proves that (6. 3 . 2 ) is satisfied, and finishes the proof ( 1) (2). 4. We now prove the converse, (2) (1). Define f.L k (�) (271") - 1 /2 [n;=l mo(2 -j �)] . XK (2- k�), where XK is the indicator function of K, X K (�) 1 if � E K, 0 otherwise. Since K contains a neighborhood of 0, f.L k -t¢ pointwise for k-too. Since

2

2

'

:::}

:::}

=

=

185


I mo(2- k �) 1 Cmo(�) k 1 andC'I� E K.henceOn Ithe other m o(�) mo(O) �, I � I ; 1 ::; 1 C'k I � I . ko. ko so that 2 - k C'I � 1 � if � E K e - 2x ::; ::; � , we find therefore, for ko I ¢(�)I (2rr) - 1 /2 k=II I mo( T k �) 1 k=kII+ I mo(2 - k �) 1 1 o1 > (2rr) - 1 / 2 C ko II exp [-2C' 2 - k l � 1 l k=ko+ 1 > (2rr) - 1 / 2 C ko exp [ _C' 2 - kO + 1 max I � I ] c" > 0 . t; EK

5. By assumption, � > 0 for hand, we also have, for any 1Since K is bounded we can find and � Using 1 - x � for 0 � E K,

�

�

O n . ( This is illustrated by the counterexample mo (O � ( 1 + e- 3ie) discussed above. ) The points � ± i play a special role for the following reason: + 2k 7l" ) 0 for all k E Z, contradicting (6.2.5 ). This mo (± i) 0 implies implication can be checked as follows. Take any k E N ( negative k can be treated similarly) . Then k has a binary representation k 2:7=0 Ej 2j , with Ej 0 or 1j for good measure we can add a couple of zeros at the front end of k En En - l . . . E I EO , so that we can assume En En - l = O. If k is even, k = 2£, then 271" ¢> 3 + 2k7f + 2£71" J> + 2£71" 0 mo

,

=

J>e;

=

=

=

=

=

=

A

(

=

)

(i

) (i

) (because mo (i) 0) =

=

We therefore need to check only what happens if k is odd, k Then ; + 2k7f 8; + 4£7I"j hence

2

(

=

271" ¢> 3 + 2k7f A

)

=

=

2£ + 1, or EO = 1.

( ) mo ( 271" + £71") ¢> ( 271" £71") .

471" mo 3

3

A

3

+

CHAPTER 6

188 (a)

o 1

-7t

( b)

21 �

21 � 27t + -

7t [ - 7t, 7t 1 \

o

- 7t

t

1

21 �

+

+L______________��--------------�+

t

7t

K

FIG. 6 . 2 . This figure assumes that mo has only one zero in 11' / 3 < I{I :S 11' / 2, namely in {t = �; . We choose It = ] �� , Ii: [ ; hence 2It = ] �; , It; [. According to (6.3.6) , the compact set K is then [_ 1;; , _ It; ] u [-11', �; ] u [ 1;; , 11'] .

f

mo( 5; ) mo( ¢e;

moe;

O. O. mo

= -�) = It follows If is odd, i.e., E l = 1, then + hI' ) = that we only need to investigate further what happens for E l = 0, or e even. We can continue this further, showing that only those k with binary representation ending in 010101· · ·01 do not automatically lead to + 2k7r) = But if we work back far enough, then we will hit En En - l = 0 0, so that we indeed have + 2k7r) = This whole argument uses that the zero set of contains G , - n = [ e ; , -;11' } + 71'] mod 271' ) and that e ; , -r } is an invariant cy cle under the operation � t-+ mod 2 71' ) , mapping [ - 71' , 71'] into itself. In his Ph.D. Thesis, Cohen (1990b) proves that such invariant cycles are the root of the problem. T HEOREM 6 . 3 . 3 . Assume that is a trigonometric polynomial satisfying (6. 1.1) and = 1, and define ¢ as in (6.2.2) . Then conditions (1) and (2) in Theorem 6.3. 1 are also equivalent to

¢e;

O.

2� ( ( mo

mo(O)

3.

or the oper {6, " 'm, �o(�n } in) [ 1 ] allinvariant I j 1 for j 1, , f, , ,

there is no non-trivial cycle ation 2 mod 2 ) such that

� � t-+

(

71'

- 71' , 71'

=

=

n.

R EMARK S .

1. Because of (6. 1.1),

I mo(�j ) 1

=

1 is of course equivalent to

I mo(�j + 71') 1

=

o.

{O} , which is always an invariant cycle. 2; 3. In our example above, 6 2; , 2. Non-trivial means different from =

6

=

-

.

0

For a proof of this theorem and related results, we refer to Cohen (1990b); one of the two implications is in fact proved in step 6 of the proof of Theorem 6.3.5 below.

189


A very different approach to the derivation of conditions on mo that ensure (6. 2 . 5) was initiated by Lawton (1990). Let us assume that mo is of the form N " hn e - i ne , mo ( then the only trigonometric polynomials invariant under are the constants .

0 Iml

=

k, n

hn hn - k fU - k

=

(6. 3 . 8 )!

Po

(1991)), 6.3.5. mo(O) 1. mo (6.1.1) I mo - �) 1 0,

1, 1990 (1992)

0,

Po


191

R EMARK . This is sufficient to prove equivalence. If we denote Lawton's orig inal condition by (L), Cohen ' s condition by (C), Lawton ' s condition rephrased in terms of by and the orthonormality of the The 2 y are obtained by shifting the decimal point to the left. Since is 211'-periodic, only the "tail," i.e., the part of the expansion of 2 y to the right of the decimal point, decides whether 211'2 - y vanishes or not. If = db then y 2 would have the same decimal part as hence 211'y 2 = 0 would follow. Since 0, we therefore have = Similarly, e L+ n we conclude = = etc. It follows that are also successively equal to for some k E = . . . = {I, 2, . . . , n}. Since the are not all equal to 0, whereas = 0, this is a contradiction. This finishes the proof. •

/

eL+n

With Theorem 6.3.5 we end our discussion of necessary and sufficient con ditions on The following theorem summarizes the main results of §§6.2 and 6.3. T HEOREM 6 . 3 . 6 . Suppose is a trigonometric polynomial such that by = 1 . Define + + 11' 1 = 1 and

mo. I mo(�) 1 2 I mo(� ) 2

mo mo(O)

C/>, 'IjJ

00

j=l ¢(�) _e - if,,/2 mo(�/2 + 11') ¢(�/2) . Then C/>, 'IjJ are compactly supported L 2 -functions, satisfying n n where hn is determined by mo via mo(�) 0 L n hn e - inf" . Moreover, the -j x - k), j, k E Z constitute a tight frame for L2 (JR) with 2 -j/2 1'IjJ(2 'ljJj, k (X)constant . This tight frame is an orthonormal basis if and only if mo =

=

frame satisfies one of the following equivalent conditions: There exists a compact set K, congruent to [-11', 11'] modulo 211', containing a neighborhood of 0, so that >0. inf inf •

•

•

k> O f"E K I mo(2 - k �) 1 There exists no nontrivial cycle { , ' " } in [0, 211'[, invariant under � 2� modulo 211', such that mO (�6j + )�n 0 for all j 1" " n. The eigenvalue 1 of the [2 (N2 - N1 ) - 1] [2 (N2 - NI ) - l] -dimensional matrix A defined by N2 ARk n=N1 L hn hk - 2/+n , -(N2 - N1 ) + 1 £, k (N2 - NI ) + 1 (where we assume hn 0 for n N1 , n > N2 ) is nondegenerate. I--->

11'

=

=

x

::;

=

=

0 .5

s 'l' o

o -1 -0 .5 ��_�_�_�.-J o 2 4 6 8

-4

-2

o

2

4

1 .0 0.5

o

o -0.5

�___--.J L...__ ...

o

-1

�__�__��

-5

10

5

5

o

1 .0

9 '1'

0.5

o

o -

0.5

L--"::"'�_�_�--.l

o

5

10

15

-1 -5

o

5

FIG. 6 . 3 . Plots of the scaling functions N fiJ and wavelets N 'ifJ for the compactly supported wavelets with maximum number of vanishing moments for their support width, and with the extremal phase choice, for = 2, 3, 5, 7, and 9.

N

198

CHAPTER 6 TABLE 6 . 3

N

The low-pass filter coefficients for the "least asymmetric" compactly supported wavelets wi maximum number of vanishing moments, for = 4 to 10. Listed here are the CN, n v'2 hN, n ; one has L n C N, n = 2 . N _ 4

N = 5

N = 6

n

0

- 0 . 1 07 1 4890 1 4 1 8

1

- 0 . 04 1 9 1 0965 1 2 5

5

- 0.038493 5 2 1 263

2

0 . 703739068656

6

- 0 .073462 50876 1 0 . 5 1 5 398670374

eN,"

n

N _ 8

3

1 . 1 36658243408

7

4

0 . 4 2 1 234534204

8

1 .099106630537

5

- 0 . 1 403 1 7624 1 79

9

0 .68074 534 7 1 90

- 0 . 0 \ 782470 1 4 4 2

10

- 0 . 0866536 1 5406

7

0.045570345896

11

- 0 . 202648655286

0

0.038654 7959 5 5

12

0 . 0 1 07586 1 1 7 5 1

13

0.044 8236230 4 2

1

0.04 1 7468644 2 2

14

- 0 .000766690896

2

- 0.055344 1 861 1 7

15

- 0 .004 7834585 1 2

3

0 . 2 8 1 990896854

4

1 .023052966894

5

0 .896581 648380

N = 9

0

0 . 00 1 5 1 2487309

1

- 0 . 000669 1 4 1 509 - 0 .0 1 4 5 1 5 5 78553

6

0 .023478923136

2

7

- 0 . 2 4 795 1 3626 1 3

3

0 . 0 \ 2 5 28896242

8

- 0 .02984 2499869

4

0.087791 2 5 1 554

9

0.027632 1 5 2958

5

- 0 . 0 2 5 78644 5930

6

- 0 . 2 70893783503 0.049882830959 0 . 8730484U7349

0 . 0 2 1 784700327

7

1

0.0049366 1 2372

8

2

- 0 . 1 668632 1 54 1 2

9

1 .0 1 5 2 59790832

3

- 0.068323 1 2 1 587

10

0 . 337658923602 .- 0.0771 7 2 1 6 1 097

0

4

0.694 4 5 7972958

11

5

1 . 1 1 3892783926

12

U.0008 2 5 1 40929

6

0 . 4 779043 7 1 333

13

0 . 0 4 2 7 4 4 4 33602

7

- 0 . 1 02724969862

14

- 0 . 0 1 630335 1 2 26

8

- 0.02978375 1 299

15

- 0 . 0 1 8 769396836

9

0 .063250562660

16

0 . 000876502539

10

0 .002499922093

17

0.00 1 98 1 1 93736

11

- 0 . 0 1 1 03 1 867509

0

0 . 003792658534

0

0 . 00 1 089 1 704 4 7

1

0.000 1 35245020

2

- 0.00 1 48 1 2 2 59 1 5

2

- om 2 2 2064 2630

- 0.01 78704 3 1 6 5 1

3

- 0.002072363923

3

0.043 1 5 54 52582

4

4

0 . 0960 1 4 767936

5

+ 0.0649509 2 4 5 79

5

- 0.0700782 9 1 2 2 2

6

- 0 . 2 2 5 5 58972234

1

N = 8

eN."

0.0694904659 1 1

6

N = 10 N = i

4

0 . 0 1 64 1 8869H6

6

0.024665659489

7

- 0 . 1 002402 1 5031

7

0 . 7581 6260 1 964

8

0 . 6670 7 1 338 1 5 4

8

1 .0857827098 1 4

9

1 .088 2 5 1 530500

9

0.4081 83939725

10

0 . 5 4 2 8 1 30 1 1 2 1 3

10

- 0 . 1 98056706807

11

- 0.050256540092

11

- 0 . 1 5 24638 7 1 896

12

- 0.0452407722 1 8

12

0 .00567 1 3 4 2686

13

0.070703567550

13

0 .0 1 4 5 2 1 394762

14

0.008 1 5 2 8 1 6799

15

- 0.028786231926

0

0 .0026727!J3393

16

- 0 . 00 1 1 375353 1 4

1

- 0 . 0004 28394300

17

0 . 0064 95728375

2

- 0 . 02 1 1 4 5686528

18

0 . 00008066 1 204

3

0.005386388754

19

- 0 .000649589896

199


2

1.5

1

N=4

I

0

-0.5

I

0

-1

N=6

1 . 5 ,---- --

1.0

0.5

I

f I

6

4

2

-2

2 ,--

N=6

.

1 .5 1.0 0.5

0 I

I

f

5

N=8

-

\

\

0

1.5 0.5

----N =-----

0

5

10

10

I

f 0 15

0 15

-1

-

'I'

5

0

N=8

1.-

-1 2

\

-5

2

- --�-

/1 1 \\

I

�

0 -0 . 5

5

10

1.0

10

�

0 -0.5

�

-1

-� �

VV�

0

0 -0.5

�-�--

4

2

0

-2

-5

-�� 'I'

�( 0

5

�

!N �-� --�-

��H

I

•

-5

0

5

N

FIG. 6.4. Plots of the scaling function ¢ and the wavelet 1/J for the "least asymmetric " = 4, 6, 8, compactly supported wavelets with maximum number of vanishing moments, for and 10.

200

CHAPTER 6

--- - -- ------ -

o

�

o

n

1 6 mO ( � ) I

o n

o

- - - - - - ""----I

o

n

o

FIG. 6 . 5 .

Imo ({) 1

for

N

=

2, 6 and 1 0 , corresponding t o the filters in Table 6 . 1 o r 6.3.

1 .0 I------�

0.5

o

_

o FIG. 6 . 6 .

_

'-_--=--,,�_.J

_ _ _ _ _ _ _ _ _ _

�

Plot of Imo ({) 1 for the 8-tap filter corresponding to

N

n =

2 and

mo (77r /9)

=

o.

201


0 (6.1.11).

orthonormal bases are indeed very flat at and 7r, but very "round" in the transition region, near 7r /2. The filters can be made "steeper" in this transition region by a judicious choice of R in Figure shows the plot of corresponding to = 2 and R of degree 3 chosen such that has a zero at � = 77r/9 ( = This is much closer to a "realistic" subband coding filter. The corresponding "least asymmetric" function ¢ is shown in Figure 6.7; it is less smooth than 4 ¢ (which has the same support width, but corresponds to = and R == but turns out to be smoother than (for which has a zero of the same multiplicity, i.e. , 2, at � = 7r ) . In Chapter 7 we will come back in greater detail to these regularity and flatness issues. The hn corresponding to Figure are listed in Table

N 140°). 0),

N 4

6. 6 mo( 2 I 01 2¢

6. 7

I mo l

mo

6.4.

1 .5

1 .0

0.5

0

-0.5

0

2

4

6

2

o

- 1 L-__L-______�__�__�______� 4 2 -2

o

FIG . 6 . 7 . The "least asymmetric " scaling function 4> and wavelet .p corresponding to as plotted in Figure 6.6.

Imo l

I¢I l ?,b l II mo 0mo(± 2; ) 1, mt(O mo 2; ). mt 1. (6.1.1), mo j i e mt(2 �), I1� 1 + f,/2

All these examples correspond to real hn ' ¢ and 'ljJ, i.e., to and symmet ric around � = O. It is also possible to construct (complex) examples with ¢ , l ?,b l of the concentrated much more on � > than on � < O. Take for instance the (� _ = = and define previous example, which satisfies since does, and ( O ) = We can there obviously satisfies This ?,b = = # (O fore construct ¢# (�) mt (�/2 7r) ¢# ( � /2) ;

mt

202

CHAPTER

6

TABLE 6 . 4

The coefficients for the low-pass filter corresponding t o the scaling function in Figure 6 . 7 . n

01 342 657

hn -0. 0 802861503271 243085969067 -0.0. 3062806341592 50576616156 0.0. 5229036357075 -0. 00644368523121 -0. 115565483406 0. 0381688330633

these are compactly supported L 2 -functions, and the 'lj;tk ' j, k E Z constitute a tight frame for L 2 (JR) , by Proposition Moreover, since the only zeros 7 of ma on are in � = ± ; , ±11", it follows that mt (O = only for or 7; . Consequently, Imt (� ) 1 � C > for I � I :::; i , and the 'lj;tk � = ±11", constitute an orthonormal wavelet basis, by Corollary Figure plots oo Imt (O I , I ¢# ( O I and 1 ¢ # (O I ; it is clear that fa d� 1 ¢ # (O I 2 is much larger than a L oo d� 1 'Ij; # (O I 2 . Note that the negative frequency part of 'Ij; # is much closer to the origin than the positive frequency part, as required by the necessary condition oo The existence of such fa � 1 � I - l l ¢ # (O I 2 = f� oo � 1 � I -l l ¢ # ( � ) 1 2 (see "asymmetric" ¢ was first pointed out in Cohen in fact, for any E > one can find an orthonormal wavelet basis such that La oo � 1 'Ij;(O I 2 < E.

6. 2 . 3 .

[-11",11"] - 5; -

0 6. 3 . 2 .

A

0

6. 8

A

§3.4). (1990);

A

0

6.5. The cascade algorithm: The link with subdivision or refinement schemes. It can already be suspected from the figures in that there is no closed form analytic formula for the compactly supported ¢(x), 'Ij;(x) constructed here (except for the Haar case). Nevertheless, we can, if ¢ is continuous, compute ¢(x) with arbitrarily high precision for any given X; we also have a fast algorithm to compute the plot of ¢ .8 Let us see how this works. First of all, since ¢ has compact support, and ¢ E L l (JR) with f dx ¢(x) = we have P ROPOSITION If f is a continuous function on JR, then, for all X E JR,

§6.4

6.5.1.

lim

3 - 00

1,

2j J dy f(x + y) ¢(2j y)

=

f(x)

.

(6. 5 .1)

If f is uniformly continuous, then this pointwise convergence is uniform as well. If f is Holder continuous with exponent a , If(x) - f(y) 1 :::; G l x - y l " ,

203

COMPACTLY SUPPORTED WAVELETS 1 .0

0.5

o

-27t

27t

o

o �---�

-67t

-47t

-27t

0

27t

47t

67t

0

27t

47t

67t

O f------' -67t

-47t

-27t

FIG . 6 . 8 . Plots of Imo l , 14>1, and 1,z,1 for an orthonormal wavelet basis where tmted more on positive than on negative frequencies.

then the convergence is exponentially fast in j :

,z,

is concen

(6. 5 . 2) I f (x) - 2j I dy f (x + y) 4>(2j y)I � CTjQ . Proof. All the assertions follow from the fact that 2j 4> (2j . ) is an "approximate 8-function" as j tends to More precisely, 00.

I f (x) - 2j I dy f(x + y) 4>(2j y) I 1 2j I dy [J (x) - I (x + y)] 4>(2j y) I I I dz [J (x) - f (x + 2-j z)] 4>(z) I

204

CHAPTER

:::; I ¢ I £ 1 '

6

I f (x) - f (x + u) 1 (where we suppose support ¢

f

sup

l u l 9 -i R

C

[- R, R]) .

If is continuous, then this can be made arbitrarily small by choosing ] suf ficiently large. If is uniformly continuous, then the choice of ] can be made independently of and the convergence is uniform. If is Holder continuous, then (6.5.2) follows immediately as well. •

x,f

f

¢

Assume now that itself is continuous, or even Holder continuous with exponent a. (We will see many techniques to compute the Holder exponent of in the next chapter.) Take to be any dyadic rational, = 2 - J Then Proposition 6.5.1 tells us that

x

¢ (x )

x

K.

¢

lim 2j J dY ¢ ( T J K + y ) ¢ (2j y ) J-+OO lim 2j / 2 J dZ ¢(z) ¢ - J' 2i-J K (Z) J-+OO lim 2j / 2 ( ¢, ¢ - j , 2i-J K ) . J-+OO

Moreover, for ] larger than some ]0 ,

(6.5.3)

K. 2j - J K (¢, ¢_j,2i-J K ) ¢O , n rno ¢ (6.5.4) (I, ¢O,n ) OO,n , for ] > 0, k E Z . (6.5.5) (I, 1P-j,k) We can use this as input for the reconstruction algorithm of the subband filter ing associated with rno (see §5.6) . More specifically, we start with a low pass sequence c� OO , n and a highpass sequence � 0, and we "crank the machine" to obtain (6.5.6) C� 1 Lk hn - 2k c� . We then use d� 1 0, to obtain, after another cranking, (6.5.7) Cm- 2 '""n hm - 2n Cn- 1 , etc. At every stage, the c:;; j are equal to ( ¢, ¢ - j , n ) ' Together with (6.5.3) , this means that we have an algorithm with exponentially fast convergence to

where are dependent on J or If is integer, which is automatically true if ] � J, then the inner products are easy to compute. Under the assumption that the are orthonormal (which can be checked with any of the necessary and sufficient conditions on listed in Theorem 6.3.5), is the unique function f characterized by

C, ]o

°

=

=

=

=

= L...J


¢

205

compute the values of at dyadic rationals. We can interpolate these val ues and thus obtain a sequence of functions approximating We can, for instance, define to be the function, piecewise constant on the intervals such that = . An other possible choice is piecewise linear on the so that = For both choices we have the following proposition. P ROPOSITION 6 . 5 . 2 . If is Holder continuous with exponent 0:, then there exists > 0 and jo N so that, for j ?: jo,

'f/j ¢.9 'f/J (2 -j k) -j 2j /-2j (¢, ¢-j, k ) [2 n, 2 (n + 1)], n E Z,

'f/jJ (x) j 1/2), 2 [2 (n(n+ 1/2)[' n E Z, 'f/} (x), 'f/; (2 -j k) 2j /2 (¢,¢_j, k ) . C E ¢

( 6.5.8 )

Take any x E For any j, choose n so that 2 - j n :S x -2 jProof. (n + 1). By the definition of 'f/j , 'f/j (x) is necessarily a convex linear com bination of 2j / 2 ( ¢, ¢ - j , n ) and 2j / 2 ( ¢, ¢ - j , n + l ) , whether 0 or 1. On the other hand, if j is larger than some jo, I ¢(x) - 2j /2 (¢, ¢-j,n )1 :S I ¢ (x) - ¢( T j n) 1 + 1 ¢ (2 -j n) - 2j / 2 ( ¢, ¢ - j , n )1 :S C Ix - 2 - j nl o + C T j o :S C T j o the same is true if we replace n by n + 1 . It follows that a similar estimate holds for any convex combination, or I ¢ (x) - 'f/j (x) I C 2 - j o . Here C can be chosen independently of x, so that ( 6.5.8 ) follows. This then is our fast algorithm to compute approximate values of ¢(x) with arbitrarily high precision: 1. Start with the sequence · · · 0 · · · 010 · · · 0 · · · , representing the 'f/o (n), n E Z. 2. AtCompute the 'f/j (2 -j n), n E Z, by "cranking the machine" as in ( 6.5.7) . every step of this cascade, twice as many values are computed: values at "even points" 2 - j (2k) are refined from the precious step, J+ l l!) , 'f/ ( 'f/j ( Tj 2k) V22::> T ( 6.5.9 ) l j _ k ) l 2 ( I and values at the "odd points" 2 - j (2k + 1) are computed for the first time, 'f/j ( Tj (2k + 1 )) V22:l h2( k -I) + 1 'f/j _ l ( Tj + l l!) . ( 6.5.10 ) K

3. The computation of for which :S Computation of these, in turn, involves only the or 3/4 :::; :S with Working back to j = J - 4, we see that to compute on m, ��] we only need the for 28 :S :S 34. We can therefore start the cascade from . . · 0 · . · 010 · . · 0 · . . , go five steps, select the seven values :S 34, use only these as the input for a new cascade, with four steps, and end up with a graph of on For larger blowups on even smaller intervals, we simply repeat the process; the blowup graphs in Chapter 7 have all been computed in this way. l1 The arguments leading to the cascade algorithm have implicitly used the orthonormality of the we have or equivalently (see §6.2, 6.3) , of the characterized as the unique function satisfying (6.5.4), (6.5.5) . The cascade algorithm can also be viewed differently, without emphasizing orthonormality at all, as a special case of a stationary subdivision or refinement scheme. Refinement schemes are used in computer graphics to design smooth curves or surfaces going through or passing near a discrete, often rather sparse, set of points. An excellent review is Cavaretta, Dahmen, and Micchelli (1991). We will restrict ourselves, in this short discussion, to one-dimensional subdivision schemes. 12 Suppose that we want a curve y = taking on the preassigned values One possibility is simply to construct the piecewise linear = graph through the points this graph has the peculiarity that, for all

¢

17] ,

¢, 'l/J

17]

175 (2 -5n), ¢ 17J(2 - Jn) [��, ��]

¢.

17J 2J- 4 :::; n :::; 2 J-4 h17Jn (2 - J+ 1nk) n (n - 3)/2 :::; k n/2.17J(2 - Jn) _l 17J_ 2 (2 - J+2t') (k - 3)/2 :::; t' :::; k/2, n/4 - 3/2 - t' n/4. 179 17�(2 -5 m) m 17�(2 -5 m), 28 m :::; 179 [��, ��]. 'l/Jj, k ,

¢

I (n) In .

¢O , n :

I

I (x)

(n, In ); n, 2n +-1 ) = "21 f (n) + "21 1 (n + l) , (6.5. 11 ) I (2 which gives a quick way to compute I at half-integer points. The values of I at quarter-integer points can be computed similarly, (6.5.12)

and so on for Z/4 + Z/8, etc. This provides a fast recursive algorithm for the computation of at all dyadic rationals. If we choose to have a smoother spline interpolation than by piecewise linear splines (quadratic, cubic or even higher

I


(6. 5 . 9), (6. 5 .10),

207

order splines ) , then the formulas analogous to computing the 1(2- j n + 2- j -1 ) from the 1(2- j k) , would contain an infinite number of terms. It is possible to opt for smoother than linear spline approximation, with inter polation formulas of the type

I (Tjn + 2-j -1 ) =

1(2 -j (n - k)) , :�:�:> k k

(6.5.13)

ak are nonzero; the resulting curves are no longer splines. - 161 [/(TJ· (n - 1)) + 1(2 -J. (n + 2)] . (6. 5 .14) + 169 [J (T J n) + I(T J (n + 1))] . This example was studied in detail in Dubuc (1986), Dyn, Gregory, and Levin (1987), and generalized in, for example, Deslauriers and Dubuc (1989) and Dyn and Levin (1989); it leads to an almost C2 -function I . ( For details on methods to determine the regularity of I, see Chapter 7.) Formula (6. 5 .14) describes an with only finitely many An example is

.

interpolation refinement scheme, in which, at every stage of the computation, the values computed earlier remain untouched, and only values at intermediate points need to be computed. One can also consider schemes where at every stage the values computed at the previous stage are further "refined," corresponding to a more general refinement scheme of the type

Lk Wn - 2k /j (Tjk) .

(6.5.15)

Lk W2 (n - k ) /j (Tjk)

(6.5.16)

/j +l (Tj -1 n) =

(6. 5 .15)

Formula corresponds in fact to two convolution schemes ( with two masks, in the terminology of the refinement literature ) ,

/j +l (Tjn) =

( the refinement of already computed values ) , and

/j +l (Tjn + Tj - l ) =

Lk W2 (n - k )+l /j (Tjk)

(6. 5 .17)

( computation of values at new intermediate points ) . In a sensible refinement scheme, the /j converge, as j tends to 00, to a continuous ( or smoother; see defines the fJ only on the discrete Chapter function 100. Note that set 2- j Z. A precise statement of the "convergence" of /j to the continuous function 100 is that

7)

(6. 5 .15)

(6. 5 .18)

208

CHAPTER

A

6

It(n) A An . h(2W -j k);b o , 2 k,

where the superscript indicates the initial data, The refinement = scheme is said to converge if (6.5. 18) holds for all E £OO (Z) ; see Cavaretta, Dahmen, and Micchelli (1991). ( It is also possible to rephrase (6.5. 18) by first introducing continuous functions interpolating the see below. ) A general refinement scheme is an interpolation scheme if k = leading to

Ij ,

-j n) h(2-j n). hH(2 In both cases, general refinement scheme or more restrictive interpolation =

scheme, it is easy to see that the linearity of the procedure implies that the limit function 100 ( which we suppose continuous 13 ) is given by

loo (x) = Ln lo(n) F(x - n) ,

F F hex)

(6.5.19)

where = 00 is the "fundamental solution," obtained by the same refinement scheme from the initial data This fundamental solution obeys = a particular functional equation. To derive this equation, we first introduce functions interpolating the discrete

Fo(n) bn,o . h(2 -j k): hex) = Lk h( Tj k) w(2j x - k) , (6.5.20) where w is a "reasonable,, 1 4 function so that wen) = bn, o . Two obvious choices are w(x) = 1 for - � x � , 0 otherwise, or w(x) = 1 - I x l for I x l 1, 0 otherwise. ( These correspond to the two choices in the exposition of the cascade algorithm above. ) The convergence requirement (6.5. 18) can then be rewritten as l i lt - I� IIL'''' ->O for j->oo. For the fundamental solution Foo , we start from Fo(x) = w(x). The next two approximating functions Fl , F2 satisfy Fl (X) = Ln Fl (n/2) w(2x - n) (by (6.5.20) ( use (6.5. 15) and Fo(n) = bn, o ) = L Wn w(2x - n) n ( 6.5.21) Ln Wn Fo(2x - n) , F2 (X) Ln F2 (n/4) w(4x - n) ( use (6.5. 15)) L Wn - 2 k Fl (k /2) w(4x -- n) n, k ( becauseFl (k/2) = Wk ) L W k L Wi w( 4x - 2k - £) k i = L W k Fl (2x - k) . k This suggests that a similar formula should hold for all Fj , i.e., Fj (x) = Lk Wk Fj _ l (2x - k) (6.5.22) :S

:S

N2 , and hNl -# 0 -# hN2 . Then (6. 5 . 24) already implies that either N1 = 0 or N2 O. Suppose N1 = 0 (N2 0 is analogous) ; N2 is necessarily odd, N2 = 2L + 1. Take k 2L in (6. 5 . 24) . Then ==

n
O. In that case I sin n(1 S; In(lmin( I ,£) leads to a similar bound. 00

5. We use here the classical formula sm x _ 00 cos(2 -j x) . x j=1 An easy proof uses sin 2a = 2 cos a sin a to write J J sin(2 - j +1 x) . sin x -:cos(TJ x) = j x) = --:---:----:J sin(2 - J x) 2 sin(2 2 j= 1 j= 1 •

II

II

II

which tends to Si� X for J---t oo . In Kac (1959) this formula is credited to Vieta, and used as a starting point for a delightful treatise on statistical independence.

rno

6. This is true in general: if satisfies (6. 1.1) and ¢, as defined by (6.2.2), generates a non-orthonormal family of translates ¢O,n , then necessarily Ll l ¢ (� + 2 11"£) = 0 for some �. (See Cohen (1990b) .)

12

212

CHAPTER

6

J dx 'ljJ(x) 'ljJ(x -

7. The condition k) = 8k ,o may seem stronger than 11 'ljJ 11 = 1, but since the 'ljJj , k constitute a tight frame with frame constant 1, the two are equivalent, by Proposition 3.2. 1.

'ljJ ( x)

8. Since is a finite linear combination of translates of ¢(2x) , fast algo rithms to plot ¢ also lead to fast plots for Throughout this section, we restrict our attention to ¢ only.

'ljJ.

L2

9. If ¢ is not continuous, then the 'r/j still converge to ¢ in (see §6.3) . More over, they converge to ¢ pointwise in every point where ¢ is continuous. 10. The choice f. = 1 was used in the proof of Proposition 3.3 in Daubechies (1988b) , because the are absolutely integrable, whereas the are not. In Daubechies (1988b) the convergence of the 'r/j to ¢ was actually proved first (using some extra technical conditions) , and orthonormality of the ¢O,n was then deduced from this convergence.

ryJ

ryJ

1 1 . Note that there exist many other procedures for plotting graphs of wavelets. Instead of a refinement cascade one can also start from appropriate ¢( n) and then compute the ¢(2 -j k) directly from ¢( ) = n) . (In fact, when ¢ is not continuous, the cascade algorithm may diverge, while this direct use of the 2-scale equation with appropriate ¢( n ) still converges. I would like to thank Wim Sweldens for pointing this out to me.) This more direct computation can be done in a tree-like procedure; a different way of looking at this, avoiding the tree construction and leading to faster plots, uses a dynamical systems framework, as developed by Berger and Wang (see Berger (1992) for a review) . The "zoom in" feature is lost, however.

x v'2Ln hn ¢(2x -

12. Many experts on refinement or subdivision schemes find the multidimen sional case much more interesting! 13. This is not a presentation with fullest generality! We merely suppose that the are such that there exists a continuous limit. This already implies = = 1.

W L W2nk L W2n+ l

14. For example, any compactly supported be "reasonable" here.

w

with bounded variation would

15. The following stretched Haar function shows how the F(x-n) can fail to be independent. Take = = 1 , all other = O. The solution to (6.5.23) is then (up to normalization) = 1 for 0 � < 2, 0 otherwise. In this =0 case the lOO -sequence defined by = leads to a.e.

Wo W2 F(x) Wn x A An (_ I ) n

Ln An F(x-n)

16. This is no coincidence. If we fix the length of the symmetric filter Mo = Imo l 2 , then the choice R == 0 means that Mo is divisible by (l + cos � ) with the highest possible multiplicity compatible with its length and the constraint Mo( � ) + Mo (� +IT) = 1. On the other hand, Lagrange refinement

213


schemes of order 2N - 1 are the interpolation schemes with the shortest length that reproduce all polynomials of order 2N - (or less) exactly from their integer samples. In terms of the filter W(�) = � L n Wn ein� , this means

1

W(�) + W(� + 11") = 1

(interpolation filter:

W2n

= 8n , o )

and 1 + 0 ((1 - cos �) N ) (see Cavaretta, Dahmen, and Micchelli (1991), or Chapter

8).

The two requirements together mean that W (� + 11" ) has a zero of order 2N in � = 0, i.e., that W(� + 11" ) is divisible by (1 - cos �) N ; hence W(O by + cos �) N . It follows that W = Mo .

(1

CHAPTER

7

M ore A b o ut the Reg u l a rity of Com pactly S u p ported Wave l ets

The regularity of the Meyer or the Battle-Lemarie wavelets is easy to assess: the Meyer wavelet has compact Fourier transform, so that it is Coo , and the Battle Lemarie wavelets are spline functions, more precisely, piecewise polynomial of degree k, with (k - 1) continuous derivatives at the knots. The regularity of compactly supported orthonormal wavelets is harder to determine. Typically, they have a non-integer Holder exponent; moreover, they are more regular in some points than in others, as is already illustrated by Figure 6.3. This chapter presents a collection of tools that have been developed over the past few years to study the regularity of these wavelets. All of these techniques rely on the fact that ¢(x) = en ¢(2x - n ) , (7.0. 1) where only finitely many en are nonzero; the wavelet ,¢, as a finite linear combina tion of translates of ¢(2x) , then inherits the same regularity properties. It follows that the techniques exposed in this chapter are not restricted to wavelets alone; they apply as well to the basic functions in subdivision schemes (see §6.5) . Some of the tools discussed here were in fact first developed for subdivision schemes, and not for wavelets. The different techniques fall into two groups: those that prove decay for the Fourier transform ¢, and those that work directly with ¢ itself. We will illustrate each method by applying it to the family of examples N ¢ constructed in §6.4. It turns out that Fourier-based methods are better suited for asymptotic estimates (rate of regularity increase as N is increased in the examples, for instance) ; the second method gives more accurate local estimates, but is often harder to use. References for the results in this chapter are Daubechies (1988b) and Cohen (1990b) for §7. 1 . 1 ; Cohen (1990b) and Cohen and Conze (1992) for §7. 1.2; Cohen and Daubechies (1991) for §7. 1.3; Daubechies and Lagarias (1991, 1992), Mic chelli and Prautzsch ( 1989), Dyn and Levin (1990) , and Rioul (1992) for §7.2; Daubechies (1990b) for §7.3.

L

7. 1 .

Fourier-based methods. The Fourier transform of equation (7.0. 1) is ¢(� ) = mo (� /2) ¢(� /2) , 215

(7. 1 . 1 )

2 16

CHAPTER

Ln cn e - in�

7

is a trigonometric polynomial. As we have seen where mo(�) � many times before, (7. 1 . 1) leads to 00

j=1 1 and J dx ¢(x) ( 1 +;-t.� ) N £(�) ,

where we have assumed mo (O) mo can be factorized as mo (�)

=

=

(7. 1.2) 1, as usual. Moreover, (7. 1.3)

£ is a trigonometric polynomial as well;Nthis leads to . (7.1.4) ¢(�) (271') - 1 /2 ( l -i�e - t� ) JII= £(Tj �) . 1 A first method is based on a straightforward estimate of the growth of the infinite product of the £(2 - j �) as I � I ---> oo . Brute force methods. For n + {3, n E N, 0 ::; {3 < 1, we define Co. to be the set of functions j which are n times continuously differentiable and such that the nth derivative j ( n ) is Holder continuous with exponent {3, i.e., I j (n) (x) - j (n) (x + t) 1 ::; Cltl ,6 for all x, t .

where

00

=

7. 1 . 1 .

Q =

It is well known and easy to check that if

J d� I j (�) 1 (1 + I W o. < 00 , then j E Co. . In particular, if I j (�) 1 ::; C ( 1 + 1 � 1 ) - 1 - 0. - oo of II;: 1 £(2 - j � ) in (7. 1.4) can N be kept in check, then the factor ( (1 - e - i� )/i�) ensures smoothness for ¢. N sup� I £ (�)I < 2 - 0. - 1 , then ¢ E Co. . 7. 1 . 1 . If L EMMA

Proof. 1. Since mo (O)

q

=

1,

=

£(0)

=

00

Now take any �, with I � I 2. Hence II 1 £ ( Tj �) 1 j=1

1 ; hence

2':

I £(�) I ::; 1 + ClH Consequently, 00

2J- 1 ::; I � I < 2J. J II 1 £ ( Tj �) 1 II 1 £ ( Tj T J �)I j=1 j=1 eC ::; c'N2 J(N1- 0. - 1 - OO

0

follows that JC = limj -> oo JCj .

-

then JCt < N 1 3. repeat If JC < N 1 for some £ E N. We can then the argument in the proof of Lemma 7.1.1, applying it to -

-

-

a,

-

00

00

j= 1

j =O

a

with Ct (�) = I1�:� C(2 - j �) , and with 2t playing the role of in This leads to I ¢(�) I :::; G(l + I W - N +Ke :::; G(l + I W -o- 1 -., Lemma hence ¢ E Go . •

2

7.1.1.

The following lemma shows that in most cases, we will not be able tlJ obtain much better by the brute force method. LEMMA There exists a sequence (�t ) tEN so that

7.1.3.

(1

+ I �t l ) - K

00

II C(Tj �t )

j=1

�G>0.

CHAPTER 7

218

Proof.

1. By Theorem 6.3.1, the orthonormality of the ¢> ( . - n ) implies the existence of a compact set K congruent to [-11", 11"] modulo 211", such that 1 ¢ (e ) 1 � C > 0 for e E K. Since K is congruent to [-11", 11"] and CI. is periodic with period 21.+ 1 11", we have

ql.

=

sup

ICI. (e ) 1

l e 1 91 .".

sup ICI. (e) 1 ,

=

e E21 K

i.e., there exists (I. E 21.K so that 1 CI. ((I.) 1 = ql. . Since K is compact, the 2-1. (I. E K are uniformly bounded. We therefore have

( 7. 1 . 7)

for 0 < Cf•

1 1+;;( 1 = 1 cos e/2 1 ::; 1, we have for all e E 2i K, II C( 2 -j e) � II mo ( T j e) = 1¢( 2 -l.e) 1 � C > 0 . l.+l l.+l

2. Moreover, since 00

00

j=

j=

Putting it all together we find for el. = 2(1. 00

00

I Ci ((£ ) 1 II l C (Tj (i )

j =l.+ > C ql. = C 2£ lCl

•

By (7.1.7),

(1 + l el. l ) - lC

00

j el.) ( T C j=1 II

� C 2£lC l CIf

TilC .

Since /C = inf£ /C£ , this is bounded below by a strictly positive constant.

•

Let us now turn to the particular family of N ¢> constructed in §6.4, and see how these estimates perform. We have

with

We start by establishing a few elementary properties of PN .

7.1.4.

LEMMA The polynomial the following properties:

Proof. If 0

1. :::; x :::; y, then x- ( N-1 ) PN (x) = >

219

PN (X) = E::01 ( N -� + n ) xn satisfies (7.1. 8 ) (7.1. 9)

MORE ABOUT COMPACTLY SUPPORTED WAVELETS

}; ( N - � + n ) x - (N-1-n) }; ( N -� + n ) y-(N- 1-n) Y-(N- 1) PN (Y) ·

2. Recall (see §6.1) that PN is the solution to xN PN (I - x) + (1 - x) N PN (X) = 1 . On substituting x = �, it follows that PN (I/2) = 2 N -1 . For x :::; �, we have PN (X) :::; PN (�) = 2 N -1 because PN is increasing. For x ::::: �, applying (7.1. 8 ) leads to PN (X) :::; x N - 1 2 N- 1 PN (�) = 2 N - 1 ( 2x) N-1 . This proves (7.1.9). It is now easy to apply Lemmas 7.1.1 and 7.1. 2 . We have [}; ( N � + n ) r �p I C N ( D I •

,

�

1 immediately leads =

to sharper results. We have, for instance, sup q2

j

I .cN (�) .cN (201 sup [PN (y) PN (4y(1 - y)W / 2 O : 0 is independent of k. C

=

=

m =

=

=

Cl > 0 so that, for all k E N, (7.1.10) Indeed, 2kM �o �o (mod 211" ) , so that (7.1.10) follows if �o =I- 0 or ±11". We already know that �o =I- 0 ; if �o ±11", then 6 0 (mod 211" ) and hence �o 2M- l 6 = 0 (mod 211" ) , which is impossible.

Proof. 1. First note that there exists =

2. Now =

=

=

221


1,

Since C is a trigonometric polynomial and C(O) = there exists G2 so that IC( O I - G2 I� I ;::: e -2 C2 I� I for I�I small enough. Hence, for r large enough,

;::: 1

00

00

j=r M

j=r M

Hence

1 ¢(2 k M +l �0) 1

(r+ k ) M - I ( 2k M I�ol ) -N G3 II C(2 k M - £ �o ) £= 0 G4 IC(�o ) C(6) · · · C(�M _dl r+ k +l (1 + 1 2 k M �o l) -N G5 2 K M k (1 + 12 k M �ol ) -N G (1 + 1 2 k M +l �ol ) -N + K .

> Gf > > >

•

7.1. 5

We can apply this to the example at the end of the last subsection: Lemma implies 1¢( 2n 2nl ;::: G(l + 12n 2311" 1 )-N +K, with JC = log lC en C(- 2nl / 2 log 2 . If C has only real coefficients (as is the case in most applications of practical interest), then IC(- �.n l = I C en l , and JC = log I C en l/ log 2. The next short . . t cyc . 1es are { 2 11" 4 11" 411" } 211" ' 7411" ' - 7611" } , etc.,· each f them gIves mvanan 5' 5' -5' -5 ' {7 an upper bound for the decay exponent of ¢ . In some cases one of these upper bounds on a can be proved to be a lower bound as well. We first prove the following lemma. LEMMA Suppose that [-n, n] = D I UD2 · · ·UDM , and that there exists q > 0 so that I C(OI '5: q 21r

0

7.1.6.

IC(O C(20 1 '5: q2 IC(� ) C(20 · · · C(2 M - I O I '5: q M Then I¢(OI '5: G(l + I W -N+K, with K = log q/ log 2. Proof.

1. Let us jestimate 1 I1{:� C(2- k �) I , for some large but arbitrary j. (=

2- +l � E Dm for some

m

E

{ l , 2, · · · , M } , we have

Since

222

7 We can now apply the same trick to 2 m (, and keep doing so until we cannot go on. At that point we have CHAPTER

1

with at most M - different £-factors remaining ( Le., r

with q 1 defined as in

I nk=O £(Tko l ::; qj-M+l q�- l ,

::; M - 1). Hence

(7.1. 5 ). Consequently, with the definition (7.1.6), 1 Kj ::; -:-- [C + J. log q] , J 1og 2

and K = limj -+ oo Kj ::; 10g q / log 2. The bound on (p now follows from • Lemma

7.1.2.

In particular, one has the following lemma. LEMMA Suppose that

7.1.7.

I � I ::; 2; , (7.1.11) 2 2 1 £ (0 £(2�) 1 ::; l £ e3"' ) 1 for ; ::; I � I ::; 7r . Then 1 (P (�) 1 ::; C ( l + I W - N + K , with K = 10g l £ en l / log 2, and this decay is optimal. The proof is a straightforward consequence of Lemmas 7.1. 5 and 7.1. 6Proof. . Of course, Lemma 7.1.7 is only applicable to very special £; in most cases (7.1.11) will not be satisfied: there even exist £ for which £ ( �.n = O. Similar optimal bounds can be derived by using other invariant cycles as breakpoints for a partition of [ -7r, 7rJ , and applying Lemma 7.1. 6 . Let us return to our "standard" example N- _ To prove (7.1.18) it is therefore sufficient to prove N- l 9 N-l 2� (2y 1) 4 (7.1.20) (2y � 1) 2 Since both (2y - 1) - 2 and y(2y - 1) - 2 are decreasing on [�, 1] , it suffices to verify that (7.1.20) holds for y = � , i.e., that N 4 . � -l < _ 6 9 v'N This is true for N 2: 13.

[

]

IN (

()

)


225

5. It remains to prove (7.1.13) for � � Y � 1 and 1 � N � 12. We do this in two steps: Y � Yo = ¥, and Y Yo. For Y � ¥, 4y(1 - y) �, hence, again by Lemma 7.1. 4 , [16y(1 - y)] N - l . 2':

2':

< ( 20 ) N - l PN (�) 9

y2 (1 - y) (7.1.21)

[�, ¥].

,

(7.1.21 )

because is decreasing on One checks by numerical computation that for � N � is indeed smaller than

[PN (�)j 2 1

12.

6. For ¥ = YoN -�1 Y � 1 we usederive the bounds PN (4y(1-y)) � (2y-1) - 2 N and PN(y) � ( ;; ) PN (yO) to N- l < N + 2 o PN(4y(1 - y))PN(y) Y ! PN(yo)(2y - 1) [ (2Y � 1) 2 ] < 2 N PN(Yo) , (7.1. 2 2) where the last inequality uses that (2y - 1) - 2 and y (2y _ 1) - 2 are both decreasing in [Yo, 1]. One checks by numerical computation that (7.1. 2 2) is smaller than [PN (�W for 5 � N � 12. 7. small It remains to prove (7.1.13) for 1 � N � 4 and ¥ � y � 1. For these values of N the polynomial PN(y) PN (4y(1-y))-PN (�) 2 has degree at most 9, and its roots can be computed easily ( numerically ) . One checks that there are none in H , 1], which finishes the proof, because (7.1.13) is satisfied in y = 1. •

follows from Lemmas

N¢(�):

7.1.8 and 7.1.7 that we know the exact asymptotic decay (7.1.23)

CHAPTER 7

226

For the first few values of N this translates into N
. 1 / log 2 . 1, then F E eo:-< for all f > O. a

If 1 >' 1

' 1 . Since, for any b > 0, there exists e > 0 so that IIAn l l ::; C(p( A) + b) n for all n E N, it follows that (7. 1.30) 3. On the other hand, f (� ) � 1 for � ::; I�I ::; 11". Together with the bound edness of n;': 1 Mo(2 -j �) for I�I 11" (derived as usual from IMo ( � ) 1 < 1 + C l W , this implies

::;

j=1

::; e

J

2n - l "': � . For every binary sequence d = (dn ) n E N\ { O } we also define its right shift rd by n = 1 , 2" " . It is then clear that rd(x ) = d( 2x ) if 0 ::; x < � , rd(x ) = d( 2x - 1) iq < x ::; 1 . (For x = � , we have two possibilities: rd+ (�) = d(O ) , rd- (�) = d(1) . ) Although r is really defined on binary sequences, we will make a slight abuse of notation and write rx = y rather than rd(x ) = d(y ) . With this new notation, we can rewrite (7.2.5) , (7.2.7) as the single equation If the Vj have a limit v, then this vector-valued function fixed point of the linear operator T defined by (Tw) (x)

= Td1 (x )

v

will therefore be a

w (rx) ;

acts on all the vector-valued functions w : [0, 1 ] requirements T

( 7.2.9 )

--+

]RK that satisfy the

[w ( O ) h = 0, [W (1)] K = 0, [W ( O )] k = [W (1)] k - b k = 2" " , K .

( 7.2.10 )


235

( As a result of these conditions Tw is defined unambiguously at the dyadic rationals: the two expansions lead to the same result. ) What has all this recasting the equations into different forms done for us? Well, it follows from (7.2.9) that

Vj (x)

= Td 1 (x) Td2 (x)

.

.

•

Tdj (x)

vo( rj x) ,

which implies (7.2.1 1 )

In other words, information on the spectral properties of products of the Td matrices will help us to control the difference Vj - Vj +l , so that we can prove Vj ----+ v, and derive smoothness for v. But let us turn to an example. For the function 2 (7.2. 1) reads

2(X) =

3L Ck 2(2x - k) ,

(7.2.12)

k=O

with

Co =

1 + v'3 4

3 + v'3 4

1 - v'3 4

3 - v'3 4

Note that

(7.2. 13)

C Cl + 3C3 ,

and

22 =

(7.2. 14)

both of which are consequences of the divisibility of mo (�) = ! E!=o Ck e-ike by (1 + e-ie) 2 . The values 2 (1), 2(2) are determined by the system

( 22(2)(1) )

=

M

( 22(2)(1 ) ) '

with

M

=

( CC31 Coc2 )

Because of (7.2. 13), the columns of M all sum to 1 , ensuring that (1, 1) is a left eigenvector of M with eigenvalue 1. This eigenvalue is nondegenerate j the right eigenvector for the same eigenvalue is therefore not orthonormal to (1, 1), which means it can be normalized so that the sum of its entries is 1 . This choice of ( 2(1), 2(2)) leads to 1 + v'3 1 2(2) = - v'3 . 2 2 The matrices To, Tl are 3 x 3 matrices given by

2(1) =

CHAPTER 7

236

Because of ( 7 .2. 1 3) , To and T1 have a common left eigenvector e 1 with eigenvalue 1 . Moreover, for all x E [0, 1) ,

e1 . vo{x)

=

e1 · [(I - x) vo{O)

+

=

(I, 1, 1)

x vo{I))

(I - x) [ 24> (1) + 24> (2)) + x [ 2 4> (1) + 2 4> (2) ) (use (7.2.2))

=

1.

It follows that, for all x E [0, 1)' all j E N,

e 1 · vi (x)

=

=

e 1 · Td 1 ( x ) . . . Tdj (x ) vo{ ri x)

1.

Consequently, vo{y) - Vt{Y) E El = {Wj e 1 · w = W1 + W2 + W3 = O}, the space orthogonal to e1 . In view of (7.2.11), we therefore only need to study products of Td-matrices restricted to E1 in order to control the convergence of the vi . But more is true! Define e2 ( I, 2, 3). Then (7.2.14) implies =

(7.2.15) Co + 2C2 - � = 5 -/'3 , a1 = C1 + 2C3 - � eg = e2 - 2ao e l , then (7.2. 15) becomes

with ao

=

3 -4Y3 . If we define

On the other hand,

(I - x) eg . vo{ O ) + x eg . vo{ l )

eg . vo{x)

-

X

j

consequently,

eg . vi {x)

= =

eg . Td1 ( x ) vi - 1 {rx) 1 1 - 2 d 1 { x ) + 2 e02 . vi - 1 { rx)

i

=

-

I: 2- m dm {x)

m= l i

- I: 2- m dm {x) m= l

+ 2-i eg . vo{ ri x)

-x .


eg . [vo {x) - Vi {X)] Vj - Vj+l '

237

It follows that = O. This means that we only need to study products of Td-matrices restricted to E2 , the space orthogonal to and in order to control But, because this is a simple example, E2 is one dimensional, and Td l E2 is simply multiplication by some constant, namely the 1 - Y3 for T1 • Consequently, third eigenvalue of Td , which is ¥ for 4

el

To,

I l vj{x) - Vj+l{X)I I ::;

1 - L:� 1 1 J3 1 =

[ + J3]j 1 + 1 --

d (x) n

eg,

J3 Vi are uniformly bounded.6 Since I ��a I 1, I vj{x) - Vj+l{X) I ::; C Taj , with .550. It follows that the Vj have a limit 1 l0g « 1 + J3)/4 ) I / log 2 function V, which is continuous since all the Vj are and since the convergence is uniform. Moreover V automatically satisfies (7.2.1O), since all the Vj do, so that it can be "unfolded" into a continuous function F on [0 , 3] . This function solves ( 7.2.1 ) , so that 2 F, and it is uniformly approachable by piecewise linear spline functions Fj with nodes at the k 2 -j ,

4

c,

where we have used that the (7.2.16) implies a

=

(7.2.16)
. = 1/2, then the Uh derivative F(l) of F is almost Lipschitz: it satisfies

!p (l)

( x + t)

- F ( l)

(x) 1 �

G l t i l ln I t l l .

REMARK. The restriction >. 2:: � means only that we pick the largest possible integer £ � L for which (7.2.21) holds with >. < 1. If £ = L, then necessarily >. 2:: � (see Daubechies and Lagarias (1992)) ; if £ < L and >. < � , then we could replace £ by £ + 1 and >' by 2>', and (7.2.21) would hold for a larger integer £. 0 A similar general theorem can be formulated for the local regularity fluctua tions exhibited by the example of 2, 'l/J have sufficient decay; both conditions are trivially satisfied for the compactly supported wavelet bases as constructed in Chapter 6.) This was our motivation to construct the N1>, which lead to N'l/J with N vanishing moments. The asymptotic results in §7.1.2 show however that the N1>, N'l/J E C/L N with I-t .2. This means that 80% of the zero moments are "wasted," i.e., the same regularity could be achieved with only N/5 vanishing moments. Something similar happens for small values of N. For instance, 2 1> is contin uous but not C 1 , 31> is C 1 but not C2 , even though 2 'l/J, 3 'l/J have, respectively, two and three vanishing moments. We can therefore "sacrifice" in each of these two cases one of the vanishing moments and use the additional degree of freedom to obtain 1> with a better Holder exponent than 2 1> or 3 1> have, with the same support width. This amounts to replacing Imo (�) 1 2 = (cos2 � ) N PN (sin2 � ) by N I mo (�) 1 2 = (cos2 � ) - 1 [PN_ 1 (sin2 � ) + a(sin 2 � ) N cos �] (see (6.1.11)), and to choose a so that the regularity of 1> is improved. Examples for N = 2, 3 are shown in Figures 7.4 and 7.5; the corresponding hn are as follows: 3 ho N = 2 5 v2 �

h1

h2 h3 N

3

6

5 v2 2 5 v2 -1 5 v2

ho .3743 2841633/V2 h1 .109093396059/ V2 h2 = .786941229301/V2 h3 - .146269859213/V2 h4 -.161269645631/V2 h5 .0553358986263/V2

These examples correspond to a choice of a such that max[p(To I Et ) ' P(T1 I Et )] is minimized; the eigenvalues of To, T1 are then degenerate.s One can prove that the Holder exponents of these two functions are at least .5864, 1.40198 respectively, and at most .60017, 1.4176; these last values are probably the true Holder' exponents. For more details, see Daubechies (1993). 7.4.

Regularity or vanishing moments?

The examples in the previous section show that for fixed support width of 1>, 'l/J, or equivalently, for fixed length of the filters in the associated subband coding scheme, the choice of the hn that leads to maximum regularity is different from the choice with maximum number N of vanishing moments for 'l/J. The question


most

regular � for N

=

243

2

o o

3

2

FIG. 7.4. The scaling function 4> for the most regular wavelet construction with support width 3.

1 .5

r--,----.---,--,

most regular � for N

=

3

1 .0 0.5

o -0.5

--'-

L--'-____

o

2

--'-

____

4

----'

__

FIG. 7.5. The scaling function 4> for the most regular wavelet construction with support width 5 .

then arises: what is more important, vanishing moments or regularity? The answer depends on the application, and is not always clear. Beylkin, Coifman, and Rokhlin (1991) use compactly supported orthonormal wavelets to compress large matrices, i.e., to reduce them to a sparse form. For the details of this ap plication, the reader should consult the original paper, or the chapter by Beylkin in Ruskai et al. (1991) ; one of the things that make their method work is the number of vanishing moments. Suppose you want to decompose a function F( x ) into wavelets (strictly speaking, matrices should be modelled by a function of two variables, but the point is illustrated just as well, and in a simpler way, with one variable). You compute all the wavelet coefficients (F, 1/Jj,k ) , and to compress all that information, you throw away all the coefficients smaller than some threshold € . Let us see what this means at some fine scale; j -J, J E N and J "large." If F is CL-1 and 1/J has L vanishing moments, then, for x near =

244

CHAPTER

7

2-J k , we have F(T J k) + F' (T J k)(x - T J k) F(x) + . . . + ( L _1 ) ! F ( L- l ) (T J k) (x - T J k) L- l + (x - T J k) L R (x) , I

where R is bounded. If we multiply this by 'l/J(2J x - k) and integrate, then the first L terms will not contribute because J dx xi'l/J(x) = 0, = 0, , , , , L - 1. Consequently,

f

I ( F, 'l/J - J, k) I

I f dx (x - T J k) L R(x) 2J/2 'l/J(2J - k) 1 T J(L- l /2) f dy lyl L I 'l/J( Y ) I . X

1 (2x - n) .

=

257

Then the same calculations as in Chapter 5 show that the functions 1/1�j , k (X) = 2 -j 1/1 1 ( 2 - 2j x - k), 1/1�j + 1 , k (X) = 2 -j - 1 / 2 1/1 2 ( 2 - 2j 1 x - k) (j, k E Z) constitute an orthonormal basis for L2 (1R) . Since the recursions above correspond to -

( p (�)

mo (�/2) mo (�/4) mo (�/8) mo (�/16) · · · CXl

]

[

II mo (T 2j - 1 � ) mo (2 - 2j - 2 � ) , j= l the phase of J1 can be expected to be closer to linear - -phase than that - of '" 00 mo (2 - J' �). Note also that ¢>2 (0 = ¢> 1 (� ) ' 1/12 (0 = 1/11 (�) ; ¢>(�) = TI)=l hence ¢>2 �X) = ¢> 1 ( -X) , 1/12 (X) = 1/1 1 ( -X). Figure 8.2 shows ¢> 1 , 1/11 computed from the for N = 2, i.e. , ho = ltg, 1 = 3tg, h2 = 3:;g, = l:;g. (Unlike the previous construction, this "switching" makes a difference even for N = 2.) For the "least asymmetric" given in Table 6.3, this switching technique lead� to slightly "better" ¢>, but seems to have little effect on 1/1 . 0 ,,

-

-.

•

h

hn

h3

.

hn

2 ,----�----.--�

o .0.5 '---_ -___L-___---' __---'-_ o

2

3

3 ,----�----,--�

2

-

1

-2 '-------'--L--� 2 o -1

FIG. 8 . 2 . Scaling function cPl and wavelet 'l/J l obtained by applying the "switching trick"

the 4-tap wavelet filters of §6.4.

258

CHAPTER

8

8.2.

Coiftets. In §7.4 we saw one advantage of having a high number of vanishing moments for 'ljJ: it led to high compressibility because the fine scale wavelet coefficients of a function would be essentially zero where the function was smooth. Since J dx ¢(x) = 1, the same thing can never happen for the ( I, ¢j , k ) . Still, if J dx x" ¢(x) = 0 for f = 1" " , L, then we can apply the same Taylor expansion argument and conclude that for J large, ( I, ¢ - J, k ) 2 J/2 ! (2 - J k), with an error that is negligibly small where ! is smooth. This means that we have a particularly simple quadrature rule to go from the samples of ! to its fine scale coefficients ( I, ¢ - J,k) . For this reason, R. Coifman suggested in the spring of 1989 that it might be worthwhile to construct orthonormal wavelet bases with vanishing moments not only for 'ljJ, but also for ¢. 4 In this section I give a brief account of how this can be done; more details are given in Daubechies (1993). Because they were first requested by Coifman (with a view to applying them for the algorithms in Beylkin, Coifman, and Rokhlin) , I have named the resulting wavelets "coiflets." The goal is to find 'ljJ, ¢ so that �

and

J dx x"'ljJ(x)

=

f = 0"

0,

J dx ¢(x) = 1 , J dx x"¢(x) = 0,

·

·

,L -

l

(8.2.1)

f = 1, , , , , L - 1;

(8.2.2)

L is then called the order of the coiflet. We already know how to express (8.2. 1) in terms of mo; it is equivalent with

mo (�)

=

( 1 +;- "e ) L C(�) . .

(8.2.3)

f!P 1

What does (8.2.2) correspond to? It is equivalent to the condition 4> =o = 0, e f = 1" " , L - 1. Let us check what 4>' (0) = 0 means for mo. Because 4> (�) = mo (�/2) 4> (�/2) , we have

� m� (�/2) 4>(�/2) + � mo (�/2) 4>' (�/2) ; 4>'(0) = � m� (O) (271") -1/ 2 + � 4>' (0) ,

4>'(�) = hence or

m� (O) = (271" ) 1 / 2 4>'(0) " Consequently, J dx x¢(x ) = 0 is equivalent with m� (O) = O. Similarly, one sees that (8.2.2) is equivalent with 4> = 0, = 1, . . . , L - 1 , or with e =o

( f!P l )

f

(8.2.4)

259

SYMMETRY FOR COMPACTLY SUPPORTED WAVELET BASES

where C is a trigonometric polynomial. In addition to (8.2.3) and (8. 2 .4) , mo will of course also have to satisfy Imo (�W + Imo (� 1. Let us specialize to L even (the easiest case, although odd L are not much harder), L 2K. Then (8.2.3) , (8.2.4) imply that we have to find two trigonometric polynomials P1 , P2 so that (8. 2 .5)

+ 7r)1 2 =

(

Because

=

(1 + e-i� ) 2K= e -'.�K (cos2 ) K, ( 1 e -'�. ) 2K= e -'.K� (2i sin ) 2K.) "2 "2 2 �

-

�

But we already know what the general form of such P1 , P2 are: (8.2.5) is nothing other than the Bezout equation which we already solved in §6. 1 . In particular, P1 has the form

where I is an arbitrary trigonometric polynomial. It then remains to taylor I in mo (�) ((1 + 1 is P1 ( � ) so that I mo (�W + I mo (� + 7r W satisfied. With the ansatz I( � ) = L ��;;- l it is shown in Daubechies (1990) how to reduce this "tayloring" to the solution of a system of K quadratic equations for K unknowns. A heuristic, perturbative argument suggests that this system will have a solution for large K, and explicit numerical solutions are computed for K = 1" " , 5. Figure 8.3 shows the plots of the resulting , 'l/J; the corresponding coefficients are listed in Table 8. 1. It is clear from the figure that , 'l/J are much more symmetric than the N, N'l/J of §6.4, or even than the , 'l/J in §8. 1, but there is of course a price to pay: a coiflet with 2K vanishing moments typically has support width 6K - 1, as compared to 4K - 1 for 2 K .

=

e - i� )/2) 2K

In e - in�,

=

In e - in�

REMARK. The ansatz I( � ) = L ��;;- l is not the only possible one, but it makes the computations easier. For small values of K (K = 1, 2, 3) , different ansatzes are also tried out in Daubechies (1993) . It turns out that the smoothest coiflets (at least at these small values for K) are not the most symmetric ones; for K 1, for instance, there exists a (very asymmetric) coiflet with Holder exponent 1 . 191814, whereas the coiflet of order 2 in Figure 8.3 is not C 1 ; both have support width 5. Similar gains of regularity can be found for K 2, 3. For graphs, coefficients and more details, see Daubechies (1990b) . 0

=

=

8.3. Symmetric biorthogonal wavelet bases. As mentioned above, it is well known in the subband filtering community that symmetry and exact reconstruction are incompatible, if the same FIR filters are used for reconstruction and decomposition. As soon as this last requirement is given up, symmetry is possible. This means that we replace the block diagram of Figure 5. 1 1 by Figure 8.4. Several questions naturally arise: what do€.s Figure 8.4

260

CHAPTER

8

3 ,--�--�----, L = 2 2

1

o

o

I--��

-1 -2

-1

3

2

0

,-------�--,

1 .5

-2

�-�--' ��-�

-1

-2

3

2

0

2 ,--------;

1 .0 0.5

o

�--.J

L-_�__

-0 . 5

_

o

1 .0

5

5

L = 6

0.5

o

'---�

-0.5

'--'-_��_��--.J

1 .5

,----.-----�

-4

-2

0

2

-2

-4

4

0

4

2

2 ,.-----,

1 .0 0.5

o I----�"""' -4

1 .5 1 .0 0.5

o

-0 . 5

-2

0

2

-4

4

-2

0

2

4

0

2

4

,----.-----�

L = 10

L = 10

'-----

o

��

'--'--��-��-

-4

-2

0

2

4

1-----.-'

-4

-2

FIG. 8.3. Coiftets "" and their corresponding scaling functions 4> for L The support width of 4> and "" is 3L - 1 in all cases.

=

2, 4, 6, 8, and 10

SYMMETRY FOR COMPACTLY SUPPORTED WAVELET BASES TABLE

8. 1

261

The coefficients lor coiflets 01 order L = 2K, K = 1 to 5 . ( The coefficients listed are normal ized so that their sum is 1; they are equal to the 2- 1/2hn .) n

hn/v'2

-2 -1 0 1 2 3

- .05 1429728471 .238929728471 .602859456942 .272140543058 - .05 1429972847 - .011070271529

-4 -3 -2 -1 0 1 2 3 4 5 6 7

-

. 0 1 1587596739 .029320137980 .047639590310 .27302 1046535 .574682393857 .294867193696 .054085607092 .042026480461 .016744410163 .0039678836 13 .00 1289203356 .000509505399

K=3

-6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11

- .002682418671 .005503126709 .016583560479 - .046507764479 - .043220763560 .286503335274 .561285256870 .302983571773 - .050770140755 - .058196250762 .024434094321 .01 1229240962 - .006369601011 - .001820458916 .000790205101 .000329665174 - .000050192775 - .000024465734

K=4

-8 -7 -6 -5 -4 -3 -2 -1

.000630961046 - .00 1 1 52224852 - .005 194524026 .011362459244 .018867235378 - .057464234429 - .0396526485 1 7 .293667390895

K = l

K=2

-

-

n K=4

K=5

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 - 10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

h n /v'2

-

-

-

.553126452562 .3071 57326 198 .0471 12738865 .068038 12705 1 .027813640153 .01 7735837438 .0107563 185 17 .004001012886 .002652665946 .000895594529 .000416500571 .000183829769 .000044080354 .000022082857 .000002304942 .000001262175

- .0001499638 .0002535612 .001 5402457 - .00294 1 1 108 - .0071637819 .0165520664 .01991 78043 - .0649972628 - .0368000736 .2980923235 .5475054294 .3097068490 - .0438660508 - .0746522389 .0291958795 .023 1 107770 - .0139736879 - .0064800900 .0047830014 .001 7206547 - .001 1 758222 - .000451 2270 .0002137298 .0000993776 - .0000292321 - .0000150720 .0000026408 .0000014593 - .0000001 1 84 - .0000000673

262

CHAPTER 8

dj

mean in terms of multiresolution analysis? What do d and now stand for? ( They were coefficients of orthogonal projections in Chapter 5.) Is there an associated wavelet basis? How does it differ from the bases constructed earlier? The answer is that, provided the filters satisfy certain technical conditions, such a scheme corresponds to two dual wavelet bases, associated with two different multiresolution ladders. In this section we will see how to prove all this, and give several families of ( symmetric! ) examples. Except for an improved argument due to Cohen and Daubechies (1992) , all these results are from Cohen, Daubechies, and Feauveau (1992). Many of the same examples are also derived independently in Vetterli and Herley (1990) , who present a treatment from the "filter design" point of view; a useful factorization scheme for this type of filter bank is in Nguyen and Vaidyanathan (1989) .

cI - 1

FIG. 8.4. Sub band filtering scheme with exact reconstruction but reconstruction filters different from the decomposition filters.

8.3. 1 . Exact reconstruction. Since we have now four filters instead of two, we have to rewrite (5.6.5), (5.6.6) as

d�

L k

=

L k

Cl n [hl- -2n Cn 1U-2n dn]

and

_

-

°

1

L...J "

+

1

In the z-notation introduced in §5.6, this can be rewritten as

C

O (z) =

� [ii(z) h(z) g(z) g(z)] cO (z) � [ii(z) h( -z) g(z) g( -z)] +

+

+

co ( -z) .

Consequently, we require

ii(z) h (z)

+

g(z) g(z) = 2 ,

ii(z) h( -z) + g(z) g( -z) = 0 ,

(8.3.1)

(8.3.2) where we assume ii, g, h, 9 to be polynomials since the filters are all FIR. (For simplicity, we use the term "polynomial" in a slightly wider sense than usual:


an z n h(z) -z)p(z) -z)] o:zk

263

we also allow negative powers. In other words, L��- Nl is a polynomial in h this terminology.) From (8.3. 1) it follows that and 9 have no common zeros; consequently, (8.3.2) implies that

g (z) z z p(z) -z)g(z) p(z) g(z) o:zk h(-z) , = h ( - )p ( ) ,

= -g(

(8.3.3)

for some polynomial p. Substitution into (8.3. 1) leads to

[h(

- h(z)g(

= 2.

The only polynomials that divide constants are monomials; hence for some 0: E

C,

=

k E Z, and (8.3.3) becomes

(8.3.4)

=

Any choice for 0: and k will do; we choose 0: = 1, k = 1, which makes the equations (8.3.4) for g and 9 symmetric. Substitution into (8.3.1) gives

h(z) h(z) h(-z) h(-z) n hn hn+2k Dk,O ,

= 2.

+

In terms of the filter coefficients,

all

(8.3.5)

this becomes

L

(8.3.6)

=

(8.3.7) where we have implicitly assumed that all the coefficients are real. These equa tions are obvious generalizations of (5. 1.39) , (5. 1.34) .

8.3.2. Scaling functions and wavelets. Because we have two pairs of fil ters, we also have two pairs of scaling function + wavelet: {lPj,k ; Wj {-¢;j,k; Wj Wj C

C

with Vii

VI I

C

C

Vo

C

V- I

C

C

C

C

C

·k E ll}, Span k E ll}. The spaces k E Il} are again complements Span k E ll}, Span of Vj , respectively, "Cj in Vj - I , respectively, "Cj - I , but they are not orthogonal complements: typically the angle7 between Vj , or "Cj , will be smaller than 90° . This is the reason why we have to prove (8.3. 12) in this case, whereas it was automatic in the orthonormal case. Another way of seeing this is the following. Because of the non-orthogonality we have

Wj

=

Span

Vo

=

=

[k 1 (1, ¢>j,kW 1 (1, lPj,k) 12] Lk 1 (1, ¢>j- l,kW [� 1 (I, ¢>j,kW �I (I' lPj'k )12l ' lPj,k Wj Wj Ej,k I (I, lPj,k ) 1 2 .

QL

:::;

+

:::; f3

+

with Q < 1, f3 > 1 ( in the orthogonal case, equality holds, with Q f3 1) . Unlike the orthonormal case, we cannot telescope these inequalities t o prove that the constitute a Riesz basis: telescoping would lead to a blowup of the constants. We therefore have to follow a different strategy. Note that (8.3.13) implies that .1 "Cj , .1 Vj . The two multiresolution hierarchies and their sequences of complement spaces fit together like a giant zipper, and this is what allows us to control expressions like But let us return to the conditions (8.3. 12) and (8.3. 14) . We already saw how to tackle condition (8.3.14) in §6.3, in the simpler orthogonal case. Our strategy here is essentially the same. We again define an operator Po acting on 27r-periodic functions, =

=

a second operator Po is defined analogously. In terms of the Fourier coefficients of f, the action of Po is given by

we will be mostly interested in invariant trigonometric polynomials for Po. This means that we can restrict our attention to the 2( + I-dimensional subspace of f for which It 0 if f > NI ( we assume hn 0 if n < I or n > on which Po is represented by a matrix. Theorems 6.3.1 and 6.3.4 have the following analog. T HEOREM 8 . 3 . 1 . The following three statements are equivalent:

N2) ,

=

N2 -

N2 - NI )

=

N


1. , E L2 (JR.) and (O , k , O , l ) = Ok , l .

267

2. There exist strictly positive trigonometric polynomials fo, io invariant for Po, Po; there also exists a compact set K congruent to [-11", 11"] modulo 211" so that

3. There exist strictly positive trigonometric polynomials fo, io invariant for Po, Po, and these are the only invariant polynomials for Po, Po (up to normalization) . The proof is very similar to the proofs in Chapter 6, but a bit more compli cated. In §6.3, the functions fo, io were simply con!tant; in this case, they are essentially fo(�) = Ll I ¢(� 211"£W, io(�) = L l �(� 211"£) 1 2 . For details on how to adapt the proofs of §6.3 to the present case, see Cohen, Daubechies, and Feauveau (1992) . Condition (8.3.14) therefore simply amounts to checking that two matrices have a nondegenerate eigenvalue 1 and that the entries of the corresponding eigenvectors define a strictly positive trigonometric polynomial. (Note that if the trigonometric polynomial takes negative values, then ¢ L2 ( JR. ) . This happens for some exact reconstruction filter quadruplets.) Condition (8.3. 12) is something we had not encountered in the orthogonal case. It turns out that this condition is satisfied if any of the three conditions in Theorem 8.3. 1 holds. The proof of this surprising fact is in the following steps8:

+

•

•

•

+

First, one shows that the existence of an eigenvalue >. of Po with 1 >' 1 2: 1, >. =I- 1 would contradict the square integrability of . It follows there fore from Theorem 8.3.1 that all the other eigenvalues of Po have abso lute value strictly smaller than 1 if the eigenvalue 1 is nondegenerate and the associated eigenvector corresponds to a strictly positive trigonometric polynomial. The proof of this step uses Lemma 7. 1. 10. Since mO(1I") = 0 = mo(1I") , we have obviously MO (1I") = I mo(1I"W = 0 = I mo(1I") 1 2 = MO(1I") . We saw in Chapter 7 that this means that the columns of the matrix representing Po all sum to 1 , so that the row vector (of the appropriate dimension) with all entries 1 is a left eigenvector for Po with eigenvalue 1. It follows from the first point that p, the spectral radius of PO I E1 ' with El = {I; L n fn = O}, is strictly smaller than 1 . One then uses that f (�) = 1 - cos � is in El to prove (the estimates are analogous to those in the proof of Theorem 7.1. 12) that J2 n- l .".S;lel:5 2 n.". d� I ¢(�W � C (l¥) n .

Via Holder 's inequality this implies J � 1 ¢(�) 1 2 ( 1- 6 ) < 00 for sufficiently small O . This can then be used to prove a "discretized" version, i.e., Lm EZ I ¢(� 1I"m) 1 2 ( 1- 6') � C < 00 for all � E JR., again for sufficiently

+

CHAPTER 8

268 small

(/ .

Because m 1 is bounded, -0 satisfies a similar bound, L 1 -0(e + 211'm) 1 2 ( 1- 6' ) :::; G < 00 . mE:!:

•

(8.3.15)

On the other hand, one can also prove that (8.3. 16) Since -0 is entire and -0(0) 0, 1 -0(e) 1 :::; C 1el for sufficiently small l e i , so that �J= - oo 1 -0(2j e W6' is uniformly bounded for lei :::; 211', and we only need to concentrate on j � 0 in (8.3. 16) . But =

� !£ 1 -0(e W �

[

� 1 -0(O I

2j 1I" :5 l e l :5 2H 1 11"

:::; G

.

/

.

I� 1 ] . [/

d( I ¢( ( W

23 - 1 11" :5 1 and 3 , 3 'I/J are examples of the fact that the cascade algo

FIG. 8 . 7 . Spline examples with if = 3, N = 3, 5, 7, and 9. For N

[ - Nt 1 , Nt3 ] .

�-'

________�______

is not square integmble. Here support

=

=

=

=

The functions rithm may diverge while the direct algorithm still converges (see Note 1 1 at the end of Chapter 6).

276

CHAPTER

8

3 CP

O.S 0 - O .S -1

O.S

2

0

-

-

3,7 1/;

3,9 1/;

O.S

0

0

- O. S

- O. S

-1

-4

-2

0

2

4

2

-1 -S 2

3,7 CP 0

-1

-1

-S

3,7 I/;

0 -1

S

Ii

2

1v 0v

-2

0

2

-S

3,9 I/;

0 -1

\J

-2 -4

0

-2 4

-S

FIG. 8.7. ( continued) .

S f\

3,9 CP

0

2

0

i lt 0

10

S

1 Ar 0

S


277

TABLE 8 . 2 List of Nino , N N mo for the first few values of N, N , with z = e - i{ . The corresponding filter coefficients N h� , N N h k are obtained by multiplying v'2 with the coefficient of z k in N N inO , , , respectively. Note that the coefficients of N N mO are always symmetric; for very long N N mO , , we only list about half the coefficients ( the others can be deduced by symmetry) .

N

fJ mO

N

N

N mo

1 - ( 1 + .) 2

1 - ( 1 + .) 2

%- 1 1 % .z 2 .z 3 % -2 - -- + -- + - + - + - - 16 16 2 2 16 16

3

_

256

z-4 _

� %-3

_

256

":':' % - 2 + ":':' % - 1 128

128

+

..: 2

+ .: + ":':' % 2 2 128

11 3 3 4 3 5 - -z - -% + -% 256 256 128 1 - (.- 1 + 2 + .) 4

--z 8 4

_

128

z -4 _

3 1 1 1 - 2 + - z - 1 + - + - z - - z2 4 4 8 4

� %-3 64

_

': % - 2 8

+

�%-l 64

+

� 64

+

�% 64

_ ': % 2 _ � % 3 + .2.... % 4 128 8 64 6

_ __ z - 6 + � % - 5 + � % - 4 _ � % - 3 _ � % - 2 1024 512 512 512 1024 1 75 + �%- l + + � % _ � %2 1024 256 256 256

8

2 - 1 5 ( 3 5 % - 8 - 70z - 7 - 300 % - 6 + 670z - 5 + 1228% - 4 - 3126%-

3

2 1 - 3796% - + 10718% - + 22050 + 10718% - 3796% 2 . . . )

3

9 3 7 45 45 7 - z - 3 - - z - 2 - - % - 1 + - + - % - - %2 64 64 64 64 64 64 _ � %3 + � %4 64 64 _ _ z - 5 + � .z - 4 + � % - 3 _ � % - 2 _ � % - 1 512 512 512 512 256 +

7

175 256

+

�.

256

_

� .2 256

2 - 14 (35% - 7 - 105.= - 6 - 195% - 5 + 865% - 4 + 336% - 3 - 3489.= -

2

1 - 307% - + 1 1025 + 1 1 025.= · · · )

2 - 1 7 ( _ 63.= - 9 + 189.z - 8 + 469 .= - 7 - 1 9 1 1 .= - 6 - 1308 .= - 5 + 9 1 88.= - 4 + 1 1 40.= - 3 - 29676.= - 2 + 190.= - 1 +873 18 + 87318 •

. . .

)

2 78

CHAPTER

8

The functions 1, 3 'l/J and 1, 3 "p were first constructed in Tchamitchian (1987) as an example of two dual wavelet bases with very different regularity prop erties. Here they constitute the first non-orthonormal example of the family eN = 1 = N gives the Haar basis ) . As in the orthonormal case, arbitrarily high regularity can be attained with these examples, for both 'l/J and "p . As a spline function, fir ' N "p is piecewise polynomial of degree if - 1 and is e fir - 2 at the knots; the regularity of fir , N 'l/J can be assessed with any of the techniques in Chapter 7 . Asymptotically, for large if, one finds that fir , N 'l/J E em if N > 4. 165 if + 5. 165 (m + 1). These spline examples have several remarkable features. For one thing, all the filter coefficients are dyadic rationals ; since divi sion by 2 can be done very fast on a computer, this makes them very suitable for fast computations. Another attractive property is that the functions fir N 'l/J(x) are known exactly and explicitly for all x, unlike the orthonormal co�pactly supported wavelets we saw before ,u One disadvantage they have is that mo and rho are very unequal in length, as is apparent from Table 8.2. This is reflected in very different support widths for ¢ and ¢; because they are determined by both mo and rho , 'l/J and "p always have the same support width, given by the average of the filter lengths of mo, rho, minus 1 . The large difference in filter lengths for mo, rho can be a nuisance in some applications, such as image analysis.

Examples with less disparate filter lengths. Even if we still take R 0, it is possible to find mo and rho with closer filter lengths by choosing an appropriate factorization of P ( sin2 � ) into qo ( cos e) and qo ( cos e) . For fixed f + i there is a limited number of factorizations. One way to find them is to use spectral factorization again: we determine all the zeros (real and pairs of conjugated complex zeros) of P, so that we can write this polynomial as a product of real first and second order polynomials, ==

P (x)

h

=

12

A II (x - Xj ) II(x 2 - 2Rezi x + lzi I 2 ) . i= l j=l

Regrouping of these factors leads to all the possibilities for qo and qo. Table 8.3 gives the coefficients for mo , rho for three examples of this kind, for f + i = 4 and 5. (Note that f + i = 4 is the smallest value for which a non-trivial factorization of this type is possible, with qo , qo both real.) For f + i = 4, the factorization is unique, for f + i = 5 there are two possibilities. In both cases we have chosen f, i so as to make the length difference of mo, rho as small as possible. The corresponding wavelets and scaling functions are given in Figures 8.8 and 8.9. In all cases the conditions of §8.4.2 are satisfied.

8.3.5. Biorthogonal bases close to an orthonormal basis. This first example of this family was suggested by M. Barlaud, whose research group in vision analysis tried out the filters in §6A, 6B for image coding (see Antonini et al. (1992)) . Because of the popularity of the Laplacian pyramid scheme ( Burt and Adelson (1 983 )) , Barlaud wondered whether dual systems of wavelets could

279


TABLE 8 . 3

The coefficients of mo, mo for three cases of "variations on the spline case" with filters of similar length, coTTesponding to t + 1 4 and 5 (see text) . For each filter we have also given the number of (cos t; / 2) factors (denoted N, iii ) . As in Table 8.2, multiplying the entries below with v'2 gives the filter coefficients hn , hn . =

N, N N=4 N=4 N=5 N=5

N=5 N=5

1 .5

n

0 -1 -2 -3 -4 0 1, -1 2, -2 3, -3 4, -4 5, -5 0 1, - 1 2, -2 3, -3 4, -4 5, -5 1, 2, 3, 4,

coefficient of e-ine in rno .557543526229 .295635881557 - .0287717631 14 - .045635881557 0 .636046869922 .337150822538 - .066117805605 - .096666153049 - .001905629356 .00951533051 1 .382638624101 .242786343133 .043244142922 .000197904543 .015436545027 .007015752324 2

cp

coefficient of e-ine in rno .602949018236 .266864118443 - .078223266529 - .016864118443 .026748757411 .520897409718 .244379838485 - .038511714155 .005620161515 .028063009296 0 .938348578330 .333745161515 - .257235611210 - .083745161515 .038061322045 0

if;

0.5 0

0 - 0.5

1 .5

-2

-1

2

0

2

cp

0.5

-2

0

2

4

0

2

4

-

if;

0

0 - 0.5

-4

FIG.

8.8.

-2

0

2

4

-1

-2

The functions 4>, �, 1/J, ;P coTTesponding to the case N

=

4

=

iii in Table 8.3.

CHAPTER 8

280

(a)

2

2

cp

if;

0 0

-1 -4

-4

4

-2

cp

� V I[ 0

2

4

-2

0

2

4

-2

0

2

4

-2

0

2

4

-2

if;

0.5 0 0 -4

(b)

0.8 0.6

-2

0

2

4

-1 -4

cp

if;

0.4

0.5

0.2

0

0 - 0.2

- 0. 5 -4

-2

0

2

-4

4

6 4

cp

5

if;

2 0

0 -2 -4 -4

FIG . 8.9.

-2

0

2

4

-5

-4

The functions r/>, 4>, 'I/J, 1iJ corresponding to the two cases

N

=

5

=

N in Table 8.3


281

be constructed, using the Laplacian pyramid filter as either rno or mo . These filters are given explicitly by

(8.3.27) For a = - 1/16, this reduces to the spline filter 4 mO as described under the "spline examples" above. For applications in vision, the choice a = .05 is es pecially popular: even though the corresponding � has less regularity than 4 � ' it seems to lead to results that are better from the point of view of visual per ception. Following Barlaud ' s suggestion, we chose therefore a = .05 in (8.3.27) , or

.6

.5 cos �

(cos �r ( 1 +

. 1 cos 2� +

� sin2 � )

(8.3.28)

Candidates for mo dual to this rno have to satisfy

rno (�) rno (�)

+

rno (� + ) rno (� + ) 11"

11"

=

1.

As shown in §8.4.4, such rno can be chosen to be symmetric (since rno is sym metric) ; we also opt for rno divisible by (cos �/2) 2 (so that the corresponding 'IjJ, ,(f both have two zero moments) . In other words,

where

(1 - x) 2

( 1 � x) P(x) +

+

x2

( � - � x) P(l - x)

=

1.

By Theorem 6. 1.1, together with the symmetry of this equation for substitution of x by 1 - x, this equation has a unique solution P of degree 2, which is easily found to be 24 6 P(x) = 1 + 5 x - x2 . 35 This leads to 24 . 4 � � 2 1 + 6 s. 2 � - s cos (8.3.29) - m 2 5 2 35 m 2 73 73 3 17 3 - e - 3ie - 56 e -2ie + 280 e - ie + 28 + 280 e ie 280 3 3 (8.3.30) - 56 e 2ie - 280 e 3ie . One can check that both (8.3.28) and (8.3.29) satisfy all the conditions in §8.4.2. It follows that these rno and rno do indeed correspond to a pair of biorthogonal

(

)

(

)

282

CHAPTER

8

wavelet bases. Figure 8. 10 shows graphs of the corresponding 4>, if>, 'I/J and ;j;. All four functions are continuous but not differentiable. It is very striking how similar if> and 4> are, or 'I/J and ;j;. This can be traced back to a similarity of rno and mo, which is not immediately obvious from (8.3.27) and (8.3.30), but becomes apparent by comparison of the explicit numerical values of the filter coefficients, as in Table 8.4. In fact, both filters are very close to the (necessarily nonsymmetric) filter corresponding to one of the orthonormal coiflets (see §8.3) , which we list again, for comparison, in the third column in Table 8.4. This proximity of rno to an orthonormal wavelet filter explains why the mo dual to rno is so close to rno itself. A first application to image analysis of these biorthogonal bases associated to the Laplacian pyramid is given in Antonini et al. ( 1992). 2 r-

--r

�----

I

�--,

----

2

I

j

-2

2

0

-I

I

t/;

0 -1

-2

-1

0

2

3

-1

o

2

3

2

-1

-2

2

o

FIG. 8 . 1 0 . Grophs of Adelson low-pass filter.

•

.(f

-2

for the biorthogonal pair constructed from the Burt

M. Barlaud's suggestion led to the accidental discovery that the Burt filter is very close to an orthonormal wavelet filter. (One wonders whether this closeness makes the filter so effective in applications?) This example suggested that maybe other biorthogonal bases, with symmetric filters and rational filter coefficients, can be constructed by approximating and "symmetrizing" existing orthonormal wavelet filters, and computing the corresponding dual filter. The coiflet coeffi cients listed in §8.3 were obtained via a construction method that naturally led to close to symmetric filters; it is natural, therefore, to expect that symmetric biorthogonal filters close to an orthonormal basis will in fact be close to these

283


TABLE 8.4

Filter coefficients for (mo )Burt , for the dual filter (mo )Burt computed in this section, and for a very close filter (mo )coiflet corresponding to an orthonormal basis of coiftets (see the entries for K = 1 in Table 8 . 1 ) . n

-3 -2 -1 0 1 2 3

(O.mo)Burt -.05 .25 .6 .25 -.05

O.

(mO ) Burt

-.010714285714 -.053571428571 .260714285714 .607142857143 .260714285714 -.053571428571 -.010714285714

(mo )coiflet

O. -.051429728471 .238929728471 .602859456942 .272140543058 -.051429972847 -.011070271529

[� (K k k) [ ( K - k1 + k )

coiflet bases. The analysis in §8.3 suggests, therefore, K -1 -1+ 2K (sin �/2) 2k + O ((sin �/2) 2K ) �) = (cos �/2)

mo(

In the examples below we have chosen in particular K -1 (sin �/2) 2 k � ) = (cos �/2) 2K

mo(

E

+

and we have then followed the following procedure:

1 [:

a(sin �/2) 2K

1

- I mo(�

1

.

1

1. Find a such that dE" [1 - l mo(�) 1 2 + 11") 1 2 ] is minimal (zero in the examples below) . This optimization criterium can of course be replaced by other criteria (e.g., least sum of squares of all the Fourier coefficients of 1 + 11") 1 2 instead of only the coefficient of e ile with £ = 0) . For the cases = 1, 2, 3, the smallest root for a is .861001748086, 3.328450120793, 13. 113494845221, respectively. 2. Replace this (irrational) "optimal" value for a by a close value expressible as a simple fraction. 12 For our examples a = .8 = 4/5 was chosen for = 1, a = 3.2 = 16/5 for = 2 and a = 13 for = 3. For = 1, this reduces then to the example above. 3. Since is now fixed, we can compute If we require that be also divisible by (cos �/2) 2K , then �) = (cos �/2) 2K PK ((sin �/2) 2 ) , (8.3. 31) where PK is a polynomial of degree - 1. The same analysis as in Daubechies (1990) shows that

I mo(�) 1 2 - I mo(� K

K

K

mo

K

K mo

mo.

mo( (

PK x)

=

3K � ( K - � + k ) xk

+

O xK ) ,

(

284

CHAPTER

8

thereby determining already K of the 3K coefficients of PK . The others can be computed easily. For K = 2 and 3 we find

14 1 + 2x + 5 x

8024 4 3776 5 x + x (8.3.32) 455 455 1721516 1 + 3x + 6x 2 + 7x 3 + 30x4 + 42x5 x6 6075 1921766 7 648908 8 + (8.3.33) x x 6075 6075 2

+

8x 3 -

In Table 8.5 we list the explicit numerical values of the filter coefficients for mo , mo and the closest coiflet, for K = 2 and 3. We have graphed

o.

290

CHAPTER

and Proof. Choose L

a ::;

9

I Qk l - 1 1rQ k dx f (x) ::; 2a,

for all

kEN.

2£ so that 2 -£ IJR dx f (x) ::; a. It follows that -L 1 I��+ l )L dx f (x) ::; a for all k E Z. This defines a first partition of R a fixed interval Q [k L, (k+ l ) L [ in this first partition. Split it into 2. Take two halves, [k L, (k + � )L[ and [(k + � ) L (k + l ) L [ . Take either of the halves, call it Q', and compute lQ1 I Q' I - 1 IQ I dx f(x) . If lQ1 > then put Q' in the bag of intervals that will make up B. We have indeed 2 1 Q I - 1 h dx f(x) 2a . a < lQ1 ::; I Q ' I - 1 h dx f(x) 1.

=

=

,

=

a,

::;

=

If lQ I < a, keep going (split into halves, etc.) , if necessary, ad infinitum. Do the same for the other half of and also for all the other intervals l)L[. At the end we have a countable bag of "bad" intervals which all satisfy the equation at the top of this page; call their union B and the complement set C.

Q,

[kL, (k +

3. By the construction of B, we find that for any x (j B , there exists an infinite sequence of smaller and smaller intervals Q , Q3 , · · · so that for every n, and IQn dy f(y) ::; a. In fact, 2 = xE for every j , and C Because the Qn "shrink to" x,

-1 I Q nl Qj Qj - 1 . I Qn l - 1 1Qn dy f(y)

Qn

Q1 , Q Q I j l � I j-1 1

-+

f(x) almost surely .

Since the left side is ::; a by construction, it follows that f (x) ::; in C . •

2£

a

a.e.

Note that the choice L = implies that all the intervals occurring in this proof are automatically dyadic intervals, i.e. , of the form for some , j E Z. Next we define Calder6n-Zygmund operators and prove a classical property. DEFINITION. A Calder6n-Zygmund operator T on lR is an integral operator

k

[k2 -j , (k + 1)2 -j [

1

(Tf) (x)

=

J dy K(x, y) f(y)

(9. 1.1)

for which the integral kernel satisfies C I K(x, y) 1 ::; x - yI ' K( y) K( , y) ::; ' x X, y X I x YI 2 and which defines a bounded operator on L 2 (lR) .

\:

-I

\ + \:

\

�

(9.1.2)

( 9. 1. 3 )

291

CHARACTERIZATION OF FUNCTIONAL SPACES

THEOREM 9. 1 .2 . A Calder6n-Zygmund operator is also a bounded operator from L 1 (JR) to L�eak(JR) . The space L�eak(JR) in this theorem is defined follows. DEFINITION. I E L�eak(JR) il there exists C > 0 so that, lor all a > 0, as

(9. 1.4 )

The infinum of all C for which ( 9. 1.4 ) holds (for all a > 0) is sometimes called IIIII L'weak . 2 EXAMPLES .

1. If I E L 1 ( JR) , then (9. 1.4) is automatically satisfied. So. = {x; I / (x) 1 � a} , then

a · I Sa l :::; { dx I/ (x) l :::; ( dx I/ (x) 1

lJi

lsc>

hence

1IIII Ll

weak

Indeed, if

IIIII L' ;

:::; IIIII L' .

I x l -1 is in L�eak ' since I {x; I x l -1 � a} 1 = � However, I x l - !3 is not in L�eak if f3 > 1. The name L�eak is justified by these examples: L�eak extends L 1 , and contains the functions I for which J III "just" misses to be finite because of logarithmic singularities in the primitive of II I . We are now ready for the proof of the theorem. 2. I (x) I (x)

=

=

Prool ol Theorem 9. 1.2. 1. We want to estimate I {x; I TI (x) 1 � a} l . We start by making a Calderon Zygmund decomposition of JR for the function III , with threshold a. Define now

{ {:

if x E G ,

I (x)

g(x)

b(x)

I Q k l -1

=

1

Qk

dy I (y)

(X) - I Q . I -1

1" dy f(y)

if x E interior of Q k , if X E G , if x E interior of Q k

•

Then I (x) = g(x) + b(x) a.e.; hence TI = Tg + Tb. It follows that I TI (x) 1 � a is only possible if either ITg(x) 1 � a / 2 or I Tb(x) 1 � a / 2 (or

CHAPTER 9

292 both) ; consequently,

I {x; ITI (x) 1 � a} 1 � x; I Tg(x) 1 �

I{

�}I + I { x; I Tb(x) 1 �}I ·

(9.1.5)

�

The theorem will therefore be proved if each of the terms in the right-hand side of (9. 1 .5) is bounded by � 1 1 / 11 Ll .

2. We have

( � f I { x; ITg(x) 1 �}I {X; J dx �

�

I Tg(xW

I Tg (x) I � V

h. dx I Tg(xW = II Tg ll 1.2 � C ll g ll 1.2 , (9. 1.6) because T is a bounded operator on L2 . Moreover, I Ig ll 1. 2 = fa dx I g(xW + 1 dx I g(xW 2 a fa dx I/ (x) 1 + � I Q k l � I k l � k dy l(y) 1 �

I

{::::::>

[� [� 3 ,k

3 ,k

I ( I, tPj ,k W I tPj ,k(X W

]

00 ,

1 /2

E V(JR)

I ( I, tPj ,k W Tj X!2ik , 2i ( k + l )] (X)

]

1 /2

E V (JR)

.

For a proof that these are indeed equivalent characterizations of V' (JR) , see Meyer (1990). Similarly, wavelets provide unconditional bases and characterizations for many other functional spaces. We list a few here, without proofs. The Sobolev spaces W8 (lR) .

The Sobolev spaces are defined by


299

Their characterization by means of wavelet coefficients is

f E WS (lR) {::} L I ( I, 1/1j , k ) 1 2 (1 + T 2j S ) j,k

The Holder spaces CS (lR) . For 0 < CS (JR)

=

For s = n + Sf , 0 < S f

{

I S; C for ] � 0, but the sum over ] between ]o < 0 and 0 then still leads to a term in Ix Xo I I In Ix Xo That is why one has to be more circumspect for integer a , and why the Zygmund class enters.

f,

-

'IjJ

- I I. 'IjJ

3. Theorems 9.2.1 and 9.2.2 are also true if has infinite support , and and have good decay at 00 (see Jafl'ard ( 1989b) ) . Compact support for 'IjJ makes the estimates easier. 0

'IjJ'

Local regularity can therefore be studied by means of wavelet coefficients. For practical purposes, one should beware, however: it may be that very large values of ] are needed to determine a in (9.2.5) reliably. This is illustrated by the following example. Take

f (x - a)

2 e -1 x -al

e -1x-al e - (x-a) [(x - a - 1 ) 2 + 1]

x x

if S; a - I , if a - I S; S; a + 1 , if � a+1 ;

x

this function is graphed in Figure 9 . 1 (with a = 0) . This function has Holder exponents 0, 1, 2 at x = a 1, a, a + 1 , respectively, and is Coo elsewhere. One can then, for each of the three points o = a - 1, a, or a + 1, compute

-

X

302

CHAPTER

9 f(

1 .0

x)

0.5

O L-�----�____-L____-L____�� -2 -1 o 2

F I G . 9 . 1 . This function is f" are discontinuous.

Coo

except at x

=

-1, 0

and

1,

where, respectively, f, f' , and

Aj = max { ! (I, 'If- j ,k) ! ; xo E support ('If -j , k ) } , and plot log Aj / log 2. If a = 0, then these plots line up on straight lines, with slope 1/2, 3/2 and 5/2, with pretty good accuracy, leading to good estimates for Q. A decomposition in orthonormal wavelets is not translation invariant, however, and dyadic rationals, particularly 0, play a very special role with respect to the dyadic grid {2- j k; j, k E fl.} of localization centers for our wavelet basis. Choosing different values for a illustrates this: for a = 1 / 1 28, we have very different (I, 'lfj ,k) , but still a reasonable line-up in the plots of log Aj / log 2, with good estimates for Q; for irrational a, the line-up is much less impressive, and determining Q becomes correspondingly less precise. All this is illustrated in Figure 9.2, showing the plots of log Aj / log 2 as a function of j, for Xo = a - I , a, a + 1 and for the three choices a = 0, 1 / 1 28 and - 1 1 /8 (we subtract 1 1 /8 to obtain a close to zero, for programming convenience) . To make the figure, ! (I, 'If j ,k) ! was computed for the relevant values of k and for j ranging from 3 to 10. (Note that this means that f itself had to be sampled with a resolution 2 - 17 , in order to have a reasonable accuracy for the j = 10 integrals. ) For a = 0, the eight points line up beautifully and the estimate for Q + is accurate to less than 1.5% at all three locations. For a = 1 / 1 28, the points at the coarser resolution scales do not align as well, but if Q + is estimated from only the finest four resolution points, then the estimates are still within 2%. For the irrational choice a = - 1 1/8 no alignment can be seen at the discontinuity at a - I (one probably needs even smaller scales) , and the estimate for Q + at a, where f is Lipschitz, is off by about 13% (interestingly enough, the estimate would be much better if the scale10 point were deleted) ; at a + 1 , where f' is Lipschitz, the estimate is within 2.5%. This illustrates that to determine the local regularity of a function, it is more useful to use very redundant wavelet families, where this translational non-invariance is much less pronounced (discrete case) or absent (continuous case) . (See Holschneider and Tchamitchian ( 1990) , Mallat and Hwang (1992) .) Another reason for using very redundant wavelet families for the characterization of local regularity is that then only the number of vanishing moments of 'If limits

v'2

�

�

�

v'2

303


SLOPE = -0.50522

3

a=O

2 1 0 -1 C\I Ol 0 ,

« Ol

.Q

SLOPE = -1 .49477

0

a=O

-5 -10 0 -5 -1 0 -1 5 -20 -25

~ 4

6

10

8

INDEX j OF THE SCALE SLOPE =

3

4

-0. 50596

a = 1 /1 28

2

0

C\I

Ol 0 , « Ol

.Q

a = -J2 - 1 .375

3

2

0

SLOPE = - 1 . 50367

2 0 -2 -4 -6 -8

a = 1 /1 28

SLOPE = - 1 . 70 1 84 a = -J2 - 1 .375

0 -5 -10

SLOPE = -2.45071

-5

a = 1 /1 28

-10

SLOPE = -2.44 1 46 a = -J2 - 1 .375

-5 -10 -15

-15

-20

-20 4

6

8

10

��--�----�--�

4

INDEX j OF THE SCALE

6

8

10

FIG. 9.2. Estimates of the Holder exponents of f(x - a) (see Figure 9 . 1 ) at a - I (top) , a (middle) , a + 1 ( bottom) , computed from log Aj / log 2, for different values of a. ( This figure was contributed by M. Nitzsche, whom I would like to thank for her help.)

304

CHAPTER

9

the maximum regularity that can be characterized; the regularity of 1/J plays no role ( see §2.9 ) . If orthonormal bases are used, then we are necessarily limited by the regularity of 1/J itself, as is illustrated by choosing f 1/J. For this choice we have indeed ( J, 1/J -j , k ) 0 for all j > 0, all k; it follows that with orthonormal wavelets we can hope to characterize only regularity up to Cr - f if 1/J E cr . =

=

9.3. Wavelets for L l ( [O, 1]). Since L l -spaces do not have unconditional bases, wavelets cannot provide one. Nevertheless, they still outperform Fourier analysis in some sense. We will il lustrate this by a comparison of expansions in wavelets versus Fourier series of L 1 ( [0, I ]) -functions. But first we must introduce "periodized wavelets." Given a multiresolution analysis with scaling function ¢ and wavelet 1/J, both with reasonable decay ( say, 1 ¢(x) l , 1 1/J(x) I ::; C (I + Ixl)-2- f ), we define

¢r,"{ (x)

=

L ¢j, k ( X + i) ,

1/Jj,"{

iEZ

=

L 1/Jj , k (X + £) ;

iEZ

and

Vjper

=

Span {¢r.�r ; k E Z},

Since LiEZ ¢(x + £) 1,6 we have, for j � 0, ¢r,"{ (x) 2 -j /2 p j j j 2 /2, so that the Vj er, for j � 0, are all identiL i ¢(2- x - k + 2- £) cal one-dimensional spaces, containing only the constant functions. Similarly, because Li 1/J(x + £/2) 0,7 Wrr {O} for j � 1. We therefore restrict our attention to the Vjper, Wrr with j ::; O. Obviously Vjper, Wrr c VJ�� , a property inherited from the non-periodized spaces. Moreover, Wrr is still orthogonal to VJer, because =

=

10 1 dx 1/Jj,"{ (x) ¢r,"{, (x) L Tj

i,i'EZ

1

10 dx 1/J(Tj x + 2-j £ - k) ¢(2 -j x + 2 -j £, - k')

i' + 1 L 21j l 1i'r dy 1/J(2 1j 1 y + 21i 1 (£ - £') - k) ¢(2 1i1 y - k' ) i,i' EZ

L ( 1/Jj , k + 2 Ii l r , ¢j , k ' )

rEZ

( because j ::; 0 ) 0.

It follows that, as in the non-periodized case, VjP�� = Vjper EI1 Wrr• The . · dImen · · ce 'f'-I,p,ekr m Ii l = 'f'-I,p,erk clor m E ILJ'71 , spaces Vjper , Wjper are a11 fimteSlOnal : s m j j + 2 and the same is true for 1/Jper , both VJer and WJer are spanned by the 21 jl functions obtained from k = 0, 1, · · · , 21i1 - 1. These 21 j l functions are moreover

30 5


orthonormal; in e.g. , Wrr we have, for 0 ::; k, k ' (1/J),"; , 1/J),";, ) =

L

::;

21 j l - 1,

(1/Jj,k+21i 1 r , 1/Jj ,k' ) = 8k,k'

rEZ

We have therefore a ladder of multiresolution spaces, with successive orthogonal complements wg er (of vter in V��r) , Wrr , . . . , and 21 j l orthonormal bases { 0/A.)' , k " k = 0 2 U I l } in VJper ' {-"0/)' " k ' k = 0 " e L 2 ([0, 1]) (this follows again from I } in Wrr . Since Uj E _NVjp r the corresponding non-periodized version) , the functions in {¢�� } U { 1/Jr,,,; ; -j E N, k = 0" " , 21 j l - I } constitute an orthonormal basis in L 2 ( [0, 1] ) . We will relabel this basis as follows: ·

gO (x ) g l (x) g2 (X) g3 (X )

·

.

·

·

-

·

·

-

1 = ¢�� (x )

g4 (X)

1/J�,"c{ (x) 1/J�,e{,o (x ) 1/J�e{, l (X) 1/J��, o (x)

g2i (x)

1/J��o (x)

g2i +k ( X)

1/J�j,k ( x)

1/J -p er1 , 0 ( X

_ .1 )

2

g2i (x k T j ) -

g2 (X -

�)

for 0 ::; k ::; 2j - l

Then this basis has the following remarkable property. THEOREM 9 . 3 . 1 . II I is a continuous periodic function with period 1, then there exist an E C so that

( 9.3. 1) Proof.

1. Since the gn are orthonormal, we necessarily have an SN by N- 1 L (f, gn ) gn . SNI n =O

(f, gn ) . Define

306

CHAPTER

9

In a first step we prove that the SN are uniformly bounded, i.e. , (9.3.2)

with C independent of l or N. 2. If N

=

2j , then S2j

Projvp�r ; hence

=

-3

21j l _ 1 ( S2j f) (X)

( I,

L

=

k =O

. correspond exactly to the wavelet coefficients ( F, 'lIr' n ) , with F In an image, horizontal edges will show up in d1 , h', vertical edges � n1 , in d v , diagonal edges in d1 , d, as illustrated in the image example below. (This justifies the h, v, d superscripts.) Note that if the original image consists of an N x N array, then (apart from border effects; see also §1O.6) , every array d1 , >. consists of I¥- x I¥- elements, and can therefore be represented by an image (magnitudes of the coefficients corresponding to grey levels) of one quarter the size of the original. The whole scheme can therefore be represented as in Fig ure 10.2. Of course, one can decompose even further if more multiresolution layers are wanted. Figure 10.3 shows this decomposition scheme on a real image, with three multiresolution layers. =

c�. more apparent. I would like to thank M. Barlaud for providing this figure.

317

GENERALIZATIONS AND TRICKS

eralizations of (5. 1 . 1 )-(5. 1.6)) in which Vo is not a tensor product of two one dimensional Vo-spaces. 1 Some (but not all!) of the constructions done in one dimension can be repeated for this case. More precisely, the multiresolution structure of the Vj implies that the corresponding scaling function cI> satisfies (10. 1 . 1 ) for some sequence (hn )n E 2? Orthonormality of the cI>o; n forces the trigonomet ric polynomial (10. 1.2) to satisfy To construct an orthonormal basis of wavelets corresponding to this multireso lution analysis, one has to find three wavelets w 1 , w 2 , w 3 in V - 1 . orthogonal to V0 and such that the three spaces spanned by their respective integer translates are orthogonal; moreover the W A ( . - n ) should also be orthonormal for each fixed A. This implies that

where the m 1 , m2 , m3 are such that the matrix

m 1 (e , ( )

mo (e , ( ) mo (e +

11" ,

()

m 1 (e +

mo (e, ( + ) 11" ,

()

m2 (e +

( + ) m 1 (e 11"

+ (+ 11" ,

11" ,

m3 (e, ( ) ()

m3 (e +

11"

) m2 (e

11"

+ (+ 11" ,

11" ,

()

m3 ( e , ( + )

m2 ( e , ( + )

m 1 ( e , ( + 1I")

11"

mo ( e +

11" ,

m2 (e , ( )

11"

) m3 ( e

+

11"

11" ,

(+ ) (10. 1.4) 11"

is unitary. The analysis leading to this condition is entirely similar to the one dimensional analysis in §5. 1; see, e.g., Meyer (1990, §III.4) . 2 Note that the number of wavelets to be constructed can be determined by an easy trick. In two dimensions, for example, Vo is generated by the translates of one function cI>(x, y) , over 1;2 ; the space V - 1 is generated by the translates of cI>(2x, 2y) over � Z2 , or equivalently, by the Z2 -translates of the four functions cI>(2x, 2y), cI>(2x - 1 , 2y) , cI>(2x, 2y - 1) , cI>(2x - 1, 2y - 1). V - 1 is therefore "four times as big" as Vo. On the other hand, each of the Wb -spaces is generated by the Z2 -translates of a single function wj (x, y) , and is therefore "of the same size" as Vo. It follows that one needs three ( = four minus one) spaces Wb (hence three wavelets wj ) to make up the complement of Vo in V - 1 . This rule may sound like "hand-waving," but we can also rephrase (and prove) it in more

318

CHAPTER

10

mathematical terms: the number of wavelets is equal to the number of different cosets (different from 712 itself) of the subgroup 712 in the group � 712 . In the general n-dimensional case, the same rule shows that there are 2n - 1 different functions mj to determine; they have to be such that the 2n x 2n _ dimensional matrix

(10.1.5) is unitary, with r = 1" " , 2n , and s = (S l , " " s n ) E {0, 1} n . 3 In fact the unitarity requirement of (10. 1 .4) or (10. 1.5) calls for a tricky bal ance: m 1 , m2 , m3 have to be found so that the first row of (10. 1.4) has unit norm, which seems harmless enough, but we also simultaneously need orthogl> nality with and among the other rows, which are all shifted versions (in e or ( ) of the first row. These correlations between the rows may be hard to juggle in practice. It is useful to untangle them first, which can be done via the sl>-called polyphase decomposition. We write, e.g., 2mo (e , () = mo , o (2e , 2 () + e - i� mo ,l (2e , 2( ) + e - i (mo , 2 (2e, 2 ( ) + e - i (H ( ) mo ,3 (2e, 2( ) ;

ml ,j , j = 0" " , 3, are defined similarly from mi, f = 1" " , 3. One easily checks that (10. 1.3 ) is equivalent to I mo , o (2e , 2 ( ) 1 2 + I mO, 1 (2e , 2( W + I mO, 2 (2e, 2 ( W + I mO,3 (2e , 2 ( ) 1 2 = 1 .

Similarly, all the other conditions ensuring unitarity of ( 10. 1.4) can be recast in terms of the ml ,j ; one finds that (10.1.4) is unitary if and only if the polyphase matrix mO, o (e , ( ) m 1 , O ( e , ( ) m2 , O ( e , ( ) m3,O (e, ( )

mo ,l (e, ( ) m 1 , 1 (e , ( ) m2 , 1 (e , ( ) m3, 1 (e , ( ) mO , 2 (e , ( ) m 1 , 2 (e , ( ) m2 , 2 (e , ( ) m3, 2 (e , ( ) mo ,3 ( e, ( ) m 1 ,3 (e , ( ) m2 ,3 (e , ( ) m3,3 (e, ( )

(10. 1.6)

is unitary. In n dimensions, one similarly defines 2 n /2 mr (6 , · · · , en ) = L e - i ( S 1 6 + ··· + s n�n) mr, s (26 , · · · , 2en ) , SE{O,l}n and the unitarity of U is equivalent to the unitarity of the polyphase matrix {; defined by (10. 1.7) The construction therefore boils down to the following question: given mo (from (10. 1 . 1) , (10. 1.2)), can m l , " ' , m2n - 1 be found such that (10. 1.6) is uni tary? In the tWl>-dimensional case, and if mo (e , ( ) happens to be a real trigono metric polynomial, then one can even dispense with the polyphase matrix: it is

319

GENERALIZATIONS AND TruCKS

easy to check that the choice m i (e, ( ) = e-ie mo (e + 11" , ( ) , m2 (e , ( ) = e-i (HC; ) mo(e, ( + 11" ) , m3 (e, ( ) = e - iC; mo ( e + 11" , ( + 11" ) makes (10.1.4) unitary. If mo is not real, then things are more complicated. At first sight one might even think the task is impossible in general in the n-dimensional situation, where (10. 1.7) is a 2n x 2n -matrix: after all, we need to find unit vectors, depending continu ously on the ei (namely the second to last columns of (10. 1.7)), orthogonal to a unit vector (the first column of (10. 1.7) ) , i.e., tangent to the unit sphere. But it is well known that "it is impossible to comb a sphere," i.e., there exist no nowhere-vanishing continuous vector fields tangent to the unit sphere, except in real dimensions 2, 4, or 8. The first column in (10. 1 .7) does not describe the full sphere, however; in fact, because it is a continuous function of n variables (the 6 , " " en ) in a 2n -dimensional space, and 2 n > n, it only describes a compact set of measure zero. This fact saves the day and makes it possible to construct m l , ' . . , m2n- 1 , as shown by Gr6chenig (1987) ; see also §III.6 in Meyer ( 1990) . Gr6chenig ' s proof is not constructive; a different, constructive proof is given in Vial (1992) . Unfortunately, these constructions can not force compact support for the 'l1i : even if mo is a trigonometric polynomial (only finitely many hn =I- 0) , the mi are not necessarily.

10.2. One-dimensional orthonormal wavelet bases with integer dilation factor larger than 2. For illustration purposes, let us choose dilation factor 3. A multiresolution anal ysis for dilation 3 is defined in exactly the same way as for dilation 2, i.e. , by (5.1.1)-(5. 1.6) , except that (5.1 .4) is replaced by

We can use the same trick again as above: Vo is generated by the integer translates of one function, i.e., by the ¢(x - n) , while V-I is generated by the ¢(3x - n) , or, equivalently, by the integer translates of three functions, ¢(3x), ¢( 3x - 1), and ¢(3x - 2) . V-I is "three times as big" as Vo , and two spaces of the "same size" as Vo are needed to complement Vo and constitute V-I : we will need two spaces WJ , W6, or two wavelets, 'lj; 1 and 'lj; 2 . We can again introduce mo, m! , m2 by

� (e) = mo(e / 3) � (e / 3),

£ = 1, 2 .

Orthonormality of the whole family {¢O,n , 'lj;6 ,n ' 'lj;5,n ; n E il}, where ¢i ,n is now defined by ('lj;J. n are defined analogously), again forces several orthonormality conditions on the ml , which can be summarized by the requirement that the matrix

320

CHAPTER

10

( 2; ) ml (� + 2; ) m2 (� + 2; ) mo (� + � ) m l (� + � ) m2 (� + � ) mo � +

(10.2. 1)

is unitary. Again, one can restate this in terms of a polyphase matrix, removing the correlations between the rows. Explicit choices of mo, m l , m2 for which (10.2.1) is indeed unitary have been constructed in the ASSP literature (see, e.g., Vaidyanathan (1987)). The question is then again, as in Chapter 6, whether these filters correspond to bona fide L 2 -functions ¢, 'lj; l , and 'lj;2 , whether the 'lj;J,k constitute an orthonormal basis, and what the regularity is of all these functions. We know, from Chapter 3, that 'lj; l and 'lj; 2 must necessarily have 0 integral zero, corresponding to m l ( O) m2 (0). Since the first row of (10.2. 1) must have norm 1 for all �, it follows that mo(O) 1 (which is necessary anyway for the convergence of the infinite product n;: l mo( 3 -j �) which defines ¢(�)). The first column of ( 10.2. 1) must also have norm 1 for all �, so that mo(O) 1 implies moe311" ) 0 mo( 4;), i.e. , mo (�) is divisible by l± e - i�±e - 2iE . If, moreover, any smoothness for 'lj; l , 'lj; 2 is desired, then we need additional vanishing moments of 'lj; l , 'lj; 2 , which by exactly the same argument as before, lead to divisibility of mo(�) by ((1 + e-ie + e-2ie)j3)L if 'lj;\ 'lj; 2 E CL - 1 . One is thus led to looking for mo of the type mo(�) ((1 + e-ie + e-2ie)j3) N £ (0 such that Imo(�)12 + Imo(� + 2; ) 12 + Imo(� + 4; ) 12 1. If mo is a trigonometric polynomial, this means that L 1£12 is again the solution to a Bezout problem. The minimal degree solution leads to functions ¢ with arbitrarily high regularity; however, the regularity index only grows logarithmically with N (L. Villemoes, private communication) . 4 Once mo is fixed, m l and m2 have to be determined. The design scheme explained in Vaidyanathan et al. (1989) gives a way to do this. In this scheme, the matrix (10.2. 1) (or rather, its z-notation equivalent) is written as a product of similar matrices the entries of which are much lower degree polynomials, with only a few parameters determining each factor matrix. 5 If one imposes that the first column of a product of such matrices is given by the mo we have fixed, then the values of these parameters are fixed likewise, and mb m2 can be read off from the product matrix.6 If the compact support constraint is lifted, then other constructions are pos sible. In Auscher (1989) one can find examples where ¢ and 'lj;" are Coo functions with fast decay (and infinite support). One final remark about dilation factor 3. We have seen that mo must neces sarily be divisible by ( 1 + e-ie + e-2ie)j3. This factor does not vanish for � = 7r (unlike the factor (1 + e-ie)j 2 for the dilation factor 2 case). However, if we want to interpret mo as a low-pass filter, then mO( 7r) 0 would be a good idea. To ensure this, we need £(7r) 0, which means going beyond the lowest degree solution to the Bezout equation for 1£12. =

=

=

=

=

=

=

=

=

=

=


321

Similar constructions can be made for larger integer dilation factors. For non-prime dilation factors a, one can generate acceptable ml from constructions for the factors of a, although not all possible solutions for dilation a can be obtained in this way. For a = 4, e.g., one can start from a scheme with dilation 2 and filters mo and m I , and one can define the filters mo , ml , m2 , m3 (still orthonormalj the - distinguishes them from the dilation factor 2 filters) by

mo (O

=

mo (�)mo (�/2) ,

ml (�) = mo (�)ml (�/2) ,

m2 (�) = ml (�)ml (�/2) , m3 (�)

=

ml (�)mo (�/2) .

(It is left to the reader as an exercise to prove that this leads indeed to an orthonormal basis. One easily checks that the 4 x 4 analogue of (10.2. 1) is unitary.) Note that the function ¢ is the same for the factor 4 and the factor 2 constructions! We will come back to this in §1O.5.

10.3. Multidimensional wavelet bases with matrix dilations. This is a generalization of both §1O. 1 and §1O.2: the multiresolution spaces are subspaces of L 2 (JRn ), and the basic dilation is a matrix D with integer entries (so that DZn C zn ) such that all its eigenvalues have absolute value strictly larger than 1 (so that we are indeed dilating in all directions) . The number of wavelets is again determined by the number of cosets of DZn j one introduces again mo, m I , · · · , and the orthonormality conditions can again be formulated as a unitarity requirement for a matrix constructed from the mo, m I , . . . . The analysis for these matrix dilation cases is quite a bit harder than for the one dimensional case with dilation 2, and, depending on the matrix chosen, there are a few surprises. One surprise is that generalizing the Haar basis (i.e., choosing mo so that all its nonvanishing coefficients are equal) leads in many cases to a function ¢ which is the indicator function of a selfsimilar set with fractal boundary, tiling the plane. For two dimensions, with D = ( I -I ), e.g., one finds that ¢ can be the indicator function of the twin dragon set, as shown in Gr6chenig and Madych (1992) and Lawton and Resnikoff (1991) . Note that such fractal tiles may occur even for D = 2 Id if mo is chosen "non-canonically" (e.g., mo (�, ( ) = i (1 + e- i( + e- i (H() + e- i (H 2() ) in two dimensions-see Grochenig and Madych (1992)). For more complicated mo (not all coefficients are equal) , the problem is to control regularity. Zero moments for the 'ljJj do not lead to factorization of mo in these multidimensional cases (because it is not sufficient to know zeros of a multi-variable polynomial to factorize it), and one has to resort to other tricks to control the decay of J. A particularly interesting case is given by the "quincunx lattice," i.e., the two-dimensional case where DZ2 = {(m, n) j m + n E 2Z}. In this case there is only one other coset, and therefore only one wavelet to construct, so that the choice for ml is as straightforward as it was for dilation 2 in one dimension. The

322

CHAPTER

10

conditions on mo , m l reduce to the requirement that the 2

( mo (�, () mo (�

+ (+ 1r ,

1r

x

2 matrix

)

be unitary. It is convenient to choose

Note that any orthonormal basis for dilation factor 2 in one dimension auto matically gives rise to a pair of candidates for mo, m l for the quincunx scheme: it suffices to take mo (�, () = mt (�) (where mt is the one-dimensional filter). 7 Different choices for D can be made, however. Two possibilities studied in de tail in Cohen and Daubechies (1993b) and KovaCevic and Vetterli (1992) are D l = n n and D2 = n n The same choice for mo leads to very differ ent wavelet bases for these two matrices; in particular, if one derives, via the mechanism explained above, the filter mo from the "standard" one-dimensional wavelet filters N mO in §6.4, then the resulting ¢ are increasingly regular if D2 is chosen (with regularity index proportional to N) , whereas choosing D l leads to ¢ which are at most continuous, regardless of N. Other choices for D may lead to yet other families, with different regularity properties again. One can of course also choose to construct two biorthogonal bases rather than one orthonormal ba sis, as in §8.3; for the choices D1 , D2 several possibilities are explored in Cohen and Daubechies (1993b) and KovaCevic and Vetterli (1992) . In this biorthogo nal case, one can again derive filters from one-dimensional constructions. If one starts from a symmetric biorthogonal filter pair in one dimension, where all the filters are polynomials in cos �, then it suffices to replace cos � by (cos � + cos () in every filter to obtain symmetric biorthogonal filter pairs for the quincunx case. 8 Because of the symmetry of these examples, the matrices D l and D2 lead to the same functions ¢, ¢ in this case. One finds again that symmetric biorthogonal bases with arbitrarily high regularity are possible (see Cohen and Daubechies (1993b) ) . The quincunx case is of interest in image processing because it treats the different directions more homogeneously than the separable (tensor-product) two-dimensional scheme: instead of having two favorite directions (horizontals and verticals) , the quincunx schemes treat horizontals, verticals, and diagonals on the same footing, without introducing redundancy to achieve this. The first quincunx subband filtering schemes, with aliasing cancellation but without ex act reconstruction (which had not been discovered even for one dimension at the time) are given in Vetterli ( 1984) ; Feauveau ( 1990) contains orthonormal and biorthogonal schemes, and links them to wavelet bases; Vetterli, KovaCevic, and LeGall (1990) discusses the use of perfect reconstruction quincunx filter ing schemes for HDTV applications. In Antonini, Barlaud, and Mathieu (1991) biorthogonal quincunx decompositions combined with vector quantization give spectacular results for image compaction. -

_

·

�

323


10.4.

One-dimensional orthonormal wavelet bases with non-integer dilation factors.

In one dimension, we have so far only discussed integer dilation factors ;::: 2.9 Non-integer dilation factors are also possible, however. Within the framework of a multiresolution analysis, the dilation factor must be rationallO (for a proof, see Auscher (1989» . It had already been pointed out by G. David in 1 98 5 that the construction of the Meyer wavelet could be generalized to dilation a = � , for k E N, k ;::: 1 ; Auscher (1989) contains constructions for arbitrary rational a ( see also Auscher's paper in Ruskai et al. (1992») . Let us illustrate for a = � how the factor 2 scheme has to be adapted. We start again from a multiresolution analysis, defined as in (5. 1. 1)-(5.1.6) , with � instead of 2 for the dilation factor. We have again (

)

corresponding to n = 3£, n = 3£ + 1 , and n = 3£ + 2. The space Vo is generated by the 2Z translates of two functions, x - U) and x - 2£ - 1) , £ E Z. It follows that the complement space Wo is generated by the 2Z translates of a single func tion, Wo = Span {'IjI ( . - 2n) ; n E Z}. ( "Wo is half the size of Vo." ) We expect therefore an orthonormal basis of the type 'IjIj , k (X) = ( � ) -j/ 2 'IjI ( ( � )j x - 2k) , j, k E Z. This function 'IjI can also be written as a linear combination of the 4>(� x - n) , and orthonormality of the 'IjI (x - 2n ) , plus orthogonality with respect to the ( x - 2n) , ( x - 2n - 1) implies

4>

4>

L L n

n

9n 9n - 3£

=

8£0 ,

9n h�_3£

=

0,

� .jj,

(10.4.6)

L n

9n h;'_3£ = o .

(10.4.7)

With the definition m l ( ) = L n 9n e-inf. , the conditions (10.4.2) , (10.4.4) (10.4.7) are equivalent to the unitarity of the matrix

m8 (�) moo

(� + 327r )

I

mo

(� + 327r )

(10.4.8)

This matrix looks identical to (10.2. 1), but this similarity is deceptive: in (10.4.8) the first two columns are both given by low-pass filters, because they are both related to the scaling function (m8 (0) = 1 = mA CO)) , whereas the second column in (10.2. 1) corresponds to a high-pass filter. Such mi, m l can indeed be constructed ( see Auscher (1989) for details and graphs ) . Note that mij and m8 are closely related. The Fourier transforms of (10.4. 1), (10.4.3) are

4>

4>(�) = m0o ( 32 �) 4>. ( 32 �) , •

implying

¢(Oe - if. = m6

(� �) ¢ (� �) ,

(10.4.9)

325


which should hold for almost all (. If ¢ is continuous, then the following argument shows that ¢ vanishes on some intervals. Since ¢(O) = (27r) - 1/2 , there exists Q so that for 1 ( 1 ::; Q , I ¢( ( ) I � (27r) - 1/2 /2. Consequently, for 1 ( 1 ::; Q , m8 (() = e3i(, /2 mA ( ( ) , or

m8 (( + 27r) = _ e 3i (, /2 mA ( ( + 27r) ; since mg , mA are also 27r-periodic, this implies mg (( + 27r) = 0 = mA (( + 27r) for 1(1 ::; Q . It follows that I ¢( � ( + 37r) 1 = 0 for 1 ( 1 ::; Q. In particular, this means that ¢ cannot be compactly supported (compact support for ¢ means that ¢ is entire, and non-trivial entire functions can only have isolated zeros). Nevertheless, subband filtering schemes with rational noninteger dilation factors, in particular with dilation � , have been proposed and constructed by KovaCevic and Vetterli (1993) , with FIR filters. The basic idea is simple: start ing from co , one can first decompose into three subbands, by means of a scheme as in §1O.2, and then regroup the two lowest frequency bands by means of a synthesis filter corresponding to dilation 2; the result of this operation is c l , while the third, highest frequency band after the first decomposition is d i . The corresponding block diagram is Figure 10.4. If all the filters are FIR, then the whole scheme is FIR as well. But didn ' t we just prove that there does not exist a multiresolution analysis for dilation factor � with FIR filters? The solution to this paradox is that the block diagram above does not correspond to the construction described earlier. A detailed analysis of Figure 10.4 shows that this scheme uses two different functions ¢ I and ¢2 , with Vo generated by the ¢ I ( X - 2n) , ¢2 (X - 2n ), n E Z. The argument used to prove that ¢ cannot have compact support then no longer applies, and ¢ I , ¢2 can indeed have compact support. The analog of (10.4.9) is now an equation relating the two-dimensional vectors ( ¢ I (�), ¢2(�)) and ( ¢ I (� �) , ¢2 ( � �)), however, and it is hard to see how to formulate conditions on the filters that result in regularity of ¢ I , ¢2 .

�----- d l

FIG. 10.4. Block diagmm corresponding to a subband filtering with dilation factor constructed in Kovacevic and Vetterli (1993) .

�,

as

One may well wonder what the rationale is for these fractional dilation factors. The answer is that they may provide a sharper frequency localization. If the

3 26

CHAPTER

10

dilation factor is 2, then ,(j; is essentially localized between 7r and 27r, as illustrated by the Fourier transform of a "typical" 'IjJ in Figure 10.5. For some applications, it may be useful to have wavelet bases that have a bandwidth narrower than one octave, and fractional dilation wavelet bases are one possible answer. A different answer is given in Cohen and Daubechies (1993a) , summarized in the next section. 1 .0

0.5

o It

o

FIG. 1 0 . 5 .

21t

41t

31t

Modulus of 1 1O�(�) 1 , with N'I/J

as

defined in §6.4.

10.5.

Better frequency resolution: The splitting trick Suppose that hn, 9n are the filter coefficients associated to wavelet basis with dilation factor 2, i.e., 1 mo (

Ten Lectures on Wavelets (CBMS-NSF Regional Conference Series in Applied Mathematics)

Ten lectures on wavelets

Ten lectures on wavelets