Editorial, Sales, and Customer Service Office A K Peters, Ltd. 63 South Avenue Natick, MA 01760 Copyright © 1998 by A K Peters, Ltd. All rights reserved. No part of the material protected by this copyright notice may be reproduced or utilized in any form, electronic or mechanical, including photocopying, recording, or by any information storage and retrieval system, without written permission from the copyright owner.
Library of Congress CataloginginPublication Data Blatter, Christian, 1935[Wavelets. English] Wavelets: a primer / Christian Blatter. p. cm. ISBN 1568810954 1. Wavelets (Mathematics) I. Title. QA403.3.B5713 1998 515' .2433DC21
9829959 CIP Rev.
Originally pllblished in the German I,mguage by Friedr Vieweg & Sohn Verlagsgesellschaft mbH, D·65189 Wicsbaden, with the title "Wavelets. Eine Einfiihrung I st Edition" (c) by Friedr Vieweg & Sohn VCIIagsgesellschaft mbH, BraunschwciglWiesbaden, 1998
Printed in the United States of America 10 9 876 543 2 1 02 01 00 99 98
Contents
Preface Read Me 1
1.1 1.2 1.3 1.4 1.5 1.6 2
2.1 2.2 2.3 2.4 3
3.1 3.2 3.3 3.4 3.5 4
4.1 4.2 4.3 4.4
vii ix
Formulating the problem A central theme of analysis . Fourier series . Fourier transform Windowed Fourier transform Wavelet transform The Haar wavelet .
1 1 4 8 11 14 20
Fourier analysis . Fourier series . Fourier transform on lR The Heisenberg uncertainty principle The Shannon sampling theorem .
29 29 34 49 53
The continuous wavelet transform Definitions and examples . A Plancherel formula Inversion formulas The kernel function Decay of the wavelet transform
61 61 69 74 78 82
Frames Geometrical considerations . The general notion of a frame The discrete wavelet transform Proof of theorem (4.10)
90 90 99 104 114
vi
Contents
5.1 5.2 5.3 5.4
Multiresolution analysis Axiomatic description . . . The scaling function. . . . Constructions in the Fourier domain. Algorithms . . . . . . . . . . . .
6
Orthonormal wavelets with compact support
6.1 6.2 6.3 6.4
The basic idea . . . . Algebraic constructions Binary interpolation. Spline wavelets
5
120 121 126
134 149 157 157 168 176 188
References
199
Index . . .
201
Preface
This book is neither the grand retrospective view of a protagonist nor an encyclopedic research monograph, but the approach of a working mathematician to a subject that has stimulated approximation theory and inspired users in many diverse domains of applied mathematics, unlike any other since the invention of the Fast Fourier Transform. As a matter of fact, I had only set out to draw up a onesemester course for our students at ETH Zurich that would introduce them to the world of wavelets ab ovo; indeed, such a course hadn't been given here before. But in the end, thanks to encouraging comments from colleagues and people in the audience, the present booklet came into existence. I had imagined that the target group for this course would be the following:· students of mathematics in their senior year or first graduate year, having the usual basic knowledge of analysis, carrying around a knapsack full of convergence theorems, but without any practical experience, say, in Fourier analysis. In the back of my mind I also entertained the hope that some people from the field of engineering would attend the course. In fact, they did, and afterward I found out that exactly these students had profit ted the most from my efforts. The contents of the book can be summarized as follows: The introductory Chapter 1 presents a tour d 'horizon over various ways of signal representation; it is here that the Haar wavelet makes its first appearance. Chapter 2 serves primarily as a tutorial of Fourier analysis (without proofs); it is supplemented by the discussion of two theorems that define ultimate limits of signal theory: the Heisenberg uncertainty principle and Shannon's sampling theorem. In Chapter 3 we are finally ready for a treatment of the continuous wavelet transform, and Chapter 4, entitled "Frames", describes a general framework (pun not intended) allowing us to handle the continuous and the discrete wavelet transforms in a uniform way. All this being accomplished, we finally arrive at the main course: multiresolution analysis with its fast algorithms in Chapter 5 and the construction of orthonormal wavelets with compact support in Chapter 6. The book ends with a brief treatment of spline wavelets in Section 6.4.
viii
Preface
Given the small size of this treatise, some things had to be left out: biorthogonal systems, wavelets in two dimensions, and a detailed description of applications, to name a few. Furthermore, I decided to leave distributions out of the picture. This means that there aren't any Sobolev spaces, nor a discussion of pointwise convergence, etc., of wavelet approximations, and the PaleyWiener theorem is not at our disposal either. Fortunately, there is an elementary argument coming to our rescue in proving that the Daubechies wavelets indeed have compact support. When putting the material together, I made generous use of the work of other authors. In the first place, of course, I borrowed from Ingrid Daubechies' incomparable "Ten Lectures on Wavelets" [D], to some lesser extent from [1], which at the time (winter 199697) was the only wavelet book available in German, and from Kaiser's "Friendly Guide to Wavelets" [K]. Concerning further sources of inspiration, I refer the reader to the list of references at the end of the book. I have deliberately kept this list short and have refrained from reprinting the more extensive, but not updated, lists of references given in [D] or [L]. A substantial and at the same time very recent (1998) list of references can be found in [Bu], which, by the way, takes an approach to wavelets that is fairly similar to ours. Let me comment briefly on the figures. Most graphs of mathematically defined functions were first computed with the help of Mathematica®, then output as Plot, and, finally were finished in the graphics environment "Canvas". A few of the figures, e.g., Figures 3.7 and 6.1, were generated by means of "Think Pascal" as bitmaps, then printed out in letter format and finally reduced to the required width photographically. This book was published first in German by ViewegVerlag under the title "Wavelets  Eine Einfuhrung". I am grateful to Klaus Peters that he consented to give the present English edition a chance, and to his collaborators for streamlining the schoolboy's English of my raw translation. Christian Blatter Zurich, 14 August, 1998
Read Me
This book is divided into six chapters, and each chapter is subdivided into a certain number of sections. Formulas that are used again at some later point are numbered sectionwise in parentheses: (1). When referring to formula (5) of the current section, we do not give the section number; 3.4.(2), however, denotes formula (2) of Section 3.4. New terms are printed in slanted type at their place of definition or first appearance; as a rule there is no further warning of the "Watch out: Here comes a definition!" type. The exact spot where a term is defined is referenced in the index at the back of the book. Propositions and theorems are numbered by chapters, the boldface marker (4.3) denoting the third theorem in Chapter 4. Theorems are usually announced; in any case they are recognizable from the marker at the beginning and from their text being printed in slanted type. The two corners I and ~ denote the beginning and the end of a proof.
®
Circled numbers mark the beginning of examples, some of them of a more explanatory nature, some of them describing famous animals created by means of the general theory. The numbering of examples begins anew in each section; the empty circle 0 marks the end of an example. A family of objects designated by
Cn
over the index set I (called an array for short) is
(cn 10: E I)
=:
c.
1A denotes the characteristic function of the set A and Ix the identity mapping of the vector space X. If e resp. ai, ... , ar are given vectors of a vector space X, then span(al, ... , a r ) denote the subspace spanned by e resp. the ak.
< e > resp.
R* := JR \ {O} is the multiplicative group of real numbers. R:' := JR* x JR is the (a, b)plane "cut up into two halves". Note that in corresponding figures the aaxis is drawn vertically and the baxis horizontally, as explained in Section 1.5.
x
Read Me
J
The symbol without upper and lower limits always denotes the integral over all of IR with respect to the Lebesgue measure:
J
f(t) dt
In an analogous manner, sums to be sums over all of Z:
:=
1:
f(t) dt .
L:k without upper and lower limits are meant 00
Lak:=
L
ak·
k=oo
k
The Fourier transform is defined as
f(t;,)
:=
~
J
f(t)
ei~t dt ,
and the Fourier inversion formula, sometimes called Fourierv transform, reads
By jf: f we denote the Njet (the Taylor polynomial of order N) of point a E IR, given by
l: f(t)
N :=
L
f at the
(k)()
f k! a (t  a)k .
k=O
The symbol e", denotes the function
If f is a complexvalued function defined on X := IR or X := Z, then a(f) and b(f) denote the left and right ends of the support of f, respectively:
I
a(f) := inf{x E X f(x)
# O},
b(f)
A time signal is simply a function f: IR  C.
:=
sup{x E X I f(x)
# O}
.
1 Formulating the problem
1.1 A central theme of analysis The approximation, resp. the representation, of arbitrary known or unknown functions f by means of special functions can be viewed as a central theme of analysis. "Special functions" are functions taken from a catalogue, e.g., monomials t tt tk, kEN, or functions of the form t tt ect , C E C a parameter. As a rule special functions are well understood, very often they are easy to compute and have interesting analytical properties; in particular, they tend to incorporate and reexpress the evident or hidden symmetries of the situation under consideration. In order to fix ideas we consider a (given or unknown) function
f:
IR "" C ,
assuming that f is sufficiently many times differentiable in a neighbourhood U of the point a E R Such a function can be approximated within U by its Taylor polynomials (k) ( )
n
j:f(t)
:=
L
f k! a (t  a)k
(1)
k=O
(jets for short), up to an error that can be quantitatively controlled, and under suitable assumptions the function f is actually represented by its Taylor series, meaning that one has
f(k)( )
L T(t 00
f(t) =
a)k
k=O
for all t in a certain neighbourhood U' C U. The general setup in this realm is the following: Depending on the particular situation at hand one chooses a family (e", I a E 1) of basis functions t tt e", (t); the index set I may be a discrete or a "continuous" set. An approximation of a more or less arbitrary function f by means of the e", then has the form N
f(t) ~
L Cke"'k (t) k=l
1 Formulating the problem
2
with coefficients
Ck
to be determined, and a representation of f has the form
f(t) ==
L co:eo:(t) ;
(2)
o:EI
or it appears as an integral over the index set I:
f(t) ==
1
do. c(a.) eo: (t) .
(3)
In the ideal case there are exactly as many basis functions at our disposal as are needed to represent any function f of the considered kind in exactly one way in the form (2) resp. (3). The operation that assigns a given function f the corresponding coefficient vector or array (co: Ia. E I) is called the analysis of f with respect to the family (eo: Ia. E 1). The coefficients Co: are particularly easy to determine, if the basis functions eo: are orthonormal (see below). In the case of the Taylor expansion (1) the coefficients have to be determined by computing recursively ever higher derivatives of f; and in the case of the socalled Tchebycheff approximation there are no formulas for the coefficients Ck, even though they are uniquely determined. The inverse operation that takes a given coefficient vector (co: I a. E I) as input and returns the function itself as output is called the synthesis of f by means of the eo:.
CD
Suppose that the xinterval [0, L J is modeling a heat conducting rod S (see Figure 1.1). The spatially and temporally variable temperature within this rod is described by a function (x, t) tt u(x, t) that satisfies the onedimensional heat equation au
at
= a
2a 2 u 2 j
ax
( 4)
here a > 0 is a material constant. The initial temperature x tt f(x) along the rod is given, as is the boundary condition that the two ends of the rod are kept at temperature 0 at all times. Along the rod, i.e., for 0 < x < L, there is no heat exchange with the surroundings. The task is to determine the resulting temperature fluctuation u(·, .) within the rod. In connection with problems of this kind the following procedure (called separation of variables) has turned out to be useful: One begins by determining functions U (., .) of the special form
(x, t)
tt
U(x, t) = X(x) T(t) ,
satisfying (4) and vanishing at the two ends of the rod. A collection of functions fulfilling these requirements is given by
Uk(X, t)
:= exp (
2 2 2 )
k 7r a v:t
k7rx
sin L
1.1 A central theme of analysis
3
u
*~~++
0c===============~
X
s
Figure 1.1
Since the conditions imposed on the Uk are linear and homogeneous, it follows that arbitrary linear combinations <Xl
u(x, t) :=
2: Ck Uk(X, t) k=l
of the Uk are in their turn solutions of the heat equation vanishing at the ends of the rod. Therefore we shall have the solution of the original problem in our hands, if we are able to specify the coefficients ck in such a way that the initial condition u(x, 0) == f(x) is fulfilled as well. This means that we would have to guarantee the identity
~
L..,;Ck sin
Tk7rx
f(x)
(0 < x < L) .
(5)
k=l
It is at this point that the question arises as to whether the function system
. k7rx
ek () x :=smT
(k E
N~t)
is "complete", that is to say, is rich enough to allow the representation of an arbitrarily given function f: )0, L[ ~ lR in the form (5). The answer to this question is yes, as is proven in the theory of Fourier series (see below). 0 As we move along, another issue enters the picture: If a function f is analyzed or synthesized not only in thought and for theoretical purposes, but concretely, as in the analysis of ECGs or of long term climate changes, then for the numerical work a more or less complete discretization becomes almost indispensable. The discretization refers, on the one hand, to the collection of basis functions (in case the latter has not been discrete from the outset) and, on the other hand, to the space parametrized by the independent variable t
4
1 Formulating the problem
(resp. x, x, ... ): The values of all occurring (given or unknown) functions are evaluated, measured or computed only at the discrete places t := kT
(k E Z,
T
> 0 fixed) .
The fact that the function values f(t) themselves are represented in the computer in a "quantized" form only, instead of with "infinite precision", does not concern us here.
Wavelets are novel systems of basis functions used for the representation, filtration, compression, storage, and so on of any "signals" f:
lR. n
+
C .
In the case n = 1, the variable t represents time, and one works with time signals f: lR. + C. The case n = 2 refers to image processing; a concrete example is the representation and storage of millions upon millions of fingerprints in the FBI's computer, see [1]. We shall approach these wavelets by recalling briefly some facts about Fourier series and the Fourier transform. A more complete tutorial of Fourier analysis is given in Sections 2.1 and 2.2.
1.2 Fourier series Fourier series concern 21Tperiodic functions f:
lR.
+
C,
f(t
+ 21T) ==
f(t) ,
equivalently written as f: lR./21T + C. The "natural" domain of definition of such a function is the unit circle 8 1 in the complex zplane, see Figure 1.2. On 8 1 the infinitely many modulo 21T equivalent points t+2k1T, k E Z, appear as a single point z = eit . i
~~~I~or~o+ t
t21T
Figure 1.2
0
t
t+21T
1
1.2 Fourier series
5
Expressing the monomial power functions
in terms of the variable t, one arrives at the trigonometrical basis functions or pure harmonics
(k E Z) . (Unfortunately there is no universally used and accepted notation for these functions; so we shall give the boldface e a try here.) The natural scalar product for functions f: lR/27r + C is given by
(I, g) The
ek
:=
1 27r
171" 71" f(t) g(t) dt .
(1)
are orthonormal: (ej,ek)
= 8jk ;
in particular, they are linearly independent. From general principles of linear algebra it follows that
(2) is the "kth coordinate of f with respect to the basis N
IkE
Z)" , and
N
L
SN:=
(ek
Ckek
resp.
SN(t):=
k=N
L
Ck eikt
k=N
is the orthogonal projection of f onto the subspace
formed by all linear combinations of the ek having Ikl :=:; N. Being the foot of the perpendicular from f to UN (see Figure 1.3), the point SN is nearest to f among all points of UN. In saying this we have tacitly assumed that in our function space the distance function
d(f,g)
:=
1 171" ) 1/2 Ilf  gll:= ( 27r _...!f(t)  g(t)1 2 dt
corresponding to the scalar product (1) has been adopted.
6
1 Formulating the problem
Figure 1.3
This has been the easy part. But what is crucial here, and much more difficult to prove as well, is that the system (ek IkE Z) is complete: Any reasonable function f: JR/27l" ~ o x lIt The variable a is called the scaling parameter, and b is the translation parameter. The factor 1/laI 1/ 2 in (1) is not crucial and is more of a technical nature; it is thrown in to guarantee II7Pa,bll = 1.
1\
y=
7P(tb) u; ,
\
O C appears (we are talking about a bona fide function here, not an equivalence class). This notion is explained as follows: To an arbitrary subdivision
T:
0
= to < tl < t2 < ... < tn = 21T
of the interval [0, 21T J belongs the increment sum n
VT(f)
:=
L
If(tk)  f(tkl) I .
k=l
(Note that the absolute values of the increments are summed here!) The total variation V(f) of the 21Tperiodic function f is the supremum of these sums over all subdivisions T. If V(f) is finite, then f is called a function of bounded variation. One may consider the function t f+ f (t) as a parametric representation of a closed curve 'Y in the complex plane. In light of this interpretation the quantity V (f) is nothing more than the length L(y) of this curve. If f is, e.g., piecewise continuously differentiable, then
V(f) = L(y) =
iotIT 1f'(t)1 dt < 00 .
(2.5) Let the function f: lR/21T > C be continuous and of bounded variation. Then the partial sums S N (t) of the Fourier series of f converge for N ~ 00 uniformly on lR/21T to f(t). Using the idea of variation we can formulate the following "quantitative version" of the RiemannLebesgue lemma:
2.1 Fourier series
33
(2.6) Let f(r) denote the rth derivative, r 2: 0, of the function f: lR/27T If !(T) is continuous and V(J(r)) =: V is finite, then
+
C.
Vki=0.
This can be summarized as follows: The smoother the function f, the faster the Fourier coefficients Ck are decaying with k + ±oo. Theorem (2.6) can, in a way, be reversed:
(2.7) If the coefficients Ck obey an estimate of the form
for some e > 0, then the function f(t) .continuously differentiable.
I:k Ck eikt
is at least r times
I' When the series defining the function f is differentiated termbyterm p times, one obtains
L ck(ik)P eikt . k
The estimate
shows that the resulting series is uniformly convergent (to a continuous function) as long as p ::; r. In fact, for such p the series represents f(p) , so altogether we have f E cr. ~ The phenomena described in (2.6) and (2.7) become manifest again when we are dealing with Fourier analysis on lR and will have decisive consequences for the smoothness of our wavelets; we shall come back to this. We conclude this section by writing down the relevant formulas for the Fourier series and its coefficients in case of a period of arbitrary length L > instead of 27r. For L := 27T, these formulas must become (1) and (3), and similarly for Parseval's formula.
°
34
2 Fourier analysis
(2.8) Let f: JR
+
foL If(x)12 dx
0, and suppose Then the formal Fourier series of f is given by
00
f(x)
+
L
Ck e2k1rix/L ,
k=oo
and Parseval's formula appears as
The function g(t) := f(2~t) is 27l"periodic, thus the relations (4) are obtained by a simple substitution of variables. From (2.2), it follows that for Lperiodic functions an equality of the form
I"
must hold. The special function f(t) :=== 1 has Fourier coefficients (Kroneckerdelta), which leads to the conclusion C =
t·
2.2 Fourier transform on
Ck
= 80k J
~
Notation: From this point on until the end of the book an integral sign J with· out upper and lower limits denotes the integral with respect to the Lebesgue measure on JR, extended over the whole real axis:
J
f(t) dt :=
1:
f(t) dt .
Fourier analysis on JR is governed not by one theory but by at least three different theories, all depending on which function space is chosen as the basic environment. All of these theories deal with functions of the type f:
JR
+
C;
we shall call such functions time signals for short.
(1)
2 2 Fourier transform on lR
35
The space Ll consists of the measurable functions (1) for which the integral
J
I/(t)1 dt =: 1I/IiI
(the 1 is a notational index!) is finite; to be precise, it consists of equivalence classes of such functions. Analogously, the space L2 consists of the functions (1) for which the integral
J
I/(tW dt =: 11/!12
(the
is an exponent!) is finite. The third of these spaces is the socalled Scbwartz space S; its elements are the functions (1) with the following properties: 1 has derivatives of all orders (in symbols, 1 E Coo (JR)) , and for It I * 00 all derivatives decay faster to 0 than any negative power l/ltln. Examples of such functions are 2
1 tl+.
cosht
Figure 2.1 shows the inclusions that are valid between these spaces. All wavelets of any practical significance belong to the intersection Ll n L2, so the L1theory as well as the L 2 theory is available for them. The famous "Mexican hat" (see Figure 3.4) even lies in S. 1
1
sint t
/
1
Figure 2.1
The Fourier transform
f of a function 1 E Ll is defined by the integral (~ E
JR) .
(2)
36
2
Fourier analysis
f
The definition of is not uniform in the mathematical literature. In addition to the integral given here, one also encounters
J
f(t) e 27ri €t dt
and others. The content of the theory remains intact under such changes, of course, but the formulas will look a little different throughout. For a given ~ E lR, the welldetermined value f(~) may be interpreted as follows: f(~) is the complex amplitude with which the pure oscillation ee is represented in f. The following "Gedankenexperiment" (thought experiment) will illustrate this: Consider a time signal f whose value f(t) oscillates around the origin (not necessarily in circles) with an angular velocity approximately ~ during some length of time and is very weak the rest of the time. If I is the time interval of this encircling motion, then arg(J(t) e*) is more or less constant on I, and the integral
1
f(t) ei€t dt
has a large absolute value, since there is little cancellation. The remaining integral
r
f(t) ei€t dt ,
iff{\!
on the other hand, will have a very small value, since the signalreading f(t) is more or less constant on lR \ I, while e€ is oscillating rapidly and harmonically there, so that we have a great deal of cancellation during the summation process on lR \ I.
(2.9) The Fourier transform uous. Furthermore, one has
f
of a function f lim f(~)
€>±oo
E £1
is automatically contin
O.
The vanishing of f at ±oo is nothing more than the Fourier transform version of the RiemannLebesgue lemma. We now derive a few rules for calculating the Fourier transforms of functions related to some given f by translation, dilation and the like. For any time signal f and arbitrary h E lR, the function Thf is defined by
Thf(t)
:=
f(t  h) .
2.2 Fourier transform on lR
37
~+~=~~~=
th
t
t
Figure 2.2
If h is positive, then Th translates the graph of f by h to the right (see Figure 2.2). Let f be in L1 and g(t) := Thf(t). Then the Fourier transform of 9 is computed as
This proves our first rule:
(Rl) which may be expressed in words as follows: If f is translated by h to the right along the time axis, then its Fourier transform j picks up a factor eh. We again consider an arbitrary signal f E L1 and modulate f with a pure oscillation e w , WEIR; that is to say, we consider the function g(t) := eiwt f(t). The Fourier transform of 9 is given by
So we have the following rule, which is in a way "dual" to (Rl): (R2) In words: If the signal f is modulated with e w • then the graph of jis translated by w (to the right, if w > 0) on the ~axis. Speaking philosophically, one can say that Fourier theory is the systematic exploitation of translational symmetry. In the realm of wavelets dilations of the time axis playa role of even more importance. For this reason we have to investigate how the Fourier transform behaves under the operation Da, which for arbitrary a E 1R* is defined by
Daf(t) . fG) .
2 Fourier analysis
38
(a=3)
t
t/a
t
Figure 2.3 The effect of Da on the graph of a signal f is shown in Figure 2.3 for the case > 1, then 9(1) is stretched horizontally by the factor lal, and for lal < 1 the graph is compressed horizontally by the factor lal. If a < 0, then, in addition, 9(1) is reflected on the vertical axis. So let g(t) := Daf(t). In order to compute 9 we use the substitution
a := 3. If lal
t := at'
(t'
E
JR) ,
da =
lal dt'
(absolute value of the Jacobian!) and obtain
9(0 =
_1_
..ffff
J
f(!) a
ei~t dt = l:L ..ffff
J
f(t')
ei~at' dt' = lal f(ae)
.
All in all, we have proven the formula
(a E JR*) .
(R3)
In terms of the graphs of f and f this means the following: If the graph of f is stretched horizontally by a factor a > 1, then the graph of is compressed horizontally to the fraction ~ < 1 of its original width; moreover, it is scaled vertically by the factor lal.
i
For any two given functions defined by
f
* g(x)
:=
f and 9
J
E L1, their
f(x  t) g(t) dt
convolution product f
*9 is
(x E JR) .
In any case the object f * 9 is an element of L1. This means that a priori it is only an equivalence class of functions. In most concrete cases, however, f *9 is a bona fide function with welldetermined values. One can even say more:
2 2 Fourier transform on lR
39
The function f * 9 is at least as smooth as the smoother of the two functions I and g. A typical application of convolution is the socalled regularization of a given function f by means of smooth bump functions ge E Coo. The ge have total mass f ge(t) dt = 1 and are identically zero outside of the interval Ie, e J, see Figure 2.4. The value f * ge(x) can then be regarded as a weighed average of the fvalues taken in an cneighbourhood of x, so the Coofunction Ie := f * ge is an "csmeared out" version of the given function f.
!ge(t)dt= 1
t
Figure 2.4
With the help of Fubini's theorem (on the interchange of the order of integration) we can now easily compute the Fourier transform of f * g:
(J*gr(~)= ~I(I f(xt)9(t)dt)eiexdx = y
=
~f
27T JIRXIR
~
f(x  t)g(t) e iex d(x, t)
1 (I g(t)
f(x  t) e iex dx )dt .
By rule (Rl), the resulting inner integral has the value .j2i e* f(~), and here only the factor e iet is dependent on t. Thus we may continue the above chain of equalities with
Our computation proves the socalled convolution tbeorem (2.10)
40
2
Fourier analysis
In words: The Fourier transform converts the convolution product of the two functions f and 9 into the ordinary, meaning pointwise, product of their Fourier transforms. Now for the L 2theory. On L2 one defines a scalar product by
(f, g)
(3)
:= J f(t) g(t) dt .
For any two functions f, 9 E L2, their scalar product (f, g) is a welldetermined complex number. Any f E L2 has a finite 2norm, norm for short,
Ilfll
:=
vTT1) = (Jlf(tWdt)1/2,
and one easily proves Scbwarz' inequality
I (f, g) I ~ Ilfllllgil .
(4)
L2 is a Hilbert space, as was L~, but not everything carries over. For a general E L2, the Fourier integral (2) need not exist: Since ee is not an element of L2, this integral cannot be regarded as being the scalar product e~). 2 Fortunately, the subset X := L1 nL is dense in L2, and this makes it possible to extend the Fourier transform
f
.,A:;(f,
F:
f
tt [ ,
defined on X by formula (2), in a unique way to all of L2. This implies, of course, that the Fourier transform of a function f E L2 \X becomes accessible only through an additional limiting process. Working out the details, one arrives at the following picture: The Fourier transform [ of a function f E L2 about which nothing else is known is again an L 2 object, Le., an equivalence class of functions, and does not have welldetermined values at individual points ~ E R But as a map
the Fourier transform is welldefined and bijective (a miracle!). In fact, even more is true: F is an isometry with respect to the scalar product (3). This is analytically expressed by the following theorem, called the ParsevalPlancberel formula:
2 2 Fourier transform on lR
41
(2.11) For arbitrary I, g E L2 one has
(1,9)
= (J,g) ,
or, written out in full,
J
I(t) g(t) dt .
In particular, resp.
A periodic function I can be reconstructed from its Fourier coefficients Ck = j(k), by summing the series. In a similar vein, there is also a reconstruction procedure (called the inversion formula) for the Fourier transform. It accepts the Fourier transform lof a time signal I as input and reproduces the original signal I by means of a summation process. In the textbooks on Fourier analysis one finds various approaches to such an inversion formula under ever weaker assumptions about I and Let us note here the following version:
f
(2.12) If f and
1are both in P, then
almost everywhere, in particular at all points t where I is continuous.
This formula can be written "abstractly" in the form
which may be interpreted as follows: The original signal I is a linear combination of pure oscillations of all possible frequencies ~ E lR; to be more precise, any individual oscillation ee occurs in I with complex amplitude [(~) (cf. our remarks following the definition (2) of 1). In Theorem (2.12) there are assumptions not only about the original signal f but also about Thus we have to address the following question: How are the properties of (continuity, decay at infinity, etc.) related to those of I? Generally speaking, the following can be said in this regard: The smoother
1. 1
42
2 Fourier analysis
the time signal f, the faster the decay of i(f.) for 1f.1 ~ 00. Reflecting this in a logical mirror, one has the following dual statement: The faster the original signal decays for It I ~ 00, the smoother, or more regular, is its Fourier transform (Following the general custom, we use the word regular to convey a not very precise notion of smoothness.) A function f in Schwartz space S is "super smooth", and as a consequence its Fourier transform decays "super fast". On the other ha~d, f and all its derivatives enjoy "super fast" decay, and as a consequence f is "super smooth". All in all, it turns out that :1, restricted to S, maps this space bijectively onto itself.
i
We want to formulate the described general principle somewhat more precisely, i.e., in a more quantitative way. The smoothness (regularity) of a function is most easily expressed by the number of times it can be continuously differentiated. So we first have to investigate the interplay between the Fourier transform and differentiation. Let f be a aIfunction and assume that f as well as l' are integrable, i.e., in Ll. Then in any case one has limtdoo f(t) = 0 (an exercise!), and partial integration of the Fourier integral (2) gives
Ji
f'(t)
ei~t 1
dt = f(t)
ei~t I~_
t.oo
+ if.
J
f(t)
ei~t dt ,
from which we can read off the following rule for computing the Fourier transform of a derivative:
(R4) Continuing in this way, we obtain, at least formally, for arbitrary r ;::: 0, the formula
(5) Assume, e.g., that our signal f is r times continuously differentiable and that the derivatives f(k) (0:::; k :::; r) are in Ll. Then formula (5) is applicable, and Theorem (2.9), applied to fH, guarantees lim
~doo
If.l ri(f.) = 0 .
This can be read as follows: Under the described circumstances the Fourier transform has a decay at infinity (i.e., for 1f.1 ~ (0) that is faster than the decay of l/If.lr.
i
Using (2.11) instead of (2.9) we arrive at a similar result: If, under suitable assumptions about the derivatives f(k) (0:::; k :::; r), the integral f If(r)(t)1 2 dt
2.2 Fourier transform on lR
43
is finite, then the integral J 1~12r 1f(~W df. is finite as well, which implies that must have corresponding decay at infinity.
i
As a counterpart to the considerations in the last paragraph we start afresh, but this time with time signals f that have fast decay at infinity. We consider an fEU decaying for It I  t 00 at least fast enough to make the integral JItllf(t)1 dt convergent. We shall denote the function t t+ t f(t) by tf for short, so we assume tf ELI. We now compute the derivative of f To this end we write
f(~ + h~  f(~)
~
=
J
f(t)
ei~t eit~ 
1 dt .
Here the integrand .
gh(t) := f(t) et~t
e ith 
h
1
can be estimated as follows: 'Vhj:.O.
By Lebesgue's theorem (about the interchange of limit and integration) we conclude that the derivative
(1)'(~) =
lim hO
f(~ + h)  f(~) h
= _1_
J21T
J
f(t)
ei~t( it) dt
exists. If the last equation is read from right to left, one obtains the following rule for computing the Fourier transform of tf: (t fr(~) = i (1)'(~) .
(R5)
Because of (2.9), the function (1)' is even continuous. By induction one proves easily that the following is true for arbitrary r ~ 1: (2.13) Assume that fELl decays fast enough for It I  t 00 to make the integral J IW If(t)1 dt finite. Then the Fourier transform is at least r times continuously differentiable. Furthermore,
1
(6) An extremal case of fast decay is when the time signal fELl has in fact compact support. If supp(f) C [b, b], we may write
f(()
=
~rb f(t) ei(t dt J b
y27r
.
(7)
44
2 Fourier analysis
Note that we have replaced the frequency variable ~ by a (, for something essential has happened: The Fourier transform has become an entire holomorphic function of the complex variable ( = ~ + iry. Looking back, we remark that for the convergence of the Fourier integral (2) in general it was crucial that the factor eit;t remain bounded when t  t ±oo. Now in the integral (7) over a finite interval, the factor ei(t can be estimated for complex ( as follows: = le i (t;.+i7)tl ::; e bl 7)1 (b::; t::; b) .
f
lei(tl
This shows that the integral (7) is convergent for arbitrary values of ( E C, and as in the proof of (R5) it follows that one may differentiate (7) in the sense of complex function theory with respect to the variable (. Furthermore, one has for f itself an estimate of the form
1f(()1 ::;
_l_jb v'2i
If(t)le1tIm(()1
dt::;
CeblIm(OI .
b
Thus the size of the support of f determines the rate of increase of the entire function ( t+ f( () in the vertical direction.
f
Since the Fourier transform in this case has turned out to be an entire holomorphic function, it is impossible that has compact support, if this is the case for f. Turned the other way around, a bandlimited signal (see Section 2.4) cannot have compact support.
f
We conclude this section with a few examples.
CD Let a > 0, and consider the function is computed as follows: f(~)
j v'2i
= 1
a
eit;t dt
fi sin(a~) y; ~
:= l[a,aj.
v'2i
(~ =1= 0)
The value ~ = 0 is special. limt;>o f(~), one finds
i~
Its Fourier transform
1 2 = ___
lIla = __ __ eit;t
a
=
f
v'2i ~
t:=a
e
it;a
 e
it;a
2i
.
By a separate calculation or by looking at
f(0) =
j!a.
The graphs of both f and f are shown in Figure 2.5. In the signal theoretic literature, very often the socalled sinc function is introduced as a standard tool. It is usually defined by sinc (x)
._ { Sinxx I
(x
=1=
0)
(x =0)
2.2 Fourier transform on R
45
and is an entire holomorphic function of x, when x is considered as a complex variable. Using this function we may write down our result about f in the following way:
(8)
t 1
'i2Fr a 1
f
~~~+t
a
a
Figure 2.5
As an exercise in using our rules, we compute the Fourier transform of the Haar wavelet (see Section 1.6) a second time. Considered as an element of £1, the Haar wavelet may be written as follows: 'l/JHaar
= 1[0'21.]1[1.2' 1] = T1.1[_1. 1.] T;J,l[_1. 1.] 4 4'4 4 4'4
.
Rule (R1) now allows us to read off :;jHaar directly from (8):
as before.
The function
g(t)
:=
l[a,a](t). eiwot
models a certain process setting in at the exact time t := a and abruptly stopping at time t := a. In between, we observe a pure oscillation of frequency (angular velocity, to be exact) woo The Fourier transform treats this process
46
2 Fourier analysis
mandatorily as an overall phenomenon extended over the full time axis. Rule (R2) gives, in this case:
9(~) =
fi sin(a(~ 
y;
wo)) ~wo
As was to be expected, the function 9 has a more or less distinctive maximum at the frequency ~ := Wo (see Figure 2.6). But because of the jump discontinuities of 9 at the times t := ±a, the absolute value 191 decays only slowly with I~I  t 00; in fact, 9 is not even in L1. 0
Figure 2.6
@ The Fourier transform of the function
is most easily computed via the methods of complex function theory. Since 90 is real and even, its Fourier transform 90 will also be a real and even function. So it suffices to discuss ~ > O. Inspired by 90, we consider the function J(z) := e z2 /2, holomorphic in the full complex zplane, and draw
the rectangle R shown in Figure 2.7. Since, in the end, we shall take the limit  t 00, we may assume right from the start that a ~ ~ > 0; note that ~ is fixed here. a
Cauchy's integral theorem tells us that
1 ~
J(z) dz =
1 ~
J(z) dz +
JaR J(z) dz = O.
Therefore we have
ju
J(z) dz ,
J(z) dz 
j
~
47
2.2 Fourier transform on R
R 1
a
a
Figure 2.1
which we may abbreviate as
h = 10 +1+  L . For h we use the parametric representation tt+z(t):=t+i~
0"1:
and obtain
11 =
i:
exp(  t
2
+ 2i;t 
= ee/ 2 (27J"go(~)
+ 0(1))
(a
e) dt = ee (a
4
/2
~ t ~
i:
a)
e t2 / 2 e iet dt
(0) .
(9)
The integral 10 can be written as
10
=
f
a a
2
e t /2 dt
= y'2; + 0(1)
(a4oo) .
(10)
Here we have used a wellknown special value of the probability integral, which can be obtained without excursion into the complex domain. To compute the remaining integrals I ±, we use the parametric representation
I±:
t
and obtain
i±=
t+
z(t)
le o
:=
±a + it
(a 2 ± 2iat  t 2 ) 2 idt.
exp 
48
2 Fourier analysis
Because of a
~ ~,
the last integral can be estimated as follows:
II± I ~ loa exp (
(a  t )2( a + t)) dt
= ... = ~(1 e
a2
a
/
~ loa exp (  ~ (a 
2) = 0(1)
(a
t
t)) dt
00) .
This proves It = 10 + 0(1) (a  t 00); therefore from (9) and (10), by passing to the limit a  t 00, we obtain ~go(C) _ _1_ e/2 '>

v'27fe
.
We see that the special function N1,0 has as its Fourier transform an identical copy of itself, but living on the ~axis.
y
((J=I, w=5) Figure 2.8
We conclude the present example by computing the Fourier transform of the "wave train"
g(t)
:=
Nu,o(t) cos(wot)
=
1
~
V27r (J
exp
(t2 ) eiwot +2 e iwot 2
2(J
(see Figure 2.8). To this end we use our rules. First, one has Nu,o = ~Dl1go, so rule (R3) gives
2 3 The Heisenberg uncertainty principle
49
To this we apply rule (R2) and obtain
We see that the Fourier transform of our "wave train" has peaks at the two points ±wo of the ~axis, and these peaks become more and more pronounced as (J increases, Le., when the number of oscillations of frequency Wo that in fact could be observed becomes larger and larger. 0 For additional formulas giving the Fourier transforms of special functions we refer the reader to the extensive tables in [13].
2.3 The Heisenberg uncertainty principle We have n~ed at several places already that a time signal I and its Fourier transform I cannot be simultaneously localized in a small domain of the tresp. the ~axis.
f
•
The scaling rule (R3) implies that the graph of is stretched horizontally (and, in addition, flattened by vertical scaling) when the graph of I is compressed horizontally.
•
The Fourier transform of a pure oscillation cut off outside ±a has all of lR as its support and is not even absolutely integrable for I~I ~ 00.
•
A time signal with compact support cannot be bandlimited (see Section 2.4).
•
Further observations can be made along the same vein, which the reader is invited to make on his own.
The phenomenon described here rather intuitively has found its quantitative expression in the famous Heisenberg uncertainty principle, a theorem of Fourier analysis that plays an important role in quantum mechanics. There the motion of a particle is described "abstractly" by a certain function 'ljJ E S (no connection with our wavelets) in the following way: The function Ix (x) := 1l,b(x)j2 is interpreted as the probability density for the position X ofthis particle, considered as a random variable, and Ip(~) := 10(~)12 is the corresponding density for its momentum P. The uncertainty principle states in the form of a
50
2 Fourier analysis
precise inequality that these two densities cannot simultaneously have a single marked peak. Here we have tacitly assumed 'f/; E L2, and, for the probabilistic interpretation,
11'f/;11 2 =
J
fx(x) dx = 1 .
The quantity
is the expectation of the random variable X 2 and consequently a measure for the horizontal spread of the function 'f/;. Analogously, the integral
can be regarded as a measure of the spread of 'f/; over the ~axis. In terms of these quantities, the Heisenberg uncertainty principle can be formulated as follows: (2.14) Let
'f/;
be an arbitrary function in L2. Tben
(1) tbe leftband side being allowed to assume tbe value 00. Tbe equality sign is valid exactly for tbe constant multiples of tbe functions x f+ e cx2 , c> O. If Ilx'f/;ll = 00 or II~';;;II = 00, then there is nothing to prove. In this case at least one of the two functions 'f/; and';;; is definitely ''very spread out". Therefore we may assume that the lefthand side of (1) is finite and prove this inequality first for functions 'f/; E S. Under this additional hypothesis all convergence questions are moved out of the way; in particular, we have limx>±oo xl'f/;(x)12 = O.
I
The Fourier transform';;; may be eliminated from (1) by means of rule (R4) and Parseval's formula (2.11). One has
II~';;;II
=
11'f/;'11
=
11'f/;'II,
from which it follows that the stated inequality (1) is equivalent to
Ilx'f/;ll· IWII ~ ~2 11'f/;11 2
.
(2)
2 3 The Heisenberg uncertainty principle
51
:.Jow by Schwarz' inequality 2.2.(4), we have
Ilx1/l11·111/I'11 ~ !(x 1/1, 1/1')! ~ !Re(x1/l,1/I')!.
(3)
Here the righthand side can be computed as follows: 2Re(x 1/1, 1/1')
= (x 1/1, 1/1') + (1/1', x 1/1) = =
X
11/1 (x) 12
[Xl 
I:
J
x (1/I(x)1/I'(x)
+ 1/I'(x)1/I(x))
!
i
11/I(xW dx =
dx
_111/111 2 .
If we insert this on the right side of (3), the inequality (2) follows. To finish up the proof we have to get rid of the assumption 1/1 E S. Since S is dense in L2, a simple approximation argument (which we leave as an exercise) will do the job. One has equality in (1), if and only if both ~ relations in (3) are in fact equalities, and for this to be valid it is necessary, in the first place, that the two vectors x1/l and 1/1' E £2 are linearly dependent. So there has to be a J1 + ill E C with
1/I'(x) == (J.L + iv) x 1/1 (x)
(x E JR.) .
(4)
The solutions of this differential equation are given by
and such a 1/1 is an element of £2 if and only if J.L =: c is negative. For the second ~ in (3) to be an equality, (x 1/1,1/1') has to be real. Together with (4) we are led to the condition
SO II
has to be zero.
According to this theorem, the two functions 1/1, :(b cannot simultaneously be sharply localized at x := 0, ~ := 0: At least one of the numbers IIx1/l11 2 and 11~~112 is ~ 111/111 2/2. Of course the same is true for an arbitrary pair (xo, ~o) instead of (0,0): (2.15) For any 1/1 E
£2
and arbitrary Xo E JR., ~o E JR. one has
52
2 Fourier analysis
Here
II(x 
xo)'l/JII resp. II(~  ~o)~11 denote the following quantities:
(j (x _ xo)21'l/J(x)12 dX) 1/2
I
resp.
We bring the auxiliary function
g(t)
:= ei~ot 'l/J(t
+ xo)
into play and compute
IIgll2 = jl'l/J(t + xo)1 2dt = 1I'l/J1I2 , IItgll 2 = jt21'l/J(t+xo)12= j(xxo)21'l/J(x)1 2 dX. Writing 9 in the form
g(t) =
ei~ot
h(t) ,
h(t)
:=
f(t
+ xo) ,
and with the help of rules (R2) and (Rl), we deduce that
This implies
If we now apply (2.14) to the function 9 and insert the values obtained for
IIgll, IItgll and
II T 911, we arrive at the stated formula.
J
24 The Shannon sampling theorem
53
2.4 The Shannon sampling theorem The Shannon sampling theorem gives a surprising answer to the following question: Is it possible to reconstruct a time signal f from discrete values (J(kT) IkE Z) completely, i.e., for all values of the continuous variable t? Without further assumptions about f the answer to this question of course has to be no, for in the open intervals between the sample points kT the graph of f could be filled in more or less arbitrarily. The sampling theorem has an interesting history; see [9J for a very readable account. The fact is that the series representation given by Shannon's theorem had been known long before Shannon by the name of cardinal series. :\. function fELl is called nbandlimited if its Fourier transform vanishes :dentically for I~I > 0.:
1
(I~I
> 0.) .
Shannon's theorem states that an nbandlimited function can be reconstructed completely from its values
(J(kT)
IkE Z) ,
T·
n, 7r
(1)
sampled at the discrete points kT. By "completely" we mean that at all points t E JR we get back the exact original value f(t). Now this might come as a surprise, but a moment's reflection shows that it is not so surprising after all: A bandlimited time signal f is automatically an entire holomorphic function of the complex variable t (cf. the corresponding statement about the Fourier transform of time signals having compact support), and it is well known that such a function is determined on all of C by giving its values on a comparatively "modest" set. So uniqueness follows from general principles, but Shannon's theorem even gives a formula for f. In (1) a certain rigid relation between the bandwidth 0. and the sampling lllterval T is stipulated. There is a lot to be said about that, and we shall come back to this matter later on. For the moment, the following will suffice: All harmonic components e~ actually occurring in f have a period length ~ 27r /0.. Thus, by requiring T := 7r /0., one makes sure that any pure oscillation possibly present in f would be sampled at least twice per period. Here is the sampling theorem (Figure 2.9):
(2.16) Let the continuous function f: JR that f satisfies an estimate of the form
f(t) =
O(ltl~+g)
+
C be nbandlimited and assume
(t
+
±oo) .
(2)
54
2 Fourier analysis
Figure 2.9
Let T
:= 7r
10.
Then 00
f(t) =
L
f(kT) sinc(O(t  kT))
(3)
(tEJR).
k=oo
The formal series appearing in (3) is called the cardinal series in the literature. Because the sincfunction is bounded on JR, the assumption (2) guarantees that the cardinal series is uniformly convergent on JR and so represents a function j that is continuous on all of R The relations sinc(k7r) = OOk imply that the function j automatically interpolates the given values f (kT). This means that the cardinal series can be used as a continuous interpolant of the given data (J(kT) IkE Z) even in cases where f is not bandlimited. From what was said above about f, it is no restriction of generality to assume right from the start that f is continuous. The assumption (2) could be weakened.
I
Because of (2) the function f is in L1 n L2 and has a continuous Fourier transform by (2.9). Since vanishes for lei > 0, it is in L1 as well, and the right side of the inversion formula (2.12) produces a continuous function t f+ j(t) which coincides with f almost everywhere, so is actually == f:
1
f(t) =
_1_ J ..j2;
_1_1
l(e) e* de =
..j2;
A
1
0
1<e) eite de
(t
E
JR) .
(4)
0
Since is continuous, one has 1 n are filtered out, so that the cardinal series would essentially produce the function 
1
f:=
f(C
y
27r
111 d~ f(~) ~ e~ . 11
Unfortunately, this conjecture is false. In reality, a new phenomenon occurs. It is called aliasing and is a nuisance in various fields of technology (telephone communications, computer tomography, etc.), where discretization of analog phenomena is an essential ingredient. Things become more clear when we now consider an ately" undersampled. We take
f that is only "moder·
n < n' < 3n and assume that ic~) == 0 for I~I >
If we make the substitution ~:=
(±2n
n'.
Then we can write
(cf.
(4))
2 4 The Shannon sampling theorem
57
the two exterior integrals on the right, then eikT~ = eikT~1 (because of 2f! T = 211"), and we obtain
10
f(kT) =
~1° (i(~) + i(~ 0
y'27l"
20)
+ i(~ + 20)) eikT~ d~ .
(9)
This brings into the game the continuous function 9 E £2 whose Fourier transform is given by
(0::;
~::;
0)
(10)
(I~I > 0)
Because of (9), the function 9 satisfies
o
g(kT)
= ~1 g(~) eikT~ d~ = f(kT) v27l"
(k E Z) .
0
We realize that 9 has the same cardinal series as f, but 9 is, contrary to f, truly nbandlimited. This implies that the common cardinal series of f and 9 represents not f but g, and we are led to the following general conclusion: If the true bandwidth 0' of f is larger than the Nyquist frequency 0 := 7l" IT, then the high frequency parts of f are not simply filtered out or "forgotten" by the cardinal series, but they appear therein, afflicted with a mysterious frequency shift. The cardinal series produces an Obandlimited function 9 whose Fourier transform 9 is given by (10) and is shown in Figure 2.10 .
..........
~~~~~+_~~~r+ ~
0'
Figure 2.10 Aliasing
0
0'
30
58
2 Fourier analysis
While undersampling leads, as we have seen, to the undesirable effect of aliasing, the skillful deployment of oversampling can be used to improve the rate of convergence. We now show how this can be realized. Let a sampling rate T 1 be given and let 0 := 7r IT be the corresponding Nyquist frequency. We assume that the signals f taken into consideration are O'bandlimited for some 0' < O. Let the auxiliary function q E L2 be defined by giving its Fourier transform:
~(!:) .= {~(1 _sm. 7r(2Iel2(0 00') 0')) q... 2 o
(lei
~
0')
(0' ~ lei ~ 0) (lei 2: 0)
Note that q is, apart from the parameter values 0 and 0', independent of f. Figure 2.11 shows the graphs of If and of a typical! under consideration.
1
0 Figure 2.11
The signal f satisfies the assumptions of theorem (2.16), therefore (8) is valid. and we may write
= ..,fj/ff
20
~
L..;
f(kT)
eikT~
k=oo
i<e)
Furthermore, we know that is identically zero for 0' ~ lei ~ O. In the interval lei ~ 0' we have If(e) == 1. This implies that, starting with (4), we
2 1 The Shannon sampling theorem
59
may do the following computation:
Using the abbreviation
n21 1 q(~) e'S~ 0
.
=:
Q(s),
(11)
0
we see that the cardinal series (3) has been transformed into the novel representation 00
L
f(t) =
f(kT) Q(t  kT) .
(12)
k=oo
In order to be able to judge the announced improvement in convergence we need the "universal" (i.e., independent of f) function Q in explicit form. Since qis an even function, the integral (11) is computed as follows:
Q(s)
=
2~
~
i:
(1
q(~) cos(s~) d~
01
cos(s~) d~ + l~
... cos(s~) d~)
7[2 sin(O's)+sin(Os) 20s 7[2  (0  O')2s2 From this, we immediately deduce
Q(s) =
O(ls~3)
Let us consider an example. Oversampling the time signal f twice means ~n. Imagine that we want to reconstruct the signal f in the tinterval [0, T]. For the comparison of (12) and (3) we have to estimate the order of magnitude of the factor Q(t  kT) in (12) when Ikl + 00. It is given by
n' =
27[2 20·
IklT . (0/2)2(kT)2
4 1 ;: Tkf3
.
2 Fourier analysis
60
In simplifying, we have used the relation flT = 1r. Compare this with the cardinal series (3): The order of magnitude of the corresponding factor sinc(fl(t  kT)) when Ikl ~ 00 is much larger, namely 1 1
;Tkj. It follows that, using (3), one would have to take several times more terms into account as compared to (12) in order to guarantee the same level of precision.
3 The continuous wavelet transform
3.1 Definitions and examples A function 'Ij;: lR + C satisfying the conditions
11'Ij;1I and 211"
r 11J(a) 12
JR lal da
(1)
= 1
=: C'if;
0), one obtains the function
1/Ja,b(t) := 1/Ja(t  b) =
1 (ta b) laI1/21/J
(5)
,
lppearing in the integral (4); see Figure 3.1. We obviously have
II1/Ja,b I = 1
v (a, b) E lR~
.
[sing the 'ljJa,b we can write the definition (4) of the wavelet transform in the form of a scalar product:
WI (a, b)
~
(6)
_ _ _ _~~ _ _ _ _~ _ _L _ _ _ _ _ _~~ _ _ _ _ _ _ . _ _~~L_ _ _
t
Figure 3.1
This implies, first, that at each point (a, b)
E
lR* x lR the wavelet transform
WI has a well determined value WI (a, b) and, second, by Schwarz' inequality, that WI is uniformly bounded on lR::: IW/(a, b)1 ~
IIIII
v(a, b) E lR~
.
(7)
We now compute the Fourier transforms of the functions 1/Ja,b. According to rule (R3) he have whence we obtain by rule (Rl), applied to (5):
(8)
64
3 The continuous wavelet transform
On account of (2.11) (Parseval's formula) and (6) we therefore can write Wf(a, b) in the following form:
The last integral can be regarded as a Fourier integral; to be precise, it gives the Fourier v transform of the £lfunction
(10) written as a function of the variable b. Altogether, we have proven the following proposition:
(3.2) For fixed a =I 0 the function
Wf(a, .):
b 1+ Wf(a, b)
can be regarded as the Fourierv transform of the function
Fa, the latter given
by (10). Because of (2.9) one may conclude in particular that the function Wf is continuous on horizontal lines a = const., and takes the limit 0 when b + ±oo, keeping a fixed.
CD The function 1/J := 1/JHaar is obviously a wavelet in the sense of the general definition. If a > 0 then (b:::;to lal
J>o
Similarly, in the case
~
27f
< 0, the substitution da =
gives
da'
lIT
1 2 A Plancherel formula
73
:\ ow one continues as before:
.1 second look at the proof of theorem (3.3) shows that the bilinearity of the Plancherel formula with respect to the variables f and 9 permits a considerable ~eneralization of the theorem: One may transform f and 9 by means of two djfferent wavelets and still gets a formula of type (3.3). This fact of course !!1creases the flexibility of the wavelet transform both for the analysis and for the synthesis of time signals f. (3.5) Let 1/J and X be two wavelets and assume that the integral
2
7r
1 ~(a)I I R"
x(a) d ' C a . ..px a
(5)
JS defined, i.e., finite. If W..p and Wx denote the wavelet transform with respect to 1/J and X, then the following is true for arbitrary f, 9 E L2;
;
Repeat the proof of (3.3) with Fa defined by 3.1.(10) as before, while G a obviously has to be replaced by
\\'e leave the details to the reader.
The formulas established in this section are best understood in the framework of topological groups and their representations. For a short but very readable presentation of this aspect see [LJ, Section 1.6.
74
3 The continuous wavelet transform
3.3 Inversion formulas The continuous wavelet transform encodes a given time signal, i.e., a function f of one real variable t, as a function Wf of two real variables a and b. Instead of 00 1 data we now have, so to speak, 00 2 of them, and this means that f is represented in the data (Wf(a, b) I (a, b) E lR:') with very high redundancy. It will come as no surprise that this circumstance greatly facilitates the reconstruction of the original signal f from Wf. As a matter of fact, there is not only one inversion formula, as with the Fourier transform, but in the end there is an arbitrary number of such formulas. We shall see in the next chapter that even an appropriate discrete collection of values
suffices to restore f completely; in other words, there is also a kind of Shannon theorem for the wavelet transform. In purely set theoretic terms the set lR:' has "the same number" of points as JR, and consequently there are "equally many" functions of the form u: lR:' ~ C as there are functions f: lR + C. Nevertheless, it is beyond question that not every theoretically possible set of data (u( a, b) I (a, b) E lR:') can actually occur as a wavelet transform of some function f E L2. This means that the values Wf(a, b) of genuine wavelet transforms must be intercorrelated in an as yet mysterious way. We shall come back to this point in Section 3.4. We will need the following regularization lemma:
(3.6) Let
(t2 )
1 9u(t) := J2ifcr exp  2cr2
denote the normal distribution with variation cr, and assume that the function f E L1 is continuous at some given point x. Then lim (f
u+O+
I
Let an c
* 9u )(x) =
> 0 be given. There is an h > 0 (not dependent on cr) with If(x  t)  f(x)1
Because of
f(x) .
< c
(It I :::; h) .
J
9u (t) dt = 1 we may write
(f * 9u)(X)  f(x) =
J
(J(x  t)  f(X))9u(t) dt,
3.3 Inversion formulas
75
which can be estimated as follows:
IU*ga)(X)  f(x)1
~
r
Jtl5,h
If(x  t)  f(x)1 ga(t) dt +
1
~
el
h
h
r
Jtl"2h
(If(x  t)1 + If(x)l) ga(t) dt
1
ga(t)dt+ Ilflllga(h) + If(x)1
r
J1tl"2h
ga(t)dt.
Here the first integral on the right hand side has a value < 1, and ga(h) as well as the last integral tend to 0 with a 4 0+; see Figure 3.8. Thus one can find a ao so that for all a < ao the following is true:
IU * ga)(x) Since e
f(x)1 < 2e.
> 0 was arbitrary, the proof is complete.
We note as an addendum the following identity, valid for arbitrary
f
E £2:
(1) The left hand side of (1) is by definition equal to
J
f(t)ga(x  t) dt, but the
same is true for the right hand side, since ga is a real symmetric (i.e., even) function.
+~~=~~=+
h Figure 3.8
t
3 The continuous wavelet transform
76
The Plancherel formula (3.3) can be written as follows:
1 ( dadb (I,g) = C,p JJR= Wf(a,b) (V;a,b,g)
W .
(2)
Letting 9 := Tx gu this becomes
so that by means of (1) we obtain the formula
(3) We now let (J ~ 0+ on both sides of (3) and use Lemma (3.6). This leads to the following reconstruction formula for our time signal f:
(3.7) Let x be a point of continuity of the time signal f. Under suitable assumptions about f and V; one has the equality
f(x)
1
=
(
C,p JJR= Wf(a, b) V;a,b(X)
dadb
W .
(4)
I
Performing the limit under the integral sign in (3) is quite subtle. For a complete proof we refer the reader to [DJ, Proposition 2.4.2. .J Formula (4) can be viewed "abstractly" as saying
(5) Written in this form it represents the original signal f as a superposition ("linear combination") of wavelet functions V;a,b, the values Wf (a, b) of the wavelet transform serving as coefficients. By the way, the validity of (5) in the socalled "weak sense" can be regarded as an immediate consequence of the Plancherel formula (3.3). We are referring here to the following functionalanalytic hocuspocus: Any vector f E L2 possesses a second ("weak") personality in the form of a continuous conjugatelinear functional, to wit
g..... (I, g) ;
3.3 Inversion formulas
77
and any continuous conjugatelinear functional ¢: £2 ) C belongs to a well determined f. If we now look at the Plancherel formula in the form (2) for a fixed f and variable 9 E £2, then it says no more and no less than
(I, .) = C1 1/J
r
df.1 Wf(a, b) (V;a,b, .) . JlR':..
This can be expressed in words as follows: The "weak version" of f is retrieved from Wf by superimposing the functionals (V;a,b, .), using the values Wf(a, b) as coefficients. The formal agreement with (5) is evident. From the two variants (3.4) and (3.5) of the Plancherel formula one derives in the same way the following reconstruction formulas:
(3.8) Under suitable regularity assumptions one has
r
1 dadb f(x) = C~ JlR~ Wf(a, b) V;a,b(X)
W '
if'1f; satisfies the symmetry condition 3.2.(4), and similarly
f(x) =
1 c 1/Jx
1 lR':..
W1/Jf(a, b) Xa,b(X)
dadb I 12 ' a
iftbe quantity C1/Jx, see 3.2.(5), is defined. The last formula can be read as
It performs the reconstruction of f using a different set of wavelet functions from the ones previously used for the analysis of f. We shall encounter analysissynthesispairings of this kind a second time in connection with the discretized version of the wavelet transform.
3 The continuous wavelet transform
78
3.4 The kernel function Formula 3.3.(5) can be paraphrased in the following way: The mapping
(1) is the identity. If in this connection people talk about a resolution of the identity, then this is to be understood in an almost chemical sense: The map id: L2 ) L2 is first resolved into its (a, b)constituents and in the end recrystallized in the integral 3.3.(5) resp. (1). Resolutions of the identity are encountered already on a very elementary level: If (el' ... ,en) is an orthonormal basis of the euclidean JRn, then the formula n X
= L(x,ek)ek k=l
is valid identically in x E JRn ; in other words, the mapping n X
1+
L (x, ek) ek k=l
is the identity. There is, however, an essential difference relative to 3.3.(5) resp. (1): The vectors ek (1 :::; k :::; n) are linearly independent, but the functions 'l/Ja,b (a E JR*, b E JR) are not. In Sections 4.1 and 4.2 we shall study these matters once again and in a more general setting. For the moment we stay with H:= L2(JR:',dp,). From (3.3) we infer
IIW/II :::; .;0;11/11 showing that the wavelet transform W: L2 ) H is a continuous map. Let
be the image space. In the case at hand there is an inverse mapping
the inverse W 1 being given (at least formally), according to 3.3.(5), by
3.4 The kernel function
79
The space U consisting of all wavelet transforms Wj, j E L2, is a proper subspace of H. We know, e.g., that the functions u E U have a well determined value at all points (a, b) E JR.:', and each individual u E U is globally bounded owing to 3.1.(7):
Ilull oo
:=
sup{u(a,b) I (a,b)
E JR.~}
(d) With the help of (10) one obtains the following expression for the Gram operator C belonging to the collection ii.:
C := T* T = GlT* TG l = G l
.
This implies ~j := Cliij = Giij = aj for all j, as stated. If r > n := dim(X), then the ii j are linearly dependent, so there have to be infinitely many representations of a given vector x E X as a linear combination of the iij. Among these the representation (4.4)(a) is distinguished as follows:
(4.5) Let a. and ii. be dual frames, and let x = E;=l ~j iij be an arbitrary representation of the vector x E X as a linear combination of the iij. Then r
L
j=l
r
l~jl2 ~
L
j=l
l(x,aj)1 2,
the equality sign holding only if ~j = (x, aj) for 1 ::; j ::; r.
4.2 The general notion of a frame
99
I
Consider the point (6, ... er) =: y E Y. According to (4.4)(b) one has x = By, and (4.3) implies Tx = TBy = puY. This at once leads to
Here we can have equality only if y = puY = Tx. Expressing these geometric facts in terms of coordinates one obtains the statements of the theorem. .l The content of Theorem (4.5) can be expressed in this way: The "natural" representation (4.4) (a) uses the least amount of "coefficient energy" .
4.2 The general notion of a frame The geometrical (and finitedimensional) analysis presented in the foregoing section served to prepare us for the following general dispositions:
X is a complex Hilbert space whose vectors we denote by letters. One should imagine X being infinitedimensional.
f, h and similar
M is an "abstract" set of points m. On the set M a measure JL is defined that assigns each measurable subset E eMits "mass" or ''volume'' JL(E) E [0,00]. The measurable subsets form a socalled ualgebra F, and care is taken that any "reasonable" subset E C M belongs to F. According to general principles it is then possible to set up an integral calculus for functions on M, and it makes sense, e.g., to speak about the Hilbert space Y := L 2 (M, JL). The pair (M, JL) is the abstraction of the pair ({I, 2, ... , r}, #) that played such a prominent role in the last section. Furthermore, a family h. := (hm 1m E M) of vectors h m E X is given, the measure space M serving as index set for this family. The h m are (analogous to the aj of Section 4.1) to be viewed as "measuring probes", by means of which we want to explore the individual vectors f E X as completely as possible. In Section 1.5 we tentatively spoke of "key patterns" when actually the same "measuring probes" were meant. The fact is, for a given f E X, one gets ahold (numerically, experimentally, conceptually, or otherwise) of the family of all scalar products Tf(m) := (j, hm )
(m E M) .
4 Frames
100
In this way one obtains an array (Tf(m) 1 mE M) that is nothing other than a function Tf: M ; C. The integral installed on M now enables us to quantify the yield of our measuring efforts: The L 2 integral (~
(0)
(1)
is obviously a natural measure for the amount of information so collected about f. This brings us to the following definition: The family h. is a frame, if the following conditions are satisfied: • •
the function Tf is j.Lmeasurable for all f E X, so that the integral (1) is always defined; there are constants B ~ A > 0 such that A
IIfl12 < IITfl12 < Bllfl12 (a)
VfEX.
(b)
Here the inequality (b) guarantees that the frame operator
T:
X ;
eM ,
f
ft
Tf
is a bounded operator from X to Y := L2(M,j.L). The inequality (a), in most cases the crucial one of the two, serves to make sure that T is injective, signifying that no information is lost in the process f ft Tf. While we are at it, we proceed to explain the related notion of a "Riesz basis", which will playa certain role in connection with the discrete wavelet transform later on. Here the set M is countable to start with, and j.L is the counting measure # on M. A family h. = (hm 1mE M) of vectors hm E X is called a Riesz basis of X if the following conditions are satisfied: • •
span(h.) = X; there are constants B
A
~
A > 0 such that
2:leml 2 ~il2:emhmI2 ~ m (c) m "
B
2:leml 2 m
Altogether, these conditions say that the mapping
m
is a bounded operator having a bounded inverse K 1 : X ; l2 (M).
(2)
4.2 The general notion of a frame
101
The relation between the two concepts "frame" and "Riesz basis" is not obvious, because the two definitions speak about totally different things. Thus it is not a bad idea to prove the following proposition: (4.6) A Riesz basis h. with constants B ~ A > 0 is automatically a frame with A and B as frame constants. Let (em 1m E M) be the canonical orthonormal basis of [2(M). Then one has K em = hm and consequently
I""
m
m
m
for all x EX. By general principles of functional analysis the conditions (2) imply the analogous inequalities for K* = T. This means that we also have
The following somewhat vague statement is not so far from the truth: A Riesz basis is a countable frame whose vectors are linearly independent and stay so even "in the limit". To wit, the inequality (c) in (2) guarantees that it is impossible for a nontrivial linear combination :Em ~m hm to represent the zero vector. In the finitedimensional case the inverse G 1 of the Gram operator and the dual frame a. could be computed by inverting a certain matrix. In the case at hand, an operator G:X~X,
dim(X)
= 00,
*:
has to be inverted. This can be accomplished by means of an iteration procedure whose rate of convergence is tied to the quotient The nearer this quotient is to 1, the better the convergence of our procedure is. In fact, we shall prove the following:
(4.7) Assume that h. is a frame for X with frame constants B ~ A > 0, and let y E X be an arbitrary vector. If the sequence x. is recursively defined by Xo := 0,
Xn+l := Xn
2
+ A +B
(y  Gx n )
(n ~ 0) ,
then limn + oo Xn = GI Y . In practice (that is to say, in the actual numerical computation of the frame vectors aj := G1aj), the described procedure is cut short as soon as the increments (y  Gx n ) become negligibly small.
A!B
102
I
4 Frames
We consider the auxiliary operator 2
R := Ix  A + B G . In terms of R, the iteration formula can be rewritten as 2
xn+1 := A+B y
+
Rxn ·
Now G is a positive definite selfadjoint operator, and by assumption on T we know that A Ix ~ G ~ B Ix (such inequalities make sense in this case!). This implies A+B I x ~ BA G  2'
II
II
2
so that we get the following estimate for the norm of R: 2
IIRII = II A + B
II BA G Ix ~ B + A
BjA1
= BjA + 1 < 1 .
By the contraction principle (Le., the general fixedpoint theorem) we can conclude now that limn ....Hxl Xn =: x E X exists, and furthermore that
The last equation implies y  Gx = 0, whence x
= G .... ly, as stated.
.J
At this time we can see the following two applications of the concepts presented here: Number one, of course, the finitedimensional model discussed in Section 4.1, and number two, the continuous wavelet transform as treated in Chapter 3. We are now going to review and interpret the latter in the functional analytic framework (!) set up in this section. X is the space L2(JR) of time signals j, and M is the set JR~:= {(a,b)
I a E JR*,
bE JR},
provided with the measure d/L := dadbjlal 2 . The Hilbert space Y := L2(M) is the space L2(JR:', d/L) that was denoted by H in Chapter 3. After a mother wavelet 'IjJ has been selected, one defines the wavelet functions 'ljJa,b(t) :=
1
laI1/2'IjJ
(ta b)
4.2 The general notion of a frame
103
and in this way installs a family
of vectors 'l/Ja,b E L2. The corresponding frame operator T transforms any function f E L2 into a function T f: JR.:' + C according to the prescription
Tf(a, b) := (I, 'l/Ja,b) = Wf(a, b) We see that the wavelet transform W is nothing other than the frame operator T corresponding to the family 'I/J • . Now by Theorem (3.3) one has
where the constant C'IjJ is given by C'IjJ := 2n
l. 1~~?12
da .
In terms of the concepts defined in the current chapter, we can express this fact as follows: (4.8) Let 'I/J be an arbitrary mother wavelet. Then the family 'I/J. is a tight frame with frame constant C'IjJ.
In view of this theorem, the inverse of the Gram operator is given by
a 1 =
~'IjJ lx, and the dual frame ;jJ. coincides with 'I/J. up to the same constant factor:
If we now apply formula (4.4)(a), which reconstructs a given vector x E X
from the values (Tx)j := (x, aj), to the situation at hand, we arrive at the following: ( dadb 1 (3) f = JIR=Wf(a, b) C'IjJ 'l/Ja,b
Tar
This is in agreement with (3.7) resp. 3.3.(4). It must be admitted, however, that (4.4)(a) is related to a finitedimensional model, so the validity of (3) is not guaranteed in the present situation. As a matter of fact, formula (3) is valid only in a "weak" sense or else under stronger assumptions on f and 'I/J; see our remarks in Section 3.3 regarding this point.
4 Frames
104
4.3 The discrete wavelet transform Shannon's sampling theorem (Section 2.4) accomplishes the full reconstruction of a bandlimited time signal f from a discrete collection (J (kT) IkE IE) of sampled values. In this section we set out to attain something similar in the realm of the wavelet transform. The data that we shall use in the reconstruction of f are no longer fvalues at equally spaced points kT, but results of "wavelet measurements" (j, 'l/Ja,b); that is to say, suitably chosen values of the wavelet transform Wf: lR= + c. One must always keep in mind that a given signal f is encoded in its wavelet transform with an enormous redundancy. Under these circumstances it is not so surprising that a discrete set of Wfvalues is already sufficient to reconstruct the given f as an L2_ object or even pointwise, and all this even without the assumption that f is bandlimited. We now describe the class of "grids" in the (a, b)plane that we shall use for the sampling of the function Wf: First a zoom step a > 1 is chosen (the habitual choice is a = 2) as well as a base step f3 > 0 (a good choice is f3 = 1). These two parameters characterize the chosen "grid" and are kept fixed in the following. Then one sets (m, n E IE) , a a~'~
1~~+~~f3____~f3__~__~_________
(m=O)
(m 0 satisfy the following inequali
ties:
In particular, one cannot have A = B unless C_ = C+. This is a consequence of the fact that we have rejected negative avalues; cf. the analogous condition in Theorem (3.4). For the proof of (4.9) we refer the reader to [DJ. SO far, so good, but what we really want is a theorem of the following kind: Under exactly described circumstances it is guaranteed that the collection 'Ij;. is a frame, with the frame constants B :::: A > 0 obeying tolerances stipulated in advance.
Assume that a zoom step (J > 1 is given. A wavelet 'lj; is called admissible for the purposes of this discussion if its Fourier transform 0 fulfills the conditions (a) and (b) below. (a) There are constants ex> 0, p > 0 and C, such that (I~I
:::; 1)
(I~I
:::: 1)
(2)
4.3 The discrete wavelet transform
107
This condition is in fact harmless and serves mainly to introduce the constants 0 such that 00
L
1';;;(O'm~)12 > A'
;1
(~
E
JR) .
(3)
m=oo
Since the left hand side of (3) is invariant with respect to the transformation e1+ O'e, it is enough to check the required in~quality on the domain 1 ::; I~I ::; (J. According to this condition the zeros of 'IjJ are in a way forbidden to be in "logarithmic conspiration". Thus it is in particular excluded that the support of;j) is contained in a single interval of the form lb, O'b[. Assume, e.g., that 'IjJ has finite order N. Then because of 3.5.(3) there is an h > 0 with
(0
1 and sufficiently small f3 > 0 the quantities IITfll2 and IIfll2 have the same order of magnitude, as required by (4). Theorem (4.10) shows that in reality very modest assumptions about 1/J suffice to guarantee that the data
(Tf(m,n)! (m,n) E Z2)
(6)
encode all features of the analyzed function f, as soon as f3 is small enough; in particular, it is all right to take a := 2 in such a case. For the reconstruction of the original f using the data (6) we need the frame ¢., dual to 1/J•. If the frame 1/J. is not tight, we have to compute the ¢m,n using the prescription ¢m,n := G 1 (1/Jm,n) . Unfortunately the ¢m,n cannot be obtained from a single ¢ by mere dilation and translation, unless of course 1/J is chosen in a very special way at the outset. The following considerations will make this more clear: The two operators
Df(t):=
~f(~)
and
8f(t) are unitary, therefore we have D* Gram operator G, given by
= Dl
m,n
:=
f(t  f3)
and 8*
= 8 1 .
Consider now the
4.3 The discrete wavelet transform
109
Regarding D, we have 1 (t) 1 Dif;m,n(t) = ,fiiWm,n ;; = o(m+l)/2 W(t/oom  n{3 ) = Wm+l,n(t)
and consequently
m,n
m,n
m,n Obviously in this case
m,n
m,n
aI
commutes with D as well, and we obtain
that is to say,
Unfortunately a and S do not commute, so that the above calculation (7) cannot (mutatis mutandis) be repeated. The reason is the following: The functions SWm,n appearing on the right hand side of the formula
m,n cannot be identified with certain Wm',n " as the D'¢m,n could; rather, they look like this:
and in general the factor n + om is not an integer. From this observation one has to draw the conclusion that the dual wavelet functions 1Po,n, nEZ, are not related to each other in a simple way, so they have to be determined individually. For the reasons described above, in most circumstances one is eager to choose a tight frame W. right at the outset. The following theorem shows that such a choice is indeed possible:
110
4 Frames
(4.11) Assume that the Fourier transform ¢ of the mother wavelet '¢ has compact support in the interval I := [w, w'], w' > w > 0 and that 00
.L
I¢(am~w == A' > 0
m=oo
Then the collection '¢. = ('¢m,n I (m, n) E Z2) belonging to the zoom step (! and arbitrary base step 2n (3 < 
 ww'
is a tight frame for realvalued time signals 1 E L2.
r
Without restriction of generality we may assume
2n w w
(8)
{3:=, .
On account of Parseval's formula (2.11) and rule 3.1.(8) one has
IITI1I2 =
If
.L 1(1, ¢m,n> \2 = Lam 1(~) ¢(am~) einuTn{3e d~r m,n
m,n
Introducing the auxiliary function
we can write liT1112 in the form
IITI1I2 =
Lam m,n
If g(~)einuTn{3e~r =
L am lQmnl 2
,
m,n
the Qmn being given by
(note that the function 9 is identically zero outside the interval amI). The functions (nEZ)
4.3 The discrete wavelet transform
III
are the trigonometrical basis functions for an interval of length
in particular for the interval(7m I. This indicates that the Qmn are essentially Fourier coefficients; in fact the formulas (2.8) give
211" 9~( n) , Qmn = (7 m {j" and summing over n (m is fixed) gives
At the up arrow i we have used Parseval's formula for period length 21r/(3, as quoted in (2.8). In this way we finally obtain
It is only at the very end that we have used realvalued. In this case the identity
the~umption
i( ~) ==
i(~) holds.
that
(7m •
f should be ~
We now are confronted with the task of producing a mother wavelet 1/J that satisfies the assumptions of Theorem (4.11). Since these assumptions refer to the Fourier transform it suggests starting with;P. In the following example, constructed by DaubechiesGrossmannMeyer, a suitable;P is given in terms of simple formulas; the actual wavelet 1/J in the time domain then has to be computed numerically. Now, this Fourier inversion concerns a single function and may be performed once and for all, preceeding the wavelet analysis of any time signal f.
;p,
112
4 Frames y 1~~~
t, x
o
1/2
1
Figure 4.6
CD
We shall need the auxiliary function
"(x)
,~ {~OX3  1s,,' + 6'"
(x $ 0) (O$x$l) (x~
(9)
1)
(or some other function with similar properties). In the interval 0 $ x :$ 1 this function can be written as
Looking at the integrand on the right hand side (see Figure 4.6), we see that it has a double zero both at t = 0 and at t = 1, is otherwise positive, and is symmetrical with respect to the point t = ~. It follows that vex) increases monotonically from 0 to 1 in the interval 0 $ x $ 1, with C 2crossings at the points x = 0 and x = 1; moreover, the mentioned symmetry implies the identity \fxER, (10) v(l  x) = 1  vex) which is going to playa certain role later on. Let u
> 1 and 13 > 0 be given,
and set 27f
W := (u2 
1)13 '
4.3 The discrete wavelet transform
113
in this way (8) is fulfilled. We now define the formula
!
;j; having support
I := [w, w'] by
sin(~v( ~ 
;j;(~)
:=
.JAi.
w )) aww cos(~v( ~ aw )) 2 a 2 w aw 2

o
(aw ::; ~ ::; a 2 w)
(11)
(otherwise)
(see Figure 4.7). The constant A' appearing here is determined by the condition II'¢II = 1.
(a=2, ,8= 1)
1
o
21r/3
41r/3
Figure 4.7
As we remarked earlier, the function
m
is invariant with respect to the transformation ~ t+ a~. If we restrict our attention to the ~interval [w, aw], then we see that only the two terms corresponding to m = 0 und m = 1 contribute anything to W(~) at all. Therefore we have
where we have used the abbreviation ~w
aww
=:
x.
So much for ;j;. The (complexvalued) wavelet '¢ having the given ;j; as its Fourier transform is shown in Figure 4.8; one observes that Re('¢) is an even, Jm('¢) an odd function.  We shall come back to this example in Section 5.3.
o
114
4 Frames y
y=Im('!f;(t))
4
Figure 4.8
DaubechiesGrossmannMeyer wavelet (step sizes u= 2,
.8= 1)
4.4 Proof of theorem (4.10) The following proof is essentially taken from [D], Section 3.3.2. We are confronted with the task of estimating the sum on the right hand side of 4.3.(5) as accurately as possible. To this end, we begin with 3.1.(9):
Introducing the auxiliary function .'\.
~
g(~) := f(~) '!f;(a~)
,
(1)
we can write
Wf(a,nb)
(2)
4.4 Proof of theorem (4.10)
where we have tacitly assumed b I
is periodic and of period
l15
o.
The function
2:. Because of the formulas (2.8) we therefore can
interpret (2) as
Wf(a, nb) =
~ lal l /2 . b21r G( n)
.
Taking the sum with respect to n we obtain
(3)
where at the end we used Parseval's formula for period length
2:, see (2.8).
We now take a closer look p.t the last integral:
Substituting
e+ l2:
=: .;', we can continue with
The last expression is now inserted into (3), leading to the following intermediate result:
L n
IWf(a, nb)12 = 27rbla l
L Jg(e) g(f. + k 2;) de . k
116
4 Frames
Here we set
(mE Z) and sum over m as well, so that we finally obtain
IITfll2 = 2:jWf(a m,nam,8)j2 = m,n
~
2:Qkm.
(4)
k,m
When the Qkm appearing on the right are unpacked using the definition (1) of g, they look as follows:
It will turn out that the terms with k = 0 in (4) account for the lion's share of II T f 112. For this reason we collect all terms Q km belonging to k =1= 0 into a single remainder term Q and write (4) in the form
We now have to play the dominant part and the remainder term against each other. In order to bring the main line of reasoning to a close, we formulate the following lemma:
(4.12) Let'lj; be an admissible wavelet with parameters a, p, C and A'. Then there is a constant B' such that Vt;, E JR., m
and, more important, one has
(5) with a constant C' that does not depend on ,8. Using this lemma and, of course, Definition 4.3.(3) of the parameter A' we arrive at the inequalities
4.4 Proof of theorem (4.10)
117
appearing in the statement of the theorem. (4.10), modulo the lemma.
This completes the proof of ~
It remains to carry out the proof of Lemma (4.12).
r
In order to estimate the sum I:rn I¢(urne)12 from above we have to treat the terms corresponding to m < a resp. to m 2: a separately, using the appropriate inequality concerning ¢ in each of the two cases. In this way we obtain
as stated. Now we come to (5), but this is a longer story. We regard Qkrn as a scalar product, using a suitable decomposition of the various factors appearing in the definition of Qkrn. In this way we obtain, by Schwarz' inequality,
If we use the substitution formula transforms into
e+ 2k1f/ (urn f3)
=:
e'
in the second factor, this
For the estimate (5) we now have to sum the IQkrnl over all k f; a and all m. For the inner sum (with respect to m) we use Schwarz' inequality in the form
118
4 Frames
leading to
(6)
In order to estimate the sums auxiliary function
Lm under the integral signs we introduce the
q(s) := sup LI~(O"m~)II~(O"m~ + s)1 ' t;
m
where, as we have seen in similar cases before, it is enough to take the supremum over the set of ~'s with 1 ::::; I~I : : ; 0". In terms of this function q(.) the inequality (6) takes the following form:
IQI : : ; 11/112 L
vq(2k1f/{3) q(2k1f/{3) .
(7)
kiO
In estimating q(.) we may assume (3 ::::; 1f from the outset; this has the consequence that only values q(s) for lsi ~ 2 need to be considered. As in the first part of the lemma we have to treat the terms corresponding to m < 0 resp. to m ~ 0 separately. To this end we split q(.) into the two parts
so that in any case
(8) We take up m < 0 first. The inequalities
!~I
: : ; 0" and lsi ~ 2 together imply
Therefore the assumptions on ~ allow the estimate
4.4 Proof of theorem (4.10)
119
and taking the sum over all m < 0 one obtains
In the case m ~ 0 we argue as follows: At least one of the two numbers jam.;) and lam.; + sl is ~ Isl/2 (note that.; and s may be of different signs) and at least one is ~ lam';l. Both Isl/2 and lam.;! are ~ 1. Since l.;p(.;) I :$ a for all ~, these circumstances allow the following conclusion:
Taking the sum over all m
~
0, we see that q+ (.) can be estimated as follows:
Because of (8), we now have
(lsi ~ 2) and consequently
(k
=I: 0) .
Inserting this into (7) and performing the summation over all k obtain the stated estimate for Q:
=I: 0, we finally
It is easy to verify that the introduced constants 01, ... , 0 4 and A' do not depend on /3. .J
5 Multiresolution analysis
The triumphant progress wavelets have made in a great variety of applications is based in the first place on the socalled "fast algorithms" (fast wavelet transform, FWT) , and these in turn owe their existence to a careful choice of the mother wavelet 'I/J. So far in this book the particular mother wavelet chosen only had to fulfill some ''technical'' conditions, such as tr'I/J E L1 or 'I/J E C r for some r ::::: 0 and, of course, 1P(O) = 0 or, even better, 'I/J should be of a certain order N > l. The trigonometric basis functions e",: t 1+ ei",t are distinguished by the following linear reproducing property: If such a function is subject to a translation Th ) it simply picks up a constant factor:
Contrary to this, in the realm of wavelets the operation of scaling is the central theme, i.e., for arbitrary a E R* the operation
With respect to this operation, the wavelets considered so far did not behave in a special way (except 'l/JHaar)' OK, their graph became flattened out or got compressed in the tdirection, depending on the value of a, but there was no reproduction property in the sense that the scaled version of a 'I/J could be related to the original 'I/J in some other way. In the discrete case only the integer iterates of a single scaling operation D cr , a > 1 denoting the zoom step, enter the picture. From now until the end of the book we choose a := 2; by the way, this is also the value most commonly used in practice. If we now adopt a mother wavelet that in a certain way "reproduces itself" when it is subject to the scaling D 2 , then novel and highly desirable effects develop. That's what "multiresolution analysis" is all about. To be more specific, things are arranged in such a way that the mother wavelet 'I/J satisfies a linear identity having the following structure: n
D2'I/J (t) == 2:>k'I/J(t  k) . k=O
5.1 Axiomatic description
121
This identity carries in its wake analogous linear formulas between the scalar products (J, '¢n,k) and (J, '¢n+1,k) , so that these scalar products (called the wavelet coefficients of f) need not be computed by tedious integrations over and over again when going from one zoom level to the next one. The definitive formulas will look somewhat different, but this is the general idea.
5.1 Axiomatic description In Section 4.3 we discretized the continuous wavelet transform, and we showed that under suitable assumptions a discrete, i.e., countable, set of "wavelet measurements" (Tf(m, n) I (m, n) E Z2) is sufficient to allow the complete reconstruction of f in the L2sense or pointwise, etc., depending on the exact circumstances. Multiresolution analysis is discrete to begin with, and the wavelet functions '¢j,k being used form an orthonormal basis of L2 by construction, so it is not necessary to compute any ~j,k'S. We now come to the formal definition. A multiresolution analysis, abbreviated MRA, is constituted by the following ingredients (a)(c). (a) A bilateral sequence (Yj Ij E ordered by inclusion, ... C V2 C VI C
Va
Z)
of closed subspaces of L2. These Yj are
C VI C ... C
Vj
C
YjI C ... C L2
(1)
(smaller values of j correspond to larger spaces Yj !), and one has
(separation axiom) ,
(2)
(completeness axiom)
(3)
The following intuitive description will be helpful later on: The time signals f E Yj only comprise features (i.e., details) exhibiting a spread of size 2': 2j on the time axis. The more negative j is, the finer are the details that may occur in a f E Yj, and "in the limit" every single f E L2 can be attained by functions Ii E Yj. (b) The Yj are connected to each other by a rigid scaling property: Vj EZ.
(4)
122
5 Multiresolution analysis
Referring to time signals! this can be expressed as follows:
!
E
Vi
¢=>
! (2j .)
E
(5)
Vo .
(c) Vo contains one basis vector per base step 1. To be precise, there is a function if; E L2 n L1 such that its translates (if;( .  k) IkE Z) form an orthonormal basis of 110. This function if; is commonly called the scaling function of the MRA under consideration; it is the determining element of the whole setup. Please note: Several authors number the Vi's in the reverse direction compared to (1). We stick to the ordering used in [D]. According to (c) above, the space 110 can be described as a set of time signals ! in the following way:
Vo = {!EL21!(t) = Ekckif;(tk),
Ek!ck!2(t' l) dt' =
~ hk hl82n+k,l ~
We see that in order for the 4>o,k to be orthonormal it is necessary that the hk satisfy the socalled consistency relations 00
~ hk hk+2n
(5.4)
=
VnEZ;
80n
k=oo
in particular, one must have
L::k Ih k [2 =
l.
While we are at it, we are going to prove a certain linear relation among the hk ; the condition q I 0 appearing therein is of no importance because of (1).
(5.5) Suppose that h
I
E
l1(Z) and that
J 4>(t) dt =: q I O.
Then
Integrating the scaling equation (2) from N to N with respect to t gives
j
N N
4>(t) dt
=
..J2 ~ k
hk
jN N
4>(2tk) dt
=
j2Nk ~ hk 4>(t') dt' . (3) v2 k 2Nk 1
/0
5.2 The scaling function
129
Since
I[::~: ¢(t') dt'! ~ 1I¢1I1
VkEZ,
we can apply the theorem of Lebesgue to the sum on the right hand side of (3). Letting N * 00 in (3) we obtain
from which the theorem follows. But we should be careful: Even if we have a coefficient vector h. E l2(Z) that satisfies the relations (5.4) and (5.5), we can by no means be sure that there exists a usable function ¢ fulfilling the scaling equation (2). Let us assume for the moment that a multiresolution analysis according to (a)(c) above is given to Ufl. If we write (2) in the form
then we see that according to general principles about orthonormal bases one has the formula (k E Z) . (4) The scalar product (¢, ¢l,k) can only be =I 0, if the supports of ¢ and of ¢l,k overlap. Thus formula (4) allows us to conclude the following:
(5.6) If the scaling function ¢ has compact support, then only finitely many hk are different from O. But one can say even more. To this end, for arbitrary functions f: lR define the quantities
a(f)
:= inf{x
I f(x) =I O} 2: 00,
b(f)
:= sup{x
~
C we
I f(x) =I O} ~ 00 .
Thus a(J) and b(J) are respectively the "left end" and the "right end" of the support of f. In the following theorem we assume for simplicity that ¢ is a bona fide function, not a mere L2object.
130
5 Multiresolution analysis
(5.1) lithe scaling function has compact support, then the quantities a:= a(00
As e
j e(2M(lqI2 + C) +4MC +
JJPjf1l 2 = 2lqJ2 + cee .
> 0 was arbitrary we see that (8) is valid if and only if Jql = 1.
5.3 Constructions in the Fourier domain Multiresolution analysis is "invariant" with respect to (a) integer translations of the time axis and (b) dilations by powers of 2. In order to make the best use of this inner symmetry we shall transfer the actual construction of admissible scaling functions 4> and corresponding mother wavelets 'IjJ into the "Fourier domain". As a consequence, e.g., the orthonormality of the 4>o,k = 4>('  k) has to be expressed in terms of properties of ¢; of course we also need a Fourier version of the scaling equation, and so On.
5.3 Constructions in the Fourier domain
135
For an arbitrary function ¢ E L2 one may write
The integral on the right hand side can be thought of as an integral over Z x [0, 27r J. If one interchanges the order of integration, the function
{p(e)
:=
2:)¢'(e + 27rl)/2 I
appears as the new inner integral. By Fubini's theorem {P is defined almost everywhere, first on [0, 27r], then on all of JR, is 27rperiodic, and one has
We first prove the following lemma:
(5.9) The integer translates ¢k := ¢(.  k) of an arbitrarily given function ¢ E L 2 constitute an orthonormal system if and only if the following identity holds: (almost all
eE JR) .
(1)
r
For symmetry reasons it is enough to consider the scalar products of the form {¢o, ¢k}. They are computed as follows:
This implies that the orthonormality condition {¢o, ¢k} ~ 1 {P(k) = 27r
and the latter obviously means {p(e)
80k
==
= 80k is equivalent to
VkEZ, 2~ almost everywhere.
136
5 Multiresolution analysis
The next point on our agenda is the scaling equation
¢(t) == V2'Lhk¢(2tk)
(almost all t E 1R.) .
(2)
k
Taking the Fourier transform on both sides of (2) we obtain, using the rules (R1) and (R2), the identity
Looking at this formula we are led to introduce (at first only formally) the function
H(~):= ~L hk eikf; ; v2
(3)
k
we call it the generating function of the multiresolution analysis under consideration. Because of Ilh.H = 1, the series (3) is almost everywhere convergent, by Theorem (2.4), and defines H as an actual 21l"periodic function. If only finetely manyhnniknd ( Itdt t: • lnll
5.3 Constructions in the Fourier domain
137
(5.10) The generating function H of a multiresolution analysis satisfies the identity (almost all wE lR) . This of course implies that H is uniformly bounded on lR:
JH(w)J ~ 1
(w E lR) .
(5)
Furthermore, since ¢(O) =J. 0 by 5.2.(1), it follows from (4) that H(O) (5.10) in turn implies H(Jr) = O.
= 1, and
Our next goal is to describe the space W o, i.e., the orthogonal complement of Vo in the larger space VI> as explicitly as possible. Having such a description in hand we shall be able to give an explicit formula for a possible mother wavelet 'if; belonging to the given scaling function .(w)H',
(10)
5.3 Constructions in the Fourier domain
139
where the coefficient >.(w) is given by the formula
The function w f+ >. (w) satisfies the identity >. (w + 1r) there is a 21rperiodic function v(·) such that
==  >. (w), consequently (11)
Inserting this into (10) and extracting the first coordinate we obtain the following representation of m f :
Introducing this into (8) we finally get for
1the expression (almost all e E JR.) _
(12)
This line of reasoning leids us to the following theorem:
(5.11) A function f E L2 belongs to the space Wo, if and only if there exists a function v() E L~, such that [ can be written in the form (12)_
r
We have already shown that f E Wo implies the existence of a 21rperiodic function v: JR. + C such that [has a representation ofthe form (12)_ Solving (11) for v(·) we get the expression vee) = e i f.!2>.(e/2), and we infer from (10) that
This implies
Conversely, if (12) is true for some v() E L~, then we have (8) with
140
Multiresolution analysis
5
Because of (5) we may conclude that mf E L~, and this in turn implies
f E VI Furthermore, we have
proving that the vector m f is orthogonal on H for almost all w _ This means that (9) is true for almost all w; on the other hand, for an f E VI this is equivalent to f l Va.J Inspired by the identity (12) we now define the mother wavelet 'ljJ corresponding to the given 4> by the following formula:
(13) It appears that in doing so we are successful:
(5.12) IE the mother wavelet 'I/J isdeiined by (13), then the system of functions ('l/Jo,k
IkE Z)
constitutes an orthonormal basis ofWo_
I
According to (5.9) the orthonormality of the 'l/Jo,k is proven by the following calculation:
:Llij;(e + 27rl) 12 = :L Iij;(e + 47rlW + :L \ij;ce + 27r + 47rl)\2 I
I
=
I
!H(~ + 7r) \2:L \¢(~ + 27rl) \2 + \H(~) r:L \¢(~ + 7r + 21rl) \2 I
=
(lH(~ + 1r)
r+ \H(~) \2) :1r
I
==
2~ 
As 1 E L~, it follows from (5.11) that 'I/J is indeed in W o , whence all integer translates 'l/Jo,k belong to Wo as well_ On the other hand, consider an arbitrary f E Wo _ By Theorem (5.11) resp_ (12) and (13) we know that there is a v() E L~ such that (almost all
eE JR) _
(14)
The function v() can be developed into a Fourier series Ek Vk e ike , and by Carleson's theorem (2.4) this series converges almost everywhere to v(e) It follows that we can replace (14) by
ice)
=
I: Vk eike ij;(e) k
(ahnost aU
eE JR) 
5.3 Constructions in the Fourier domain
141
Now, this is nothing more than the Fourier transform of the representation
J(t) =
"L
1/k
'if;(t  k)
resp.
k
the series appearing on the right converging in L2. Altogether this proves that the 'if;o,k do indeed form an orthonormal basis of Wo . ~ The scaling function does not determine the corresponding mother wavelet 'if; uniquely, thus formula (13) can be modified to a certain degree. For instance, amending it by factors eiO! eiNf. with a E JR, NEZ, is allowed. An additional factor eiNf. in :;j produces a translation of the graph of 'if; by N units to the right. In this way, depending on circumstances, one can achieve that 'if; has the same support as . Formula (13) gives only the Fourier transform of the wavelet 'if;. In order to obtain the function 'if; itself we have to translate (13) back into the time domain. Using (3) we get. '.
where at the very end we performed the substitution k := k'  1 (k' E Z). Therefore (13) can be replaced by
(15) According to the rules (Rl) and (R3) the last formula is nothing other than the Fourier transform of the representation
'if;(t) =
v'2"L(
_1)kl hkl
(2t  k) .
(16)
k
In order to get a wellstructured set of formulas we set
(1)
klhkl =: 9k.
(17)
::>
14:;!
LVIUltlresolutlOn analySis
In this way (16) becomes '1/J(t) =
v2L9k ¢(2t 
(18)
k) ,
k
an identity that has the same structure as the scaling equation 5.2.(2). Another admissible definition of the 9k would have been (19) If, e.g., only the hk for D ~ k ~ 2N 1 are different from zero, then (19) implies the same state of affairs for the 9k, and all summations in the corresponding algorithms (see Section 5.4) range over the index set {D, 1, ... ,2N  I}. Let us summarize the results obtained so far in the following theorem:
(5.13) Assume that (Vj Ij E Z) is a multiresolution analysis with scaling function ¢ and generating function H, and let the mother wavelet '1/J be defined by (13) resp. by (16). Then the function system j
"/2 (tk.2 ) '1/Jj,k(t) .= . 2 J '1/J 2j'
is an orthonormal wavelet basis of L2(~).
r
Consider a fixed j E Z. Since according to (5.12) the '1/Jo,k constitute an orthonormal basis of Wo, it is an easy consequence of the principle 5.1.(9) and a small calculation that ('1/Jj,k IkE Z) is an orthonormal basis of W j . The theorem now follows from Proposition (5.1). .J
CD
As our first example we take up the Haar multiresolution analysis again, cf. Example 5.2.Q). This time we are in a position to construct the mother wavelet '1/J following the prescriptions of the general theory. It is easy to verify that ¢ := ¢Haa:r has as its Fourier transform the function
¢(~) =
1 sin(~/2) ei~/2
.J2i
~/2
(20)
On the other hand we now insert the values of the hk' as computed in 5.2.(5), into (3) and obtain the following generating function: H(~) = 
1
1 ~ "c/2 (1 + e''') = cos  e'" . 2
v2v2
"C
(21)
5.3 Constructions in the Fourier domain
143
It is easily seen that the functional equation (4) is fulfilled in this case. The recipe (13) now gives
which is the same as 1.6.(1), up to a factor _ei ( This means that the 'IjJ we have constructed here is translated one unit to the left and is multiplied by 1 with respect to the "official" Haar wavelet. This fact is corroborated, if we now compute the 9k by means of (17): 
91
1
= ho = .../2'

92
= hI
1
= 
.../2 '
all remaining 9k being zero. This gives ~
1
'IjJ = j2 ¢>1,1 
1
j2 ¢>1,2
resp. 'l/J(t) = ¢>(2t + 1)  ¢>(2t + 2), as announced above. The reader may convince himself on his own that the alternative definition (19) of the 9k (in the case at hand we have N = 1) would have led to the "official" 'l/JHaar , whose support coincides with that of ¢>Haar' 0
@ As our second example we present the socalled Meyer wavelet. For its construction we again make use of the auxiliary function
vex)
,~ Haxz
15x' +
6'"
(x S 0) (0 x S 1) (x 2:: 1)
s
shown in Figure 4.6 (this v(.) has nothing to do with the vO's appearing in Theorem (5.11»). We set 1
¢(~).
v'21r _1_
~
o
(I~I S
cos(~v( ~1~11)) 2 21r
2;)
e: s I~I S \1r) (I~I 2::
\?r)
144
5 Multiresolution analysis A
'I7=4\t;,27f)
,..
1/V21r
/
/
o Figure 5.2
(see Figure 5.2). This defines a functionj> E L2 about which we can say the following right away: From the fact that ¢ has compact support it follows that ¢ E Coo, and because of ¢ E C2 the assumption 5.2.(6) of Theorem (5.8) is satisfied by ¢; furthermore, one has
J
¢(t) dt = V2rr¢(O) = 1,
as is required for Uj 10 = L2, see (5.8). In view of Proposition (5.9) we now have to examine the function
~(t;,)
:=
L I¢(t;, + 21rlW . l
A short glance at Figure 5.2 shows that it is sufficient to verify condition (1) in the f.interval [2;, 4;]. In this interval only the two terms corresponding to l = 0 and l =1 contribute anything to ~(f.) at all. Because of
2~ If.  27f!  1 = 1  (2~f.  1)
< C < 41l") ( 21l" 3 (t)
V2 L
=
hk 1>(2t  k),
(1)
k
paired with the analogous equation for written in the form
'I/J(t)
=
'I/J.
The latter can by 5.3.(18) be
V2 Lgk 1>(2t 
k) ,
(2)
k
the gk appearing in (2) being related to the hk according to 5.3.(17) or 5.3.(19). From (1) we deduce, for arbitrary j E Z, nEZ, the identity
This may be written in the form Vj, Vn,
1>j,n = L hk 1>jI,2n+k
(3)
k
that is to say, as a recursion formula for 1>jI,. one obtains from (2) the formula
'V't
1>j, •. In an analogous way
Vj, Vn,
'l/Jj,n = Lgk1>jI,2n+k
(4)
k
which leads from the array 1>jI,. to the array 'l/Jj, •. We are now going to analyze a time signal f E L2, and having done that we are going to synthesize it back to its original appearance. In the whole process there will be a finest scale to be considered; we may assume that it belongs to the value j = o. Therefore the analysis begins with the data
aO,k
:=
(j,1>o,k)
:=
J
f(t) 1>(t  k) dt .
These values could be determined, e.g., by numerical integration. It may also be the case that f is only given in the form of a discrete array (J(k) IkE Z) to begin with. In such circumstances one simply puts
aO,k
:=
f(k)
(k
E
Z) .
5.4 Algorithms
151
This is not so farfetched in view of the fact that J ¢(t) dt = 1, particularly in the case when ¢ has a narrow support and subsequent values of ! do not differ much from each other. Be that as it may, for the remaining discussion our basic assumption on ! can be summarized as follows:
I: aO,k ¢O,k .
Po! =
k
The wavelet analysis now proceeds in the direction of increasing j, and this means in the direction of ever longer waves resp. toward more drawnout features of the signal!. We describe right away the step j  1 ..,... j. Let j ;::: 1 and assume PjI! =
I: ajl,k ¢jl,k ,
(5)
k
where the values ajl,k are known and stored in an array. Intuitively speaking, the image PjI! encompasses all features of! having a spread of size;::: 2j  1 on the time axis; see our detailed explanations in this regard in Section 5.l. Our first task is the computing of the quantities aj,n (n E Z). Using (3) we obtain aj,n := (j, ¢j,n) =
I: hk (j, ¢jI,2n+k) , k
so
that we can write down the following recursion formula for the step from to aj,_ :
ajI,_
aj,n =
I: hk ajI,2n+k k
The array aj,_ encodes the next coarser approximation of !, to wit
p.! J
= ~ '"' aJ,. k
A. . k • 'i'J,
k
The approximations PjI! and Pj ! are related to each other by the formula
Qj denoting the orthogonal projection onto Wj. The image Qj! contains all features (details) of ! that have a time spread of size rv 2j /..;2. Since ('if;j,k IkE Z) is an orthonormal basis of Wj, we can write Qj! =
L dj,k 'if;j,k , k
152
5
Multiresolution analysis
and on account of (4) the coefficients appearing here are given by
L
dj,n = (f, 'l/Jj,n) =
9k (f,4>jI,2n+k) .
k
Expressing the scalar products on the right by means of (5) we therefore obtain the following formula for the "diagonal" step from ajI,. to dj ,.: dj,n =
L
9k ajI,2n+k
k
The information about the time signal ! that was extracted in the transition from Pj ! to PjI! is now stored in the array dj ,.. Contrary to the "temporary" quantities aj,k , the dj,k are actual wavelet coefficients. Altogether we obtain the following cascade, in the course of which at each step the signal ! is made coarser by a factor of two and at the same time details having a time spread of size 2j / V2 are extracted: IV
ao,.
Ii t
al,.
~g
Ii t
az,.
~g
Ii
a3,.
~g
dz,.
dl,.
Ii t
Ii
t
t
~g
~g
d3,.
aJ,.
dJ,.
(6) The wavelet analysis (6) of the given time signal! is terminated after J steps, where the number J comes out in a natural way, see below. We now address the following question: How many arithmetical operations were necessary for this analysis? In order to fix ideas we assume from the outset that the scaling function 4> has compact support. We know from (5.7) that in this case the numbers a( 4» and b( 4» are integers. In keeping with the notation used in certain famous examples later on we assume that
a(4)) = 0,
b(4)) = 2N 1,
N~l.
It follows from (5.7) that only the hk with 0 ~ k ~ 2N  1 are different from 0, and the same is true for the 9k, if we agree on 5.3.(19). We introduce the following piece of notation: If x. is an arbitrary array over the index set Il, then the formulas supp(x.)
c [p, q[ ,
length(x.)
~
q p
5.4 Algorithms
153
express the fact that at most the Xk with p ~ k < q are nonzero and that at most q  p individual entries are considered resp. stored at all. (The numbers p and q need not be integers.) The array ao,. encodes all the information that we are going to use about the time signal f. For simplicity, we assume, e.g., supp(ao,.) C
[0, 2J[ ,
length(ao,.) = 2J
.
We assert that under the described circumstances the supports of the arrays fJj,. can be bounded as follows: supp(aj,.) C [2N + 2, 2J  j [
(j '2 0) .
(7)
r
For j = 0 the assertion is true by assumption. For the step j  1 v+ j we may suppose that j '2 1 and that supp(ajl,.) C [2N + 2, q[ ,
Because of
q := 2J (jl) .
2Nl
aj,n =
L
hk ajl,2n+k ,
k=O
a component aj,n can be
1= 0 only if the two sets
{2n,2n + 1, ... ,2n + 2N I}
and
[2N + 2, q[
have a nonempty intersection, and for the latter it is necessary and sufficient that the inequalities
2n< q
/\
2n + 2N  1 '2 2N + 2
hold. The first of these says n < q/2 = 2J  j , the second n '2 2N + ~. Thus we may conclude that supp(aj,.) is bounded as stated in (7). ...J Formula (7) suggests that we terminate the process after J steps, since from then on supp(aj,.) stays put at [2N + 2, OJ. How many multiplications have been carried out up to this point? (For the sake of simplicity we are disregarding the additions here.) The computation of an individual value aj,n requires at most length(h.) multiplications. On the other hand we conclude from (7) that length(aj,.) ~ 2 J 
j
+ 2N 
2
(j '2 0) ,
= 2N
5 Multiresolution analysis
154
and for length(dj,.) we obviously have the same bound. Altogether we obtain the following upper bound for the total number f.L of multiplications required for the complete analysis of the given signal f: J
f.L::; 2·2N· 2:)2 J  j +2N2) =4N(2J 1+J(2N2)). j=l
This implies f.L ::; 21ength( h.) length( ao,.) (1
+ 0(1)) ;
that is to say, the number of required operations is linear in the input length. Starting from ao,. and proceeding in the described way we have computed in J ~ 1 steps the coefficient arrays
(the intermediate or "temporary" arrays ao,., ... , aJl,. are no longer needed). The total length of these arrays is about equal to length(ao,.), so that at first glance we have gained nothing in terms of storage requirements. But we have to bear in mind that the individual coefficient arrays dj,. will contain long sequences of negligible entries dj,k, depending on the fine structure of the time signal f in different regions of the taxis. By disregarding all dj,k whose absolute value is below a certain threshold and releasing the corresponding storage cells one is able to achieve spectacular compression ratios without Significant loss of information. For instructive examples in this regard, we refer the reader to [19]. Now for the synthesis: Here we obtain an algorithm of a similar simplicity. Since the step j 1 ~ j amounts to replacing the orthonormal basis CPjl,. of YJl by the likewise orthonormal basis CPj,. U 'l/Jj,. , the reverse step j ~ j 1 does not necessitate the inversion of a certain matrix. The details are as follows: One has
L aj,k hk + L dj,k 'l/Jj,k k
k
and consequently ajl,n = (Pjd, CPjl,n) =
L aj,k (CPj,k, CPjl,n) + L dj,k ('l/Jj,k, CPjl,n) . k
k
The scalar products appearing on the right can be read off from (3) and (4):
5.4 Algorithms
155
so that altogether the following synthesis formula emerges: ajl,n =
L hn 2k aj,k + L gn2k dj,k k
k
In this way we obtain as a counterpart to (6) an "upward" cascade that takes the coefficient arrays aJ,. , dJ,., dJI,., ... , d2,., dl ,. as its input and finally returns ao,., i.e., Pof, as its output: aJ,.
h ~
aJI,.
/g dJ,.
h ~.
aJ2,.
h
h
~
~
/g
/g d2,.
dJI,.
al,.
h ~
ao,.
/g d 1,.
We leave it to the reader as an exercise to compute the total number 11 of multiplications required ~for such a synthesis. The resulting figure will be about twice as large as the 11 from the "downward" cascade (6). The boxed formulas show that we need only a table of the hk and the gk in order to be able to begin with concrete numerical work. Neither the scaling function 1> nor the mother wavelet Whave to be stored, be it numerically or otherwise, nor do they have to be recomputed on end at runtime. (By the way, one does not need to understand anything of the underlying theory either. .. ) In [DJ one finds a great number of such tables; they relate to various wavelets 'if; that for the one reason or another have proved their worth. The following example of such a table belongs to the socalled Daubechies wavelet 3W having support [0,5] : k
hk
0 1 2 3 4 5
.3326705529500825 .8068915093110924 .4598775021184914 .1350110200102546 .0854412738820267 .0352262918857095
gk
= (I)kh 5 _k
.0352262918857095 .0854412738820267 .1350110200102546 .4598775021184914 .8068915093110924 .3326705529500825
(8)
We shall construct this wavelet in 6.2.@ ab ovo, only there we shall see how the values of the hk tabulated above come about.
5 Multiresolution analysis
156
CD (Continuation of 5.3.@) We have not yet computed the hk corresponding to the Meyer wavelet. That's what we are going to do now. The generating function H(·) is given by 5.3.(22) and is an even function, as is 4>. Thus on account of 5.3.(3) we obtain successively hk
= .../2 2 2j'lr H(f.) eikf; df. = .../2j'lr H(f.) cos(kf.) df.
'lr 2~ 'lr .../21'lr V21fL: ¢(2f. + 4~l) cos(kf.) df. . ~
=
~
0
l
In the last sum, only the term corresponding to l = 0 is contributing anything to the integral, whence we obtain hk = hk =
r ¢(2f.) cos(kf.) dE; .
2 ..fo Jo
These integrals now have to be computed numerically. In view of the function v(·) used in the construction, the resulting ¢ has 4clicks at the two points ± and 3clicks at the two points ± ~; apart from that it is infinitely differentiable. This implies (cf. Example 1.2.@) that for k ~ 00 the hk decay only like 1/k4. The numerical computation results in the following values:
2;
k
hk = hk
k
hk = hk
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
.748791 .442347 .039431 .127928 .033278 .057120 .024807 .025310 .016000 .009538 .008556 .002451 .003416 .000058 .000647 .000225
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
.000329 .000061 .000333 .000231 .000059 .000174 .000115 .000027 .000115 .000067 .000028 .000066 .000040 .000015 .000046 .000027
o
6 Orthonormal wavelets with compact support
6.1 The basic idea We are confronted with the task of producing scaling functions ¢: lR ..,. C having the following properties: (a)
¢ E L2 ,
(b)
¢(t)==h~hk¢(2tk)
(c)
!
supp(¢) compact,
¢(t) dt = 1 resp.
¢(O) =
(d) !¢(t)¢(tk)dt=OOk
¢(~)=H(~)¢(~),
resp.
k
vk,
resp.
~1¢(~+21rlW==~. 211"
k
If all these conditions are met, then Theorem (5.13) will provide us with an orthonormal basis of wavelets 'l/Jj,k having compact support.
Condition (a) immediately implies ¢ E L1 and ¢ E Coo; furthermore, we know from (5.6) that only finitely many hk are nonzero. It follows that the generating function
H(~)
:=
..2. ~ hk eik~ V2k
is a trigonometric polynomial satisfying the identity
(1) and having the special values H(O)
= 1, H(1I") = 0;
see (5.10).
The systematic construction of polynomials with these properties is an algebraic problem that we shall take up in the next section. For the moment we assume that we have such an H at our disposal, and we begin our undertaking by showing that the corresponding scaling function ¢, if there is one at all, is uniquely determined by H. Applying (b) recursively r times we obtain
6 Orthononnal wavelets with compact support
158
and, therefore, because of (c), 1
~
¢(~) = V2ff r~~ }1 H(2j) r
,
(2)
if the infinite product converges. In this regard, we show: (6.1) Assume that the generating function H E 0 1 satisfies the identity (1) as well as H(O) = 1. Then the product (2) converges locally uniformly on IR to a function ¢ E L2.
I
Setting max IH'(~)I =: M
e
and using the mean value theorem of differential calculus we obtain IH(~)
1\
= IH(~)  H(O)I ~ M I~I
(~ E
R),
therefore we may conclude
 1\ I H(~) 23
< 
M23I.~I
~
(j
0) .
Because 2: j2 1 2 j
= 1, this implies by general principles that the product (2) is converging locally uniformly to a continuous function ¢: R  t Co In order to prove ¢ E £2 we have to modify the limiting process leading from H to
¢ slightly by means off a
"cutoff function". To this end we set
1o(~)
vk 1[_1r,1r[(~)
:=
and define recursively, as in (b),
(3) This implies ~
fr(~)
=
1 !2;
~
II H(2j) .1[_2 1r,2 1r[ • ... 2?r r
r
(4)
r
j=l
For any given
~ E
R there is an  2r ?r
TO
~ ~
such that
< 2r ?r
VT
> TO ,
showing that the "cutoff factor" in (4) has no effect as soon as Therefore the comparison with (2) proves (~E
T
>
TO.
R) ,
moreover, we have locally uniform convergence of the fr as well. The next point on the agenda is the following lemma:
6.1 The basic idea
159
(6.2) For each r ~ 0 the family (jr('  k) IkE
r
Z)
is an orthonormal system.
Because of Proposition (5.9) the assertion of the lemma is equivalent to
(r Now the recursion formula (3) for the mula for the functions as well as the mother wavelets 'if; will then be realvalued as well. According to 5.3.(13) the Fourier transform of 'if; is given by
Now, on account of what we said in Section 3.5 (see, e.g., Theorem (3.13)), we are interested in our wavelet 'if; having an order N as high as possible, and according to 3.5.(3) this is equivalent to the requirement that ~ should vanish of an order N as large as possible at f. = O. As a consequence the generating function H should have a zero of order N » 1 at f. = 7r, a fact that we express most elegantly by writing N?l. Instead of looking for H we switch for a moment to the function
(1) that would have to satisfy the linear identity
(2) For symmetry reasons the function M is a polynomial in cos f. , and M contains the factor
Therefore we may write
(3)
6.2 Algebraic constructions
169
where P is a certain polynomial as welL Now we introduce a new variable y by letting y := sin2 ~. This leads to A(~) = p(cos~) = P(l  2y) =: P(y) ,
(4)
where again P is a certain polynomiaL In this way (3) becomes M(~)
Because of
= (1 _ y)N P(y) .
+ 1r) =sm . 2 "2~ =y cos 2 (~ 2
and
A(~ +1r) = p(cos~) = P(2y 1) = P(l 2(1 y») = P(l y), the identity (2) takes the following form when expressed in terms ofthe variable y: (5) This formula is valid for 0 :::; y :::; 1 at first, but by general principles on holomorphic functions we may conclude that it is true for arbitrary y E C. By the theorem on decomposition into partial fractions there are uniquely determined coefficients Ck , C k such that
and for symmetry reasons one has Ck = C k for all k. Clearing denominators, we can infer that there is a polynomial PN of degree :::; N  1 such that
holds, and PN is the only polynomial solution of (5) having a degree:::; N  l. Now it easy to see that any solution P of (5) satisfies the identity
P(y) == (1  y)N (1 _ yN P(l _ y») as welL In particular, this is the case for PN, and this allows us to draw the following conclusion:
PN(y)
=
if: 1 PN(y)
=
I: (N) k (_y)k = I:
Nl
Nl
k=O
k=O
(N + k1) yk . k
(6)
170
6 Orthonormal wavelets with compact support
Here we have made use of the fact that the part of PN carrying the factor 1 PN . The solution of (5) having the smallest yN gives no contribution to possible degree now has been determined explicitly: It is the right hand side of (6). Now let P be an arbitrary solution of (5). Then
i:
and consequently P(y)  PN(y)
= yN P*(y)
for some polynomial P*. If we insert this into (7) again, we obtain P*(y)
+ P*(l y) == 0,
which is equivalent to p*(y)
=
R(l 2y)
= R(cos~) ,
R odd.
Since we can perform the same computations backward as well, all in all the following theorem has been proven:
(6.7) A trigonometric polynomial M(·) satisfies the identity (2) if and only if it has the following form:
M(~) = (coS2~) N p(sin2~)
.
Here P(y)
= PN(y) + yN R(l  2y) ,
where PN is given by (6) and R is an arbitrary odd polynomial. In view of (1) such a function M(·) is of use only if P satisfies the additional condition P(y) ;::: 0 Letting P := PN, this condition is obviously satisfied. So much for the admissible functions M, these being related to H by (1). In order to get the generating functions H themselves, we must, so to speak, "take the square root of M ". In doing this we only have to bother about the factor
P(sin2~) = p(cos~) = A(~)
introduced in (3). For carrying out this task a surprising lemma of Riesz will come to our help. It reads as follows:
6.2 Algebraic constructions
171
(6.8) If n
A(~) = Lak cosk~, k=O
and if A(~) ~ 0 for real~, in particular A(O) = 1, then there is a trigonometric polynomial n
L b eik~
B(~) =
k
k=O
with real coefficients bk and B(O) = 1, such that A(~)
identically in
I
B(~) B( ~)
==
(8)
,
~.
The function A(·) possesses a product representation of the form n
~ A(~) = an
II (cos~ 
(9)
Cj) ,
j=1
the Cj being real or else appearing in complex conjugate pairs. We introduce the complex variable z by writing ei~ =: Z; then (9) goes over into n
A(~)=anII
(Z +Z1
)
2
(10)
Cj.
j=1
In investigating the individual factors appearing in (10), we need the well known properties of the mapping z It (z + Z1 ) /2 as well as the identity
z + Z1
=2= 
s + s1 2
1
==  2s (z  s) (z
1
(a) If Cj E R and ICjl ~ 1, then there is an s Therefore we obtain, using (11):
z + Z1
    Cj =
2
(b) If Cj
E
Rand
ICjl
1
_.
2s
E
R* such that
1
eia:
S+S1 2
# 0) .
Cj =
=COSct.
(11)
(s+sI)/2.
)
(z  s) . (z  s .
< 1, then there is an s = Cj=
(zs
 s)
# ±1 such that
172
6 Orthononnal wavelets with compact support
This implies that A(';) contains a factor cos';  COSQ, and the latter is not compatible with A(';) ~ 0 (.; E JR), unless this factor occurs an even number of times. Therefore there is a j' such that Cj' = Cj, and using (11) we obtain the identity 1 (Z+2Z
Cj)
(z+2z 
1
Cjl)
=
4e;ia (z  eia ) (Z1  eia)(z  eia )(z1  eia )
=
~(z  eia)(z _ e ia ) (Zl _ eia ) (Z1 _ e ia ) 4
= ~.(z22ZCOSQ+1).(Z22z1COSQ+1). (c) If Cj tj JR, then there is, first, a j' such that s E C* such that 2
Cj'
=
Cj'
=
Cj
and, second, an
8+8 1
2
Using (11) again we get
All things considered, it follows that it is possible to combine and to regroup the factors appearing in (10) in such a way that the resulting representation of A(';) assumes the following form:
Here Q(z) = L:~=o qkzk is a polynomial with real coefficients qk, and the constant C E C" is obtained by collecting an and the various numerical factors that have appeared in (a)(c). The extra condition A(O) = 1 gives C = 1/(Q(1))2. It follows that, if we set B(';) := Q(ei~)IQ(l), then (8) is valid; therefore the lemma is proven. .J The decomposition (8) is not uniquely determined, since in the cases (a) and (c) interchanging sand S1 leads to another decomposition ofthe corresponding partial product of A(·). This, albeit modest, flexibility can be used to make
173
6.2 Algebraic constructions
the resulting scaling function and in consequence the related mother wavelet more symmetrical. We shall not pursue this matter any further. Assume that N is given. If we choose for simplicity P := PN , then A(.) becomes a polynomial of degree N  1 in cos~ and B(·) a polynomial of degree N  1 in ei~. In this way the generating function
is of degree 2N  1 in ei~, and the support of the corresponding scaling function (=: N¢» turns out to be the interval [0, 2N  1]. The mother wavelets N'I/J derived from the N¢> are called Daubechies wavelets.
CD
In the case N = 1 we of course obtain the Haar wavelet. Formula (6) gives P1 (y) == 1, and this in turn implies p(cos~) == 1, B(~) == 1, so that we finally get 1
·c
H(!;.) = 2(1 + e t ... )
,
o
which is in agreement with 5.3.(21).
The case N = 2 shall be dealt with in detail in the next section; the case N = 3 appears as Example @ below. In [D], Table 6.1, the coefficient vectors (hk 10 :'S k :'S 2N  1) corresponding to the Daubechies wavelets N'I/J are given to 16 decimal places for 2 :'S N :'S 10. In [L], Table 2.3, one finds these coefficients to six decimal places for N from 2 to 5.
@ We now describe in detail the case N P(y) :=P3 (y) =
= 3, choosing
G) + G)y+ (:)y 2 =1+3 +6y2 . Y
Inserting 2 ~ 1 ·c ·c y = sin  = ( e·. . + 2  e·. . ),
2
4
into (4) we get
A(~)
=
~e2~ _ 49 ei~ + 19 8
4
_ ...
Figure 6.2 confirms that A(~) is 2:: 0 throughout so that it makes sense to proceed with our computation. In the case at hand, the function B(·) has the
174
6 Orthononnal wavelets with compact support y
1
o
211"
Figure 6.2
form B(~) = bo + b1e ie + b2e 2ie , so that we have to compare coefficients in the identity
°e + b2 e 2°e °e 2°e , )(bo + b1e' + b2e • )
(b o + b1e'
3 2°e  '49 e'"·c + 4: 19  ... = Se'
For symmetry reasons it is enough to check the coefficients corresponding to e 2ie , e i { and 1. In this way we obtain the three equations
(12) Because A(O) = P(O) = 1, Lemma (6.8) guarantees that we can find real solutions (b o, b1 , b2) that satisfy the additional condition bo + b1 + b2 = 1. If we use this condition to eliminate bo + b2 from the second equation in (12), we get for bl the quadratic equation by  bl  ~ = 0, and this in turn leads to b _ l±v'lO 12 '
We leave it to the reader to pursue the upper choice of the sign here; it will result in complex solutions bo and b2 • This means that we definitively have b1 = (1  v'lO) /2, and because of the first equation in (12) we can say that bo and b2 are the two solutions of the quadratic equation
Choosing arbitrarily (well, not quite ... ) one ofthe two possible assignments, we get
V5 + 2v'lO + 1  2v'lO eie + 1 + v'lO  4V5 + 2v'lO e 2ie ,
B(..C) _ 1 + v'lO + 4
6.2 Algebraic constructions
175
so that we finally obtain
C+
H(f,)
;i€
r
B(e)
.!.(1 3 i€ 8 + e +...) =
1 + J10 + V5
32
(1 + J10 +4V5 + 2J1O + 12J10 ei€ + ...)
+ 2J1O
+
5 + J10 + 3V5 + 2J1O i€ 32 e
+ ....
From the part of H that is actually printed out here one can immediately read off ho and hI: 1 + J10 + V5 + 2J1O ho = v1n2 L, 32
hI =
= 0.33267 . .. ,
J2 5 + J10 + 3V5 + 2J1O = 32
0.80689 ... ,
both in agreement with Table 5.4.(8). We leave it to the reader as an exercise to compute the remaining hk as well and so convince herself that we have indeed determined the coefficient vector h. corresponding to the Daubechies wavelet 31/J. Figures 6.3 and 6.4 show the functions
0
34> and 31/J in the time domain.
1+::#'>
3
Figure 6.3
The Daubechies scaling function
34>
4
5
6 Orthonormal wavelets with compact support
176
1
1
Figure 6.4
4
5
The Daubechies wavelet 3'1/1
6.3 Binary interpolation In the two foregoing sections we obtained scaling functions and corresponding wavelets by means of constructions in the Fourier domain, and also as limiting functions of an iteration procedure. In neither approach, however, did we discuss the convergence behaviour in the time domain. Now there is a third, called the direct method for constructing scaling functions . This method yields without a limiting process the exact values (x) at all "binary rational" points x E JR, and it is with the help of this method that one obtains the best regularity results, e.g. for the Daubechies wavelets N'lj;. In order to fix ideas, we assume that an N > 1 has been chosen once and for all and, furthermore, that
a(h.)
=
0,
b(h.) = 2N 1,
as agreed upon in connection with the Daubechies wavelets. The following abbreviations will prove useful:
{O, 1, ... ,2N  I}
=: J ,
6.3 Binary interpolation
177
For the description of the binary rational numbers we use the handy notation
therefore we have the inclusions Z
= lJ)o
C lJ)1 C ... C lJ)r C lJ)r+1 C . . . C lJ) ,
and lJ) is dense in R. The scaling equation now has the form 2Nl
¢(t) =
v'2
L
ho h2N
hk ¢(2t  k),
1 =1=
0.
(1)
k=O
The "direct method" is founded on the following three simple facts: • • •
If t E lJ)r for some r ~ 1, then the numbers 2t  k (k E J) belong to lJ)rl· If t < 0, then the numbers 2t  k (k E J) are < 0 as well. If t > 2N 1, then,the numbers 2t  k (k E J) are> 2N 1 as well.
On account of these facts the scaling equation (2) allows us to compute the values of ¢ successively on w
••
,
and therefore on all oflJ), if only these values have been determined on lJ)o = Z beforehand. Moreover, if ¢(k) = 0 for k E Z2Nl to begin with, then automatically ¢(t) = 0 for all t E lJ)1., by counting dimensions we therefore have V <e>1.. Because a tj. V, this implies
~
~kEJ
=
= (e, a) # 0 ,
ak
which is enough to show that the sum on the left can be normalized to 1.
1
Condition (3), resp. I:kEJ 4>(k) = 1, does not come out of the blue. As a matter of fact, one ha~the following theorem (cf. (6.1)):
(6.10) Suppose that the generating function H is as in Theorem (6.1) and that if; E L2 is defined by the infinite product 6.1.(2). If 4> is in reality a continuous function, satisfying an estimate of the form
c
14>(t)1 ~ 1 + t2
(t E JR.) ,
then the following identity holds: 1
L4>(xk)
(x E JR.) •
(5)
k
I
By assumption on 4> the auxiliary function
g(x) := L 4>(x  k) k
is a continuous periodic function of period 1 and has Fourier coefficients Cj
=
=
t
Jo
g(x) e2j1rix dx
J
4>(x) e2j1rix dx
=L k
[1 4>(x  k) e 2j1ri (xk) dx
Jo
= v2irif;(2j1f) = 80j
(j
E
Z) ,
6 Orthonormal wavelets with compact support
180
where in the end we have used 6.1.(15). From this it follows that 9 has the constant value 1, as stated. .J For N ~ 2 the Daubechies scaling functions N
Tf=f*), goes over into the reproduction scheme (10) for the function ¢i(JI)) n [0,1]). From this it follows that our ¢: JI)) + JR, restricted to 0 ~ x ~ 1, has a continuous extension on all of [0,1]. Now from (8) one concludes that such continuous extensions exist in the intervals [1,2] and [2,3] as well, and outside of [0,3] the definition ¢(x) := 0 trivially makes for a continuous extension.
186
6 Orthonormal wavelets with compact support
2a
1
3
1
20: Figure 6.S
The Daubechies scaling function
24>
Let us summarize our results so far: (6.14) There is a unique continuous function : R + R having support [0,3] and satisfying, identically in x, the following equations:
.E!=o hk (x 
(a)
(x) =
(b)
.Ek (x  k) = 1,
(c)
.Ekk(xk)=x
2k) ,
3
v'3
2
.
(a) The function u(x) := (x)  .E!=o hk (2x  k) is continuous and vanishes at all points ofD, consequently u(x) == O. In any bounded xinterval, the left hand side of (b) is a finite sum and therefore a continuous function v(·). According to (6.12)(c) this function takes the value 1 at all points of D, therefore we have v(x) == 1 on all of R. In an analogous manner one obtains the identity (c) from (6.12)(d). 1
I
The function : R + R we constructed here is in fact the Daubechies scaling function 2, for (6.14)(a) implies
187
6.3 Binary interpolation
and from (6.14)(b) one concludes 3 1 2
.,f2;r¢(O) =
r ¢(x)dx= Jor L¢(x+k)dx= Jor L¢(x+k)dx=l. Jo k=O
1
k
Altogether this means that 6.1.(2) is true. It follows that our ¢ is the "original", i.e., time domain version of the unique scaling function belonging to the coefficient vector (h o, ... , h3). This function, by definition, is 2¢; but up to this point it was analytically available to us only in the form ¢. .l In Figures 6.5 and 6.6, the functions 2¢ and 2'1/; are shown. These figures have been created by means of the described recursion procedure, computing 3·256 values in each of the two cases.
1
3
2a Figure 6.6
The Daubechies wavelet 2'1j;
188
6 Orthonormal wavelets with compact support
6.4 Spline wavelets In this last section we construct the socalled BattleLemarie wavelets. The starting material are spline functions, and that's why these wavelets are occasionally called spline wavelets as well, even though they are no longer spline functions. At the same time, the BattIeLemarie wavelets, in contradiction to the title of the current chapter, don't have compact support either. Nevertheless it will be possible to use the formalism that we have erected in the foregoing sections for the treatment of these wavelets as well. But let's take everything in turn! Another glance at the scaling equation in the form 5.3.(4) shows that, given two pairs (¢I, HI) and (¢2, H 2 ), each of them satisfying such an equation, the pair (~ . ¢2 , HI' H 2 ) satisfies such an equation as well. To multiplication in the Fourier domain corresponds convolution in the time domain; in other words, if (1)1 and ¢2 are scaling functions, then ¢I * ¢2 will satisfy a scaling equation as well. Therefore, beginning with ¢o := ¢Haar and setting up the recursion scheme ¢n+1 := ¢o * ¢n (n 2: 0), we should obtain a sequence of ever more regular functions that a priori satisfy scaling equations and could maybe be adapted to be useful in the construction of wavelets. We are going to change our notation to some extent, for the functions obtained in this way have previously appeared in numerical practice, going by the name of Bsplines (for "basis splines"), and they play an important role in the general theory of spline approximation. Various notations for these functions can be found in the literature, among them the following, which suits our purposes well enough: Bo(x) .
(0:5 x < 1)
{~
Bn+l(x) := (Bo
(otherwise)
* Bn)(x) = 1~1 Bn(t) dt
(n 2: 0) .
(1)
Doing the actual computation one finds, e.g., that the cubic Bspline is given by the following formulas: .!.X3 6
2 2 I 3 B3(X)= { 32X+2X2X B3(4  x)
o Figure 6.7 shows the graphs of B I , B2 and B 3 •
(0:5x:51) (1:5 x :5 2) (2:5 x:5 4) (otherwise) .
6.4 Spline wavelets
189
The easy verification of the following statements is left to the reader: supp(Bn)
=
J
[0, n + 1] ,
Bn(x)dx
=1
(n;::: 0);
furthermore, one has
(n;::: 1) .
1
1/2
x
o
1
2
3
4
Figure 6.7
Since for all practical purposes Bo
= ¢Haan copying 5.3.(20) gives
The convolution theorem (2.10) converts the recursion formula (1) into the formula
and by multiplicative accumulation one obtains
(n;::: 0) .
(2)
The following can immediately be read off from this representation of Bn: (3)
190
6 Orthonormal wavelets with compact support
On account of what was said at the beginning of this section, we now expect that each Bspline Bn satisfies a scaling equation. As a matter of fact, we have
and consequently
This means that
(4) where the generating function Hn is given by
e)n+l = Hn(e):= ( ei~/2 cos 2"
(1 +2ei~)n+l
(5)
We see that the coefficients hk (hin), really) of Hn have the following values:
.j2
hk = { ~n+l
(n+ 1) k
(0 ~ k ~ n
+ 1)
(otherwise) so that the scaling equation in the time domain takes on the following form:
(x
E
JR.) .
That the Bn would satisfy such identities could not immediately be guessed from looking at their definition! In order to check whether Bn can be used as a scaling function, according to (5.9) we have to examine the 27rperiodic function
(6) Because of (3) the series appearing on the right is uniformly convergent. It follows that n is a continuous function (we shall compute n explicitly later on). Furthermore, we obtain, using (2) and the inequality sin x x
>
2 7r
6.4 Spline wavelets
191
the following estimate:
IBn(e) 12
=
~lsin(e/2)12n+2 > ~(~)2n+2 211" e/2  211" 11"
Under these circumstances there are numbers B 2:: A > 0 (B and A depend on n) such that E lR,
\Ie
and on account of part (a) of Theorem (5.14) we come to the conclusion that the translates Bn (.  k) (k E Z) constitute a Riesz basis of the space
Va
:=
span(Bn(·  k) Ik E Z)
.
The proof of the following lemma is deferred to a later point:
(6.15) There are polynomials Pn of respective degree n such that the following is true:
(n2::0). The Pn can be computed recursively and have rational coefficients. We now suppose that an n 2:: 1 has been chosen and remains fixed in what follows. Part (b) of Theorem (5.14) describes an orthonormalization procedure; in particular, it gives a formula for the "definitive" scaling function if; corresponding to the chosen n, meaning that the translates if; (.  k) (k E Z) of if; are in fact orthonormal. The formula in question is
(7) In order to get an expression for if; in the time domain, we develop the function IlVPn(cose) into a Fourier series:
Inserting this into (7) and applying rule (Rl) we finally obtain the following representation of the scaling function if; corresponding to the chosen n:
if;(x)
= :~::>k Bn(x  k) . k
(8)
192
6 Orthonormal wavelets with compact support
It has to be admitted, however, that the coefficients
(k;::: 0) appearing here have to be computed numerically one by one. Since l!VPn(cosf.) is a realanalytic 27rperiodic function, the nential decay when Ikl + 00: There is a p < 1 such that
Ck
have expo
and because of supp(Bn) = [0, n + 1] it easily follows from this that ¢(x) is exponentially decaying when Ixl + 00 as well. But the compact support of Bn has been lost in the orthogonalization process. Proceeding along the lines of the general theory, we further need the modified generating function H#, and in order to be able to work with the mother wavelet 'Ij; corresponding to the above ¢ we need the coefficients h'/f in the representation
H#(f.) =
~ Lh'f!'e ire
v2
(9)
.
r
From (7) we conclude because of (4) that
Pn(cosf.) Pn (cos(2f.)) Therefore, by means of (5), we get the representation
Pn(cosf.) Pn (cos(2f.)) ,
(10)
from which one can read off already that 'Ij; has the order n + 1. The square root on the right now has to be developed into a Fourier series:
Pn(cosf.) Pn (cos(2f.)) here again the coefficients
(k ;::: 0)
(11)
193
6.4 Spline wavelets
have to be computed numerically one by one. Comparing coefficients in (9) and (10) we obtain the following formula for the hf:
(12) Only now are we in a position to compute the BattleLemarie wavelet resp. spline wavelet 'IjJ corresponding to the chosen n. On account of 5.3.(16) resp. (8) we have
'IjJ(t) =
J2 2:::( _1)kl h~kl ¢(2t 
k)
k
=
J2 2::: 2:::( _1)kl h~k_lCl Bn(2t k
=
k l)
I
J2 2::: 2) _1)kl h~k_l Crk Bn(2t r
r) .
k
This means that we should introduce the new set of coefficients br :=
J2 2:::(I)kl h~k_l Crk, k
and in this way we get definitively
'IjJ(t) = 2::: br Bn(2t  r) . r
How many terms of this expansion actually have to be taken into consideration is best decided "at run time". The last formula has brought our discussion to a close. It remains to supply the proof of Lemma (6.15).
I
Inserting (2) into the definition (6) of ~n we get
e) =
~n (
1 . 2n+2 e '\:"' 2 L....
2 sm 7r
I
1 f;
(~+ 7rl)
1 . 2n+2 es (t::) 2n+2 = 2 SIn 2 n