Contents INTRODUC'TION
vri
PREI.IMINARIES A N D NOTATION
xi
i ".
2 .;
'8-
:.. CHAPTER I . The What. Why. ant1 How...
149 downloads
1453 Views
15MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Contents INTRODUC'TION
vri
PREI.IMINARIES A N D NOTATION
xi
i ".
2 .;
'8-
:.. CHAPTER I . The What. Why. ant1 How of Wavelets
;t 6, where -cx, < a < b < m. then its Fourier transform j(() is well defined also for cornplcx 0 are fixed. Then (1.1.1) becomes
This procedure is sdicmatically reprcwnfc~din Figure 1.1: for fixed n, the Tzl",(f correspond to the Fourier ccc~ficicntsof f ( )g(. - nto). If, for instance, g i s compactly strpportcci, then it is dcar that, with 'appropriately chosen wo, the Fourier coefficients ar,c sufficicnt to characterize and, if need be, to reconstruct f ( - ) g ( -- ato). Clranging n a~xrountsto shifting the "slices" by) steps of to and its multiples, allowing the rt3covery of all of f from the j). (We wili discuss this in Inore mathrn~aticaldetail in Chapter 3.) Many possible choices t~avebrcn proposed for the window funpiion g in signal analysis, most of which have compact slppurt a11d reasonable smoothness. Jn physks, (1.1.I) is related to coherent state representations; the gW*'(s) = etWag(s- $1 are the s to the Weyl -Heisenberg group (see,e.g., Klauder and coherent s t a t ~ associated Skagerstam (1985)). In this cor~text,a very popular choice is a Gaussian g. In all applications, g is suppawxi to be well concentrated in both time and frequency; if g and are both c o n c e n t r ~ daround zero, then ( T ~f )~( c j", t ) can be interpreted loosely a s the "content" off near time t and near frequency w. The windowed Fourier transform provides thus a description of f in the timefrequency plane.
yz(f)
cfE(
-. RG.1.1 The wandowed Founer trarufom:
the finctton f ( 2 ) is multipiitd with the unndow finctson g ( t ) , and the Founer c a e 5 e n t J or the @uct f (t)g(t) a= m W d ; the p m d ~ * ts then =peat& for tmnskited veraons of fhe d m , g(t t o ) , g(t Zto),
-
-
. ..
3
THF WHAT, WHY, AND HOW OF WAVELETS
The wavelet transform: Analogies and differences with the windowed Fourier transform.
1.2.
The wavelet transform provides a similar time-frequency description, with a few ~mportantdierences. The wavelet transform formulas analogous t o (1.1.1) and (1.1.2) are
%
and
8
-
.6
?E;
q:(f 1 = a0 p-
1
%'
/ dt f ( t )$(aimt - n b ) .
J
In b @ h h mwe assume thatJ! ,? satisfies
-
(for reasons explained in Chapters 2 and 3). Formula (1.2 2) is agmn obtained from (1.2.1) by restricting a, b to only d i s crete values: a = b = nboar in this case, with m , n ranging over Z, and Q > 1, > 0 fixed. One similarity between the wavelet and windowed Fourier transforms is clear: both (1.1.1) and (1.2.1) take the inner products of f with a family of functions indexed by two labels, e d ( s ) = ew*g(s - t) in (1.1.1), and @'sb(s) = 1a(-1f2jt(yb) in (1.2 1). The functions + O f b are called "aravelets"; the function $J is sometimes called "mother wavelet." (Note that and g are implicitly assumed to be real, even though this is by no means essential; if they are not, then complex conjugates have to be introduced in (1.1.1), (1.2. I).) A typical choice for 3 is $(t) = (1 - t2) exp(-t2/2), &hesecond derivative of the Gaussian, sometimes called the mexlcan hat function because it resembles a cross section of a Mexican hat. The mexican hat function is well lwalized in both time iand frequency, and satisfies (1.2.3). As a changes, the q F O ( s ) = lal-l/'$(s/a) I mver different frequency ranges (large values of the scaling parameter la\ corto smaH frequencies, or large scale .~cP*O; small values of la1 correspond .)Or'@ Changing the parameter b as well frequencies or very fine scale to move the time localization center: each $Plb(s) is lacalized around follows that (1.2.I), like (1.1.I), provides a time-frequency descripti~n difference between the wavelet and windowed Fourier transforms lies of the analyzing functions g Y s t and @asb,as shown in Figure 1.2. all consist of the same envelope function g, translated to the ion, and "filled in" with higher frequency oscillations. All the the value af w, have the same width. In contrast, the y F b have to their frequency: high frequency +a*b are very narrow, are much broader, As-aresult, the wavelet transform is better able tbt4he windowed Fourier transform to "zoom in" on very short lived high frequency phenomena, such as transients in signals (or singularities
ar,
+
.
$Jlt
CHAPTER
1
(b)
FIG I 2 Tkptcal shapes of (a) unndowu! E j u n e r tmnsform fvnctaans g W a t , and The P i ( x ) = ~-'~'g(x - t ) can be mewed as tmnrilated envelopes 3. (b) wavelets m" unth hzgker ff~puenctes the are ull copzes of the same ,%nctaons, translated
+".'
and compressed or s t r ~ t c h e d
in functions or integral kernels) This is illustrated by Figure 1 3, which shows windowed Four~ertransforms and the wavelet transform of the Same *a1 f defined by
I(t) = s i n ( 2 r y t )
.
+ sin(Z.rrv2t) + y i6(t - t l ) + li(t - t2)] .
In practice, thls signal is not give11by this continuous expression, but by samples, and adding a 6-function is then approximated by adding a constant to one sample only. In sampled vcrsion, we have then
.
For the example in Figure t.3a, ul = 500 Hz,vz = 1 kHz, 7 = 1/8,000 sec (i.e., we have 8,COO samples per second), a = 1.5, and n2 - nl = 32 (corresponding to 4 milliseconds betweeh the two pulses). The three spectrograms (graphs of
nq 6 ~ a w a - 6 ~ ov ja ~ y apap0 y u q o? am ppom I .~-5mny ??fiom pun S U L - 3 4 s a a . ~ a w~ uaarnlg uo?Jnlos;uR x m & u j q Jo uowdtuo3 (p) .(l,o 08 rpuodsauor, a?wrpco aw ' 3 3 ) ~ L Z DA3uanbaJ ~ o a u q u pun 'poylatu tanat RarG auvc ay$ ggn
ay)
'l(/),,O~I m o l d 0 b - p afi-q
(q) y w u o m ~ a~q m ayour ~ oj
-1lo w o @ w
? ~ P ~ U A I(3) qanq k 6 aavtpauuaaua 'apym = cuaz ?f3u19 = sanpn y&y) qaaat M6uy-n '(yd& av uo pabaptw lou n omyd ay$) panotd m I ( j ) , , d I fip :rulac6aymi~palpaso am a . r u mppm niopuam am
-auvld (a~uu!pro)m' ( v c m c q v ) ? ow ut
(I(~),,,J;I
W
02 p u o a ~ ~ y J o d a rpau6ar.m d ao
ruarag~pa;u~ rllm 5 l o ~ o J ~ wuu n~o , p?a m o p u ? ~(q) '(7)J p*c
0009 OOOE 0001 OOOC
r"-1
a y (~a ) -SI 31J
OOOP 000s
0
oooz 0031 0,
o " l
1-2 ;
i
OOOE
COSE
I
OOOP (3)
0
[,*! 1 .; ;,I: ;,,,
-
Om& OOOP
E
00s 000 1 00s 1 0002 OOSZ OOOE OOSE OOOP
SLBT3AVM 60 MOH CINV 'AHM ',LVHM 3HL
0 005 000 1 00s 1
oooz 00% OOOC OEC
OOOP
6 ,
CHAPTER 1
the modulus of the windowed Fourier transform) in Figure 1.3b use standard Hamming windows, with widths 12.8, 6.4,, and 3.2 milliseconds, respectively. (Time t varies horizontdly, frequency w vertically, on these plots; the grey levels indicate the value of ( F i n ( f ) l ,with black standing for the highest value.) As the window width increases, the resolution of the two pure tones gets better, but it becomes harder or even impossible to resolve the two pulses. Figure 1 . 3 ~ shows the modulus of the wavelet transform of f computed by means of the (complex) Morlet wavelet $ ( t ) = ~ e - ~ ~ ~ ~ ~ e-u2a2/4), ( e ' " ' with a = 4. (To make comparison with the spectrograms easier, a linear frequency axis has been used here; for wavelet transforms, a logbithmic frequency axis is more usual.) One already sees that the two impulses are resolved even better than with the 3.2 msec Hamming windoiv (right in Figure 1-3b), while t h e frequency resolution for the two pure tones is comparable with that obtained with the 6.4 msec Hamming window (middle in Figure 1.3b). This comparison of frequency reqlutions is illustrated mare clearly by Figure 1.3d: here sections of the spectrograms (i.e., plots bf I(Fi" f)(., t)l with fixed t) and of the wavelet transform modulus (1 (T""f )(-,b)l with fixed b) are compared. The dynamic range (ratio between the maxima and the "dipn between the two peaks) of the wavelet transform is comparable to that of the 6.4 msec spectrogram. '(Note that the flat horizontal "tail" for the wavelet transform in the graphs in Figure 1.3d is an artifact of the plotting package used, which set a rather high cut-off, a s compared with the spectrogram plots; anyway, this cut-off is already a t -24 dB.) In fact, our,ear uses a wavelet transform when analyzing sound, at least in the very first stage, The pressure amplitude oscillations are transmitted from the eardrum to the basilar membrane, which extends over tire whole length of the cochlea. The cochlea is rolled up as a spiral inside our inner ear; imagine it unrolled to a straight segment, so that the basilar membrane is also stretched out. We can then introduce a coordinate y along this segment. Experiment sad numerical simulation show that a pressnre wave which is a pure tone, f , ( f ) = eaWt, leads to a response excitation along the basilar membrane which has the s m e frequency in time, but with an envelope in y, Fu(t, y) = eWt &(y)- Iq a first approximation, which turns out to be pretty good for frequencies w above 500 Hz, the dependence on w of #,(y) corresponds to a shift by log w : there exists one function 4 so that 4, (y ) is very close to #( y -log w ). For a general excitation function f , f (t) = i;& j(w)eYt, it follows that the response function F(t,V ) is given by the corresponding superposition of "elementary response fuactions,"
If we now introduce a change of parameterization, by d e h h g
4 )= (
2
)
4 ,
G(a,t)= F(t,loga) ,
,f
T H E WHAT, WHY, A N D HOW OF WAVELETS P
then it &,$lows that t"'^,
G(a,t ) = 3:
$
J
dt' j(t') $ ( a ( t - t ' ) ) ,
which (up t o normalization) is exactly a wavelet transform. The dilation parameter comesin, of course, because of the logarithmic shifts in frequency in the &. The occurrence of the wavelet transform in the first stage of our own biological
:i
+
mustical analysis suggests that wavelet-based methods for acousticai analysis
.' haw a better chance than other rricthods to lead, e.g., to compression schemes 0 '
2..
"
yv$-?~detectable by our ear. ,
" ~ 2 ~ Different types of wavelet
0). Moreover, one easily 3,
fZ
that
+
db alG(b *a)/' = /dz
If
(%)I2
,
32
CHAPTER 2
so that T"" can be interpreted as an isometry from H~ to the Bergman space of all analytic function on the upper half plane, square integral& with respect to the measure im z d(1m z ) d(Rez). On the other hand, one can Hove that any function in this Bergman space is asbciateti, via the wavelet transform with this particular $, to a function ixi H Z : the isometry is onto, and is therefore a unitary map. For qther choices of $, such as $ E H 2 with $(risIpossit>le cxt.onsions of (2.4.4) to L ~ ( w " )with 71 > 1. One possibility is to choose thc wavelct G L2(R") so that it is spherically sym~nctric. - Its Follricr transform is tllcn sph:rically symmetric as well,
2 ,
4K) = dl
.
The admissibility condition then becomes
and the corresponding resolution of the identity is 4
A similar construction can be made in dimensions larger than 2. Theae wavelets with rotqtior angles were studied by Murenzi (1989), and applied by Argoul et d. (1989) in a study of DbA (diffusion-limlted aggregates) and other two. dimensional fractals. 2.7.
Ptlrallels with the continuow windowed Fourier transform.
The windowed Fourier transform of a function f is given by
(2.7.1)
v""'"f)(w, t ) = (f, gW't),
where gW9'(x)= ewzg(z - t ) . Argumentg completely similar to those in the proof of Proposition 2.4.1 show that, for ail fl,f2 E L2(W),
which can be rewritten as
There is no admissibility condition in this case: any window function g in E2 will do. A convenient normalization for g is 11911 = 1. (The absence of a9 &himibility condition is due to the unimodularity of the Weyl-Heisenberg grou-. Groesmann, Morlet, end Paul (1985).) The continuous windowed Fourier transform can again be viewed aa a map frodZZ(R)to an r.k.H.s.; the functions F E T*'"L'(R) are all in L~(R') and moreover satisfy
where K(w, t ; w', t r ) = (@"*', pi). (We assume 11g11 = 1 here.) Ag& there exist very special choices for g which reduce this r.k.H.8. to a space of analytic functions: for g(x) = n-lI4 exp(-x2/2), one finds -
THE CONTINUOUS WAVELEZ' TRANSFORM
35
where q5 is an entire function. The set of all. entire functions # which can be obtained in this way constitutes the Bargmann Hilbert space (Bargmann (1961)): The obtained from g(a) = go(%) = exp(-z2/2) are often d i e d the canonical coherent states (see the primer in Klauder and Skiigerstam (1985)); the associate continuous windowed Fourier transform is the canonical coherent state representation. It has many beautiful and useful properties, of which we will explain one that will be used in the next section. Applying the differential d2 operator H = -&I "x 1 t o go(x) leads to g"lt
. '
'
+
,,- i.e., go is an eigenfunction of H witheigenvalue 0. In quantum mechanics lanb
'
-
-:, m
e , H is the harmonic oscillator Hamiltonian operator, and go is its ground state. (Strictly speaking, H is really twzce thestandard harmonic oscillator H d t o n i a n . ) The other eigenfunctions of H are given by higher or&r Hermite iimctions,
,
, fr
$
*Ki
A
satisfy
H4, = 2n 4, .
(The standard and easiest way to derive (2.7.4) is t o write H = A'A, $ where A = x + and A* is its adjoint A* = z - $, and to show that Ago = 0, A(A*)n = (Am)"A+ 2n(A*)n-a, so that Hq5, = an A*A(A*)n go = 2n(A*)"" go = 2n &; the n o ~ i z a t i o a i,~ can be computed easily as well.) It is well known that the (4,; n E N) form an o r t h o n o d basis for L2(R);they const~tutetherefore a "complete set of eigenfunctionsn for 11. Let us now consider the one-parameter firmilies = exp(-iHs)+. These wa the solutions to the equation
6,
*,
i&$s
= HJr,
,
'
(2.7.5)
t h initial condition qlb = @. In the very special case where &(z) = &"(x) = euz exp[-(2 -- t ) 2 / 2 ] ,we find Jt. = ew* g,Y.lt*, where w. = w ecs b n 28, t. = w sin 2s t cm 23, and a. = f (wt - w.t.) (as can easily be verified m c i t computation). That is, a canonical coherent state, when "evolved" &r (2.7.51, remains a c.anonica.1cohore~tetate (up to a phase factor which will important to us); the label (w,,t,) of the new coherent state is obtained the initial ( w , t) by a simple rotation in the time-frequency plane.
-
+
i b
-
-
+"3& The continuous transforms as tools to build useful operators.
-
, %%-ed
l l t i o s of the identity (2.4.4), (2.7.2) can be rewritten in yet another
way:
(2.8. la)
CHAPTER 2
where (., &)$ stands for the operator on L2(IR)that sends f to ( f , 4)4; this is a rank one projection operator (i.e., its square and its adjoint are both identical to the o ~ e r a t o ritself, and its range is one-dimensional). Formulas (2.8.1) state that a "superposition," with equal weights, of the rank one p~ojectionoperators corresponding to a family of wavelets (or a family of windowed Fourier functions) is exactly the identity operator. (As before, the integrals in (2.8.1) have to be taken in the weak sense.) What h a ~ p e n sif we take similar superpositions, but give different weights t o the different rank one projection operators? If the weight furiction is a t all reasonable, we end up with a well-defined operator, different from the identity operator. If the weight function is bounded, then the corresponding operator is as well, but in many examples it is advantageous t o consider even unbounded weight functions, which may give rise to ~nbounded operators. We will review a few interesting examples (bounded and unbounded,) in this section. U7estart with the windowed Fourier case Let us rewrite (2.8.lb) in the p, q (momentum, position) notation customary in quantum mechanics (rather than the w , t notation for the frequency-time plane), and insert a weight function
If w 6 L"(w~), then W may be unbounded and hence not everywhere defined; as a domain for 1%' we can then take ( f ; S l d p dq I ~ ( p , ~ ) I(f, l ' ~ P V Q ; , decrease monotonically as n increases; for smatl n they are close to 1, for large t% n c1ow to zero. The threshold value around wbicb they make this 'plunge," as ii5**( e e d , for example, by n t h , = max{n; & 1/21, is nth, R2/2. Note that %r%&s is again equd to sIZ2/27r, i.e.., the area of the timefrequency localization P w o n SR multiplied by the Nyquist density, just as in 52.3. The width of the &. w g e region is wider than in 52.3; however, -b
-
r
>
-c.
-9 h
p
$iZ t >
*
6
,%
&%; 73
*
%@ -pared i
$.*. $:,
-
*7 it&
-
# {n; l - ~ > X , 3 c ) s C , R , to the logarithmic width in (2.3.2)), but it is still negligible, for
R* when compared with nth. Another striking difference with 52.3 is that in this *case are i n m e n t of the size of the region SR eigenhnctions
the prolate spheroidal wave funetio11~):the R-dependence is completely ,$dncentmted in the An(R).
Ij: @dike
..-
CHAPTER 2
FIG 2 1
The etgenvalues A, ( R ) for R = 3, 5, and 7
Examples silnilar to all of the above exist for the continuous wavelet transform We can again insert a non-constant function w(a, b) in the integral in (2.8. la), and construct operators W different from the iderrtity operator. An example is w(a,b) -- a2 in three dimensions, with a spherically symmetric 1C, (where the resolution of the identity is given by (2.6.2)), i.e., ( W f ) ( x )=
c;'
B
"ah
u./,
d 6 ~ a 2 ( ~ " j ) ( a , b ) ~ " ~ ' ( z ) ,(2.8.3)
c+
4(()
where = 4(IEl) and (?+ = ( 2 ~ l )r d~s s#(s). Because the three-dimensional Fourier transform of g(x) = is g([) = %&/(,hi [(I) (in the sense of distributions), one easily checks that Wj can also be written a s
so that (Wf , g) represents the interaction Coulomb potential energy for two charge distributions f and g. This formula was used ,in, e-g., the relativistic stability of matter paper by Fefferman and de la Llave (1986). N0t-t (W f ,g) becomes "diagonal" in the representation (2.8.3) (which, incidentally, is why it turned out to be useful in Feff-an and de la Llave (1986)). Note also that this diagonal wavelet representEStion completely captures the s-ity of
41
THE CONTINUOUS WAVELET TRANSFORM
the kerrlel in (2.8.4) iio "clipping off" of the singularity as in the windowed Fourier case. This is due to the fact that wavelets can zoom in on singularities (an extreme version of very short-lived high frequency features!), whereas the ,, windowed Fourier functions cannot (see 51.2 or 32.9). We can also, as in the wlndowed Fourier case, choose to restrict the integral in (2.8. l a ) t o a subset S of (a, b)-space, thus defining time-frequency localization operators Ls. These are well defined for measurable S , and 0 Ls 5 1. For compact S not containing any points with a = 0, Ls is a trace-class operator. For general S, the eigenfunct~orisand eigenvalues may agair? be hard to characterize, ,Qut there exist again special choices of ?/I and S so that the eigenfunctions and eigenvalues of Ls are known explicitly. Their anaiysis is similar t o the windowed . -- Fourier case, but a bit more tricky. We will only sketch the results here; for full detaib the reader should consult Paul (1985) or Daubechles and Paul (1988). One such special is $( 1 @though it does not matter, since we take negative as well as positive powers m). For m = 0, it seems natural as well to discretize b by taking only . +he integer (positive and negative) multiples of dne fixed bo (we arbitrarily fix
$ > O), where 60 is appropriately chosen so that the $(z - nb) 'cover" the whole line (in a sense t o be made precise below). For different values of rn, the
-
CHAPTER 3
-
width of aim'2 + ( a L m ~is ) a F times the width of *(z) las measured, e.6, by width (f) = [ J d x x21f ( x ) [ ~ ] ' /where ~, we assume that $dz xi f = 0)) so that the choice b = nh aom will ensure that the discretized wavelets at level rn "cover" the line in the same way that the $(x - nb) do. Thus we choose a r aom,b = nboar, where m,n range over Z, and > 1, bo > 0 are fixed; the appropriate choices for ao, bo depend, of course, on the wavelet 1/, (see below). This corresponds to
(%)I2
We can now ask two questions: \
completely characterize f? Or, stronger, can we reconstruct f in a numerically stable way from the
(1) Do the discrete wavelet coefficients {f,Q,,,)
(f +m.n>? 1
(2) Can any function f be written as a superposition of "elenlentary building Can we write an easy algorithm to find the coefficients in blocks" I),,,?' such a superposition?
In fact, these questions are dual aspects of only one problem. We will see below that, for reasonable 11 and appropriate ao, bo, there exist $J, so that the answer CI
to the reconstruckion question is simply
It then follows that, for any g E
L2(R)
-
or g = En,,,(g,l/,m,n)I)m,n, at least in the weak sense;-this is effectively a prescription for the computat~onof the coefficients in s miperposition of ~ m . , leading to g. We will mostly focus on the first set of que~tionshere; for a more detailed discussion of the duality between (I) and (2), see Grijchenig (1991). In the case of the continuous wavelet transform, both quttstbns were answered immediately by the resolution of the identity, at least if was admissible. In the present discrete case there is no analog of the resolution of the identity,2 so we have to attack the problem some other way. We c8n aku, wonder whether there exists a "discrete admissibility condition," and wh& it is. Let us first give some mathematical content to the questions in (1). We will restrict ourselves mmtly to functions f E La(R), although dtrcrete families of wa&leta, l i i their continuously labelled cousins, can be used in many other h c t i o n spaces as well.
+
DISCRETE WAVELET TRANSFORMS. FRAMES
55
Functions can then be "characterized by means of their "wavelet coefficients" (f,$,,,) if it is true that *
(fit 1Clm,n) = (fi,
h,n)
for dlmtn E
implies fi s fi
,
or, equivakntly, if
,
(1,+m,n)=O
forallrn,n~Z=+ f =O.
A
But we want more than characterizabirity! we want b%@able t o reconstruct f in a numericdly stable way from the (f, Ilr,,,).In ord& for such an algorithm to exist, we must be sure that if the sequence ((fi, &,,n)),,nez is "close" to 4(f2, $m,ni)m,nEZcthe0 necessarily fi and f 2 were "close" as 4.In order to make thig,precise, we need topologies on the function spa& and qa the sequence space. P t h e function space L 2 ( R )we already have its Hilbert space topology; on tb-uence space we will choose a similar t2-topology, in which the distance sequences c1 = (c&,,),,,,,~~and 2 = (c$,,),,,~~ is measured by
1 4
"
- xe This implicitly assumm that the sequences ((f, $m,n))m,nEzare in C(Z2)thema
,"
mives, i.e., that Ern,,((f, &,,)I2 < oo for all f E L2(R). Ln practice, this is . bo problem. 'AS we will see below, any reasonable wavelet (which means that bas some decay in both time and frequency, and that J & * ( x ) = O ) , and any ' choice for a0 > 1, 4 > 0 leads to **
3
, CIEJ I(f, @j)I2=
, ",
61
DISCRETE WAVELET TRANSFOqMS FRAMES
1 Saying that f =
d
'
El,,c, cp, is equivalent to staying that f = F'c
2. Write c = a + b, where a E Ran (F)= Ran particular, a l. b; hence /(cl12= Ital12 Ifbf12.
+
(81,and b l. Ran ( F )
.,
In
+
.rl
3. Slnce a E Ran (F), there exists g E 31 so that a = pg,or c = Rg b. Hence f == F'c = F * ~ C JF'b + But b I Ran (F), so that F'b = 0, and F*F Id. It follows that f = g; hence c -i ~f b, and
+
,
4
-
CJEI( f, $,)12,
which is strictly larger than
unless 6 = 0 and c = Ff .
-.
This proposition can also be used t o see how the g3 play a special role m the fifit half of (3.2.8). We typically have nonuniqueness there as well: there may #st many other families ( u , ) , ~J SO that f = El, (f, cpl) u,. In our earl~er two-dimensional example, such other families are given by u, = +el a, where d is M arbitrary vector in c2.Since 3 e, = 0,we obv;ously have
+
r,=,
L,
3
2 )uJ = 3 ]=I
C(v,e])e] +
are "less economical" than the E,, in the sense that for 3
4)12
I(v,
+
%)I2
+3I(v,a)l2
,=I
2 Ilvl12 + 31(v,a)12 > 51tvl12 =
-
3
.,,I2 J=el
ilar inequality holds for every b e : if f =
CIE (f;
p,)ul, then J [(u,. g)I2 g)I2 f o r d l g E W , byPropasition3.2.4. , to the reconstruction issue. If we know $J = (F'F)-'q,, then (8.2 8) .'$@$, ~ie8hi.mhow to reconstruct f from the (f,v,). SOwe only need ti3 compute the +%@p+which involves the inversion of P F. If B and A are close to each other, i.e., B/A - 1 < 1, then (3.2.4) tells us that F*F is "close" to ~ db , that 1 W ~ F ) : is' UclOse"to Id, and @I "close" to ip3. More precisely,
tx,EI
*;,
%. .'3
#'t,; :".-
-a4 i
yr
-&
f=-
2
A+B
C (f,R)Vj + Rf , IEJ
(3.2.11)
CHAPTER 3
62
-m
Id.%This impliib where R = Id - 2 F'F; hence Id 5 R 5 B - A = &. If r is mall, we can drop the rest term Rf in (3.2.11), 1JRll I and we obtain a reconstruction formula for f which is accurate up .to an Laerror of 11f U. Even if r is not so small, we can mite an algorithm for the reconstruct~onof f with expoaeatial convergence. With the same definition of R, we have
+
A+B F'F = (Id - R) ;
-& -
2
0,
or, for our particular w ,
I
r;
with Ip(a, b)l
where
I w(0) = A. Consequently,
4
CHAPTER 3 which Ch as defined by (2.4.11. We can rewrite the first term in (3.3.7) a s
4. For the particular weight k t i o n w that we have chosen, we have dt w ( t ) = $,hence Tr C = llhIl2 1" ao. Substituting all our results in (3.3.9) we find
3
where IRI 5 X Ch l($l12. If we divide by llhllZ and let A tend to zero, then this proves (3 3 1). The negative frequency formula (5.3.2) is proved analogously. s
1 Formulas (33. I), (3.3.21 impme an a priori restriction on 11, namely that (-I < m and fwq1 ~ I - l < m. This is the aomc restrict~onas in the continuous case (see (2-4.6)).
JFe
d(l5 cnr a
%", I'
..$+ ?;? ;'ii c
J
4I ) .
It follows thet the + or -) is a tight frame for L'(w), with Frame bwnd One can uhe a variant to obtain a frame consisting of real wavelets: @ = Re $+ = I[@+ + and $2 = Cm @+ = - $-I gener&e 2 the tight 'frame {$&,, m, n E Z, A = 1 or 21. These frames h e not generated by translations and dilations of a single function; tflisis a naturel-consequence of the decoupling of positive and negative frequenctes in the construction. A more serious objection to their practical use is the fact that their Fourier transforms' are compactly supported, and that the sim of this support is relatrvely smdl (for reasonable ao;bo)- As a result, the decay of the wavelets is numerically rather alow: even though we may choose v to be Cm,so that the @* decay faster tban any inverse polynomial, , collection (q!$m,,;
Y
E P
.
$.
(1
+ t4)-
I
the value of CN turns out to be too large to be practicd. Note that we did nut introduce any restriction M w,bo in thii construction. ' \
B. The Mexican hat b c t i o n . The Mexican hat
function is the second derivative of the C a w i a n e-;'f2; if we normalize it so that its L'-norm hr 1, we obtain $(z)" =-1/4 (1 - z2)e-s2/2
-
f
43
.
Thb -hition (and dilated and translated versions d it) was plotted in Figure 1.a; if p u take one such plot, and imagine it .anMm n d its symmetry axb,
76
CHAPTER
3
then you obtain a shape similar to a Mexican hat. This function is popular in vision analysis (at least in theoretical expositions), where it was also christened. Table 3.1 gives the frame bounds for this function, as computed from (3.3.19), (3.3.20), with a0 = -2, for d i f f k n t values of bo and for a number df voices varying from 1 to 4. As soon a s we take 2 or more voices, the frame may be considered tight for all d~ 4 .75. Note that bo = .75 and ( w , ) , ~ ~ ~= , , 2'12 1.41 (intuitively corresponding to two voices per octave) are not small values fbr the Mexican hat function: the distance between the maximum of and its zeros is only 1, and the width of the positive frequency bump of (as measured by 1-23. For 4 (C - 0 are fixed, and m, n range over Z; the discretely labelled family is thus gm,n(x) = et-= g ( ~R~O)
.
.f
DISCRETE WAVELET TRANSFORMS: FRAMES
-
81
We can again seek answers to the sww questions as in the wavelet .case: for which choices of g, wo, to can a hetion be by the inner products (f, gmln);when is it possible to recopst~ctf in a numekally stable way from these inner products; can an e5cient &&ilBm be given to write f as a linear combination of the g,,,,,? The answers are ag&i pm*ided by the same abstract framework: stable numerical reconstruction of f from its windowed Fourier coefficients
~~
)r
i *
r
~7
h , n )
is only possible if the g,,, ,,sotilat
=
& f ( ~ eLamZ ) 9fz - e
lf
constitute a frame, i.e., if there erdst A
> 0, B < oo
*
0, then there ezwts (wO)thr > 0 SO that the gm,n(x) eCRUJb2g(z - nto) wnstttute a &me 11,heneuer < (@)thr For wo c ( w ~ ) the ~ ~nght-hand , ssdea of (3.4,3), (3.14) an? jhme bounds for the 9m,nThe conditions on and (3.4.5) are met if, e-g., Ig(z)l 5 C(1+ I X ~ ) - ~ with E
y
> 1. REMARK.The windowed Fourier case exhibits a symmetry under the Four~er
transform absent in the wavelet case. We have
$-
,
.
which impliea that (3.4 3), (3.4.4) still hold if we replace g , w, to by 8, to, wo, respectively, everywhere in the right-hand sides (including in the definit~onof p). Using this remark, we can therefore compute two est~mateseach for A and B, @ pick the highhest one for A, the lowest for B o 5.43.
5;
The dual frame. The dual frame 1s again defined by
F* F is now (F' F)f = C,,
(f, g,,,) g,,,,,. In this case one easily checks $hat F'F commutes mth translations by to as well as w~thmultiplications by i.e., if ( T f ) ( x )= f (s to), ( E f ) ( x )= e W xf ( z ) ,then *%ere
-
F*FT=TF'F,
F'FE=EFgF.
fdows that (F'F)" a h commutes with E and T, so that
-
gm,n = (F*F)" JiF = P T" (F'FL-I
g=(x)
T" g 9,
= elrnvozij(x - nto) = 4,,,(4
,
(F*Fj'lg. Unlike the generic wavelet case,the dud kame is dm* by a single function 4. This means that it is not aa important in the Fourier case that the frame be close to a tight frame: if B / A - I is ble, then one simply cumputts 3 to hi& precision, mce Bnd for all, with the two dual &ma.
dr~meawith compact support in time or hquency. The folmcthn, again &om Daubechiea, Grossmann, and Meyer (1986), arid to $3.3.6.A, leads to tight windowed Fourier W with arbitrarily
high regularity if wto< 2n. If support g C
[-$,
51,then
where we haw used that for any n, at most one value oft can contribute, because of the support property of g. Consequently,
and the frame is tight if and only if EnIg(x -nto)I2 = constant. For instance, if woto 2 n, then we can start again from a Ck or Cm-functionu satisfying (3.3.25) and define .
I
i
iI
I
sin
(
0
[
~
.
~
+ ( a ~ -. ~
~
o
- ) 5 ] 5, ~ 5 wo ~ - - t o ,
otherwise .
function (depending on the choice of u) with comp.et Then g is a Ck or support, Hgll = 1, and the g, constitute a tight frame with frame bound 27r(wohj)-' (as already followed from (3.4.2)). If woto < n, then this construction can easily be adapted. This construction gives a tight frame with compactly supported g. By taking its Fovxier transform, we obtain a frame for which the window function has compactly supported Fourier transf~rrn.'~
B. The Gaussian. In this case g ( t ) = n-li4 e - ~ ' / ~ Discrete . families of wiqdowed Fourier functions starting from a Gaussian window have been disc& ' extensively in the literature for many reasons. Gabor (1946) prop& their use for communication purposes (he proposed -to = 27r, however, which is inapprapriate: see below); because of the importance of the "canonical coherent states* in quantum mechanics (see Klauder and Skagemtarn (1985)) they are of interest to physicists; the link between Gaussian coherent states and the Bargrnann space of entire function makes it poesible to rewrite results concerning the gm,rn in
DISCRETE *AVELIW TRANSFOW.
7
.
85
FRAMES *
.
terms of sampling properties for the Bargmann space. Egploiting this lidc with entire functions, it was proved in Bargmann et al. (1971) and independently in Perelomov (1971) that the g, span aU of Gz(R) if and only if woto 5 2n; in Gacry, Grossmann, and Zak (1975) a different technique was used to show that if wotn = 2 7 ~ then .
even though the,,g , are "complete," in the sense that they span L~(IR).'~ (We will see ia Chapta 4 that this is a b t consequence of wo . to = 2n, and of the regularity of both g and g.) This is therefore an example of a family of g,,, where the inner products (f,gm,,) suffice t o characterize the function of f (if (fi, g,,,) = (fi, gm,n) for all n,n,then f i = f2), but where there is no n u m e r i d y stable reconstruction formula for f from the (f, g,,,). Bastiaans (1980,19%f)has constructed a d u d function 3 such that
f = C (f, 9m.n) b . n
=
-
1
(3.4 6 )
m.n
a#h 3m,n(~) e'mwo= g(z - nto), but comrergence of (3.4.6) holds only in a $'#wry weak (in the sense of distributions-see Jamsen (1981, 1984)), and wt even in the weak L2-sense; in fact, 3 itself is not in L*[w). to = 27r 1s thus completely undetstood, what happens if uoto < 27r? Table 3.3shows the values of the frame bounds A, B and of the ratio B/A, for " various values of wa.to, computed from (3.4-31, (3.4.4) and the analogous formulas uslng g. We find that the g,,, do constitute a frame, even for wo - to/(27r) = .95, &though B / A becomes very large so close to the 'Lcrit~cdl density. It turns out t0/(2n) = l / N , N E N, N > 1, the frame bounds can also be that when -computed via another technique, which leads to exact values (within the error tat ion) instead of lower, respectivelybupper,bounds for A, B.14For the to/(%) = f and Table 3.3 reveals these exact values as well; it surprising t o k e how close our bounds on A, B (which are, after d l , obtained hy-Schwarz inequality, and &&ht therefore be quite coarse) are t o ues Substituting these values for A, B into the aGroximation at the end of 83.2, we can compute 3 fixthese different choices of wo, to. e 3.6 shows plots of ij for the special case where wo = to = ( X ~ n ) l /with ~, X ues -25, .375, .5, -75, .95, and 1. Note that ~ a s t i k function ' 3, corresponds to X = 1 (lower right plot in Figure 3.6), has to be computed &ntly, since A = 0 for X = 1. FO; s m d A, the frame is very close t o tight, ij is close to g itxlf, as is illustrated by the near-Gaussian profile of ij for .25. As A increases, the frame becomes both less redundant (as reflected by ng maximum amplitude of 3) and less tight, causing 3 to defia$e more from a Gaussian. Because both 9 and j have (faster than) exponential , one can easily prove from the conve%ing series representation for 4 (see that 3 and 4 have exponential decay as well, if A > 0. It follows that good time-frequency locabation properties, for all the values of A < I in Figure 3.6, even though it is quite striking how ij tends to Bastiaans'
j.T
i,
C
I
86
CHAPTER 3 I
FIG 3 6 The dud /mme junciwn $'for G a w s ~ a ng and
= to =
(2r,4J1/2, &
X = 25,.375, 5 , 7 5 , 95, and 1. As A m m c l s e s , g h a t e s mom and mow fmm o Ga-n (rgectmng the rncseose of B / A ) , and its omplttude increases ar well ( k w e A + B deEnoses). For X = 1, g ta no longer square tntegmblc.
pathological as A increases. For A = 1 , dl-time-frequency lockization breaks down.15 The series of'plots in Figure 3-6suggests the conjecture, first formulated in Daubechies and Grossmann (1988), that, at teast for a u s s i a n g, the gmSn are a frame whenever woto .< 2 ~ .In Drtubechies (1990) it wes Shawn t h ~ thii t is indeed the case for u0to/(2?r) < .996. Using entire function methods, this conjecture has since been proved, by Lyubsrskii (1989) and independently by Seip and Wallsten (1990). There exist of course many other possible and popular choices for the window function g, but we will stop our list of examples here, and return to wavelets. 3.5.
I
I I
Time-frequency Iacalization.
One of our main motivations for studying wavelet transform (or windowed Fourier transforms) is that they provide a timefrequency picture, with, hopefully, good localization properties in both variables. We have awxted several times-that if q3 itself is well localbed in time and in frequency, then the frame Gnerated by will share that property. In this section we want to make this vague statement more precise. and 141 to be symmetric (true if, For the sake of convenience, we W u m e e.g., II,is real and symmetric-a good example is the Mexican hat f u n ~ t b n ) ' ~ , then 9 is centered around 0 in time 8nd near kto in frequency (with, e.g., b = E I $ K ) ~ ~ /4 [ Jl~j(c)v]). ~ If$ is well localized in time and frequency, then $, pjill similarly be well -1 wound comnbo in time and around fqmb , in frequency. Intuitively spadung, (f, b,,) then represents the "information contentn in f near time a?& d near the frequencies fq r n { o . If f itself is LLessentially localized" on two rectmgle9 in time-frequency space, meaning that, forsomeOml
[I(Pno.n,f.$m,n)I SUP fl'I1-l
h)
$m n ,$ ()*:,
(m,n)e~.
or
+
(f, +m,n)($xnv h)
(m,n)EB,
SUP #hH=l
C
(f h) -
Uhll=l
C
+
C
1((1*Pho,~1)f, +m,n)II ~($Ynv h)I
noSm<mr lntt,,l>.;-~+t
11( Q T +m,n) ~ ~ I + l((1- QTM,$m,n) I] I( $ x n j
'
h)I 9 (3.5.5)
where we have introduced (QT f )(z)= f(z)for 1x1 1 T, (QTf) (z)= 0
+=
a;"'T+t
+
The sum over n ~ p l i t sinto two parts, n 7 b;' (aCmT t ) , and n < - b ~ ' ( a & ~ Z ' - t ) . Let nl be the smallest integer larger then b;'(a;*T-tt).Then
+
,
-
(because JGmz6 1= n 4 - G m x > (n - ni)bo
t+Gm(T-x))
+
DISCRETE WAVELET TRANSFORMS F U M E S
The mlm over n < -6; that
93
'( a ~ ~ t)Tisf dealt with in the same way. It follows
wh~chcan be made srtlaller than B
c2((f 112/4 by
&wing
This concludes the proof
t The estimates for mo,m i , t that follow from this proof are very coarse; in practice, one can obtain much less coarse values ~f $ and hive faster decay than stated in the theorem (see, e.g., Daubcch~es(1990), p 996). For Iater reference, let us estimate # B,(Oo, R I ; T), as a function of !lo,i l l , T, and f We find
4
%
3
"
On the other hand, the area of the time-frequency region f-T, T] x (I-R1, Sto] U [h, a l l ) is 4T(F1 Qo) As Ro+O and T, ill -too, we find
-
ch is not independent of c. We will come back to this in Chapter 4. m 3.5.1 tells us that if tl, has reasonable decay in time and in ency, then frames generated by do indeed exhibit time-frequency 1 e ation features, at least with respect to time-frequency seb of the type T ] x ([-$I1, U [Q, $Il]). In practice, one is interested in l o c a l b on many other sets. A chirp signal,for instance, intuitively corresponds to nai region (possibly curved) in the timefrequency plane, and it should be to reconstruct it from only those +m,n for which ( a r n h , f%-mb) is in to this region. This turns out to be the case in practice (for chirp signals, others). It is harder to formulate this in a precise theorem, mainly one first has to agree on the meaning of "1ocdizationn on a prescribed uency set, when this set is not a union of rectangles as in Theorem 3.5. I.
+
CHAPTER 3
94
If we choose the interpretation in terms of the operators Ls defined i n $2.8 (i.e., f is mostly localized in S if II(1 - Ls) f 11 ;'+ :
t ; ' ( ~ t,) is exactly equal to that for n < - t i l ( T t L ) ;we may restrict ourselves to negative n only, at the price of a factor 2. By reaefhhg y = E wo C if f! is positive, we see that ue may restrict owselves to negative C as well. Hence
+
sup
[l
-
+ (z+ nto)'~-~/*
I=l 0, we have
I
It follows that
I
Let
7tl
be thc: smailcst integer larger than
00
1, it is clear that appropriate choices of t,, u, (independent of T or R!) make (3.5.15), (3.5.16) smaller than BE^ fl f 1i2/4. This concludes the proof. m
97
DISCRETE WAVELET TRANSFORMS FRAMES
m. For some applications (in particulat, all applications that involve "recognizing" f ) this can be a real problem. In a first approximation, the solution proposed by S. Mallat is the following:
no, then
I
whi& tends to 0 for w-oo. Hence the q,, = a Cauchy sequence, with limit 7 in L ~ ( R ) . For this 7 , and any f E L2(R),
This is proved
follows:
( ~ (+f91, f + 9 ) - (R(f - !I) f'- g ) (because R' = R) ;
cJpj constitute
,
CHAPTER 3
102 1-
(Rf,g) 5 2 B+*
1 1 1 1 1 I 2 + 11g1121 ; lllf + dl2+ (If - gl121= H
I(Rf,9)1 = ( R f , g ) ( R f , / I ( ~ f , g ) l
= (Rf *
i 31
(Rf ,9)9ll(RI,9)l)
B-A
jjg
1 6-A
55
Illf El2 + II(Rf,9)gll(Rf,g)l 1l21 Illf ll' + 11~1121;
IlRll = SUP#f,l=,,IgII=1 I(Rf,g)l 5
B-A rn .
6. Intuitively, % can be understood as a "supeiposition" of Jhe rank one trace-class operators (., ha*b)ha*b, with weights c(a, b). If c is integrable with respect to aT2da db, then the individual traces af {., halb)ha*b(which are all equal to I), weighted by the c ( a ,6 ) are "summable," so that the whole superposition has finite trace,
This handwaving argument can be made rigorous by approximation arguments. - 1
7. We use here the "essential infimum" (notation: ess inf) defined by
where IAJ stands for the Lebesgue measure of A c R. The difference between essinf, f ( x ) and inf, f(x) lies in the positive measure requirement: if f(0) = 0, f(x) = 1 for all x # 0, then inf, f(x) = 0, but ess inf, f(x) = 1, because f 2 1 except on a set of measure zero, which "does not count." In fact we could be pedantic, and replace inf or sup by essinf or-esssup in most of our conditions without invalidating them, but it is usually not worth it: in practice the expressions we are dealing with are continuous functions, for which inf and essinf coincide. In (3.3.11) the situation is different: even for very smooth 4 , the sum I$(U~F)~* is discontinuous at = 0, because 4 ( 0 ) = 0. For the Haar function, for instance, l$({)j = 4 ( 2 ~ ) - ' / ~ I ( ( - 'sin2 {/4, and CmeZ 1$(()12 = ( 2 ~ ) ~ ' ' if ( # 0, 0 if ( = 0. We therefore need to take the essential infimum; the infimum is zero.
8. This condition implies both the boundednas of decay of P(s):
zmEZ 14(ag6)12 and the
DISCRETE WAVELET TRANSFORh4S FRAMES
and
)
5 c2 sup
l 0, -y > a. For the second term that suPz,y,g (1 + y2)[1 + (z- y)2]-1[1 + (z3- v ) ~ ) - ' < 00 to bound the sum by C"(1 + Cz==a(l + loo*ft2)-c(~-Q)/2, whse b < E < 1 is wbitrary. Since 1 < 5 a,this can be bounded by if 7 > a. We have therefore, for 0 < p < y - a, Cm@-I-
lee
Em
If 6 is continuous and has decay at oo, then ili)(qf)l2 is mntinuous in c. errepr at 5 = 0. There exists therefore a a,that ~ $ ( @ c ) I5 ~ fr if K - QJ5 a. Define, for a' < a,a fundion f by f([) = (2a')-1/2 if 1.f - 5 a', i(c)= 0 atherwise. Then
k-bt-'
(me Cauchy-Wmm on the integral)
CHAPTER 3
3
SC
4-
2 ~ Sup ' f
19t&a$'{)l2. m~it
If /d(t)/< C(l + 1((2)-7/2 with y > 1, then this infinite s w is uniformly bounded in 0, F(b) 1s a Riesz bass for L ~ ( R ) ,for any b E 11 - c, 1 €1. This example shows conclusively that lt is not always safe to apply "time-frequency space density intuition" to families of wave&+.
+
4.2.
Orthonormal bases.
4.2.1. Orthonormal wavelet bases. The conckusion of the last paragraph seems a rather negative point for wavelets no clean time-frequency density concept In thls section we emphasize a much more positive aspect: the existence of orthonormal wavelet bases with good t i m e - w n c y localization. Historically, the first orthonormal wavelet basis is the Haer basis, consfructed long before the term "wavelet" was coined. The basic wavelet is then, as we already saw in Chapter I,
3,
1, O < z < t -1, ij 5 x < 1,
0 otherwise
-
* W eshowed in $1 6 that the V,!J,,~(Z) = 2'"@~9(2-~2 n) constitute an or4honormal basls for L2(W).The Haar funcQon is not contmnuous, and its Fourier
KI-',
t;ransform decays only llke corresponding to bad frequencjr iodiz&ion. It may therefore seem that ttus basis IS no better than the winFoulier basis
$ 5
a) may be very large, reAecting a large CN iD (4.2.8)). The exponentially decaying wavelets of Stromberg or B a t t l e - M d W much faster numerical 'b decay, at the price of sacrificing regularity.
CHAPTER 4
In the matter of orthonormal bases then, wavelets seem to do quite a bit better than windowed Fourier functions: there are coastructions in which both gl, and have fast decay, in stark contrast with Theorem 4.1.1, which forbids simultaneous good decay for 9 and if g is a window function leading to an orthonormal basis. If I had written this chapter three years ago, this is probably where I would have stopped. But matters are not quite that simple: in the last few years, the windowed Fourier transform has led to a few surprises, which we will discuss briefly in the remainder of this chapter. %
4.2.2. The windowed Fourier transform revisited: 'Goodn orthonormat bases after all! One way in which one could try to generalize the windowed Fourier construction, so as to get round Theorem 4.1.'1, is to consider families 9,,,(2) that are not generated by a strict timefrequency lattice. This allows for a little leeway: Bourgain (1988) has constructed an orthonormal basis ( g j l j E J for t 2 ( R ) such that
6 [
uniformly in j E J, where. z, = dz ~19,(+)1~, = 4 (IJ,(E)(~.'(Note t-hat wavelet bases do not satisfy such a uniform bound. ) Giving up the lattice structure therefore permits better l d z a t i o n than allowed by the BalianLow theorem. However, Steger (private communication, 1986) proved that even slightly better localization then (4.2.9) is impossible: L ~ ( w does ) not admit an orthonormal basis ( g j ) j eJ satisfying
uniformly in j, if E > 0. This approach can therefore not lead to good timefrequency localization. There is another way in which we can try to break away fkm the lattice scheme (4.1.1). Note that in (4.2.9), (4.2.10), "time-frequency localization" stands for strong decay properties of the g,,, (gm,n)A away from the average values x,, This corresponds to a picture in which both gm,rn and (g,,,,)" have essentially one peak. Wilson (1987) proposes instead to construct orthonormal bases g,,, of the type
where
has two peaks, situated near
and
-9,
;
$
'3 a
4
4
?i
TIMEFREQUENCY DENSITY AND OFWHONORMAL BASES
121
with ,& ; 4, centered around 0. This ansata Fhangea the picture completely. W~lson(1987) proposes numerical evidence for the existence of such an orthonorma1 basis, with uniform exponential decay f& f, and dA, 4.; In his numerical construction he further "optimizes" the l d & i O n by requiring
Sullivan et al. (1'387) present arguments explaining both the existence of Wilson's basis and its exponential decay In both papers %ere are infinitely many functions-42; as rn tends to oo,the #$ tend to a Limit f u n c t b 42. The moral of Wllson's construction is that orthononnal b p r i t h good phase space localization seern possible after dl if &modal functions as in (4.2.13) are used. Note that many of our wavelet constructions, frames as well as the orthonormal bases we saw earlier, have these two peaks in frequency (one for > 0, one for < 0). In the case of frames, or for the continuous wavelet transform, the two frequency regions can be separ9ted (corresponding to one-frequency-peak functions; see 53.3.5.A or (2 4.9)), but this does not seem to be the case for orthonormal bases. We will see later that the two frequency peaks of $J need not be symmetric: there even exist examples with l]$b-2 JtSP 4 jJ;(
8
where we assume that the e k satisfy a j +c, a,+l -€,+I for all j. Moreover, we require that w, and wj1 complement each other near aj: wj ( 2 ) = w,-~ (2aj -;z ) > %$ T:,.k' h and W ; ( X ) W;-~(Z) = 1 if 1x ajl 5 c,. (All this can be achieved with smooth ;. 4n, 3; one can take, for instance, w,(z) =sin(; v( 2-a&,+e i)] for Ix-a,l y, and l+C'ti)] for (X - ~ ~ + €j+l, ~ with l w satisfying (4.2.4) wl(z) = m[lv ( = - ~ ~ + ~ :Ti L~J., snd (4.2.5).) C o i and Meyer (1990) prwe that the family ( u , , ~ j, ; k E Z),
+
with
I
-
2n.
2. For orthonormal bases the proof is much simpler. In this case we need not bother' with the Zak transform, which was only i n t r o d u d to prove that if Qg, Pg E L2, then Q j , Pfi E L3 as well. For orthonormal beses we can start directly with point 5, establishing (Qg, Pg) = (Pg,Qg), which is impoeeihk by point 6. This is the original elegant proof in Battle (1988).
3. If the $,,,(x) = ao-"I2 $ ( ~ - ~- nbo) x constitute a (tight) frBme, then 80.. do the $m,nn(x) = cro-m/2$#(--"x - &I), with @(x) = ( b ~ / b o ' ) ~ ~lr(boxlbo'). '~
4. To illustrate this, the following e;cample shows that the complex exponenexp (27rinz) do not constitute an unconditional basis for LP([O,lj) if p # 2. One can show (see Zygmmd (1959)) that
In both caees, x = 0 L the womt singularity, and the integrability of powers of these functions an (0, I] is determined by their behavior around
TIMEFREQUENCY DENSITY AND ORTHONORMAI, BASES
127
0. The first function is in LP for p < $, the second is not, even though the absolute values of their Fourier coeffic~entsare the same. This means that the functions exp (2rznx) do not constitute an unconditional basis for L ~ / ~ ( [I)), o, The Haar h i s adapted to the interval [0,11 cbnsists of { 4 ) ~ { ~ , 6m, ~ ,n~E; 2, m 5 0, 0 n 5 2Iml - 11, with r$(x) 3 1 on [O, 11. ThlS basis is orthonormal in L2((0,I]), and is an unconditional basis for LP([O, I]) if l 0, B < oo are independent of the c, (see Prelimideries). *
&t
and
so that (5.3.1) is equivalent to
We can therefore define q5# E L2(W) by
x,
+
1J6 (( 27re)12 = (27r)" ax., which means that the q5#( - k) are Clearly, orthonormal. On the other hand, the space ~ b #spanned by the q5#(. - k ) is given by
{ f; j = v J# with v 27r- perlodic, v E L ~ ( { o2x1)) , = {f; j = q with yl 27r- perlodic, vl E L ~ ( [ o2,~ 1 ) ) (use (5.3.2) and (5.3.3))
=
4
= Vo
(since the .($t
- n ) are a Riesz basis for VO)
Using the scaling function as a starting point. As &scribed in $5.1, a multiresolution analysis consists of a ladder of spaces (V,),EZ and a special function C#J E Vo such that (5.1.1)-(5.1.6) are satisfied (with (5.1.6) possibly relaxed as in 55.3.1). One can also try to start the mnstrudion from , an appropriate choice for the scaling function 4: after all, Vo can be constructed from the q5(. -k), and from there, all the other V, can be generated. This strategy is followed in many examples. More precisely, we choose # such that 5.3.2.
where
C, Ih12 < oo, and
-wethen define V.to be the c l o d subspace spanned by the 4j,k, k E 2,with &j,k(z)
= 2
(
2 - k). The conditions (5.3.4) and (5.3.5) are nk E 2)is a Ricm basis in each 5,srid
sary and sufiicient to ensure that {#j,k;
141
MULTIRESOLUTION ANALYSIS
that the V, satisfy the "ladder property" (5.1.1). It follows that the Vj satisfy (5.1.1), (5.1.4), (5.1.5), and (5.1.6); in order to make sure that we have a rnultiresolution analysis we need to check whether (5.1.2) and-(5.1.3) hold. This is the purpose of the following two propositions. PROPOSITION 5.3.1. Suppose 4 E L2(lR) satisfies (5.3.5), and define V, = Span {&k; k E 2). m e n n,,~ 5 = (0).
' +
Proof. \
1. By (5.3.5), the t$O,k constitute a Riesz basis for Vo. In particular, they constitute a frame for Vb, i.e., there exist A > 0,B < oo so that, for all L?
f
E Vo,
A
1111 51C ~
~ ( f a, s ) ! '
sB II~II~
(5.3.6)
k€Z
(see Preliminaries). Since 4 and the 4J,kare the images of Vo and the &,k under the unitary map (Dl f)(x) = 2 - ] I 2 f ( 2 - J x ) , it follows that, for all f ~ % , A ilfl12 l l(fl h.r)12 6 B 11f1I2, (5.3.7)
k~z with the same A,
B as in (5.3.6). .
2. Now take f E n,,~ 4. Pick e > 0 arbitrarily small. There exists a compactly supported and continuous j so that 11f - j l l L a 5 c. If we denote by P, the orthogonal projection on V, , then
i-
-+
llf
-PJ~II
= ll a > 0, this implies that there exists 6, p m i b l y smaller than 7 , sa that ReG( J , j > 0. Then
I
every step we compute not only the ~p~veW Coefficients ( j , *spending j-!evel, but also the ( j , C$j,k)for the same j-level, which are useful for computation of the next level wavelet coefficients. $a,:the The whole process can also be viewed as the computation of successively -.$*,i m m e r approximations of f , together with the difference in "information" be+ :.--., t-areen every two successive levels. In this vieolr we start out with a fine-scale : . t -5 -approximation to f , f" = Pof (recall that P, is the orthogonal projection onto , V ;.are will denote the orthogonal projection onto Wjby Q,), and we decompose P E V o = Vl@IVl into f O = f 1 + 6 1 , r b e r e f ' = = Plf is the w x t coarser approximation of f in the multiresolution analysis, and 6' = p - j' = QlfO = Ql f is what is "lost" in the transition f o -+ f l . In each of these V, , W,spaces we have the orthomrmal b d (41.k)kEZ, (+l,k)kEz, respectively, so that
., i
%
'
P = C c".,n, f1 ='C 41,*,P = C dt *I,,
% r.+w
n
n
F ' k r (56.2), (5.6.4)
-
n
give the effixt on-tha & e% okcthe ieni
ck =
h-lt
4,
9n-1L C: .
d: = n
a
ii = (dncZ and (Ab)k =
With the notation a = csn rewrite this as
cl=BcO,
basis
-+
.
(5.6.5)
Ena2*-n bn, we
d'=GcO.
The coarser approximation f1 E Vl = Vz GI W2dtk again be d e c o m p d Into f' = f 2 +a2, j2 E Vz, 62 E W2,with f2
=
C
:C h e n
n
@=x&
h,n
-
n
We again have c2=fjc1,
6=Gc1.
Schematically, all this can be represented as in F
i 5.8.
CHAPTER 3
FIG 5 8
Schem4kc rrpnsentokon of (5.6.5).
In practice, we will stop after a h i t e number of levels, which means we have rewritten the information in ((f, &,n))nE&= d ) as d l , dL,d3,. - - ,dJ and a find coarse approximation cJ, i.e., ((f,' @ j , k ) ) & ~ ,> = I , , J and ( ( j ', # ~ , l i ) ) b ~Since ~. all we have done is a succession of orthogonal basis transformations, the inverse operation is given by the adjoint matrices. Explicitly,
hence
(use (5.6,1), (5.6.3)) . In electrical engineering terms (5.6.5) and (5.6.6) are the analysis and synthesis steps of a subbandfiltenng scheme with exact reconstruction. In a twechgnnel subband filtering scheme, an incoming sequence is wnvolved with two different filters, one low-pass and one high-pass. The two resulting sequences are then subsampled, i.e., only the even (or only the odd) eatnee are retained. This is exactly what happens in (5-6.5). For readers unfamiliar with this "filtering" terminology, let me explain briefly what it means. Any square summable sequence ( c , , ) , ~can ~ be interpreted a& the sequence of sampled valw ~ ( nof) a bandlimited function 7 with suppod j. c [-?r, rr] (see Chapter 2),
(e)n,Z
sin r (x - ra) n
A filtering operation corresponds to the multiplication of j. with a 2~-periodic function, e.g.,
I
MULTIRESOLUTION ANALYSIS
The result is another bandlirnitd function, a * 7 ,
or
sin n ( x - n) n
is m d l y concentrated on (-n/2, n / 2 ] , hghThe filter is low-pass if pars if &ll-r,xl is mostly concentratad on {C; r / 2 5 I[) 5 n ) ; see Figure 5 9. The "idealn low-pass and high-pass filters a& lit(() = 1 if ](El < n/2, 0 if n / 2 < ( 0 (a necessary condition to have some regulGty for +). Not &verysuch is associated to an o r t h o n o d wavelet basis, hawever, an issue addressed in 556.2 and 6.3. The main results of these two sections are summarized in Theorem 6.3.6, at thi 4 of 56.3. Section 6.4 contains examples of crunpactly supported wavelets genemfhg orthonormal bases. The orthonormal wavelet bases thus obtained cannot, in general, be written in a closed d y t i c tom. Their graph can be computed with arbitrarily high precision, via ae algorithm that I cdl the * q a ~ cade algorithm," which is in fact a "rdinemt *en as ueed in c o r n aided design. All this is discussed in 56.5. A Iat of this material back to D a u W - (198Bb); for many of the results, better, simpler, or more general pro~&have hem found since, and I have given preference to these new ways of looking at thin&. These d i h n t a p prow$m are'borrowed mainly from MaUat (1989), Coben (1990),Lawton (1990, 1991), Meyer (1990), and Cohen, Daubechies, and F'eauveau (1992); for the link with rehemebt equations the referemxa are Cavtmtta, Dahmen, md Miccheili (1991) and Dyn and Levin (1990), as weli as earlier papers by these authara (see 56.5). .*
=
4
"
6.1. Construction of m.
t
In this chapter we are m & l y interested in constructing compactly supported wavelets $. The easiest way to ensure compact support for the wavelet $I is to Eamk the d i n g fundion $t with%* support (in its orthogona@d d o n ) . ft then f o U a &om tbe ddbRw6 ofthe k,
.e
T
168
CHAPTER 6
that only finitely many h, are nonzero, so that $ reduces to a finite linear combination oi compactly ~ilpportedfundions (we (5.1 34)), and therefore automaticallj~has wmpacf eugport itself. Choosing both $ and qb with- ampact support also has the advantage that the corresponding subband filtering scheme (see $5.6) uses only FIR filters. For compactly supported q5 the 2~-periodicfunction m,
becomes a trigonometric polynomial. As shown in Chapter 5 (see (5.1.20)), orthonormality of the b,,implies
where we have dropped the "almost everywherep became is necessarily cont tinuous, w that (6.1.1) has to hold for dl if it holds a.e. We are also interested in xm&g-+ and 4 reasonably regular. By Cerol1aty 5.5.4, this me& that rq-,should be of the form
with N 2 1, end C's trigonometric polynomial. Note %hateven without regularity constraint, we Beed (6.1.2) with M at least 1.' Putting (6.1.l), (6.1.2) tagether, it follows that we are W n g for
and
where L(c)= IL(()I2 is BfeO a polynomial in c a t . For our purpose it is connient to rewrite L(() aa a polynomial in ain2 0 for y E (0,I].
This proposition completely characterha (m,,(t)12. For out purposes we need however rno itself, not (mo/2.So how do we "extract the square rootn from L? Here a lemma by Riesz (see Polya a d S+ (1971)) c o w to our help. LEMMA 6.1.3.Let A be a positive trigonometric plyrwmial invariant under the substzktton - (; A is necessarily of the form
1
J
-ZJ)(z
- Zj)(% - %yl)(Z -
where we have regrouped the two different kinds of zerwr.
A
,,
1
5 aM
5.
L
For z = e-e on the unit circle, we have
Consequently,
where
fs clearly a trigonometric polynomial of order M with real coefficients.
m
1. This proof is constructive. It uses factorization of a polynomial of degree M ,however, which has to be done numerically and may lead to problems if M is large and some zeros are close together. Note that in this proof we
need to factor a polynomial of degree only M, unlike some other procedures, which factor directly PA,a polynomial of degree 2M. 2. This procedure of "extracting the square root" is also called spectral factorization in the engineering literature.
3. The polynomial B is not unique! For M odd, for instance, PA may have quadruplets of complex zeros, and 1 pair of real zeme. In each 2 quadrupiet we can choose to retain either r,, Z j to make up 8 ,or ql,z ;'; in each duplet -we can choose either rt or r;'. This d m already for different choices for B. Moreover, we can always muftiply B with e'M, n arbitrary in Z. o
-
.
Together, Propasition 6.1.2 and Lemma 6.1.3 tell us how to construct aU the possible trigonometric polynomials mo satisfying (6.1.1) and (6.1.2). It is not yet clear, however, whether any such ?no leads to orthonormal wavelet basis. In fact, some do not. This will be discussed in the next two sections. Readers who would like to skip most of the technicalities cap find the main results summed up in Theorem 6.3.6 at the end of $6.3. 0.2.
Correspondence with orthonormal wavelet bases.
We start by deriving a-formula for a candidate waling function 4. Once this is done, we will check when this candidate defines indeed a hona fide multiresoiution analysis. If a trigonometric polynomial n o is associated with a m u l t i r ~ l u t i o nanalysis as in 55.1, and if thp corresponding scaling function @ is in L1(R), then we know that for all ,,; n E 2) constitute a muftiresolution ~nalysis(by 85.3.2); in Vj , the ($j,n)nEZ constitute an orthonormal basis. We define t(, by i
+(x) = \/i
z(-l)n Q(ZI-
n);
(6.2.6)
n
thie is automirtically compa~tlysupported because 4 is and becauae only finitely many h,, differ from zero. The ($,,k)J,kEZ constitute then an orthonormal basis of compactly supported wavelets for L2[R). Before we go into the conditions on that ensure (6.2.5), it b interesting to remark that even if (6.2.5) is not satisfied, the function $ defined by (6.2.6) still generates a tight Frame, as p r d in Lawton (1990). PROPOSITION 6.2.3. Let mo be o trigonometric polyrornial satisbrag (6.1.1) and m(0)A 1, and let 4, $ be llre compacffp supported L2-functions defined by (6.2.2), (6.2.6). Define, usual, q j , k ( x ) = 2-]I2 $(2-lx - k). Then,for all f E L2(W),
be., the ($,,&; j, k E 2)h t i t u t e a tight ftnme for L2(R).
1. First reanember that (6.1.1) eau abm be written as
(see (5.1.39)). 2. 'XBke f compactly supported sad CQD.Then ,all j :
I (f, +j,k)lz
caaobrgee for
I
179
COMPACTLY SUPPOIZTED WA-
S" %
-
Choose K
80
+
thst 2-3 suppot(j ) n[2-3 suppwt(f) kj is empty d k 2 K.
Then
cl -c
&&
wE2-J .uppoP(f)
z,
5
g
we-'
/d* Id,
Qy Id(v - k)12 dy I~(Y - nlr~- C)I=
~~~port(l1
-
(because, for every C, the mts (2-'suppart(f) C PIK),~ do~not overlap)
*s"-
+ +
..
q$"
6 Simlarly,
11411'
CkI( f ,v&,k) I2
converges for all j
It ~seasy to check that the right-hand side of (6.2.9) is absolutely summable (use that only finitely many h,, ate muzero), ~o that we may jnvert the order of the summations. 4. If n, m are even, n 3 2r, m =P 28, we have
hb-p = 4,. = 6,,,l.
(substitute k = 8 + t 4')
=
hhl
Similarly, for n = 2r + 1, m
28
+ 1 both odd,
(by (6.2.7))
.
180 5.
CHAPTER 6
If n = 2r is even and m = 2s + 1 is odd, then
-&
' k
(substitute k = s
6.
-.
+ t - C)
This establishes
for all rn, n. Consequently,
By "telescoping," we have J
7. The same estimates as in points 3 and 4 of the proof of Proposition 5.3.1 show that, for fixed continuous and compactly supported f, ((f , # J , ~ l2 ) It: if J is large enough, with t: arbitrarily small (J d+ pending on f and c). Similarly, the estimate in point 3 of the proof of Proposition 5.3.2 leads to
xk
6
with IR1 5 c if J is sufficiently large. Since is continuous tit = 0, and b(0) = (2?).)-1/2,the fimt term in thb right-hand side of (6.2.11) converges to If({)12 for J-oo (by dominated convergence: ]&{)I 5 ( 2 ~ ) - ' I 2 for all O eEK
{mo(2-kt)l > 0 .
I I
1
COMPACTLY SUPPOKFED WAVELETS
nc. e l x = f
set aonqrrcrnt to 1-T,
ii
b 4 ~ '
[ - ? f , - p w ] u [ - , - i ][ ~- f , P ] u [ 4 , ~ ] u [ ~ , U' Df -]W ~
s]modulo 21; at mn be v#cad as the rvuU of cuttang [ - r / 2 , - r / 4 ] and 15n/8,3n/4)a13 of I-*, sj and movnyf tht fint pia*r to ihe nght by 2m, the second b the Ieff
F
P k
r
REMARK. The condition (6.3.2) may seem a bit technfcal, and hard to verify in practice. Remember baweYer tbat K is compact, and is therefore bounded: K c [-R, R]. By the continuity of q and ~ ( 0 =) 1, it follows that lmo(2-k()l > un~formlyfor dl KJ 5 R, if k is larger than some ko. This means that (6.3 2) reduces to requiring that the ko functions ~ ( ( / 2 ) , ~ ( ( / 4 ) , - ., m0(2-*0[) have no, zero on K ,M equivalently, that m has no zero In K / 2 , K/4, - - , 2-&0K This is akeady much mote accessible! o
4,
Proof of lneorrem 6.3.1
I
proving (1) ~ = (2). t 1. We start Aaeurne that (6.3 1) holds, a equiudatly, &I&< Then, for ail E 1-n, n],there exists E N eo that
O for k 1 and E K. On the other hand, we a h ha&, for any t , ]nso(C) mo(O)l C'IJI; hence Imo(E)I 2 1 - FIFI. Since K is bounded we can find4 so that ~-*c'IcI < ir( E K and k 2 b. Using 1 - x 2 e-2z Tor 0 5 x _< we find therefore, for
< E K,
-
< i,
4
We can rephrase this as
This implies
We can therefore apply the dominated convergence theorem and conclude
that j ~ ~ +in4 La.
6. The congruence of K with I-'17, A] aiSdu1o 27r means that for any 2 * - ~ e ~ d ifunction c f JtEX @ f (0= & f (C) = :J 4 f (F). 1"
particular,
=III
CHAPTER 6
. Since this implies J
Ipk( N; m5 can always be brought into this forrn by multiplication by esNl0
k>O (EK
k*
r
27r, containing
.
Them exists no nontrivial cycle {G, .t,,) in [0,2-/rl, invariant under [ w 2 ( modvlo27r, sueh t h a t r q ) ( ~ j + r ) = O f o r a l l j =1 , - - e n . The eigenvalue 1 of the [2(N2- Nl) - I] x mapix A &fined by
(where we assume h, = 0 for n < Nl, n
7:
[?(Nz - Nl)- 11-dimemional
> N2) j9 nondegenemte.
184
CHAPTER 6
From the point of view of a u b d filtering, this theorem tells us that, provided the high-pass filter has a null at DC (mo(lr)= 0, hence mo(O')= 1 with the appropriate phase choice), we &almostalways" have a corresponding orthonorma1 wavelet basis. The cormqmndence only fails "accidentally," as is illustrated by the last two equivalent necessary and sufficient conditions. In practice, one likes to work with filter pairs in which the low-pasa f3ter has no zeros in the band )(I 5 x / 2 , which is sufficient to ensure that the $&, are an orthonormal basis. But it is time to look at some examples! 6.4.
Examples of compactly supported wavelets generating an orthonormal basis.
All the examples we give in this section are obtained by apectrai factorization of (6.1.11), with different choices of N and R. Except for the Haar basis, we have no closed-form formula for t$(x), $(z);we- -will h-the next section how - - explain --the-plotg f o m . A f&t fmilly of examples, coastructed in Daubechies (1988b),corresponds to R -z 0 in (6.1 11). En the spectral factorization needed to extract C( O , k E 2 .
I
(6.5.4) (6.5.5)
f
We can we this ae input for the recaostruction algorithm of the subband 6ltering d a t e d with m~ (see 55.6). More specifically, we start with a low pess sequence cfl = &, and a highpass sequence dO, = 0, and we "crank the machine" to obtain
We then use d;;' = 0, to obtain, after another cranking, *
n I
etc. At every atage, the czf are equd to (4, 4-j,n). w e t h e r with (6.5.3), this means that we. have an algorithm with exponentially fast convergence to
I
compute the values of t$ at dyadic ratioads. W e can interpolate these k d ues and thus obtain a sequence of f p n c t i 5 approximating 4.9 We can, for instance, define $(z) to be the function, pi& constant on the intends 12-i(n- 1/21, 2-j(n 1/2)[, n E 2, such thbt #(2-jk) = 2jI2 (4, 4- j , k ) . Another possible choice is $ (z),p i e a r k linear ~n the [2-jn, 2-3 (n 1)]t a E it, so that 9;(2-Jk) = 2i/2{41 qLlSk). For both choices we have the following propodtion. PROPOSITION 6.5.2. If 4 & H oemtinwnu with ezponent a, then thew &hi:> 0 and jo E N so that?for j l y o ,
+
+
PmJ Take any z E R. For any j , choose n so that 2-in 5 z < 2-J(n I). By the definition of t$, $(z) is mmmarily a convex linear combination of 2112(4,9- ,,=) and 2iI2(+, 4- j,n+l), whether e = 0 or 1. On the other hand, if j is larger than some jo,
+
the same is true if we replace n by n + 1. It fdlm that e similar estimate holds for any convex combination, or Iq5(x) - $(%)I 5 C 2-la- Here C can be chosen independently of z,so that (6.5.8) fo110ws. This then is our fast aigorithm to compute approximate values of #(x) with arbitrariFy high precision:
PI Start with the sequence . - 0 - - .010- .- 0 - ., representing the r$(n), n G Z.
5F
F
b
2. Compute the qj (2-In),n E 2, by the machinen as in (6.5.7). At every step of this cascade,twice a s - n ~ values ~ ~ y are computed: values at "even pointsn 2-j(2k) are refined from the precious step,
-
and values at the "odd points" 2-j(2k
$(2-j(2k
+ 1))
+1) are domputed for the first*time,
h(k-wl
2
qJ-1
.
C
b
.
.
(6.5.10) : *
Both (6.5.9) and (6.5.10) can be viewed as convolutions.
3. Interpolate the G(2-jn) (piemwise constant if e = 0, piecewise linear if c = 1) to obtain r$(x) for non-dyadic z.
The whole algorithm was called the cascade algorithm in Daubechies and Lag&ias (1991), where 6 = 1 was chosen; in Daubechies (1988b), the choice c = 0 was made.1° All the plots of 4, $J in s6.4 and in later chapters are, in fact, plots of q j , with j = 7 or 8; at the resolution of these figures, the difference between r#~ and these T$ is imperceptible. A particularly attractive feature of the cascade algorithm is that it allows one to "zoom in" on particular features of 4. Suppose we already have c~mputedall the ~ g ( 2 - ~ n but ) , we would like to look at a blowup, with much better resolution, of 4 in the interval centered around 1. We could do this by computing all the ~ 7 ; ( 2 - ~for n ) very large J, and then plotting q;(x) only on the small interval of interest, corresponding to 25-4 . 15 5 n 5 2J-4 . 17 But we do not need to: by the "localn nature of (6.5.9), (6.5.10) much fewer computations suffice. Suppose h, = 0 for n < 0, n > 3. The computation of 7);(2TJn) only involves those rl;-, (~-~+'k)for which (n - 3)/2 5 k 5 n/2. Computation of these, jn tun, involves only the 77;-2(2-J+21) with (k - 3)/2 5 4' 5 k/2, or n/4 - 3/2 - 3/4 4' n/4. Working back to J = J - 4, we see that to compute 7$, on we only need the q ; ( Y 5 r n ) for 28 5 m < 34. We can therefore start the cascade from . . O . . .010 . . . 0 . . ., go five steps, select the seven values ~&(2-'rn), 28 5 m 5 34, use only these as the input for a new cascade, with four steps, and end up with a graph of q; on For larger blowups on even smaller intervals, we simply repeat the process; the blowup graphs in Chapter 7 have all been computed in this way. The arguments leading to the cascade algorithm have implicitly used the orthonormality of the @,,k, or equivalently (see $6.2, 6.3), of the &,,: we have characterized 4 as the unique function f satisfying (6.5.4), (6.5.5). The cascade algorithm can also be viewed differently, without emphasizing orthonormality at all, as a special case of a stationary subdivision or refinement scheme. Refinement schemes are used in computer graphics to design smooth curves or surfaces going through or passing near a discrete, often rather sparse, set of points. An excellent review is Cavaretba, Dahmen, and Micche!li (1991).'We will restrict ourselves, in this short discussion, to ohe-dimensional subdivision schemes.12 Suppose that we want a curve y = f (x)taking on the preassigned values f (n) = j,,. One possibility is simply to construct the piecewise linear graph through the points (n, f,); this graph has the peculiarity that, for all n,
12, 5 )
-
[s,gj<
jl. Then j2 = njl + r 932
with 0
r
< r < jl, and
(93,)"
9;: . I
Consequently,
KJ, 5
nlogq,, j2
+ rlog91 5 K,, +
log 2
-
Cjlljz .
2. For any c > 0, there exists 30 so that K = inf, Kj > KJo - c. For j 2 jo we then have K1 5 K: c C j o / j K: c. Since c was arbitrary, it'
+ +
r
follows that
K
= lim,,,
J -W
+
KJ
3. if K: < N - 1 - a, then Kt < N - 1 - a for some t' E N. We can then repeat the argument in the proof of Lemmrr 7.1 . l , applying it to
n:;:
with &(() = 4 2 - I < ) , add with 2' plying the role of 2 in Lemma 7.1.1. This W to J&c)~5 ~ ( ItI)-N+"t;l l +5 C(1+ kl)-a-l-c, hence 4 E CP. a ;
The f 6 W n g lemma. &owe that in 'most cases, we will not be able to &sin
:
much better by the btute force method. LEMMA 7.1.3. There d t s a sequence
so
UIot
218
7
CHAPTER
1. By Theorem 6.3.1, the orthonormality of the &(- - n) implies the existence of a compact set K congruent to [ - A , T ] modulo 2a, such thht [&([)I 2 C > 0 for ( E K..Since K is congruent to I-n, lif ~ n LC d is periodic with period 2'+'n, we have
i.e., there exists Cc E 9 K so that (Lt(Ct)l = qr. Since K is compact, the 2-' Ct E K are uniformly bounded. We therefore have I
/
ICrl 5 2 (2
(7.1.7)
for 0 < C'.
2. Moreover, since
1 1
= Icos el21 5 1, we have for all ( r 2'K,
Putting it all together we find for tt = 2Ct
Since K = infr Kt, this is bounded below by a strictly positive constant. Let us now turn to the particular family of N@ Lonstructed in $6.4, and see how these estimates perform. We have
with
We start by establishing a few elementary properties of PN.
i
I
1
I
11
MORE ABOUT COMPACTLY SUPPOWTED WAVELETS i
LEMMA7.1.4. The polynomial PN ( x ) = il $ the followrng properties: 7F
05 x 5 y
A
05 x
$
c
1 immediately leads to sharper results. We have, for instance,
(because sin26 = 4 sin2€12 ( 1 - sin2€12)) .
4 I
4 + 4 (implying 4p(l - ,y) 5 f), then < p(N-l)by (7.1.9). In the remabiq window $ + 9 2
If eitha y 5 1/2 or y 2 [P~(u)P~(4y(1 y))] a,? $,wehave PN(v) PN(4y(l- Y))
0 m that, for all k E N, Indeed, 2kMb = &I (mod %), ao that (7.1.10) f o U m if b # 0 or f T . We already know that to# 0; if b = fr , then = 0 (mod 2 ~ ahd ) hence & = 2 M - 1 =~ 0 (mod 2n), which ie impossible.
2. Now
L
MORE ABOUT COMPACTLY SUPPO-KI'ED WAWLETS
221
Since & ia a trigonometric polynomiai and L(0) = 1, there exists Cz so that I&(
I
We start by proving yet another property of
PN.
223
MORE ABOUT COMPACTLY SUPPORTED WAVELETS
LEMMA 7.1 9. P ~ ( x )=
N
(PN(x) - P ~ ( ~ ) X ~ .- ' ] 1-x
(7.1.14)
Proof.
1.
3. Combining (7.1 15) and (7 1.16) gives
(1 - x)P',(x) Since PN(l) =
=N
N-1 En=., (N - ln+ n )
=
(
*
I ) , (7.1.14) follows. m
~
~
i
We now tackle the proof of Lemma 7.1.8. Proof of Lemma 7.1.8.
1. Since PN(y) is increasing on [O,l], we only need to prove (7.1.13). 2. Define f (y) = PN(y) PN(4y(l
- 3)). Applying Lemma 7.1.9 leads to
CHAPTER 7
224
3. Since 4y(l - y) 5 y for y
1 3/4,
we can apply (7.1.8) to derive
Substituting this into (7.1.17) leads to
The quantity in square brackets equals *(I - g)PN(y) 1 0 for y 5 1, so that g(y) 5 o for 5 y I It follows that PN(y) PN(4g(l - y)) is decreasing on which prove6 (7.1.13) for y _< 9.
i.
(i, I],
4. For
5 y 5 1 we follow a different strategy. PN(?) by Lenrma 7.1.4, it suffices to prove
(e)N'l
Since PN(y) 5
But PN(4y(l- y)) 5 [I - 4y(l - ~ ) I - N= ( 2 -~ 1)-2N ( b m (1 PN(t)= 1 x N P N ( -~ x ) I), and
-
J . For Mo of the type
where L is a trigonometric polynomial such that L ( r ) # 0, the matrix Po has very special apectral properties. LEMMA7.1.11. The values 1, f ,- - .,2-2K+1 am eigenvduea for Po. The row vectors e, = (jk)i,-J, ..,J, k = 0, . .,2K- 1 genemte a subsplzce which is left invariant for Po. More precisely,
ekPo = 2-kek
+
linear combination of the en, n < k
.
I
I
MO-
I I
ABOUT CO&fPACnY SUPPOFWED WAVELETS
229
1. The factorization (7.1.28) is equivalent to "" I
a, jk(ll)l
- .
= 0 for k = O , - - - , 2 K 1
1.
This means that the Moreover, since Mo(0)= 1, C a 2 3 = %j+l = sum of each column in the matrix (7.1.27) is equal to 1; Q is thus a left eigenvedor of Po with eigenvalue 1.
it
: it.
2. For 0 < k 5 2K \
i
- 1, define gk = ekPo,i-e., (gk).. = 2 z j k oy-,
.
3
t For m even, m = 2!,
: 1
Hence
where
A,,, = Co?,(~j)"' I
F
= ~ 0 z J + 1 ( 2+j lIrn f
A amsequence of Lemma 7.1.11 is that the epacea Ek,
bi%b 1 < k 5 2K,are all right invariant hr Po. The main result of this subsection irr then the foflowing.
230
CHAPTER 7
THEOREM 7.1.12.Let X be the eigenvalue of POIEaK value. Define F,a by
with the largest absolute
-
If IXI < 1, then F € Ca" for dl c > 0.
2. The spectral radius p(Polh,)equals IX/. Since, for any 6 > 0, there exists C > 0 so that (IAnII I C(p(A)+ 6)n for ail n E N, it follows that
3. On the other hand, f( then the first term, of order 2-1, dominates, and 2 4 is Lipchits In fact, one can'even prove that 2 4 is differentiable in these points, which constitute a set of full measure. This establishes a whole hierarchy of fractal sets (the sets on which r ( x ) takes some preassigned value) on which 2 4 has different Holder exponents.. And what happens at dyadic rationals? Well,there you can define r * ( s ) , depending on whether you come- "from above" (associated with d + ( z ) ) or "From'belown (d- (x)); r+(x) = 0, r , (z)= I. As a consquence, 24 is left differentiable at dyadic rationals, x, but has Holder exponent -550 when z is approached from the right. This is illustrated by Figure 7.1, which shows blowups of 24,exhibiting the characteristic lopsided peaks at even very fine scales.
9
+
9,
In this example, we had twcs Usum rules'' (7.2.13), (7.2.14), decting that &(() = f C1 ck e-'N rss divleible by ( ( 1 e-q)/2)'. In general, .m is divisible by ( ( 1 + e ~ Y ) / 2 ) ~ipd , we b v e N sum rules. The mrbtqmce Enr
+
I
MORE ABOUT COMPACTLY SUPPOFtTED WAVELETS
239
will, h o m e r , be more than one-dimensional, which complicates wtimates. The general theorem about global regularity is as follows. THEOREM 7.2.1. Assume that the c k , k = 0, ..,K, satssfy K c k = 2
Ck=-,
6nd
K
C ( - I ) ~k'ck
= O forI=O, 1 , - - - , L .
(7.2.20)
k=O
For every m = 1,: ,. ,L + 1 , define Em to be the subspace of ltN orthogonal to U M = Span {el, - .,e,,,), where el = (11-', 23-', . .- ,NJ-') . Assume that there ezist 1/2 5 X < 1, 0 0 such that, for all binary sequences ( d j ) j E Nand , dl rn E N,
1 . there ezists a non-trivral contsnuws L1-solution F for the two-scale equation (7.2.1) assocrated with the c,,, ,
2. this solution F is t times continuously differentiable, and 3. if X >
$,
then the llh derivative F ( ~ of ) f is Holder continuous, with expment at least tn A ( / In 2; if A = 112, then the eth denvatiue F(()of F is almost Lapschitz: it satrsfies
REMARK.The restriction X 2 means only that we pick the largest possible integer O 5 L for which*(7.2.21)holds with A < 1. If t = L, then n e c W y X 2' (see Daubechies and Lagarias (1992)); if L < L and X < then we could replace L by L+ 1 and X by 2X, and (7.2.21) would hdd for a larger integer t!. o A similar general theorem can be formulated for the local regularity fluctuations exhibiM by the example of 24. For-a precise statement, more details mil pro&, I refer to Daubechies and Lagarias'(1991, 1992). When applied to the Nt$, these methods lead to the following optin~alHolder exponents:
4
4,
These are clearly better than what was kbtained in §7.1.3;'m~reover,we see to our surprise that s4 is continuously differentiable, even though its graph seems to have a "peakn at z = 1. ~low&s show that thb is deceptive: the true maximum lies a little to the right of x = 1, md everything is indeed smooth (W Figure 7.2). The derivative of 3d is ~031tim10~6, but has a very small IIclder e q ~ ~ ~ eas n i!lurrtrated t, by Figure 7.3.
CHAPTER 7
MORE ABOUT COMPACTLY SUPPORTED WAVELETS
241
Unfortunately, these matrix methods are too cumbersome to treat Iarge examples. Another, more recent "direct methodn has been developed in Dyn and Levin (1989)and Rioul (1991); when applied to the N4 with N = 2,3,4 it re produces .the a-values above; since it is computationally less heavy, it can sJBo taclde larger valuea of N with better results t h in $7.1.3 (see Rioul (1991)). , 1. Note the similarity of the matrices To, TI and Po in 57.1.3 (see (7.1.27))! Even the spectral analysis, with the nested invariant subspcrces, is the same. This shows that the result in Theorem 7.1.12 is indeed optimal: if iis the
spectral radius of 'polhX = . T I J then ~ ~ ,
so that X in (7.2.21) must be at least &, and the H6lder exponent is at most C Ilog XI/ bg 2 5 1 log i t / log 2. The difference between the two appro& is that the present method also givea optimal estimates if &([) is not positive, unlike $7.1.3.
+
2. The condition (7.2.21) suggests that infinitely many conditions on the To, TIhave to be checked before Theorem 7.2.1 can be applied. Fortunately, (7.2.21) can be reduced to equivalent conditions which can be checked in a finitetime computer search. For details, see Daubechies and Lagarias (1992). 3. In practice, it is not necessary to work with To,TI and restrict them to GK. One can also define directly the matrices Po, TI corresponding to the coefficients of m ( t ) / ( ( l + e-g) f 2)K; it turns out )that bounds on llTd, - - .TL]&, 11 are equivalent to bounds on ...f" 11 . 2-Lm (see Daubechiea and Lagarias (1992), 55). The matf'd are much smaller than Td((N K) x ( N - K) i d e a d of N x N). o
~fi,
-
Since this method works for any function satisfying an equation of the type (7.2.1), we can apply it to the basic functions in subdivision schemes. For the Lagpangian interpolation function cornspoxding to (6.5.14), a detailed @pis shma that F is "almoet" @: it ia C1, and P &isfie8
Thie had already been obtained previously by Dubuc (1986). But our matrix methods can do morel They can prove that F b ahalmost everywhere differentiable, and they can even compute F" where it ie well defined. For details, see again Daubechiee and Lagarias (1992). 7.3.
Compactly supported waveke with more regularity.
By Corollary 5.5.2, an o r t h o n o d bmb of *veleta can consist of CN-I wa~elete only if the basic wavelet $ has N Mniehing momenta. (We implicitly aesume
-243
CHAPTER 7
+
that $ stems from a multiresolution analysis and that 4, have sufficient decay; hoth.conditions are trivially satisfied for the compactly supported wavelet bases as constructed in Chapter 6.) This was our motivation to construct the Nq5, which lead to N~ with N vanishing moments. The asymptotie results in 57.1.2 show however that the ~ 4 ~ , q E5 CpN with p = .2. This means that 80% of the zero moments are "wasted," i-e., the same regularity could be achieved with only N / 5 vanishing moments. Something similar happens for small values of N. For instance, 29 is continuous but not C1, 34 is C1 but not 0, even though 3$ have, respectively, two and threb vanishing moments. We can therefore Usacrifice" in each of these t;wo cases one of the wishing moments end use the additional degree of W o r n to obtain 4 with a better Holder exponent than 2 9 or 3 4 4 have, with the same support width. This amounts to replacing Imo(() 1' = (cos2 $)Nfi(sin2 $ ) by lrnO(t)l2 = (m2$)N-1[~N-l(sin2f) a(sin2 f l Nc a d (see (6.1.11)), and to choose a so thkt the regderity of I$ is impmved. Examples for N = 21 3 are shown in Fivires 7.4 and 7.5; the corresponding h, are as follows:
+
These examples correspond to a choice of a such that mexl p (Tol~,),p (TI I )] is minimized; the eigenvalues of To,TI are then degenerate.8 One can prove that the Holder exponents of t b two functions are at leaat .5864,1.40198 respectively, and at moet .60017,1.4116; these last values are probably the true Hiilder exponents. For more detaile, see Daubechies (1990b). 7.4.
Regularity or d h i n g momeds?
The examples in the previoua section show that for fixed support width of #,I), or equivalently, for fixed length of the filters in the associated subband coding scheme, the choice of the h, that leads to maximum regularity is dieerent from the choice with maximum number N of mishing moments for @. The question
I
MORE ABOUT COMPACTLY SUPPOKI'ED WAVELETS
RG.7.4. The r d n g jbction & for the mo& ngJor
wavelet cmutruetum wat/i ouppmt
width 3.
FIG. 7.5. The sealing jhction 4 for
the m o s t regulw wauckt w ~ t r u c t a o nwih support
wdih 5.
then arises: what is more important, vanishing moments or regularity? The answer depends on the application, and is not always clear. Beylkin, Coifrnan, and RokhIin (1991) use compactly supported orthonormal wavelets to cmmpress large matrices, i.e., to reduce them to a sparse form. For the details of this a g plication, the reader should consult the original paper, or the chapter by Beylkin in R u d d et al. (1991); one of the things that make their method work is the number of vahihing moments. Suppase you want to decompose a function F(z) into wave1ets (strictly speaking, matrices should be modelled by a function of two mriabks, but the point is illustrated juet as well, and in a simpler way, with one variable). You compute all the wavelet M c i e n t s (F,qjCk), and to compress all that informtion, you throw away all the coefficieni;~smaller than some threshold r . Let us see what this meaw at some fine scale; j = -J , J € N and J "large." If F is CL-'and J, has L vanishing moments, then, for x near
244
CHAPTER 7
2- jk, we have
where R is bounded. If we rnultipiy this by $(2Jx - k ) and integrate,-then the @t L terms will not contribute because j & x C ~ ( x=) 0, 1 = 0, . . - ,L - 1. Consequently, (F,
=
j
I/&
(r- l-3k)L R ( x ) 2'12 ~ 1 ( 2 ~ kx)-
For J large, this Mil be negligibly smatl, u n k R is verp large near k2- J . After thresholding, we *ill therefore only reb*n fine-scale wawle3 coefficients near singularities of F or its derivatives. The e k t will be all the more pronounced if the number L of vanishing moments of q5 is largeg Note that the regularity of 1C, does not play a role at all in this argurnent3t seems that for Beylkin, Coifman, and Rokhlin-type applications the number of vanishing moments is far more important than the regularity of $. For other applications, regularity may be more relevant. Suppose you want to compress the information in an image. Again, you decompose into wavelets (twedimensional wavelets, e.g-, aseociated with a tensor product multiresolution analysis), and you throw away all the small coefficients. (This is a rather primitive procedure. In practice, one chooses to allocate more piecision to some coefficients than to others, by means of a quantization rule.) You end up with a representation of the type
where S is only a (small) subset of all the possible values, c h m n i11 function of I. The mistakes you have made will consist of multiples of the deleted $,,&. If these are very wild objecd, then the difference between I and f might well be much more perceptible than if 9 L%ploother. This is admittedly very much a band-waving argument, but it suggests that a t least some regulatity might be required. Some fire* experhe& rqmrkd in Antonini e t al. (1991) seem to confirm this, but more expdmWs am required for a convincing answer. The sum rules (7.2.20), egUidt%&to tbe divisibility of m( N. Then N i a odd, becawe N even, N = 2m together with
1. We can always shift
.
2. S i n c d h = 0 ~ n < 0 , n > N , s u p g o l t ~ = [ 0 , ~ , b y ~ 6 . 2 2 ? T .b s I], where standard definition (5.1.34) then leads to suppad # = [-no, .no= The symmetry axis is therefore nacessarily at us heve either *: $ ( l - 2 ) = +(x) or $11 - 2 ) = -ql(x).
v.
4;
+
which meaw that the Wj-epacee ste invariant under the map s I+ Sic@ Vj = @ W*, 5 ie invarisht as well. k>j
-2.
S Y M M E n r Y FOR COmACTLY 8UPPOlWED WAVELET BAS=
4.
253
-
Dz?Bne now &(z)= #(N- z). Then the &- n) m a t e an orthmmmal basis of Yo(since Vo is h a r h t Ior z cr -XI, & &z) = $ & b(z) = 1, and eupport 4 = cruppoft 4. It Mows trOm Corollary 8.1.3 that 6 = 0, i:e., 4(N - z) = $(x). Consequently, h.
=
fi /&fl*)+(*-n)
=
fildz+(~-+)~~-iz+r.,'
5. On the other hand,
m
m
(use (8.1.2) on the eedond term)
By Lemma8.1.2, this implies hl, = &,,,-a for some W L 2,la1 = 2-'j2. Since we wumed ho # 0, this means t W h2, = dm,o a. By (8.t.2), hN = ho = a as well, and hi,+l = a 5 % . in general. The normalization Ch, = & (see Chapter 5 ) fix88 the value ofa,a = .
'
3
6. We have thus hzm =
-& 6m,o, hhn+l = $5 km,~
( t= h) ( l + e - i N t ) .
It f b k that d( 1, then the q5(. - n) are not dhonormsl, which cbntradicts the aseumpti~lein the theorem. a
E
1 E*
r-t .
k
.
1. The nonexistence of sgmmetik or slbbpumtric ral compactly m p ~ d wavelets e h d d be no surprise to aOybody fam$h with subbaad coding: it haf & e d y bem noted by Smith and BMwell (1986) thw eymmetry i~ not mmjmtible with the the exact mumatmction property in subband atering, Tbe only extra result of Tbaran 8.1.4 is that symmbn 3 n d y implies symmetry for the h,,, but that is a rather intuitively true r d t anyway. 2. If the restriction that ) be real is W, them qmme!try i.possible, even if 4 is compsctly supported (Lawtan, private cammunication, 1990). o
254
CHAPTER 8
The asymmetry of all the ex+mpb plotted in $6.4 is therefore unavoidable. But why shauld we care? Symmetry is nice, but can't we do without? For some applications it does not r e d y matter at all. The nurnericd an* applications in Beylkin, ~oifman,'andRa$hlin (1991), for instance, work very well with very asymmetric wavelets. For other applications, the asymmetry can be a nuisance. In image coding, for example, quantization errors will often be most prominent around d g e s in the images; it is a property of our visual system that we are more tolerant of symmetric errors than asymmetric ones. In other words, less asymmetry would result in greater compressibility for the same perceptual errorV3 Moreover, symmetric filters make it easier to deal with the bomdapies of the image (see also Chapter lo), another reason why the subbsnd cdding engineering literature often sticks to symmetry. The following subeectiane discuss what we can do to make orthohormd wavelets less asymmetric, olr how we can recover symmetry if we give up ~rthonormalit~. 8.1.1. Closer to linear phase. Symmetric filters are ofiten called linear phase filters by engineers; if a filter is not symmetric, then its deviation from symmetry is judged by how much its phase deviates from a iinear function. More precisely, a filter with filter ooefficients a, is called linear plrcue if the phese of the function a(t) = Ena+-'* is a linear function of i.e., if, for some L E 2,
c,
This means that the a,, are symmetric around f, a,, = az-,. Note that according to this definition: the Haar filter no([) = (1 + e-4)/2 is not liniear phase, although the filter coefficientsare clearly~symm~tric. This is because the h:are symmetric around $ 2 in this '*
4
The phase has a discontinuity st r,vlke lmol = 0. If we extend the definition of linear phase to include also the slters for which the phase d a(() is piecewisq linear, with constant slope, and has discontinuities only where la({)l ism, then filters with the same symmetry as the Haar filter are also i n c l d To make a filter "close" to symmetric, the idea is then to juggle with its phase so that it is "almostn linear. Let ~ls .apply this to the "standard* construction of the Nd N@, 88 given ins6.4. In that errse we have
and the coefficients hn were determined by taking the "square rootn of PN via spedr factorization. Typicdly tbis means writing the polynomial L z ,defined by L e ) = pN (sin2t/2), 8s a product of (z Q ) ( Z- z ~ ) (-zZ; )(z - ~ y l or ( z - rr)( z - r;'), where y, rt sre the mplex, respectively, real roots of L, and selecting one jair (zt, 'ii)o\b aB each quadruple of complex mob, and oae value rt out of each pair of real roo@. Up to normalization, the resulting rno is
ca:
-
1'
)
255
SYMMETRY FOR COMPAGLZY SUPPoKIgD WAVELET BASES
then
~ m o ( E )=
(
1
+ e"C
)
N *
n
- d(e-* - %)
(e-*
t
n
(e-.
- .k)
-
k
*
The phase of ~ m cen o therefore be computed from the phase of each contribution. Since (e-4
- Rc e"ioc)(e-'C - I&
and (e"€
ebe) = e-g(e-4
- t t ) izs
the corre~pondingph contributi= arcts
and
k ( 4 ) ==ctg
(e-%/2
e,
cire
-3- 1
tlJ
USa!
+ R! tic)
-
- ,.[ 8 € / 2 ) ,
- 1) sin(
(R; ((i+ ii;)
(-rt
- 2Rr
)
- 2Rc mat
f) -
Let us choose the valuation of arctg ao thst & ia continuous in [0,2x], and #((0) = 0; as shrrwn by the example of the: Haar basis, this may not be the "true" phase: we have ironed out possible discontinuities Tome how llnear the phase is, this ironing out is exactly what we want to do, however. Moreover, we would like to extract only the nonlinear part of we therefore define
*(;
at(€)= at( 0. Since we are dealing with pal~cw,mi& in cost, P win 8~tomIdyb -; (1 + C Y ) ~ = r - 4 d i = 2e+[l+ws 0,
kEZ.
(9.2.1)
We have imptieitly assumed here that 11 E C, with r > s. For proof8 and more examples, see Meyer (1990). Of the qunplea given here, the only epacea that can be completely characterized (with Uif and only if" conditions) by Fourier t r d o r m s are the +bolev apscee. The conditions (9.2.1) characterize gZoM regularity. LucaJ regularity can also be studied by means of coefiiciente with respect to an orthonwmd wavelet The moat general theorem is the foliowing, due to J&d (1989b). For simplicity, we assume that tC, hse o ~ m p b cbupport t a@ ie C1(the formulation of the theorem is slightly di&wt b r more general $).
.
300
CHAPTER 9
THEOREM 9.2.1. Iff is Ho'lder continuous with ezponent a,0 .< a < 1, at 20,
i.e.,
If (4 - f(zo)l 5 C b - sola ,
(9.2.2)
then m$(f, for j-mo. then
$-~,r)ldist (xo, a ~ p p o r t ( $ - ~ , ~ ) = )-0 ~ )(2-(4*)s
)
(9.2.3)
Conversely, if(9.2.3) holds and iff is known to be CY for some c > 0,
We do not have exact equivalence between (9.2.8) and (9.2.2) he&. The estimate (9.2.4) is in fact optimal, as is the condition f E CC:i f f is merely continuous, or if the logarithm in (9.2.4) is omitted, then counterexamples can be found (Jdard (1989b)). Non-equivalence of (9.2.2) and (9.2.3) can be caused by the existence of less regular points near s o , or by wild oscillations of f (s)nesr a (we, e.g., Mailat and Hwang (1992)). If we modify condition (9.2.3) dightly, then these problems are circumvented. More precieely (again with compactly supported p4 E C1),we have the following. THEOREM 9.2.2. Define, for c > 0,
.
I , for some E > 0 , and some a, O < a < 1,
then f i s Holder continuous with eqownt a an xu.
Pmf.
+
1. Choose any x in ]so->, zo c[. Since either -implies k E S(xo,j;e) , we have
' i t follows that
$j,k(~)
# 0 or p41,k(zo)# O
*
CHARACTERIZATION OF m C I ' I O N A C SPACES
I
301
2. Since $ has compact support, the number of k for which $j,k(x) # 0 or +j,k(xo) # 0 ia bounded, d o r m l y in j , by 2 Isupport ($)I. Consequently,
Since $ is bounded and C',
k
3. Now c
i
1. Similar theorems can, of cout&, be p
-
b jo so that 230 5 Jx
5 2jo+l. Then
.
d for V-spaces with a > 1.
2. If a = 1 (or more generdly, a E N), then the wry h t step of the proof does not work any more, because thesa~ond8erieswill not converge. That is why orre has to be more circumspect for a, and why the Zygmund
class eaters. 3. Theorems 9.2.1 and 9.2.2 are tho true if $ fLes infinite support, and 91, and haw good decay at oo (see Jaffard ( 1 ~ ) ) Compact . support for $J mdcecr the estimates easier. o P
of wavelet coefficients. Local regularity can therefore be etndiaa %y For practical purposes, one should b ~ , l it may ~ be :that ~ r large y value. of j ace needed to determine a in {9.85) reliably. This is illustrated by the following example. Take 3
+
f(=
-
=
if z l a - 1 , if a - l I z L o + l ,
2 e-f~-Ol C-lz-4 e-{2-al
>
[(z-a-$+q
u
ihbr function irs graphed in Figure 9.1 (with a = 0).
z,a+l;
This function has Hiilder 0, 1, 2 at z = a - 1, a, a 1, respectively, and is Cw elsewhere. *'kt then, for each of the three points zo = a - 1, a, or a + 1, compute
+
CHAPTER 9
{((f*$j.k)I; zo E support ($j,~)),and plot log Aj/ log 2. If a = 0, Aj = then the+ pIde fine up on etraight lines, with dope 1/2, 3/2 and 5/2, with pretty good d e u y , leading to good estimates for a. A decomposition in orthonormal wavelets is not translation invariant, however, and dyadic rationals, particularly 0, play a very special role with respect to the dyadic grid (2-fk; j, k E Z) of localization centers for our wavelet basis. Choosing different values for cr . illustrates this: for a = 1/128, we have very different (f, q3j,k), but still a reasonable line-up in the plots of logAj/ log2, with good estimates fox a;for irrational a, the line-up is much less impressive, and determining a becornea correspondingly leas preck. All this is illustrated in Figure 9.2, shawing the plots of logAj/log2 as a function of j, for zo = a - 1, a, a + 1 and for the three choices a = 0, 11128 and 8- 11/8 (we subtract 11/8 to obtain a cloae to zero, for programming convenience). ?b make the figure, ((f,$-f,k)I vm~computed for the relevant valuea of k end for j ranging from 3 to 10. (Note that this mean6 that f iteelf had to be dunpled with a resolution 2-17,in order t o have a reasonable accuracy for the j = 10 integrals.) For a = 0, the eight pointe line up beautifully and the estimate for a is accurate to l e . than 1.at all three locations. For a = 1/128, tbe points at the coarser resolution d e e dotnot dign as well, but if a + 3 is estimated from only the finat four resolution points, then the estimates are still within 2%. For the &ional choice a = fi 11/8 no a l ' i e n t cm be eeen at the discontinuity at a - 1 (one probably needs even d l e r scales), and the estimate for a -t3 at a,y h f is Lipechitz, is off by about 13% (interestixtgJy enough, the estim* would be much better if the d e 10 point were deieted); a t a 1, where 1' is Lipschitr, the. estimate is within 2.5%. This illustrates that to determine the lacs regularity of a function, it is more wefid to use very redundant wavelet families, where this translational m n - i n w c e in much l e a pronounced (discrete case) or absent (continuous caae).' (See H o b e i d e r and Tchamit&ian (1990), MBUat and Hwrrng (1992).) Another teason for using very redundant wavelet families for the chatacteriaation of load mgukwity is that then ody the number of wishing mornof JI limits
+4
-
I I
+
CHARACTERIZATION OF FUNCTIONAL SPACES
INDEX j OF THE SCALE
4 8 SCALE
-
8
1
0
-
Pro. 9.1. &dimotu of the H W awwnb of f(z a) (w Piam 9-11 41 a 1 (Cop), *,(mid&), o + 1 (am), E o r l l p d a l ~ k A j / h 2 , f o r d i d n m t ~ ~(Ihi.AOmr fa. wresptr(buwby1~.~ i s u J l c , ~ I v o J d l i k a t ~ ~ I b r ~ W ~ )
the xnaxhum regularity that can be characterized; the regularity of tj plays no role (see $2.9). If orthonormal bases are used, then we are necemady limited by the reguhrity of q6 itself, as is illustrated by choosing f = $. For thi choice we have indeed (f,q6-j,kJ = 0 for all j > 0, all k; it follows that with orthonormal wavelets we can hope to characterbe only regularity up to C 7 - c if $ E Cr.
9.3. Wavelets for L1([O,11). S i L1-spacesdo not have unconditional bases, wayeleta cannot provide one. 'Nevertheless, they still outperform Fourier analysis in somebeeme. We will illustrate this by a comparison of expansions in wave1eta versua Fourier aeries of L1([O, I])-functions. But first we must introduce " p e r i o d i i wavelets." Given a multireeolution analysis with scaling function 4 end wavelet tj, both with reasonable decay (say, I$(x)l, I@(x)l 5 C(l /zl)-'-'), we define
+
C
and
+ + EL +
Sha )(x 0. = 1: we have, for j 2 0, G ( z ) = 2-jIa Et#(2-fz- k 2 - j l ) = 2na,so that the for j 2 0, are d l identical one-dimensional spaces, amtaining only the constant functions. SimiIeily, because +(x C/2) = o , Wjw ~ = (0) for j _> 1. We therefore restrict our attention to the W.fm with j < 0. Obviously WPI c, :% a property inherited from the non-periodized spaces. Moreover, i~ st111 orthogonal to because
v,
y,
y,
v,
=
C
re2
(+jL+l~j~v,
h!~
=0
# j , ~ )
a
It Pdbn twt, ss in the non-ps.iodirmd ass,5y5= vjW @ WjW.Ths lpaca vim, WfPa are &lk+dimensiond: 8 h m 4iC+m2~jl = 41.k fbr m € 2, and the same is true for ), both and W J pare spanned by the 2111 function8 obtained from A = 0,1,.. ,2111 1. These 2b1 functione are moreaver orthonormal; in
a
%p -
I
I
1
I
CHARA~WTION OF FUNCTIONAL SPACES
e.g.,
W r we have, for 0 5 &,A1 5 2bI par
(J;,,
9
CL)
=
C
- 1,
(djC+2w~., @jp)= &,kt
.
t€Z
We ha* therefore a ladder of multireeolution w,
., (6) {c;
with successive orthogonal complements W r (of CfoPQ in Vy ), Wlw ,. and orthonormal basea { 4 , , k ; k = 0,. - 2111 - 1) in Viw, {#J,k; k = 0, ,2131 1) in WJPr. Since u , ~ - ~ % = ~L2([0,11) (this Follows again &om the correa-ng non-periodized version), the functiohs in U -j E N, k = 0, - . - ,2111 - 1) constitute an orthonormal be+& in L2[[0,11). We will relabel this basis as follows:
-
-
+J,(x) = r l z k ( x ) = gv (z k2-3)
for 0 S k 5 23 - 1
Then this basis has the following remarkable psoperty. THEOREM 9.3.1. If f i s o continuow penodrc jbnction with period 1, Uaen them d a, E C so that
1. Since the g, me orthonormal, we necesgarily have a, = ( f , h).D d m e
SN by N
La 6rst step we prc& that the SN are udifomly boundad, is., with C independent of f or N 2.
If N
= 2-f, then Su = Pmjvrr; hence -5
and this is uniformly bounded if f9.3.2) for N =. $.
J4(s)lIC(1+ JZ~)-"~. This ez3tablbht.s
hlstimatee exactly Wlar to those in point 2 shaw that the Lm-nnnn of the second sum b slso bounded by C 11f llL-, uniformly in j , which pravee (9.3.2) for all N.
I, I
Proof. 1. Suppose, for simplicity, that q9 is compactly supported, with support $ c [- L, L]. For sufEcientIy large j, this means that $J:*(Z) = $-,,*(x) if 12-Jk zo( 5 2-1. (Again, this is not crucial. For ~ m p a c t sup l ~ ported 9, one only has to be a little more careful in the estimabes below.')
-
2. Form = 23
+ k, a,
support $-,,r,
C
c
= Jdz j ( z ) ~ / J - ~ , ~ (Here Z).
-
+
[2"(k - L),2"(k L)] P - j ( z o - L - I), 2 - 3 ( ~ o + L + l ) ] (becam 12-lk- zol 5 2-1) ;
hence
(use f ( z ) - ~
-
( X O ) (5
- xo)fl(~o)=
and d u q e variables: y = 2, (z - xo))
- XO),
This has the following corollary. COROLLARY 9.4.2. If, for cJI m, Cl rn-=I2 I la,( 5 C2 m-3/2, wiU, is in Ca for all a *< 1, bui i s nowhere C1 > 0, C2 < oo, then CZzo am dzffernntiable. P m f . Immediate &om Theorem 9.2.2 and Lemma 9.4.1. m +
.
~13tUB now constmet a very psrticular function. ' W e a,= a p + k =
independently of k. Then
Pj,
CHARACTERIZATION OF PUNCTIONAL SPACES
where F(x) =
C,
+(z - m ) is a periodic function. We have
with
In the special case where $ = J"Msya (see Chapters 4 and 5 ) ) support 6 = {E; 'j 5 #I 5 ao that $(2m)# 0 only if n = f1. Moreover, $(-2a) = q(27r). Consequently, F ( x ) = A cas(2nx). and
9))
The "fulln wavelet series of the left-hand side has a lacunary Fourier expansion! If now the p, are chosen so that C12-J 5 231' 8, 5 Cz2-3, then we can apply Corollary 9.4.2," and conclude that the function is nowhere differentiable. For this special case, this is in fact a well-known result about lacunary Fourier series: 7, cm(A,z), with I-y,l < oo but %A, f . 0, defines a continuous, nowhere differentiablefunction. On the other hand, if we take a function with a localized singularity, but which is Gw elsewhere, such as, e,g., j (z) = (sin zx(-O, with 0 < a < 1, then its wavelet expansion will be more or less lacunary (all the coefficients decay very fast as -J-*oo, except the few for which 2'1~llc is close to the singularity), while the Fourier series is "full" : fn = -y,n-l+a O(C~+O), with yo # 0; the effects of the singularity are felt in all the Fourier coefficients.
xz,
+
Notes. 1. Tbere exist many different definitions uf Calder6n-Zygmmd operators. A discussion of these different definitiolls and their evolution is given a t the start of Meyer (1990, vol. 2). Note tkt$the bounds are infinite on the diagonal x = y; in general K will be singabran the diagonal. Strictly speaking, we should be more careful about what trappens on the diagonal. One way to make sure everything is well defined is to require that T is bounded h m V ta V' ('D is the set of aH compactly supported Cm functions, V' its dual, the spece of (non-tempered) distributions), and that if x $ support (f), then (Tf ) ( x ) = dy K(x, y) f (y). It then follows that K does not cornpletely determine T: the operator (TIf )(x) = (Tf )(x) m(x)f (x), with rn E Lw(R), has the same integral h 1 . See Meyer (1990, vol. 2) for a clear and extensive discuesion.
+
2. Note that Jf - (ILk,
mnstit%tes a (very convenient) abuse of notation. Aa rlpm,for a w p l e , by 11 1% - 11-' 1% + ll-'llrC, 2 Ilk - l)-lltt; #fqa$$I-* /Iry, the triangle inequality i.not satisfied, so that II.lILkd is not a 9 ~ e norm. "
+
+
310
I
CHAPTER 9
3. If the "weakn is dropped, then the theorem is known as the Riess-Thorin theorem; in this case K = and the restriction ql S pl, ~z 5 p2 is not neemarye
qc-t,
5. We can suppose without lorn of generality that a 2 0. Find k so that k a 5 k + 1. Then
-2N 1. All but 2N - 2 of these are untouched by the restricting procedure: 4;,$f(~) = t $ ~ ~ , ~if( kx j2 0; these functions are therefore . still orthonormal. The 2N - 2 functions & s f ,k = - 1 , - . - ,-(2N - 2) are ind& pendent of each other and of the 4,,k with k 0 . Define now wJef to be the orthogonal complement in of Vjha". If, for convenience, we shift 11 so that it is also supported on (0,2N - 11, then the (restrictions of $j,k to [O, oo)) are clearly in W;df if k 2 0 , since they are orthogonal to all the &!', and lie in What about the with k = - 1 , . .- ,-(2N - 2)? ( I f k is even smaller, k < -2N + 1 , then = 0.) It turns out (see Meyer (1992)) that the +$ff with k = - N , - ( N + I ) , - - ,. -(2N - 2) are in qhdf, i.e., they are orthogonal to w,ef. The other $;ff, k = -1,. - .,-(N - 1) contribute to W,w$in fact, we ha- that { k 2 -(2N - 2 ) ) U k 2 -N(-1))
+
V,yf
%fqf.
I I
I
1
jl
>
$kgf
~!JF tjkf
{$iff;
qY.l3In order to orthonormalizethis basis, one
is a ($n-orthogonal) basis for proceeds in the following steps:
&sf,
(1)O r t h o n o d m the k = -1, - . ,-(2N - 2). The resulting functions &, k = -1, ,.. ,-(2N-2) are automstically orthogonal to the &,k, k 2 0, and together they provide an orthonormd basis for If we define
v,-.
(
( x )= 2
>
( x ) , j s 2,k = -1,. .. ,-(2N - 2 ) ,
0) U {&,,k; k = - 1 , - - . , - ( 2 N then { 4 j , k ; k basis for 9- for any j E 2.
- 2 ) ) is an
orthonormal
(2) Project the+&',
k = -1,...,-(N - I) ants ~ ~ ~ & i & i x t g , d < )& k3*
IN-2
-7%
%-
(3) Orthonormalize the JI: The d t i w &, k = -1, ,-(N - I), ~ h with the +t,ff,k 2 0, provide an orthonord bash for WOhrtf.We can again define a a a
.
then (4 , k ; k = - 1,. - ,-(N - 1))Cl {Jttek; k 2 0) is an orthonomd h i s for w$. The union of all these b.as (jranging over Z) givea a b.ds for L2(l0,a)). The resulting beses me not only orthonormal basea for La([O,m)), but provide also unconditional bases for the Holder spaces restricted to the half-line (i.e , they even handle regularity properties at 0 "correctly"), etc.; see Meyer 71992) for pro&. To implement all thie in practice, one needs to compute extra filter coefficients at the boundaries, corresponding to the expansion of the k= -l,...,-(N - I), &,kt k = -I,.--,-(2N 2), in terms of the &-l,c, .t = -1,-.-,-(2N -2) and the 1 = 0 , . . - , 4 N - 5 . Thesecan becomputed &om the origrnal ht; tables are given in Cohen, Daubechies, and Vial (1992); this paper el%a contains an alternative to Meyer's canetruction, involving fewer additional functions at the edges (only N imtmd of 2N -2), while still handling the regularity properties correctly, even at the edge. One last remark about wavelet bases on an idetval. h image analpis, it is customary to treat border effects by extending the image, beyond t,he barder, by its reflection: this extension avoids the discontinuity that follows from periodization or extending by zero (although there still ie a discontinuity in the dsrimtive, however). It is well known that thie means that border e f k t s are .. mmlmlaed, and that no extra coefficients (to deal with the borders) haw to be introduced, provided that the filters wed are symmetric. The same trick can be used to provide biorthogonal wavelet bases on [O,l], with much less &ort than the orthonord wavelet W i on the interval of Meyer (1992) or Cohen, Daubechies, and Vial (1992). If f is a function on R, then we can define a function on [O,1] by "fold- ' in< its graph at 0 and 1. The first "fold," at 0, amounte to replacing f(s) by f ( x ) f (-x). Folding back the two tails (one from the original f , the other from the folded over negative part) eticking out beyond 1 legde to f (x)+ f (-z) f(2 - z) f(z 2). If we keep folding like this, theP we end up with
-
e
.
+
+
+
+
Note, h r later convenience, that"
a
330
CHAPTER 10
Talre now $J,4, $wo wavelets which generate biorthogod &let
baea of
la(^);
#,J, as codtructed in $8.3; aseume also tha$ t$,d are symmetric around 4, 4(1- z) = r#(z),i(l- 2 ) = a x ) , and that qb, & am antbymmetric around f, $(I - r ) = -$(z), $(l - r) = -d(z). (Exampleq
wi& .misted scaling b c t i o n s
were constructed in 88.3.) Apply the "folding" technique to the t,bjtb and $,,k, 6
$51'
is defined .odogou81yYWe will r~triiotour attention to j with J 5 0, for which (10.7.2) can be rewritten as
Fbr good measure we dm de0m #', &(I x ) , we find
-
$?Yk
4;T;
< 0, or j
since flz) = #(I
= -J
- x ) , &x) =
Obvlnuly, $ ~ ~ L + , ~ . = , for rn E Z, so that we only need to consider the durn k = 0, - ,zJ+' 1. Monwer, = &&,whih mema 1. A similar argument sharve we can rest& orrmelves to only k = 0, ... , that we only need to consider the $3tk for k = 0, . ,2J - 1. RaznmlgMy, the and @yk,, with 0 5 k, kt 5 zJ - 1, are still biorthogond on [O, 1). To prove this, use (10.7.1):
-
mk
-
q5t'&J+l-k-
-
@yk,
-
Thb biorthogonality implies, w n g other things,that the k = 0, ..,2' I Mdl independat, and proride a tmi~for v!' = {jbld ; f E V-J} . (The
GENERALIZATIONS AND TRICK8
-
@Jd
W Y , ~=
r m e is tm En the #:$".) We can .Lo define ~p.ca by I/M; f E W - J ) .ObviowiY,w?jd isspanned by t h e m k , k=&-*-,@M w e o m , computstiona similar to (10.7.4) sbow that b
337 4.
1.
v'?
pmving that W?Jd I and that the &"j',,, 0 5 k 5 2J - 1 are independent. It follows that the. "foldedn structures iaherKt all the properties (nesting of the spaces, biorthogonelity, basis properties,. ) from the unfolded originals. The filter coefficientscorresponding to these folded Morthog6nal basecr are Likewise obtained by folding at the edges corxe8pondii to s = 0,1; if $,$,4,4 are compactly supported, then only the filter coefficients new the borders will be a f h t d . Examples are given in Cohen, Daubdies, and Vial (1992). Bemuse analyzing f on [O, 11with these folded biorthogonal wavelets amounts to the same as extending f to all of R by reflections and analyzing this extension with the oiigiaal biotthogonal wavelets, we cannot hope, however, to characterize Holder spaces on [O, 11 beyond Holder exponent 1 with this technique. Thiil is progrem with respect to what periodized wavelets can do, but it 18 less petfbrmant than the o r t h o d wavelet baaee on p,1).Fa'more detaila: see %hen, Daubdies, and Vial (1992). - 4
"
:
-
Notes. .'
-
1. One example is the foilowing. Let r be the hexagonal lattice, = .(me1 n2e3; nl, ns E Z),where el = (l,O), es = (1/2, &/2); r d~+ a partition of R' into equilsteral triangles. Define % to be the space of continuous functions in L2(R) that are pieceftriee aEm on theee triangles The urkhonmal besia fm this muitireaWion d y s i a is m d x w t e d in J&d (1989). Biurthogond beeerr of compactly supported mveleta with this hexagonal symmetry ate construct& in Cohtm and Schlenkm (1091).
+
\*
2, lb me-dimensional conditions of 55.1 can also be crrclt in a matrix i~ that. use d(t) = ml(tfi) &/2), .nd tb. c~~~c&tion% are = 1, Im1(€)12+ Im1(€ + 412= 1, mo(0 m1(0+ ensuring the orthonormality of the (h,n; i ,E Z), of th*m,& rn5 E Z) and the orthogonality of these ttoa Waf vectors, xwpectissl@ But t h e e conditiom b e equivalent to the r&@remmt that
;mL
is
3. If one prefers to index the entrim df U with the n u m h 1,. .,2" rather than entrim of (0, I)", then one clrn mumber 8 E {O, lIn by defining @ = 1+ q22-l E {I,.* -,r).
x;-l
4. At present, I know of no explicit scheme that provides an infinite family uf w ,for dilation factor 3, with regularity .pawing proportionally to the filter support width.
5. The same can be done for dilation factor 2. where the factor matrices are even simpler. The bwic idea is that if im(i)12 +lmo((+r)l2 = 1, then, for any y E R, n E Z,&(() = (1 $)-'I2 [ ~ ( f +) ye-i(2n+1)(tq-,(